如何通过API或SDK创建Microsoft自定义语音

编程入门行业动态更新时间:2024-10-09 11:18:11

本文介绍了如何通过API或SDK创建Microsoft自定义语音的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

我正在评估Microsoft Custom Voice作为潜在的供应商，并想知道如何通过API或SDK编程地创建和训练自定义声音.

经过广泛的搜索，我只找到了说明如何通过其自定义语音门户创建自定义语音的文档.此页面上有一行提示自定义语音培训API.

下面是该段落以及指向该文档页面的链接.

您能帮我弄清楚该怎么做还是确认不存在这样的API?

准备好数据后，就可以开始将其上传到自定义语音门户，或通过自定义语音培训API.

当前，此页面指向Speech 2.0 API，但看起来很快就会有Microsoft推出3.0.如您所见，如果您查看来自语音门户的网络呼叫(

如何使用它们?

您可以通过门户进行操作，并检查API端调用的内容.

以下是该过程的概述:

使用/datasets/upload 操作创建用于模型训练的数据集

成功处理完数据集后，请使用对/models 的POST请求创建模型(请参见操作此处).该POST的正文中有几个详细信息:基本模型，使用的训练数据集等.此操作将训练您的模型，您无需再次调用即可开始训练

培训结束后(您可以在/models 上使用GET来检查状态，或者在/models/yourModelId 上使用模型ID来检查特定的GET)，您可以部署"它.为此，您必须基于此模型创建一个端点:它是/endpoints s的POST(请参阅操作此处)

然后，您可以通过在/endpoints 上调用GET或按ID进行GET来监视部署状态，例如模型

I am evaluating Microsoft Custom Voice as a potential vendor and want to know how to programmatically create and train custom voices either through an API or SDK.

After an extensive search, I have only found documentation showing how to create a custom voice through their custom voice portal. There is one line in this page that hints at a custom voice training API.

Below is that passage and the link to that documentation page.

Could you help me either figure out how to do this or confirm that no such API exists?

Once you have prepared your data, you can start to upload them to the Custom Voice portal, or through the Custom Voice training API.

docs.microsoft/en-us/azure/cognitive-services/speech-service/how-to-custom-voice-create-voice

解决方案

I guess George's answer is not relevant as you are specially talking of "Custom" voice / speech.

Link to the APIs

There are APIs for this part but the documentation is not clear, you are right. You can find at westus.cris.ai/swagger/ui/index the API and available operations. Note that it exits in several regions, for example West Europe is westeurope.cris.ai/swagger/ui

Currently, this page points to Speech 2.0 API but it looks like there will be a 3.0 from Microsoft soon. As you can see if you look at the network calls from the Speech portal (link), they are in fact already using this 3.0 api (preview):

How to use them?

You can have a look to the process by doing it through the portal and checking what is called on the API side.

Here is an overview of the process:

Create your dataset for model training, using /datasets/upload operation

Once your dataset has been processed successfully, create a model using a POST request to /models (see operation here). This POST has several details in the body: base model, training dataset used etc. It is this operation which will train your model, you don't need another call to start the training

Once training is completed (you can check the status using GET on /models or a specific GET using the model ID to /models/yourModelId), you can "deploy" it. For that, you have to create an endpoint based on this model: it is a POST to /endpointss (see operation here)

Then you can monitor the deployment status by calling GET on /endpoints or GET by ID, like for models

更多推荐

如何通过API或SDK创建Microsoft自定义语音

本文发布于:2023-11-28 11:51:44，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1642289.html