我正在从事语音识别系统项目.我已经使用深度神经网络来进行语音识别.但我还需要给定演讲中出现的单词的开始和结束时间.您能否建议我或指导我寻找解决语音识别中时间戳生成问题的资源?我知道 Amazon transcribe 服务也会生成时间戳,但我无法获得有关此的论文.
I am working on a speech recognition system project. I have used deep neural network to do the speech recognition. But I also need the starting and end timings of the words occuring in the given speech. Can you suggest me or direct me towards resources to solve the problem of timestamp generation in speech recognition ? I know the Amazon transcribe service does the timestamp generation too but I haven't been able to get the papers about this.
推荐答案如果您有兴趣尝试 Microsoft 的语音服务 (aka.ms/speech/sdk) 我们也支持字级时间戳.您可以从我们的快速入门示例之一(可用于多种编程语言)开始,您还可以多写几行代码来获取字级计时信息.
If you're interested in trying Microsoft's speech service (aka.ms/speech/sdk) we do support word level timestamps as well. You can start with one of our quick start samples (available in many programming languages), and you can a couple more lines of code to get the word level timing information.
基本上,在尝试了默认的麦克风快速入门或文件快速入门,您可以添加几行代码来请求单词级时间戳.并且您将添加另一行代码来检索服务提供的 json 响应(具有字级计时信息).
Basically, after trying out the default microphone quickstart or file quickstart, you can add a couple lines of code to request the word level timestamps. And you'll add another line of code to retrieve the service provided json response (which has the word level timing information).
例如,在 C# 中,您可以为 SpeechConfig 对象执行此操作:
For example, in C#, you'd do this for your SpeechConfig object:
config.OutputFormat = OutputFormat.Detailed; config.RequestWordLevelTimestamps = true;一旦你收到了你的 SpeechRecognitionResult 对象,你就会这样做:
And once you've received your SpeechRecognitionResult object, you'd do this:
var json = result.Properties.GetProperty(PropertyId.SpeechServiceResponse_JsonResult); Console.WriteLine(json);如果您使用其他受支持的编程语言(C++、Java、JavaScript、Objective-C、Swift、Python 等),则代码会略有不同.
If you're using another supported programming language (C++, Java, JavaScript, Objective-C, Swift, Python, etc.), the code would be slightly different.
祝你好运.
罗伯·钱伯斯,微软建筑师和工程经理
Rob Chambers, Microsoft Architect and Engineering Manager
更多推荐
如何在语音识别中生成时间戳?
发布评论