Google Speech API流音频超过1分钟

编程入门 行业动态 更新时间:2024-10-24 07:26:14
本文介绍了Google Speech API流音频超过1分钟的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我希望能够从电话音频流中提取一个人的话语.电话音频被路由到我的服务器,然后我的服务器创建一个流识别请求.如何判断一个单词是完整发音中的一部分还是当前正在转录中的一部分?我应该比较单词之间的时间戳吗?即使流电话音频中在一定时间内没有语音,API仍会继续返回临时结果吗?如何超过1分钟的流音频限制?

I would like to be able to extract utternaces of a person from a stream of telephone audio. The phone audio is routed to my server which then creates a streaming recognition request. How can I tell when a word exists as part of a complete utterance or is part of an utterance currently being transcribed? Should I compare timestamps between words? Will the API continue to return interim results even if there is no speech for a certain amount of time in the streaming phone audio? How can I exceed the 1-minute of streaming audio limit?

推荐答案

关于前三个问题:

您不需要比较单词之间的时间戳,可以通过查看is_final flag来判断单词是否是完整话语(最终结果)的一部分. google/speech-to-text/docs/reference/rpc/google.cloud.speech.v1#streamingrecognitionresult"rel =" nofollow noreferrer>流式识别结果.如果该标志设置为true,则响应对应于完成的转录,否则,它是一个临时结果.有关此处.

You don’t need to compare timestamps between words, you can tell if a word is part of a complete utterance (final result) by looking at the is_final flag in the Streaming Recognition Result. If the flag is set to true, the response corresponds to a completed transcription, otherwise, it is an interim result. More on this here.

获得最终结果后,在流式传输新语音之前,不应生成任何临时结果.

Once you get the final results, no interim results should be generated until new utterances are streamed.

关于最后一个问题,您不能超过1分钟的限制,您需要发送多个请求.

Regarding your last question, you can’t exceed the 1 minute limit, you need to send multiple requests instead.

更多推荐

Google Speech API流音频超过1分钟

本文发布于:2023-11-26 12:54:20,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1633964.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:流音   Google   Speech   API

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!