Google Cloud Speech-to-Text Models by Google
1 model2023
Details
ResearcherGoogle
LicenseProprietary
Commercial useCommercial use with conditions
Models1
Released2023
Capabilities
MultimodalAll models
Links
WebsiteAbout
Google Cloud hosted automatic speech recognition model family.
Current Variants
Use-when guidance is derived from seed capabilities, context, release, and replacement fields.
1 in view
Google Cloud Speech-to-TextCurrent
Use when the workload needs speech recognition, multimodal inputs, and audio.
2023-01speech recognitionmultimodal inputsaudio
| Model | Use when | Released | Signals | Status |
|---|---|---|---|---|
| Google Cloud Speech-to-Text | Use when the workload needs speech recognition, multimodal inputs, and audio. | 2023-01 | speech recognitionmultimodal inputsaudio | Current |
Release Timeline
1 release group2023-01
1 current
Google Cloud Speech-to-Text
Currentspeech recognitionmultimodal inputsaudio
Specifications(1 models)
| Model | Released | Multimodal |
|---|---|---|
| Google Cloud Speech-to-Text | 2023-01 | Yes |
Frequently Asked Questions
- What is Google Cloud Speech-to-Text used for?
- Google Cloud Speech-to-Text is used for audio, speech recognition, and vision and multimodal work. The family description and listed model capabilities point to those workloads as the best fit.
- How does Google Cloud Speech-to-Text compare to OpenAI Whisper?
- Google Cloud Speech-to-Text by Google is strongest where you need audio, while OpenAI Whisper by OpenAI is the closest related family to check for audio. Google Cloud Speech-to-Text has 1 listed variant, so compare the specs and pricing tables before choosing a production model.
- Which Google Cloud Speech-to-Text model should I use?
- If price is the main constraint, use the pricing table first because Google Cloud Speech-to-Text does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate Google Cloud Speech-to-Text with multimodal inputs.