Google Cloud Speech-to-Text
Released
2023-01-01
Last refreshed
2026-06-07
Status
Researched today
ProprietaryCommercial use with conditionsMultimodalVisionAudio
Google Cloud Speech-to-Text has model metadata, but missing tracked provider pricing keeps it from being a default production pick.
Use it for
- Teams evaluating vision
Do not use it for
- Cost-sensitive launches that need sourced token pricing
- Strict JSON or tool-calling flows
- Teams that need a tracked hosted API route today
Specifications
- Released
- 2023-01-01
- Architecture
- transformer
- Specialization
- speech-recognition
- Openness
- Proprietary
- License
- ProprietaryCommercial use with conditions
- Training
- pretrained
Created by
Pricing
No tracked provider token pricing is available yet.
Links
About
Enterprise speech recognition service with 125+ language support and advanced audio processing capabilities.
Google Cloud Speech-to-Text is a proprietary model. The structured metadata tracks multimodal input and audio. No headline benchmark score is tracked for Google Cloud Speech-to-Text yet.
Top use-case fit
Vision
Included by capability and metadata signals in the decision map.
Provider price ladder
No tracked provider token pricing is available for this model yet.
Capabilities
MultimodalAudio
Benchmark peer barsfor Vision
No task-mapped benchmark peers are available for this model yet.
Migration checks
No linked migration route is available for this model yet.
API versions
google-cloud-speech-to-text