LLM Reference

Grok Voice Models by xAI

xAIProprietaryProprietary
1 model2026

Details

ResearcherxAI
LicenseProprietary
Commercial useCommercial use with conditions
Models1
Released2026

Capabilities

MultimodalAll models
ReasoningAll models
Function CallingAll models
Tool UseAll models
Structured OutputsAll models

Links

Website

About

xAI's Grok Voice family of full-duplex voice agent models for enterprise workflows. Designed for complex, multi-step customer support, sales, and enterprise applications with real-time reasoning, structured data capture, and broad language support. The family is evaluated on the τ-voice Bench leaderboard for full-duplex voice agents under realistic conditions.

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

1 in view

Use when the workload needs voice, reasoning, and tool use.

2026-04voicereasoningtool use

Release Timeline

1 release group
2026-04
1 current
Grok Voice Think Fast 1.0
voicereasoningtool use
Current

Specifications(1 models)

Grok Voice model specifications comparison
ModelReleasedMultimodalReasoningFn CallingTool UseStructured Outputs
Grok Voice Think Fast 1.02026-04YesYesYesYesYes

Frequently Asked Questions

What is Grok Voice used for?
Grok Voice is used for voice, vision and multimodal work, and reasoning. The family description and listed model capabilities point to those workloads as the best fit.
How does Grok Voice compare to Grok 1?
Grok Voice by xAI is strongest where you need voice, while Grok 1 by xAI is the closest related family to check for reasoning. Grok Voice has 1 listed variant, so compare the specs and pricing tables before choosing a production model.
Which Grok Voice model should I use?
If price is the main constraint, use the pricing table first because Grok Voice does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate Grok Voice Think Fast 1.0 with reasoning, tool use, function calling, structured outputs, and multimodal inputs.