How many GroqCloud models does LLMReference track?

LLMReference currently tracks 9 models available through GroqCloud's API. GroqCloud's full catalog may be larger.

What are GroqCloud's most popular models?

GroqCloud's top models include Llama 3.1 8B Instruct, gpt-oss-120b, gpt-oss-20b, GPT OSS Safeguard 20B, Llama 4 Scout 17B-16E Instruct.

What is GroqCloud's pricing?

GroqCloud pricing ranges from $0.03/1M to $0.59/1M input tokens depending on the model.

GroqCloud

Researched 29d agoInference PlatformTier 1

Groq

CodingRAGAgentsLong contextVisionClassificationJSON / Tool useHighlightAI

GroqCloud offers 9 tracked models (9 with output token pricing). This catalog covers coding, rag, and agents; open any model detail page for benchmarks, batch tiers, and migration prompts.
Covers 7 workload areas across 9 tracked models; last verified 2026-06-07.

Use it for

Teams comparing token and batch pricing across this provider's models
Operators routing coding, rag, and agents workloads through this API

Do not use it for

Final benchmark picks without opening the relevant model detail page

Tracked models

Models available through this provider

Priced output routes

Models with output token pricing tracked

Cheapest output

$0.030

Llama Prompt Guard 2 22M on this route

Batch-ready models

No batch pricing tracked

Latest model release

2025-08-05

335d since newest release

Freshness

2026-06-07

Researched 29d ago

fresh

Information

TypeInference Platform

TierTier 1

Models9

CompanyGroq

Founded2016

Mountain View, California, United States

Groq is a company specializing in AI inference technology, particularly with their flagship product, the Language Processing Unit (LPU™). Their AI platform focuses on delivering fast, affordable, and energy-efficient AI inference solutions. Groq's technology is designed to unlock new classes of AI applications and use cases, emphasizing speed and performance in AI processing. Key aspects of Groq's AI platform include: 1. GroqChat: A chat interface leveraging their LPU technology. 2. GroqCloud™ Developer Hub: A platform for developers to build and deploy AI applications using Groq's infrastructure. 3. LPU™ AI Inference Technology: Their core technology designed for high-speed AI processing. Groq's approach to AI is centered on creating systems that are not only fast but also energy-efficient, potentially addressing some of the scalability and sustainability challenges in the AI industry. The company designs, fabricates, and assembles its LPU and related systems in North America, emphasizing local production and control over their technology stack.

Links

Website X / Twitter LinkedIn Crunchbase

Catalog freshness

The newest model tracked on this provider was released 2025-08-05 (335d ago).

Where this host wins

Coding: 2 tracked models with SWE-bench / HumanEval-style scores.
RAG: 5 tracked models with ruler / needle retrieval benchmarks.
Agentic: 4 tracked models with BFCL, tau-bench, and SWE-bench tool-use coverage.
Long-context: 5 tracked models with context-token or InfiniteBench-class signal.

Getting started

Official product, docs, and pricing links — confirm quotas and regions in the vendor docs.

Product Docs Portal Pricing

Compliance notes

No verified compliance claims (SOC 2, ISO, HIPAA) tracked for this provider yet — check the vendor's trust center for current certifications.

Platform Overview

Groq's AI platform is built around its groundbreaking Language Processing Unit (LPU™), an innovative architecture designed for high-speed AI inference tasks. The LPU™ delivers exceptional performance, achieving speeds up to 1000 times faster than traditional models like ChatGPT, with remarkably low latency. This makes it ideal for real-time applications such as chatbots and voice assistants. The platform is versatile, capable of handling various AI workloads including natural language processing, computer vision, and complex computations without extensive retraining or reconfiguration. It supports mixed-precision operations and comes with a user-friendly software stack, simplifying deployment and enhancing the overall user experience. The platform is further enhanced by GroqChat and the GroqCloud™ Developer Hub, which provide developers with powerful AI tools and resources. GroqChat enables seamless interaction with multiple large language models (LLMs), while the GroqCloud™ Developer Hub offers a no-code environment for exploring APIs and featured models. This allows for rapid development and experimentation without requiring extensive coding knowledge. The platform's on-demand pricing and flexible deployment options make it adaptable to diverse enterprise needs, facilitating quick integration of AI capabilities into various operational workflows and enhancing productivity and efficiency.

Compare per-model pricing, input and output token costs, batch availability, and benchmark coverage.

Available Models(9)

View all →

All models available as Serverless

Model	Input (per 1M)	Output (per 1M)
GPT OSS Safeguard 20B	$0.075	$0.30
gpt-oss-120b	$0.15	$0.60
gpt-oss-20b	$0.075	$0.30
Llama Prompt Guard 2 22M	$0.03	$0.03
Llama Prompt Guard 2 86M	$0.04	$0.04
Qwen3-32B	$0.29	$0.59
Llama 4 Scout 17B-16E Instruct	$0.11	$0.34
Llama 3.3 70B Instruct (free)	$0.59	$0.79
Llama 3.1 8B Instruct	$0.05	$0.08