GroqCloud
Researched todayInference PlatformTier 1Groq
GroqCloud exposes 9 tracked models (9 with output token pricing in seed data). Task coverage across this catalog includes coding, rag, and agents; open any model detail page for benchmarks, batch tiers, and migration prompts.
Portfolio context: 6 decision-task tags, 9 catalog rows, latest research stamp 2026-05-22.
Use this portfolio page for
- Teams comparing token and batch economics on this surface
- Operators routing coding, rag, and agents workloads through this API
Do not stop here for
- Final benchmark picks without opening the relevant model detail page
Catalog rows
9
Models linked to this provider in seed data
Priced output routes
9
Rows with token_out in seed data
Cheapest output
$0.030
Llama Prompt Guard 2 22M on this route
Batch-ready SKUs
0
No batch pricing tracked
Latest catalog ship
2025-08-05
290d since dated release field
Freshness
2026-05-22
Researched today
Catalog release signal
Latest ISO-dated model.release in this catalog is 2025-08-05 (290d ago).
Where this host wins
- Coding: 1 tracked model with SWE-bench / HumanEval-style scores.
- RAG: 5 tracked models with ruler / needle retrieval benchmarks.
- Agentic: 4 tracked models with BFCL, tau-bench, and SWE-bench tool-use coverage.
- Long-context: 5 tracked models with context-token or InfiniteBench-class signal.
Getting started
Official entry points from seed metadata — confirm quotas and regions in vendor docs.
Compliance notes (verbatim seed excerpts)
Not yet verified from seed copy — no SOC/ISO/HIPAA-class sentences detected to quote verbatim.
Platform Overview
Groq's AI platform is built around its groundbreaking Language Processing Unit (LPU™), an innovative architecture designed for high-speed AI inference tasks. The LPU™ delivers exceptional performance, achieving speeds up to 1000 times faster than traditional models like ChatGPT, with remarkably low latency. This makes it ideal for real-time applications such as chatbots and voice assistants. The platform is versatile, capable of handling various AI workloads including natural language processing, computer vision, and complex computations without extensive retraining or reconfiguration. It supports mixed-precision operations and comes with a user-friendly software stack, simplifying deployment and enhancing the overall user experience. The platform is further enhanced by GroqChat and the GroqCloud™ Developer Hub, which provide developers with powerful AI tools and resources. GroqChat enables seamless interaction with multiple large language models (LLMs), while the GroqCloud™ Developer Hub offers a no-code environment for exploring APIs and featured models. This allows for rapid development and experimentation without requiring extensive coding knowledge. The platform's on-demand pricing and flexible deployment options make it adaptable to diverse enterprise needs, facilitating quick integration of AI capabilities into various operational workflows and enhancing productivity and efficiency.
Available Models(9)
View all →All models available as Serverless
| Model | Input (per 1M) | Output (per 1M) |
|---|---|---|
| GPT OSS Safeguard 20B | $0.075 | $0.30 |
| gpt-oss-120b | $0.15 | $0.60 |
| gpt-oss-20b | $0.075 | $0.30 |
| Llama Prompt Guard 2 22M | $0.03 | $0.03 |
| Llama Prompt Guard 2 86M | $0.04 | $0.04 |
| Qwen3-32B | $0.29 | $0.59 |
| Llama 4 Scout 17B-16E Instruct | $0.11 | $0.34 |
| Llama 3.3 70B Instruct (free) | $0.59 | $0.79 |
| Llama 3.1 8B Instruct | $0.05 | $0.08 |
Platform Details
Organization
Groq is a company specializing in AI inference technology, particularly with their flagship product, the Language Processing Unit (LPU™). Their AI platform focuses on delivering fast, affordable, and energy-efficient AI inference solutions. Groq's technology is designed to unlock new classes of AI applications and use cases, emphasizing speed and performance in AI processing. Key aspects of Groq's AI platform include: 1. GroqChat: A chat interface leveraging their LPU technology. 2. GroqCloud™ Developer Hub: A platform for developers to build and deploy AI applications using Groq's infrastructure. 3. LPU™ AI Inference Technology: Their core technology designed for high-speed AI processing. Groq's approach to AI is centered on creating systems that are not only fast but also energy-efficient, potentially addressing some of the scalability and sustainability challenges in the AI industry. The company designs, fabricates, and assembles its LPU and related systems in North America, emphasizing local production and control over their technology stack.