How many Chutes AI models does LLMReference track?

LLMReference currently tracks 5 models available through Chutes AI's API. Chutes AI's full catalog may be larger.

What are Chutes AI's most popular models?

Chutes AI's top models include Llama 3.3 70B Instruct (free), Qwen2.5-72B-Instruct, Gemma 2 9B Instruct, Mistral Large 2 (2407), Grok-3.

What is Chutes AI's pricing?

Chutes AI pricing ranges from $0.1/1M to $0.8/1M input tokens depending on the model.

Chutes AI

Researched 6d ago

Chutes AI

CodingRAGLong contextVisionClassificationJSON / Tool use

Chutes AI offers 5 tracked models (5 with output token pricing). This catalog covers coding, rag, and long context; open any model detail page for benchmarks, batch tiers, and migration prompts.
Covers 6 workload areas across 5 tracked models; last verified 2026-06-30.

Use it for

Teams comparing token and batch pricing across this provider's models
Operators routing coding, rag, and long context workloads through this API

Do not use it for

Final benchmark picks without opening the relevant model detail page

Tracked models

Models available through this provider

Priced output routes

Models with output token pricing tracked

Cheapest output

$0.300

Gemma 2 9B Instruct on this route

Batch-ready models

No batch pricing tracked

Latest model release

2024-12-06

577d since newest release

Freshness

2026-06-30

Researched 6d ago

fresh

Information

Models5

CompanyChutes AI

Chutes AI is a decentralized GPU compute network providing inference API access to open-source AI models.

Links

X / Twitter

Catalog freshness

The newest model tracked on this provider was released 2024-12-06 (577d ago).

Where this host wins

Coding: 1 tracked model with SWE-bench / HumanEval-style scores.
RAG: 2 tracked models with ruler / needle retrieval benchmarks.
Long-context: 2 tracked models with context-token or InfiniteBench-class signal.
Vision: 1 tracked model with multimodal benchmark coverage.

Compliance notes

No verified compliance claims (SOC 2, ISO, HIPAA) tracked for this provider yet — check the vendor's trust center for current certifications.

Platform Overview

Chutes AI is a decentralized GPU compute network providing inference API access to open-source AI models.

Compare per-model pricing, input and output token costs, batch availability, and benchmark coverage.

Available Models(5)

View all →

All models available as Serverless

Model	Input (per 1M)	Output (per 1M)
Grok-3	$0.8	$2.4
Llama 3.3 70B Instruct (free)	$0.22	$0.66
Mistral Large 2 (2407)	$0.5	$1.5
Gemma 2 9B Instruct	$0.1	$0.3
Qwen2.5-72B-Instruct	$0.18	$0.54

Where else to run this

Mistral Large 2 (2407) on Chutes AI

Provider setup and pricing

Gemma 2 9B Instruct on Chutes AI

Provider setup and pricing

Qwen2.5-72B-Instruct on Chutes AI

Provider setup and pricing

Mistral Large 2 (2407) on Microsoft Foundry

Alternative host

Gemma 2 9B Instruct on Fireworks AI

Alternative host

Qwen2.5-72B-Instruct on DeepInfra

Alternative host