LLM Reference

Claude 3.5 Haiku

Released
2024-10-22
Last refreshed
2026-06-01
Status
Researched 3d ago
CodingRAGAgentsLong contextVisionClassificationJSON / Tool use

Claude 3.5 Haiku is worth evaluating for coding, rag, and agents when its provider route and context window match the workload.

Use it for

  • Teams evaluating coding, rag, and agents
  • Workloads that can use a 200k context window
  • Buyers comparing 4 tracked provider routes

Do not use it for

  • Workloads where another current model has stronger sourced task evidence
Specifications
Released
2024-10-22
Context
200k
Architecture
Decoder Only
Specialization
general
Training
finetuned
Created by

Developing safe and ethical AI systems.

San Francisco, California, United States
Founded 2021
Website
Pricing
Output / 1M
$4.00
Input / 1M
$0.800

Cheapest of 6 routes · Anthropic · cache read $0.080

About

Claude 3.5 Haiku is Anthropic's latest AI model, known for its speed and efficiency while maintaining high intelligence. It is optimized for applications needing rapid response, like interactive chatbots and real-time content moderation. Initially text-only, future plans include image input capabilities. It excels in delivering fast, accurate code suggestions, processing and categorizing information swiftly, and handling large volumes of user interactions. Priced accessibly, it offers advanced coding, tool use, and reasoning abilities. Though initially surpassing Claude 3 Haiku in benchmarks, its pricing reflects its enhanced performance 123457.

Claude 3.5 Haiku is Anthropic's fastest and most cost-efficient model in the Claude 3.5 family, released on October 22, 2024. It supports a 200,000-token context window, processes both text and image inputs, and supports tool use, parallel function calls, and the Message Batches API. These features make it well-suited for high-throughput applications: interactive chatbots, real-time content moderation, automated code review, and large-scale data classification pipelines where latency and cost matter more than frontier accuracy.

On capability benchmarks, Claude 3.5 Haiku surpasses Claude 3 Opus on several evaluations including HumanEval (87.6%) while maintaining Haiku-tier response speed. It is particularly strong on coding, structured output, and agentic tool use with self-correction, outperforming Claude 3 Haiku substantially on instruction-following quality. The knowledge cutoff is July 2024.

Pricing as of late 2024 was $0.80 per million input tokens and $4.00 per million output tokens. Prompt caching reduces effective input cost by up to 90%; the Message Batches API offers an additional 50% reduction for non-latency-sensitive workloads. The model is available through Anthropic's direct API, AWS Bedrock, Google Vertex AI, Fireworks AI, OpenRouter, and the Vercel AI Gateway.

Claude 3.5 Haiku has a 200k-token context window.

Claude 3.5 Haiku input tokens at $0.8/1M, output at $4/1M.

Top use-case fit: coding, agents, and build tasks

Coding

Included by capability and metadata signals in the decision map.

RAG

Included by capability and metadata signals in the decision map.

Agents

Included by capability and metadata signals in the decision map.

Provider price ladder

Compare all 6

Compare API pricing across 4 providers for input and output tokens, batch, and cached reads when available.

ProviderInput / 1MOutput / 1MBatch in / outCacheRoute
Anthropic$0.800$4.00$0.400 / $2.00read $0.080 / 5m $1.00 / 1h $1.60
Serverless
AWS Bedrock$0.800$4.00--
Serverless
GCP Vertex AI$0.800$4.00--
Serverless
OpenRouter$0.800$4.00--
Serverless

Capabilities

VisionReasoningStructured OutputsCode Execution

Benchmark peer barsfor Coding

No task-mapped benchmark peers are available for this model yet.

Migration checks

No linked migration route is available for this model yet.

Rankings & picks(3)

Comparison and alternatives

Browse all comparisons →
Show all 4 popular comparisonssorted by 7-day search impressions