Concepts & capability filters

Capability filtercapabilityintermediate

Code execution

Also known as: sandboxed code, code interpreter, computer use

run code as part of a workflow

81

matching active models

22

tracked providers

58

models with routes

model.code_execution

Definition

Code execution capability means a model route or surrounding product can run generated code, calculations, or sandboxed scripts as part of completing a task. For model selection, treat it as an execution-surface flag and still inspect the provider route before relying on it.

Models With Code execution

Showing the first 80 decision-sorted matches, with model flags and provider-route evidence from seed data.

81 matches

ModelReleaseContextCapabilitiesProvider route

GPT-4 Vision Preview

2023-11-06

Researched 134d ago

128K

128,000 tokens

128K contextVisionMultimodalCode exec

No tracked provider route

Post-training variant of GLM-5 from Zhipu AI with enhanced reasoning and coding capabilities. 754B parameters (40B active) in Mixture of Experts architecture. Optimized for complex agentic workflows and multi-step reasoning. Available via Z.AI API and open weights under the MIT license.

2026-04-07

Researched 11d ago

200k

200,000 tokens

200k contextReasoningTool useFunctionsJSONCode exec

$1.05 in / $3.50 out / 1M tokens

3 routes

Claude 3 Sonnet

Claude 3 Sonnet by Anthropic is a versatile large language AI model, balancing intelligence and speed for diverse enterprise use cases. It is part of the Claude 3 family, positioned between the powerful Opus and the faster Haiku models. Sonnet excels in nuanced content creation, accurate summarization, and complex scientific query handling while also showcasing proficiency in non-English languages and coding tasks. Additionally, it enhances vision capabilities with exceptional skills in visual reasoning, such as interpreting charts, graphs, and transcribing text from imperfect images, which benefits industries like retail, logistics, and finance. Operated at twice the speed of Claude 3 Opus, Sonnet is efficient in context-sensitive customer support and multi-step workflows. It has achieved AI Safety Level 2 (ASL-2) and is accessible through multiple platforms, including Claude.ai, the Claude iOS app, the Anthropic API, Amazon Bedrock, and Google Cloud's Vertex AI.

2024-03-04

Researched 26d ago

200K

200,000 tokens

200K contextReasoningVisionMultimodalJSONCode exec

$3.00 in / $15.00 out / 1M tokens

2 routes · 1 cache

DeepSeek R1: Reasoning-optimized model with extended thinking capabilities. 128K context.

2025-01-20

Researched 26d ago

128K

128,000 tokens

128K contextReasoningJSONCode exec

$0.100 in / $0.300 out / 1M tokens

13 routes

Claude 3.7 Sonnet

Claude 3.7 Sonnet is Anthropic's advanced model with extended thinking capabilities, offering state-of-the-art reasoning for complex tasks.

2024-03-04

Researched 26d ago

200K

200,000 tokens

200K contextReasoningVisionMultimodalTool useFunctions

$3.00 in / $15.00 out / 1M tokens

6 routes · 1 batch

Qwen2.5-Coder-32B-Instruct

Instruction-optimized 32B code flagship for production systems requiring top-tier code reasoning, generation, and multi-file analysis.

2024-11-12

Researched 26d ago

—

No window data

JSONCode exec

$0.180 in / $0.180 out / 1M tokens

5 routes

GPT-5.2 is OpenAI's incremental update in the GPT-5 series offering improvements in agentic coding and long-context performance at 128K context.

2025-12-11

Researched 1d ago

400K

400,000 tokens

400K contextReasoningVisionMultimodalTool useFunctions

$1.75 in / $14.00 out / 1M tokens

2 routes

OpenAI o3 reasoning model with advanced multi-step problem-solving capabilities.

2025-03-31

Researched 5d ago

200K

200,000 tokens

200K contextReasoningJSONCode execPrompt cacheBatch

$2.00 in / $8.00 out / 1M tokens

2 routes · 1 batch · 1 cache

Qwen2.5-Coder-32B

32B flagship code specialist matching GPT-4o performance with SOTA multi-language repair (75.2% on MdEval) and 3.7% improvement on repo-wide context benchmarks.

2024-11-12

Researched 26d ago

—

No window data

JSONCode exec

$0.200 in / $0.200 out / 1M tokens

2 routes

o1-mini (09-12)

OpenAI o1-mini model emphasizing fast reasoning for smaller tasks and problems.

2024-09-12

Researched 134d ago

128K

128,000 tokens

128K contextReasoningCode exec

$1.10 in / $4.40 out / 1M tokens

1 route

GPT-4o Audio Preview (12-17)

Updated GPT-4o audio model with improved multimodal audio-text understanding.

2024-12-17

Researched 134d ago

128K

128,000 tokens

128K contextVisionCode exec

No tracked provider route

2024-11-20

Researched 134d ago

128K

128,000 tokens

128K contextVisionCode exec

No tracked provider route

GPT-4o Audio Preview (10-01)

GPT-4o model with integrated audio I/O capabilities for multimodal interactions.

2024-10-01

Researched 134d ago

128K

128,000 tokens

128K contextVisionCode exec

No tracked provider route

o1-preview (09-12)

OpenAI o1 preview model emphasizing reasoning and complex problem-solving.

2024-09-12

Researched 134d ago

128K

128,000 tokens

128K contextReasoningCode exec

No tracked provider route

The chatgpt-4o-latest model version continuously points to the version of GPT-4o used in ChatGPT, and is updated frequently, when there are significant changes.

2024-05-13

Researched 134d ago

128K

128,000 tokens

128K contextVisionCode exec

No tracked provider route

Cerebras GPT 590M

The Cerebras GPT 590M is a robust language model featuring 590 million parameters and a transformer architecture akin to GPT-3. It is optimized for natural language processing tasks such as text generation, completion, and summarization. Trained using the Chinchilla scaling laws and Cerebras' weight streaming technology, this model achieves high efficiency, offering faster training times and reduced costs. The Andromeda AI supercomputer facilitated its training on the extensive Pile dataset. Open-sourced under the Apache 2.0 license, it primarily supports English and requires additional tuning for other languages and conversational applications due to its lack of reinforcement learning from human feedback.

2023-03-13

Researched 134d ago

—

No window data

ReasoningCode exec

No tracked provider route

Megatron GPT 5B

The NeMo Megatron-GPT 5B is a transformer-based language model with 5 billion trainable parameters, inspired by models like GPT-2 and GPT-3 1. Its architecture is a decoder-only transformer, designed to sequentially process input for text generation and language understanding tasks 15. Trained on "The Piles" dataset by Eleuther.AI, it leverages its substantial dataset to produce coherent and natural-sounding text while also answering questions and completing sentences 5. Despite its strengths, the model can reflect biases and toxic language from its dataset, sometimes yielding inappropriate outputs. Evaluations on benchmarks like the LM Evaluation Test Suite showcase its varying performance, scoring 0.5566 on ARC-Easy and 0.6133 on Winogrande 1, indicating both strengths and limitations across different tasks.

2019-08-28

Researched 134d ago

—

No window data

ReasoningCode exec

No tracked provider route

OpenAI's previous intelligent reasoning model with configurable reasoning effort. Released August 2025. Supports minimal, low, medium, and high reasoning levels. Succeeded by GPT-5.1 and later models.

2025-08-07

Researched 5d ago

400K

400,000 tokens

400K contextReasoningVisionMultimodalTool useFunctions

$1.25 in / $10.00 out / 1M tokens

3 routes · 1 batch · 1 cache

Near-frontier intelligence for cost-sensitive, low-latency, high-volume workloads. Released August 2025. Replaces o4-mini (shutting down Oct 2026).

2025-08-07

Researched 5d ago

400K

400,000 tokens

400K contextReasoningVisionMultimodalTool useFunctions

$0.250 in / $2.00 out / 1M tokens

3 routes · 1 batch · 1 cache

GPT-5 Pro is OpenAI's most advanced GPT-5 tier, offering major improvements in reasoning, code quality, and user experience for enterprise and power-user applications at 400K context.

2025-10-01

Researched 18d ago

400K

400,000 tokens

400K contextVisionMultimodalTool useFunctionsJSON

No tracked provider route

Fastest, cheapest GPT-5 variant for summarization and classification tasks. Also available via Realtime API.

2025-08-07

Researched 5d ago

400K

400,000 tokens

400K contextReasoningVisionMultimodalTool useFunctions

$0.050 in / $0.400 out / 1M tokens

3 routes · 1 batch · 1 cache

Premium extended-reasoning GPT-5.4 variant producing smarter and more precise responses. Replacement for o3-deep-research and o4-mini-deep-research. No prompt caching discount.

2026-03-01

Researched 5d ago

1.1M

1,050,000 tokens

1.1M contextReasoningVisionMultimodalTool useFunctions

$30.00 in / $180.00 out / 1M tokens

2 routes · 1 batch

Speed-optimized Gemini 3 model from Google DeepMind with frontier intelligence. Combines high performance with lower cost and latency. 1M token context window.

2025-12-17

Researched 134d ago

1M

1,000,000 tokens

1M contextVisionMultimodalTool useFunctionsCode exec

$0.100 in / $0.400 out / 1M tokens

2 routes

Google DeepMind's most advanced reasoning Gemini model. Part of the Gemini 3 series with frontier-class intelligence, multimodal understanding, and 1M token context window.

2025-12-11

Researched 134d ago

1M

1,000,000 tokens

1M contextVisionMultimodalTool useFunctionsCode exec

$1.25 in / $5.00 out / 1M tokens

2 routes

GPT-5.2 Pro is OpenAI's most advanced GPT-5.2 tier offering major improvements in agentic coding and long-context performance for enterprise use at 400K context.

2026-01-01

Researched 18d ago

400K

400,000 tokens

400K contextVisionMultimodalTool useFunctionsJSON

No tracked provider route

GPT-5.1-Codex is a coding-specialized version of GPT-5.1, optimized for software engineering and agentic coding workflows at 400K context.

2025-12-01

Researched 18d ago

400K

400,000 tokens

400K contextVisionMultimodalTool useFunctionsJSON

No tracked provider route

GPT-5 Codex is OpenAI's coding-specialized variant of GPT-5, optimized for software engineering workflows, code generation, and agentic coding tasks at 400K context.

2025-10-01

Researched 18d ago

400K

400,000 tokens

400K contextVisionMultimodalTool useFunctionsJSON

No tracked provider route

Gemini 3 Flash Preview

Frontier-class performance rivaling larger models at a fraction of the cost. Most intelligent Gemini model built for speed, combining frontier intelligence with superior search and grounding. $0.50 input / $3.00 output per 1M tokens.

2025-12-17

Researched 26d ago

1M

1,000,000 tokens

1M contextVisionMultimodalTool useFunctionsJSON

$0.500 in / $3.00 out / 1M tokens

3 routes

Advanced o3 reasoning model for complex math, science, and coding problems. Supports tools, vision, and extended thinking. Available to Pro users. Released June 10, 2025.

2025-06-10

Researched 26d ago

—

No window data

ReasoningVisionMultimodalTool useFunctionsJSON

$20.00 in / $80.00 out / 1M tokens

2 routes

OpenAI's GPT-4.1 model released April 2025, excelling at coding tasks, precise instruction following, and web development. Outperforms GPT-4o in these areas with a 1 million token context window. Available via API and in ChatGPT for Plus, Pro, Team, Enterprise, and Edu users.

2025-04-01

Researched 5d ago

1M

1,047,576 tokens

1M contextVisionMultimodalTool useFunctionsJSON

$2.00 in / $8.00 out / 1M tokens

3 routes · 1 batch · 1 cache

Fast and efficient small model from OpenAI replacing GPT-4o mini. Released April 2025 alongside GPT-4.1. Shows improvements in instruction-following, coding, and intelligence with a 1 million token context window. Available in ChatGPT for paid users.

2025-04-01

Researched 5d ago

1M

1,047,576 tokens

1M contextVisionMultimodalTool useFunctionsJSON

$0.400 in / $1.60 out / 1M tokens

3 routes · 1 cache

KAT Coder Pro V2

KAT-Coder-Pro V2 is the latest high-performance coding model in KwaiPilot's KAT-Coder series, designed for complex enterprise-grade coding tasks and agentic software development at 256K context.

2026-03-01

Researched 18d ago

256K

256,000 tokens

256K contextTool useFunctionsJSONCode exec

$0.300 in / $1.20 out / 1M tokens

1 route

Claude 3.5 Haiku

Claude 3.5 Haiku is Anthropic's latest AI model, known for its speed and efficiency while maintaining high intelligence. It is optimized for applications needing rapid response, like interactive chatbots and real-time content moderation. Initially text-only, future plans include image input capabilities. It excels in delivering fast, accurate code suggestions, processing and categorizing information swiftly, and handling large volumes of user interactions. Priced accessibly, it offers advanced coding, tool use, and reasoning abilities. Though initially surpassing Claude 3 Haiku in benchmarks, its pricing reflects its enhanced performance 123457.

2024-10-22

Researched 26d ago

200k

200,000 tokens

200k contextReasoningVisionJSONCode execBatch

$0.800 in / $4.00 out / 1M tokens

5 routes · 1 batch · 1 cache

Morph V3 Fast is Morph's fastest code apply model at ~10,500 tokens/sec with 96% accuracy, optimized for rapid code transformations in AI coding workflows.

2026-03-01

Researched 18d ago

80K

80,000 tokens

Code exec

No tracked provider route

Relace Apply 3 is a specialized code-patching LLM that merges AI-suggested edits directly into source files at 256K context, designed for precise apply operations in AI coding agents.

2026-01-01

Researched 18d ago

256K

256,000 tokens

256K contextCode exec

No tracked provider route

Enhanced reasoning and grounded retrieval model from DeepSeek with multimodal text and image understanding.

2025-08-21

Researched 1d ago

64K

64,000 tokens

VisionMultimodalJSONCode exec

$0.560 in / $1.68 out / 1M tokens

6 routes

Claude 3.5 Sonnet

Claude 3.5 Sonnet, the latest in Anthropic's line of large language models, merges state-of-the-art reasoning, coding, and natural language understanding capabilities with advanced multi-modal processing. Released in October 2024, it excels in benchmarks against previous models and competitors, thanks to its scalable attention mechanisms and massive neural network architecture. Its dynamic routing enables specialization in various tasks, supporting applications from software development and data analysis to customer support and content creation. Users benefit from its "Artifacts" feature for real-time collaborative workflows and can access the model through platforms like Claude.ai and APIs at competitive pricing rates.

2024-06-20

Researched 26d ago

200K

200,000 tokens

200K contextReasoningVisionMultimodalFunctionsJSON

$3.00 in / $15.00 out / 1M tokens

6 routes · 1 cache

Claude Opus 4.5

Claude Opus 4.5 available on AWS Bedrock

2025-11-01

Researched 26d ago

200K

200,000 tokens

200K contextReasoningVisionMultimodalTool useFunctions

$5.00 in / $25.00 out / 1M tokens

5 routes · 1 batch

OpenAI GPT-4o: Flagship multimodal model with vision, function calling, and broad capability. $2.50/M input, $10/M output.

2024-05-13

Researched 5d ago

128K

128,000 tokens

128K contextVisionMultimodalTool useFunctionsJSON

$2.50 in / $10.00 out / 1M tokens

4 routes · 1 batch · 1 cache

Claude Mythos Preview

Claude Mythos Preview is Anthropic's frontier research model, positioned above the public Claude 4 family and released exclusively via invitation-only Project Glasswing to roughly 12 launch partners and over 40 organizations working on critical infrastructure. No public API or self-serve access. Specializes in defensive cybersecurity — autonomously identified zero-day vulnerabilities including a 27-year-old OpenBSD TCP SACK remote code execution bug and a 17-year-old FreeBSD NFS RCE. Codenamed Capybara internally. Scores 93.9% on SWE-bench Verified, 82.0% on Terminal-Bench 2.0, and 97.6% on USAMO 2026. Partner pricing: $25/$125 per million tokens (input/output). Max output: 128K tokens. Knowledge cutoff: December 2025.

2026-04-07

Researched 14d ago

1M

1,000,000 tokens

1M contextReasoningVisionMultimodalTool useFunctions

$25.00 in / $125.00 out / 1M tokens

1 route

Morph V3 Large is Morph's high-accuracy code apply model, achieving ~98% accuracy for precise code transformations at ~4,500 tokens/sec and 256K context.

2026-03-01

Researched 18d ago

256K

256,000 tokens

256K contextCode exec

No tracked provider route

Relace Search uses parallel file view and grep tools to explore a codebase and return relevant file sections with 256K context, specialized for AI coding agent pipelines.

2026-01-01

Researched 18d ago

256K

256,000 tokens

256K contextTool useCode exec

No tracked provider route

Arcee Coder Large

Coder Large is Arcee AI's 32B code-focused model, trained on permissively-licensed GitHub repositories and fine-tuned from Qwen 2.5-Instruct for software engineering tasks.

2025-12-01

Researched 18d ago

32K

32,000 tokens

Tool useFunctionsJSONCode exec

No tracked provider route

Cogito v2.1 671B

Cogito v2.1 671B MoE is Deep Cogito's strongest open model, matching performance of frontier closed models. It features deep thinking capabilities and strong results on coding, reasoning, and math benchmarks.

2025-11-19

Researched 8d ago

128K

128,000 tokens

128K contextReasoningTool useFunctionsJSONCode exec

No tracked provider route

Mistral Medium 3

Mistral Medium 3 is Mistral AI's enterprise-grade model delivering frontier-level capabilities including vision, function calling, and code generation at competitive cost for business applications.

2025-05-01

Researched 18d ago

128K

128,000 tokens

128K contextVisionMultimodalTool useFunctionsJSON

No tracked provider route

Claude Sonnet 4.6

Claude Sonnet 4.6 is Anthropic's best combination of speed and intelligence. Proprietary decoder-only model with 1M-token context, 64K max output, multimodal vision, extended thinking, and function calling. Available via Anthropic API, AWS Bedrock, GCP Vertex AI, and OpenRouter at $3/1M input and $15/1M output tokens.

2026-02-17

Researched 7d ago

1M

1,000,000 tokens

1M contextReasoningVisionMultimodalTool useFunctions

$3.00 in / $15.00 out / 1M tokens

4 routes · 1 batch · 1 cache

Claude Opus 4.7

Claude Opus 4.7 is Anthropic's generally available flagship model with 1M context, 128K max output, adaptive thinking, and a new tokenizer with roughly 555K words per 1M tokens.

2026-04-16

Researched 1d ago

1M

1,000,000 tokens

1M contextReasoningVisionMultimodalTool useFunctions

$5.00 in / $25.00 out / 1M tokens

5 routes · 1 batch · 1 cache

Claude Opus 4.6

Claude Opus 4.6 available on AWS Bedrock

2026-02-05

Researched 26d ago

1M

1,000,000 tokens

1M contextReasoningVisionMultimodalTool useFunctions

$5.00 in / $25.00 out / 1M tokens

4 routes · 1 batch · 1 cache

GLM-4.7 is Z.ai's flagship text model featuring enhanced programming capabilities and deeper reasoning at 200K context, succeeding GLM-4.6.

2026-03-01

Researched 18d ago

200K

200,000 tokens

200K contextTool useFunctionsJSONCode exec

$0.600 in / $2.20 out / 1M tokens

1 route

Mistral Small 3.2 24B

Mistral Small 3.2 24B is an updated instruction-tuned model from Mistral optimized for function calling, structured outputs, and vision tasks at 128K context with open weights.

2025-06-01

Researched 18d ago

128K

128,000 tokens

128K contextVisionMultimodalTool useFunctionsJSON

Pricing not tracked / 1M tokens

1 route

DeepSeek R1 0528

2025-01-01

Researched 26d ago

160K

160,000 tokens

160K contextReasoningJSONCode exec

$0.100 in / $0.300 out / 1M tokens

5 routes

Gemini 3.1 Pro Preview

Google: Gemini 3.1 Pro Preview available via OpenRouter. Pricing: $2/1M input, $12/1M output.

2026-02-19

Researched 26d ago

1M

1,000,000 tokens

1M contextVisionMultimodalTool useFunctionsJSON

$2.00 in / $12.00 out / 1M tokens

4 routes

Gemini 2.5 Flash

Google: Gemini 2.5 Flash available via OpenRouter. Pricing: $0.3/1M input, $2.5/1M output.

2025-06-17

Researched 26d ago

1M

1,000,000 tokens

1M contextVisionMultimodalTool useFunctionsJSON

$0.300 in / $2.50 out / 1M tokens

4 routes

DeepSeek V3.2 available on AWS Bedrock

2025-01-01

Researched 26d ago

160K

160,000 tokens

160K contextJSONCode exec

$0.252 in / $0.378 out / 1M tokens

4 routes

Qwen3-Coder-480B-A35B-Instruct

Qwen Qwen3 Coder 480B A35B Instruct available on AWS Bedrock

2025-01-01

Researched 26d ago

256K

256,000 tokens

256K contextJSONCode exec

$1.20 in / $1.20 out / 1M tokens

4 routes

Gemini 2.5 Flash Lite Preview 09-2025

Google: Gemini 2.5 Flash Lite Preview 09-2025 available via OpenRouter. Pricing: $0.1/1M input, $0.4/1M output.

2025-09-01

Researched 26d ago

1M

1,000,000 tokens

1M contextVisionMultimodalTool useFunctionsJSON

$0.100 in / $0.400 out / 1M tokens

3 routes

Gemini 2.5 Flash Lite

Google: Gemini 2.5 Flash Lite available via OpenRouter. Pricing: $0.1/1M input, $0.4/1M output.

2025-07-22

Researched 26d ago

1M

1,000,000 tokens

1M contextVisionMultimodalTool useFunctionsJSON

$0.100 in / $0.400 out / 1M tokens

3 routes

Google: Gemini 2.5 Pro available via OpenRouter. Pricing: $1.25/1M input, $10/1M output.

2025-06-17

Researched 26d ago

1M

1,000,000 tokens

1M contextVisionMultimodalTool useFunctionsJSON

$1.25 in / $10.00 out / 1M tokens

3 routes

Gemini 3.1 Flash-Lite

Gemini 3.1 Flash-Lite is Google's generally available low-latency Gemini 3.1 model, launched May 7, 2026. It is optimized for high-volume, cost-sensitive workloads with text, image, and video inputs, a 1M token context window, and a 66K token maximum output. The GA model uses the stable API ID gemini-3.1-flash-lite and replaces gemini-3.1-flash-lite-preview, which is scheduled to shut down on May 25, 2026. Pricing is $0.25 per 1M input tokens and $1.50 per 1M output tokens.

2026-05-07

Researched 7d ago

1M

1,048,576 tokens

1M contextVisionMultimodalTool useFunctionsJSON

Google AI Studio

$0.250 in / $1.50 out / 1M tokens

2 routes

DeepSeek V3.2 Exp

DeepSeek: DeepSeek V3.2 Exp available via OpenRouter. Pricing: $0.27/1M input, $0.41/1M output.

2025-04-10

Researched 26d ago

164K

164,000 tokens

164K contextJSONCode exec

$0.270 in / $0.410 out / 1M tokens

2 routes

Qwen3-Coder-14B-Instruct

2025-01-01

Researched 134d ago

256K

256,000 tokens

256K contextCode exec

$0.200 in / $0.200 out / 1M tokens

1 route

Qwen3-Coder-32B-Instruct

2025-01-01

Researched 134d ago

256K

256,000 tokens

256K contextCode exec

$0.900 in / $0.900 out / 1M tokens

1 route

Qwen3-Coder-3B-Instruct

2025-01-01

Researched 134d ago

256K

256,000 tokens

256K contextCode exec

$0.100 in / $0.100 out / 1M tokens

1 route

Qwen3-Coder-72B-Instruct

2025-01-01

Researched 134d ago

256K

256,000 tokens

256K contextCode exec

$0.900 in / $0.900 out / 1M tokens

1 route

Qwen3-Coder-8B-Instruct

2025-01-01

Researched 134d ago

256K

256,000 tokens

256K contextCode exec

$0.200 in / $0.200 out / 1M tokens

1 route

GPT-5.4 Nano is the smallest and fastest variant in the GPT-5.4 family, optimized for edge deployment and low-latency tasks. Model ID: gpt-5.4-nano.

2026-03-05

Researched 5d ago

400K

400,000 tokens

400K contextMultimodalTool useFunctionsJSONCode exec

$0.200 in / $1.25 out / 1M tokens

2 routes · 1 batch · 1 cache

GPT-5.5 Pro is OpenAI's premium variant of GPT-5.5, released April 23, 2026. Targets large quality gains for business, legal, education, and data science use cases. Scores 39.6% on FrontierMath Tier 4 (postdoctoral-level math problems), compared to 22.9% for Claude Opus 4.7. Priced at 6× the standard GPT-5.5 API rate. Available to ChatGPT subscribers and via API.

2026-04-23

Researched 5d ago

1.1M

1,050,000 tokens

1.1M contextReasoningVisionMultimodalTool useFunctions

$30.00 in / $180.00 out / 1M tokens

2 routes · 1 batch

GPT-5.4 Mini is a smaller, cost-efficient variant of GPT-5.4 with a 400K token context window. Designed for tasks requiring long-context processing at lower cost. Model ID: gpt-5.4-mini.

2026-03-05

Researched 5d ago

400K

400,000 tokens

400K contextReasoningMultimodalTool useFunctionsJSON

$0.750 in / $4.50 out / 1M tokens

2 routes · 1 batch · 1 cache

GPT-5.5 is OpenAI's fully retrained agentic model, released April 23, 2026. Optimized for agentic coding, computer use, knowledge work, and early scientific research. Achieves 82.7% on Terminal-Bench 2.0, 84.9% on GDPval, and 58.6% on SWE-Bench Pro. Individual factual claims are 23% more likely to be correct versus GPT-5.4, with factual errors 3% less frequent. Uses fewer tokens than GPT-5.4 for equivalent tasks. Supports text and image inputs. Available to ChatGPT Plus, Business, and Enterprise subscribers; API access coming soon. Model ID: gpt-5.5.

2026-04-23

Researched 5d ago

1.1M

1,050,000 tokens

1.1M contextReasoningVisionMultimodalTool useFunctions

$5.00 in / $30.00 out / 1M tokens

2 routes · 1 batch · 1 cache

GPT-5.4 is OpenAI's flagship frontier reasoning model, released March 5, 2026. It incorporates advances from GPT-5.3-Codex for coding and agentic workflows, and adds 'Thinking' mode with editable reasoning plans. Key capabilities include computer use (navigating interfaces via Playwright), image understanding and generation integration, full-stack web app generation, tool calling, and deep research. Knowledge cutoff is August 31, 2025. Model ID: gpt-5.4.

2026-03-05

Researched 5d ago

1.1M

1,050,000 tokens

1.1M contextReasoningMultimodalTool useFunctionsJSON

$2.50 in / $15.00 out / 1M tokens

2 routes · 1 batch · 1 cache

Most capable agentic coding model from OpenAI. Optimized for long-horizon, agentic coding tasks in the Codex CLI and API. Note: GPT-5.3-Codex-Spark is a distinct ChatGPT Pro research preview (not API-accessible).

2026-02-05

Researched 5d ago

400K

400,000 tokens

400K contextReasoningVisionTool useFunctionsJSON

$1.75 in / $14.00 out / 1M tokens

2 routes · 1 cache

Claude Haiku 4.5

Claude Haiku 4.5 available on AWS Bedrock

2025-10-01

Researched 26d ago

200k

200,000 tokens

200k contextVisionMultimodalTool useFunctionsJSON

$0.800 in / $4.00 out / 1M tokens

7 routes · 1 batch

Qwen3-Coder-Next

Qwen3-Coder-Next is an ultra-sparse Mixture-of-Experts coding agent model from Alibaba's Qwen team, released February 3, 2026 under Apache 2.0. It has 80B total parameters with 3B active at inference, delivering substantially higher throughput than comparable dense models. It supports a native 256K context window, function calling, structured outputs, Claude Code, Qwen Code, Cline, Kilo, and other scaffold templates. Benchmarks reported in the DAT-3724 datapack include SWE-Bench Pro 44.3%, SWE-Bench Resolved 70.6%, and TerminalBench 2 36.2%.

2026-02-03

Researched 11d ago

256K

256,144 tokens

256K contextReasoningTool useFunctionsJSONCode exec

$0.120 in / $0.800 out / 1M tokens

2 routes

GPT-5.5 Instant

GPT-5.5 Instant is OpenAI's latest Instant model used in ChatGPT, released May 5, 2026 as the new default ChatGPT model and exposed in the API as chat-latest. OpenAI says the update improves factuality, image analysis, STEM answers, web-search decisions, personalization from past chats/files/connected Gmail, and concise conversational style. OpenAI reports 52.5% fewer hallucinated claims than GPT-5.3 Instant on high-stakes prompts and 37.3% fewer inaccurate claims on difficult conversations flagged for factual errors.

2026-05-05

Researched 5d ago

400K

400,000 tokens

400K contextVisionMultimodalTool useFunctionsJSON

$1.50 in / $6.00 out / 1M tokens

1 route

Mistral Devstral 2 123B

Mistral Devstral 2 123B available on AWS Bedrock

2025-12-01

Researched 26d ago

—

No window data

JSONCode exec

$0.400 in / $2.00 out / 1M tokens

1 route

Qwen3-Coder-30B-A3B-Instruct

Qwen Qwen3 Coder 30B A3B Instruct available on AWS Bedrock

2025-12-01

Researched 26d ago

—

No window data

JSONCode exec

$0.150 in / $0.620 out / 1M tokens

1 route

Llama 4 Maverick 17B

Multimodal Llama 4 with 128 experts, optimized for fast responses with minimal computational cost

2025-10-01

Researched 26d ago

128k

128,000 tokens

128k contextMultimodalJSONCode execBatch

$0.240 in / $0.970 out / 1M tokens

1 route · 1 batch

Qwen3.5-Coder-32B

Larger code specialist variant for complex programming tasks and large codebase analysis.

2026-01-15

Researched 134d ago

256k

256,000 tokens

256k contextCode exec

No tracked provider route

Qwen3.5-Coder-7B

Code-focused variant of Qwen3.5 optimized for code generation, understanding, and software engineering tasks.

2026-01-15

Researched 134d ago

256k

256,000 tokens

256k contextCode exec

No tracked provider route

GPT-5.2-Codex is OpenAI's agentic coding model built on GPT-5.2, released December 18, 2025. Optimized for long-horizon professional software engineering tasks including large refactors, multi-file changes, and codebase migrations. Features context compaction for handling tasks exceeding a single context window, enhanced vision for interpreting screenshots, diagrams, and UI, and strengthened safeguards for defensive cybersecurity workflows. Available to paid ChatGPT users and via the Responses API.

2025-12-18

Researched 27d ago

—

No window data

ReasoningVisionMultimodalTool useFunctionsCode exec

No tracked provider route