OLMo 3.1 32B Instruct

Name: OLMo 3.1 32B Instruct
Author: Allen Institute for Artificial Intelligence (AI2)

Released

2026-02-01

Last refreshed

2026-04-27

Status

Researched 88d ago

Open sourceCommercial use: permittedCodingAgentsClassificationJSON / Tool use

OLMo 3.1 32B Instruct is a released coding, agents, and classification model with open-source; evaluate it while provider pricing coverage matures.

Use it for

Teams evaluating coding, agents, and classification
Workloads that can use a 64k context window

Do not use it for

Cost-sensitive launches that need sourced token pricing
Vision or document-understanding workloads
Teams that need a tracked hosted API route today

Specifications

Family: OLMo
Released: 2026-02-01
Context: 64k
Parameters: 32B
Architecture: Decoder Only
Knowledge cutoff: 2024-12
Specialization: general
Openness: Open source
License: Apache 2.0OSI-approvedCommercial use: permitted
Weights: Available
Code: Unknown
Training: Pretrained

Created by

Allen Institute for Artificial Intelligence (AI2)

Advocating for open science and source

Seattle, Washington, United States

Founded 2014

Website

Pricing

No tracked provider token pricing is available yet.

Links

Website HuggingFace

About

OLMo 3.1 32B Instruct is Allen Institute for AI's large-scale 32B instruction-tuned model engineered for high performance across language understanding, reasoning, and coding tasks.

OLMo 3.1 32B Instruct is an open-source model in the OLMo family. The structured metadata tracks a 64k-token context window, function calling, tool use, and structured outputs. Headline tracked benchmarks include AIME 2024 67.8, AIME 2025 57.9, and Google-Proof Q&A 48.6.

Top use-case fit: coding, agents, and build tasks

Coding

2 relevant benchmarks in the decision map.

Agents

Included by capability and metadata signals in the decision map.

Classification

1 relevant benchmark in the decision map.

Provider price ladder

No tracked provider token pricing is available for this model yet.

Capabilities

Function CallingTool UseStructured Outputs

Benchmark peer barsfor Coding

HumanEvalRank 25 of 97

Claude Sonnet 4.6

98.0

96.7

Claude Opus 4.6

95.0

Grok-3

94.5

OLMo 3.1 32B Instructcurrent

86.7

LiveCodeBenchRank 49 of 55

DeepSeek V4 Pro

93.5

Gemini 3.1 Pro Preview

91.7

DeepSeek V4 Flash

91.6

Qwen3.7-Max

91.6

OLMo 3.1 32B Instructcurrent

54.7

Benchmark scores(7)

Scores are benchmark-specific and are direction-aware: the same numeric gap can mean very different outcomes across suites. Use the leaderboard context and this model's provider route to decide whether the winning margin is meaningful for your workload.

Benchmark	Score	Version	Evaluation	Source
AIME 2024	67.8	AIME 2024 (accuracy)Observed 2026-06-07	—	Source
AIME 2025	57.9	AIME 2025 (accuracy)Observed 2026-06-07	—	Source
Google-Proof Q&A	48.6	GPQA (accuracy)Observed 2026-06-07	—	Source
HumanEval	86.7	HumanEval (pass@1)Observed 2026-06-07	—	Source
LiveCodeBench	54.7	LiveCodeBench v3 (accuracy)Observed 2026-06-07	—	Source
MATH-500	93.4	MATH benchmark (accuracy)Observed 2026-06-07	—	Source
Massive Multitask Language Understanding	80.9	From official HuggingFace model card (accuracy)Observed 2026-06-07	—	Source

Migration checks

No linked migration route is available for this model yet.

Frequently asked questions

What is the context window of OLMo 3.1 32B Instruct?

OLMo 3.1 32B Instruct has a context window of 64k tokens.

When was OLMo 3.1 32B Instruct released?

OLMo 3.1 32B Instruct was released on 2026-02-01.

What benchmarks has OLMo 3.1 32B Instruct been tested on?

OLMo 3.1 32B Instruct has been evaluated on 7 benchmarks, including AIME 2024, AIME 2025, Google-Proof Q&A, HumanEval, LiveCodeBench.