What is the context window of Mistral NeMo (2407)?

Mistral NeMo (2407) has a context window of 128k tokens.

How much does Mistral NeMo (2407) cost?

Mistral NeMo (2407) pricing ranges from $0.02/1M to $0.3/1M input tokens depending on the provider.

When was Mistral NeMo (2407) released?

Mistral NeMo (2407) was released on 2024-07-18.

Which providers offer Mistral NeMo (2407)?

Mistral NeMo (2407) is available from 7 providers: Mistral AI Studio, OpenRouter, Fireworks AI, Bitdeer AI, SiliconFlow, Vercel AI Gateway, Novita AI.

Mistral NeMo (2407)

Name: Mistral NeMo (2407)
Author: MistralAI

Released

2024-07-18

Last refreshed

2026-06-01

Status

Researched 3d ago

Long context

Mistral NeMo (2407) is worth evaluating for long context when its provider route and context window match the workload.

Use it for

Teams evaluating long context
Workloads that can use a 128k context window
Buyers comparing 4 tracked provider routes

Do not use it for

Vision or document-understanding workloads
Strict JSON or tool-calling flows

Specifications

Family: Mistral NeMo
Released: 2024-07-18
Context: 128k
Parameters: 12B
Architecture: Decoder Only
Knowledge cutoff: 2024-04
Specialization: general
Training: finetuned

Created by

MistralAI

Enterprise AI solutions for trust and transparency.

Paris, France

Founded 2023

Website

Pricing

Output / 1M

$0.030

Input / 1M

$0.020

Cheapest of 7 routes · OpenRouter

Providers(7)

Mistral AI Studio OpenRouter Fireworks AI Bitdeer AI SiliconFlow Vercel AI Gateway Novita AI

View 7 provider routes

About

Mistral NeMo is a 12B parameter open-source language model developed by Mistral AI, designed for efficient performance and reasoning tasks. With a 128K token context window, it excels at handling long documents and complex reasoning. The model is optimized for fast inference while maintaining strong performance across multiple benchmarks, making it suitable for enterprise deployments where balance between performance and resource efficiency is critical.

Mistral NeMo is a 12-billion-parameter open-source language model developed jointly by Mistral AI and NVIDIA, released in July 2024. It supports a 128,000-token context window and is available under the Apache 2.0 license, making it freely usable for both research and commercial applications. The model was designed to replace Mistral 7B as the default open mid-tier model, offering substantially longer context and improved multilingual capability at a modest increase in parameter count.

A notable architectural feature is the Tekken tokenizer, which has a vocabulary of approximately 131,000 tokens—significantly larger than the previous Mistral tokenizer. This improves tokenization efficiency for multilingual text, including European and Asian languages, reducing token count for equivalent text and thus lowering cost and latency for multilingual applications. The model architecture is otherwise a standard decoder-only transformer, similar to Mistral 7B, optimized for efficient inference on commodity hardware.

Mistral NeMo is available through Mistral AI's API, Fireworks AI, Bitdeer, OpenRouter, Novita AI, and SiliconFlow. It can be self-hosted from Hugging Face (mistralai/Mistral-Nemo-Instruct-2407). At 12B parameters, it is heavier than Mistral 7B but substantially cheaper than Mistral Small or Mistral Large and fits on a single GPU with 24GB VRAM in standard precision. For applications currently using Mistral 7B that need longer context or better multilingual coverage, Mistral NeMo is the natural upgrade path.

Mistral NeMo (2407) has a 128k-token context window.

Mistral NeMo (2407) input tokens at $0.02/1M, output at $0.03/1M.