LLM Reference
All comparisons

Mistral Codestral 2508 vs GPT-4o (05-13)

Side-by-side comparison of specifications, capabilities, and pricing.

Released2025-08-202024-05-13
Context window256K128K
Parameters24B MoE1.76T (8x222B MoE)*
Architecturedecoder onlymixture of experts
LicenseUnknownProprietary
Knowledge cutoff2023-10

Capabilities

Vision
Multimodal
Reasoning
Function calling
Tool use
JSON mode
Code execution

Availability

Providers

Benchmarks

HellaSwag—96.4HumanEval—90.2Massive Multitask Language Understanding—88.7