Mixtral 8x22B v0.1

About

The Mixtral 8x22B v0.1 is a pretrained generative Sparse Mixture of Experts (MoE) model created by Mistral AI [1][2][4]. It utilizes a specialized architecture where different sub-models, termed "experts," manage distinct input segments, enhancing both efficiency and performance relative to traditional large language models [2][10][12]. This model features an impressive 176 billion parameters and supports a context length of 65,000 tokens [10][13]. It excels in text generation, completion, and question answering, outperforming models like LLaMA 2 70B on various benchmarks [4][5][7]. Nonetheless, as a base model, it lacks inherent moderation capabilities, potentially generating inappropriate or harmful content without filtration [2][4][10]. The model requires significant VRAM—approximately 260GB in FP16 mode and 73GB in INT4 mode—for optimal operation [10][13] and may struggle with complex contextual understanding and current knowledge. Enhanced instruct-tuned versions, such as the Mixtral-8x22B-Instruct-v0.1, address some limitations by improving instruction adherence [3][5][6].

Capabilities

MultimodalFunction CallingTool UseJSON Mode

Providers(8)

Provider	Input (per 1M)	Output (per 1M)	Type
NVIDIA NIM	—	—	Provisioned
OctoAI API	$1.2	$1.2	Serverless
Fireworks AI Platform	$1.2	$1.2	Serverless Provisioned
deepinfra API	—	—	Serverless
Baseten API	—	—	Serverless
Azure OpenAI	—	—	Provisioned
Mistral AI Le Plateforme	—	—	Serverless
Together AI API	$1.2	$1.2	Serverless

Specifications

FamilyMixtral

Released2024-04-17

Parameters8x22B

Context64K

ArchitectureMixture of Experts

Specializationgeneral