Mixtral 8x22B v0.1
About
The Mixtral 8x22B v0.1 is a pretrained generative Sparse Mixture of Experts (MoE) model created by Mistral AI [1][2][4]. It utilizes a specialized architecture where different sub-models, termed "experts," manage distinct input segments, enhancing both efficiency and performance relative to traditional large language models [2][10][12]. This model features an impressive 176 billion parameters and supports a context length of 65,000 tokens [10][13]. It excels in text generation, completion, and question answering, outperforming models like LLaMA 2 70B on various benchmarks [4][5][7]. Nonetheless, as a base model, it lacks inherent moderation capabilities, potentially generating inappropriate or harmful content without filtration [2][4][10]. The model requires significant VRAM—approximately 260GB in FP16 mode and 73GB in INT4 mode—for optimal operation [10][13] and may struggle with complex contextual understanding and current knowledge. Enhanced instruct-tuned versions, such as the Mixtral-8x22B-Instruct-v0.1, address some limitations by improving instruction adherence [3][5][6].
Capabilities
Providers(8)
| Provider | Input (per 1M) | Output (per 1M) | Type | |
|---|---|---|---|---|
| NVIDIA NIM | — | — | Provisioned | |
| OctoAI API | $1.2 | $1.2 | Serverless | |
| Fireworks AI Platform | $1.2 | $1.2 | Serverless Provisioned | |
| deepinfra API | — | — | Serverless | |
| Baseten API | — | — | Serverless | |
| Azure OpenAI | — | — | Provisioned | |
| Mistral AI Le Plateforme | — | — | Serverless | |
| Together AI API | $1.2 | $1.2 | Serverless |