LLM Reference

Mixtral 8x22B v0.1

About

The Mixtral 8x22B v0.1 is a pretrained generative Sparse Mixture of Experts (MoE) model created by Mistral AI [1][2][4]. It utilizes a specialized architecture where different sub-models, termed "experts," manage distinct input segments, enhancing both efficiency and performance relative to traditional large language models [2][10][12]. This model features an impressive 176 billion parameters and supports a context length of 65,000 tokens [10][13]. It excels in text generation, completion, and question answering, outperforming models like LLaMA 2 70B on various benchmarks [4][5][7]. Nonetheless, as a base model, it lacks inherent moderation capabilities, potentially generating inappropriate or harmful content without filtration [2][4][10]. The model requires significant VRAM—approximately 260GB in FP16 mode and 73GB in INT4 mode—for optimal operation [10][13] and may struggle with complex contextual understanding and current knowledge. Enhanced instruct-tuned versions, such as the Mixtral-8x22B-Instruct-v0.1, address some limitations by improving instruction adherence [3][5][6].

Capabilities

MultimodalFunction CallingTool UseJSON Mode

Providers(8)

ProviderInput (per 1M)Output (per 1M)Type
NVIDIA NIM
Provisioned
OctoAI API$1.2$1.2
Serverless
Fireworks AI Platform$1.2$1.2
Serverless
Provisioned
deepinfra API
Serverless
Baseten API
Serverless
Azure OpenAI
Provisioned
Mistral AI Le Plateforme
Serverless
Together AI API$1.2$1.2
Serverless

Specifications

FamilyMixtral
Released2024-04-17
Parameters8x22B
Context64K
ArchitectureMixture of Experts
Specializationgeneral