LLM ReferenceLLM Reference
Microsoft Foundry

Mixtral 8x22B v0.1 on Microsoft Foundry

Mixtral · MistralAI

Provisioned

Compare Mixtral 8x22B v0.1 Across Providers

ProviderInput (per 1M)Output (per 1M)
NVIDIA NIM
OctoAI API$1.20$1.20
Fireworks AI$1.20$1.20
DeepInfra$0.65$0.65
Baseten API
View all 8 providers →

Pricing

TypePrice (per 1M)
Input tokens$2.00
Output tokens$6.00

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution

About Mixtral 8x22B v0.1

The Mixtral 8x22B v0.1 is a pretrained generative Sparse Mixture of Experts (MoE) model created by Mistral AI [1][2][4]. It utilizes a specialized architecture where different sub-models, termed "experts," manage distinct input segments, enhancing both efficiency and performance relative to traditional large language models [2][10][12]. This model features an impressive 176 billion parameters and supports a context length of 65,000 tokens [10][13]. It excels in text generation, completion, and question answering, outperforming models like LLaMA 2 70B on various benchmarks [4][5][7]. Nonetheless, as a base model, it lacks inherent moderation capabilities, potentially generating inappropriate or harmful content without filtration [2][4][10]. The model requires significant VRAM—approximately 260GB in FP16 mode and 73GB in INT4 mode—for optimal operation [10][13] and may struggle with complex contextual understanding and current knowledge. Enhanced instruct-tuned versions, such as the Mixtral-8x22B-Instruct-v0.1, address some limitations by improving instruction adherence [3][5][6].

Get Started

Model Specs

Released2024-04-17
Parameters8x22B
Context64K
ArchitectureMixture of Experts

GPU-Hour Providers(1)

Related Models on Microsoft Foundry