Using Mixtral 8x7B on Baseten API

Implementation guide · Mixtral · MistralAI

ServerlessOpen Source

Quick Start

1
Create an account at Baseten API and generate an API key.
2
Use the Baseten API SDK or REST API to call mixtral-8x7b — see the documentation for request format.

API Portal Documentation Pricing

Code Examples

See Baseten API documentation for integration details.

About Baseten API

The AI platform offers a comprehensive suite of features designed to streamline the development and deployment of machine learning models. At its core, the platform supports open-source models, allowing developers to leverage existing frameworks and tools for their AI applications. This flexibility is coupled with rapid deployment capabilities, enabling organizations to quickly bring their models into production environments. The platform's architecture is built for scalability, accommodating fluctuating workloads and user demands without compromising performance. A standout feature is its high-speed inference capabilities, crucial for applications that require real-time data processing and decision-making. Cost-effectiveness is a key advantage of the platform, implementing a pay-as-you-go model that minimizes initial investments while optimizing resource utilization. The platform also boasts flexible deployment options, allowing users to deploy models across various environments including cloud, on-premises, or edge devices. This versatility empowers organizations to tailor their deployment strategies to specific needs and existing infrastructure. By combining these features, the platform provides a robust solution that enables businesses to fully harness the potential of AI while maintaining control over costs and deployment logistics.

Baseten is an AI infrastructure platform that provides comprehensive tools for deploying and serving machine learning models efficiently and cost-effectively. The platform offers: 1. Rapid deployment: Users can deploy models in minutes, avoiding complex processes. 2. Support for open-source models: Baseten allows deployment of best-in-class open-source models. 3. Optimized serving: The platform provides optimized serving for custom models. 4. Scalability: Horizontally scalable services enable smooth transition from prototype to production. 5. High-speed inference: Baseten offers fast inference on infrastructure that automatically scales with traffic. 6. Cost-efficiency: The platform includes a scaled-to-zero feature to optimize costs. 7. Flexible deployment options: Models can be run on Baseten's cloud or the user's infrastructure. Baseten aims to simplify the ML deployment process while ensuring performance, scalability, and cost-efficiency for AI builders and developers.

View all models on Baseten API →

Pricing on Baseten API

Capabilities

No model capability flags are currently sourced.

About Mixtral 8x7B

Mixtral 8x7B, developed by Mistral AI, features a cutting-edge Mixture of Experts (MoE) architecture, utilizing eight experts with seven billion parameters each, yielding a total of 46.7 billion parameters. This architecture activates only two experts per token, allowing for efficient processing and a 6x faster inference rate compared to Llama 2 70B. The model excels in performance, surpassing Llama 2 70B and competing with GPT-3.5 on numerous benchmarks. It supports multiple languages and can handle context up to 32,000 tokens, enhancing understanding of lengthy text. Designed for diverse tasks, it is strong in code generation and available under a permissive Apache 2.0 license, promoting community engagement. Compatible with various optimization tools, its weights are easily deployable, with Mistral AI continuing to improve its capabilities through performance optimizations and fine-tuning efforts.

Full model details →

Model Specs

Released2023-12-11

Parameters8x7B

Context32k

ArchitectureMixture of Experts

Knowledge cutoff2023-12

Baseten

San Francisco, California, United States