llmreference
Databricks Foundation Model Serving

Using Mixtral 8x7B on Databricks Foundation Model Serving

Implementation guide · Mixtral · MistralAI

Serverless

Quick Start

  1. 1
    Create an account at Databricks Foundation Model Serving and generate an API key.
  2. 2
    Use the Databricks Foundation Model Serving SDK or REST API to call mixtral-8x7b — see the documentation for request format.
  3. 3
    You'll be billed $0.50/1M input, $1.00/1M output tokens. See full pricing.

Code Examples

About Databricks Foundation Model Serving

Databricks offers a comprehensive AI platform that integrates a lakehouse model, combining the flexibility of data lakes with the management capabilities of data warehouses. The platform features a natural language interface for conversational data querying, automated infrastructure management for optimized performance, and robust governance tools ensuring data privacy and compliance. It supports a wide range of functionalities including data engineering, real-time streaming, and a marketplace for data sharing, while enabling seamless collaboration among data scientists, engineers, and DevOps teams . The platform's capabilities extend to advanced machine learning operations (MLOps), facilitating the entire lifecycle of AI model development. It includes built-in support for popular libraries like TensorFlow and PyTorch, tools for monitoring data quality and model performance, and automated workflows for building production-ready ETL pipelines. The platform also integrates with large language models (LLMs) for generative AI applications, emphasizing cost efficiency and ease of use. This comprehensive suite of tools empowers organizations to effectively leverage AI while maintaining control over their data and models .

Databricks offers a comprehensive Data Intelligence Platform that unifies data, analytics, and AI capabilities. Their platform, known as the Databricks Lakehouse, combines the best features of data lakes and data warehouses, enabling organizations to handle large-scale data processing, analytics, and machine learning workloads in a single, unified environment. Key features of Databricks' AI platform include: 1. Apache Spark integration: As the creators of Apache Spark, Databricks provides optimized performance for big data processing and analytics. 2. Delta Lake: An open-source storage layer that brings reliability to data lakes, ensuring data quality and consistency. 3. MLflow: An open-source platform for managing the machine learning lifecycle, including experimentation, reproducibility, and deployment. 4. Collaborative notebooks: Interactive environments for data scientists and analysts to work together on data exploration, model development, and visualization. 5. AutoML: Automated machine learning capabilities to streamline the model development process. 6. Generative AI support: Tools and frameworks for developing and deploying generative AI models. 7. Data governance: Unity Catalog provides centralized governance and security controls across the entire data estate. 8. Scalable infrastructure: Cloud-native architecture that allows for elastic scaling of compute resources. Databricks' platform is designed to democratize data and AI, making it accessible to organizations of all sizes. It's used by over 10,000 organizations worldwide, including more than 50% of the Fortune 500 companies, for various use cases such as data engineering, machine learning, and business analytics.

Pricing on Databricks Foundation Model Serving

TypePrice (per 1M)
Input tokens$0.50
Output tokens$1.00

Capabilities

No model capability flags are currently sourced.

About Mixtral 8x7B

Mixtral 8x7B, developed by Mistral AI, features a cutting-edge Mixture of Experts (MoE) architecture, utilizing eight experts with seven billion parameters each, yielding a total of 46.7 billion parameters. This architecture activates only two experts per token, allowing for efficient processing and a 6x faster inference rate compared to Llama 2 70B. The model excels in performance, surpassing Llama 2 70B and competing with GPT-3.5 on numerous benchmarks. It supports multiple languages and can handle context up to 32,000 tokens, enhancing understanding of lengthy text. Designed for diverse tasks, it is strong in code generation and available under a permissive Apache 2.0 license, promoting community engagement. Compatible with various optimization tools, its weights are easily deployable, with Mistral AI continuing to improve its capabilities through performance optimizations and fine-tuning efforts.

Model Specs

Released2023-12-11
Parameters8x7B
Context32K
ArchitectureMixture of Experts
Knowledge cutoff2023-12

Provider

Databricks Foundation Model Serving
Databricks Foundation Model Serving

Databricks

San Francisco, California, United States