MPT 30B
About
MPT-30B, developed by MosaicML, is a powerful large language model (LLM) utilizing a decoder-only transformer architecture that excels in predicting the next word in a sequence. This model is particularly adept at handling various NLP tasks, such as text generation, question answering, summarization, and code generation, due to its training on a diverse dataset of 1 trillion tokens of English text and code. Notable architectural innovations like FlashAttention, ALiBi, and no biases contribute to its enhanced efficiency, allowing for operation on a single high-end GPU. Moreover, MPT-30B offers fine-tuned variants like mpt-30b-instruct and mpt-30b-chat, catering to specialized tasks like instruction following and dialogue generation, and is available for commercial use.
Capabilities
Providers(1)
| Provider | Input (per 1M) | Output (per 1M) | Type | |
|---|---|---|---|---|
| Databricks Foundation Model Serving | $1 | $1 | Serverless |