LLM Reference

MPT 30B

About

MPT-30B, developed by MosaicML, is a powerful large language model (LLM) utilizing a decoder-only transformer architecture that excels in predicting the next word in a sequence. This model is particularly adept at handling various NLP tasks, such as text generation, question answering, summarization, and code generation, due to its training on a diverse dataset of 1 trillion tokens of English text and code. Notable architectural innovations like FlashAttention, ALiBi, and no biases contribute to its enhanced efficiency, allowing for operation on a single high-end GPU. Moreover, MPT-30B offers fine-tuned variants like mpt-30b-instruct and mpt-30b-chat, catering to specialized tasks like instruction following and dialogue generation, and is available for commercial use.

Capabilities

MultimodalFunction CallingTool UseJSON Mode

Providers(1)

ProviderInput (per 1M)Output (per 1M)Type
Databricks Foundation Model Serving$1$1
Serverless

Specifications

FamilyMPT
Parameters30B
ArchitectureDecoder Only
Specializationgeneral