LLM ReferenceLLM Reference

MPT

Databricks MosaicCC-BY-NC-SA-4.0
2 models2023From $0.5/1M input

About

The MosaicML Pretrained Transformer (MPT) family is a collection of advanced, open-source large language models designed for diverse applications, available for commercial use. These models stand out for their decoder-only architecture reminiscent of GPT models, offering enhanced performance through optimized layer implementations and increased training stability. Notably, the MPT models eliminate context length limitations via ALiBi (Attention with Linear Biases), replacing traditional positional embeddings. The MPT family encompasses the base model MPT-7B, alongside specialized variants like MPT-7B-Instruct, MPT-7B-Chat, and MPT-7B-StoryWriter-65k+, each fine-tuned for distinct tasks ranging from instruction-following to storytelling. They were developed using a vast dataset comprising 1 trillion tokens of text and code, underscoring their capability to process and generate high-quality text outputs 125.

Specifications(2 models)

MPT model specifications comparison
ModelReleasedParameters
MPT 30B2023-0330B
MPT 7B2023-037B

Available From(2 providers)

Pricing

MPT model pricing by provider
ModelProviderInput / 1MOutput / 1MType
MPT 7BDatabricks Foundation Model Serving$0.5$0.5Serverless
MPT 30BDatabricks Foundation Model Serving$1$1Serverless

Frequently Asked Questions

What is MPT?
The MosaicML Pretrained Transformer (MPT) family is a collection of advanced, open-source large language models designed for diverse applications, available for commercial use. These models stand out for their decoder-only architecture reminiscent of GPT models, offering enhanced performance through optimized layer implementations and increased training stability. Notably, the MPT models eliminate context length limitations via ALiBi (Attention with Linear Biases), replacing traditional positional embeddings. The MPT family encompasses the base model MPT-7B, alongside specialized variants like MPT-7B-Instruct, MPT-7B-Chat, and MPT-7B-StoryWriter-65k+, each fine-tuned for distinct tasks ranging from instruction-following to storytelling. They were developed using a vast dataset comprising 1 trillion tokens of text and code, underscoring their capability to process and generate high-quality text outputs 125.
How many models are in the MPT family?
The MPT family contains 2 models.
What is the latest MPT model?
The latest model is MPT 30B, released in 2023-03.
How much does MPT cost?
MPT models range from $0.5/1M to $1/1M input tokens depending on the model and provider.

Models(2)