LLM Reference

Breeze Models by MediaTek-Research

2 models2023Up to 32k ctx

About

The Breeze large language model (LLM) family, created by MediaTek Research, features a series of open-source models based on the Mistral-7B foundation. Tailored for Traditional Chinese, these models excel in both Traditional Chinese and English languages. They are available in base and instruction-tuned variants, with the latter optimized for tasks like question answering, retrieval augmented generation (RAG), multi-round chat, and summarization. One of Breeze's standout features is its enhanced processing speed, particularly for Traditional Chinese, reaching double the inference speed compared to models like Mistral-7B and Llama 7B. This is largely due to its expanded vocabulary, which includes an additional 30,000 Traditional Chinese tokens. Moreover, the Breeze models perform impressively in benchmarks against other open-source models of similar size 23.

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

2 in view
Breeze 7BCurrent

Use when the workload needs 32k context and 7B parameters.

2023-1132k context7B parameters

Use when the workload needs 32k context and 7B parameters.

2023-1132k context7B parameters

Release Timeline

1 release group
2023-11
2 current
Breeze 7B
32k context7B parameters
Current
Breeze 8x7B
32k context7B parameters
Current

Specifications(2 models)

Breeze model specifications comparison
ModelReleasedContextParameters
Breeze 7B2023-1132k7B
Breeze 8x7B2023-1132k7B

Available From(1 provider)

Frequently Asked Questions

What is Breeze used for?
Breeze is used for chatbot and role-playing use cases. The family description and listed model capabilities point to those workloads as the best fit.
How does Breeze compare to Claude 3?
Breeze by MediaTek-Research is strongest where you need chatbot and role-playing use cases, while Claude 3 by Anthropic is the closest related family to check for vision and multimodal work. Breeze has 2 listed variants and reaches up to 32k context, while Claude 3 reaches up to 200k context, so compare the specs and pricing tables before choosing a production model.
Which Breeze model should I use?
If price is the main constraint, use the pricing table first because Breeze does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate Breeze 7B with 32k context.

Models(2)