Smaug Models by Abacus.AI
Details
About
The Smaug large language model family by Abacus AI stands out for its exceptional performance in reasoning and mathematics. As open-source models, they are built on the foundation of Qwen-72B, enhanced through innovative fine-tuning techniques like DPO-Positive (DPOP), which tackles standard DPO loss limitations, especially in datasets with minimal differences between completion pairs. This advancement is documented in a research paper on arXiv. The Smaug-72B model has notably topped the Hugging Face Open LLM leaderboard, earning the distinction of being the first open-source model to score above 80 on average across key evaluations. These Smaug models are readily accessible on Hugging Face, fostering easy utilization and modification for the AI community at large 349.
Current Variants
Use-when guidance is derived from seed capabilities, context, release, and replacement fields.
Use when the workload needs 32k context and 72B parameters.
Use when the workload needs 200k context and 34B parameters.
| Model | Use when | Released | Signals | Status |
|---|---|---|---|---|
| Smaug 72B | Use when the workload needs 32k context and 72B parameters. | 2023-12 | 32k context72B parameters | Current |
| Smaug 34B | Use when the workload needs 200k context and 34B parameters. | 2023-12 | 200k context34B parameters | Current |
Release Timeline
1 release groupSpecifications(2 models)
Available From(1 provider)
Pricing
| Model | Provider | Input / 1M | Output / 1M | Type |
|---|---|---|---|---|
| Smaug 72B | Microsoft Foundry | $1 | $2 | Provisioned |
Frequently Asked Questions
- What is Smaug used for?
- Smaug is used for coding and math-heavy prompts. The family description and listed model capabilities point to those workloads as the best fit.
- How does Smaug compare to Smaug 2?
- Smaug by Abacus.AI is strongest where you need coding, while Smaug 2 by Abacus.AI is the closest related family to check for coding. Smaug has 2 listed variants and reaches up to 200k context, while Smaug 2 reaches up to 32k context, so compare the specs and pricing tables before choosing a production model.


