DeepSeek Math Models by DeepSeek
Details
About
The DeepSeekMath family of large language models (LLMs) is a robust collection focusing on enhancing mathematical reasoning through open-source innovations. These models, built on the DeepSeek-Coder-Base-v1.5 architecture with 7 billion parameters, have been rigorously pre-trained on a substantial dataset of 120 billion mathematics-related tokens from Common Crawl, supplemented with natural language and code data 145. A standout feature is their application of Group Relative Policy Optimization (GRPO), which is a specialized reinforcement learning algorithm aimed at boosting mathematical problem-solving efficiency while optimizing memory consumption 14. The suite comprises several versions, including DeepSeekMath-Base 7B, DeepSeekMath-Instruct 7B, and DeepSeekMath-RL 7B, each designed to facilitate different stages of the training continuum, with the RL variant achieving an impressive 51.7% accuracy on the MATH benchmark without using external tools 145. These models are available on platforms such as Hugging Face and GitHub, promoting collaborative research and innovation 45. DeepSeekMath's capabilities rival those of proprietary models like Gemini-Ultra and GPT-4, marking it a pivotal development in the domain of open-source AI for tackling mathematical challenges 14.
Current Variants
Use-when guidance is derived from seed capabilities, context, release, and replacement fields.
Use when the workload needs 4k context and 7 parameters.
Use when the workload needs math and 7B parameters.
Use when the workload needs math and 7B parameters.
| Model | Use when | Released | Signals | Status |
|---|---|---|---|---|
| DeepSeek Math | Use when the workload needs 4k context and 7 parameters. | 2024-09 | 4k context7 parameters | Current |
| DeepSeek Math 7B RL | Use when the workload needs math and 7B parameters. | 2024-03 | math7B parameters | Current |
| DeepSeek Math 7B | Use when the workload needs math and 7B parameters. | 2024-02 | math7B parameters | Current |
| DeepSeek Math 7B Instruct | Use when the workload needs math and 7B parameters. | 2024-02 | math7B parameters | Current |
Release Timeline
3 release groupsSpecifications(4 models)
| Model | Released | Context | Parameters |
|---|---|---|---|
| DeepSeek Math | 2024-09 | 4k | 7 |
| DeepSeek Math 7B RL | 2024-03 | — | 7B |
| DeepSeek Math 7B | 2024-02 | — | 7B |
| DeepSeek Math 7B Instruct | 2024-02 | — | 7B |
Available From(1 provider)
Pricing
| Model | Provider | Input / 1M | Output / 1M | Type |
|---|---|---|---|---|
| DeepSeek Math 7B | Replicate API | $0.05 | $0.25 | Serverless |
Frequently Asked Questions
- What is DeepSeek Math used for?
- DeepSeek Math is used for mathematics, math, and coding. The family description and listed model capabilities point to those workloads as the best fit.
- How does DeepSeek Math compare to Janus?
- DeepSeek Math by DeepSeek is strongest where you need mathematics, while Janus by DeepSeek is the closest related family to check for image generation. DeepSeek Math has 4 listed variants and reaches up to 4k context, so compare the specs and pricing tables before choosing a production model.
- Which DeepSeek Math model should I use?
- For the lowest listed input price, start with DeepSeek Math 7B through Replicate API at $0.05/1M input tokens. For the most capable/latest local choice, evaluate DeepSeek Math with 4k context.





