LLM Reference

DeepSeek Math Models by DeepSeek

DeepSeekDeepSeek LicenseOpen weightsMathematics
4 models2024Up to 4k ctxFrom $0.05/1M input

Details

ResearcherDeepSeek
Commercial useCommercial use allowed
Models4
Released2024
Max context4k

About

The DeepSeekMath family of large language models (LLMs) is a robust collection focusing on enhancing mathematical reasoning through open-source innovations. These models, built on the DeepSeek-Coder-Base-v1.5 architecture with 7 billion parameters, have been rigorously pre-trained on a substantial dataset of 120 billion mathematics-related tokens from Common Crawl, supplemented with natural language and code data 145. A standout feature is their application of Group Relative Policy Optimization (GRPO), which is a specialized reinforcement learning algorithm aimed at boosting mathematical problem-solving efficiency while optimizing memory consumption 14. The suite comprises several versions, including DeepSeekMath-Base 7B, DeepSeekMath-Instruct 7B, and DeepSeekMath-RL 7B, each designed to facilitate different stages of the training continuum, with the RL variant achieving an impressive 51.7% accuracy on the MATH benchmark without using external tools 145. These models are available on platforms such as Hugging Face and GitHub, promoting collaborative research and innovation 45. DeepSeekMath's capabilities rival those of proprietary models like Gemini-Ultra and GPT-4, marking it a pivotal development in the domain of open-source AI for tackling mathematical challenges 14.

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

4 in view

Use when the workload needs 4k context and 7 parameters.

2024-094k context7 parameters

Use when the workload needs math and 7B parameters.

2024-03math7B parameters

Use when the workload needs math and 7B parameters.

2024-02math7B parameters

Use when the workload needs math and 7B parameters.

2024-02math7B parameters

Release Timeline

3 release groups
2024-09
1 current
DeepSeek Math
4k context7 parameters
Current
2024-03
1 current
DeepSeek Math 7B RL
math7B parameters
Current
2024-02
2 current
DeepSeek Math 7B
math7B parameters
Current
DeepSeek Math 7B Instruct
math7B parameters
Current

Specifications(4 models)

DeepSeek Math model specifications comparison
ModelReleasedContextParameters
DeepSeek Math2024-094k7
DeepSeek Math 7B RL2024-037B
DeepSeek Math 7B2024-027B
DeepSeek Math 7B Instruct2024-027B

Available From(1 provider)

Pricing

DeepSeek Math model pricing by provider
ModelProviderInput / 1MOutput / 1MType
DeepSeek Math 7BReplicate API$0.05$0.25Serverless

Frequently Asked Questions

What is DeepSeek Math used for?
DeepSeek Math is used for mathematics, math, and coding. The family description and listed model capabilities point to those workloads as the best fit.
How does DeepSeek Math compare to Janus?
DeepSeek Math by DeepSeek is strongest where you need mathematics, while Janus by DeepSeek is the closest related family to check for image generation. DeepSeek Math has 4 listed variants and reaches up to 4k context, so compare the specs and pricing tables before choosing a production model.
Which DeepSeek Math model should I use?
For the lowest listed input price, start with DeepSeek Math 7B through Replicate API at $0.05/1M input tokens. For the most capable/latest local choice, evaluate DeepSeek Math with 4k context.