Llemma 7B
Llemma 7B has model metadata, but missing tracked provider pricing keeps it from being a default production pick.
Use it for
- Teams evaluating classification and json / tool use
- Workloads that can use a 4k context window
Do not use it for
- Cost-sensitive launches that need sourced token pricing
- Vision or document-understanding workloads
- Teams that need a tracked hosted API route today
- Family
- Llemma
- Released
- 2023-09-26
- Context
- 4k
- Parameters
- 7B
- Architecture
- Decoder Only
- Knowledge cutoff
- 2023-04
- Specialization
- general
- Training
- finetuned
No tracked provider token pricing is available yet.
About
Llemma 7B is an innovative open-source large language model tailored for mathematical tasks, featuring 7 billion parameters. It builds upon Code Llama 7B and has been enhanced with the Proof-Pile-2 dataset, comprising 200 billion tokens of scientific papers and mathematical content. Renowned for its advanced chain-of-thought reasoning, Llemma 7B significantly surpasses other models like Llama-2 and Code Llama. It excels in tool use, such as Python interpreters and theorem proving, without additional fine-tuning, and is openly accessible, driving further research. The model performs exceptionally in mathematical benchmarks like MATH and GSM8k, providing a robust base for future advancements.
Llemma 7B is a model in the Llemma family. The structured metadata tracks a 4k-token context window and structured outputs. No headline benchmark score is tracked for Llemma 7B yet.
Top use-case fit
Classification
Included by capability and metadata signals in the decision map.
JSON / Tool use
Included by capability and metadata signals in the decision map.
Provider price ladder
No tracked provider token pricing is available for this model yet.
Capabilities
Benchmark peer barsfor Classification
No task-mapped benchmark peers are available for this model yet.
Migration checks
No linked migration route is available for this model yet.