
Llemma
About
Llemma is a family of open-access large language models (LLMs) designed to specialize in mathematical reasoning. Developed by EleutherAI, these models were initialized using Code Llama weights and trained on the Proof-Pile-2 dataset, comprising a vast 55 billion unique tokens of mathematical and scientific documents. This extensive training allows Llemma models to excel in chain-of-thought mathematical reasoning and effectively use computational tools like Python and formal theorem provers. Available in both 7-billion and 34-billion parameter variants, the Llemma models, particularly the larger one, outperform other LLMs of similar size on a range of mathematical benchmarks. The Llemma project's open-source approach facilitates ongoing research and advancements in mathematical reasoning with LLMs 23.