LLM ReferenceLLM Reference

SaulLM

EquallLegal
6 models2024

About

The SaulLM family is a collection of large language models specifically crafted for the legal domain, with the foundational SaulLM-7B model comprising 7 billion parameters. Initially trained on an extensive English legal corpus of over 30 billion tokens, the SaulLM-7B model was further refined through advanced pretraining and instruction fine-tuning to produce SaulLM-7B-Instruct, which is optimized for instruction-following tasks within the legal sector. Following the original model's success, the family expanded with the introduction of larger models like SaulLM-54B and SaulLM-141B, incorporating 54 billion and 141 billion parameters, respectively. These models feature the Mixtral architecture and focus on refined domain adaptation, including continued pretraining on a vast legal dataset exceeding 540 billion tokens. Deployed with specialized instruction protocols and aligned with human legal interpretation preferences, all models in the SaulLM family support open collaboration through a permissive MIT license, fostering innovation within the legal AI community 12356.

Specifications(6 models)

SaulLM model specifications comparison
ModelReleasedParameters
Saul 141B2024-07141B
Saul 54B2024-0754B
Saul 141B Instruct2024-07141B
Saul 54B Instruct2024-0754B
Saul 7B2024-027B
Saul 7B Instruct2024-027B

Frequently Asked Questions

What is SaulLM?
The SaulLM family is a collection of large language models specifically crafted for the legal domain, with the foundational SaulLM-7B model comprising 7 billion parameters. Initially trained on an extensive English legal corpus of over 30 billion tokens, the SaulLM-7B model was further refined through advanced pretraining and instruction fine-tuning to produce SaulLM-7B-Instruct, which is optimized for instruction-following tasks within the legal sector. Following the original model's success, the family expanded with the introduction of larger models like SaulLM-54B and SaulLM-141B, incorporating 54 billion and 141 billion parameters, respectively. These models feature the Mixtral architecture and focus on refined domain adaptation, including continued pretraining on a vast legal dataset exceeding 540 billion tokens. Deployed with specialized instruction protocols and aligned with human legal interpretation preferences, all models in the SaulLM family support open collaboration through a permissive MIT license, fostering innovation within the legal AI community 12356.
How many models are in the SaulLM family?
The SaulLM family contains 6 models.
What is the latest SaulLM model?
The latest model is Saul 141B, released in 2024-07.

Models(6)