LLM Reference
StarCoder

StarCoder

About

StarCoder is a family of open-source large language models (LLMs) that excel in code generation and manipulation. Developed through a collaboration between Hugging Face and ServiceNow, these models are trained on an extensive dataset comprising over 80 programming languages, Git commits, GitHub issues, and Jupyter notebooks, sourced from permissively licensed code on GitHub. The flagship StarCoder model features 15.5 billion parameters, leveraging 1 trillion tokens for its training, while the foundational StarCoderBase offers a more generalized application. Beyond mere code completion, StarCoder can modify code according to instructions, explain snippets in natural language, and function as a technical assistant. Released under the BigCode OpenRAIL-M v1 license, these models are accessible for both research and commercial purposes. A later version, StarCoder2, brings enhanced performance and increased size 147.

Models(2)

Details

Models2