LLM Reference

StarCoder

About

StarCoder is a specialized large language model developed for code generation and understanding, as part of a collaboration between Hugging Face and ServiceNow under the BigCode project. It utilizes a decoder-only transformer architecture with multi-query attention and an 8192-token context window. The base model, StarCoderBase, contains 15.5 billion parameters and is trained on 1 trillion tokens sourced from GitHub data, covering various programming languages and use cases. StarCoder is a fine-tuned version that includes an additional 35 billion Python-specific tokens, achieving excellent performance in Python-related tasks. It is multilingual, capable of acting as a technical assistant through dialogue and facilitating code autocompletion, modification, and debugging. Released under the OpenRAIL-M license, the model supports open access and distribution. StarCoder 2 further extends its capabilities, supporting over 600 programming languages and incorporating architectural improvements for enhanced performance.

Capabilities

MultimodalFunction CallingTool UseJSON Mode

Providers(1)

ProviderInput (per 1M)Output (per 1M)Type
Fireworks AI Platform$0.2$0.2
Serverless

Specifications

FamilyStarCoder
Parameters15.5B
Context8K
ArchitectureDecoder Only
Specializationgeneral