LLM Reference

StarCoder2 3B

About

StarCoder2-3B is a 3-billion parameter large language model developed by the BigCode project, focusing on code generation tasks. Its architecture features a transformer decoder with grouped-query and sliding window attention, trained using the Fill-in-the-Middle objective. The model supports a context window of 16,384 tokens, allowing it to handle large contexts for tasks like code completion and translation across 17 programming languages. However, it is not designed for direct instruction-following. Its efficiency is enhanced by a Grouped Query Attention mechanism and availability in quantized versions for lower memory usage, though users should note that the generated code may not be error-free 25.

Capabilities

MultimodalFunction CallingTool UseJSON Mode

Providers(1)

ProviderInput (per 1M)Output (per 1M)Type
Fireworks AI Platform
Provisioned

Specifications

Parameters3B
Context8K
ArchitectureDecoder Only
Specializationgeneral