StarCoder2 3B
About
StarCoder2-3B is a 3-billion parameter large language model developed by the BigCode project, focusing on code generation tasks. Its architecture features a transformer decoder with grouped-query and sliding window attention, trained using the Fill-in-the-Middle objective. The model supports a context window of 16,384 tokens, allowing it to handle large contexts for tasks like code completion and translation across 17 programming languages. However, it is not designed for direct instruction-following. Its efficiency is enhanced by a Grouped Query Attention mechanism and availability in quantized versions for lower memory usage, though users should note that the generated code may not be error-free 25.
Capabilities
MultimodalFunction CallingTool UseJSON Mode
Providers(1)
| Provider | Input (per 1M) | Output (per 1M) | Type | |
|---|---|---|---|---|
| Fireworks AI Platform | — | — | Provisioned |