LLM Reference

DeepSeek Coder 6.7B

About

DeepSeek Coder 6.7B is a large language model tailored for code-related tasks such as code completion and insertion. With 6.7 billion parameters, it excels in repository-level code projects. Trained on a vast dataset containing 2 trillion tokens, the model mixes 87% code with 13% natural language in both English and Chinese. Its architecture features a multi-head attention mechanism with a 16K window size, utilizing fill-in-the-blank tasks for effective project-level code handling. It demonstrates state-of-the-art performance in open-source code models across various benchmarks and supports commercial use under the DeepSeek license. Quantized versions are available on Hugging Face for diverse hardware, and a fine-tuned deepseek-coder-6.7b-instruct version is also accessible, optimized with 2 billion instruction tokens.

Capabilities

MultimodalFunction CallingTool UseJSON Mode

Providers(3)

ProviderInput (per 1M)Output (per 1M)Type
Cloudflare Workers AI
Serverless
Fireworks AI Platform
Provisioned
NVIDIA NIM
Provisioned

Specifications

Released2024-03-07
Parameters6.7B
ArchitectureDecoder Only
Specializationcode