DeepSeek Coder 6.7B

About

DeepSeek Coder 6.7B is a large language model tailored for code-related tasks such as code completion and insertion. With 6.7 billion parameters, it excels in repository-level code projects. Trained on a vast dataset containing 2 trillion tokens, the model mixes 87% code with 13% natural language in both English and Chinese. Its architecture features a multi-head attention mechanism with a 16K window size, utilizing fill-in-the-blank tasks for effective project-level code handling. It demonstrates state-of-the-art performance in open-source code models across various benchmarks and supports commercial use under the DeepSeek license. Quantized versions are available on Hugging Face for diverse hardware, and a fine-tuned deepseek-coder-6.7b-instruct version is also accessible, optimized with 2 billion instruction tokens.

Capabilities

MultimodalFunction CallingTool UseJSON Mode

Providers(3)

Provider	Input (per 1M)	Output (per 1M)	Type
Cloudflare Workers AI	—	—	Serverless
Fireworks AI Platform	—	—	Provisioned
NVIDIA NIM	—	—	Provisioned

Specifications

FamilyDeepSeek Coder

Released2024-03-07

Parameters6.7B

ArchitectureDecoder Only

Specializationcode