LLM Reference

DeepSeek Coder V2

About

DeepSeek Coder V2 is an open-source Mixture-of-Experts code language model tailored for code intelligence and software development. It rivals closed-source models like GPT-4 Turbo in code-specific tasks by undergoing extensive pre-training on 6 trillion tokens beyond its DeepSeek-V2 base. This training enhances its coding and mathematical reasoning capabilities while preserving strong general language skills. The model supports over 338 programming languages and accommodates a context length of 128K tokens, making it capable of handling extensive codebases and complex tasks. DeepSeek Coder V2 is accessible via Hugging Face, DeepSeek's official website, and an OpenAI-compatible API. Its architecture features Multi-head Latent Attention and employs the DeepSeekMoE framework for efficient inference and cost-effective training.

Capabilities

MultimodalFunction CallingTool UseJSON Mode

Providers(2)

ProviderInput (per 1M)Output (per 1M)Type
DeepSeek PlatformServerless
Fireworks AI Platform$1.2$1.2ServerlessProvisioned

Specifications

Released2024-06-17
Parameters236B
Context128K
ArchitectureMixture of Experts
Specializationcode