LLM ReferenceLLM Reference
OpenRouter

Using DeepSeek V4 Flash on OpenRouter

Implementation guide · DeepSeek V4 · DeepSeek

ServerlessOpen Source

Quick Start

  1. 1
    Create an account at OpenRouter and generate an API key.
  2. 2
    Use the OpenRouter SDK or REST API to call deepseek/deepseek-v4-flash — see the documentation for request format.
  3. 3
    You'll be billed $0.14/1M input, $0.28/1M output tokens. See full pricing.

Code Examples

See OpenRouter documentation for integration details.

About OpenRouter

OpenRouter provides a unified interface for Large Language Models with better pricing, improved uptime, and no subscription requirements. Route across providers for cost optimization and reliability.

Multi-provider LLM aggregator offering unified API access to 300+ models from all major labs and emerging providers, with automatic failover for reliability.

Pricing on OpenRouter

TypePrice (per 1M)
Input tokens$0.14
Output tokens$0.28

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution

About DeepSeek V4 Flash

DeepSeek V4 Flash is a 284B parameter (13B activated) Mixture-of-Experts language model with 1M-token context. Features a hybrid attention architecture combining Compressed Sparse Attention (CSA) and Heavily Compressed Attention (HCA) for efficient long-context inference. Supports thinking and non-thinking modes. Legacy API aliases deepseek-chat and deepseek-reasoner map to this model's non-thinking and thinking modes respectively. Pricing: $0.14/1M input, $0.28/1M output (cache hit: $0.028/1M input). MIT licensed.

Model Specs

Released2026-04-24
Parameters284B
Context1M
ArchitectureMixture of Experts

Provider

OpenRouter
OpenRouter

OpenRouter, Inc.

New York, NY, USA