Using DeepSeek V4 Flash on Novita AI
Implementation guide · DeepSeek V4 · DeepSeek
ServerlessOpen Source
Quick Start
- 1
- 2Use the Novita AI SDK or REST API to call
deepseek-v4-flash. - 3
Code Examples
Code examples for this provider have not been sourced yet.
About Novita AI
Novita AI offers a GPU-based inference API for image, video, and language model generation with a broad catalog of open-source models.
Pricing on Novita AI
| Type | Price (per 1M) |
|---|---|
| Input tokens | $0.14 |
| Output tokens | $0.28 |
Capabilities
ReasoningFunction CallingTool UseStructured Outputs
About DeepSeek V4 Flash
DeepSeek V4 Flash is a 284B parameter (13B activated) Mixture-of-Experts language model with 1M-token context. Features a hybrid attention architecture combining Compressed Sparse Attention (CSA) and Heavily Compressed Attention (HCA) for efficient long-context inference. Supports thinking and non-thinking modes. Legacy API aliases deepseek-chat and deepseek-reasoner map to this model's non-thinking and thinking modes respectively. Pricing: $0.14/1M input, $0.28/1M output (cache hit: $0.0028/1M input). MIT licensed.
Model Specs
Released2026-04-24
Parameters284B
Context1M
ArchitectureMixture of Experts