LLM Reference

Using DeepSeek V4 Pro on Novita AI

Implementation guide · DeepSeek V4 · DeepSeek

ServerlessOpen Source

Quick Start

  1. 1
    Create an account at Novita AI and generate an API key.
  2. 2
    Use the Novita AI SDK or REST API to call deepseek-v4-pro.
  3. 3
    You'll be billed $1.64/1M input, $3.38/1M output tokens. See full pricing.

Code Examples

Code examples for this provider have not been sourced yet.

About Novita AI

Novita AI offers a GPU-based inference API for image, video, and language model generation with a broad catalog of open-source models.

Pricing on Novita AI

TypePrice (per 1M)
Input tokens$1.64
Output tokens$3.38

Capabilities

ReasoningFunction CallingTool UseStructured OutputsPrompt Caching

About DeepSeek V4 Pro

DeepSeek V4 Pro is DeepSeek's flagship open-weights model, released April 24 2026 under the MIT license. Architecture: 1.6T total / 49B active parameters, MoE with Compressed Sparse Attention (CSA) + Heavily Compressed Attention (HCA) hybrid — requiring only 27% of inference FLOPs vs standard 1M-context transformers — plus Manifold-Constrained Hyper-Connections (mHC) and Muon Optimizer. Context window: 1,000,000 tokens; max output: 384,000 tokens (Think Max mode requires ≥384K context). Text-only (no vision/image input). Supports three reasoning modes: Non-Think, Think High, Think Max. Function calling, tool use, and structured outputs supported. Key benchmarks: SWE-bench Verified 80.6%, SWE-bench Pro 55.4%, LiveCodeBench 93.5%, GPQA Diamond 90.1%, MMLU-Pro 87.5%, Terminal-Bench 2.0 67.9%, Chatbot Arena 1460 (2026-04-28). Current API pricing: $0.435/$0.87 per 1M input/output tokens (75% discount active until 2026-05-31 15:59 UTC); regular rate $1.74/$3.48.

Model Specs

Released2026-04-24
Parameters1.6T
Context1M
ArchitectureMixture of Experts (MoE) with CSA+HCA hybrid attention

Provider

Novita AI