LLM Reference
SiliconFlow

SiliconFlow

Researched 3d agoInference PlatformTier 3

SiliconFlow

CodingRAGAgentsLong contextVisionClassificationJSON / Tool useapi

SiliconFlow offers 13 tracked models (13 with output token pricing). This catalog covers coding, rag, and agents; open any model detail page for benchmarks, batch tiers, and migration prompts.

Covers 7 workload areas across 13 tracked models; last verified 2026-06-29.

Use it for

  • Teams comparing token and batch pricing across this provider's models
  • Operators routing coding, rag, and agents workloads through this API

Do not use it for

  • Final benchmark picks without opening the relevant model detail page

Tracked models

13

Models available through this provider

Priced output routes

13

Models with output token pricing tracked

Cheapest output

$0.040

Qwen2.5-7B-Instruct on this route

Batch-ready models

0

No batch pricing tracked

Latest model release

2026-06-02

30d since newest release

Freshness

2026-06-29

Researched 3d ago

fresh

Information

TypeInference Platform
TierTier 3
Models13
CompanySiliconFlow

SiliconFlow is a model serving platform for open and closed model inference, offering fast and cost-effective API access to popular AI models.

Catalog freshness

The newest model tracked on this provider was released 2026-06-02 (30d ago).

Where this host wins

  • Coding: 10 tracked models with SWE-bench / HumanEval-style scores.
  • RAG: 9 tracked models with ruler / needle retrieval benchmarks.
  • Agentic: 4 tracked models with BFCL, tau-bench, and SWE-bench tool-use coverage.
  • Long-context: 10 tracked models with context-token or InfiniteBench-class signal.

Compliance notes

No verified compliance claims (SOC 2, ISO, HIPAA) tracked for this provider yet — check the vendor's trust center for current certifications.

Platform Overview

SiliconFlow is a model serving platform for open and closed model inference, offering fast and cost-effective API access to popular AI models.

Compare per-model pricing, input and output token costs, batch availability, and benchmark coverage.

Available Models(13)

View all →

All models available as Serverless

ModelInput (per 1M)Output (per 1M)
Nex-N2-Pro$0.50$2.50
DeepSeek R1$0.25$0.8
DeepSeek V3$0.15$0.5
Qwen2.5-Coder-32B-Instruct$0.18$0.18
Grok-2$0.5$0.5
Mistral Large 2 (2407)$2$2
Mistral NeMo (2407)$0.3$0.3
Qwen2.5-14B-Instruct$0.08$0.08
Qwen2.5-32B-Instruct$0.15$0.15
Qwen2.5-72B-Instruct$0.28$0.28
View full catalog →

Where else to run this