LLM Reference

DeepInfra

DeepInfra

AI

Platform

DeepInfra offers serverless AI inference with a simple API, supporting hundreds of models across text generation, embeddings, and more. Pay-per-token pricing with no upfront commitments.

About DeepInfra

DeepInfra is a cloud inference platform offering cost-effective access to open-source AI models. It provides serverless inference for leading models from Meta, Mistral, Alibaba, and others with competitive token-based pricing.

Available Models(10)

ModelInput (per 1M)Output (per 1M)Type
Qwen2.5 72B Instruct$23$23
Serverless
Qwen2.5 Coder 32B$20$20
Serverless
Qwen2.5 14B Instruct$10$10
Serverless
Qwen2.5 7B Instruct$3$3
Serverless
Qwen2 57B-A14B$16$16
Serverless
Llama 3.1 70B Instruct$40$40
Serverless
DeepSeek V3$32$89
Serverless
DeepSeek R1 Distill Llama 70B$70$80
Serverless
Mixtral 8x7B$54$54
Serverless
Mistral NeMo Instruct (2407)$2$4
Serverless

Company Info

Founded2023
San Francisco, California, United States