LLM Reference

Models on Cloudflare Workers AI

20 models available · Cloudflare

ModelInput (per 1M)Output (per 1M)Context
DeepSeek Coder 6.7B
DeepSeek Math 7B
Falcon 7B
Gemma 2B Instruct2K
Gemma 7B Instruct8K
Hermes 2 Pro Mistral 7B32K
Llama 2 13B Chat4K
Llama 2 7B Chat4K
Llama 3 8B Instruct8K
Llama Guard 7B2K
Mistral 7B v0.18K
OpenChat 3.5 (0106)8K
OpenHermes 2.5 Mistral 7B32K
Phi-2
Qwen1.5-0.5B
Qwen1.5-1.8B
Qwen1.5-14B
Qwen1.5-7B
SQLCoder 7B 2
Starling LM 7B Beta

About Cloudflare Workers AI

Cloudflare Workers AI is a serverless GPU inference platform enabling developers to run machine learning models on Cloudflare's global edge network. It supports diverse AI tasks including text generation, image classification, automatic speech recognition, and real-time language translation. The platform provides pay-per-use pricing and access to a curated library of open-source models from Hugging Face, enabling rapid deployment without complex infrastructure management. Key features include low-latency edge computing, streaming responses for large language models, context length customization, and the AI Gateway for monitoring, caching, and cost optimization.

Full provider profile →