Models on Cloudflare Workers AI
20 models available · Cloudflare
| Model | Input (per 1M) | Output (per 1M) | Context | |
|---|---|---|---|---|
| DeepSeek Coder 6.7B | — | — | — | |
| DeepSeek Math 7B | — | — | — | |
| Falcon 7B | — | — | — | |
| Gemma 2B Instruct | — | — | 2K | |
| Gemma 7B Instruct | — | — | 8K | |
| Hermes 2 Pro Mistral 7B | — | — | 32K | |
| Llama 2 13B Chat | — | — | 4K | |
| Llama 2 7B Chat | — | — | 4K | |
| Llama 3 8B Instruct | — | — | 8K | |
| Llama Guard 7B | — | — | 2K | |
| Mistral 7B v0.1 | — | — | 8K | |
| OpenChat 3.5 (0106) | — | — | 8K | |
| OpenHermes 2.5 Mistral 7B | — | — | 32K | |
| Phi-2 | — | — | — | |
| Qwen1.5-0.5B | — | — | — | |
| Qwen1.5-1.8B | — | — | — | |
| Qwen1.5-14B | — | — | — | |
| Qwen1.5-7B | — | — | — | |
| SQLCoder 7B 2 | — | — | — | |
| Starling LM 7B Beta | — | — | — |
About Cloudflare Workers AI
Cloudflare Workers AI is a serverless GPU inference platform enabling developers to run machine learning models on Cloudflare's global edge network. It supports diverse AI tasks including text generation, image classification, automatic speech recognition, and real-time language translation. The platform provides pay-per-use pricing and access to a curated library of open-source models from Hugging Face, enabling rapid deployment without complex infrastructure management. Key features include low-latency edge computing, streaming responses for large language models, context length customization, and the AI Gateway for monitoring, caching, and cost optimization.
Full provider profile →