Cloudflare Workers AI Models — Pricing & Benchmarks
40 models available · Cloudflare
Cloudflare Workers AI hosts 40 AI models in this catalog. The lowest listed input price is Llama 3.2 1B Instruct at $0.027/1M input tokens. LLM Reference lets you compare these models across all 80 providers without switching tabs.
Where else to run this
Pricing Overview
About Cloudflare Workers AI
Cloudflare Workers AI is a serverless GPU inference platform enabling developers to run machine learning models on Cloudflare's global edge network. It supports diverse AI tasks including text generation, image classification, automatic speech recognition, and real-time language translation. The platform provides pay-per-use pricing and access to a curated library of open-source models from Hugging Face, enabling rapid deployment without complex infrastructure management. Key features include low-latency edge computing, streaming responses for large language models, context length customization, and the AI Gateway for monitoring, caching, and cost optimization.
Full provider profile →