Run inference on region: Earth

Build and deploy ambitiousAI applicationsto Cloudflare's global network

Full-stack AI Building Blocks

Serverless AI on GPUs

Run generative AI tasks on our global network of NVIDIA GPUs with no extra setup.

Models Included

Choose from a variety of popular models in our catalog including Llama-2, Whisper, and ResNet50.

Available everywhere

Run AI models from Workers, Pages, or anywhere via our REST API

Supercharge with Vectorize

Generate and store embeddings in a globally distributed vector database.

AI Gateway

Improve reliability and scalability with caching, rate limiting, and analytics.

Train with R2

Build multi-cloud training architectures with free egress.

Simplified with Neurons

Estimate the costs and number of neurons needed for your typical AI workloads.

Estimate your cost

Choose your models and enter the average response size and estimated daily traffic

@cf/meta/m2m100-1.2b
input tokens
output tokens
requests
Daily Neurons (usage)6.6K
Daily CostFree

Neurons aggregate usage across different models in a single metric. Each account has 10K free daily neurons before being billed at a rate of $0.011 / 1k neurons. Learn more

This calculator is for informational purposes only. Prices are limited to the public fees as of March 1, 2024, and do not include taxes and any other fees.

Zero to production in minutes

Less boilerplate. More fun.

Choose a template from our curated catalog of off-the-shelf models, that allow you to perform tasks including image classification, sentiment analysis, speech recognition, text generation, or translation.
using template

Add a vector database without breaking the bank

Speed up and scale your AI Workflows with Vectorize. Generate and store new or existing embeddings to enable search on top of your own data for repeated use with machine learning models.
vectorize ui

Grab your model and go

All it takes is a few lines of code with Workers AI and Vectorize to run an AI inference task on Pages using your favorite framework, Workers, or any stack via an API. Pick your model and go.
worker snippet

Working with the best AI companies in the world

MetaNvidiaMicrosoftHugging FaceHugging FaceDatabricks

Enhance and protect your AI applications

Build reliable, secure, cost-effective AI architectures

No more surprise bills from your AI vendors

The AI Gateway adds a layer of control and protection in LLM applications
  • Apply rate-limits and caching to protect back-end infrastructure and avoid surprise bills.
  • Gain visibility into how many people are using the service.
AI gateway logo

Train where it's cheapest with egress-free data

Cost-effective storage for training models and AI-generated assets with R2
  • Egress-free storage makes multi-cloud architectures for training LLMs affordable.
  • Limitless storage for the ever-growing assets generated by users.
R2 logo

You're in good company

"We use Cloudflare for everything – storage, cache, queues, and most importantly for training data and deploying the app on the edge, so I can ensure the product is reliable and fast. It's also been the most affordable option, with competitors costing more for a single day's worth of requests than Cloudflare costs in a month."
Bhanu Teja Pachipulusu, Founder, SiteGPT.ai
LeonardoSiteGPTcharacter.ai