Skip to main content

Cloudflare Workers AI Provider Guide

Open-model inference at the edge via Cloudflare Workers AI


Overview

Cloudflare Workers AI serves Meta Llama, Mistral, and other open models from Cloudflare's global GPU cluster. NeuroLink talks to the OpenAI-compatible endpoint.

Key Facts

  • Protocol: OpenAI-compatible (/v1/chat/completions)
  • Default base URL: https://api.cloudflare.com/client/v4/accounts/{accountId}/ai/v1
  • Default model: @cf/meta/llama-3.3-70b-instruct-fp8-fast
  • Streaming: Yes
  • Tool calling: Limited (model-dependent)

Quick Start

1. Get Credentials

You need both:

  • A Cloudflare Account ID (Cloudflare dashboard → right sidebar)
  • A Workers AI API token with the Workers AI Read & Write permission (Profile → API Tokens → Create Token)

2. Configure

CLOUDFLARE_ACCOUNT_ID=your-account-id
CLOUDFLARE_API_KEY=your-workers-ai-token
CLOUDFLARE_MODEL=@cf/meta/llama-3.3-70b-instruct-fp8-fast

3. Generate

import { NeuroLink } from "@juspay/neurolink";
const ai = new NeuroLink();
const result = await ai.generate({
provider: "cloudflare",
input: { text: "Why is Cloudflare's edge network significant?" },
});
console.log(result.content);

Supported Models (sample)

Model IDNotes
@cf/meta/llama-3.3-70b-instruct-fp8-fastDefault
@cf/meta/llama-3.1-70b-instructLlama 3.1 70B
@cf/meta/llama-3.1-8b-instructFast tier
@cf/meta/llama-3.2-11b-vision-instructVision-capable

Browse: https://developers.cloudflare.com/workers-ai/models


CLI Usage

pnpm run cli generate "..." --provider cloudflare

Provider Aliases

AliasExample
cloudflare--provider cloudflare
cf--provider cf

Configuration Reference

Environment VariableRequiredDefault
CLOUDFLARE_ACCOUNT_IDYes
CLOUDFLARE_API_KEYYes
CLOUDFLARE_MODELNo@cf/meta/llama-3.3-70b-instruct-fp8-fast

See Also