Cloudflare Workers AI Provider Guide

Open-model inference at the edge via Cloudflare Workers AI

Overview

Cloudflare Workers AI serves Meta Llama, Mistral, and other open models from Cloudflare's global GPU cluster. NeuroLink talks to the OpenAI-compatible endpoint.

Key Facts

Protocol: OpenAI-compatible (/v1/chat/completions)
Default base URL: https://api.cloudflare.com/client/v4/accounts/{accountId}/ai/v1
Default model: @cf/meta/llama-3.3-70b-instruct-fp8-fast
Streaming: Yes
Tool calling: Limited (model-dependent)

Quick Start

1. Get Credentials

You need both:

A Cloudflare Account ID (Cloudflare dashboard → right sidebar)
A Workers AI API token with the Workers AI Read & Write permission (Profile → API Tokens → Create Token)

2. Configure

CLOUDFLARE_ACCOUNT_ID=your-account-id
CLOUDFLARE_API_KEY=your-workers-ai-token
CLOUDFLARE_MODEL=@cf/meta/llama-3.3-70b-instruct-fp8-fast

3. Generate

import { NeuroLink } from "@juspay/neurolink";
const ai = new NeuroLink();
const result = await ai.generate({
  provider: "cloudflare",
  input: { text: "Why is Cloudflare's edge network significant?" },
});
console.log(result.content);

Supported Models (sample)

Model ID	Notes
`@cf/meta/llama-3.3-70b-instruct-fp8-fast`	Default
`@cf/meta/llama-3.1-70b-instruct`	Llama 3.1 70B
`@cf/meta/llama-3.1-8b-instruct`	Fast tier
`@cf/meta/llama-3.2-11b-vision-instruct`	Vision-capable

Browse: https://developers.cloudflare.com/workers-ai/models

CLI Usage

pnpm run cli generate "..." --provider cloudflare

Provider Aliases

Alias	Example
`cloudflare`	`--provider cloudflare`
`cf`	`--provider cf`

Configuration Reference

Environment Variable	Required	Default
`CLOUDFLARE_ACCOUNT_ID`	Yes	—
`CLOUDFLARE_API_KEY`	Yes	—
`CLOUDFLARE_MODEL`	No	`@cf/meta/llama-3.3-70b-instruct-fp8-fast`

Overview​

Key Facts​

Quick Start​

1. Get Credentials​

2. Configure​

3. Generate​

Supported Models (sample)​

CLI Usage​

Provider Aliases​

Configuration Reference​

See Also​