Skip to main content

AI Provider Guides

Complete setup guides for all supported AI providers.


🆓 Free Tier Providers

Start with zero cost using these free-tier options:

Hugging Face

100,000+ open-source models

  • ✅ Free inference API
  • 🌍 Largest model collection
  • 🔓 Fully open source
  • 📊 Models by task: chat, classification, NER, summarization

Setup Guide →

Google AI Studio

Gemini models with generous free tier

  • ✅ 1,500 requests/day free
  • ⚡ Fast Gemini 2.0 Flash
  • 🎯 15 requests/minute
  • 💰 Pay-as-you-go option

Setup Guide →


🤖 Direct AI Providers

Access leading AI models directly from their creators:

Anthropic

Claude models with API key or OAuth authentication

  • 🧠 Claude 4.5 Opus/Sonnet/Haiku, Claude 4.0 Opus/Sonnet
  • 🔐 API key or OAuth (Pro/Max subscription)
  • 💭 Extended thinking for deep reasoning
  • 📄 200K context window, multimodal support

Setup Guide →


🏢 Enterprise Providers

Production-grade providers for enterprise deployments:

Azure OpenAI

Enterprise AI with Microsoft Azure

  • 🔒 SOC2, HIPAA, ISO 27001 compliant
  • 🌍 Multi-region deployment (30+ regions)
  • 🛡️ Private endpoints with VNet
  • 💼 Enterprise SLAs

Setup Guide →

Google Vertex AI

Google Cloud ML platform

  • ☁️ GCP integration
  • 🔐 IAM, VPC, service accounts
  • 🌏 Global deployment
  • 🎯 Gemini, PaLM, Codey models

Setup Guide →

AWS Bedrock

Serverless AI on AWS

  • 📦 13 foundation models (Claude, Llama, Mistral)
  • 🔐 IAM, VPC integration
  • 🌍 Multi-region (us-east-1, eu-west-1, ap-southeast-1)
  • 💰 Pay-per-use pricing

Setup Guide →


🌍 Compliance-Focused

Providers with specific compliance certifications:

Mistral AI

European AI with GDPR compliance

  • 🇪🇺 EU data residency
  • ✅ GDPR compliant by default
  • 🔓 Open source models
  • 💰 Cost-effective

Setup Guide →


🧑‍💻 Hosted Inference Providers

Access frontier models via hosted cloud inference APIs:

DeepSeek

deepseek-chat (V3) and deepseek-reasoner (R1)

  • 🧠 deepseek-chat — high-quality general chat at low cost
  • 💭 deepseek-reasoner — R1 chain-of-thought reasoning model
  • 🔑 API key from platform.deepseek.com
  • 🔄 Aliases: ds

Setup Guide →

NVIDIA NIM

400+ models via NVIDIA's hosted and self-hosted inference platform

  • 🚀 Llama 3.3 70B Instruct (default), Mistral, Nemotron, and 400+ catalog models
  • 🔧 NIM-specific extras: top_k, min_p, repetition_penalty, reasoning_budget
  • 🔑 API key from build.nvidia.com
  • 🖥️ Also supports self-hosted NIM endpoints via NVIDIA_NIM_BASE_URL
  • 🔄 Aliases: nim, nvidia

Setup Guide →


💻 Local Providers

Run models entirely on your own hardware — no API key or internet required for inference:

LM Studio

Run any supported model locally with a GUI app

  • 🖥️ Download and run models via the LM Studio desktop application
  • 🔍 Auto-discovers the loaded model from /v1/models (no model name required)
  • 🌐 OpenAI-compatible API at http://localhost:1234/v1 by default
  • 🆓 No API key needed for local use (key optional for reverse-proxy setups)
  • 🔄 Aliases: lmstudio, lms

Setup Guide →

llama.cpp

High-performance local inference via llama-server

  • ⚡ Run GGUF models with llama-server at http://localhost:8080/v1 by default
  • 🔍 Auto-discovers the loaded model from /v1/models
  • 🛠️ Tool support requires --jinja flag when starting llama-server
  • 🆓 No API key needed for local use (key optional for reverse-proxy setups)
  • 🔄 Aliases: llama.cpp

Setup Guide →


🔌 Aggregators & Proxies

Access multiple providers through unified interfaces:

OpenRouter

300+ models from 60+ providers

  • 🌐 Single API for all major providers (Anthropic, OpenAI, Google, Meta, etc.)
  • ⚡ Automatic failover and routing
  • 💰 Competitive pricing with cost optimization
  • 🎯 Zero lock-in - switch models instantly
  • 📊 Usage tracking dashboard
  • 🆓 Free models available

Setup Guide →

OpenAI Compatible

OpenRouter, vLLM, LocalAI, and more

  • 🌐 100+ models through OpenRouter
  • 💻 Local deployment with vLLM
  • 🔓 Self-hosted with LocalAI
  • 🔄 Drop-in OpenAI replacement

Setup Guide →

LiteLLM

100+ providers through proxy

  • 🔄 Unified API for 100+ providers
  • 📊 Load balancing and fallbacks
  • 💰 Cost tracking
  • 🎯 Model routing

Setup Guide →


🎙️ Voice Providers

Synthesize speech, transcribe audio, or run live voice sessions. Voice providers are separate from LLM providers — they handle audio I/O rather than text generation.

Text-to-Speech (TTS)

OpenAI TTS

Highest-quality text-to-speech

  • 🎙️ Voices: alloy, echo, fable, onyx, nova, shimmer
  • 🎵 Models: tts-1 (fast) and tts-1-hd (high quality)
  • 🎼 Formats: MP3, WAV, OGG, Opus
  • 🔑 Auth: API Key (OPENAI_API_KEY)

Setup Guide →

ElevenLabs

Best multilingual and voice-cloning TTS

  • 🌍 Supports 30+ languages with natural prosody
  • 🎭 Custom voice cloning from short audio samples
  • 🎼 Formats: MP3, WAV (raw PCM, surfaced as pcm16), Opus (Ogg container)
  • 🔑 Auth: API Key (ELEVENLABS_API_KEY)

Setup Guide →

Google TTS

1M characters/month free tier

  • 💰 Generous free tier for standard voices
  • 🌍 380+ voices across 50+ languages
  • 🎼 Formats: MP3, WAV, OGG
  • 🔑 Auth: Service Account

Setup Guide →

Azure TTS

Enterprise TTS with full SSML support

  • 🏢 Fine-grained prosody control via SSML
  • 🌍 400+ neural voices, 140+ languages
  • 🎼 Formats: MP3, WAV (PCM), Opus (Ogg container)
  • 🔑 Auth: API Key + Region

Setup Guide →


Speech-to-Text (STT)

Whisper (OpenAI)

Highest transcription accuracy

  • 🎯 Best-in-class accuracy on diverse audio
  • 🌍 Multilingual with automatic language detection
  • 🎼 Formats: WAV, MP3, M4A, FLAC, OGG, OPUS, WEBM, MP4, MPEG, MPGA
  • 🔑 Auth: API Key (OPENAI_API_KEY)

Setup Guide →

Deepgram

Real-time streaming transcription via WebSocket

  • ⚡ Sub-300 ms word-level results over WebSocket
  • 🌊 REST batch and WebSocket streaming modes
  • 🎼 Formats: WAV, MP3, OGG, FLAC
  • 🔑 Auth: API Key (DEEPGRAM_API_KEY)

Setup Guide →

Google STT

125+ languages with speaker diarization

  • 🌍 Best fit for existing Google Cloud users
  • 👥 Speaker diarization and multi-channel audio
  • 🎼 Formats: WAV, FLAC, MP3, OGG
  • 🔑 Auth: API Key (GOOGLE_AI_API_KEY / GEMINI_API_KEY) or Service Account (GOOGLE_APPLICATION_CREDENTIALS)

Setup Guide →

Azure STT

Enterprise STT with custom model training

  • 🏢 Batch transcription and custom model support
  • 🔒 Compliance controls for regulated industries
  • 🎼 Formats: WAV (PCM), Ogg/Opus — convert MP3 to WAV first
  • 🔑 Auth: API Key + Region

Setup Guide →


Realtime Voice

Realtime providers maintain a persistent bidirectional WebSocket connection, enabling low-latency spoken conversation with the AI model.

OpenAI Realtime

Low-latency bidirectional voice over WebSocket

  • ⚡ Full-duplex audio stream with GPT-4o
  • 🎵 Voice activity detection (VAD) built-in
  • 🎼 Formats: WAV, Opus
  • 🔑 Auth: API Key (OPENAI_API_KEY)

Setup Guide →

Gemini Live

Google's native realtime voice API

  • ⚡ Native multimodal realtime session with Gemini
  • 🎵 Supports audio + video input simultaneously
  • 🎼 Formats: WAV, Opus
  • 🔑 Auth: API Key (GOOGLE_AI_API_KEY or GEMINI_API_KEY)

Setup Guide →


Quick Comparison

ProviderFree TierEnterpriseGDPRLatencyBest For
AnthropicLimitedLowReasoning, coding, Claude
Hugging FaceMediumOpen source, experimentation
Google AILowFree tier, Gemini
Mistral AILowEU compliance, cost
OpenRouterVariesLowMulti-model, automatic failover
OpenAI CompatibleVariesVariesVariesFlexibility, local deployment
LiteLLMVariesLowMulti-provider, unified API
Azure OpenAILowEnterprise, Microsoft ecosystem
Vertex AILowEnterprise, GCP ecosystem
AWS BedrockLowEnterprise, AWS ecosystem
DeepSeekLowCost-effective reasoning, R1 model
NVIDIA NIMVariesLowNVIDIA-hosted or self-hosted LLMs
LM Studio✅ (Local)VariesLocal GUI model management
llama.cpp✅ (Local)VariesHigh-performance local GGUF inference
OpenAI TTSLowHigh-quality TTS (tts-1-hd)
ElevenLabsVariesLowMultilingual TTS, voice cloning
Google TTSLowCost-effective TTS, 1M chars free
Azure TTSLowEnterprise TTS, SSML support
WhisperLowBest STT accuracy
DeepgramVariesLowReal-time STT streaming (WebSocket)
Google STTLowSTT for GCP users, 125+ languages
Azure STTLowEnterprise STT, custom models
OpenAI RealtimeLowRealtime bidirectional voice
Gemini LiveLowRealtime voice + video (Gemini)

Setup Strategies

const ai = new NeuroLink({
providers: [
{
name: 'google-ai',
priority: 1,
config: { apiKey: process.env.GOOGLE_AI_KEY },
quotas: { daily: 1500 }
},
{
name: 'openai',
priority: 2,
config: { apiKey: process.env.OPENAI_API_KEY }
}
],
failoverConfig: { enabled: true, fallbackOnQuota: true }
});

const result = await ai.generate({
input: { text: "Hello world" }
});

Strategy 2: Multi-Region Enterprise

const ai = new NeuroLink({
providers: [
{
name: "azure-us",
region: "us-east",
config: {
/* Azure US */
},
},
{
name: "azure-eu",
region: "eu-west",
config: {
/* Azure EU */
},
},
{
name: "bedrock-us",
region: "us-east",
config: {
/* Bedrock US */
},
},
],
loadBalancing: "latency-based",
});

Strategy 3: GDPR Compliance

const ai = new NeuroLink({
providers: [
{
name: "mistral",
priority: 1,
config: { apiKey: process.env.MISTRAL_API_KEY },
},
{
name: "azure-eu",
priority: 2,
config: {
/* Azure EU region */
},
},
],
compliance: {
framework: "GDPR",
dataResidency: "EU",
},
});

Next Steps

  1. Choose a provider based on your requirements (free tier, compliance, region)
  2. Follow the setup guide to get your API key
  3. Configure NeuroLink with the provider
  4. Test the integration with a simple request
  5. Add failover for production reliability