Synthesize speech, transcribe audio, or run live voice sessions. Voice providers are separate from LLM providers — they handle audio I/O rather than text generation.

Text-to-Speech (TTS)

OpenAI TTS

Highest-quality text-to-speech

🎙️ Voices: alloy, echo, fable, onyx, nova, shimmer
🎵 Models: tts-1 (fast) and tts-1-hd (high quality)
🎼 Formats: MP3, WAV, OGG, Opus
🔑 Auth: API Key (OPENAI_API_KEY)

Setup Guide →

ElevenLabs

Best multilingual and voice-cloning TTS

🌍 Supports 30+ languages with natural prosody
🎭 Custom voice cloning from short audio samples
🎼 Formats: MP3, WAV (raw PCM, surfaced as pcm16), Opus (Ogg container)
🔑 Auth: API Key (ELEVENLABS_API_KEY)

Setup Guide →

Google TTS

1M characters/month free tier

💰 Generous free tier for standard voices
🌍 380+ voices across 50+ languages
🎼 Formats: MP3, WAV, OGG
🔑 Auth: Service Account

Setup Guide →

Azure TTS

Enterprise TTS with full SSML support

🏢 Fine-grained prosody control via SSML
🌍 400+ neural voices, 140+ languages
🎼 Formats: MP3, WAV (PCM), Opus (Ogg container)
🔑 Auth: API Key + Region

Setup Guide →

Speech-to-Text (STT)

Whisper (OpenAI)

Highest transcription accuracy

🎯 Best-in-class accuracy on diverse audio
🌍 Multilingual with automatic language detection
🎼 Formats: WAV, MP3, M4A, FLAC, OGG, OPUS, WEBM, MP4, MPEG, MPGA
🔑 Auth: API Key (OPENAI_API_KEY)

Setup Guide →

Deepgram

Real-time streaming transcription via WebSocket

⚡ Sub-300 ms word-level results over WebSocket
🌊 REST batch and WebSocket streaming modes
🎼 Formats: WAV, MP3, OGG, FLAC
🔑 Auth: API Key (DEEPGRAM_API_KEY)

Setup Guide →

Google STT

125+ languages with speaker diarization

🌍 Best fit for existing Google Cloud users
👥 Speaker diarization and multi-channel audio
🎼 Formats: WAV, FLAC, MP3, OGG
🔑 Auth: API Key (GOOGLE_AI_API_KEY / GEMINI_API_KEY) or Service Account (GOOGLE_APPLICATION_CREDENTIALS)

Setup Guide →

Azure STT

Enterprise STT with custom model training

🏢 Batch transcription and custom model support
🔒 Compliance controls for regulated industries
🎼 Formats: WAV (PCM), Ogg/Opus — convert MP3 to WAV first
🔑 Auth: API Key + Region

Setup Guide →

Realtime Voice

Realtime providers maintain a persistent bidirectional WebSocket connection, enabling low-latency spoken conversation with the AI model.

OpenAI Realtime

Low-latency bidirectional voice over WebSocket

⚡ Full-duplex audio stream with GPT-4o
🎵 Voice activity detection (VAD) built-in
🎼 Formats: WAV, Opus
🔑 Auth: API Key (OPENAI_API_KEY)

Setup Guide →

Gemini Live

Google's native realtime voice API

⚡ Native multimodal realtime session with Gemini
🎵 Supports audio + video input simultaneously
🎼 Formats: WAV, Opus
🔑 Auth: API Key (GOOGLE_AI_API_KEY or GEMINI_API_KEY)

Setup Guide →

Quick Comparison

Provider	Free Tier	Enterprise	GDPR	Latency	Best For
Anthropic	Limited	✅	✅	Low	Reasoning, coding, Claude
Hugging Face	✅	❌	✅	Medium	Open source, experimentation
Google AI	✅	✅	✅	Low	Free tier, Gemini
Mistral AI	❌	✅	✅	Low	EU compliance, cost
OpenRouter	✅	✅	Varies	Low	Multi-model, automatic failover
OpenAI Compatible	Varies	✅	Varies	Varies	Flexibility, local deployment
LiteLLM	❌	✅	Varies	Low	Multi-provider, unified API
Azure OpenAI	❌	✅	✅	Low	Enterprise, Microsoft ecosystem
Vertex AI	❌	✅	✅	Low	Enterprise, GCP ecosystem
AWS Bedrock	❌	✅	✅	Low	Enterprise, AWS ecosystem
DeepSeek	❌	✅	❌	Low	Cost-effective reasoning, R1 model
NVIDIA NIM	❌	✅	Varies	Low	NVIDIA-hosted or self-hosted LLMs
LM Studio	✅ (Local)	❌	✅	Varies	Local GUI model management
llama.cpp	✅ (Local)	❌	✅	Varies	High-performance local GGUF inference
OpenAI TTS	❌	✅	✅	Low	High-quality TTS (tts-1-hd)
ElevenLabs	❌	✅	Varies	Low	Multilingual TTS, voice cloning
Google TTS	✅	✅	✅	Low	Cost-effective TTS, 1M chars free
Azure TTS	❌	✅	✅	Low	Enterprise TTS, SSML support
Whisper	❌	✅	✅	Low	Best STT accuracy
Deepgram	❌	✅	Varies	Low	Real-time STT streaming (WebSocket)
Google STT	❌	✅	✅	Low	STT for GCP users, 125+ languages
Azure STT	❌	✅	✅	Low	Enterprise STT, custom models
OpenAI Realtime	❌	✅	✅	Low	Realtime bidirectional voice
Gemini Live	❌	✅	✅	Low	Realtime voice + video (Gemini)

Setup Strategies

Strategy 1: Free Tier First (Recommended for Development)

SDK Usage
CLI Usage

const ai = new NeuroLink({
providers: [
{
name: 'google-ai',
priority: 1,
config: { apiKey: process.env.GOOGLE_AI_KEY },
quotas: { daily: 1500 }
},
{
name: 'openai',
priority: 2,
config: { apiKey: process.env.OPENAI_API_KEY }
}
],
failoverConfig: { enabled: true, fallbackOnQuota: true }
});

    const result = await ai.generate({
      input: { text: "Hello world" }
    });

# Set up environment variables
export GOOGLE_AI_KEY="your-key"
export OPENAI_API_KEY="your-key"

    # Use with automatic failover
    npx @juspay/neurolink generate "Hello world" \
      --provider google-ai

Strategy 2: Multi-Region Enterprise

const ai = new NeuroLink({
  providers: [
    {
      name: "azure-us",
      region: "us-east",
      config: {
        /* Azure US */
      },
    },
    {
      name: "azure-eu",
      region: "eu-west",
      config: {
        /* Azure EU */
      },
    },
    {
      name: "bedrock-us",
      region: "us-east",
      config: {
        /* Bedrock US */
      },
    },
  ],
  loadBalancing: "latency-based",
});

const ai = new NeuroLink({
  providers: [
    {
      name: "mistral",
      priority: 1,
      config: { apiKey: process.env.MISTRAL_API_KEY },
    },
    {
      name: "azure-eu",
      priority: 2,
      config: {
        /* Azure EU region */
      },
    },
  ],
  compliance: {
    framework: "GDPR",
    dataResidency: "EU",
  },
});

Next Steps

Choose a provider based on your requirements (free tier, compliance, region)
Follow the setup guide to get your API key
Configure NeuroLink with the provider
Test the integration with a simple request
Add failover for production reliability

Multi-Provider Failover - High availability patterns
Cost Optimization - Reduce costs by 80-95%
Compliance & Security - GDPR, SOC2, HIPAA
Load Balancing - Distribution strategies
Voice Providers Comparison - TTS, STT, and Realtime capability matrix
Voice Provider Selection - Choosing the right voice provider

🆓 Free Tier Providers​

🤖 Direct AI Providers​

🏢 Enterprise Providers​

🌍 Compliance-Focused​

🧑‍💻 Hosted Inference Providers​

💻 Local Providers​

🔌 Aggregators & Proxies​

🎙️ Voice Providers​

Text-to-Speech (TTS)​

Speech-to-Text (STT)​

Realtime Voice​

Quick Comparison​

Setup Strategies​

Strategy 1: Free Tier First (Recommended for Development)​

Strategy 2: Multi-Region Enterprise​

Strategy 3: GDPR Compliance​

Next Steps​

Related Documentation​

🆓 Free Tier Providers

🤖 Direct AI Providers

🏢 Enterprise Providers

🌍 Compliance-Focused

🧑‍💻 Hosted Inference Providers

💻 Local Providers

🔌 Aggregators & Proxies

🎙️ Voice Providers

Text-to-Speech (TTS)

Speech-to-Text (STT)

Realtime Voice

Quick Comparison

Setup Strategies

Strategy 1: Free Tier First (Recommended for Development)

Strategy 2: Multi-Region Enterprise

Strategy 3: GDPR Compliance

Next Steps

Related Documentation