# NeuroLink Documentation (Complete)

> Enterprise AI Development Platform - Unified provider access, MCP integration, professional CLI

Generated: 2026-02-27T07:54:12.406Z
Summary version: https://docs.neurolink.ink/llms.txt
Total files: 354

---

## Table of Contents

### Introduction
- NeuroLink
- Page Not Found
- AI Analysis Tools
- NeuroLink AI Enhancements - Complete Documentation
- ️ AI Development Workflow Tools
- Automated Publishing Guide (Semantic Release)
- Business Documentation Hub
- Business Value Guide: Analytics & Evaluation Features
- Workflow Engine - High-Level Design
- Workflow Engine - Low-Level Design
- Domain Configuration Examples for NeuroLink CLI
- ️ NeuroLink CLI Guide
- ️ CLI Reference Guide
- Lighthouse Unified Integration Guide
- npm Trusted Publishing Setup
- Step-by-Step Integration Tutorials
- Industry Use Cases: Real-World Applications
- Visual Demonstrations

### Getting Started
- Getting Started
- AI Provider Guides
- Quick Start
- Installation
- Environment Variables Configuration Guide
- AWS Bedrock Provider Guide
- Azure OpenAI Provider Guide
- Google AI Studio Provider Guide
- ⚙️ Provider Configuration Guide
- Google Vertex AI Provider Guide
- Hugging Face Provider Guide
- Redis Quick Start (5 Minutes)
- LiteLLM Provider Guide
- Mistral AI Provider Guide
- Ollama Setup Guide
- OpenAI-Compatible Providers Guide
- OpenRouter Provider Guide
- SageMaker Integration - Deploy Your Custom AI Models

### SDK Reference
- SDK Reference
- API Reference
- Advanced SDK Features
- SDK Custom Tools Guide
- SDK Custom Tools Guide
- ️ Framework Integration Guide
- NestJS Integration Guide

### CLI
- CLI Command Reference
- CLI Guide
- Advanced CLI Usage
- CLI Examples

### Features
- Feature Guides
- Audio Input & Transcription Guide
- Auto Evaluation Engine
- CLI Loop Sessions
- Context Compaction
- Redis Conversation History Export
- CSV File Support
- Enterprise Human-in-the-Loop System
- File Processors Guide
- Guardrails AI Integration with Middleware
- Guardrails Implementation Guide
- Guardrails Middleware
- Human-in-the-Loop (HITL) Workflows
- Image Generation Streaming Guide
- Interactive CLI - Your AI Development Environment
- MCP Tools Ecosystem - 58+ Integrations
- Memory Guide
- Multimodal Chat Experiences
- Multimodal Capabilities Guide
- Observability Guide
- Office Documents Support
- PDF File Support
- Provider Orchestration Brain
- RAG Document Processing Guide
- Real-time Services Guide
- Regional Streaming Controls
- Speech-to-Speech Agents: Architecture and Gemini Live Integration Plan
- Structured Output with Zod Schemas
- Extended Thinking Configuration
- Text-to-Speech (TTS) Integration Guide
- Video Analysis
- Video Generation with Veo 3.1

### Examples
- Examples & Tutorials
- Advanced Examples
- Basic Usage Examples
- Business Applications
- Tool Blocking Feature Example
- Use Cases & Applications

### Cookbook
- NeuroLink Cookbook
- Batch Processing
- Context Window Management
- Conversation Summarization
- Cost Optimization
- Error Recovery Patterns
- Multi-Provider Fallback
- Rate Limit Handling
- Streaming with Retry Logic
- Structured Output with JSON Schema
- Tool Chaining with MCP

### MCP Integration
- MCP Foundation (Model Context Protocol)
- MCP Configuration Locations Across AI Development Tools
- MCP Concurrency Control Guide
- NeuroLink Docs MCP Server
- HTTP Transport for MCP Servers
- MCP (Model Context Protocol) Integration Guide
- NeuroLink MCP Latency Optimization Implementation Guide
- MCP Foundation Testing Guide

### Advanced
- Advanced Features
- Analytics & Evaluation
- Built-in Middleware Reference
- CLI Guide
- Enterprise Features
- NeuroLink Factory Patterns - Complete Implementation Guide
- Factory Pattern Migration Guide
- Memory Integration with Mem0
- Middleware System Architecture
- Streaming Responses
- Updated Provider Test Results

### Reference
- Reference
- Analytics Reference
- Error Code Reference
- Provider Behavior Guide
- Provider Capabilities Audit
- AI Provider Comparison Guide
- Provider Feature Compatibility Reference
- Troubleshooting Guide
- Frequently Asked Questions
- Provider Selection Guide
- Srvr Cofiguratio Rfrc []

### Tutorials
- NeuroLink Tutorials
- Build a Complete Chat Application
- Build a RAG System
- Video Tutorials

### Development
- Development
- System Architecture
- Changelog Automation & Formatting
- CLI Factory Integration Impact Assessment
- Factory Pattern Architecture
- Factory Pattern Migration Guide
- Design Doc: Large Context Handling via Map-Reduce Summarization
- Automated Link Checking
- Package Version Overrides Documentation
- ✅ Provider-Agnostic Testing Framework - UPDATED STATUS
- COMPREHENSIVE TESTING & VERIFICATION PLAN
- NeuroLink Testing Guide - ALL 9 PROVIDERS WORKING
- Documentation Versioning

### Guides
- NeuroLink Guides
- Server Adapters
- Migration Guides
- Enterprise Guides
- Hono Adapter
- Express Adapter
- Fastify Adapter
- Koa Adapter
- Middleware Reference
- Streaming Guide
- WebSocket Support
- Error Handling
- Domain-Specific AI Usage Guide
- Security Best Practices
- MCP Server Catalog
- Migrating from LangChain to NeuroLink
- Express.js Integration Guide
- Production Code Patterns
- Audit Trails & Compliance Logging
- Deployment Guide
- Dynamic Model Configuration System
- Migrating from Vercel AI SDK to NeuroLink
- Fastify Integration Guide
- Real-World Use Cases
- Compliance & Security Guide
- Next.js Integration Guide
- Cost Optimization Guide
- GitHub Action Guide
- SvelteKit Integration Guide
- Load Balancing Strategies
- Migration Guide (v7.40 → v7.47)
- Monitoring & Observability Guide
- Provider Selection Wizard
- Multi-Provider Failover & High Availability
- Complete Redis Configuration Guide
- Multi-Region Deployment Guide
- Redis Migration Patterns
- Session Management & Persistence Guide
- Vector Stores Guide

### Memory
- Conversation Memory
- NeuroLink Mem0 Memory Integration
- Automatic Conversation Summarization

### Observability
- Health Monitoring & Auto-Recovery Guide
- Provider Status Monitoring and Health Management
- Enterprise Telemetry Guide

### Deployment
- ⚙️ NeuroLink Configuration Guide
- ️ Enterprise Configuration Management Guide
- Enterprise & Proxy Setup Guide
- Performance Optimization Guide for NeuroLink CLI with Domain Features
- Performance Optimization Guide

### Demos
- Visual Demos
- Interactive Demo
- Screenshots Gallery
- Video Demonstrations

### About
- NeuroLink Vision & Roadmap

### Community
- Changelog
- Contributor Covenant Code of Conduct
- Contributing to NeuroLink

### Workflows
- AI-Driven Tool Orchestration Guide
- Custom Middleware Development Guide
- Error Handling
- NeuroLink Middleware System
- Advanced AI Model Orchestration

### Visual Content
- AI Development Workflow Tools - Visual Proof Documentation
- Phase 1.2 Screenshot Summary
- MCP CLI Screenshots
- Phase 1.2 AI Development Workflow Tools - Visual Content Achievement Report
- Phase 1.2 AI Development Workflow Tools - Visual Content Plan

### Playground
- Interactive Playground

### Rag
- RAG Processing - CLI Reference
- RAG Processing - Configuration Guide
- RAG Processing - Testing Guide
- RAG Processing - Manual Verification Checklist

### Implementation Guides
- RAG Document Processing - Implementation Guide

### Api
- NeuroLink API Reference v8.42.0
- Variable: DEFAULT_HTTP_RETRY_CONFIG
- Enumeration: AIProviderName
- Type Alias: AIModelProviderConfig
- Function: assembleContext()
- Class: AIProviderFactory
- Variable: DEFAULT_PROVIDER_CONFIGS
- Enumeration: BedrockModels
- Type Alias: AIProvider
- Function: batchRerank()
- Class: ChunkerFactory
- Variable: DEFAULT_RATE_LIMIT_CONFIG
- Enumeration: OpenAIModels
- Type Alias: AnalyticsData
- Function: buildObservabilityConfigFromEnv()
- Class: ChunkerRegistry
- Variable: VERSION
- Enumeration: VertexModels
- Type Alias: AuthorizationUrlResult
- Function: calculateExpiresAt()
- Class: CircuitBreakerManager
- Variable: dynamicModelProvider
- Type Alias: Chunk
- Function: chunkText()
- Class: FileTokenStorage
- Variable: globalCircuitBreakerManager
- Type Alias: ChunkMetadata
- Function: createAIProvider()
- Class: GraphRAG
- Variable: globalRateLimiterManager
- Type Alias: ChunkParams
- Function: createAIProviderWithFallback()
- Class: HTTPRateLimiter
- Variable: mcpLogger
- Type Alias: ChunkerConfig
- Function: createBestAIProvider()
- Class: InMemoryBM25Index
- Type Alias: ChunkerMetadata
- Function: createChunker()
- Class: InMemoryTokenStorage
- Type Alias: ChunkingStrategy
- Function: createContextEnricher()
- Class: InMemoryVectorStore
- Type Alias: DiscoveredMcp\<TTools\>
- Function: createContextWindow()
- Class: MCPCircuitBreaker
- Type Alias: DocumentType
- Function: createHybridSearch()
- Class: MDocument
- Type Alias: DynamicModelConfig
- Function: createOAuthProviderFromConfig()
- Class: MiddlewareFactory
- Type Alias: EnhancedProvider
- Function: createReranker()
- Class: NeuroLink
- Type Alias: EvaluationData
- Function: createVectorQueryTool()
- Class: NeuroLinkOAuthProvider
- Type Alias: ExecutionContext\<T\>
- Function: executeMCP()
- Class: RAGPipeline
- Type Alias: ExtractParams
- Function: flushOpenTelemetry()
- Class: RateLimiterManager
- Type Alias: GenerateOptions
- ~~Function: generateText()~~
- Class: RerankerFactory
- Type Alias: GenerateResult
- Function: getAvailableProviders()
- Class: RerankerRegistry
- Type Alias: HTTPRetryConfig
- Function: getAvailableRerankerTypes()
- Type Alias: HybridSearchConfig
- Function: getAvailableStrategies()
- Type Alias: LangfuseConfig
- Function: getBestProvider()
- Type Alias: LangfuseSpanAttributes
- Function: getChunkerMetadata()
- Type Alias: LogLevel
- Function: getLangfuseContext()
- Type Alias: MCPOAuthConfig
- Function: getLangfuseHealthStatus()
- Type Alias: MCPServerInfo
- Function: getLangfuseSpanProcessor()
- Type Alias: MDocumentConfig
- Function: getMCPStats()
- Type Alias: McpMetadata
- Function: getSpanProcessors()
- Type Alias: MiddlewareConfig
- Function: getTelemetryStatus()
- Type Alias: MiddlewareContext
- Function: getTracer()
- Type Alias: MiddlewareFactoryOptions
- Function: getTracerProvider()
- Type Alias: MiddlewarePreset
- Function: initializeMCPEcosystem()
- Type Alias: ModelRegistry
- Function: initializeOpenTelemetry()
- Type Alias: NeuroLinkMiddleware
- Function: initializeTelemetry()
- Type Alias: OAuthClientInformation
- Function: isRetryableHTTPError()
- Type Alias: OAuthTokens
- Function: isRetryableStatusCode()
- Type Alias: ObservabilityConfig
- Function: isTokenExpired()
- Type Alias: OpenTelemetryConfig
- Function: isUsingExternalTracerProvider()
- Type Alias: ProviderAttempt
- Function: isValidProvider()
- ~~Type Alias: RateLimitConfig~~
- Function: linearCombination()
- Type Alias: RerankerConfig
- Function: listMCPs()
- Type Alias: RerankerType
- Function: loadDocument()
- Type Alias: StreamingOptions
- Function: loadDocuments()
- Type Alias: SupportedModelName
- Function: reciprocalRankFusion()
- Type Alias: TextGenerationOptions
- Function: rerank()
- Type Alias: TextGenerationResult
- Function: setLangfuseContext()
- Type Alias: TokenExchangeRequest
- Function: shutdownOpenTelemetry()
- Type Alias: TokenStorage
- Function: simpleRerank()
- Type Alias: ToolContext
- Function: validateTool()
- Type Alias: ToolDefinition\<TArgs, TResult\>
- Function: withHTTPRetry()
- Type Alias: ToolExecutionResult\<T\>
- Type Alias: ToolInfo
- Type Alias: ToolResult\<T\>
- Type Alias: TraceNameFormat
- Type Alias: VectorQueryToolConfig

---

# Introduction

## NeuroLink

<!-- Source: index.md -->

NeuroLink
  The Enterprise AI SDK for Production Applications
  13 Providers | 58+ MCP Tools | HITL Security | Redis Persistence

[[Image: npm version]](https://www.npmjs.com/package/@juspay/neurolink)
[[Image: npm downloads]](https://www.npmjs.com/package/@juspay/neurolink)
[[Image: Build Status]](https://github.com/juspay/neurolink/actions/workflows/ci.yml)
[[Image: Coverage Status]](https://coveralls.io/github/juspay/neurolink?branch=main)
[[Image: License: MIT]](https://opensource.org/licenses/MIT)
[[Image: TypeScript]](https://www.typescriptlang.org/)
[[Image: GitHub Stars]](https://github.com/juspay/neurolink/stargazers)
[[Image: Discord]](https://discord.gg/neurolink)

Enterprise AI development platform with unified provider access, production-ready tooling, and an opinionated factory architecture. NeuroLink ships as both a TypeScript SDK and a professional CLI so teams can build, operate, and iterate on AI features quickly.

##  What is NeuroLink?

**NeuroLink is the universal AI integration platform that unifies 13 major AI providers and 100+ models under one consistent API.**

Extracted from production systems at Juspay and battle-tested at enterprise scale, NeuroLink provides a production-ready solution for integrating AI into any application. Whether you're building with OpenAI, Anthropic, Google, AWS Bedrock, Azure, or any of our 13 supported providers, NeuroLink gives you a single, consistent interface that works everywhere.

**Why NeuroLink?** Switch providers with a single parameter change, leverage 64+ built-in tools and MCP servers, deploy with confidence using enterprise features like Redis memory and multi-provider failover, and optimize costs automatically with intelligent routing. Use it via our professional CLI or TypeScript SDK—whichever fits your workflow.

**Where we're headed:** We're building for the future of AI—edge-first execution and continuous streaming architectures that make AI practically free and universally available. **[Read our vision →](/docs/about/vision)**

**[Get Started in \ [Observability Guide](/docs/observability/health-monitoring)
- **Server Adapters** -- Deploy NeuroLink as an HTTP API server with your framework of choice (Hono, Express, Fastify, Koa). Full CLI support with `serve` and `server` commands for foreground/background modes, route management, and OpenAPI generation. -> [Server Adapters Guide](/docs/guides/server-adapters)
- **Title Generation Events** -- Emit real-time events when conversation titles are auto-generated. Listen to `conversation:titleGenerated` for session tracking. -> [Conversation Memory Guide](/docs/memory/conversation)
- **Custom Title Prompts** -- Customize conversation title generation with `NEUROLINK_TITLE_PROMPT` environment variable. Use `${userMessage}` placeholder for dynamic prompts. -> [Conversation Memory Guide](/docs/memory/conversation)
- **Video Generation** -- Transform images into 8-second videos with synchronized audio using Google Veo 3.1 via Vertex AI. Supports 720p/1080p resolutions, portrait/landscape aspect ratios. -> [Video Generation Guide](/docs/features/video-generation)
- **Image Generation** -- Generate images from text prompts using Gemini models via Vertex AI or Google AI Studio. Supports streaming mode with automatic file saving. -> [Image Generation Guide](/docs/features/image-generation)
- **HTTP/Streamable HTTP Transport for MCP** -- Connect to remote MCP servers via HTTP with authentication headers, retry logic, and rate limiting. -> [HTTP Transport Guide](/docs/mcp/http-transport)
- **Gemini 3 Preview Support** - Full support for gemini-3-flash-preview and gemini-3-pro-preview with extended thinking capabilities
- **Structured Output with Zod Schemas** -- Type-safe JSON generation with automatic validation using `schema` + `output.format: "json"` in `generate()`. -> [Structured Output Guide](/docs/cookbook/structured-output)
- **CSV File Support** -- Attach CSV files to prompts for AI-powered data analysis with auto-detection. -> [CSV Guide](/docs/features/multimodal-chat.md#csv-file-support)
- **PDF File Support** -- Process PDF documents with native visual analysis for Vertex AI, Anthropic, Bedrock, AI Studio. -> [PDF Guide](/docs/features/pdf-support)
- **50+ File Types** -- Process Excel, Word, RTF, JSON, YAML, XML, HTML, SVG, Markdown, and 50+ code languages with intelligent content extraction. -> [File Processors Guide](/docs/features/file-processors)
- **LiteLLM Integration** -- Access 100+ AI models from all major providers through unified interface. -> [Setup Guide](/docs/getting-started/providers/litellm)
- **SageMaker Integration** -- Deploy and use custom trained models on AWS infrastructure. -> [Setup Guide](/docs/getting-started/providers/sagemaker)
- **OpenRouter Integration** -- Access 300+ models from OpenAI, Anthropic, Google, Meta, and more through a single unified API. -> [Setup Guide](/docs/getting-started/providers/openrouter)
- **Human-in-the-loop workflows** -- Pause generation for user approval/input before tool execution. -> [HITL Guide](/docs/features/hitl)
- **Guardrails middleware** -- Block PII, profanity, and unsafe content with built-in filtering. -> [Guardrails Guide](/docs/features/guardrails)
- **Context summarization** -- Automatic conversation compression for long-running sessions. -> [Summarization Guide](/docs/memory/summarization)
- **Redis conversation export** -- Export full session history as JSON for analytics and debugging. -> [History Guide](/docs/features/conversation-history)

```typescript
// Multi-Model Workflow Engine (v8.42.0)

const neurolink = new NeuroLink();

// Run a consensus workflow with multiple models
const result = await neurolink.runConsensusWorkflow({
  prompt: "Explain quantum computing",
  models: [
    { provider: "anthropic", modelId: "claude-3-5-sonnet-20241022" },
    { provider: "openai", modelId: "gpt-4" },
    { provider: "google-ai", modelId: "gemini-2.0-flash-exp" },
  ],
  judgeModel: { provider: "anthropic", modelId: "claude-3-5-sonnet-20241022" },
  options: { temperature: 0.7 },
});

console.log(result.response); // Best response selected by judge
console.log(result.score); // Quality score (0-100)
console.log(result.metrics); // Detailed performance metrics

// Image Generation with Gemini (v8.31.0)
const image = await neurolink.generateImage({
  prompt: "A futuristic cityscape",
  provider: "google-ai",
  model: "imagen-3.0-generate-002",
});

// HTTP Transport for Remote MCP (v8.29.0)
await neurolink.addExternalMCPServer("remote-tools", {
  transport: "http",
  url: "https://mcp.example.com/v1",
  headers: { Authorization: "Bearer token" },
  retries: 3,
  timeout: 15000,
});
```

---

Previous Updates (Q4 2025)

- **Image Generation** – Generate images from text prompts using Gemini models via Vertex AI or Google AI Studio. → [Guide](/docs/features/image-generation)
- **Gemini 3 Preview Support** - Full support for `gemini-3-flash-preview` and `gemini-3-pro-preview` with extended thinking
- **Structured Output with Zod Schemas** – Type-safe JSON generation with automatic validation. → [Guide](/docs/cookbook/structured-output)
- **CSV & PDF File Support** – Attach CSV/PDF files to prompts with auto-detection. → [CSV](/docs/features/multimodal-chat.md#csv-file-support) | [PDF](/docs/features/pdf-support)
- **LiteLLM & SageMaker** – Access 100+ models via LiteLLM, deploy custom models on SageMaker. → [LiteLLM](/docs/getting-started/providers/litellm) | [SageMaker](/docs/getting-started/providers/sagemaker)
- **OpenRouter Integration** – Access 300+ models through a single unified API. → [Guide](/docs/getting-started/providers/openrouter)
- **HITL & Guardrails** – Human-in-the-loop approval workflows and content filtering middleware. → [HITL](/docs/features/hitl) | [Guardrails](/docs/features/guardrails)
- **Redis & Context Management** – Session export, conversation history, and automatic summarization. → [History](/docs/features/conversation-history)

## Enterprise Security: Human-in-the-Loop (HITL)

NeuroLink includes a **production-ready HITL system** for regulated industries and high-stakes AI operations:

| Capability                  | Description                                               | Use Case                                   |
| --------------------------- | --------------------------------------------------------- | ------------------------------------------ |
| **Tool Approval Workflows** | Require human approval before AI executes sensitive tools | Financial transactions, data modifications |
| **Output Validation**       | Route AI outputs through human review pipelines           | Medical diagnosis, legal documents         |
| **Confidence Thresholds**   | Automatically trigger human review below confidence level | Critical business decisions                |
| **Complete Audit Trail**    | Full audit logging for compliance (HIPAA, SOC2, GDPR)     | Regulated industries                       |

```typescript

const neurolink = new NeuroLink({
  hitl: {
    enabled: true,
    requireApproval: ["writeFile", "executeCode", "sendEmail"],
    confidenceThreshold: 0.85,
    reviewCallback: async (action, context) => {
      // Custom review logic - integrate with your approval system
      return await yourApprovalSystem.requestReview(action);
    },
  },
});

// AI pauses for human approval before executing sensitive tools
const result = await neurolink.generate({
  input: { text: "Send quarterly report to stakeholders" },
});
```

**[Enterprise HITL Guide](/docs/features/enterprise-hitl)** | **[Quick Start](/docs/features/hitl)**

## Get Started in Two Steps

```bash
# 1. Run the interactive setup wizard (select providers, validate keys)
pnpm dlx @juspay/neurolink setup

# 2. Start generating with automatic provider selection
npx @juspay/neurolink generate "Write a launch plan for multimodal chat"
```

Need a persistent workspace? Launch loop mode with `npx @juspay/neurolink loop` - [Learn more →](/docs/features/cli-loop-sessions)

##  Complete Feature Set

NeuroLink is a comprehensive AI development platform. Every feature below is production-ready and fully documented.

###  AI Provider Integration

**13 providers unified under one API** - Switch providers with a single parameter change.

| Provider              | Models                                             | Free Tier       | Tool Support | Status        | Documentation                                                      |
| --------------------- | -------------------------------------------------- | --------------- | ------------ | ------------- | ------------------------------------------------------------------ |
| **OpenAI**            | GPT-4o, GPT-4o-mini, o1                            | ❌              | ✅ Full      | ✅ Production | [Setup Guide](/docs/getting-started/provider-setup.md#openai)            |
| **Anthropic**         | Claude 4.5 Opus/Sonnet/Haiku, Claude 4 Opus/Sonnet | ❌              | ✅ Full      | ✅ Production | [Setup Guide](/docs/getting-started/provider-setup.md#anthropic)         |
| **Google AI Studio**  | Gemini 3 Flash/Pro, Gemini 2.5 Flash/Pro           | ✅ Free Tier    | ✅ Full      | ✅ Production | [Setup Guide](/docs/getting-started/provider-setup.md#google-ai)         |
| **AWS Bedrock**       | Claude, Titan, Llama, Nova                         | ❌              | ✅ Full      | ✅ Production | [Setup Guide](/docs/getting-started/provider-setup.md#bedrock)           |
| **Google Vertex**     | Gemini 3/2.5 (gemini-3-\*-preview)                 | ❌              | ✅ Full      | ✅ Production | [Setup Guide](/docs/getting-started/provider-setup.md#vertex)            |
| **Azure OpenAI**      | GPT-4, GPT-4o, o1                                  | ❌              | ✅ Full      | ✅ Production | [Setup Guide](/docs/getting-started/provider-setup.md#azure)             |
| **LiteLLM**           | 100+ models unified                                | Varies          | ✅ Full      | ✅ Production | [Setup Guide](/docs/getting-started/providers/litellm)                              |
| **AWS SageMaker**     | Custom deployed models                             | ❌              | ✅ Full      | ✅ Production | [Setup Guide](/docs/getting-started/providers/sagemaker)                            |
| **Mistral AI**        | Mistral Large, Small                               | ✅ Free Tier    | ✅ Full      | ✅ Production | [Setup Guide](/docs/getting-started/provider-setup.md#mistral)           |
| **Hugging Face**      | 100,000+ models                                    | ✅ Free         | ⚠️ Partial   | ✅ Production | [Setup Guide](/docs/getting-started/provider-setup.md#huggingface)       |
| **Ollama**            | Local models (Llama, Mistral)                      | ✅ Free (Local) | ⚠️ Partial   | ✅ Production | [Setup Guide](/docs/getting-started/provider-setup.md#ollama)            |
| **OpenAI Compatible** | Any OpenAI-compatible endpoint                     | Varies          | ✅ Full      | ✅ Production | [Setup Guide](/docs/getting-started/provider-setup.md#openai-compatible) |
| **OpenRouter**        | 200+ Models via OpenRouter                         | Varies          | ✅ Full      | ✅ Production | [Setup Guide](/docs/getting-started/providers/openrouter)             |

**[ Provider Comparison Guide](/docs/reference/provider-comparison)** - Detailed feature matrix and selection criteria
**[ Provider Feature Compatibility](/docs/reference/provider-feature-compatibility)** - Test-based compatibility reference for all 19 features across 13 providers

---

###  Built-in Tools & MCP Integration

**6 Core Tools** (work across all providers, zero configuration):

| Tool                 | Purpose                  | Auto-Available          | Documentation                         |
| -------------------- | ------------------------ | ----------------------- | ------------------------------------- |
| `getCurrentTime`     | Real-time clock access   | ✅                      | [Tool Reference](/docs/sdk/custom-tools) |
| `readFile`           | File system reading      | ✅                      | [Tool Reference](/docs/sdk/custom-tools) |
| `writeFile`          | File system writing      | ✅                      | [Tool Reference](/docs/sdk/custom-tools) |
| `listDirectory`      | Directory listing        | ✅                      | [Tool Reference](/docs/sdk/custom-tools) |
| `calculateMath`      | Mathematical operations  | ✅                      | [Tool Reference](/docs/sdk/custom-tools) |
| `websearchGrounding` | Google Vertex web search | ⚠️ Requires credentials | [Tool Reference](/docs/sdk/custom-tools) |

**58+ External MCP Servers** supported (GitHub, PostgreSQL, Google Drive, Slack, and more):

```typescript
// stdio transport - local MCP servers via command execution
await neurolink.addExternalMCPServer("github", {
  command: "npx",
  args: ["-y", "@modelcontextprotocol/server-github"],
  transport: "stdio",
  env: { GITHUB_TOKEN: process.env.GITHUB_TOKEN },
});

// HTTP transport - remote MCP servers via URL
await neurolink.addExternalMCPServer("github-copilot", {
  transport: "http",
  url: "https://api.githubcopilot.com/mcp",
  headers: { Authorization: "Bearer YOUR_COPILOT_TOKEN" },
  timeout: 15000,
  retries: 5,
});

// Tools automatically available to AI
const result = await neurolink.generate({
  input: { text: 'Create a GitHub issue titled "Bug in auth flow"' },
});
```

**MCP Transport Options:**

| Transport   | Use Case       | Key Features                                    |
| ----------- | -------------- | ----------------------------------------------- |
| `stdio`     | Local servers  | Command execution, environment variables        |
| `http`      | Remote servers | URL-based, auth headers, retries, rate limiting |
| `sse`       | Event streams  | Server-Sent Events, real-time updates           |
| `websocket` | Bi-directional | Full-duplex communication                       |

**[ MCP Integration Guide](/docs/mcp/integration)** - Setup external servers
**[ HTTP Transport Guide](/docs/mcp/http-transport)** - Remote MCP server configuration

---

###  Developer Experience Features

**SDK-First Design** with TypeScript, IntelliSense, and type safety:

| Feature                     | Description                                                   | Documentation                                        |
| --------------------------- | ------------------------------------------------------------- | ---------------------------------------------------- |
| **Auto Provider Selection** | Intelligent provider fallback                                 | [SDK Guide](/docs/sdk/index.md#auto-selection)             |
| **Streaming Responses**     | Real-time token streaming                                     | [Streaming Guide](/docs/advanced/streaming)             |
| **Conversation Memory**     | Automatic context management                                  | [Memory Guide](/docs/sdk/index.md#memory)                  |
| **Full Type Safety**        | Complete TypeScript types                                     | [Type Reference](/docs/sdk/api-reference)               |
| **Error Handling**          | Graceful provider fallback                                    | [Error Guide](/docs/reference/troubleshooting)          |
| **Analytics & Evaluation**  | Usage tracking, quality scores                                | [Analytics Guide](/docs/reference/analytics)             |
| **Middleware System**       | Request/response hooks                                        | [Middleware Guide](/docs/workflows/custom-middleware)       |
| **Framework Integration**   | Next.js, SvelteKit, Express                                   | [Framework Guides](/docs/sdk/framework-integration)     |
| **Extended Thinking**       | Native thinking/reasoning mode for Gemini 3 and Claude models | [Thinking Guide](/docs/features/thinking-configuration) |

---

###  Multimodal & File Processing

**17+ file categories supported** (50+ total file types including code languages) with intelligent content extraction and provider-agnostic processing:

| Category      | Supported Types                                            | Processing                          |
| ------------- | ---------------------------------------------------------- | ----------------------------------- |
| **Documents** | Excel (`.xlsx`, `.xls`), Word (`.docx`), RTF, OpenDocument | Sheet extraction, text extraction   |
| **Data**      | JSON, YAML, XML                                            | Validation, syntax highlighting     |
| **Markup**    | HTML, SVG, Markdown, Text                                  | OWASP-compliant sanitization        |
| **Code**      | 50+ languages (TypeScript, Python, Java, Go, etc.)         | Language detection, syntax metadata |
| **Config**    | `.env`, `.ini`, `.toml`, `.cfg`                            | Secure parsing                      |
| **Media**     | Images (PNG, JPEG, WebP, GIF), PDFs, CSV                   | Provider-specific formatting        |

```typescript
// Process any supported file type
const result = await neurolink.generate({
  input: {
    text: "Analyze this data and code",
    files: [
      "./data.xlsx", // Excel spreadsheet
      "./config.yaml", // YAML configuration
      "./diagram.svg", // SVG (injected as sanitized text)
      "./main.py", // Python source code
    ],
  },
});

// CLI: Use --file for any supported type
// neurolink generate "Analyze this" --file ./report.xlsx --file ./config.json
```

**Key Features:**

- **ProcessorRegistry** - Priority-based processor selection with fallback
- **OWASP Security** - HTML/SVG sanitization prevents XSS attacks
- **Auto-detection** - FileDetector identifies file types by extension and content
- **Provider-agnostic** - All processors work across all 13 AI providers

**[ File Processors Guide](/docs/features/file-processors)** - Complete reference for all file types

---

###  Enterprise & Production Features

**Production-ready capabilities for regulated industries:**

| Feature                     | Description                        | Use Case                  | Documentation                                          |
| --------------------------- | ---------------------------------- | ------------------------- | ------------------------------------------------------ |
| **Enterprise Proxy**        | Corporate proxy support            | Behind firewalls          | [Proxy Setup](/docs/deployment/enterprise-proxy)               |
| **Redis Memory**            | Distributed conversation state     | Multi-instance deployment | [Redis Guide](/docs/getting-started/provider-setup.md#redis) |
| **Cost Optimization**       | Automatic cheapest model selection | Budget control            | [Cost Guide](/docs/)                        |
| **Multi-Provider Failover** | Automatic provider switching       | High availability         | [Failover Guide](/docs/)                    |
| **Telemetry & Monitoring**  | OpenTelemetry integration          | Observability             | [Telemetry Guide](/docs/observability/telemetry)                  |
| **Security Hardening**      | Credential management, auditing    | Compliance                | [Security Guide](/docs/guides/enterprise)               |
| **Custom Model Hosting**    | SageMaker integration              | Private models            | [SageMaker Guide](/docs/getting-started/providers/sagemaker)            |
| **Load Balancing**          | LiteLLM proxy integration          | Scale & routing           | [Load Balancing](/docs/getting-started/providers/litellm)               |

**Security & Compliance:**

- ✅ SOC2 Type II compliant deployments
- ✅ ISO 27001 certified infrastructure compatible
- ✅ GDPR-compliant data handling (EU providers available)
- ✅ HIPAA compatible (with proper configuration)
- ✅ Hardened OS verified (SELinux, AppArmor)
- ✅ Zero credential logging
- ✅ Encrypted configuration storage
- ✅ Automatic context window management with 4-stage compaction pipeline and 80% budget gate

**[ Enterprise Deployment Guide](/docs/guides/enterprise)** - Complete production checklist

---

## Enterprise Persistence: Redis Memory

Production-ready distributed conversation state for multi-instance deployments:

### Capabilities

| Feature                | Description                                  | Benefit                     |
| ---------------------- | -------------------------------------------- | --------------------------- |
| **Distributed Memory** | Share conversation context across instances  | Horizontal scaling          |
| **Session Export**     | Export full history as JSON                  | Analytics, debugging, audit |
| **Auto-Detection**     | Automatic Redis discovery from environment   | Zero-config in containers   |
| **Graceful Failover**  | Falls back to in-memory if Redis unavailable | High availability           |
| **TTL Management**     | Configurable session expiration              | Memory management           |

### Quick Setup

```typescript

// Auto-detect Redis from REDIS_URL environment variable
const neurolink = new NeuroLink({
  conversationMemory: {
    enabled: true,
    store: "redis", // Automatically uses REDIS_URL
    ttl: 86400, // 24-hour session expiration
  },
});

// Or explicit configuration
const neurolinkExplicit = new NeuroLink({
  conversationMemory: {
    enabled: true,
    store: "redis",
    redis: {
      host: "redis.example.com",
      port: 6379,
      password: process.env.REDIS_PASSWORD,
      tls: true, // Enable for production
    },
  },
});

// Export conversation for analytics
const history = await neurolink.exportConversation({ format: "json" });
await saveToDataWarehouse(history);
```

### Docker Quick Start

```bash
# Start Redis
docker run -d --name neurolink-redis -p 6379:6379 redis:7-alpine

# Configure NeuroLink
export REDIS_URL=redis://localhost:6379

# Start your application
node your-app.js
```

**[Redis Setup Guide](/docs/getting-started/redis-quickstart)** | **[Production Configuration](/docs/guides/redis-configuration)** | **[Migration Patterns](/docs/guides/redis-migration)**

---

###  Professional CLI

**15+ commands** for every workflow:

| Command          | Purpose                              | Example                    | Documentation                               |
| ---------------- | ------------------------------------ | -------------------------- | ------------------------------------------- |
| `setup`          | Interactive provider configuration   | `neurolink setup`          | [Setup Guide](/docs/)                 |
| `generate`       | Text generation                      | `neurolink gen "Hello"`    | [Generate](/docs/cli/commands.md#generate)        |
| `stream`         | Streaming generation                 | `neurolink stream "Story"` | [Stream](/docs/cli/commands.md#stream)            |
| `status`         | Provider health check                | `neurolink status`         | [Status](/docs/cli/commands.md#status)            |
| `loop`           | Interactive session                  | `neurolink loop`           | [Loop](/docs/cli/commands.md#loop)                |
| `mcp`            | MCP server management                | `neurolink mcp discover`   | [MCP CLI](/docs/cli/commands.md#mcp)              |
| `models`         | Model listing                        | `neurolink models`         | [Models](/docs/cli/commands.md#models)            |
| `eval`           | Model evaluation                     | `neurolink eval`           | [Eval](/docs/cli/commands.md#eval)                |
| `serve`          | Start HTTP server in foreground mode | `neurolink serve`          | [Serve](/docs/cli/commands.md#serve)              |
| `server start`   | Start HTTP server in background mode | `neurolink server start`   | [Server](/docs/cli/commands.md#server-subcommand) |
| `server stop`    | Stop running background server       | `neurolink server stop`    | [Server](/docs/cli/commands.md#server-subcommand) |
| `server status`  | Show server status information       | `neurolink server status`  | [Server](/docs/cli/commands.md#server-subcommand) |
| `server routes`  | List all registered API routes       | `neurolink server routes`  | [Server](/docs/cli/commands.md#server-subcommand) |
| `server config`  | View or modify server configuration  | `neurolink server config`  | [Server](/docs/cli/commands.md#server-subcommand) |
| `server openapi` | Generate OpenAPI specification       | `neurolink server openapi` | [Server](/docs/cli/commands.md#server-subcommand) |

**[ Complete CLI Reference](/docs/cli/commands)** - All commands and options

---

###  GitHub Action

Run AI-powered workflows directly in GitHub Actions with 13 provider support and automatic PR/issue commenting.

```yaml
- uses: juspay/neurolink@v1
  with:
    anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
    prompt: "Review this PR for security issues and code quality"
    post_comment: true
```

| Feature                | Description                                                                               |
| ---------------------- | ----------------------------------------------------------------------------------------- |
| **Multi-Provider**     | 13 providers with unified interface                                                       |
| **PR/Issue Comments**  | Auto-post AI responses with intelligent updates                                           |
| **Multimodal Support** | Attach images, PDFs, CSVs, Excel, Word, JSON, YAML, XML, HTML, SVG, code files to prompts |
| **Cost Tracking**      | Built-in analytics and quality evaluation                                                 |
| **Extended Thinking**  | Deep reasoning with thinking tokens                                                       |

**[ GitHub Action Guide](/docs/guides/github-action)** - Complete setup and examples

---

##  Smart Model Selection

NeuroLink features intelligent model selection and cost optimization:

### Cost Optimization Features

- ** Automatic Cost Optimization**: Selects cheapest models for simple tasks
- ** LiteLLM Model Routing**: Access 100+ models with automatic load balancing
- ** Capability-Based Selection**: Find models with specific features (vision, function calling)
- **⚡ Intelligent Fallback**: Seamless switching when providers fail

```bash
# Cost optimization - automatically use cheapest model
npx @juspay/neurolink generate "Hello" --optimize-cost

# LiteLLM specific model selection
npx @juspay/neurolink generate "Complex analysis" --provider litellm --model "anthropic/claude-3-5-sonnet"

# Auto-select best available provider
npx @juspay/neurolink generate "Write code" # Automatically chooses optimal provider
```

## Revolutionary Interactive CLI

NeuroLink's CLI goes beyond simple commands - it's a **full AI development environment**:

### Why Interactive Mode Changes Everything

| Feature       | Traditional CLI   | NeuroLink Interactive          |
| ------------- | ----------------- | ------------------------------ |
| Session State | None              | Full persistence               |
| Memory        | Per-command       | Conversation-aware             |
| Configuration | Flags per command | `/set` persists across session |
| Tool Testing  | Manual per tool   | Live discovery & testing       |
| Streaming     | Optional          | Real-time default              |

### Live Demo: Development Session

```bash
$ npx @juspay/neurolink loop --enable-conversation-memory

neurolink > /set provider vertex
✓ provider set to vertex (Gemini 3 support enabled)

neurolink > /set model gemini-3-flash-preview
✓ model set to gemini-3-flash-preview

neurolink > Analyze my project architecture and suggest improvements

✓ Analyzing your project structure...
[AI provides detailed analysis, remembering context]

neurolink > Now implement the first suggestion
[AI remembers previous context and implements suggestion]

neurolink > /mcp discover
✓ Discovered 58 MCP tools:
   GitHub: create_issue, list_repos, create_pr...
   PostgreSQL: query, insert, update...
   [full list]

neurolink > Use the GitHub tool to create an issue for this improvement
✓ Creating issue... (requires HITL approval if configured)

neurolink > /export json > session-2026-01-01.json
✓ Exported 15 messages to session-2026-01-01.json

neurolink > exit
Session saved. Resume with: neurolink loop --session session-2026-01-01.json
```

### Session Commands Reference

| Command              | Purpose                                              |
| -------------------- | ---------------------------------------------------- |
| `/set  ` | Persist configuration (provider, model, temperature) |
| `/mcp discover`      | List all available MCP tools                         |
| `/export json`       | Export conversation to JSON                          |
| `/history`           | View conversation history                            |
| `/clear`             | Clear context while keeping settings                 |

**[Interactive CLI Guide](/docs/cli)** | **[CLI Reference](/docs/cli/commands)**

Skip the wizard and configure manually? See [`docs/getting-started/provider-setup.md`](/docs/getting-started/provider-setup).

## CLI & SDK Essentials

`neurolink` CLI mirrors the SDK so teams can script experiments and codify them later.

```bash
# Discover available providers and models
npx @juspay/neurolink status
npx @juspay/neurolink models list --provider google-ai

# Route to a specific provider/model
npx @juspay/neurolink generate "Summarize customer feedback" \
  --provider azure --model gpt-4o-mini

# Turn on analytics + evaluation for observability
npx @juspay/neurolink generate "Draft release notes" \
  --enable-analytics --enable-evaluation --format json
```

```typescript

const neurolink = new NeuroLink({
  conversationMemory: {
    enabled: true,
    store: "redis",
  },
  enableOrchestration: true,
});

const result = await neurolink.generate({
  input: {
    text: "Create a comprehensive analysis",
    files: [
      "./sales_data.csv", // Auto-detected as CSV
      "examples/data/invoice.pdf", // Auto-detected as PDF
      "./diagrams/architecture.png", // Auto-detected as image
      "./report.xlsx", // Auto-detected as Excel
      "./config.json", // Auto-detected as JSON
      "./diagram.svg", // Auto-detected as SVG (injected as text)
      "./app.ts", // Auto-detected as TypeScript code
    ],
  },
  provider: "vertex", // PDF-capable provider (see docs/features/pdf-support.md)
  enableEvaluation: true,
  region: "us-east-1",
});

console.log(result.content);
console.log(result.evaluation?.overallScore);
```

### Gemini 3 with Extended Thinking

```typescript

const neurolink = new NeuroLink();

// Use Gemini 3 with extended thinking for complex reasoning
const result = await neurolink.generate({
  input: {
    text: "Solve this step by step: What is the optimal strategy for...",
  },
  provider: "vertex",
  model: "gemini-3-flash-preview",
  thinkingLevel: "medium", // Options: "minimal", "low", "medium", "high"
});

console.log(result.content);
```

Full command and API breakdown lives in [`docs/cli/commands.md`](/docs/cli/commands) and [`docs/sdk/api-reference.md`](/docs/sdk/api-reference).

## Platform Capabilities at a Glance

| Capability               | Highlights                                                                                                               |
| ------------------------ | ------------------------------------------------------------------------------------------------------------------------ |
| **Provider unification** | 13+ providers with automatic fallback, cost-aware routing, provider orchestration (Q3).                                  |
| **Multimodal pipeline**  | Stream images + CSV data + PDF documents across providers with local/remote assets. Auto-detection for mixed file types. |
| **Quality & governance** | Auto-evaluation engine (Q3), guardrails middleware (Q4), HITL workflows (Q4), audit logging.                             |
| **Memory & context**     | Conversation memory, Mem0 integration, Redis history export (Q4), context summarization (Q4).                            |
| **CLI tooling**          | Loop sessions (Q3), setup wizard, config validation, Redis auto-detect, JSON output.                                     |
| **Enterprise ops**       | Proxy support, regional routing (Q3), telemetry hooks, configuration management.                                         |
| **Tool ecosystem**       | MCP auto discovery, HTTP/stdio/SSE/WebSocket transports, LiteLLM hub access, SageMaker custom deployment, web search.    |

## Documentation Map

| Area            | When to Use                                           | Link                                                        |
| --------------- | ----------------------------------------------------- | ----------------------------------------------------------- |
| Getting started | Install, configure, run first prompt                  | [`docs/getting-started/index.md`](/docs/) |
| Feature guides  | Understand new functionality front-to-back            | [`docs/features/index.md`](/docs/)               |
| CLI reference   | Command syntax, flags, loop sessions                  | [`docs/cli/index.md`](/docs/)                         |
| SDK reference   | Classes, methods, options                             | [`docs/sdk/index.md`](/docs/)                         |
| Integrations    | LiteLLM, SageMaker, MCP, Mem0                         | [`docs/litellm-integration.md`](/docs/getting-started/providers/litellm)     |
| Advanced        | Middleware, architecture, streaming patterns          | [`docs/advanced/index.md`](/docs/)               |
| Cookbook        | Practical recipes for common patterns                 | [`docs/cookbook/index.md`](/docs/)               |
| Guides          | Migration, Redis, troubleshooting, provider selection | [`docs/guides/index.md`](/docs/)                   |
| Operations      | Configuration, troubleshooting, provider matrix       | [`docs/reference/index.md`](/docs/)             |

### New in 2026: Enhanced Documentation

**Enterprise Features:**

- [Enterprise HITL Guide](/docs/features/enterprise-hitl) - Production-ready approval workflows
- [Interactive CLI Guide](/docs/cli) - AI development environment
- [MCP Tools Showcase](/docs/features/mcp-tools-showcase) - 58+ external tools & 6 built-in tools

**Provider Intelligence:**

- [Provider Capabilities Audit](/docs/reference/provider-capabilities-audit) - Technical capabilities matrix
- [Provider Selection Guide](/docs/reference/provider-selection) - Interactive decision wizard
- [Provider Comparison](/docs/reference/provider-comparison) - Feature & cost comparison

**Middleware System:**

- [Middleware Architecture](/docs/advanced/middleware-architecture) - Complete lifecycle & patterns
- [Built-in Middleware](/docs/advanced/builtin-middleware) - Analytics, Guardrails, Evaluation
- [Custom Middleware Guide](/docs/workflows/custom-middleware) - Build your own

**Redis & Persistence:**

- [Redis Quick Start](/docs/getting-started/redis-quickstart) - 5-minute setup
- [Redis Configuration](/docs/guides/redis-configuration) - Production-ready setup
- [Redis Migration](/docs/guides/redis-migration) - Migration patterns

**Migration Guides:**

- [From LangChain](/docs/guides/migration/from-langchain) - Complete migration guide
- [From Vercel AI SDK](/docs/guides/migration/from-vercel-ai-sdk) - Next.js focused

**Developer Experience:**

- [Cookbook](/docs/) - 10 practical recipes
- [Troubleshooting Guide](/docs/reference/troubleshooting) - Common issues & solutions

## Integrations

- **LiteLLM 100+ model hub** – Unified access to third-party models via LiteLLM routing. → [`docs/litellm-integration.md`](/docs/getting-started/providers/litellm)
- **Amazon SageMaker** – Deploy and call custom endpoints directly from NeuroLink CLI/SDK. → [`docs/sagemaker-integration.md`](/docs/getting-started/providers/sagemaker)
- **Mem0 conversational memory** – Persistent semantic memory with vector store support. → [`docs/mem0-integration.md`](/docs/memory/mem0)
- **Enterprise proxy & security** – Configure outbound policies and compliance posture. → [`docs/enterprise-proxy-setup.md`](/docs/deployment/enterprise-proxy)
- **Configuration automation** – Manage environments, regions, and credentials safely. → [`docs/configuration-management.md`](/docs/deployment/configuration-management)
- **MCP tool ecosystem** – Auto-discover Model Context Protocol tools and extend workflows. → [`docs/advanced/mcp-integration.md`](/docs/mcp/integration)
- **Remote MCP via HTTP** – Connect to HTTP-based MCP servers with authentication, retries, and rate limiting. → [`docs/mcp-http-transport.md`](/docs/mcp/http-transport)

## Contributing & Support

- Bug reports and feature requests → [GitHub Issues](https://github.com/juspay/neurolink/issues)
- Development workflow, testing, and pull request guidelines → [`docs/development/contributing.md`](/docs/community/contributing)
- Documentation improvements → open a PR referencing the documentation matrix.

---

NeuroLink is built with ❤️ by Juspay. Contributions, questions, and production feedback are always welcome.

---

## Page Not Found

<!-- Source: 404.md -->

# Page Not Found

- ⚠️ **404 - Page Not Found**

  The page you're looking for doesn't exist or has been moved.

##  What you can do:

##  Popular Pages

-  **[Getting Started](/docs/)**

  Quick setup and installation guides

-  **[CLI Guide](/docs/)**

  Command-line interface documentation

-  **[SDK Reference](/docs/)**

  API documentation and examples

- ⭐ **[Examples](/docs/)**

  Practical usage examples

## 🆘 Need Help?

If you think this is an error, please:

1. Check our [Troubleshooting Guide](/docs/reference/troubleshooting)
2. Search our [FAQ](/docs/reference/faq)
3. Report the issue on [GitHub](https://github.com/juspay/neurolink/issues)

---

[← Back to Home](/docs/)

---

## AI Analysis Tools

<!-- Source: ai-analysis-tools.md -->

#  AI Analysis Tools

**NeuroLink** features **3 specialized AI Analysis Tools** for AI optimization and workflow enhancement. These tools work seamlessly behind our factory method interface, providing enterprise-grade AI analysis capabilities.

##  Production Status

**Production Ready: 20/20 Tests Passing (100% Success Rate)**

- ✅ **3 AI Analysis Tools Implemented**: Complete AI optimization and analysis capabilities
- ✅ **Enterprise Integration**: Professional web interface with full API endpoints
- ✅ **Performance Validated**: All tools execute under 1ms individually, 7 seconds total for full suite
- ✅ **Production Infrastructure**: Rich context, permissions, error handling, comprehensive validation

##  Available Tools

### 1. AI Usage Analysis - `analyzeAIUsage()`

Analyze AI usage patterns, token consumption, and cost optimization across all providers.

```typescript
const analysis = await provider.analyzeAIUsage({
  timeframe: "last-24-hours",
  providers: ["openai", "bedrock", "vertex", "google-ai"],
  includeOptimizations: true,
});

console.log(analysis.tokenUsage); // Token consumption patterns
console.log(analysis.costBreakdown); // Cost analysis by provider
console.log(analysis.recommendations); // Optimization suggestions
```

**Features:**

- **Token Usage Analytics**: Detailed breakdown by provider and time period
- **Cost Optimization**: Identify most cost-effective providers for your workload
- **Usage Patterns**: Detect peak usage times and optimization opportunities
- **Provider Comparison**: Side-by-side cost and performance analysis

### 2. Provider Performance Benchmarking - `benchmarkProviders()`

Advanced benchmarking with latency, quality, and cost metrics across all AI providers.

```typescript
const benchmark = await provider.benchmarkProviders({
  iterations: 3,
  testPrompts: ["balanced", "creative", "technical"],
  includeQualityMetrics: true,
});

console.log(benchmark.latencyResults); // Response time comparisons
console.log(benchmark.qualityScores); // Content quality analysis
console.log(benchmark.costEfficiency); // Cost per token analysis
```

**Features:**

- **Latency Testing**: Measure real response times across providers
- **Quality Assessment**: Evaluate output quality for different prompt types
- **Cost Efficiency**: Calculate cost per token and value metrics
- **Provider Rankings**: Automatic ranking by performance criteria

### 3. Prompt Parameter Optimization - `optimizePrompt()`

Optimize prompt parameters (temperature, max tokens, style) for better output quality.

```typescript
const optimization = await provider.optimizePrompt({
  prompt: "Write a professional email explaining AI benefits",
  style: "balanced",
  optimizeFor: "quality",
  includeAlternatives: true,
});

console.log(optimization.optimizedParameters); // Temperature, max tokens, etc.
console.log(optimization.expectedImprovement); // Quality enhancement predictions
console.log(optimization.alternatives); // Alternative parameter sets
```

**Features:**

- **Parameter Tuning**: Automatic optimization of temperature, max tokens, style
- **Quality Prediction**: Estimate quality improvements from parameter changes
- **Alternative Suggestions**: Multiple parameter sets for different use cases
- **Style Optimization**: Adjust parameters for specific writing styles

##  Business Benefits

### Cost Optimization

- **Provider Cost Analysis**: Identify most cost-effective providers for your workload
- **Usage Pattern Insights**: Detect opportunities to reduce token consumption
- **Budget Planning**: Predict costs based on historical usage patterns

### Performance Enhancement

- **Real-time Benchmarking**: Continuous performance monitoring across providers
- **Quality Metrics**: Measure and improve output quality over time
- **Latency Optimization**: Choose fastest providers for time-sensitive applications

### Parameter Intelligence

- **Automated Tuning**: Remove guesswork from prompt parameter selection
- **Quality Prediction**: Understand impact of parameter changes before implementation
- **Style Adaptation**: Optimize parameters for different content types

##  Interactive Web Interface

All AI Analysis Tools are available through our unified demo application with professional UI:

```bash
cd neurolink-demo && node server.js
# Visit http://localhost:9876 to see AI Analysis Tools in action
```

### Features

- ✅ **Real-time Analysis**: Interactive forms for all 3 analysis tools
- ✅ **API Endpoints**: Full REST API at `/api/ai/analyze-usage`, `/api/ai/benchmark-performance`, `/api/ai/optimize-parameters`
- ✅ **JSON Results**: Comprehensive analysis results with visual feedback
- ✅ **Simulation Mode**: Fallback to realistic simulated responses for demonstration

### API Endpoints

#### Analyze AI Usage

```bash
POST /api/ai/analyze-usage
Content-Type: application/json

{
  "timeframe": "last-24-hours",
  "providers": ["openai", "vertex", "google-ai"],
  "includeOptimizations": true
}
```

#### Benchmark Performance

```bash
POST /api/ai/benchmark-performance
Content-Type: application/json

{
  "iterations": 3,
  "testPrompts": ["balanced", "creative"],
  "includeQualityMetrics": true
}
```

#### Optimize Parameters

```bash
POST /api/ai/optimize-parameters
Content-Type: application/json

{
  "prompt": "Write a technical blog post",
  "style": "professional",
  "optimizeFor": "quality"
}
```

##  Visual Documentation

### Screenshots

- **AI Usage Analysis Interface**: Interactive form with real-time token analysis
- **Performance Benchmarking**: Provider comparison with latency and quality metrics
- **Parameter Optimization**: Prompt tuning interface with multiple suggestions

### Demo Videos

All analysis tools are demonstrated in our comprehensive demo videos:

- **[Visual Demos](/docs/)** - Real-time analysis and optimization demonstrations

##  Technical Implementation

### MCP Integration

AI Analysis Tools are implemented as MCP (Model Context Protocol) tools that work internally behind our factory methods:

```typescript
// Internal MCP tool execution (transparent to users)
const mcpTools = [
  "analyze-ai-usage",
  "benchmark-provider-performance",
  "optimize-prompt-parameters",
];
```

### Error Handling

- **Graceful Fallback**: Tools fall back to simulation mode if AI providers unavailable
- **Comprehensive Validation**: Input validation and error reporting
- **Production Logging**: Detailed logging for debugging and monitoring

### Performance Metrics

- **Tool Execution**: Individual tools execute under 1ms
- **Suite Execution**: Complete analysis suite runs in ~7 seconds
- **API Response**: REST endpoints respond within 2-5 seconds
- **Error Recovery**: Automatic fallback to simulation mode on provider failures

##  Getting Started

1. **Install NeuroLink**: `npm install @juspay/neurolink`
2. **Set up providers**: Configure at least one AI provider (see [Provider Configuration](/docs/getting-started/provider-setup)) (now with authentication and model availability checks)
3. **Try the tools**: Use factory methods or visit the demo application
4. **Integrate APIs**: Use REST endpoints for web applications

##  Related Documentation

- **[Main README](/docs/)** - Project overview and quick start
- **[AI Workflow Tools](/docs/ai-workflow-tools)** - Development lifecycle tools
- **[MCP Foundation](/docs/mcp/overview)** - Technical architecture details
- **[API Reference](/docs/sdk/api-reference)** - Complete TypeScript API
- **[Visual Demos](/docs/visual-demos)** - Screenshots and videos

---

**Enterprise AI Analysis** - Transform your AI development workflow with data-driven insights and optimization recommendations.

---

## NeuroLink AI Enhancements - Complete Documentation

<!-- Source: ai-enhancements.md -->

#  NeuroLink AI Enhancements - Complete Documentation

## Overview

NeuroLink v3.1.0 introduces 6 powerful AI enhancement features that transform it from a basic AI SDK into a comprehensive AI development platform with quality monitoring and analytics capabilities.

## 🆕 New Features

### 1. Response Quality Evaluation ⭐

AI-powered quality scoring using fast, cost-effective models to evaluate response quality on multiple dimensions.

**Metrics:**

- **Relevance** (1-10): How well the response addresses the prompt
- **Accuracy** (1-10): Factual correctness of the information
- **Completeness** (1-10): Whether the response fully answers the question
- **Overall** (1-10): Combined quality assessment

**Configuration:**

```bash
# Optional environment variables
NEUROLINK_EVALUATION_MODEL=gemini-2.5-flash
NEUROLINK_EVALUATION_PROVIDER=google-ai
```

### 2. Usage Analytics

Comprehensive tracking of AI usage patterns, costs, and performance metrics.

**Metrics Captured:**

- Token usage (input, output, total)
- Estimated costs (based on provider pricing)
- Response time
- Provider and model used
- Custom context data
- Timestamp

**Supported Cost Estimation:**

- OpenAI (GPT-4, GPT-4 Turbo, GPT-3.5 Turbo)
- Anthropic (Claude 3 Opus, Sonnet, Haiku)
- Google AI (Gemini Pro, Gemini 2.5 Flash)

### 3. Generic Context Flow

Pass custom context objects through the entire AI request lifecycle for domain-specific tracking and analytics.

**Use Cases:**

- User identification (`userId`, `sessionId`)
- Domain-specific metadata (`department`, `project`)
- Request categorization (`priority`, `type`)
- Custom business logic data

### 4. Quality Monitoring

Analytics and evaluation data returned in response objects for user-controlled alerting and monitoring.

**No External Dependencies:**

- All data stays within NeuroLink ecosystem
- Users control what to do with the data
- No forced external endpoints or webhooks

## ️ SDK Usage

### Basic Usage with Analytics

```typescript

const sdk = new NeuroLink();

const result = await sdk.generate({
  input: { text: "Explain artificial intelligence in simple terms" },
  provider: "openai",
  enableAnalytics: true, // 🆕 NEW: Track usage and costs
  context: {
    // 🆕 NEW: Custom context
    userId: "user-123",
    department: "engineering",
    requestType: "explanation",
  },
});

console.log(result.content); // AI response
console.log(result.analytics); // Usage metrics
// {
//   provider: 'openai',
//   model: 'gpt-4o',
//   tokens: { input: 15, output: 150, total: 165 },
//   cost: 0.00495,  // Estimated cost in USD
//   responseTime: 2340,
//   timestamp: '2025-01-15T10:30:00.000Z',
//   context: { userId: 'user-123', department: 'engineering', requestType: 'explanation' }
// }
```

### Usage with Quality Evaluation

```typescript
const result = await sdk.generate({
  input: { text: "Write a technical explanation of machine learning" },
  provider: "google-ai",
  enableEvaluation: true, // 🆕 NEW: AI quality scoring
  context: {
    domain: "technology",
    audience: "technical",
    expectedLength: "detailed",
  },
});

console.log(result.evaluation);
// {
//   relevanceScore: 9,
//   accuracyScore: 8,
//   completenessScore: 9,
//   overallScore: 8.7,
//   evaluationModel: 'gemini-2.5-flash',
//   evaluationTime: 1200
// }
```

### Combined Analytics and Evaluation

```typescript
const result = await sdk.generate({
  input: { text: "Generate a product description for AI software" },
  enableAnalytics: true, // Track usage and costs
  enableEvaluation: true, // Score response quality
  context: {
    productId: "ai-toolkit-v2",
    userId: "marketing-001",
    campaign: "product-launch-2025",
  },
});

// Access all enhancement data
const { content, analytics, evaluation } = result;

// Custom monitoring logic
if (evaluation.overallScore  0.1) {
  console.warn("High cost request detected");
}

// Send to your monitoring system
sendToMonitoring({
  requestId: analytics.context.productId,
  quality: evaluation.overallScore,
  cost: analytics.cost,
  responseTime: analytics.responseTime,
});
```

## ️ CLI Usage

### Analytics Tracking

```bash
# Enable analytics with debug output
npx @juspay/neurolink generate "Explain quantum computing" \
  --enable-analytics \
  --debug

# Output includes:
# - AI response text
# - Token usage details
# - Estimated costs
# - Response time
# - Provider information
```

### Quality Evaluation

```bash
# Enable response quality scoring
npx @juspay/neurolink generate "Write a business proposal" \
  --enable-evaluation \
  --debug

# Output includes:
# - AI response text
# - Quality scores (relevance, accuracy, completeness, overall)
# - Evaluation model used
# - Evaluation time
```

### Custom Context

```bash
# Pass custom context data
npx @juspay/neurolink generate "Help with customer issue" \
  --context '{"userId":"support-001","priority":"high","department":"customer-service"}' \
  --enable-analytics \
  --debug

# Context appears in analytics data for tracking
```

### All Features Combined

```bash
# Use all enhancement features together
npx @juspay/neurolink generate "Generate marketing copy for AI product" \
  --enable-analytics \
  --enable-evaluation \
  --context '{"campaign":"q1-2025","target":"developers","budget":"high"}' \
  --provider openai \
  --temperature 0.8 \
  --debug
```

### 5. Universal Evaluation System

Enterprise-grade multi-provider evaluation with intelligent fallback, cost optimization, and performance tuning.

**Key Features:**

- **9 Provider Support**: Google AI, OpenAI, Anthropic, Vertex, Bedrock, Azure, Ollama, Hugging Face, Mistral
- **Intelligent Fallback**: Automatic provider selection when primary fails
- **Cost Optimization**: Provider-specific cost calculations and budget awareness
- **Performance Modes**: Fast, balanced, and quality evaluation options
- **Retry Logic**: Robust error handling with exponential backoff

**Configuration:**

```bash
# Primary evaluation setup
NEUROLINK_EVALUATION_PROVIDER=google-ai
NEUROLINK_EVALUATION_MODE=fast
NEUROLINK_EVALUATION_FALLBACK_ENABLED=true
NEUROLINK_EVALUATION_FALLBACK_PROVIDERS=openai,anthropic,vertex

# Cost optimization
NEUROLINK_EVALUATION_PREFER_CHEAP=true
NEUROLINK_EVALUATION_MAX_COST_PER_EVAL=0.01

# Performance tuning
NEUROLINK_EVALUATION_TIMEOUT=10000
NEUROLINK_EVALUATION_RETRY_ATTEMPTS=2
```

**Usage:**

```typescript
// Automatic provider selection
const result = await sdk.generate({
  input: { text: "Explain quantum computing" },
  enableEvaluation: true, // Uses configured evaluation system
});

// Will try: google-ai → openai → anthropic → vertex (if primary fails)
```

**CLI Usage:**

```bash
# Uses Universal Evaluation System automatically
npx @juspay/neurolink generate "What is machine learning?" --enable-evaluation

# With debug to see provider selection
npx @juspay/neurolink generate "Explain AI" --enable-evaluation --debug
```

### 6. Lighthouse Enhanced Evaluation

Domain-aware evaluation with 6-dimensional scoring based on Lighthouse AI platform patterns.

**Enhanced Scoring Dimensions:**

- **Relevance Score** (1-10): How well response addresses the prompt
- **Accuracy Score** (1-10): Factual correctness of information
- **Completeness Score** (1-10): Whether response fully answers question
- **Domain Alignment** (1-10): Expertise alignment with specified domain
- **Terminology Accuracy** (1-10): Proper use of domain-specific terms
- **Tool Effectiveness** (1-10): How well MCP tools were utilized

**Advanced Features:**

- **Context Integration**: Tool usage tracking and conversation history
- **Domain Expertise**: Specialized evaluation prompts for specific domains
- **Enterprise Telemetry**: Structured logging with OpenTelemetry patterns
- **Backward Compatibility**: Full compatibility with Universal Evaluation System

**CLI Usage:**

```bash
# Basic Lighthouse-style evaluation
npx @juspay/neurolink generate "Fix this Python code" \
  --lighthouse-style \
  --evaluation-domain "Python coding assistant"

# Enterprise evaluation with full context
npx @juspay/neurolink generate "Analyze sales performance" \
  --lighthouse-style \
  --evaluation-domain "Business data analyst" \
  --tool-usage-context "Used sales-data and analytics MCP tools" \
  --context '{"role":"senior_analyst","department":"sales"}'
```

**SDK Usage:**

```typescript

  performEnhancedEvaluation,
  createEnhancedContext,
} from "@juspay/neurolink";

// Create enhanced evaluation context
const enhancedContext = createEnhancedContext(
  "Write a business proposal for Q1 expansion",
  result.text,
  {
    domain: "Business development",
    role: "Business proposal assistant",
    toolsUsed: ["generate", "analytics-helper"],
    conversationHistory: [
      { role: "user", content: "I need help with our Q1 business plan" },
      {
        role: "assistant",
        content: "I can help you create a comprehensive plan",
      },
    ],
  },
);

// Perform enhanced evaluation
const domainEvaluation = await performEnhancedEvaluation(enhancedContext);
console.log(" Enhanced Evaluation:", domainEvaluation);
// {
//   relevanceScore: 9, accuracyScore: 8, completenessScore: 9,
//   domainAlignment: 9, terminologyAccuracy: 8, toolEffectiveness: 9,
//   overall: 8.7, alertSeverity: 'none'
// }
```

##  Interface Reference

### Enhanced TextGenerationOptions

```typescript
type TextGenerationOptions = {
  // Existing fields (unchanged)
  input: { text: string };
  provider?: string;
  model?: string;
  temperature?: number;
  maxTokens?: number;
  systemPrompt?: string;
  timeout?: number | string;
  disableTools?: boolean;

  // NEW: AI Enhancement fields
  enableAnalytics?: boolean; // Default: false
  enableEvaluation?: boolean; // Default: false
  context?: Record; // Default: undefined
};
```

### AnalyticsData Structure

```typescript
type AnalyticsData = {
  provider: string; // AI provider used
  model: string; // Specific model name
  tokens: {
    input: number; // Input tokens
    output: number; // Output tokens
    total: number; // Total tokens
  };
  cost?: number; // Estimated cost (USD)
  responseTime: number; // Response time (ms)
  timestamp: string; // ISO timestamp
  context?: Record; // User context
};
```

### EvaluationData Structure

```typescript
type EvaluationData = {
  relevanceScore: number; // 1-10 scale
  accuracyScore: number; // 1-10 scale
  completenessScore: number; // 1-10 scale
  overallScore: number; // 1-10 scale
  evaluationModel: string; // Model used for evaluation
  evaluationTime: number; // Evaluation time (ms)
};
```

##  Configuration

### Environment Variables

```bash
# Response Quality Evaluation (optional)
NEUROLINK_EVALUATION_MODEL=gemini-2.5-flash
NEUROLINK_EVALUATION_PROVIDER=google-ai

# Provider API Keys (existing)
OPENAI_API_KEY=sk-your-openai-key
GOOGLE_AI_API_KEY=AIza-your-google-ai-key
AWS_ACCESS_KEY_ID=your-aws-access-key
# ... other provider keys
```

### Cost Estimation Configuration

Built-in pricing for major providers (updated regularly):

```typescript
const costMap = {
  openai: {
    "gpt-4": { input: 0.03, output: 0.06 },
    "gpt-4-turbo": { input: 0.01, output: 0.03 },
    "gpt-3.5-turbo": { input: 0.0015, output: 0.002 },
  },
  anthropic: {
    "claude-3-opus": { input: 0.015, output: 0.075 },
    "claude-3-sonnet": { input: 0.003, output: 0.015 },
    "claude-3-haiku": { input: 0.00025, output: 0.00125 },
  },
  "google-ai": {
    "gemini-pro": { input: 0.00035, output: 0.00105 },
    "gemini-2.5-flash": { input: 0.000075, output: 0.0003 },
  },
};
```

##  Performance Considerations

### Performance Impact

- **Features Disabled (default)**: Zero overhead
- **Analytics Only**: \ r.evaluation.overallScore >= 8,
);
```

### Cost Monitoring Dashboard

```typescript
function createCostDashboard() {
  const dailyCosts = [];
  const qualityMetrics = [];

  // Track all AI requests
  sdk.onResponse((result) => {
    if (result.analytics) {
      dailyCosts.push({
        date: new Date(result.analytics.timestamp),
        cost: result.analytics.cost,
        provider: result.analytics.provider,
        tokens: result.analytics.tokens.total,
      });
    }

    if (result.evaluation) {
      qualityMetrics.push({
        date: new Date(),
        quality: result.evaluation.overallScore,
        prompt: result.analytics?.context?.promptType,
      });
    }
  });
}
```

##  Best Practices

1. **Enable Analytics by Default**: Track all production usage
2. **Selective Evaluation**: Use for critical or customer-facing content
3. **Meaningful Context**: Include user/session IDs for tracking
4. **Quality Thresholds**: Set minimum quality scores for auto-publish
5. **Cost Alerts**: Monitor spending with custom thresholds
6. **Performance Monitoring**: Track response times and token usage
7. **A/B Testing**: Use context to track different prompt strategies

---

_NeuroLink AI Enhancements v3.1.0 - Transform your AI applications with comprehensive quality monitoring and analytics._

---

## ️ AI Development Workflow Tools

<!-- Source: ai-workflow-tools.md -->

# ️ AI Development Workflow Tools

**NeuroLink** features **4 specialized AI Development Workflow Tools** for comprehensive AI development lifecycle support. These tools work seamlessly behind our factory method interface, providing enterprise-grade development assistance.

##  Production Status

**Production Ready: 24/24 Tests Passing (100% Success Rate)**

- ✅ **4 AI Workflow Tools Implemented**: Complete development lifecycle support
- ✅ **Platform Evolution**: NeuroLink now features 10 specialized tools (3 core + 3 analysis + 4 workflow)
- ✅ **Performance Validated**: All tools designed for \ sum + item.price, 0); }",
  testTypes: ["unit", "integration", "edge-cases"],
  framework: "jest",
});

console.log(testCases.unitTests); // Unit test scenarios
console.log(testCases.edgeCases); // Edge case coverage
console.log(testCases.integrationTests); // Integration test patterns
```

**Features:**

- **Unit Test Generation**: Comprehensive unit test coverage for functions and classes
- **Edge Case Detection**: Identify and test boundary conditions and error scenarios
- **Integration Testing**: Generate tests for component interactions and API endpoints
- **Framework Support**: Jest, Mocha, Vitest, and other popular testing frameworks
- **Realistic Data**: Generate meaningful test data and mock scenarios

### 2. Code Refactoring - `refactorCode()`

AI-powered code refactoring and optimization with performance and maintainability improvements.

```typescript
const refactoring = await provider.refactorCode({
  sourceCode: `
    function processUsers(users) {
      var result = [];
      for (var i = 0; i ;
  };
  issues: Array;
  recommendations: string[];
  correctedOutput?: string;
  confidence: number;
};
```

##  Interactive Web Interface

All AI Development Workflow Tools are available through our unified demo application:

```bash
cd neurolink-demo && node server.js
# Visit http://localhost:9876 to see all 10 AI tools in action
```

### Features

- ✅ **Complete Tool Suite**: Interactive forms for all 10 specialized tools (3 core + 3 analysis + 4 workflow)
- ✅ **Full API Coverage**: REST endpoints for all AI Analysis and Workflow tools
- ✅ **Professional Results**: Comprehensive output with structured JSON responses
- ✅ **Demonstration Mode**: Realistic examples for immediate evaluation

### API Endpoints

#### Generate Test Cases

```bash
POST /api/ai/generate-test-cases
Content-Type: application/json

{
  "codeFunction": "function add(a, b) { return a + b; }",
  "testTypes": ["unit", "edge-cases"],
  "framework": "jest"
}
```

#### Refactor Code

```bash
POST /api/ai/refactor-code
Content-Type: application/json

{
  "sourceCode": "var users = []; // legacy code...",
  "target": "modern-es6",
  "focusAreas": ["performance", "readability"]
}
```

#### Generate Documentation

```bash
POST /api/ai/generate-documentation
Content-Type: application/json

{
  "codeBase": "class ApiService { ... }",
  "outputFormat": "markdown",
  "includeExamples": true
}
```

#### Debug AI Output

```bash
POST /api/ai/debug-ai-output
Content-Type: application/json

{
  "aiResponse": "{ malformed json... }",
  "expectedFormat": "json",
  "issueTypes": ["format", "logic"]
}
```

##  Visual Documentation

### Screenshots

- **Test Case Generation**: Interactive form showing comprehensive test generation
- **Code Refactoring**: Before/after code comparison with optimization suggestions
- **Documentation Generator**: Automatic API documentation creation interface
- **Debug Assistant**: AI output analysis with issue identification and fixes

### Demo Videos

All workflow tools are demonstrated in our comprehensive demo videos:

- **[Visual Demos](/docs/)** - Complete workflow demonstrations and technical applications

##  Technical Implementation

### MCP Integration

AI Workflow Tools are implemented as MCP (Model Context Protocol) tools that work internally behind our factory methods:

```typescript
// Internal MCP tool execution (transparent to users)
const workflowTools = [
  "generate-test-cases",
  "refactor-code",
  "generate-documentation",
  "debug-ai-output",
];
```

### Real AI Integration

- **Enhanced AI Generation**: All tools now use real AI generation instead of mock data
- **NeuroLink Integration**: Tools leverage actual `NeuroLink` class with automatic fallback
- **Graceful Fallback**: AI tools fall back to mock data only if AI parsing fails
- **Provider Tracking**: Tools report which AI provider was actually used

### Error Handling

- **Comprehensive Validation**: Input validation and error reporting for all tools
- **Production Logging**: Detailed logging for debugging and monitoring
- **Graceful Degradation**: Fallback to simulation mode when AI providers unavailable
- **Context Preservation**: Maintain context across tool execution chains

### Performance Metrics

- **Tool Execution**: Individual tools designed for \<100ms execution
- **API Response**: REST endpoints respond within 2-5 seconds
- **Error Recovery**: Automatic fallback mechanisms for reliability
- **Resource Management**: Efficient handling of large code bases and outputs

##  Getting Started

### Prerequisites

1. **Install NeuroLink**: `npm install @juspay/neurolink`
2. **Configure Providers**: Set up at least one AI provider (see [Provider Configuration](/docs/getting-started/provider-setup)) (now with authentication and model availability checks)
3. **Verify Setup**: Run `npx @juspay/neurolink status` to check connectivity

### Quick Examples

#### Generate Tests for Your Code

```typescript

const provider = createBestAIProvider();
const tests = await provider.generateTestCases({
  codeFunction: "your-function-here",
  testTypes: ["unit", "edge-cases"],
  framework: "jest",
});
```

#### Refactor Legacy Code

```typescript
const refactored = await provider.refactorCode({
  sourceCode: "legacy-code-here",
  target: "modern-es6",
  focusAreas: ["performance", "readability"],
});
```

#### Generate Documentation

```typescript
const docs = await provider.generateDocumentation({
  codeBase: "your-code-here",
  outputFormat: "markdown",
  includeExamples: true,
});
```

### Integration Patterns

#### CI/CD Integration

```yaml
# GitHub Actions example
- name: Generate Tests
  run: npx @juspay/neurolink generate-test-cases --input src/ --output tests/
```

#### Development Workflow

```bash
# Local development commands
neurolink refactor-code --file legacy.js --target modern
neurolink generate-docs --input src/ --output docs/
neurolink debug-output --file ai-response.json --format json
```

##  Current Integration Status

**Total Workflow Tools**: 4 specialized development tools

- **Test Generation**: Comprehensive test case creation for all code types
- **Code Refactoring**: AI-powered optimization and modernization
- **Documentation**: Automatic generation of API docs and guides
- **Debug Assistance**: AI output validation and correction

**Platform Achievement**: NeuroLink has successfully evolved into a **Comprehensive AI Development Platform** with complete development lifecycle support.

##  Related Documentation

- **[Main README](/docs/)** - Project overview and quick start
- **[AI Analysis Tools](/docs/ai-analysis-tools)** - AI optimization and analysis tools
- **[MCP Foundation](/docs/mcp/overview)** - Technical architecture details
- **[API Reference](/docs/sdk/api-reference)** - Complete TypeScript API
- **[CLI Guide](/docs/cli)** - Command-line interface documentation
- **[Visual Demos](/docs/visual-demos)** - Screenshots and videos

---

**AI-Powered Development** - Accelerate your development workflow with intelligent code generation, optimization, and quality assurance tools.

---

## Automated Publishing Guide (Semantic Release)

<!-- Source: automated-publishing-guide.md -->

#  Automated Publishing Guide (Semantic Release)

Complete step-by-step guide to set up **semantic-release** for automated GitHub releases, tags, and NPM publishing for NeuroLink.

##  Current Status

✅ **GitHub Workflow** - `.github/workflows/release.yml` configured with semantic-release
✅ **Semantic Release Config** - `.releaserc.json` configured
✅ **Dependencies Added** - All semantic-release packages in package.json
⏳ **NPM Token Setup** - Required for NPM publishing
⏳ **First Release** - Ready to trigger after NPM token

##  Step-by-Step Setup

### **Step 1: Create NPM Automation Token**

1. **Login to NPM:**

   ```bash
   npm login
   ```

   Use your NPM account credentials

2. **Create Automation Token:**

   ```bash
   npm token create --type=automation
   ```

3. **Copy the token** (starts with `npm_...`)

### **Step 2: Add NPM Token to GitHub Secrets**

1. Go to: https://github.com/juspay/neurolink/settings/secrets/actions
2. Click **"New repository secret"**
3. **Name:** `NPM_TOKEN`
4. **Value:** Paste your NPM automation token
5. Click **"Add secret"**

### **Step 3: Use Conventional Commits**

Semantic-release uses **conventional commits** to determine version bumps:

```bash
# PATCH version (1.7.0 → 1.7.1) - Bug fixes
git commit -m "fix: resolve CLI authentication issue"
git commit -m "perf: improve provider selection speed"

# MINOR version (1.7.0 → 1.8.0) - New features
git commit -m "feat: add new AI provider support"
git commit -m "feat(cli): add batch processing command"

# MAJOR version (1.7.0 → 2.0.0) - Breaking changes
git commit -m "feat!: remove deprecated API methods"
git commit -m "fix!: change provider interface signature"

# Alternative major version syntax
git commit -m "feat: add new authentication

BREAKING CHANGE: Previous auth methods no longer supported"
```

### **Step 4: Trigger Automatic Release**

 **Just push to release branch with conventional commits!**

```bash
# Make your changes with conventional commits
git add .
git commit -m "feat: add Google AI Studio integration"

# Push to release branch
git checkout release
git merge your-feature-branch
git push origin release

#  SEMANTIC RELEASE HANDLES EVERYTHING:
# ✅ Analyzes commit messages
# ✅ Determines version bump (patch/minor/major)
# ✅ Generates CHANGELOG.md
# ✅ Creates Git tag
# ✅ Creates GitHub release with notes
# ✅ Publishes to NPM
# ✅ Publishes to GitHub Packages
# ✅ Commits version changes back to repo
```

##  How Semantic Release Works

### **Commit Analysis:**

- **fix:** → Patch release (1.7.0 → 1.7.1)
- **feat:** → Minor release (1.7.0 → 1.8.0)
- **BREAKING CHANGE** or **!** → Major release (1.7.0 → 2.0.0)
- **docs:, style:, refactor:, test:, chore:** → No release

### **Generated Assets:**

- ️ **Git Tag:** `v1.8.0` (automatically created)
-  **CHANGELOG.md** (automatically generated and committed)
-  **GitHub Release** (with professional release notes)
-  **NPM Package:** https://www.npmjs.com/package/@juspay/neurolink
-  **GitHub Package:** https://github.com/juspay/neurolink/packages

### **Automatic Updates:**

- ✅ **package.json version** updated and committed
- ✅ **CHANGELOG.md** generated and committed
- ✅ **Git tags** created automatically
- ✅ **Release notes** generated from commits

##  Expected Results

After pushing conventional commits to release branch:

### **Automatic Process:**

1.  **Analyzes commits** since last release
2.  **Determines version** based on conventional commits
3.  **Generates CHANGELOG.md** from commit messages
4. ️ **Creates Git tag** (e.g., v1.8.0)
5.  **Creates GitHub release** with generated notes
6.  **Publishes to NPM** registry
7.  **Publishes to GitHub Packages**
8.  **Commits changes** back to release branch

### **GitHub Repository:**

- ✅ **Tags:** Automatically created (v1.8.0)
- ✅ **Releases:** Professional release notes from commits
- ✅ **Packages:** Available on GitHub Packages
- ✅ **CHANGELOG.md:** Auto-generated and updated

### **NPM Registry:**

- ✅ **Published Package:** `@juspay/neurolink@1.8.0`
- ✅ **Installation:** `npm install @juspay/neurolink`

##  Troubleshooting

### **Common Issues:**

#### **"No release published"**

- **Cause:** No conventional commits since last release
- **Solution:** Use proper conventional commit format (`feat:`, `fix:`, etc.)

#### **"NPM_TOKEN not found"**

- **Solution:** Add NPM token to GitHub repository secrets
- **Check:** Repository → Settings → Secrets and variables → Actions

#### **"Permission denied to publish"**

- **Solution:** Ensure NPM token has publishing permissions
- **Fix:** Create new automation token with correct permissions

#### **"CHANGELOG.md conflicts"**

- **Solution:** Semantic-release handles this automatically
- **Info:** Don't manually edit CHANGELOG.md - it's auto-generated

### **Verification Commands:**

```bash
# Check if package is published
npm view @juspay/neurolink

# Check latest release
gh release view --web

# Check semantic-release dry run (locally)
npx semantic-release --dry-run
```

##  Conventional Commit Examples

### **Feature Examples:**

```bash
feat: add OpenAI GPT-4o support
feat(cli): add --stream flag for real-time output
feat(providers): add retry logic for failed requests
```

### **Bug Fix Examples:**

```bash
fix: resolve memory leak in provider selection
fix(auth): handle expired API keys gracefully
fix(cli): correct typo in help text
```

### **Breaking Change Examples:**

```bash
feat!: change provider interface to async/await
fix!: remove deprecated createProvider function

# Or with body:
feat: redesign authentication system

BREAKING CHANGE: All providers now require async initialization
```

### **Other Types:**

```bash
docs: update README with new provider instructions
style: fix code formatting in provider files
refactor: simplify error handling logic
test: add unit tests for new providers
chore: update dependencies to latest versions
perf: optimize provider selection algorithm
```

##  Future Releases

### **Fully Automated Process:**

1. Write code with conventional commits
2. Push to release branch
3. **That's it!** Semantic-release handles everything else

### **No Manual Steps Required:**

- ❌ No manual version bumping
- ❌ No manual changelog writing
- ❌ No manual tag creation
- ❌ No manual release creation
- ❌ No manual NPM publishing

### **Professional Results:**

- ✅ Consistent versioning with SemVer
- ✅ Professional changelogs from commits
- ✅ Comprehensive release notes
- ✅ Zero human error in releases

## ✅ Next Steps

1. **Complete Step 1-2:** NPM token setup
2. **Use conventional commits:** Follow the format above
3. **Push to release branch:** Automatic release triggered
4. **Verify:** Check all platforms have packages
5. **Celebrate:** You now have industry-standard automation!

---

** Need Help?**

- Check the workflow logs in GitHub Actions
- Ensure NPM_TOKEN is properly configured
- Use conventional commit format
- Test with `npx semantic-release --dry-run` locally

The semantic-release workflow is the industry standard used by thousands of open-source projects. Once set up, you'll have bulletproof, professional-grade release automation!

##  References

- [Semantic Release Documentation](https://semantic-release.gitbook.io/)
- [Conventional Commits Specification](https://www.conventionalcommits.org/)
- [GitHub Actions for Semantic Release](https://github.com/semantic-release/semantic-release/blob/master/docs/usage/github-actions.md)

---

## Business Documentation Hub

<!-- Source: business-documentation.md -->

#  Business Documentation Hub

> **Transform your AI operations with NeuroLink's enterprise analytics and quality evaluation features**

This hub provides comprehensive business-focused documentation for implementing NeuroLink's analytics and evaluation features in production environments.

##  Documentation Overview

###  [Business Value Guide](/docs/business-value)

**ROI-focused guide with real cost savings and quality improvements**

- **Cost Optimization**: 35-40% reduction in AI spending
- **Quality Improvement**: 85-95% consistency in AI responses
- **Performance Monitoring**: Real-time business intelligence
- **Industry Examples**: E-commerce, healthcare, finance, SaaS
- **ROI Calculator**: Measure 300-1000% return on investment

###  [Industry Use Cases](/docs/use-cases)

**Real-world applications across 8+ industries**

- **E-commerce**: Product descriptions with cost optimization
- **Healthcare**: Patient education with 100% compliance
- **Financial Services**: Investment reports with regulatory compliance
- **SaaS**: Customer support automation (88% satisfaction)
- **Education**: Course content creation (8x faster)
- **Manufacturing**: Safety documentation (OSHA compliant)
- **Hospitality**: Marketing content (18% booking increase)
- **Mobile Apps**: App store optimization

###  Integration Tutorials

**Step-by-step implementation guides**

- **Quick Start**: 15-minute setup guide
- **Web Application**: Express.js + frontend integration
- **Batch Processing**: CSV data processing at scale
- **Real-Time Monitoring**: Analytics dashboard creation
- **Cost Optimization**: Automatic model selection
- **Industry Examples**: Production-ready implementations

###  [Technical Implementation](/docs/ai-enhancements)

**Technical feature specifications**

- **Analytics System**: Usage tracking and cost analysis
- **Evaluation System**: AI-powered quality scoring
- **Context Flow**: Custom data through request chains
- **Configuration**: Environment setup and model selection

###  [Testing & Validation](/docs/development/testing)

**Comprehensive testing and validation guides**

- **Feature Testing**: Analytics and evaluation validation
- **Integration Testing**: End-to-end workflow verification
- **Performance Testing**: Load and stress testing
- **Quality Assurance**: Testing methodology and best practices

##  Quick Navigation by Role

###  **Business Decision Makers**

**Start Here**: [Business Value Guide](/docs/business-value)

- See immediate ROI potential (300-1000% returns)
- Review cost optimization examples (35-40% savings)
- Understand quality improvement metrics (85-95% consistency)
- Compare industry success stories

### ‍ **Product Managers**

**Start Here**: [Industry Use Cases](/docs/use-cases)

- Find your industry's specific implementation
- See real-world success metrics
- Understand quality gates and business rules
- Review customer satisfaction improvements

### ‍ **Developers & Engineers**

**Start Here**: Integration Tutorials

- Follow step-by-step implementation guides
- Review code examples and best practices
- Set up monitoring and analytics dashboards
- Implement cost optimization strategies

###  **QA & Testing Teams**

**Start Here**: [Testing & Validation](/docs/development/testing)

- Comprehensive testing methodologies
- Quality assurance frameworks
- Performance benchmarking
- Validation scripts and tools

##  Implementation Roadmap

### Week 1: Foundation

1. **Read**: [Business Value Guide](/docs/business-value) - Understand ROI potential
2. **Review**: [Industry Use Cases](/docs/use-cases) - Find relevant examples
3. **Setup**: Basic analytics tracking
4. **Measure**: Baseline costs and quality

### Week 2: Implementation

1. **Follow**: [Quick Start Tutorial](/docs/tutorials.md#quick-start-15-minutes)
2. **Enable**: Analytics and evaluation features
3. **Configure**: Quality gates and cost monitoring
4. **Test**: Validation using [Testing Guide](/docs/development/testing)

### Week 3: Optimization

1. **Implement**: Cost optimization strategies
2. **Setup**: Real-time monitoring dashboard
3. **Configure**: Department-level tracking
4. **Measure**: Quality improvement metrics

### Week 4: Scale

1. **Deploy**: Production implementation
2. **Monitor**: ROI and performance metrics
3. **Optimize**: Based on analytics data
4. **Expand**: Roll out to additional teams

##  Expected Business Outcomes

###  Cost Optimization

- **Month 1**: 15-25% cost reduction through basic optimization
- **Month 2**: 25-35% cost reduction through advanced model selection
- **Month 3**: 35-45% cost reduction through department-level optimization
- **Ongoing**: Continuous optimization based on analytics insights

### ⭐ Quality Improvement

- **Week 1**: Baseline quality measurement established
- **Week 2**: Quality gates prevent low-quality content
- **Month 1**: 20-30% improvement in content consistency
- **Month 3**: 85-95% quality consistency achieved

###  Productivity Gains

- **Immediate**: Real-time cost and quality visibility
- **Week 2**: Automated quality control reduces manual review
- **Month 1**: 50-75% reduction in content review time
- **Month 3**: 10x faster content creation with quality assurance

##  Success Stories Summary

### E-commerce Company

- **Challenge**: 50,000 product descriptions monthly
- **Solution**: Analytics-driven model selection + quality gates
- **Results**: 65% cost reduction, 90% quality consistency, 10x faster creation

### Healthcare Organization

- **Challenge**: Regulatory compliance for patient education
- **Solution**: Strict evaluation thresholds + medical review workflows
- **Results**: 100% compliance, 75% faster creation, 40% better comprehension

### SaaS Company

- **Challenge**: Scale customer support while maintaining quality
- **Solution**: Tiered quality control + response time optimization
- **Results**: 88% satisfaction, 60% cost reduction, 10x volume handling

### Financial Services

- **Challenge**: Accurate investment reports with regulatory compliance
- **Solution**: Compliance frameworks + fact-checking requirements
- **Results**: Zero violations, 5x faster reports, 45% better ratings

##  Technical Architecture Overview

```
┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│   Application   │────│   NeuroLink SDK  │────│  AI Providers   │
│  (Your Code)    │    │  with Analytics  │    │ (9 Providers)   │
└─────────────────┘    └──────────────────┘    └─────────────────┘
         │                       │                       │
         ▼                       ▼                       ▼
┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│  Quality Gates  │    │   Cost Tracking  │    │   Performance   │
│  & Evaluation   │    │   & Analytics    │    │   Monitoring    │
└─────────────────┘    └──────────────────┘    └─────────────────┘
```

### Core Components

- **Analytics System**: Real-time usage tracking and cost analysis
- **Evaluation System**: AI-powered response quality scoring
- **Context Flow**: Custom business data through request chains
- **Quality Gates**: Automated quality control and review workflows
- **Cost Optimization**: Intelligent provider and model selection

##  Support & Resources

### Getting Help

- **Technical Issues**: [GitHub Issues](https://github.com/juspay/neurolink/issues)
- **Feature Requests**: [GitHub Discussions](https://github.com/juspay/neurolink/discussions)
- **Documentation**: [Complete API Reference](/docs/)
- **Examples**: [Working Code Examples](/docs/)

### Community

- **NPM Package**: [@juspay/neurolink](https://www.npmjs.com/package/@juspay/neurolink)
- **GitHub Repository**: [juspay/neurolink](https://github.com/juspay/neurolink)
- **License**: MIT (Production-friendly)

##  Next Steps

1. **Assess Your Needs**: Review [Industry Use Cases](/docs/use-cases) for your sector
2. **Calculate ROI**: Use examples in [Business Value Guide](/docs/business-value)
3. **Start Implementation**: Follow the Integration Tutorials
4. **Validate Results**: Use [Testing & Validation](/docs/development/testing)
5. **Optimize & Scale**: Monitor analytics and optimize based on data

---

**Ready to transform your AI operations?**

Start with the [Business Value Guide](/docs/business-value) to understand the ROI potential, then move to [Industry Use Cases](/docs/use-cases) to see how organizations like yours are achieving success.

The analytics and evaluation features typically deliver **300-1000% ROI** within 3-6 months through cost optimization, quality improvement, and productivity gains.

---

## Business Value Guide: Analytics & Evaluation Features

<!-- Source: business-value.md -->

#  Business Value Guide: Analytics & Evaluation Features

- [✅ Performance Monitoring Achieved:](#performance-monitoring-achieved)
- [ Next Steps](#next-steps)

NeuroLink's analytics and evaluation features deliver measurable business value through cost optimization, quality improvement, and performance monitoring. This guide shows real-world examples of business impact and ROI.

##  Cost Optimization

### Problem: Uncontrolled AI Spending

**Before NeuroLink Analytics:**

- No visibility into AI provider costs
- Using expensive models for simple tasks
- No department-level cost tracking
- Estimated monthly spend: **$5,000-$8,000**

**After NeuroLink Analytics:**

- Real-time cost tracking by provider, model, department
- Automatic model selection based on task complexity
- Cost optimization alerts and recommendations
- Actual monthly spend: **$3,200-$4,500** (35-40% reduction)

### ROI Example: E-commerce Company

```javascript
// Before: Using GPT-4 for all product descriptions
const expensiveResult = await provider.generate({
  input: { text: "Write product description for basic t-shirt" },
  model: "gpt-4-turbo", // $30/1M tokens
  enableAnalytics: true,
});
// Cost per description: $0.12
// Monthly cost (10,000 descriptions): $1,200

// After: Using analytics-driven model selection
const optimizedResult = await provider.generate({
  input: { text: "Write product description for basic t-shirt" },
  model: "gpt-3.5-turbo", // $3/1M tokens
  enableAnalytics: true,
});
// Cost per description: $0.015
// Monthly cost (10,000 descriptions): $150
// Monthly savings: $1,050 (87.5% reduction)
```

### Department-Level Cost Tracking

```javascript
// Track costs by department
const marketingResult = await provider.generate({
  input: { text: "Create social media post" },
  enableAnalytics: true,
  context: { department: "marketing", campaign: "Q1-launch" },
});

const supportResult = await provider.generate({
  input: { text: "Generate customer response" },
  enableAnalytics: true,
  context: { department: "support", priority: "high" },
});

// Analytics dashboard shows:
// Marketing: $450/month (social posts, ad copy)
// Support: $230/month (customer responses)
// Sales: $180/month (email templates)
// Total visibility enables budget allocation
```

## ⭐ Quality Improvement

### Problem: Inconsistent AI Response Quality

**Before NeuroLink Evaluation:**

- No automated quality assessment
- Manual review required for all content
- Inconsistent response quality (60-75% satisfaction)
- High review overhead (2-3 hours daily)

**After NeuroLink Evaluation:**

- Automated quality scoring (relevance, accuracy, completeness)
- Quality gates prevent low-quality content
- Consistent high-quality responses (85-95% satisfaction)
- Reduced review time (30 minutes daily)

### ROI Example: Customer Support

```javascript
// Automated quality control
const supportResponse = await provider.generate({
  input: { text: "Customer complaining about delayed shipment" },
  enableEvaluation: true,
  enableAnalytics: true,
  context: {
    customerTier: "premium",
    issueType: "shipping",
    urgency: "high",
  },
});

// Quality gates
if (supportResponse.evaluation.overall = 9) {
  await publishContent(medicalContent);
} else {
  await medicalProfessionalReview(medicalContent);
}

// Results:
// - 95% accuracy maintained (regulatory compliance)
// - 40% faster content creation
// - Zero compliance violations
```

##  Performance Monitoring

### Real-Time Business Intelligence

```bash
# Daily analytics reporting
npx @juspay/neurolink generate "Daily report summary" \
  --enable-analytics --enable-evaluation \
  --context '{"report_type":"daily","department":"analytics"}' \
  --debug

# Output includes:
#  Analytics: Response time: 1,200ms, Cost: $0.08, Tokens: 1,250
# ⭐ Evaluation: Overall: 9/10, Accuracy: 9/10, Completeness: 8/10
```

### Performance Optimization Dashboard

```javascript
// Track performance trends
const performanceData = {
  dailyStats: await analytics.getDailyUsage(),
  qualityTrends: await evaluation.getQualityTrends(),
  costOptimization: await analytics.getCostOptimization(),
};

// Key Performance Indicators:
// - Average response time: 1.2s (target: <2s)
// - Quality score trend: +15% this month
// - Cost per task: -25% vs last quarter
// - Provider reliability: 99.2% uptime
```

##  Industry-Specific Value

### E-commerce

**Use Case:** Product description generation

- **Volume:** 50,000 products/month
- **Cost Savings:** $2,400/month (optimized model selection)
- **Quality Improvement:** 85% consistency (vs 60% manual)
- **Time Savings:** 200 hours/month human writing

### Healthcare

**Use Case:** Patient education content

- **Compliance:** 98% accuracy requirement met
- **Review Time:** 75% reduction in medical review
- **Patient Satisfaction:** +30% comprehension scores
- **Risk Mitigation:** Zero compliance violations

### Financial Services

**Use Case:** Investment report generation

- **Accuracy:** 95% fact-checking score required
- **Compliance:** Automated regulatory review
- **Client Satisfaction:** +40% report quality ratings
- **Productivity:** 3x faster report generation

### SaaS Companies

**Use Case:** Customer communication

- **Response Time:** 90% under 30 seconds
- **Quality:** 88% customer satisfaction
- **Cost:** 60% reduction vs human-only support
- **Scalability:** Handle 10x volume with same team

##  ROI Calculation Framework

### Cost Savings Calculator

```javascript
// Monthly cost analysis
const monthlyROI = {
  // Before NeuroLink
  aiProviderCosts: 5000, // Unoptimized spending
  humanReviewHours: 80, // Manual quality review
  humanHourlyRate: 50, // $50/hour for reviewers
  qualityIssues: 12, // Monthly quality problems
  issueResolutionCost: 200, // $200 per quality issue

  // After NeuroLink
  optimizedAICosts: 3200, // 36% cost reduction
  reducedReviewHours: 20, // 75% review time reduction
  qualityIssuesPrevented: 10, // Quality gates prevent issues

  // Calculate savings
  totalMonthlySavings() {
    const aiSavings = this.aiProviderCosts - this.optimizedAICosts;
    const laborSavings =
      (this.humanReviewHours - this.reducedReviewHours) * this.humanHourlyRate;
    const qualitySavings =
      this.qualityIssuesPrevented * this.issueResolutionCost;

    return aiSavings + laborSavings + qualitySavings;
    // Result: $1,800 + $3,000 + $2,000 = $6,800/month savings
  },
};

// Annual ROI: $81,600 savings
// Implementation cost: ~$5,000 (development time)
// ROI: 1,632% (16x return on investment)
```

### Quality Improvement Metrics

```javascript
const qualityMetrics = {
  beforeNeuroLink: {
    averageQualityScore: 6.5, // Out of 10
    customerSatisfaction: 72, // Percentage
    manualReviewRequired: 100, // Percentage
    complianceViolations: 3, // Per month
  },

  afterNeuroLink: {
    averageQualityScore: 8.7, // +34% improvement
    customerSatisfaction: 89, // +24% improvement
    manualReviewRequired: 25, // -75% reduction
    complianceViolations: 0, // Zero violations
  },
};
```

##  Getting Started with Business Value

### Week 1: Baseline Measurement

```bash
# Measure current costs without analytics
npx @juspay/neurolink generate "Business content" --provider openai
# Note: No cost tracking, no quality metrics
```

### Week 2: Enable Analytics

```bash
# Start tracking costs and usage
npx @juspay/neurolink generate "Business content" \
  --provider openai --enable-analytics --debug
# Result: Immediate cost visibility
```

### Week 3: Add Quality Control

```bash
# Add automated quality assessment
npx @juspay/neurolink generate "Business content" \
  --provider openai --enable-analytics --enable-evaluation --debug
# Result: Quality scores + cost tracking
```

### Week 4: Optimize Based on Data

```bash
# Use analytics data to optimize provider/model selection
npx @juspay/neurolink generate "Business content" \
  --provider google-ai --model gemini-2.5-flash \
  --enable-analytics --enable-evaluation --debug
# Result: Optimized costs + maintained quality
```

##  Business Value Checklist

### ✅ Cost Optimization Achieved:

- [ ] Real-time cost tracking implemented
- [ ] Department-level cost allocation setup
- [ ] Model optimization based on task complexity
- [ ] Monthly cost reduction of 25-40%
- [ ] Automated cost alerts configured

### ✅ Quality Improvement Achieved:

- [ ] Automated quality scoring implemented
- [ ] Quality gates prevent low-quality content
- [ ] Customer satisfaction increased 20%+
- [ ] Manual review time reduced 70%+
- [ ] Compliance requirements met consistently

### Performance Monitoring Achieved:

- [ ] Real-time performance dashboards
- [ ] Quality trend analysis
- [ ] Cost optimization recommendations
- [ ] Provider reliability monitoring
- [ ] Business intelligence reporting

## Next Steps

1. **Implement Analytics**: Start with cost tracking
2. **Add Quality Control**: Implement evaluation scoring
3. **Measure Baseline**: Document current costs/quality
4. **Optimize Based on Data**: Use insights for improvement
5. **Scale Across Organization**: Roll out to all teams

The combination of analytics and evaluation features typically delivers **300-1000% ROI** within 3-6 months through cost optimization, quality improvement, and productivity gains.

---

## Workflow Engine - High-Level Design

<!-- Source: WORKFLOW-ENGINE-HLD.md -->

# Neurolink Workflow Engine - High-Level Design (HLD)

**Version**: 1.0
**Date**: November 28, 2025
**Status**: Implementation Complete
**Author**: Neurolink Team

##  Goals & Non-Goals

### Goals (Testing Phase)

1. **Enable Multi-Model Workflows**: Run N models in parallel for the same prompt
2. **Intelligent Evaluation**: Use judge models to score (0-100) and rank responses
3. **Comprehensive Logging**: Detailed metrics for AB testing and evaluation
4. **Original Output**: Return best response unchanged for production safety
5. **Cost Transparency**: Provide clear cost/performance metrics
6. **Seamless Integration**: Work with existing Neurolink provider layer

### Non-Goals (Phase 1 - Testing)

- ❌ Response conditioning/modification (deferred until testing validates workflows)
- ❌ Streaming workflow execution (deferred to Phase 2)
- ❌ Stateful/resumable workflows (deferred to Phase 2)
- ❌ DAG-based workflow chaining (deferred to Phase 3)
- ❌ Human-in-the-loop approval steps (deferred to Phase 3)
- ❌ Workflow versioning/migration (deferred to Phase 3)

---

## ️ Architecture Overview

### System Context

```
┌─────────────────────────────────────────────────────────────┐
│                      Neurolink SDK                           │
│                                                              │
│  ┌──────────────┐         ┌─────────────────┐              │
│  │   NeuroLink  │────────▶│ Workflow Engine │◀─────┐       │
│  │    Class     │         └─────────────────┘      │       │
│  └──────────────┘                 │                │       │
│         │                          ▼                │       │
│         │                  ┌──────────────┐        │       │
│         │                  │   Workflow   │        │       │
│         │                  │   Registry   │        │       │
│         │                  └──────────────┘        │       │
│         │                          │                │       │
│         ▼                          ▼                │       │
│  ┌──────────────┐         ┌──────────────┐        │       │
│  │  AI Provider │         │   Ensemble   │────────┘       │
│  │   Factory    │◀────────│   Executor   │                │
│  └──────────────┘         └──────────────┘                │
│         │                          │                        │
│         │                          ▼                        │
│         │                  ┌──────────────┐                │
│         │                  │    Judge     │                │
│         │                  │   Scorer     │                │
│         │                  └──────────────┘                │
│         │                          │                        │
│         ▼                          ▼                        │
│  ┌──────────────────────────────────────┐                 │
│  │          BaseProvider Layer           │                 │
│  │  (OpenAI, Anthropic, Google, etc.)    │                 │
│  └──────────────────────────────────────┘                 │
└─────────────────────────────────────────────────────────────┘
```

### Component Hierarchy

```
src/lib/types/
└── workflowTypes.ts           # All workflow type definitions (centralized)

src/lib/workflow/
├── index.ts                    # Public API exports
├── types.ts                    # Re-exports from types/workflowTypes.ts
├── config.ts                   # Configuration schemas & defaults
│
├── core/
│   ├── workflowRunner.ts      # Main orchestrator
│   ├── workflowRegistry.ts    # Workflow template registry
│   ├── ensembleExecutor.ts    # Multi-model parallel execution
│   ├── judgeScorer.ts         # Judge model scoring
│   └── responseConditioner.ts # Response post-processing
│
├── workflows/                  # Built-in workflow implementations
│   ├── consensusWorkflow.ts   # 3-5 models + judge
│   ├── fallbackWorkflow.ts    # Sequential fallback chain
│   ├── multiJudgeWorkflow.ts  # Multiple judges with voting
│   └── adaptiveWorkflow.ts    # Dynamic model selection
│
└── utils/
    ├── workflowValidation.ts  # Config validation
    └── workflowMetrics.ts     # Performance tracking
```

---

##  Workflow Execution Flow

### High-Level Process

```
┌────────────────────────────────────────────────────────────┐
│ 1. USER REQUEST                                             │
│    neuro.generate({                                        │
│      workflowConfig: { workflowId: 'consensus-3' },        │
│      input: { text: 'Explain quantum computing' }          │
│    })                                                       │
└────────────────────────────────────────────────────────────┘
                            ↓
┌────────────────────────────────────────────────────────────┐
│ 2. WORKFLOW RESOLUTION                                      │
│    - Load workflow config from registry                    │
│    - Validate configuration                                │
│    - Apply runtime overrides (if any)                      │
└────────────────────────────────────────────────────────────┘
                            ↓
┌────────────────────────────────────────────────────────────┐
│ 3. ENSEMBLE EXECUTION (Parallel)                           │
│    ┌──────────┐  ┌──────────┐  ┌──────────┐              │
│    │ Model 1  │  │ Model 2  │  │ Model 3  │              │
│    │ GPT-4o   │  │ Claude   │  │ Gemini   │              │
│    └──────────┘  └──────────┘  └──────────┘              │
│         │             │             │                       │
│         └─────────────┴─────────────┘                      │
│                       ↓                                     │
│    [Response 1, Response 2, Response 3]                    │
└────────────────────────────────────────────────────────────┘
                            ↓
┌────────────────────────────────────────────────────────────┐
│ 4. JUDGE SCORING (Optional)                                │
│    - Format responses for judge evaluation                 │
│    - Call judge model with structured schema               │
│    - Parse scores: { resp1: 8.5, resp2: 9.2, resp3: 7.8 } │
│    - Rank/select best response                             │
└────────────────────────────────────────────────────────────┘
                            ↓
┌────────────────────────────────────────────────────────────┐
│ 5. RESPONSE CONDITIONING (Optional)                        │
│    - Calculate confidence score                            │
│    - Adjust tone based on confidence                       │
│    - Add metadata (models used, scores, timing)            │
│    - Format final response                                 │
└────────────────────────────────────────────────────────────┘
                            ↓
┌────────────────────────────────────────────────────────────┐
│ 6. RETURN WORKFLOW RESULT                                  │
│    {                                                        │
│      content: "Quantum computing is...",                   │
│      confidence: 0.92,                                     │
│      ensembleResponses: [...],                             │
│      judgeScores: {...},                                   │
│      totalTime: 3421                                       │
│    }                                                        │
└────────────────────────────────────────────────────────────┘
```

---

##  Core Components

### 1. Workflow Runner

**Purpose**: Main orchestrator that executes workflows end-to-end

**Responsibilities**:

- Load and validate workflow configurations
- Coordinate ensemble → judge → conditioning pipeline
- Handle errors and partial failures
- Aggregate results with comprehensive metrics

**Key Methods**:

```typescript
class WorkflowRunner {
  async execute(
    config: WorkflowConfig,
    input: WorkflowInput,
  ): Promise;

  async executeWithRetry(
    config: WorkflowConfig,
    input: WorkflowInput,
    retries: number,
  ): Promise;
}
```

---

### 2. Workflow Registry

**Purpose**: Manage workflow templates (built-in + custom)

**Responsibilities**:

- Store workflow configurations
- Provide workflow discovery API
- Validate configs before registration
- Support workflow CRUD operations

**Key Methods**:

```typescript
class WorkflowRegistry {
  register(config: WorkflowConfig): void;
  get(id: string): WorkflowConfig | undefined;
  list(): WorkflowConfig[];
  validate(config: WorkflowConfig): ValidationResult;
}
```

---

### 3. Ensemble Executor

**Purpose**: Execute multiple models in parallel

**Responsibilities**:

- Create provider instances for each model
- Execute requests concurrently via `Promise.all()`
- Collect responses with timing/usage data
- Handle individual model failures gracefully

**Key Methods**:

```typescript
class EnsembleExecutor {
  async execute(
    models: ModelConfig[],
    input: string,
  ): Promise;

  async executeWithTimeout(
    models: ModelConfig[],
    input: string,
    timeout: number,
  ): Promise;
}
```

**Integration Points**:

- Uses `AIProviderFactory.createProvider()` for model instantiation
- Calls `BaseProvider.generate()` for each model
- Leverages existing analytics from `core/analytics.ts`

---

### 4. Judge Scorer

**Purpose**: Evaluate and rank ensemble responses

**Responsibilities**:

- Format ensemble results for judge evaluation
- Call judge model with structured output schema
- Parse scores/rankings from judge response
- Support multiple scoring strategies (numeric, ranking, best-pick)

**Key Methods**:

```typescript
class JudgeScorer {
  async score(
    responses: EnsembleResponse[],
    judgeConfig: JudgeConfig,
  ): Promise;

  async scoreMultiJudge(
    responses: EnsembleResponse[],
    judgeConfigs: JudgeConfig[],
  ): Promise;
}
```

**Scoring Strategies**:

1. **Numeric Scoring**: Return 0-10 scores for each response
2. **Ranking**: Order responses from best to worst
3. **Best Pick**: Select single best response with reasoning
4. **Multi-Judge Voting**: Average scores from multiple judges

---

### 5. Response Conditioner

**Purpose**: Post-process responses based on confidence

**Responsibilities**:

- Calculate overall confidence score
- Adjust tone based on confidence level
- Add structured metadata
- Format final user-facing response

**Key Methods**:

```typescript
class ResponseConditioner {
  async condition(
    response: string,
    confidence: number,
    config: ConditioningConfig,
  ): Promise;

  calculateConfidence(scores: JudgeScores, consensus: number): number;
}
```

**Conditioning Rules**:

- **High confidence (>0.8)**: Direct, assertive language
- **Medium confidence (0.5-0.8)**: Balanced, qualified language
- **Low confidence (\; // Custom metadata
};
```

### ModelConfig

```typescript
type ModelConfig = {
  provider: AIProviderName; // e.g., 'openai', 'anthropic'
  model: string; // e.g., 'gpt-4o', 'claude-3-5-sonnet'
  weight?: number; // Weight for voting (0-1)
  temperature?: number; // Model temperature
  maxTokens?: number; // Max response tokens
  systemPrompt?: string; // Custom system prompt
  timeout?: number; // Per-model timeout (ms)
};
```

### JudgeConfig

```typescript
type JudgeConfig = {
  provider: AIProviderName; // Judge model provider
  model: string; // Judge model name
  criteria: string[]; // Evaluation criteria
  outputFormat: JudgeOutputFormat; // 'scores' | 'ranking' | 'best'
  systemPrompt?: string; // Custom judge prompt
  blindEvaluation?: boolean; // Hide provider names
};
```

### WorkflowResult

```typescript
type WorkflowResult = {
  content: string; // Final conditioned response

  ensembleResponses: EnsembleResponse[]; // All model responses
  judgeScores?: JudgeScores; // Judge evaluation
  selectedResponse?: string; // Selected best response

  confidence: number; // Overall confidence (0-1)
  totalTime: number; // Total execution time (ms)
  workflow: string; // Workflow ID used

  usage?: AggregatedUsage; // Token usage across all models
  analytics?: WorkflowAnalytics; // Detailed analytics
  metadata?: Record; // Custom metadata
};
```

---

##  Integration Points

### With Existing Neurolink Infrastructure

#### 1. AIProviderFactory

```typescript
// Workflow uses existing factory for provider creation
const provider = await AIProviderFactory.createProvider(
  modelConfig.provider,
  modelConfig.model,
);
```

#### 2. BaseProvider

```typescript
// All models use standard generate() method
const result = await provider.generate({
  input: { text: prompt },
  temperature: modelConfig.temperature,
  systemPrompt: modelConfig.systemPrompt,
});
```

#### 3. Analytics & Evaluation

```typescript
// Workflow aggregates existing analytics
const analytics = createAnalytics(provider, model, result, time);
const evaluation = await evaluateResponse(query, response);
```

#### 4. NeuroLink Class Extension

```typescript
// Workflow execution via generate() method
export class NeuroLink {
  async generate(
    options: GenerateOptions & { workflowConfig?: WorkflowGenerateOptions },
  ): Promise {
    if (options.workflowConfig) {
      const workflow = workflowRegistry.get(options.workflowConfig.workflowId);
      return await workflowRunner.execute(workflow, options);
    }
    // ... existing generate logic
  }
}

// Standalone registry functions

  registerWorkflow,
  listWorkflows,
  getWorkflow,
} from "@juspay/neurolink/workflow";

registerWorkflow(config);
const workflows = listWorkflows();
const workflow = getWorkflow("consensus-3");
```

---

##  Built-in Workflows

### 1. Consensus Workflow (consensus-3)

**Purpose**: Cross-validate responses across 3 models with judge scoring

```typescript
{
  id: 'consensus-3',
  name: 'Three Model Consensus',
  type: 'ensemble',
  models: [
    { provider: 'openai', model: 'gpt-4o' },
    { provider: 'anthropic', model: 'claude-3-5-sonnet' },
    { provider: 'google-ai', model: 'gemini-2.5-flash' }
  ],
  judge: {
    provider: 'openai',
    model: 'gpt-4o',
    criteria: ['accuracy', 'clarity', 'completeness'],
    outputFormat: 'best'
  },
  conditioning: {
    useConfidence: true,
    toneAdjustment: 'neutral'
  }
}
```

**Use Cases**: High-stakes decisions, factual queries, technical explanations

---

### 2. Fast Fallback Workflow (fast-fallback)

**Purpose**: Try fast model first, fallback to powerful model if needed

```typescript
{
  id: 'fast-fallback',
  name: 'Fast with Quality Fallback',
  type: 'chain',
  models: [
    { provider: 'google-ai', model: 'gemini-2.5-flash', timeout: 5000 },
    { provider: 'anthropic', model: 'claude-3-5-sonnet', timeout: 10000 }
  ],
  conditioning: {
    useConfidence: true,
    metadata: { strategy: 'fast-first' }
  }
}
```

**Use Cases**: Cost optimization, performance-sensitive applications

---

### 3. Quality Max Workflow (quality-max)

**Purpose**: Maximum quality with dual powerful models

```typescript
{
  id: 'quality-max',
  name: 'Maximum Quality Ensemble',
  type: 'ensemble',
  models: [
    { provider: 'openai', model: 'gpt-4o', temperature: 0.3 },
    { provider: 'anthropic', model: 'claude-3-5-sonnet', temperature: 0.3 }
  ],
  judge: {
    provider: 'anthropic',
    model: 'claude-3-5-sonnet',
    criteria: ['depth', 'reasoning', 'accuracy', 'safety'],
    outputFormat: 'scores'
  },
  conditioning: {
    useConfidence: true,
    toneAdjustment: 'strengthen'
  }
}
```

**Use Cases**: Research, analysis, critical business decisions

---

### 4. Multi-Judge Workflow (multi-judge-5)

**Purpose**: Use multiple judges to eliminate bias

```typescript
{
  id: 'multi-judge-5',
  name: 'Multi-Judge Consensus',
  type: 'ensemble',
  models: [
    { provider: 'openai', model: 'gpt-4o' },
    { provider: 'anthropic', model: 'claude-3-5-sonnet' },
    { provider: 'google-ai', model: 'gemini-2.5-pro' }
  ],
  judges: [  // Multiple judges
    { provider: 'openai', model: 'gpt-4o', criteria: ['accuracy'] },
    { provider: 'anthropic', model: 'claude-3-5-sonnet', criteria: ['safety'] }
  ],
  conditioning: {
    useConfidence: true,
    toneAdjustment: 'neutral'
  }
}
```

**Use Cases**: Bias-sensitive applications, fairness requirements

---

##  Performance Characteristics

### Expected Latency

| Workflow Type | Models | Judge | Expected Latency | Cost Multiplier |
| ------------- | ------ | ----- | ---------------- | --------------- |
| Consensus-3   | 3      | 1     | 3-5 seconds      | 4x              |
| Fast-Fallback | 1-2    | 0     | 1-3 seconds      | 1-2x            |
| Quality-Max   | 2      | 1     | 3-4 seconds      | 3x              |
| Multi-Judge-5 | 3      | 2     | 4-6 seconds      | 5x              |

### Optimization Strategies

1. **Parallel Execution**: All ensemble models run concurrently
2. **Timeout Controls**: Per-model timeout prevents hanging
3. **Early Termination**: Optional "first N responses" mode
4. **Model Selection**: Lightweight models for speed, powerful for quality
5. **Concurrency Control**: p-limit for controlled parallel execution

---

##  Security & Safety

### Input Validation

- Validate workflow configs before execution
- Sanitize user inputs before passing to models
- Enforce token limits per model
- Validate judge output schemas

### Cost Controls

- Pre-execution cost estimation
- Per-workflow budget limits
- Cost tracking and alerting
- Rate limiting on workflow execution

### Error Handling

- Graceful degradation on partial failures
- Retry logic with exponential backoff
- Detailed error logging and metrics
- Fallback to single-model execution

---

##  Observability

### Metrics to Track

1. **Execution Metrics**
   - Total workflow execution time
   - Per-model response time
   - Judge scoring time
   - Ensemble success rate

2. **Quality Metrics**
   - Judge scores distribution
   - Consensus levels
   - Confidence scores
   - Response variation

3. **Cost Metrics**
   - Total tokens used
   - Cost per workflow
   - Cost breakdown by model
   - Budget utilization

4. **Error Metrics**
   - Model failure rate
   - Timeout frequency
   - Validation errors
   - Retry attempts

### Logging

- Structured JSON logs for all workflow executions
- Debug mode for detailed execution traces
- Performance profiling for optimization
- Audit trail for compliance

---

##  API Design

### Public API

```typescript
// Import from main package

  registerWorkflow,
  listWorkflows,
  getWorkflow,
} from "@juspay/neurolink/workflow";

// Initialize
const neuro = new NeuroLink();

// Execute built-in workflow (TESTING PHASE)
const result = await neuro.generate({
  workflowConfig: {
    workflowId: "consensus-3",
    timeout: 30000,
  },
  input: { text: "Explain machine learning" },
});

// Result contains original response + evaluation metrics
console.log(result.content); // Original best response (unchanged)
console.log(result.score); // 87 (out of 100)
console.log(result.reasoning); // "Clear and accurate explanation"

// Detailed metrics for AB testing
console.log(result.ensembleResponses); // All 3 model responses
console.log(result.judgeScores); // Individual scores
console.log(result.confidence); // 0.87
console.log(result.totalTime); // 3200ms

// Register custom workflow using standalone function
registerWorkflow({
  id: "custom-workflow",
  name: "My Custom Workflow",
  type: "ensemble",
  models: [
    { provider: "openai", model: "gpt-4o" },
    { provider: "anthropic", model: "claude-3-5-sonnet" },
  ],
});

// Execute custom workflow
const customResult = await neuro.generate({
  workflowConfig: { workflowId: "custom-workflow" },
  input: { text: "Custom query" },
});

// List available workflows (standalone function)
const workflows = listWorkflows();

// Get workflow details (standalone function)
const workflowConfig = getWorkflow("consensus-3");
```

---

##  Success Criteria

### Phase 1 (MVP)

- ✅ Support 3+ ensemble models running in parallel
- ✅ Implement judge-based scoring with structured output
- ✅ Response conditioning with confidence-based tone adjustment
- ✅ 3 built-in workflows (consensus, fallback, quality-max)
- ✅ Custom workflow registration API
- ✅ Comprehensive analytics and metrics
- ✅ Full TypeScript type safety
- ✅ Integration tests with real providers

### Performance Targets

- **Latency**: \95% workflow completion
- **Cost Accuracy**: ±5% cost estimation accuracy
- **Error Recovery**: Handle 2/3 model failures gracefully

### Documentation

- High-Level Design (this document)
- Low-Level Design with implementation details
- API Reference documentation
- Tutorial with 5+ examples
- Migration guide for existing users

---

##  Future Enhancements (Post-MVP)

### Phase 2: Streaming & Advanced Patterns

- **Streaming Workflows**: Progressive results with `streamWorkflow()`
- **Workflow State Management**: Persistent workflow state
- **Async Workflows**: Background execution with callbacks
- **Workflow Chaining**: Connect workflows in pipelines

### Phase 3: Enterprise Features

- **DAG-based Workflows**: Complex multi-stage orchestration
- **Human-in-the-Loop**: Manual approval/judging steps
- **Workflow Versioning**: Manage workflow evolution
- **A/B Testing**: Compare workflow performance
- **Workflow Marketplace**: Share and discover workflows

### Phase 4: Advanced Intelligence

- **Adaptive Workflows**: Auto-select models based on query
- **Self-Improving Workflows**: Learn from past executions
- **Cost Optimization**: Auto-route to cheapest viable models
- **Quality Prediction**: Predict confidence before execution

---

##  References

### Internal Documentation

- [Factory Pattern Architecture](/docs/development/factory-architecture)
- [MCP Foundation](/docs/mcp/overview)
- [Configuration Management](/docs/deployment/configuration)
- [API Reference](/docs/sdk/api-reference)

### External Resources

- [Vercel AI SDK Documentation](https://sdk.vercel.ai/docs)
- [Ensemble Methods in ML](https://en.wikipedia.org/wiki/Ensemble_learning)
- [LLM Judge Patterns](https://arxiv.org/abs/2306.05685)

---

##  Appendix

### Glossary

- **Ensemble**: Running multiple models in parallel for the same input
- **Judge Model**: AI model that evaluates and scores responses
- **Conditioning**: Post-processing response based on metadata/confidence
- **Workflow**: Declarative configuration of ensemble + judge + conditioning
- **Consensus**: Agreement level between ensemble models
- **Confidence**: Calculated metric representing response reliability

### Assumptions

1. All providers support concurrent requests
2. Judge models support structured output (Zod schemas)
3. Sufficient API rate limits for parallel execution
4. Network latency is manageable (\<1s per model)

### Constraints

1. Maximum 10 models per ensemble (performance/cost)
2. Maximum 3 judges per workflow (complexity)
3. Minimum 2 models for meaningful ensemble
4. Judge model must differ from ensemble models (bias prevention)

---

**Document Status**: ✅ Approved for Implementation
**Next Step**: Low-Level Design (LLD) document

---

## Workflow Engine - Low-Level Design

<!-- Source: WORKFLOW-ENGINE-LLD.md -->

# Neurolink Workflow Engine - Low-Level Design (LLD)

**Version**: 1.0
**Date**: November 28, 2025
**Status**: Implementation Complete
**Author**: Neurolink Team

## ️ File Structure

```text
src/
├── lib/
│   ├── workflow/
│   │   ├── index.ts                        # Public API exports (60 lines)
│   │   ├── types.ts                        # Type definitions (250 lines)
│   │   ├── config.ts                       # Configuration schemas (150 lines)
│   │   │
│   │   ├── core/
│   │   │   ├── workflowRunner.ts          # Main orchestrator (400 lines)
│   │   │   ├── workflowRegistry.ts        # Workflow management (200 lines)
│   │   │   ├── ensembleExecutor.ts        # Parallel execution (300 lines)
│   │   │   ├── judgeScorer.ts             # Judge scoring logic (350 lines)
│   │   │   └── responseConditioner.ts     # Response conditioning (200 lines)
│   │   │
│   │   ├── workflows/                      # Built-in workflows (800 lines total)
│   │   │   ├── consensusWorkflow.ts       # Consensus pattern (200 lines)
│   │   │   ├── fallbackWorkflow.ts        # Fallback chain (150 lines)
│   │   │   ├── multiJudgeWorkflow.ts      # Multi-judge voting (250 lines)
│   │   │   └── adaptiveWorkflow.ts        # Adaptive selection (200 lines)
│   │   │
│   │   └── utils/
│   │       ├── workflowValidation.ts      # Validation utilities (250 lines)
│   │       └── workflowMetrics.ts         # Metrics tracking (150 lines)
│   │
│   ├── neurolink.ts                        # MODIFY: Add workflow methods (20 lines)
│   └── index.ts                            # MODIFY: Export workflow types (10 lines)
```

**Total Estimated Lines**: ~3,000 lines

---

##  Module Specifications

---

## 1. Types Module (`workflow/types.ts`)

### Core Type Definitions

```typescript
/**
 * workflow/types.ts
 * Core type definitions for the Workflow Engine
 */

  AIProviderName,
  AnalyticsData,
  EvaluationData,
} from "../lib/core/types.js";

// ============================================================================
// WORKFLOW CONFIGURATION TYPES
// ============================================================================

/**
 * Workflow type enumeration
 */
export type WorkflowType = "ensemble" | "chain" | "adaptive" | "custom";

/**
 * Judge output format
 */
export type JudgeOutputFormat = "scores" | "ranking" | "best" | "detailed";

/**
 * Tone adjustment strategy
 */
export type ToneAdjustment = "soften" | "strengthen" | "neutral";

/**
 * Complete workflow configuration
 */
export type WorkflowConfig = {
  // Identification
  id: string;
  name: string;
  description?: string;
  version?: string;

  // Workflow definition
  type: WorkflowType;
  models: ModelConfig[];

  // Optional components
  judge?: JudgeConfig;
  judges?: JudgeConfig[]; // For multi-judge workflows
  conditioning?: ConditioningConfig;
  execution?: ExecutionConfig;

  // Metadata
  tags?: string[];
  metadata?: Record;
  createdAt?: string;
  updatedAt?: string;
};

/**
 * Model configuration for ensemble
 */
export type ModelConfig = {
  // Required fields
  provider: AIProviderName;
  model: string;

  // Optional tuning
  weight?: number; // For weighted voting (0-1)
  temperature?: number; // Model temperature (0-2)
  maxTokens?: number; // Max output tokens
  systemPrompt?: string; // Custom system prompt
  timeout?: number; // Per-model timeout (ms)

  // Advanced options
  topP?: number;
  topK?: number;
  presencePenalty?: number;
  frequencyPenalty?: number;

  // Metadata
  label?: string; // Human-readable label
  metadata?: Record;
};

/**
 * Judge model configuration
 */
export type JudgeConfig = {
  // Required fields
  provider: AIProviderName;
  model: string;
  criteria: string[]; // Evaluation criteria
  outputFormat: JudgeOutputFormat;

  // Optional configuration
  systemPrompt?: string; // Custom judge prompt
  temperature?: number; // Judge temperature (usually low)
  maxTokens?: number; // Max judge output
  timeout?: number; // Judge timeout (ms)

  // Advanced options
  blindEvaluation?: boolean; // Hide provider names
  includeReasoning: boolean; // REQUIRED: Always include short explanation
  scoreScale: {
    // Fixed 0-100 scale for testing phase
    min: 0;
    max: 100;
  };

  // Metadata
  label?: string;
  metadata?: Record;
};

/**
 * Response conditioning configuration
 */
export type ConditioningConfig = {
  // Confidence-based conditioning
  useConfidence: boolean;
  confidenceThresholds?: {
    high: number; // Default: 0.8
    medium: number; // Default: 0.5
    low: number; // Default: 0.3
  };

  // Tone adjustment
  toneAdjustment?: ToneAdjustment;

  // Metadata injection
  includeMetadata?: boolean;
  metadataFields?: string[]; // Which fields to include

  // Response formatting
  addConfidenceStatement?: boolean;
  addModelAttribution?: boolean;
  addExecutionTime?: boolean;

  // Custom metadata
  metadata?: Record;
};

/**
 * Workflow execution configuration
 */
export type ExecutionConfig = {
  // Timeout settings
  timeout?: number; // Total workflow timeout (ms)
  modelTimeout?: number; // Per-model timeout (ms)
  judgeTimeout?: number; // Judge timeout (ms)

  // Retry settings
  retries?: number; // Max retries on failure
  retryDelay?: number; // Delay between retries (ms)
  retryableErrors?: string[]; // Error codes to retry

  // Optimization
  parallelism?: number; // Max parallel models
  earlyTermination?: boolean; // Stop after N responses
  minResponses?: number; // Minimum required responses

  // Cost controls
  maxCost?: number; // Max cost per execution
  costThreshold?: number; // Warn at cost threshold

  // Monitoring
  enableMetrics?: boolean;
  enableTracing?: boolean;

  // Metadata
  metadata?: Record;
};

// ============================================================================
// WORKFLOW INPUT/OUTPUT TYPES
// ============================================================================

/**
 * Input for workflow execution
 */
export type WorkflowInput = {
  text: string;
  context?: Record;
  metadata?: Record;
};

/**
 * Options for workflow execution
 */
export type WorkflowGenerateOptions = {
  // Required
  workflowId: string;
  input: WorkflowInput;

  // Optional overrides
  overrides?: Partial;
  timeout?: number | string;

  // Additional options
  enableAnalytics?: boolean;
  enableEvaluation?: boolean;
  context?: Record;
};

/**
 * Complete workflow execution result
 * NOTE: For testing phase - returns original content unchanged with evaluation metrics
 */
export type WorkflowResult = {
  // Primary output (ORIGINAL, UNMODIFIED)
  content: string;

  // Evaluation metrics (for AB testing)
  score: number; // Judge score (0-100)
  reasoning: string; // Short summary of why this score

  // Ensemble data
  ensembleResponses: EnsembleResponse[];

  // Judge data (if used)
  judgeScores?: JudgeScores;
  selectedResponse?: EnsembleResponse;

  // Quality metrics
  confidence: number; // Overall confidence (0-1)
  consensus?: number; // Agreement level (0-1)

  // Performance metrics
  totalTime: number; // Total execution time (ms)
  ensembleTime: number; // Ensemble phase time (ms)
  judgeTime?: number; // Judge phase time (ms)
  conditioningTime?: number; // Conditioning time (ms)

  // Workflow metadata
  workflow: string; // Workflow ID
  workflowName: string; // Workflow name
  workflowVersion?: string; // Workflow version

  // Resource usage
  usage?: AggregatedUsage;
  cost?: number; // Total estimated cost

  // Analytics and evaluation
  analytics?: WorkflowAnalytics;
  evaluation?: EvaluationData;

  // Additional metadata
  metadata?: Record;
  timestamp: string;
};

/**
 * Single ensemble model response
 */
export type EnsembleResponse = {
  // Model identification
  provider: string;
  model: string;
  modelLabel?: string;

  // Response content
  content: string;

  // Performance metrics
  responseTime: number; // Response time (ms)

  // Resource usage
  usage?: {
    inputTokens: number;
    outputTokens: number;
    totalTokens: number;
  };

  // Status
  status: "success" | "failure" | "timeout" | "partial";
  error?: string;

  // Metadata
  metadata?: Record;
  timestamp: string;
};

/**
 * Judge scoring results
 * NOTE: Scores are 0-100 for standardized evaluation
 */
export type JudgeScores = {
  // Judge identification
  judgeProvider: string;
  judgeModel: string;

  // Scoring results (0-100 scale)
  scores: Record; // { "response-0": 85, "response-1": 92 }
  ranking?: string[]; // Ordered list of response IDs
  bestResponse?: string; // ID of best response

  // Evaluation details
  criteria: string[];
  reasoning?: string;
  confidenceInJudgment?: number;

  // Performance
  judgeTime: number; // Judge execution time (ms)

  // Metadata
  metadata?: Record;
  timestamp: string;
};

/**
 * Multi-judge voting results
 */
export type MultiJudgeScores = {
  // Individual judge results
  judges: JudgeScores[];

  // Aggregated results
  averageScores: Record;
  aggregatedRanking: string[];
  consensusLevel: number; // Agreement between judges (0-1)

  // Final selection
  bestResponse: string;
  confidence: number;

  // Metadata
  votingStrategy: "average" | "median" | "majority";
  metadata?: Record;
};

/**
 * Aggregated token usage across all models
 */
export type AggregatedUsage = {
  totalInputTokens: number;
  totalOutputTokens: number;
  totalTokens: number;

  // Per-model breakdown
  byModel: Array;

  // Judge usage (if applicable)
  judgeUsage?: {
    inputTokens: number;
    outputTokens: number;
    totalTokens: number;
    cost?: number;
  };
};

/**
 * Workflow-specific analytics
 */
export type WorkflowAnalytics = AnalyticsData & {
  // Workflow-specific metrics
  workflowId: string;
  workflowType: WorkflowType;

  // Ensemble metrics
  modelsExecuted: number;
  modelsSuccessful: number;
  modelsFailed: number;

  // Quality metrics
  averageConfidence: number;
  consensusLevel?: number;

  // Performance distribution
  modelResponseTimes: Record;
  fastestModel?: string;
  slowestModel?: string;

  // Cost breakdown
  totalCost: number;
  costByModel: Record;
  costEfficiency?: number; // Quality per dollar
};

// ============================================================================
// VALIDATION & ERROR TYPES
// ============================================================================

/**
 * Workflow validation result
 */
export type WorkflowValidationResult = {
  valid: boolean;
  errors: WorkflowValidationError[];
  warnings: WorkflowValidationWarning[];
};

/**
 * Validation error
 */
export type WorkflowValidationError = {
  field: string;
  message: string;
  code: string;
  severity: "error" | "critical";
};

/**
 * Validation warning
 */
export type WorkflowValidationWarning = {
  field: string;
  message: string;
  code: string;
  recommendation?: string;
};

/**
 * Workflow execution error
 */
export type WorkflowError = Error & {
  code: string;
  workflowId: string;
  phase: "ensemble" | "judge" | "conditioning" | "validation";
  details?: Record;
  retryable: boolean;
};
```

---

## 2. Configuration Module (`workflow/config.ts`)

### Configuration Schemas & Defaults

```typescript
/**
 * workflow/config.ts
 * Configuration schemas, validation, and defaults
 */

  WorkflowConfig,
  ModelConfig,
  JudgeConfig,
  ConditioningConfig,
  ExecutionConfig,
} from "./types.js";

// ============================================================================
// ZOD VALIDATION SCHEMAS
// ============================================================================

/**
 * Model configuration schema
 */
export const ModelConfigSchema = z.object({
  provider: z.string().min(1),
  model: z.string().min(1),
  weight: z.number().min(0).max(1).optional(),
  temperature: z.number().min(0).max(2).optional(),
  maxTokens: z.number().int().positive().optional(),
  systemPrompt: z.string().optional(),
  timeout: z.number().int().positive().optional(),
  topP: z.number().min(0).max(1).optional(),
  topK: z.number().int().positive().optional(),
  presencePenalty: z.number().min(-2).max(2).optional(),
  frequencyPenalty: z.number().min(-2).max(2).optional(),
  label: z.string().optional(),
  metadata: z.record(z.unknown()).optional(),
});

/**
 * Judge configuration schema
 */
export const JudgeConfigSchema = z.object({
  provider: z.string().min(1),
  model: z.string().min(1),
  criteria: z.array(z.string()).min(1),
  outputFormat: z.enum(["scores", "ranking", "best", "detailed"]),
  systemPrompt: z.string().optional(),
  temperature: z.number().min(0).max(2).optional(),
  maxTokens: z.number().int().positive().optional(),
  timeout: z.number().int().positive().optional(),
  blindEvaluation: z.boolean().optional(),
  includeReasoning: z.boolean().optional(),
  scoreScale: z
    .object({
      min: z.number(),
      max: z.number(),
    })
    .optional(),
  label: z.string().optional(),
  metadata: z.record(z.unknown()).optional(),
});

/**
 * Conditioning configuration schema
 */
export const ConditioningConfigSchema = z.object({
  useConfidence: z.boolean(),
  confidenceThresholds: z
    .object({
      high: z.number().min(0).max(1),
      medium: z.number().min(0).max(1),
      low: z.number().min(0).max(1),
    })
    .optional(),
  toneAdjustment: z.enum(["soften", "strengthen", "neutral"]).optional(),
  includeMetadata: z.boolean().optional(),
  metadataFields: z.array(z.string()).optional(),
  addConfidenceStatement: z.boolean().optional(),
  addModelAttribution: z.boolean().optional(),
  addExecutionTime: z.boolean().optional(),
  metadata: z.record(z.unknown()).optional(),
});

/**
 * Execution configuration schema
 */
export const ExecutionConfigSchema = z.object({
  timeout: z.number().int().positive().optional(),
  modelTimeout: z.number().int().positive().optional(),
  judgeTimeout: z.number().int().positive().optional(),
  retries: z.number().int().min(0).max(5).optional(),
  retryDelay: z.number().int().positive().optional(),
  retryableErrors: z.array(z.string()).optional(),
  parallelism: z.number().int().positive().optional(),
  earlyTermination: z.boolean().optional(),
  minResponses: z.number().int().positive().optional(),
  maxCost: z.number().positive().optional(),
  costThreshold: z.number().positive().optional(),
  enableMetrics: z.boolean().optional(),
  enableTracing: z.boolean().optional(),
  metadata: z.record(z.unknown()).optional(),
});

/**
 * Complete workflow configuration schema
 */
export const WorkflowConfigSchema = z
  .object({
    id: z.string().min(1),
    name: z.string().min(1),
    description: z.string().optional(),
    version: z.string().optional(),
    type: z.enum(["ensemble", "chain", "adaptive", "custom"]),
    models: z.array(ModelConfigSchema).min(1),
    judge: JudgeConfigSchema.optional(),
    judges: z.array(JudgeConfigSchema).optional(),
    conditioning: ConditioningConfigSchema.optional(),
    execution: ExecutionConfigSchema.optional(),
    tags: z.array(z.string()).optional(),
    metadata: z.record(z.unknown()).optional(),
    createdAt: z.string().optional(),
    updatedAt: z.string().optional(),
  })
  .refine(
    (data) => {
      // Cannot have both judge and judges
      if (data.judge && data.judges) {
        return false;
      }
      // Ensemble and adaptive need at least 2 models
      if (
        (data.type === "ensemble" || data.type === "adaptive") &&
        data.models.length  &
    Pick,
): WorkflowConfig {
  const base: WorkflowConfig = {
    id: partial.id,
    name: partial.name,
    type: partial.type,
    models: partial.models,
    ...partial,
  };

  return mergeWithDefaults(base);
}
```

---

## 3. Workflow Runner (`workflow/core/workflowRunner.ts`)

### Main Orchestrator Implementation

```typescript
/**
 * workflow/core/workflowRunner.ts
 * Main workflow orchestrator - coordinates ensemble, judge, and conditioning
 */

  WorkflowConfig,
  WorkflowInput,
  WorkflowResult,
  WorkflowGenerateOptions,
  EnsembleResponse,
  JudgeScores,
  MultiJudgeScores,
  AggregatedUsage,
  WorkflowAnalytics,
} from "../types.js";

/**
 * Main workflow execution orchestrator
 */
export class WorkflowRunner {
  private ensembleExecutor: EnsembleExecutor;
  private judgeScorer: JudgeScorer;
  private responseConditioner: ResponseConditioner;
  private metrics: WorkflowMetrics;

  constructor() {
    this.ensembleExecutor = new EnsembleExecutor();
    this.judgeScorer = new JudgeScorer();
    this.responseConditioner = new ResponseConditioner();
    this.metrics = new WorkflowMetrics();
  }

  /**
   * Execute workflow end-to-end
   */
  async execute(
    config: WorkflowConfig,
    options: WorkflowGenerateOptions,
  ): Promise {
    const functionTag = "WorkflowRunner.execute";
    const startTime = Date.now();

    logger.info(`[${functionTag}] Starting workflow execution`, {
      workflowId: config.id,
      workflowType: config.type,
      models: config.models.length,
    });

    try {
      // Phase 1: Execute ensemble
      const ensembleStart = Date.now();
      const ensembleResponses = await this.executeEnsemblePhase(
        config,
        options.input,
      );
      const ensembleTime = Date.now() - ensembleStart;

      logger.debug(`[${functionTag}] Ensemble phase complete`, {
        responses: ensembleResponses.length,
        successful: ensembleResponses.filter((r) => r.status === "success")
          .length,
        time: ensembleTime,
      });

      // Phase 2: Judge scoring (optional)
      let judgeScores: JudgeScores | MultiJudgeScores | undefined;
      let judgeTime = 0;

      if (config.judge || config.judges) {
        const judgeStart = Date.now();
        judgeScores = await this.executeJudgePhase(config, ensembleResponses);
        judgeTime = Date.now() - judgeStart;

        logger.debug(`[${functionTag}] Judge phase complete`, {
          judgeTime,
          bestResponse: judgeScores.bestResponse,
        });
      }

      // Phase 3: Extract score and reasoning (NO CONDITIONING in testing phase)
      const { score, reasoning } = this.extractScoreAndReasoning(judgeScores);

      // Use original best response content (UNCHANGED)
      const selectedResponse = this.selectBestResponse(
        ensembleResponses,
        judgeScores,
      );
      const finalContent = selectedResponse.content;

      // Calculate final metrics
      const totalTime = Date.now() - startTime;
      const usage = this.aggregateUsage(ensembleResponses, judgeScores);
      const analytics = this.createAnalytics(
        config,
        ensembleResponses,
        judgeScores,
        totalTime,
      );

      // Build complete result (TESTING PHASE: original content + evaluation)
      const result: WorkflowResult = {
        content: finalContent, // ORIGINAL, UNMODIFIED
        score, // 0-100
        reasoning, // Short summary
        ensembleResponses,
        judgeScores,
        selectedResponse,
        confidence: this.calculateConfidence(ensembleResponses, judgeScores),
        consensus: this.calculateConsensus(ensembleResponses),
        totalTime,
        ensembleTime,
        judgeTime: judgeTime > 0 ? judgeTime : undefined,
        workflow: config.id,
        workflowName: config.name,
        workflowVersion: config.version,
        usage,
        cost: this.calculateTotalCost(usage),
        analytics,
        metadata: {
          ...config.metadata,
        },
        timestamp: new Date().toISOString(),
      };

      // Comprehensive logging for AB testing evaluation
      logger.info(`[${functionTag}] Workflow execution complete`, {
        workflowId: config.id,
        workflowType: config.type,
        totalTime,
        ensembleTime,
        judgeTime,
        score: result.score,
        reasoning: result.reasoning,
        confidence: result.confidence,
        consensus: result.consensus,
        modelsExecuted: ensembleResponses.length,
        modelsSuccessful: ensembleResponses.filter(
          (r) => r.status === "success",
        ).length,
        selectedModel: `${selectedResponse.provider}/${selectedResponse.model}`,
        allScores: judgeScores?.scores,
        timestamp: result.timestamp,
      });

      // Record metrics
      this.metrics.recordExecution(config.id, result);

      return result;
    } catch (error) {
      logger.error(`[${functionTag}] Workflow execution failed`, {
        workflowId: config.id,
        error: error instanceof Error ? error.message : String(error),
      });

      throw new WorkflowError(
        `Workflow execution failed: ${error instanceof Error ? error.message : String(error)}`,
        {
          code: "WORKFLOW_EXECUTION_FAILED",
          workflowId: config.id,
          phase: "execution",
          retryable: true,
        },
      );
    }
  }

  /**
   * Execute ensemble phase
   */
  private async executeEnsemblePhase(
    config: WorkflowConfig,
    input: WorkflowInput,
  ): Promise {
    const functionTag = "WorkflowRunner.executeEnsemblePhase";

    try {
      const responses = await this.ensembleExecutor.execute(
        config.models,
        input,
        config.execution,
      );

      // Validate minimum responses
      const successfulResponses = responses.filter(
        (r) => r.status === "success",
      );
      const minResponses = config.execution?.minResponses || 1;

      if (successfulResponses.length  {
    const functionTag = "WorkflowRunner.executeJudgePhase";

    try {
      // Filter successful responses only
      const validResponses = responses.filter((r) => r.status === "success");

      if (validResponses.length === 0) {
        throw new Error("No valid responses to judge");
      }

      // Multi-judge workflow
      if (config.judges && config.judges.length > 0) {
        return await this.judgeScorer.scoreMultiJudge(
          validResponses,
          config.judges,
          config.execution,
        );
      }

      // Single judge workflow
      if (config.judge) {
        return await this.judgeScorer.score(
          validResponses,
          config.judge,
          config.execution,
        );
      }

      throw new Error("No judge configuration provided");
    } catch (error) {
      logger.error(`[${functionTag}] Judge scoring failed`, { error });
      throw error;
    }
  }

  /**
   * Extract score and reasoning from judge results
   * NOTE: Testing phase - no response modification
   */
  private extractScoreAndReasoning(
    judgeScores?: JudgeScores | MultiJudgeScores,
  ): { score: number; reasoning: string } {
    if (!judgeScores) {
      return { score: 0, reasoning: "No judge scoring performed" };
    }

    // Get best response score (0-100)
    const bestResponseId = judgeScores.bestResponse || "response-0";
    const score = judgeScores.scores[bestResponseId] || 0;

    // Get reasoning (keep it short)
    const reasoning = judgeScores.reasoning
      ? judgeScores.reasoning.slice(0, 200) // Max 200 chars for summary
      : "Score assigned by judge";

    return { score, reasoning };
  }

  /**
   * Select best response based on judge scores or fallback
   */
  private selectBestResponse(
    responses: EnsembleResponse[],
    judgeScores?: JudgeScores | MultiJudgeScores,
  ): EnsembleResponse {
    // Use judge selection if available
    if (judgeScores?.bestResponse) {
      const index = parseInt(judgeScores.bestResponse.replace("response-", ""));
      return responses[index];
    }

    // Fallback: first successful response
    const successful = responses.find((r) => r.status === "success");
    if (successful) {
      return successful;
    }

    // Fallback: first response (even if failed)
    return responses[0];
  }

  /**
   * Calculate confidence score
   */
  private calculateConfidence(
    responses: EnsembleResponse[],
    judgeScores?: JudgeScores | MultiJudgeScores,
  ): number {
    // If judge provided confidence
    if (
      judgeScores &&
      "confidenceInJudgment" in judgeScores &&
      judgeScores.confidenceInJudgment
    ) {
      return judgeScores.confidenceInJudgment;
    }

    // Calculate from judge scores
    if (judgeScores && "scores" in judgeScores) {
      const scores = Object.values(judgeScores.scores);
      const maxScore = Math.max(...scores);
      const avgScore = scores.reduce((a, b) => a + b, 0) / scores.length;

      // Normalize to 0-1
      const scoreRange = 10; // Assuming 0-10 scale
      return (maxScore / scoreRange + avgScore / scoreRange) / 2;
    }

    // Fallback: based on success rate
    const successCount = responses.filter((r) => r.status === "success").length;
    return successCount / responses.length;
  }

  /**
   * Calculate consensus level
   */
  private calculateConsensus(responses: EnsembleResponse[]): number {
    const successful = responses.filter((r) => r.status === "success");
    if (successful.length  r.content.length);
    const avgLength = lengths.reduce((a, b) => a + b, 0) / lengths.length;
    const variance =
      lengths.reduce((sum, len) => sum + Math.pow(len - avgLength, 2), 0) /
      lengths.length;
    const stdDev = Math.sqrt(variance);

    // Normalize to 0-1 (lower std dev = higher consensus)
    return Math.max(0, 1 - stdDev / avgLength);
  }

  /**
   * Aggregate token usage
   */
  private aggregateUsage(
    responses: EnsembleResponse[],
    judgeScores?: JudgeScores | MultiJudgeScores,
  ): AggregatedUsage {
    const byModel = responses
      .filter((r) => r.usage)
      .map((r) => ({
        provider: r.provider,
        model: r.model,
        inputTokens: r.usage!.inputTokens,
        outputTokens: r.usage!.outputTokens,
        totalTokens: r.usage!.totalTokens,
      }));

    const totalInputTokens = byModel.reduce((sum, m) => sum + m.inputTokens, 0);
    const totalOutputTokens = byModel.reduce(
      (sum, m) => sum + m.outputTokens,
      0,
    );

    return {
      totalInputTokens,
      totalOutputTokens,
      totalTokens: totalInputTokens + totalOutputTokens,
      byModel,
    };
  }

  /**
   * Create workflow analytics
   */
  private createAnalytics(
    config: WorkflowConfig,
    responses: EnsembleResponse[],
    judgeScores: JudgeScores | MultiJudgeScores | undefined,
    totalTime: number,
  ): WorkflowAnalytics {
    const successful = responses.filter((r) => r.status === "success");
    const failed = responses.filter((r) => r.status !== "success");

    const modelResponseTimes: Record = {};
    responses.forEach((r) => {
      modelResponseTimes[`${r.provider}/${r.model}`] = r.responseTime;
    });

    const sortedByTime = [...responses].sort(
      (a, b) => a.responseTime - b.responseTime,
    );

    return {
      workflowId: config.id,
      workflowType: config.type,
      modelsExecuted: responses.length,
      modelsSuccessful: successful.length,
      modelsFailed: failed.length,
      averageConfidence: this.calculateConfidence(responses, judgeScores),
      consensusLevel: this.calculateConsensus(responses),
      modelResponseTimes,
      fastestModel: sortedByTime[0]
        ? `${sortedByTime[0].provider}/${sortedByTime[0].model}`
        : undefined,
      slowestModel: sortedByTime[sortedByTime.length - 1]
        ? `${sortedByTime[sortedByTime.length - 1].provider}/${sortedByTime[sortedByTime.length - 1].model}`
        : undefined,
      totalCost: 0, // Calculated separately
      costByModel: {},
      provider: config.models[0].provider,
      model: config.models[0].model,
      tokens: {
        input: 0,
        output: 0,
        total: 0,
      },
      responseTime: totalTime,
      timestamp: new Date().toISOString(),
    };
  }

  /**
   * Calculate total cost
   */
  private calculateTotalCost(usage: AggregatedUsage): number {
    // TODO: Implement actual cost calculation based on provider pricing
    return usage.totalTokens * 0.00001; // Placeholder
  }
}
```

---

## 4. Ensemble Executor (`workflow/core/ensembleExecutor.ts`)

### Parallel Model Execution

```typescript
/**
 * workflow/core/ensembleExecutor.ts
 * Executes multiple models in parallel for ensemble workflows
 */

  ModelConfig,
  WorkflowInput,
  EnsembleResponse,
  ExecutionConfig,
} from "../types.js";

/**
 * Executes ensemble of models in parallel
 */
export class EnsembleExecutor {
  /**
   * Execute all models in parallel
   */
  async execute(
    models: ModelConfig[],
    input: WorkflowInput,
    execution?: ExecutionConfig,
  ): Promise {
    const functionTag = "EnsembleExecutor.execute";

    logger.debug(`[${functionTag}] Starting ensemble execution`, {
      models: models.length,
      parallelism: execution?.parallelism,
    });

    // Set up concurrency limit
    const limit = pLimit(execution?.parallelism || 10);

    // Execute all models in parallel
    const promises = models.map((modelConfig, index) =>
      limit(() => this.executeModel(modelConfig, input, index, execution)),
    );

    const responses = await Promise.all(promises);

    logger.debug(`[${functionTag}] Ensemble execution complete`, {
      total: responses.length,
      successful: responses.filter((r) => r.status === "success").length,
    });

    return responses;
  }

  /**
   * Execute single model
   */
  private async executeModel(
    modelConfig: ModelConfig,
    input: WorkflowInput,
    index: number,
    execution?: ExecutionConfig,
  ): Promise {
    const functionTag = "EnsembleExecutor.executeModel";
    const startTime = Date.now();

    try {
      logger.debug(`[${functionTag}] Executing model`, {
        provider: modelConfig.provider,
        model: modelConfig.model,
        index,
      });

      // Create provider instance
      const provider = await AIProviderFactory.createProvider(
        modelConfig.provider,
        modelConfig.model,
      );

      // Execute with timeout
      const timeout = modelConfig.timeout || execution?.modelTimeout || 15000;
      const result = await this.executeWithTimeout(
        provider,
        modelConfig,
        input,
        timeout,
      );

      const responseTime = Date.now() - startTime;

      // Build successful response
      const response: EnsembleResponse = {
        provider: modelConfig.provider,
        model: modelConfig.model,
        modelLabel: modelConfig.label,
        content: result.content,
        responseTime,
        usage: result.usage
          ? {
              inputTokens: result.usage.promptTokens || 0,
              outputTokens: result.usage.completionTokens || 0,
              totalTokens: result.usage.totalTokens || 0,
            }
          : undefined,
        status: "success",
        metadata: modelConfig.metadata,
        timestamp: new Date().toISOString(),
      };

      logger.debug(`[${functionTag}] Model execution successful`, {
        provider: modelConfig.provider,
        model: modelConfig.model,
        responseTime,
      });

      return response;
    } catch (error) {
      const responseTime = Date.now() - startTime;

      logger.warn(`[${functionTag}] Model execution failed`, {
        provider: modelConfig.provider,
        model: modelConfig.model,
        error: error instanceof Error ? error.message : String(error),
      });

      // Build error response
      return {
        provider: modelConfig.provider,
        model: modelConfig.model,
        modelLabel: modelConfig.label,
        content: "",
        responseTime,
        status:
          error instanceof Error && error.name === "TimeoutError"
            ? "timeout"
            : "failure",
        error: error instanceof Error ? error.message : String(error),
        metadata: modelConfig.metadata,
        timestamp: new Date().toISOString(),
      };
    }
  }

  /**
   * Execute provider with timeout
   */
  private async executeWithTimeout(
    provider: AIProvider,
    modelConfig: ModelConfig,
    input: WorkflowInput,
    timeout: number,
  ): Promise {
    return Promise.race([
      provider.generate({
        input: { text: input.text },
        systemPrompt: modelConfig.systemPrompt,
        temperature: modelConfig.temperature,
        maxTokens: modelConfig.maxTokens,
      }),
      new Promise((_, reject) =>
        setTimeout(() => reject(new Error("Timeout")), timeout),
      ),
    ]);
  }
}
```

---

_Due to length constraints, I'll continue with the remaining modules in a structured format._

---

## 5. Judge Scorer (`workflow/core/judgeScorer.ts`) - Key Methods

```typescript
export class JudgeScorer {
  async score(
    responses: EnsembleResponse[],
    judgeConfig: JudgeConfig,
    execution?: ExecutionConfig,
  ): Promise;

  async scoreMultiJudge(
    responses: EnsembleResponse[],
    judgeConfigs: JudgeConfig[],
    execution?: ExecutionConfig,
  ): Promise;

  private async executeJudge(
    responses: EnsembleResponse[],
    judgeConfig: JudgeConfig,
  ): Promise;

  private formatPromptForJudge(
    responses: EnsembleResponse[],
    judgeConfig: JudgeConfig,
  ): string;

  private parseJudgeResponse(
    judgeResponse: string,
    outputFormat: JudgeOutputFormat,
  ): JudgeScores;
}
```

**Key Algorithm**: Judge Prompt Generation

```typescript
private formatPromptForJudge(responses, judgeConfig): string {
  const responseTexts = responses.map((r, i) =>
    `Response ${i}: ${judgeConfig.blindEvaluation ? '' : `(${r.provider}/${r.model})`}\n${r.content}`
  ).join('\n\n');

  return `
You are an expert judge evaluating AI responses.

Criteria: ${judgeConfig.criteria.join(', ')}

Responses to evaluate:
${responseTexts}

Please score each response on a scale of ${scoreScale.min}-${scoreScale.max} for each criterion.
Return your evaluation in JSON format:
{
  "scores": { "response-0": 8.5, "response-1": 9.2 },
  "ranking": ["response-1", "response-0"],
  "bestResponse": "response-1",
  "reasoning": "Response 1 demonstrates..."
}
`;
}
```

---

## 6. Response Conditioner (`workflow/core/responseConditioner.ts`) - Key Methods

```typescript
export class ResponseConditioner {
  async condition(
    content: string,
    confidence: number,
    config: ConditioningConfig,
    context?: ConditioningContext,
  ): Promise;

  private adjustTone(
    content: string,
    confidence: number,
    adjustment: ToneAdjustment,
  ): string;

  private addMetadata(
    content: string,
    config: ConditioningConfig,
    context: ConditioningContext,
  ): string;

  private getConfidenceStatement(confidence: number): string;
}
```

**Tone Adjustment Algorithm**:

```typescript
private adjustTone(content: string, confidence: number, adjustment: ToneAdjustment): string {
  const thresholds = config.confidenceThresholds;

  if (confidence >= thresholds.high) {
    // High confidence - assertive tone
    return adjustment === 'strengthen'
      ? `Definitively, ${content}`
      : content;
  } else if (confidence >= thresholds.medium) {
    // Medium confidence - balanced tone
    return adjustment === 'soften'
      ? `Based on the analysis, ${content}`
      : content;
  } else {
    // Low confidence - tentative tone
    return adjustment === 'soften'
      ? `It appears that ${content}. However, this conclusion has lower confidence.`
      : `Note: This response has lower confidence. ${content}`;
  }
}
```

---

## 7. Workflow Registry (`workflow/core/workflowRegistry.ts`) - Key Methods

```typescript
export class WorkflowRegistry {
  private workflows: Map;

  register(config: WorkflowConfig): void;
  unregister(id: string): boolean;
  get(id: string): WorkflowConfig | undefined;
  list(filter?: WorkflowFilter): WorkflowConfig[];
  validate(config: WorkflowConfig): WorkflowValidationResult;
  exists(id: string): boolean;
  update(id: string, updates: Partial): void;
}
```

---

## 8. Integration with NeuroLink Class

### Modifications to `src/lib/neurolink.ts`

```typescript
// Add import at top

  WorkflowConfig,
  WorkflowGenerateOptions,
  WorkflowResult,
} from "./workflow/types.js";

export class NeuroLink {
  // Add workflow runner instance
  private workflowRunner: WorkflowRunner;

  constructor(options?: NeuroLinkOptions) {
    // ... existing code ...
    this.workflowRunner = new WorkflowRunner();
  }

  /**
   * Execute a workflow with ensemble and judge via generate()
   * Workflows are accessed through the workflowConfig option
   */
  async generate(
    options: GenerateOptions & { workflowConfig?: WorkflowConfig },
  ): Promise {
    if (options.workflowConfig) {
      // Workflow execution path
      return await this.workflowRunner.execute(options.workflowConfig, options);
    }
    // ... existing generate logic
  }
}

// Standalone registry functions (not class methods)

  registerWorkflow,
  listWorkflows,
  getWorkflow,
} from "@juspay/neurolink/workflow";

// Register custom workflow
registerWorkflow(config);

// List available workflows
const workflows = listWorkflows();

// Get workflow configuration
const workflow = getWorkflow("consensus-3");
```

---

## 9. Testing Strategy

### Unit Tests

```typescript
// test/workflow/ensembleExecutor.test.ts
describe("EnsembleExecutor", () => {
  test("executes multiple models in parallel", async () => {
    const executor = new EnsembleExecutor();
    const responses = await executor.execute([...models], input);

    expect(responses).toHaveLength(3);
    expect(responses.filter((r) => r.status === "success")).toHaveLength(3);
  });

  test("handles individual model failures gracefully", async () => {
    // Mock one model to fail
    const responses = await executor.execute([...models], input);

    expect(responses).toHaveLength(3);
    expect(responses.filter((r) => r.status === "failure")).toHaveLength(1);
  });

  test("respects timeout configuration", async () => {
    const responses = await executor.execute(
      [{ ...model, timeout: 100 }],
      input,
    );

    expect(responses[0].status).toBe("timeout");
  });
});
```

### Integration Tests

```typescript
// test/workflow/integration/workflow.test.ts
describe("Workflow Integration", () => {
  test("executes consensus workflow end-to-end", async () => {
    const neuro = new NeuroLink();
    const workflowConfig = getWorkflow("consensus-3");

    const result = await neuro.generate({
      workflowConfig,
      input: { text: "Test query" },
    });

    expect(result.content).toBeDefined();
    expect(result.workflow?.ensembleResponses).toHaveLength(3);
    expect(result.workflow?.judgeScores).toBeDefined();
  });
});
```

---

## 10. Error Handling Strategy

### Error Hierarchy

```typescript
// workflow/utils/workflowErrors.ts
export class WorkflowError extends Error {
  constructor(
    message: string,
    public details: {
      code: string;
      workflowId: string;
      phase: "ensemble" | "judge" | "conditioning" | "validation";
      retryable: boolean;
      originalError?: Error;
    },
  ) {
    super(message);
    this.name = "WorkflowError";
  }
}

export class EnsembleExecutionError extends WorkflowError {
  constructor(
    workflowId: string,
    modelErrors: Array,
  ) {
    super("Ensemble execution failed", {
      code: "ENSEMBLE_FAILED",
      workflowId,
      phase: "ensemble",
      retryable: true,
    });
  }
}

export class JudgeScoringError extends WorkflowError {
  constructor(workflowId: string, judgeError: Error) {
    super("Judge scoring failed", {
      code: "JUDGE_FAILED",
      workflowId,
      phase: "judge",
      retryable: true,
      originalError: judgeError,
    });
  }
}
```

### Retry Logic

```typescript
async executeWithRetry(
  config: WorkflowConfig,
  options: WorkflowGenerateOptions
): Promise {
  const maxRetries = config.execution?.retries || 1;
  let lastError: Error;

  for (let attempt = 0; attempt  setTimeout(resolve, delay * (attempt + 1)));
        continue;
      }

      throw error;
    }
  }

  throw lastError!;
}
```

---

## 11. Performance Optimizations

### Parallel Execution Optimization

```typescript
// Use p-limit for controlled parallelism
const limit = pLimit(config.execution?.parallelism || 10);

// Batch model execution
const batches = chunk(models, limit);
const allResponses: EnsembleResponse[] = [];

for (const batch of batches) {
  const batchResponses = await Promise.all(
    batch.map((model) => limit(() => this.executeModel(model, input))),
  );
  allResponses.push(...batchResponses);
}
```

---

## 12. Observability & Monitoring

### Structured Logging

```typescript
logger.info("WorkflowExecution", {
  workflowId: config.id,
  workflowType: config.type,
  phase: "ensemble",
  models: config.models.length,
  duration: Date.now() - startTime,
  success: true,
});
```

### Metrics Collection

```typescript
// workflow/utils/workflowMetrics.ts
export class WorkflowMetrics {
  recordExecution(workflowId: string, result: WorkflowResult): void;
  recordModelLatency(provider: string, model: string, latency: number): void;
  recordJudgeLatency(provider: string, model: string, latency: number): void;
  recordError(workflowId: string, phase: string, error: Error): void;

  getMetrics(workflowId: string): WorkflowMetricsData;
  exportPrometheusMetrics(): string;
}
```

---

## 13. Security Considerations

### Input Validation

````typescript
// Sanitize all user inputs before passing to models
function sanitizeInput(input: string): string {
  // Remove potential prompt injection attempts
  return input
    .replace(/```[^`]*```/g, "") // Remove code blocks
    .replace(/]*>.*?/gi, "") // Remove scripts
    .trim();
}
````

### Cost Controls

```typescript
// Pre-execution cost estimation
async estimateCost(config: WorkflowConfig, input: WorkflowInput): Promise {
  const estimatedTokens = estimateTokenCount(input.text);
  const modelCosts = config.models.map(m =>
    calculateModelCost(m.provider, m.model, estimatedTokens)
  );
  const totalCost = modelCosts.reduce((a, b) => a + b, 0);

  if (config.execution?.maxCost && totalCost > config.execution.maxCost) {
    throw new Error(`Estimated cost ${totalCost} exceeds limit ${config.execution.maxCost}`);
  }

  return totalCost;
}
```

---

## 14. Built-in Workflow Implementations

### Consensus Workflow

```typescript
// workflow/workflows/consensusWorkflow.ts
export const CONSENSUS_3_WORKFLOW: WorkflowConfig = {
  id: "consensus-3",
  name: "Three Model Consensus",
  description:
    "Cross-validate responses across 3 leading models with judge scoring",
  type: "ensemble",
  models: [
    {
      provider: "openai",
      model: "gpt-4o",
      temperature: 0.3,
      label: "OpenAI GPT-4o",
    },
    {
      provider: "anthropic",
      model: "claude-3-5-sonnet",
      temperature: 0.3,
      label: "Anthropic Claude 3.5 Sonnet",
    },
    {
      provider: "google-ai",
      model: "gemini-2.5-flash",
      temperature: 0.3,
      label: "Google Gemini 2.5 Flash",
    },
  ],
  judge: {
    provider: "openai",
    model: "gpt-4o",
    criteria: ["accuracy", "clarity", "completeness", "depth"],
    outputFormat: "detailed",
    temperature: 0.1,
    includeReasoning: true, // REQUIRED for testing
    scoreScale: {
      min: 0,
      max: 100, // Standard 0-100 scale
    },
  },
  conditioning: {
    useConfidence: true,
    toneAdjustment: "neutral",
    addConfidenceStatement: true,
    includeMetadata: false,
  },
  execution: {
    timeout: 30000,
    modelTimeout: 15000,
    judgeTimeout: 10000,
    minResponses: 2,
    enableMetrics: true,
  },
};
```

---

## 15. API Usage Examples

### Basic Usage

```typescript

const neuro = new NeuroLink();

// Use built-in workflow (TESTING PHASE)
const workflowConfig = getWorkflow("consensus-3");
const result = await neuro.generate({
  workflowConfig,
  input: { text: "Explain quantum entanglement" },
});

// Original response (unchanged)
console.log(result.content); // Original AI response

// Workflow metadata (when using workflowConfig)
console.log(result.workflow?.selectedModel); // Selected best model
console.log(result.workflow?.metrics?.totalTime); // Execution time
console.log(result.workflow?.ensembleResponses?.length); // 3
```

### Custom Workflow

```typescript
// Register custom workflow using standalone function
registerWorkflow({
  id: "custom-medical",
  name: "Medical Query Workflow",
  type: "ensemble",
  models: [
    {
      provider: "openai",
      model: "gpt-4o",
      systemPrompt: "You are a medical expert...",
    },
    {
      provider: "anthropic",
      model: "claude-3-5-sonnet",
      systemPrompt: "You are a medical expert...",
    },
  ],
  judge: {
    provider: "openai",
    model: "gpt-4o",
    criteria: ["medical_accuracy", "safety", "clarity"],
    outputFormat: "scores",
  },
});

// Execute custom workflow
const customWorkflow = getWorkflow("custom-medical");
const result = await neuro.generate({
  workflowConfig: customWorkflow,
  input: { text: "What are the symptoms of type 2 diabetes?" },
});
```

---

## 16. Migration Path for Existing Users

### Backward Compatibility

```typescript
// Existing code continues to work
const result = await neuro.generate({
  input: { text: "Hello" },
});

// Workflow feature is enabled via workflowConfig option
const workflowConfig = getWorkflow("consensus-3");
const workflowResult = await neuro.generate({
  workflowConfig,
  input: { text: "Hello" },
});
```

### Gradual Adoption

1. **Phase 1**: Users can try workflows alongside existing methods
2. **Phase 2**: Workflows become recommended for high-stakes queries
3. **Phase 3**: Workflows are default with single-model as fallback

---

## 17. Performance Benchmarks (Expected)

| Workflow      | Models | Judge | Latency (p50) | Latency (p95) | Cost Multiplier |
| ------------- | ------ | ----- | ------------- | ------------- | --------------- |
| consensus-3   | 3      | 1     | 3.2s          | 5.1s          | 4.2x            |
| fast-fallback | 1-2    | 0     | 1.1s          | 2.8s          | 1.3x            |
| quality-max   | 2      | 1     | 3.5s          | 4.9s          | 3.1x            |
| multi-judge-5 | 3      | 2     | 4.8s          | 6.7s          | 5.3x            |

---

## 18. Future Enhancements

### Phase 2: Streaming Support

```typescript
async streamWorkflow(
  options: WorkflowGenerateOptions
): AsyncIterable {
  // Stream ensemble responses as they arrive
  // Update judge scores progressively
  // Condition final response
}
```

### Phase 3: Workflow Chaining

```typescript
const pipeline = neuro.createWorkflowPipeline([
  "quality-check", // First workflow
  "fact-verification", // Second workflow
  "final-polish", // Third workflow
]);

const result = await pipeline.execute({
  input: { text: "Complex query" },
});
```

---

##  Implementation Checklist

- [ ] Create `src/workflow/` directory structure
- [ ] Implement `types.ts` with all interfaces
- [ ] Implement `config.ts` with Zod schemas
- [ ] Implement `ensembleExecutor.ts`
- [ ] Implement `judgeScorer.ts`
- [ ] Implement `responseConditioner.ts`
- [ ] Implement `workflowRegistry.ts`
- [ ] Implement `workflowRunner.ts`
- [ ] Create built-in workflows (consensus, fallback, quality-max)
- [ ] Add methods to `NeuroLink` class
- [ ] Export types from `src/lib/index.ts`
- [ ] Write unit tests (80% coverage target)
- [ ] Write integration tests
- [ ] Add JSDoc documentation
- [ ] Create user guide with examples
- [ ] Add CLI support (optional Phase 2)

---

**Document Status**: ✅ Ready for Implementation
**Next Step**: Code generation upon approval

---

## Domain Configuration Examples for NeuroLink CLI

<!-- Source: cli-domain-examples.md -->

# Domain Configuration Examples for NeuroLink CLI

This document provides comprehensive examples of using domain-specific features with the NeuroLink CLI, showcasing the Phase 1 Factory Infrastructure capabilities.

## Table of Contents

- [Basic Domain Usage](#basic-domain-usage)
- [Healthcare Domain Examples](#healthcare-domain-examples)
- [Analytics Domain Examples](#analytics-domain-examples)
- [Finance Domain Examples](#finance-domain-examples)
- [E-commerce Domain Examples](#e-commerce-domain-examples)
- [Context Integration Examples](#context-integration-examples)
- [Evaluation and Analytics](#evaluation-and-analytics)
- [Provider-Specific Examples](#provider-specific-examples)
- [Streaming with Domains](#streaming-with-domains)
- [Configuration Management](#configuration-management)
- [Advanced Use Cases](#advanced-use-cases)

## Basic Domain Usage

### Simple Domain Generation

```bash
# Basic healthcare domain usage
neurolink generate "Analyze patient symptoms: fever, headache, fatigue" \
  --evaluationDomain healthcare \
  --enable-evaluation \
  --format json

# Basic analytics domain usage
neurolink generate "Calculate quarterly revenue growth trends" \
  --evaluationDomain analytics \
  --enable-evaluation \
  --enable-analytics \
  --format json
```

### Domain-Specific Streaming

```bash
# Stream with finance domain evaluation
neurolink stream "Assess investment portfolio risk for retirement planning" \
  --evaluationDomain finance \
  --enable-evaluation

# Stream with ecommerce domain evaluation
neurolink stream "Optimize conversion funnel for online retail store" \
  --evaluationDomain ecommerce \
  --enable-evaluation
```

## Healthcare Domain Examples

### Medical Diagnosis Support

```bash
# Comprehensive symptom analysis
neurolink generate "Patient presents with: chest pain (8/10), shortness of breath, elevated heart rate (110 BPM), diaphoresis. History: hypertension, diabetes. Age 65. Provide differential diagnosis and recommended tests." \
  --evaluationDomain healthcare \
  --enable-evaluation \
  --enable-analytics \
  --provider google-ai \
  --max-tokens 800 \
  --format json
```

### Treatment Planning

```bash
# Treatment recommendation with context
neurolink generate "Develop treatment plan for Type 2 diabetes patient" \
  --context '{"patientAge":55,"comorbidities":["hypertension","obesity"],"allergies":["penicillin"],"currentMedications":["metformin","lisinopril"]}' \
  --evaluationDomain healthcare \
  --enable-evaluation \
  --format json
```

### Medical Research Analysis

```bash
# Clinical trial data analysis
neurolink stream "Analyze clinical trial results for new cardiovascular drug" \
  --context '{"studyType":"randomized-controlled","sampleSize":2000,"primaryEndpoint":"MACE reduction","duration":"24-months"}' \
  --evaluationDomain healthcare \
  --enable-evaluation \
  --enable-analytics \
  --provider anthropic
```

## Analytics Domain Examples

### Business Intelligence

```bash
# Quarterly business analysis
neurolink generate "Analyze Q3 performance metrics and identify growth opportunities" \
  --context '{"revenue":"$2.5M","growth":"15%","customerAcquisition":450,"churnRate":"3.2%","marketSegment":"B2B-SaaS"}' \
  --evaluationDomain analytics \
  --enable-evaluation \
  --enable-analytics \
  --format json \
  --max-tokens 1000
```

### Data Science Insights

```bash
# Machine learning model performance analysis
neurolink generate "Evaluate ML model performance and recommend optimizations" \
  --context '{"modelType":"gradient-boosting","accuracy":0.87,"precision":0.83,"recall":0.91,"f1Score":0.87,"trainingData":"50k-samples","features":42}' \
  --evaluationDomain analytics \
  --enable-evaluation \
  --provider openai \
  --format json
```

### Predictive Analytics

```bash
# Sales forecasting with streaming
neurolink stream "Generate sales forecast for next quarter based on historical trends" \
  --context '{"historicalData":"3-years","seasonality":"high","marketTrends":"positive","competitiveAnalysis":"included"}' \
  --evaluationDomain analytics \
  --enable-evaluation \
  --enable-analytics
```

## Finance Domain Examples

### Investment Analysis

```bash
# Portfolio risk assessment
neurolink generate "Assess risk profile of diversified investment portfolio" \
  --context '{"assetAllocation":{"stocks":0.60,"bonds":0.30,"alternatives":0.10},"totalValue":"$500k","timeHorizon":"10-years","riskTolerance":"moderate"}' \
  --evaluationDomain finance \
  --enable-evaluation \
  --enable-analytics \
  --format json
```

### Financial Planning

```bash
# Retirement planning analysis
neurolink generate "Create comprehensive retirement savings strategy" \
  --context '{"currentAge":35,"retirementAge":65,"currentSavings":"$75k","annualIncome":"$120k","savingsRate":"15%","expectedReturns":"7%"}' \
  --evaluationDomain finance \
  --enable-evaluation \
  --provider vertex \
  --max-tokens 1200
```

### Market Analysis

```bash
# Economic trend analysis with streaming
neurolink stream "Analyze current market conditions and economic indicators" \
  --context '{"inflationRate":"3.2%","unemploymentRate":"3.8%","fedFundsRate":"5.25%","gdpGrowth":"2.1%","marketVolatility":"elevated"}' \
  --evaluationDomain finance \
  --enable-evaluation
```

## E-commerce Domain Examples

### Conversion Optimization

```bash
# E-commerce funnel analysis
neurolink generate "Optimize checkout process to reduce cart abandonment" \
  --context '{"cartAbandonmentRate":"68%","checkoutSteps":4,"averageLoadTime":"3.2s","mobileUsers":"75%","paymentOptions":["card","paypal","apple-pay"]}' \
  --evaluationDomain ecommerce \
  --enable-evaluation \
  --enable-analytics \
  --format json
```

### Customer Experience

```bash
# Product recommendation strategy
neurolink generate "Develop personalized product recommendation engine" \
  --context '{"userBase":"50k-active","purchaseHistory":"available","browsingData":"tracked","categoryCount":25,"averageOrderValue":"$85"}' \
  --evaluationDomain ecommerce \
  --enable-evaluation \
  --provider google-ai
```

### Marketing Campaign Analysis

```bash
# Campaign performance optimization
neurolink stream "Analyze digital marketing campaign performance and ROI" \
  --context '{"channels":["social","email","ppc","seo"],"budget":"$50k","duration":"3-months","conversions":1250,"cac":"$40","ltv":"$300"}' \
  --evaluationDomain ecommerce \
  --enable-evaluation \
  --enable-analytics
```

## Context Integration Examples

### Complex Organizational Context

```bash
# Enterprise analytics with comprehensive context
neurolink generate "Analyze operational efficiency across multiple departments" \
  --context '{
    "organization": {
      "id": "acme-corp-2024",
      "industry": "technology",
      "size": "mid-market",
      "locations": ["us-east", "eu-west", "apac-south"]
    },
    "departments": {
      "engineering": {"headcount": 120, "budget": "$8M", "kpis": ["velocity", "quality", "innovation"]},
      "sales": {"headcount": 45, "budget": "$2M", "kpis": ["revenue", "pipeline", "conversion"]},
      "marketing": {"headcount": 25, "budget": "$1.5M", "kpis": ["leads", "brand", "engagement"]}
    },
    "timeframe": "Q3-2024",
    "objectives": ["growth", "efficiency", "scalability"]
  }' \
  --evaluationDomain analytics \
  --enable-evaluation \
  --enable-analytics \
  --format json \
  --max-tokens 1500
```

### Multi-Domain Context

```bash
# Healthcare analytics with regulatory context
neurolink generate "Analyze patient outcomes while ensuring HIPAA compliance" \
  --context '{
    "healthcare": {
      "facilityType": "hospital",
      "specialties": ["cardiology", "oncology", "emergency"],
      "patientVolume": "daily-500"
    },
    "compliance": {
      "frameworks": ["HIPAA", "SOX", "FDA"],
      "auditStatus": "current",
      "dataClassification": "sensitive"
    },
    "analytics": {
      "metricsTracked": ["readmission-rates", "patient-satisfaction", "treatment-outcomes"],
      "reportingFrequency": "monthly",
      "stakeholders": ["medical-staff", "administration", "regulators"]
    }
  }' \
  --evaluationDomain healthcare \
  --enable-evaluation \
  --enable-analytics \
  --provider anthropic
```

## Evaluation and Analytics

### Comprehensive Evaluation Setup

```bash
# Full evaluation with custom domain
neurolink generate "Develop AI strategy for enterprise transformation" \
  --evaluationDomain analytics \
  --enable-evaluation \
  --enable-analytics \
  --context '{"industry":"manufacturing","aiMaturity":"beginner","budget":"$2M","timeline":"18-months"}' \
  --provider google-ai \
  --format json \
  --max-tokens 2000
```

### Analytics-Only Mode

```bash
# Analytics without evaluation
neurolink generate "Create quarterly performance report" \
  --enable-analytics \
  --context '{"quarter":"Q3","metrics":["revenue","growth","efficiency"],"stakeholders":["executives","board","investors"]}' \
  --format json
```

### Evaluation-Only Mode

```bash
# Evaluation without analytics
neurolink generate "Review software architecture decisions" \
  --evaluationDomain analytics \
  --enable-evaluation \
  --context '{"architecture":"microservices","scale":"enterprise","complexity":"high"}'
```

## Provider-Specific Examples

### OpenAI with Healthcare Domain

```bash
neurolink generate "Analyze drug interaction risks for polypharmacy patient" \
  --provider openai \
  --model gpt-4 \
  --evaluationDomain healthcare \
  --enable-evaluation \
  --context '{"medications":["warfarin","amiodarone","simvastatin"],"age":78,"kidneyFunction":"moderate-impairment"}' \
  --format json
```

### Anthropic with Finance Domain

```bash
neurolink generate "Assess cryptocurrency investment strategy risks" \
  --provider anthropic \
  --model claude-3-5-sonnet-20241022 \
  --evaluationDomain finance \
  --enable-evaluation \
  --enable-analytics \
  --context '{"portfolio":"traditional","riskTolerance":"low","cryptoAllocation":"5%","timeHorizon":"long-term"}' \
  --format json
```

### Google AI with Analytics Domain

```bash
neurolink stream "Optimize supply chain logistics using AI predictions" \
  --provider google-ai \
  --model gemini-2.5-pro \
  --evaluationDomain analytics \
  --enable-evaluation \
  --context '{"supplyChain":"global","products":"electronics","demandVolatility":"high","inventoryTurnover":"quarterly"}'
```

## Streaming with Domains

### Interactive Healthcare Consultation

```bash
# Stream medical case analysis
neurolink stream "Walk through differential diagnosis process for complex case" \
  --evaluationDomain healthcare \
  --enable-evaluation \
  --context '{"setting":"emergency-room","urgency":"high","resources":"full-diagnostic"}' \
  --provider anthropic
```

### Real-time Financial Analysis

```bash
# Stream market analysis
neurolink stream "Provide real-time analysis of market volatility impact" \
  --evaluationDomain finance \
  --enable-evaluation \
  --enable-analytics \
  --context '{"marketConditions":"volatile","portfolio":"balanced","clientRisk":"moderate"}'
```

### Live Business Intelligence

```bash
# Stream business insights
neurolink stream "Generate actionable insights from real-time business metrics" \
  --evaluationDomain analytics \
  --enable-evaluation \
  --enable-analytics \
  --context '{"dataSource":"live-dashboard","updateFrequency":"real-time","stakeholder":"c-suite"}'
```

## Configuration Management

### Setting Domain Defaults

```bash
# Configure default domain settings
neurolink config init
# Follow prompts to set:
# - Default Evaluation Domain: analytics
# - Enable Analytics by Default: yes
# - Enable Evaluation by Default: yes
```

### Domain-Specific Configuration

```bash
# Show current domain configuration
neurolink config show

# Export configuration with domain settings
neurolink config export --format json > neurolink-domain-config.json

# Validate domain configuration
neurolink config validate
```

### Custom Domain Setup

```bash
# Initialize with custom domain preferences
neurolink config init
# Select healthcare as default domain
# Configure evaluation criteria: accuracy, safety, compliance, clarity
# Enable diagnostic accuracy tracking
# Enable treatment outcomes tracking
```

## Advanced Use Cases

### Multi-Step Analysis Pipeline

```bash
# Step 1: Initial analysis
neurolink generate "Conduct preliminary market research analysis" \
  --evaluationDomain analytics \
  --enable-evaluation \
  --context '{"market":"fintech","stage":"preliminary","scope":"competitive-landscape"}' \
  --output step1-analysis.json \
  --format json

# Step 2: Deep dive based on initial findings
neurolink generate "Deep dive into identified market opportunities" \
  --evaluationDomain analytics \
  --enable-evaluation \
  --enable-analytics \
  --context '{"previousAnalysis":"step1-analysis.json","focus":"opportunity-sizing","methodology":"bottom-up"}' \
  --format json
```

### Cross-Domain Analysis

```bash
# Healthcare + Analytics combined analysis
neurolink generate "Analyze healthcare cost optimization using data analytics" \
  --evaluationDomain healthcare \
  --enable-evaluation \
  --enable-analytics \
  --context '{
    "healthcare": {"costs":"rising","quality":"maintained","patient-satisfaction":"high"},
    "analytics": {"dataAvailable":["claims","outcomes","satisfaction"],"methodology":"predictive-modeling"}
  }' \
  --format json \
  --max-tokens 2000
```

### Compliance-Aware Generation

```bash
# Finance with regulatory compliance
neurolink generate "Develop investment strategy complying with fiduciary standards" \
  --evaluationDomain finance \
  --enable-evaluation \
  --context '{
    "regulatory": {"framework":"DOL-fiduciary","state":"california","clientType":"retirement-plan"},
    "investment": {"universe":"mutual-funds","fees":"low-cost","diversification":"required"}
  }' \
  --format json
```

### Performance-Optimized Commands

```bash
# High-performance analytics processing
neurolink generate "Process large dataset for business insights" \
  --evaluationDomain analytics \
  --enable-analytics \
  --provider vertex \
  --max-tokens 1000 \
  --timeout 180 \
  --context '{"dataSize":"100GB","processing":"distributed","latency":"low","accuracy":"high"}' \
  --format json
```

## Best Practices

### 1. Domain Selection Guidelines

- **Healthcare**: Medical analysis, diagnosis support, treatment planning, regulatory compliance
- **Analytics**: Data analysis, business intelligence, predictive modeling, performance metrics
- **Finance**: Investment analysis, risk assessment, financial planning, market analysis
- **E-commerce**: Conversion optimization, customer experience, marketing campaigns, sales analytics

### 2. Context Structure Best Practices

```bash
# Well-structured context example
neurolink generate "Your analysis request" \
  --context '{
    "domain_specific": {
      "key_metrics": ["metric1", "metric2"],
      "constraints": ["constraint1", "constraint2"]
    },
    "organizational": {
      "size": "enterprise",
      "industry": "technology"
    },
    "temporal": {
      "timeframe": "Q3-2024",
      "urgency": "high"
    }
  }' \
  --evaluationDomain analytics \
  --enable-evaluation
```

### 3. Output Format Selection

- Use `--format json` for structured analysis and integration
- Use `--format text` for human-readable reports
- Use `--format table` for comparative data presentation

### 4. Performance Optimization

- Use `--max-tokens` to control response length
- Enable `--enable-analytics` for detailed performance metrics
- Use appropriate providers for specific domains
- Structure context data efficiently

### 5. Evaluation Best Practices

- Always enable evaluation for critical domain applications
- Use domain-specific evaluation criteria
- Monitor evaluation scores for quality assurance
- Combine evaluation with analytics for comprehensive insights

## Troubleshooting

### Common Issues and Solutions

1. **Unknown domain error**

   ```bash
   # Ensure domain name is supported
   neurolink generate "test" --evaluationDomain healthcare  # ✓ Correct
   neurolink generate "test" --evaluationDomain medical     # ✗ Incorrect
   ```

2. **Context parsing errors**

   ```bash
   # Use proper JSON formatting
   neurolink generate "test" --context '{"key":"value"}'     # ✓ Correct
   neurolink generate "test" --context '{key:value}'        # ✗ Incorrect
   ```

3. **Performance issues**

   ```bash
   # Optimize token limits and context size
   neurolink generate "test" --max-tokens 500 --context '{"minimal":"data"}'
   ```

4. **Provider compatibility**
   ```bash
   # Test with different providers if needed
   neurolink generate "test" --provider google-ai --evaluationDomain healthcare
   neurolink generate "test" --provider anthropic --evaluationDomain finance
   ```

## Additional Resources

- [CLI Reference](/docs/cli/commands)
- [Configuration Guide](/docs/deployment/configuration)
- [Performance Optimization](/docs/deployment/performance-guide)
- [API Documentation](/docs/sdk/api-reference)

For more examples and advanced usage patterns, visit the [NeuroLink Examples Repository](https://github.com/juspay/neurolink-examples).

---

## ️ NeuroLink CLI Guide

<!-- Source: cli-guide.md -->

# ️ NeuroLink CLI Guide

## Command-Line Philosophy

The NeuroLink CLI is designed with the developer experience in mind. Our goal is to provide a tool that is not only powerful and flexible but also a pleasure to use. Here are the core principles that guide our design:

- **Clear and Consistent Commands:** We use a clear and consistent command structure to make the CLI easy to learn and use. All commands follow a logical `verb-noun` structure (e.g., `neurolink generate`, `neurolink models list`).

- **Human-Readable and Machine-Readable Output:** The CLI provides both human-readable text output and machine-readable JSON output. This makes it easy to use the CLI both interactively and in automated scripts.

- **Smart Defaults:** We provide smart defaults for all commands, so you can get started quickly without having to configure everything upfront.

- **Great Developer Experience:** We use animated spinners, colorized output, and helpful error messages to provide a great developer experience.

The NeuroLink CLI provides all SDK functionality through an elegant command-line interface with professional UX features.

## Installation & Usage

### Option 1: NPX (No Installation Required)

```bash
# Use directly without installation
npx @juspay/neurolink --help
npx @juspay/neurolink generate "Hello, AI!"
npx @juspay/neurolink status
```

### Option 2: Global Installation

```bash
# Install globally for convenient access
npm install -g @juspay/neurolink

# Then use anywhere
neurolink --help
neurolink generate "Write a haiku about programming"
neurolink status --verbose
```

### Option 3: Local Project Usage

```bash
# Add to project and use via npm scripts
npm install @juspay/neurolink
npx neurolink generate "Explain TypeScript"
```

## Commands Reference

### `generate ` - Core Text Generation (Recommended)

Generate AI content with customizable parameters. Prepared for multimodal support.

```bash
# Basic text generation
npx @juspay/neurolink generate "Explain quantum computing"

# With provider and model selection
npx @juspay/neurolink generate "what is deepest you can think?" --provider google-ai --model gemini-2.5-flash

# With different model for detailed responses
npx @juspay/neurolink generate "Write a comprehensive analysis" --provider google-ai --model gemini-2.5-pro

# With temperature control
npx @juspay/neurolink generate "Creative writing" --temperature 0.9

# With system prompt
npx @juspay/neurolink generate "Write code" --system "You are a senior developer"

# JSON output for scripting and automation
npx @juspay/neurolink generate "Summary of AI" --format json
npx @juspay/neurolink gen "Create product specification" --format json --provider google-ai

# JSON Output Example:
{
  "content": "AI (Artificial Intelligence) represents a transformative technology...",
  "provider": "google-ai",
  "model": "gemini-2.5-flash",
  "usage": {
    "promptTokens": 12,
    "completionTokens": 156,
    "totalTokens": 168
  },
  "responseTime": 987
}

# Parse JSON in shell scripts
response=$(npx @juspay/neurolink gen "Generate greeting" --format json)
content=$(echo "$response" | jq -r '.content')
echo "AI says: $content"

# Debug mode with detailed metadata
npx @juspay/neurolink generate "Hello AI" --debug
```

### `gen ` - Shortest Form

Quick command alias for fast usage.

```bash
# Basic generation (shortest)
npx @juspay/neurolink gen "Explain quantum computing"

# With provider and model
npx @juspay/neurolink gen "what is deepest you can think?" --provider google-ai --model gemini-2.5-flash

# With different model for comprehensive responses
npx @juspay/neurolink gen "Analyze this problem" --provider google-ai --model gemini-2.5-pro
```

**Available Options:**

- `--provider ` - Choose specific provider or 'auto' (default: auto)
- `--temperature ` - Creativity level 0.0-1.0 (default: 0.7)
- `--maxTokens ` - Maximum tokens to generate (default: 1000)
- `--system ` - System prompt to guide AI behavior
- `--format ` - Output format: 'text', 'json', or 'table' (default: text)
- `--debug` - Enable debug mode with verbose output and metadata
- `--timeout ` - Request timeout in seconds (default: 120)
- `--quiet` - Suppress spinners and progress indicators
- `--enableAnalytics` - Enable usage analytics collection (Phase 3 feature)
- `--enableEvaluation` - Enable AI response quality evaluation (Phase 3 feature)
- `--evaluationDomain ` - Domain expertise for evaluation context (e.g., "Senior Software Architect")
- `--context ` - JSON context object for custom data (e.g., '{"userId":"123","project":"api-design"}')
- `--disableTools` - Disable MCP tool integration (tools enabled by default)

**Video Generation Options (Veo 3.1):**

- `--outputMode ` - Output mode: 'text' (default) or 'video'
- `--image ` - Path to input image file for image-based video generation (required for video mode, e.g., ./input.jpg)
- `--videoOutput ` - Path to save generated video file (e.g., ./output.mp4)
- `--videoResolution ` - Video resolution: '720p' or '1080p' (default: 720p)
- `--videoLength ` - Video duration: 4, 6, or 8 seconds (default: 6)
- `--videoAspectRatio ` - Aspect ratio: '9:16' (portrait) or '16:9' (landscape, default: 16:9)
- `--videoAudio ` - Include synchronized audio (default: true)

**Output Example:**

```
 Generating text...
✅ Text generated successfully!
Quantum computing represents a revolutionary approach to information processing...
ℹ️  127 tokens used
```

**Debug Mode Output:**

```
 Generating text...
✅ Text generated successfully!

Quantum computing represents a revolutionary approach to information processing...

{
  "provider": "openai",
  "usage": {
    "promptTokens": 15,
    "completionTokens": 127,
    "totalTokens": 142
  },
  "responseTime": 1234
}
ℹ️  142 tokens used
```

### 🆕 Phase 3 Enhanced Features Examples

```bash
# Analytics Collection (Phase 3.1 Complete)
npx @juspay/neurolink generate "Explain machine learning" --enableAnalytics --debug

# Response Quality Evaluation (Phase 3.1 Complete)
npx @juspay/neurolink generate "Write Python code for prime numbers" --enableEvaluation --debug

# Combined Analytics + Evaluation
npx @juspay/neurolink generate "Design a REST API" --enableAnalytics --enableEvaluation --debug

# Domain-specific Evaluation Context
npx @juspay/neurolink generate "Debug this code issue" --enableEvaluation --evaluationDomain "Senior Software Engineer" --debug

# Custom Context for Analytics
npx @juspay/neurolink generate "Help with project" --context '{"userId":"123","project":"AI-platform"}' --enableAnalytics --debug
```

**Phase 3 Analytics Output Example:**

```
 Analytics:
   Provider: google-ai
   Tokens: 434 input + 127 output = 561 total
   Cost: $0.00042
   Time: 1.2s
   Tools: getCurrentTime, writeFile

 Response Evaluation:
   Relevance: 10/10
   Accuracy: 9/10
   Completeness: 9/10
   Overall: 9/10
   Reasoning: Response directly addresses the request with accurate code implementation.
             Includes comprehensive examples and error handling. Minor improvement
             could be adding more edge case documentation.
```

### `stream ` - Real-time Streaming

Stream AI generation in real-time with optional agent support.

```bash
# Basic streaming
npx @juspay/neurolink stream "Tell me a story"

# With specific provider
npx @juspay/neurolink stream "Tell me a story" --provider openai

# With agent tool support (default - AI can use tools)
npx @juspay/neurolink stream "What time is it?" --provider google-ai

# Without tools (traditional text-only mode)
npx @juspay/neurolink stream "Tell me a story" --disableTools

# Debug mode with tool execution logging
npx @juspay/neurolink stream "What time is it?" --debug

# Temperature control for creative streaming
npx @juspay/neurolink stream "Write a poem" --temperature 0.9

# Real Streaming with Analytics (Phase 3.2B Complete)
npx @juspay/neurolink stream "Explain quantum computing" --enableAnalytics --enableEvaluation --debug

# With custom timeout for long streaming operations
npx @juspay/neurolink stream "Write a long story" --timeout 120

# Quiet mode with timeout
npx @juspay/neurolink stream "Hello world" --quiet --timeout 10s
```

**Available Options:**

- `--provider ` - Choose specific provider or 'auto' (default: auto)
- `--temperature ` - Creativity level 0.0-1.0 (default: 0.7)
- `--debug` - Enable debug mode with interleaved logging
- `--quiet` - Suppress progress messages and status updates
- `--timeout ` - Request timeout (default: 2m for streaming). Accepts: '30s', '2m', '5000' (ms), '1h'
- `--disable-tools` - Disable agent tool support for text-only mode

**Output Example:**

```
 Streaming from auto provider...

Once upon a time, in a world where technology had advanced beyond...
[text streams in real-time as it's generated]
```

**Debug Mode Output:**

```
 Streaming from openai provider with debug logging...

Once upon a time[DEBUG: chunk received, 15 chars]
, in a world where technology[DEBUG: chunk received, 25 chars]
...
[text streams with interleaved debug information]
```

### `batch ` - Process Multiple Prompts

Process multiple prompts from a file efficiently with progress tracking.

```bash
# Create a file with prompts (one per line)
echo -e "Write a haiku\nExplain gravity\nDescribe the ocean" > prompts.txt

# Process all prompts
neurolink batch prompts.txt

# Save results to JSON file
neurolink batch prompts.txt --output results.json

# Add delay between requests (rate limiting)
neurolink batch prompts.txt --delay 2000

# With custom timeout per request
neurolink batch prompts.txt --timeout 45s

# Process with specific provider and timeout
neurolink batch prompts.txt --provider openai --timeout 1m --output results.json
```

**Output Example:**

```
 Processing 3 prompts...

✅ 1/3 completed
✅ 2/3 completed
✅ 3/3 completed
✅ Results saved to results.json
```

### `models` - Dynamic Model Management

The dynamic model system provides intelligent model selection and cost optimization.

```bash
# List all available models with pricing
neurolink models list

# Search models by capability
neurolink models search --capability functionCalling
neurolink models search --capability vision --max-price 0.001

# Get best model for specific use case
neurolink models best --use-case coding
neurolink models best --use-case vision
neurolink models best --use-case cheapest

# Resolve model aliases
neurolink models resolve anthropic claude-latest
neurolink models resolve google fastest

# Show model configuration server status
neurolink models server-status

# Test model parameter support
node dist/cli/index.js generate "what is deepest you can think?" --provider google-ai --model gemini-2.5-flash
node dist/cli/index.js generate "Analyze this complex problem" --provider google-ai --model gemini-2.5-pro
```

**Available Options:**

- `--capability ` - Filter by capability (functionCalling, vision, code-execution)
- `--max-price ` - Maximum price per 1K input tokens
- `--provider ` - Filter by specific provider
- `--exclude-deprecated` - Exclude deprecated models
- `--format ` - Output format: 'table', 'json', 'csv' (default: table)
- `--optimize-cost` - Automatically select cheapest suitable model
- `--use-case ` - Find best model for: coding, analysis, vision, fastest, cheapest

**Example Output:**

```
 Dynamic Model Inventory (Auto-Updated)

┌─────────────┬──────────────────────┬────────────┬─────────────────────────────────┬──────────────┐
│ Provider    │ Model                │ Input Cost │ Capabilities                    │ Status       │
├─────────────┼──────────────────────┼────────────┼─────────────────────────────────┼──────────────┤
│ google      │ gemini-2.0-flash     │ $0.000075  │ functionCalling, vision, code  │ ✅ Active    │
│ openai      │ gpt-4o-mini          │ $0.000150  │ functionCalling, json-mode     │ ✅ Active    │
│ anthropic   │ claude-3-haiku       │ $0.000250  │ functionCalling                │ ✅ Active    │
│ anthropic   │ claude-3-sonnet      │ $0.003000  │ functionCalling, vision        │ ✅ Active    │
│ openai      │ gpt-4o               │ $0.005000  │ functionCalling, vision        │ ✅ Active    │
│ anthropic   │ claude-3-opus        │ $0.015000  │ functionCalling, vision, analysis │ ✅ Active │
│ openai      │ gpt-4-turbo          │ $0.010000  │ functionCalling, vision        │ ❌ Deprecated │
└─────────────┴──────────────────────┴────────────┴─────────────────────────────────┴──────────────┘

 Cost Range: $0.000075 - $0.015000 per 1K tokens (200x difference)
 Capabilities: 9 functionCalling, 7 vision, 1 code-execution
⚡ Cheapest: google/gemini-2.0-flash
 Most Capable: anthropic/claude-3-opus
```

### `status` - Provider Diagnostics

Check the health and connectivity of all configured AI providers. This now includes authentication and model availability checks.

```bash
# Check all provider connectivity
neurolink status

# Verbose output with detailed information
neurolink status --verbose
```

**Output Example:**

```
 Checking AI provider status...

✅ openai: ✅ Working (234ms)
✅ bedrock: ✅ Working (456ms)
❌ vertex: ❌ Authentication failed

 Summary: 2/3 providers working
```

### `get-best-provider` - Auto-selection Testing

Test which provider would be automatically selected.

```bash
# Test which provider would be auto-selected
neurolink get-best-provider

# Debug mode with selection reasoning
neurolink get-best-provider --debug
```

**Available Options:**

- `--debug` - Show selection logic and reasoning

**Output Example:**

```
 Finding best provider...
✅ Best provider: bedrock
```

**Debug Mode Output:**

```
 Finding best provider...
✅ Best provider selected: openai

Best available provider: openai
Selection based on: availability, performance, and configuration
```

### `provider` - Provider Management Commands

Comprehensive provider management and diagnostics.

#### `provider status` - Detailed Provider Status

```bash
# Check all provider connectivity
neurolink provider status

# Verbose output with detailed information
neurolink provider status --verbose
```

#### `provider list` - List Available Providers

```bash
# List all supported providers
neurolink provider list
```

**Output Example:**

```
Available providers: openai, bedrock, vertex, anthropic, azure, google-ai, huggingface, ollama, mistral
```

#### `provider configure ` - Configuration Help

```bash
# Get configuration guidance for specific provider
neurolink provider configure openai
neurolink provider configure bedrock
neurolink provider configure vertex
neurolink provider configure google-ai
```

**For detailed setup instructions** → See [Provider Configuration Guide](/docs/getting-started/provider-setup)

**Output Example:**

```
 Configuration guidance for openai:
 Set relevant environment variables for API keys and other settings.
   Refer to the documentation for details: https://github.com/juspay/neurolink#configuration
```

### `config` - Configuration Management Commands

Manage NeuroLink configuration settings and preferences.

#### `config setup` - Interactive Setup

```bash
# Run interactive configuration setup
neurolink config setup

# Alias for setup
neurolink config init
```

#### `config show` - Display Current Configuration

```bash
# Show current NeuroLink configuration
neurolink config show
```

#### `config set  ` - Set Configuration Values

```bash
# Set configuration key-value pairs
neurolink config set provider openai
neurolink config set temperature 0.8
neurolink config set max-tokens 1000
```

#### `config import ` - Import Configuration

```bash
# Import configuration from JSON file
neurolink config import my-config.json
```

#### `config export ` - Export Configuration

```bash
# Export current configuration to file
neurolink config export backup-config.json
```

#### `config validate` - Validate Configuration

```bash
# Validate current configuration settings
neurolink config validate
```

#### `config reset` - Reset to Defaults

```bash
# Reset configuration to default values
neurolink config reset
```

**Available Options:**

- `--format ` - Output format: `table` (default), `json`, `yaml`, `summary`
- `--include-inactive` - Include servers that may not be currently active
- `--preferred-tools ` - Prioritize specific tools (comma-separated)
- `--workspace-only` - Search only workspace/project configurations
- `--global-only` - Search only global configurations

**Output Example:**

```
 NeuroLink MCP Server Discovery
✔ Discovery completed!

 Found 29 MCP servers:
────────────────────────────────────────

1.  kite
   Title: kite
   Source: Claude Desktop (global)
   Command: bash -c source ~/.nvm/nvm.sh && nvm exec 20 npx mcp-remote https://mcp.kite.trade/sse

2.  github.com/modelcontextprotocol/servers/tree/main/src/puppeteer
   Title: github.com/modelcontextprotocol/servers/tree/main/src/puppeteer
   Source: Cline AI Coder (global)
   Command: npx -y @modelcontextprotocol/server-puppeteer

 Discovery Statistics:
   Execution time: 15ms
   Config files found: 5
   Servers discovered: 29
   Duplicates removed: 0

 Search Sources:
    Claude Desktop: 1 location(s)
    Windsurf: 1 location(s)
    VS Code: 1 location(s)
    Cline AI Coder: 1 location(s)
   ⚙️ Generic: 1 location(s)
```

**Supported Tools & Platforms:**

✅ **Claude Desktop** - Global configuration discovery
✅ **VS Code** - Global and workspace configurations
✅ **Cursor** - Global and project configurations
✅ **Windsurf (Codeium)** - Global configuration discovery
✅ **Cline AI Coder** - Extension globalStorage discovery
✅ **Continue Dev** - Global configuration discovery
✅ **Aider** - Global configuration discovery
✅ **Generic Configs** - Project-level MCP configurations

**Resilient JSON Parser:**

The discovery system includes a sophisticated JSON parser that handles common configuration file issues:

✅ **Trailing Commas** - Automatically removes trailing commas
✅ **JavaScript Comments** - Strips `//` and `/* */` comments
✅ **Control Characters** - Fixes unescaped control characters
✅ **Unquoted Keys** - Adds missing quotes to object keys
✅ **Non-printable Characters** - Sanitizes problematic characters
✅ **Multiple Repair Strategies** - Three-stage repair with graceful fallback

### `discover` - Auto-Discover MCP Servers

Automatically discover MCP server configurations from all major AI development tools on your system.

```bash
# Basic discovery with table output
neurolink discover

# Different output formats
neurolink discover --format table
neurolink discover --format json
neurolink discover --format yaml
neurolink discover --format summary
```

**Options:**

- `--format ` - Output format: table, json, yaml, summary (default: table)
- `--include-inactive` - Include servers that may not be currently active
- `--preferred-tools ` - Prioritize specific tools (comma-separated)
- `--workspace-only` - Search only workspace/project configurations
- `--global-only` - Search only global configurations

**Output Example:**

```
 NeuroLink MCP Server Discovery
✔ Discovery completed!

 Found 29 MCP servers:
────────────────────────────────────────
1.  kite
   Title: kite
   Source: Claude Desktop (global)
   Command: bash -c source ~/.nvm/nvm.sh && nvm exec 20 npx mcp-remote https://mcp.kite.trade/sse

2.  github.com/modelcontextprotocol/servers/tree/main/src/puppeteer
   Title: github.com/modelcontextprotocol/servers/tree/main/src/puppeteer
   Source: Cline AI Coder (global)
   Command: npx -y @modelcontextprotocol/server-puppeteer

 Discovery Statistics:
   Execution time: 15ms
   Config files found: 5
   Servers discovered: 29
   Duplicates removed: 0
```

### `mcp` - Model Context Protocol Integration

Manage external MCP servers for extended functionality. Connect to filesystem operations, GitHub integration, database access, and more through the growing MCP ecosystem.

> **Status Update (v1.7.1):** Built-in tools are fully functional! External MCP server discovery is working (58+ servers found), with activation currently in development.

#### ✅ Working Now: Built-in Tool Testing

```bash
# Test built-in time tool
neurolink generate "What time is it?"

# Test tool discovery
neurolink generate "What tools do you have access to? List and categorize them."

# Multi-tool integration test
neurolink generate "Can you help me refactor some code? And what time is it right now?"
```

#### `mcp list` - List Configured Servers

```bash
# List all discovered MCP servers (58+ found from all AI tools)
neurolink mcp list

# List with live connectivity status (external activation in development)
neurolink mcp list --status
```

**Current Output Example:**

```
 Discovered MCP servers (58+ found):

 filesystem
   Command: npx -y @modelcontextprotocol/server-filesystem /
   Transport: stdio
 filesystem: Discovered (activation in development)

 github
   Command: npx @modelcontextprotocol/server-github
   Transport: stdio
 github: Discovered (activation in development)

... (56+ more servers discovered)
```

#### `mcp install` - Install Popular Servers (Discovery Phase)

> **Note:** Installation commands are available but servers are currently in discovery/placeholder mode. Full activation coming soon!

```bash
# Install filesystem server for file operations (discovered but not yet activated)
neurolink mcp install filesystem

# Install GitHub server for repository management (discovered but not yet activated)
neurolink mcp install github

# Install PostgreSQL server for database operations (discovered but not yet activated)
neurolink mcp install postgres

# Install browser automation server (discovered but not yet activated)
neurolink mcp install puppeteer

# Install web search server (discovered but not yet activated)
neurolink mcp install brave-search
```

**Current Output Example:**

```
 Installing MCP server: filesystem
 Server discovered and configured
 Note: Server activation in development - use built-in tools for now
 Test built-in tools with: neurolink generate "What time is it?" --debug
```

#### `mcp add` - Add Custom Servers

```bash
# Add custom server with basic command
neurolink mcp add myserver "python /path/to/server.py"

# Add server with arguments
neurolink mcp add myserver "npx my-mcp-server" --args "arg1,arg2"

# Add SSE-based server
neurolink mcp add webserver "http://localhost:8080" --transport sse

# Add server with environment variables
neurolink mcp add dbserver "npx db-server" --env '{"DB_URL": "postgresql://..."}'

# Add server with custom working directory
neurolink mcp add localserver "python server.py" --cwd "/project/directory"
```

#### `mcp test` - Test Server Connectivity (Development Phase)

> **Current Status:** Built-in tools are fully testable! External server connectivity testing is under development.

```bash
# ✅ Working: Test built-in tools
neurolink generate "What time is it?" --debug

#  In Development: Test external server connectivity
neurolink mcp test filesystem

#  Working: List discovered servers
neurolink mcp list --status
```

**Current Output Example (Built-in Tools):**

```
✅ Built-in tool execution via AI:
 The current time is Friday, December 13, 2024 at 10:30:45 AM PST
 Available tools: 5 built-in tools discovered
 External servers: 58+ discovered, activation in development
```

**Future Output Example (External Servers):**

```
 Testing MCP server: filesystem (Coming Soon)

⠋ Connecting...⠙ Getting capabilities...⠹ Listing tools...
✔ ✅ Connection successful!

 Server Capabilities:
   Protocol Version: 2024-11-05
   Tools: ✅ Supported

️  Available Tools:
   • read_file: Read file contents from filesystem
   • write_file: Create/overwrite files
   • edit_file: Make line-based edits
   // ...existing tools...
```

#### `mcp remove` - Remove Servers

```bash
# Remove configured server
neurolink mcp remove old-server

# Remove multiple servers
neurolink mcp remove server1 server2 server3
```

#### `mcp exec` - Execute Tools (Development Phase)

> **Current Status:** Built-in tools work via AI generation! Direct external tool execution is under development.

```bash
# ✅ Working Now: Built-in tools via AI generation
neurolink generate "What time is it?" --debug
neurolink generate "What tools do you have access to?" --debug

#  Coming Soon: Direct external tool execution
neurolink mcp exec filesystem read_file --params '{"path": "index.md"}'
neurolink mcp exec github create_issue --params '{"owner": "juspay", "repo": "neurolink", "title": "Bug report", "body": "Description"}'
neurolink mcp exec postgres execute_query --params '{"query": "SELECT * FROM users LIMIT 10"}'
neurolink mcp exec filesystem list_directory --params '{"path": "."}'
neurolink mcp exec puppeteer navigate --params '{"url": "https://example.com"}'
neurolink mcp exec puppeteer screenshot --params '{"name": "homepage"}'
```

**Current Working Output (Built-in Tools):**

```
✅ Built-in tool execution via AI:
 The current time is Friday, December 13, 2024 at 10:30:45 AM PST
 Available tools: 5 built-in tools discovered
 External servers: 58+ discovered, activation in development
```

### MCP Command Options

#### Global MCP Options

- `--help, -h` - Show MCP command help
- `--status` - Include live connectivity status (for `list` command)

#### Server Management Options

- `--args ` - Comma-separated command arguments
- `--transport ` - Transport type: `stdio` (default) or `sse`
- `--url ` - Server URL (for SSE transport)
- `--env ` - Environment variables as JSON string
- `--cwd ` - Working directory for server process

#### Tool Execution Options

- `--params ` - Tool parameters as JSON string
- `--timeout ` - Execution timeout in milliseconds

### MCP Integration Examples

#### File Operations Workflow

```bash
# Install and test filesystem server
neurolink mcp install filesystem
neurolink mcp test filesystem

# (Future) Execute file operations
neurolink mcp exec filesystem read_file --params '{"path": "package.json"}'
neurolink mcp exec filesystem list_directory --params '{"path": "src"}'
neurolink mcp exec filesystem search_files --params '{"path": ".", "pattern": "*.ts"}'
```

#### GitHub Integration Workflow

```bash
# Install GitHub server
neurolink mcp install github
neurolink mcp test github

# (Future) GitHub operations
neurolink mcp exec github search_repositories --params '{"query": "neurolink"}'
neurolink mcp exec github create_issue --params '{"title": "Feature request", "body": "Add new feature"}''
```

#### Database Operations Workflow

```bash
# Install PostgreSQL server
neurolink mcp install postgres
neurolink mcp test postgres

# (Future) Database operations
neurolink mcp exec postgres query --params '{"sql": "SELECT version()"}'
neurolink mcp exec postgres list-tables --params '{}'
```

#### Custom Server Development

```bash
# Add your custom MCP server
neurolink mcp add myapp "python /path/to/my-mcp-server.py" \
  --env '{"API_KEY": "secret", "DEBUG": "true"}' \
  --cwd "/my/project"

# Test your server
neurolink mcp test myapp

# Use your custom tools
neurolink mcp exec myapp my_custom_tool --params '{"input": "data"}'
```

### `ollama` - Local Model Management

Manage Ollama local models directly from NeuroLink CLI.

#### `ollama list-models` - List Installed Models

```bash
neurolink ollama list-models
```

#### `ollama pull ` - Download Model

```bash
neurolink ollama pull llama2
neurolink ollama pull codellama
```

#### `ollama remove ` - Remove Model

```bash
neurolink ollama remove llama2
```

#### `ollama status` - Check Ollama Service

```bash
neurolink ollama status
```

#### `ollama start` - Start Ollama Service

```bash
neurolink ollama start
```

#### `ollama stop` - Stop Ollama Service

```bash
neurolink ollama stop
```

#### `ollama setup` - Interactive Setup

```bash
neurolink ollama setup
```

----- | ----------- | -------------------------------- |
| `agent`  | /api/agent  | AI agent execution and streaming |
| `tool`   | /api/tools  | Tool listing and execution       |
| `mcp`    | /api/mcp    | MCP server management            |
| `memory` | /api/memory | Conversation memory              |
| `health` | /api/health | Health checks and metrics        |

### Managing Server Configuration

View and modify server settings:

```bash
# Show all configuration
neurolink server config

# Get specific value
neurolink server config --get defaultPort
neurolink server config --get cors.enabled

# Set configuration values
neurolink server config --set defaultPort=8080
neurolink server config --set rateLimit.maxRequests=200

# Reset to defaults
neurolink server config --reset

# Export as JSON
neurolink server config --format json
```

### Generating OpenAPI Specification

Generate API documentation:

```bash
# Output to stdout
neurolink server openapi

# Save to file
neurolink server openapi -o openapi.json

# Generate YAML format
neurolink server openapi --format yaml -o api-spec.yaml

# With custom metadata
neurolink server openapi --title "My API" --version "1.0.0"
```

### Server Command Reference

| Command                    | Description                |
| -------------------------- | -------------------------- |
| `serve [options]`          | Start server in foreground |
| `server start [options]`   | Start server in background |
| `server stop [--force]`    | Stop background server     |
| `server status [--format]` | Show server status         |
| `server routes [options]`  | List registered routes     |
| `server config [options]`  | Manage configuration       |
| `server openapi [options]` | Generate OpenAPI spec      |

### Framework Selection

Choose the right framework for your needs:

```bash
# Hono (default) - Lightweight, fast, edge-ready
neurolink serve --framework hono

# Express - Most ecosystem support, familiar API
neurolink serve --framework express

# Fastify - High performance, schema validation
neurolink serve --framework fastify

# Koa - Elegant middleware composition
neurolink serve --framework koa
```

### MCP Configuration Management

MCP servers are automatically configured in `.mcp-config.json`:

```json
{
  "mcpServers": {
    "filesystem": {
      "name": "filesystem",
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "/"],
      "transport": "stdio"
    },
    "github": {
      "name": "github",
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "transport": "stdio"
    }
  }
}
```

## Command Options

### Global Options

- `--help, -h` - Show help information
- `--version, -v` - Show version number

### Generation Options

- `--provider ` - Choose provider: `auto` (default), `openai`, `bedrock`, `vertex`, `anthropic`, `azure`, `google-ai`, `huggingface`, `ollama`, `mistral`
- `--temperature ` - Creativity level: `0.0` (focused) to `1.0` (creative), default: `0.7`
- `--max-tokens ` - Maximum tokens to generate, default: `1000`
- `--format ` - Output format: `text` (default) or `json`

### Batch Processing Options

- `--output ` - Save results to JSON file
- `--delay ` - Delay between requests in milliseconds, default: `1000`
- `--timeout ` - Request timeout per prompt (default: 30s). Accepts: '30s', '2m', '5000' (ms), '1h'

### Status Options

- `--verbose, -v` - Show detailed diagnostic information

## CLI Features

### ✨ Professional UX

- **Animated Spinners**: Beautiful animations during AI generation
- **Colorized Output**: Green ✅ for success, red ❌ for errors, blue ℹ️ for info
- **Progress Tracking**: Real-time progress for batch operations
- **Smart Error Messages**: Helpful hints for common issues

### ️ Developer-Friendly

- **Multiple Output Formats**: Text for humans, JSON for scripts
- **Provider Selection**: Test specific providers or use auto-selection
- **Batch Processing**: Handle multiple prompts efficiently
- **Status Monitoring**: Check provider health and connectivity

###  Automation Ready

- **Exit Codes**: Standard exit codes for scripting
- **JSON Output**: Structured data for automated workflows
- **Environment Variables**: All SDK environment variables work with CLI
- **Scriptable**: Perfect for CI/CD pipelines and automation

## Usage Examples

### Creative Writing Workflow

```bash
# Generate creative content with high temperature
neurolink generate "Write a sci-fi story opening" \
  --provider openai \
  --temperature 0.9 \
  --max-tokens 1000 \
  --format json > story.json

# Check what was generated
cat story.json | jq '.content'

# Extract specific fields from JSON response
cat story.json | jq -r '.provider, .usage.totalTokens, .responseTime'

# Automated workflow with JSON parsing
story_response=$(neurolink gen "Write a mystery story" --format json)
title=$(echo "$story_response" | jq -r '.content' | head -1)
tokens=$(echo "$story_response" | jq -r '.usage.totalTokens')
echo "Generated story: $title (${tokens} tokens)"
```

### Batch Content Processing

```bash
# Create prompts file
cat > content-prompts.txt  status.json

# Parse results in scripts
working_providers=$(cat status.json | jq '[.[] | select(.status == "working")] | length')
echo "Working providers: $working_providers"
```

### Integration with Shell Scripts

```bash
#!/bin/bash
# AI-powered commit message generator

# Get git diff
diff=$(git diff --cached --name-only)

if [ -z "$diff" ]; then
  echo "No staged changes found"
  exit 1
fi

# Generate commit message
commit_msg=$(neurolink generate \
  "Generate a concise git commit message for these changes: $diff" \
  --max-tokens 50 \
  --temperature 0.3)

echo "Suggested commit message:"
echo "$commit_msg"

# Optionally auto-commit
read -p "Use this commit message? (y/N): " -n 1 -r
if [[ $REPLY =~ ^[Yy]$ ]]; then
  git commit -m "$commit_msg"
fi
```

## Environment Setup

The CLI uses the same environment variables as the SDK:

```bash
# Set up your providers (same as SDK)
export OPENAI_API_KEY="sk-your-key"
export AWS_ACCESS_KEY_ID="your-aws-key"
export AWS_SECRET_ACCESS_KEY="your-aws-secret"
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"

# Corporate proxy support (automatic detection)
export HTTPS_PROXY="http://your-corporate-proxy:port"
export HTTP_PROXY="http://your-corporate-proxy:port"

# Test configuration
neurolink status
```

###  Enterprise Proxy Support

The CLI automatically works behind corporate proxies:

```bash
# Set proxy environment variables
export HTTPS_PROXY=http://proxy.company.com:8080
export HTTP_PROXY=http://proxy.company.com:8080

# CLI commands work automatically through proxy
npx @juspay/neurolink generate "Hello from corporate network"
npx @juspay/neurolink status
```

**No additional configuration required** - proxy detection is automatic.

**For detailed proxy setup** → See [Enterprise & Proxy Setup Guide](/docs/deployment/enterprise-proxy)

## CLI vs SDK Comparison

| Feature                | CLI                    | SDK                      |
| ---------------------- | ---------------------- | ------------------------ |
| **Text Generation**    | ✅ `generate`          | ✅ `generate()`          |
| **Streaming**          | ✅ `stream`            | ✅ `stream()`            |
| **Provider Selection** | ✅ `--provider` flag   | ✅ `createProvider()`    |
| **Batch Processing**   | ✅ `batch` command     | ✅ Manual implementation |
| **Status Monitoring**  | ✅ `status` command    | ✅ Manual testing        |
| **JSON Output**        | ✅ `--format json`     | ✅ Native objects        |
| **Automation**         | ✅ Perfect for scripts | ✅ Perfect for apps      |
| **Learning Curve**     |  Low                 |  Medium                |

## When to Use CLI vs SDK

### Use the CLI when

-  **Prototyping**: Quick testing of prompts and providers
-  **Scripting**: Shell scripts and automation workflows
-  **Debugging**: Checking provider status and testing connectivity
-  **Batch Processing**: Processing multiple prompts from files
-  **One-off Tasks**: Generating content without writing code

### Use the SDK when

- ️ **Application Development**: Building web apps, APIs, or services
-  **Real-time Integration**: Chat interfaces, streaming responses
- ⚙️ **Complex Logic**: Custom provider fallback, error handling
-  **UI Integration**: React components, Svelte stores
-  **Production Applications**: Full-featured applications

## ⭐ Phase 3 Enhanced Features

### Advanced Analytics and Evaluation

**Multi-Domain Evaluation Strategy:**

```bash
# Technical Documentation Evaluation
npx @juspay/neurolink generate "Explain microservices architecture" \
  --enableEvaluation \
  --evaluationDomain "Senior Software Architect" \
  --debug

# Creative Content Evaluation
npx @juspay/neurolink generate "Write marketing copy for AI product" \
  --enableEvaluation \
  --evaluationDomain "Senior Marketing Manager" \
  --debug
```

**Context-Aware Analytics:**

```bash
# User Session Context
npx @juspay/neurolink generate "Help with API design" \
  --enableAnalytics \
  --context '{"userId":"user123","session":"sess456","project":"ecommerce"}' \
  --debug

# Business Context with Evaluation
npx @juspay/neurolink generate "Market analysis for AI products" \
  --enableAnalytics \
  --enableEvaluation \
  --evaluationDomain "Business Strategy Consultant" \
  --context '{"company":"TechCorp","department":"strategy","quarter":"Q4-2025"}' \
  --debug
```

### Real Streaming with Analytics

**Enterprise streaming with full monitoring:**

```bash
# Production streaming with all features
npx @juspay/neurolink stream "Generate comprehensive project documentation" \
  --provider google-ai \
  --model gemini-2.5-pro \
  --enableAnalytics \
  --enableEvaluation \
  --evaluationDomain "Senior Technical Writer" \
  --context '{"project":"enterprise-api","team":"platform"}' \
  --temperature 0.7 \
  --maxTokens 3000 \
  --timeout 180 \
  --debug
```

### Performance Optimization (68% Faster Provider Checks)

```bash
# Fast provider status (5s instead of 16s)
time npx @juspay/neurolink provider status

# Best provider selection
npx @juspay/neurolink get-best-provider

# Auto-selection with performance priority
npx @juspay/neurolink generate "Performance critical task" --provider auto
```

##  CLI Video Demonstrations

**See the CLI in action with professional demonstrations:**

### **Command Tutorials**

- **[Help & Overview](visual-content/cli-videos/cli-01-cli-help.mp4)** - Complete command reference and usage examples
- **[Provider Status](visual-content/cli-videos/cli-02-provider-status.mp4)** - Connectivity testing and response time measurement
- **[Text Generation](visual-content/cli-videos/cli-03-text-generation.mp4)** - Real AI content generation with different providers
- **[Auto Selection](visual-content/cli-videos/cli-04-auto-selection.mp4)** - Automatic provider selection algorithm
- **[Streaming](visual-content/cli-videos/cli-05-streaming.mp4)** - Real-time text generation streaming
- **[Advanced Features](visual-content/cli-videos/cli-06-advanced-features.mp4)** - Verbose diagnostics and advanced options

### **MCP Integration Demos**

- **[MCP Help](visual-content/cli-videos/cli-advanced-features/mcp-help.mp4)** - MCP command reference and usage
- **[MCP List](visual-content/cli-videos/cli-advanced-features/mcp-list.mp4)** - MCP server listing and status

### **AI Workflow Tools Demo**

- **[AI Workflow Tools](visual-content/videos/demo/ai-workflow-full-demo.mp4)** - Complete demonstration of AI workflow tools via CLI

**All videos feature:**

- ✅ Real command execution with live AI generation
- ✅ Professional MP4 format for universal compatibility
- ✅ Comprehensive coverage of all CLI features
- ✅ Suitable for documentation, tutorials, and presentations

For complete visual documentation including web interface demos, see the [Visual Demos Guide](/docs/visual-demos).

---

[← Back to Main README](/docs/) | [Next: Framework Integration →](/docs/sdk/framework-integration)

---

## ️ CLI Reference Guide

<!-- Source: cli-reference.md -->

# ️ CLI Reference Guide

## ✅ IMPLEMENTATION STATUS: COMPLETE (2025-01-07)

**Generate Function Migration completed - CLI now supports both primary and legacy commands**

- ✅ New `generate` command established as primary
- ✅ All options and functionality maintained
- ✅ Zero breaking changes for existing scripts

> **Migration Note**: Use `generate` for new scripts. Existing `generate` scripts continue working with deprecation warnings.

------------ | ------- | ---------------- | -------------------------------------------------------------------------------------------------------------------------- |
| `--provider`    | string  | `auto`           | AI provider (`auto`, `openai`, `bedrock`, `vertex`, `anthropic`, `azure`, `google-ai`, `huggingface`, `ollama`, `mistral`) |
| `--model`       | string  | provider default | Specific model (e.g., `gemini-2.5-pro`, `gpt-4o`, `claude-3-sonnet`)                                                       |
| `--temperature` | number  | `0.7`            | Creativity level (0.0 = focused, 1.0 = creative)                                                                           |
| `--max-tokens`  | number  | `1000`           | Maximum tokens to generate                                                                                                 |
| `--system`      | string  | none             | System prompt to guide AI behavior                                                                                         |
| `--format`      | string  | `text`           | Output format (`text`, `json`)                                                                                             |
| `--timeout`     | number  | `120`            | Maximum execution time in seconds                                                                                          |
| `--debug`       | boolean | `false`          | Enable debug mode with verbose output                                                                                      |

### Enhancement Features

| Flag                  | Type    | Default | Description                                        |
| --------------------- | ------- | ------- | -------------------------------------------------- |
| `--enable-analytics`  | boolean | `false` | Enable usage analytics (tokens, cost, performance) |
| `--enable-evaluation` | boolean | `false` | Enable AI response quality evaluation              |
| `--context`           | string  | none    | JSON context object for custom data                |

### Universal Evaluation System

| Flag                   | Type    | Default | Description                                                   |
| ---------------------- | ------- | ------- | ------------------------------------------------------------- |
| `--evaluation-domain`  | string  | none    | Domain expertise for evaluation (e.g., 'AI coding assistant') |
| `--tool-usage-context` | string  | none    | Tool usage context for evaluation                             |
| `--lighthouse-style`   | boolean | `false` | Use Lighthouse-compatible domain-aware evaluation             |

### MCP Integration

| Flag              | Type    | Default | Description                                             |
| ----------------- | ------- | ------- | ------------------------------------------------------- |
| `--disable-tools` | boolean | `false` | Disable MCP tool integration (tools enabled by default) |

### Video Generation (Veo 3.1)

| Flag                   | Type    | Default | Description                                                               |
| ---------------------- | ------- | ------- | ------------------------------------------------------------------------- |
| `--outputMode`         | string  | `text`  | Output mode: 'text' or 'video'                                            |
| `--image`              | string  | none    | Path to an input image to base the generated video on (e.g., ./input.png) |
| `--videoOutput`, `-vo` | string  | none    | Path to save generated video (e.g., ./output.mp4)                         |
| `--videoResolution`    | string  | `720p`  | Video resolution: '720p' or '1080p'                                       |
| `--videoLength`        | number  | `6`     | Video duration in seconds: 4, 6, or 8                                     |
| `--videoAspectRatio`   | string  | `16:9`  | Aspect ratio: '9:16' (portrait) or '16:9' (landscape)                     |
| `--videoAudio`         | boolean | `true`  | Include synchronized audio                                                |

## Usage Examples

### Basic Text Generation

```bash
# Simple generation
npx @juspay/neurolink generate "Write a haiku about AI"

# With specific provider
npx @juspay/neurolink generate "Explain quantum computing" --provider openai

# With model selection
npx @juspay/neurolink generate "Write code" --provider google-ai --model gemini-2.5-pro
```

### Enhanced Analytics & Evaluation

```bash
# Basic analytics
npx @juspay/neurolink generate "What is machine learning?" --enable-analytics

# Analytics + evaluation
npx @juspay/neurolink generate "Explain AI ethics" --enable-analytics --enable-evaluation

# With custom context
npx @juspay/neurolink generate "Create a proposal" \
  --enable-analytics --enable-evaluation \
  --context '{"company":"TechCorp","department":"AI"}'
```

### Domain-Aware Evaluation

```bash
# Basic domain evaluation
npx @juspay/neurolink generate "Fix this Python code" \
  --enable-evaluation --evaluation-domain "Python coding assistant"

# Lighthouse-style evaluation
npx @juspay/neurolink generate "Create a business plan" \
  --lighthouse-style --evaluation-domain "Business consultant" \
  --tool-usage-context "Used market-research and financial-analysis tools"

# Enterprise evaluation with context
npx @juspay/neurolink generate "Analyze sales data" \
  --enable-analytics --lighthouse-style \
  --evaluation-domain "Data analyst" \
  --context '{"role":"senior_analyst","access_level":"full"}'
```

### Debug & Development

```bash
# Debug mode with full output
npx @juspay/neurolink generate "Test prompt" --debug

# Debug with enhancements
npx @juspay/neurolink generate "Test analytics" \
  --enable-analytics --enable-evaluation --debug

# Disable MCP tools for testing
npx @juspay/neurolink generate "Simple test" --disable-tools
```

### Advanced Examples

```bash
# Enterprise AI assistant with full features
npx @juspay/neurolink generate "Create quarterly AI strategy" \
  --provider openai --model gpt-4o \
  --enable-analytics --lighthouse-style \
  --evaluation-domain "AI strategy consultant" \
  --tool-usage-context "Market research, competitor analysis, financial modeling" \
  --context '{"company":"Fortune500","quarter":"Q1-2025","budget":"$5M"}' \
  --debug

# Cost-optimized evaluation
npx @juspay/neurolink generate "Quick code review" \
  --provider google-ai --model gemini-2.5-flash \
  --enable-evaluation --evaluation-domain "Code reviewer" \
  --max-tokens 500

# High-quality content generation
npx @juspay/neurolink generate "Write technical documentation" \
  --provider anthropic --model claude-3-opus \
  --enable-analytics --enable-evaluation \
  --evaluation-domain "Technical writer" \
  --temperature 0.3 --max-tokens 2000
```

## Output Examples

### Basic Output

```
✨ Generated text:
Artificial Intelligence (AI) refers to...

✅ Text generated successfully!
```

### Enhanced Output (with --enable-analytics --enable-evaluation)

```
✨ Generated text:
Artificial Intelligence (AI) refers to...

 Analytics:
   Provider: google-ai
   Model: gemini-2.5-flash
   Tokens: 245 (input: 12, output: 233)
   Cost: $0.0012
   Response Time: 3247ms
   Context: {"domain":"education"}

⭐ Response Quality Evaluation:
    Relevance: 9/10
   ✅ Accuracy: 8/10
    Completeness: 9/10
    Overall Quality: 9/10
    Evaluated by: gemini-2.5-flash (1247ms)

✅ Text generated successfully!
```

### Debug Output (with --debug)

```
 Debug: Provider selection started
 Debug: Selected provider: google-ai (model: gemini-2.5-flash)
 Debug: Analytics enabled: true
 Debug: Evaluation enabled: true
 Debug: Request started at 2025-01-06T12:00:00.000Z

✨ Generated text:
...

 Debug: Raw analytics data:
{
  "provider": "google-ai",
  "tokens": {"input": 12, "output": 233, "total": 245},
  "cost": 0.0012,
  "responseTime": 3247,
  "context": {"domain": "education"}
}

 Debug: Raw evaluation data:
{
  "relevance": 9,
  "accuracy": 8,
  "completeness": 9,
  "overall": 9,
  "model": "gemini-2.5-flash",
  "evaluationTime": 1247
}

✅ Text generated successfully!
```

## Error Handling

### Common Errors & Solutions

**Provider not available:**

```
❌ Error: Provider 'openai' not available (missing API key)
 Solution: Set OPENAI_API_KEY in your .env file
```

**Invalid context JSON:**

```
❌ Error: Invalid JSON in --context parameter
 Solution: Use proper JSON format: --context '{"key":"value"}'
```

**Model not found:**

```
❌ Error: Model 'invalid-model' not found for provider 'openai'
 Solution: Use valid model names (see provider documentation)
```

**Evaluation failed:**

```
⚠️  Warning: Evaluation failed, continuing without quality scores
 Reason: Evaluation provider unavailable, set NEUROLINK_EVALUATION_MODEL
```

## Performance Tips

1. **Fast Evaluation**: Use `--model gemini-2.5-flash` for quick, cost-effective evaluation
2. **Quality Content**: Use `--provider anthropic --model claude-3-opus` for high-quality generation
3. **Cost Optimization**: Set `NEUROLINK_EVALUATION_PREFER_CHEAP=true` for automatic cost optimization
4. **Debug Efficiently**: Use `--debug` only when troubleshooting to avoid verbose output
5. **Context Size**: Keep `--context` objects small to minimize token usage

## Video Generation Examples

Generate videos from images using Veo 3.1 via Vertex AI:

```bash
# Basic video generation
npx @juspay/neurolink generate "Product showcase with smooth camera movement" \
  --image ./product.jpg \
  --outputMode video \
  --videoOutput ./output.mp4

# Full options
npx @juspay/neurolink generate "Cinematic reveal with dramatic lighting" \
  --image ./hero-image.png \
  --provider vertex \
  --model veo-3.1 \
  --outputMode video \
  --videoResolution 1080p \
  --videoLength 8 \
  --videoAspectRatio 16:9 \
  --videoOutput ./cinematic.mp4

# Portrait video for social media
npx @juspay/neurolink generate "Vertical scroll animation" \
  --image ./mobile-screenshot.jpg \
  --outputMode video \
  --videoResolution 720p \
  --videoAspectRatio 9:16 \
  --videoOutput ./story.mp4
```

> **Note:** Video generation requires Vertex AI credentials. See [Video Generation Guide](/docs/features/video-generation).

## Environment Variables

See the [Environment Variables](/docs/getting-started/environment-variables) documentation for complete configuration options.

## API Integration

For programmatic usage, see the [API Reference](/docs/sdk/api-reference) documentation.

---

## Lighthouse Unified Integration Guide

<!-- Source: lighthouse-unified-integration.md -->

#  Lighthouse Unified Integration Guide

## ✅ **FINAL IMPLEMENTATION: Unified registerTools() API**

This document outlines the final implementation of Lighthouse integration through a unified `registerTools()` method that accepts both object and array formats.

##  **Overview**

**Problem Solved**: Seamless integration of Lighthouse tools without migration or special methods.

**Solution**: Enhanced `registerTools()` method that automatically detects and handles both:

- **Object format**: `Record` (existing compatibility)
- **Array format**: `Array` (Lighthouse compatibility)

##  **Core Implementation**

### **Method Signature**

```typescript
registerTools(tools: Record | Array): void
```

### **Automatic Format Detection**

```typescript
registerTools(tools: Record | Array): void {
  if (Array.isArray(tools)) {
    // Handle array format (Lighthouse compatible)
    for (const { name, tool } of tools) {
      this.registerTool(name, tool);
    }
  } else {
    // Handle object format (existing compatibility)
    for (const [name, tool] of Object.entries(tools)) {
      this.registerTool(name, tool);
    }
  }
}
```

##  **Lighthouse Compatibility**

### **Zod Schema Support**

NeuroLink already supports Zod schemas in the `SimpleTool` interface:

```typescript
type SimpleTool = {
  description: string;
  parameters?: ZodSchema; // Zod support already implemented
  execute: (params: ToolArgs, context?: ExecutionContext) => Promise;
};
```

### **Example: Lighthouse Tool Integration**

```typescript

const neurolink = new NeuroLink();

// Lighthouse tools exported as array with Zod schemas
const lighthouseTools = [
  {
    name: "juspay-analytics",
    tool: {
      description: "Analyze Juspay merchant payment data",
      parameters: z.object({
        merchantId: z.string().describe("Merchant identifier"),
        dateRange: z.object({
          start: z.string().datetime(),
          end: z.string().datetime(),
        }),
        metrics: z
          .array(z.enum(["volume", "success_rate", "avg_amount"]))
          .optional(),
      }),
      execute: async ({ merchantId, dateRange, metrics }) => {
        // Lighthouse tool implementation
        return {
          merchantId,
          period: dateRange,
          analytics: {
            totalVolume: 125000,
            successRate: 0.987,
            avgAmount: 45.67,
          },
        };
      },
    },
  },
  {
    name: "payment-processor",
    tool: {
      description: "Process payment transactions",
      parameters: z.object({
        amount: z.number().positive(),
        currency: z.string().length(3),
        paymentMethod: z.enum(["card", "upi", "wallet"]),
      }),
      execute: async ({ amount, currency, paymentMethod }) => {
        return {
          transactionId: `txn_${Date.now()}`,
          status: "success",
          amount,
          currency,
          method: paymentMethod,
        };
      },
    },
  },
];

// Register Lighthouse tools using unified API
neurolink.registerTools(lighthouseTools);

// Use in AI generation
const result = await neurolink.generate({
  input: {
    text: "Show me payment analytics for merchant MERCH123 for the last week",
  },
  provider: "google-ai",
});
```

##  **Compatibility Matrix**

| Format | Type                                        | Lighthouse Compatible   | Backward Compatible | Status   |
| ------ | ------------------------------------------- | ----------------------- | ------------------- | -------- |
| Object | `Record`                | ⚠️ Requires conversion  | ✅ Yes              | Existing |
| Array  | `Array` | ✅ Direct compatibility | ✅ Yes              | New      |

##  **Migration Path**

### **Existing Code**

No changes required - object format continues to work:

```typescript
// Existing code remains unchanged
neurolink.registerTools({
  myTool: { description: "...", execute: async () => {...} }
});
```

### **New Lighthouse Integration**

Direct import using array format:

```typescript
// Lighthouse tools can be imported directly

neurolink.registerTools(lighthouseAnalyticsTools);
```

##  **Benefits**

1. **Unified API**: Single method for all tool registration needs
2. **Zero Migration**: Lighthouse tools work without conversion
3. **Backward Compatibility**: Existing code unchanged
4. **Type Safety**: Full TypeScript support for both formats
5. **Zod Integration**: Native support for Zod parameter validation
6. **API Simplification**: Removes need for separate methods

##  **Testing Strategy**

### **Format Detection Tests**

```typescript
describe("Unified registerTools()", () => {
  test("should detect object format", () => {
    neurolink.registerTools({ tool1: {...}, tool2: {...} });
    expect(neurolink.getCustomTools().size).toBe(2);
  });

  test("should detect array format", () => {
    neurolink.registerTools([
      { name: "tool1", tool: {...} },
      { name: "tool2", tool: {...} }
    ]);
    expect(neurolink.getCustomTools().size).toBe(2);
  });

  test("should support mixed registration", () => {
    neurolink.registerTools({ objectTool: {...} });
    neurolink.registerTools([{ name: "arrayTool", tool: {...} }]);
    expect(neurolink.getCustomTools().size).toBe(2);
  });
});
```

### **Lighthouse Integration Tests**

```typescript
describe("Lighthouse Integration", () => {
  test("should register Lighthouse tools with Zod schemas", () => {
    const lighthouseTools = [
      {
        name: "analytics",
        tool: {
          description: "Analytics tool",
          parameters: z.object({ merchantId: z.string() }),
          execute: async ({ merchantId }) => ({ data: merchantId }),
        },
      },
    ];

    neurolink.registerTools(lighthouseTools);
    const result = await neurolink.executeTool("analytics", {
      merchantId: "test",
    });
    expect(result.data).toBe("test");
  });
});
```

##  **Implementation Checklist**

- [x] **Design**: Unified method signature with union types
- [x] **Detection**: Automatic format detection using `Array.isArray()`
- [x] **Compatibility**: Zod schema support verification
- [x] **Documentation**: Updated README and guides
- [x] **Implementation**: Modify `registerTools()` method in NeuroLink class
- [x] **Cleanup**: Remove redundant `registerToolsFromArray()` method (never existed)
- [x] **Testing**: Update tests for unified method
- [x] **Validation**: End-to-end integration testing

##  **Future Extensibility**

The unified approach supports future extensions:

```typescript
// Future: Additional format support
registerTools(tools:
  | Record           // Object format
  | Array  // Array format
  | MCPServerConfig                      // Future: MCP server format
  | PluginManifest                       // Future: Plugin format
): void
```

This architecture ensures the API can grow with new tool formats while maintaining compatibility.

---

## npm Trusted Publishing Setup

<!-- Source: npm-trusted-publishing-setup.md -->

# npm Trusted Publishing Setup

This repository is configured to use npm's **Trusted Publishing** feature with GitHub Actions OIDC authentication. This provides secure, token-free publishing with automatic provenance generation.

## What is Trusted Publishing?

Trusted Publishing allows GitHub Actions to publish packages to npm without using long-lived NPM_TOKEN secrets. Instead, it uses OpenID Connect (OIDC) to create short-lived tokens that are automatically verified by npm.

**Benefits:**

- ✅ No need to manage NPM_TOKEN secrets
- ✅ Automatic package provenance (cryptographic attestation)
- ✅ Enhanced security (no long-lived credentials)
- ✅ Verifiable supply chain

## Configuration Status

✅ **GitHub Actions workflow** - Configured with OIDC permissions
✅ **semantic-release** - Configured to publish with provenance

⚠️ **npm Trusted Publisher** - Requires manual setup on npm.org (see below)

## GitHub Actions Configuration (✅ Complete)

The following changes have been made to `.github/workflows/release.yml`:

1. **Added `id-token: write` permission:**

   ```yaml
   permissions:
     contents: write
     packages: write
     issues: write
     pull-requests: write
     id-token: write # Required for npm provenance
   ```

2. **Configured semantic-release** in `.releaserc.json`:
   ```json
   [
     "@semantic-release/npm",
     {
       "npmPublish": true,
       "provenance": true
     }
   ]
   ```

## npm Website Configuration (⚠️ Required)

To complete the setup, you must configure the trusted publisher on npm.org:

### Step 1: Access Package Settings

1. Go to [npmjs.com](https://www.npmjs.com/) and sign in
2. Navigate to your package: `@juspay/neurolink`
3. Click on **Settings** tab

### Step 2: Configure Trusted Publisher

1. Scroll to **Publishing Access** section
2. Click **Add Trusted Publisher**
3. Select **GitHub Actions** as the provider
4. Fill in the following details:
   - **Repository owner:** `juspay`
   - **Repository name:** `neurolink`
   - **Workflow name:** `release.yml`
   - **Environment (optional):** Leave empty unless you use GitHub environments

### Step 3: Save Configuration

1. Click **Add Trusted Publisher**
2. Verify the configuration appears in the list

## Migration Notes

### During Transition Period

You can keep the `NPM_TOKEN` secret configured during the transition:

- If trusted publishing is configured, npm will use OIDC authentication
- If trusted publishing fails, it will fall back to the token
- Once verified working, you can remove the `NPM_TOKEN` secret

### Removing NPM_TOKEN (After Verification)

Once you've confirmed trusted publishing works:

1. Go to GitHub repository settings
2. Navigate to **Secrets and variables** → **Actions**
3. Delete the `NPM_TOKEN` secret (optional but recommended)

**Note:** The `NPM_TOKEN` in the workflow environment variables doesn't need to be removed - it will simply be unused when OIDC is active.

## Verification

After configuring trusted publishing and triggering a release:

1. **Check the workflow logs:**
   - Go to **Actions** tab in GitHub
   - Open the latest release workflow run
   - Look for the semantic-release step logs

2. **Verify provenance on npm:**
   - Visit your package page: `https://www.npmjs.com/package/@juspay/neurolink`
   - Look for the **Provenance** badge or section
   - Click to view the attestation details

3. **Expected output:**
   - Workflow should complete successfully without NPM_TOKEN errors
   - Package page should show provenance information
   - Attestation should link back to the GitHub Actions run

## Troubleshooting

### Error: "This request requires id-token permission"

**Cause:** Missing `id-token: write` permission in workflow

**Solution:** Verify `.github/workflows/release.yml` has:

```yaml
permissions:
  id-token: write
```

### Error: "npm publish failed - no trusted publisher configured"

**Cause:** Trusted publisher not configured on npm.org

**Solution:** Follow the npm website configuration steps above

### Provenance not showing on npm

**Possible causes:**

1. Trusted publisher not configured on npm.org
2. `provenance: true` not set in semantic-release config
3. Publishing happened before OIDC configuration

**Solution:**

1. Verify all configuration steps
2. Trigger a new release to test

## References

- [npm Trusted Publishers Documentation](https://docs.npmjs.com/trusted-publishers)
- [GitHub Actions OIDC](https://docs.github.com/en/actions/deployment/security-hardening-your-deployments/about-security-hardening-with-openid-connect)
- [semantic-release npm plugin](https://github.com/semantic-release/npm#options)

## Support

For issues with:

- **GitHub Actions OIDC:** Contact GitHub Support
- **npm Trusted Publishing:** Contact npm Support
- **semantic-release:** Check [semantic-release documentation](https://semantic-release.gitbook.io/)

---

## Step-by-Step Integration Tutorials

<!-- Source: tutorials.md -->

#  Step-by-Step Integration Tutorials

##  Quick Start (15 minutes) {#quick-start-15-minutes}

### Step 1: Installation

```bash
npm install @juspay/neurolink
echo 'GOOGLE_AI_API_KEY="your-key"' > .env
npx @juspay/neurolink generate "Hello world"
```

### Step 2: Enable Analytics

```javascript
const { NeuroLink } = require("@juspay/neurolink");
const neurolink = new NeuroLink();

const result = await neurolink.generate({
  input: { text: "Write a professional email" },
  enableAnalytics: true,
});

console.log(" Analytics:", result.analytics);
```

### Step 3: Add Quality Evaluation

```javascript
const result = await neurolink.generate({
  input: { text: "Explain quantum computing" },
  enableEvaluation: true,
});

console.log("⭐ Quality:", result.evaluation);
// Shows: { relevanceScore: 9, accuracyScore: 8, completenessScore: 9, overallScore: 8.7 }
```

## Video Generation (Veo 3.1)

Generate videos from images using Google's Veo 3.1 model via Vertex AI.

### Prerequisites

```bash
# Set up Vertex AI credentials
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"
export GOOGLE_VERTEX_PROJECT="your-project-id"
export GOOGLE_VERTEX_LOCATION="us-central1"
```

### SDK Video Generation

```javascript

const neurolink = new NeuroLink();

// Generate video from image + text prompt
// Note: Image must be PNG, JPEG, or WebP format (max 20MB)
const result = await neurolink.generate({
  input: {
    text: "Smooth camera pan with cinematic lighting",
    images: [await readFile("./product-image.jpg")],
  },
  provider: "vertex", // Optional: auto-switches to vertex when output.mode is "video"
  model: "veo-3.1",
  output: {
    mode: "video",
    video: {
      resolution: "1080p", // or "720p"
      length: 8, // 4, 6, or 8 seconds
      aspectRatio: "16:9", // or "9:16" for portrait
      audio: true, // synchronized audio
    },
  },
});

// Save generated video
if (result.video) {
  await writeFile("output.mp4", result.video.data);
  console.log(`Duration: ${result.video.metadata?.duration}s`);
}
```

**Image Requirements:**

- **Formats:** PNG, JPEG, or WebP only
- **Size limit:** 20MB maximum
- **Aspect ratio:** Should be compatible with target video aspect ratio (16:9 or 9:16)

### CLI Video Generation

```bash
# Basic video generation
npx @juspay/neurolink generate "Product showcase video" \
  --image ./product.jpg \
  --outputMode video \
  --videoOutput ./output.mp4

# Full options (--provider vertex is optional, auto-selected for video mode)
npx @juspay/neurolink generate "Cinematic camera movement" \
  --image ./input.jpg \
  --provider vertex \
  --model veo-3.1 \
  --outputMode video \
  --videoResolution 1080p \
  --videoLength 8 \
  --videoAspectRatio 16:9 \
  --videoOutput ./output.mp4
```

**Note:** The `--provider vertex` flag is optional for video generation—NeuroLink automatically switches to Vertex AI when `--outputMode video` is specified.

For complete documentation, see the [Video Generation Guide](/docs/features/video-generation).

## � Web App Integration

### Express.js API

```javascript
const express = require("express");
const { NeuroLink } = require("@juspay/neurolink");
const app = express();
app.use(express.json());
const neurolink = new NeuroLink();

app.post("/api/generate", async (req, res) => {
  const result = await neurolink.generate({
    input: { text: req.body.prompt },
    enableAnalytics: true,
    enableEvaluation: true,
    context: {
      department: req.body.department,
      user_id: req.headers["user-id"],
    },
  });

  // Quality gate
  if (result.evaluation.overallScore  c.cost = qualityTarget)
      .sort((a, b) => b.quality - a.quality)[0];
  }
}
```

##  Batch Processing

```javascript
const fs = require("fs");
const csv = require("csv-parser");
const { NeuroLink } = require("@juspay/neurolink");

const neurolink = new NeuroLink();

class BatchProcessor {
  async processCSV(inputFile) {
    const items = [];

    await new Promise((resolve) => {
      fs.createReadStream(inputFile)
        .pipe(csv())
        .on("data", (row) => items.push(row))
        .on("end", resolve);
    });

    for (const item of items) {
      const result = await neurolink.generate({
        input: { text: `Create marketing copy for: ${item.name}` },
        enableAnalytics: true,
        enableEvaluation: true,
        context: { product_id: item.id, batch: true },
      });

      console.log(
        `Processed ${item.name}: Quality ${result.evaluation.overallScore}/10`,
      );
    }
  }
}
```

##  Real-Time Monitoring

### Analytics Dashboard

```javascript
// Store analytics in memory (use database in production)
const analyticsStore = { requests: [], stats: {} };

app.post("/api/generate", async (req, res) => {
  const result = await neurolink.generate({
    input: { text: req.body.prompt },
    ...req.body,
    enableAnalytics: true,
    enableEvaluation: true,
  });

  // Store analytics
  analyticsStore.requests.push({
    timestamp: new Date(),
    ...result.analytics,
    quality: result.evaluation,
  });

  res.json(result);
});

// Dashboard endpoint
app.get("/api/dashboard", (req, res) => {
  const last24h = analyticsStore.requests.filter(
    (r) => r.timestamp > new Date(Date.now() - 24 * 60 * 60 * 1000),
  );

  res.json({
    totalRequests: last24h.length,
    totalCost: last24h.reduce((sum, r) => sum + (r.cost || 0), 0),
    avgQuality:
      last24h.reduce((sum, r) => sum + r.quality.overallScore, 0) /
      last24h.length,
  });
});
```

##  CLI Usage Patterns

### Basic Generation with Analytics

```bash
npx @juspay/neurolink generate "Create product description" \
  --enable-analytics --debug
```

### Quality Control

```bash
npx @juspay/neurolink generate "Medical advice content" \
  --enable-evaluation --debug
```

### Full Features

```bash
npx @juspay/neurolink generate "Business proposal" \
  --enable-analytics --enable-evaluation \
  --context '{"dept":"sales","priority":"high"}' \
  --debug
```

##  Industry Examples

### E-commerce: Product Descriptions

```javascript
const productResult = await neurolink.generate({
  input: { text: `Product: ${product.name}\nFeatures: ${product.features}` },
  enableAnalytics: true,
  enableEvaluation: true,
  context: {
    category: product.category,
    price_tier: product.priceTier,
  },
});

// Cost optimization by category
if (product.category === "basic" && productResult.analytics?.cost > 0.05) {
  // Switch to cheaper model for basic products
}
```

### Healthcare: Patient Education

```javascript
const medicalContent = await neurolink.generate({
  input: { text: "Diabetes management guide for patients" },
  enableEvaluation: true,
  context: {
    content_type: "medical",
    accuracy_required: 95,
  },
});

// Strict medical accuracy requirements
if (medicalContent.evaluation.accuracyScore  User: ${prompt}`);
    const result = await neurolink.generate({
      input: { text: prompt },
    });
    console.log(`> Agent: ${result.content}`);
  }
}

haveConversation();
```

### Expected Output

The agent will correctly recall the information provided in earlier prompts, demonstrating its stateful nature.

```
> User: My name is Alex.
> Agent: It's nice to meet you, Alex.

> User: I live in San Francisco.
> Agent: San Francisco is a beautiful city.

> User: What is my name and where do I live?
> Agent: Your name is Alex and you live in San Francisco.
```

##  Implementation Checklist

### ✅ Basic Setup

- [ ] Install NeuroLink SDK
- [ ] Configure API keys in .env
- [ ] Test basic generation
- [ ] Enable analytics tracking
- [ ] Add evaluation scoring

### ✅ Production Setup

- [ ] Implement quality gates
- [ ] Set up cost monitoring
- [ ] Create analytics dashboard
- [ ] Configure department tracking
- [ ] Set up batch processing

### ✅ Optimization

- [ ] Model selection strategy
- [ ] Cost optimization rules
- [ ] Quality improvement process
- [ ] Performance monitoring
- [ ] ROI measurement

##  Next Steps

1. **Start Simple**: Basic analytics and evaluation
2. **Add Quality Gates**: Implement quality thresholds
3. **Monitor Costs**: Track spending by department/usage
4. **Optimize**: Use data to improve cost and quality
5. **Scale**: Implement across organization

Each tutorial builds on the previous ones - start with the Quick Start and progress based on your needs.

---

## Industry Use Cases: Real-World Applications

<!-- Source: use-cases.md -->

#  Industry Use Cases: Real-World Applications

This guide shows how different industries use NeuroLink's analytics and evaluation features to solve specific business problems with measurable results.

##  E-commerce & Retail

### Product Description Generation

**Business Challenge:** Generate 50,000+ product descriptions monthly while controlling costs and maintaining quality.

**Solution Implementation:**

```javascript
// E-commerce product description with cost optimization
const productResult = await provider.generate({
  input: {
    text: `Write compelling product description for: ${product.name}
  Features: ${product.features.join(", ")}
  Target audience: ${product.targetAudience}`,
  },

  enableAnalytics: true,
  enableEvaluation: true,
  context: {
    department: "marketing",
    product_category: product.category,
    price_tier: product.priceTier, // budget, mid-range, premium
    word_count_target: 150,
  },
});

// Quality gates based on product value
if (product.priceTier === "premium" && productResult.evaluation.overall  0.05) {
  // Switch to cheaper model for basic items
  await optimizeModelSelection(product.category);
}
```

**Business Results:**

- **Cost Reduction:** 65% ($1,200 → $420/month)
- **Quality Consistency:** 90% descriptions meet brand standards
- **Productivity:** 10x faster than manual writing
- **A/B Testing:** 23% higher conversion rates

### Product Video Generation

**Business Challenge:** Create engaging product videos at scale for social media and e-commerce listings.

**Solution Implementation:**

```javascript

const neurolink = new NeuroLink();

try {
  // Generate product showcase video from image
  const videoResult = await neurolink.generate({
    input: {
      text: `Smooth camera movement showcasing ${product.name}
      with elegant rotation revealing product details`,
      images: [await readFile(product.heroImagePath)],
    },
    provider: "vertex",
    model: "veo-3.1",
    output: {
      mode: "video",
      video: {
        resolution: "1080p",
        length: 8,
        aspectRatio: product.platform === "instagram" ? "9:16" : "16:9",
        audio: true,
      },
    },
    enableAnalytics: true,
  });

  if (videoResult.video) {
    await writeFile(`${product.id}-showcase.mp4`, videoResult.video.data);
    // Use your logger instead: logger.info(`Video generated: ${videoResult.video.metadata?.duration}s`)
  }
} catch (error) {
  // Handle video generation errors (quota exceeded, invalid format, timeout, etc.)
  // Use your logger instead: logger.error('Video generation failed', { error, productId: product.id })
  throw error;
}
```

**Business Results:**

- **Content Velocity:** 50x faster than traditional video production
- **Cost Savings:** 90% reduction vs. professional video shoots
- **Engagement:** 40% higher engagement on video vs. static images
- **Scale:** Generate videos for entire product catalog

### Customer Review Response

**CLI Implementation:**

```bash
# Respond to customer reviews with quality control
npx @juspay/neurolink generate "Professional response to: 'Product broke after 2 days'" \
  --enable-analytics --enable-evaluation \
  --context '{"response_type":"customer_service","sentiment":"negative","priority":"high"}' \
  --debug

# Quality thresholds for customer-facing content:
# Relevance: >8 (must address customer concern)
# Accuracy: >9 (factual information only)
# Completeness: >7 (full response to issue)
```

##  Healthcare & Medical

### Patient Education Content

**Business Challenge:** Create accurate, compliant patient education materials while meeting strict regulatory requirements.

**Solution Implementation:**

```javascript
// Medical content with strict accuracy requirements
const medicalContent = await provider.generate({
  input: {
    text: `Create patient education content about diabetes management.
  Include: diet guidelines, exercise recommendations, monitoring tips.
  Audience: Adult patients, 6th grade reading level.`,
  },

  enableAnalytics: true,
  enableEvaluation: true,
  context: {
    content_type: "patient_education",
    medical_condition: "diabetes",
    audience_level: "general_public",
    regulatory_compliance: "FDA_guidelines",
    accuracy_threshold: 95,
  },
});

// Strict medical content quality gates
if (medicalContent.evaluation.accuracy 9.5 (medical facts must be precise)
# Completeness: >9 (all symptoms and treatments covered)
# Clinical review: Always required regardless of scores
```

##  Financial Services

### Investment Report Generation

**Business Challenge:** Create accurate, timely investment reports while managing compliance and cost at scale.

**Solution Implementation:**

```javascript
// Financial report with compliance tracking
const investmentReport = await provider.generate({
  input: {
    text: `Generate quarterly investment performance report.
  Portfolio: ${portfolio.name}
  Performance data: ${portfolio.quarterlyData}
  Market context: ${marketData.summary}
  Regulatory requirements: SEC compliance required.`,
  },

  enableAnalytics: true,
  enableEvaluation: true,
  context: {
    report_type: "investment_performance",
    compliance_framework: "SEC_regulations",
    client_tier: portfolio.clientTier,
    confidentiality: "high",
    fact_check_required: true,
  },
});

// Financial compliance quality gates
if (investmentReport.evaluation.accuracy 9 (financial facts must be correct)
# Completeness: >8 (all risks and disclaimers included)
# Regulatory review: Required for all financial advice
```

##  SaaS & Technology

### Customer Support Automation

**Business Challenge:** Scale customer support while maintaining quality and reducing response times.

**Solution Implementation:**

```javascript
// Automated customer support with quality control
const supportResponse = await provider.generate({
  input: {
    text: `Customer issue: "${ticket.description}"
  Product: ${ticket.product}
  Customer tier: ${customer.tier}
  Previous interactions: ${ticket.history}
  Create helpful, professional response.`,
  },

  enableAnalytics: true,
  enableEvaluation: true,
  context: {
    ticket_type: ticket.category,
    customer_tier: customer.tier,
    urgency: ticket.priority,
    product_area: ticket.product,
    response_time_target: "9 (code examples must work)
# Completeness: >8 (all parameters documented)
# Technical review: Required for all API docs
```

##  Education & Training

### Course Content Creation

**Business Challenge:** Create engaging, accurate educational content at scale while tracking costs per course.

**Solution Implementation:**

```javascript
// Educational content with learning outcome tracking
const courseContent = await provider.generate({
  input: {
    text: `Create lesson content: "${lesson.title}"
  Learning objectives: ${lesson.objectives.join(", ")}
  Target audience: ${course.audience}
  Duration: ${lesson.duration} minutes
  Include examples, exercises, and key takeaways.`,
  },

  enableAnalytics: true,
  enableEvaluation: true,
  context: {
    content_type: "educational",
    subject_area: course.subject,
    grade_level: course.gradeLevel,
    learning_style: "mixed",
    engagement_required: true,
  },
});

// Educational quality standards
if (courseContent.evaluation.completeness 8 (must match app functionality)
# Completeness: >7 (all key features mentioned)
# Marketing review: Required for all app store content
```

##  Hospitality & Travel

### Hotel Description Generation

**Business Challenge:** Create compelling hotel descriptions that drive bookings while managing content costs.

**Solution Implementation:**

```javascript
// Hotel marketing content with booking optimization
const hotelDescription = await provider.generate({
  input: {
    text: `Write compelling hotel description for: ${hotel.name}
  Location: ${hotel.location}
  Amenities: ${hotel.amenities.join(", ")}
  Target guests: ${hotel.targetGuests}
  Emphasize unique selling points and local attractions.`,
  },

  enableAnalytics: true,
  enableEvaluation: true,
  context: {
    content_type: "hotel_marketing",
    hotel_category: hotel.starRating,
    location_type: hotel.locationType,
    booking_conversion_goal: true,
    brand_voice: hotel.brandVoice,
  },
});

// Hospitality content quality standards
if (hotelDescription.evaluation.relevance 95%)
- [ ] Regulatory compliance validation
- [ ] Medical professional review workflows
- [ ] Patient comprehension optimization

### Financial Services Setup

- [ ] Compliance framework integration
- [ ] Fact-checking requirements
- [ ] Risk disclosure automation
- [ ] Client tier cost tracking

### SaaS/Technology Setup

- [ ] Customer tier quality differentiation
- [ ] Response time optimization
- [ ] Technical accuracy validation
- [ ] Scalability cost tracking

### Education Setup

- [ ] Learning objective alignment
- [ ] Grade-level appropriate content
- [ ] Engagement quality metrics
- [ ] Curriculum compliance checking

##  Getting Started by Industry

1. **Choose Your Industry Template** - Use examples above as starting point
2. **Define Quality Thresholds** - Set accuracy/relevance requirements
3. **Implement Cost Tracking** - Add analytics with industry context
4. **Set Up Quality Gates** - Automate review workflows
5. **Measure Business Impact** - Track ROI and quality improvements

Each industry has specific requirements for accuracy, compliance, and quality - the examples above show proven patterns for success in real-world deployments.

---

## Visual Demonstrations

<!-- Source: visual-demos.md -->

#  Visual Demonstrations

Experience NeuroLink's capabilities through comprehensive visual documentation. **No installation required!**

##  Web Demo Interface

### Interactive Screenshots

| Feature                    | Screenshot                                    | Description                                                  |
| -------------------------- | --------------------------------------------- | ------------------------------------------------------------ |
| **Main Interface**         | _[Screenshots available in demo application]_ | Complete web interface showing all features and capabilities |
| **AI Generation Results**  | _[Screenshots available in demo application]_ | Real AI content generation with OpenAI GPT-4o                |
| **Business Use Cases**     | _[Screenshots available in demo application]_ | Professional business applications and workflows             |
| **Creative Tools**         | _[Screenshots available in demo application]_ | Creative content generation and storytelling                 |
| **Developer Tools**        | _[Screenshots available in demo application]_ | Code generation, API documentation, debugging help           |
| **Analytics & Monitoring** | _[Screenshots available in demo application]_ | Real-time provider analytics and performance metrics         |

### Complete Demo Videos

**5,681+ tokens of real AI generation captured!**

#### **Basic Examples** - _[Demo videos available in live application]_

- Text generation fundamentals
- Haiku creation with Claude 3.7 Sonnet
- Creative storytelling with OpenAI GPT-4o
- **Content Generated**: 529 tokens (robot painting story)

#### **Business Use Cases** - _[Demo videos available in live application]_

- Professional email generation
- Business analysis and reporting
- Executive summaries and insights
- **Content Generated**: 1,677 tokens (email + analysis + summaries)

#### **Creative Tools** - _[Demo videos available in live application]_

- Story writing and narrative creation
- Language translation capabilities
- Creative brainstorming and ideation
- **Content Generated**: 1,174 tokens (stories + translation + ideas)

#### **Developer Tools** - _[Demo videos available in live application]_

- React component generation
- API documentation creation
- Code debugging and optimization
- **Content Generated**: 2,301 tokens (React code + API docs + debugging)

#### **Monitoring & Analytics** - _[Demo videos available in live application]_

- Live provider status monitoring
- Performance metrics tracking
- Usage analytics and insights
- **Real-time Demonstrations**: Provider connectivity and response times

### Live Interactive Demo

**Express.js Server with Real API Integration**

- **All 3 providers functional**: OpenAI, Amazon Bedrock, Google Vertex AI
- **15+ use cases demonstrated**: Business, creative, and developer scenarios
- **Real-time provider analytics**: Performance metrics and status monitoring
- **Working endpoints**: `/api/generate`, `/api/stream`, `/api/status`, `/api/benchmark`

**Access**: Run the demo server from the `neurolink-demo/` directory

```bash
cd neurolink-demo
npm install
npm start
# Open http://localhost:9876
```

**Note**: If port 9876 is already in use, the server will automatically find the next available port. Check the terminal output for the actual port number.

## ️ CLI Demonstrations

### Professional CLI Screenshots _(Latest: June 10, 2025)_

| Command                     | Screenshot                                                                                                | Description                                                 |
| --------------------------- | --------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------- |
| **CLI Help Overview**       | [Image: CLI Help]               | Complete command reference and usage examples               |
| **Provider Status Check**   | [Image: Provider Status] | All provider connectivity verification with response times  |
| **Text Generation**         | [Image: Text Generation] | Real AI haiku generation with JSON output and usage metrics |
| **Auto Provider Selection** | [Image: Best Provider]     | Automatic provider selection algorithm demonstration        |
| **Batch Processing**        | [Image: Batch Results]     | Multi-prompt processing with progress tracking and results  |

### CLI Demonstration Videos

**Real command execution with live AI generation**

#### **CLI Help Overview** - [ MP4](pathname:///docs/visual-content/cli-videos/cli-01-cli-help.mp4)

- Complete help system demonstration
- Command reference and usage examples
- Provider configuration overview
- **Size**: 44KB - Professional MP4 with comprehensive command overview

#### **Provider Status** - [ MP4](pathname:///docs/visual-content/cli-videos/cli-02-provider-status.mp4)

- All provider connectivity verification (now with authentication and model availability checks)
- Response time measurements
- Authentication status checking
- **Size**: 496KB - Professional MP4 showing provider connectivity

#### **Text Generation** - [ MP4](pathname:///docs/visual-content/cli-videos/cli-03-text-generation.mp4)

- Text generation with different providers
- Temperature and token control demonstrations
- JSON vs text output formats
- **Size**: 100KB - Professional MP4 with real AI generation

#### **Auto Provider Selection** - [ MP4](pathname:///docs/visual-content/cli-videos/cli-04-auto-selection.mp4)

- Automatic provider selection algorithm
- Fallback mechanism demonstration
- Performance-based selection
- **Size**: Professional MP4 showing selection logic

#### **Streaming Generation** - [ MP4](pathname:///docs/visual-content/cli-videos/cli-05-streaming.mp4)

- Live AI content streaming demonstration
- Real-time text generation as it happens
- Provider performance comparison
- **Size**: Professional MP4 with live streaming

#### **Advanced Features** - [ MP4](pathname:///docs/visual-content/cli-videos/cli-06-advanced-features.mp4)

- Verbose diagnostics and debugging
- Provider-specific command options
- Advanced configuration and customization
- **Size**: Professional MP4 with comprehensive advanced features

### CLI Recording Infrastructure

**Professional asciinema recordings available:**

```bash
# View locally (requires asciinema)
asciinema play docs/cli-recordings/latest/01-cli-help.cast
asciinema play docs/cli-recordings/latest/02-provider-status.cast
asciinema play docs/cli-recordings/latest/03-text-generation.cast
asciinema play docs/cli-recordings/latest/04-auto-selection.cast
asciinema play docs/cli-recordings/latest/05-streaming.cast
asciinema play docs/cli-recordings/latest/06-advanced-features.cast
```

**Features:**

- **Web Embeddable**: Upload to asciinema.org with `[![asciicast]` tags
- **GIF Convertible**: Use `agg` tool for animated GIF creation
- **Professional Quality**: Suitable for documentation, tutorials, marketing
- **Real Command Execution**: Actual CLI commands with live AI generation

##  MCP (Model Context Protocol) Demonstrations

### MCP CLI Screenshots

**Generated January 10, 2025** - Showcasing external server integration capabilities

| Command                  | Screenshot                                                                                 | Description                                                |
| ------------------------ | ------------------------------------------------------------------------------------------ | ---------------------------------------------------------- |
| **MCP Help Overview**    | [Image: MCP Help]               | Complete MCP command reference and server management       |
| **Server Installation**  | [Image: Install Server]      | Installing external MCP servers (filesystem, github, etc.) |
| **Server Status Check**  | [Image: Server Status]   | MCP server connectivity and status verification            |
| **Server Testing**       | [Image: Test Server]     | Testing MCP server connectivity and tool discovery         |
| **Custom Server Setup**  | [Image: Custom Server] | Adding custom MCP server configurations                    |
| **Workflow Integration** | [Image: Workflow Demo] | Complete MCP workflow demonstrations                       |

### MCP Demo Videos

**Real external server integration demonstrations**

#### **Server Management** - [ MP4](pathname:///docs/videos/mcp-server-management-demo.mp4)

- Installing and configuring MCP servers
- Server lifecycle management
- Status monitoring and health checks
- **Duration**: ~45 seconds of real server management

**Note**: Additional MCP demo videos are in development. The server management demo showcases the core MCP integration capabilities.

### MCP CLI Commands Demonstrated

```bash
# Server Management
neurolink mcp install filesystem
neurolink mcp list --status
neurolink mcp test filesystem
neurolink mcp add custom-python "python /path/to/server.py"
neurolink mcp remove server-name

# Tool Execution (framework ready)
neurolink mcp exec filesystem read-file --path "/path/to/file"
neurolink generate "Read README and summarize" --tools filesystem
```

### MCP Integration Benefits

- ✅ **External Server Connectivity**: Connect to filesystem, github, database, and custom servers
- ✅ **Tool Discovery**: Automatic discovery of available tools from MCP servers
- ✅ **Workflow Integration**: Combine AI generation with external tool execution
- ✅ **Extensible Architecture**: Add new capabilities through external servers
- ✅ **Standard Protocol**: Compatible with existing MCP server ecosystem

##  Visual Content Benefits

### **No Installation Required**

See everything in action before installing:

- Complete feature demonstrations
- Real AI content generation
- Provider connectivity validation
- Performance metrics and analytics

### **Production Validation**

All visual content shows real functionality:

- ✅ **Actual AI Generation**: 5,681+ tokens of real content
- ✅ **Working Providers**: OpenAI, Bedrock, Vertex AI all functional
- ✅ **Real Performance**: Actual response times and metrics
- ✅ **Live Demonstrations**: No simulated or mocked content

### **Professional Quality**

Suitable for all documentation uses:

-  **1920x1080 Resolution**: High-definition screenshots and videos
-  **Professional Styling**: Clean, consistent visual presentation
-  **Comprehensive Coverage**: Every major feature documented
-  **Easy Integration**: Ready for embedding in documentation

### **Multiple Formats**

Choose the best format for your needs:

- **Screenshots**: Quick visual reference and feature overview
- **Videos**: Dynamic demonstrations with real interactions
- **Asciinema Recordings**: Playable CLI demonstrations
- **Live Demo**: Interactive testing environment

##  Content Organization

```
neurolink/
├── neurolink-demo/                    # Web interface demonstrations
│   ├── screenshots/                   # 6 professional web screenshots
│   │   ├── 01-overview/              # Main interface overview
│   │   ├── 02-basic-examples/        # AI generation results
│   │   ├── 03-business-use-cases/    # Business applications
│   │   ├── 04-creative-tools/        # Creative content generation
│   │   ├── 05-developer-tools/       # Code generation and docs
│   │   └── 06-monitoring/            # Analytics and monitoring
│   └── videos/                       # Complete demo videos (WebM + MP4)
│       ├── basic-examples.webm/.mp4  # Text generation fundamentals
│       ├── business-use-cases.*      # Professional applications
│       ├── creative-tools.*          # Creative content creation
│       ├── developer-tools.*         # Code generation and APIs
│       ├── monitoring-analytics.*    # Real-time analytics
│       └── mcp-demos/               # MCP server integration demos
├── docs/visual-content/              # CLI demonstrations
│   ├── screenshots/cli-screenshots/  # Professional CLI screenshots
│   └── cli-videos/                  # CLI demonstration videos
│       ├── cli-01-cli-help.mp4      # Help command overview
│       ├── cli-02-provider-status.mp4 # Provider connectivity
│       ├── cli-03-text-generation.mp4 # AI generation demos
│       ├── cli-04-auto-selection.mp4  # Auto provider selection
│       ├── cli-05-streaming.mp4      # Real-time streaming
│       ├── cli-06-advanced-features.mp4 # Advanced features
│       └── cli-advanced-features/    # MCP command demos
└── docs/cli-recordings/             # Professional asciinema recordings
    └── latest/                      # 6 .cast files for web embedding
```

##  Getting Started with Visual Content

### Quick Demo Access

1. **Web Interface**: `cd neurolink-demo && npm start`
2. **CLI Testing**: `npx @juspay/neurolink status`
3. **Screenshots**: Browse the visual content directories
4. **Videos**: Open video files in your preferred player

### Recording Your Own Demos

1. **CLI Recording**: Use the provided automation scripts
2. **Web Recording**: Browser automation with Playwright
3. **Screenshot Creation**: Automated capture with consistent styling
4. **Professional Quality**: Follow established visual standards

### Integration in Documentation

- **README Files**: Embed screenshots and video links
- **API Documentation**: Visual examples alongside code
- **Tutorials**: Step-by-step visual guides
- **Marketing**: Professional quality content for promotion

---

[← Back to Main README](/docs/) | [Next: Error Handling →](/docs/workflows/error-handling)

---

# Getting Started

## Getting Started

<!-- Source: getting-started/index.md -->

# Getting Started

Welcome to NeuroLink! This section will help you get up and running quickly with the Enterprise AI Development Platform.

##  What You'll Learn

- ⏱️ **[Quick Start](/docs/getting-started/quick-start)**

  Get NeuroLink working in under 2 minutes with basic examples for both CLI and SDK usage.

-  **[Installation](/docs/getting-started/installation)**

  Detailed installation instructions for different environments and package managers.

-  **[Provider Setup](/docs/getting-started/provider-setup)**

  Configure API keys and credentials for all 9 supported AI providers with step-by-step guides.

- ⚙️ **[Environment Variables](/docs/getting-started/environment-variables)**

  Complete reference for all environment variables and configuration options.

##  Choose Your Path

##  Prerequisites

- **Node.js 18+** (for SDK usage)
- **npm/pnpm/yarn** (package manager)
- **API keys** for at least one AI provider

:::tip[Free Options Available]

You can start with free providers like Google AI Studio, Hugging Face, or local Ollama to test NeuroLink without costs.
:::

##  Next Steps

1. **[Quick Start](/docs/getting-started/quick-start)** - Get running in 2 minutes
2. **[Provider Setup](/docs/getting-started/provider-setup)** - Configure your AI providers
3. **[CLI Guide](/docs/)** or **[SDK Reference](/docs/)** - Deep dive into usage
4. **[Examples](/docs/)** - See real-world applications

---

## AI Provider Guides

<!-- Source: getting-started/providers/index.md -->

# AI Provider Guides

Complete setup guides for all supported AI providers.

##  Enterprise Providers

Production-grade providers for enterprise deployments:

### [Azure OpenAI](/docs/getting-started/providers/azure-openai)

**Enterprise AI with Microsoft Azure**

-  SOC2, HIPAA, ISO 27001 compliant
-  Multi-region deployment (30+ regions)
- ️ Private endpoints with VNet
-  Enterprise SLAs

[Setup Guide →](/docs/getting-started/providers/azure-openai)

### [Google Vertex AI](/docs/getting-started/providers/google-vertex)

**Google Cloud ML platform**

- ☁️ GCP integration
-  IAM, VPC, service accounts
-  Global deployment
-  Gemini, PaLM, Codey models

[Setup Guide →](/docs/getting-started/providers/google-vertex)

### [AWS Bedrock](/docs/getting-started/providers/aws-bedrock)

**Serverless AI on AWS**

-  13 foundation models (Claude, Llama, Mistral)
-  IAM, VPC integration
-  Multi-region (us-east-1, eu-west-1, ap-southeast-1)
-  Pay-per-use pricing

[Setup Guide →](/docs/getting-started/providers/aws-bedrock)

---

##  Compliance-Focused

Providers with specific compliance certifications:

### [Mistral AI](/docs/getting-started/providers/mistral)

**European AI with GDPR compliance**

- 🇪🇺 EU data residency
- ✅ GDPR compliant by default
-  Open source models
-  Cost-effective

[Setup Guide →](/docs/getting-started/providers/mistral)

---

##  Aggregators & Proxies

Access multiple providers through unified interfaces:

### [OpenRouter](/docs/getting-started/providers/openrouter)

**300+ models from 60+ providers**

-  Single API for all major providers (Anthropic, OpenAI, Google, Meta, etc.)
- ⚡ Automatic failover and routing
-  Competitive pricing with cost optimization
-  Zero lock-in - switch models instantly
-  Usage tracking dashboard
- 🆓 Free models available

[Setup Guide →](/docs/getting-started/providers/openrouter)

### [OpenAI Compatible](/docs/getting-started/providers/openai-compatible)

**OpenRouter, vLLM, LocalAI, and more**

-  100+ models through OpenRouter
-  Local deployment with vLLM
-  Self-hosted with LocalAI
-  Drop-in OpenAI replacement

[Setup Guide →](/docs/getting-started/providers/openai-compatible)

### [LiteLLM](/docs/getting-started/providers/litellm)

**100+ providers through proxy**

-  Unified API for 100+ providers
-  Load balancing and fallbacks
-  Cost tracking
-  Model routing

[Setup Guide →](/docs/getting-started/providers/litellm)

---

## Quick Comparison

| Provider                                  | Free Tier | Enterprise | GDPR   | Latency | Best For                        |
| ----------------------------------------- | --------- | ---------- | ------ | ------- | ------------------------------- |
| [Hugging Face](/docs/getting-started/providers/huggingface)            | ✅        | ❌         | ✅     | Medium  | Open source, experimentation    |
| [Google AI](/docs/getting-started/providers/google-ai)                 | ✅        | ✅         | ✅     | Low     | Free tier, Gemini               |
| [Mistral AI](/docs/getting-started/providers/mistral)                  | ❌        | ✅         | ✅     | Low     | EU compliance, cost             |
| [OpenRouter](/docs/getting-started/providers/openrouter)               | ✅        | ✅         | Varies | Low     | Multi-model, automatic failover |
| [OpenAI Compatible](/docs/getting-started/providers/openai-compatible) | Varies    | ✅         | Varies | Varies  | Flexibility, local deployment   |
| [LiteLLM](/docs/getting-started/providers/litellm)                     | ❌        | ✅         | Varies | Low     | Multi-provider, unified API     |
| [Azure OpenAI](/docs/getting-started/providers/azure-openai)           | ❌        | ✅         | ✅     | Low     | Enterprise, Microsoft ecosystem |
| [Vertex AI](/docs/getting-started/providers/google-vertex)             | ❌        | ✅         | ✅     | Low     | Enterprise, GCP ecosystem       |
| [AWS Bedrock](/docs/getting-started/providers/aws-bedrock)             | ❌        | ✅         | ✅     | Low     | Enterprise, AWS ecosystem       |

---

## Setup Strategies

### Strategy 1: Free Tier First (Recommended for Development)

```typescript
const ai = new NeuroLink({
providers: [
{
name: 'google-ai',
priority: 1,
config: { apiKey: process.env.GOOGLE_AI_KEY },
quotas: { daily: 1500 }
},
{
name: 'openai',
priority: 2,
config: { apiKey: process.env.OPENAI_API_KEY }
}
],
failoverConfig: { enabled: true, fallbackOnQuota: true }
});

    const result = await ai.generate({
      input: { text: "Hello world" }
    });
    ```

```bash
# Set up environment variables
export GOOGLE_AI_KEY="your-key"
export OPENAI_API_KEY="your-key"

    # Use with automatic failover
    npx @juspay/neurolink generate "Hello world" \
      --provider google-ai
    ```

### Strategy 2: Multi-Region Enterprise

```typescript
const ai = new NeuroLink({
  providers: [
    {
      name: "azure-us",
      region: "us-east",
      config: {
        /* Azure US */
      },
    },
    {
      name: "azure-eu",
      region: "eu-west",
      config: {
        /* Azure EU */
      },
    },
    {
      name: "bedrock-us",
      region: "us-east",
      config: {
        /* Bedrock US */
      },
    },
  ],
  loadBalancing: "latency-based",
});
```

### Strategy 3: GDPR Compliance

```typescript
const ai = new NeuroLink({
  providers: [
    {
      name: "mistral",
      priority: 1,
      config: { apiKey: process.env.MISTRAL_API_KEY },
    },
    {
      name: "azure-eu",
      priority: 2,
      config: {
        /* Azure EU region */
      },
    },
  ],
  compliance: {
    framework: "GDPR",
    dataResidency: "EU",
  },
});
```

---

## Next Steps

1. **Choose a provider** based on your requirements (free tier, compliance, region)
2. **Follow the setup guide** to get your API key
3. **Configure NeuroLink** with the provider
4. **Test the integration** with a simple request
5. **Add failover** for production reliability

---

## Related Documentation

- **[Multi-Provider Failover](/docs/guides/enterprise/multi-provider-failover)** - High availability patterns
- **[Cost Optimization](/docs/cookbook/cost-optimization)** - Reduce costs by 80-95%
- **[Compliance & Security](/docs/guides/enterprise/compliance)** - GDPR, SOC2, HIPAA
- **[Load Balancing](/docs/guides/enterprise/load-balancing)** - Distribution strategies

---

## Quick Start

<!-- Source: getting-started/quick-start.md -->

# Quick Start

Get NeuroLink running in under 2 minutes with this quick start guide.

##  Prerequisites

- **Node.js 18+**
- **npm/pnpm/yarn** package manager
- **API key** for at least one AI provider (we recommend starting with Google AI Studio - it has a free tier)

## ⚡ 1-Minute Setup

### Option 1: CLI Usage (No Installation)

```bash
# Set up your API key (Google AI Studio has free tier)
export GOOGLE_AI_API_KEY="AIza-your-google-ai-api-key"

# Generate text instantly
npx @juspay/neurolink generate "Hello, AI"
npx @juspay/neurolink gen "Hello, AI"        # Shortest form

# Check provider status
npx @juspay/neurolink status
```

### Option 2: SDK Installation

```bash
# Install for your project
npm install @juspay/neurolink
```

```typescript

const neurolink = new NeuroLink();
const result = await neurolink.generate({
  input: { text: "Write a haiku about programming" },
  provider: "google-ai",
});

console.log(result.content);
console.log(`Used: ${result.provider}`);
```

### Write Once, Run Anywhere

NeuroLink's power is in its provider-agnostic design. Write your code once, and NeuroLink automatically uses the best available provider. If your primary provider fails, it seamlessly falls back to another, ensuring your application remains robust.

```typescript

// No provider specified - NeuroLink handles it!
const neurolink = new NeuroLink();

// This code works with OpenAI, Google, Anthropic, etc. without any changes.
const result = await neurolink.generate({
  input: { text: "Explain quantum computing simply." },
});

console.log(result.content);
console.log(`AI Provider Used: ${result.provider}`);
```

##  Get API Keys

### Google AI Studio (Free Tier Available)

1. Visit [Google AI Studio](https://aistudio.google.com/)
2. Sign in with your Google account
3. Click "Get API Key"
4. Create a new API key
5. Copy and use: `export GOOGLE_AI_API_KEY="AIza-your-key"`

### Other Providers

- **OpenAI**: [platform.openai.com](https://platform.openai.com/)
- **Anthropic**: [console.anthropic.com](https://console.anthropic.com/)
- **LiteLLM**: Access 100+ models through one proxy server (requires setup)
- **Ollama**: Local installation, no API key needed

## ✅ Verify Setup

```bash
# Check all configured providers
npx @juspay/neurolink status

# Test with built-in tools
npx @juspay/neurolink generate "What time is it?" --debug

# Test without tools (pure text generation)
npx @juspay/neurolink generate "Write a poem" --disable-tools
```

##  Next Steps

1. **[Provider Setup](/docs/getting-started/provider-setup)** - Configure multiple AI providers
2. **[CLI Loop Sessions](/docs/features/cli-loop-sessions)** - Try persistent interactive mode with memory
3. **[CLI Commands](/docs/cli/commands)** - Learn all available commands
4. **[SDK Reference](/docs/sdk/api-reference)** - Integrate into your applications
5. **[Examples](/docs/examples/basic-usage)** - See practical implementations

**Latest Features:**

- [Multimodal Chat](/docs/features/multimodal-chat) - Add images to your prompts
- [Auto Evaluation](/docs/features/auto-evaluation) - Quality scoring for responses
- [Guardrails](/docs/features/guardrails) - Content filtering and safety

## 🆘 Need Help?

- **Not working?** Check our [Troubleshooting Guide](/docs/reference/troubleshooting)
- **Questions?** See our [FAQ](/docs/reference/faq)
- **Issues?** Report on [GitHub](https://github.com/juspay/neurolink/issues)

---

## Installation

<!-- Source: getting-started/installation.md -->

# Installation

Complete installation guide for NeuroLink CLI and SDK across different environments.

##  Choose Your Installation Method

```bash
# Direct usage (recommended)
npx @juspay/neurolink generate "Hello, AI"

# Global installation (optional)
npm install -g @juspay/neurolink
neurolink generate "Hello, AI"
```

```bash
# npm
npm install @juspay/neurolink

# pnpm
pnpm add @juspay/neurolink

# yarn
yarn add @juspay/neurolink
```

```bash
git clone https://github.com/juspay/neurolink
cd neurolink
pnpm install
npx husky install       # Setup git hooks for build rule enforcement
pnpm setup:complete     # Complete automated setup
pnpm run validate:all   # Validate build rules and quality
```

##  System Requirements

### Minimum Requirements

- **Node.js**: 18.0.0 or higher
- **npm**: 8.0.0 or higher
- **pnpm**: 8.0.0 or higher (recommended)

### Supported Platforms

- **macOS**: 10.15+ (Intel and Apple Silicon)
- **Linux**: Ubuntu 18.04+, CentOS 7+, Debian 9+
- **Windows**: 10+ (WSL recommended for best experience)

### Check Your Environment

```bash
# Check Node.js version
node --version  # Should be 18.0.0+

# Check npm version
npm --version   # Should be 8.0.0+

# Check if TypeScript support is available (optional)
npx tsc --version
```

##  Environment Setup

### 1. API Keys Configuration

Create a `.env` file in your project root:

```bash
# Create .env file
touch .env

# Add your API keys
echo 'GOOGLE_AI_API_KEY="AIza-your-google-ai-key"' >> .env
echo 'OPENAI_API_KEY="sk-your-openai-key"' >> .env
echo 'ANTHROPIC_API_KEY="sk-ant-your-key"' >> .env
```

### 2. Verify Installation

```bash
# Test CLI installation
npx @juspay/neurolink --version

# Test provider connectivity
npx @juspay/neurolink status

# Test basic generation
npx @juspay/neurolink generate "Hello, world!"
```

### 3. TypeScript Setup (Optional)

For TypeScript projects, NeuroLink includes full type definitions:

```typescript
// tsconfig.json
{
  "compilerOptions": {
    "target": "ES2020",
    "module": "ESNext",
    "moduleResolution": "node",
    "esModuleInterop": true,
    "allowSyntheticDefaultImports": true,
    "strict": true
  }
}
```

```typescript
// test.ts

const neurolink = new NeuroLink();
// Full TypeScript IntelliSense available
```

##  Framework-Specific Setup

### Next.js

```bash
npm install @juspay/neurolink
```

```typescript
// app/api/ai/route.ts

export async function POST(request: Request) {
  const { prompt } = await request.json();
  const neurolink = new NeuroLink();

  const result = await neurolink.generate({
    input: { text: prompt },
  });

  return Response.json({ content: result.content });
}
```

### SvelteKit

```bash
npm install @juspay/neurolink
```

```typescript
// src/routes/api/ai/+server.ts

export const POST: RequestHandler = async ({ request }) => {
  const { prompt } = await request.json();
  const neurolink = new NeuroLink();

  const result = await neurolink.generate({
    input: { text: prompt },
  });

  return new Response(JSON.stringify({ content: result.content }));
};
```

### Express.js

```bash
npm install @juspay/neurolink express
```

```typescript

const app = express();
const neurolink = new NeuroLink();

app.post("/api/generate", async (req, res) => {
  const result = await neurolink.generate({
    input: { text: req.body.prompt },
  });

  res.json({ content: result.content });
});

app.listen(3000);
```

##  Docker Setup

```dockerfile
# Dockerfile
FROM node:18-alpine

WORKDIR /app
COPY package*.json ./
RUN npm install

COPY . .
RUN npm run build

EXPOSE 3000
CMD ["npm", "start"]
```

```yaml
# docker-compose.yml
version: "3.8"
services:
  neurolink-app:
    build: .
    ports:
      - "3000:3000"
    environment:
      - GOOGLE_AI_API_KEY=${GOOGLE_AI_API_KEY}
      - OPENAI_API_KEY=${OPENAI_API_KEY}
    volumes:
      - .env:/app/.env
```

##  Security Considerations

### Environment Variables

```bash
# Never commit API keys to version control
echo ".env" >> .gitignore

# Use environment-specific files
cp .env .env.example
# Remove actual keys from .env.example
```

### Production Deployment

```bash
# Use secure secret management
# AWS: AWS Secrets Manager
# Azure: Azure Key Vault
# Google Cloud: Secret Manager
# Kubernetes: Secrets

# Example with environment variables
export GOOGLE_AI_API_KEY="$(cat /secrets/google-ai-key)"
export OPENAI_API_KEY="$(cat /secrets/openai-key)"
```

##  Troubleshooting

### Common Issues

**Node.js version error:**

```bash
# Update Node.js to 18+
nvm install 18
nvm use 18
```

**Permission errors on Linux/macOS:**

```bash
# Fix npm permissions
sudo chown -R $(whoami) ~/.npm
```

**TypeScript errors:**

```bash
# Install type definitions
npm install -D @types/node typescript
```

**Import/export errors:**

```bash
# Ensure package.json has "type": "module"
echo '"type": "module"' >> package.json
```

### Getting Help

1. **Check our [Troubleshooting Guide](/docs/reference/troubleshooting)**
2. **Review [FAQ](/docs/reference/faq)**
3. **Search [GitHub Issues](https://github.com/juspay/neurolink/issues)**
4. **Create new issue** with:
   - Node.js version (`node --version`)
   - Operating system
   - Error message
   - Steps to reproduce

## ✅ Verification Checklist

- [ ] Node.js 18+ installed
- [ ] NeuroLink package installed or accessible via npx
- [ ] API keys configured in `.env` file
- [ ] `neurolink status` shows working providers
- [ ] Basic generation command works
- [ ] TypeScript support (if needed)
- [ ] Framework integration (if applicable)

##  Next Steps

1. **[Quick Start](/docs/getting-started/quick-start)** - Test your installation
2. **[Provider Setup](/docs/getting-started/provider-setup)** - Configure AI providers
3. **[CLI Commands](/docs/cli/commands)** - Learn available commands
4. **[Examples](/docs/examples/basic-usage)** - See implementation patterns

---

## Environment Variables Configuration Guide

<!-- Source: getting-started/environment-variables.md -->

#  Environment Variables Configuration Guide

This guide provides comprehensive setup instructions for all AI providers supported by NeuroLink. The CLI automatically loads environment variables from `.env` files, making configuration seamless.

##  Quick Setup

### Automatic .env Loading ✨ NEW!

NeuroLink CLI automatically loads environment variables from `.env` files in your project directory:

```bash
# Create .env file (automatically loaded)
echo 'OPENAI_API_KEY="sk-your-key"' > .env
echo 'AWS_ACCESS_KEY_ID="your-key"' >> .env

# Test configuration
npx @juspay/neurolink status
```

### Manual Export (Also Supported)

```bash
export OPENAI_API_KEY="sk-your-key"
export AWS_ACCESS_KEY_ID="your-key"
npx @juspay/neurolink status
```

## ️ Enterprise Configuration Management

### **✨ NEW: Automatic Backup System**

```bash
# Configure backup settings
NEUROLINK_BACKUP_ENABLED=true              # Enable automatic backups (default: true)
NEUROLINK_BACKUP_RETENTION=30              # Days to keep backups (default: 30)
NEUROLINK_BACKUP_DIRECTORY=.neurolink.backups  # Backup directory (default: .neurolink.backups)

# Config validation settings
NEUROLINK_VALIDATION_STRICT=false          # Strict validation mode (default: false)
NEUROLINK_VALIDATION_WARNINGS=true         # Show validation warnings (default: true)

# Provider status monitoring
NEUROLINK_PROVIDER_STATUS_CHECK=true       # Monitor provider availability (default: true)
NEUROLINK_PROVIDER_TIMEOUT=30000           # Provider timeout in ms (default: 30000)
```

### **Interface Configuration**

```bash
# MCP Registry settings
NEUROLINK_REGISTRY_CACHE_TTL=300           # Cache TTL in seconds (default: 300)
NEUROLINK_REGISTRY_AUTO_DISCOVERY=true     # Auto-discover MCP servers (default: true)
NEUROLINK_REGISTRY_STATS_ENABLED=true      # Enable registry statistics (default: true)

# Execution context settings
NEUROLINK_DEFAULT_TIMEOUT=30000            # Default execution timeout (default: 30000)
NEUROLINK_DEFAULT_RETRIES=3                # Default retry count (default: 3)
NEUROLINK_CONTEXT_LOGGING=info             # Context logging level (default: info)
```

### **Performance & Optimization**

```bash
# Tool execution settings
NEUROLINK_TOOL_EXECUTION_TIMEOUT=1000      # Tool execution timeout in ms (default: 1000)
NEUROLINK_PIPELINE_TIMEOUT=22000           # Pipeline execution timeout (default: 22000)
NEUROLINK_CACHE_ENABLED=true               # Enable execution caching (default: true)

# Error handling
NEUROLINK_AUTO_RESTORE_ENABLED=true        # Enable auto-restore on config failures (default: true)
NEUROLINK_ERROR_RECOVERY_ATTEMPTS=3        # Error recovery attempts (default: 3)
NEUROLINK_GRACEFUL_DEGRADATION=true        # Enable graceful degradation (default: true)
```

## 🆕 AI Enhancement Features

### Basic Enhancement Configuration

```bash
# AI response quality evaluation model (optional)
NEUROLINK_EVALUATION_MODEL="gemini-2.5-flash"
```

**Description**: Configures the AI model used for response quality evaluation when `--enable-evaluation` flag is used. Uses Google AI's fast Gemini 2.5 Flash model for quick quality assessment.

**Supported Models**:

- `gemini-2.5-flash` (default) - Fast evaluation processing
- `gemini-2.5-pro` - More detailed evaluation (slower)

**Usage**:

```bash
# Enable evaluation with default model
npx @juspay/neurolink generate "prompt" --enable-evaluation

# Enable both analytics and evaluation
npx @juspay/neurolink generate "prompt" --enable-analytics --enable-evaluation
```

##  Universal Evaluation System (Advanced)

### Primary Configuration

```bash
# Primary evaluation provider
NEUROLINK_EVALUATION_PROVIDER="google-ai"        # Default: google-ai

# Evaluation performance mode
NEUROLINK_EVALUATION_MODE="fast"                 # Options: fast, balanced, quality
```

**NEUROLINK_EVALUATION_PROVIDER**: Primary AI provider for evaluation

- **Options**: `google-ai`, `openai`, `anthropic`, `vertex`, `bedrock`, `azure`, `ollama`, `huggingface`, `mistral`
- **Default**: `google-ai`
- **Usage**: Determines which AI provider performs the quality evaluation

**NEUROLINK_EVALUATION_MODE**: Performance vs quality trade-off

- **Options**: `fast` (cost-effective), `balanced` (optimal), `quality` (highest accuracy)
- **Default**: `fast`
- **Usage**: Selects appropriate model for the provider (e.g., gemini-2.5-flash vs gemini-2.5-pro)

### Fallback Configuration

```bash
# Enable automatic fallback when primary provider fails
NEUROLINK_EVALUATION_FALLBACK_ENABLED="true"     # Default: true

# Fallback provider order (comma-separated)
NEUROLINK_EVALUATION_FALLBACK_PROVIDERS="openai,anthropic,vertex,bedrock"
```

**NEUROLINK_EVALUATION_FALLBACK_ENABLED**: Enable intelligent fallback system

- **Options**: `true`, `false`
- **Default**: `true`
- **Usage**: When enabled, automatically tries backup providers if primary fails

**NEUROLINK_EVALUATION_FALLBACK_PROVIDERS**: Backup provider order

- **Format**: Comma-separated provider names
- **Default**: `openai,anthropic,vertex,bedrock`
- **Usage**: Defines the order of providers to try if primary fails

### Performance Tuning

```bash
# Evaluation timeout (milliseconds)
NEUROLINK_EVALUATION_TIMEOUT="10000"             # Default: 10000 (10 seconds)

# Maximum tokens for evaluation response
NEUROLINK_EVALUATION_MAX_TOKENS="500"            # Default: 500

# Temperature for consistent evaluation
NEUROLINK_EVALUATION_TEMPERATURE="0.1"           # Default: 0.1 (low for consistency)

# Retry attempts for failed evaluations
NEUROLINK_EVALUATION_RETRY_ATTEMPTS="2"          # Default: 2
```

**Performance Variables**:

- **TIMEOUT**: Maximum time to wait for evaluation (prevents hanging)
- **MAX_TOKENS**: Limits evaluation response length (controls cost)
- **TEMPERATURE**: Lower values = more consistent scoring
- **RETRY_ATTEMPTS**: Number of retry attempts for transient failures

### Cost Optimization

```bash
# Prefer cost-effective models and providers
NEUROLINK_EVALUATION_PREFER_CHEAP="true"         # Default: true

# Maximum cost per evaluation (USD)
NEUROLINK_EVALUATION_MAX_COST_PER_EVAL="0.01"    # Default: $0.01
```

**NEUROLINK_EVALUATION_PREFER_CHEAP**: Cost optimization preference

- **Options**: `true`, `false`
- **Default**: `true`
- **Usage**: When enabled, prioritizes cheaper providers and models

**NEUROLINK_EVALUATION_MAX_COST_PER_EVAL**: Cost limit per evaluation

- **Format**: Decimal number (USD)
- **Default**: `0.01` ($0.01)
- **Usage**: Prevents expensive evaluations, switches to cheaper providers if needed

### Complete Universal Evaluation Example

```bash
# Comprehensive evaluation configuration
NEUROLINK_EVALUATION_PROVIDER="google-ai"
NEUROLINK_EVALUATION_MODEL="gemini-2.5-flash"
NEUROLINK_EVALUATION_MODE="balanced"
NEUROLINK_EVALUATION_FALLBACK_ENABLED="true"
NEUROLINK_EVALUATION_FALLBACK_PROVIDERS="openai,anthropic,vertex"
NEUROLINK_EVALUATION_TIMEOUT="15000"
NEUROLINK_EVALUATION_MAX_TOKENS="750"
NEUROLINK_EVALUATION_TEMPERATURE="0.2"
NEUROLINK_EVALUATION_PREFER_CHEAP="false"
NEUROLINK_EVALUATION_MAX_COST_PER_EVAL="0.05"
NEUROLINK_EVALUATION_RETRY_ATTEMPTS="3"
```

### Testing Universal Evaluation

```bash
# Test primary provider
npx @juspay/neurolink generate "What is AI?" --enable-evaluation --debug

# Test with custom domain
npx @juspay/neurolink generate "Fix this Python code" --enable-evaluation --evaluation-domain "Python expert"

# Test Lighthouse-style evaluation
npx @juspay/neurolink generate "Business analysis" --lighthouse-style --evaluation-domain "Business consultant"
```

---------- | ------------------------------- | ---------------------------------- |
| `HTTPS_PROXY` | Proxy server for HTTPS requests | `http://proxy.company.com:8080`    |
| `HTTP_PROXY`  | Proxy server for HTTP requests  | `http://proxy.company.com:8080`    |
| `NO_PROXY`    | Domains to bypass proxy         | `localhost,127.0.0.1,.company.com` |

### Authenticated Proxy

```bash
# Proxy with username/password authentication
HTTPS_PROXY="http://username:password@proxy.company.com:8080"
HTTP_PROXY="http://username:password@proxy.company.com:8080"
```

**All NeuroLink providers automatically use proxy settings when configured.**

**For detailed proxy setup** → See [Enterprise & Proxy Setup Guide](/docs/deployment/enterprise-proxy)

##  Provider Configuration

### 1. OpenAI

#### Required Variables

```bash
OPENAI_API_KEY="sk-proj-your-openai-api-key"
```

#### Optional Variables

```bash
OPENAI_MODEL="gpt-4o"                    # Default: gpt-4o
OPENAI_BASE_URL="https://api.openai.com" # Default: OpenAI API
```

#### How to Get OpenAI API Key

1. Visit [OpenAI Platform](https://platform.openai.com)
2. Sign up or log in to your account
3. Navigate to **API Keys** section
4. Click **Create new secret key**
5. Copy the key (starts with `sk-proj-` or `sk-`)
6. Add billing information if required

#### Supported Models

- `gpt-4o` (default) - Latest GPT-4 Optimized
- `gpt-4o-mini` - Faster, cost-effective option
- `gpt-4-turbo` - High-performance model
- `gpt-3.5-turbo` - Legacy cost-effective option

---

### 2. Amazon Bedrock

#### Required Variables

```bash
AWS_ACCESS_KEY_ID="AKIA..."
AWS_SECRET_ACCESS_KEY="your-secret-key"
AWS_REGION="us-east-1"
```

#### Model Configuration (⚠️ Critical)

```bash
# Use full inference profile ARN for Anthropic models
BEDROCK_MODEL="arn:aws:bedrock:us-east-2::inference-profile/us.anthropic.claude-3-7-sonnet-20250219-v1:0"

# OR use simple model names for non-Anthropic models
BEDROCK_MODEL="amazon.titan-text-express-v1"
```

#### Optional Variables

```bash
AWS_SESSION_TOKEN="IQoJb3..."           # For temporary credentials
```

#### How to Get AWS Credentials

1. Sign up for [AWS Account](https://aws.amazon.com)
2. Navigate to **IAM Console**
3. Create new user with programmatic access
4. Attach policy: `AmazonBedrockFullAccess`
5. Download access key and secret key
6. **Important**: Request model access in Bedrock console

#### Bedrock Model Access Setup

1. Go to [AWS Bedrock Console](https://console.aws.amazon.com/bedrock)
2. Navigate to **Model access**
3. Click **Request model access**
4. Select desired models (Claude, Titan, etc.)
5. Submit request and wait for approval

#### Supported Models

- **Anthropic Claude**:
  - `arn:aws:bedrock:::inference-profile/us.anthropic.claude-3-7-sonnet-20250219-v1:0`
  - `arn:aws:bedrock:::inference-profile/us.anthropic.claude-3-5-sonnet-20241022-v2:0`
- **Amazon Titan**:
  - `amazon.titan-text-express-v1`
  - `amazon.titan-text-lite-v1`

---

### 3. Google Vertex AI

Google Vertex AI supports **three authentication methods**. Choose the one that fits your deployment:

#### Method 1: Service Account File (Recommended)

```bash
GOOGLE_APPLICATION_CREDENTIALS="/absolute/path/to/service-account.json"
GOOGLE_VERTEX_PROJECT="your-gcp-project-id"
GOOGLE_VERTEX_LOCATION="us-central1"
```

#### Method 2: Service Account JSON String

```bash
GOOGLE_SERVICE_ACCOUNT_KEY='{"type":"service_account","project_id":"your-project",...}'
GOOGLE_VERTEX_PROJECT="your-gcp-project-id"
GOOGLE_VERTEX_LOCATION="us-central1"
```

#### Method 3: Individual Environment Variables

```bash
GOOGLE_AUTH_CLIENT_EMAIL="service-account@your-project.iam.gserviceaccount.com"
GOOGLE_AUTH_PRIVATE_KEY="-----BEGIN PRIVATE KEY-----\nMIIEvQIBADANBgkqhkiG9w0B..."
GOOGLE_VERTEX_PROJECT="your-gcp-project-id"
GOOGLE_VERTEX_LOCATION="us-central1"
```

#### Optional Variables

```bash
VERTEX_MODEL="gemini-2.5-pro"           # Default: gemini-2.5-pro
```

#### How to Set Up Google Vertex AI

1. Create [Google Cloud Project](https://console.cloud.google.com)
2. Enable **Vertex AI API**
3. Create **Service Account**:
   - Go to **IAM & Admin > Service Accounts**
   - Click **Create Service Account**
   - Grant **Vertex AI User** role
   - Generate and download JSON key file
4. Set `GOOGLE_APPLICATION_CREDENTIALS` to the JSON file path

#### Supported Models

- `gemini-2.5-pro` (default) - Most capable model
- `gemini-2.5-flash` - Faster responses
- `claude-3-5-sonnet@20241022` - Claude via Vertex AI

---

### 4. Anthropic (Direct)

#### Required Variables

```bash
ANTHROPIC_API_KEY="sk-ant-api03-your-anthropic-key"
```

#### Optional Variables

```bash
ANTHROPIC_MODEL="claude-3-5-sonnet-20241022"  # Default model
ANTHROPIC_BASE_URL="https://api.anthropic.com" # Default endpoint
```

#### How to Get Anthropic API Key

1. Visit [Anthropic Console](https://console.anthropic.com)
2. Sign up or log in
3. Navigate to **API Keys**
4. Click **Create Key**
5. Copy the key (starts with `sk-ant-api03-`)
6. Add billing information for usage

#### Supported Models

- `claude-3-5-sonnet-20241022` (default) - Latest Claude
- `claude-3-haiku-20240307` - Fast, cost-effective
- `claude-3-opus-20240229` - Most capable (if available)

---

### 5. Google AI Studio

#### Required Variables

```bash
GOOGLE_AI_API_KEY="AIza-your-google-ai-api-key"
```

#### Optional Variables

```bash
GOOGLE_AI_MODEL="gemini-2.5-pro"      # Default model
```

#### How to Get Google AI Studio API Key

1. Visit [Google AI Studio](https://aistudio.google.com)
2. Sign in with your Google account
3. Navigate to **API Keys** section
4. Click **Create API Key**
5. Copy the key (starts with `AIza`)
6. Note: Google AI Studio provides free tier with generous limits

#### Supported Models

- `gemini-2.5-pro` (default) - Latest Gemini Pro
- `gemini-2.0-flash` - Fast, efficient responses

---

### 6. Azure OpenAI

#### Required Variables

```bash
AZURE_OPENAI_API_KEY="your-azureOpenai-key"
AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com/"
AZURE_OPENAI_DEPLOYMENT_ID="your-deployment-name"
```

#### Optional Variables

```bash
AZURE_MODEL="gpt-4o"                    # Default: gpt-4o
AZURE_API_VERSION="2024-02-15-preview"  # Default API version
```

#### How to Set Up Azure OpenAI

1. Create [Azure Account](https://azure.microsoft.com)
2. Apply for **Azure OpenAI Service** access
3. Create **Azure OpenAI Resource**:
   - Go to Azure Portal
   - Search "OpenAI"
   - Create new OpenAI resource
4. **Deploy Model**:
   - Go to Azure OpenAI Studio
   - Navigate to **Deployments**
   - Create deployment with desired model
5. Get credentials from **Keys and Endpoint** section

#### Supported Models

- `gpt-4o` (default) - Latest GPT-4 Optimized
- `gpt-4` - Standard GPT-4
- `gpt-35-turbo` - Cost-effective option

---

### 7. Hugging Face

#### Required Variables

```bash
HUGGINGFACE_API_KEY="hf_your_huggingface_token"
```

#### Optional Variables

```bash
HUGGINGFACE_MODEL="microsoft/DialoGPT-medium"    # Default model
HUGGINGFACE_ENDPOINT="https://api-inference.huggingface.co"  # Default endpoint
```

#### How to Get Hugging Face API Token

1. Visit [Hugging Face](https://huggingface.co)
2. Sign up or log in
3. Go to Settings → Access Tokens
4. Create new token with "read" scope
5. Copy token (starts with `hf_`)

#### Supported Models

- **Open Source**: Access to 100,000+ community models
- `microsoft/DialoGPT-medium` (default) - Conversational AI
- `gpt2` - Classic GPT-2
- `EleutherAI/gpt-neo-2.7B` - Large open model
- Any model from [Hugging Face Hub](https://huggingface.co/models)

---

### 8. Ollama (Local AI)

#### Required Variables

None! Ollama runs locally.

#### Optional Variables

```bash
OLLAMA_BASE_URL="http://localhost:11434"    # Default local server
OLLAMA_MODEL="llama2"                        # Default model
```

#### How to Set Up Ollama

1. **Install Ollama**:
   - macOS: `brew install ollama` or download from [ollama.ai](https://ollama.ai)
   - Linux: `curl -fsSL https://ollama.ai/install.sh | sh`
   - Windows: Download installer from [ollama.ai](https://ollama.ai)

2. **Start Ollama Service**:

   ```bash
   ollama serve  # Usually auto-starts
   ```

   **Tip: To keep Ollama running in the background:**
   - macOS: `brew services start ollama`
   - Linux (user): `systemctl --user enable --now ollama`
   - Linux (system): `sudo systemctl enable --now ollama`

3. **Pull Models**:
   ```bash
   ollama pull llama2
   ollama pull codellama
   ollama pull mistral
   ```

#### Supported Models

- `llama2` (default) - Meta's Llama 2
- `codellama` - Code-specialized Llama
- `mistral` - Mistral 7B
- `vicuna` - Fine-tuned Llama
- Any model from [Ollama Library](https://ollama.ai/library)

---

### 9. Mistral AI

#### Required Variables

```bash
MISTRAL_API_KEY="your_mistral_api_key"
```

#### Optional Variables

```bash
MISTRAL_MODEL="mistral-small"               # Default model
MISTRAL_ENDPOINT="https://api.mistral.ai"   # Default endpoint
```

#### How to Get Mistral AI API Key

1. Visit [Mistral AI Platform](https://mistral.ai)
2. Sign up for an account
3. Navigate to API Keys section
4. Generate new API key
5. Add billing information

#### Supported Models

- `mistral-tiny` - Fastest, most cost-effective
- `mistral-small` (default) - Balanced performance
- `mistral-medium` - Enhanced capabilities
- `mistral-large` - Most capable model

---

### 10. LiteLLM 🆕

#### Required Variables

```bash
LITELLM_BASE_URL="http://localhost:4000"         # Local LiteLLM proxy (default)
LITELLM_API_KEY="sk-anything"                    # API key for local proxy (any value works)
```

#### Optional Variables

```bash
LITELLM_MODEL="gemini-2.5-pro"                   # Default model
LITELLM_TIMEOUT="60000"                          # Request timeout (ms)
```

#### How to Use LiteLLM

LiteLLM provides access to 100+ AI models through a unified proxy interface:

1. **Local Setup**: Run LiteLLM locally with your API keys (recommended)
2. **Self-Hosted**: Deploy your own LiteLLM proxy server
3. **Cloud Deployment**: Use cloud-hosted LiteLLM instances

#### Available Models (Example Configuration)

- `openai/gpt-4o` - OpenAI GPT-4 Optimized
- `anthropic/claude-3-5-sonnet` - Anthropic Claude Sonnet
- `google/gemini-2.0-flash` - Google Gemini Flash
- `mistral/mistral-large` - Mistral Large model
- Many more via [LiteLLM Providers](https://docs.litellm.ai/docs/providers)

#### Benefits

- **100+ Models**: Access to all major AI providers through one interface
- **Cost Optimization**: Automatic routing to cost-effective models
- **Unified API**: OpenAI-compatible API for all models
- **Load Balancing**: Automatic failover and load distribution
- **Analytics**: Built-in usage tracking and monitoring

---

### 11. Amazon SageMaker 🆕

#### Required Variables

```bash
AWS_ACCESS_KEY_ID="AKIA..."
AWS_SECRET_ACCESS_KEY="your-aws-secret-key"
AWS_REGION="us-east-1"
SAGEMAKER_DEFAULT_ENDPOINT="your-endpoint-name"
```

#### Optional Variables

```bash
SAGEMAKER_MODEL="custom-model-name"         # Model identifier (default: sagemaker-model)
SAGEMAKER_TIMEOUT="30000"                   # Request timeout in ms (default: 30000)
SAGEMAKER_MAX_RETRIES="3"                   # Retry attempts (default: 3)
AWS_SESSION_TOKEN="IQoJb3..."               # For temporary credentials
SAGEMAKER_CONTENT_TYPE="application/json"   # Request content type (default: application/json)
SAGEMAKER_ACCEPT="application/json"         # Response accept type (default: application/json)
```

#### How to Set Up Amazon SageMaker

Amazon SageMaker allows you to deploy and use your own custom trained models:

1. **Deploy Your Model to SageMaker**:
   - Train your model using SageMaker Training Jobs
   - Deploy model to a SageMaker Real-time Endpoint
   - Note the endpoint name for configuration

2. **Set Up AWS Credentials**:
   - Use IAM user with `sagemaker:InvokeEndpoint` permission
   - Or use IAM role for EC2/Lambda/ECS deployments
   - Configure AWS CLI: `aws configure`

3. **Configure NeuroLink**:

   ```bash
   export AWS_ACCESS_KEY_ID="your-access-key"
   export AWS_SECRET_ACCESS_KEY="your-secret-key"
   export AWS_REGION="us-east-1"
   export SAGEMAKER_DEFAULT_ENDPOINT="my-model-endpoint"
   ```

4. **Test Connection**:
   ```bash
   npx @juspay/neurolink sagemaker status
   npx @juspay/neurolink sagemaker test my-endpoint
   ```

#### How to Get AWS Credentials for SageMaker

1. **Create IAM User**:
   - Go to [AWS IAM Console](https://console.aws.amazon.com/iam)
   - Create new user with **Programmatic access**
   - Attach the following policy:

   ```json
   {
     "Version": "2012-10-17",
     "Statement": [
       {
         "Effect": "Allow",
         "Action": ["sagemaker:InvokeEndpoint"],
         "Resource": "arn:aws:sagemaker:*:*:endpoint/*"
       }
     ]
   }
   ```

2. **Download Credentials**:
   - Save Access Key ID and Secret Access Key
   - Set as environment variables

#### Supported Models

SageMaker supports **any custom model** you deploy:

- **Custom Fine-tuned Models** - Your domain-specific models
- **Foundation Model Endpoints** - Large language models deployed via SageMaker
- **Multi-model Endpoints** - Multiple models behind single endpoint
- **Serverless Endpoints** - Auto-scaling model deployments

#### Model Deployment Types

- **Real-time Inference** - Low-latency model serving (recommended)
- **Batch Transform** - Batch processing (not supported by NeuroLink)
- **Serverless Inference** - Pay-per-request model serving
- **Multi-model Endpoints** - Host multiple models efficiently

#### Benefits

- **️ Custom Models** - Deploy and use your own trained models
- ** Cost Control** - Pay only for inference usage, auto-scaling available
- ** Enterprise Security** - Full control over model infrastructure and data
- **⚡ Performance** - Dedicated compute resources with predictable latency
- ** Global Deployment** - Available in all major AWS regions
- ** Monitoring** - Built-in CloudWatch metrics and logging

#### CLI Commands

```bash
# Check SageMaker configuration and endpoint status
npx @juspay/neurolink sagemaker status

# Validate connection to specific endpoint
npx @juspay/neurolink sagemaker validate

# Test inference with specific endpoint
npx @juspay/neurolink sagemaker test my-endpoint

# Show current configuration
npx @juspay/neurolink sagemaker config

# Performance benchmark
npx @juspay/neurolink sagemaker benchmark my-endpoint

# List available endpoints (requires AWS CLI)
npx @juspay/neurolink sagemaker list-endpoints

# Interactive setup wizard
npx @juspay/neurolink sagemaker setup
```

#### Environment Variables Reference

| Variable                     | Required | Default          | Description                                  |
| ---------------------------- | -------- | ---------------- | -------------------------------------------- |
| `AWS_ACCESS_KEY_ID`          | ✅       | -                | AWS access key for authentication            |
| `AWS_SECRET_ACCESS_KEY`      | ✅       | -                | AWS secret key for authentication            |
| `AWS_REGION`                 | ✅       | us-east-1        | AWS region where endpoint is deployed        |
| `SAGEMAKER_DEFAULT_ENDPOINT` | ✅       | -                | SageMaker endpoint name                      |
| `SAGEMAKER_TIMEOUT`          | ❌       | 30000            | Request timeout in milliseconds              |
| `SAGEMAKER_MAX_RETRIES`      | ❌       | 3                | Number of retry attempts for failed requests |
| `AWS_SESSION_TOKEN`          | ❌       | -                | Session token for temporary credentials      |
| `SAGEMAKER_MODEL`            | ❌       | sagemaker-model  | Model identifier for logging                 |
| `SAGEMAKER_CONTENT_TYPE`     | ❌       | application/json | Request content type                         |
| `SAGEMAKER_ACCEPT`           | ❌       | application/json | Response accept type                         |

#### Production Considerations

- ** Security**: Use IAM roles instead of access keys when possible
- ** Monitoring**: Enable CloudWatch logging for your endpoints
- ** Cost Optimization**: Use auto-scaling and serverless options
- ** Multi-Region**: Deploy endpoints in multiple regions for redundancy
- **⚡ Performance**: Choose appropriate instance types for your workload

---

##  Configuration Examples

### Complete .env File Example

```bash
# NeuroLink Environment Configuration - All 11 Providers

# OpenAI Configuration
OPENAI_API_KEY="sk-proj-your-openai-key"
OPENAI_MODEL="gpt-4o"

# Amazon Bedrock Configuration
AWS_ACCESS_KEY_ID="AKIA..."
AWS_SECRET_ACCESS_KEY="your-aws-secret"
AWS_REGION="us-east-1"
BEDROCK_MODEL="arn:aws:bedrock:us-east-1::inference-profile/us.anthropic.claude-3-5-sonnet-20241022-v2:0"

# Amazon SageMaker Configuration
AWS_ACCESS_KEY_ID="AKIA..."
AWS_SECRET_ACCESS_KEY="your-aws-secret"
AWS_REGION="us-east-1"
SAGEMAKER_DEFAULT_ENDPOINT="my-model-endpoint"
SAGEMAKER_TIMEOUT="30000"
SAGEMAKER_MAX_RETRIES="3"

# Google Vertex AI Configuration
GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"
GOOGLE_VERTEX_PROJECT="your-gcp-project"
GOOGLE_VERTEX_LOCATION="us-central1"
VERTEX_MODEL="gemini-2.5-pro"

# Anthropic Configuration
ANTHROPIC_API_KEY="sk-ant-api03-your-key"

# Google AI Studio Configuration
GOOGLE_AI_API_KEY="AIza-your-google-ai-key"
GOOGLE_AI_MODEL="gemini-2.5-pro"

# Azure OpenAI Configuration
AZURE_OPENAI_API_KEY="your-azure-key"
AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com/"
AZURE_OPENAI_DEPLOYMENT_ID="gpt-4o-deployment"
AZURE_MODEL="gpt-4o"

# Hugging Face Configuration
HUGGINGFACE_API_KEY="hf_your_huggingface_token"
HUGGINGFACE_MODEL="microsoft/DialoGPT-medium"

# Ollama Configuration (Local AI - No API Key Required)
OLLAMA_BASE_URL="http://localhost:11434"
OLLAMA_MODEL="llama2"

# Mistral AI Configuration
MISTRAL_API_KEY="your_mistral_api_key"
MISTRAL_MODEL="mistral-small"

# LiteLLM Configuration
LITELLM_BASE_URL="http://localhost:4000"
LITELLM_API_KEY="sk-anything"
LITELLM_MODEL="openai/gpt-4o-mini"
```

### Docker/Container Configuration

```bash
# Use environment variables in containers
docker run -e OPENAI_API_KEY="sk-..." \
           -e AWS_ACCESS_KEY_ID="AKIA..." \
           -e AWS_SECRET_ACCESS_KEY="..." \
           your-app
```

### CI/CD Configuration

```yaml
# GitHub Actions example
env:
  OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
  AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
  AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
```

---

##  Testing Configuration

### Test All Providers

```bash
# Check provider status
npx @juspay/neurolink status --verbose

# Test specific provider
npx @juspay/neurolink generate "Hello" --provider openai

# Get best available provider
npx @juspay/neurolink get-best-provider
```

### Expected Output

```bash
✅ openai: Working (1245ms)
✅ bedrock: Working (2103ms)
✅ vertex: Working (1876ms)
✅ anthropic: Working (1654ms)
✅ azure: Working (987ms)
 Summary: 5/5 providers working
```

---

##  Security Best Practices

### API Key Management

- ✅ **Use .env files** for local development
- ✅ **Use environment variables** in production
- ✅ **Rotate keys regularly** (every 90 days)
- ❌ **Never commit keys** to version control
- ❌ **Never hardcode keys** in source code

### .gitignore Configuration

```bash
# Add to .gitignore
.env
.env.local
.env.production
*.pem
service-account*.json
```

### Production Deployment

- Use **secret management systems** (AWS Secrets Manager, Azure Key Vault)
- Implement **key rotation** policies
- Monitor **API usage** and **rate limits**
- Use **least privilege** access policies

---

##  Troubleshooting

### Common Issues

#### 1. "Missing API Key" Error

```bash
# Check if environment is loaded
npx @juspay/neurolink status

# Verify .env file exists and has correct format
cat .env
```

#### 2. AWS Bedrock "Not Authorized" Error

- ✅ Verify account has **model access** in Bedrock console
- ✅ Use **full inference profile ARN** for Anthropic models
- ✅ Check **IAM permissions** include Bedrock access

#### 3. Google Vertex AI Import Issues

- ✅ Ensure **Vertex AI API** is enabled
- ✅ Verify **service account** has correct permissions
- ✅ Check **JSON file path** is absolute and accessible

#### 4. CLI Not Loading .env

- ✅ Ensure `.env` file is in **current directory**
- ✅ Check file has **correct format** (no spaces around =)
- ✅ Verify CLI version supports **automatic loading**

### Debug Commands

```bash
# Verbose status check
npx @juspay/neurolink status --verbose

# Test specific provider
npx @juspay/neurolink generate "test" --provider openai --verbose

# Check environment loading
node -e "require('dotenv').config(); console.log(process.env.OPENAI_API_KEY)"
```

---

##  Related Documentation

- **[Provider Configuration Guide](/docs/getting-started/provider-setup)** - Detailed provider setup
- **[CLI Guide](/docs/cli)** - Complete CLI command reference
- **[API Reference](/docs/sdk/api-reference)** - Programmatic usage examples
- **[Framework Integration](/docs/sdk/framework-integration)** - Next.js, SvelteKit, React

---

##  Need Help?

-  **Check the troubleshooting section** above
-  **Report issues** in our GitHub repository
-  **Join our Discord** for community support
-  **Contact us** for enterprise support

**Next Steps**: Once configured, test your setup with `npx @juspay/neurolink status` and start generating AI content!

---

## AWS Bedrock Provider Guide

<!-- Source: getting-started/providers/aws-bedrock.md -->

# AWS Bedrock Provider Guide

**Enterprise AI with Claude, Llama, Mistral, and more on AWS infrastructure**

------------- | -------------------------------------- | --------------------------- |
| **Anthropic**    | Claude 3.5 Sonnet, Claude 3 Opus/Haiku | Complex reasoning, coding   |
| **Meta**         | Llama 3.1 (8B, 70B, 405B)              | Open source, cost-effective |
| **Mistral AI**   | Mistral Large, Mixtral 8x7B            | European compliance, coding |
| **Cohere**       | Command R+, Embed                      | Enterprise search, RAG      |
| **Amazon**       | Titan Text, Titan Embeddings           | AWS-native, affordable      |
| **AI21 Labs**    | Jamba-Instruct                         | Long context                |
| **Stability AI** | Stable Diffusion XL                    | Image generation            |

---

## Quick Start

### 1. Enable Model Access

```bash
# Via AWS CLI
aws bedrock list-foundation-models --region us-east-1

# Request model access (one-time)
# Go to: https://console.aws.amazon.com/bedrock
# → Model access → Manage model access
# → Select models → Request access
```

Or via AWS Console:

1. Open [Bedrock Console](https://console.aws.amazon.com/bedrock)
2. Select region (us-east-1 recommended)
3. Click "Model access"
4. Enable desired models (instant for most, approval needed for some)

### 2. Setup IAM Permissions

```bash
# Create IAM policy
cat > bedrock-policy.json  req.userRegion === "us",
    },

    // EU West (GDPR)
    {
      name: "bedrock-eu",
      priority: 1,
      config: {
        region: "eu-west-1",
        credentials: {
          accessKeyId: process.env.AWS_ACCESS_KEY_ID,
          secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY,
        },
      },
      condition: (req) => req.userRegion === "eu",
    },

    // Asia Pacific
    {
      name: "bedrock-asia",
      priority: 1,
      config: {
        region: "ap-southeast-1",
        credentials: {
          accessKeyId: process.env.AWS_ACCESS_KEY_ID,
          secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY,
        },
      },
      condition: (req) => req.userRegion === "asia",
    },
  ],
  failoverConfig: { enabled: true },
});
```

---

## Model Selection Guide

### Anthropic Claude Models

```typescript
// Claude 3.5 Sonnet - Balanced performance (recommended)
const sonnet = await ai.generate({
  input: { text: "Complex analysis task" },
  provider: "bedrock",
  model: "anthropic.claude-3-5-sonnet-20241022-v2:0",
});

// Claude 3 Opus - Highest intelligence
const opus = await ai.generate({
  input: { text: "Most difficult reasoning task" },
  provider: "bedrock",
  model: "anthropic.claude-3-opus-20240229-v1:0",
});

// Claude 3 Haiku - Fast and affordable
const haiku = await ai.generate({
  input: { text: "Quick simple query" },
  provider: "bedrock",
  model: "anthropic.claude-3-haiku-20240307-v1:0",
});
```

**Claude Model IDs:**

- `anthropic.claude-3-5-sonnet-20241022-v2:0` - Latest Sonnet
- `anthropic.claude-3-opus-20240229-v1:0` - Opus
- `anthropic.claude-3-haiku-20240307-v1:0` - Haiku

### Meta Llama Models

```typescript
// Llama 3.1 405B - Largest open model
const llama405b = await ai.generate({
  input: { text: "Complex task" },
  provider: "bedrock",
  model: "meta.llama3-1-405b-instruct-v1:0",
});

// Llama 3.1 70B - Balanced
const llama70b = await ai.generate({
  input: { text: "General task" },
  provider: "bedrock",
  model: "meta.llama3-1-70b-instruct-v1:0",
});

// Llama 3.1 8B - Fast and cheap
const llama8b = await ai.generate({
  input: { text: "Simple task" },
  provider: "bedrock",
  model: "meta.llama3-1-8b-instruct-v1:0",
});
```

**Llama Model IDs:**

- `meta.llama3-1-405b-instruct-v1:0` - 405B (most capable)
- `meta.llama3-1-70b-instruct-v1:0` - 70B (balanced)
- `meta.llama3-1-8b-instruct-v1:0` - 8B (fast)

### Mistral AI Models

```typescript
// Mistral Large - Most capable
const mistralLarge = await ai.generate({
  input: { text: "Complex reasoning" },
  provider: "bedrock",
  model: "mistral.mistral-large-2402-v1:0",
});

// Mixtral 8x7B - Cost-effective
const mixtral = await ai.generate({
  input: { text: "General task" },
  provider: "bedrock",
  model: "mistral.mixtral-8x7b-instruct-v0:1",
});
```

**Mistral Model IDs:**

- `mistral.mistral-large-2402-v1:0` - Mistral Large
- `mistral.mixtral-8x7b-instruct-v0:1` - Mixtral 8x7B

### Amazon Titan Models

```typescript
// Titan Text Premier - AWS native
const titanPremier = await ai.generate({
  input: { text: "AWS-optimized task" },
  provider: "bedrock",
  model: "amazon.titan-text-premier-v1:0",
});

// Titan Embeddings - Vector search
const embeddings = await ai.generateEmbeddings({
  texts: ["Document 1", "Document 2"],
  provider: "bedrock",
  model: "amazon.titan-embed-text-v2:0",
});
```

**Titan Model IDs:**

- `amazon.titan-text-premier-v1:0` - Text generation
- `amazon.titan-text-express-v1` - Fast text
- `amazon.titan-embed-text-v2:0` - Embeddings (1024 dim)
- `amazon.titan-embed-text-v1` - Embeddings (1536 dim)

### Cohere Models

```typescript
// Command R+ - RAG optimized
const commandRPlus = await ai.generate({
  input: { text: "Search and summarize documents" },
  provider: "bedrock",
  model: "cohere.command-r-plus-v1:0",
});

// Embed English - Embeddings
const cohereEmbed = await ai.generateEmbeddings({
  texts: ["Query text"],
  provider: "bedrock",
  model: "cohere.embed-english-v3",
});
```

**Cohere Model IDs:**

- `cohere.command-r-plus-v1:0` - Command R+
- `cohere.command-r-v1:0` - Command R
- `cohere.embed-english-v3` - Embeddings

---

## IAM Roles & Permissions

### EC2 Instance Role

```bash
# Create trust policy
cat > trust-policy.json  lambda-trust.json  {
    await logMetric(result.usage.totalTokens, result.cost);
  },
});
```

### CloudWatch Logs

```typescript

const logs = new CloudWatchLogs({ region: "us-east-1" });

async function logRequest(data: any) {
  await logs.putLogEvents({
    logGroupName: "/aws/bedrock/requests",
    logStreamName: "production",
    logEvents: [
      {
        timestamp: Date.now(),
        message: JSON.stringify(data),
      },
    ],
  });
}

const ai = new NeuroLink({
  providers: [{ name: "bedrock", config: { region: "us-east-1" } }],
  onSuccess: async (result) => {
    await logRequest({
      model: result.model,
      tokens: result.usage.totalTokens,
      latency: result.latency,
      cost: result.cost,
    });
  },
});
```

---

## Cost Management

### Pricing Overview

```
Claude 3.5 Sonnet:
- Input:  $3.00 per 1M tokens
- Output: $15.00 per 1M tokens

Claude 3 Opus:
- Input:  $15.00 per 1M tokens
- Output: $75.00 per 1M tokens

Claude 3 Haiku:
- Input:  $0.25 per 1M tokens
- Output: $1.25 per 1M tokens

Llama 3.1 405B:
- Input:  $2.65 per 1M tokens
- Output: $3.50 per 1M tokens

Llama 3.1 70B:
- Input:  $0.99 per 1M tokens
- Output: $0.99 per 1M tokens

Llama 3.1 8B:
- Input:  $0.22 per 1M tokens
- Output: $0.22 per 1M tokens

Mistral Large:
- Input:  $4.00 per 1M tokens
- Output: $12.00 per 1M tokens

Titan Text Premier:
- Input:  $0.50 per 1M tokens
- Output: $1.50 per 1M tokens
```

### Cost Budgets

```bash
# Create budget for Bedrock
aws budgets create-budget \
  --account-id ACCOUNT_ID \
  --budget file://budget.json

# budget.json
cat > budget.json  = {
      "anthropic.claude-3-5-sonnet-20241022-v2:0": { input: 3.0, output: 15.0 },
      "anthropic.claude-3-haiku-20240307-v1:0": { input: 0.25, output: 1.25 },
      "meta.llama3-1-405b-instruct-v1:0": { input: 2.65, output: 3.5 },
      "meta.llama3-1-8b-instruct-v1:0": { input: 0.22, output: 0.22 },
    };

    const rates = pricing[model] || { input: 1.0, output: 1.0 };
    const cost =
      (inputTokens / 1_000_000) * rates.input +
      (outputTokens / 1_000_000) * rates.output;

    this.monthlyCost += cost;
    return cost;
  }

  getMonthlyTotal(): number {
    return this.monthlyCost;
  }
}
```

---

## Production Patterns

### Pattern 1: Multi-Model Strategy

```typescript
const ai = new NeuroLink({
  providers: [
    // Cheap for simple tasks
    {
      name: "bedrock-haiku",
      config: { region: "us-east-1" },
      model: "anthropic.claude-3-haiku-20240307-v1:0",
      condition: (req) => req.complexity === "low",
    },

    // Balanced for medium tasks
    {
      name: "bedrock-sonnet",
      config: { region: "us-east-1" },
      model: "anthropic.claude-3-5-sonnet-20241022-v2:0",
      condition: (req) => req.complexity === "medium",
    },

    // Premium for complex tasks
    {
      name: "bedrock-opus",
      config: { region: "us-east-1" },
      model: "anthropic.claude-3-opus-20240229-v1:0",
      condition: (req) => req.complexity === "high",
    },
  ],
});
```

### Pattern 2: Guardrails

```typescript
// Enable Bedrock Guardrails
const ai = new NeuroLink({
  providers: [
    {
      name: "bedrock",
      config: {
        region: "us-east-1",
        guardrailId: "abc123xyz", // Created in Bedrock console
        guardrailVersion: "1",
      },
    },
  ],
});

const result = await ai.generate({
  input: { text: "Your prompt" },
  provider: "bedrock",
  model: "anthropic.claude-3-5-sonnet-20241022-v2:0",
});
// Content filtered by guardrails
```

### Pattern 3: Knowledge Base Integration

```bash
# Create Knowledge Base in Bedrock
aws bedrock-agent create-knowledge-base \
  --name my-kb \
  --role-arn arn:aws:iam::ACCOUNT_ID:role/BedrockKBRole \
  --knowledge-base-configuration '{
    "type": "VECTOR",
    "vectorKnowledgeBaseConfiguration": {
      "embeddingModelArn": "arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0"
    }
  }' \
  --storage-configuration '{
    "type": "OPENSEARCH_SERVERLESS",
    "opensearchServerlessConfiguration": {
      "collectionArn": "arn:aws:aoss:us-east-1:ACCOUNT_ID:collection/abc",
      "vectorIndexName": "my-index",
      "fieldMapping": {
        "vectorField": "embedding",
        "textField": "text",
        "metadataField": "metadata"
      }
    }
  }'
```

---

## Best Practices

### 1. ✅ Use IAM Roles Instead of Keys

```typescript
// ✅ Good: EC2 instance role (no keys)
const ai = new NeuroLink({
  providers: [
    {
      name: "bedrock",
      config: { region: "us-east-1" },
      // Credentials from instance metadata
    },
  ],
});
```

### 2. ✅ Enable VPC Endpoints

```bash
# ✅ Good: Private connectivity
aws ec2 create-vpc-endpoint \
  --service-name com.amazonaws.us-east-1.bedrock-runtime
```

### 3. ✅ Monitor Costs

```typescript
// ✅ Good: Track every request
const cost = costTracker.calculateCost(model, inputTokens, outputTokens);
```

### 4. ✅ Use Appropriate Model for Task

```typescript
// ✅ Good: Match model to complexity
const model = complexity === "low" ? "claude-haiku" : "claude-sonnet";
```

### 5. ✅ Enable CloudWatch Logging

```typescript
// ✅ Good: Comprehensive logging
await logs.putLogEvents({
  /* ... */
});
```

---

## Troubleshooting

### Common Issues

#### 1. "Model Access Denied"

**Problem**: Model not enabled in your account.

**Solution**:

```bash
# Enable via console
# https://console.aws.amazon.com/bedrock → Model access

# Or check status
aws bedrock list-foundation-models --region us-east-1
```

#### 2. "Throttling Exception"

**Problem**: Exceeded rate limits.

**Solution**:

```bash
# Request quota increase
aws service-quotas request-service-quota-increase \
  --service-code bedrock \
  --quota-code L-12345678 \
  --desired-value 1000
```

#### 3. "Invalid Model ID"

**Problem**: Wrong model identifier.

**Solution**:

```bash
# List available models
aws bedrock list-foundation-models --region us-east-1

# Use exact model ID
model: 'anthropic.claude-3-5-sonnet-20241022-v2:0'  # ✅ Correct
```

---

## Related Documentation

- **[Provider Setup](/docs/getting-started/provider-setup)** - General configuration
- **[Multi-Region](/docs/guides/enterprise/multi-region)** - Geographic distribution
- **[Cost Optimization](/docs/cookbook/cost-optimization)** - Reduce costs
- **[Compliance](/docs/guides/enterprise/compliance)** - Security

---

## Additional Resources

- **[AWS Bedrock Docs](https://docs.aws.amazon.com/bedrock/)** - Official documentation
- **[Bedrock Pricing](https://aws.amazon.com/bedrock/pricing/)** - Pricing details
- **[Bedrock Console](https://console.aws.amazon.com/bedrock)** - Manage models
- **[AWS CLI Reference](https://docs.aws.amazon.com/cli/latest/reference/bedrock/)** - CLI commands

---

**Need Help?** Join our [GitHub Discussions](https://github.com/juspay/neurolink/discussions) or open an [issue](https://github.com/juspay/neurolink/issues).

---

## Azure OpenAI Provider Guide

<!-- Source: getting-started/providers/azure-openai.md -->

# Azure OpenAI Provider Guide

**Enterprise-grade OpenAI models with Microsoft Azure infrastructure and compliance**

## Quick Start

### 1. Create Azure OpenAI Resource

```bash
# Via Azure CLI
az cognitiveservices account create \
  --name my-openai-resource \
  --resource-group my-resource-group \
  --location eastus \
  --kind OpenAI \
  --sku S0
```

Or use [Azure Portal](https://portal.azure.com/#create/Microsoft.CognitiveServicesOpenAI):

1. Search for "Azure OpenAI"
2. Click "Create"
3. Select subscription and resource group
4. Choose region (eastus, westeurope, etc.)
5. Name your resource
6. Click "Review + Create"

### 2. Deploy a Model

```bash
# Deploy GPT-4o model
az cognitiveservices account deployment create \
  --name my-openai-resource \
  --resource-group my-resource-group \
  --deployment-name gpt-4o-deployment \
  --model-name gpt-4o \
  --model-version "2024-08-06" \
  --model-format OpenAI \
  --sku-capacity 10 \
  --sku-name "Standard"
```

Or via Azure Portal:

1. Open your Azure OpenAI resource
2. Go to "Deployments" → "Create new deployment"
3. Select model (gpt-4o, gpt-4, gpt-35-turbo, etc.)
4. Name deployment
5. Set capacity (TPM quota)

### 3. Get Credentials

```bash
# Get endpoint
az cognitiveservices account show \
  --name my-openai-resource \
  --resource-group my-resource-group \
  --query "properties.endpoint" --output tsv

# Get API key
az cognitiveservices account keys list \
  --name my-openai-resource \
  --resource-group my-resource-group \
  --query "key1" --output tsv
```

### 4. Configure NeuroLink

```bash
# .env
AZURE_OPENAI_API_KEY=your_api_key_here
AZURE_OPENAI_ENDPOINT=https://my-resource.openai.azure.com/
AZURE_OPENAI_DEPLOYMENT=gpt-4o-deployment
```

```typescript

const ai = new NeuroLink({
  providers: [
    {
      name: "azure-openai",
      config: {
        apiKey: process.env.AZURE_OPENAI_API_KEY,
        endpoint: process.env.AZURE_OPENAI_ENDPOINT,
        deployment: process.env.AZURE_OPENAI_DEPLOYMENT,
      },
    },
  ],
});

const result = await ai.generate({
  input: { text: "Hello from Azure OpenAI!" },
  provider: "azure-openai",
});

console.log(result.content);
```

---

## Regional Deployment

### Available Regions

| Region                | Location      | Models Available | Data Residency |
| --------------------- | ------------- | ---------------- | -------------- |
| **East US**           | Virginia, USA | All models       | USA            |
| **East US 2**         | Virginia, USA | All models       | USA            |
| **South Central US**  | Texas, USA    | All models       | USA            |
| **West Europe**       | Netherlands   | All models       | EU             |
| **North Europe**      | Ireland       | All models       | EU             |
| **UK South**          | London, UK    | All models       | UK             |
| **France Central**    | Paris, France | All models       | EU             |
| **Switzerland North** | Zurich        | All models       | Switzerland    |
| **Sweden Central**    | Stockholm     | All models       | EU             |
| **Australia East**    | Sydney        | All models       | Australia      |
| **Japan East**        | Tokyo         | All models       | Japan          |
| **Canada East**       | Quebec        | All models       | Canada         |

### Multi-Region Setup

```typescript
const ai = new NeuroLink({
  providers: [
    // US deployments
    {
      name: "azure-us-east",
      config: {
        apiKey: process.env.AZURE_US_EAST_KEY,
        endpoint: "https://my-us-east.openai.azure.com/",
        deployment: "gpt-4o-deployment",
      },
      region: "us-east",
      priority: 1,
      condition: (req) => req.userRegion === "us",
    },

    // EU deployments
    {
      name: "azure-eu-west",
      config: {
        apiKey: process.env.AZURE_EU_WEST_KEY,
        endpoint: "https://my-eu-west.openai.azure.com/",
        deployment: "gpt-4o-deployment",
      },
      region: "eu-west",
      priority: 1,
      condition: (req) => req.userRegion === "eu",
    },

    // Asia deployments
    {
      name: "azure-japan",
      config: {
        apiKey: process.env.AZURE_JAPAN_KEY,
        endpoint: "https://my-japan.openai.azure.com/",
        deployment: "gpt-4o-deployment",
      },
      region: "japan",
      priority: 1,
      condition: (req) => req.userRegion === "asia",
    },
  ],
  failoverConfig: { enabled: true },
});
```

---

## Model Deployments

### Available Models

| Model                      | Description          | Context | Best For          | TPM Quota |
| -------------------------- | -------------------- | ------- | ----------------- | --------- |
| **gpt-4o**                 | Latest flagship      | 128K    | Complex reasoning | 10K - 1M  |
| **gpt-4o-mini**            | Fast, cost-effective | 128K    | General tasks     | 10K - 10M |
| **gpt-4-turbo**            | Previous flagship    | 128K    | Advanced tasks    | 10K - 1M  |
| **gpt-4**                  | Stable version       | 8K      | Production        | 10K - 1M  |
| **gpt-35-turbo**           | Fast, affordable     | 16K     | High-volume       | 10K - 10M |
| **text-embedding-ada-002** | Embeddings           | 8K      | Vector search     | 10K - 10M |
| **text-embedding-3-small** | Small embeddings     | 8K      | Efficient search  | 10K - 10M |
| **text-embedding-3-large** | Large embeddings     | 8K      | Accuracy          | 10K - 10M |

### Deployment Quotas (TPM)

```
Standard Tier Quotas (Tokens Per Minute):
- gpt-4o:              10K - 1M TPM
- gpt-4o-mini:         10K - 10M TPM
- gpt-4-turbo:         10K - 1M TPM
- gpt-35-turbo:        10K - 10M TPM
- embeddings:          10K - 10M TPM

Request quota increase via Azure Portal if needed.
```

### Multiple Model Deployments

```typescript
const ai = new NeuroLink({
  providers: [
    // GPT-4o for complex tasks
    {
      name: "azure-gpt4o",
      config: {
        apiKey: process.env.AZURE_API_KEY,
        endpoint: process.env.AZURE_ENDPOINT,
        deployment: "gpt-4o-deployment",
      },
      model: "gpt-4o",
    },

    // GPT-4o-mini for general tasks
    {
      name: "azure-gpt4o-mini",
      config: {
        apiKey: process.env.AZURE_API_KEY,
        endpoint: process.env.AZURE_ENDPOINT,
        deployment: "gpt-4o-mini-deployment",
      },
      model: "gpt-4o-mini",
    },

    // GPT-3.5-turbo for high-volume
    {
      name: "azure-gpt35",
      config: {
        apiKey: process.env.AZURE_API_KEY,
        endpoint: process.env.AZURE_ENDPOINT,
        deployment: "gpt-35-turbo-deployment",
      },
      model: "gpt-35-turbo",
    },
  ],
});

// Route based on task complexity
const complexTask = await ai.generate({
  input: { text: "Complex analysis..." },
  provider: "azure-gpt4o",
});

const simpleTask = await ai.generate({
  input: { text: "Simple query..." },
  provider: "azure-gpt4o-mini",
});
```

---

## Azure AD Authentication

### Managed Identity (Recommended)

```typescript

const credential = new DefaultAzureCredential();

const ai = new NeuroLink({
  providers: [
    {
      name: "azure-openai",
      config: {
        credential, // Use Azure AD instead of API key
        endpoint: process.env.AZURE_OPENAI_ENDPOINT,
        deployment: process.env.AZURE_OPENAI_DEPLOYMENT,
      },
    },
  ],
});
```

### Service Principal

```typescript

const credential = new ClientSecretCredential(
  process.env.AZURE_TENANT_ID!,
  process.env.AZURE_CLIENT_ID!,
  process.env.AZURE_CLIENT_SECRET!,
);

const ai = new NeuroLink({
  providers: [
    {
      name: "azure-openai",
      config: {
        credential,
        endpoint: process.env.AZURE_OPENAI_ENDPOINT,
        deployment: process.env.AZURE_OPENAI_DEPLOYMENT,
      },
    },
  ],
});
```

### User-Assigned Managed Identity

```typescript

const credential = new ManagedIdentityCredential({
  clientId: process.env.AZURE_CLIENT_ID,
});

const ai = new NeuroLink({
  providers: [
    {
      name: "azure-openai",
      config: {
        credential,
        endpoint: process.env.AZURE_OPENAI_ENDPOINT,
        deployment: process.env.AZURE_OPENAI_DEPLOYMENT,
      },
    },
  ],
});
```

---

## Private Endpoint & VNet Integration

### Configure Private Endpoint

```bash
# Create private endpoint
az network private-endpoint create \
  --name my-openai-pe \
  --resource-group my-resource-group \
  --vnet-name my-vnet \
  --subnet my-subnet \
  --private-connection-resource-id "/subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.CognitiveServices/accounts/my-openai" \
  --group-id account \
  --connection-name my-openai-connection
```

### Private DNS Zone

```bash
# Create private DNS zone
az network private-dns zone create \
  --resource-group my-resource-group \
  --name privatelink.openai.azure.com

# Link to VNet
az network private-dns link vnet create \
  --resource-group my-resource-group \
  --zone-name privatelink.openai.azure.com \
  --name my-openai-dns-link \
  --virtual-network my-vnet \
  --registration-enabled false
```

### VNet Integration in Code

```typescript
// No code changes needed - just use private endpoint URL
const ai = new NeuroLink({
  providers: [
    {
      name: "azure-openai",
      config: {
        apiKey: process.env.AZURE_API_KEY,
        endpoint: "https://my-openai.privatelink.openai.azure.com/", // Private endpoint
        deployment: "gpt-4o-deployment",
      },
    },
  ],
});
```

---

## Compliance & Security

### Data Residency

```typescript
// Ensure EU data stays in EU
const ai = new NeuroLink({
  providers: [
    {
      name: "azure-eu",
      config: {
        apiKey: process.env.AZURE_EU_KEY,
        endpoint: "https://my-eu-resource.openai.azure.com/",
        deployment: "gpt-4o-deployment",
        region: "westeurope", // EU region
      },
      condition: (req) => req.userRegion === "EU",
      compliance: ["GDPR", "ISO27001", "SOC2"],
    },
  ],
});
```

### Customer-Managed Keys (CMK)

```bash
# Enable CMK with Azure Key Vault
az cognitiveservices account update \
  --name my-openai-resource \
  --resource-group my-resource-group \
  --encryption KeyVault \
  --encryption-key-name my-key \
  --encryption-key-source Microsoft.KeyVault \
  --encryption-key-vault https://my-vault.vault.azure.net/
```

### Disable Public Network Access

```bash
# Restrict to private endpoint only
az cognitiveservices account update \
  --name my-openai-resource \
  --resource-group my-resource-group \
  --public-network-access Disabled
```

---

## Monitoring & Logging

### Azure Monitor Integration

```typescript

const appInsights = new ApplicationInsights({
  connectionString: process.env.APPLICATIONINSIGHTS_CONNECTION_STRING,
});

appInsights.start();

const ai = new NeuroLink({
  providers: [
    {
      name: "azure-openai",
      config: {
        apiKey: process.env.AZURE_API_KEY,
        endpoint: process.env.AZURE_OPENAI_ENDPOINT,
        deployment: process.env.AZURE_OPENAI_DEPLOYMENT,
      },
    },
  ],
  onSuccess: (result) => {
    // Log to Application Insights
    appInsights.trackEvent({
      name: "AI_Generation_Success",
      properties: {
        provider: result.provider,
        model: result.model,
        tokens: result.usage.totalTokens,
        cost: result.cost,
        latency: result.latency,
      },
    });
  },
  onError: (error, provider) => {
    // Log errors
    appInsights.trackException({
      exception: error,
      properties: { provider },
    });
  },
});
```

### Diagnostic Logs

```bash
# Enable diagnostic logs
az monitor diagnostic-settings create \
  --name my-diagnostic-settings \
  --resource "/subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.CognitiveServices/accounts/my-openai" \
  --logs '[{"category":"Audit","enabled":true},{"category":"RequestResponse","enabled":true}]' \
  --workspace "/subscriptions/{sub}/resourceGroups/{rg}/providers/microsoft.operationalinsights/workspaces/my-workspace"
```

---

## Cost Management

### Pricing Model

```
Azure OpenAI Pricing (as of 2025):

GPT-4o:
- Input:  $2.50 per 1M tokens
- Output: $10.00 per 1M tokens

GPT-4o-mini:
- Input:  $0.15 per 1M tokens
- Output: $0.60 per 1M tokens

GPT-4-turbo:
- Input:  $10.00 per 1M tokens
- Output: $30.00 per 1M tokens

GPT-3.5-turbo:
- Input:  $0.50 per 1M tokens
- Output: $1.50 per 1M tokens

Embeddings (ada-002):
- $0.10 per 1M tokens
```

### Cost Tracking

```typescript
class AzureCostTracker {
  private dailyCost = 0;
  private monthlyCost = 0;

  recordUsage(result: any) {
    const inputTokens = result.usage.promptTokens;
    const outputTokens = result.usage.completionTokens;

    // Calculate cost based on model
    let cost = 0;
    if (result.model === "gpt-4o") {
      cost =
        (inputTokens / 1_000_000) * 2.5 + (outputTokens / 1_000_000) * 10.0;
    } else if (result.model === "gpt-4o-mini") {
      cost =
        (inputTokens / 1_000_000) * 0.15 + (outputTokens / 1_000_000) * 0.6;
    }

    this.dailyCost += cost;
    this.monthlyCost += cost;

    return cost;
  }

  getStats() {
    return {
      daily: this.dailyCost,
      monthly: this.monthlyCost,
    };
  }
}

const costTracker = new AzureCostTracker();

const result = await ai.generate({
  input: { text: "Your prompt" },
  provider: "azure-openai",
  enableAnalytics: true,
});

const cost = costTracker.recordUsage(result);
console.log(`Request cost: $${cost.toFixed(4)}`);
```

### Budget Alerts

```bash
# Create budget in Azure
az consumption budget create \
  --budget-name openai-monthly-budget \
  --amount 1000 \
  --time-grain Monthly \
  --start-date 2025-01-01 \
  --end-date 2025-12-31 \
  --resource-group my-resource-group
```

---

## Production Patterns

### Pattern 1: High Availability Setup

```typescript
const ai = new NeuroLink({
  providers: [
    // Primary region
    {
      name: "azure-primary",
      priority: 1,
      config: {
        apiKey: process.env.AZURE_PRIMARY_KEY,
        endpoint: process.env.AZURE_PRIMARY_ENDPOINT,
        deployment: "gpt-4o-deployment",
      },
    },

    // Failover region
    {
      name: "azure-secondary",
      priority: 2,
      config: {
        apiKey: process.env.AZURE_SECONDARY_KEY,
        endpoint: process.env.AZURE_SECONDARY_ENDPOINT,
        deployment: "gpt-4o-deployment",
      },
    },
  ],
  failoverConfig: {
    enabled: true,
    maxAttempts: 3,
    retryDelay: 1000,
  },
  healthCheck: {
    enabled: true,
    interval: 60000,
  },
});
```

### Pattern 2: Load Balancing Across Deployments

```typescript
const ai = new NeuroLink({
  providers: [
    {
      name: "azure-deployment-1",
      config: {
        apiKey: process.env.AZURE_API_KEY,
        endpoint: process.env.AZURE_ENDPOINT,
        deployment: "gpt-4o-deployment-1",
      },
      weight: 1,
    },
    {
      name: "azure-deployment-2",
      config: {
        apiKey: process.env.AZURE_API_KEY,
        endpoint: process.env.AZURE_ENDPOINT,
        deployment: "gpt-4o-deployment-2",
      },
      weight: 1,
    },
    {
      name: "azure-deployment-3",
      config: {
        apiKey: process.env.AZURE_API_KEY,
        endpoint: process.env.AZURE_ENDPOINT,
        deployment: "gpt-4o-deployment-3",
      },
      weight: 1,
    },
  ],
  loadBalancing: "round-robin",
});
```

### Pattern 3: Quota Management

```typescript
class QuotaManager {
  private tokensThisMinute = 0;
  private minuteStart = Date.now();
  private quotaLimit = 100000; // 100K TPM

  async checkQuota(estimatedTokens: number): Promise {
    const now = Date.now();

    // Reset if new minute
    if (now - this.minuteStart > 60000) {
      this.tokensThisMinute = 0;
      this.minuteStart = now;
    }

    // Check if within quota
    return this.tokensThisMinute + estimatedTokens <= this.quotaLimit;
  }

  recordUsage(tokens: number) {
    this.tokensThisMinute += tokens;
  }

  getRemaining(): number {
    return Math.max(0, this.quotaLimit - this.tokensThisMinute);
  }
}

const quotaManager = new QuotaManager();

async function generateWithQuota(prompt: string) {
  const estimated = prompt.length / 4; // Rough estimate

  if (!(await quotaManager.checkQuota(estimated))) {
    throw new Error("Quota exceeded, please wait");
  }

  const result = await ai.generate({
    input: { text: prompt },
    provider: "azure-openai",
    enableAnalytics: true,
  });

  quotaManager.recordUsage(result.usage.totalTokens);
  return result;
}
```

---

## Troubleshooting

### Common Issues

#### 1. "Deployment Not Found"

**Problem**: Incorrect deployment name.

**Solution**:

```bash
# List all deployments
az cognitiveservices account deployment list \
  --name my-openai-resource \
  --resource-group my-resource-group

# Use exact deployment name in config
AZURE_OPENAI_DEPLOYMENT=gpt-4o-deployment  # ✅ Exact name
```

#### 2. "Rate Limit Exceeded (429)"

**Problem**: Exceeded TPM quota for deployment.

**Solution**:

```bash
# Increase quota via Azure Portal:
# 1. Go to resource → Deployments
# 2. Edit deployment
# 3. Increase TPM capacity
# Or request quota increase via support ticket
```

#### 3. "Resource Not Found"

**Problem**: Incorrect endpoint or resource deleted.

**Solution**:

```bash
# Verify resource exists
az cognitiveservices account show \
  --name my-openai-resource \
  --resource-group my-resource-group

# Check endpoint format
AZURE_OPENAI_ENDPOINT=https://my-resource.openai.azure.com/  # ✅ With trailing slash
```

#### 4. "Invalid API Key"

**Problem**: API key rotated or incorrect.

**Solution**:

```bash
# Regenerate key
az cognitiveservices account keys regenerate \
  --name my-openai-resource \
  --resource-group my-resource-group \
  --key-name key1

# Update environment variable
```

---

## Best Practices

### 1. ✅ Use Managed Identity in Azure

```typescript
// ✅ Good: Managed identity (no keys to manage)
const credential = new DefaultAzureCredential();

const ai = new NeuroLink({
  providers: [
    {
      name: "azure-openai",
      config: { credential, endpoint, deployment },
    },
  ],
});
```

### 2. ✅ Deploy Multiple Regions for HA

```typescript
// ✅ Good: Multi-region failover
providers: [
  { name: "azure-us", priority: 1 },
  { name: "azure-eu", priority: 2 },
];
```

### 3. ✅ Use Private Endpoints for Security

```bash
# ✅ Good: Private endpoint + disable public access
az cognitiveservices account update \
  --public-network-access Disabled
```

### 4. ✅ Monitor Costs with Budgets

```bash
# ✅ Good: Set budget alerts
az consumption budget create \
  --amount 1000 \
  --time-grain Monthly
```

### 5. ✅ Enable Diagnostic Logging

```bash
# ✅ Good: Enable audit logs
az monitor diagnostic-settings create \
  --logs '[{"category":"Audit","enabled":true}]'
```

---

## Related Documentation

- **[Provider Setup Guide](/docs/getting-started/provider-setup)** - General provider configuration
- **[Multi-Region Deployment](/docs/guides/enterprise/multi-region)** - Geographic distribution
- **[Compliance Guide](/docs/guides/enterprise/compliance)** - Security and compliance
- **[Cost Optimization](/docs/cookbook/cost-optimization)** - Reduce costs

---

## Additional Resources

- **[Azure OpenAI Documentation](https://learn.microsoft.com/azure/cognitive-services/openai/)** - Official docs
- **[Azure OpenAI Pricing](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/)** - Pricing details
- **[Azure Portal](https://portal.azure.com/)** - Manage resources
- **[Azure CLI Reference](https://learn.microsoft.com/cli/azure/cognitiveservices)** - CLI commands

---

**Need Help?** Join our [GitHub Discussions](https://github.com/juspay/neurolink/discussions) or open an [issue](https://github.com/juspay/neurolink/issues).

---

## Google AI Studio Provider Guide

<!-- Source: getting-started/providers/google-ai.md -->

# Google AI Studio Provider Guide

**Direct access to Google's Gemini models with generous free tier and simple API key authentication**

## Quick Start

### 1. Get Your API Key

1. Visit [Google AI Studio](https://aistudio.google.com/)
2. Sign in with your Google account (no GCP project needed)
3. Click **Get API Key** in the top navigation
4. Click **Create API Key**
5. Copy the generated key (starts with `AIza`)

### 2. Configure NeuroLink

Add to your `.env` file:

```bash
GOOGLE_AI_API_KEY=AIza-your-api-key-here
```

### 3. Test the Setup

```bash
# CLI - Test with default model
npx @juspay/neurolink generate "Hello from Google AI!" --provider google-ai

# CLI - Use specific Gemini model
npx @juspay/neurolink generate "Explain quantum physics" \
  --provider google-ai \
  --model "gemini-2.0-flash"

# SDK
node -e "
const { NeuroLink } = require('@juspay/neurolink');
(async () => {
  const ai = new NeuroLink();
  const result = await ai.generate({
    input: { text: 'Hello from Gemini!' },
    provider: 'google-ai'
  });
  console.log(result.content);
})();
"
```

---

## Free Tier Details

### Current Limits (Updated 2025)

| Resource                      | Free Tier Limit | Notes                            |
| ----------------------------- | --------------- | -------------------------------- |
| **Requests per Minute (RPM)** | 15 RPM          | Per API key                      |
| **Tokens per Minute (TPM)**   | 1M TPM          | Combined input + output          |
| **Requests per Day (RPD)**    | 1,500 RPD       | Rolling 24-hour window           |
| **Concurrent Requests**       | 15              | Max simultaneous requests        |
| **Context Length**            | Up to 2M tokens | Model-dependent (Gemini 1.5 Pro) |

### Free Tier Capacity Estimate

```
Daily Capacity:
- 1,500 requests/day × 500 tokens/request = 750K tokens/day
- Equivalent to ~300 pages of text per day
- Or ~150 detailed conversations

Monthly Capacity (30 days):
- 45,000 requests/month
- ~22.5M tokens/month
- Covers most small-medium applications
```

### When to Upgrade

You should consider upgrading to **Vertex AI** when:

- ✅ Exceeding 1,500 requests/day consistently
- ✅ Need for SLA guarantees
- ✅ Enterprise compliance requirements (HIPAA, SOC2)
- ✅ Multi-region deployment
- ✅ Advanced security features (VPC, customer-managed encryption)
- ✅ Fine-tuning custom models

---

## Model Selection Guide

### Available Gemini Models

| Model                      | Description                   | Context    | Best For                             | Free Tier |
| -------------------------- | ----------------------------- | ---------- | ------------------------------------ | --------- |
| **gemini-3-pro-preview**   | Latest flagship with thinking | 2M tokens  | Complex reasoning, extended thinking | ✅ Yes    |
| **gemini-3-flash-preview** | Fast model with thinking      | 1M tokens  | Speed + reasoning, real-time         | ✅ Yes    |
| **gemini-2.0-flash**       | Production fast model         | 1M tokens  | Speed, real-time apps                | ✅ Yes    |
| **gemini-1.5-pro**         | Proven capable model          | 2M tokens  | Complex reasoning, analysis          | ✅ Yes    |
| **gemini-1.5-flash**       | Balanced model                | 1M tokens  | General tasks                        | ✅ Yes    |
| **gemini-1.0-pro**         | Legacy stable model           | 32K tokens | Production stability                 | ✅ Yes    |

### Model Selection by Use Case

```typescript
// Extended thinking for complex problems (Gemini 3)
const deepReasoning = await ai.generate({
  input: { text: "Solve this complex mathematical proof..." },
  provider: "google-ai",
  model: "gemini-3-pro-preview", // Best reasoning with thinking
  thinkingLevel: "high", // Enable extended thinking
});

// Fast reasoning with thinking (Gemini 3 Flash)
const fastReasoning = await ai.generate({
  input: { text: "Analyze this code and find bugs" },
  provider: "google-ai",
  model: "gemini-3-flash-preview", // Fast + reasoning
  thinkingLevel: "medium",
});

// Real-time applications (speed priority)
const realtime = await ai.generate({
  input: { text: "Quick customer query" },
  provider: "google-ai",
  model: "gemini-2.0-flash", // Fastest response
});

// Complex reasoning (quality priority)
const complex = await ai.generate({
  input: { text: "Analyze this complex business scenario..." },
  provider: "google-ai",
  model: "gemini-1.5-pro", // Most capable, 2M context
});

// Multimodal processing
const multimodal = await ai.generate({
  input: {
    text: "Describe this image",
    images: ["data:image/jpeg;base64,..."],
  },
  provider: "google-ai",
  model: "gemini-1.5-pro", // Best for multimodal
});

// Cost-optimized general tasks
const general = await ai.generate({
  input: { text: "General customer support query" },
  provider: "google-ai",
  model: "gemini-1.5-flash", // Balanced performance/cost
});
```

### Context Length Comparison

```
Model Context Limits:
- gemini-3-pro-preview:   2,000,000 tokens (1000 novels)
- gemini-3-flash-preview: 1,000,000 tokens (500 novels)
- gemini-2.0-flash:       1,000,000 tokens (500 novels)
- gemini-1.5-pro:         2,000,000 tokens (1000 novels)
- gemini-1.5-flash:       1,000,000 tokens (500 novels)
- gemini-1.0-pro:            32,000 tokens (16 novels)

For comparison:
- GPT-4 Turbo:          128,000 tokens
- Claude 3.5 Sonnet:    200,000 tokens
```

---

## Extended Thinking (Gemini 3)

Gemini 3 models introduce **Extended Thinking**, a feature that allows the model to "think" more deeply before responding. This improves reasoning quality for complex tasks like mathematical proofs, code analysis, and multi-step problem solving.

### Thinking Levels

| Level       | Description                        | Use Case                            | Token Budget |
| ----------- | ---------------------------------- | ----------------------------------- | ------------ |
| **minimal** | Basic reasoning with minimal usage | Quick decisions, simple queries     | ~500 tokens  |
| **low**     | Quick reasoning, minimal overhead  | Simple analysis, quick decisions    | ~1K tokens   |
| **medium**  | Balanced thinking depth            | Code review, moderate complexity    | ~8K tokens   |
| **high**    | Deep reasoning, maximum thinking   | Complex proofs, architecture design | ~24K tokens  |

### Configuration

```typescript

const ai = new NeuroLink();

// Enable extended thinking with thinkingLevel
const result = await ai.generate({
  input: { text: "Prove that the square root of 2 is irrational" },
  provider: "google-ai",
  model: "gemini-3-pro-preview",
  thinkingLevel: "high", // 'minimal' | 'low' | 'medium' | 'high'
});

console.log(result.content);
```

### Extended Thinking Examples

```typescript
// Mathematical reasoning with high thinking
const mathProof = await ai.generate({
  input: {
    text: "Prove the Pythagorean theorem using at least three different methods",
  },
  provider: "google-ai",
  model: "gemini-3-pro-preview",
  thinkingLevel: "high",
});

// Code architecture analysis
const codeReview = await ai.generate({
  input: {
    text: `Review this code for potential issues and suggest improvements:

    ${codeSnippet}`,
  },
  provider: "google-ai",
  model: "gemini-3-flash-preview",
  thinkingLevel: "medium",
});

// Quick analysis with minimal thinking overhead
const quickAnalysis = await ai.generate({
  input: { text: "What's the time complexity of binary search?" },
  provider: "google-ai",
  model: "gemini-3-flash-preview",
  thinkingLevel: "low",
});
```

### CLI Usage with Thinking

```bash
# Use Gemini 3 with extended thinking
npx @juspay/neurolink generate "Solve this logic puzzle..." \
  --provider google-ai \
  --model "gemini-3-pro-preview" \
  --thinking-level high

# Fast reasoning with medium thinking
npx @juspay/neurolink generate "Analyze this code pattern" \
  --provider google-ai \
  --model "gemini-3-flash-preview" \
  --thinking-level medium
```

### Best Practices for Extended Thinking

1. **Match thinking level to task complexity**: Use `low` for simple queries, `high` for complex reasoning
2. **Consider latency**: Higher thinking levels increase response time
3. **Token budget awareness**: Thinking tokens count toward your quota
4. **Streaming recommended**: Use streaming for high thinking levels to see progress

```typescript
// Stream thinking responses for better UX
for await (const chunk of ai.stream({
  input: { text: "Design a distributed caching system" },
  provider: "google-ai",
  model: "gemini-3-pro-preview",
  thinkingLevel: "high",
})) {
  process.stdout.write(chunk.content);
}
```

---

## Rate Limiting and Quotas

### Understanding Rate Limits

Google AI Studio enforces **three types of limits**:

1. **RPM (Requests Per Minute)**: 15 requests in any 60-second window
2. **TPM (Tokens Per Minute)**: 1M tokens in any 60-second window
3. **RPD (Requests Per Day)**: 1,500 requests in any 24-hour window

### Rate Limit Handling

```typescript
// ✅ Good: Implement exponential backoff
async function generateWithBackoff(prompt: string, maxRetries = 3) {
  for (let attempt = 0; attempt  setTimeout(resolve, delay));
      } else {
        throw error;
      }
    }
  }
  throw new Error("Max retries exceeded");
}
```

### Quota Monitoring

```typescript
// Track quota usage
class QuotaTracker {
  private requestsToday = 0;
  private requestsThisMinute = 0;
  private tokensThisMinute = 0;
  private minuteStart = Date.now();
  private dayStart = Date.now();

  async checkQuota() {
    const now = Date.now();

    // Reset minute counters
    if (now - this.minuteStart > 60000) {
      this.requestsThisMinute = 0;
      this.tokensThisMinute = 0;
      this.minuteStart = now;
    }

    // Reset day counter
    if (now - this.dayStart > 86400000) {
      this.requestsToday = 0;
      this.dayStart = now;
    }

    // Check limits
    if (this.requestsThisMinute >= 15) {
      throw new Error("RPM limit reached (15/min)");
    }
    if (this.tokensThisMinute >= 1000000) {
      throw new Error("TPM limit reached (1M/min)");
    }
    if (this.requestsToday >= 1500) {
      throw new Error("RPD limit reached (1500/day)");
    }
  }

  recordUsage(tokens: number) {
    this.requestsThisMinute++;
    this.requestsToday++;
    this.tokensThisMinute += tokens;
  }
}

// Usage
const tracker = new QuotaTracker();

async function generate(prompt: string) {
  await tracker.checkQuota();

  const result = await ai.generate({
    input: { text: prompt },
    provider: "google-ai",
    enableAnalytics: true,
  });

  tracker.recordUsage(result.usage.totalTokens);
  return result;
}
```

### Rate Limiting Best Practices

```typescript
// ✅ Good: Request queuing for high-volume apps
class RequestQueue {
  private queue: Array void;
    reject: (error: any) => void;
  }> = [];
  private processing = false;
  private requestsThisMinute = 0;
  private minuteStart = Date.now();

  async enqueue(prompt: string): Promise {
    return new Promise((resolve, reject) => {
      this.queue.push({ prompt, resolve, reject });
      this.processQueue();
    });
  }

  private async processQueue() {
    if (this.processing || this.queue.length === 0) return;

    this.processing = true;

    while (this.queue.length > 0) {
      // Check rate limit (15 RPM)
      const now = Date.now();
      if (now - this.minuteStart > 60000) {
        this.requestsThisMinute = 0;
        this.minuteStart = now;
      }

      if (this.requestsThisMinute >= 15) {
        // Wait until minute resets
        await new Promise((resolve) => setTimeout(resolve, 4000)); // 4s delay
        continue;
      }

      const item = this.queue.shift()!;

      try {
        const result = await ai.generate({
          input: { text: item.prompt },
          provider: "google-ai",
        });
        this.requestsThisMinute++;
        item.resolve(result);
      } catch (error) {
        item.reject(error);
      }
    }

    this.processing = false;
  }
}

// Usage
const queue = new RequestQueue();
const result = await queue.enqueue("Your prompt");
```

---

## SDK Integration

### Basic Usage

```typescript

const ai = new NeuroLink();

// Simple generation
const result = await ai.generate({
  input: { text: "Explain machine learning" },
  provider: "google-ai",
});

console.log(result.content);
```

### Multimodal Capabilities

```typescript
// Image analysis
const imageAnalysis = await ai.generate({
  input: {
    text: "Describe what you see in this image",
    images: ["data:image/jpeg;base64,/9j/4AAQSkZJRg..."],
  },
  provider: "google-ai",
  model: "gemini-1.5-pro",
});

// Video analysis (Gemini 1.5 Pro)
const videoAnalysis = await ai.generate({
  input: {
    text: "Summarize the key events in this video",
    videos: ["data:video/mp4;base64,..."],
  },
  provider: "google-ai",
  model: "gemini-1.5-pro",
});

// Audio transcription and analysis
const audioAnalysis = await ai.generate({
  input: {
    text: "Transcribe and analyze the sentiment",
    audio: ["data:audio/mp3;base64,..."],
  },
  provider: "google-ai",
  model: "gemini-1.5-pro",
});
```

### Streaming Responses

```typescript
// Stream long responses for better UX
for await (const chunk of ai.stream({
  input: { text: "Write a detailed article about AI" },
  provider: "google-ai",
  model: "gemini-1.5-pro",
})) {
  process.stdout.write(chunk.content);
}
```

### Large Context Handling

```typescript
// Leverage 2M token context window (Gemini 1.5 Pro)
const largeDocument = readFileSync("large-document.txt", "utf-8");

const analysis = await ai.generate({
  input: {
    text: `Analyze this entire document and provide key insights:\n\n${largeDocument}`,
  },
  provider: "google-ai",
  model: "gemini-1.5-pro", // 2M context window
});
```

### Tool/Function Calling

```typescript
// Function calling (supported in Gemini models)
const tools = [
  {
    name: "get_weather",
    description: "Get current weather for a location",
    parameters: {
      type: "object",
      properties: {
        location: { type: "string", description: "City name" },
      },
      required: ["location"],
    },
  },
];

const result = await ai.generate({
  input: { text: "What's the weather in London?" },
  provider: "google-ai",
  model: "gemini-1.5-pro",
  tools,
});

console.log(result.toolCalls); // Function calls to execute
```

---

## CLI Usage

### Basic Commands

```bash
# Generate with default model
npx @juspay/neurolink generate "Hello Gemini" --provider google-ai

# Use specific model
npx @juspay/neurolink gen "Write code" \
  --provider google-ai \
  --model "gemini-2.0-flash"

# Stream response
npx @juspay/neurolink stream "Tell a story" --provider google-ai

# Check provider status
npx @juspay/neurolink status --provider google-ai
```

### Advanced Usage

```bash
# With temperature and max tokens
npx @juspay/neurolink gen "Creative writing prompt" \
  --provider google-ai \
  --model "gemini-1.5-pro" \
  --temperature 0.9 \
  --max-tokens 2000

# Interactive mode
npx @juspay/neurolink loop --provider google-ai --model "gemini-2.0-flash"

# Multimodal: Image analysis (requires image file)
npx @juspay/neurolink gen "Describe this image" \
  --provider google-ai \
  --model "gemini-1.5-pro" \
  --image ./photo.jpg
```

---

## Configuration Options

### Environment Variables

```bash
# Required
GOOGLE_AI_API_KEY=AIza-your-key-here

# Optional
GOOGLE_AI_MODEL=gemini-2.0-flash  # Default model
GOOGLE_AI_TIMEOUT=60000  # Request timeout (ms)
GOOGLE_AI_MAX_RETRIES=3  # Retry attempts on rate limits
```

### Programmatic Configuration

```typescript
const ai = new NeuroLink({
  providers: [
    {
      name: "google-ai",
      config: {
        apiKey: process.env.GOOGLE_AI_API_KEY,
        defaultModel: "gemini-2.0-flash",
        timeout: 60000,
        maxRetries: 3,
        retryDelay: 1000,
      },
    },
  ],
});
```

---

## Google AI Studio vs Vertex AI

### When to Use Google AI Studio

✅ **Choose Google AI Studio when:**

- Development and prototyping
- Low-volume production (\1,500 requests/day)
- Enterprise compliance (HIPAA, SOC2)
- SLA guarantees required
- Multi-region deployment
- VPC/private networking
- Custom model fine-tuning
- Advanced security controls

### Feature Comparison

| Feature              | Google AI Studio          | Vertex AI              |
| -------------------- | ------------------------- | ---------------------- |
| **Authentication**   | API key                   | Service account (GCP)  |
| **Free Tier**        | ✅ Yes (15 RPM, 1.5K RPD) | ❌ No                  |
| **Rate Limits**      | 15 RPM, 1M TPM            | Custom quotas          |
| **SLA**              | ❌ No                     | ✅ Yes (99.9%)         |
| **Compliance**       | Basic                     | HIPAA, SOC2, ISO       |
| **Regions**          | Global                    | Multi-region choice    |
| **VPC Support**      | ❌ No                     | ✅ Yes                 |
| **Setup Complexity** | Low (1 API key)           | High (GCP project)     |
| **Best For**         | Development, POCs         | Production, enterprise |

### Migration Path

```typescript
// Start with Google AI Studio for development
const devAI = new NeuroLink({
  providers: [
    {
      name: "google-ai",
      config: {
        apiKey: process.env.GOOGLE_AI_API_KEY,
      },
    },
  ],
});

// Migrate to Vertex AI for production
const prodAI = new NeuroLink({
  providers: [
    {
      name: "vertex",
      config: {
        projectId: "your-gcp-project",
        location: "us-central1",
        credentials: "/path/to/service-account.json",
      },
    },
  ],
});

// Hybrid: Use both with failover
const hybridAI = new NeuroLink({
  providers: [
    {
      name: "vertex",
      priority: 1, // Prefer Vertex for production
      condition: (req) => req.env === "production",
    },
    {
      name: "google-ai",
      priority: 2, // Fallback to AI Studio
      condition: (req) => req.env !== "production",
    },
  ],
});
```

---

## Troubleshooting

### Common Issues

#### 1. "API key not valid"

**Problem**: API key is incorrect or expired.

**Solution**:

```bash
# Verify key format (should start with AIza)
echo $GOOGLE_AI_API_KEY

# Regenerate key at https://aistudio.google.com/
# Ensure no extra spaces in .env
GOOGLE_AI_API_KEY=AIza-your-key  # ✅ Correct
GOOGLE_AI_API_KEY= AIza-your-key # ❌ Extra space
```

#### 2. "429 Too Many Requests"

**Problem**: Exceeded rate limits (15 RPM, 1M TPM, or 1500 RPD).

**Solution**:

```typescript
// Implement backoff strategy (see Rate Limiting section above)
// Or reduce request frequency
// Monitor quota usage

// Check current quota status
const status = await ai.checkStatus("google-ai");
console.log("Rate limit status:", status);
```

#### 3. "Resource Exhausted" (Quota)

**Problem**: Exceeded daily quota (1,500 requests/day).

**Solution**:

- Wait for quota reset (24-hour rolling window)
- Upgrade to Vertex AI for higher quotas
- Implement request caching:

```typescript
// Cache frequent queries
const cache = new Map();

async function cachedGenerate(prompt: string) {
  if (cache.has(prompt)) {
    console.log("Cache hit");
    return cache.get(prompt);
  }

  const result = await ai.generate({
    input: { text: prompt },
    provider: "google-ai",
  });

  cache.set(prompt, result);
  return result;
}
```

#### 4. Slow Response Times

**Problem**: Network latency or model processing time.

**Solution**:

```typescript
// Use streaming for immediate feedback
for await (const chunk of ai.stream({
  input: { text: "Your prompt" },
  provider: "google-ai",
  model: "gemini-2.0-flash", // Fastest model
})) {
  // Display partial results immediately
  console.log(chunk.content);
}
```

#### 5. "Model not found"

**Problem**: Invalid or deprecated model name.

**Solution**:

```typescript
// Use current model names
const validModels = [
  "gemini-3-pro-preview", // ✅ Latest with thinking
  "gemini-3-flash-preview", // ✅ Fast with thinking
  "gemini-2.0-flash", // ✅ Production stable
  "gemini-1.5-pro", // ✅ Current
  "gemini-1.5-flash", // ✅ Current
  "gemini-pro", // ❌ Use gemini-1.0-pro instead
];

const result = await ai.generate({
  input: { text: "test" },
  provider: "google-ai",
  model: "gemini-3-flash-preview", // Use latest
});
```

---

## Best Practices

### 1. Quota Management

```typescript
// ✅ Good: Implement quota tracking
class GoogleAIClient {
  private dailyRequests = 0;
  private dayStart = Date.now();

  async generate(prompt: string) {
    // Reset daily counter
    if (Date.now() - this.dayStart > 86400000) {
      this.dailyRequests = 0;
      this.dayStart = Date.now();
    }

    // Check quota
    if (this.dailyRequests >= 1450) {
      // Buffer before hard limit
      console.warn("Approaching daily quota limit");
      // Switch to backup provider or queue request
    }

    const result = await ai.generate({
      input: { text: prompt },
      provider: "google-ai",
    });

    this.dailyRequests++;
    return result;
  }
}
```

### 2. Error Handling

```typescript
// ✅ Good: Comprehensive error handling
async function robustGenerate(prompt: string) {
  try {
    return await ai.generate({
      input: { text: prompt },
      provider: "google-ai",
    });
  } catch (error) {
    if (error.message.includes("429")) {
      // Rate limit - implement backoff
      await new Promise((r) => setTimeout(r, 2000));
      return robustGenerate(prompt);
    } else if (error.message.includes("quota")) {
      // Quota exhausted - switch provider
      return await ai.generate({
        input: { text: prompt },
        provider: "openai", // Fallback
      });
    } else if (error.message.includes("timeout")) {
      // Timeout - retry with shorter timeout
      return await ai.generate({
        input: { text: prompt },
        provider: "google-ai",
        timeout: 30000,
      });
    } else {
      throw error;
    }
  }
}
```

### 3. Model Selection

```typescript
// ✅ Good: Choose appropriate model for task
function selectModel(task: string, needsThinking: boolean = false): string {
  const taskType = analyzeTask(task);

  // Use Gemini 3 for tasks requiring deep reasoning
  if (needsThinking || /prove|reason|analyze deeply|architecture/.test(task)) {
    return taskType === "realtime"
      ? "gemini-3-flash-preview" // Fast thinking
      : "gemini-3-pro-preview"; // Deep thinking
  }

  switch (taskType) {
    case "simple":
      return "gemini-1.5-flash"; // Fast, cost-effective
    case "complex":
      return "gemini-3-pro-preview"; // High capability with thinking
    case "realtime":
      return "gemini-2.0-flash"; // Lowest latency
    case "multimodal":
      return "gemini-1.5-pro"; // Best multimodal
    default:
      return "gemini-2.0-flash"; // Default
  }
}

function analyzeTask(task: string): string {
  if (task.length ();
  private TTL = 3600000; // 1 hour

  async generate(prompt: string, options: any = {}) {
    const cacheKey = this.getCacheKey(prompt, options);
    const cached = this.cache.get(cacheKey);

    // Return cached if fresh
    if (cached && Date.now() - cached.timestamp < this.TTL) {
      console.log("Cache hit");
      return cached.result;
    }

    // Generate fresh result
    const result = await ai.generate({
      input: { text: prompt },
      provider: "google-ai",
      ...options,
    });

    // Store in cache
    this.cache.set(cacheKey, {
      result,
      timestamp: Date.now(),
    });

    return result;
  }

  private getCacheKey(prompt: string, options: any): string {
    const hash = createHash("sha256");
    hash.update(JSON.stringify({ prompt, options }));
    return hash.digest("hex");
  }
}
```

---

## Known Limitations

### Tools + JSON Schema Cannot Be Used Together

:::warning[Critical Limitation]
Gemini models (including Gemini 3) cannot use function calling (tools) and JSON schema output simultaneously. You must choose one or the other.
:::

**Google API Limitation:** Google AI Studio (all Gemini models including Gemini 3) cannot combine function calling with structured output (JSON schema). This is a fundamental Google API constraint documented in the [Gemini API documentation](https://ai.google.dev/gemini-api/docs/).

**Error:**

```
Function calling with a response mime type: 'application/json' is unsupported
```

**Solution:**

```typescript
// ❌ This will fail - tools + schema together
const badResult = await neurolink.generate({
  input: { text: "Analyze this data" },
  schema: MyZodSchema,
  provider: "google-ai",
  model: "gemini-3-pro-preview",
  tools: myTools, // Cannot use tools with schema!
});

// ✅ Correct approach - disable tools when using schema
const result = await neurolink.generate({
  input: { text: "Analyze this data" },
  schema: MyZodSchema,
  output: { format: "json" },
  provider: "google-ai",
  model: "gemini-3-pro-preview",
  disableTools: true, // Required for schemas
});

// ✅ Alternative - use tools without schema
const toolResult = await neurolink.generate({
  input: { text: "What's the weather in London?" },
  provider: "google-ai",
  model: "gemini-3-flash-preview",
  tools: myTools, // Works fine without schema
});
```

**Industry Context:**

- This limitation affects ALL frameworks using Gemini (LangChain, Vercel AI SDK, Agno, Instructor)
- All use the same workaround: disable tools when using schemas
- This applies to all Gemini versions including Gemini 3 preview models
- Check official Google AI Studio documentation for future updates

**Alternative Approaches:**

1. Use OpenAI or Anthropic providers (support both simultaneously)
2. Use Vertex AI with Claude models (via Anthropic integration)
3. Choose between tools OR schemas for Gemini models
4. Chain requests: first call with tools, second call with schema

### Complex Schema Limitations

**"Too many states for serving" Error:**

When using complex Zod schemas, you may encounter:

```
Error: 9 FAILED_PRECONDITION: Too many states for serving
```

**Solutions:**

1. Simplify schema (reduce nesting, array sizes)
2. Use `disableTools: true` (reduces state count)
3. Split complex operations into multiple simpler calls

See [Troubleshooting Guide](/docs/reference/troubleshooting) for details.

---

## Related Documentation

- **[Provider Setup Guide](/docs/getting-started/provider-setup)** - General provider configuration
- **[Google Vertex AI Guide](/docs/getting-started/providers/google-vertex)** - Enterprise Vertex AI setup
- **[Cost Optimization](/docs/cookbook/cost-optimization)** - Reduce AI costs
- **[Cost Optimization](/docs/cookbook/cost-optimization)** - Handle quotas and rate limits

---

## Additional Resources

- **[Google AI Studio](https://aistudio.google.com/)** - Get API keys
- **[Gemini API Documentation](https://ai.google.dev/docs)** - Official API docs
- **[Gemini Models](https://ai.google.dev/models/gemini)** - Model capabilities
- **[Pricing](https://ai.google.dev/pricing)** - Free tier and paid pricing

---

**Need Help?** Join our [GitHub Discussions](https://github.com/juspay/neurolink/discussions) or open an [issue](https://github.com/juspay/neurolink/issues).

---

## ⚙️ Provider Configuration Guide

<!-- Source: getting-started/provider-setup.md -->

# ⚙️ Provider Configuration Guide

NeuroLink supports multiple AI providers with flexible authentication methods. This guide covers complete setup for all supported providers.

## Supported Providers

- **OpenAI** - GPT-4o, GPT-4o-mini, GPT-4-turbo
- **Amazon Bedrock** - Claude 3.7 Sonnet, Claude 3.5 Sonnet, Claude 3 Haiku
- **Amazon SageMaker** - Custom models deployed on SageMaker endpoints
- **Google Vertex AI** - Gemini 3 Flash/Pro (preview), Gemini 2.5 Flash, Claude 4.0 Sonnet
- **Google AI Studio** - Gemini 1.5 Pro, Gemini 2.0 Flash, Gemini 1.5 Flash
- **Anthropic** - Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Haiku
- **Azure OpenAI** - GPT-4, GPT-3.5-Turbo
- **LiteLLM** - 100+ models from all providers via proxy server
- **Hugging Face** - 100,000+ open source models including DialoGPT, GPT-2, GPT-Neo
- **Ollama** - Local AI models including Llama 2, Code Llama, Mistral, Vicuna
- **Mistral AI** - Mistral Tiny, Small, Medium, and Large models

##  Model Availability & Cost Considerations

**Important Notes:**

- **Model Availability**: Specific models may not be available in all regions or require special access
- **Cost Variations**: Pricing differs significantly between providers and models (e.g., Claude 3.5 Sonnet vs GPT-4o)
- **Rate Limits**: Each provider has different rate limits and quota restrictions
- **Local vs Cloud**: Ollama (local) has no per-request cost but requires hardware resources
- **Enterprise Tiers**: AWS Bedrock, Google Vertex AI, and Azure typically offer enterprise pricing

**Best Practices:**

- Use `new NeuroLink()` with automatic provider selection for cost-optimized routing
- Monitor usage through built-in analytics to track costs
- Consider local models (Ollama) for development and testing
- Check provider documentation for current pricing and availability

##  Enterprise Proxy Support

**All providers support corporate proxy environments automatically.** Simply set environment variables:

```bash
export HTTPS_PROXY=http://your-corporate-proxy:port
export HTTP_PROXY=http://your-corporate-proxy:port
```

**No code changes required** - NeuroLink automatically detects and uses proxy settings.

**For detailed proxy setup** → See [Enterprise & Proxy Setup Guide](/docs/deployment/enterprise-proxy)

## OpenAI Configuration {#openai}

### Basic Setup

```bash
export OPENAI_API_KEY="sk-your-openai-api-key"
```

### Optional Configuration

```bash
export OPENAI_MODEL="gpt-4o"  # Default model to use
```

### Supported Models

- `gpt-4o` (default) - Latest multimodal model
- `gpt-4o-mini` - Cost-effective variant
- `gpt-4-turbo` - High-performance model

### Usage Example

```typescript

const neurolink = new NeuroLink();

const result = await neurolink.generate({
  input: { text: "Explain machine learning" },
  provider: "openai",
  model: "gpt-4o",
  temperature: 0.7,
  maxTokens: 500,
  timeout: "30s", // Optional: Override default 30s timeout
});
```

### Timeout Configuration

- **Default Timeout**: 30 seconds
- **Supported Formats**: Milliseconds (`30000`), human-readable (`'30s'`, `'1m'`, `'5m'`)
- **Environment Variable**: `OPENAI_TIMEOUT='45s'` (optional)

## Amazon Bedrock Configuration {#bedrock}

###  Critical Setup Requirements

**⚠️ IMPORTANT: Anthropic Models Require Inference Profile ARN**

For Anthropic Claude models in Bedrock, you **MUST** use the full inference profile ARN, not simple model names:

```bash
# ✅ CORRECT: Use full inference profile ARN
export BEDROCK_MODEL="arn:aws:bedrock:us-east-2::inference-profile/us.anthropic.claude-3-7-sonnet-20250219-v1:0"

# ❌ WRONG: Simple model names cause "not authorized to invoke this API" errors
# export BEDROCK_MODEL="anthropic.claude-3-sonnet-20240229-v1:0"
```

### Basic AWS Credentials

```bash
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export AWS_REGION="us-east-2"
```

### Session Token Support (Development)

For temporary credentials (common in development environments):

```bash
export AWS_SESSION_TOKEN="your-session-token"  # Required for temporary credentials
```

### Available Inference Profile ARNs

Replace `` with your AWS account ID:

```bash
# Claude 3.7 Sonnet (Latest - Recommended)
BEDROCK_MODEL="arn:aws:bedrock:us-east-2::inference-profile/us.anthropic.claude-3-7-sonnet-20250219-v1:0"

# Claude 3.5 Sonnet
BEDROCK_MODEL="arn:aws:bedrock:us-east-2::inference-profile/us.anthropic.claude-3-5-sonnet-20241022-v2:0"

# Claude 3 Haiku
BEDROCK_MODEL="arn:aws:bedrock:us-east-2::inference-profile/us.anthropic.claude-3-haiku-20240307-v1:0"
```

### Why Inference Profiles?

- **Cross-Region Access**: Faster access across AWS regions
- **Better Performance**: Optimized routing and response times
- **Higher Availability**: Improved model availability and reliability
- **Different Permissions**: Separate permission model from base models

### Complete Bedrock Configuration

```bash
# Required AWS credentials
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export AWS_REGION="us-east-2"

# Optional: Session token for temporary credentials
export AWS_SESSION_TOKEN="your-session-token"

# Required: Inference profile ARN (not simple model name)
export BEDROCK_MODEL="arn:aws:bedrock:us-east-2::inference-profile/us.anthropic.claude-3-7-sonnet-20250219-v1:0"

# Alternative environment variable names (backward compatibility)
export BEDROCK_MODEL_ID="arn:aws:bedrock:us-east-2::inference-profile/us.anthropic.claude-3-7-sonnet-20250219-v1:0"
```

### Usage Example

```typescript

const neurolink = new NeuroLink();

const result = await neurolink.generate({
  input: { text: "Write a haiku about AI" },
  provider: "bedrock",
  temperature: 0.8,
  maxTokens: 100,
  timeout: "45s", // Optional: Override default 45s timeout
});
```

### Timeout Configuration

- **Default Timeout**: 45 seconds (longer due to cold starts)
- **Supported Formats**: Milliseconds (`45000`), human-readable (`'45s'`, `'1m'`, `'2m'`)
- **Environment Variable**: `BEDROCK_TIMEOUT='1m'` (optional)

### Account Setup Requirements

To use AWS Bedrock, ensure your AWS account has:

1. **Bedrock Service Access**: Enable Bedrock in your AWS region
2. **Model Access**: Request access to Anthropic Claude models
3. **IAM Permissions**: Your credentials need `bedrock:InvokeModel` permissions
4. **Inference Profile Access**: Access to the specific inference profiles

### IAM Policy Example

```json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "bedrock:InvokeModel",
        "bedrock:InvokeModelWithResponseStream"
      ],
      "Resource": ["arn:aws:bedrock:*:*:inference-profile/us.anthropic.*"]
    }
  ]
}
```

## Amazon SageMaker Configuration

**Amazon SageMaker** allows you to use your own custom models deployed on SageMaker endpoints. This provider is perfect for:

- **Custom Model Hosting** - Deploy your fine-tuned models
- **Enterprise Compliance** - Full control over model infrastructure
- **Cost Optimization** - Pay only for inference usage
- **Performance** - Dedicated compute resources

### Basic AWS Credentials

```bash
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export AWS_REGION="us-east-1"  # Your SageMaker region
```

### SageMaker-Specific Configuration

```bash
# Required: Your SageMaker endpoint name
export SAGEMAKER_DEFAULT_ENDPOINT="your-endpoint-name"

# Optional: Timeout and retry settings
export SAGEMAKER_TIMEOUT="30000"      # 30 seconds (default)
export SAGEMAKER_MAX_RETRIES="3"      # Retry attempts (default)
```

### Advanced Model Configuration

```bash
# Optional: Model-specific settings
export SAGEMAKER_MODEL="custom-model-name"    # Model identifier
export SAGEMAKER_MODEL_TYPE="custom"          # Model type
export SAGEMAKER_CONTENT_TYPE="application/json"
export SAGEMAKER_ACCEPT="application/json"
```

### Session Token Support (for IAM Roles)

```bash
export AWS_SESSION_TOKEN="your-session-token"  # For temporary credentials
```

### Complete SageMaker Configuration

```bash
# AWS Credentials
export AWS_ACCESS_KEY_ID="AKIAIOSFODNN7EXAMPLE"
export AWS_SECRET_ACCESS_KEY="wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
export AWS_REGION="us-east-1"

# SageMaker Settings
export SAGEMAKER_DEFAULT_ENDPOINT="my-model-endpoint-2024"
export SAGEMAKER_TIMEOUT="45000"
export SAGEMAKER_MAX_RETRIES="5"
```

### Usage Example

```bash
# Test SageMaker endpoint
npx @juspay/neurolink sagemaker test my-endpoint

# Generate text with SageMaker
npx @juspay/neurolink generate "Analyze this data" --provider sagemaker

# Interactive setup
npx @juspay/neurolink sagemaker setup
```

### CLI Commands

```bash
# Check SageMaker configuration
npx @juspay/neurolink sagemaker status

# Validate connection
npx @juspay/neurolink sagemaker validate

# Show current configuration
npx @juspay/neurolink sagemaker config

# Performance benchmark
npx @juspay/neurolink sagemaker benchmark my-endpoint

# List available endpoints (requires AWS CLI)
npx @juspay/neurolink sagemaker list-endpoints
```

### Timeout Configuration

Configure request timeouts for SageMaker endpoints:

```bash
export SAGEMAKER_TIMEOUT="60000"  # 60 seconds for large models
```

### Prerequisites

1. **SageMaker Endpoint**: Deploy a model to SageMaker and get the endpoint name
2. **AWS IAM Permissions**: Ensure your credentials have `sagemaker:InvokeEndpoint` permission
3. **Endpoint Status**: Endpoint must be in "InService" status

### IAM Policy Example

```json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["sagemaker:InvokeEndpoint"],
      "Resource": "arn:aws:sagemaker:*:*:endpoint/*"
    }
  ]
}
```

### Environment Variables Reference

| Variable                     | Required | Default   | Description               |
| ---------------------------- | -------- | --------- | ------------------------- |
| `AWS_ACCESS_KEY_ID`          | ✅       | -         | AWS access key            |
| `AWS_SECRET_ACCESS_KEY`      | ✅       | -         | AWS secret key            |
| `AWS_REGION`                 | ✅       | us-east-1 | AWS region                |
| `SAGEMAKER_DEFAULT_ENDPOINT` | ✅       | -         | SageMaker endpoint name   |
| `SAGEMAKER_TIMEOUT`          | ❌       | 30000     | Request timeout (ms)      |
| `SAGEMAKER_MAX_RETRIES`      | ❌       | 3         | Retry attempts            |
| `AWS_SESSION_TOKEN`          | ❌       | -         | For temporary credentials |

###  Complete SageMaker Guide

For comprehensive SageMaker setup, advanced features, and production deployment:
**[ Complete SageMaker Integration Guide](/docs/getting-started/providers/sagemaker)** - Includes:

- Model deployment examples
- Cost optimization strategies
- Enterprise security patterns
- Multi-model endpoint management
- Performance testing and monitoring
- Troubleshooting and debugging

## Google Vertex AI Configuration {#vertex}

NeuroLink supports **three authentication methods** for Google Vertex AI to accommodate different deployment environments:

### Method 1: Service Account File (Recommended for Production)

Best for production environments where you can store service account files securely.

```bash
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"
export GOOGLE_VERTEX_PROJECT="your-project-id"
export GOOGLE_VERTEX_LOCATION="us-central1"
```

**Setup Steps:**

1. Create a service account in Google Cloud Console
2. Download the service account JSON file
3. Set the file path in `GOOGLE_APPLICATION_CREDENTIALS`

### Method 2: Service Account JSON String (Good for Containers/Cloud)

Best for containerized environments where file storage is limited.

```bash
export GOOGLE_SERVICE_ACCOUNT_KEY='{"type":"service_account","project_id":"your-project",...}'
export GOOGLE_VERTEX_PROJECT="your-project-id"
export GOOGLE_VERTEX_LOCATION="us-central1"
```

**Setup Steps:**

1. Copy the entire contents of your service account JSON file
2. Set it as a single-line string in `GOOGLE_SERVICE_ACCOUNT_KEY`
3. NeuroLink will automatically create a temporary file for authentication

### Method 3: Individual Environment Variables (Good for CI/CD)

Best for CI/CD pipelines where individual secrets are managed separately.

```bash
export GOOGLE_AUTH_CLIENT_EMAIL="service-account@project.iam.gserviceaccount.com"
export GOOGLE_AUTH_PRIVATE_KEY="-----BEGIN PRIVATE KEY-----\nMIIE..."
export GOOGLE_VERTEX_PROJECT="your-project-id"
export GOOGLE_VERTEX_LOCATION="us-central1"
```

**Setup Steps:**

1. Extract `client_email` and `private_key` from your service account JSON
2. Set them as individual environment variables
3. NeuroLink will automatically assemble them into a temporary service account file

### Authentication Detection

NeuroLink automatically detects and uses the best available authentication method in this order:

1. **File Path** (`GOOGLE_APPLICATION_CREDENTIALS`) - if file exists
2. **JSON String** (`GOOGLE_SERVICE_ACCOUNT_KEY`) - if provided
3. **Individual Variables** (`GOOGLE_AUTH_CLIENT_EMAIL` + `GOOGLE_AUTH_PRIVATE_KEY`) - if both provided

### Complete Vertex AI Configuration

```bash
# Required for all methods
export GOOGLE_VERTEX_PROJECT="your-gcp-project-id"

# Optional
export GOOGLE_VERTEX_LOCATION="us-east5"        # Default: us-east5
export VERTEX_MODEL_ID="claude-sonnet-4@20250514"  # Default model

# Choose ONE authentication method:

# Method 1: Service Account File
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"

# Method 2: Service Account JSON String
export GOOGLE_SERVICE_ACCOUNT_KEY='{"type":"service_account","project_id":"your-project","private_key_id":"...","private_key":"-----BEGIN PRIVATE KEY-----\n...","client_email":"...","client_id":"...","auth_uri":"https://accounts.google.com/o/oauth2/auth","token_uri":"https://oauth2.googleapis.com/token","auth_provider_x509_cert_url":"https://www.googleapis.com/oauth2/v1/certs","client_x509_cert_url":"..."}'

# Method 3: Individual Environment Variables
export GOOGLE_AUTH_CLIENT_EMAIL="service-account@your-project.iam.gserviceaccount.com"
export GOOGLE_AUTH_PRIVATE_KEY="-----BEGIN PRIVATE KEY-----\nMIIEvQIBADANBgkqhkiG9w0BAQEFAASCBKcwggSjAgEAAoIBAQC...\n-----END PRIVATE KEY-----"
```

### Usage Example

```typescript

const neurolink = new NeuroLink();

const result = await neurolink.generate({
  input: { text: "Explain quantum computing" },
  provider: "vertex",
  model: "gemini-2.5-flash",
  temperature: 0.6,
  maxTokens: 800,
  timeout: "1m", // Optional: Override default 60s timeout
});
```

### Timeout Configuration

- **Default Timeout**: 60 seconds (longer due to GCP initialization)
- **Supported Formats**: Milliseconds (`60000`), human-readable (`'60s'`, `'1m'`, `'2m'`)
- **Environment Variable**: `VERTEX_TIMEOUT='90s'` (optional)

### Supported Models

**Gemini 3 (Preview):**

- `gemini-3-flash-preview` - Latest Gemini 3 Flash with extended thinking support
- `gemini-3-pro-preview` - Latest Gemini 3 Pro with extended thinking support

**Gemini 2.x:**

- `gemini-2.5-flash` (default) - Fast, efficient model

**Anthropic Models:**

- `claude-sonnet-4@20250514` - High-quality reasoning (Anthropic via Vertex AI)

**Video Generation:**

- `veo-3.1` / `veo-3.1-generate-001` - Video generation from image + text prompt (8-second videos with audio)

> **Video Generation:** Use `output.mode: "video"` with Veo 3.1 to generate videos. See [Video Generation Guide](/docs/features/video-generation).

### Gemini 3 Extended Thinking Configuration

Gemini 3 models support **extended thinking** (also known as "thinking mode"), which allows the model to reason more deeply before providing responses. This is particularly useful for complex reasoning tasks, math problems, and multi-step analysis.

#### Environment Variables for Gemini 3

```bash
# Required: Google Vertex AI credentials (same as above)
export GOOGLE_VERTEX_PROJECT="your-project-id"
export GOOGLE_VERTEX_LOCATION="us-central1"

# Gemini 3 model selection
export VERTEX_MODEL_ID="gemini-3-flash-preview"  # or gemini-3-pro-preview
```

#### Extended Thinking Configuration

Configure thinking level to control how much reasoning the model performs:

```typescript

const neurolink = new NeuroLink();

// Enable extended thinking with thinkingLevel configuration
const result = await neurolink.generate({
  input: { text: "Solve this complex math problem step by step: ..." },
  provider: "vertex",
  model: "gemini-3-flash-preview",
  temperature: 0.7,
  maxTokens: 4000,
  // Gemini 3 extended thinking configuration
  thinkingLevel: "medium", // Options: "minimal", "low", "medium", "high"
});
```

#### Thinking Levels

| Level     | Description                             | Best For                          |
| --------- | --------------------------------------- | --------------------------------- |
| `minimal` | No extended thinking, fastest responses | Simple queries, quick answers     |
| `low`     | Brief reasoning before responding       | Moderate complexity tasks         |
| `medium`  | Balanced reasoning depth (recommended)  | Most use cases                    |
| `high`    | Deep reasoning, thorough analysis       | Complex math, multi-step problems |

#### Usage Example with Extended Thinking

```typescript

const neurolink = new NeuroLink();

// Complex reasoning task with high thinking level
const result = await neurolink.generate({
  input: {
    text: "Analyze the following business scenario and provide strategic recommendations...",
  },
  provider: "vertex",
  model: "gemini-3-pro-preview",
  thinkingLevel: "high",
  maxTokens: 8000,
  timeout: "2m", // Extended timeout for deep thinking
});

console.log(result.content);
```

#### CLI Usage with Gemini 3

```bash
# Generate with Gemini 3 Flash
npx @juspay/neurolink generate "Explain quantum computing" --provider vertex --model gemini-3-flash-preview

# Stream with Gemini 3 Pro
npx @juspay/neurolink stream "Write a detailed analysis" --provider vertex --model gemini-3-pro-preview
```

### Claude Sonnet 4 via Vertex AI Configuration

NeuroLink provides first-class support for Claude Sonnet 4 through Google Vertex AI. This configuration has been thoroughly tested and verified working.

#### Working Configuration Example

```bash
# ✅ VERIFIED WORKING CONFIGURATION
export GOOGLE_VERTEX_PROJECT="your-project-id"
export GOOGLE_VERTEX_LOCATION="us-east5"
export GOOGLE_AUTH_CLIENT_EMAIL="service-account@your-project.iam.gserviceaccount.com"
export GOOGLE_AUTH_PRIVATE_KEY="-----BEGIN PRIVATE KEY-----
[Your private key content here]
--"
```

#### Performance Metrics (Verified)

- **Generation Response**: ~2.6 seconds
- **Health Check**: Working status detection
- **Streaming**: Fully functional
- **Tool Integration**: Ready for MCP tools

#### Usage Examples

```bash
# Generation test
node dist/cli/index.js generate "test" --provider vertex --model claude-sonnet-4@20250514

# Streaming test
node dist/cli/index.js stream "Write a short poem" --provider vertex --model claude-sonnet-4@20250514

# Health check
node dist/cli/index.js status
# Expected: vertex: ✅ Working (2599ms)
```

### Google Cloud Setup Requirements

To use Google Vertex AI, ensure your Google Cloud project has:

1. **Vertex AI API Enabled**: Enable the Vertex AI API in your project
2. **Service Account**: Create a service account with Vertex AI permissions
3. **Model Access**: Ensure access to the models you want to use
4. **Billing Enabled**: Vertex AI requires an active billing account

### Service Account Permissions

Your service account needs these IAM roles:

- `Vertex AI User` or `Vertex AI Admin`
- `Service Account Token Creator` (if using impersonation)

## Google AI Studio Configuration {#google-ai}

Google AI Studio provides direct access to Google's Gemini models with a simple API key authentication.

### Basic Setup

```bash
export GOOGLE_AI_API_KEY="AIza-your-google-ai-api-key"
```

### Optional Configuration

```bash
export GOOGLE_AI_MODEL="gemini-2.5-pro"  # Default model to use
```

### Supported Models

- `gemini-2.5-pro` - Comprehensive, detailed responses for complex tasks
- `gemini-2.5-flash` (recommended) - Fast, efficient responses for most tasks

### Usage Example

```typescript

const neurolink = new NeuroLink();

const result = await neurolink.generate({
  input: { text: "Explain the future of AI" },
  provider: "google-ai",
  model: "gemini-2.5-flash",
  temperature: 0.7,
  maxTokens: 1000,
  timeout: "30s", // Optional: Override default 30s timeout
});
```

### Timeout Configuration

- **Default Timeout**: 30 seconds
- **Supported Formats**: Milliseconds (`30000`), human-readable (`'30s'`, `'1m'`, `'5m'`)
- **Environment Variable**: `GOOGLE_AI_TIMEOUT='45s'` (optional)

### How to Get Google AI Studio API Key

1. **Visit Google AI Studio**: Go to [aistudio.google.com](https://aistudio.google.com)
2. **Sign In**: Use your Google account credentials
3. **Create API Key**:
   - Navigate to the **API Keys** section
   - Click **Create API Key**
   - Copy the generated key (starts with `AIza`)
4. **Set Environment**: Add to your `.env` file or export directly

### Google AI Studio vs Vertex AI

| Feature                 | Google AI Studio            | Google Vertex AI             |
| ----------------------- | --------------------------- | ---------------------------- |
| **Setup Complexity**    |  Simple (API key only)    |  Complex (Service account) |
| **Authentication**      | API key                     | Service account JSON         |
| **Free Tier**           | ✅ Generous free limits     | ❌ Pay-per-use only          |
| **Enterprise Features** | ❌ Limited                  | ✅ Full enterprise support   |
| **Model Selection**     |  Latest Gemini models     |  Broader model catalog     |
| **Best For**            | Prototyping, small projects | Production, enterprise apps  |

### Complete Google AI Studio Configuration

```bash
# Required: API key from Google AI Studio (choose one)
export GOOGLE_AI_API_KEY="AIza-your-google-ai-api-key"
# OR
export GOOGLE_GENERATIVE_AI_API_KEY="AIza-your-google-ai-api-key"

# Optional: Default model selection
export GOOGLE_AI_MODEL="gemini-2.5-pro"
```

### Rate Limits and Quotas

Google AI Studio includes generous free tier limits:

- **Free Tier**: 15 requests per minute, 1,500 requests per day
- **Paid Usage**: Higher limits available with billing enabled
- **Model-Specific**: Different models may have different rate limits

### Error Handling for Google AI Studio

```typescript

const neurolink = new NeuroLink();

try {
  const result = await neurolink.generate({
    input: { text: "Generate a creative story" },
    provider: "google-ai",
    temperature: 0.8,
    maxTokens: 500,
  });
  console.log(result.content);
} catch (error) {
  if (error.message.includes("API_KEY_INVALID")) {
    console.error(
      "Invalid Google AI API key. Check your GOOGLE_AI_API_KEY environment variable.",
    );
  } else if (error.message.includes("QUOTA_EXCEEDED")) {
    console.error("Rate limit exceeded. Wait before making more requests.");
  } else {
    console.error("Google AI Studio error:", error.message);
  }
}
```

### Security Considerations

- **API Key Security**: Treat API keys as sensitive credentials
- **Environment Variables**: Never commit API keys to version control
- **Rate Limiting**: Implement client-side rate limiting for production apps
- **Monitoring**: Monitor usage to avoid unexpected charges

## LiteLLM Configuration

LiteLLM provides access to 100+ models through a unified proxy server, allowing you to use any AI provider through a single interface.

### Prerequisites

1. Install LiteLLM:

```bash
pip install litellm
```

2. Start LiteLLM proxy server:

```bash
# Basic usage
litellm --port 4000

# With configuration file (recommended)
litellm --config litellm_config.yaml --port 4000
```

### Basic Setup

```bash
export LITELLM_BASE_URL="http://localhost:4000"
export LITELLM_API_KEY="sk-anything"  # Optional, any value works
```

### Optional Configuration

```bash
export LITELLM_MODEL="openai/gpt-4o-mini"  # Default model to use
```

### Supported Model Formats

LiteLLM uses the `provider/model` format:

```bash
# OpenAI models
openai/gpt-4o
openai/gpt-4o-mini
openai/gpt-4

# Anthropic models
anthropic/claude-3-5-sonnet
anthropic/claude-3-haiku

# Google models
google/gemini-2.0-flash
vertex_ai/gemini-pro

# Mistral models
mistral/mistral-large
mistral/mixtral-8x7b

# And many more...
```

### LiteLLM Configuration File (Optional)

Create `litellm_config.yaml` for advanced configuration:

```yaml
model_list:
  - model_name: openai/gpt-4o
    litellm_params:
      model: gpt-4o
      api_key: os.environ/OPENAI_API_KEY

  - model_name: anthropic/claude-3-5-sonnet
    litellm_params:
      model: claude-3-5-sonnet-20241022
      api_key: os.environ/ANTHROPIC_API_KEY

  - model_name: google/gemini-2.0-flash
    litellm_params:
      model: gemini-2.0-flash
      api_key: os.environ/GOOGLE_AI_API_KEY
```

### Usage Example

```typescript

const neurolink = new NeuroLink();

// Use LiteLLM provider with specific model
const result = await neurolink.generate({
  input: { text: "Explain quantum computing" },
  provider: "litellm",
  model: "openai/gpt-4o",
  temperature: 0.7,
});

console.log(result.content);
```

### Advanced Features

- **Cost Tracking**: Built-in usage and cost monitoring
- **Load Balancing**: Automatic failover between providers
- **Rate Limiting**: Built-in rate limiting and retry logic
- **Caching**: Optional response caching for efficiency

### Production Considerations

- **Deployment**: Run LiteLLM proxy as a separate service
- **Security**: Configure authentication for production environments
- **Scaling**: Use Docker/Kubernetes for high-availability deployments
- **Monitoring**: Enable logging and metrics collection

## Hugging Face Configuration {#huggingface}

### Basic Setup

```bash
export HUGGINGFACE_API_KEY="hf_your_token_here"
```

### Optional Configuration

```bash
export HUGGINGFACE_MODEL="microsoft/DialoGPT-medium"  # Default model
```

### Model Selection Strategy

Hugging Face hosts 100,000+ models. Choose based on:

- **Task**: text-generation, conversational, code
- **Size**: Larger models = better quality but slower
- **License**: Check model licenses for commercial use

### Rate Limiting

- Free tier: Limited requests
- PRO tier: Higher limits
- Handle 503 errors (model loading) with retry logic

### Usage Example

```typescript

const neurolink = new NeuroLink();

const result = await neurolink.generate({
  input: { text: "Explain machine learning" },
  provider: "huggingface",
  model: "gpt2",
  temperature: 0.8,
  maxTokens: 200,
  timeout: "45s", // Optional: Override default 30s timeout
});
```

### Timeout Configuration

- **Default Timeout**: 30 seconds
- **Supported Formats**: Milliseconds (`30000`), human-readable (`'30s'`, `'1m'`, `'5m'`)
- **Environment Variable**: `HUGGINGFACE_TIMEOUT='45s'` (optional)
- **Note**: Model loading may take additional time on first request

### Popular Models

- `microsoft/DialoGPT-medium` (default) - Conversational AI
- `gpt2` - Classic GPT-2
- `distilgpt2` - Lightweight GPT-2
- `EleutherAI/gpt-neo-2.7B` - Large open model
- `bigscience/bloom-560m` - Multilingual model

### Getting Started with Hugging Face

1. **Create Account**: Visit [huggingface.co](https://huggingface.co)
2. **Generate Token**: Go to Settings → Access Tokens
3. **Create Token**: Click "New token" with "read" scope
4. **Set Environment**: Export token as `HUGGINGFACE_API_KEY`

## Ollama Configuration {#ollama}

### Local Installation Required

Ollama must be installed and running locally.

### Installation Steps

1. **macOS**:

   ```bash
   brew install ollama
   # or
   curl -fsSL https://ollama.ai/install.sh | sh
   ```

2. **Linux**:

   ```bash
   curl -fsSL https://ollama.ai/install.sh | sh
   ```

3. **Windows**:
   Download from [ollama.ai](https://ollama.ai)

### Model Management

```bash
# List models
ollama list

# Pull new model
ollama pull llama2

# Remove model
ollama rm llama2
```

### Privacy Benefits

- **100% Local**: No data leaves your machine
- **No API Keys**: No authentication required
- **Offline Capable**: Works without internet

### Usage Example

```typescript

const neurolink = new NeuroLink();

const result = await neurolink.generate({
  input: { text: "Write a poem about privacy" },
  provider: "ollama",
  model: "llama2",
  temperature: 0.7,
  maxTokens: 300,
  timeout: "10m", // Optional: Override default 5m timeout
});
```

### Timeout Configuration

- **Default Timeout**: 5 minutes (longer for local model processing)
- **Supported Formats**: Milliseconds (`300000`), human-readable (`'5m'`, `'10m'`, `'30m'`)
- **Environment Variable**: `OLLAMA_TIMEOUT='10m'` (optional)
- **Note**: Local models may need longer timeouts for complex prompts

### Popular Models

- `llama2` (default) - Meta's Llama 2
- `codellama` - Code-specialized Llama
- `mistral` - Mistral 7B
- `vicuna` - Fine-tuned Llama
- `phi` - Microsoft's small model

### Environment Variables

```bash
# Optional: Custom Ollama server URL
export OLLAMA_BASE_URL="http://localhost:11434"

# Optional: Default model
export OLLAMA_MODEL="llama2"
```

### Performance Optimization

```bash
# Set memory limit
OLLAMA_MAX_MEMORY=8GB ollama serve

# Use specific GPU
OLLAMA_CUDA_DEVICE=0 ollama serve
```

## OpenRouter Configuration {#openrouter}

OpenRouter provides access to 300+ AI models from 60+ providers through a single unified API with automatic failover and cost optimization.

### Basic Setup

```bash
export OPENROUTER_API_KEY="sk-or-v1-your-api-key"
```

### Optional Configuration

```bash
# Attribution for OpenRouter dashboard
export OPENROUTER_REFERER="https://yourapp.com"
export OPENROUTER_APP_NAME="Your App Name"

# Default model
export OPENROUTER_MODEL="anthropic/claude-3-5-sonnet"
```

### Supported Models

OpenRouter supports 300+ models including:

- `anthropic/claude-3-5-sonnet` (default) - Best overall quality
- `openai/gpt-4o` - Excellent code generation
- `google/gemini-2.0-flash` - Fast and cost-effective
- `meta-llama/llama-3.1-70b-instruct` - Best open source

### Usage Example

```typescript

const neurolink = new NeuroLink();

const result = await neurolink.generate({
  input: { text: "Explain quantum computing" },
  provider: "openrouter",
  model: "anthropic/claude-3-5-sonnet",
  temperature: 0.7,
  maxTokens: 500,
});
```

### Complete Guide

For comprehensive OpenRouter setup including model selection, cost optimization, and best practices, see the [OpenRouter Provider Guide](/docs/getting-started/providers/openrouter).

## Mistral AI Configuration {#mistral}

### Basic Setup

```bash
export MISTRAL_API_KEY="your_mistral_api_key"
```

### European Compliance

- GDPR compliant
- Data processed in Europe
- No training on user data

### Model Selection

- **mistral-tiny**: Fast responses, basic tasks
- **mistral-small**: Balanced choice (default)
- **mistral-medium**: Complex reasoning
- **mistral-large**: Maximum capability

### Cost Optimization

Mistral offers competitive pricing:

- Tiny: $0.14 / 1M tokens
- Small: $0.6 / 1M tokens
- Medium: $2.5 / 1M tokens
- Large: $8 / 1M tokens

### Usage Example

```typescript

const neurolink = new NeuroLink();

const result = await neurolink.generate({
  input: { text: "Translate to French: Hello world" },
  provider: "mistral",
  model: "mistral-small",
  temperature: 0.3,
  maxTokens: 100,
  timeout: "30s", // Optional: Override default 30s timeout
});
```

### Timeout Configuration

- **Default Timeout**: 30 seconds
- **Supported Formats**: Milliseconds (`30000`), human-readable (`'30s'`, `'1m'`, `'5m'`)
- **Environment Variable**: `MISTRAL_TIMEOUT='45s'` (optional)

### Getting Started with Mistral AI

1. **Create Account**: Visit [mistral.ai](https://mistral.ai)
2. **Get API Key**: Navigate to API Keys section
3. **Generate Key**: Create new API key
4. **Add Billing**: Set up payment method

### Environment Variables

```bash
# Required: API key
export MISTRAL_API_KEY="your_mistral_api_key"

# Optional: Default model
export MISTRAL_MODEL="mistral-small"

# Optional: Custom endpoint
export MISTRAL_ENDPOINT="https://api.mistral.ai"
```

### Multilingual Support

Mistral models excel at multilingual tasks:

- English, French, Spanish, German, Italian
- Code generation in multiple programming languages
- Translation between supported languages

## Anthropic Configuration {#anthropic}

Direct access to Anthropic's Claude models without going through AWS Bedrock.

### Basic Setup

```bash
export ANTHROPIC_API_KEY="sk-ant-api03-your-key-here"
```

### Optional Configuration

```bash
export ANTHROPIC_MODEL="claude-3-5-sonnet-20241022"  # Default model
```

### Supported Models

- `claude-3-7-sonnet-20250219` - Latest Claude 3.7 Sonnet
- `claude-3-5-sonnet-20241022` (default) - Claude 3.5 Sonnet v2
- `claude-3-opus-20240229` - Most capable model
- `claude-3-haiku-20240307` - Fastest, most cost-effective

### Usage Example

```typescript

const neurolink = new NeuroLink();

const result = await neurolink.generate({
  input: { text: "Explain quantum computing" },
  provider: "anthropic",
  model: "claude-3-5-sonnet-20241022",
  temperature: 0.7,
  maxTokens: 1000,
  timeout: "30s",
});
```

### Timeout Configuration

- **Default Timeout**: 30 seconds
- **Supported Formats**: Milliseconds (`30000`), human-readable (`'30s'`, `'1m'`, `'5m'`)
- **Environment Variable**: `ANTHROPIC_TIMEOUT='45s'` (optional)

### Getting Started with Anthropic

1. **Create Account**: Visit [anthropic.com](https://www.anthropic.com)
2. **Get API Key**: Navigate to API Keys section
3. **Generate Key**: Create new API key
4. **Set Environment**: Export key as `ANTHROPIC_API_KEY`

## Azure OpenAI Configuration {#azure}

Azure OpenAI provides enterprise-grade access to OpenAI models through Microsoft Azure.

### Basic Setup

```bash
export AZURE_OPENAI_API_KEY="your-azure-openai-key"
export AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com/"
export AZURE_OPENAI_DEPLOYMENT_ID="your-deployment-name"
```

### Optional Configuration

```bash
export AZURE_OPENAI_API_VERSION="2024-02-15-preview"  # API version
```

### Supported Models

Azure OpenAI supports deployment of:

- `gpt-4o` - Latest multimodal model
- `gpt-4` - Advanced reasoning
- `gpt-4-turbo` - Optimized performance
- `gpt-3.5-turbo` - Cost-effective

### Usage Example

```typescript

const neurolink = new NeuroLink();

const result = await neurolink.generate({
  input: { text: "Explain machine learning" },
  provider: "azure",
  temperature: 0.7,
  maxTokens: 500,
  timeout: "30s",
});
```

### Timeout Configuration

- **Default Timeout**: 30 seconds
- **Supported Formats**: Milliseconds (`30000`), human-readable (`'30s'`, `'1m'`, `'5m'`)
- **Environment Variable**: `AZURE_TIMEOUT='45s'` (optional)

### Azure Setup Requirements

1. **Azure Subscription**: Active Azure subscription
2. **Azure OpenAI Resource**: Create Azure OpenAI resource in Azure Portal
3. **Model Deployment**: Deploy a model to get deployment ID
4. **API Key**: Get API key from resource's Keys and Endpoint section

### Environment Variables Reference

| Variable                     | Required | Description                   |
| ---------------------------- | -------- | ----------------------------- |
| `AZURE_OPENAI_API_KEY`       | ✅       | Azure OpenAI API key          |
| `AZURE_OPENAI_ENDPOINT`      | ✅       | Resource endpoint URL         |
| `AZURE_OPENAI_DEPLOYMENT_ID` | ✅       | Model deployment name         |
| `AZURE_OPENAI_API_VERSION`   | ❌       | API version (default: latest) |

## OpenAI Compatible Configuration {#openai-compatible}

Connect to any OpenAI-compatible API endpoint (LocalAI, vLLM, Ollama with OpenAI compatibility, etc.)

### Basic Setup

```bash
export OPENAI_COMPATIBLE_BASE_URL="http://localhost:8080/v1"
export OPENAI_COMPATIBLE_API_KEY="optional-api-key"  # Some servers don't require this
```

### Optional Configuration

```bash
export OPENAI_COMPATIBLE_MODEL="your-model-name"
```

### Usage Example

```typescript

const neurolink = new NeuroLink();

const result = await neurolink.generate({
  input: { text: "Hello from custom endpoint" },
  provider: "openai-compatible",
  model: "your-model",
  temperature: 0.7,
  maxTokens: 500,
});
```

### Compatible Servers

This works with any server implementing the OpenAI API:

- **LocalAI** - Local AI server
- **vLLM** - High-performance inference server
- **Ollama** (with `OLLAMA_OPENAI_COMPAT=1`)
- **Text Generation WebUI**
- **Custom inference servers**

### Environment Variables

```bash
# Required: Base URL of your OpenAI-compatible server
export OPENAI_COMPATIBLE_BASE_URL="http://localhost:8080/v1"

# Optional: API key (if your server requires one)
export OPENAI_COMPATIBLE_API_KEY="your-api-key-if-needed"

# Optional: Default model name
export OPENAI_COMPATIBLE_MODEL="your-model-name"
```

## Redis Configuration {#redis}

Redis integration for distributed conversation memory and session state.

### Basic Setup

```bash
export REDIS_URL="redis://localhost:6379"
```

### Optional Configuration

```bash
export REDIS_PASSWORD="your-redis-password"  # If authentication enabled
export REDIS_DB="0"  # Database number (default: 0)
export REDIS_KEY_PREFIX="neurolink:"  # Key prefix for namespacing
```

### Advanced Configuration

```bash
# Connection settings
export REDIS_HOST="localhost"
export REDIS_PORT="6379"
export REDIS_TLS="false"  # Set to "true" for TLS connections

# Pool settings
export REDIS_MAX_RETRIES="3"
export REDIS_RETRY_DELAY="1000"  # milliseconds
export REDIS_CONNECTION_TIMEOUT="5000"  # milliseconds
```

### Usage Example

```typescript

const neurolink = new NeuroLink({
  memory: {
    type: "redis",
    url: process.env.REDIS_URL,
  },
});

const result = await neurolink.generate({
  input: { text: "Remember this conversation" },
  sessionId: "user-123", // Session stored in Redis
});
```

### Redis Cloud Setup

For managed Redis (Redis Cloud, AWS ElastiCache, etc.):

```bash
export REDIS_URL="rediss://username:password@your-redis-host:6380"
```

### Docker Redis (Development)

```bash
# Start Redis in Docker
docker run -d -p 6379:6379 redis:latest

# Set environment
export REDIS_URL="redis://localhost:6379"
```

### Features Enabled by Redis

- **Distributed Memory**: Share conversation state across instances
- **Session Persistence**: Conversations survive application restarts
- **Export/Import**: Export full session history as JSON
- **Multi-tenant**: Isolate conversations by session ID
- **Scalability**: Handle thousands of concurrent conversations

### Environment Variables Reference

| Variable           | Required        | Default    | Description               |
| ------------------ | --------------- | ---------- | ------------------------- |
| `REDIS_URL`        | Recommended     | -          | Full Redis connection URL |
| `REDIS_HOST`       | Alternative     | localhost  | Redis host                |
| `REDIS_PORT`       | Alternative     | 6379       | Redis port                |
| `REDIS_PASSWORD`   | If auth enabled | -          | Redis password            |
| `REDIS_DB`         | ❌              | 0          | Database number           |
| `REDIS_KEY_PREFIX` | ❌              | neurolink: | Key prefix                |

## Environment File Template

Create a `.env` file in your project root:

```bash
# NeuroLink Environment Configuration

# OpenAI
OPENAI_API_KEY=sk-your-openai-key-here
OPENAI_MODEL=gpt-4o

# Amazon Bedrock
AWS_ACCESS_KEY_ID=your-aws-access-key
AWS_SECRET_ACCESS_KEY=your-aws-secret-key
AWS_REGION=us-east-2
AWS_SESSION_TOKEN=your-session-token  # Optional: for temporary credentials
BEDROCK_MODEL=arn:aws:bedrock:us-east-2::inference-profile/us.anthropic.claude-3-7-sonnet-20250219-v1:0

# Google Vertex AI (choose one method)
# Method 1: File path
GOOGLE_APPLICATION_CREDENTIALS=/path/to/your/service-account.json

# Method 2: JSON string (uncomment to use)
# GOOGLE_SERVICE_ACCOUNT_KEY={"type":"service_account","project_id":"your-project",...}

# Method 3: Individual variables (uncomment to use)
# GOOGLE_AUTH_CLIENT_EMAIL=service-account@your-project.iam.gserviceaccount.com
# GOOGLE_AUTH_PRIVATE_KEY="-----BEGIN PRIVATE KEY-----\nYOUR_PRIVATE_KEY_HERE\n-----END PRIVATE KEY-----"

# Required for all Google Vertex AI methods
GOOGLE_VERTEX_PROJECT=your-gcp-project-id
GOOGLE_VERTEX_LOCATION=us-east5
VERTEX_MODEL_ID=claude-sonnet-4@20250514

# Alternative: Gemini 3 models with extended thinking support
# VERTEX_MODEL_ID=gemini-3-flash-preview
# VERTEX_MODEL_ID=gemini-3-pro-preview

# Google AI Studio
GOOGLE_AI_API_KEY=AIza-your-googleAiStudio-key
GOOGLE_AI_MODEL=gemini-2.5-pro

# Anthropic
ANTHROPIC_API_KEY=sk-ant-api03-your-key

# Azure OpenAI
AZURE_OPENAI_API_KEY=your-azure-key
AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com/"
AZURE_OPENAI_DEPLOYMENT_ID=your-deployment-name

# Hugging Face
HUGGINGFACE_API_KEY=hf_your_token_here
HUGGINGFACE_MODEL=microsoft/DialoGPT-medium  # Optional

# Ollama (Local AI)
OLLAMA_BASE_URL=http://localhost:11434  # Optional
OLLAMA_MODEL=llama2  # Optional

# Mistral AI
MISTRAL_API_KEY=your_mistral_api_key
MISTRAL_MODEL=mistral-small  # Optional

# Application Settings
DEFAULT_PROVIDER=auto
NEUROLINK_DEBUG=false
```

## Provider Priority and Fallback

### Automatic Provider Selection

NeuroLink automatically selects the best available provider when no provider is specified:

```typescript

const neurolink = new NeuroLink();

// Automatically selects best available provider
const result = await neurolink.generate({
  input: { text: "Hello, world!" },
});
```

### Provider Priority Order

The default priority order (most reliable first):

1. **OpenAI** - Most reliable, fastest setup
2. **Anthropic** - High quality, simple setup
3. **Google AI Studio** - Free tier, easy setup
4. **Azure OpenAI** - Enterprise reliable
5. **Google Vertex AI** - Good performance, multiple auth methods
6. **Mistral AI** - European compliance, competitive pricing
7. **Hugging Face** - Open source variety
8. **Amazon Bedrock** - High quality, requires careful setup
9. **Ollama** - Local only, no fallback

### Specifying Provider and Model

```typescript

const neurolink = new NeuroLink();

// Explicitly specify provider and model
const result = await neurolink.generate({
  input: { text: "Hello" },
  provider: "bedrock",
  model: "anthropic.claude-3-sonnet-20240229-v1:0",
});
```

### Environment-Based Selection

```typescript

const neurolink = new NeuroLink();

// Different providers for different environments
const result = await neurolink.generate({
  input: { text: "Hello" },
  provider: process.env.NODE_ENV === "production" ? "bedrock" : "openai",
  model: process.env.NODE_ENV === "production" ? undefined : "gpt-4o-mini",
});
```

## Testing Provider Configuration

### CLI Status Check

```bash
# Test all providers
npx @juspay/neurolink status --verbose

# Expected output:
#  Checking AI provider status...
# ✅ openai: ✅ Working (234ms)
# ❌ bedrock: ❌ Invalid credentials - The security token included in the request is expired
# ⚪ vertex: ⚪ Not configured - Missing environment variables
```

### Programmatic Testing

```typescript

async function testProviders() {
  const providers = [
    "openai",
    "bedrock",
    "vertex",
    "anthropic",
    "azure",
    "google-ai",
    "huggingface",
    "ollama",
    "mistral",
  ];

  const neurolink = new NeuroLink();

  for (const providerName of providers) {
    try {
      const start = Date.now();

      const result = await neurolink.generate({
        input: { text: "Test" },
        provider: providerName,
        maxTokens: 10,
      });

      console.log(`✅ ${providerName}: Working (${Date.now() - start}ms)`);
    } catch (error) {
      console.log(`❌ ${providerName}: ${error.message}`);
    }
  }
}

testProviders();
```

## Common Configuration Issues

### OpenAI Issues

```
Error: Cannot find API key for OpenAI provider
```

**Solution**: Set `OPENAI_API_KEY` environment variable

### Bedrock Issues

```
Your account is not authorized to invoke this API operation
```

**Solutions**:

1. Use full inference profile ARN (not simple model name)
2. Check AWS account has Bedrock access
3. Verify IAM permissions include `bedrock:InvokeModel`
4. Ensure model access is enabled in your AWS region

### Vertex AI Issues

```
Cannot find package '@google-cloud/vertexai'
```

**Solution**: Install peer dependency: `npm install @google-cloud/vertexai`

```
Authentication failed
```

**Solutions**:

1. Verify service account JSON is valid
2. Check project ID is correct
3. Ensure Vertex AI API is enabled
4. Verify service account has proper permissions

## Security Best Practices

### Environment Variables

- Never commit API keys to version control
- Use different keys for development/staging/production
- Rotate keys regularly
- Use minimal permissions for service accounts

### AWS Security

- Use IAM roles instead of access keys when possible
- Enable CloudTrail for audit logging
- Use VPC endpoints for additional security
- Implement resource-based policies

### Google Cloud Security

- Use service account keys with minimal permissions
- Enable audit logging
- Use VPC Service Controls for additional isolation
- Rotate service account keys regularly

### General Security

- Use environment-specific configurations
- Implement rate limiting in your applications
- Monitor usage and costs
- Use HTTPS for all API communications

---

[← Back to Main README](/docs/) | [Next: API Reference →](/docs/sdk/api-reference)

---

## Google Vertex AI Provider Guide

<!-- Source: getting-started/providers/google-vertex.md -->

# Google Vertex AI Provider Guide

**Enterprise AI on Google Cloud with Claude, Gemini, and custom models**

## Quick Start

### 1. Create GCP Project

```bash
# Create project
gcloud projects create my-ai-project --name="My AI Project"

# Set project
gcloud config set project my-ai-project

# Enable Vertex AI API
gcloud services enable aiplatform.googleapis.com
```

### 2. Setup Authentication

**Option A: Service Account (Production)**

```bash
# Create service account
gcloud iam service-accounts create vertex-ai-sa \
  --display-name="Vertex AI Service Account"

# Grant Vertex AI User role
gcloud projects add-iam-policy-binding my-ai-project \
  --member="serviceAccount:vertex-ai-sa@my-ai-project.iam.gserviceaccount.com" \
  --role="roles/aiplatform.user"

# Create key file
gcloud iam service-accounts keys create vertex-key.json \
  --iam-account=vertex-ai-sa@my-ai-project.iam.gserviceaccount.com

# Set environment variable
export GOOGLE_APPLICATION_CREDENTIALS="$(pwd)/vertex-key.json"
```

**Option B: Application Default Credentials (Development)**

```bash
# Login with your Google account
gcloud auth application-default login
```

**Option C: Workload Identity (GKE)**

```bash
# Bind Kubernetes service account to GCP service account
gcloud iam service-accounts add-iam-policy-binding \
  vertex-ai-sa@my-ai-project.iam.gserviceaccount.com \
  --role roles/iam.workloadIdentityUser \
  --member "serviceAccount:my-ai-project.svc.id.goog[default/my-ksa]"
```

### 3. Configure NeuroLink

```bash
# .env
GOOGLE_VERTEX_PROJECT_ID=my-ai-project
GOOGLE_VERTEX_LOCATION=us-central1
GOOGLE_APPLICATION_CREDENTIALS=/path/to/vertex-key.json
```

```typescript

const ai = new NeuroLink({
  providers: [
    {
      name: "vertex",
      config: {
        projectId: process.env.GOOGLE_VERTEX_PROJECT_ID,
        location: process.env.GOOGLE_VERTEX_LOCATION,
        credentials: process.env.GOOGLE_APPLICATION_CREDENTIALS,
      },
    },
  ],
});

const result = await ai.generate({
  input: { text: "Hello from Vertex AI!" },
  provider: "vertex",
  model: "gemini-2.0-flash",
});

console.log(result.content);
```

---

## Regional Deployment

### Available Regions

| Region                   | Location       | Models Available | Latency              |
| ------------------------ | -------------- | ---------------- | -------------------- |
| **us-central1**          | Iowa, USA      | All models       | Low (US)             |
| **us-east1**             | South Carolina | All models       | Low (US East)        |
| **us-west1**             | Oregon, USA    | All models       | Low (US West)        |
| **europe-west1**         | Belgium        | All models       | Low (EU)             |
| **europe-west2**         | London, UK     | All models       | Low (UK)             |
| **europe-west4**         | Netherlands    | All models       | Low (EU)             |
| **asia-northeast1**      | Tokyo, Japan   | All models       | Low (Asia)           |
| **asia-southeast1**      | Singapore      | All models       | Low (Southeast Asia) |
| **asia-south1**          | Mumbai, India  | All models       | Low (India)          |
| **australia-southeast1** | Sydney         | All models       | Low (Australia)      |

### Multi-Region Setup

```typescript
const ai = new NeuroLink({
  providers: [
    // US deployment
    {
      name: "vertex-us",
      config: {
        projectId: process.env.GOOGLE_VERTEX_PROJECT_ID,
        location: "us-central1",
        credentials: process.env.GOOGLE_APPLICATION_CREDENTIALS,
      },
      region: "us",
      priority: 1,
      condition: (req) => req.userRegion === "us",
    },

    // EU deployment
    {
      name: "vertex-eu",
      config: {
        projectId: process.env.GOOGLE_VERTEX_PROJECT_ID,
        location: "europe-west1",
        credentials: process.env.GOOGLE_APPLICATION_CREDENTIALS,
      },
      region: "eu",
      priority: 1,
      condition: (req) => req.userRegion === "eu",
    },

    // Asia deployment
    {
      name: "vertex-asia",
      config: {
        projectId: process.env.GOOGLE_VERTEX_PROJECT_ID,
        location: "asia-southeast1",
        credentials: process.env.GOOGLE_APPLICATION_CREDENTIALS,
      },
      region: "asia",
      priority: 1,
      condition: (req) => req.userRegion === "asia",
    },
  ],
  failoverConfig: { enabled: true },
});
```

---

## Available Models

### Gemini Models (Google)

| Model                      | Description               | Context    | Best For                 | Pricing                          |
| -------------------------- | ------------------------- | ---------- | ------------------------ | -------------------------------- |
| **gemini-3-pro-preview**   | Latest, extended thinking | 1M tokens  | Deep reasoning, analysis | Preview                          |
| **gemini-3-flash-preview** | Fast with thinking        | 1M tokens  | Balanced speed/quality   | Preview                          |
| **gemini-2.0-flash**       | Fast model                | 1M tokens  | Speed, real-time         | $0.075/1M input, $0.30/1M output |
| **gemini-1.5-pro**         | Most capable              | 2M tokens  | Complex reasoning        | $1.25/1M in                      |
| **gemini-1.5-flash**       | Balanced                  | 1M tokens  | General tasks            | $0.075/1M in                     |
| **gemini-1.0-pro**         | Stable version            | 32K tokens | Production               | $0.50/1M in                      |

> **Note:** Gemini 3 models (`gemini-3-pro-preview`, `gemini-3-flash-preview`) are preview models and may have stricter rate limits than production models. Monitor your usage and expect potential API changes during the preview period.

### Claude Models (Anthropic via Vertex)

| Model                 | Description      | Context     | Best For        | Pricing     |
| --------------------- | ---------------- | ----------- | --------------- | ----------- |
| **claude-3-5-sonnet** | Latest Anthropic | 200K tokens | Complex tasks   | $3/1M in    |
| **claude-3-opus**     | Most capable     | 200K tokens | Highest quality | $15/1M in   |
| **claude-3-haiku**    | Fast, affordable | 200K tokens | High-volume     | $0.25/1M in |

### Model Selection Examples

```typescript
// Use Gemini for speed
const fast = await ai.generate({
  input: { text: "Quick query" },
  provider: "vertex",
  model: "gemini-2.0-flash",
});

// Use Gemini Pro for complex reasoning
const complex = await ai.generate({
  input: { text: "Detailed analysis..." },
  provider: "vertex",
  model: "gemini-1.5-pro",
});

// Use Claude for highest quality
const premium = await ai.generate({
  input: { text: "Critical task..." },
  provider: "vertex",
  model: "claude-3-5-sonnet",
});
```

---

## Extended Thinking (Gemini 3)

Gemini 3 models support **Extended Thinking**, which enables the model to perform deeper reasoning before generating responses. This is ideal for complex analysis, multi-step problem solving, and tasks requiring careful deliberation.

### Thinking Levels

| Level       | Description                        | Use Case                           | Latency Impact |
| ----------- | ---------------------------------- | ---------------------------------- | -------------- |
| **minimal** | Near-zero thinking (Flash only)    | Simple queries requiring speed     | Minimal        |
| **low**     | Minimal thinking, faster responses | Simple queries, quick answers      | Low            |
| **medium**  | Balanced thinking and speed        | General tasks, moderate complexity | Moderate       |
| **high**    | Deep reasoning, thorough analysis  | Complex problems, critical tasks   | Higher         |

### Basic Usage

```typescript

const ai = new NeuroLink({
  providers: [
    {
      name: "vertex",
      config: {
        projectId: process.env.GOOGLE_VERTEX_PROJECT_ID,
        location: process.env.GOOGLE_VERTEX_LOCATION,
      },
    },
  ],
});

// Enable extended thinking with Gemini 3
const result = await ai.generate({
  input: {
    text: "Analyze the trade-offs between microservices and monolithic architecture for a startup with 5 engineers.",
  },
  provider: "vertex",
  model: "gemini-3-pro-preview",
  thinkingLevel: "high", // 'minimal' | 'low' | 'medium' | 'high'
});

console.log(result.content);
```

### Thinking Level Examples

```typescript
// Low thinking - Quick responses for simple queries
const quick = await ai.generate({
  input: { text: "What is the capital of France?" },
  provider: "vertex",
  model: "gemini-3-flash-preview",
  thinkingLevel: "low",
});

// Medium thinking - Balanced for everyday tasks
const balanced = await ai.generate({
  input: { text: "Summarize the key points of this article..." },
  provider: "vertex",
  model: "gemini-3-flash-preview",
  thinkingLevel: "medium",
});

// High thinking - Deep analysis for complex problems
const deep = await ai.generate({
  input: {
    text: `Given the following codebase architecture, identify potential
           security vulnerabilities and suggest remediation strategies...`,
  },
  provider: "vertex",
  model: "gemini-3-pro-preview",
  thinkingLevel: "high",
});
```

### Streaming with Extended Thinking

```typescript
// Stream responses with thinking enabled
const stream = await ai.stream({
  input: {
    text: "Design a distributed caching strategy for a high-traffic e-commerce platform.",
  },
  provider: "vertex",
  model: "gemini-3-pro-preview",
  thinkingLevel: "high",
});

for await (const chunk of stream) {
  process.stdout.write(chunk.content);
}
```

### Best Practices for Extended Thinking

1. **Match thinking level to task complexity**: Use `low` for simple queries, `high` for complex analysis
2. **Consider latency requirements**: Higher thinking levels increase response time
3. **Use with complex prompts**: Extended thinking shines with multi-step reasoning tasks
4. **Monitor token usage**: Thinking processes consume additional tokens

> **Important:** Extended Thinking is only available on Gemini 3 models (`gemini-3-pro-preview`, `gemini-3-flash-preview`). Using `thinkingLevel` with other models will be ignored.

---

## IAM & Permissions

### Required IAM Roles

```bash
# Minimum roles for Vertex AI
roles/aiplatform.user           # Use Vertex AI services
roles/serviceusage.serviceUsageConsumer  # Use GCP APIs

# Additional roles for specific features
roles/aiplatform.admin          # Manage models and endpoints
roles/storage.objectViewer      # Read from Cloud Storage
roles/bigquery.dataViewer       # Read from BigQuery
```

### Service Account Setup

```bash
# Create service account with minimal permissions
gcloud iam service-accounts create vertex-readonly \
  --display-name="Vertex AI Read-Only"

# Grant only necessary permissions
gcloud projects add-iam-policy-binding my-ai-project \
  --member="serviceAccount:vertex-readonly@my-ai-project.iam.gserviceaccount.com" \
  --role="roles/aiplatform.user"

# For production, use custom role with least privilege
gcloud iam roles create vertexAIInference \
  --project=my-ai-project \
  --title="Vertex AI Inference Only" \
  --permissions=aiplatform.endpoints.predict,aiplatform.endpoints.get
```

### Workload Identity for GKE

```yaml
# kubernetes-sa.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: vertex-ai-sa
  namespace: default
  annotations:
    iam.gke.io/gcp-service-account: vertex-ai-sa@my-ai-project.iam.gserviceaccount.com
```

```bash
# Bind Kubernetes SA to GCP SA
gcloud iam service-accounts add-iam-policy-binding \
  vertex-ai-sa@my-ai-project.iam.gserviceaccount.com \
  --role roles/iam.workloadIdentityUser \
  --member "serviceAccount:my-ai-project.svc.id.goog[default/vertex-ai-sa]"
```

---

## VPC & Private Connectivity

### Private Service Connect

```bash
# Create Private Service Connect endpoint
gcloud compute addresses create vertex-psc-ip \
  --region=us-central1 \
  --subnet=my-subnet

gcloud compute forwarding-rules create vertex-psc-endpoint \
  --region=us-central1 \
  --network=my-vpc \
  --address=vertex-psc-ip \
  --target-service-attachment=projects/my-project/regions/us-central1/serviceAttachments/vertex-ai
```

### VPC Service Controls

```bash
# Create access policy
gcloud access-context-manager policies create \
  --title="Vertex AI Access Policy"

# Create perimeter
gcloud access-context-manager perimeters create vertex_perimeter \
  --title="Vertex AI Perimeter" \
  --resources=projects/my-ai-project \
  --restricted-services=aiplatform.googleapis.com \
  --policy=POLICY_ID
```

---

## Custom Model Deployment

### Deploy Custom Model

```python
# Python example for custom model deployment
from google.cloud import aiplatform

aiplatform.init(project='my-ai-project', location='us-central1')

# Upload model
model = aiplatform.Model.upload(
    display_name='my-custom-model',
    artifact_uri='gs://my-bucket/model/',
    serving_container_image_uri='gcr.io/my-project/serving-image:latest'
)

# Create endpoint
endpoint = aiplatform.Endpoint.create(
    display_name='my-model-endpoint'
)

# Deploy model to endpoint
model.deploy(
    endpoint=endpoint,
    machine_type='n1-standard-4',
    min_replica_count=1,
    max_replica_count=3
)
```

### Use Custom Endpoint with NeuroLink

```typescript
const ai = new NeuroLink({
  providers: [
    {
      name: "vertex-custom",
      config: {
        projectId: process.env.GOOGLE_VERTEX_PROJECT_ID,
        location: "us-central1",
        credentials: process.env.GOOGLE_APPLICATION_CREDENTIALS,
        endpoint: "projects/my-project/locations/us-central1/endpoints/12345",
      },
    },
  ],
});

const result = await ai.generate({
  input: { text: "Your prompt" },
  provider: "vertex-custom",
});
```

---

## Monitoring & Logging

### Cloud Logging Integration

```typescript

const logging = new Logging({
  projectId: process.env.GOOGLE_VERTEX_PROJECT_ID,
});

const log = logging.log("vertex-ai-requests");

const ai = new NeuroLink({
  providers: [
    {
      name: "vertex",
      config: {
        projectId: process.env.GOOGLE_VERTEX_PROJECT_ID,
        location: "us-central1",
      },
    },
  ],
  onSuccess: async (result) => {
    // Log to Cloud Logging
    const metadata = {
      resource: { type: "global" },
      severity: "INFO",
    };

    const entry = log.entry(metadata, {
      event: "ai_generation_success",
      provider: result.provider,
      model: result.model,
      tokens: result.usage.totalTokens,
      cost: result.cost,
      latency: result.latency,
    });

    await log.write(entry);
  },
});
```

### Cloud Monitoring Metrics

```typescript

const client = new MetricServiceClient();

async function writeMetric(tokens: number, cost: number) {
  const projectId = process.env.GOOGLE_VERTEX_PROJECT_ID;
  const projectPath = client.projectPath(projectId);

  const dataPoint = {
    interval: {
      endTime: { seconds: Date.now() / 1000 },
    },
    value: { doubleValue: tokens },
  };

  const timeSeriesData = {
    metric: {
      type: "custom.googleapis.com/vertex_ai/tokens_used",
      labels: { model: "gemini-1.5-pro" },
    },
    resource: {
      type: "global",
      labels: { project_id: projectId },
    },
    points: [dataPoint],
  };

  const request = {
    name: projectPath,
    timeSeries: [timeSeriesData],
  };

  await client.createTimeSeries(request);
}
```

---

## Cost Management

### Pricing Overview

```
Gemini Pricing (per 1M tokens):
- gemini-2.0-flash:  $0.075 input, $0.30 output
- gemini-1.5-pro:    $1.25 input, $5.00 output
- gemini-1.5-flash:  $0.075 input, $0.30 output

Claude on Vertex (per 1M tokens):
- claude-3-5-sonnet: $3 input, $15 output
- claude-3-opus:     $15 input, $75 output
- claude-3-haiku:    $0.25 input, $1.25 output

Custom Model: Based on compute (n1-standard-4: ~$0.19/hour)
```

### Budget Alerts

```bash
# Set budget alert
gcloud billing budgets create \
  --billing-account=BILLING_ACCOUNT_ID \
  --display-name="Vertex AI Budget" \
  --budget-amount=1000 \
  --threshold-rule=percent=50 \
  --threshold-rule=percent=90 \
  --threshold-rule=percent=100
```

### Cost Tracking

```typescript
class VertexCostTracker {
  private monthlyCost = 0;

  calculateCost(
    model: string,
    inputTokens: number,
    outputTokens: number,
  ): number {
    const pricing: Record = {
      "gemini-2.0-flash": { input: 0.075, output: 0.3 },
      "gemini-1.5-pro": { input: 1.25, output: 5.0 },
      "claude-3-5-sonnet": { input: 3.0, output: 15.0 },
    };

    const rates = pricing[model] || pricing["gemini-2.0-flash"];
    const cost =
      (inputTokens / 1_000_000) * rates.input +
      (outputTokens / 1_000_000) * rates.output;

    this.monthlyCost += cost;
    return cost;
  }

  getMonthlyTotal(): number {
    return this.monthlyCost;
  }
}

const costTracker = new VertexCostTracker();

const result = await ai.generate({
  input: { text: "Your prompt" },
  provider: "vertex",
  model: "gemini-1.5-pro",
  enableAnalytics: true,
});

const cost = costTracker.calculateCost(
  result.model,
  result.usage.promptTokens,
  result.usage.completionTokens,
);

console.log(`Request cost: $${cost.toFixed(4)}`);
console.log(`Monthly total: $${costTracker.getMonthlyTotal().toFixed(2)}`);
```

---

## Production Patterns

### Pattern 1: Multi-Model Strategy

```typescript
const ai = new NeuroLink({
  providers: [
    // Fast, cheap for simple queries
    {
      name: "vertex-flash",
      config: {
        projectId: process.env.GOOGLE_VERTEX_PROJECT_ID,
        location: "us-central1",
      },
      model: "gemini-2.0-flash",
      condition: (req) => req.complexity === "low",
    },

    // Balanced for medium complexity
    {
      name: "vertex-pro",
      config: {
        projectId: process.env.GOOGLE_VERTEX_PROJECT_ID,
        location: "us-central1",
      },
      model: "gemini-1.5-pro",
      condition: (req) => req.complexity === "medium",
    },

    // Premium for critical tasks
    {
      name: "vertex-claude",
      config: {
        projectId: process.env.GOOGLE_VERTEX_PROJECT_ID,
        location: "us-central1",
      },
      model: "claude-3-5-sonnet",
      condition: (req) => req.complexity === "high",
    },
  ],
});
```

### Pattern 2: A/B Testing

```typescript
// Deploy two model versions for A/B testing
const ai = new NeuroLink({
  providers: [
    {
      name: "vertex-model-a",
      config: {
        /*...*/
      },
      model: "gemini-1.5-pro",
      weight: 1, // 50% traffic
      tags: ["experiment-a"],
    },
    {
      name: "vertex-model-b",
      config: {
        /*...*/
      },
      model: "claude-3-5-sonnet",
      weight: 1, // 50% traffic
      tags: ["experiment-b"],
    },
  ],
  loadBalancing: "weighted-round-robin",
  onSuccess: (result) => {
    // Track A/B test metrics
    analytics.track({
      experiment: result.tags[0],
      model: result.model,
      latency: result.latency,
      quality: result.quality,
    });
  },
});
```

---

## Best Practices

### 1. ✅ Use Service Accounts with Minimal Permissions

```bash
# ✅ Good: Least privilege
gcloud iam roles create vertexInferenceOnly \
  --permissions=aiplatform.endpoints.predict
```

### 2. ✅ Enable Private Service Connect

```bash
# ✅ Good: Private connectivity
gcloud compute forwarding-rules create vertex-psc
```

### 3. ✅ Monitor Costs

```typescript
// ✅ Good: Track every request
const cost = costTracker.calculateCost(model, inputTokens, outputTokens);
```

### 4. ✅ Use Multi-Region for HA

```typescript
// ✅ Good: Regional failover
providers: [
  { name: "vertex-us", region: "us-central1", priority: 1 },
  { name: "vertex-eu", region: "europe-west1", priority: 2 },
];
```

### 5. ✅ Log to Cloud Logging

```typescript
// ✅ Good: Centralized logging
await log.write(entry);
```

---

## Troubleshooting

### Common Issues

#### 1. "Permission Denied"

**Problem**: Missing IAM permissions.

**Solution**:

```bash
# Grant required role
gcloud projects add-iam-policy-binding my-ai-project \
  --member="serviceAccount:vertex-ai-sa@my-ai-project.iam.gserviceaccount.com" \
  --role="roles/aiplatform.user"
```

#### 2. "Quota Exceeded"

**Problem**: Exceeded API quota.

**Solution**:

```bash
# Request quota increase
gcloud services enable serviceusage.googleapis.com
gcloud alpha services quota update \
  --service=aiplatform.googleapis.com \
  --consumer=projects/my-ai-project \
  --metric=aiplatform.googleapis.com/online_prediction_requests \
  --value=10000
```

#### 3. "Model Not Found"

**Problem**: Model not available in region.

**Solution**:

```bash
# Check available models in region
gcloud ai models list --region=us-central1

# Use different region
GOOGLE_VERTEX_LOCATION=europe-west1
```

---

## Known Limitations

### Tools + JSON Schema Cannot Be Used Simultaneously (Gemini Models)

**Google API Limitation:** All Google Gemini models on Vertex AI (including Gemini 3 preview models) cannot combine function calling (tools) with structured output (JSON schema) in the same request. This is a fundamental Google API constraint.

**Affected models:** All Gemini models including `gemini-3-pro-preview`, `gemini-3-flash-preview`, `gemini-2.0-flash`, `gemini-1.5-pro`, `gemini-1.5-flash`

**Note:** This limitation ONLY affects Gemini models. Anthropic Claude models via Vertex AI do NOT have this limitation.

**Error:**

```
Function calling with a response mime type: 'application/json' is unsupported
```

**Solution for Gemini models:**

```typescript
// ✅ Correct approach with Gemini (including Gemini 3)
const result = await neurolink.generate({
  input: { text: "Analyze this data" },
  schema: MyZodSchema,
  output: { format: "json" },
  provider: "vertex",
  model: "gemini-3-pro-preview", // or any Gemini model
  disableTools: true, // Required for ALL Gemini models when using schema
});
```

**With Extended Thinking (Gemini 3):**

```typescript
// ✅ Using schema with Gemini 3 Extended Thinking
const result = await neurolink.generate({
  input: { text: "Analyze this complex data and provide structured insights" },
  schema: MyZodSchema,
  output: { format: "json" },
  provider: "vertex",
  model: "gemini-3-pro-preview",
  thinkingLevel: "high",
  disableTools: true, // Still required even with thinking enabled
});
```

**Claude models work without restriction:**

```typescript
// ✅ Claude via Vertex AI supports both
const result = await neurolink.generate({
  input: { text: "Analyze this data" },
  schema: MyZodSchema,
  output: { format: "json" },
  provider: "vertex",
  model: "claude-3-5-sonnet-20241022",
  // No disableTools needed - Claude supports both
});
```

**Industry Context:**

- This limitation affects ALL frameworks using Gemini (LangChain, Vercel AI SDK, Agno, Instructor)
- All use the same workaround: disable tools when using schemas
- Future Gemini versions may support both - check official Google Cloud documentation for updates

### Preview Model Rate Limits (Gemini 3)

**Preview models** (`gemini-3-pro-preview`, `gemini-3-flash-preview`) have stricter rate limits than production models:

- Lower requests per minute (RPM) quotas
- Lower tokens per minute (TPM) quotas
- Potential for API changes without notice
- Not recommended for production workloads without fallback

**Recommended pattern for production:**

```typescript
const ai = new NeuroLink({
  providers: [
    // Primary: Gemini 3 preview
    {
      name: "vertex-gemini3",
      config: {
        projectId: process.env.GOOGLE_VERTEX_PROJECT_ID,
        location: "us-central1",
      },
      model: "gemini-3-pro-preview",
      priority: 1,
    },
    // Fallback: Stable Gemini 2
    {
      name: "vertex-gemini2",
      config: {
        projectId: process.env.GOOGLE_VERTEX_PROJECT_ID,
        location: "us-central1",
      },
      model: "gemini-2.0-flash",
      priority: 2,
    },
  ],
  failoverConfig: { enabled: true },
});
```

### Complex Schema Limitations

**"Too many states for serving" Error:**

When using complex Zod schemas with Gemini, you may encounter:

```
Error: 9 FAILED_PRECONDITION: Too many states for serving
```

**Solutions:**

1. Simplify schema (reduce nesting, array sizes)
2. Use `disableTools: true` (reduces state count)
3. Use Claude models via Vertex AI (no such limitation)

See [Troubleshooting Guide](/docs/reference/troubleshooting) for details.

---

## Related Documentation

- **[Provider Setup Guide](/docs/getting-started/provider-setup)** - General configuration
- **[Multi-Region Deployment](/docs/guides/enterprise/multi-region)** - Geographic distribution
- **[Cost Optimization](/docs/cookbook/cost-optimization)** - Reduce costs
- **[Compliance Guide](/docs/guides/enterprise/compliance)** - Security

---

## Additional Resources

- **[Vertex AI Documentation](https://cloud.google.com/vertex-ai/docs)** - Official docs
- **[Vertex AI Pricing](https://cloud.google.com/vertex-ai/pricing)** - Pricing calculator
- **[GCP Console](https://console.cloud.google.com/)** - Manage resources
- **[gcloud CLI](https://cloud.google.com/sdk/gcloud)** - Command-line tool

---

**Need Help?** Join our [GitHub Discussions](https://github.com/juspay/neurolink/discussions) or open an [issue](https://github.com/juspay/neurolink/issues).

---

## Hugging Face Provider Guide

<!-- Source: getting-started/providers/huggingface.md -->

# Hugging Face Provider Guide

**Access 100,000+ open-source AI models through Hugging Face's free inference API**

## Quick Start

### 1. Get Your API Token

1. Visit [Hugging Face](https://huggingface.co/)
2. Create a free account (no credit card required)
3. Go to [Settings → Access Tokens](https://huggingface.co/settings/tokens)
4. Click "New token"
5. Give it a name (e.g., "NeuroLink")
6. Select "Read" permissions
7. Copy the token (starts with `hf_...`)

### 2. Configure NeuroLink

Add to your `.env` file:

```bash
HUGGINGFACE_API_KEY=hf_your_token_here
```

:::warning[Security Best Practice]
Never commit your API token to version control. Always use environment variables and add `.env` to your `.gitignore` file.
:::

### 3. Test the Setup

```bash
# CLI - Test with default model
npx @juspay/neurolink generate "Hello from Hugging Face!" --provider huggingface

# CLI - Use specific model
npx @juspay/neurolink generate "Write a poem" --provider huggingface --model "mistralai/Mistral-7B-Instruct-v0.2"

# SDK
node -e "
const { NeuroLink } = require('@juspay/neurolink');
(async () => {
  const ai = new NeuroLink();
  const result = await ai.generate({
    input: { text: 'Hello from Hugging Face!' },
    provider: 'huggingface'
  });
  console.log(result.content);
})();
"
```

---

## Model Selection Guide

### Popular Models by Category

#### 1. **General Text Generation**

| Model                                | Size | Description                        | Best For                      |
| ------------------------------------ | ---- | ---------------------------------- | ----------------------------- |
| `mistralai/Mistral-7B-Instruct-v0.2` | 7B   | High-quality instruction following | General tasks, fast responses |
| `meta-llama/Llama-2-7b-chat-hf`      | 7B   | Meta's open chat model             | Conversational AI             |
| `tiiuae/falcon-7b-instruct`          | 7B   | Efficient, multilingual            | Multiple languages            |
| `google/flan-t5-xxl`                 | 11B  | Google's instruction-tuned         | Q&A, summarization            |

#### 2. **Code Generation**

| Model                           | Description                | Best For             |
| ------------------------------- | -------------------------- | -------------------- |
| `bigcode/starcoder`             | Code generation specialist | Writing code         |
| `Salesforce/codegen-16B-mono`   | Python-focused             | Python development   |
| `WizardLM/WizardCoder-15B-V1.0` | Code instruction following | Complex coding tasks |

#### 3. **Summarization**

| Model                           | Description           | Best For             |
| ------------------------------- | --------------------- | -------------------- |
| `facebook/bart-large-cnn`       | News summarization    | Articles, news       |
| `sshleifer/distilbart-cnn-12-6` | Faster BART variant   | Quick summaries      |
| `google/pegasus-xsum`           | Extreme summarization | Very brief summaries |

#### 4. **Translation**

| Model                                      | Languages      | Best For                   |
| ------------------------------------------ | -------------- | -------------------------- |
| `facebook/mbart-large-50-many-to-many-mmt` | 50 languages   | Multi-language translation |
| `Helsinki-NLP/opus-mt-*`                   | Language pairs | Specific language pairs    |

#### 5. **Question Answering**

| Model                                   | Description   | Best For      |
| --------------------------------------- | ------------- | ------------- |
| `deepset/roberta-base-squad2`           | SQuAD-trained | Factual Q&A   |
| `distilbert-base-cased-distilled-squad` | Faster QA     | Quick answers |

### Model Selection by Use Case

```typescript
// General conversation
const general = await ai.generate({
  input: { text: "Explain quantum computing" },
  provider: "huggingface",
  model: "mistralai/Mistral-7B-Instruct-v0.2",
});

// Code generation
const code = await ai.generate({
  input: { text: "Write a Python function to sort a list" },
  provider: "huggingface",
  model: "bigcode/starcoder",
});

// Summarization
const summary = await ai.generate({
  input: { text: "Summarize: [long article text]" },
  provider: "huggingface",
  model: "facebook/bart-large-cnn",
});

// Translation
const translation = await ai.generate({
  input: { text: "Translate to French: Hello, how are you?" },
  provider: "huggingface",
  model: "facebook/mbart-large-50-many-to-many-mmt",
});
```

---

## Free Tier Details

### What's Included

- ✅ **Unlimited requests** to public models
- ✅ **No cost** - completely free
- ✅ **No credit card** required
- ✅ **Rate limits**: 1,000 requests/day per model (generous)
- ✅ **Access to 100,000+** public models

### Rate Limits

- **Per Model**: ~1,000 requests/day
- **Strategy**: Use different models to scale
- **Best Practice**: Combine with other providers for production

```typescript
// Rate limit friendly approach
const ai = new NeuroLink({
  providers: [
    { name: "huggingface", priority: 1 }, // Free tier first
    { name: "google-ai", priority: 2 }, // Fallback to Google AI
  ],
});
```

### Limitations

⚠️ **Free Tier Constraints:**

- Models load on-demand (first request may be slow)
- Rate limits per model (use multiple models to scale)
- No guaranteed uptime (community infrastructure)
- Some popular models may have queues

 **For Production:**

- Use Hugging Face for experimentation
- Consider paid inference for critical workloads
- Combine with other providers for reliability

---

## SDK Integration

### Basic Usage

```typescript

const ai = new NeuroLink();

// Simple generation
const result = await ai.generate({
  input: { text: "Write a haiku about coding" },
  provider: "huggingface",
});

console.log(result.content);
```

### With Specific Model

```typescript
// Use Mistral for instruction following
const mistral = await ai.generate({
  input: { text: "Explain Docker in simple terms" },
  provider: "huggingface",
  model: "mistralai/Mistral-7B-Instruct-v0.2",
});

// Use StarCoder for code generation
const starcoder = await ai.generate({
  input: { text: "Create a REST API endpoint in Express.js" },
  provider: "huggingface",
  model: "bigcode/starcoder",
});
```

### Multi-Model Strategy

```typescript
// Try multiple models for best results
const models = [
  "mistralai/Mistral-7B-Instruct-v0.2",
  "meta-llama/Llama-2-7b-chat-hf",
  "tiiuae/falcon-7b-instruct",
];

for (const model of models) {
  try {
    const result = await ai.generate({
      input: { text: "Your prompt here" },
      provider: "huggingface",
      model,
    });
    console.log(`${model}: ${result.content}`);
  } catch (error) {
    console.log(`${model} failed, trying next...`);
  }
}
```

### With Streaming

```typescript
// Stream responses for better UX
for await (const chunk of ai.stream({
  input: { text: "Write a long story about space exploration" },
  provider: "huggingface",
  model: "mistralai/Mistral-7B-Instruct-v0.2",
})) {
  process.stdout.write(chunk.content);
}
```

### With Error Handling

```typescript
try {
  const result = await ai.generate({
    input: { text: "Your prompt" },
    provider: "huggingface",
    maxTokens: 500,
    temperature: 0.7,
  });
  console.log(result.content);
} catch (error) {
  if (error.message.includes("rate limit")) {
    console.log("Rate limited - try another model or wait");
  } else if (error.message.includes("loading")) {
    console.log("Model is loading - try again in a moment");
  } else {
    console.error("Error:", error.message);
  }
}
```

---

## CLI Usage

### Basic Commands

```bash
# Generate with default model
npx @juspay/neurolink generate "Hello world" --provider huggingface

# Use specific model
npx @juspay/neurolink gen "Write code" --provider huggingface --model "bigcode/starcoder"

# Stream response
npx @juspay/neurolink stream "Tell a story" --provider huggingface

# Check available models
npx @juspay/neurolink models --provider huggingface
```

### Advanced Usage

```bash
# With temperature control
npx @juspay/neurolink gen "Creative story" \
  --provider huggingface \
  --model "mistralai/Mistral-7B-Instruct-v0.2" \
  --temperature 0.9 \
  --max-tokens 1000

# Save output to file
npx @juspay/neurolink gen "Technical documentation" \
  --provider huggingface \
  --model "tiiuae/falcon-7b-instruct" \
  > output.txt

# Interactive mode
npx @juspay/neurolink loop --provider huggingface
```

### Model Comparison

```bash
# Compare different models
for model in "mistralai/Mistral-7B-Instruct-v0.2" \
             "meta-llama/Llama-2-7b-chat-hf" \
             "tiiuae/falcon-7b-instruct"; do
  echo "Testing $model:"
  npx @juspay/neurolink gen "What is AI?" \
    --provider huggingface \
    --model "$model"
  echo "---"
done
```

---

## Configuration Options

### Environment Variables

```bash
# Required
HUGGINGFACE_API_KEY=hf_your_token_here

# Optional
HUGGINGFACE_BASE_URL=https://api-inference.huggingface.co  # Custom endpoint
HUGGINGFACE_DEFAULT_MODEL=mistralai/Mistral-7B-Instruct-v0.2  # Default model
HUGGINGFACE_TIMEOUT=60000  # Request timeout (ms)
```

### Programmatic Configuration

```typescript
const ai = new NeuroLink({
  providers: [
    {
      name: "huggingface",
      config: {
        apiKey: process.env.HUGGINGFACE_API_KEY,
        defaultModel: "mistralai/Mistral-7B-Instruct-v0.2",
        timeout: 60000,
      },
    },
  ],
});
```

---

## Troubleshooting

### Common Issues

#### 1. "Model is currently loading"

**Problem**: Model hasn't been used recently and needs to load.

**Solution**:

```bash
# Wait 20-30 seconds and retry
# Or use a popular model that's always loaded
npx @juspay/neurolink gen "test" \
  --provider huggingface \
  --model "mistralai/Mistral-7B-Instruct-v0.2"
```

#### 2. "Rate limit exceeded"

**Problem**: Hit the ~1,000 requests/day limit for a model.

**Solution**:

```typescript
// Switch to a different model
const alternativeModels = [
  "mistralai/Mistral-7B-Instruct-v0.2",
  "tiiuae/falcon-7b-instruct",
  "meta-llama/Llama-2-7b-chat-hf",
];

// Or use multi-provider fallback
const ai = new NeuroLink({
  providers: [
    { name: "huggingface", priority: 1 },
    { name: "google-ai", priority: 2 }, // Fallback
  ],
});
```

#### 3. "Invalid API token"

**Problem**: Token is incorrect or expired.

**Solution**:

1. Verify token at https://huggingface.co/settings/tokens
2. Ensure token has "Read" permissions
3. Check for typos in `.env` file
4. Token should start with `hf_`

#### 4. "Model not found"

**Problem**: Model name is incorrect or private.

**Solution**:

```bash
# Verify model exists at huggingface.co
# Use exact model ID: username/model-name
npx @juspay/neurolink gen "test" \
  --provider huggingface \
  --model "mistralai/Mistral-7B-Instruct-v0.2"  # ✅ Correct format
```

#### 5. Slow Response Times

**Problem**: Model is loading or under high load.

**Solution**:

- Use popular models (always loaded)
- Add timeout handling
- Consider caching results
- Use streaming for long responses

```typescript
const result = await ai.generate({
  input: { text: "Your prompt" },
  provider: "huggingface",
  timeout: 120000, // 2 minute timeout
});
```

---

## Best Practices

### 1. Model Selection

```typescript
// ✅ Good: Use appropriate model for task
const code = await ai.generate({
  input: { text: "Write a function" },
  model: "bigcode/starcoder", // Code specialist
});

// ❌ Avoid: Using general model for specialized tasks
const badCode = await ai.generate({
  input: { text: "Write a function" },
  model: "google/flan-t5-xxl", // General model
});
```

### 2. Rate Limit Management

```typescript
// ✅ Good: Rotate between models
const models = [
  "mistralai/Mistral-7B-Instruct-v0.2",
  "tiiuae/falcon-7b-instruct",
  "meta-llama/Llama-2-7b-chat-hf",
];

let requestCount = 0; // Track the number of requests
const modelIndex = requestCount % models.length;
const result = await ai.generate({
  input: { text: prompt },
  provider: "huggingface",
  model: models[modelIndex],
});
requestCount++; // Increment after each request
```

### 3. Error Handling

```typescript
// ✅ Good: Handle model loading gracefully
async function generateWithRetry(prompt, maxRetries = 3) {
  for (let i = 0; i  setTimeout(resolve, 30000));
      } else {
        throw error;
      }
    }
  }
}
```

### 4. Production Deployment

```typescript
// ✅ Good: Use Hugging Face with fallback
const ai = new NeuroLink({
  providers: [
    {
      name: "huggingface",
      priority: 1,
      config: {
        defaultModel: "mistralai/Mistral-7B-Instruct-v0.2",
      },
    },
    {
      name: "google-ai", // Free tier fallback
      priority: 2,
    },
    {
      name: "anthropic", // Paid fallback for critical
      priority: 3,
    },
  ],
});
```

---

## Performance Optimization

### 1. Model Warm-Up

```typescript
// Keep popular models warm with periodic requests
setInterval(async () => {
  await ai.generate({
    input: { text: "ping" },
    provider: "huggingface",
    model: "mistralai/Mistral-7B-Instruct-v0.2",
    maxTokens: 1,
  });
}, 300000); // Every 5 minutes
```

### 2. Caching

```typescript
// Cache responses for repeated queries
const cache = new Map();

async function cachedGenerate(prompt) {
  if (cache.has(prompt)) {
    return cache.get(prompt);
  }

  const result = await ai.generate({
    input: { text: prompt },
    provider: "huggingface",
  });

  cache.set(prompt, result);
  return result;
}
```

### 3. Parallel Requests

```typescript
// Use different models in parallel to avoid rate limits
const prompts = ["prompt1", "prompt2", "prompt3"];
const models = [
  "mistralai/Mistral-7B-Instruct-v0.2",
  "tiiuae/falcon-7b-instruct",
  "meta-llama/Llama-2-7b-chat-hf",
];

const results = await Promise.all(
  prompts.map((prompt, i) =>
    ai.generate({
      input: { text: prompt },
      provider: "huggingface",
      model: models[i],
    }),
  ),
);
```

---

## Related Documentation

- **[Provider Setup Guide](/docs/getting-started/provider-setup)** - General provider configuration
- **[SDK API Reference](/docs/sdk/api-reference)** - Complete API documentation
- **[CLI Commands](/docs/cli/commands)** - CLI reference
- **[Multi-Provider Failover](/docs/guides/enterprise/multi-provider-failover)** - Enterprise patterns

---

## Additional Resources

- **[Hugging Face Models](https://huggingface.co/models)** - Browse all models
- **[Hugging Face Inference API](https://huggingface.co/docs/api-inference/index)** - API documentation
- **[Model Cards](https://huggingface.co/docs/hub/model-cards)** - Understanding model capabilities
- **[Hugging Face Hub](https://huggingface.co/docs/hub/index)** - Platform documentation

---

**Need Help?** Join our [GitHub Discussions](https://github.com/juspay/neurolink/discussions) or open an [issue](https://github.com/juspay/neurolink/issues).

---

## Redis Quick Start (5 Minutes)

<!-- Source: getting-started/redis-quickstart.md -->

# Redis Quick Start (5 Minutes)

Get Redis storage up and running with NeuroLink in under 5 minutes.

## Prerequisites

- Docker installed **OR** Redis installed locally
- NeuroLink SDK installed (`pnpm add @juspay/neurolink`)

## Option 1: Docker (Recommended)

The fastest way to get Redis running for development and testing.

### Start Redis Container

```bash
# Start Redis with persistence
docker run -d \
  --name neurolink-redis \
  -p 6379:6379 \
  -v redis-data:/data \
  redis:7-alpine

# Verify Redis is running
docker ps | grep neurolink-redis
```

### Test Connection

```bash
# Test Redis connectivity
docker exec -it neurolink-redis redis-cli ping
# Expected output: PONG
```

## Option 2: Local Install

### macOS

```bash
# Install Redis with Homebrew
brew install redis

# Start Redis service
brew services start redis

# Verify installation
redis-cli ping
# Expected output: PONG
```

### Ubuntu/Debian

```bash
# Install Redis
sudo apt update
sudo apt install redis-server -y

# Start Redis service
sudo systemctl start redis-server
sudo systemctl enable redis-server

# Verify installation
redis-cli ping
# Expected output: PONG
```

### Windows (WSL2)

```bash
# Update packages
sudo apt update

# Install Redis
sudo apt install redis-server -y

# Start Redis
sudo service redis-server start

# Test connection
redis-cli ping
# Expected output: PONG
```

## Configure NeuroLink

### 1. Set Environment Variables

```bash
# Add to your .env file
REDIS_HOST=localhost
REDIS_PORT=6379
REDIS_PASSWORD=  # Leave empty for local dev
REDIS_DB=0
```

### 2. Initialize NeuroLink with Redis

```typescript

const neurolink = new NeuroLink({
  conversationMemory: {
    enabled: true,
    store: "redis",
    redisConfig: {
      host: "localhost",
      port: 6379,
      db: 0,
    },
  },
});

// Use neurolink as normal
const result = await neurolink.generate({
  input: { text: "Hello! How are you?" },
  provider: "openai",
});

console.log(result.content);
```

### 3. Verify Storage

```typescript
// Check conversation persistence
const stats = await neurolink.conversationMemory?.getStats();
console.log(stats); // { totalSessions: 1, totalTurns: 1 }
```

## Quick Verification

### Test Data Persistence

```bash
# In your Node.js console
const neurolink = new NeuroLink({
  conversationMemory: {
    enabled: true,
    store: "redis",
    redisConfig: { host: "localhost", port: 6379 }
  }
});

// Generate a conversation
await neurolink.generate({
  input: { text: "Remember this: my favorite color is blue" },
  sessionId: "test-session",
  userId: "test-user",
});

// Stop your app, restart, and verify data persists
const history = await neurolink.conversationMemory?.getUserSessionHistory(
  "test-user",
  "test-session"
);

console.log(history); // Should show your conversation
```

### Check Redis Data

```bash
# Connect to Redis CLI
docker exec -it neurolink-redis redis-cli
# OR (local install)
redis-cli

# List all keys
127.0.0.1:6379> KEYS *
# Expected: Shows NeuroLink conversation keys

# Check a specific session
127.0.0.1:6379> GET neurolink:conversation:test-user:test-session
# Shows conversation data in JSON format
```

## Common Issues

### Connection Refused

**Problem:** Cannot connect to Redis

```bash
# Check if Redis is running
docker ps | grep neurolink-redis
# OR
sudo systemctl status redis-server

# Restart if needed
docker restart neurolink-redis
# OR
sudo systemctl restart redis-server
```

### Port Already in Use

**Problem:** Port 6379 is already taken

```bash
# Use a different port for Redis
docker run -d --name neurolink-redis -p 6380:6379 redis:7-alpine

# Update NeuroLink config
redisConfig: { host: "localhost", port: 6380 }
```

### Permission Denied

**Problem:** Cannot access Redis socket (Linux)

```bash
# Add your user to the redis group
sudo usermod -a -G redis $USER

# Restart Redis
sudo systemctl restart redis-server
```

## Next Steps

- **[Complete Redis Configuration Guide](/docs/guides/redis-configuration)** - Production setup, clustering, security
- **[Redis Migration Patterns](/docs/guides/redis-migration)** - Migrate from in-memory to Redis
- **[Conversation Memory Guide](/docs/features/conversation-history)** - Advanced conversation management

## Production Checklist

Before going to production, review:

- [ ] **Security**: Set `requirepass` in Redis configuration
- [ ] **Persistence**: Enable AOF (Append-Only File) for data durability
- [ ] **Monitoring**: Set up health checks and alerts
- [ ] **Backup**: Configure automated backup schedule
- [ ] **Performance**: Tune `maxmemory` and eviction policies

See the [Complete Redis Configuration Guide](/docs/guides/redis-configuration) for production best practices.

---

**Need Help?** Check our [Troubleshooting Guide](/docs/reference/troubleshooting) or open an issue on [GitHub](https://github.com/juspay/neurolink).

---

## LiteLLM Provider Guide

<!-- Source: getting-started/providers/litellm.md -->

# LiteLLM Provider Guide

**Access 100+ AI providers through a unified OpenAI-compatible proxy with advanced features**

## Quick Start

### Option 1: Direct Integration (SDK Only)

Use LiteLLM directly in your code without running a proxy server.

#### 1. Install LiteLLM

```bash
pip install litellm
```

#### 2. Configure NeuroLink

```bash
# Add provider API keys to .env
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_AI_API_KEY=AIza...
```

#### 3. Use via LiteLLM Python Client

```python

# Use any provider with OpenAI-compatible interface
response = litellm.completion(
  model="gpt-4",
  messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

# Switch providers easily
response = litellm.completion(
  model="claude-3-5-sonnet-20241022",  # Anthropic
  messages=[{"role": "user", "content": "Hello!"}]
)

response = litellm.completion(
  model="gemini/gemini-pro",  # Google AI
  messages=[{"role": "user", "content": "Hello!"}]
)
```

### Option 2: Proxy Server (Recommended for Teams)

Run LiteLLM as a standalone proxy server for team-wide access.

#### 1. Install LiteLLM

```bash
pip install 'litellm[proxy]'
```

#### 2. Create Configuration File

Create `litellm_config.yaml`:

```yaml
model_list:
  - model_name: gpt-4
    litellm_params:
      model: gpt-4
      api_key: ${OPENAI_API_KEY} # Use env vars for all secrets

  - model_name: claude-3-5-sonnet
    litellm_params:
      model: claude-3-5-sonnet-20241022
      api_key: ${ANTHROPIC_API_KEY} # Use env vars for all secrets

  - model_name: gemini-pro
    litellm_params:
      model: gemini/gemini-pro
      api_key: ${GOOGLE_API_KEY} # Use env vars for all secrets

  # Optional: Load balancing across multiple instances
  # SECURITY: Use environment variables or secret management (e.g., AWS Secrets Manager, HashiCorp Vault)
  - model_name: gpt-4-balanced
    litellm_params:
      model: gpt-4
      api_key: ${OPENAI_API_KEY_1} # Use env vars for all secrets

  - model_name: gpt-4-balanced
    litellm_params:
      model: gpt-4
      api_key: ${OPENAI_API_KEY_2} # Use env vars for all secrets

general_settings:
  master_key: ${LITELLM_MASTER_KEY} # Use env vars for all secrets
  database_url: "postgresql://..." # Optional: for persistence
```

#### 3. Start Proxy Server

```bash
litellm --config litellm_config.yaml --port 8000
```

#### 4. Configure NeuroLink to Use Proxy

```bash
# Add to .env
OPENAI_COMPATIBLE_BASE_URL=http://localhost:8000
OPENAI_COMPATIBLE_API_KEY=sk-1234  # Your master_key from config
```

#### 5. Test Setup

```bash
# Test via NeuroLink
npx @juspay/neurolink generate "Hello from LiteLLM!" \
  --provider openai-compatible \
  --model "gpt-4"

# Or use any OpenAI-compatible client
curl http://localhost:8000/v1/chat/completions \
  -H "Authorization: Bearer sk-1234" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
```

---

## Provider Support

### Supported Providers (100+)

LiteLLM supports all major AI providers:

| Category        | Providers                                                             |
| --------------- | --------------------------------------------------------------------- |
| **Major Cloud** | OpenAI, Anthropic, Google (Gemini, Vertex), Azure OpenAI, AWS Bedrock |
| **Open Source** | Hugging Face, Together AI, Replicate, Ollama, vLLM, LocalAI           |
| **Specialized** | Cohere, AI21, Aleph Alpha, Perplexity, Groq, Fireworks AI             |
| **Aggregators** | OpenRouter, Anyscale, Deep Infra, Mistral AI                          |
| **Enterprise**  | SageMaker, Cloudflare Workers AI, Azure AI Studio                     |
| **Custom**      | Any OpenAI-compatible endpoint                                        |

### Model Name Format

```yaml
# OpenAI (default prefix)
model: gpt-4                    # openai/gpt-4
model: gpt-4o-mini             # openai/gpt-4o-mini

# Anthropic
model: claude-3-5-sonnet-20241022      # anthropic/claude-3-5-sonnet
model: anthropic/claude-3-opus-20240229

# Google AI
model: gemini/gemini-pro              # Google AI Studio
model: vertex_ai/gemini-pro           # Vertex AI

# Azure OpenAI
model: azure/gpt-4                    # Requires azure config

# AWS Bedrock
model: bedrock/anthropic.claude-3-sonnet-20240229-v1:0

# Ollama (local)
model: ollama/llama2                  # Requires Ollama running

# Hugging Face
model: huggingface/mistralai/Mistral-7B-Instruct-v0.2

# OpenRouter
model: openrouter/anthropic/claude-3.5-sonnet

# Together AI
model: together_ai/meta-llama/Llama-3-70b-chat-hf

# Full list: https://docs.litellm.ai/docs/providers
```

---

## Advanced Features

### 1. Load Balancing

Distribute requests across multiple providers or API keys:

```yaml
# litellm_config.yaml
model_list:
  # Load balance across multiple OpenAI keys
  - model_name: gpt-4-loadbalanced
    litellm_params:
      model: gpt-4
      api_key: sk-key-1...

  - model_name: gpt-4-loadbalanced
    litellm_params:
      model: gpt-4
      api_key: sk-key-2...

  - model_name: gpt-4-loadbalanced
    litellm_params:
      model: gpt-4
      api_key: sk-key-3...

router_settings:
  routing_strategy: simple-shuffle # Round-robin across keys
  # or: least-busy, usage-based-routing, latency-based-routing
```

Usage with NeuroLink:

```typescript
const ai = new NeuroLink({
  providers: [
    {
      name: "openai-compatible",
      config: {
        baseUrl: "http://localhost:8000",
        apiKey: "sk-1234",
      },
    },
  ],
});

// Requests automatically balanced across all 3 API keys
const result = await ai.generate({
  input: { text: "Your prompt" },
  provider: "openai-compatible",
  model: "gpt-4-loadbalanced",
});
```

### 2. Automatic Failover

Configure fallback providers for reliability:

```yaml
# litellm_config.yaml
model_list:
  # Primary: OpenAI
  - model_name: smart-model
    litellm_params:
      model: gpt-4
      api_key: sk-...

  # Fallback 1: Anthropic
  - model_name: smart-model
    litellm_params:
      model: claude-3-5-sonnet-20241022
      api_key: sk-ant-...

  # Fallback 2: Google
  - model_name: smart-model
    litellm_params:
      model: gemini/gemini-pro
      api_key: AIza...

router_settings:
  enable_fallbacks: true
  fallback_timeout: 30 # Seconds before trying fallback
  num_retries: 2
```

### 3. Budget Management

Set spending limits per user/team:

```yaml
# litellm_config.yaml
general_settings:
  master_key: sk-1234
  database_url: "postgresql://..." # Required for budgets

# Create virtual keys with budgets
# litellm --config config.yaml --create_key \
#   --key_name "team-frontend" \
#   --budget 100  # $100 limit
```

Track spending:

```python
# Check budget status

budget_info = litellm.get_budget(api_key="sk-team-frontend-...")
print(f"Spent: ${budget_info['total_spend']}")
print(f"Budget: ${budget_info['max_budget']}")
```

### 4. Rate Limiting

Control request rates per user/model:

```yaml
# litellm_config.yaml
model_list:
  - model_name: gpt-4-limited
    litellm_params:
      model: gpt-4
      api_key: sk-...
    model_info:
      max_parallel_requests: 10 # Max concurrent requests
      max_requests_per_minute: 100 # RPM limit
      max_tokens_per_minute: 100000 # TPM limit
```

### 5. Caching

Reduce costs by caching responses:

```yaml
# litellm_config.yaml
general_settings:
  cache: true
  cache_params:
    type: redis
    host: localhost
    port: 6379
    ttl: 3600 # Cache for 1 hour
```

Usage:

```typescript
// Identical requests within TTL return cached results
const result1 = await ai.generate({
  input: { text: "What is AI?" },
  provider: "openai-compatible",
  model: "gpt-4",
});
// Cost: $0.03

const result2 = await ai.generate({
  input: { text: "What is AI?" }, // Same query
  provider: "openai-compatible",
  model: "gpt-4",
});
// Cost: $0.00 (cached)
```

### 6. Virtual Keys (Team Management)

Create team-specific API keys with permissions:

```bash
# Create key for frontend team with budget
litellm --config config.yaml --create_key \
  --key_name "team-frontend" \
  --budget 100 \
  --models "gpt-4,claude-3-5-sonnet"

# Create key for backend team
litellm --config config.yaml --create_key \
  --key_name "team-backend" \
  --budget 500 \
  --models "gpt-4,gpt-4o-mini,claude-3-5-sonnet"

# Returns: sk-litellm-team-frontend-abc123...
```

Teams use their virtual key:

```bash
OPENAI_COMPATIBLE_API_KEY=sk-litellm-team-frontend-abc123
```

---

## NeuroLink Integration

### Basic Usage

```typescript

const ai = new NeuroLink({
  providers: [
    {
      name: "openai-compatible",
      config: {
        baseUrl: "http://localhost:8000", // LiteLLM proxy
        apiKey: process.env.LITELLM_KEY, // Master key or virtual key
      },
    },
  ],
});

// Use any provider through LiteLLM
const result = await ai.generate({
  input: { text: "Hello!" },
  provider: "openai-compatible",
  model: "gpt-4",
});
```

### Multi-Model Workflow

```typescript
// Easy switching between providers via LiteLLM
const models = {
  fast: "gpt-4o-mini",
  balanced: "claude-3-5-sonnet-20241022",
  powerful: "gpt-4",
};

async function generateSmart(
  prompt: string,
  complexity: "low" | "medium" | "high",
) {
  const modelMap = {
    low: models.fast,
    medium: models.balanced,
    high: models.powerful,
  };

  return await ai.generate({
    input: { text: prompt },
    provider: "openai-compatible",
    model: modelMap[complexity],
  });
}
```

### Cost Tracking

```typescript
// LiteLLM provides detailed cost tracking
const result = await ai.generate({
  input: { text: "Your prompt" },
  provider: "openai-compatible",
  model: "gpt-4",
  enableAnalytics: true,
});

console.log("Model used:", result.model);
console.log("Tokens:", result.usage.totalTokens);
console.log("Cost:", result.cost); // Calculated by LiteLLM
```

---

## CLI Usage

### Basic Commands

```bash
# Start LiteLLM proxy
litellm --config litellm_config.yaml --port 8000

# Use via NeuroLink CLI
npx @juspay/neurolink generate "Hello LiteLLM" \
  --provider openai-compatible \
  --model "gpt-4"

# Switch models easily
npx @juspay/neurolink gen "Write code" \
  --provider openai-compatible \
  --model "claude-3-5-sonnet-20241022"

# Check proxy status
curl http://localhost:8000/health
```

### Proxy Management

```bash
# Create virtual key
litellm --config config.yaml --create_key \
  --key_name "my-team" \
  --budget 100

# List all keys
litellm --config config.yaml --list_keys

# Delete key
litellm --config config.yaml --delete_key \
  --key "sk-litellm-abc123..."

# View spend by key
litellm --config config.yaml --spend \
  --key "sk-litellm-abc123..."
```

---

## Production Deployment

### Docker Deployment

```dockerfile
# Dockerfile
FROM ghcr.io/berriai/litellm:main-latest

COPY litellm_config.yaml /app/config.yaml

EXPOSE 8000

CMD ["litellm", "--config", "/app/config.yaml", "--port", "8000"]
```

```bash
# Build and run
docker build -t litellm-proxy .
docker run -p 8000:8000 litellm-proxy
```

### Docker Compose

```yaml
# docker-compose.yml
version: "3.8"

services:
  litellm:
    image: ghcr.io/berriai/litellm:main-latest
    ports:
      - "8000:8000"
    volumes:
      - ./litellm_config.yaml:/app/config.yaml
    command: ["litellm", "--config", "/app/config.yaml", "--port", "8000"]
    environment:
      - DATABASE_URL=postgresql://user:pass@postgres:5432/litellm
    depends_on:
      - postgres

  postgres:
    image: postgres:15
    environment:
      - POSTGRES_DB=litellm
      - POSTGRES_USER=user
      - POSTGRES_PASSWORD=pass
    volumes:
      - postgres_data:/var/lib/postgresql/data

volumes:
  postgres_data:
```

### Kubernetes Deployment

```yaml
# litellm-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: litellm-proxy
spec:
  replicas: 3
  selector:
    matchLabels:
      app: litellm
  template:
    metadata:
      labels:
        app: litellm
    spec:
      containers:
        - name: litellm
          image: ghcr.io/berriai/litellm:main-latest
          ports:
            - containerPort: 8000
          volumeMounts:
            - name: config
              mountPath: /app
          command: ["litellm", "--config", "/app/config.yaml", "--port", "8000"]
      volumes:
        - name: config
          configMap:
            name: litellm-config
---
apiVersion: v1
kind: Service
metadata:
  name: litellm-service
spec:
  selector:
    app: litellm
  ports:
    - port: 80
      targetPort: 8000
  type: LoadBalancer
```

### High Availability Setup

```yaml
# litellm_config.yaml - Production
model_list:
  # Multiple instances of each model
  - model_name: gpt-4-ha
    litellm_params:
      model: gpt-4
      api_key: sk-key-1...

  - model_name: gpt-4-ha
    litellm_params:
      model: gpt-4
      api_key: sk-key-2...

  - model_name: gpt-4-ha
    litellm_params:
      model: gpt-4
      api_key: sk-key-3...

general_settings:
  master_key: ${LITELLM_MASTER_KEY}
  database_url: ${DATABASE_URL}

  # Observability
  success_callback: ["langfuse", "prometheus"]
  failure_callback: ["sentry"]

  # Performance
  num_workers: 4
  cache: true
  cache_params:
    type: redis
    host: redis-cluster
    port: 6379

router_settings:
  routing_strategy: latency-based-routing
  enable_fallbacks: true
  num_retries: 3
  timeout: 30
  cooldown_time: 60
```

---

## Observability & Monitoring

### Logging

```yaml
# litellm_config.yaml
general_settings:
  success_callback: ["langfuse"] # Log successful requests
  failure_callback: ["sentry"] # Log failures

  # Langfuse integration for observability
  langfuse_public_key: ${LANGFUSE_PUBLIC_KEY}
  langfuse_secret_key: ${LANGFUSE_SECRET_KEY}
```

### Prometheus Metrics

```yaml
# litellm_config.yaml
general_settings:
  success_callback: ["prometheus"]
# Metrics available at http://localhost:8000/metrics
# - litellm_requests_total
# - litellm_request_duration_seconds
# - litellm_tokens_total
# - litellm_cost_total
```

### Custom Logging

```typescript
// Add custom metadata to requests
const result = await ai.generate({
  input: { text: "Your prompt" },
  provider: "openai-compatible",
  model: "gpt-4",
  metadata: {
    user_id: "user-123",
    team: "frontend",
    environment: "production",
  },
});
```

---

## Troubleshooting

### Common Issues

#### 1. "Connection refused"

**Problem**: LiteLLM proxy not running.

**Solution**:

```bash
# Check if proxy is running
curl http://localhost:8000/health

# Start proxy
litellm --config litellm_config.yaml --port 8000

# Check logs
litellm --config config.yaml --debug
```

#### 2. "Invalid API key"

**Problem**: Master key or virtual key incorrect.

**Solution**:

```bash
# Verify master_key in config
grep master_key litellm_config.yaml

# List all virtual keys
litellm --config config.yaml --list_keys

# Ensure key matches in .env
echo $OPENAI_COMPATIBLE_API_KEY
```

#### 3. "Budget exceeded"

**Problem**: Virtual key reached budget limit.

**Solution**:

```bash
# Check spend
litellm --config config.yaml --spend --key "sk-litellm-..."

# Increase budget
litellm --config config.yaml --update_key \
  --key "sk-litellm-..." \
  --budget 200
```

#### 4. "Model not found"

**Problem**: Model not configured in `model_list`.

**Solution**:

```yaml
# Add model to litellm_config.yaml
model_list:
  - model_name: your-model
    litellm_params:
      model: gpt-4
      api_key: sk-...

# Restart proxy
litellm --config litellm_config.yaml
```

---

## Best Practices

### 1. Use Virtual Keys

```yaml
# ✅ Good: Separate keys per team
# Team Frontend: sk-litellm-frontend-abc
# Team Backend: sk-litellm-backend-xyz
# Each with own budget and model access
```

### 2. Enable Fallbacks

```yaml
# ✅ Good: Configure fallback providers
router_settings:
  enable_fallbacks: true
  fallback_models: ["claude-3-5-sonnet-20241022", "gemini/gemini-pro"]
```

### 3. Implement Caching

```yaml
# ✅ Good: Cache frequent queries
general_settings:
  cache: true
  cache_params:
    ttl: 3600 # 1 hour
```

### 4. Monitor Costs

```yaml
# ✅ Good: Track spending
general_settings:
  success_callback: ["langfuse", "prometheus"]
# Set budgets per team
# Create alerts when budgets approach limits
```

### 5. Use Load Balancing

```yaml
# ✅ Good: Distribute load across providers
model_list:
  - model_name: production-model
    litellm_params:
      model: gpt-4
      api_key: sk-1...

  - model_name: production-model
    litellm_params:
      model: claude-3-5-sonnet-20241022
      api_key: sk-ant-...

router_settings:
  routing_strategy: usage-based-routing
```

---

## Related Documentation

- **[OpenAI Compatible Guide](/docs/getting-started/providers/openai-compatible)** - OpenAI-compatible providers
- **[Provider Setup Guide](/docs/getting-started/provider-setup)** - General provider configuration
- **[Cost Optimization](/docs/cookbook/cost-optimization)** - Reduce AI costs
- **[Load Balancing](/docs/guides/enterprise/load-balancing)** - Distribution strategies

---

## Additional Resources

- **[LiteLLM Documentation](https://docs.litellm.ai/)** - Official docs
- **[Supported Providers](https://docs.litellm.ai/docs/providers)** - 100+ providers list
- **[LiteLLM GitHub](https://github.com/BerriAI/litellm)** - Source code
- **[LiteLLM Proxy Docs](https://docs.litellm.ai/docs/proxy/quick_start)** - Proxy setup

---

**Need Help?** Join our [GitHub Discussions](https://github.com/juspay/neurolink/discussions) or open an [issue](https://github.com/juspay/neurolink/issues).

---

## Mistral AI Provider Guide

<!-- Source: getting-started/providers/mistral.md -->

# Mistral AI Provider Guide

**European AI excellence with GDPR compliance and competitive free tier**

## Quick Start

### 1. Get Your API Key

1. Visit [Mistral AI Console](https://console.mistral.ai/)
2. Create a free account
3. Go to "API Keys" section
4. Click "Create new key"
5. Copy the key (format: `xxx...`)

### 2. Configure NeuroLink

Add to your `.env` file:

```bash
MISTRAL_API_KEY=your_api_key_here
```

### 3. Test the Setup

```bash
# CLI - Test with default model
npx @juspay/neurolink generate "Bonjour! Comment allez-vous?" --provider mistral

# CLI - Use specific model
npx @juspay/neurolink generate "Explain quantum physics" --provider mistral --model "mistral-large-latest"

# SDK
node -e "
const { NeuroLink } = require('@juspay/neurolink');
(async () => {
  const ai = new NeuroLink();
  const result = await ai.generate({
    input: { text: 'Hello from Mistral AI!' },
    provider: 'mistral'
  });
  console.log(result.content);
})();
"
```

---

## Model Selection Guide

### Available Models

| Model                     | Description                       | Context | Best For                  | Pricing        |
| ------------------------- | --------------------------------- | ------- | ------------------------- | -------------- |
| **mistral-large-latest**  | Flagship model, GPT-4 competitive | 128K    | Complex reasoning, coding | €8/1M tokens   |
| **mistral-small-latest**  | Balanced performance/cost         | 128K    | General tasks, production | €2/1M tokens   |
| **mistral-medium-latest** | Mid-tier (deprecated, use large)  | 32K     | Legacy apps               | €2.7/1M tokens |
| **codestral-latest**      | Code specialist                   | 32K     | Code generation, review   | €1/1M tokens   |
| **mistral-embed**         | Embeddings model                  | -       | RAG, semantic search      | €0.1/1M tokens |

### Free Tier Details

✅ **What's Included:**

- **$5 free credits** for new users
- **No time limit** on free credits
- **All models available** on free tier
- **No credit card** required for signup

 **Free Tier Estimate:**

- ~2.5M tokens with mistral-small
- ~625K tokens with mistral-large
- ~5M tokens with codestral

### Model Selection by Use Case

```typescript
// Complex reasoning and analysis
const complex = await ai.generate({
  input: { text: "Analyze this business strategy..." },
  provider: "mistral",
  model: "mistral-large-latest",
});

// General production workloads
const general = await ai.generate({
  input: { text: "Customer support query" },
  provider: "mistral",
  model: "mistral-small-latest",
});

// Code generation and review
const code = await ai.generate({
  input: { text: "Write a REST API in Python" },
  provider: "mistral",
  model: "codestral-latest",
});

// Embeddings for RAG
const embeddings = await ai.generateEmbeddings({
  texts: ["Document 1", "Document 2"],
  provider: "mistral",
  model: "mistral-embed",
});
```

---

## GDPR Compliance & European Deployment

### Why Mistral for EU Companies

**Built-in GDPR Compliance:**

- ✅ European company (France-based)
- ✅ EU data centers
- ✅ GDPR-compliant by design
- ✅ No data sent to US servers
- ✅ Data residency in Europe

### Data Residency Configuration

```typescript
// Ensure EU data residency
const ai = new NeuroLink({
  providers: [
    {
      name: "mistral",
      config: {
        apiKey: process.env.MISTRAL_API_KEY,
        region: "eu", // Explicitly use EU endpoints
      },
    },
  ],
});
```

### GDPR Compliance Checklist

```typescript
// ✅ GDPR-compliant setup
const gdprAI = new NeuroLink({
  providers: [
    {
      name: "mistral",
      config: {
        apiKey: process.env.MISTRAL_API_KEY,
        // Data stays in EU
        region: "eu",
        // Enable audit logging
        enableAudit: true,
        // Data retention policy
        dataRetention: "30-days",
      },
    },
  ],
});

// Document data processing
const result = await gdprAI.generate({
  input: { text: userQuery },
  provider: "mistral",
  metadata: {
    userId: "anonymized-id",
    purpose: "customer-support",
    legalBasis: "consent",
  },
});
```

### Compliance Features

| Feature              | Mistral AI        | Other Providers |
| -------------------- | ----------------- | --------------- |
| **EU Data Centers**  | ✅ Yes            | ⚠️ Limited      |
| **GDPR Compliance**  | ✅ Built-in       | ⚠️ Varies       |
| **Data Residency**   | ✅ EU-only option | ⚠️ Often US     |
| **Privacy Controls** | ✅ Granular       | ⚠️ Limited      |
| **Audit Logs**       | ✅ Available      | ⚠️ Varies       |

---

## SDK Integration

### Basic Usage

```typescript

const ai = new NeuroLink();

// Simple generation
const result = await ai.generate({
  input: { text: "Explain artificial intelligence" },
  provider: "mistral",
});

console.log(result.content);
```

### With Specific Model

```typescript
// Use Mistral Large for complex tasks
const large = await ai.generate({
  input: { text: "Analyze this complex business scenario..." },
  provider: "mistral",
  model: "mistral-large-latest",
  temperature: 0.7,
  maxTokens: 2000,
});

// Use Codestral for code generation
const code = await ai.generate({
  input: { text: "Create a FastAPI application with authentication" },
  provider: "mistral",
  model: "codestral-latest",
});
```

### Streaming Responses

```typescript
// Stream long responses for better UX
for await (const chunk of ai.stream({
  input: { text: "Write a detailed technical article about microservices" },
  provider: "mistral",
  model: "mistral-large-latest",
})) {
  process.stdout.write(chunk.content);
}
```

### Multi-Language Support

```typescript
// Mistral excels at European languages
const languages = [
  { lang: "French", prompt: "Expliquez la blockchain" },
  { lang: "Spanish", prompt: "Explica la inteligencia artificial" },
  { lang: "German", prompt: "Erkläre maschinelles Lernen" },
  { lang: "Italian", prompt: "Spiega il deep learning" },
];

for (const { lang, prompt } of languages) {
  const result = await ai.generate({
    input: { text: prompt },
    provider: "mistral",
  });
  console.log(`${lang}: ${result.content}`);
}
```

### Cost Tracking

```typescript
// Track costs with analytics
const result = await ai.generate({
  input: { text: "Your prompt" },
  provider: "mistral",
  model: "mistral-small-latest",
  enableAnalytics: true,
});

// Calculate cost (mistral-small: €2/1M tokens)
const cost = (result.usage.totalTokens / 1_000_000) * 2;
console.log(`Cost: €${cost.toFixed(4)}`);
console.log(`Tokens used: ${result.usage.totalTokens}`);
```

---

## CLI Usage

### Basic Commands

```bash
# Generate with default model
npx @juspay/neurolink generate "Hello Mistral" --provider mistral

# Use specific model
npx @juspay/neurolink gen "Write code" --provider mistral --model "codestral-latest"

# Stream response
npx @juspay/neurolink stream "Tell a story" --provider mistral

# Check status
npx @juspay/neurolink status --provider mistral
```

### Advanced Usage

```bash
# With temperature and max tokens
npx @juspay/neurolink gen "Creative writing" \
  --provider mistral \
  --model "mistral-large-latest" \
  --temperature 0.9 \
  --max-tokens 2000

# Code generation with Codestral
npx @juspay/neurolink gen "Create a React component" \
  --provider mistral \
  --model "codestral-latest" \
  > component.tsx

# Interactive mode
npx @juspay/neurolink loop --provider mistral --model "mistral-large-latest"
```

### Cost-Effective Workflows

```bash
# Use mistral-small for production (cheaper)
npx @juspay/neurolink gen "Customer query: How do I reset my password?" \
  --provider mistral \
  --model "mistral-small-latest"

# Use mistral-large only for complex tasks
npx @juspay/neurolink gen "Analyze quarterly financial performance" \
  --provider mistral \
  --model "mistral-large-latest"
```

---

## Configuration Options

### Environment Variables

```bash
# Required
MISTRAL_API_KEY=your_api_key_here

# Optional
MISTRAL_BASE_URL=https://api.mistral.ai  # Custom endpoint
MISTRAL_DEFAULT_MODEL=mistral-small-latest  # Default model
MISTRAL_TIMEOUT=60000  # Request timeout (ms)
MISTRAL_REGION=eu  # Enforce EU endpoints
```

### Programmatic Configuration

```typescript
const ai = new NeuroLink({
  providers: [
    {
      name: "mistral",
      config: {
        apiKey: process.env.MISTRAL_API_KEY,
        defaultModel: "mistral-small-latest",
        region: "eu",
        timeout: 60000,
        retryAttempts: 3,
      },
    },
  ],
});
```

---

## Enterprise Deployment

### Production Setup

```typescript
// Enterprise-grade Mistral configuration
const enterpriseAI = new NeuroLink({
  providers: [
    {
      name: "mistral",
      priority: 1,
      config: {
        apiKey: process.env.MISTRAL_API_KEY,
        region: "eu",
        enableAudit: true,

        // Rate limiting
        rateLimit: {
          requestsPerMinute: 100,
          tokensPerMinute: 1_000_000,
        },

        // Retry logic
        retryAttempts: 3,
        retryDelay: 1000,

        // Timeouts
        timeout: 120000,
      },
    },
    {
      name: "anthropic", // Fallback for critical workloads
      priority: 2,
    },
  ],
});
```

### Multi-Region Deployment

```typescript
// Serve EU and global users
const multiRegionAI = new NeuroLink({
  providers: [
    {
      name: "mistral",
      region: "eu",
      priority: 1,
      condition: (req) => req.userRegion === "EU",
    },
    {
      name: "openai",
      priority: 1,
      condition: (req) => req.userRegion !== "EU",
    },
  ],
});
```

### Cost Optimization

```typescript
// Smart model selection based on complexity
async function generateWithCostOptimization(prompt: string) {
  const complexity = estimateComplexity(prompt);

  const model =
    complexity > 0.7
      ? "mistral-large-latest" // Complex: €8/1M
      : "mistral-small-latest"; // Simple: €2/1M

  return await ai.generate({
    input: { text: prompt },
    provider: "mistral",
    model,
  });
}

function estimateComplexity(prompt: string): number {
  // Complexity scoring constants (0-1 scale)
  const LENGTH_WEIGHT = 0.3; // Characters per 1000
  const CODE_COMPLEXITY_WEIGHT = 0.4; // Technical implementation tasks
  const ANALYSIS_COMPLEXITY_WEIGHT = 0.5; // Deep analysis/reasoning tasks
  const LENGTH_SCALE = 1000; // Normalize character count

  const length = prompt.length;
  const hasCodeKeywords = /function|class|api|database/i.test(prompt);
  const hasAnalysisKeywords = /analyze|compare|evaluate|assess/i.test(prompt);

  return (
    (length / LENGTH_SCALE) * LENGTH_WEIGHT +
    (hasCodeKeywords ? CODE_COMPLEXITY_WEIGHT : 0) +
    (hasAnalysisKeywords ? ANALYSIS_COMPLEXITY_WEIGHT : 0)
  );
}
```

---

## Troubleshooting

### Common Issues

#### 1. "Invalid API Key"

**Problem**: API key is incorrect or expired.

**Solution**:

```bash
# Verify key at console.mistral.ai
# Ensure no extra spaces in .env
MISTRAL_API_KEY=your_key_here  # ✅ Correct
MISTRAL_API_KEY= your_key_here # ❌ Extra space
```

#### 2. "Rate Limit Exceeded"

**Problem**: Exceeded free tier or paid tier limits.

**Solution**:

```typescript
// Implement exponential backoff
async function generateWithBackoff(prompt, maxRetries = 3) {
  for (let i = 0; i  setTimeout(r, delay));
      } else {
        throw error;
      }
    }
  }
}
```

#### 3. "Insufficient Credits"

**Problem**: Free tier exhausted.

**Solution**:

- Add payment method in Mistral console
- Use fallback provider
- Monitor usage:

```typescript
// Track usage to avoid surprises
const result = await ai.generate({
  input: { text: prompt },
  provider: "mistral",
  enableAnalytics: true,
});

console.log(`Tokens used: ${result.usage.totalTokens}`);
console.log(`Estimated cost: €${(result.usage.totalTokens / 1_000_000) * 2}`);
```

#### 4. Slow Response Times

**Problem**: Model or network latency.

**Solution**:

```typescript
// Use streaming for immediate feedback
for await (const chunk of ai.stream({
  input: { text: "Long prompt requiring detailed response" },
  provider: "mistral",
})) {
  // Display partial results immediately
  console.log(chunk.content);
}
```

---

## Best Practices

### 1. GDPR-Compliant Usage

```typescript
// ✅ Good: Anonymize user data
const result = await ai.generate({
  input: { text: sanitizeUserInput(userQuery) },
  provider: "mistral",
  metadata: {
    userId: hashUserId(userId), // Hash, don't store raw
    timestamp: new Date().toISOString(),
    purpose: "customer-support",
  },
});

// Document processing
await auditLog.record({
  action: "ai-generation",
  provider: "mistral",
  legalBasis: "legitimate-interest",
  dataRetention: "30-days",
});
```

### 2. Cost Optimization

```typescript
// ✅ Good: Use appropriate model for task
const customerSupport = await ai.generate({
  input: { text: "How do I reset my password?" },
  provider: "mistral",
  model: "mistral-small-latest", // €2/1M vs €8/1M
});

// ✅ Good: Cache common queries
const cache = new Map();
const cacheKey = `mistral:${userQuery}`;

if (cache.has(cacheKey)) {
  return cache.get(cacheKey);
}

const result = await ai.generate({
  input: { text: userQuery },
  provider: "mistral",
});

cache.set(cacheKey, result);
```

### 3. Multi-Language Support

```typescript
// ✅ Good: Leverage Mistral's multilingual strength
const supportedLanguages = ["en", "fr", "es", "de", "it"];

async function generateInLanguage(prompt, language) {
  const languagePrompt =
    language !== "en" ? `[Respond in ${language}] ${prompt}` : prompt;

  return await ai.generate({
    input: { text: languagePrompt },
    provider: "mistral", // Excellent European language support
  });
}
```

---

## Related Documentation

- **[Provider Setup Guide](/docs/getting-started/provider-setup)** - General provider configuration
- **[GDPR Compliance Guide](/docs/guides/enterprise/compliance)** - GDPR implementation
- **[Cost Optimization](/docs/cookbook/cost-optimization)** - Reduce AI costs
- **[Multi-Region Deployment](/docs/guides/enterprise/multi-region)** - Geographic distribution

---

## Additional Resources

- **[Mistral AI Console](https://console.mistral.ai/)** - API keys and billing
- **[Mistral AI Documentation](https://docs.mistral.ai/)** - Official docs
- **[Mistral Models](https://docs.mistral.ai/models/)** - Model capabilities
- **[Pricing](https://mistral.ai/pricing/)** - Current pricing

---

**Need Help?** Join our [GitHub Discussions](https://github.com/juspay/neurolink/discussions) or open an [issue](https://github.com/juspay/neurolink/issues).

---

## Ollama Setup Guide

<!-- Source: getting-started/providers/ollama.md -->

#  Ollama Setup Guide

Complete guide for setting up Ollama with NeuroLink for local AI capabilities.

##  macOS Installation

### Method 1: Homebrew (Recommended)

```bash
# Install Ollama
brew install ollama

# Start Ollama service (auto-starts on install)
ollama serve
```

### Method 2: Direct Download

1. Download from [ollama.ai](https://ollama.ai)
2. Open the .dmg file
3. Drag Ollama to Applications
4. Launch from Applications

### Verify Installation

```bash
ollama --version
ollama list
```

##  Linux Installation

### Ubuntu/Debian

```bash
curl -fsSL https://ollama.ai/install.sh | sh
```

### Manual Installation

```bash
# Download binary
curl -L https://ollama.ai/download/ollama-linux-amd64 -o ollama
chmod +x ollama
sudo mv ollama /usr/local/bin/

# Create systemd service
sudo tee /etc/systemd/system/ollama.service > /dev/null <<EOF
[Unit]
Description=Ollama
After=network.target

[Service]
ExecStart=/usr/local/bin/ollama serve
Restart=always
User=$USER

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl enable ollama
sudo systemctl start ollama
```

## 🪟 Windows Installation

### Requirements

- Windows 10/11
- WSL2 (recommended) or native

### Native Windows

1. Download installer from [ollama.ai](https://ollama.ai)
2. Run the .exe installer
3. Follow installation wizard
4. Ollama starts automatically

### WSL2 (Recommended)

```bash
# Inside WSL2
curl -fsSL https://ollama.ai/install.sh | sh
```

##  Getting Started

### 1. Pull Your First Model

NeuroLink's CLI now checks for the default model (`llama3.2:latest`) and will prompt you to pull it if it's missing. You can also pull other models manually:

```bash
# Pull Llama 2 (default)
ollama pull llama2

# Pull Code Llama for coding
ollama pull codellama

# Pull Mistral for balanced performance
ollama pull mistral
```

### 2. Test with NeuroLink

```bash
# Test Ollama integration
npx @juspay/neurolink generate "Hello from local AI!" --provider ollama

# Use specific model
npx @juspay/neurolink generate "Write code" --provider ollama --model codellama
```

### 3. Manage Models

```bash
# List installed models
ollama list

# Remove a model
ollama rm llama2

# Show model information
ollama show llama2
```

##  Troubleshooting

### Ollama Service Not Running

```bash
# Check status
ollama list  # Should show models or error

# Start manually
ollama serve

# Check if port is in use
lsof -i :11434  # macOS/Linux
netstat -an | findstr 11434  # Windows
```

### Connection Refused

1. Ensure Ollama is running: `ollama serve`
2. Check firewall settings
3. Verify port 11434 is accessible
4. Try: `curl http://localhost:11434/api/tags`

### Model Download Issues

- Check disk space (models are 4-7GB)
- Verify internet connection
- Try alternative model: `ollama pull tinyllama`

### Performance Issues

- Close other applications
- Use smaller models (tinyllama, phi)
- Increase system swap/page file
- Consider GPU acceleration (NVIDIA)

##  Model Recommendations

### By Use Case

- **General Purpose**: llama2 (7B)
- **Coding**: codellama (7B)
- **Fast Responses**: mistral (7B), tinyllama (1B)
- **Creative Writing**: llama2-uncensored
- **Technical Tasks**: mixtral (if you have 48GB+ RAM)

### By System Resources

- **8GB RAM**: tinyllama, phi
- **16GB RAM**: llama2, mistral, codellama
- **32GB+ RAM**: mixtral, llama2:13b

##  Privacy & Security

### Data Privacy

- **100% Local**: No data sent to external servers
- **No Analytics**: Ollama doesn't track usage
- **Air-Gap Capable**: Works completely offline

### Resource Management

```bash
# Set memory limit
OLLAMA_MAX_MEMORY=8GB ollama serve

# Use specific GPU
OLLAMA_CUDA_DEVICE=0 ollama serve

# CPU only mode
OLLAMA_GPU_DRIVER=cpu ollama serve
```

##  Advanced Configuration

### Environment Variables

```bash
# Custom models directory
export OLLAMA_MODELS=/path/to/models

# Custom host (for remote access)
export OLLAMA_HOST=0.0.0.0:11434

# Keep models in memory longer
export OLLAMA_KEEP_ALIVE=10m
```

### Remote Access

```bash
# Allow remote connections
OLLAMA_HOST=0.0.0.0:11434 ollama serve

# Connect from NeuroLink
export OLLAMA_BASE_URL=http://remote-host:11434
```

### GPU Acceleration

- **NVIDIA**: Automatically detected if CUDA is installed
- **AMD**: ROCm support on Linux
- **Apple Silicon**: Metal acceleration on M1/M2/M3

##  Resources

- [Ollama Documentation](https://github.com/ollama/ollama)
- [Model Library](https://ollama.ai/library)
- [NeuroLink Examples](/docs/)
- [Community Discord](https://discord.gg/ollama)

---

## OpenAI-Compatible Providers Guide

<!-- Source: getting-started/providers/openai-compatible.md -->

# OpenAI Compatible Provider Guide

**Connect to any OpenAI-compatible API: OpenRouter, vLLM, LocalAI, and more**

---------------------- | ------------------------------------ | ---------------------- |
| **OpenRouter**            | AI provider aggregator (100+ models) | Multi-provider access  |
| **vLLM**                  | High-performance inference server    | Self-hosted models     |
| **LocalAI**               | Local OpenAI alternative             | Privacy, offline usage |
| **Text Generation WebUI** | Community inference server           | Local LLMs             |
| **Custom APIs**           | Your own OpenAI-compatible service   | Proprietary models     |

---

## Quick Start

### Option 1: OpenRouter (Recommended for Beginners)

OpenRouter provides access to 100+ models from multiple providers through a single API.

#### 1. Get OpenRouter API Key

1. Visit [OpenRouter.ai](https://openrouter.ai/)
2. Sign up for free account
3. Go to [Keys](https://openrouter.ai/keys)
4. Create new key
5. Add credits ($5 minimum)

#### 2. Configure NeuroLink

```bash
# Add to .env
OPENAI_COMPATIBLE_BASE_URL=https://openrouter.ai/api/v1
OPENAI_COMPATIBLE_API_KEY=sk-or-v1-your-key-here
```

#### 3. Test Setup

```bash
# Auto-discover available models
npx @juspay/neurolink models --provider openai-compatible

# Generate with specific model
npx @juspay/neurolink generate "Hello from OpenRouter!" \
  --provider openai-compatible \
  --model "anthropic/claude-3.5-sonnet"
```

### Option 2: vLLM (Self-Hosted)

vLLM is a high-performance inference server for running models locally.

#### 1. Install vLLM

```bash
# Install vLLM
pip install vllm

# Start server with a model
python -m vllm.entrypoints.openai.api_server \
  --model mistralai/Mistral-7B-Instruct-v0.2 \
  --port 8000
```

#### 2. Configure NeuroLink

```bash
# Add to .env
OPENAI_COMPATIBLE_BASE_URL=http://localhost:8000/v1
OPENAI_COMPATIBLE_API_KEY=none  # vLLM doesn't require key
```

#### 3. Test Setup

```bash
npx @juspay/neurolink generate "Hello from vLLM!" \
  --provider openai-compatible
```

### Option 3: LocalAI (Privacy-Focused)

LocalAI runs completely offline for maximum privacy.

#### 1. Install LocalAI

```bash
# Using Docker
docker run -p 8080:8080 \
  -v $PWD/models:/models \
  localai/localai:latest

# Or install directly
curl https://localai.io/install.sh | sh
```

#### 2. Configure NeuroLink

```bash
OPENAI_COMPATIBLE_BASE_URL=http://localhost:8080/v1
OPENAI_COMPATIBLE_API_KEY=none
```

---

## Model Auto-Discovery

NeuroLink automatically discovers available models through the `/v1/models` endpoint.

### Discover Available Models

```bash
# List all models from endpoint
npx @juspay/neurolink models --provider openai-compatible
```

### SDK Auto-Discovery

```typescript

const ai = new NeuroLink();

// Discover models programmatically
const models = await ai.listModels("openai-compatible");
console.log("Available models:", models);

// Use discovered model
const result = await ai.generate({
  input: { text: "Hello!" },
  provider: "openai-compatible",
  model: models[0].id, // Use first available model
});
```

---

## OpenRouter Integration

OpenRouter aggregates 100+ models from multiple providers.

### Available Models on OpenRouter

```bash
# List all OpenRouter models
npx @juspay/neurolink models --provider openai-compatible

# Popular models available:
# - anthropic/claude-3.5-sonnet
# - openai/gpt-4-turbo
# - google/gemini-pro-1.5
# - meta-llama/llama-3-70b-instruct
# - mistralai/mistral-large
```

### Model Selection by Provider

```typescript
// Use Claude through OpenRouter
const claude = await ai.generate({
  input: { text: "Explain quantum computing" },
  provider: "openai-compatible",
  model: "anthropic/claude-3.5-sonnet",
});

// Use GPT-4 through OpenRouter
const gpt4 = await ai.generate({
  input: { text: "Write a poem" },
  provider: "openai-compatible",
  model: "openai/gpt-4-turbo",
});

// Use Gemini through OpenRouter
const gemini = await ai.generate({
  input: { text: "Analyze this data" },
  provider: "openai-compatible",
  model: "google/gemini-pro-1.5",
});
```

### OpenRouter Features

```typescript
// Cost tracking (OpenRouter provides in response)
const result = await ai.generate({
  input: { text: "Your prompt" },
  provider: "openai-compatible",
  model: "anthropic/claude-3.5-sonnet",
  enableAnalytics: true,
});

console.log("Tokens used:", result.usage.totalTokens);
console.log("Cost:", result.cost); // OpenRouter returns actual cost

// Provider selection preferences
const result = await ai.generate({
  input: { text: "Your prompt" },
  provider: "openai-compatible",
  model: "openai/gpt-4",
  headers: {
    "X-Provider-Preferences": "order:cost", // Cheapest first
  },
});
```

---

## vLLM Integration

vLLM provides high-performance inference for self-hosted models.

### Starting vLLM Server

```bash
# Basic setup
python -m vllm.entrypoints.openai.api_server \
  --model mistralai/Mistral-7B-Instruct-v0.2 \
  --port 8000

# With GPU optimization
python -m vllm.entrypoints.openai.api_server \
  --model mistralai/Mistral-7B-Instruct-v0.2 \
  --tensor-parallel-size 2 \  # Multi-GPU
  --gpu-memory-utilization 0.9 \
  --port 8000

# With quantization for lower memory
python -m vllm.entrypoints.openai.api_server \
  --model TheBloke/Mistral-7B-Instruct-v0.2-AWQ \
  --quantization awq \
  --port 8000
```

### NeuroLink Configuration for vLLM

```typescript
const ai = new NeuroLink({
  providers: [
    {
      name: "openai-compatible",
      config: {
        baseUrl: "http://localhost:8000/v1",
        apiKey: "none", // vLLM doesn't require authentication
        defaultModel: "mistralai/Mistral-7B-Instruct-v0.2",
      },
    },
  ],
});

// Use vLLM-hosted model
const result = await ai.generate({
  input: { text: "Explain Docker containers" },
  provider: "openai-compatible",
});
```

### Multiple vLLM Instances

```typescript
// Load balance across multiple vLLM servers
const ai = new NeuroLink({
  providers: [
    {
      name: "openai-compatible-1",
      config: {
        baseUrl: "http://server1:8000/v1",
        apiKey: "none",
      },
      priority: 1,
    },
    {
      name: "openai-compatible-2",
      config: {
        baseUrl: "http://server2:8000/v1",
        apiKey: "none",
      },
      priority: 1,
    },
  ],
  loadBalancing: "round-robin",
});
```

---

## SDK Integration

### Basic Usage

```typescript

const ai = new NeuroLink();

// Simple generation
const result = await ai.generate({
  input: { text: "Hello from OpenAI Compatible!" },
  provider: "openai-compatible",
});

console.log(result.content);
```

### With Model Selection

```typescript
// Specify exact model (OpenRouter format)
const result = await ai.generate({
  input: { text: "Explain blockchain" },
  provider: "openai-compatible",
  model: "anthropic/claude-3.5-sonnet",
});

// Or use auto-discovered model
const models = await ai.listModels("openai-compatible");
const result = await ai.generate({
  input: { text: "Your prompt" },
  provider: "openai-compatible",
  model: models[0].id,
});
```

### Streaming

```typescript
// Stream responses for better UX
for await (const chunk of ai.stream({
  input: { text: "Write a long story" },
  provider: "openai-compatible",
  model: "anthropic/claude-3.5-sonnet",
})) {
  process.stdout.write(chunk.content);
}
```

### Custom Headers

```typescript
// Pass custom headers (e.g., for OpenRouter)
const result = await ai.generate({
  input: { text: "Your prompt" },
  provider: "openai-compatible",
  headers: {
    "HTTP-Referer": "https://your-app.com",
    "X-Title": "YourApp",
    "X-Provider-Preferences": "order:cost",
  },
});
```

### Error Handling

```typescript
try {
  const result = await ai.generate({
    input: { text: "Your prompt" },
    provider: "openai-compatible",
    model: "non-existent-model",
  });
} catch (error) {
  if (error.message.includes("model not found")) {
    // List available models
    const models = await ai.listModels("openai-compatible");
    console.log(
      "Available models:",
      models.map((m) => m.id),
    );
  } else if (error.message.includes("connection")) {
    console.error("Cannot connect to endpoint");
  } else {
    throw error;
  }
}
```

---

## CLI Usage

### Basic Commands

```bash
# Generate with default model
npx @juspay/neurolink generate "Hello world" --provider openai-compatible

# Use specific model
npx @juspay/neurolink gen "Write code" \
  --provider openai-compatible \
  --model "anthropic/claude-3.5-sonnet"

# Stream response
npx @juspay/neurolink stream "Tell a story" \
  --provider openai-compatible

# List available models
npx @juspay/neurolink models --provider openai-compatible
```

### OpenRouter-Specific Commands

```bash
# Use cheap models for cost optimization
npx @juspay/neurolink gen "Customer support query" \
  --provider openai-compatible \
  --model "meta-llama/llama-3-8b-instruct"  # Cheap

# Use premium models for complex tasks
npx @juspay/neurolink gen "Complex analysis task" \
  --provider openai-compatible \
  --model "anthropic/claude-3-opus"  # Premium
```

---

## Configuration Options

### Environment Variables

```bash
# Required
OPENAI_COMPATIBLE_BASE_URL=https://openrouter.ai/api/v1
OPENAI_COMPATIBLE_API_KEY=sk-or-v1-your-key

# Optional
OPENAI_COMPATIBLE_MODEL=anthropic/claude-3.5-sonnet  # Default model
OPENAI_COMPATIBLE_TIMEOUT=60000  # Timeout (ms)
OPENAI_COMPATIBLE_VERIFY_SSL=true  # SSL verification
```

### Programmatic Configuration

```typescript
const ai = new NeuroLink({
  providers: [
    {
      name: "openai-compatible",
      config: {
        baseUrl: process.env.OPENAI_COMPATIBLE_BASE_URL,
        apiKey: process.env.OPENAI_COMPATIBLE_API_KEY,
        defaultModel: "anthropic/claude-3.5-sonnet",
        timeout: 60000,
        headers: {
          "HTTP-Referer": "https://yourapp.com",
          "X-Title": "YourApp",
        },
      },
    },
  ],
});
```

---

## Use Cases

### 1. Multi-Provider Access via OpenRouter

```typescript
// Access multiple providers through one endpoint
const providers = {
  claude: "anthropic/claude-3.5-sonnet",
  gpt4: "openai/gpt-4-turbo",
  gemini: "google/gemini-pro-1.5",
  llama: "meta-llama/llama-3-70b-instruct",
};

for (const [name, model] of Object.entries(providers)) {
  const result = await ai.generate({
    input: { text: "Explain quantum computing in one sentence" },
    provider: "openai-compatible",
    model,
  });
  console.log(`${name}: ${result.content}`);
}
```

### 2. Self-Hosted Private Models

```typescript
// Complete privacy with local vLLM
const privateAI = new NeuroLink({
  providers: [
    {
      name: "openai-compatible",
      config: {
        baseUrl: "http://localhost:8000/v1",
        apiKey: "none",
      },
    },
  ],
});

// Process sensitive data locally
const result = await privateAI.generate({
  input: { text: sensitiveData },
  provider: "openai-compatible",
});
// Data never leaves your infrastructure
```

### 3. Cost Optimization

```typescript
// Compare costs across providers via OpenRouter
async function generateCheapest(prompt: string) {
  const models = [
    {
      name: "llama-3-8b",
      model: "meta-llama/llama-3-8b-instruct",
      costPer1M: 0.2,
    },
    {
      name: "mistral-7b",
      model: "mistralai/mistral-7b-instruct",
      costPer1M: 0.15,
    },
    { name: "gemma-7b", model: "google/gemma-7b-it", costPer1M: 0.1 },
  ];

  // Sort by cost
  models.sort((a, b) => a.costPer1M - b.costPer1M);

  // Try cheapest first
  for (const { model } of models) {
    try {
      return await ai.generate({
        input: { text: prompt },
        provider: "openai-compatible",
        model,
      });
    } catch (error) {
      continue; // Try next model
    }
  }
}
```

---

## Troubleshooting

### Common Issues

#### 1. "Connection refused"

**Problem**: Endpoint is not accessible.

**Solution**:

```bash
# Test endpoint manually (local development)
curl http://localhost:8000/v1/models

# Test endpoint manually (production - always use HTTPS)
curl https://your-production-endpoint.com/v1/models

# Check if server is running
ps aux | grep vllm

# Verify firewall allows connection
telnet localhost 8000
```

#### 2. "Model not found"

**Problem**: Model ID is incorrect or not available.

**Solution**:

```bash
# List available models first
npx @juspay/neurolink models --provider openai-compatible

# Use exact model ID from list
npx @juspay/neurolink gen "test" \
  --provider openai-compatible \
  --model "exact-model-id-from-list"
```

#### 3. "Invalid API key"

**Problem**: API key format is incorrect (OpenRouter).

**Solution**:

```bash
# OpenRouter keys start with sk-or-v1-
OPENAI_COMPATIBLE_API_KEY=sk-or-v1-your-key  # ✅ Correct

# For local servers, use 'none' or empty string
OPENAI_COMPATIBLE_API_KEY=none  # ✅ For vLLM
```

---

## Best Practices

### 1. Model Discovery

```typescript
// ✅ Good: Auto-discover models on startup
const models = await ai.listModels("openai-compatible");
console.log(
  "Available models:",
  models.map((m) => m.id),
);

// Cache model list
const modelCache = new Map();
modelCache.set("openai-compatible", models);
```

### 2. Endpoint Health Checks

```typescript
// ✅ Good: Verify endpoint before use
async function healthCheck() {
  try {
    const models = await ai.listModels("openai-compatible");
    return models.length > 0;
  } catch (error) {
    return false;
  }
}

if (await healthCheck()) {
  // Use provider
} else {
  // Fall back to alternative
}
```

### 3. Cost Tracking

```typescript
// ✅ Good: Track usage with OpenRouter
const result = await ai.generate({
  input: { text: prompt },
  provider: "openai-compatible",
  enableAnalytics: true,
});

await costTracker.record({
  provider: "openrouter",
  model: result.model,
  tokens: result.usage.totalTokens,
  cost: result.cost,
});
```

---

## Related Documentation

- **[Provider Setup Guide](/docs/getting-started/provider-setup)** - General provider configuration
- **[Cost Optimization](/docs/cookbook/cost-optimization)** - Reduce AI costs
- **[Enterprise Multi-Region](/docs/guides/enterprise/multi-region)** - Self-hosted and vLLM deployment

---

## Additional Resources

- **[OpenRouter](https://openrouter.ai/)** - Multi-provider aggregator
- **[vLLM Documentation](https://docs.vllm.ai/)** - Self-hosted inference
- **[LocalAI](https://localai.io/)** - Local OpenAI alternative
- **[OpenAI API Spec](https://platform.openai.com/docs/api-reference)** - API standard

---

**Need Help?** Join our [GitHub Discussions](https://github.com/juspay/neurolink/discussions) or open an [issue](https://github.com/juspay/neurolink/issues).

---

## OpenRouter Provider Guide

<!-- Source: getting-started/providers/openrouter.md -->

# OpenRouter Provider Guide

**Access 300+ AI models from 60+ providers through a single unified API**

## Quick Start

### 1. Get Your API Key

Sign up at [https://openrouter.ai](https://openrouter.ai) and get your API key from [https://openrouter.ai/keys](https://openrouter.ai/keys).

### 2. Configure Environment

Add your API key to `.env`:

```bash
# Required
OPENROUTER_API_KEY=sk-or-v1-...

# Optional: Attribution (shows in OpenRouter dashboard)
OPENROUTER_REFERER=https://yourapp.com
OPENROUTER_APP_NAME="Your App Name"

# Optional: Override default model
OPENROUTER_MODEL=anthropic/claude-3-5-sonnet
```

### 3. Install NeuroLink

```bash
npm install @juspay/neurolink
# or
pnpm add @juspay/neurolink
```

### 4. Start Using OpenRouter

```typescript

const ai = new NeuroLink({
  providers: [{
    name: "openrouter",
    config: {
      apiKey: process.env.OPENROUTER_API_KEY,
    },
  }],
});

// Use default model (Claude 3.5 Sonnet)
const result = await ai.generate({
  input: { text: "What are the benefits of TypeScript?" },
});

console.log(result.content);
```

```bash
# Quick generation
npx @juspay/neurolink generate "Hello from OpenRouter!" \
  --provider openrouter

# Use specific model
npx @juspay/neurolink gen "Write a haiku about AI" \
  --provider openrouter \
  --model "openai/gpt-4o"

# Interactive loop mode
npx @juspay/neurolink loop \
  --provider openrouter \
  --model "anthropic/claude-3-5-sonnet"
```

---

## Supported Models

OpenRouter provides access to 300+ models. Here are the most popular:

### Anthropic Claude

```typescript
// Latest models
"anthropic/claude-3-5-sonnet"; // Best overall - 200K context
"anthropic/claude-3-5-haiku"; // Fast & affordable - 200K context
"anthropic/claude-3-opus"; // Most capable - 200K context
```

### OpenAI

```typescript
// GPT-4 series
"openai/gpt-4o"; // Latest GPT-4 Omni
"openai/gpt-4o-mini"; // Fast & affordable GPT-4
"openai/gpt-4-turbo"; // GPT-4 Turbo
"openai/gpt-4"; // Original GPT-4

// GPT-3.5
"openai/gpt-3.5-turbo"; // Fast & cheap
```

### Google

```typescript
// Gemini models
"google/gemini-2.0-flash"; // Latest Gemini - 1M context
"google/gemini-1.5-pro"; // Gemini Pro - 1M context
"google/gemini-1.5-flash"; // Fast Gemini
```

### Meta Llama

```typescript
// Llama 3.1 series
"meta-llama/llama-3.1-405b-instruct"; // Largest open model
"meta-llama/llama-3.1-70b-instruct"; // Balanced performance
"meta-llama/llama-3.1-8b-instruct"; // Fast & efficient
```

### Mistral AI

```typescript
// Mistral models
"mistralai/mistral-large"; // Most capable Mistral
"mistralai/mixtral-8x22b-instruct"; // Large MoE model
"mistralai/mixtral-8x7b-instruct"; // Efficient MoE
```

### Free Models

OpenRouter provides free access to select models:

```typescript
// Popular free models
"google/gemini-2.0-flash-exp:free";
"meta-llama/llama-3.1-8b-instruct:free";
"microsoft/phi-3-medium-128k-instruct:free";
```

### Browse All Models

- **Web Dashboard**: [https://openrouter.ai/models](https://openrouter.ai/models)
- **API**: Dynamically fetched via `provider.getAvailableModels()`

---

## Model Selection Guide

### By Use Case

| Use Case                | Recommended Model                   | Why                                         |
| ----------------------- | ----------------------------------- | ------------------------------------------- |
| **General Chat**        | `anthropic/claude-3-5-sonnet`       | Best balance of quality, speed, and cost    |
| **Code Generation**     | `openai/gpt-4o`                     | Excellent code understanding and generation |
| **Long Documents**      | `google/gemini-1.5-pro`             | 1M token context window                     |
| **Fast Responses**      | `anthropic/claude-3-5-haiku`        | Ultra-fast with good quality                |
| **Cost Optimization**   | `openai/gpt-4o-mini`                | Cheapest GPT-4 class model                  |
| **Development/Testing** | `google/gemini-2.0-flash-exp:free`  | Free tier available                         |
| **Open Source**         | `meta-llama/llama-3.1-70b-instruct` | Best open source model                      |
| **Reasoning**           | `anthropic/claude-3-opus`           | Superior reasoning capabilities             |

### By Performance Characteristics

#### Speed Priority

```typescript
// Fastest models (= MAX_DAILY_COST) {
    throw new Error("Daily budget exceeded");
  }

  const result = await ai.generate({
    input: { text: prompt },
    enableAnalytics: true,
  });

  dailyCost += result.analytics?.cost || 0;
  return result;
}
```

### 3. Rate Limiting Awareness

OpenRouter has rate limits based on your account tier:

```typescript
// Implement exponential backoff for rate limits
async function generateWithRetry(
  prompt: string,
  maxRetries = 3,
  baseDelay = 1000,
) {
  for (let i = 0; i  setTimeout(resolve, delay));
        continue;
      }
      throw error;
    }
  }
}
```

### 4. Error Handling Patterns

```typescript
// Comprehensive error handling
async function generateSafely(prompt: string) {
  try {
    return await ai.generate({
      input: { text: prompt },
      provider: "openrouter",
    });
  } catch (error) {
    if (error.message.includes("rate limit")) {
      // Handle rate limiting - wait and retry
      console.log("Rate limited, implementing backoff...");
      await new Promise((resolve) => setTimeout(resolve, 5000));
      return generateSafely(prompt); // Retry
    } else if (error.message.includes("insufficient_credits")) {
      // Handle insufficient credits
      console.error(
        "Out of credits! Add more at https://openrouter.ai/credits",
      );
      throw new Error("Please add credits to continue");
    } else if (
      error.message.includes("model") &&
      error.message.includes("not found")
    ) {
      // Handle model not available - fallback to different model
      console.log("Model unavailable, falling back to default");
      return await ai.generate({
        input: { text: prompt },
        provider: "openrouter",
        model: "anthropic/claude-3-5-sonnet", // Reliable fallback
      });
    } else {
      // Unknown error - log and rethrow
      console.error("OpenRouter error:", error.message);
      throw error;
    }
  }
}
```

### 5. Caching Strategies

```typescript
// Implement response caching to reduce costs

const responseCache = new Map();
const CACHE_TTL = 3600000; // 1 hour

async function generateWithCache(prompt: string) {
  // Create cache key from prompt
  const cacheKey = createHash("sha256").update(prompt).digest("hex");

  // Check cache
  const cached = responseCache.get(cacheKey);
  if (cached && Date.now() - cached.timestamp  10000) {
      console.warn(`Slow response: ${duration}ms`);
    }

    return result;
  } catch (error) {
    // Log errors to monitoring service
    console.error("Generation failed:", {
      prompt: prompt.substring(0, 100),
      duration: Date.now() - startTime,
      error: error.message,
    });
    throw error;
  }
}
```

---

## Advanced Features

### 1. Dynamic Model Discovery

```typescript
// Get all available models at runtime
const provider = await ai.getProvider("openrouter");
const models = await provider.getAvailableModels();

console.log(`${models.length} models available`);
console.log("Sample models:", models.slice(0, 10));

// Filter models by provider
const claudeModels = models.filter((m) => m.startsWith("anthropic/"));
const openaiModels = models.filter((m) => m.startsWith("openai/"));

console.log(`Claude models: ${claudeModels.length}`);
console.log(`OpenAI models: ${openaiModels.length}`);
```

### 2. Multi-Model Comparison

```typescript
// Compare outputs from different models
async function compareModels(prompt: string) {
  const models = [
    "anthropic/claude-3-5-sonnet",
    "openai/gpt-4o",
    "google/gemini-1.5-pro",
  ];

  const results = await Promise.all(
    models.map(async (model) => {
      const result = await ai.generate({
        input: { text: prompt },
        provider: "openrouter",
        model,
        enableAnalytics: true,
      });

      return {
        model,
        content: result.content,
        cost: result.analytics?.cost,
        tokens: result.analytics?.tokens.total,
        time: result.analytics?.responseTime,
      };
    }),
  );

  // Analyze results
  console.table(results);
  return results;
}
```

### 3. Attribution Tracking

```typescript
// Track usage in OpenRouter dashboard with custom attribution
const ai = new NeuroLink({
  providers: [
    {
      name: "openrouter",
      config: {
        apiKey: process.env.OPENROUTER_API_KEY,
        // Shows up on openrouter.ai/activity dashboard
        referer: "https://myapp.com",
        appName: "My AI Application",
      },
    },
  ],
});

// All requests will show attribution in dashboard
const result = await ai.generate({
  input: { text: "Hello!" },
});
```

### 4. Privacy Modes

OpenRouter supports different privacy modes through model suffixes:

```typescript
// Standard routing (default)
"anthropic/claude-3-5-sonnet";

// Moderated (filtered for safety)
"anthropic/claude-3-5-sonnet:moderated";

// Extended (longer timeout for large requests)
"anthropic/claude-3-5-sonnet:extended";

// Free tier (when available)
"google/gemini-2.0-flash-exp:free";
```

---

## CLI Usage

### Basic Commands

```bash
# Use default model
npx @juspay/neurolink generate "Hello OpenRouter" \
  --provider openrouter

# Specify model
npx @juspay/neurolink gen "Write code" \
  --provider openrouter \
  --model "openai/gpt-4o"

# Interactive loop mode
npx @juspay/neurolink loop \
  --provider openrouter \
  --model "anthropic/claude-3-5-sonnet"

# With temperature control
npx @juspay/neurolink gen "Be creative" \
  --provider openrouter \
  --temperature 0.9

# With max tokens
npx @juspay/neurolink gen "Write a long story" \
  --provider openrouter \
  --max-tokens 2000
```

### Model Comparison via CLI

```bash
# Compare different models
for model in "anthropic/claude-3-5-sonnet" "openai/gpt-4o" "google/gemini-1.5-pro"; do
  echo "Testing $model:"
  npx @juspay/neurolink gen "What is AI?" \
    --provider openrouter \
    --model "$model"
  echo "---"
done
```

---

## Pricing & Cost Management

### Understanding Costs

OpenRouter charges per token with transparent pricing:

- **Input tokens**: Cost to process your prompt
- **Output tokens**: Cost to generate the response
- **Caching**: Some models support prompt caching to reduce costs

View current pricing at [https://openrouter.ai/models](https://openrouter.ai/models)

### Cost Comparison (Approximate)

| Model                         | Input (per 1M tokens) | Output (per 1M tokens) | Best For          |
| ----------------------------- | --------------------- | ---------------------- | ----------------- |
| `openai/gpt-4o-mini`          | $0.15                 | $0.60                  | Cost optimization |
| `google/gemini-2.0-flash`     | $0.075                | $0.30                  | Fast & cheap      |
| `anthropic/claude-3-5-haiku`  | $0.25                 | $1.25                  | Speed & value     |
| `anthropic/claude-3-5-sonnet` | $3.00                 | $15.00                 | Balanced          |
| `openai/gpt-4o`               | $2.50                 | $10.00                 | Code generation   |
| `anthropic/claude-3-opus`     | $15.00                | $75.00                 | Complex reasoning |

### Managing Your Budget

```typescript
// Track spending across requests
class BudgetTracker {
  private totalSpent = 0;
  private dailyLimit = 50.0; // $50/day

  async generate(prompt: string) {
    if (this.totalSpent >= this.dailyLimit) {
      throw new Error(`Daily budget of $${this.dailyLimit} exceeded`);
    }

    const result = await ai.generate({
      input: { text: prompt },
      provider: "openrouter",
      enableAnalytics: true,
    });

    this.totalSpent += result.analytics?.cost || 0;

    console.log(`Spent: $${this.totalSpent.toFixed(4)} / $${this.dailyLimit}`);

    return result;
  }

  reset() {
    this.totalSpent = 0;
  }
}

const tracker = new BudgetTracker();
```

---

## Troubleshooting

### Common Issues

#### 1. "Invalid API key"

**Problem**: API key not set or incorrect.

**Solution**:

```bash
# Check if key is set
echo $OPENROUTER_API_KEY

# Get your key at https://openrouter.ai/keys
export OPENROUTER_API_KEY=sk-or-v1-...

# Add to .env file
echo "OPENROUTER_API_KEY=sk-or-v1-..." >> .env
```

#### 2. "Rate limit exceeded"

**Problem**: Too many requests in a short time.

**Solution**:

- Implement exponential backoff (see Best Practices above)
- Upgrade your account at https://openrouter.ai/credits
- Reduce request frequency
- Use response caching

#### 3. "Insufficient credits"

**Problem**: Account balance is too low.

**Solution**:

```bash
# Check balance at https://openrouter.ai/credits
# Add credits to your account
# Set up auto-recharge for uninterrupted service
```

#### 4. "Model not found"

**Problem**: Model name is incorrect or unavailable.

**Solution**:

```bash
# Check available models
npx @juspay/neurolink models --provider openrouter

# Or visit https://openrouter.ai/models
# Use exact model ID format: "provider/model-name"
```

#### 5. "Request timeout"

**Problem**: Request took too long.

**Solution**:

```typescript
// Increase timeout
const result = await ai.generate({
  input: { text: "Long task..." },
  provider: "openrouter",
  timeout: 60000, // 60 seconds
});

// Or use extended model variant
const result = await ai.generate({
  input: { text: "Long task..." },
  provider: "openrouter",
  model: "anthropic/claude-3-5-sonnet:extended",
});
```

---

## Comparison with Other Providers

### OpenRouter vs Direct Provider Access

| Feature          | OpenRouter                 | Direct Provider          |
| ---------------- | -------------------------- | ------------------------ |
| **Model Access** | 300+ models, 60+ providers | Single provider's models |
| **Setup**        | One API key                | Multiple API keys        |
| **Failover**     | Automatic                  | Manual implementation    |
| **Pricing**      | Competitive, transparent   | Varies by provider       |
| **Rate Limits**  | Unified limits             | Provider-specific        |
| **Dashboard**    | Centralized tracking       | Separate dashboards      |
| **Switching**    | Instant (same API)         | Code changes required    |

### When to Use OpenRouter

**Use OpenRouter when:**

- You want to experiment with multiple models
- You need automatic failover for high availability
- You want simplified billing across providers
- You're building multi-model applications
- You want to avoid vendor lock-in

**Use Direct Providers when:**

- You only need one specific model
- You need provider-specific features (e.g., AWS Bedrock's VPC integration)
- You have existing provider integrations
- Your organization has enterprise agreements with specific providers

---

## Related Documentation

- **[LiteLLM Provider](/docs/getting-started/providers/litellm)** - Alternative multi-provider solution
- **[OpenAI Compatible](/docs/getting-started/providers/openai-compatible)** - OpenAI-compatible endpoints
- **[Provider Setup Guide](/docs/getting-started/provider-setup)** - General provider configuration
- **[Cost Optimization Guide](/docs/cookbook/cost-optimization)** - Reduce AI costs

---

## Additional Resources

- **[OpenRouter Website](https://openrouter.ai)** - Main website
- **[OpenRouter Models](https://openrouter.ai/models)** - Browse all models
- **[OpenRouter Dashboard](https://openrouter.ai/activity)** - Usage tracking
- **[OpenRouter Docs](https://openrouter.ai/docs)** - Official documentation
- **[OpenRouter API Reference](https://openrouter.ai/docs/api-reference)** - API docs

---

**Need Help?** Join our [GitHub Discussions](https://github.com/juspay/neurolink/discussions) or open an [issue](https://github.com/juspay/neurolink/issues).

---

## SageMaker Integration - Deploy Your Custom AI Models

<!-- Source: getting-started/providers/sagemaker.md -->

# SageMaker Integration - Deploy Your Custom AI Models

> **FULLY IMPLEMENTED**: NeuroLink now supports Amazon SageMaker, enabling you to deploy and use your own custom trained models through NeuroLink's unified interface. All features documented below are complete and production-ready.

## What is SageMaker Integration?

SageMaker integration transforms NeuroLink into a platform for custom AI model deployment, offering:

- **Custom Model Hosting** - Deploy your fine-tuned models on AWS infrastructure
- **Cost Control** - Pay only for inference usage with auto-scaling capabilities
- **Enterprise Security** - Full control over model infrastructure and data privacy
- **Performance** - Dedicated compute resources with predictable latency
- **Global Deployment** - Available in all major AWS regions
- **Monitoring** - Built-in CloudWatch metrics and logging

## Quick Start

### 1. Deploy Your Model to SageMaker

First, you need a model deployed to a SageMaker endpoint:

```python
# Example: Deploy a Hugging Face model to SageMaker
from sagemaker.huggingface import HuggingFaceModel

# Create model
huggingface_model = HuggingFaceModel(
    model_data="s3://your-bucket/model.tar.gz",
    role=role,
    transformers_version="4.21",
    pytorch_version="1.12",
    py_version="py39",
)

# Deploy to endpoint
predictor = huggingface_model.deploy(
    initial_instance_count=1,
    instance_type="ml.m5.large",
    endpoint_name="my-custom-model-endpoint"
)
```

### 2. Configure NeuroLink

```bash
# Set AWS credentials and SageMaker configuration
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export AWS_REGION="us-east-1"
export SAGEMAKER_DEFAULT_ENDPOINT="my-custom-model-endpoint"
```

### 3. Use with CLI

```bash
# Test SageMaker endpoint connectivity
npx @juspay/neurolink sagemaker status

# Generate content with your custom model
npx @juspay/neurolink generate "Analyze this business scenario" --provider sagemaker

# Use specific endpoint
npx @juspay/neurolink generate "Domain-specific task" --provider sagemaker --model my-domain-model

# Performance benchmark
npx @juspay/neurolink sagemaker benchmark my-custom-model-endpoint
```

### 4. Use with SDK

```typescript

// Create NeuroLink instance
const neurolink = new NeuroLink();

// Generate with default endpoint
const result = await neurolink.generate({
  input: { text: "Analyze customer feedback for sentiment and themes" },
  provider: "sagemaker",
});

// Use specific endpoint
const domainResult = await neurolink.generate({
  input: { text: "Industry-specific analysis request" },
  provider: "sagemaker",
  model: "domain-expert-model-endpoint",
});
```

## Key Benefits

### Custom Model Deployment

Deploy any model you've trained or fine-tuned:

```typescript

// Example: Using different specialized models
const models = {
  sentiment: "sentiment-analysis-model",
  translation: "multilingual-translation-model",
  summarization: "document-summarizer-model",
  domain: "healthcare-specialist-model",
};

async function analyzeWithSpecializedModel(text: string, task: string) {
  const endpoint = models[task] || models.sentiment;
  const neurolink = new NeuroLink();

  const result = await neurolink.generate({
    input: { text: `${task}: ${text}` },
    provider: "sagemaker",
    model: endpoint,
    temperature: 0.3, // Lower for specialized tasks
    timeout: "45s",
  });

  return {
    analysis: result.content,
    model: endpoint,
    task: task,
  };
}

// Usage
const sentimentResult = await analyzeWithSpecializedModel(
  "The product quality has really improved recently!",
  "sentiment",
);

const summaryResult = await analyzeWithSpecializedModel(
  "Long document content here...",
  "summarization",
);
```

### Cost Optimization

SageMaker enables precise cost control through multiple deployment options:

```typescript

class CostOptimizedSageMaker {
  private neurolink: NeuroLink;
  private endpoints: {
    cheap: string; // Small instance, basic model
    balanced: string; // Medium instance, good model
    premium: string; // Large instance, best model
  };

  constructor() {
    this.neurolink = new NeuroLink();
    this.endpoints = {
      cheap: "cost-effective-model",
      balanced: "production-model",
      premium: "high-performance-model",
    };
  }

  async generateOptimized(
    prompt: string,
    priority: "cost" | "balanced" | "quality" = "balanced",
  ) {
    const endpoint = this.endpoints[priority];

    const startTime = Date.now();
    const result = await this.neurolink.generate({
      input: { text: prompt },
      provider: "sagemaker",
      model: endpoint,
      timeout: priority === "cost" ? "15s" : "45s", // Faster timeout for cost model
    });
    const responseTime = Date.now() - startTime;

    return {
      content: result.content,
      endpoint: endpoint,
      priority: priority,
      responseTime: responseTime,
      estimatedCost: this.calculateCost(responseTime, priority),
    };
  }

  private calculateCost(responseTime: number, priority: string): number {
    const rates = {
      cost: 0.0001, // $0.0001 per second
      balanced: 0.0005, // $0.0005 per second
      quality: 0.002, // $0.002 per second
    };

    return (responseTime / 1000) * rates[priority];
  }
}

// Usage
const optimizer = new CostOptimizedSageMaker();

// Cost-effective for simple tasks
const cheapResult = await optimizer.generateOptimized(
  "Simple classification task",
  "cost",
);

// High-quality for complex analysis
const premiumResult = await optimizer.generateOptimized(
  "Complex business strategy analysis",
  "quality",
);

console.log(
  `Cost difference: $${premiumResult.estimatedCost - cheapResult.estimatedCost}`,
);
```

### Enterprise Security & Compliance

Full control over your model infrastructure:

```typescript

class SecureSageMakerProvider {
  private neurolink: NeuroLink;
  private region: string;
  private vpcConfig?: {
    securityGroups: string[];
    subnets: string[];
  };

  constructor(region: string, vpcConfig?: any) {
    this.neurolink = new NeuroLink();
    this.region = region;
    this.vpcConfig = vpcConfig;
  }

  async secureGenerate(
    prompt: string,
    endpoint: string,
    securityContext: {
      userId: string;
      department: string;
      clearanceLevel: "public" | "internal" | "confidential";
    },
  ) {
    // Audit logging
    console.log(
      `[AUDIT] User ${securityContext.userId} from ${securityContext.department} requesting ${securityContext.clearanceLevel} generation`,
    );

    const result = await this.neurolink.generate({
      input: { text: prompt },
      provider: "sagemaker",
      model: endpoint,
      timeout: "30s",
      // Custom metadata for tracking
      context: {
        user: securityContext.userId,
        department: securityContext.department,
        classification: securityContext.clearanceLevel,
        timestamp: new Date().toISOString(),
      },
    });

    // Log successful completion
    console.log(
      `[AUDIT] Generation completed for user ${securityContext.userId}`,
    );

    return {
      ...result,
      securityContext,
      complianceInfo: {
        dataResidency: this.region,
        encryptionAtRest: true,
        encryptionInTransit: true,
        auditLogged: true,
      },
    };
  }
}

// Usage
const secureProvider = new SecureSageMakerProvider("us-east-1", {
  securityGroups: ["sg-12345"],
  subnets: ["subnet-abc123"],
});

const secureResult = await secureProvider.secureGenerate(
  "Analyze sensitive customer data",
  "hipaa-compliant-model",
  {
    userId: "john.doe@company.com",
    department: "healthcare",
    clearanceLevel: "confidential",
  },
);
```

## Advanced Model Management

### Multi-Model Endpoints

Manage multiple models through a single endpoint:

```typescript

class MultiModelSageMaker {
  private neurolink: NeuroLink;
  private multiModelEndpoint: string;
  private models: Map;

  constructor(endpoint: string) {
    this.neurolink = new NeuroLink();
    this.multiModelEndpoint = endpoint;
    this.models = new Map([
      ["sentiment", "sentiment-v2.tar.gz"],
      ["translation", "translate-v1.tar.gz"],
      ["summarization", "summary-v3.tar.gz"],
    ]);
  }

  async generateWithModel(
    prompt: string,
    modelType: string,
    options: {
      temperature?: number;
      maxTokens?: number;
    } = {},
  ) {
    const modelPath = this.models.get(modelType);
    if (!modelPath) {
      throw new Error(`Model type '${modelType}' not available`);
    }

    const result = await this.neurolink.generate({
      input: { text: prompt },
      provider: "sagemaker",
      model: this.multiModelEndpoint,
      temperature: options.temperature || 0.7,
      maxTokens: options.maxTokens || 500,
      // SageMaker-specific: target model for multi-model endpoint
      targetModel: modelPath,
    });

    return {
      ...result,
      modelType: modelType,
      modelPath: modelPath,
    };
  }

  async compareModels(prompt: string, modelTypes: string[]) {
    const comparisons = await Promise.all(
      modelTypes.map(async (modelType) => {
        try {
          const result = await this.generateWithModel(prompt, modelType);
          return {
            modelType,
            success: true,
            response: result.content,
            responseTime: result.responseTime,
          };
        } catch (error) {
          return {
            modelType,
            success: false,
            error: error.message,
          };
        }
      }),
    );

    return comparisons;
  }
}

// Usage
const multiModel = new MultiModelSageMaker("multi-model-endpoint");

// Use specific model
const sentimentResult = await multiModel.generateWithModel(
  "I love this new feature!",
  "sentiment",
);

// Compare multiple models
const comparison = await multiModel.compareModels(
  "Analyze this text for insights",
  ["sentiment", "summarization"],
);
```

### Health Monitoring & Auto-Recovery

```typescript

class SageMakerHealthMonitor {
  private neurolink: NeuroLink;
  private endpoints: string[];
  private healthStatus: Map;
  private failureCount: Map;

  constructor(endpoints: string[]) {
    this.neurolink = new NeuroLink();
    this.endpoints = endpoints;
    this.healthStatus = new Map();
    this.failureCount = new Map();
  }

  async checkHealth(endpoint: string): Promise {
    try {
      const result = await this.neurolink.generate({
        input: { text: "health check" },
        provider: "sagemaker",
        model: endpoint,
        timeout: "10s",
        maxTokens: 10,
      });

      this.healthStatus.set(endpoint, true);
      this.failureCount.set(endpoint, 0);
      return true;
    } catch (error) {
      this.healthStatus.set(endpoint, false);
      const failures = this.failureCount.get(endpoint) || 0;
      this.failureCount.set(endpoint, failures + 1);
      return false;
    }
  }

  async generateWithFailover(prompt: string) {
    for (const endpoint of this.endpoints) {
      const isHealthy = await this.checkHealth(endpoint);

      if (isHealthy) {
        try {
          const result = await this.neurolink.generate({
            input: { text: prompt },
            provider: "sagemaker",
            model: endpoint,
            timeout: "30s",
          });

          return {
            ...result,
            endpoint: endpoint,
            failoverUsed: this.endpoints.indexOf(endpoint) > 0,
          };
        } catch (error) {
          console.warn(`Endpoint ${endpoint} failed, trying next...`);
          continue;
        }
      }
    }

    throw new Error("All SageMaker endpoints are unavailable");
  }

  getHealthReport() {
    return {
      endpoints: this.endpoints,
      health: Object.fromEntries(this.healthStatus),
      failures: Object.fromEntries(this.failureCount),
      healthyCount: Array.from(this.healthStatus.values()).filter(Boolean)
        .length,
      totalEndpoints: this.endpoints.length,
    };
  }
}

// Usage
const monitor = new SageMakerHealthMonitor([
  "primary-model-endpoint",
  "backup-model-endpoint",
  "fallback-model-endpoint",
]);

// Generate with automatic failover
const result = await monitor.generateWithFailover(
  "Important business analysis request",
);

// Get health status
const healthReport = monitor.getHealthReport();
console.log("Endpoint Health:", healthReport);
```

## Advanced Configuration

### Serverless Inference

Configure SageMaker for serverless inference:

> **Educational Example - Custom Wrapper Pattern**
>
> The `coldStartTimeout` parameter below is a **user-defined convenience variable**,
> not a native NeuroLink SDK option. This example demonstrates how you can create
> wrapper functions with custom options that map to standard SDK parameters.
>
> The `coldStartTimeout` value is passed to the standard `timeout` option internally.

```typescript

class ServerlessSageMaker {
  private neurolink: NeuroLink;
  private serverlessEndpoint: string;

  constructor(endpoint: string) {
    this.neurolink = new NeuroLink();
    this.serverlessEndpoint = endpoint;
  }

  async generateServerless(
    prompt: string,
    options: {
      // NOTE: coldStartTimeout is a custom wrapper option (not SDK native)
      // It maps to the standard `timeout` parameter for SageMaker cold starts
      coldStartTimeout?: string;
      maxConcurrency?: number;
      memorySize?: number;
    } = {},
  ) {
    const {
      coldStartTimeout = "2m", // Longer timeout for cold starts
      maxConcurrency = 10,
      memorySize = 4096,
    } = options;

    const startTime = Date.now();
    const result = await this.neurolink.generate({
      input: { text: prompt },
      provider: "sagemaker",
      model: this.serverlessEndpoint,
      timeout: coldStartTimeout,
      // Serverless-specific metadata
      context: {
        deployment: "serverless",
        maxConcurrency,
        memorySize,
      },
    });
    const totalTime = Date.now() - startTime;

    return {
      ...result,
      serverlessMetrics: {
        totalTime,
        coldStart: totalTime > 10000, // Assume cold start if > 10s
        configuration: {
          maxConcurrency,
          memorySize,
        },
      },
    };
  }

  async batchServerless(prompts: string[], batchSize: number = 5) {
    const results = [];

    // Process in batches to respect concurrency limits
    for (let i = 0; i  this.generateServerless(prompt)),
      );

      results.push(...batchResults);

      // Brief pause between batches
      if (i + batchSize  setTimeout(resolve, 1000));
      }
    }

    return results;
  }
}

// Usage
const serverless = new ServerlessSageMaker("serverless-model-endpoint");

// Single serverless generation
const result = await serverless.generateServerless(
  "Analyze market trends for Q4 2024",
  {
    coldStartTimeout: "3m",
    maxConcurrency: 20,
    memorySize: 8192,
  },
);

// Batch serverless processing
const prompts = [
  "Summarize customer feedback",
  "Analyze competitor pricing",
  "Generate product recommendations",
];

const batchResults = await serverless.batchServerless(prompts, 3);
```

## Testing and Validation

### Model Performance Testing

```typescript

class SageMakerPerformanceTester {
  private neurolink: NeuroLink;
  private endpoint: string;
  private baseline: {
    latency: number;
    accuracy: number;
    throughput: number;
  };

  constructor(endpoint: string, baseline: any) {
    this.neurolink = new NeuroLink();
    this.endpoint = endpoint;
    this.baseline = baseline;
  }

  async loadTest(
    prompts: string[],
    concurrency: number = 5,
    duration: number = 60000, // 1 minute
  ) {
    const results = [];
    const startTime = Date.now();
    let requestCount = 0;
    let errorCount = 0;

    while (Date.now() - startTime  {
          const prompt = prompts[requestCount % prompts.length];
          requestCount++;

          try {
            const requestStart = Date.now();
            const result = await this.neurolink.generate({
              input: { text: prompt },
              provider: "sagemaker",
              model: this.endpoint,
              timeout: "30s",
            });
            const latency = Date.now() - requestStart;

            return {
              success: true,
              latency,
              responseLength: result.content.length,
              requestId: requestCount,
            };
          } catch (error) {
            errorCount++;
            return {
              success: false,
              error: error.message,
              requestId: requestCount,
            };
          }
        });

      const batchResults = await Promise.all(batchPromises);
      results.push(...batchResults);

      // Brief pause between batches
      await new Promise((resolve) => setTimeout(resolve, 100));
    }

    return this.analyzeResults(results, requestCount, errorCount);
  }

  private analyzeResults(
    results: any[],
    totalRequests: number,
    errors: number,
  ) {
    const successfulResults = results.filter((r) => r.success);
    const latencies = successfulResults.map((r) => r.latency);

    const avgLatency = latencies.reduce((a, b) => a + b, 0) / latencies.length;
    const p95Latency = latencies.sort((a, b) => a - b)[
      Math.floor(latencies.length * 0.95)
    ];
    const throughput = totalRequests / 60; // requests per second
    const errorRate = (errors / totalRequests) * 100;

    return {
      performance: {
        averageLatency: avgLatency,
        p95Latency: p95Latency,
        throughput: throughput,
        errorRate: errorRate,
        totalRequests: totalRequests,
      },
      comparison: {
        latencyChange:
          ((avgLatency - this.baseline.latency) / this.baseline.latency) * 100,
        throughputChange:
          ((throughput - this.baseline.throughput) / this.baseline.throughput) *
          100,
      },
      status: this.getPerformanceStatus(avgLatency, throughput, errorRate),
    };
  }

  private getPerformanceStatus(
    latency: number,
    throughput: number,
    errorRate: number,
  ) {
    if (errorRate > 5) return "POOR";
    if (latency > this.baseline.latency * 1.5) return "DEGRADED";
    if (throughput < this.baseline.throughput * 0.8) return "DEGRADED";
    return "GOOD";
  }
}

// Usage
const tester = new SageMakerPerformanceTester("performance-test-endpoint", {
  latency: 2000, // 2 seconds baseline
  accuracy: 0.95, // 95% accuracy baseline
  throughput: 10, // 10 requests/second baseline
});

const testPrompts = [
  "Analyze customer sentiment",
  "Generate product description",
  "Summarize business report",
  "Classify support ticket",
];

const performanceReport = await tester.loadTest(testPrompts, 10, 120000); // 2 minutes
console.log("Performance Report:", performanceReport);
```

## Troubleshooting

### Common Issues

#### 1. "Endpoint not found" Error

```bash
# Check if endpoint exists
aws sagemaker describe-endpoint --endpoint-name your-endpoint-name

# Check endpoint status
npx @juspay/neurolink sagemaker status
```

#### 2. "Access denied" Error

```bash
# Verify IAM permissions
aws sts get-caller-identity

# Test IAM policy
aws sagemaker invoke-endpoint --endpoint-name your-endpoint --body '{"inputs": "test"}' --content-type application/json /tmp/output.json
```

#### 3. "Model not loading" Error

```bash
# Check endpoint health
npx @juspay/neurolink sagemaker test your-endpoint

# Monitor CloudWatch logs
aws logs describe-log-groups --log-group-name-prefix /aws/sagemaker/Endpoints
```

### Debug Mode

```bash
# Enable debug output
export NEUROLINK_DEBUG=true
npx @juspay/neurolink generate "test" --provider sagemaker --debug
```

## Related Documentation

- **[Provider Setup Guide](/docs/getting-started/provider-setup.md#amazon-sagemaker-configuration)** - Complete SageMaker setup
- **[Environment Variables](/docs/getting-started/environment-variables)** - Configuration options
- **[API Reference](/docs/sdk/api-reference)** - SDK usage examples
- **[Basic Usage Examples](/docs/examples/basic-usage.md#custom-model-access-with-sagemaker)** - Code examples
- **[CLI Reference](/docs/cli)** - Command-line usage

### Other Provider Integrations

- **[LiteLLM Integration](/docs/getting-started/providers/litellm)** - Access 100+ models through unified interface
- **[MCP Integration](/docs/mcp/integration)** - Model Context Protocol support
- **[Framework Integration](/docs/sdk/framework-integration)** - Next.js, React, and more

## Why Choose SageMaker Integration?

### For AI/ML Teams

- **Custom Models**: Deploy your own fine-tuned models
- **Experimentation**: A/B test different model versions
- **Performance Control**: Dedicated compute resources
- **Cost Transparency**: Clear pricing per inference request

### For Enterprises

- **Data Privacy**: Models run in your AWS account
- **Compliance**: Meet industry-specific requirements
- **Scalability**: Auto-scaling from zero to thousands of requests
- **Integration**: Seamless fit with existing AWS infrastructure

### For Production

- **Reliability**: Multi-AZ deployment options
- **Monitoring**: CloudWatch integration for metrics and logs
- **Security**: VPC, encryption, and IAM controls
- **Performance**: Predictable latency and throughput

---

**Ready to deploy your custom models?** Follow the [Quick Start](#quick-start) guide above to begin using your own AI models through NeuroLink's SageMaker integration today!

---

# SDK Reference

## SDK Reference

<!-- Source: sdk/index.md -->

# SDK Reference

The NeuroLink SDK provides a TypeScript-first programmatic interface for integrating AI capabilities into your applications.

##  Overview

The SDK is designed for:

- **Web applications** (React, Vue, Svelte, Angular)
- **Backend services** (Node.js, Express, Fastify)
- **Serverless functions** (Vercel, Netlify, AWS Lambda)
- **Desktop applications** (Electron, Tauri)

##  Quick Start

```typescript

const neurolink = new NeuroLink();

// Generate text
const result = await neurolink.generate({
  input: { text: "Write a haiku about programming" },
  provider: "google-ai",
});

console.log(result.content);
```

```typescript

// Auto-selects best available provider
const provider = createBestAIProvider();

const result = await provider.generate({
  input: { text: "Explain quantum computing" },
  maxTokens: 500,
  temperature: 0.7,
});
```

```typescript
const stream = await neurolink.stream({
  input: { text: "Tell me a long story" },
  provider: "anthropic",
});

for await (const chunk of stream.stream) {
  process.stdout.write(chunk.content);
}
```

##  Documentation Sections

-  **[API Reference](/docs/sdk/api-reference)**

  Complete TypeScript API documentation with interfaces, types, and method signatures.

-  **[Framework Integration](/docs/sdk/framework-integration)**

  Integration guides for Next.js, SvelteKit, React, Vue, and other popular frameworks.

- ️ **[Custom Tools](/docs/sdk/custom-tools)**

  How to create and register custom tools for enhanced AI capabilities.

## ️ Core Architecture

The SDK uses a **Factory Pattern** architecture that provides:

- **Unified Interface**: All providers implement the same `AIProvider` interface
- **Type Safety**: Full TypeScript support with IntelliSense
- **Automatic Fallback**: Seamless provider switching on failures
- **Built-in Tools**: 6 core tools available across all providers

```typescript
type AIProvider = {
  generate(options: TextGenerationOptions): Promise;
  stream(options: StreamOptions): Promise;
  supportsTools(): boolean;
};
```

## ⚙️ Configuration

The SDK automatically detects configuration from:

```typescript
// Environment variables
process.env.OPENAI_API_KEY;
process.env.GOOGLE_AI_API_KEY;
process.env.ANTHROPIC_API_KEY;
// ... and more

// Programmatic configuration
const neurolink = new NeuroLink({
  defaultProvider: "openai",
  timeout: 30000,
  enableAnalytics: true,
});
```

##  Advanced Features

### Auto Provider Selection {#auto-selection}

NeuroLink automatically selects the best available AI provider based on your configuration:

```typescript

// Automatically selects best available provider
const provider = createBestAIProvider();

const result = await provider.generate({
  input: { text: "Explain quantum computing" },
  maxTokens: 500,
  temperature: 0.7,
});
```

**Selection Priority:**

1. OpenAI (most reliable)
2. Anthropic (high quality)
3. Google AI Studio (free tier)
4. Other configured providers

**Custom Priority:**

```typescript

// Create with fallback
const { primary, fallback } = AIProviderFactory.createProviderWithFallback(
  "bedrock", // Prefer Bedrock
  "openai", // Fall back to OpenAI
);
```

**Learn more:** [Provider Orchestration Guide](/docs/features/provider-orchestration)

### Analytics & Evaluation

```typescript
const result = await neurolink.generate({
  input: { text: "Generate a business proposal" },
  enableAnalytics: true, // Track usage and costs
  enableEvaluation: true, // AI quality scoring
});

console.log(result.analytics); // Usage data
console.log(result.evaluation); // Quality scores
```

### Custom Tools

```typescript
// Register a single tool
neurolink.registerTool("weatherLookup", {
  description: "Get current weather for a city",
  parameters: z.object({
    city: z.string(),
    units: z.enum(["celsius", "fahrenheit"]).optional(),
  }),
  execute: async ({ city, units = "celsius" }) => {
    // Your implementation
    return { city, temperature: 22, units, condition: "sunny" };
  },
});

// Register multiple tools - Object format
neurolink.registerTools({
  stockPrice: {
    description: "Get stock price",
    execute: async () => ({ price: 150.25 }),
  },
  calculator: {
    description: "Calculate math",
    execute: async () => ({ result: 42 }),
  },
});

// Register multiple tools - Array format (Lighthouse compatible)
neurolink.registerTools([
  {
    name: "analytics",
    tool: {
      description: "Get analytics data",
      parameters: z.object({
        merchantId: z.string(),
        dateRange: z.string().optional(),
      }),
      execute: async ({ merchantId, dateRange }) => {
        return { data: "analytics result" };
      },
    },
  },
  {
    name: "processor",
    tool: {
      description: "Process payments",
      execute: async () => ({ status: "processed" }),
    },
  },
]);
```

### Context Integration

```typescript
const result = await neurolink.generate({
  input: { text: "Create a summary" },
  context: {
    userId: "123",
    project: "Q1-report",
    department: "sales",
  },
});
```

##  Framework Examples

```typescript
// app/api/ai/route.ts

export async function POST(request: Request) {
  const { prompt } = await request.json();
  const neurolink = new NeuroLink();

  const result = await neurolink.generate({
    input: { text: prompt },
    timeout: "2m",
  });

  return Response.json({ text: result.content });
}
```

```typescript
// src/routes/api/ai/+server.ts

export const POST: RequestHandler = async ({ request }) => {
  const { message } = await request.json();
  const provider = createBestAIProvider();

  const result = await provider.stream({
    input: { text: message },
    timeout: "2m",
  });

  // Manually create ReadableStream from AsyncIterable
  const readable = new ReadableStream({
    async start(controller) {
      try {
        for await (const chunk of result.stream) {
          if (chunk && typeof chunk === "object" && "content" in chunk) {
            controller.enqueue(new TextEncoder().encode(chunk.content));
          }
        }
        controller.close();
      } catch (error) {
        controller.error(error);
      }
    },
  });

  return new Response(readable, {
    headers: {
      "Content-Type": "text/event-stream",
      "Cache-Control": "no-cache",
      Connection: "keep-alive",
    },
  });
};
```

```typescript

const app = express();
const neurolink = new NeuroLink();

app.post('/api/generate', async (req, res) => {
  const result = await neurolink.generate({
    input: { text: req.body.prompt },
  });

  res.json({ content: result.content });
});
```

##  Related Resources

- **[Examples & Tutorials](/docs/)** - Practical implementation examples
- **[Advanced Features](/docs/)** - MCP integration, analytics, streaming
- **[Troubleshooting](/docs/reference/troubleshooting)** - Common issues and solutions

---

## API Reference

<!-- Source: sdk/api-reference.md -->

# API Reference

Complete API reference for NeuroLink.

## Core API

### Generate Text

```http
POST /api/generate
```

### Stream Text

```http
POST /api/stream
```

### Provider Status

```http
GET /api/status
```

## MCP Integration

### List MCP Tools

```http
GET /api/mcp/tools
```

### Execute MCP Tool

```http
POST /api/mcp/execute
```

### MCP Server Status

```http
GET /api/mcp/status
```

For complete API documentation, see [API Reference](/docs/sdk/api-reference).

---

## Advanced SDK Features

<!-- Source: sdk/advanced-features.md -->

# Advanced SDK Features

Advanced features and capabilities of the NeuroLink SDK.

## Advanced Configuration

### Custom Providers

```typescript

const neurolink = new NeuroLink({
  providers: {
    custom: {
      endpoint: "https://api.custom.com",
      apiKey: process.env.CUSTOM_API_KEY,
    },
  },
});
```

### Advanced Streaming

```typescript
const stream = neurolink.generateStream({
  prompt: "Write a story",
  onChunk: (chunk) => console.log(chunk),
  onComplete: (result) => console.log("Done:", result),
  onError: (error) => console.error("Error:", error),
});
```

## Performance Optimization

### Caching

```typescript
const result = await neurolink.generate({
  prompt: "Hello world",
  cache: true,
  cacheTTL: 300000, // 5 minutes
});
```

### Batching

```typescript
const results = await neurolink.generateBatch([
  { prompt: "First prompt" },
  { prompt: "Second prompt" },
  { prompt: "Third prompt" },
]);
```

For more examples, see [Advanced Examples](/docs/examples/advanced).

---

## SDK Custom Tools Guide

<!-- Source: sdk/custom-tools-guide.md -->

#  SDK Custom Tools Guide

Build powerful AI applications by extending NeuroLink with your own custom tools.

##  Overview

NeuroLink's SDK allows you to register custom tools programmatically, giving your AI assistants access to any functionality you need. All registered tools work seamlessly with the built-in tool system across all supported providers.

### Key Features

- ✅ **Type-Safe**: Full TypeScript support with Zod schema validation
- ✅ **Provider Agnostic**: Works with all providers that support tools
- ✅ **Easy Integration**: Simple API for tool registration
- ✅ **Async Support**: All tools run asynchronously
- ✅ **Error Handling**: Graceful error handling built-in

##  Quick Start

### Basic Tool Registration

```typescript

const neurolink = new NeuroLink();

// Register a simple tool
neurolink.registerTool("greetUser", {
  description: "Generate a personalized greeting",
  parameters: z.object({
    name: z.string().describe("User name"),
    language: z.enum(["en", "es", "fr", "de"]).default("en"),
  }),
  execute: async ({ name, language }) => {
    const greetings = {
      en: `Hello, ${name}!`,
      es: `¡Hola, ${name}!`,
      fr: `Bonjour, ${name}!`,
      de: `Hallo, ${name}!`,
    };
    return { greeting: greetings[language] };
  },
});

// AI will now use your tool
const result = await neurolink.generate({
  input: { text: "Greet John in Spanish" },
});
// AI calls: greetUser({ name: "John", language: "es" })
// Returns: "¡Hola, John!"
```

## ⚠️ Common Mistakes

### ❌ Using `schema` instead of `parameters`

```typescript
// WRONG - will throw validation error
neurolink.registerTool("badTool", {
  description: "This will fail",
  schema: {
    // ❌ Should be 'parameters'
    type: "object",
    properties: { value: { type: "string" } },
  },
  execute: async (args) => args,
});
```

### ❌ Using plain JSON schema as `parameters`

```typescript
// WRONG - will throw validation error
neurolink.registerTool("badTool", {
  description: "This will also fail",
  parameters: {
    // ❌ Should be Zod schema
    type: "object",
    properties: { value: { type: "string" } },
  },
  execute: async (args) => args,
});
```

### ✅ Correct Zod Schema Format

```typescript
// CORRECT - works perfectly

neurolink.registerTool("goodTool", {
  description: "This works correctly",
  parameters: z.object({
    // ✅ Zod schema
    value: z.string(),
  }),
  execute: async (args) => args,
});
```

##  SimpleTool Interface

All custom tools implement the `SimpleTool` interface:

```typescript
type SimpleTool = {
  description: string; // What the tool does
  parameters?: ZodSchema; // Input validation schema
  execute: (args: T) => Promise; // Tool implementation
};
```

### Interface Components

- **description**: Clear, actionable description that helps the AI understand when to use the tool
- **parameters**: Optional Zod schema for validating inputs (highly recommended)
- **execute**: Async function that implements the tool's logic

## ️ Registration Methods

### Register Single Tool

```typescript
neurolink.registerTool(name: string, tool: SimpleTool): void
```

### Register Multiple Tools

```typescript
neurolink.registerTools(tools: Record): void
```

### Get Custom Tools

```typescript
// Get custom tools registered via registerTool()
const customTools = neurolink.getCustomTools(); // Returns Map

// Get all available tools (async - includes built-in, custom, and MCP tools)
const allTools = await neurolink.getAllAvailableTools(); // Returns ToolInfo[]
```

##  Common Use Cases

### 1. API Integration

```typescript
neurolink.registerTool("weatherLookup", {
  description: "Get current weather for any city",
  parameters: z.object({
    city: z.string().describe("City name"),
    country: z.string().optional().describe("Country code (ISO 2-letter)"),
    units: z.enum(["celsius", "fahrenheit"]).default("celsius"),
  }),
  execute: async ({ city, country, units }) => {
    const response = await fetch(
      `https://api.weather.com/v1/current?city=${city}&country=${country || ""}&units=${units}`,
      { headers: { "API-Key": process.env.WEATHER_API_KEY } },
    );
    const data = await response.json();

    return {
      city,
      temperature: data.temp,
      condition: data.condition,
      humidity: data.humidity,
      units,
    };
  },
});
```

### 2. Database Operations

```typescript
neurolink.registerTool("userLookup", {
  description: "Find user information by email or ID",
  parameters: z.object({
    identifier: z.string().describe("Email address or user ID"),
    fields: z
      .array(z.string())
      .optional()
      .describe("Specific fields to return"),
  }),
  execute: async ({ identifier, fields }) => {
    const db = getDatabase();
    const query = identifier.includes("@")
      ? { email: identifier }
      : { id: identifier };

    const user = await db.users.findOne(query);
    if (!user) {
      return { error: "User not found" };
    }

    // Return only requested fields if specified
    if (fields && fields.length > 0) {
      return fields.reduce((acc, field) => {
        acc[field] = user[field];
        return acc;
      }, {});
    }

    return user;
  },
});
```

### 3. Data Processing

```typescript
neurolink.registerTool("analyzeSentiment", {
  description: "Analyze sentiment of text using ML model",
  parameters: z.object({
    text: z.string().describe("Text to analyze"),
    language: z.string().default("en").describe("Language code"),
    detailed: z.boolean().default(false).describe("Include detailed analysis"),
  }),
  execute: async ({ text, language, detailed }) => {
    const sentimentModel = await loadSentimentModel(language);
    const result = await sentimentModel.analyze(text);

    if (detailed) {
      return {
        sentiment: result.sentiment,
        score: result.score,
        emotions: result.emotions,
        keywords: result.keywords,
        confidence: result.confidence,
      };
    }

    return {
      sentiment: result.sentiment,
      score: result.score,
    };
  },
});
```

### 4. File Operations

```typescript
neurolink.registerTool("processSpreadsheet", {
  description: "Process Excel/CSV files with various operations",
  parameters: z.object({
    filePath: z.string().describe("Path to spreadsheet file"),
    operation: z.enum(["summarize", "filter", "pivot", "chart"]),
    options: z.record(z.any()).optional(),
  }),
  execute: async ({ filePath, operation, options = {} }) => {
    const workbook = await loadSpreadsheet(filePath);

    switch (operation) {
      case "summarize":
        return {
          sheets: workbook.sheetNames,
          totalRows: workbook.getTotalRows(),
          columns: workbook.getColumns(),
          summary: workbook.generateSummary(),
        };

      case "filter":
        const filtered = workbook.filter(options.criteria);
        return {
          matchingRows: filtered.length,
          data: filtered,
        };

      case "pivot":
        return workbook.createPivotTable(
          options.rows,
          options.columns,
          options.values,
        );

      case "chart":
        const chartData = workbook.prepareChartData(
          options.type,
          options.series,
        );
        return { chartData, recommendation: suggestChartType(chartData) };
    }
  },
});
```

### 5. External Service Integration

```typescript
neurolink.registerTools({
  sendEmail: {
    description: "Send email via SMTP",
    parameters: z.object({
      to: z.string().email(),
      subject: z.string(),
      body: z.string(),
      cc: z.array(z.string().email()).optional(),
      attachments: z.array(z.string()).optional(),
    }),
    execute: async ({ to, subject, body, cc, attachments }) => {
      const mailer = getMailer();
      const result = await mailer.send({
        to,
        subject,
        body,
        cc,
        attachments: attachments
          ? await Promise.all(attachments.map(loadAttachment))
          : undefined,
      });

      return {
        messageId: result.messageId,
        status: "sent",
        timestamp: new Date().toISOString(),
      };
    },
  },

  scheduleCalendarEvent: {
    description: "Create calendar event",
    parameters: z.object({
      title: z.string(),
      startTime: z.string().datetime(),
      duration: z.number().describe("Duration in minutes"),
      attendees: z.array(z.string().email()).optional(),
      location: z.string().optional(),
      description: z.string().optional(),
    }),
    execute: async (params) => {
      const calendar = getCalendarService();
      const event = await calendar.createEvent({
        ...params,
        endTime: addMinutes(params.startTime, params.duration),
      });

      return {
        eventId: event.id,
        eventLink: event.htmlLink,
        status: "created",
      };
    },
  },
});
```

##  Best Practices

### 1. Clear Descriptions

Make tool descriptions specific and actionable:

```typescript
// ❌ Bad
description: "Database tool";

// ✅ Good
description: "Search customer database by name, email, or order ID";
```

### 2. Parameter Validation

Always use Zod schemas for type safety:

```typescript
// ❌ Bad - No validation
parameters: undefined,
execute: async (args: any) => {
  // Risky - args could be anything
}

// ✅ Good - Full validation
parameters: z.object({
  userId: z.string().uuid(),
  action: z.enum(['view', 'edit', 'delete']),
  reason: z.string().min(10).optional()
}),
execute: async ({ userId, action, reason }) => {
  // Type-safe with validated inputs
}
```

### 3. Error Handling

Handle errors gracefully:

```typescript
execute: async (args) => {
  try {
    const result = await riskyOperation(args);
    return { success: true, data: result };
  } catch (error) {
    // Return error info instead of throwing
    return {
      success: false,
      error: error.message,
      code: error.code || "UNKNOWN_ERROR",
    };
  }
};
```

### 4. Async Operations

All execute functions must return promises:

```typescript
// ❌ Bad - Synchronous
execute: (args) => {
  return { result: "data" };
};

// ✅ Good - Asynchronous
execute: async (args) => {
  const result = await fetchData(args);
  return { result };
};
```

### 5. Tool Naming

Use clear, consistent naming:

```typescript
// ❌ Bad naming
neurolink.registerTool('tool1', { ... });
neurolink.registerTool('doStuff', { ... });
neurolink.registerTool('x', { ... });

// ✅ Good naming
neurolink.registerTool('searchProducts', { ... });
neurolink.registerTool('calculateShipping', { ... });
neurolink.registerTool('updateInventory', { ... });
```

##  Testing Your Tools

### Unit Testing

```typescript

describe("weatherLookup tool", () => {
  it("should return weather data for valid city", async () => {
    const tool = {
      description: "Get weather data",
      parameters: z.object({
        city: z.string(),
      }),
      execute: async ({ city }) => {
        // Mock implementation for testing
        return {
          city,
          temperature: 22,
          condition: "sunny",
        };
      },
    };

    const result = await tool.execute({ city: "London" });
    expect(result).toHaveProperty("temperature");
    expect(result.city).toBe("London");
  });
});
```

### Integration Testing

```typescript

describe("Custom tools integration", () => {
  let neurolink: NeuroLink;

  beforeEach(() => {
    neurolink = new NeuroLink();
    neurolink.registerTool("testTool", {
      description: "Test tool for integration testing",
      parameters: z.object({ input: z.string() }),
      execute: async ({ input }) => ({ output: input.toUpperCase() }),
    });
  });

  it("should use custom tool in generation", async () => {
    const result = await neurolink.generate({
      input: { text: "Use the test tool with input 'hello'" },
      provider: "google-ai",
    });

    expect(result.content).toContain("HELLO");
  });
});
```

##  Debugging Tools

### Enable Debug Mode

```bash
export NEUROLINK_DEBUG=true
```

### Log Tool Execution

```typescript
neurolink.registerTool("debuggedTool", {
  description: "Tool with debug logging",
  parameters: z.object({ data: z.any() }),
  execute: async (args) => {
    console.log("[Tool] Executing with args:", args);

    try {
      const result = await processData(args);
      console.log("[Tool] Success:", result);
      return result;
    } catch (error) {
      console.error("[Tool] Error:", error);
      throw error;
    }
  },
});
```

##  Advanced Patterns

### Tool Composition

```typescript
// Base tools
const baseTools = {
  fetchData: {
    description: "Fetch data from API",
    execute: async ({ endpoint }) => {
      const response = await fetch(endpoint);
      return response.json();
    },
  },

  transformData: {
    description: "Transform data format",
    execute: async ({ data, format }) => {
      return transform(data, format);
    },
  },
};

// Composed tool
neurolink.registerTool("fetchAndTransform", {
  description: "Fetch data and transform it",
  parameters: z.object({
    endpoint: z.string().url(),
    format: z.enum(["json", "csv", "xml"]),
  }),
  execute: async ({ endpoint, format }) => {
    const data = await baseTools.fetchData.execute({ endpoint });
    return baseTools.transformData.execute({ data, format });
  },
});
```

### Tool Middleware

```typescript
// Wrap tools with middleware
function withRateLimit(tool: SimpleTool, limit: number): SimpleTool {
  const rateLimiter = new RateLimiter(limit);

  return {
    ...tool,
    execute: async (args) => {
      await rateLimiter.acquire();
      return tool.execute(args);
    },
  };
}

// Register with rate limiting
neurolink.registerTool(
  "limitedApi",
  withRateLimit(
    {
      description: "Rate-limited API call",
      execute: async (args) => callExpensiveAPI(args),
    },
    10,
  ), // 10 calls per minute
);
```

### Dynamic Tool Registration

```typescript
// Register tools based on configuration
async function registerDynamicTools(config: ToolConfig[]) {
  const tools: Record = {};

  for (const toolConfig of config) {
    tools[toolConfig.name] = {
      description: toolConfig.description,
      parameters: createZodSchema(toolConfig.parameters),
      execute: createExecutor(toolConfig),
    };
  }

  neurolink.registerTools(tools);
}

// Load from configuration
const toolConfigs = await loadToolConfigs();
await registerDynamicTools(toolConfigs);
```

##  Performance Considerations

### 1. Timeout Handling

```typescript
execute: async (args) => {
  const timeout = new Promise((_, reject) =>
    setTimeout(() => reject(new Error("Tool timeout")), 30000),
  );

  const operation = performOperation(args);

  return Promise.race([operation, timeout]);
};
```

### 2. Caching

```typescript
const cache = new Map();

execute: async (args) => {
  const cacheKey = JSON.stringify(args);

  if (cache.has(cacheKey)) {
    return cache.get(cacheKey);
  }

  const result = await expensiveOperation(args);
  cache.set(cacheKey, result);

  return result;
};
```

### 3. Batch Operations

```typescript
neurolink.registerTool("batchProcess", {
  description: "Process multiple items efficiently",
  parameters: z.object({
    items: z.array(z.any()),
    operation: z.string(),
  }),
  execute: async ({ items, operation }) => {
    // Process in parallel with concurrency limit
    const results = await pLimit(5)(
      items.map((item) => () => processItem(item, operation)),
    );

    return {
      processed: results.length,
      results,
    };
  },
});
```

##  Security Considerations

### Input Sanitization

```typescript
parameters: z.object({
  sqlQuery: z
    .string()
    .max(1000)
    .refine(
      (query) => !query.match(/DROP|DELETE|TRUNCATE/i),
      "Destructive operations not allowed",
    ),
});
```

### Permission Checking

```typescript
execute: async (args, context) => {
  // Check permissions before execution
  if (!hasPermission(context.user, "database.write")) {
    return { error: "Insufficient permissions" };
  }

  return performDatabaseOperation(args);
};
```

### Rate Limiting

```typescript
const userLimits = new Map();

execute: async (args, context) => {
  const userId = context.user?.id || "anonymous";
  const userCalls = userLimits.get(userId) || 0;

  if (userCalls >= 100) {
    return { error: "Rate limit exceeded" };
  }

  userLimits.set(userId, userCalls + 1);

  // Reset counters periodically
  setTimeout(() => userLimits.delete(userId), 3600000);

  return performOperation(args);
};
```

##  Complete Example

Here's a complete example combining multiple concepts:

```typescript

const neurolink = new NeuroLink();

// Define a comprehensive customer service tool set
neurolink.registerTools({
  searchCustomer: {
    description: "Search for customer by various criteria",
    parameters: z.object({
      query: z.string(),
      searchBy: z.enum(["email", "name", "phone", "orderId"]),
      limit: z.number().min(1).max(50).default(10),
    }),
    execute: async ({ query, searchBy, limit }) => {
      const db = getDatabase();
      const results = await db.customers.search({
        [searchBy]: query,
        limit,
      });

      return {
        found: results.length,
        customers: results.map((c) => ({
          id: c.id,
          name: c.name,
          email: c.email,
          totalOrders: c.orderCount,
          memberSince: c.createdAt,
        })),
      };
    },
  },

  getOrderHistory: {
    description: "Get order history for a customer",
    parameters: z.object({
      customerId: z.string().uuid(),
      status: z
        .enum(["all", "pending", "completed", "cancelled"])
        .default("all"),
      limit: z.number().default(10),
    }),
    execute: async ({ customerId, status, limit }) => {
      const orders = await fetchOrders(customerId, { status, limit });

      return {
        customerId,
        orderCount: orders.length,
        orders: orders.map((o) => ({
          orderId: o.id,
          date: o.createdAt,
          status: o.status,
          total: o.total,
          items: o.items.length,
        })),
      };
    },
  },

  processRefund: {
    description: "Process refund for an order",
    parameters: z.object({
      orderId: z.string().uuid(),
      amount: z.number().positive(),
      reason: z.string().min(10),
      notify: z.boolean().default(true),
    }),
    execute: async ({ orderId, amount, reason, notify }) => {
      // Validate order exists and is refundable
      const order = await getOrder(orderId);
      if (!order) {
        return { success: false, error: "Order not found" };
      }

      if (order.status !== "completed") {
        return {
          success: false,
          error: "Only completed orders can be refunded",
        };
      }

      if (amount > order.total) {
        return { success: false, error: "Refund amount exceeds order total" };
      }

      // Process refund
      const refund = await processPaymentRefund({
        orderId,
        amount,
        reason,
      });

      // Send notification
      if (notify) {
        await sendRefundNotification(order.customerId, refund);
      }

      return {
        success: true,
        refundId: refund.id,
        amount: refund.amount,
        status: "processed",
      };
    },
  },
});

// Now you can use natural language to access these tools
const result = await neurolink.generate({
  input: {
    text: "Find all orders for customer john@example.com and process a $50 refund for their most recent completed order due to damaged item",
  },
  provider: "openai",
});

// The AI will:
// 1. Call searchCustomer({ query: "john@example.com", searchBy: "email" })
// 2. Call getOrderHistory({ customerId: , status: "completed" })
// 3. Call processRefund({ orderId: , amount: 50, reason: "damaged item" })
```

##  MCP Server Integration

Beyond simple tool registration, NeuroLink SDK supports adding complete MCP (Model Context Protocol) servers for more complex tool ecosystems.

### Adding In-Memory MCP Servers

```typescript
// Add a complete MCP server with multiple related tools
await neurolink.addInMemoryMCPServer("hr-management", {
  server: {
    title: "HR Management Server",
    description: "Comprehensive HR tools for employee management",
    tools: {
      createEmployee: {
        description: "Create a new employee record with full details",
        execute: async (params: {
          name: string;
          department: string;
          role: string;
          salary: number;
          startDate: string;
        }) => {
          return {
            success: true,
            data: {
              employeeId: `EMP-${Date.now()}`,
              name: params.name,
              department: params.department,
              role: params.role,
              salary: params.salary,
              startDate: params.startDate,
              status: "active",
              createdAt: new Date().toISOString(),
            },
          };
        },
      },

      calculateSalary: {
        description: "Calculate total salary including bonuses and deductions",
        execute: async (params: {
          baseSalary: number;
          bonuses: number;
          deductions: number;
          taxRate: number;
        }) => {
          const grossSalary =
            params.baseSalary + params.bonuses - params.deductions;
          const netSalary = grossSalary * (1 - params.taxRate);

          return {
            success: true,
            data: {
              baseSalary: params.baseSalary,
              bonuses: params.bonuses,
              deductions: params.deductions,
              grossSalary,
              taxAmount: grossSalary * params.taxRate,
              netSalary,
              calculatedAt: new Date().toISOString(),
            },
          };
        },
      },

      getEmployeeStats: {
        description: "Get comprehensive employee statistics and analytics",
        execute: async (params: { department?: string; role?: string }) => {
          // Simulated analytics data
          return {
            success: true,
            data: {
              totalEmployees: 150,
              byDepartment: {
                engineering: 60,
                sales: 35,
                marketing: 25,
                hr: 15,
                finance: 15,
              },
              averageSalary: 75000,
              averageTenure: "2.5 years",
              openPositions: 8,
              lastUpdated: new Date().toISOString(),
            },
          };
        },
      },
    },
  },
  category: "hr-management",
  metadata: {
    version: "1.0.0",
    author: "Your Company",
    description: "Complete HR management solution",
  },
});
```

### Advanced MCP Server Examples

#### 1. Data Analytics Server

```typescript
await neurolink.addInMemoryMCPServer("analytics-server", {
  server: {
    title: "Data Analytics Server",
    description: "Advanced data processing and analytics tools",
    tools: {
      analyzeDataset: {
        description: "Perform statistical analysis on datasets",
        execute: async (params: {
          data: number[];
          analysisType: "descriptive" | "correlation" | "regression";
        }) => {
          const { data, analysisType } = params;

          switch (analysisType) {
            case "descriptive":
              const sum = data.reduce((a, b) => a + b, 0);
              const mean = sum / data.length;
              const sortedData = [...data].sort((a, b) => a - b);
              const median = sortedData[Math.floor(data.length / 2)];
              const variance =
                data.reduce((acc, val) => acc + Math.pow(val - mean, 2), 0) /
                data.length;
              const stdDev = Math.sqrt(variance);

              return {
                success: true,
                data: {
                  count: data.length,
                  sum,
                  mean,
                  median,
                  min: Math.min(...data),
                  max: Math.max(...data),
                  variance,
                  standardDeviation: stdDev,
                  range: Math.max(...data) - Math.min(...data),
                },
              };

            case "correlation":
              // Simplified correlation analysis
              return {
                success: true,
                data: {
                  correlationMatrix: "Generated correlation matrix",
                  strongCorrelations: [],
                  analysisNote: "Correlation analysis completed",
                },
              };

            default:
              return { success: false, error: "Unknown analysis type" };
          }
        },
      },

      generateReport: {
        description: "Generate comprehensive data reports with visualizations",
        execute: async (params: {
          title: string;
          data: any[];
          reportType: "summary" | "detailed" | "executive";
        }) => {
          return {
            success: true,
            data: {
              reportId: `RPT-${Date.now()}`,
              title: params.title,
              type: params.reportType,
              dataPoints: params.data.length,
              sections: [
                "Executive Summary",
                "Key Metrics",
                "Detailed Analysis",
                "Recommendations",
              ],
              generatedAt: new Date().toISOString(),
              status: "completed",
            },
          };
        },
      },
    },
  },
});
```

#### 2. Workflow Automation Server

```typescript
await neurolink.addInMemoryMCPServer("workflow-server", {
  server: {
    title: "Workflow Automation Server",
    description: "Tools for creating and managing automated workflows",
    tools: {
      createWorkflow: {
        description: "Create a new automated workflow with multiple steps",
        execute: async (params: {
          name: string;
          steps: Array;
          triggers: string[];
        }) => {
          return {
            success: true,
            data: {
              workflowId: `WF-${Date.now()}`,
              name: params.name,
              steps: params.steps,
              triggers: params.triggers,
              status: "created",
              nextExecution: null,
              createdAt: new Date().toISOString(),
            },
          };
        },
      },

      executeWorkflow: {
        description: "Execute a workflow with specific input data",
        execute: async (params: {
          workflowId: string;
          inputData: any;
          executionMode: "test" | "production";
        }) => {
          return {
            success: true,
            data: {
              executionId: `EXE-${Date.now()}`,
              workflowId: params.workflowId,
              mode: params.executionMode,
              status: "running",
              progress: 0,
              startedAt: new Date().toISOString(),
              estimatedCompletion: new Date(Date.now() + 300000).toISOString(), // 5 minutes
            },
          };
        },
      },

      getWorkflowStatus: {
        description: "Get current status and progress of workflow execution",
        execute: async (params: { workflowId: string }) => {
          return {
            success: true,
            data: {
              workflowId: params.workflowId,
              status: "in-progress",
              currentStep: "Data Processing",
              stepsCompleted: 3,
              totalSteps: 8,
              progress: 37.5,
              timeElapsed: "2m 15s",
              estimatedTimeRemaining: "3m 45s",
              lastUpdated: new Date().toISOString(),
            },
          };
        },
      },
    },
  },
});
```

#### 3. Content Generation Server

```typescript
await neurolink.addInMemoryMCPServer("content-server", {
  server: {
    title: "Content Generation Server",
    description: "Advanced content creation and management tools",
    tools: {
      generateSampleText: {
        description: "Generate sample text content for testing and development",
        execute: async (params: {
          topic: string;
          length: "short" | "medium" | "long";
          style: "formal" | "casual" | "technical";
        }) => {
          const samples = {
            short: `A brief overview of ${params.topic}. This content covers essential information in a ${params.style} style.`,
            medium: `This is a comprehensive introduction to ${params.topic}. Written in a ${params.style} style, it covers fundamental concepts, practical applications, and key considerations for understanding ${params.topic} in various contexts.`,
            long: `This extensive exploration of ${params.topic} provides detailed analysis written in a ${params.style} style. The content examines multiple perspectives, methodologies, and real-world applications related to ${params.topic}. By thoroughly investigating various aspects and implications, readers gain comprehensive understanding of ${params.topic} and its significance across different fields and industries.`,
          };

          return {
            success: true,
            data: {
              text: samples[params.length],
              topic: params.topic,
              length: params.length,
              style: params.style,
              wordCount: samples[params.length].split(" ").length,
              characterCount: samples[params.length].length,
              generatedAt: new Date().toISOString(),
            },
          };
        },
      },

      analyzeContent: {
        description: "Analyze text content for various metrics and insights",
        execute: async (params: {
          text: string;
          analysisTypes: Array;
        }) => {
          const results: any = {};

          params.analysisTypes.forEach((type) => {
            switch (type) {
              case "sentiment":
                const positiveWords = ["good", "great", "excellent", "amazing"];
                const negativeWords = ["bad", "terrible", "awful", "poor"];
                const words = params.text.toLowerCase().split(" ");
                const positive = words.filter((w) =>
                  positiveWords.includes(w),
                ).length;
                const negative = words.filter((w) =>
                  negativeWords.includes(w),
                ).length;

                results.sentiment = {
                  score: positive - negative,
                  sentiment:
                    positive > negative
                      ? "positive"
                      : negative > positive
                        ? "negative"
                        : "neutral",
                  confidence: Math.min(
                    (Math.abs(positive - negative) / words.length) * 10,
                    1,
                  ),
                };
                break;

              case "readability":
                const sentences = params.text.split(/[.!?]+/).length;
                const wordCount = params.text.split(" ").length;
                const avgWordsPerSentence = wordCount / sentences;

                results.readability = {
                  wordCount,
                  sentenceCount: sentences,
                  avgWordsPerSentence,
                  readabilityLevel:
                    avgWordsPerSentence  = {};
                const meaningfulWords = params.text
                  .toLowerCase()
                  .replace(/[^\w\s]/g, "")
                  .split(" ")
                  .filter((w) => w.length > 3);

                meaningfulWords.forEach((word) => {
                  wordFreq[word] = (wordFreq[word] || 0) + 1;
                });

                results.keywords = Object.entries(wordFreq)
                  .sort(([, a], [, b]) => b - a)
                  .slice(0, 10)
                  .map(([word, freq]) => ({ word, frequency: freq }));
                break;
            }
          });

          return {
            success: true,
            data: {
              textLength: params.text.length,
              analysisTypes: params.analysisTypes,
              results,
              analyzedAt: new Date().toISOString(),
            },
          };
        },
      },
    },
  },
});
```

### Mixed Tool Ecosystem Example

```typescript

const neurolink = new NeuroLink();

// 1. Register simple custom tools (extending existing functionality)
neurolink.registerTool(
  "enhancedCalculator",
  createTool({
    description: "Enhanced calculator with scientific and financial functions",
    execute: (params: {
      expression: string;
      mode: "basic" | "scientific" | "financial";
    }) => {
      if (params.mode === "scientific" && params.expression.includes("sqrt")) {
        const num = parseFloat(
          params.expression.replace("sqrt(", "").replace(")", ""),
        );
        return { result: Math.sqrt(num), enhanced: true, mode: params.mode };
      }

      if (
        params.mode === "financial" &&
        params.expression.includes("compound")
      ) {
        // Parse: compound(principal, rate, time)
        const match = params.expression.match(
          /compound\((\d+),\s*([\d.]+),\s*(\d+)\)/,
        );
        if (match) {
          const [, principal, rate, time] = match.map(Number);
          const result = principal * Math.pow(1 + rate / 100, time);
          return {
            result,
            enhanced: true,
            mode: params.mode,
            calculation: "compound_interest",
          };
        }
      }

      // Use a safe, restricted math expression evaluator for security
      const {
        create,
        addDependencies,
        subtractDependencies,
        multiplyDependencies,
        divideDependencies,
        powDependencies,
        sqrtDependencies,
        absDependencies,
      } from "mathjs";

      // Create restricted math environment with only specific functions for security
      const dependencies = {
        addDependencies,
        subtractDependencies,
        multiplyDependencies,
        divideDependencies,
        powDependencies,
        sqrtDependencies,
        absDependencies,
      };

      const math = create(dependencies, {
        matrix: "Array",
        number: "number",
        precision: 64,
      });

      // Additional sanitization for basic mathematical expressions
      const sanitizedExpression = params.expression.replace(
        /[^0-9+\-*/().\s]/g,
        "",
      );
      if (sanitizedExpression !== params.expression) {
        return {
          error: "Expression contains invalid characters",
          enhanced: false,
          mode: params.mode,
        };
      }

      try {
        const result = math.evaluate(sanitizedExpression);
        return { result, enhanced: false, mode: params.mode };
      } catch (error) {
        return {
          error: `Mathematical expression failed: ${error.message || "Invalid expression"}`,
          enhanced: false,
          mode: params.mode,
        };
      }
    },
  }),
);

// 2. Add complete MCP servers (new functionality domains)
await neurolink.addInMemoryMCPServer("business-intelligence", {
  server: {
    title: "Business Intelligence Server",
    tools: {
      generateKPIReport: {
        description: "Generate comprehensive KPI reports for business metrics",
        execute: async (params: {
          metrics: string[];
          timeRange: string;
          department?: string;
        }) => {
          return {
            success: true,
            data: {
              reportId: `KPI-${Date.now()}`,
              metrics: params.metrics,
              timeRange: params.timeRange,
              department: params.department || "All",
              kpis: {
                revenue: "$1.2M",
                growth: "+15%",
                customerSatisfaction: "94%",
                efficiency: "87%",
              },
              trends: ["Revenue increasing", "Customer satisfaction stable"],
              recommendations: [
                "Focus on efficiency improvements",
                "Expand successful programs",
              ],
              generatedAt: new Date().toISOString(),
            },
          };
        },
      },

      predictTrends: {
        description: "Predict business trends using historical data",
        execute: async (params: {
          dataPoints: number[];
          predictionPeriod: number;
          algorithm: "linear" | "exponential" | "seasonal";
        }) => {
          // Simplified prediction logic
          const trend =
            params.dataPoints[params.dataPoints.length - 1] >
            params.dataPoints[0]
              ? "upward"
              : "downward";
          const avgGrowth =
            (params.dataPoints[params.dataPoints.length - 1] -
              params.dataPoints[0]) /
            params.dataPoints.length;

          return {
            success: true,
            data: {
              algorithm: params.algorithm,
              trend,
              predictedGrowth: avgGrowth,
              confidence: 0.85,
              predictions: Array.from(
                { length: params.predictionPeriod },
                (_, i) =>
                  params.dataPoints[params.dataPoints.length - 1] +
                  avgGrowth * (i + 1),
              ),
              generatedAt: new Date().toISOString(),
            },
          };
        },
      },
    },
  },
});

// 3. Use the mixed ecosystem
const comprehensiveResult = await neurolink.generate({
  input: {
    text: `Calculate compound interest for $10000 at 5% for 3 years, then generate a KPI report
           for revenue metrics over the last quarter, and predict trends for the next 6 months
           using the data points [100, 120, 115, 130, 125, 140]`,
  },
  provider: "google-ai",
  maxTokens: 2000,
});

// The AI will automatically:
// 1. Use enhancedCalculator for compound interest: compound(10000, 5, 3)
// 2. Use generateKPIReport for business metrics
// 3. Use predictTrends for forecasting
// 4. Synthesize all results into a comprehensive response

console.log("AI Response:", comprehensiveResult.content);
console.log("Tools Used:", comprehensiveResult.toolsUsed);
```

### Tool Discovery and Management

```typescript
// Get comprehensive view of all available tools
const allTools = await neurolink.getAllAvailableTools();

// Group tools by source
const toolsBySource = allTools.reduce(
  (acc, tool) => {
    const source = tool.serverId || "unknown";
    acc[source] = (acc[source] || 0) + 1;
    return acc;
  },
  {} as Record,
);

console.log("Tool ecosystem summary:");
console.log("• Total tools available:", allTools.length);
console.log("• Tools by source:", toolsBySource);

// Get custom tools registered via registerTool()
const customTools = neurolink.getCustomTools();
console.log("• Custom tools registered:", customTools.size);

// Get in-memory MCP servers added via addInMemoryMCPServer()
const mcpServers = neurolink.getInMemoryServers();
console.log("• In-memory MCP servers:", mcpServers.size);

// Execute tools from any source using unified API
const timeResult = await neurolink.executeTool("getCurrentTime");
const calculationResult = await neurolink.executeTool("enhancedCalculator", {
  expression: "compound(5000, 4.5, 2)",
  mode: "financial",
});
const reportResult = await neurolink.executeTool("generateKPIReport", {
  metrics: ["revenue", "growth"],
  timeRange: "Q1-2024",
});

console.log("Tool execution results:");
console.log("• Built-in tool:", timeResult.data.time);
console.log("• Custom tool:", calculationResult.result);
console.log("• MCP server tool:", reportResult.data.reportId);
```

### Adding Remote HTTP MCP Servers

Connect to remote MCP servers via HTTP transport with authentication, retry, and rate limiting:

```typescript

const neurolink = new NeuroLink();

// Add HTTP MCP server with full configuration
await neurolink.addExternalMCPServer("remote-api", {
  transport: "http",
  url: "https://api.example.com/mcp",
  headers: {
    Authorization: "Bearer YOUR_API_TOKEN",
    "X-Custom-Header": "value",
  },
  httpOptions: {
    connectionTimeout: 30000,
    requestTimeout: 60000,
    idleTimeout: 120000,
    keepAliveTimeout: 30000,
  },
  retryConfig: {
    maxAttempts: 3,
    initialDelay: 1000,
    maxDelay: 30000,
    backoffMultiplier: 2,
  },
  rateLimiting: {
    requestsPerMinute: 60,
    maxBurst: 10,
    useTokenBucket: true,
  },
});

// Add HTTP server with OAuth 2.1
await neurolink.addExternalMCPServer("oauth-api", {
  transport: "http",
  url: "https://api.enterprise.com/mcp",
  auth: {
    type: "oauth2",
    oauth: {
      clientId: "your-client-id",
      clientSecret: "your-client-secret",
      authorizationUrl: "https://auth.provider.com/authorize",
      tokenUrl: "https://auth.provider.com/token",
      redirectUrl: "http://localhost:8080/callback",
      scope: "mcp:read mcp:write",
      usePKCE: true,
    },
  },
});

// Use the remote server's tools in AI generation
const result = await neurolink.generate({
  input: { text: "Use the remote API to perform analysis" },
  provider: "google-ai",
});
```

**HTTP Configuration Options:**

| Option         | Type   | Description                                 |
| -------------- | ------ | ------------------------------------------- |
| `transport`    | string | Must be `"http"` for HTTP transport         |
| `url`          | string | Remote MCP endpoint URL                     |
| `headers`      | object | Custom HTTP headers                         |
| `httpOptions`  | object | Connection timeout settings                 |
| `retryConfig`  | object | Retry with exponential backoff              |
| `rateLimiting` | object | Rate limiting configuration                 |
| `auth`         | object | Authentication (OAuth 2.1, Bearer, API Key) |

See [MCP HTTP Transport Guide](/docs/mcp/http-transport) for complete documentation.

### Best Practices for MCP Integration

#### 1. Organize Tools by Domain

```typescript
// Group related tools into themed MCP servers
await neurolink.addInMemoryMCPServer("user-management", {
  server: {
    title: "User Management Server",
    tools: {
      createUser: {
        /* ... */
      },
      updateUser: {
        /* ... */
      },
      deleteUser: {
        /* ... */
      },
      getUserProfile: {
        /* ... */
      },
    },
  },
});

await neurolink.addInMemoryMCPServer("order-processing", {
  server: {
    title: "Order Processing Server",
    tools: {
      createOrder: {
        /* ... */
      },
      updateOrderStatus: {
        /* ... */
      },
      calculateShipping: {
        /* ... */
      },
      processPayment: {
        /* ... */
      },
    },
  },
});
```

#### 2. Consistent Error Handling

```typescript
execute: async (params) => {
  try {
    const result = await performOperation(params);
    return {
      success: true,
      data: result,
    };
  } catch (error) {
    return {
      success: false,
      error: error.message,
      code: error.code || "OPERATION_FAILED",
      timestamp: new Date().toISOString(),
    };
  }
};
```

#### 3. Comprehensive Metadata

```typescript
await neurolink.addInMemoryMCPServer("server-id", {
  server: {
    title: "Human-Readable Server Name",
    description: "Detailed description of server purpose",
    tools: {
      /* ... */
    },
  },
  category: "business-logic", // Group similar servers
  metadata: {
    version: "2.1.0",
    author: "Your Team",
    lastUpdated: "2024-01-15",
    documentation: "https://docs.yourcompany.com/mcp-servers",
    supportContact: "support@yourcompany.com",
  },
});
```

##  Additional Resources

- [API Reference - NeuroLink Class](/docs/sdk/api-reference)
- [MCP Integration Guide](/docs/mcp/integration)
- [Provider Tool Support](/docs/)
- [Test Examples](/docs/development/testing)
- [MCP SDK Integration Proof Tests](/docs/development/testing)
- [Real AI-MCP Integration Demo](/docs/development/testing)

---

**Start building powerful AI applications with custom tools and MCP servers today! **

---

## SDK Custom Tools Guide

<!-- Source: sdk/custom-tools.md -->

#  SDK Custom Tools Guide

Build powerful AI applications by extending NeuroLink with your own custom tools.

##  Overview

NeuroLink's SDK allows you to register custom tools programmatically, giving your AI assistants access to any functionality you need. All registered tools work seamlessly with the built-in tool system across all supported providers.

### Key Features

- ✅ **Type-Safe**: Full TypeScript support with Zod schema validation
- ✅ **Provider Agnostic**: Works with all providers that support tools
- ✅ **Easy Integration**: Simple API for tool registration
- ✅ **Async Support**: All tools run asynchronously
- ✅ **Error Handling**: Graceful error handling built-in

##  Quick Start

### Basic Tool Registration

```typescript

const neurolink = new NeuroLink();

// Register a simple tool
neurolink.registerTool("greetUser", {
  description: "Generate a personalized greeting",
  parameters: z.object({
    name: z.string().describe("User name"),
    language: z.enum(["en", "es", "fr", "de"]).default("en"),
  }),
  execute: async ({ name, language }) => {
    const greetings = {
      en: `Hello, ${name}!`,
      es: `¡Hola, ${name}!`,
      fr: `Bonjour, ${name}!`,
      de: `Hallo, ${name}!`,
    };
    return { greeting: greetings[language] };
  },
});

// AI will now use your tool
const result = await neurolink.generate({
  input: { text: "Greet John in Spanish" },
});
// AI calls: greetUser({ name: "John", language: "es" })
// Returns: "¡Hola, John!"
```

## ⚠️ Common Mistakes

### ❌ Using `schema` instead of `parameters`

```typescript
// WRONG - will throw validation error
neurolink.registerTool("badTool", {
  description: "This will fail",
  schema: {
    // ❌ Should be 'parameters'
    type: "object",
    properties: { value: { type: "string" } },
  },
  execute: async (args) => args,
});
```

### ❌ Using plain JSON schema as `parameters`

```typescript
// WRONG - will throw validation error
neurolink.registerTool("badTool", {
  description: "This will also fail",
  parameters: {
    // ❌ Should be Zod schema
    type: "object",
    properties: { value: { type: "string" } },
  },
  execute: async (args) => args,
});
```

### ✅ Correct Zod Schema Format

```typescript
// CORRECT - works perfectly

neurolink.registerTool("goodTool", {
  description: "This works correctly",
  parameters: z.object({
    // ✅ Zod schema
    value: z.string(),
  }),
  execute: async (args) => args,
});
```

##  SimpleTool Interface

All custom tools implement the `SimpleTool` interface:

```typescript
type SimpleTool = {
  description: string; // What the tool does
  parameters?: ZodSchema; // Input validation schema
  execute: (args: T) => Promise; // Tool implementation
};
```

### Interface Components

- **description**: Clear, actionable description that helps the AI understand when to use the tool
- **parameters**: Optional Zod schema for validating inputs (highly recommended)
- **execute**: Async function that implements the tool's logic

## ️ Registration Methods

### Register Single Tool

```typescript
neurolink.registerTool(name: string, tool: SimpleTool): void
```

### Register Multiple Tools (Unified API)

```typescript
// Object format (existing compatibility)
neurolink.registerTools(tools: Record): void

// Array format (Lighthouse compatible)
neurolink.registerTools(tools: Array): void
```

The `registerTools()` method automatically detects the input format and handles both object and array formats seamlessly.

### Get Registered Tools

```typescript
const tools = neurolink.getRegisteredTools(); // Returns string[]
```

##  Common Use Cases

### 1. API Integration

```typescript
neurolink.registerTool("weatherLookup", {
  description: "Get current weather for any city",
  parameters: z.object({
    city: z.string().describe("City name"),
    country: z.string().optional().describe("Country code (ISO 2-letter)"),
    units: z.enum(["celsius", "fahrenheit"]).default("celsius"),
  }),
  execute: async ({ city, country, units }) => {
    const response = await fetch(
      `https://api.weather.com/v1/current?city=${city}&country=${country || ""}&units=${units}`,
      { headers: { "API-Key": process.env.WEATHER_API_KEY } },
    );
    const data = await response.json();

    return {
      city,
      temperature: data.temp,
      condition: data.condition,
      humidity: data.humidity,
      units,
    };
  },
});
```

### 2. Database Operations

```typescript
neurolink.registerTool("userLookup", {
  description: "Find user information by email or ID",
  parameters: z.object({
    identifier: z.string().describe("Email address or user ID"),
    fields: z
      .array(z.string())
      .optional()
      .describe("Specific fields to return"),
  }),
  execute: async ({ identifier, fields }) => {
    const db = getDatabase();
    const query = identifier.includes("@")
      ? { email: identifier }
      : { id: identifier };

    const user = await db.users.findOne(query);
    if (!user) {
      return { error: "User not found" };
    }

    // Return only requested fields if specified
    if (fields && fields.length > 0) {
      return fields.reduce((acc, field) => {
        acc[field] = user[field];
        return acc;
      }, {});
    }

    return user;
  },
});
```

### 3. Data Processing

```typescript
neurolink.registerTool("analyzeSentiment", {
  description: "Analyze sentiment of text using ML model",
  parameters: z.object({
    text: z.string().describe("Text to analyze"),
    language: z.string().default("en").describe("Language code"),
    detailed: z.boolean().default(false).describe("Include detailed analysis"),
  }),
  execute: async ({ text, language, detailed }) => {
    const sentimentModel = await loadSentimentModel(language);
    const result = await sentimentModel.analyze(text);

    if (detailed) {
      return {
        sentiment: result.sentiment,
        score: result.score,
        emotions: result.emotions,
        keywords: result.keywords,
        confidence: result.confidence,
      };
    }

    return {
      sentiment: result.sentiment,
      score: result.score,
    };
  },
});
```

### 4. File Operations

```typescript
neurolink.registerTool("processSpreadsheet", {
  description: "Process Excel/CSV files with various operations",
  parameters: z.object({
    filePath: z.string().describe("Path to spreadsheet file"),
    operation: z.enum(["summarize", "filter", "pivot", "chart"]),
    options: z.record(z.any()).optional(),
  }),
  execute: async ({ filePath, operation, options = {} }) => {
    const workbook = await loadSpreadsheet(filePath);

    switch (operation) {
      case "summarize":
        return {
          sheets: workbook.sheetNames,
          totalRows: workbook.getTotalRows(),
          columns: workbook.getColumns(),
          summary: workbook.generateSummary(),
        };

      case "filter":
        const filtered = workbook.filter(options.criteria);
        return {
          matchingRows: filtered.length,
          data: filtered,
        };

      case "pivot":
        return workbook.createPivotTable(
          options.rows,
          options.columns,
          options.values,
        );

      case "chart":
        const chartData = workbook.prepareChartData(
          options.type,
          options.series,
        );
        return { chartData, recommendation: suggestChartType(chartData) };
    }
  },
});
```

### 5. External Service Integration

```typescript
neurolink.registerTools({
  sendEmail: {
    description: "Send email via SMTP",
    parameters: z.object({
      to: z.string().email(),
      subject: z.string(),
      body: z.string(),
      cc: z.array(z.string().email()).optional(),
      attachments: z.array(z.string()).optional(),
    }),
    execute: async ({ to, subject, body, cc, attachments }) => {
      const mailer = getMailer();
      const result = await mailer.send({
        to,
        subject,
        body,
        cc,
        attachments: attachments
          ? await Promise.all(attachments.map(loadAttachment))
          : undefined,
      });

      return {
        messageId: result.messageId,
        status: "sent",
        timestamp: new Date().toISOString(),
      };
    },
  },

  scheduleCalendarEvent: {
    description: "Create calendar event",
    parameters: z.object({
      title: z.string(),
      startTime: z.string().datetime(),
      duration: z.number().describe("Duration in minutes"),
      attendees: z.array(z.string().email()).optional(),
      location: z.string().optional(),
      description: z.string().optional(),
    }),
    execute: async (params) => {
      const calendar = getCalendarService();
      const event = await calendar.createEvent({
        ...params,
        endTime: addMinutes(params.startTime, params.duration),
      });

      return {
        eventId: event.id,
        eventLink: event.htmlLink,
        status: "created",
      };
    },
  },
});
```

##  Best Practices

### 1. Clear Descriptions

Make tool descriptions specific and actionable:

```typescript
// ❌ Bad
description: "Database tool";

// ✅ Good
description: "Search customer database by name, email, or order ID";
```

### 2. Parameter Validation

Always use Zod schemas for type safety:

```typescript
// ❌ Bad - No validation
parameters: undefined,
execute: async (args: any) => {
  // Risky - args could be anything
}

// ✅ Good - Full validation
parameters: z.object({
  userId: z.string().uuid(),
  action: z.enum(['view', 'edit', 'delete']),
  reason: z.string().min(10).optional()
}),
execute: async ({ userId, action, reason }) => {
  // Type-safe with validated inputs
}
```

### 3. Error Handling

Handle errors gracefully:

```typescript
execute: async (args) => {
  try {
    const result = await riskyOperation(args);
    return { success: true, data: result };
  } catch (error) {
    // Return error info instead of throwing
    return {
      success: false,
      error: error.message,
      code: error.code || "UNKNOWN_ERROR",
    };
  }
};
```

### 4. Async Operations

All execute functions must return promises:

```typescript
// ❌ Bad - Synchronous
execute: (args) => {
  return { result: "data" };
};

// ✅ Good - Asynchronous
execute: async (args) => {
  const result = await fetchData(args);
  return { result };
};
```

### 5. Tool Naming

Use clear, consistent naming:

```typescript
// ❌ Bad naming
neurolink.registerTool('tool1', { ... });
neurolink.registerTool('doStuff', { ... });
neurolink.registerTool('x', { ... });

// ✅ Good naming
neurolink.registerTool('searchProducts', { ... });
neurolink.registerTool('calculateShipping', { ... });
neurolink.registerTool('updateInventory', { ... });
```

##  Testing Your Tools

### Unit Testing

```typescript

describe("weatherLookup tool", () => {
  it("should return weather data for valid city", async () => {
    const tool = {
      description: "Get weather data",
      parameters: z.object({
        city: z.string(),
      }),
      execute: async ({ city }) => {
        // Mock implementation for testing
        return {
          city,
          temperature: 22,
          condition: "sunny",
        };
      },
    };

    const result = await tool.execute({ city: "London" });
    expect(result).toHaveProperty("temperature");
    expect(result.city).toBe("London");
  });
});
```

### Integration Testing

```typescript

describe("Custom tools integration", () => {
  let neurolink: NeuroLink;

  beforeEach(() => {
    neurolink = new NeuroLink();
    neurolink.registerTool("testTool", {
      description: "Test tool for integration testing",
      parameters: z.object({ input: z.string() }),
      execute: async ({ input }) => ({ output: input.toUpperCase() }),
    });
  });

  it("should use custom tool in generation", async () => {
    const result = await neurolink.generate({
      input: { text: "Use the test tool with input 'hello'" },
      provider: "google-ai",
    });

    expect(result.content).toContain("HELLO");
  });
});
```

##  Debugging Tools

### Enable Debug Mode

```bash
export NEUROLINK_DEBUG=true
```

### Log Tool Execution

```typescript
neurolink.registerTool("debuggedTool", {
  description: "Tool with debug logging",
  parameters: z.object({ data: z.any() }),
  execute: async (args) => {
    console.log("[Tool] Executing with args:", args);

    try {
      const result = await processData(args);
      console.log("[Tool] Success:", result);
      return result;
    } catch (error) {
      console.error("[Tool] Error:", error);
      throw error;
    }
  },
});
```

##  Advanced Patterns

### Tool Composition

```typescript
// Base tools
const baseTools = {
  fetchData: {
    description: "Fetch data from API",
    execute: async ({ endpoint }) => {
      const response = await fetch(endpoint);
      return response.json();
    },
  },

  transformData: {
    description: "Transform data format",
    execute: async ({ data, format }) => {
      return transform(data, format);
    },
  },
};

// Composed tool
neurolink.registerTool("fetchAndTransform", {
  description: "Fetch data and transform it",
  parameters: z.object({
    endpoint: z.string().url(),
    format: z.enum(["json", "csv", "xml"]),
  }),
  execute: async ({ endpoint, format }) => {
    const data = await baseTools.fetchData.execute({ endpoint });
    return baseTools.transformData.execute({ data, format });
  },
});
```

### Tool Middleware

```typescript
// Wrap tools with middleware
function withRateLimit(tool: SimpleTool, limit: number): SimpleTool {
  const rateLimiter = new RateLimiter(limit);

  return {
    ...tool,
    execute: async (args) => {
      await rateLimiter.acquire();
      return tool.execute(args);
    },
  };
}

// Register with rate limiting
neurolink.registerTool(
  "limitedApi",
  withRateLimit(
    {
      description: "Rate-limited API call",
      execute: async (args) => callExpensiveAPI(args),
    },
    10,
  ), // 10 calls per minute
);
```

### Dynamic Tool Registration

```typescript
// Register tools based on configuration
async function registerDynamicTools(config: ToolConfig[]) {
  const tools: Record = {};

  for (const toolConfig of config) {
    tools[toolConfig.name] = {
      description: toolConfig.description,
      parameters: createZodSchema(toolConfig.parameters),
      execute: createExecutor(toolConfig),
    };
  }

  neurolink.registerTools(tools);
}

// Load from configuration
const toolConfigs = await loadToolConfigs();
await registerDynamicTools(toolConfigs);
```

##  Performance Considerations

### 1. Timeout Handling

```typescript
execute: async (args) => {
  const timeout = new Promise((_, reject) =>
    setTimeout(() => reject(new Error("Tool timeout")), 30000),
  );

  const operation = performOperation(args);

  return Promise.race([operation, timeout]);
};
```

### 2. Caching

```typescript
const cache = new Map();

execute: async (args) => {
  const cacheKey = JSON.stringify(args);

  if (cache.has(cacheKey)) {
    return cache.get(cacheKey);
  }

  const result = await expensiveOperation(args);
  cache.set(cacheKey, result);

  return result;
};
```

### 3. Batch Operations

```typescript
neurolink.registerTool("batchProcess", {
  description: "Process multiple items efficiently",
  parameters: z.object({
    items: z.array(z.any()),
    operation: z.string(),
  }),
  execute: async ({ items, operation }) => {
    // Process in parallel with concurrency limit
    const results = await pLimit(5)(
      items.map((item) => () => processItem(item, operation)),
    );

    return {
      processed: results.length,
      results,
    };
  },
});
```

##  Security Considerations

### Input Sanitization

```typescript
parameters: z.object({
  sqlQuery: z
    .string()
    .max(1000)
    .refine(
      (query) => !query.match(/DROP|DELETE|TRUNCATE/i),
      "Destructive operations not allowed",
    ),
});
```

### Permission Checking

```typescript
execute: async (args, context) => {
  // Check permissions before execution
  if (!hasPermission(context.user, "database.write")) {
    return { error: "Insufficient permissions" };
  }

  return performDatabaseOperation(args);
};
```

### Rate Limiting

```typescript
const userLimits = new Map();

execute: async (args, context) => {
  const userId = context.user?.id || "anonymous";
  const userCalls = userLimits.get(userId) || 0;

  if (userCalls >= 100) {
    return { error: "Rate limit exceeded" };
  }

  userLimits.set(userId, userCalls + 1);

  // Reset counters periodically
  setTimeout(() => userLimits.delete(userId), 3600000);

  return performOperation(args);
};
```

##  Complete Example

Here's a complete example combining multiple concepts:

```typescript

const neurolink = new NeuroLink();

// Define a comprehensive customer service tool set
neurolink.registerTools({
  searchCustomer: {
    description: "Search for customer by various criteria",
    parameters: z.object({
      query: z.string(),
      searchBy: z.enum(["email", "name", "phone", "orderId"]),
      limit: z.number().min(1).max(50).default(10),
    }),
    execute: async ({ query, searchBy, limit }) => {
      const db = getDatabase();
      const results = await db.customers.search({
        [searchBy]: query,
        limit,
      });

      return {
        found: results.length,
        customers: results.map((c) => ({
          id: c.id,
          name: c.name,
          email: c.email,
          totalOrders: c.orderCount,
          memberSince: c.createdAt,
        })),
      };
    },
  },

  getOrderHistory: {
    description: "Get order history for a customer",
    parameters: z.object({
      customerId: z.string().uuid(),
      status: z
        .enum(["all", "pending", "completed", "cancelled"])
        .default("all"),
      limit: z.number().default(10),
    }),
    execute: async ({ customerId, status, limit }) => {
      const orders = await fetchOrders(customerId, { status, limit });

      return {
        customerId,
        orderCount: orders.length,
        orders: orders.map((o) => ({
          orderId: o.id,
          date: o.createdAt,
          status: o.status,
          total: o.total,
          items: o.items.length,
        })),
      };
    },
  },

  processRefund: {
    description: "Process refund for an order",
    parameters: z.object({
      orderId: z.string().uuid(),
      amount: z.number().positive(),
      reason: z.string().min(10),
      notify: z.boolean().default(true),
    }),
    execute: async ({ orderId, amount, reason, notify }) => {
      // Validate order exists and is refundable
      const order = await getOrder(orderId);
      if (!order) {
        return { success: false, error: "Order not found" };
      }

      if (order.status !== "completed") {
        return {
          success: false,
          error: "Only completed orders can be refunded",
        };
      }

      if (amount > order.total) {
        return { success: false, error: "Refund amount exceeds order total" };
      }

      // Process refund
      const refund = await processPaymentRefund({
        orderId,
        amount,
        reason,
      });

      // Send notification
      if (notify) {
        await sendRefundNotification(order.customerId, refund);
      }

      return {
        success: true,
        refundId: refund.id,
        amount: refund.amount,
        status: "processed",
      };
    },
  },
});

// Now you can use natural language to access these tools
const result = await neurolink.generate({
  input: {
    text: "Find all orders for customer john@example.com and process a $50 refund for their most recent completed order due to damaged item",
  },
  provider: "openai",
});

// The AI will:
// 1. Call searchCustomer({ query: "john@example.com", searchBy: "email" })
// 2. Call getOrderHistory({ customerId: , status: "completed" })
// 3. Call processRefund({ orderId: , amount: 50, reason: "damaged item" })
```

##  MCP Server Integration

Beyond simple tool registration, NeuroLink SDK supports adding complete MCP (Model Context Protocol) servers for more complex tool ecosystems.

### Adding In-Memory MCP Servers

```typescript
// Add a complete MCP server with multiple related tools
await neurolink.addInMemoryMCPServer("hr-management", {
  server: {
    title: "HR Management Server",
    description: "Comprehensive HR tools for employee management",
    tools: {
      createEmployee: {
        description: "Create a new employee record with full details",
        execute: async (params: {
          name: string;
          department: string;
          role: string;
          salary: number;
          startDate: string;
        }) => {
          return {
            success: true,
            data: {
              employeeId: `EMP-${Date.now()}`,
              name: params.name,
              department: params.department,
              role: params.role,
              salary: params.salary,
              startDate: params.startDate,
              status: "active",
              createdAt: new Date().toISOString(),
            },
          };
        },
      },

      calculateSalary: {
        description: "Calculate total salary including bonuses and deductions",
        execute: async (params: {
          baseSalary: number;
          bonuses: number;
          deductions: number;
          taxRate: number;
        }) => {
          const grossSalary =
            params.baseSalary + params.bonuses - params.deductions;
          const netSalary = grossSalary * (1 - params.taxRate);

          return {
            success: true,
            data: {
              baseSalary: params.baseSalary,
              bonuses: params.bonuses,
              deductions: params.deductions,
              grossSalary,
              taxAmount: grossSalary * params.taxRate,
              netSalary,
              calculatedAt: new Date().toISOString(),
            },
          };
        },
      },

      getEmployeeStats: {
        description: "Get comprehensive employee statistics and analytics",
        execute: async (params: { department?: string; role?: string }) => {
          // Simulated analytics data
          return {
            success: true,
            data: {
              totalEmployees: 150,
              byDepartment: {
                engineering: 60,
                sales: 35,
                marketing: 25,
                hr: 15,
                finance: 15,
              },
              averageSalary: 75000,
              averageTenure: "2.5 years",
              openPositions: 8,
              lastUpdated: new Date().toISOString(),
            },
          };
        },
      },
    },
  },
  category: "hr-management",
  metadata: {
    version: "1.0.0",
    author: "Your Company",
    description: "Complete HR management solution",
  },
});
```

### Advanced MCP Server Examples

#### 1. Data Analytics Server

```typescript
await neurolink.addInMemoryMCPServer("analytics-server", {
  server: {
    title: "Data Analytics Server",
    description: "Advanced data processing and analytics tools",
    tools: {
      analyzeDataset: {
        description: "Perform statistical analysis on datasets",
        execute: async (params: {
          data: number[];
          analysisType: "descriptive" | "correlation" | "regression";
        }) => {
          const { data, analysisType } = params;

          switch (analysisType) {
            case "descriptive":
              const sum = data.reduce((a, b) => a + b, 0);
              const mean = sum / data.length;
              const sortedData = [...data].sort((a, b) => a - b);
              const median = sortedData[Math.floor(data.length / 2)];
              const variance =
                data.reduce((acc, val) => acc + Math.pow(val - mean, 2), 0) /
                data.length;
              const stdDev = Math.sqrt(variance);

              return {
                success: true,
                data: {
                  count: data.length,
                  sum,
                  mean,
                  median,
                  min: Math.min(...data),
                  max: Math.max(...data),
                  variance,
                  standardDeviation: stdDev,
                  range: Math.max(...data) - Math.min(...data),
                },
              };

            case "correlation":
              // Simplified correlation analysis
              return {
                success: true,
                data: {
                  correlationMatrix: "Generated correlation matrix",
                  strongCorrelations: [],
                  analysisNote: "Correlation analysis completed",
                },
              };

            default:
              return { success: false, error: "Unknown analysis type" };
          }
        },
      },

      generateReport: {
        description: "Generate comprehensive data reports with visualizations",
        execute: async (params: {
          title: string;
          data: any[];
          reportType: "summary" | "detailed" | "executive";
        }) => {
          return {
            success: true,
            data: {
              reportId: `RPT-${Date.now()}`,
              title: params.title,
              type: params.reportType,
              dataPoints: params.data.length,
              sections: [
                "Executive Summary",
                "Key Metrics",
                "Detailed Analysis",
                "Recommendations",
              ],
              generatedAt: new Date().toISOString(),
              status: "completed",
            },
          };
        },
      },
    },
  },
});
```

#### 2. Workflow Automation Server

```typescript
await neurolink.addInMemoryMCPServer("workflow-server", {
  server: {
    title: "Workflow Automation Server",
    description: "Tools for creating and managing automated workflows",
    tools: {
      createWorkflow: {
        description: "Create a new automated workflow with multiple steps",
        execute: async (params: {
          name: string;
          steps: Array;
          triggers: string[];
        }) => {
          return {
            success: true,
            data: {
              workflowId: `WF-${Date.now()}`,
              name: params.name,
              steps: params.steps,
              triggers: params.triggers,
              status: "created",
              nextExecution: null,
              createdAt: new Date().toISOString(),
            },
          };
        },
      },

      executeWorkflow: {
        description: "Execute a workflow with specific input data",
        execute: async (params: {
          workflowId: string;
          inputData: any;
          executionMode: "test" | "production";
        }) => {
          return {
            success: true,
            data: {
              executionId: `EXE-${Date.now()}`,
              workflowId: params.workflowId,
              mode: params.executionMode,
              status: "running",
              progress: 0,
              startedAt: new Date().toISOString(),
              estimatedCompletion: new Date(Date.now() + 300000).toISOString(), // 5 minutes
            },
          };
        },
      },

      getWorkflowStatus: {
        description: "Get current status and progress of workflow execution",
        execute: async (params: { workflowId: string }) => {
          return {
            success: true,
            data: {
              workflowId: params.workflowId,
              status: "in-progress",
              currentStep: "Data Processing",
              stepsCompleted: 3,
              totalSteps: 8,
              progress: 37.5,
              timeElapsed: "2m 15s",
              estimatedTimeRemaining: "3m 45s",
              lastUpdated: new Date().toISOString(),
            },
          };
        },
      },
    },
  },
});
```

#### 3. Content Generation Server

```typescript
await neurolink.addInMemoryMCPServer("content-server", {
  server: {
    title: "Content Generation Server",
    description: "Advanced content creation and management tools",
    tools: {
      generateSampleText: {
        description: "Generate sample text content for testing and development",
        execute: async (params: {
          topic: string;
          length: "short" | "medium" | "long";
          style: "formal" | "casual" | "technical";
        }) => {
          const samples = {
            short: `A brief overview of ${params.topic}. This content covers essential information in a ${params.style} style.`,
            medium: `This is a comprehensive introduction to ${params.topic}. Written in a ${params.style} style, it covers fundamental concepts, practical applications, and key considerations for understanding ${params.topic} in various contexts.`,
            long: `This extensive exploration of ${params.topic} provides detailed analysis written in a ${params.style} style. The content examines multiple perspectives, methodologies, and real-world applications related to ${params.topic}. By thoroughly investigating various aspects and implications, readers gain comprehensive understanding of ${params.topic} and its significance across different fields and industries.`,
          };

          return {
            success: true,
            data: {
              text: samples[params.length],
              topic: params.topic,
              length: params.length,
              style: params.style,
              wordCount: samples[params.length].split(" ").length,
              characterCount: samples[params.length].length,
              generatedAt: new Date().toISOString(),
            },
          };
        },
      },

      analyzeContent: {
        description: "Analyze text content for various metrics and insights",
        execute: async (params: {
          text: string;
          analysisTypes: Array;
        }) => {
          const results: any = {};

          params.analysisTypes.forEach((type) => {
            switch (type) {
              case "sentiment":
                const positiveWords = ["good", "great", "excellent", "amazing"];
                const negativeWords = ["bad", "terrible", "awful", "poor"];
                const words = params.text.toLowerCase().split(" ");
                const positive = words.filter((w) =>
                  positiveWords.includes(w),
                ).length;
                const negative = words.filter((w) =>
                  negativeWords.includes(w),
                ).length;

                results.sentiment = {
                  score: positive - negative,
                  sentiment:
                    positive > negative
                      ? "positive"
                      : negative > positive
                        ? "negative"
                        : "neutral",
                  confidence: Math.min(
                    (Math.abs(positive - negative) / words.length) * 10,
                    1,
                  ),
                };
                break;

              case "readability":
                const sentences = params.text.split(/[.!?]+/).length;
                const wordCount = params.text.split(" ").length;
                const avgWordsPerSentence = wordCount / sentences;

                results.readability = {
                  wordCount,
                  sentenceCount: sentences,
                  avgWordsPerSentence,
                  readabilityLevel:
                    avgWordsPerSentence  = {};
                const meaningfulWords = params.text
                  .toLowerCase()
                  .replace(/[^\w\s]/g, "")
                  .split(" ")
                  .filter((w) => w.length > 3);

                meaningfulWords.forEach((word) => {
                  wordFreq[word] = (wordFreq[word] || 0) + 1;
                });

                results.keywords = Object.entries(wordFreq)
                  .sort(([, a], [, b]) => b - a)
                  .slice(0, 10)
                  .map(([word, freq]) => ({ word, frequency: freq }));
                break;
            }
          });

          return {
            success: true,
            data: {
              textLength: params.text.length,
              analysisTypes: params.analysisTypes,
              results,
              analyzedAt: new Date().toISOString(),
            },
          };
        },
      },
    },
  },
});
```

### Mixed Tool Ecosystem Example

```typescript

const neurolink = new NeuroLink();

// 1. Register simple custom tools (extending existing functionality)
neurolink.registerTool(
  "enhancedCalculator",
  createTool({
    description: "Enhanced calculator with scientific and financial functions",
    execute: (params: {
      expression: string;
      mode: "basic" | "scientific" | "financial";
    }) => {
      if (params.mode === "scientific" && params.expression.includes("sqrt")) {
        const num = parseFloat(
          params.expression.replace("sqrt(", "").replace(")", ""),
        );
        return { result: Math.sqrt(num), enhanced: true, mode: params.mode };
      }

      if (
        params.mode === "financial" &&
        params.expression.includes("compound")
      ) {
        // Parse: compound(principal, rate, time)
        const match = params.expression.match(
          /compound\((\d+),\s*([\d.]+),\s*(\d+)\)/,
        );
        if (match) {
          const [, principal, rate, time] = match.map(Number);
          const result = principal * Math.pow(1 + rate / 100, time);
          return {
            result,
            enhanced: true,
            mode: params.mode,
            calculation: "compound_interest",
          };
        }
      }

      // Use a safe, restricted math expression evaluator for security
      const {
        create,
        addDependencies,
        subtractDependencies,
        multiplyDependencies,
        divideDependencies,
        powDependencies,
        sqrtDependencies,
        absDependencies,
      } from "mathjs";

      // Create restricted math environment with only specific functions for security
      const dependencies = {
        addDependencies,
        subtractDependencies,
        multiplyDependencies,
        divideDependencies,
        powDependencies,
        sqrtDependencies,
        absDependencies,
      };

      const math = create(dependencies, {
        matrix: "Array",
        number: "number",
        precision: 64,
      });

      // Additional sanitization for basic mathematical expressions
      const sanitizedExpression = params.expression.replace(
        /[^0-9+\-*/().\s]/g,
        "",
      );
      if (sanitizedExpression !== params.expression) {
        return {
          error: "Expression contains invalid characters",
          enhanced: false,
          mode: params.mode,
        };
      }

      try {
        const result = math.evaluate(sanitizedExpression);
        return { result, enhanced: false, mode: params.mode };
      } catch (error) {
        return {
          error: `Mathematical expression failed: ${error.message || "Invalid expression"}`,
          enhanced: false,
          mode: params.mode,
        };
      }
    },
  }),
);

// 2. Add complete MCP servers (new functionality domains)
await neurolink.addInMemoryMCPServer("business-intelligence", {
  server: {
    title: "Business Intelligence Server",
    tools: {
      generateKPIReport: {
        description: "Generate comprehensive KPI reports for business metrics",
        execute: async (params: {
          metrics: string[];
          timeRange: string;
          department?: string;
        }) => {
          return {
            success: true,
            data: {
              reportId: `KPI-${Date.now()}`,
              metrics: params.metrics,
              timeRange: params.timeRange,
              department: params.department || "All",
              kpis: {
                revenue: "$1.2M",
                growth: "+15%",
                customerSatisfaction: "94%",
                efficiency: "87%",
              },
              trends: ["Revenue increasing", "Customer satisfaction stable"],
              recommendations: [
                "Focus on efficiency improvements",
                "Expand successful programs",
              ],
              generatedAt: new Date().toISOString(),
            },
          };
        },
      },

      predictTrends: {
        description: "Predict business trends using historical data",
        execute: async (params: {
          dataPoints: number[];
          predictionPeriod: number;
          algorithm: "linear" | "exponential" | "seasonal";
        }) => {
          // Simplified prediction logic
          const trend =
            params.dataPoints[params.dataPoints.length - 1] >
            params.dataPoints[0]
              ? "upward"
              : "downward";
          const avgGrowth =
            (params.dataPoints[params.dataPoints.length - 1] -
              params.dataPoints[0]) /
            params.dataPoints.length;

          return {
            success: true,
            data: {
              algorithm: params.algorithm,
              trend,
              predictedGrowth: avgGrowth,
              confidence: 0.85,
              predictions: Array.from(
                { length: params.predictionPeriod },
                (_, i) =>
                  params.dataPoints[params.dataPoints.length - 1] +
                  avgGrowth * (i + 1),
              ),
              generatedAt: new Date().toISOString(),
            },
          };
        },
      },
    },
  },
});

// 3. Use the mixed ecosystem
const comprehensiveResult = await neurolink.generate({
  input: {
    text: `Calculate compound interest for $10000 at 5% for 3 years, then generate a KPI report
           for revenue metrics over the last quarter, and predict trends for the next 6 months
           using the data points [100, 120, 115, 130, 125, 140]`,
  },
  provider: "google-ai",
  maxTokens: 2000,
});

// The AI will automatically:
// 1. Use enhancedCalculator for compound interest: compound(10000, 5, 3)
// 2. Use generateKPIReport for business metrics
// 3. Use predictTrends for forecasting
// 4. Synthesize all results into a comprehensive response

console.log("AI Response:", comprehensiveResult.content);
console.log("Tools Used:", comprehensiveResult.toolsUsed);
```

### Tool Discovery and Management

```typescript
// Get comprehensive view of all available tools
const allTools = await neurolink.getAllAvailableTools();

// Group tools by source
const toolsBySource = allTools.reduce(
  (acc, tool) => {
    const source = tool.serverId || "unknown";
    acc[source] = (acc[source] || 0) + 1;
    return acc;
  },
  {} as Record,
);

console.log("Tool ecosystem summary:");
console.log("• Total tools available:", allTools.length);
console.log("• Tools by source:", toolsBySource);

// Get custom tools registered via registerTool()
const customTools = neurolink.getCustomTools();
console.log("• Custom tools registered:", customTools.size);

// Get in-memory MCP servers added via addInMemoryMCPServer()
const mcpServers = neurolink.getInMemoryServers();
console.log("• In-memory MCP servers:", mcpServers.size);

// Execute tools from any source using unified API
const timeResult = await neurolink.executeTool("getCurrentTime");
const calculationResult = await neurolink.executeTool("enhancedCalculator", {
  expression: "compound(5000, 4.5, 2)",
  mode: "financial",
});
const reportResult = await neurolink.executeTool("generateKPIReport", {
  metrics: ["revenue", "growth"],
  timeRange: "Q1-2024",
});

console.log("Tool execution results:");
console.log("• Built-in tool:", timeResult.data.time);
console.log("• Custom tool:", calculationResult.result);
console.log("• MCP server tool:", reportResult.data.reportId);
```

### Best Practices for MCP Integration

#### 1. Organize Tools by Domain

```typescript
// Group related tools into themed MCP servers
await neurolink.addInMemoryMCPServer("user-management", {
  server: {
    title: "User Management Server",
    tools: {
      createUser: {
        /* ... */
      },
      updateUser: {
        /* ... */
      },
      deleteUser: {
        /* ... */
      },
      getUserProfile: {
        /* ... */
      },
    },
  },
});

await neurolink.addInMemoryMCPServer("order-processing", {
  server: {
    title: "Order Processing Server",
    tools: {
      createOrder: {
        /* ... */
      },
      updateOrderStatus: {
        /* ... */
      },
      calculateShipping: {
        /* ... */
      },
      processPayment: {
        /* ... */
      },
    },
  },
});
```

#### 2. Consistent Error Handling

```typescript
execute: async (params) => {
  try {
    const result = await performOperation(params);
    return {
      success: true,
      data: result,
    };
  } catch (error) {
    return {
      success: false,
      error: error.message,
      code: error.code || "OPERATION_FAILED",
      timestamp: new Date().toISOString(),
    };
  }
};
```

#### 3. Comprehensive Metadata

```typescript
await neurolink.addInMemoryMCPServer("server-id", {
  server: {
    title: "Human-Readable Server Name",
    description: "Detailed description of server purpose",
    tools: {
      /* ... */
    },
  },
  category: "business-logic", // Group similar servers
  metadata: {
    version: "2.1.0",
    author: "Your Team",
    lastUpdated: "2024-01-15",
    documentation: "https://docs.yourcompany.com/mcp-servers",
    supportContact: "support@yourcompany.com",
  },
});
```

##  Built-in Tools Reference

NeuroLink provides **6 core tools** that work across all AI providers with zero configuration:

### getCurrentTime {#getCurrentTime}

Get the current date and time in ISO 8601 format.

**Parameters:** None

**Returns:** Current date/time string

**Usage:**

```typescript
const result = await neurolink.generate({
  input: { text: "What time is it?" },
});
// AI can call getCurrentTime() automatically
```

**Use Cases:**

- Timestamping operations
- Time-based logic
- Scheduling and reminders
- Log entries

### writeFile {#writeFile}

Write content to a file on the filesystem.

**Parameters:**

- `path` (string): File path to write to
- `content` (string): Content to write

**Returns:** Success confirmation

**Usage:**

```typescript
const result = await neurolink.generate({
  input: { text: "Create a file called output.txt with 'Hello World'" },
});
// AI can call writeFile({ path: "output.txt", content: "Hello World" })
```

**Use Cases:**

- Report generation
- Configuration file creation
- Data export
- Log file writing

**Security:** Directory creation automatic, overwrites existing files

---

### listDirectory {#listDirectory}

List files and directories in a specified path.

**Parameters:**

- `path` (string): Directory path to list

**Returns:** Array of file/directory names

**Usage:**

```typescript
const result = await neurolink.generate({
  input: { text: "What files are in the current directory?" },
});
// AI can call listDirectory({ path: "." })
```

**Use Cases:**

- File system exploration
- Directory traversal
- File discovery
- Project structure analysis

**Returns:** File names only (not full paths)

---

### calculateMath {#calculateMath}

Perform mathematical calculations and expressions.

**Parameters:**

- `expression` (string): Math expression to evaluate

**Returns:** Calculation result (number)

**Usage:**

```typescript
const result = await neurolink.generate({
  input: { text: "What is 15% of 240?" },
});
// AI can call calculateMath({ expression: "240 * 0.15" })
```

**Supported Operations:**

- Basic arithmetic: `+`, `-`, `*`, `/`
- Exponentiation: `^`, `**`
- Parentheses: `(`, `)`
- Functions: `sqrt()`, `sin()`, `cos()`, `log()`, etc.
- Constants: `pi`, `e`

**Powered by:** [math.js](https://mathjs.org)

---

### websearch / websearchGrounding {#websearch}

Search the web using Google Vertex AI's grounding feature.

**Parameters:**

- `query` (string): Search query

**Returns:** Search results with citations

**Requirements:**

- ✅ Google Vertex AI configured
- ✅ Grounding API enabled
- ⚠️ Only works with Vertex AI provider

**Usage:**

```typescript
const result = await neurolink.generate({
  input: { text: "Search for latest AI developments" },
  provider: "vertex", // Must use Vertex AI
});
// AI can call websearchGrounding({ query: "latest AI developments" })
```

**Use Cases:**

- Real-time information retrieval
- Fact verification
- Current events
- Research assistance

**Limitations:** Requires Google Vertex AI credentials and enabled API

---

### Enabling/Disabling Built-in Tools

**Disable all tools:**

```typescript
const result = await neurolink.generate({
  input: { text: "Pure text generation" },
  disableTools: true,
});
```

**CLI usage:**

```bash
# With tools (default)
neurolink generate "What time is it?"

# Without tools
neurolink generate "Pure text" --disable-tools
```

**Note:** Built-in tools are automatically available unless explicitly disabled.

---

##  Additional Resources

- [API Reference](/docs/sdk/api-reference)
  **Feature Guides:**

- [Human-in-the-Loop (HITL)](/docs/features/hitl) - Add user approval checkpoints to custom tools
- [Guardrails Middleware](/docs/features/guardrails) - Filter tool outputs for safety

**MCP Integration:**

- [MCP Integration Guide](/docs/mcp/integration)
- [MCP Server Catalog](/docs/guides/mcp/server-catalog)
- [Advanced MCP Testing Guide](/docs/mcp/testing)

---

**Start building powerful AI applications with custom tools and MCP servers today! **

---

## ️ Framework Integration Guide

<!-- Source: sdk/framework-integration.md -->

# ️ Framework Integration Guide

NeuroLink integrates seamlessly with popular web frameworks. Here are complete examples for common use cases.

## SvelteKit Integration

### API Route (`src/routes/api/chat/+server.ts`)

```typescript

export const POST: RequestHandler = async ({ request }) => {
  try {
    const { message } = await request.json();

    const provider = createBestAIProvider();
    const result = await provider.stream({
      input: { text: message },
      temperature: 0.7,
      maxTokens: 1000,
    });

    // Manually create ReadableStream from AsyncIterable
    const readable = new ReadableStream({
      async start(controller) {
        try {
          for await (const chunk of result.stream) {
            if (chunk && typeof chunk === "object" && "content" in chunk) {
              controller.enqueue(new TextEncoder().encode(chunk.content));
            }
          }
          controller.close();
        } catch (error) {
          controller.error(error);
        }
      },
    });

    return new Response(readable, {
      headers: {
        "Content-Type": "text/event-stream",
        "Cache-Control": "no-cache",
        Connection: "keep-alive",
      },
    });
  } catch (error) {
    return new Response(JSON.stringify({ error: error.message }), {
      status: 500,
      headers: { "Content-Type": "application/json" },
    });
  }
};
```

### Svelte Component (`src/routes/chat/+page.svelte`)

```svelte

  let message = '';
  let response = '';
  let isLoading = false;

  async function sendMessage() {
    if (!message.trim()) return;

    isLoading = true;
    response = '';

    try {
      const res = await fetch('/api/chat', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ message })
      });

      if (!res.body) throw new Error('No response');

      const reader = res.body.getReader();
      const decoder = new TextDecoder();

      while (true) {
        const { done, value } = await reader.read();
        if (done) break;
        response += decoder.decode(value, { stream: true });
      }
    } catch (error) {
      response = `Error: ${error.message}`;
    } finally {
      isLoading = false;
    }
  }


    {isLoading ? 'Sending...' : 'Send'}


  {#if response}
    {response}
  {/if}

  .chat {
    max-width: 600px;
    margin: 2rem auto;
    padding: 1rem;
  }

  input {
    width: 70%;
    padding: 0.5rem;
    border: 1px solid #ccc;
    border-radius: 4px;
  }

  button {
    width: 25%;
    padding: 0.5rem;
    margin-left: 0.5rem;
    background: #007acc;
    color: white;
    border: none;
    border-radius: 4px;
    cursor: pointer;
  }

  button:disabled {
    opacity: 0.5;
    cursor: not-allowed;
  }

  .response {
    margin-top: 1rem;
    padding: 1rem;
    background: #f5f5f5;
    border-radius: 4px;
    white-space: pre-wrap;
  }

```

### Environment Configuration

```bash
# .env
OPENAI_API_KEY="sk-your-key"
AWS_ACCESS_KEY_ID="your-aws-key"
AWS_SECRET_ACCESS_KEY="your-aws-secret"
# Add other provider keys as needed
```

### Dynamic Model Integration (v1.8.0+)

#### Smart Model Selection API Route

```typescript

export const POST: RequestHandler = async ({ request }) => {
  try {
    const { message, useCase, optimizeFor } = await request.json();

    const factory = new AIProviderFactory();

    // Use dynamic model selection based on use case
    const provider = await factory.createProvider({
      provider: "auto",
      capability: useCase === "vision" ? "vision" : "general",
      optimizeFor: optimizeFor || "quality", // 'cost', 'speed', or 'quality'
    });

    const result = await provider.stream({
      input: { text: message },
      temperature: 0.7,
      maxTokens: 1000,
    });

    // Manually create ReadableStream from AsyncIterable
    const readable = new ReadableStream({
      async start(controller) {
        try {
          for await (const chunk of result.stream) {
            if (chunk && typeof chunk === "object" && "content" in chunk) {
              controller.enqueue(new TextEncoder().encode(chunk.content));
            }
          }
          controller.close();
        } catch (error) {
          controller.error(error);
        }
      },
    });

    return new Response(readable, {
      headers: {
        "Content-Type": "text/event-stream",
        "Cache-Control": "no-cache",
        Connection: "keep-alive",
        "X-Model-Used": result.model,
        "X-Provider-Used": result.provider,
      },
    });
  } catch (error) {
    return new Response(JSON.stringify({ error: error.message }), {
      status: 500,
      headers: { "Content-Type": "application/json" },
    });
  }
};
```

#### Cost-Optimized Component

```svelte

  let message = '';
  let response = '';
  let isLoading = false;
  let optimizeFor = 'quality'; // 'cost', 'speed', 'quality'
  let useCase = 'general';     // 'general', 'vision', 'code'
  let modelUsed = '';
  let providerUsed = '';

  async function sendMessage() {
    if (!message.trim()) return;

    isLoading = true;
    response = '';

    try {
      const res = await fetch('/api/chat', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({
          message,
          useCase,
          optimizeFor
        })
      });

      // Extract model and provider info from headers
      modelUsed = res.headers.get('X-Model-Used') || '';
      providerUsed = res.headers.get('X-Provider-Used') || '';

      if (!res.body) throw new Error('No response');

      const reader = res.body.getReader();
      const decoder = new TextDecoder();

      while (true) {
        const { done, value } = await reader.read();
        if (done) break;
        response += decoder.decode(value, { stream: true });
      }
    } catch (error) {
      response = `Error: ${error.message}`;
    } finally {
      isLoading = false;
    }
  }


      Use Case:

        General
        Coding
        Vision


      Optimize For:

        Quality
        Speed
        Cost


    {isLoading ? 'Sending...' : 'Send'}


  {#if response}


        Model: {modelUsed} | Provider: {providerUsed}

      {response}

  {/if}

  .smart-chat {
    max-width: 700px;
    margin: 2rem auto;
    padding: 1rem;
  }

  .options {
    display: flex;
    gap: 1rem;
    margin-bottom: 1rem;
  }

  .options label {
    display: flex;
    flex-direction: column;
    gap: 0.25rem;
  }

  .options select {
    padding: 0.25rem;
    border: 1px solid #ccc;
    border-radius: 4px;
  }

  .model-info {
    font-size: 0.8rem;
    color: #666;
    margin-bottom: 0.5rem;
    font-family: monospace;
  }

  .content {
    white-space: pre-wrap;
  }

```

## Next.js Integration

### App Router API (`app/api/ai/route.ts`)

```typescript

export async function POST(request: NextRequest) {
  try {
    const { prompt, ...options } = await request.json();

    const provider = createBestAIProvider();
    const result = await provider.generate({
      input: { text: prompt },
      temperature: 0.7,
      maxTokens: 1000,
      ...options,
    });

    return NextResponse.json({
      text: result.text,
      provider: result.provider,
      usage: result.usage,
    });
  } catch (error) {
    return NextResponse.json({ error: error.message }, { status: 500 });
  }
}

// Streaming endpoint
export async function PUT(request: NextRequest) {
  try {
    const { prompt } = await request.json();

    const provider = createBestAIProvider();
    const result = await provider.stream({
      input: { text: prompt },
    });

    // Manually create ReadableStream from AsyncIterable
    const readable = new ReadableStream({
      async start(controller) {
        try {
          for await (const chunk of result.stream) {
            if (chunk && typeof chunk === "object" && "content" in chunk) {
              controller.enqueue(new TextEncoder().encode(chunk.content));
            }
          }
          controller.close();
        } catch (error) {
          controller.error(error);
        }
      },
    });

    return new Response(readable, {
      headers: {
        "Content-Type": "text/event-stream",
        "Cache-Control": "no-cache",
        Connection: "keep-alive",
      },
    });
  } catch (error) {
    return NextResponse.json({ error: error.message }, { status: 500 });
  }
}
```

### React Component (`components/AIChat.tsx`)

```typescript
'use client';

type AIResponse = {
  text: string;
  provider: string;
  usage?: {
    promptTokens: number;
    completionTokens: number;
    totalTokens: number;
  };
}

export default function AIChat() {
  const [prompt, setPrompt] = useState('');
  const [result, setResult] = useState(null);
  const [loading, setLoading] = useState(false);
  const [error, setError] = useState('');

  const generate = async () => {
    if (!prompt.trim()) return;

    setLoading(true);
    setError('');
    setResult(null);

    try {
      const response = await fetch('/api/ai', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ prompt })
      });

      const data = await response.json();

      if (response.ok) {
        setResult(data);
      } else {
        setError(data.error || 'An error occurred');
      }
    } catch (err) {
      setError(err instanceof Error ? err.message : 'Network error');
    } finally {
      setLoading(false);
    }
  };

  const handleKeyPress = (e: React.KeyboardEvent) => {
    if (e.key === 'Enter' && !e.shiftKey) {
      e.preventDefault();
      generate();
    }
  };

  return (

      AI Chat with NeuroLink


         setPrompt(e.target.value)}
          onKeyPress={handleKeyPress}
          placeholder="Enter your prompt here..."
          className="flex-1 p-3 border border-gray-300 rounded-lg resize-none focus:outline-none focus:ring-2 focus:ring-blue-500"
          rows={3}
        />

          {loading ? 'Generating...' : 'Generate'}


      {error && (

          Error: {error}

      )}

      {result && (


            Response:
            {result.text}


            Provider: {result.provider}
            {result.usage && (
              Tokens: {result.usage.totalTokens}
            )}


      )}

  );
}
```

### Streaming Component (`components/AIStreamChat.tsx`)

```typescript
'use client';

export default function AIStreamChat() {
  const [prompt, setPrompt] = useState('');
  const [response, setResponse] = useState('');
  const [loading, setLoading] = useState(false);

  const streamGenerate = async () => {
    if (!prompt.trim()) return;

    setLoading(true);
    setResponse('');

    try {
      const res = await fetch('/api/ai', {
        method: 'PUT',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ prompt })
      });

      if (!res.body) throw new Error('No response stream');

      const reader = res.body.getReader();
      const decoder = new TextDecoder();

      while (true) {
        const { done, value } = await reader.read();
        if (done) break;

        const chunk = decoder.decode(value, { stream: true });
        setResponse(prev => prev + chunk);
      }
    } catch (error) {
      setResponse(`Error: ${error.message}`);
    } finally {
      setLoading(false);
    }
  };

  return (

      Streaming AI Chat


         setPrompt(e.target.value)}
          placeholder="Enter your prompt..."
          className="flex-1 p-3 border border-gray-300 rounded-lg focus:outline-none focus:ring-2 focus:ring-blue-500"
        />

          {loading ? ' Streaming...' : '▶️ Stream'}


      {response && (

          Streaming Response:
          {response}
          {loading && ▋}

      )}

  );
}
```

## Express.js Integration

### Basic Server Setup

```typescript

const app = express();
app.use(express.json());

// Simple generation endpoint
app.post("/api/generate", async (req, res) => {
  try {
    const { prompt, options = {} } = req.body;

    const provider = createBestAIProvider();
    const result = await provider.generate({
      input: { text: prompt },
      ...options,
    });

    res.json({
      success: true,
      text: result.text,
      provider: result.provider,
      usage: result.usage,
    });
  } catch (error) {
    res.status(500).json({
      success: false,
      error: error.message,
    });
  }
});

// Streaming endpoint
app.post("/api/stream", async (req, res) => {
  try {
    const { prompt } = req.body;

    const provider = createBestAIProvider();
    const result = await provider.stream({
      input: { text: prompt },
    });

    res.setHeader("Content-Type", "text/plain");
    res.setHeader("Cache-Control", "no-cache");

    for await (const chunk of result.stream) {
      if (chunk && typeof chunk === "object" && "content" in chunk) {
        res.write(chunk.content);
      }
    }
    res.end();
  } catch (error) {
    res.status(500).json({ error: error.message });
  }
});

// Provider status endpoint
app.get("/api/status", async (req, res) => {
  const providers = ["openai", "bedrock", "vertex"];
  const status = {};

  for (const providerName of providers) {
    try {
      const provider = AIProviderFactory.createProvider(providerName);
      const start = Date.now();

      await provider.generate({
        input: { text: "test" },
        maxTokens: 1,
      });

      status[providerName] = {
        available: true,
        responseTime: Date.now() - start,
      };
    } catch (error) {
      status[providerName] = {
        available: false,
        error: error.message,
      };
    }
  }

  res.json(status);
});

app.listen(9876, () => {
  console.log("Server running on http://localhost:9876");
});
```

### Advanced Express Integration with Middleware

```typescript

const app = express();
app.use(express.json());

// Middleware for AI provider
app.use("/api/ai", (req, res, next) => {
  req.aiProvider = createBestAIProvider();
  next();
});

// Rate limiting middleware
const rateLimitMap = new Map();
app.use("/api/ai", (req, res, next) => {
  const ip = req.ip;
  const now = Date.now();
  const requests = rateLimitMap.get(ip) || [];

  // Allow 10 requests per minute
  const recentRequests = requests.filter((time) => now - time = 10) {
    return res.status(429).json({ error: "Rate limit exceeded" });
  }

  recentRequests.push(now);
  rateLimitMap.set(ip, recentRequests);
  next();
});

// Batch processing endpoint
app.post("/api/ai/batch", async (req, res) => {
  try {
    const { prompts, options = {} } = req.body;

    if (!Array.isArray(prompts) || prompts.length === 0) {
      return res.status(400).json({ error: "Prompts array required" });
    }

    const results = [];
    for (const prompt of prompts) {
      try {
        const result = await req.aiProvider.generate({
          input: { text: prompt },
          ...options,
        });
        results.push({ success: true, ...result });
      } catch (error) {
        results.push({ success: false, error: error.message });
      }

      // Add delay to prevent rate limiting
      if (results.length  setTimeout(resolve, 1000));
      }
    }

    res.json({ results });
  } catch (error) {
    res.status(500).json({ error: error.message });
  }
});
```

## Fastify Integration

Fastify is a high-performance web framework for Node.js. NeuroLink integrates smoothly with Fastify's async-first architecture.

### Basic Server Setup

```typescript

const fastify = Fastify({ logger: true });

// Simple generation endpoint
fastify.post("/api/generate", async (request, reply) => {
  try {
    const { prompt, options = {} } = request.body as {
      prompt: string;
      options?: Record;
    };

    const provider = createBestAIProvider();
    const result = await provider.generate({
      input: { text: prompt },
      ...options,
    });

    return {
      success: true,
      text: result.text,
      provider: result.provider,
      usage: result.usage,
    };
  } catch (error) {
    reply.status(500);
    return {
      success: false,
      error: error instanceof Error ? error.message : "Unknown error",
    };
  }
});

// Streaming endpoint
fastify.post("/api/stream", async (request, reply) => {
  try {
    const { prompt } = request.body as { prompt: string };

    const provider = createBestAIProvider();
    const result = await provider.stream({
      input: { text: prompt },
    });

    reply.raw.writeHead(200, {
      "Content-Type": "text/plain",
      "Cache-Control": "no-cache",
      Connection: "keep-alive",
    });

    for await (const chunk of result.stream) {
      if (chunk && typeof chunk === "object" && "content" in chunk) {
        reply.raw.write(chunk.content);
      }
    }

    reply.raw.end();
  } catch (error) {
    reply.status(500);
    return { error: error instanceof Error ? error.message : "Unknown error" };
  }
});

// Provider status endpoint
fastify.get("/api/status", async () => {
  const providers = ["openai", "bedrock", "vertex"];
  const status: Record = {};

  for (const providerName of providers) {
    try {
      const provider = AIProviderFactory.createProvider(providerName);
      const start = Date.now();

      await provider.generate({
        input: { text: "test" },
        maxTokens: 1,
      });

      status[providerName] = {
        available: true,
        responseTime: Date.now() - start,
      };
    } catch (error) {
      status[providerName] = {
        available: false,
        error: error instanceof Error ? error.message : "Unknown error",
      };
    }
  }

  return status;
});

// Start server
const start = async () => {
  try {
    await fastify.listen({ port: 9876, host: "0.0.0.0" });
    console.log("Server running on http://localhost:9876");
  } catch (err) {
    fastify.log.error(err);
    process.exit(1);
  }
};

start();
```

For a complete Fastify integration guide with hooks, plugins, and advanced patterns, see the [Fastify Integration Guide](/docs/sdk/framework-integration).

## NestJS Integration

NestJS is an enterprise-grade Node.js framework built with TypeScript, featuring decorators, dependency injection, and a modular architecture. NeuroLink integrates naturally with NestJS patterns.

### NeuroLink Module and Service

```typescript
// neurolink.module.ts

@Global()
@Module({
  providers: [NeuroLinkService],
  exports: [NeuroLinkService],
})
export class NeuroLinkModule {}
```

```typescript
// neurolink.service.ts

@Injectable()
export class NeuroLinkService implements OnModuleInit {
  private provider: ReturnType;

  onModuleInit() {
    this.provider = createBestAIProvider();
  }

  async generate(prompt: string, options = {}) {
    const result = await this.provider.generate({
      input: { text: prompt },
      ...options,
    });

    return {
      text: result.text,
      provider: result.provider,
      usage: result.usage,
    };
  }

  async *stream(prompt: string, options = {}) {
    const result = await this.provider.stream({
      input: { text: prompt },
      ...options,
    });

    for await (const chunk of result.stream) {
      if (chunk && typeof chunk === "object" && "content" in chunk) {
        yield chunk.content;
      }
    }
  }
}
```

```typescript
// ai.controller.ts

@Controller("api/ai")
export class AIController {
  constructor(private readonly neurolink: NeuroLinkService) {}

  @Post("generate")
  async generate(@Body() body: { prompt: string; options?: object }) {
    return this.neurolink.generate(body.prompt, body.options);
  }

  @Post("stream")
  async stream(@Body() body: { prompt: string }, @Res() res: Response) {
    res.setHeader("Content-Type", "text/plain");
    res.setHeader("Cache-Control", "no-cache");

    for await (const chunk of this.neurolink.stream(body.prompt)) {
      res.write(chunk);
    }

    res.end();
  }
}
```

[Full NestJS Guide](/docs/sdk/nestjs-integration)

## React Hook (Universal)

### Custom Hook for AI Generation

```typescript

type AIOptions = {
  temperature?: number;
  maxTokens?: number;
  provider?: string;
  systemPrompt?: string;
}

type AIResult = {
  text: string;
  provider: string;
  usage?: {
    promptTokens: number;
    completionTokens: number;
    totalTokens: number;
  };
}

export function useAI(apiEndpoint = '/api/ai') {
  const [loading, setLoading] = useState(false);
  const [error, setError] = useState(null);
  const [result, setResult] = useState(null);

  const generate = useCallback(async (
    prompt: string,
    options: AIOptions = {}
  ) => {
    if (!prompt.trim()) {
      setError('Prompt is required');
      return null;
    }

    setLoading(true);
    setError(null);
    setResult(null);

    try {
      const response = await fetch(apiEndpoint, {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ prompt, ...options })
      });

      if (!response.ok) {
        throw new Error(`Request failed: ${response.statusText}`);
      }

      const data = await response.json();

      if (data.error) {
        throw new Error(data.error);
      }

      setResult(data);
      return data.text;
    } catch (err) {
      const message = err instanceof Error ? err.message : 'Unknown error';
      setError(message);
      return null;
    } finally {
      setLoading(false);
    }
  }, [apiEndpoint]);

  const clear = useCallback(() => {
    setResult(null);
    setError(null);
  }, []);

  return {
    generate,
    loading,
    error,
    result,
    clear
  };
}

// Usage example
function MyComponent() {
  const { generate, loading, error, result } = useAI('/api/ai');

  const handleGenerate = async () => {
    const text = await generate("Explain React hooks", {
      temperature: 0.7,
      maxTokens: 500,
      provider: 'openai'
    });

    if (text) {
      console.log('Generated:', text);
    }
  };

  return (


        {loading ? 'Generating...' : 'Generate'}


      {error && Error: {error}}

      {result && (

          {result.text}
          Provider: {result.provider}

      )}

  );
}
```

### Streaming Hook

```typescript

export function useAIStream(apiEndpoint = "/api/ai/stream") {
  const [streaming, setStreaming] = useState(false);
  const [content, setContent] = useState("");
  const [error, setError] = useState(null);

  const stream = useCallback(
    async (prompt: string) => {
      if (!prompt.trim()) return;

      setStreaming(true);
      setContent("");
      setError(null);

      try {
        const response = await fetch(apiEndpoint, {
          method: "POST",
          headers: { "Content-Type": "application/json" },
          body: JSON.stringify({ prompt }),
        });

        if (!response.body) {
          throw new Error("No response stream");
        }

        const reader = response.body.getReader();
        const decoder = new TextDecoder();

        while (true) {
          const { done, value } = await reader.read();
          if (done) break;

          const chunk = decoder.decode(value, { stream: true });
          setContent((prev) => prev + chunk);
        }
      } catch (err) {
        setError(err instanceof Error ? err.message : "Stream error");
      } finally {
        setStreaming(false);
      }
    },
    [apiEndpoint],
  );

  const clear = useCallback(() => {
    setContent("");
    setError(null);
  }, []);

  return {
    stream,
    streaming,
    content,
    error,
    clear,
  };
}
```

## Vue.js Integration

### Vue 3 Composition API

```typescript
// composables/useAI.ts

export function useAI() {
  const loading = ref(false);
  const error = ref(null);
  const result = ref("");

  const generate = async (prompt: string, options = {}) => {
    if (!prompt.trim()) return;

    loading.value = true;
    error.value = null;
    result.value = "";

    try {
      const response = await fetch("/api/ai", {
        method: "POST",
        headers: { "Content-Type": "application/json" },
        body: JSON.stringify({ prompt, ...options }),
      });

      const data = await response.json();

      if (data.error) {
        throw new Error(data.error);
      }

      result.value = data.text;
    } catch (err) {
      error.value = err instanceof Error ? err.message : "Unknown error";
    } finally {
      loading.value = false;
    }
  };

  const clear = () => {
    result.value = "";
    error.value = null;
  };

  return {
    loading: computed(() => loading.value),
    error: computed(() => error.value),
    result: computed(() => result.value),
    generate,
    clear,
  };
}
```

### Vue Component

```vue


    AI Chat with NeuroLink


        {{ loading ? "Generating..." : "Generate" }}


    Error: {{ error }}


      Response:
      {{ result }}


const prompt = ref("");
const { loading, error, result, generate } = useAI();

const handleGenerate = async () => {
  if (!prompt.value.trim()) return;

  await generate(prompt.value, {
    temperature: 0.7,
    maxTokens: 500,
  });

  prompt.value = "";
};

.ai-chat {
  max-width: 600px;
  margin: 0 auto;
  padding: 2rem;
}

.input-group {
  display: flex;
  gap: 1rem;
  margin: 1rem 0;
}

textarea {
  flex: 1;
  min-height: 100px;
  padding: 0.5rem;
  border: 1px solid #ccc;
  border-radius: 4px;
}

button {
  padding: 0.5rem 1rem;
  background: #42b883;
  color: white;
  border: none;
  border-radius: 4px;
  cursor: pointer;
}

button:disabled {
  opacity: 0.5;
  cursor: not-allowed;
}

.error {
  padding: 1rem;
  background: #fee;
  border: 1px solid #fcc;
  border-radius: 4px;
  color: #c00;
}

.result {
  padding: 1rem;
  background: #f9f9f9;
  border-radius: 4px;
  margin-top: 1rem;
}

```

## Environment Configuration for All Frameworks

### Environment Variables

```bash
# .env (for all frameworks)
OPENAI_API_KEY="sk-your-openai-key"
AWS_ACCESS_KEY_ID="your-aws-access-key"
AWS_SECRET_ACCESS_KEY="your-aws-secret-key"
GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"

# Optional configurations
NEUROLINK_DEBUG="false"
DEFAULT_PROVIDER="auto"
ENABLE_FALLBACK="true"
```

### Framework-Specific Configuration

#### Next.js (`next.config.js`)

```javascript
/** @type {import('next').NextConfig} */
const nextConfig = {
  env: {
    OPENAI_API_KEY: process.env.OPENAI_API_KEY,
    // Don't expose AWS keys to client
  },
  experimental: {
    serverComponentsExternalPackages: ["@juspay/neurolink"],
  },
};

module.exports = nextConfig;
```

#### SvelteKit (`vite.config.ts`)

```typescript

export default defineConfig({
  plugins: [sveltekit()],
  define: {
    // Only expose public env vars to client
    "process.env.PUBLIC_APP_NAME": JSON.stringify(process.env.PUBLIC_APP_NAME),
  },
});
```

## Deployment Considerations

### Vercel Deployment

```bash
# Add environment variables in Vercel dashboard
# or use vercel.json
{
  "env": {
    "OPENAI_API_KEY": "@openai-api-key",
    "AWS_ACCESS_KEY_ID": "@aws-access-key-id",
    "AWS_SECRET_ACCESS_KEY": "@aws-secret-access-key"
  }
}
```

### Docker Deployment

```dockerfile
FROM node:18-alpine

WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production

COPY . .
RUN npm run build

# Set environment variables
ENV OPENAI_API_KEY=""
ENV AWS_ACCESS_KEY_ID=""
ENV AWS_SECRET_ACCESS_KEY=""

EXPOSE 3000
CMD ["npm", "start"]
```

---

[← Back to Main README](/docs/) | [Next: Provider Configuration →](/docs/getting-started/provider-setup)

---

## NestJS Integration Guide

<!-- Source: sdk/nestjs-integration.md -->

# NestJS Integration Guide

**Build enterprise-grade AI applications with NestJS and NeuroLink**

## Quick Start

### 1. Create New NestJS Project

```bash
npm install -g @nestjs/cli
nest new my-ai-service
cd my-ai-service
npm install @juspay/neurolink dotenv class-validator class-transformer @nestjs/config
```

### 2. Configure Environment

```bash
# .env
OPENAI_API_KEY=sk-your-openai-key
ANTHROPIC_API_KEY=sk-ant-your-anthropic-key
JWT_SECRET=your-super-secret-jwt-key
API_KEY=your-api-key-for-clients
PORT=3000
```

### 3. Generate Module and Controller

```bash
nest generate module neurolink
nest generate service neurolink
nest generate controller ai
```

---

## Module Setup

### NeuroLink Module (Dynamic)

```typescript
// src/neurolink/neurolink.module.ts

@Global()
@Module({})
export class NeuroLinkModule {
  static forRoot(): DynamicModule {
    return {
      module: NeuroLinkModule,
      imports: [ConfigModule],
      providers: [NeuroLinkService],
      exports: [NeuroLinkService],
    };
  }

  static forRootAsync(options: {
    imports?: any[];
    useFactory: (...args: any[]) => Promise | any;
    inject?: any[];
  }): DynamicModule {
    return {
      module: NeuroLinkModule,
      imports: [...(options.imports || []), ConfigModule],
      providers: [
        {
          provide: "NEUROLINK_OPTIONS",
          useFactory: options.useFactory,
          inject: options.inject || [],
        },
        NeuroLinkService,
      ],
      exports: [NeuroLinkService],
    };
  }
}
```

### NeuroLink Service (@Injectable)

```typescript
// src/neurolink/neurolink.service.ts

  Injectable,
  OnModuleInit,
  OnModuleDestroy,
  Logger,
  Inject,
  Optional,
} from "@nestjs/common";

@Injectable()
export class NeuroLinkService implements OnModuleInit, OnModuleDestroy {
  private readonly logger = new Logger(NeuroLinkService.name);
  private ai: NeuroLink;

  constructor(
    private configService: ConfigService,
    @Optional() @Inject("NEUROLINK_OPTIONS") private options?: any,
  ) {}

  async onModuleInit() {
    this.ai = new NeuroLink({
      providers: this.options?.providers || [
        {
          name: "openai",
          config: { apiKey: this.configService.get("OPENAI_API_KEY") },
        },
        {
          name: "anthropic",
          config: { apiKey: this.configService.get("ANTHROPIC_API_KEY") },
        },
      ],
    });
    this.logger.log("NeuroLink service initialized");
  }

  async onModuleDestroy() {
    await this.ai.cleanup();
    this.logger.log("NeuroLink resources cleaned up");
  }

  async generate(
    prompt: string,
    options?: { provider?: string; model?: string; temperature?: number },
  ) {
    return this.ai.generate({
      input: { text: prompt },
      providerName: options?.provider,
      modelName: options?.model,
      temperature: options?.temperature,
    });
  }

  async generateStream(
    prompt: string,
    options?: { provider?: string; model?: string },
  ) {
    return this.ai.generateStream({
      input: { text: prompt },
      providerName: options?.provider,
      modelName: options?.model,
    });
  }

  async chat(
    messages: Array,
    options?: { provider?: string },
  ) {
    return this.ai.generate({
      input: { messages },
      providerName: options?.provider,
    });
  }
}
```

---

## Controller Implementation

### AI Controller with Decorators

```typescript
// src/ai/ai.controller.ts

  Controller,
  Post,
  Get,
  Body,
  Res,
  HttpCode,
  HttpStatus,
  UseGuards,
  UseInterceptors,
  UsePipes,
  ValidationPipe,
  Logger,
} from "@nestjs/common";

@Controller("api/ai")
@UseGuards(JwtAuthGuard)
@UseInterceptors(RateLimitInterceptor)
@UsePipes(new ValidationPipe({ transform: true, whitelist: true }))
export class AIController {
  private readonly logger = new Logger(AIController.name);

  constructor(private readonly neuroLinkService: NeuroLinkService) {}

  @Post("generate")
  @HttpCode(HttpStatus.OK)
  async generate(@Body() dto: GenerateDto) {
    this.logger.log(`Generate request: ${dto.prompt.substring(0, 50)}...`);
    const result = await this.neuroLinkService.generate(dto.prompt, {
      provider: dto.provider,
      model: dto.model,
      temperature: dto.temperature,
    });
    return { success: true, data: { text: result.text, usage: result.usage } };
  }

  @Post("chat")
  @HttpCode(HttpStatus.OK)
  async chat(@Body() dto: ChatDto) {
    const result = await this.neuroLinkService.chat(dto.messages, {
      provider: dto.provider,
    });
    return { success: true, data: { text: result.text, usage: result.usage } };
  }

  @Post("stream")
  async stream(@Body() dto: StreamDto, @Res() res: Response) {
    res.setHeader("Content-Type", "text/event-stream");
    res.setHeader("Cache-Control", "no-cache");
    res.setHeader("Connection", "keep-alive");

    try {
      const stream = await this.neuroLinkService.generateStream(dto.prompt, {
        provider: dto.provider,
        model: dto.model,
      });
      for await (const chunk of stream) {
        if (chunk.text)
          res.write(`data: ${JSON.stringify({ text: chunk.text })}\n\n`);
      }
      res.write("data: [DONE]\n\n");
    } catch (error) {
      res.write(`data: ${JSON.stringify({ error: error.message })}\n\n`);
    }
    res.end();
  }

  @Get("health")
  @HttpCode(HttpStatus.OK)
  healthCheck() {
    return { status: "healthy", timestamp: new Date().toISOString() };
  }
}
```

---

## DTOs and Validation

### Generate DTO with class-validator

```typescript
// src/ai/dto/generate.dto.ts

  IsString,
  IsNotEmpty,
  IsOptional,
  IsNumber,
  Min,
  Max,
  MaxLength,
  IsIn,
} from "class-validator";

export class GenerateDto {
  @IsString()
  @IsNotEmpty({ message: "Prompt is required" })
  @MaxLength(100000)
  @Transform(({ value }) => value?.trim())
  prompt: string;

  @IsOptional()
  @IsString()
  @IsIn(["openai", "anthropic", "google-ai", "mistral", "bedrock"])
  provider?: string;

  @IsOptional()
  @IsString()
  model?: string;

  @IsOptional()
  @IsNumber()
  @Min(0)
  @Max(2)
  temperature?: number;

  @IsOptional()
  @IsNumber()
  @Min(1)
  @Max(128000)
  maxTokens?: number;
}
```

### Chat DTO with Nested Validation

```typescript
// src/ai/dto/chat.dto.ts

  IsArray,
  IsString,
  IsNotEmpty,
  IsOptional,
  ValidateNested,
  ArrayMinSize,
  IsIn,
} from "class-validator";

class MessageDto {
  @IsString()
  @IsIn(["user", "assistant", "system"])
  role: "user" | "assistant" | "system";

  @IsString()
  @IsNotEmpty()
  content: string;
}

export class ChatDto {
  @IsArray()
  @ArrayMinSize(1)
  @ValidateNested({ each: true })
  @Type(() => MessageDto)
  messages: MessageDto[];

  @IsOptional()
  @IsString()
  provider?: string;

  @IsOptional()
  @IsString()
  model?: string;
}
```

### Stream DTO

```typescript
// src/ai/dto/stream.dto.ts

  IsString,
  IsNotEmpty,
  IsOptional,
  IsNumber,
  Min,
  Max,
  MaxLength,
} from "class-validator";

export class StreamDto {
  @IsString()
  @IsNotEmpty()
  @MaxLength(100000)
  @Transform(({ value }) => value?.trim())
  prompt: string;

  @IsOptional()
  @IsString()
  provider?: string;

  @IsOptional()
  @IsString()
  model?: string;

  @IsOptional()
  @IsNumber()
  @Min(0)
  @Max(2)
  temperature?: number;
}
```

---

## Authentication

### API Key Guard

```typescript
// src/auth/guards/api-key.guard.ts

  Injectable,
  CanActivate,
  ExecutionContext,
  UnauthorizedException,
} from "@nestjs/common";

@Injectable()
export class ApiKeyGuard implements CanActivate {
  constructor(private configService: ConfigService) {}

  canActivate(context: ExecutionContext): boolean {
    const request = context.switchToHttp().getRequest();
    const apiKey = request.headers["x-api-key"] as string;

    if (!apiKey) throw new UnauthorizedException("API key is required");

    const validApiKey = this.configService.get("API_KEY");
    if (apiKey !== validApiKey)
      throw new UnauthorizedException("Invalid API key");

    return true;
  }
}
```

### JWT Auth Guard with @UseGuards

```typescript
// src/auth/guards/jwt-auth.guard.ts

  Injectable,
  CanActivate,
  ExecutionContext,
  UnauthorizedException,
} from "@nestjs/common";

export const IS_PUBLIC_KEY = "isPublic";

@Injectable()
export class JwtAuthGuard implements CanActivate {
  constructor(
    private configService: ConfigService,
    private reflector: Reflector,
  ) {}

  async canActivate(context: ExecutionContext): Promise {
    const isPublic = this.reflector.getAllAndOverride(IS_PUBLIC_KEY, [
      context.getHandler(),
      context.getClass(),
    ]);
    if (isPublic) return true;

    const request = context.switchToHttp().getRequest();
    const authHeader = request.headers.authorization;
    if (!authHeader?.startsWith("Bearer ")) {
      throw new UnauthorizedException("Authentication required");
    }

    try {
      const token = authHeader.substring(7);
      const secret = this.configService.get("JWT_SECRET");
      request["user"] = jwt.verify(token, secret);
      return true;
    } catch {
      throw new UnauthorizedException("Invalid or expired token");
    }
  }
}
```

### Public Decorator

```typescript
// src/auth/decorators/public.decorator.ts

export const Public = () => SetMetadata(IS_PUBLIC_KEY, true);
```

---

## Rate Limiting

### Custom RateLimitInterceptor

```typescript
// src/common/interceptors/rate-limit.interceptor.ts

  Injectable,
  NestInterceptor,
  ExecutionContext,
  CallHandler,
  HttpException,
  HttpStatus,
} from "@nestjs/common";

@Injectable()
export class RateLimitInterceptor implements NestInterceptor {
  private store = new Map();
  private readonly limit = 100;
  private readonly windowMs = 60000;

  intercept(context: ExecutionContext, next: CallHandler): Observable {
    const request = context.switchToHttp().getRequest();
    const response = context.switchToHttp().getResponse();
    const key = request["user"]?.sub || request.ip;
    const now = Date.now();

    let entry = this.store.get(key);
    if (!entry || now > entry.resetTime) {
      entry = { count: 0, resetTime: now + this.windowMs };
    }
    entry.count++;
    this.store.set(key, entry);

    response.setHeader("X-RateLimit-Limit", this.limit);
    response.setHeader(
      "X-RateLimit-Remaining",
      Math.max(0, this.limit - entry.count),
    );

    if (entry.count > this.limit) {
      throw new HttpException(
        { message: "Rate limit exceeded" },
        HttpStatus.TOO_MANY_REQUESTS,
      );
    }
    return next.handle();
  }
}
```

### Using @nestjs/throttler

```bash
npm install @nestjs/throttler
```

```typescript
// src/app.module.ts

@Module({
  imports: [
    ThrottlerModule.forRoot([
      { name: "short", ttl: 1000, limit: 3 },
      { name: "medium", ttl: 10000, limit: 20 },
      { name: "long", ttl: 60000, limit: 100 },
    ]),
  ],
  providers: [{ provide: APP_GUARD, useClass: ThrottlerGuard }],
})
export class AppModule {}
```

---

## Response Caching

### CacheInterceptor with @nestjs/cache-manager

```bash
npm install @nestjs/cache-manager cache-manager cache-manager-ioredis-yet
```

```typescript
// src/common/interceptors/cache.interceptor.ts

  Injectable,
  NestInterceptor,
  ExecutionContext,
  CallHandler,
  Inject,
} from "@nestjs/common";

@Injectable()
export class CacheInterceptor implements NestInterceptor {
  constructor(@Inject(CACHE_MANAGER) private cacheManager: Cache) {}

  async intercept(
    context: ExecutionContext,
    next: CallHandler,
  ): Promise> {
    const request = context.switchToHttp().getRequest();
    if (request.method !== "GET" || request.body?.temperature !== 0) {
      return next.handle();
    }

    const cacheKey = this.generateKey(request);
    const cached = await this.cacheManager.get(cacheKey);
    if (cached) return of(cached);

    return next.handle().pipe(
      tap((response) => {
        if (response && !response.error) {
          this.cacheManager.set(cacheKey, response, 300000);
        }
      }),
    );
  }

  private generateKey(request: Request): string {
    const hash = crypto
      .createHash("sha256")
      .update(JSON.stringify({ url: request.url, body: request.body }))
      .digest("hex");
    return `ai:cache:${hash}`;
  }
}
```

---

## Streaming Responses

### SSE with @Sse() Decorator

```typescript
// src/ai/ai-stream.controller.ts

  Controller,
  Post,
  Body,
  Sse,
  MessageEvent,
  UseGuards,
  Logger,
} from "@nestjs/common";

@Controller("api/ai")
@UseGuards(JwtAuthGuard)
export class AIStreamController {
  private readonly logger = new Logger(AIStreamController.name);
  constructor(private readonly neuroLinkService: NeuroLinkService) {}

  @Post("stream/sse")
  @Sse()
  streamSSE(@Body() dto: StreamDto): Observable {
    const subject = new Subject();

    this.processStream(dto, subject).catch((error) => {
      subject.next({ data: { error: error.message }, type: "error" });
      subject.complete();
    });

    return subject.asObservable();
  }

  private async processStream(dto: StreamDto, subject: Subject) {
    const stream = await this.neuroLinkService.generateStream(dto.prompt, {
      provider: dto.provider,
      model: dto.model,
    });

    subject.next({ data: { status: "started" }, type: "start" });

    let tokenCount = 0;
    for await (const chunk of stream) {
      if (chunk.text) {
        tokenCount++;
        subject.next({
          data: { text: chunk.text, index: tokenCount },
          type: "token",
        });
      }
    }

    subject.next({
      data: { status: "completed", totalTokens: tokenCount },
      type: "complete",
    });
    subject.complete();
  }
}
```

---

## Exception Filters

### AIExceptionFilter with @Catch()

```typescript
// src/common/filters/ai-exception.filter.ts

  ExceptionFilter,
  Catch,
  ArgumentsHost,
  HttpException,
  HttpStatus,
  Logger,
} from "@nestjs/common";

@Catch()
export class AIExceptionFilter implements ExceptionFilter {
  private readonly logger = new Logger(AIExceptionFilter.name);

  catch(exception: Error, host: ArgumentsHost) {
    const ctx = host.switchToHttp();
    const response = ctx.getResponse();
    const request = ctx.getRequest();

    const { statusCode, code, message } = this.handleException(exception);

    this.logger.error(
      `${request.method} ${request.url} - ${statusCode}: ${message}`,
      exception.stack,
    );

    response.status(statusCode).json({
      success: false,
      error: { code, message },
      meta: { timestamp: new Date().toISOString(), path: request.url },
    });
  }

  private handleException(exception: Error) {
    if (exception instanceof HttpException) {
      const status = exception.getStatus();
      const response = exception.getResponse();
      return {
        statusCode: status,
        code: this.getErrorCode(status),
        message:
          typeof response === "string" ? response : (response as any).message,
      };
    }

    const message = exception.message?.toLowerCase() || "";
    if (message.includes("rate limit")) {
      return {
        statusCode: 429,
        code: "RATE_LIMIT_ERROR",
        message: "Rate limit exceeded",
      };
    }
    if (message.includes("api key") || message.includes("unauthorized")) {
      return {
        statusCode: 401,
        code: "PROVIDER_AUTH_ERROR",
        message: "Provider authentication failed",
      };
    }

    return {
      statusCode: 500,
      code: "INTERNAL_ERROR",
      message: "An unexpected error occurred",
    };
  }

  private getErrorCode(status: number): string {
    const codes: Record = {
      400: "BAD_REQUEST",
      401: "UNAUTHORIZED",
      429: "RATE_LIMITED",
      500: "INTERNAL_ERROR",
    };
    return codes[status] || "UNKNOWN_ERROR";
  }
}
```

---

## Production Patterns

### Health Check Module

```bash
npm install @nestjs/terminus
```

```typescript
// src/health/health.controller.ts

  HealthCheckService,
  HealthCheck,
  MemoryHealthIndicator,
} from "@nestjs/terminus";

@Controller("health")
export class HealthController {
  constructor(
    private health: HealthCheckService,
    private memory: MemoryHealthIndicator,
  ) {}

  @Get()
  @Public()
  @HealthCheck()
  check() {
    return this.health.check([
      () => this.memory.checkHeap("memory_heap", 500 * 1024 * 1024),
    ]);
  }

  @Get("live")
  @Public()
  liveness() {
    return { status: "ok", timestamp: new Date().toISOString() };
  }
}
```

### Graceful Shutdown

```typescript
// src/main.ts

async function bootstrap() {
  const logger = new Logger("Bootstrap");
  const app = await NestFactory.create(AppModule);

  app.useGlobalPipes(new ValidationPipe({ whitelist: true, transform: true }));
  app.useGlobalFilters(new AIExceptionFilter());
  app.enableShutdownHooks();
  app.enableCors();

  const port = process.env.PORT || 3000;
  await app.listen(port);
  logger.log(`Application running on port ${port}`);
}

bootstrap();
```

---

## Monitoring and Logging

### nestjs-pino for Structured Logging

```bash
npm install nestjs-pino pino-http pino-pretty
```

```typescript
// src/app.module.ts

@Module({
  imports: [
    LoggerModule.forRoot({
      pinoHttp: {
        level: process.env.NODE_ENV === "production" ? "info" : "debug",
        transport:
          process.env.NODE_ENV !== "production"
            ? { target: "pino-pretty", options: { colorize: true } }
            : undefined,
        redact: ["req.headers.authorization", "req.headers['x-api-key']"],
      },
    }),
  ],
})
export class AppModule {}
```

### Prometheus with @willsoto/nestjs-prometheus

```bash
npm install @willsoto/nestjs-prometheus prom-client
```

```typescript
// src/common/metrics/metrics.module.ts

  PrometheusModule,
  makeCounterProvider,
  makeHistogramProvider,
} from "@willsoto/nestjs-prometheus";

@Module({
  imports: [
    PrometheusModule.register({
      path: "/metrics",
      defaultMetrics: { enabled: true },
    }),
  ],
  providers: [
    makeCounterProvider({
      name: "ai_requests_total",
      help: "Total AI requests",
      labelNames: ["provider", "status"],
    }),
    makeHistogramProvider({
      name: "ai_request_duration_seconds",
      help: "AI request duration",
      labelNames: ["provider"],
      buckets: [0.1, 0.5, 1, 2, 5, 10],
    }),
  ],
  exports: [PrometheusModule],
})
export class MetricsModule {}
```

---

## Best Practices

Follow these best practices when building NestJS AI applications:

- **Use Dependency Injection** - Inject NeuroLinkService instead of creating instances directly. This enables testing and lifecycle management.

- **Implement Lifecycle Hooks** - Use `OnModuleInit` for initialization and `OnModuleDestroy` for cleanup to ensure proper resource management.

- **Validate All Inputs** - Use DTOs with class-validator decorators and apply ValidationPipe globally to catch invalid requests early.

- **Centralize Error Handling** - Use exception filters to handle AI provider errors consistently across all endpoints.

- **Monitor Everything** - Implement Prometheus metrics for requests, latency, and errors. Use structured logging for debugging.

---

## Deployment

### Dockerfile

```dockerfile
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build

FROM node:20-alpine AS production
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package*.json ./
RUN addgroup -g 1001 -S nodejs && adduser -S nestjs -u 1001
USER nestjs
ENV NODE_ENV=production PORT=3000
EXPOSE 3000
HEALTHCHECK --interval=30s --timeout=10s CMD wget --spider http://localhost:3000/health/live || exit 1
CMD ["node", "dist/main.js"]
```

### docker-compose.yml

```yaml
version: "3.8"
services:
  app:
    build: .
    ports:
      - "3000:3000"
    environment:
      - NODE_ENV=production
      - JWT_SECRET=${JWT_SECRET}
      - API_KEY=${API_KEY}
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - REDIS_HOST=redis
    depends_on:
      - redis
    restart: unless-stopped

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    volumes:
      - redis_data:/data
    restart: unless-stopped

volumes:
  redis_data:
```

### Production Checklist

```markdown
## Security

- [ ] API keys in environment variables
- [ ] Strong JWT secret
- [ ] CORS configured properly
- [ ] Rate limiting enabled
- [ ] Input validation on all endpoints

## Performance

- [ ] Response caching with Redis
- [ ] Appropriate timeouts
- [ ] Memory limits configured

## Reliability

- [ ] Health checks implemented
- [ ] Graceful shutdown handlers
- [ ] Error handling for all AI providers

## Monitoring

- [ ] Prometheus metrics exposed
- [ ] Structured logging configured
- [ ] Alerting rules defined
```

---

## Related Documentation

- [Express.js Integration Guide](/docs/sdk/framework-integration) - Lightweight REST API setup
- [Next.js Integration Guide](/docs/guides/frameworks/nextjs) - Full-stack React applications
- [Streaming Guide](/docs/advanced/streaming) - SSE and WebSocket streaming
- [API Reference](/docs/sdk/api-reference) - Complete SDK documentation

---

## Need Help?

- **Documentation**: [https://neurolink.dev/docs](https://neurolink.dev/docs)
- **GitHub Issues**: [https://github.com/juspay/neurolink/issues](https://github.com/juspay/neurolink/issues)
- **Discord Community**: [https://discord.gg/neurolink](https://discord.gg/neurolink)

---

# CLI

## CLI Command Reference

<!-- Source: cli/commands.md -->

# CLI Command Reference

The NeuroLink CLI mirrors the SDK. Every command shares consistent options and outputs so you can prototype in the terminal and port the workflow to code later.

## Install or Run Ad-hoc

```bash
# Run without installation
npx @juspay/neurolink --help

# Install globally
npm install -g @juspay/neurolink

# Local project dependency
npm install @juspay/neurolink
```

## Command Map

| Command               | Description                                                 | Example                                                                     |
| --------------------- | ----------------------------------------------------------- | --------------------------------------------------------------------------- |
| `generate` / `gen`    | One-shot content generation with optional multimodal input. | `npx @juspay/neurolink generate "Draft release notes" --image ./before.png` |
| `stream`              | Real-time streaming output with tool support.               | `npx @juspay/neurolink stream "Narrate sprint demo" --enableAnalytics`      |
| `batch`               | Process multiple prompts from a file.                       | `npx @juspay/neurolink batch prompts.txt --format json`                     |
| `loop`                | Interactive session with persistent variables & memory.     | `npx @juspay/neurolink loop --auto-redis`                                   |
| `setup` / `s`         | Guided provider onboarding and validation.                  | `npx @juspay/neurolink setup --provider openai`                             |
| `status`              | Health check for configured providers.                      | `npx @juspay/neurolink status --verbose`                                    |
| `get-best-provider`   | Show the best available AI provider.                        | `npx @juspay/neurolink get-best-provider --format json`                     |
| `models list`         | Inspect available models and capabilities.                  | `npx @juspay/neurolink models list --capability vision`                     |
| `config ` | Initialise, validate, export, or reset configuration.       | `npx @juspay/neurolink config validate`                                     |
| `memory ` | View, export, or clear conversation history.                | `npx @juspay/neurolink memory history NL_x3yr --format json`                |
| `mcp `    | Manage Model Context Protocol servers/tools.                | `npx @juspay/neurolink mcp list`                                            |
| `ollama ` | Manage Ollama local AI models.                              | `npx @juspay/neurolink ollama list-models`                                  |
| `sagemaker ` | Manage Amazon SageMaker endpoints and models.               | `npx @juspay/neurolink sagemaker status`                                    |
| `server ` | Manage NeuroLink HTTP server                                |                                                                             |
| `serve`               | Start server in foreground mode                             |                                                                             |
| `validate`            | Alias for `config validate`.                                | `npx @juspay/neurolink validate`                                            |
| `completion`          | Generate shell completion script.                           | `npx @juspay/neurolink completion > ~/.neurolink-completion.sh`             |

## Primary Commands

### `generate ` {#generate}

```bash
npx @juspay/neurolink generate "Summarise design doc" \
  --provider google-ai --model gemini-2.5-pro \
  --image ./screenshots/ui.png --enableAnalytics --enableEvaluation
```

Key flags:

- `--provider`, `-p` – provider slug (default `auto`).
- `--model`, `-m` – model name for the chosen provider.
- `--image`, `-i` – attach one or more image files/URLs for multimodal prompts.
- `--pdf` – attach one or more PDF files for document analysis.
- `--csv`, `-c` – attach one or more CSV files for data analysis.
- `--file` – attach any supported file type (auto-detected: Excel, Word, RTF, JSON, YAML, XML, HTML, SVG, Markdown, code files, and more).
- `--temperature`, `-t` – creativity (default `0.7`).
- `--maxTokens`, `--max` – response limit (default `1000`).
- `--system`, `-s` – system prompt.
- `--format`, `-f`, `--output-format` – `text` (default), `json`, or `table`.
- `--output`, `-o` – write response to file.
- `--imageOutput`, `--image-output` – custom path for generated image (default: `generated-images/image-.png`).
- `--enableAnalytics` / `--enableEvaluation` – capture metrics & quality scores.
- `--evaluationDomain` – domain hint for the judge model.
- `--domainAware` – use domain-aware evaluation (default `false`).
- `--context` – JSON string appended to analytics/evaluation context.
- `--domain`, `-d` – domain type for specialized processing: `healthcare`, `finance`, `analytics`, `ecommerce`, `education`, `legal`, `technology`, `generic`, `auto`.
- `--disableTools` – bypass MCP tools for this call.
- `--timeout` – seconds before aborting the request (default `120`).
- `--region`, `-r` – Vertex AI region (e.g., `us-central1`, `europe-west1`, `asia-northeast1`).
- `--debug`, `-v`, `--verbose` – verbose logging and full JSON payloads.
- `--quiet`, `-q` – suppress non-essential output (default `true`).

**CSV Options:**

- `--csvMaxRows` – maximum number of CSV rows to process (default `1000`).
- `--csvFormat` – CSV output format: `raw` (default), `markdown`, `json`.

**Video Input (Analysis):**

- `--video` – attach video file for analysis (MP4, WebM, MOV, AVI, MKV).
- `--video-frames` – number of frames to extract (default `8`).
- `--video-quality` – frame quality 0–100 (default `85`).
- `--video-format` – frame format: `jpeg` (default) or `png`.
- `--transcribe-audio` – extract and transcribe audio from video (default `false`).

**Text-to-Speech (TTS):**

- `--tts` – enable text-to-speech output (default `false`).
- `--ttsVoice` – TTS voice to use (e.g., `en-US-Neural2-C`).
- `--ttsFormat` – audio output format: `mp3` (default), `wav`, `ogg`, `opus`.
- `--ttsSpeed` – speaking rate 0.25–4.0 (default `1.0`).
- `--ttsQuality` – audio quality level: `standard` (default) or `hd`.
- `--ttsOutput` – save TTS audio to file (supports absolute and relative paths).
- `--ttsPlay` – auto-play generated audio (default `false`).

**Extended Thinking:**

- `--thinking`, `--think` – enable extended thinking/reasoning capability (default `false`).
- `--thinkingBudget` – token budget for extended thinking (5000–100000, default `10000`). Supported by Anthropic Claude and Gemini 2.5+ models.
- `--thinkingLevel` – thinking level for Gemini 3 models: `minimal`, `low`, `medium`, `high`.

**File Input Examples:**

```bash
# Attach multiple file types
npx @juspay/neurolink generate "Analyze this data" \
  --file ./report.xlsx \
  --file ./config.yaml \
  --file ./diagram.svg

# Mix file types with images and PDFs
npx @juspay/neurolink generate "Compare architecture" \
  --file ./main.ts \
  --pdf ./spec.pdf \
  --image ./screenshot.png
```

See [File Processors Guide](/docs/features/file-processors) for all 17+ supported file types.

**Video Generation (Veo 3.1):**

- `--outputMode` – output mode: `text` (default) or `video`.
- `--image` – path to input image file (required for video generation, e.g., ./input.jpg).
- `--videoOutput`, `-vo` – path to save generated video file.
- `--videoResolution` – `720p` or `1080p` (default `720p`).
- `--videoLength` – duration: `4`, `6`, or `8` seconds (default `4`).
- `--videoAspectRatio` – `9:16` (portrait) or `16:9` (landscape, default `16:9`).
- `--videoAudio` – include synchronized audio (default `true`).

**Note:** Video generation requires Vertex AI provider (`vertex`) and Veo 3.1 model (`veo-3.1`). The provider auto-switches to Vertex when `--outputMode video` is specified. Supported image formats: PNG, JPEG, WebP (max 20MB).

`gen` is a short alias with the same options.

### `stream ` {#stream}

```bash
npx @juspay/neurolink stream "Walk through the timeline" \
  --provider openai --model gpt-4o --enableEvaluation
```

`stream` shares the same flags as `generate` and adds chunked output for live UIs. Evaluation results are emitted after the stream completes when `--enableEvaluation` is set.

### `batch ` {#batch}

Process multiple prompts from a file in sequence.

```bash
# Process prompts from a file
npx @juspay/neurolink batch prompts.txt

# Export results as JSON
npx @juspay/neurolink batch questions.txt --format json

# Use Vertex AI with 2s delay between requests
npx @juspay/neurolink batch tasks.txt -p vertex --delay 2000

# Save results to file
npx @juspay/neurolink batch batch.txt --output results.json
```

`batch` shares the same flags as `generate`. The input file should contain one prompt per line. Results are returned as an array of `{ prompt, response }` objects. A default 1-second delay is applied between requests; override with `--delay `.

### `loop`

**Interactive session mode** with persistent state, conversation memory, and session variables. Perfect for iterative workflows and experimentation.

```bash
# Start loop with Redis-backed conversation memory
npx @juspay/neurolink loop --enable-conversation-memory --auto-redis

# Start loop without Redis auto-detection
npx @juspay/neurolink loop --enable-conversation-memory --no-auto-redis

# Force start a new conversation (skip selection menu)
npx @juspay/neurolink loop --new

# Resume a specific conversation by session ID
npx @juspay/neurolink loop --resume abc123def456

# List available conversations and exit
npx @juspay/neurolink loop --list-conversations

# Use in-memory storage only
npx @juspay/neurolink loop --no-auto-redis
```

**Loop-specific flags:**

| Flag                           | Alias | Type    | Default | Description                                           |
| ------------------------------ | ----- | ------- | ------- | ----------------------------------------------------- |
| `--enable-conversation-memory` |       | boolean | true    | Enable conversation memory for the loop session       |
| `--max-sessions`               |       | number  | 50      | Maximum number of conversation sessions to keep       |
| `--max-turns-per-session`      |       | number  | 20      | Maximum turns per conversation session                |
| `--auto-redis`                 |       | boolean | true    | Automatically use Redis if available                  |
| `--resume`                     | `-r`  | string  |         | Directly resume a specific conversation by session ID |
| `--new`                        | `-n`  | boolean |         | Force start a new conversation (skip selection menu)  |
| `--list-conversations`         | `-l`  | boolean |         | List available conversations and exit                 |
| `--compact-threshold`          |       | number  | 0.8     | Context compaction trigger threshold (0.0–1.0)        |
| `--disable-compaction`         |       | boolean | false   | Disable automatic context compaction                  |

**Key capabilities:**

- Run any CLI command without restarting session
- Persistent session variables: `set provider openai`, `set temperature 0.9`
- Conversation memory: AI remembers previous turns within session
- Redis auto-detection: Automatically connects if `REDIS_URL` is set
- Export session history as JSON for analytics
- Automatic context compaction when usage exceeds threshold

**Session management commands (inside loop):**

| Command             | Description                                                  |
| ------------------- | ------------------------------------------------------------ |
| `help`              | Show all available loop mode commands and standard CLI help. |
| `set  ` | Set a session variable. Use `set help` for available keys.   |
| `get `         | Show current value of a session variable.                    |
| `unset `       | Remove a session variable.                                   |
| `show`              | Display all currently set session variables.                 |
| `clear`             | Reset all session variables.                                 |
| `exit`              | Exit loop session. Aliases: `quit`, `:q`.                    |

**Settable session variables (via `set`):**

| Variable              | Type    | Description                                                | Allowed Values                                                         |
| --------------------- | ------- | ---------------------------------------------------------- | ---------------------------------------------------------------------- |
| `provider`            | string  | The AI provider to use.                                    | `openai`, `anthropic`, `google-ai`, `vertex`, `bedrock`, `azure`, etc. |
| `model`               | string  | The specific model to use from the provider.               | Any valid model name                                                   |
| `temperature`         | number  | Controls randomness of the output (e.g., 0.2, 0.8).        |                                                                        |
| `maxTokens`           | number  | The maximum number of tokens to generate.                  |                                                                        |
| `output`              | string  | AI response format value.                                  | `text`, `json`, `structured`, `none`                                   |
| `systemPrompt`        | string  | The system prompt to guide the AI's behavior.              |                                                                        |
| `timeout`             | number  | Timeout for the generation request in milliseconds.        |                                                                        |
| `disableTools`        | boolean | Disable all tool usage for the AI.                         |                                                                        |
| `maxSteps`            | number  | Maximum number of tool execution steps.                    |                                                                        |
| `enableAnalytics`     | boolean | Enable or disable analytics for responses.                 |                                                                        |
| `enableEvaluation`    | boolean | Enable or disable AI-powered evaluation of responses.      |                                                                        |
| `evaluationDomain`    | string  | Domain expertise for evaluation.                           |                                                                        |
| `toolUsageContext`    | string  | Context about tools/MCPs used in the interaction.          |                                                                        |
| `enableSummarization` | boolean | Enable automatic conversation summarization.               |                                                                        |
| `thinking`            | boolean | Enable extended thinking/reasoning capability.             |                                                                        |
| `thinkingBudget`      | number  | Token budget for thinking (Anthropic models: 5000–100000). |                                                                        |
| `thinkingLevel`       | string  | Thinking level for Gemini 3 models.                        | `minimal`, `low`, `medium`, `high`                                     |

**Context Budget Warnings:**

During a loop session, NeuroLink monitors context window usage after each generation command:

- **60% used (gray):** A subtle status line is shown: `Context: 62% used`.
- **80% used (yellow):** A prominent warning with token counts is shown:
  ```
  Context usage: 83% of window (12,450 / 15,000 tokens)
  Auto-compaction will trigger to preserve conversation quality.
  ```
  When `--disable-compaction` is not set, the system automatically compacts the context to free up space while preserving conversation quality.

See the complete guide: [CLI Loop Sessions](/docs/features/cli-loop-sessions)

### `setup`

**Interactive provider configuration wizard** that guides you through API key setup, credential validation, and recommended model selection.

```bash
# Launch interactive setup wizard
npx @juspay/neurolink setup

# Show all available providers
npx @juspay/neurolink setup --list

# Configure a specific provider
npx @juspay/neurolink setup --provider openai
npx @juspay/neurolink setup --provider bedrock
npx @juspay/neurolink setup --provider google-ai
```

**What the wizard does:**

1. **Prompts for API keys** – Securely collects credentials
2. **Validates authentication** – Tests connection to provider
3. **Writes `.env` file** – Safely stores credentials (creates if missing)
4. **Recommends models** – Suggests best models for your use case
5. **Shows example commands** – Quick-start examples to try immediately

**Supported providers:**
OpenAI, Anthropic, Google AI, Vertex AI, Bedrock, Azure, Hugging Face, Ollama, Mistral, and more.

See also: [Provider Setup Guide](/docs/getting-started/provider-setup)

### `status`

```bash
npx @juspay/neurolink status --verbose
```

Displays provider availability, authentication status, recent error summaries, and response latency.

### `models`

```bash
# List all models for a provider
npx @juspay/neurolink models list --provider google-ai

# Filter by capability
npx @juspay/neurolink models list --capability vision --format table
```

### `config`

Manage persistent configuration stored in the NeuroLink config directory.

```bash
npx @juspay/neurolink config init
npx @juspay/neurolink config validate
npx @juspay/neurolink config export --format json > neurolink-config.json
```

### `memory`

**Manage conversation history** stored in Redis. View, export, or clear session data for analytics and debugging.

```bash
# List all active sessions
npx @juspay/neurolink memory list

# View session statistics
npx @juspay/neurolink memory stats

# View conversation history (text format)
npx @juspay/neurolink memory history

# Export session as JSON (Q4 2025 - for analytics)
npx @juspay/neurolink memory export --session-id  --format json > session.json

# Export all sessions
npx @juspay/neurolink memory export-all --output ./exports/

# Delete a single session
npx @juspay/neurolink memory clear

# Delete all sessions
npx @juspay/neurolink memory clear-all
```

**Export formats:**

- `json` – Structured data with metadata, timestamps, token counts
- `csv` – Tabular format for spreadsheet analysis

**Note:** Requires Redis-backed conversation memory. Set `REDIS_URL` environment variable.

See the complete guide: [Redis Conversation Export](/docs/features/conversation-history)

### `mcp`

Manage Model Context Protocol servers and tools. Supports stdio, SSE, WebSocket, and HTTP transports.

```bash
# List registered servers/tools
npx @juspay/neurolink mcp list

# Auto-discover MCP servers from config files
npx @juspay/neurolink mcp discover

# Install popular MCP servers
npx @juspay/neurolink mcp install filesystem
npx @juspay/neurolink mcp install github

# Add custom servers with different transports
npx @juspay/neurolink mcp add myserver "python server.py" --transport stdio
npx @juspay/neurolink mcp add webserver "http://localhost:8080" --transport sse --url "http://localhost:8080/sse"

# Add HTTP remote server with authentication
npx @juspay/neurolink mcp add remote-api "https://api.example.com/mcp" \
  --transport http \
  --url "https://api.example.com/mcp" \
  --headers '{"Authorization": "Bearer YOUR_TOKEN"}'

# Test server connectivity
npx @juspay/neurolink mcp test myserver

# Remove a server
npx @juspay/neurolink mcp remove myserver
```

**MCP Command Options:**

| Option        | Description                                         |
| ------------- | --------------------------------------------------- |
| `--transport` | Transport type: `stdio`, `sse`, `websocket`, `http` |
| `--url`       | URL for SSE/WebSocket/HTTP transport                |
| `--headers`   | JSON string with HTTP headers for authentication    |
| `--args`      | Command arguments (comma-separated)                 |
| `--env`       | Environment variables (JSON string)                 |
| `--cwd`       | Working directory for the server                    |

**HTTP Transport Features:**

- Custom headers for authentication (Bearer tokens, API keys)
- Configurable timeouts and connection options
- Automatic retry with exponential backoff
- Rate limiting to prevent API throttling
- OAuth 2.1 support with PKCE

See [MCP HTTP Transport Guide](/docs/mcp/http-transport) for complete configuration options.

### `batch`

See [`batch `](#batch) above.

### `get-best-provider`

Show the best available AI provider based on current configuration and availability.

```bash
# Get best available provider
npx @juspay/neurolink get-best-provider

# Get provider as JSON
npx @juspay/neurolink get-best-provider --format json

# Just the provider name
npx @juspay/neurolink get-best-provider --quiet
```

### `ollama `

Manage Ollama local AI models. Requires Ollama to be installed on the local machine.

```bash
# List installed models
npx @juspay/neurolink ollama list-models

# Download a model
npx @juspay/neurolink ollama pull llama3

# Remove a model
npx @juspay/neurolink ollama remove llama3

# Check Ollama service status
npx @juspay/neurolink ollama status

# Start/stop Ollama service
npx @juspay/neurolink ollama start
npx @juspay/neurolink ollama stop

# Interactive Ollama setup
npx @juspay/neurolink ollama setup
```

**Subcommands:**

| Subcommand       | Description                  |
| ---------------- | ---------------------------- |
| `list-models`    | List installed Ollama models |
| `pull `   | Download an Ollama model     |
| `remove ` | Remove an Ollama model       |
| `status`         | Check Ollama service status  |
| `start`          | Start Ollama service         |
| `stop`           | Stop Ollama service          |
| `setup`          | Interactive Ollama setup     |

### `sagemaker `

Manage Amazon SageMaker AI models and endpoints.

```bash
# Check SageMaker configuration and connectivity
npx @juspay/neurolink sagemaker status

# Test connectivity to an endpoint
npx @juspay/neurolink sagemaker test my-endpoint

# List available endpoints
npx @juspay/neurolink sagemaker list-endpoints

# Show current SageMaker configuration
npx @juspay/neurolink sagemaker config

# Interactive setup
npx @juspay/neurolink sagemaker setup

# Validate configuration and credentials
npx @juspay/neurolink sagemaker validate

# Run performance benchmark
npx @juspay/neurolink sagemaker benchmark my-endpoint
```

**Subcommands:**

| Subcommand             | Description                                      |
| ---------------------- | ------------------------------------------------ |
| `status`               | Check SageMaker configuration and connectivity   |
| `test `      | Test connectivity to a SageMaker endpoint        |
| `list-endpoints`       | List available SageMaker endpoints               |
| `config`               | Show current SageMaker configuration             |
| `setup`                | Interactive SageMaker configuration setup        |
| `validate`             | Validate SageMaker configuration and credentials |
| `benchmark ` | Run performance benchmark against endpoint       |

### `completion`

Generate a shell completion script for bash.

```bash
# Generate shell completion
npx @juspay/neurolink completion

# Save completion script
npx @juspay/neurolink completion > ~/.neurolink-completion.sh

# Enable completions (bash)
source ~/.neurolink-completion.sh
```

Add the completion script to your shell profile for persistent completions.

---

## `serve`

Start the NeuroLink HTTP server in foreground mode.

### Usage

```bash
neurolink serve [options]
```

### Options

| Option        | Alias | Type    | Default | Description                                              |
| ------------- | ----- | ------- | ------- | -------------------------------------------------------- |
| `--port`      | `-p`  | number  | 3000    | Port to listen on                                        |
| `--host`      | `-H`  | string  | 0.0.0.0 | Host to bind to                                          |
| `--framework` | `-f`  | string  | hono    | Web framework: hono, express, fastify, koa               |
| `--basePath`  |       | string  | /api    | Base path for all routes                                 |
| `--cors`      |       | boolean | true    | Enable CORS                                              |
| `--rateLimit` |       | number  | 100     | Rate limit (requests per 15-minute window, 0 to disable) |
| `--swagger`   |       | boolean | false   | Enable Swagger UI and OpenAPI endpoints                  |
| `--watch`     | `-w`  | boolean | false   | Enable watch mode                                        |
| `--config`    | `-c`  | string  |         | Path to config file                                      |

### Swagger/OpenAPI Endpoints

When `--swagger` is enabled, these endpoints become available:

| Endpoint                | Description                              |
| ----------------------- | ---------------------------------------- |
| `GET /api/openapi.json` | OpenAPI 3.1 specification in JSON format |
| `GET /api/openapi.yaml` | OpenAPI 3.1 specification in YAML format |
| `GET /api/docs`         | Interactive Swagger UI documentation     |

> **Note:** Disable with `--no-swagger` in production to avoid exposing API structure.

### Examples

```bash
# Start with defaults
neurolink serve

# Start on specific port with Express
neurolink serve --port 8080 --framework express

# Start with custom config file
neurolink serve --config ./server.config.json
```

---

## `server `

Manage NeuroLink HTTP server for exposing AI agents as REST APIs.

### Subcommands

| Subcommand | Description                         |
| ---------- | ----------------------------------- |
| `start`    | Start the HTTP server in background |
| `stop`     | Stop the running server             |
| `status`   | Show server status                  |
| `routes`   | List all registered routes          |
| `config`   | Show or modify server configuration |
| `openapi`  | Generate OpenAPI specification      |

---

### `server start`

Start the HTTP server in background mode.

```bash
neurolink server start [options]
```

| Option        | Alias | Type    | Default | Description                                              |
| ------------- | ----- | ------- | ------- | -------------------------------------------------------- |
| `--port`      | `-p`  | number  | 3000    | Port to listen on                                        |
| `--host`      | `-H`  | string  | 0.0.0.0 | Host to bind to                                          |
| `--framework` | `-f`  | string  | hono    | Framework: hono, express, fastify, koa                   |
| `--basePath`  |       | string  | /api    | Base path for all routes                                 |
| `--cors`      |       | boolean | true    | Enable CORS                                              |
| `--rateLimit` |       | number  | 100     | Rate limit (requests per 15-minute window, 0 to disable) |

**Examples:**

```bash
# Start with defaults
neurolink server start

# Start on port 8080 with Express
neurolink server start -p 8080 --framework express
```

---

### `server stop`

Stop a running background server.

```bash
neurolink server stop [options]
```

| Option    | Type    | Default | Description                                 |
| --------- | ------- | ------- | ------------------------------------------- |
| `--force` | boolean | false   | Force stop even if server is not responding |

**Examples:**

```bash
# Stop gracefully
neurolink server stop

# Force stop
neurolink server stop --force
```

---

### `server status`

Show server status information.

```bash
neurolink server status [options]
```

| Option     | Type   | Default | Description               |
| ---------- | ------ | ------- | ------------------------- |
| `--format` | string | text    | Output format: text, json |

**Examples:**

```bash
# Text output
neurolink server status

# JSON output for scripting
neurolink server status --format json
```

---

### `server routes`

List all registered server routes.

```bash
neurolink server routes [options]
```

| Option     | Type   | Default | Description                                                  |
| ---------- | ------ | ------- | ------------------------------------------------------------ |
| `--format` | string | table   | Output format: text, json, table                             |
| `--group`  | string | all     | Filter by route group: agent, tool, mcp, memory, health, all |
| `--method` | string | all     | Filter by HTTP method: GET, POST, PUT, DELETE, PATCH, all    |

**Examples:**

```bash
# List all routes in table format
neurolink server routes

# List only agent routes
neurolink server routes --group agent

# List all POST endpoints as JSON
neurolink server routes --method POST --format json
```

---

### `server config`

Show or modify server configuration.

```bash
neurolink server config [options]
```

| Option     | Type    | Default | Description                            |
| ---------- | ------- | ------- | -------------------------------------- |
| `--get`    | string  |         | Get a specific config value            |
| `--set`    | string  |         | Set a config value (format: key=value) |
| `--reset`  | boolean | false   | Reset configuration to defaults        |
| `--format` | string  | text    | Output format: text, json              |

**Examples:**

```bash
# Show all configuration
neurolink server config

# Get specific value
neurolink server config --get defaultPort

# Set a value
neurolink server config --set defaultPort=8080

# Reset to defaults
neurolink server config --reset
```

---

### `server openapi`

Generate OpenAPI specification.

```bash
neurolink server openapi [options]
```

| Option       | Alias | Type   | Default | Description               |
| ------------ | ----- | ------ | ------- | ------------------------- |
| `--output`   | `-o`  | string | stdout  | Output file path          |
| `--format`   |       | string | json    | Output format: json, yaml |
| `--basePath` |       | string | /api    | Base path for all routes  |
| `--title`    |       | string |         | API title                 |
| `--version`  |       | string |         | API version               |

**Examples:**

```bash
# Generate to stdout
neurolink server openapi

# Save to file
neurolink server openapi -o openapi.json

# Generate YAML format
neurolink server openapi --format yaml -o openapi.yaml
```

## Global Flags (available on every command)

| Flag                        | Alias                   | Default | Description                                                               |
| --------------------------- | ----------------------- | ------- | ------------------------------------------------------------------------- |
| `--provider`                | `-p`                    | `auto`  | AI provider to use (auto-selects best available).                         |
| `--model`                   | `-m`                    |         | Specific model to use.                                                    |
| `--temperature`             | `-t`                    | `0.7`   | Creativity level (0.0 = focused, 1.0 = creative).                         |
| `--maxTokens`               | `--max`                 | `1000`  | Maximum tokens to generate.                                               |
| `--system`                  | `-s`                    |         | System prompt to guide AI behavior.                                       |
| `--format`                  | `-f`, `--output-format` | `text`  | Output format: `text`, `json`, `table`.                                   |
| `--output`                  | `-o`                    |         | Save output to file.                                                      |
| `--configFile `       |                         |         | Use a specific configuration file.                                        |
| `--dryRun`                  |                         | `false` | Generate without calling providers (returns mocked analytics/evaluation). |
| `--noColor`                 |                         | `false` | Disable ANSI colours.                                                     |
| `--delay `              |                         |         | Delay between batched operations.                                         |
| `--domain `           | `-d`                    |         | Domain type for specialized processing and optimization.                  |
| `--toolUsageContext ` |                         |         | Describe expected tool usage for better evaluation feedback.              |
| `--debug`                   | `-v`, `--verbose`       | `false` | Enable debug mode with verbose output.                                    |
| `--quiet`                   | `-q`                    | `true`  | Suppress non-essential output.                                            |
| `--timeout`                 |                         | `120`   | Maximum execution time in seconds.                                        |
| `--disableTools`            |                         | `false` | Disable MCP tool integration.                                             |
| `--enableAnalytics`         |                         | `false` | Enable usage analytics collection.                                        |
| `--enableEvaluation`        |                         | `false` | Enable AI response quality evaluation.                                    |
| `--region`                  | `-r`                    |         | Vertex AI region (e.g., `us-central1`).                                   |

## JSON-Friendly Automation

- `--format json` returns structured output including analytics, evaluation, tool calls, and response metadata.
- Combine with `--enableAnalytics --enableEvaluation` to capture usage costs and quality scores in automation pipelines.
- Use `--output ` to persist raw responses alongside JSON logs.

## rag \

Document processing and RAG pipeline commands.

| Subcommand | Description                                 |
| ---------- | ------------------------------------------- |
| `chunk`    | Chunk a document using a specified strategy |
| `index`    | Index documents into a vector store         |
| `query`    | Query indexed documents                     |

### rag chunk

Chunk a document file into smaller pieces for RAG processing.

```bash
neurolink rag chunk  [options]
```

| Option            | Alias | Type   | Default     | Description                |
| ----------------- | ----- | ------ | ----------- | -------------------------- |
| `--strategy`      | `-s`  | string | `recursive` | Chunking strategy          |
| `--chunk-size`    |       | number | `1000`      | Maximum chunk size         |
| `--chunk-overlap` |       | number | `200`       | Overlap between chunks     |
| `--output`        | `-o`  | string | stdout      | Output file path           |
| `--format`        | `-f`  | string | `text`      | Output format (text, json) |

**Chunking Strategies:** `character`, `recursive`, `sentence`, `token`, `markdown`, `html`, `json`, `latex`, `semantic`, `semantic-markdown`

**Examples:**

```bash
# Default chunking
neurolink rag chunk ./docs/guide.md

# Markdown-aware chunking with JSON output
neurolink rag chunk ./docs/guide.md --strategy markdown --format json

# Custom size and overlap
neurolink rag chunk ./docs/guide.md --chunk-size 512 --chunk-overlap 50 --output chunks.json
```

### RAG Flags on generate/stream

RAG can also be used directly with `generate` and `stream` commands via `--rag-files`:

```bash
neurolink generate "What is this about?" --rag-files ./docs/guide.md
neurolink stream "Summarize" --rag-files ./docs/a.md ./docs/b.md --rag-top-k 10
```

| Flag                  | Type     | Default       | Description                         |
| --------------------- | -------- | ------------- | ----------------------------------- |
| `--rag-files`         | string[] | -             | File paths to load for RAG context  |
| `--rag-strategy`      | string   | auto-detected | Chunking strategy for RAG documents |
| `--rag-chunk-size`    | number   | 1000          | Maximum chunk size in characters    |
| `--rag-chunk-overlap` | number   | 200           | Overlap between adjacent chunks     |
| `--rag-top-k`         | number   | 5             | Number of top results to retrieve   |

## Troubleshooting

| Issue                              | Tip                                                                                                      |
| ---------------------------------- | -------------------------------------------------------------------------------------------------------- |
| `Unknown argument`                 | Check spelling; run `command --help` for the latest options.                                             |
| CLI exits immediately              | Upgrade to the newest release or clear old `neurolink` binaries on PATH.                                 |
| Provider shows as `not-configured` | Run `neurolink setup --provider ` or populate `.env`.                                              |
| Analytics/evaluation missing       | Ensure both `--enableAnalytics`/`--enableEvaluation` and provider credentials for the judge model exist. |

For advanced workflows (batching, tooling, configuration management) see the relevant guides in the documentation sidebar.

---

## Related Features

**Q4 2025:**

- [CLI Loop Sessions](/docs/features/cli-loop-sessions) – Persistent interactive mode with session management
- [Redis Conversation Export](/docs/features/conversation-history) – Export session history via `memory export`
- [Guardrails Middleware](/docs/features/guardrails) – Content filtering (use `--middleware-preset security`)

**Q3 2025:**

- [Multimodal Chat](/docs/features/multimodal-chat) – Use `--image` flag with `generate` or `stream`
- [Auto Evaluation](/docs/features/auto-evaluation) – Enable with `--enableEvaluation`
- [Provider Orchestration](/docs/features/provider-orchestration) – Automatic fallback and routing

**Documentation:**

- [SDK API Reference](/docs/sdk/api-reference) – TypeScript API equivalents
- [Configuration Guide](/docs/deployment/configuration) – Environment variables and config files
- [Troubleshooting](/docs/reference/troubleshooting) – Detailed error solutions

---

## CLI Guide

<!-- Source: cli/index.md -->

# CLI Guide

The NeuroLink CLI provides a professional command-line interface for AI text generation, provider management, and workflow automation.

##  Overview

The CLI is designed for:

- **Developers** who want to integrate AI into scripts and workflows
- **Content creators** who need quick AI text generation
- **System administrators** who manage AI provider configurations
- **Researchers** who experiment with different AI models and providers

##  Quick Reference

```bash
# Text generation (primary commands)
neurolink generate "Your prompt here"
neurolink gen "Your prompt"           # Short form

# Real-time streaming
neurolink stream "Tell me a story"

# Provider management
neurolink status                      # Check all providers
neurolink provider status --verbose  # Detailed diagnostics
```

```bash
# With analytics and evaluation
neurolink generate "Write code" --enable-analytics --enable-evaluation

# Custom provider and model
neurolink gen "Explain AI" --provider openai --model gpt-4

# Batch processing
echo -e "Prompt 1\nPrompt 2" | neurolink batch prompts.txt

# Output to file
neurolink generate "Documentation" --output result.md
```

```bash
# Built-in tools (working)
neurolink generate "What time is it?" --debug

# Disable tools
neurolink generate "Pure text" --disable-tools

# MCP server management
neurolink mcp discover
neurolink mcp list
neurolink mcp install
```

```bash
# Start server in foreground
neurolink serve --port 3000 --framework hono

# Background server management
neurolink server start --port 8080
neurolink server status
neurolink server stop

# View and manage routes
neurolink server routes
neurolink server routes --group agent --format json

# Configuration management
neurolink server config
neurolink server config --set defaultPort=8080
```

##  Documentation Sections

-  **[Commands Reference](/docs/cli/commands)**

  Complete reference for all CLI commands, options, and parameters with detailed explanations.

-  **[Examples](/docs/examples)**

  Practical examples and common usage patterns for different scenarios and workflows.

-  **[Advanced Usage](/docs/advanced)**

  Advanced features like batch processing, streaming, analytics, and custom configurations.

##  Installation

The CLI requires no installation for basic usage:

```bash
# Direct usage (recommended)
npx @juspay/neurolink generate "Hello, AI"

# Global installation (optional)
npm install -g @juspay/neurolink
neurolink generate "Hello, AI"
```

## ⚙️ Configuration

The CLI automatically loads configuration from:

1. **Environment variables** (`.env` file)
2. **Command-line options**
3. **Auto-detection** of available providers

```bash
# Create .env file
echo 'OPENAI_API_KEY="sk-your-key"' > .env
echo 'GOOGLE_AI_API_KEY="AIza-your-key"' >> .env

# Test configuration
neurolink status
```

##  Interactive Features

The CLI includes several interactive and automation features:

:::tip[Auto-Provider Selection]

NeuroLink automatically selects the best available provider based on configuration and performance.
:::

:::info[Built-in Tools]

All commands include 6 built-in tools by default: time, file operations, math calculations, and more.
:::

:::note[Streaming Support]

Real-time streaming displays results as they're generated, perfect for long-form content.
:::

##  Integration

The CLI works seamlessly with:

- **Shell scripts** and automation
- **CI/CD pipelines** for automated content generation
- **Git hooks** for documentation updates
- **Cron jobs** for scheduled AI tasks

## 🆘 Getting Help

```bash
# General help
neurolink --help

# Command-specific help
neurolink generate --help
neurolink mcp --help

# Check provider status
neurolink status --verbose
```

For troubleshooting, see our [Troubleshooting Guide](/docs/reference/troubleshooting) or [FAQ](/docs/reference/faq).

---

## Advanced CLI Usage

<!-- Source: cli/advanced.md -->

# Advanced CLI Usage

Power user features, optimization techniques, and advanced workflows for the NeuroLink CLI.

##  Advanced Generation Techniques

### Multi-Provider Strategies

```bash
# Provider fallback chain
generate_with_fallback() {
  local prompt="$1"
  local providers=("google-ai" "openai" "anthropic")

  for provider in "${providers[@]}"; do
    if result=$(npx @juspay/neurolink gen "$prompt" --provider $provider 2>/dev/null); then
      echo "✅ Success with $provider"
      echo "$result"
      return 0
    fi
  done

  echo "❌ All providers failed"
  return 1
}

# Usage
generate_with_fallback "Complex technical analysis"
```

### Dynamic Provider Selection

```bash
# Select provider based on task type
select_provider_by_task() {
  local task_type="$1"

  case $task_type in
    "code")
      echo "anthropic"  # Best for code analysis
      ;;
    "creative")
      echo "openai"     # Best for creative content
      ;;
    "fast")
      echo "google-ai"  # Fastest responses
      ;;
    *)
      echo "auto"       # Let NeuroLink decide
      ;;
  esac
}

# Usage
provider=$(select_provider_by_task "code")
npx @juspay/neurolink gen "Write a Python class" --provider $provider
```

##  Analytics and Monitoring

### Advanced Analytics Usage

```bash
# Context-aware analytics
npx @juspay/neurolink gen "Design microservices architecture" \
  --enable-analytics \
  --context '{
    "user_id": "dev123",
    "project": "ecommerce-platform",
    "team": "backend",
    "environment": "development",
    "session_id": "sess_456"
  }' \
  --debug

# Business intelligence tracking
npx @juspay/neurolink gen "Create marketing strategy" \
  --enable-analytics \
  --enable-evaluation \
  --evaluation-domain "Marketing Director" \
  --context '{
    "department": "marketing",
    "campaign": "Q1-launch",
    "budget": "high",
    "target_audience": "enterprise"
  }' \
  --debug
```

### Performance Monitoring

```bash
# Provider performance comparison
compare_providers() {
  local prompt="$1"
  local providers=("openai" "google-ai" "anthropic")

  echo " Comparing provider performance..."
  echo "Prompt: $prompt"
  echo

  for provider in "${providers[@]}"; do
    echo "Testing $provider..."
    start_time=$(date +%s%N)

    result=$(npx @juspay/neurolink gen "$prompt" \
      --provider $provider \
      --enable-analytics \
      --debug 2>/dev/null)

    end_time=$(date +%s%N)
    duration=$(( (end_time - start_time) / 1000000 ))

    echo "✅ $provider: ${duration}ms"
    echo
  done
}

# Usage
compare_providers "Explain quantum computing briefly"
```

### Real-time Monitoring Dashboard

```bash
#!/bin/bash
# provider-dashboard.sh - Real-time provider monitoring

monitor_providers() {
  while true; do
    clear
    echo " NeuroLink Provider Dashboard"
    echo "==============================="
    date
    echo

    # Check provider status
    status=$(npx @juspay/neurolink status --json 2>/dev/null)

    if [ $? -eq 0 ]; then
      echo " Provider Status:"
      echo "$status" | jq -r '.[] | "  \(.name): \(.status) (\(.responseTime)ms)"'

      # Count working providers
      working=$(echo "$status" | jq '[.[] | select(.status == "working")] | length')
      total=$(echo "$status" | jq 'length')

      echo
      echo " Summary: $working/$total providers working"
    else
      echo "❌ Failed to get provider status"
    fi

    echo
    echo "Press Ctrl+C to exit"
    sleep 30
  done
}

# Run monitoring
monitor_providers
```

##  Configuration Management

### Advanced Configuration

```bash
# Environment-specific configs
setup_environment() {
  local env="$1"

  case $env in
    "development")
      export NEUROLINK_LOG_LEVEL="debug"
      export NEUROLINK_CACHE_ENABLED="false"
      export NEUROLINK_TIMEOUT="60000"
      ;;
    "staging")
      export NEUROLINK_LOG_LEVEL="info"
      export NEUROLINK_CACHE_ENABLED="true"
      export NEUROLINK_TIMEOUT="30000"
      ;;
    "production")
      export NEUROLINK_LOG_LEVEL="warn"
      export NEUROLINK_CACHE_ENABLED="true"
      export NEUROLINK_TIMEOUT="15000"
      export NEUROLINK_ANALYTICS_ENABLED="true"
      ;;
  esac

  echo "✅ Environment set to: $env"
}

# Usage
setup_environment "production"
npx @juspay/neurolink gen "Production prompt"
```

### Dynamic Configuration

```bash
# Load configuration from external source
load_remote_config() {
  local config_url="$1"

  # Fetch configuration
  config=$(curl -s "$config_url")

  if [ $? -eq 0 ]; then
    # Export environment variables
    echo "$config" | jq -r 'to_entries[] | "export \(.key)=\(.value)"' | source /dev/stdin
    echo "✅ Configuration loaded from $config_url"
  else
    echo "❌ Failed to load configuration"
    return 1
  fi
}

# Usage (example)
# load_remote_config "https://config.company.com/neurolink.json"
```

##  Specialized Workflows

### Code Analysis Pipeline

```bash
#!/bin/bash
# code-analyzer.sh - Comprehensive code analysis

analyze_codebase() {
  local project_path="$1"
  local output_dir="$2"

  mkdir -p "$output_dir"

  echo " Analyzing codebase at: $project_path"

  # Find code files
  find "$project_path" -name "*.ts" -o -name "*.js" -o -name "*.py" | while read file; do
    echo "Analyzing: $file"

    # Code review
    npx @juspay/neurolink gen "
    Perform comprehensive code review:
    1. Code quality and best-practice adherence
    2. Security vulnerabilities
    3. Performance optimizations
    4. Maintainability improvements

    File: $(basename $file)
    " --enable-evaluation \
      --evaluation-domain "Senior Software Architect" \
      --context "{\"file\":\"$file\",\"project\":\"$project_path\"}" \
      > "$output_dir/review-$(basename $file).md"

    # Generate tests
    npx @juspay/neurolink gen "
    Generate comprehensive unit tests for this code.
    Include edge cases and error scenarios.

    File: $(basename $file)
    " --provider anthropic \
      > "$output_dir/tests-$(basename $file).md"

    sleep 2  # Rate limiting
  done

  echo "✅ Analysis complete. Results in: $output_dir"
}

# Usage
# analyze_codebase "./src" "./analysis-results"
```

### Documentation Generation Pipeline

```bash
#!/bin/bash
# docs-generator.sh - Automated documentation generation

generate_project_docs() {
  local project_path="$1"
  local docs_dir="$2"

  mkdir -p "$docs_dir"

  echo " Generating documentation for: $project_path"

  # API documentation
  npx @juspay/neurolink gen "
  Generate comprehensive API documentation for this project.
  Include:
  - Endpoint descriptions
  - Request/response examples
  - Authentication methods
  - Error codes and handling

  Project path: $project_path
  " --enable-analytics \
    --context "{\"project\":\"$project_path\",\"type\":\"api-docs\"}" \
    --max-tokens 2000 \
    > "$docs_dir/api-reference.md"

  # User guide
  npx @juspay/neurolink gen "
  Create a comprehensive user guide for this project.
  Include:
  - Getting started
  - Installation instructions
  - Usage examples
  - Troubleshooting

  Project path: $project_path
  " --enable-evaluation \
    --evaluation-domain "Technical Writer" \
    --max-tokens 1500 \
    > "$docs_dir/user-guide.md"

  # Developer guide
  npx @juspay/neurolink gen "
  Write a developer guide for contributing to this project.
  Include:
  - Development setup
  - Architecture overview
  - Coding standards
  - Testing guidelines

  Project path: $project_path
  " --provider anthropic \
    --max-tokens 1500 \
    > "$docs_dir/developer-guide.md"

  echo "✅ Documentation generated in: $docs_dir"
}

# Usage
# generate_project_docs "./my-project" "./docs"
```

##  Batch Processing Optimization

### Parallel Processing

```bash
# parallel-batch.sh - Optimized batch processing

parallel_generate() {
  local prompts_file="$1"
  local max_jobs="${2:-4}"
  local output_dir="${3:-./results}"

  mkdir -p "$output_dir"

  echo " Processing prompts in parallel (max jobs: $max_jobs)"

  # Use GNU parallel for concurrent processing
  cat "$prompts_file" | parallel -j "$max_jobs" --line-buffer \
    'echo "Processing: {}" &&
     npx @juspay/neurolink gen "{}" \
       --enable-analytics \
       --json > "'"$output_dir"'/result-{#}.json" &&
     echo "✅ Completed: {}"'

  echo "✅ All prompts processed. Results in: $output_dir"
}

# Usage
# parallel_generate "prompts.txt" 6 "./batch-results"
```

### Smart Rate Limiting

```bash
# rate-limited-batch.sh - Intelligent rate limiting

smart_batch_process() {
  local prompts_file="$1"
  local provider="$2"
  local output_file="${3:-batch-results.json}"

  echo " Smart batch processing with $provider"

  # Determine optimal delay based on provider
  case $provider in
    "openai")
      delay=3000    # Conservative for OpenAI rate limits
      ;;
    "google-ai")
      delay=1000    # Google AI has generous limits
      ;;
    "anthropic")
      delay=2000    # Moderate delay for Claude
      ;;
    *)
      delay=2000    # Default safe delay
      ;;
  esac

  echo "Using ${delay}ms delay between requests"

  # Process with adaptive delay
  npx @juspay/neurolink batch "$prompts_file" \
    --provider "$provider" \
    --delay "$delay" \
    --output "$output_file" \
    --enable-analytics

  echo "✅ Batch processing complete"
}

# Usage
# smart_batch_process "prompts.txt" "google-ai" "results.json"
```

##  Security and Compliance

### Secure API Key Management

```bash
# secure-setup.sh - Secure configuration management

setup_secure_environment() {
  local env="$1"

  # Use external secret management
  case $env in
    "aws")
      echo " Loading secrets from AWS Secrets Manager"
      export OPENAI_API_KEY=$(aws secretsmanager get-secret-value \
        --secret-id openai-api-key \
        --query SecretString --output text)

      export GOOGLE_AI_API_KEY=$(aws secretsmanager get-secret-value \
        --secret-id google-ai-api-key \
        --query SecretString --output text)
      ;;

    "azure")
      echo " Loading secrets from Azure Key Vault"
      export OPENAI_API_KEY=$(az keyvault secret show \
        --name openai-key --vault-name my-vault \
        --query value -o tsv)
      ;;

    "gcp")
      echo " Loading secrets from Google Secret Manager"
      export OPENAI_API_KEY=$(gcloud secrets versions access latest \
        --secret="openai-api-key")
      ;;

    *)
      echo "❌ Unknown secret management system: $env"
      return 1
      ;;
  esac

  echo "✅ Secure environment configured"
}

# Usage
# setup_secure_environment "aws"
```

### Audit Logging

```bash
# audit-logger.sh - Comprehensive audit logging

audit_generate() {
  local prompt="$1"
  local provider="$2"
  local user_id="${3:-unknown}"

  # Create audit log entry
  local timestamp=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
  local session_id=$(uuidgen)

  echo " Audit Log Entry:"
  echo "  Timestamp: $timestamp"
  echo "  Session ID: $session_id"
  echo "  User ID: $user_id"
  echo "  Provider: $provider"
  echo "  Prompt length: ${#prompt} characters"

  # Execute with audit context
  result=$(npx @juspay/neurolink gen "$prompt" \
    --provider "$provider" \
    --enable-analytics \
    --context "{
      \"audit\": {
        \"timestamp\": \"$timestamp\",
        \"session_id\": \"$session_id\",
        \"user_id\": \"$user_id\"
      }
    }" \
    --debug)

  # Log the result
  echo "✅ Generation complete - Session: $session_id"
  echo "$result"

  # Store audit record
  echo "{
    \"timestamp\": \"$timestamp\",
    \"session_id\": \"$session_id\",
    \"user_id\": \"$user_id\",
    \"provider\": \"$provider\",
    \"prompt_length\": ${#prompt},
    \"status\": \"success\"
  }" >> audit.log
}

# Usage
# audit_generate "Generate report" "openai" "user123"
```

##  Performance Optimization

### Caching Strategies

```bash
# cache-manager.sh - Advanced caching for repeated prompts

cached_generate() {
  local prompt="$1"
  local provider="$2"
  local cache_dir="${3:-.neurolink-cache}"

  mkdir -p "$cache_dir"

  # Create cache key
  local cache_key=$(echo -n "$prompt|$provider" | sha256sum | cut -d' ' -f1)
  local cache_file="$cache_dir/$cache_key.json"

  # Check cache
  if [ -f "$cache_file" ] && [ $(($(date +%s) - $(stat -c %Y "$cache_file"))) -lt 3600 ]; then
    echo " Cache hit for prompt"
    cat "$cache_file" | jq -r '.content'
    return 0
  fi

  # Generate and cache
  echo " Generating and caching..."
  result=$(npx @juspay/neurolink gen "$prompt" \
    --provider "$provider" \
    --json)

  if [ $? -eq 0 ]; then
    echo "$result" > "$cache_file"
    echo "$result" | jq -r '.content'
    echo "✅ Result cached"
  else
    echo "❌ Generation failed"
    return 1
  fi
}

# Usage
# cached_generate "Explain caching" "openai" ".cache"
```

### Connection Pooling

```bash
# connection-pool.sh - Manage provider connections efficiently

manage_provider_pool() {
  local action="$1"

  case $action in
    "warm-up")
      echo " Warming up provider connections..."

      # Pre-warm connections with simple prompts
      npx @juspay/neurolink gen "Hello" --provider openai &
      npx @juspay/neurolink gen "Hello" --provider google-ai &
      npx @juspay/neurolink gen "Hello" --provider anthropic &

      wait
      echo "✅ Provider pool warmed up"
      ;;

    "health-check")
      echo " Checking provider health..."
      npx @juspay/neurolink status --verbose
      ;;

    "reset")
      echo " Resetting provider connections..."
      # Implementation depends on your provider management
      echo "✅ Provider pool reset"
      ;;

    *)
      echo "Usage: manage_provider_pool {warm-up|health-check|reset}"
      ;;
  esac
}

# Usage
# manage_provider_pool "warm-up"
```

##  Custom Tool Development

### MCP Server Integration

```bash
# mcp-workflow.sh - Custom MCP server integration

setup_custom_mcp() {
  local server_name="$1"
  local server_command="$2"

  echo " Setting up custom MCP server: $server_name"

  # Add server to configuration
  npx @juspay/neurolink mcp add "$server_name" "$server_command"

  # Test server connectivity
  if npx @juspay/neurolink mcp test "$server_name"; then
    echo "✅ MCP server $server_name is working"

    # List available tools
    echo "️ Available tools:"
    npx @juspay/neurolink mcp list --server "$server_name"
  else
    echo "❌ MCP server $server_name failed to start"
    return 1
  fi
}

# Usage
# setup_custom_mcp "filesystem" "npx @modelcontextprotocol/server-filesystem /"
```

### Tool Chain Automation

```bash
# tool-chain.sh - Automated tool chain execution

execute_tool_chain() {
  local workflow_file="$1"

  echo "⚙️ Executing tool chain workflow: $workflow_file"

  # Read workflow configuration
  if [ ! -f "$workflow_file" ]; then
    echo "❌ Workflow file not found: $workflow_file"
    return 1
  fi

  # Process each step
  jq -c '.steps[]' "$workflow_file" | while read step; do
    local tool=$(echo "$step" | jq -r '.tool')
    local prompt=$(echo "$step" | jq -r '.prompt')
    local params=$(echo "$step" | jq -r '.params // "{}"')

    echo " Executing step: $tool"

    # Execute tool via NeuroLink
    npx @juspay/neurolink gen "$prompt" \
      --enable-analytics \
      --context "$params" \
      --debug

    echo "✅ Step completed: $tool"
    sleep 1
  done

  echo "✅ Tool chain execution complete"
}

# Example workflow.json:
# {
#   "steps": [
#     {
#       "tool": "analyzer",
#       "prompt": "Analyze the codebase structure",
#       "params": {"path": "./src"}
#     },
#     {
#       "tool": "documenter",
#       "prompt": "Generate API documentation",
#       "params": {"format": "markdown"}
#     }
#   ]
# }

# Usage
# execute_tool_chain "workflow.json"
```

##  Metrics and Reporting

### Advanced Reporting

```bash
# metrics-reporter.sh - Comprehensive metrics reporting

generate_usage_report() {
  local period="${1:-daily}"
  local output_file="${2:-usage-report.md}"

  echo " Generating $period usage report..."

  # Analyze usage patterns
  npx @juspay/neurolink gen "
  Generate a comprehensive usage report based on these analytics:

  Period: $period
  Report type: Executive summary

  Include:
  - Usage trends and patterns
  - Provider performance comparison
  - Cost analysis and optimization recommendations
  - Key insights and recommendations

  Format as professional markdown report.
  " --enable-analytics \
    --evaluation-domain "Data Analyst" \
    --max-tokens 2000 \
    > "$output_file"

  echo "✅ Usage report generated: $output_file"
}

# Usage
# generate_usage_report "weekly" "weekly-report.md"
```

##  Specialized Use Cases

### CI/CD Integration

```bash
# ci-cd-integration.sh - Advanced CI/CD workflows

run_ai_quality_gate() {
  local commit_hash="$1"
  local threshold="${2:-8}"

  echo " Running AI quality gate for commit: $commit_hash"

  # Get changed files
  changed_files=$(git diff --name-only HEAD~1)

  # Analyze changes
  quality_score=$(npx @juspay/neurolink gen "
  Analyze these code changes for quality score (1-10):

  Commit: $commit_hash
  Changed files: $changed_files

  Evaluate:
  - Code quality and best-practice compliance
  - Test coverage adequacy
  - Documentation completeness
  - Security considerations

  Respond only with numeric score (1-10).
  " --enable-evaluation \
    --evaluation-domain "Senior Code Reviewer" \
    --max-tokens 10 | grep -o '[0-9]' | head -1)

  echo " Quality score: $quality_score/10"

  if [ "$quality_score" -ge "$threshold" ]; then
    echo "✅ Quality gate passed"
    exit 0
  else
    echo "❌ Quality gate failed (score: $quality_score, threshold: $threshold)"
    exit 1
  fi
}

# Usage in CI pipeline
# run_ai_quality_gate "$GITHUB_SHA" 7
```

### Content Management System

```bash
# cms-integration.sh - AI-powered content management

manage_content() {
  local action="$1"
  local content_type="$2"
  local target="${3:-.}"

  case $action in
    "generate")
      echo " Generating $content_type content..."

      case $content_type in
        "blog-post")
          npx @juspay/neurolink gen "
          Write a professional blog post about AI development tools.
          Include: introduction, key benefits, use cases, conclusion.
          Target audience: Software developers and engineering managers.
          Tone: Professional but approachable.
          Length: 800-1000 words.
          " --enable-evaluation \
            --evaluation-domain "Content Marketing Manager" \
            > "$target/blog-post-$(date +%Y%m%d).md"
          ;;

        "documentation")
          npx @juspay/neurolink gen "
          Create comprehensive API documentation.
          Include: authentication, endpoints, examples, error handling.
          Format: OpenAPI 3.0 specification.
          " --provider anthropic \
            > "$target/api-docs-$(date +%Y%m%d).yaml"
          ;;

        "social-media")
          npx @juspay/neurolink gen "
          Create 5 social media posts about AI automation.
          Platforms: Twitter, LinkedIn.
          Include relevant hashtags.
          Tone: Engaging and informative.
          " > "$target/social-content-$(date +%Y%m%d).txt"
          ;;
      esac
      ;;

    "review")
      echo " Reviewing existing content..."
      find "$target" -name "*.md" -o -name "*.txt" | while read file; do
        npx @juspay/neurolink gen "
        Review this content for:
        - Clarity and readability
        - Technical accuracy
        - SEO optimization
        - Engagement potential

        Provide specific improvement recommendations.
        " --enable-evaluation \
          --evaluation-domain "Content Editor" \
          > "${file%.md}-review.md"
      done
      ;;
  esac
}

# Usage
# manage_content "generate" "blog-post" "./content"
# manage_content "review" "" "./content"
```

This advanced CLI usage guide provides sophisticated patterns and techniques for power users who want to maximize the capabilities of NeuroLink CLI in production environments.

##  Related Documentation

- [CLI Commands Reference](/docs/cli/commands) - Complete command documentation
- [CLI Examples](/docs/examples) - Practical usage examples
- [Environment Variables](/docs/getting-started/environment-variables) - Configuration
- [SDK Advanced Features](/docs/sdk/advanced-features) - Programmatic equivalents
- [Troubleshooting](/docs/reference/troubleshooting) - Common issues

---

## CLI Examples

<!-- Source: cli/examples.md -->

# CLI Examples

Practical examples and usage patterns for the NeuroLink CLI.

##  Quick Start Examples

### Basic Text Generation

```bash
# Simple generation
npx @juspay/neurolink gen "Write a Python function to reverse a string"

# With specific provider
npx @juspay/neurolink gen "Explain quantum computing" --provider google-ai

# Creative writing with high temperature
npx @juspay/neurolink gen "Write a short poem about AI" --temperature 0.9
```

### Provider Testing

```bash
# Check all providers
npx @juspay/neurolink status

# Test specific provider
npx @juspay/neurolink gen "Hello" --provider openai

# Find best available provider
npx @juspay/neurolink get-best-provider
```

##  Development Workflows

### Code Generation

```bash
# Generate TypeScript interfaces
npx @juspay/neurolink gen "
Create TypeScript interfaces for:
- User profile with id, name, email
- API response with data, status, message
"

# Generate test cases
npx @juspay/neurolink gen "
Write Jest test cases for a function that calculates compound interest.
Include edge cases and error handling.
" --provider anthropic
```

### Documentation Generation

```bash
# Generate API documentation
npx @juspay/neurolink gen "
Create API documentation for a REST endpoint that:
- Accepts POST requests to /api/users
- Creates new user accounts
- Returns user ID and status
" --max-tokens 1000

# Generate README sections
npx @juspay/neurolink gen "
Write a 'Getting Started' section for a Node.js CLI tool
that processes CSV files. Include installation and basic usage.
"
```

##  Business Use Cases

### Content Creation

```bash
# Marketing copy
npx @juspay/neurolink gen "
Write compelling product description for an AI development platform
that supports multiple providers and has built-in tools.
" --temperature 0.8

# Email templates
npx @juspay/neurolink gen "
Create a professional email template for announcing
new API features to enterprise customers.
"

# Social media content
npx @juspay/neurolink gen "
Write 3 Twitter posts about AI automation benefits
for software development teams. Keep under 280 characters each.
"
```

### Business Analysis

```bash
# Market research
npx @juspay/neurolink gen "
Analyze the current trends in AI development tools.
Focus on developer experience and enterprise adoption.
" --provider anthropic --max-tokens 1500

# Competitive analysis
npx @juspay/neurolink gen "
Compare the advantages of multi-provider AI platforms
versus single-provider solutions for enterprise use.
"
```

##  Batch Processing

### Content Pipeline

```bash
# Create prompts file
cat > content-prompts.txt  review-prompts.txt  docs/api.md",
    "ai:test": "npx @juspay/neurolink status",
    "ai:review": "npx @juspay/neurolink gen 'Review this codebase for improvements' --provider anthropic"
  }
}
```

### GitHub Actions

```yaml
name: AI Documentation
on: [push]
jobs:
  docs:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Generate docs
        run: |
          npx @juspay/neurolink gen "Create changelog for latest changes" > CHANGELOG.md
        env:
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
      - name: Commit docs
        run: |
          git config --local user.email "action@github.com"
          git config --local user.name "GitHub Action"
          git add CHANGELOG.md
          git commit -m "Update AI-generated changelog" || exit 0
          git push
```

##  Production Workflows

### Content Management

```bash
# Daily content generation
#!/bin/bash
DATE=$(date +"%Y-%m-%d")

# Generate daily summary
npx @juspay/neurolink gen "
Create a daily engineering summary for $DATE.
Include: progress updates, blockers, next steps.
" --enable-analytics > reports/daily-$DATE.md

# Generate team updates
npx @juspay/neurolink gen "
Write team update email template for weekly standup.
Include sections for achievements, challenges, goals.
" > templates/weekly-update.md
```

### Code Review Pipeline

```bash
#!/bin/bash
# AI-assisted code review

# Get changed files
files=$(git diff --name-only HEAD~1)

# Review each file
for file in $files; do
  if [[ $file == *.ts ]] || [[ $file == *.js ]]; then
    echo "Reviewing $file..."
    npx @juspay/neurolink gen "
    Review this code for:
    - Best practices
    - Security issues
    - Performance optimizations
    - Maintainability

    File: $file
    " --enable-evaluation \
      --evaluation-domain "Senior Code Reviewer" \
      > reviews/review-$(basename $file).md
  fi
done
```

### Monitoring and Alerts

```bash
#!/bin/bash
# Provider health monitoring

status=$(npx @juspay/neurolink status --json)
working=$(echo $status | jq '[.[] | select(.status == "working")] | length')
total=$(echo $status | jq 'length')

if [ $working -lt $total ]; then
  # Generate alert message
  alert=$(npx @juspay/neurolink gen "
  Create alert message: $working out of $total AI providers are working.
  Include impact assessment and recommended actions.
  " --max-tokens 200)

  # Send to monitoring system
  curl -X POST webhook-url -d "message=$alert"
fi
```

##  Performance Optimization

### Provider Selection

```bash
# Find fastest provider
fastest=$(npx @juspay/neurolink get-best-provider --criteria speed)
echo "Using fastest provider: $fastest"

# Cost optimization
cheapest=$(npx @juspay/neurolink models best --use-case cheapest)
npx @juspay/neurolink gen "Budget-conscious prompt" --provider $cheapest

# Quality optimization
npx @juspay/neurolink gen "High-quality analysis needed" \
  --provider anthropic \
  --enable-evaluation \
  --evaluation-domain "Expert Analyst"
```

### Batch Optimization

```bash
# Parallel processing with GNU parallel
cat prompts.txt | parallel -j 4 npx @juspay/neurolink gen {} \
  --provider openai \
  --max-tokens 500 \
  > results.txt

# Rate-limited processing
npx @juspay/neurolink batch prompts.txt \
  --delay 5000 \
  --provider google-ai \
  --output batch-results.json
```

##  Error Handling

### Robust Scripts

```bash
#!/bin/bash
# Error-resistant AI generation

generate_with_fallback() {
  local prompt="$1"
  local providers=("openai" "google-ai" "anthropic")

  for provider in "${providers[@]}"; do
    echo "Trying $provider..."
    if result=$(npx @juspay/neurolink gen "$prompt" --provider $provider 2>/dev/null); then
      echo "Success with $provider"
      echo "$result"
      return 0
    else
      echo "Failed with $provider, trying next..."
    fi
  done

  echo "All providers failed"
  return 1
}

# Usage
generate_with_fallback "Write a summary of AI trends"
```

### Timeout Handling

```bash
# Long-running generation with timeout
timeout 120s npx @juspay/neurolink gen "
Generate comprehensive technical documentation for our API.
Include: authentication, endpoints, examples, error codes.
" --max-tokens 3000 || echo "Generation timed out"

# Streaming with timeout
timeout 60s npx @juspay/neurolink stream "
Tell a long story about AI development
" --provider openai || echo "Stream timed out"
```

##  Learning and Experimentation

### A/B Testing

```bash
# Compare provider outputs
prompt="Explain microservices architecture"

echo "=== OpenAI ==="
npx @juspay/neurolink gen "$prompt" --provider openai

echo "=== Google AI ==="
npx @juspay/neurolink gen "$prompt" --provider google-ai

echo "=== Anthropic ==="
npx @juspay/neurolink gen "$prompt" --provider anthropic
```

### Temperature Experiments

```bash
# Creative temperature range
prompt="Write a creative product name for AI tools"

for temp in 0.3 0.7 0.9; do
  echo "=== Temperature: $temp ==="
  npx @juspay/neurolink gen "$prompt" --temperature $temp
  echo
done
```

### Token Limit Testing

```bash
# Test different response lengths
prompt="Explain React hooks"

for tokens in 100 500 1000; do
  echo "=== $tokens tokens ==="
  npx @juspay/neurolink gen "$prompt" --max-tokens $tokens
  echo
done
```

##  Related Resources

- [CLI Commands Reference](/docs/cli/commands) - Complete command documentation
- [Advanced Usage](/docs/advanced) - Power user features
- [Installation Guide](/docs/getting-started/installation) - Setup instructions
- [Environment Variables](/docs/getting-started/environment-variables) - Configuration
- [Troubleshooting](/docs/reference/troubleshooting) - Common issues

---

# Features

## Feature Guides

<!-- Source: features/index.md -->

# Feature Guides

Comprehensive guides for all NeuroLink features organized by category. Each guide includes setup, usage patterns, configuration, and troubleshooting.

----------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------- |
|  **Video Generation**  | Generate videos from text prompts using RunwayML (ML5, ML6 Turbo models). _Coming Soon_                                                     |
| ️ **[Image Generation with Gemini](/docs/features/image-generation)**         | Native image generation using Gemini 2.0 Flash Experimental with imagen-3.0-generate-002 model.                                             |
|  **[HTTP/Streamable HTTP Transport for MCP](/docs/mcp/http-transport)**              | Connect to remote MCP servers via HTTP with authentication, rate limiting, retry support, and session management.                           |
|  **[Audio Input](/docs/features/audio-input)**                                            | Real-time voice conversations with Gemini Live and audio streaming capabilities.                                                            |
| ️ **[Server Adapters](/docs/guides/server-adapters)**                        | Expose NeuroLink AI agents as HTTP APIs with Hono, Express, Fastify, and Koa. Production-ready with auth, rate limiting, and streaming.     |
|  **[RAG Document Processing](/docs/tutorials/rag)**                                   | Comprehensive document chunking (10 strategies), hybrid search (BM25 + vector), and reranking (5 types) for retrieval-augmented generation. |
|  **[Context Compaction](/docs/features/context-compaction)**                         | 4-stage context compaction pipeline with automatic budget management, per-provider token estimation, and non-destructive message tagging.   |
|  **[Memory](/docs/features/memory)**                                                           | Per-user condensed memory that persists across conversations. LLM-powered condensation with S3, Redis, or SQLite storage backends.          |

**Q1 2026 Highlights:**

- **Video Generation** _(Coming Soon)_: Create AI-generated videos with RunwayML integration supporting ML5 and ML6 Turbo models, customizable duration (5-10s), and watermark control
- **Gemini Image Generation**: Native support for Google's imagen-3.0-generate-002 model through Gemini 2.0 Flash Experimental for high-quality image synthesis
- **Remote MCP Servers**: HTTP/Streamable HTTP transport enables connecting to cloud-hosted MCP servers with Bearer token authentication, configurable rate limits, automatic retry with exponential backoff, and session management via `Mcp-Session-Id` header
- **Audio Input**: Real-time voice conversations with Gemini Live API enabling bidirectional audio streaming for interactive voice-based AI experiences
- **Server Adapters**: Deploy NeuroLink as production HTTP APIs with support for Hono (recommended), Express, Fastify, and Koa frameworks. Includes built-in authentication, rate limiting, caching, validation middleware, and SSE streaming support.
- **RAG Document Processing**: Full-featured retrieval-augmented generation with 10 chunking strategies (character, recursive, sentence, token, markdown, html, json, latex, semantic, semantic-markdown), hybrid search combining BM25 and vector similarity, 5 reranking types (simple, LLM, batch, cross-encoder, Cohere), and integration with Pinecone, Weaviate, and Chroma vector stores.

---

## Core Features (Q4 2025)

| Feature                                                                             | Description                                                                                        |
| ----------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------- |
| ️ **[Image Generation](/docs/features/image-generation)**           | Generate images from text prompts using Gemini models via Vertex AI or Google AI Studio.           |
|  **[Enterprise HITL](/docs/features/enterprise-hitl)**                       | Production-ready HITL with approval workflows, confidence thresholds, and enterprise patterns.     |
|  **[Interactive CLI](/docs/cli)**                   | AI development environment with loop mode, session variables, and conversation memory.             |
| ️ **[MCP Tools Showcase](/docs/features/mcp-tools-showcase)**                    | Complete guide to 6 built-in tools and 58+ external MCP servers across 6 categories.               |
|  **[Human-in-the-Loop (HITL)](/docs/features/hitl)**                 | Pause AI tool execution for user approval before risky operations like file deletion or API calls. |
| ️ **[Guardrails Middleware](/docs/features/guardrails)**                  | Content filtering, PII detection, and safety checks for AI outputs with zero configuration.        |
|  **[Redis Conversation Export](/docs/features/conversation-history)** | Export complete session history as JSON for analytics, debugging, and compliance auditing.         |
|  **[Context Summarization](/docs/memory/summarization)**   | Automatic conversation compression for long-running sessions to stay within token limits.          |
|  **[LiteLLM Integration](/docs/getting-started/providers/litellm)**      | Access 100+ AI models from all major providers through unified LiteLLM routing interface.          |
| ☁️ **[SageMaker Integration](/docs/getting-started/providers/sagemaker)**             | Deploy and use custom trained models on AWS SageMaker infrastructure with full control.            |

---

## Core Features (Q3 2025)

| Feature                                                                        | Description                                                                                      |
| ------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------ |
| ️ **[Multimodal Chat Experiences](/docs/features/multimodal-chat)**    | Stream text and images together with automatic provider fallbacks and format conversion.         |
|  **[CSV File Support](/docs/features/csv-support)**                  | Process CSV files for data analysis with automatic format conversion. Works with all providers.  |
|  **[PDF File Support](/docs/features/pdf-support)**                 | Process PDF documents for visual analysis and content extraction. Native provider support.       |
|  **[Office Documents](/docs/features/office-documents)**               | Process DOCX, PPTX, XLSX files for document analysis. Native Bedrock, Vertex, Anthropic support. |
|  **[Auto Evaluation Engine](/docs/features/auto-evaluation)**         | Automated quality scoring and metrics export for AI response validation using LLM-as-judge.      |
|  **[CLI Loop Sessions](/docs/features/cli-loop-sessions)**               | Persistent interactive mode with conversation memory and session state for prompt engineering.   |
|  **[Regional Streaming Controls](/docs/features/regional-streaming)**      | Region-specific model deployment and routing for compliance and latency optimization.            |
|  **[Provider Orchestration Brain](/docs/features/provider-orchestration)** | Adaptive provider and model selection with intelligent fallbacks based on task classification.   |

---

## Platform Capabilities at a Glance

| Category                 | Features                                                                                                           | Documentation                                                                                                                            |
| ------------------------ | ------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------- |
| **Provider unification** | 14+ providers with automatic failover, cost-aware routing, provider orchestration (Q3)                             | [Provider Setup](/docs/getting-started/provider-setup)                                                                                   |
| **Multimodal pipeline**  | Stream images + CSV data + PDF documents + Office files across providers with auto-detection for mixed file types. | [Multimodal Guide](/docs/features/multimodal-chat), [CSV Support](/docs/features/csv-support), [PDF Support](/docs/features/pdf-support), [Office Docs](/docs/features/office-documents) |
| **Quality & governance** | Auto-evaluation engine (Q3), guardrails middleware (Q4), HITL workflows (Q4), audit logging                        | [Auto Evaluation](/docs/features/auto-evaluation), [Guardrails](/docs/features/guardrails), [HITL](/docs/features/hitl)                                                      |
| **Memory & context**     | Conversation memory, per-user memory, Mem0 integration, Redis history export (Q4), context summarization (Q4)      | [Conversation Memory](/docs/memory/conversation), [Memory](/docs/features/memory), [Redis Export](/docs/features/conversation-history)                           |
| **CLI tooling**          | Loop sessions (Q3), setup wizard, config validation, Redis auto-detect, JSON output                                | [CLI Loop](/docs/features/cli-loop-sessions), [CLI Commands](/docs/cli/commands)                                                                     |
| **Enterprise ops**       | Proxy support, regional routing (Q3), telemetry hooks, configuration management                                    | [Enterprise Proxy](/docs/deployment/enterprise-proxy), [Observability](/docs/observability/health-monitoring)                                                      |
| **Tool ecosystem**       | MCP auto discovery, LiteLLM hub access, SageMaker custom deployment, web search                                    | [MCP Integration](/docs/mcp/integration), [MCP Catalog](/docs/guides/mcp/server-catalog)                                        |

---

## AI Provider Integration

NeuroLink supports **14+ AI providers** with unified API access:

| Provider              | Key Features                   | Free Tier       | Tool Support | Status        | Documentation                                                         |
| --------------------- | ------------------------------ | --------------- | ------------ | ------------- | --------------------------------------------------------------------- |
| **OpenAI**            | GPT-4o, GPT-4o-mini, o1 models | ❌              | ✅ Full      | ✅ Production | [Setup Guide](/docs/getting-started/provider-setup.md#openai)            |
| **Anthropic**         | Claude 3.5/3.7 Sonnet, Opus    | ❌              | ✅ Full      | ✅ Production | [Setup Guide](/docs/getting-started/provider-setup.md#anthropic)         |
| **Google AI**         | Gemini 2.5 Flash/Pro           | ✅ Free Tier    | ✅ Full      | ✅ Production | [Setup Guide](/docs/getting-started/provider-setup.md#google-ai)         |
| **AWS Bedrock**       | Claude, Titan, Llama, Nova     | ❌              | ✅ Full      | ✅ Production | [Setup Guide](/docs/getting-started/provider-setup.md#bedrock)           |
| **Google Vertex**     | Gemini via GCP                 | ❌              | ✅ Full      | ✅ Production | [Setup Guide](/docs/getting-started/provider-setup.md#vertex)            |
| **Azure OpenAI**      | GPT-4, GPT-4o, o1              | ❌              | ✅ Full      | ✅ Production | [Setup Guide](/docs/getting-started/provider-setup.md#azure)             |
| **LiteLLM**           | 100+ models unified            | Varies          | ✅ Full      | ✅ Production | [Integration Guide](/docs/getting-started/providers/litellm)                        |
| **AWS SageMaker**     | Custom deployed models         | ❌              | ✅ Full      | ✅ Production | [Integration Guide](/docs/getting-started/providers/sagemaker)                      |
| **Mistral AI**        | Mistral Large, Small           | ✅ Free Tier    | ✅ Full      | ✅ Production | [Setup Guide](/docs/getting-started/provider-setup.md#mistral)           |
| **Hugging Face**      | 100,000+ models                | ✅ Free         | ⚠️ Partial   | ✅ Production | [Setup Guide](/docs/getting-started/provider-setup.md#huggingface)       |
| **Ollama**            | Local models                   | ✅ Free (Local) | ⚠️ Partial   | ✅ Production | [Setup Guide](/docs/getting-started/provider-setup.md#ollama)            |
| **OpenAI Compatible** | Any compatible endpoint        | Varies          | ✅ Full      | ✅ Production | [Setup Guide](/docs/getting-started/provider-setup.md#openai-compatible) |
| **OpenRouter**        | 300+ models via unified API    | ✅ Free Tier    | ✅ Full      | ✅ Production | [Setup Guide](/docs/getting-started/provider-setup.md#openrouter)        |

**[ Provider Comparison Guide](/docs/reference/provider-comparison)** - Full feature matrix

---

## Advanced CLI Capabilities

### Interactive Setup Wizard

NeuroLink includes a revolutionary **interactive setup wizard** that guides users through provider configuration in 2-3 minutes:

```bash
# Launch interactive setup wizard
npx @juspay/neurolink setup

# Provider-specific guided setup
npx @juspay/neurolink setup --provider openai
npx @juspay/neurolink setup --provider bedrock
```

**Wizard Features:**

-  Secure credential collection with validation
- ✅ Real-time authentication testing
-  Automatic `.env` file creation
-  Recommended model selection
-  Quick-start command examples
-  Interactive provider discovery

### 15+ CLI Commands

Complete command-line toolkit for every workflow:

| Command          | Description              | Key Features                              |
| ---------------- | ------------------------ | ----------------------------------------- |
| **generate/gen** | Text generation          | Multimodal input, tool support, streaming |
| **stream**       | Real-time streaming      | Live token output, evaluation             |
| **loop**         | Interactive session      | Persistent variables, conversation memory |
| **setup**        | Guided configuration     | Provider wizard, validation               |
| **status**       | Health monitoring        | Provider health, latency checks           |
| **models list**  | Model discovery          | Capability filtering, availability        |
| **config**       | Configuration management | Init, validate, export, reset             |
| **memory**       | Conversation management  | Export, import, stats, clear              |
| **mcp**          | MCP server management    | List, discover, connect, status           |
| **provider**     | Provider operations      | List, test, health dashboard              |
| **ollama**       | Ollama management        | Model download, list, remove              |
| **sagemaker**    | SageMaker operations     | Status, endpoint management               |
| **vertex**       | Vertex AI operations     | Auth status, quota checks                 |
| **completion**   | Shell completion         | Bash and Zsh support                      |
| **validate**     | Config validation        | Environment verification                  |

### Shell Integration

**Bash and Zsh completions** for faster command-line workflows:

```bash
# Install Bash completion
neurolink completion bash >> ~/.bashrc

# Install Zsh completion
neurolink completion zsh >> ~/.zshrc
```

**Learn more:** [Complete CLI Reference](/docs/cli/commands)

---

## Built-in Tools & MCP Integration

### 8 Core Built-in Agent Tools

Complete autonomous agent foundation with security and validation:

| Tool                 | Function           | Capabilities                                      | Security   | Status |
| -------------------- | ------------------ | ------------------------------------------------- | ---------- | ------ |
| `getCurrentTime`     | Time access        | Date/time with timezone support                   | Safe       | ✅     |
| `readFile`           | File reading       | Secure file system access with path validation    | Sandboxed  | ✅     |
| `writeFile`          | File writing       | File creation and modification with safety checks | HITL       | ✅     |
| `listFiles`          | Directory listing  | Directory navigation and listing                  | Restricted | ✅     |
| `createDirectory`    | Directory creation | Directory creation with permission checks         | Validated  | ✅     |
| `deleteFile`         | File deletion      | File and directory deletion with confirmation     | HITL       | ✅     |
| `executeCommand`     | Command execution  | System command execution with safety limits       | HITL       | ✅     |
| `websearchGrounding` | Web search         | Google Vertex web search integration              | API-based  | ✅     |

**Tool Management System:**

- ✅ Dynamic tool registration and validation
- ✅ Secure execution with sandboxing
- ✅ Result processing and error recovery
- ✅ Tool discovery and availability tracking

**[ Custom Tools Guide](/docs/sdk/custom-tools)** - Create your own tools

---

### Model Context Protocol (MCP) - Enterprise-Grade Ecosystem

#### 5 Built-in MCP Servers

NeuroLink includes **5 production-ready MCP servers** for enterprise agent deployment:

| Server           | Purpose                | Tools Provided                          | Status         |
| ---------------- | ---------------------- | --------------------------------------- | -------------- |
| **AI Core**      | Provider orchestration | generate, select-provider, check-status | ✅ Operational |
| **AI Analysis**  | Analytics capabilities | analyze-usage, performance-metrics      | ✅ Operational |
| **AI Workflow**  | Workflow automation    | execute-workflow, batch-process         | ✅ Operational |
| **Direct Tools** | Agent integration      | file-ops, web-search, execute           | ✅ Operational |
| **Utilities**    | General utilities      | time, calculations, formatting          | ✅ Operational |

#### Advanced MCP Infrastructure

| Component                   | Capabilities                              | Status    |
| --------------------------- | ----------------------------------------- | --------- |
| **Tool Registry**           | Tool registration, execution, statistics  | ✅ Active |
| **External Server Manager** | Lifecycle management, health monitoring   | ✅ Active |
| **Tool Discovery Service**  | Automatic tool discovery and registration | ✅ Active |
| **MCP Factory**             | Lighthouse-compatible server creation     | ✅ Active |
| **Flexible Tool Validator** | Universal safety validation               | ✅ Active |
| **Context Manager**         | Rich context with 15+ fields              | ✅ Active |
| **Tool Orchestrator**       | Sequential pipelines, error handling      | ✅ Active |

#### Lighthouse MCP Compatibility

- ✅ **Factory Pattern**: `createMCPServer()` fully compatible with Lighthouse architecture
- ✅ **Transport Mechanisms**: stdio, SSE, WebSocket support (99% compatibility)
- ✅ **Tool Standards**: Full MCP specification compliance
- ✅ **Context Passing**: Rich context with sessionId, userId, permissions (15+ fields)

#### 58+ External MCP Servers

Supported for extended functionality:

**Categories:**

- **Development**: GitHub, GitLab, filesystem access
- **Databases**: PostgreSQL, MySQL, SQLite
- **Cloud Storage**: Google Drive, AWS S3
- **Communication**: Slack, email
- **And many more...**

**Quick Example:**

```typescript
// Add any MCP server dynamically
await neurolink.addExternalMCPServer("github", {
  command: "npx",
  args: ["-y", "@modelcontextprotocol/server-github"],
  transport: "stdio",
  env: { GITHUB_TOKEN: process.env.GITHUB_TOKEN },
});

// Tools automatically available to AI
const result = await neurolink.generate({
  input: { text: 'Create a GitHub issue titled "Bug in auth flow"' },
});
```

**[ MCP Integration Guide](/docs/mcp/integration)** - Setup and usage
**[ MCP Server Catalog](/docs/guides/mcp/server-catalog)** - Complete server list (58+)

---

## Developer Experience Features

### SDK Features

| Feature                     | Description                    | Documentation                                       |
| --------------------------- | ------------------------------ | --------------------------------------------------- |
| **Auto Provider Selection** | Intelligent provider fallback  | [SDK Guide](/docs/sdk/index.md#auto-selection)         |
| **Streaming Responses**     | Real-time token streaming      | [Streaming Guide](/docs/advanced/streaming)         |
| **Conversation Memory**     | Automatic context management   | [Memory Guide](/docs/sdk/index.md#memory)              |
| **Full Type Safety**        | Complete TypeScript types      | [Type Reference](/docs/sdk/api-reference)           |
| **Error Handling**          | Graceful provider fallback     | [Error Guide](/docs/reference/troubleshooting)      |
| **Analytics & Evaluation**  | Usage tracking, quality scores | [Analytics Guide](/docs/reference/analytics)         |
| **Middleware System**       | Request/response hooks         | [Middleware Guide](/docs/workflows/custom-middleware)   |
| **Framework Integration**   | Next.js, SvelteKit, Express    | [Framework Guides](/docs/sdk/framework-integration) |

---

### CLI Features

| Feature                 | Description                       | Documentation                                   |
| ----------------------- | --------------------------------- | ----------------------------------------------- |
| **Interactive Setup**   | Guided provider configuration     | [Setup Guide](/docs/)                  |
| **Text Generation**     | CLI-based generation              | [Generate Command](/docs/cli/commands.md#generate) |
| **Streaming**           | Real-time streaming output        | [Stream Command](/docs/cli/commands.md#stream)     |
| **Loop Sessions**       | Persistent interactive mode       | [Loop Sessions](/docs/features/cli-loop-sessions)           |
| **Provider Management** | Health checks and status          | [CLI Guide](/docs/cli/commands)                 |
| **Model Evaluation**    | Automated testing                 | [Eval Command](/docs/cli/commands.md#eval)         |
| **MCP Management**      | Server discovery and installation | [MCP CLI](/docs/cli/commands)                   |

**15+ Commands** for every workflow - see [Complete CLI Reference](/docs/cli/commands)

---

## Smart Model Selection & Cost Optimization

### Cost Optimization Features

- ** Automatic Cost Optimization**: Selects cheapest models for simple tasks
- ** LiteLLM Model Routing**: Access 100+ models with automatic load balancing
- ** Capability-Based Selection**: Find models with specific features (vision, function calling)
- **⚡ Intelligent Fallback**: Seamless switching when providers fail

**CLI Examples:**

```bash
# Cost optimization - automatically use cheapest model
npx @juspay/neurolink generate "Hello" --optimize-cost

# LiteLLM specific model selection
npx @juspay/neurolink generate "Complex analysis" --provider litellm --model "anthropic/claude-3-5-sonnet"

# Auto-select best available provider
npx @juspay/neurolink generate "Write code" # Automatically chooses optimal provider
```

**Learn more:** [Provider Orchestration Guide](/docs/features/provider-orchestration)

---

## Interactive Loop Mode

NeuroLink features a powerful **interactive loop mode** that transforms the CLI into a persistent, stateful session.

### Key Capabilities

- Run any CLI command without restarting session
- Persistent session variables: `set provider openai`, `set temperature 0.9`
- Conversation memory: AI remembers previous turns within session
- Redis auto-detection: Automatically connects if `REDIS_URL` is set
- Export session history as JSON for analytics

### Quick Start

```bash
# Start loop with Redis-backed conversation memory
npx @juspay/neurolink loop --enable-conversation-memory --auto-redis

# Start loop without Redis auto-detection
npx @juspay/neurolink loop --enable-conversation-memory --no-auto-redis
```

### Example Session

```bash
# Start the interactive session
$ npx @juspay/neurolink loop

neurolink » set provider google-ai
✓ provider set to google-ai

neurolink » set temperature 0.8
✓ temperature set to 0.8

neurolink » generate "Tell me a fun fact about space"
The quietest place on Earth is an anechoic chamber at Microsoft's headquarters...

# Exit the session
neurolink » exit
```

**[ Complete Loop Guide](/docs/features/cli-loop-sessions)** - Full documentation with all commands

---

## Enterprise & Production Features

### Production Capabilities

| Feature                      | Description                         | Use Case                     | Documentation                                                     |
| ---------------------------- | ----------------------------------- | ---------------------------- | ----------------------------------------------------------------- |
| **Enterprise Proxy**         | Corporate proxy support             | Behind firewalls             | [Proxy Setup](/docs/deployment/enterprise-proxy)                       |
| **Redis Memory**             | Distributed conversation state      | Multi-instance deployment    | [Redis Guide](/docs/getting-started/provider-setup.md#redis)         |
| **Cost Optimization**        | Automatic cheapest model selection  | Budget control               | [Cost Guide](/docs/cookbook/cost-optimization)           |
| **Multi-Provider Failover**  | Automatic provider switching        | High availability            | [Failover Guide](/docs/guides/enterprise/multi-provider-failover) |
| **Telemetry & Monitoring**   | OpenTelemetry integration           | Observability                | [Observability Guide](/docs/observability/health-monitoring)                           |
| **Security Hardening**       | Credential management, auditing     | Compliance                   | [Security Guide](/docs/guides/enterprise/compliance)              |
| **Custom Model Hosting**     | SageMaker integration               | Private models               | [SageMaker Guide](/docs/getting-started/providers/sagemaker)                    |
| **Load Balancing**           | LiteLLM proxy integration           | Scale & routing              | [Load Balancing Guide](/docs/guides/enterprise/load-balancing)    |
| **Audit Trails**             | Comprehensive logging               | Compliance                   | [Audit Guide](/docs/guides/enterprise/audit-trails)               |
| **Configuration Management** | Environment & credential management | Multi-environment deployment | [Config Guide](/docs/deployment/configuration-management)                    |

### Advanced Security Features

#### Human-in-the-Loop (HITL) Policy Engine

Enterprise-grade approval system for sensitive operations:

```typescript
// HITL Policy Configuration
type HITLPolicy = {
  requireApprovalFor: string[]; // Tool-specific policies
  autoApprove: string[]; // Safe operation whitelist
  alwaysDeny: string[]; // Blacklist operations
  timeoutBehavior: "deny" | "approve"; // Timeout handling
};
```

**HITL Capabilities:**

- ✅ User consent for dangerous operations
- ✅ Configurable policy engine
- ✅ Comprehensive audit trail logging
- ✅ Timeout handling
- ✅ Bulk approval for batch operations

#### Advanced Proxy Support

Corporate network compatibility:

| Proxy Type           | Support | Features                             |
| -------------------- | ------- | ------------------------------------ |
| **AWS Proxy**        | ✅ Full | AWS-specific proxy configuration     |
| **HTTP/HTTPS Proxy** | ✅ Full | Universal proxy across all providers |
| **No-Proxy Bypass**  | ✅ Full | Bypass configuration and utilities   |

#### Enhanced Guardrails

AI-powered content security:

- ✅ **Content Filtering**: Automatic content screening
- ✅ **Toxicity Detection**: Toxic content filtering
- ✅ **PII Redaction**: Privacy protection and PII detection
- ✅ **Custom Rules**: Configurable policy rules
- ✅ **Security Reporting**: Detailed security event reporting

### Security & Compliance Certifications

- ✅ SOC2 Type II compliant deployments
- ✅ ISO 27001 certified infrastructure compatible
- ✅ GDPR-compliant data handling (EU providers available)
- ✅ HIPAA compatible (with proper configuration)
- ✅ Hardened OS verified (SELinux, AppArmor)
- ✅ Zero credential logging
- ✅ Encrypted configuration storage

**[ Enterprise Deployment Guide](/docs/guides/enterprise/multi-provider-failover)** - Complete production patterns

---

## Middleware & Extension System

### Advanced Middleware Architecture

Pluggable request/response processing for custom workflows:

#### Built-in Middleware

| Middleware          | Purpose                     | Features                                            | Status    |
| ------------------- | --------------------------- | --------------------------------------------------- | --------- |
| **Analytics**       | Usage tracking & monitoring | Token counting, timing, performance metrics         | ✅ Active |
| **Guardrails**      | Content security            | Content policies, toxicity detection, PII filtering | ✅ Active |
| **Auto Evaluation** | Quality scoring             | LLM-as-judge, accuracy metrics, safety validation   | ✅ Active |

#### Middleware System Capabilities

```typescript
// Middleware Configuration
type MiddlewareFactoryOptions = {
  middleware?: NeuroLinkMiddleware[]; // Custom middleware registration
  enabledMiddleware?: string[]; // Selective activation
  disabledMiddleware?: string[]; // Selective deactivation
  middlewareConfig?: Record; // Per-middleware configuration
  preset?: string; // Preset configurations
  global?: {
    // Global settings
    maxExecutionTime?: number;
    continueOnError?: boolean;
  };
};
```

**Middleware Features:**

- ✅ Dynamic middleware registration
- ✅ Pipeline execution with performance tracking
- ✅ Runtime configuration changes
- ✅ Error handling and graceful recovery
- ✅ Priority-based execution order
- ✅ Detailed execution statistics

**[ Custom Middleware Guide](/docs/workflows/custom-middleware)** - Build your own middleware

---

## Performance & Optimization

### Intelligent Cost Optimization

- ** Model Resolver**: Cost optimization algorithms and intelligent routing
- **⚡ Performance Routing**: Speed-optimized provider selection
- ** Concurrent Initialization**: Reduced latency through parallel loading
- ** Caching Strategies**: Intelligent response and configuration caching

### Advanced SageMaker Features

Beyond basic integration - enterprise-grade custom model deployment:

| Feature                      | Description                                          | Status         |
| ---------------------------- | ---------------------------------------------------- | -------------- |
| **Adaptive Semaphore**       | Dynamic concurrency control for optimal throughput   | ✅ Implemented |
| **Structured Output Parser** | Complex response parsing and validation              | ✅ Implemented |
| **Capability Detection**     | Automatic endpoint capability discovery              | ✅ Implemented |
| **Batch Inference**          | Efficient batch processing for high-volume workloads | ✅ Implemented |
| **Diagnostics System**       | Real-time endpoint monitoring and debugging          | ✅ Implemented |

### Error Handling & Resilience

Production-grade fault tolerance:

- ✅ **MCP Circuit Breaker**: Fault tolerance with state management
- ✅ **Error Hierarchies**: Comprehensive error types for HITL, providers, and MCP
- ✅ **Graceful Degradation**: Intelligent fallback strategies
- ✅ **Retry Logic**: Configurable retry with exponential backoff

**[ Performance Optimization Guide](/docs/deployment/performance)** - Complete optimization strategies

---

## Advanced Integrations

| Integration                                                                    | Description                                                                             |
| ------------------------------------------------------------------------------ | --------------------------------------------------------------------------------------- |
|  **[LiteLLM Integration](/docs/getting-started/providers/litellm)** | Access 100+ models from all major providers via LiteLLM routing with unified interface. |
| ☁️ **[SageMaker Integration](/docs/getting-started/providers/sagemaker)**        | Deploy and call custom endpoints directly from NeuroLink CLI/SDK with full control.     |
|  **[Mem0 Integration](/docs/memory/mem0)**        | Persistent semantic memory with vector store support for long-term conversations.       |
|  **[Memory](/docs/features/memory)**                                       | Per-user condensed memory with S3/Redis/SQLite storage and LLM-powered condensation.    |
|  **[Enterprise Proxy](/docs/deployment/enterprise-proxy)**    | Configure outbound policies and compliance posture for corporate environments.          |
| ⚙️ **[Configuration Management](/docs/deployment/configuration-management)**  | Manage environments, regions, and credentials safely across deployments.                |

---

## Advanced Features

| Feature                                                                                   | Description                                                                        |
| ----------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------- |
|  **[Factory Pattern Architecture](/docs/development/factory-architecture)** | Unified provider interface with automatic fallbacks and type-safe implementations. |
| ️ **[Conversation Memory](/docs/memory/conversation)**              | Deep dive into memory management, Redis integration, and Mem0 support.             |
|  **[Custom Middleware](/docs/workflows/custom-middleware)**              | Build request/response hooks for logging, filtering, and custom processing.        |
| ⚡ **[Performance Optimization](/docs/deployment/performance)**     | Caching, connection pooling, and latency optimization strategies.                  |
|  **[Telemetry & Observability](/docs/observability/health-monitoring)**               | OpenTelemetry integration for distributed tracing and monitoring.                  |
|  **[Testing Guide](/docs/development/testing)**                                   | Provider-agnostic testing, mocking, and quality assurance strategies.              |
|  **[Analytics & Evaluation](/docs/reference/analytics)**               | Usage tracking, cost monitoring, and quality scoring for AI responses.             |
| ⚡ **[Streaming](/docs/advanced/streaming)**                                | Real-time token streaming with provider-specific optimizations.                    |
|  **[Thinking Configuration](/docs/features/thinking-configuration)**               | Configure extended thinking levels for supported models (Anthropic, Gemini 2.5+).  |
|  **[Structured Output](/docs/cookbook/structured-output)**                        | JSON schema-based structured output with provider-specific formatting.             |
|  **[Text-to-Speech (TTS)](/docs/features/tts)**                              | Basic TTS support via Google Cloud TTS (Neural2, Wavenet, Standard voices).        |

---

## See Also

- **[Getting Started](/docs/)** - Quick start and installation
- **[CLI Reference](/docs/cli/commands)** - Command-line interface documentation
- **[SDK Reference](/docs/sdk/api-reference)** - TypeScript API documentation
- **[Enterprise Guides](/docs/guides/enterprise/multi-provider-failover)** - Production deployment patterns
- **[Tutorials](/docs/)** - Step-by-step implementation guides
- **[Examples](/docs/)** - Real-world code samples

---

## Audio Input & Transcription Guide

<!-- Source: features/audio-input.md -->

# Audio Input & Voice Conversations Guide

NeuroLink provides comprehensive audio input capabilities, enabling real-time voice conversations with AI models. This guide covers currently available features, audio specifications, and upcoming enhancements.

## Overview

### Currently Available

NeuroLink supports the following audio capabilities today:

- **Real-time voice conversations** via Gemini Live (Google AI Studio)
- **Text-to-Speech (TTS) output** via Google Cloud TTS integration
- **WebSocket-based voice streaming** for web applications
- **Bidirectional audio** - speak and hear AI responses in real-time

### Coming Soon

The following features are planned for future releases:

- CLI commands: `neurolink audio transcribe`, `neurolink audio analyze`, `neurolink audio summarize`
- CLI commands: `neurolink voice chat`, `neurolink voice demo`
- OpenAI Whisper transcription integration
- Cross-provider audio support (Anthropic, Azure, AWS)
- File-based audio input processing

----------------- | --------------- | ----------- | ------------------- | ---------------- |
| **Google AI Studio** | Yes             | Yes         | Coming Soon         | Production Ready |
| **Google Vertex AI** | Planned         | Yes         | Coming Soon         | TTS Available    |
| **OpenAI**           | Coming Soon     | Coming Soon | Coming Soon         | Planned          |
| **Anthropic**        | Coming Soon     | Coming Soon | Coming Soon         | Planned          |
| **Azure OpenAI**     | Coming Soon     | Coming Soon | Coming Soon         | Planned          |
| **AWS Bedrock**      | Coming Soon     | Coming Soon | Coming Soon         | Planned          |

**Supported Model for Real-time Voice:**

| Model                                          | Provider  | Capabilities                     |
| ---------------------------------------------- | --------- | -------------------------------- |
| `gemini-2.5-flash-preview-native-audio-dialog` | Google AI | Bidirectional audio, low latency |

---

## Quick Start: Real-Time Voice (SDK)

Real-time voice conversations are available through the SDK using Gemini Live's native audio dialog model.

### Prerequisites

```bash
# Set your Google AI API key
export GOOGLE_AI_API_KEY="your-api-key"
# OR
export GEMINI_API_KEY="your-api-key"
```

### Basic Real-time Voice Streaming

```typescript

const neurolink = new NeuroLink();

// Create an async iterator for audio frames
// This example uses a hypothetical audio source
async function* getAudioFrames(): AsyncIterable {
  // Your audio capture logic here
  // Each frame should be PCM16LE mono at 16kHz
  // Recommended frame size: 20-60ms of audio
  while (capturing) {
    const frame = await captureAudioFrame();
    yield frame;
  }
}

// Stream with real-time audio input
const result = await neurolink.stream({
  provider: "google-ai",
  model: "gemini-2.5-flash-preview-native-audio-dialog",
  input: {
    audio: {
      frames: getAudioFrames(),
      sampleRateHz: 16000, // Input sample rate (default: 16000)
      encoding: "PCM16LE", // Encoding format (default: PCM16LE)
    },
  },
  disableTools: true, // Required for Phase 1 audio streaming
});

// Process audio responses
for await (const event of result.stream) {
  if (event.type === "audio") {
    // Handle audio output chunk
    // Output is PCM16LE mono at 24kHz
    const audioData = event.audio.data;
    playAudio(audioData);
  }
}
```

### Complete Voice Session Example

```typescript

const neurolink = new NeuroLink();

async function startVoiceSession() {
  // Audio frame queue management
  const frameQueue: Buffer[] = [];
  let isSessionActive = true;

  // Create async iterator from queue
  const audioFramesIterator: AsyncIterable = {
    [Symbol.asyncIterator]() {
      return {
        async next() {
          if (!isSessionActive) {
            return { value: undefined, done: true };
          }
          // Wait for frames to be available
          while (frameQueue.length === 0 && isSessionActive) {
            await new Promise((resolve) => setTimeout(resolve, 10));
          }
          if (frameQueue.length > 0) {
            return { value: frameQueue.shift()!, done: false };
          }
          return { value: undefined, done: true };
        },
      };
    },
  };

  // Start the streaming session
  const streamResult = await neurolink.stream({
    provider: "google-ai",
    model: "gemini-2.5-flash-preview-native-audio-dialog",
    input: {
      audio: {
        frames: audioFramesIterator,
        sampleRateHz: 16000,
        encoding: "PCM16LE",
      },
    },
    disableTools: true,
  });

  // Function to add captured audio to queue
  function onAudioCaptured(pcmBuffer: Buffer) {
    frameQueue.push(pcmBuffer);
  }

  // Function to signal end of input (flush)
  function flushAudio() {
    // Push a zero-length buffer as flush signal
    frameQueue.push(Buffer.alloc(0));
  }

  // Process responses
  for await (const event of streamResult.stream) {
    if (event.type === "audio") {
      // Output audio data: PCM16LE, 24kHz, mono
      handleAudioOutput(event.audio.data);
    }
  }

  isSessionActive = false;
}

function handleAudioOutput(audioBuffer: Buffer) {
  // Play or process the audio response
  // Sample rate: 24000 Hz
  // Format: PCM16LE mono
  playAudioBuffer(audioBuffer);
}
```

---

## Quick Start: TTS Integration

NeuroLink provides Text-to-Speech output via Google Cloud TTS. TTS can be combined with any text generation.

### CLI Usage

```bash
# Generate text and convert to speech
neurolink generate "Hello, world!" \
  --provider google-ai \
  --tts-voice en-US-Neural2-C

# Save audio to file
neurolink generate "Welcome to NeuroLink" \
  --provider google-ai \
  --tts-voice en-US-Neural2-C \
  --tts-output welcome.mp3

# Customize voice parameters
neurolink generate "This is a test" \
  --provider google-ai \
  --tts-voice en-US-Wavenet-D \
  --tts-speed 1.2 \
  --tts-pitch 2.0 \
  --tts-format mp3 \
  --tts-output test.mp3

# Synthesize AI response (not input text)
neurolink generate "Tell me a joke" \
  --provider google-ai \
  --tts-voice en-US-Neural2-C \
  --tts-use-ai-response \
  --tts-output joke.mp3
```

### SDK Usage

```typescript

const neurolink = new NeuroLink();

// Basic TTS
const result = await neurolink.generate({
  input: { text: "Hello, world!" },
  provider: "google-ai",
  tts: {
    enabled: true,
    voice: "en-US-Neural2-C",
    format: "mp3",
    play: true, // Auto-play in CLI
  },
});

// Save TTS audio
if (result.tts?.buffer) {
  writeFileSync("output.mp3", result.tts.buffer);
  console.log(`Audio saved: ${result.tts.size} bytes`);
}

// Advanced TTS with AI response synthesis
const aiResponse = await neurolink.generate({
  input: { text: "Explain quantum computing briefly" },
  provider: "google-ai",
  tts: {
    enabled: true,
    useAiResponse: true, // Synthesize AI's response
    voice: "en-US-Wavenet-D",
    format: "mp3",
    speed: 0.9,
    pitch: -2.0,
  },
});

console.log("Text:", aiResponse.content);
console.log("Audio size:", aiResponse.tts?.size, "bytes");
```

For comprehensive TTS documentation, see the [TTS Integration Guide](/docs/features/tts).

---

## Voice Demo Example

NeuroLink includes a complete voice demo application demonstrating real-time bidirectional audio conversations.

### Location

```
examples/voice-demo/
  server.mjs      # WebSocket server with NeuroLink integration
  public/
    index.html    # Web interface
    client.js     # Browser audio capture and playback
```

### Running the Demo

```bash
# Navigate to the project root
cd /path/to/neurolink

# Build the SDK first
pnpm run build

# Set your API key
export GOOGLE_AI_API_KEY="your-api-key"

# Run the demo server
node examples/voice-demo/server.mjs
```

The demo will:

1. Start a WebSocket server on port 5175 (or next available port)
2. Open your browser automatically to the demo interface
3. Allow you to speak and receive real-time AI audio responses

### Demo Architecture

```
Browser (client.js)
    |
    | WebSocket (ws://localhost:5175/ws)
    |
    v
Server (server.mjs)
    |
    | neurolink.stream()
    |
    v
Gemini Live API
    |
    | PCM16LE audio chunks
    |
    v
Server -> Browser -> Audio playback
```

### Key Code from Voice Demo Server

```typescript
// From examples/voice-demo/server.mjs
const streamResult = await neurolink.stream({
  provider: "google-ai",
  model:
    process.env.GEMINI_MODEL || "gemini-2.5-flash-preview-native-audio-dialog",
  input: {
    audio: {
      frames: framesFromClient,
      // sampleRateHz defaults to 16000
      // encoding defaults to 'PCM16LE'
    },
  },
  disableTools: true, // Required for audio streaming
});

// Stream audio responses back to client
for await (const ev of streamResult.stream) {
  if (ev.type === "audio") {
    // Send raw PCM16LE bytes back to the client
    ws.send(ev.audio.data, { binary: true });
  }
}
```

---

## Audio Specifications

### Input Audio Format

| Parameter       | Value               | Notes                                |
| --------------- | ------------------- | ------------------------------------ |
| **Encoding**    | PCM16LE             | 16-bit signed integer, little-endian |
| **Sample Rate** | 16,000 Hz           | 16 kHz mono                          |
| **Channels**    | 1 (mono)            | Stereo not supported in Phase 1      |
| **Frame Size**  | 20-60ms recommended | ~320-960 samples per frame           |
| **Byte Order**  | Little-endian       | Intel/ARM standard                   |

### Output Audio Format

| Parameter       | Value         | Notes                                |
| --------------- | ------------- | ------------------------------------ |
| **Encoding**    | PCM16LE       | 16-bit signed integer, little-endian |
| **Sample Rate** | 24,000 Hz     | 24 kHz mono                          |
| **Channels**    | 1 (mono)      | Single channel output                |
| **Byte Order**  | Little-endian | Intel/ARM standard                   |

### Converting Audio Formats

**From Float32 to PCM16LE (for input):**

```javascript
function floatTo16BitPCM(float32Array) {
  const length = float32Array.length;
  const buffer = new ArrayBuffer(length * 2);
  const view = new DataView(buffer);

  for (let i = 0; i ;

  /**
   * Input sample rate in Hz
   * @default 16000
   */
  sampleRateHz?: number;

  /**
   * Audio encoding format
   * @default "PCM16LE"
   */
  encoding?: "PCM16LE";

  /**
   * Number of audio channels
   * Phase 1 only supports mono
   * @default 1
   */
  channels?: 1;
};
```

### AudioChunk

Audio output chunk received from streaming responses.

```typescript
type AudioChunk = {
  /**
   * Raw audio data buffer (PCM16LE format)
   */
  data: Buffer;

  /**
   * Sample rate of the audio data
   * Gemini typically outputs at 24000 Hz
   */
  sampleRateHz: number;

  /**
   * Number of audio channels (typically 1 for mono)
   */
  channels: number;

  /**
   * Audio encoding format
   */
  encoding: "PCM16LE";
};
```

### StreamOptions with Audio

```typescript
type StreamOptions = {
  input: {
    text: string;
    audio?: AudioInputSpec; // Optional audio input
    // ... other input options
  };

  provider: string;
  model?: string;
  disableTools?: boolean; // Required true for audio streaming
  // ... other options
};
```

### Stream Result Events

```typescript
// Stream yields different event types
type StreamEvent =
  | { content: string } // Text chunk
  | { type: "audio"; audio: AudioChunk } // Audio chunk
  | { type: "image"; imageOutput: { base64: string } }; // Image output

// Usage
for await (const event of result.stream) {
  if ("content" in event) {
    // Text content
    console.log(event.content);
  } else if (event.type === "audio") {
    // Audio data
    playAudio(event.audio.data);
  }
}
```

### AudioContent (File-based - Future)

For file-based audio input (planned feature).

```typescript
type AudioContent = {
  type: "audio";
  data: Buffer | string; // Buffer, base64, URL, or file path
  mediaType?:
    | "audio/mpeg" // MP3
    | "audio/wav" // WAV
    | "audio/ogg" // OGG
    | "audio/webm" // WebM
    | "audio/aac" // AAC
    | "audio/flac" // FLAC
    | "audio/mp4"; // M4A
  metadata?: {
    filename?: string;
    duration?: number; // in seconds
    sampleRate?: number;
    channels?: number;
    transcription?: string; // Pre-existing transcription
  };
};
```

---

## Roadmap

### Phase 1 (Current)

- Real-time voice with Gemini Live
- Bidirectional audio streaming via SDK
- Voice demo example application
- TTS output integration

### Phase 2 (Coming Soon)

- **CLI Voice Commands**

  ```bash
  # Start interactive voice chat
  neurolink voice chat --provider google-ai

  # Launch voice demo server
  neurolink voice demo --port 5175
  ```

- **Audio Transcription**

  ```bash
  # Transcribe audio file
  neurolink audio transcribe recording.mp3 --provider openai

  # Analyze audio content
  neurolink audio analyze podcast.mp3 --prompt "Summarize key points"
  ```

### Phase 3 (Planned)

- **OpenAI Whisper Integration**

  ```typescript
  const transcription = await neurolink.transcribe({
    audioFile: "./recording.mp3",
    provider: "openai",
    model: "whisper-1",
    language: "en",
  });
  ```

- **Cross-provider Audio Support**
  - Anthropic voice capabilities
  - Azure Speech Services
  - AWS Transcribe

- **File-based Audio Input**
  ```typescript
  const result = await neurolink.generate({
    input: {
      text: "Analyze this audio file",
      audioFiles: ["./meeting.mp3"],
    },
    provider: "openai",
  });
  ```

---

## Environment Setup

### Required Environment Variables

```bash
# For Google AI Studio (Gemini Live)
export GOOGLE_AI_API_KEY="your-api-key"
# OR
export GEMINI_API_KEY="your-api-key"

# For TTS (Google Cloud)
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"
# OR use the same GOOGLE_AI_API_KEY with Cloud TTS API enabled
```

### API Key Configuration

For Gemini Live and TTS to work with an API key:

1. Go to Google Cloud Console > APIs & Services > Credentials
2. Create or select your API key
3. Under "API restrictions", enable:
   - **Generative Language API** (for Gemini)
   - **Cloud Text-to-Speech API** (for TTS output)

---

## Troubleshooting

### Common Issues

| Issue                       | Cause                    | Solution                                           |
| --------------------------- | ------------------------ | -------------------------------------------------- |
| **No audio output**         | Missing API key          | Set `GOOGLE_AI_API_KEY` or `GEMINI_API_KEY`        |
| **"disableTools required"** | Tools enabled with audio | Add `disableTools: true` to stream options         |
| **Choppy audio playback**   | Buffer underrun          | Increase buffer size or frame rate                 |
| **Wrong sample rate**       | Mismatched audio context | Use 16kHz input, 24kHz output contexts             |
| **WebSocket disconnects**   | Network timeout          | Implement reconnection logic                       |
| **"Model not found"**       | Invalid model name       | Use `gemini-2.5-flash-preview-native-audio-dialog` |

### Audio Quality Issues

**Clipping/Distortion:**

- Ensure input samples are normalized to [-1, 1] range
- Check gain levels before PCM conversion

**Echo/Feedback:**

- Mute microphone during AI audio playback
- Implement voice activity detection (VAD)

**Latency:**

- Use smaller frame sizes (20ms)
- Process audio in real-time, avoid buffering
- Use WebSocket for low-latency transport

### Debug Mode

Enable debug logging to troubleshoot audio issues:

```bash
export NEUROLINK_DEBUG=true
```

```typescript
const neurolink = new NeuroLink({
  debug: true,
});
```

---

## Related Features

**Audio & Voice:**

- [TTS Integration Guide](/docs/features/tts) - Complete Text-to-Speech documentation
- [Video Generation](/docs/features/video-generation) - AI-powered video with audio

**Multimodal Capabilities:**

- [Multimodal Guide](/docs/features/multimodal) - Images, PDFs, CSV inputs
- [PDF Support](/docs/features/pdf-support) - Document processing

**Advanced Features:**

- [Streaming](/docs/advanced/streaming) - Stream AI responses in real-time
- [Provider Orchestration](/docs/features/provider-orchestration) - Multi-provider failover

**Documentation:**

- [CLI Commands](/docs/cli/commands) - Complete CLI reference
- [SDK API Reference](/docs/sdk/api-reference) - Full API documentation
- [Troubleshooting](/docs/reference/troubleshooting) - Extended error catalog

---

## Summary

NeuroLink's audio input capabilities provide:

**Currently Available:**

- Real-time voice conversations via Gemini Live
- Bidirectional audio streaming (speak and hear)
- TTS output via Google Cloud
- Voice demo example application
- PCM16LE audio format support

**Coming Soon:**

- CLI voice commands (`voice chat`, `audio transcribe`)
- OpenAI Whisper transcription
- Cross-provider audio support
- File-based audio processing

**Next Steps:**

1. Set up [environment variables](#environment-setup)
2. Try the [voice demo](#voice-demo-example) application
3. Integrate [real-time voice](#quick-start-real-time-voice-sdk) in your SDK code
4. Explore [TTS output](/docs/features/tts) for text-to-speech
5. Check [troubleshooting](#troubleshooting) if you encounter issues

---

## Auto Evaluation Engine

<!-- Source: features/auto-evaluation.md -->

# Auto Evaluation Engine

NeuroLink 7.46.0 adds an automated quality gate that scores every response using an LLM-as-judge pipeline. Scores, rationales, and severity flags are surfaced in both CLI and SDK workflows so you can monitor drift and enforce minimum quality thresholds.

## What It Does

- Generates a structured evaluation payload (`result.evaluation`) for every call with `enableEvaluation: true`.
- Calculates relevance, accuracy, completeness, and an overall score (1–10) using a RAGAS-style rubric.
- Supports retry loops: re-ask the provider when the score falls below your threshold.
- Emits analytics-friendly JSON so you can pipe results into dashboards.

:::warning[LLM Costs]
Evaluation uses additional AI calls to the judge model (default: `gemini-2.5-flash`). Each evaluated response incurs extra API costs. For high-volume production workloads, consider sampling (e.g., evaluate 10% of requests) or disabling evaluation after quality stabilizes.
:::

## Usage Examples

```typescript

const neurolink = new NeuroLink({ enableOrchestration: true });  // (1)!

const result = await neurolink.generate({
  input: { text: "Create quarterly performance summary" },  // (2)!
  enableEvaluation: true,  // (3)!
  evaluationDomain: "Enterprise Finance",  // (4)!
  factoryConfig: {
    enhancementType: "domain-configuration",  // (5)!
    domainType: "finance",
  },
});

if (result.evaluation && !result.evaluation.isPassing) {  // (6)!
  console.warn("Quality gate failed", result.evaluation.details?.message);
}
```

```bash
# Baseline quality check
npx @juspay/neurolink generate "Draft onboarding email" --enableEvaluation

# Combine with analytics for observability dashboards
npx @juspay/neurolink generate "Summarise release notes" \
  --enableEvaluation --enableAnalytics --format json

# Domain-aware evaluations shape the rubric
npx @juspay/neurolink generate "Refactor this API" \
  --enableEvaluation --evaluationDomain "Principal Engineer"

# Fail the command if the score dips below 7 (set env variable first)
NEUROLINK_EVALUATION_THRESHOLD=7 npx @juspay/neurolink generate "Write compliance summary" \
  --enableEvaluation
```

```
 Evaluation Summary
• Overall: 8.6/10 (Passing threshold: 7)
• Relevance: 9.0  • Accuracy: 8.5  • Completeness: 8.0
• Reasoning: Response covers all requested sections with correct policy references.
```

## Streaming with Evaluation

```typescript
const stream = await neurolink.stream({
  input: { text: "Walk through the incident postmortem" },
  enableEvaluation: true, // (1)!
});

let final;
for await (const chunk of stream) {
  if (chunk.evaluation) {
    // (2)!
    final = chunk.evaluation; // (3)!
  }
}
console.log(final?.overallScore); // (4)!
```

1. Evaluation works in streaming mode
2. Evaluation payload arrives in final chunks
3. Capture the evaluation object
4. Access overall score (1-10) and sub-scores

## Configuration Options

| Option                                | Where                            | Description                                                        |
| ------------------------------------- | -------------------------------- | ------------------------------------------------------------------ |
| `enableEvaluation`                    | CLI flag / request option        | Turns the middleware on for this call.                             |
| `evaluationDomain`                    | CLI flag / request option        | Provides context to the judge model (e.g., `"Healthcare"`).        |
| `NEUROLINK_EVALUATION_THRESHOLD`      | Env variable / loop session var  | Minimum passing score; failures trigger retries or errors.         |
| `NEUROLINK_EVALUATION_MODEL`          | Env variable / middleware config | Override the default judge model (defaults to `gemini-2.5-flash`). |
| `NEUROLINK_EVALUATION_PROVIDER`       | Env variable                     | Force the judge provider (`google-ai` by default).                 |
| `NEUROLINK_EVALUATION_RETRY_ATTEMPTS` | Env variable                     | Number of re-evaluation attempts before surfacing failure.         |
| `NEUROLINK_EVALUATION_TIMEOUT`        | Env variable                     | Millisecond timeout for judge requests.                            |
| `offTopicThreshold`                   | Middleware config                | Score below which a response is flagged as off-topic.              |
| `highSeverityThreshold`               | Middleware config                | Score threshold for triggering high-severity alerts.               |

Set global defaults by exporting environment variables in your `.env`:

```bash
NEUROLINK_EVALUATION_PROVIDER="google-ai"
NEUROLINK_EVALUATION_MODEL="gemini-2.5-flash"
NEUROLINK_EVALUATION_THRESHOLD=7
NEUROLINK_EVALUATION_RETRY_ATTEMPTS=2
NEUROLINK_EVALUATION_TIMEOUT=15000
```

> Loop sessions respect these values. Inside `neurolink loop`, use `set NEUROLINK_EVALUATION_THRESHOLD 8` or `unset NEUROLINK_EVALUATION_THRESHOLD` to adjust the gate on the fly.

## Best Practices

:::tip[Cost Optimization]
Only enable evaluation when needed: during prompt engineering, quality regression testing, or high-stakes production calls. For routine operations, disable evaluation and rely on [Analytics](/docs/reference/analytics) for zero-cost observability.
:::

- Pair evaluation with analytics to track cost vs. quality trends.
- Lower the threshold during experimentation, then tighten once prompts stabilise.
- Register a custom `onEvaluationComplete` handler to forward scores to BI systems.
- Exclude massive prompts from evaluation when latency matters; analytics is zero-cost without evaluation.

## Troubleshooting

| Issue                                | Fix                                                                                                              |
| ------------------------------------ | ---------------------------------------------------------------------------------------------------------------- |
| `Evaluation model not configured`    | Ensure judge provider API keys are present or set `NEUROLINK_EVALUATION_PROVIDER`.                               |
| CLI exits with failure               | Lower `NEUROLINK_EVALUATION_THRESHOLD` or configure the middleware with `blocking: false`.                       |
| Evaluation takes too long            | Reduce `NEUROLINK_EVALUATION_RETRY_ATTEMPTS` or switch to a smaller judge model (e.g., `gemini-2.5-flash-lite`). |
| Off-topic false positives            | Increase `offTopicThreshold` to a lower score (e.g., 3).                                                         |
| JSON output missing evaluation block | Confirm `--format json` and `--enableEvaluation` are both set.                                                   |

## Related Features

**Q4 2025 Features:**

- [Guardrails Middleware](/docs/features/guardrails) – Combine evaluation with content filtering for comprehensive quality control

**Q3 2025 Features:**

- [Multimodal Chat](/docs/features/multimodal-chat) – Evaluate vision-based responses
- [CLI Loop Sessions](/docs/features/cli-loop-sessions) – Set evaluation threshold in loop mode

**Documentation:**

- [Analytics Guide](/docs/reference/analytics) – Track evaluation metrics over time
- [SDK API Reference](/docs/sdk/api-reference) – Evaluation options
- [Troubleshooting](/docs/reference/troubleshooting) – Common evaluation issues

---

## CLI Loop Sessions

<!-- Source: features/cli-loop-sessions.md -->

# CLI Loop Sessions

`neurolink loop` delivers a persistent CLI workspace so you can explore prompts, tweak parameters, and inspect state without restarting the CLI. Session variables, Redis-backed history, and built-in help turn the CLI into a playground for prompt engineering and operator runbooks.

## Why Loop Mode

- **Stateful sessions** – keep provider/model/temperature context between commands.
- **Memory on demand** – enable in-memory or Redis-backed conversation history per session.
- **Fast iteration** – reuse the entire command surface (`generate`, `stream`, `memory`, etc.) without leaving the loop.
- **Guided UX** – ASCII banner, inline help, and validation for every session variable.

:::tip[Keyboard Shortcuts]
Loop mode supports **tab completion** for commands and session variables, **arrow key history** for navigating previous commands, and **Ctrl+C** to cancel the current operation without exiting the loop.
:::

## Starting a Session

```bash
# Default: in-memory session variables, Redis auto-detected when available
npx @juspay/neurolink loop

# Disable Redis auto-detection and stay in-memory
npx @juspay/neurolink loop --no-auto-redis

# Turn off memory entirely (prompt-by-prompt mode)
npx @juspay/neurolink loop --enable-conversation-memory=false

# Custom retention limits
npx @juspay/neurolink loop --max-sessions 100 --max-turns-per-session 50
```

```typescript

// Create a NeuroLink instance with session state
const neurolink = new NeuroLink({
  conversationMemory: {
    enabled: true,  // (1)!
    store: "redis",  // (2)!
    maxTurnsPerSession: 50,
  },
});

// Simulate loop-like behavior with persistent context
const sessionId = "my-session-123";  // (3)!

// First interaction
const result1 = await neurolink.generate({
  input: { text: "What is NeuroLink?" },
  context: { sessionId },  // (4)!
  provider: "google-ai",
  enableEvaluation: true,
});

// Second interaction - memory preserved
const result2 = await neurolink.generate({
  input: { text: "How do I enable HITL?" },
  context: { sessionId },  // (5)!
  provider: "google-ai",
});

console.log(result2.content);  // AI remembers previous context
```

## Session Commands

Inside the loop prompt (`⎔ neurolink »`) you can manage context without leaving the session:

| Command                | Purpose                                                 | Example                   |
| ---------------------- | ------------------------------------------------------- | ------------------------- |
| `/help`                | Show loop-specific commands plus full CLI help.         | `/help`                   |
| `/set  `   | Persist a generation option (validated against schema). | `/set provider google-ai` |
| `/get `           | Inspect the current value.                              | `/get provider`           |
| `/unset `         | Remove a single session variable.                       | `/unset temperature`      |
| `/show`                | List all session variables.                             | `/show`                   |
| `/clear`               | Reset every session variable.                           | `/clear`                  |
| `exit` / `quit` / `:q` | Leave loop mode.                                        | `exit`                    |

### Common Variables

- `provider` – any provider except `auto` (`/set provider google-ai`).
- `model` – model slug from `models list` (`/set model gemini-2.5-pro`).
- `temperature` – floating point number (`/set temperature 0.6`).
- `enableEvaluation` / `enableAnalytics` – toggles for observability (`/set enableEvaluation true`).
- `context` – JSON-encoded metadata (`/set context {"userId":"42"}`).
- `NEUROLINK_EVALUATION_THRESHOLD` – dynamic quality gate (`/set NEUROLINK_EVALUATION_THRESHOLD 8`).

> Type `/set help` in the loop to view every available key and its validation rules.

## Using CLI Commands in Loop Mode

In loop mode, you can interact with the AI naturally by typing your prompts directly:

```
⎔ neurolink » what are the seven wonders?
⎔ neurolink » explain quantum physics
⎔ neurolink » tell me a story about space exploration
```

To use other CLI commands explicitly, prefix them with a forward slash `/`:

```
⎔ neurolink » /generate "Draft changelog from sprint notes" --enableEvaluation
⎔ neurolink » /batch file.txt
⎔ neurolink » /status --verbose
⎔ neurolink » /models list --capability vision
```

:::tip[Quick Reference]
:::

- **No prefix**: Streams a response to your prompt
- **`/` prefix**: Executes CLI commands or session commands (e.g., `/help`, `/set`, `/generate`, `/batch`)
- **`//` prefix**: Escape to stream prompts starting with `/` (e.g., `//what is /usr/bin?`)
- **Exit commands**: `exit`, `quit`, or `:q` work without prefix to leave loop mode

Errors are handled gracefully; parsing issues surface inline without closing the loop.

## Conversation Memory & Redis Auto-Detect

:::tip[Redis Persistence]
When Redis is detected, loop sessions survive restarts. Exit the loop, close your terminal, and resume later with the same session ID to continue where you left off. Perfect for long-running prompt engineering workflows.
:::

- By default the loop enables conversation memory (`--enable-conversation-memory=true`).
- `--auto-redis` probes for a reachable Redis instance using existing environment variables (`REDIS_URL`, etc.).
- When Redis is available you’ll see `✅ Using Redis for persistent conversation memory` in the banner.
- History is segmented by generated session IDs and stored with tool transcripts.

Manage history with standard CLI commands (inside or outside loop):

```bash
# Overview of stored sessions
npx @juspay/neurolink memory stats

# Export a specific transcript as JSON
npx @juspay/neurolink memory history NL_r1bd2 --format json > transcript.json

# Clear loop history
npx @juspay/neurolink memory clear NL_r1bd2
```

## Best Practices

- Commit to a provider/model via `/set` at the start of a session to avoid noisy auto-routing during experiments.
- Use `/set enableAnalytics true` and `/set enableEvaluation true` to apply observability globally.
- Combine with the interactive setup wizard (`neurolink setup --list`) to configure credentials mid-session.
- If you switch projects, run `/clear` or start a new loop to avoid leaking context.

## Troubleshooting

| Symptom                            | Resolution                                                                              |
| ---------------------------------- | --------------------------------------------------------------------------------------- |
| `A loop session is already active` | Use `exit` in the existing session or close the terminal tab before starting a new one. |
| Redis warning but memory disabled  | Ensure Redis credentials are valid or run with `--no-auto-redis`.                       |
| Session variable rejected          | Run `/set help` to check allowed values; booleans must be `true`/`false`.               |
| Commands exit unexpectedly         | Upgrade to CLI `>=7.47.0` so the session-aware error handler is included.               |

## Related Features

**Q4 2025 Features:**

- [Redis Conversation Export](/docs/features/conversation-history) – Export loop session history as JSON for analytics

**Q3 2025 Features:**

- [Multimodal Chat](/docs/features/multimodal-chat) – Use images in loop sessions
- [Auto Evaluation](/docs/features/auto-evaluation) – Enable quality scoring with `set enableEvaluation true`

**Documentation:**

- [CLI Commands](/docs/cli/commands) – Complete command reference
- [Conversation Memory](/docs/memory/conversation) – Memory system deep dive
- [Mem0 Integration](/docs/memory/mem0) – Semantic memory with vectors

---

## Context Compaction

<!-- Source: features/context-compaction.md -->

# Context Compaction

## Overview

NeuroLink's Context Compaction system automatically manages conversation context windows, preventing overflow errors and maintaining conversation quality as sessions grow longer. It runs transparently before every `generate()` and `stream()` call.

Before each LLM call, the **Budget Checker** estimates the total input tokens needed (system prompt + conversation history + current prompt + tool definitions + file attachments) and compares them against the model's available context window. When usage exceeds the configured threshold (default: 80%), the **ContextCompactor** runs a 4-stage reduction pipeline:

1. **Tool Output Pruning** — Replace old tool results with placeholders (cheapest, no LLM call)
2. **File Read Deduplication** — Keep only the latest read of each file (cheap, no LLM call)
3. **LLM Summarization** — Structured 9-section summary of older messages (expensive, requires LLM call)
4. **Sliding Window Truncation** — Remove oldest messages while preserving the first exchange (fallback, no LLM call)

If a provider still returns a context overflow error after compaction, the system detects it across all supported providers and retries with aggressive compaction.

## SDK Configuration

The full `contextCompaction` block lives inside `conversationMemory`:

```typescript
const neurolink = new NeuroLink({
  conversationMemory: {
    enabled: true,
    enableSummarization: true,
    summarizationProvider: "vertex", // Provider for summarization LLM calls
    summarizationModel: "gemini-2.5-flash", // Model for summarization LLM calls
    contextCompaction: {
      enabled: true, // Enable auto-compaction (default: true when summarization enabled)
      threshold: 0.8, // Compaction trigger threshold, 0.0–1.0 (default: 0.80)
      enablePruning: true, // Enable Stage 1: tool output pruning (default: true)
      enableDeduplication: true, // Enable Stage 2: file read deduplication (default: true)
      enableSlidingWindow: true, // Enable Stage 4: sliding window fallback (default: true)
      maxToolOutputBytes: 50 * 1024, // Tool output max size in bytes (default: 51200)
      maxToolOutputLines: 2000, // Tool output max lines (default: 2000)
      fileReadBudgetPercent: 0.6, // File read budget as fraction of remaining context (default: 0.60)
    },
  },
});
```

| Field                   | Type      | Default                             | Description                                            |
| ----------------------- | --------- | ----------------------------------- | ------------------------------------------------------ |
| `enabled`               | `boolean` | `true` (when summarization enabled) | Master switch for auto-compaction                      |
| `threshold`             | `number`  | `0.80`                              | Usage ratio (0.0–1.0) that triggers compaction         |
| `enablePruning`         | `boolean` | `true`                              | Enable Stage 1: tool output pruning                    |
| `enableDeduplication`   | `boolean` | `true`                              | Enable Stage 2: file read deduplication                |
| `enableSlidingWindow`   | `boolean` | `true`                              | Enable Stage 4: sliding window truncation fallback     |
| `maxToolOutputBytes`    | `number`  | `51200` (50 KB)                     | Maximum tool output size in bytes before truncation    |
| `maxToolOutputLines`    | `number`  | `2000`                              | Maximum tool output lines before truncation            |
| `fileReadBudgetPercent` | `number`  | `0.60`                              | Fraction of remaining context allocated for file reads |

---

## Environment Variables

These environment variables configure conversation memory and summarization, which in turn affect compaction behavior:

| Variable                           | Default                     | Description                                           |
| ---------------------------------- | --------------------------- | ----------------------------------------------------- |
| `NEUROLINK_MEMORY_ENABLED`         | `"false"`                   | Set to `"true"` to enable conversation memory         |
| `NEUROLINK_SUMMARIZATION_ENABLED`  | `"true"`                    | Set to `"false"` to disable summarization             |
| `NEUROLINK_TOKEN_THRESHOLD`        | auto (80% of model context) | Override token threshold for triggering summarization |
| `NEUROLINK_SUMMARIZATION_PROVIDER` | `"vertex"`                  | Provider for summarization LLM calls                  |
| `NEUROLINK_SUMMARIZATION_MODEL`    | `"gemini-2.5-flash"`        | Model for summarization LLM calls                     |
| `NEUROLINK_MEMORY_MAX_SESSIONS`    | `50`                        | Maximum number of sessions to keep in memory          |

Source: `src/lib/config/conversationMemory.ts`

---

## CLI Flags

The `loop` command accepts compaction-specific flags:

```bash
# Set a custom compaction threshold (0.0–1.0)
neurolink loop --compact-threshold 0.70

# Disable automatic context compaction entirely
neurolink loop --disable-compaction
```

| Flag                   | Type      | Default | Description                                    |
| ---------------------- | --------- | ------- | ---------------------------------------------- |
| `--compact-threshold`  | `number`  | `0.8`   | Context compaction trigger threshold (0.0–1.0) |
| `--disable-compaction` | `boolean` | `false` | Disable automatic context compaction           |

Source: `src/cli/factories/commandFactory.ts:1466-1475`

---

## Public API Methods

### `getContextStats(sessionId, provider?, model?)`

Get context usage statistics for a session. Returns token counts, usage ratio, and whether compaction should trigger.

**Signature:**

```typescript
async getContextStats(
  sessionId: string,
  provider?: string,
  model?: string,
): Promise
```

Returns `null` if conversation memory is not enabled or the session has no messages. The `provider` defaults to `"openai"` if not specified.

**Example:**

```typescript
const stats = await neurolink.getContextStats(
  "session-1",
  "anthropic",
  "claude-sonnet-4-20250514",
);
if (stats) {
  console.log(`Usage: ${(stats.usageRatio * 100).toFixed(0)}%`);
  console.log(
    `Tokens: ${stats.estimatedInputTokens} / ${stats.availableInputTokens}`,
  );
  console.log(`Messages: ${stats.messageCount}`);
  console.log(`Needs compaction: ${stats.shouldCompact}`);
}
```

Source: `src/lib/neurolink.ts:6624-6661`

---

### `compactSession(sessionId, config?)`

Manually trigger context compaction for a session. Runs the full 4-stage pipeline. After compaction, tool pairs are automatically repaired via `repairToolPairs()`.

**Signature:**

```typescript
async compactSession(
  sessionId: string,
  config?: CompactionConfig,
): Promise
```

Returns `null` if conversation memory is not enabled or the session has no messages.

**Example:**

```typescript
const result = await neurolink.compactSession("session-1", {
  enablePrune: true,
  enableDeduplicate: true,
  enableSummarize: true,
  enableTruncate: true,
  pruneProtectTokens: 40_000,
  summarizationProvider: "vertex",
  summarizationModel: "gemini-2.5-flash",
});

if (result?.compacted) {
  console.log(`Stages used: ${result.stagesUsed.join(", ")}`);
  console.log(`Tokens saved: ${result.tokensSaved}`);
  console.log(`Before: ${result.tokensBefore}, After: ${result.tokensAfter}`);
}
```

Source: `src/lib/neurolink.ts:6591-6618`

---

### `needsCompaction(sessionId, provider?, model?)`

Synchronous check of whether a session needs compaction. Uses `checkContextBudget()` internally with the default 80% threshold.

**Signature:**

```typescript
needsCompaction(
  sessionId: string,
  provider?: string,
  model?: string,
): boolean
```

Returns `false` if conversation memory is not enabled or the session doesn't exist. The `provider` defaults to `"openai"` if not specified.

**Example:**

```typescript
if (
  neurolink.needsCompaction(
    "session-1",
    "anthropic",
    "claude-sonnet-4-20250514",
  )
) {
  const result = await neurolink.compactSession("session-1");
  console.log(`Saved ${result?.tokensSaved} tokens`);
}
```

Source: `src/lib/neurolink.ts:6666-6692`

---

## Types Reference

### `CompactionStage`

```typescript
type CompactionStage = "prune" | "deduplicate" | "summarize" | "truncate";
```

### `CompactionResult`

Returned by `compactSession()` and `ContextCompactor.compact()`.

```typescript
type CompactionResult = {
  compacted: boolean; // Whether any compaction was applied
  stagesUsed: CompactionStage[]; // Which stages were used (in order)
  tokensBefore: number; // Estimated tokens before compaction
  tokensAfter: number; // Estimated tokens after compaction
  tokensSaved: number; // tokensBefore - tokensAfter
  messages: ChatMessage[]; // The compacted message array
};
```

### `CompactionConfig`

Optional configuration passed to `compactSession()` or the `ContextCompactor` constructor.

```typescript
type CompactionConfig = {
  enablePrune?: boolean; // Enable Stage 1 (default: true)
  enableDeduplicate?: boolean; // Enable Stage 2 (default: true)
  enableSummarize?: boolean; // Enable Stage 3 (default: true)
  enableTruncate?: boolean; // Enable Stage 4 (default: true)
  pruneProtectTokens?: number; // Recent tool output tokens to protect (default: 40,000)
  pruneMinimumSavings?: number; // Minimum tokens saved to declare pruning success (default: 20,000)
  pruneProtectedTools?: string[]; // Tool names that are never pruned (default: ["skill"])
  summarizationProvider?: string; // Provider for summarization LLM (default: "vertex")
  summarizationModel?: string; // Model for summarization LLM (default: "gemini-2.5-flash")
  keepRecentRatio?: number; // Fraction of messages to keep unsummarized (default: 0.3)
  truncationFraction?: number; // Fraction of oldest messages to remove in Stage 4 (default: 0.5)
  provider?: string; // Provider name for token estimation multipliers (default: "")
};
```

Source: `src/lib/context/contextCompactor.ts:37-65`

### `BudgetCheckResult`

Returned by `checkContextBudget()`.

```typescript
type BudgetCheckResult = {
  withinBudget: boolean; // Whether the request fits within the context window
  estimatedInputTokens: number; // Estimated total input tokens
  availableInputTokens: number; // Available input tokens for this model
  usageRatio: number; // Usage ratio (0.0–1.0+)
  shouldCompact: boolean; // Whether auto-compaction should trigger
  breakdown: {
    systemPrompt: number; // Tokens from system prompt
    conversationHistory: number; // Tokens from conversation history
    currentPrompt: number; // Tokens from current user prompt
    toolDefinitions: number; // Tokens from tool definitions (content-based: JSON.stringify(tool).length / 4)
    fileAttachments: number; // Tokens from file attachments
  };
};
```

### `BudgetCheckParams`

Parameters for `checkContextBudget()`.

```typescript
type BudgetCheckParams = {
  provider: string;
  model?: string;
  maxTokens?: number;
  systemPrompt?: string;
  conversationMessages?: Array;
  currentPrompt?: string;
  toolDefinitions?: unknown[];
  fileAttachments?: Array;
  compactionThreshold?: number; // 0.0–1.0, default: 0.80
};
```

Source: `src/lib/context/budgetChecker.ts:18-54`

---

## The 4-Stage Pipeline

The `ContextCompactor` runs stages sequentially. Each stage only runs if the previous stage didn't bring tokens below the target budget.

### Stage 1: Tool Output Pruning

**File:** `src/lib/context/stages/toolOutputPruner.ts`

Walks messages backwards, protecting the most recent tool outputs, and replaces older tool results with `"[Tool result cleared]"`.

```typescript
function pruneToolOutputs(
  messages: ChatMessage[],
  config?: PruneConfig,
): PruneResult;
```

**`PruneConfig`:**

| Field            | Type       | Default     | Description                                                 |
| ---------------- | ---------- | ----------- | ----------------------------------------------------------- |
| `protectTokens`  | `number`   | `40,000`    | Token budget of recent tool outputs to protect from pruning |
| `minimumSavings` | `number`   | `20,000`    | Minimum tokens that must be saved for pruning to be applied |
| `protectedTools` | `string[]` | `["skill"]` | Tool names that are never pruned                            |
| `provider`       | `string`   | —           | Provider name for token estimation multiplier               |

**`PruneResult`:**

```typescript
type PruneResult = {
  pruned: boolean; // Whether pruning was applied (savings >= minimumSavings)
  messages: ChatMessage[];
  tokensSaved: number;
};
```

### Stage 2: File Read Deduplication

**File:** `src/lib/context/stages/fileReadDeduplicator.ts`

Detects multiple reads of the same file path. Keeps only the latest read, replaces earlier reads with `"[File  - refer to latest read below]"`.

```typescript
function deduplicateFileReads(messages: ChatMessage[]): DeduplicationResult;
```

**`DeduplicationResult`:**

```typescript
type DeduplicationResult = {
  deduplicated: boolean; // Whether dedup was applied (requires 30%+ savings)
  messages: ChatMessage[];
  filesDeduped: number; // Number of unique files that had duplicates removed
};
```

File read detection uses the regex pattern: `/(?:read|reading|read_file|readFile|Read file|cat)\s+['"`]?([^\s'"`\n]+)/i`

A 30% savings threshold (`DEDUP_THRESHOLD = 0.3`) must be met for deduplication to be applied.

### Stage 3: LLM Summarization

**File:** `src/lib/context/stages/structuredSummarizer.ts`

Uses the structured 9-section prompt to summarize older messages while keeping recent ones. Delegates to `generateSummary()` from the conversation memory system.

```typescript
async function summarizeMessages(
  messages: ChatMessage[],
  config?: SummarizeConfig,
): Promise;
```

**`SummarizeConfig`:**

| Field             | Type                                | Default | Description                                            |
| ----------------- | ----------------------------------- | ------- | ------------------------------------------------------ |
| `provider`        | `string`                            | —       | Provider for the summarization LLM call                |
| `model`           | `string`                            | —       | Model for the summarization LLM call                   |
| `keepRecentRatio` | `number`                            | `0.3`   | Fraction of messages to keep unsummarized (minimum: 4) |
| `memoryConfig`    | `Partial` | —       | Memory config passed to `generateSummary()`            |

**`SummarizeResult`:**

```typescript
type SummarizeResult = {
  summarized: boolean;
  messages: ChatMessage[]; // [summaryMessage, ...recentMessages]
  summaryText?: string; // Raw summary text
};
```

Behavior:

- Will not summarize if there are 4 or fewer messages
- Keeps at least 4 recent messages (or `keepRecentRatio` of total, whichever is greater)
- Finds and incorporates any previous summary message for iterative merging
- Summary message is inserted as a `system` role message with `metadata.isSummary = true`
- If summarization fails (LLM error), the pipeline silently falls through to Stage 4

### Stage 4: Sliding Window Truncation

**File:** `src/lib/context/stages/slidingWindowTruncator.ts`

Non-destructive fallback that removes the oldest messages from the middle of the conversation while always preserving the first user-assistant pair.

```typescript
function truncateWithSlidingWindow(
  messages: ChatMessage[],
  config?: TruncationConfig,
): TruncationResult;
```

**`TruncationConfig`:**

| Field      | Type     | Default | Description                                       |
| ---------- | -------- | ------- | ------------------------------------------------- |
| `fraction` | `number` | `0.5`   | Fraction of messages (after first pair) to remove |

**`TruncationResult`:**

```typescript
type TruncationResult = {
  truncated: boolean;
  messages: ChatMessage[]; // [firstPair..., truncationMarker, ...keptMessages]
  messagesRemoved: number; // Always an even number (maintains role alternation)
};
```

Behavior:

- Will not truncate if there are 4 or fewer messages
- Always preserves the first 2 messages (first user-assistant pair)
- Removes an even number of messages to maintain role alternation
- Inserts a `system` role truncation marker: `"[Earlier conversation history was truncated to fit within context limits]"`

---

## ChatMessage Compaction Fields

The `ChatMessage` type has five fields used for non-destructive context management:

```typescript
type ChatMessage = {
  // ... standard fields ...

  condenseId?: string; // UUID identifying this condensation group
  condenseParent?: string; // Points to the summary that replaces this message
  truncationId?: string; // UUID identifying this truncation group
  truncationParent?: string; // Points to the truncation marker that hides this message
  isTruncationMarker?: boolean; // Marks this message as a truncation boundary marker
};
```

| Field                | Purpose                                                                       |
| -------------------- | ----------------------------------------------------------------------------- |
| `condenseId`         | Set on the summary message. Groups all messages that were condensed together. |
| `condenseParent`     | Set on original messages. Points to the `condenseId` of their summary.        |
| `truncationId`       | Set on the truncation marker. Groups all messages hidden by this truncation.  |
| `truncationParent`   | Set on original messages. Points to the `truncationId` of their marker.       |
| `isTruncationMarker` | `true` on the synthetic marker message inserted where messages were removed.  |

Messages with `condenseParent` or `truncationParent` are filtered out by `getEffectiveHistory()` but remain in storage for potential rewind.

Source: `src/lib/types/conversation.ts:270-279`

---

## Non-Destructive History

**File:** `src/lib/context/effectiveHistory.ts`

Messages are tagged rather than deleted, allowing compaction to be unwound.

### `getEffectiveHistory(messages)`

Returns only visible messages by filtering out those with `condenseParent` or `truncationParent`.

```typescript
function getEffectiveHistory(messages: ChatMessage[]): ChatMessage[];
```

### `tagForCondensation(messages, fromIndex, toIndex, condenseId)`

Tags messages in `[fromIndex, toIndex)` with a `condenseParent` pointing to `condenseId`.

```typescript
function tagForCondensation(
  messages: ChatMessage[],
  fromIndex: number,
  toIndex: number,
  condenseId: string,
): ChatMessage[];
```

### `tagForTruncation(messages, fromIndex, toIndex, truncationId)`

Tags messages in `[fromIndex, toIndex)` with a `truncationParent` pointing to `truncationId`.

```typescript
function tagForTruncation(
  messages: ChatMessage[],
  fromIndex: number,
  toIndex: number,
  truncationId: string,
): ChatMessage[];
```

### `removeCondensationTags(messages, condenseId)`

Removes `condenseParent` tags from messages matching `condenseId`, making them visible again. Also removes the summary message itself (matched by `condenseId` + `metadata.isSummary`).

```typescript
function removeCondensationTags(
  messages: ChatMessage[],
  condenseId: string,
): ChatMessage[];
```

### `removeTruncationTags(messages, truncationId)`

Removes `truncationParent` tags from messages matching `truncationId`, making them visible again. Also removes the truncation marker itself (matched by `truncationId` + `isTruncationMarker`).

```typescript
function removeTruncationTags(
  messages: ChatMessage[],
  truncationId: string,
): ChatMessage[];
```

---

## Token Estimation

**File:** `src/lib/utils/tokenEstimation.ts`

Character-based token estimation with per-provider adjustment multipliers. Uses the same approach as Continue (GPT-tokenizer baseline + provider multipliers) without requiring a tokenizer dependency.

### Constants

| Constant                  | Value  | Description                                            |
| ------------------------- | ------ | ------------------------------------------------------ |
| `CHARS_PER_TOKEN`         | `4`    | Characters per token for English text                  |
| `CODE_CHARS_PER_TOKEN`    | `3`    | Characters per token for code                          |
| `TOKEN_SAFETY_MARGIN`     | `1.15` | Safety margin multiplier to avoid underestimation      |
| `TOKENS_PER_MESSAGE`      | `4`    | Message framing overhead in tokens (role + delimiters) |
| `TOKENS_PER_CONVERSATION` | `24`   | Conversation-level overhead in tokens                  |
| `IMAGE_TOKEN_ESTIMATE`    | `1024` | Flat token estimate for images                         |

### Provider Multipliers

Applied on top of the base character estimate:

| Provider      | Multiplier | Notes                                         |
| ------------- | ---------- | --------------------------------------------- |
| `anthropic`   | `1.23`     | Anthropic tokenizer produces ~23% more tokens |
| `google-ai`   | `1.18`     | Google AI Studio                              |
| `vertex`      | `1.18`     | Google Vertex AI                              |
| `mistral`     | `1.26`     | Mistral / Codestral                           |
| `openai`      | `1.0`      | Baseline (GPT-style)                          |
| `azure`       | `1.0`      | Same tokenizer as OpenAI                      |
| `bedrock`     | `1.23`     | Mostly Anthropic models                       |
| `ollama`      | `1.0`      |                                               |
| `litellm`     | `1.0`      |                                               |
| `huggingface` | `1.0`      |                                               |
| `sagemaker`   | `1.0`      |                                               |

### Functions

**`estimateTokens(text, provider?, isCode?)`**

Estimate token count for a string.

```typescript
function estimateTokens(
  text: string,
  provider?: string,
  isCode?: boolean,
): number;
```

Formula: `ceil(text.length / charsPerToken) * providerMultiplier * TOKEN_SAFETY_MARGIN`

**`estimateMessagesTokens(messages, provider?)`**

Estimate total token count for an array of messages, including per-message overhead and conversation-level overhead.

```typescript
function estimateMessagesTokens(
  messages: Array,
  provider?: string,
): number;
```

**`truncateToTokenBudget(text, maxTokens, provider?)`**

Truncate text to fit within a token budget. Tries to cut at sentence or word boundaries. Appends `"..."` if truncated.

```typescript
function truncateToTokenBudget(
  text: string,
  maxTokens: number,
  provider?: string,
): { text: string; truncated: boolean };
```

---

## Context Window Registry

**File:** `src/lib/constants/contextWindows.ts`

### Constants

| Constant                       | Value     | Description                                   |
| ------------------------------ | --------- | --------------------------------------------- |
| `DEFAULT_CONTEXT_WINDOW`       | `128,000` | Fallback when provider/model is unknown       |
| `MAX_DEFAULT_OUTPUT_RESERVE`   | `64,000`  | Maximum output reserve when maxTokens not set |
| `DEFAULT_OUTPUT_RESERVE_RATIO` | `0.35`    | Default output reserve as fraction of context |

### Functions

**`getContextWindowSize(provider, model?)`**

Resolve context window size. Priority: exact model match > provider `_default` > global `DEFAULT_CONTEXT_WINDOW`. Also supports partial model name prefix matching.

```typescript
function getContextWindowSize(provider: string, model?: string): number;
```

**`getAvailableInputTokens(provider, model?, maxTokens?)`**

Calculate available input tokens: `contextWindow - outputReserve`.

```typescript
function getAvailableInputTokens(
  provider: string,
  model?: string,
  maxTokens?: number,
): number;
```

**`getOutputReserve(contextWindow, maxTokens?)`**

Calculate output token reserve. Uses explicit `maxTokens` if provided, otherwise `min(MAX_DEFAULT_OUTPUT_RESERVE, contextWindow * DEFAULT_OUTPUT_RESERVE_RATIO)`.

```typescript
function getOutputReserve(contextWindow: number, maxTokens?: number): number;
```

### `MODEL_CONTEXT_WINDOWS`

Complete per-provider, per-model context window registry:

| Provider        | Model                                       | Context Window |
| --------------- | ------------------------------------------- | -------------- |
| **anthropic**   | `_default`                                  | 200,000        |
|                 | `claude-opus-4-20250514`                    | 200,000        |
|                 | `claude-sonnet-4-20250514`                  | 200,000        |
|                 | `claude-3-7-sonnet-20250219`                | 200,000        |
|                 | `claude-3-5-sonnet-20241022`                | 200,000        |
|                 | `claude-3-5-haiku-20241022`                 | 200,000        |
|                 | `claude-3-opus-20240229`                    | 200,000        |
|                 | `claude-3-sonnet-20240229`                  | 200,000        |
|                 | `claude-3-haiku-20240307`                   | 200,000        |
| **openai**      | `_default`                                  | 128,000        |
|                 | `gpt-4o`                                    | 128,000        |
|                 | `gpt-4o-mini`                               | 128,000        |
|                 | `gpt-4-turbo`                               | 128,000        |
|                 | `gpt-4`                                     | 8,192          |
|                 | `gpt-3.5-turbo`                             | 16,385         |
|                 | `o1`                                        | 200,000        |
|                 | `o1-mini`                                   | 128,000        |
|                 | `o1-pro`                                    | 200,000        |
|                 | `o3`                                        | 200,000        |
|                 | `o3-mini`                                   | 200,000        |
|                 | `o4-mini`                                   | 200,000        |
|                 | `gpt-4.1`                                   | 1,047,576      |
|                 | `gpt-4.1-mini`                              | 1,047,576      |
|                 | `gpt-4.1-nano`                              | 1,047,576      |
|                 | `gpt-5`                                     | 1,047,576      |
| **google-ai**   | `_default`                                  | 1,048,576      |
|                 | `gemini-2.5-pro`                            | 1,048,576      |
|                 | `gemini-2.5-flash`                          | 1,048,576      |
|                 | `gemini-2.0-flash`                          | 1,048,576      |
|                 | `gemini-1.5-pro`                            | 2,097,152      |
|                 | `gemini-1.5-flash`                          | 1,048,576      |
|                 | `gemini-3-flash-preview`                    | 1,048,576      |
|                 | `gemini-3-pro-preview`                      | 1,048,576      |
| **vertex**      | `_default`                                  | 1,048,576      |
|                 | `gemini-2.5-pro`                            | 1,048,576      |
|                 | `gemini-2.5-flash`                          | 1,048,576      |
|                 | `gemini-2.0-flash`                          | 1,048,576      |
|                 | `gemini-1.5-pro`                            | 2,097,152      |
|                 | `gemini-1.5-flash`                          | 1,048,576      |
| **bedrock**     | `_default`                                  | 200,000        |
|                 | `anthropic.claude-3-5-sonnet-20241022-v2:0` | 200,000        |
|                 | `anthropic.claude-3-5-haiku-20241022-v1:0`  | 200,000        |
|                 | `anthropic.claude-3-opus-20240229-v1:0`     | 200,000        |
|                 | `anthropic.claude-3-sonnet-20240229-v1:0`   | 200,000        |
|                 | `anthropic.claude-3-haiku-20240307-v1:0`    | 200,000        |
|                 | `amazon.nova-pro-v1:0`                      | 300,000        |
|                 | `amazon.nova-lite-v1:0`                     | 300,000        |
| **azure**       | `_default`                                  | 128,000        |
|                 | `gpt-4o`                                    | 128,000        |
|                 | `gpt-4o-mini`                               | 128,000        |
|                 | `gpt-4-turbo`                               | 128,000        |
|                 | `gpt-4`                                     | 8,192          |
| **mistral**     | `_default`                                  | 128,000        |
|                 | `mistral-large-latest`                      | 128,000        |
|                 | `mistral-medium-latest`                     | 32,000         |
|                 | `mistral-small-latest`                      | 128,000        |
|                 | `codestral-latest`                          | 256,000        |
| **ollama**      | `_default`                                  | 128,000        |
| **litellm**     | `_default`                                  | 128,000        |
| **huggingface** | `_default`                                  | 32,000         |
| **sagemaker**   | `_default`                                  | 128,000        |

---

## Error Detection

**File:** `src/lib/context/errorDetection.ts`

Cross-provider regex patterns to detect context window overflow errors.

### `isContextOverflowError(error)`

Returns `true` if the error matches any known context overflow pattern.

```typescript
function isContextOverflowError(error: unknown): boolean;
```

Accepts `Error` objects, strings, or objects with `message`/`error` properties. Also inspects `error.cause` for nested errors.

### `getContextOverflowProvider(error)`

Identifies which provider produced the context overflow error.

```typescript
function getContextOverflowProvider(error: unknown): string | null;
```

Returns the provider name string or `null` if no match.

### Supported Provider Patterns

| Provider     | Error Patterns                                                                            |
| ------------ | ----------------------------------------------------------------------------------------- |
| `openai`     | `"This model's maximum context length is"`, `"reduce the length of the messages"`         |
| `azure`      | `"content_length_exceeded"`                                                               |
| `google`     | `"RESOURCE_EXHAUSTED"`, `"exceeds the maximum number of tokens"`, `"content is too long"` |
| `bedrock`    | `"ValidationException.*token"`, `"Input is too long"`, `"exceeds the model's maximum"`    |
| `mistral`    | `"context length exceeded"`, `"maximum number of tokens"`                                 |
| `openrouter` | `"context_length_exceeded"`                                                               |
| `anthropic`  | `"prompt is too long"`, `"input is too long"`, `"too many tokens"`                        |

### Non-Retryable Error Handling

When `isContextOverflowError()` detects that an error is a context overflow, the MCP generation retry loop (`performMCPGenerationRetries`) breaks immediately instead of retrying up to 3 times. This prevents wasting API calls on errors that cannot succeed without compaction.

Additionally, errors with `statusCode === 400` or `isRetryable === false` are treated as non-retryable and break the retry loop immediately.

### Post-Failure Compaction Passthrough

When a generation call fails with a context overflow error and compaction is triggered, the compacted messages are passed through via `options.conversationMessages` to `directProviderGeneration()`, which uses them instead of re-fetching from memory. The compaction target is set to `Math.floor(availableInputTokens * 0.7)` (70% of available context) to leave headroom.

---

## Tool Output Limits

**File:** `src/lib/context/toolOutputLimits.ts`

Truncates individual tool outputs that exceed size limits. Can optionally save the full output to disk.

### Constants

| Constant                | Value           | Description                  |
| ----------------------- | --------------- | ---------------------------- |
| `MAX_TOOL_OUTPUT_BYTES` | `51200` (50 KB) | Maximum tool output in bytes |
| `MAX_TOOL_OUTPUT_LINES` | `2000`          | Maximum tool output lines    |

### `truncateToolOutput(output, options?)`

```typescript
function truncateToolOutput(
  output: string,
  options?: TruncateOptions,
): TruncateResult;
```

**`TruncateOptions`:**

```typescript
type TruncateOptions = {
  maxBytes?: number; // Default: MAX_TOOL_OUTPUT_BYTES (51200)
  maxLines?: number; // Default: MAX_TOOL_OUTPUT_LINES (2000)
  direction?: "head" | "tail"; // Which end to keep (default: "tail")
  saveToDisk?: boolean; // Save full output to disk (default: false)
  saveDir?: string; // Directory for saved output (default: os.tmpdir()/neurolink-tool-output)
};
```

**`TruncateResult`:**

```typescript
type TruncateResult = {
  content: string; // Truncated content with notice appended
  truncated: boolean; // Whether truncation was applied
  savedPath?: string; // Path to saved full output (if saveToDisk was true)
  originalSize: number; // Original size in bytes
};
```

When truncated, a notice is appended: `[Output truncated from X bytes to Y bytes]` (with optional saved path).

---

## File Token Budget

**File:** `src/lib/context/fileTokenBudget.ts`

Calculates how much of the remaining context window can be used for file reads. Implements fast-path for small files and preview mode for very large files.

### Constants

| Constant                   | Value             | Description                                       |
| -------------------------- | ----------------- | ------------------------------------------------- |
| `FILE_READ_BUDGET_PERCENT` | `0.6`             | 60% of remaining context allocated for file reads |
| `FILE_FAST_PATH_SIZE`      | `102400` (100 KB) | Files below this size skip budget validation      |
| `FILE_PREVIEW_MODE_SIZE`   | `5242880` (5 MB)  | Files above this size get preview-only mode       |
| `FILE_PREVIEW_CHARS`       | `2000`            | Default preview size in characters                |

### `calculateFileTokenBudget(contextWindow, currentTokens, maxOutputTokens)`

Calculate available token budget for file reads.

```typescript
function calculateFileTokenBudget(
  contextWindow: number,
  currentTokens: number,
  maxOutputTokens: number,
): number;
```

Formula: `floor((contextWindow - currentTokens - maxOutputTokens) * FILE_READ_BUDGET_PERCENT)`

Returns `0` if remaining tokens is zero or negative.

### `enforceAggregateFileBudget(files, provider, model, maxTokens)`

**File:** `src/lib/context/fileTokenBudget.ts`

Enforces a total token budget across all file attachments in a single request. When the aggregate content of all files exceeds the available context budget, files are truncated proportionally or dropped to fit.

This prevents the scenario where multiple large file attachments (e.g., 5 files totaling 2.8 MB) overflow the context window on the very first message — before any conversation history exists to compact.

```typescript
function enforceAggregateFileBudget(
  files: Array,
  provider: string,
  model?: string,
  maxTokens?: number,
): Array;
```

Called automatically by `buildMultimodalMessagesArray()` before the file processing loop.

### `shouldTruncateFile(fileSize, budget)`

Determine how a file should be handled based on its size and the token budget.

```typescript
function shouldTruncateFile(
  fileSize: number,
  budget: number,
): { shouldTruncate: boolean; maxChars?: number; previewMode?: boolean };
```

Decision logic:

- `fileSize > FILE_PREVIEW_MODE_SIZE (5MB)` → preview mode (2000 chars)
- `fileSize  - conversation was compacted]"`
- Synthetic messages have `metadata.truncated = true`

This runs automatically after `compactSession()`.

---

## CLI Session Warnings

**File:** `src/cli/loop/session.ts:300-354`

In loop mode, the CLI checks context budget after each turn and displays warnings:

**At >60% usage** (informational, gray text):

```
  Context: 65% used
```

**At >=80% usage** (warning, yellow text — compaction threshold reached):

```
  Context usage: 83% of window (166,000 / 200,000 tokens)
  Auto-compaction will trigger to preserve conversation quality.
```

These warnings only appear when `contextCompaction.enabled` is `true` in the session config.

---

## Provider Support

Summary table of default context windows by provider:

| Provider     | Default Context Window | Notable Models                                     |
| ------------ | ---------------------- | -------------------------------------------------- |
| Anthropic    | 200,000                | All Claude 3/3.5/4 models                          |
| OpenAI       | 128,000                | GPT-4o, o1/o3 (200K), GPT-4.1/GPT-5 (1M+)          |
| Google AI    | 1,048,576              | Gemini 2.x/3.x (1M), Gemini 1.5 Pro (2M)           |
| Vertex       | 1,048,576              | Gemini 2.x (1M), Gemini 1.5 Pro (2M)               |
| Bedrock      | 200,000                | Claude models (200K), Nova (300K)                  |
| Azure        | 128,000                | GPT-4o, GPT-4-turbo; GPT-4 (8K)                    |
| Mistral      | 128,000                | Large/Small (128K), Medium (32K), Codestral (256K) |
| Ollama       | 128,000                | Configurable per model                             |
| LiteLLM      | 128,000                | Passthrough to underlying provider                 |
| Hugging Face | 32,000                 | Model-dependent                                    |
| SageMaker    | 128,000                | Model-dependent                                    |

---

## Redis Conversation History Export

<!-- Source: features/conversation-history.md -->

# Redis Conversation History Export

> **Since**: v7.38.0 | **Status**: Stable | **Availability**: SDK + CLI

## Overview

**What it does**: Export complete conversation session history from Redis storage as JSON for analytics, debugging, and compliance auditing.

**Why use it**: Access structured conversation data for analysis, user behavior insights, quality assurance, and debugging failed sessions. Essential for production observability.

**Common use cases**:

- Debugging failed or problematic conversations
- Analytics and user behavior analysis
- Compliance and audit trail generation
- Quality assurance and model evaluation
- Training data collection for fine-tuning

## Quick Start

:::warning[Redis Required]
Conversation history export **only works with Redis storage**. In-memory storage does not support export functionality. Configure Redis before enabling conversation memory.
:::

### SDK Example

```typescript

const neurolink = new NeuroLink({
  conversationMemory: {
    enabled: true,
    store: "redis", // Required for export functionality
  },
});

// Have a conversation
await neurolink.generate({
  prompt: "What is machine learning?",
  context: { sessionId: "session-123" },
});

// Get the conversation history
const history = await neurolink.getConversationHistory("session-123");
// Returns: Promise

console.log(history);
// [
//   { role: "user", content: "What is machine learning?" },
//   { role: "assistant", content: "..." }
// ]

// Clear a specific session
const cleared = await neurolink.clearConversationSession("session-123");
// Returns: Promise

// Clear all conversations
await neurolink.clearAllConversations();
// Returns: Promise
```

### CLI Example

> **Planned Feature**
>
> The `neurolink memory` CLI subcommand is planned for a future release.
> The commands shown below represent the intended interface once implemented.

```bash
# Enable Redis-backed conversation memory
npx @juspay/neurolink loop --enable-conversation-memory --store redis

# Have a conversation (session ID auto-generated)
> Tell me about AI
[AI response...]

# Export conversation history
npx @juspay/neurolink memory export --session-id  --format json > conversation.json

# Or export all sessions
npx @juspay/neurolink memory export-all --output ./exports/
```

## Configuration

| Option            | Type              | Default  | Required | Description                    |
| ----------------- | ----------------- | -------- | -------- | ------------------------------ |
| `sessionId`       | `string`          | -        | Yes      | Unique session identifier      |
| `format`          | `"json" \| "csv"` | `"json"` | No       | Export format                  |
| `includeMetadata` | `boolean`         | `true`   | No       | Include session metadata       |
| `startTime`       | `Date`            | -        | No       | Filter: export from this time  |
| `endTime`         | `Date`            | -        | No       | Filter: export until this time |

### Environment Variables

```bash
# Redis connection (required for export)
export REDIS_URL="redis://localhost:6379"
# or
export REDIS_HOST="localhost"
export REDIS_PORT="6379"
export REDIS_PASSWORD="your-password"  # if needed

# Conversation memory settings
export NEUROLINK_MEMORY_ENABLED="true"
export NEUROLINK_MEMORY_STORE="redis"
export NEUROLINK_MEMORY_MAX_TURNS_PER_SESSION="100"
```

### Config File

```typescript
// .neurolink.config.ts
export default {
  conversationMemory: {
    enabled: true,
    store: "redis", // Required for persistent history
    redis: {
      host: process.env.REDIS_HOST || "localhost",
      port: parseInt(process.env.REDIS_PORT || "6379"),
      password: process.env.REDIS_PASSWORD,
    },
    maxTurnsPerSession: 100,
  },
};
```

## How It Works

### Data Flow

1. **Conversation occurs** → Each turn stored in Redis with session ID
2. **Export requested** → SDK/CLI queries Redis for session
3. **Data aggregated** → Turns assembled with metadata
4. **Format applied** → JSON or CSV serialization
5. **Output delivered** → File or console output

### Redis Storage Structure

```
neurolink:session:{sessionId}:turns → List of conversation turns
neurolink:session:{sessionId}:metadata → Session metadata
neurolink:sessions → Set of all active session IDs
```

### Data Schema (JSON Export)

```json
{
  "sessionId": "session-abc123",
  "userId": "user-456",
  "createdAt": "2025-09-30T10:00:00Z",
  "updatedAt": "2025-09-30T10:15:00Z",
  "turns": [
    {
      "index": 0,
      "role": "user",
      "content": "What is NeuroLink?",
      "timestamp": "2025-09-30T10:00:00Z"
    },
    {
      "index": 1,
      "role": "assistant",
      "content": "NeuroLink is an enterprise AI development platform...",
      "timestamp": "2025-09-30T10:00:05Z",
      "model": "gpt-4",
      "provider": "openai",
      "tokens": { "prompt": 12, "completion": 45 }
    }
  ],
  "metadata": {
    "provider": "openai",
    "model": "gpt-4",
    "totalTurns": 2,
    "toolsUsed": ["web-search", "calculator"]
  }
}
```

## Advanced Usage

### Retrieve Session History

```typescript
// Get conversation history for a specific session
const history = await neurolink.getConversationHistory("session-123");
// Returns: Promise

// Process the history
for (const message of history) {
  console.log(`${message.role}: ${message.content}`);
}
```

### Clear Session Data

```typescript
// Clear a specific session
const cleared = await neurolink.clearConversationSession("session-123");
if (cleared) {
  console.log("Session cleared successfully");
}

// Clear all conversations
await neurolink.clearAllConversations();
console.log("All conversations cleared");
```

### Export History to File

```typescript
// Get history and save to JSON file
const history = await neurolink.getConversationHistory("session-123");

await fs.writeFile(
  `./exports/session-123.json`,
  JSON.stringify(history, null, 2),
);
```

### Integration with Analytics Pipeline

:::tip[Analytics Integration]
Pipe exported conversation data directly to your analytics dashboards for user behavior insights, quality metrics, and model performance tracking. Combine with [Auto Evaluation](/docs/features/auto-evaluation) for comprehensive quality monitoring.
:::

```typescript

// After each conversation session ends
async function processSession(sessionId: string) {
  // Get conversation history
  const history = await neurolink.getConversationHistory(sessionId);

  // Send to analytics
  await analyticsService.track("conversation_completed", {
    sessionId,
    turnCount: history.length,
    messages: history,
  });

  // Archive to data warehouse
  await dataWarehouse.store("conversations", { sessionId, messages: history });

  // Optionally clear the session after archiving
  await neurolink.clearConversationSession(sessionId);
}
```

## API Reference

### SDK Methods

```typescript
// Get conversation history for a session
const history = await neurolink.getConversationHistory(sessionId);
// Returns: Promise

// Clear a specific session
const cleared = await neurolink.clearConversationSession(sessionId);
// Returns: Promise

// Clear all conversations
await neurolink.clearAllConversations();
// Returns: Promise
```

### CLI Commands

> **Planned Feature**
>
> The `neurolink memory` CLI subcommand is planned for a future release.
> The commands shown below represent the intended interface once implemented.

- `neurolink memory export --session-id ` → Export single session (planned)
- `neurolink memory export-all` → Export all sessions (planned)
- `neurolink memory list` → List active sessions (planned)
- `neurolink memory delete --session-id ` → Delete session (planned)

See [conversation-memory.md](/docs/memory/conversation) for complete memory system documentation.

## Troubleshooting

### Problem: getConversationHistory returns empty array

**Cause**: Session ID doesn't exist or Redis not configured
**Solution**:

```bash
# Verify Redis connection
redis-cli ping  # Should return PONG

# Check environment variables
echo $REDIS_URL
```

```typescript
// Verify the session exists before retrieving
const history = await neurolink.getConversationHistory(sessionId);
if (history.length === 0) {
  console.log("No messages found for session:", sessionId);
}
```

### Problem: Redis connection failed

**Cause**: Redis server not running or incorrect credentials
**Solution**:

```bash
# Start Redis locally
redis-server

# Or use Docker
docker run -d -p 6379:6379 redis:latest

# Test connection
redis-cli -h localhost -p 6379 ping
```

### Problem: Need additional metadata with history

**Cause**: `getConversationHistory` returns only message array
**Solution**:

```typescript
// Add your own metadata when archiving
const history = await neurolink.getConversationHistory("session-123");
const enrichedHistory = {
  sessionId: "session-123",
  messages: history,
  exportedAt: new Date().toISOString(),
  messageCount: history.length,
};
```

### Problem: Memory command not found in CLI

**Cause**: The `neurolink memory` subcommand is a planned feature
**Solution**:

The CLI memory subcommand is planned for a future release. In the meantime, use the SDK methods directly:

```typescript
// Use SDK methods for conversation history management
const history = await neurolink.getConversationHistory(sessionId);
await neurolink.clearConversationSession(sessionId);
await neurolink.clearAllConversations();
```

## Best Practices

### Data Retention

1. **Set TTL on sessions** - Auto-delete old conversations

```typescript
config: {
  conversationMemory: {
    redis: {
      ttl: 7 * 24 * 60 * 60,  // 7 days in seconds
    },
  },
}
```

2. **Archive regularly** - Export to long-term storage

```typescript
// Archive a session before clearing
async function archiveSession(sessionId: string) {
  const history = await neurolink.getConversationHistory(sessionId);
  await s3.upload(`archives/${sessionId}.json`, JSON.stringify(history));
  await neurolink.clearConversationSession(sessionId); // Clean up
}
```

### Privacy & Compliance

```typescript
// Redact PII before archiving
async function archiveWithRedaction(sessionId: string) {
  const history = await neurolink.getConversationHistory(sessionId);

  // Redact sensitive data
  const redactedHistory = history.map((message) => ({
    ...message,
    content:
      typeof message.content === "string"
        ? redactPII(message.content) // Remove emails, phone numbers, etc.
        : message.content,
  }));

  return { sessionId, messages: redactedHistory };
}
```

### Session Cleanup

```typescript
// Clean up old sessions
async function cleanupSession(sessionId: string) {
  // Archive first if needed
  const history = await neurolink.getConversationHistory(sessionId);
  if (history.length > 0) {
    await archiveToStorage(sessionId, history);
  }

  // Clear the session
  const cleared = await neurolink.clearConversationSession(sessionId);
  console.log(`Session ${sessionId} cleared: ${cleared}`);
}

// Clear all conversations (use with caution)
async function clearAllData() {
  await neurolink.clearAllConversations();
  console.log("All conversations cleared");
}
```

## Use Cases

### Quality Assurance

```typescript
// Review conversations for specific sessions
const failedSessions = await db.query(
  "SELECT session_id FROM sessions WHERE error IS NOT NULL",
);

for (const { session_id } of failedSessions) {
  const history = await neurolink.getConversationHistory(session_id);

  // Analyze why conversation failed
  analyzeFailure({ sessionId: session_id, messages: history });
}
```

### Session Review

```typescript
// Review a specific session's conversation
async function reviewSession(sessionId: string) {
  const history = await neurolink.getConversationHistory(sessionId);

  const report = {
    sessionId,
    messageCount: history.length,
    messages: history.map((msg) => ({
      role: msg.role,
      contentPreview:
        typeof msg.content === "string"
          ? msg.content.substring(0, 100)
          : "[complex content]",
    })),
  };

  console.table(report.messages);
  return report;
}
```

## Related Features

- [CLI Loop Sessions](/docs/features/cli-loop-sessions) - Persistent conversation mode
- [Conversation Memory](/docs/memory/conversation) - Full memory system docs
- [Mem0 Integration](/docs/memory/mem0) - Semantic memory with vectors
- [Analytics Integration](/docs/reference/analytics) - Track conversation metrics

## Migration Notes

If upgrading from in-memory to Redis-backed storage:

1. Enable Redis in configuration
2. Existing in-memory sessions will be lost (not migrated)
3. New sessions automatically stored in Redis
4. Export functionality only works with Redis store
5. Consider gradual rollout with feature flag

For complete conversation memory system documentation, see [conversation-memory.md](/docs/memory/conversation).

---

## CSV File Support

<!-- Source: features/csv-support.md -->

# CSV File Support

NeuroLink provides seamless CSV file support as a **multimodal input type** - attach CSV files directly to your AI prompts for data analysis, insights, and processing.

## Overview

CSV support in NeuroLink works just like image support - it's a multimodal input that gets automatically processed and injected into your prompts. The system:

1. **Auto-detects** CSV files using FileDetector (magic bytes, MIME types, extensions, content heuristics)
2. **Parses** CSV data using streaming parser for memory efficiency
3. **Formats** CSV content into LLM-optimized text (markdown/json)
4. **Injects** formatted CSV data into your prompt text
5. **Works** with ALL AI providers (not limited to vision models)

## Quick Start

### SDK Usage

```typescript

const neurolink = new NeuroLink();

// Basic CSV analysis
const result = await neurolink.generate({
  input: {
    text: "What are the key trends in this sales data?",
    csvFiles: ["sales-2024.csv"],
  },
});

// Multiple CSV files
const comparison = await neurolink.generate({
  input: {
    text: "Compare Q1 vs Q2 performance and identify growth areas",
    csvFiles: ["q1-sales.csv", "q2-sales.csv"],
  },
});

// Auto-detect file types (mix CSV and images)
const multimodal = await neurolink.generate({
  input: {
    text: "Analyze this data and compare with the chart",
    files: ["data.csv", "chart.png"], // Auto-detects which is CSV vs image
  },
});

// Customize CSV processing
const custom = await neurolink.generate({
  input: {
    text: "Summarize the top 100 customers by revenue",
    csvFiles: ["customers.csv"],
  },
  csvOptions: {
    maxRows: 100, // Limit to first 100 rows
    formatStyle: "markdown", // Use markdown table format
    includeHeaders: true, // Include CSV headers
  },
});
```

### CLI Usage

```bash
# Attach CSV files to your prompt
neurolink generate "Analyze this sales data" --csv sales.csv

# Multiple CSV files
neurolink generate "Compare these datasets" --csv q1.csv --csv q2.csv

# Auto-detect file types
neurolink generate "Analyze data and image" --file data.csv --file chart.png

# Customize CSV processing
neurolink generate "Summarize trends" \
  --csv large-dataset.csv \
  --csv-max-rows 500 \
  --csv-format json

# Stream mode also supports CSV
neurolink stream "Explain this data in detail" --csv data.csv

# Batch processing with CSV
echo "Summarize sales data" > prompts.txt
echo "Find top performers" >> prompts.txt
neurolink batch prompts.txt --csv sales.csv
```

## API Reference

### GenerateOptions

```typescript
type GenerateOptions = {
  input: {
    text: string;
    images?: Array;
    csvFiles?: Array; // Explicit CSV files
    files?: Array; // Auto-detect file types
  };

  csvOptions?: {
    maxRows?: number; // Default: 1000
    formatStyle?: "raw" | "markdown" | "json"; // Default: "raw"
    includeHeaders?: boolean; // Default: true
  };

  // ... other options
};
```

### CSV Input Types

CSV files can be provided as:

- **File paths**: `"./data.csv"` or `"/absolute/path/data.csv"`
- **URLs**: `"https://example.com/data.csv"`
- **Buffers**: `Buffer.from("name,age\nAlice,30")`
- **Data URIs**: `"data:text/csv;base64,..."`

```typescript
// File path
await neurolink.generate({
  input: {
    text: "Analyze this",
    csvFiles: ["./data.csv"],
  },
});

// URL
await neurolink.generate({
  input: {
    text: "Analyze this",
    csvFiles: ["https://example.com/data.csv"],
  },
});

// Buffer
const csvBuffer = Buffer.from("name,age\nAlice,30\nBob,25");
await neurolink.generate({
  input: {
    text: "Analyze this",
    csvFiles: [csvBuffer],
  },
});
```

### CSV Processing Options

#### maxRows

Limit the number of rows processed (default: 1000). Useful for large datasets.

```typescript
csvOptions: {
  maxRows: 100; // Only process first 100 rows
}
```

#### formatStyle

Control how CSV data is formatted for the LLM:

- **`raw`** (default, RECOMMENDED): Original CSV format with proper escaping
  - Best for large files and minimal token usage
  - Preserves original structure
  - Handles commas, quotes, newlines correctly
  - File size stays minimal (63KB stays 63KB, not 199KB)

- **`json`**: JSON array format
  - Best for structured data processing
  - Easy to parse programmatically
  - Higher token usage (can expand 3x for large files)

- **`markdown`**: Markdown table format
  - Best for small datasets (\ Not binary (0% confidence)
// 2. Check MIME type (if URL) -> text/csv (85% confidence) ✓ STOP
// Result: Detected as CSV with 85% confidence
```

## How It Works

### Internal Processing Flow

````typescript
// When you call generate() with CSV files:
await neurolink.generate({
  input: {
    text: "Analyze this data",
    csvFiles: ["data.csv"],
  },
});

// Internal flow:
// 1. messageBuilder.ts detects csvFiles array
// 2. Calls FileDetector.detectAndProcess("data.csv")
// 3. FileDetector runs detection strategies
// 4. Loads file content (from path/URL/buffer)
// 5. Routes to CSVProcessor.process(buffer)
// 6. CSV parsed using streaming csv-parser library
// 7. Formatted to LLM-optimized text (raw/markdown/json)
// 8. Appends to prompt text:
//    "Analyze this data
//
//    ## CSV Data from "data.csv":
//    ```csv
//    name,age,city
//    Alice,30,New York
//    Bob,25,London
//    ```"
// 9. Sends to AI provider
````

### Memory Efficiency

CSV files are parsed using **streaming** for memory efficiency:

```typescript
// CSVProcessor uses Readable streams
Readable.from([csvString])
  .pipe(csvParser())
  .on("data", (row) => {
    if (count < maxRows) rows.push(row);
  });
```

Large CSV files are handled efficiently:

- **Streaming parser**: Processes line-by-line
- **Row limit**: Configurable `maxRows` (default: 1000)
- **Memory bounded**: Only holds limited rows in memory

## Examples

### Data Analysis

```typescript
const result = await neurolink.generate({
  input: {
    text: `Analyze this customer data and provide:
    1. Total customers
    2. Average age
    3. Top 5 cities by customer count
    4. Any notable patterns or insights`,
    csvFiles: ["customers.csv"],
  },
});
```

### Data Comparison

```typescript
const result = await neurolink.generate({
  input: {
    text: "Compare Q1 vs Q2 sales data. What changed? Which products improved?",
    csvFiles: ["q1-sales.csv", "q2-sales.csv"],
  },
});
```

### Data Cleaning

```typescript
const result = await neurolink.generate({
  input: {
    text: `Review this data for:
    - Missing values
    - Duplicate entries
    - Data quality issues
    - Suggested corrections`,
    csvFiles: ["raw-data.csv"],
  },
  csvOptions: {
    maxRows: 100,
    formatStyle: "markdown",
  },
});
```

### Schema Generation

```typescript
const result = await neurolink.generate({
  input: {
    text: "Generate a JSON schema for this CSV data with appropriate types and constraints",
    csvFiles: ["sample-data.csv"],
  },
  csvOptions: {
    maxRows: 50,
    formatStyle: "json",
  },
});
```

### Multimodal Analysis

```typescript
const result = await neurolink.generate({
  input: {
    text: "Compare the sales chart with the actual CSV data. Do they match?",
    files: ["sales-chart.png", "sales-data.csv"],
  },
});
```

## TypeScript Types

Only **types** are exposed from the package (not classes):

```typescript

  FileType,
  FileInput,
  FileSource,
  FileDetectionResult,
  FileProcessingResult,
  CSVProcessorOptions,
  FileDetectorOptions,
  CSVContent,
} from "@juspay/neurolink";

// FileType union
type FileType = "csv" | "image" | "pdf" | "text" | "unknown";

// CSV processing options
type CSVProcessorOptions = {
  maxRows?: number;
  formatStyle?: "raw" | "markdown" | "json";
  includeHeaders?: boolean;
};

// File detector options
type FileDetectorOptions = {
  maxSize?: number;
  timeout?: number;
  allowedTypes?: FileType[];
};
```

## Best Practices

### 1. Use Raw Format for Large Files

The `raw` format is **recommended** for large files and best token efficiency:

```typescript
csvOptions: {
  formatStyle: "raw",
} // ✅ RECOMMENDED for large files

// Use json for smaller datasets or when you need structured parsing
csvOptions: {
  formatStyle: "json",
} // ✅ Good for small-medium files
```

### 2. Limit Rows for Large Files

For large datasets, limit rows to avoid token limits:

```typescript
csvOptions: {
  maxRows: 500,
} // Process first 500 rows
```

### 3. Use Markdown for Small Datasets

For \<100 rows, markdown tables are more readable:

```typescript
csvOptions: {
  maxRows: 50,
  formatStyle: "markdown"
}
```

### 4. Provide Clear Instructions

Give the AI clear instructions about what to analyze:

```typescript
input: {
  text: `Analyze this sales data and provide:
  1. Total revenue
  2. Top 5 products
  3. Revenue trend
  4. Recommendations`,
  csvFiles: ["sales.csv"],
}
```

### 5. Use Auto-Detection

Let FileDetector handle mixed file types:

```typescript
files: ["data.csv", "chart.png", "report.pdf"]; // Auto-detects each type
```

## Limitations

- **Max file size**: 10MB by default (configurable)
- **Max rows**: 1000 by default (configurable)
- **Encoding**: UTF-8 recommended (auto-detected)
- **Token limits**: Large CSV files may exceed provider token limits
- **Streaming**: CSV content is parsed and formatted before sending (not streamed to LLM)

## Error Handling

```typescript
try {
  const result = await neurolink.generate({
    input: {
      text: "Analyze this",
      csvFiles: ["data.csv"],
    },
  });
} catch (error) {
  if (error.message.includes("File too large")) {
    // Handle file size error
  } else if (error.message.includes("not allowed")) {
    // Handle file type restriction
  } else if (error.message.includes("CSV")) {
    // Handle CSV parsing error
  }
}
```

## Related Features

- **[Office Documents](/docs/features/office-documents)**: DOCX, PPTX, XLSX processing
- **[PDF Support](/docs/features/pdf-support)**: PDF document processing
- **Image Support**: Similar multimodal input for images
- **File Detection**: Auto-detect file types with confidence scores
- **Memory Efficient**: Streaming parser for large files
- **Provider Agnostic**: Works with all AI providers
- **CLI Integration**: Full CLI support with options

## Summary

- CSV support is **multimodal input** (like images)
- Use `csvFiles` array or `files` array (auto-detect)
- Customize with `csvOptions` (maxRows, formatStyle, includeHeaders)
- Works with **ALL providers** (not just vision models)
- **Memory efficient** streaming parser
- CLI support with `--csv`, `--file`, `--csv-max-rows`, `--csv-format`
- Only **types** exposed from package (not classes)

---

## Enterprise Human-in-the-Loop System

<!-- Source: features/enterprise-hitl.md -->

# Enterprise Human-in-the-Loop System

> **Since**: v7.39.0 | **Status**: Production Ready | **Availability**: SDK & CLI

:::note[Feature Status - Enterprise HITL]
This document describes enterprise HITL features. Some advanced features (marked as "Planned")
are not yet implemented and represent the target API design for future releases.
:::

    **Currently Available:** Basic HITL with `dangerousActions`, `timeout`, `autoApproveOnTimeout`,
    `allowArgumentModification`, and `auditLogging`. See [Basic HITL Guide](/docs/features/hitl).

## Executive Summary

NeuroLink's Human-in-the-Loop (HITL) system provides enterprise-grade controls for AI operations requiring human oversight. Purpose-built for regulated industries and high-stakes applications, it combines real-time approval workflows with comprehensive audit trails to meet compliance requirements while maintaining operational efficiency.

### Strategic Value Proposition

- **Risk Mitigation**: Prevent costly AI mistakes through mandatory human checkpoints
- **Regulatory Compliance**: Meet HIPAA, SOC2, GDPR, and industry-specific requirements
- **Trust & Transparency**: Build stakeholder confidence with auditable AI decisions
- **Continuous Improvement**: Capture human expertise to improve AI accuracy over time

### Key Metrics

| Metric                   | Impact               | Evidence                                        |
| ------------------------ | -------------------- | ----------------------------------------------- |
| **Accuracy Improvement** | 95% increase         | Human validation catches edge cases AI misses   |
| **Compliance Coverage**  | 100% auditability    | Complete decision trail for regulatory review   |
| **Model Learning Rate**  | 60% faster           | Structured feedback accelerates training cycles |
| **Enterprise Adoption**  | 90% confidence boost | Security teams approve HITL-enabled deployments |

### When to Use HITL

**Required for:**

- Medical diagnosis and treatment recommendations
- Financial transactions above risk thresholds
- Legal document generation and review
- Code execution in production environments
- Personal data modification or deletion
- Irreversible operations (send email, post to social media)

**Not recommended for:**

- Read-only operations (information retrieval)
- Low-stakes content generation
- Development/testing environments
- High-volume, low-risk automation

---

## Quick Start (5 Minutes)

### Installation

HITL is built into NeuroLink SDK v7.39.0+. No additional packages required:

```bash
npm install @juspay/neurolink@latest
# or
pnpm add @juspay/neurolink@latest
```

### Basic Configuration

Minimal setup for tool-based approval workflow:

```typescript

const neurolink = new NeuroLink({
  hitl: {
    enabled: true,
    requireApproval: ["writeFile", "deleteFile", "executeCode"],
    reviewCallback: async (action, context) => {
      // Your approval logic - integrate with Slack, email, custom UI
      console.log(` Approval needed: ${action.tool}`);
      console.log(` Arguments:`, action.args);

      // Example: Simple prompt-based approval (replace with your system)
      const approved = await promptUser(
        `Allow AI to ${action.tool} with args ${JSON.stringify(action.args)}?`,
      );

      return {
        approved,
        reason: approved ? "User authorized" : "User denied",
        reviewer: "admin@company.com",
      };
    },
  },
});
```

### First Approval Request

Complete end-to-end example with error handling:

```typescript
try {
  const result = await neurolink.generate({
    input: {
      text: "Delete the temporary files in the /tmp directory",
    },
    provider: "anthropic",
    tools: [
      {
        name: "deleteFile",
        description: "Delete a file from filesystem",
        requiresConfirmation: true, // Triggers HITL
        execute: async (args) => {
          const fs = await import("fs/promises");
          await fs.unlink(args.path);
          return { success: true, deletedPath: args.path };
        },
      },
    ],
  });

  console.log(result.content);
} catch (error) {
  if (error.code === "USER_CONFIRMATION_REQUIRED") {
    // Handle approval workflow
    const approvalResult = await handleApproval(error.details);
    if (approvalResult.approved) {
      // Retry with confirmation
      const retryResult = await retryWithConfirmation(error.details);
      console.log(retryResult);
    }
  }
}
```

---

## Core Concepts

### 1. Approval Workflows

HITL supports both synchronous (blocking) and asynchronous (non-blocking) approval patterns:

#### Synchronous Approval (Blocking)

AI operation pauses until human approves or rejects:

```typescript
const neurolink = new NeuroLink({
  hitl: {
    enabled: true,
    mode: "synchronous", // Default
    timeout: 300000, // 5 minutes max wait
    reviewCallback: async (action, context) => {
      // Blocks here until approval received
      return await showApprovalDialog(action);
    },
  },
});
```

**Use cases:**

- Real-time operations requiring immediate decision
- Interactive applications with user present
- High-risk actions requiring instant validation

#### Asynchronous Approval (Non-blocking)

AI operation returns pending status, continues when approved:

```typescript
const neurolink = new NeuroLink({
  hitl: {
    enabled: true,
    mode: "asynchronous",
    reviewCallback: async (action, context) => {
      // Queue for review, return immediately
      const reviewId = await queueForReview(action);
      return {
        pending: true,
        reviewId,
        estimatedTime: 900000, // 15 minutes
      };
    },
    statusCallback: async (reviewId) => {
      // Check approval status
      return await checkReviewStatus(reviewId);
    },
  },
});
```

**Use cases:**

- Batch processing workflows
- Operations requiring expert review (takes time)
- Multi-level approval chains
- Integration with ticketing systems (Jira, ServiceNow)

### 2. Review Triggers

Configure when human review is required:

#### Confidence Threshold Trigger (Planned)

Automatically request review when AI confidence is low:

```typescript
const neurolink = new NeuroLink({
  hitl: {
    enabled: true,
    confidenceThreshold: 0.85, // Review if confidence  {
      if (context.aiConfidence  {
      const containsSensitiveData = context.contentPatterns.some((pattern) =>
        pattern.test(action.content),
      );

      if (containsSensitiveData) {
        return await requestSecurityReview(action);
      }

      return { approved: true };
    },
  },
});
```

#### Time-Based Restrictions

Require approval outside business hours:

```typescript
const neurolink = new NeuroLink({
  hitl: {
    enabled: true,
    reviewCallback: async (action, context) => {
      const hour = new Date().getHours();
      const isBusinessHours = hour >= 9 && hour  {
      const level = context.escalationLevel || 1;
      const reviewers =
        context.escalationPolicy.escalationLevels[level - 1].reviewers;

      return await requestApprovalFrom(reviewers, action);
    },
  },
});
```

---

## SDK Integration

### TypeScript Configuration

Complete configuration interface:

```typescript
type HITLConfiguration = {
  // Core settings
  enabled: boolean;
  mode?: "synchronous" | "asynchronous"; // (Planned feature)
  timeout?: number; // milliseconds

  // Approval triggers
  requireApproval?: string[]; // Tool names
  confidenceThreshold?: number; // 0-1 (Planned feature)
  contentPatterns?: RegExp[]; // (Planned feature)

  // Callbacks
  reviewCallback: (
    action: HITLAction,
    context: HITLContext,
  ) => Promise;

  statusCallback?: (reviewId: string) => Promise; // (Planned feature)

  // Escalation (Planned feature)
  escalationPolicy?: {
    onTimeout: "approve" | "reject" | "escalate";
    escalationLevels?: EscalationLevel[];
  };

  // Audit
  auditLog?: {
    enabled: boolean;
    storage: "file" | "database" | "custom";
    customLogger?: (entry: AuditEntry) => Promise;
  };
};

type HITLAction = {
  tool: string;
  args: Record;
  timestamp: Date;
  sessionId: string;
};

type HITLContext = {
  aiConfidence?: number;
  provider: string;
  model: string;
  escalationLevel?: number;
};

type HITLReviewResult = {
  approved: boolean;
  reason?: string;
  reviewer?: string;
  modifications?: Record;
  escalate?: boolean;
};
```

### Approval Callback Patterns

#### Slack Integration

```typescript

const slack = new WebClient(process.env.SLACK_BOT_TOKEN);

const neurolink = new NeuroLink({
  hitl: {
    enabled: true,
    requireApproval: ["deleteFile", "sendEmail"],
    reviewCallback: async (action, context) => {
      // Send approval request to Slack
      const message = await slack.chat.postMessage({
        channel: "#ai-approvals",
        text: ` AI Approval Request`,
        blocks: [
          {
            type: "section",
            text: {
              type: "mrkdwn",
              text: `*Action:* \`${action.tool}\`\n*Args:* \`\`\`${JSON.stringify(action.args, null, 2)}\`\`\``,
            },
          },
          {
            type: "actions",
            elements: [
              {
                type: "button",
                text: { type: "plain_text", text: "Approve" },
                style: "primary",
                value: action.sessionId,
                action_id: "approve",
              },
              {
                type: "button",
                text: { type: "plain_text", text: "Reject" },
                style: "danger",
                value: action.sessionId,
                action_id: "reject",
              },
            ],
          },
        ],
      });

      // Wait for response (implement with Slack interactivity)
      return await waitForSlackResponse(message.ts);
    },
  },
});
```

#### Email Integration

```typescript

const transporter = nodemailer.createTransport({
  host: process.env.SMTP_HOST,
  auth: {
    user: process.env.SMTP_USER,
    pass: process.env.SMTP_PASS,
  },
});

const neurolink = new NeuroLink({
  hitl: {
    enabled: true,
    mode: "asynchronous",
    reviewCallback: async (action, context) => {
      const reviewId = generateReviewId();

      await transporter.sendMail({
        from: "ai-system@company.com",
        to: "approvers@company.com",
        subject: `AI Approval Request: ${action.tool}`,
        html: `
          AI Action Requires Approval
          Tool: ${action.tool}
          Arguments:
          ${JSON.stringify(action.args, null, 2)}

            Approve |
            Reject

        `,
      });

      return {
        pending: true,
        reviewId,
        estimatedTime: 1800000, // 30 minutes
      };
    },
    statusCallback: async (reviewId) => {
      return await checkApprovalStatus(reviewId);
    },
  },
});
```

### Integration with External Systems

#### ServiceNow Integration

```typescript

const serviceNowClient = axios.create({
  baseURL: process.env.SERVICENOW_INSTANCE,
  auth: {
    username: process.env.SERVICENOW_USER,
    password: process.env.SERVICENOW_PASS,
  },
});

const neurolink = new NeuroLink({
  hitl: {
    enabled: true,
    mode: "asynchronous",
    reviewCallback: async (action, context) => {
      // Create ServiceNow ticket
      const ticket = await serviceNowClient.post("/api/now/table/incident", {
        short_description: `AI Approval: ${action.tool}`,
        description: JSON.stringify(action.args, null, 2),
        urgency: 2,
        category: "AI Operations",
        assignment_group: "AI Review Team",
      });

      return {
        pending: true,
        reviewId: ticket.data.result.sys_id,
        trackingUrl: `${process.env.SERVICENOW_INSTANCE}/nav_to.do?uri=incident.do?sys_id=${ticket.data.result.sys_id}`,
      };
    },
    statusCallback: async (reviewId) => {
      const ticket = await serviceNowClient.get(
        `/api/now/table/incident/${reviewId}`,
      );

      return {
        approved: ticket.data.result.state === "6", // Resolved
        pending: ticket.data.result.state !== "6",
        reason: ticket.data.result.close_notes,
      };
    },
  },
});
```

---

## CLI Integration

### HITL in Loop Mode

Interactive CLI provides built-in HITL commands:

```bash
# Start loop with HITL enabled
npx @juspay/neurolink loop --enable-hitl

# Inside loop session
neurolink > /hitl status
 Pending HITL Approvals (2):

1. Tool: deleteFile
   Args: { path: "/tmp/data.csv" }
   Confidence: 0.76
   Requested: 2 minutes ago

2. Tool: sendEmail
   Args: { to: "customer@example.com", subject: "Order Update" }
   Confidence: 0.92
   Requested: 5 seconds ago

neurolink > /hitl approve 1
✅ Approved deleteFile operation
   Execution completed successfully

neurolink > /hitl reject 2 --reason "Email template needs review"
❌ Rejected sendEmail operation
   Reason logged: Email template needs review
```

### CLI HITL Commands

| Command              | Description                 | Example                                      |
| -------------------- | --------------------------- | -------------------------------------------- |
| `/hitl status`       | View pending approvals      | `/hitl status`                               |
| `/hitl approve ` | Approve pending action      | `/hitl approve 1`                            |
| `/hitl reject `  | Reject with optional reason | `/hitl reject 2 --reason "Security concern"` |
| `/hitl history`      | View approval history       | `/hitl history --last 10`                    |
| `/hitl config`       | View HITL configuration     | `/hitl config`                               |

---

## Enterprise Patterns

### Pattern 1: Medical AI Validation

Physician oversight for AI-generated diagnostic recommendations:

```typescript

const medicalAI = new NeuroLink({
  hitl: {
    enabled: true,
    mode: "synchronous",
    confidenceThreshold: 0.95, // High bar for medical decisions
    requireApproval: ["generateDiagnosis", "recommendTreatment"],
    reviewCallback: async (action, context) => {
      // Route to qualified physician based on specialty
      const specialty = determineSpecialty(action.args);
      const physician = await findAvailablePhysician(specialty);

      // Present AI analysis to physician
      const review = await presentToPhysician({
        physician,
        aiAnalysis: {
          tool: action.tool,
          recommendation: action.args,
          confidence: context.aiConfidence,
          supportingData: context.metadata,
        },
        patientContext: context.patientId,
      });

      // Log for HIPAA compliance
      await auditLog.recordMedicalReview({
        physician: physician.id,
        decision: review.approved,
        timestamp: new Date(),
        patientId: context.patientId,
        aiConfidence: context.aiConfidence,
        humanConfidence: review.confidence,
      });

      return {
        approved: review.approved,
        reason: review.clinicalReasoning,
        reviewer: physician.email,
        modifications: review.modifications,
      };
    },
  },
});

// Usage
const diagnosis = await medicalAI.generate({
  input: {
    text: "Analyze patient symptoms and recommend diagnosis",
  },
  context: {
    patientId: "PT-12345",
    symptoms: ["chest pain", "shortness of breath"],
    vitals: { bp: "145/95", hr: 98 },
  },
  tools: [
    {
      name: "generateDiagnosis",
      description: "Generate diagnostic recommendation",
      requiresConfirmation: true,
      execute: async (args) => {
        return {
          diagnosis: args.primaryDiagnosis,
          differentials: args.differentialDiagnoses,
          recommendedTests: args.tests,
        };
      },
    },
  ],
});
```

### Pattern 2: Financial Compliance

Transaction approval above risk thresholds:

```typescript
const financialAI = new NeuroLink({
  hitl: {
    enabled: true,
    requireApproval: ["executeTransaction", "modifyAccount"],
    reviewCallback: async (action, context) => {
      const amount = action.args.amount;
      const threshold = 10000; // $10,000

      if (amount >= threshold) {
        // Multi-level approval for large transactions
        const approvals = [];

        // Level 1: Manager approval
        const managerApproval = await requestApproval({
          approver: "manager@company.com",
          action,
          level: 1,
        });
        approvals.push(managerApproval);

        if (!managerApproval.approved) {
          return managerApproval;
        }

        // Level 2: Finance director for >$50k
        if (amount >= 50000) {
          const directorApproval = await requestApproval({
            approver: "finance-director@company.com",
            action,
            level: 2,
          });
          approvals.push(directorApproval);

          if (!directorApproval.approved) {
            return directorApproval;
          }
        }

        // Compliance audit trail
        await complianceLog.record({
          transactionId: action.args.transactionId,
          amount,
          approvals,
          timestamp: new Date(),
          regulatoryFramework: "SOC2",
        });

        return {
          approved: true,
          reason: "Multi-level approval completed",
          reviewers: approvals.map((a) => a.reviewer),
        };
      }

      return { approved: true, reason: "Below threshold" };
    },
  },
});

// Usage
const transaction = await financialAI.generate({
  input: {
    text: "Process wire transfer of $75,000 to vendor account",
  },
  tools: [
    {
      name: "executeTransaction",
      description: "Execute financial transaction",
      requiresConfirmation: true,
      execute: async (args) => {
        return await processWireTransfer(args);
      },
    },
  ],
});
```

### Pattern 3: Legal Document Review

Attorney validation of AI-generated contracts:

```typescript
const legalAI = new NeuroLink({
  hitl: {
    enabled: true,
    mode: "asynchronous", // Legal review takes time
    requireApproval: ["generateContract", "modifyClause"],
    reviewCallback: async (action, context) => {
      // Determine required legal expertise
      const practiceArea = determinePracticeArea(action.args);
      const jurisdiction = action.args.jurisdiction;

      // Route to qualified attorney
      const attorney = await findAttorney({
        practiceArea,
        jurisdiction,
        barAdmissions: [jurisdiction],
      });

      // Create review task
      const reviewTask = await legalReviewSystem.createTask({
        attorney: attorney.id,
        documentType: action.tool,
        content: action.args,
        aiConfidence: context.aiConfidence,
        priority: action.args.urgency || "standard",
        deadline: calculateDeadline(action.args.urgency),
      });

      return {
        pending: true,
        reviewId: reviewTask.id,
        estimatedTime: reviewTask.estimatedCompletionTime,
        trackingUrl: reviewTask.url,
      };
    },
    statusCallback: async (reviewId) => {
      const task = await legalReviewSystem.getTask(reviewId);

      if (task.status === "completed") {
        return {
          approved: task.approved,
          reason: task.legalOpinion,
          reviewer: task.attorney.email,
          modifications: task.suggestedChanges,
        };
      }

      return { pending: true };
    },
  },
});

// Usage
const contract = await legalAI.generate({
  input: {
    text: "Generate employment contract for California senior engineer position",
  },
  context: {
    jurisdiction: "California",
    position: "Senior Software Engineer",
    complianceRequirements: ["california-labor-law", "federal-employment-law"],
  },
  tools: [
    {
      name: "generateContract",
      description: "Generate legal contract",
      requiresConfirmation: true,
      execute: async (args) => {
        return {
          contractText: args.content,
          clauses: args.clauses,
          terms: args.terms,
        };
      },
    },
  ],
});
```

### Pattern 4: Code Execution Safety

Sandbox approval before executing AI-generated code:

```typescript
const codeAI = new NeuroLink({
  hitl: {
    enabled: true,
    requireApproval: ["executeCode", "modifyDatabase", "deployToProduction"],
    reviewCallback: async (action, context) => {
      if (action.tool === "executeCode") {
        // Static analysis of code
        const analysis = await analyzeCode(action.args.code);

        if (analysis.containsDangerousPatterns) {
          return {
            approved: false,
            reason: `Security concern: ${analysis.issues.join(", ")}`,
            escalate: true,
          };
        }

        // Present code to developer for review
        const review = await presentCodeReview({
          code: action.args.code,
          analysis,
          context: action.args.context,
        });

        return {
          approved: review.approved,
          reason: review.comments,
          reviewer: review.developer,
          modifications: review.suggestedChanges,
        };
      }

      return { approved: true };
    },
  },
});

// Usage with code execution
const result = await codeAI.generate({
  input: {
    text: "Write and execute a Python script to process CSV data",
  },
  tools: [
    {
      name: "executeCode",
      description: "Execute code in sandboxed environment",
      requiresConfirmation: true,
      execute: async (args) => {
        // Execute in sandbox after approval
        return await sandbox.execute({
          code: args.code,
          language: args.language,
          timeout: 30000,
        });
      },
    },
  ],
});
```

---

## Configuration Reference

### Full Configuration Object

Complete TypeScript interface with all available options:

```typescript
type HITLConfiguration = {
  // === Core Settings ===
  enabled: boolean;
  mode?: "synchronous" | "asynchronous"; // (Planned feature)
  timeout?: number; // Default: 300000 (5 minutes)

  // === Approval Triggers ===
  requireApproval?: string[]; // Tool names requiring approval
  confidenceThreshold?: number; // 0-1, trigger review if AI confidence below (Planned feature)
  contentPatterns?: RegExp[]; // Patterns that trigger review (Planned feature)

  // === Callbacks ===
  reviewCallback: (
    action: HITLAction,
    context: HITLContext,
  ) => Promise;

  statusCallback?: (reviewId: string) => Promise; // (Planned feature)

  // === Escalation (Planned feature) ===
  escalationPolicy?: {
    onTimeout: "approve" | "reject" | "escalate";
    escalationLevels?: Array;
  };

  // === Audit & Compliance ===
  auditLog?: {
    enabled: boolean;
    storage: "file" | "database" | "custom";
    path?: string; // For file storage
    database?: DatabaseConfig; // For database storage
    customLogger?: (entry: AuditEntry) => Promise;
  };

  // === Security ===
  security?: {
    encryptAuditLogs?: boolean;
    redactSensitiveData?: boolean;
    requireMFA?: boolean;
    ipWhitelist?: string[];
  };
};
```

### Environment Variables

Configure HITL through environment variables:

```bash
# Core HITL Settings
NEUROLINK_HITL_ENABLED=true
NEUROLINK_HITL_MODE=synchronous
NEUROLINK_HITL_TIMEOUT=300000

# Approval Configuration
NEUROLINK_HITL_CONFIDENCE_THRESHOLD=0.85
NEUROLINK_HITL_REQUIRE_APPROVAL=writeFile,deleteFile,executeCode

# Audit Logging
NEUROLINK_HITL_AUDIT_ENABLED=true
NEUROLINK_HITL_AUDIT_STORAGE=database
NEUROLINK_HITL_AUDIT_DB_URL=postgresql://user:pass@localhost:5432/audit

# Integration
NEUROLINK_HITL_SLACK_TOKEN=xoxb-your-token
NEUROLINK_HITL_SLACK_CHANNEL=#ai-approvals
```

---

## Security & Audit

### Audit Trail Format

Every HITL action is logged in structured format:

```json
{
  "eventId": "evt_7a9f2c1b",
  "timestamp": "2025-01-01T14:30:00Z",
  "sessionId": "sess_abc123",
  "action": {
    "tool": "deleteFile",
    "args": {
      "path": "/data/sensitive.csv"
    }
  },
  "context": {
    "provider": "anthropic",
    "model": "claude-3-sonnet",
    "aiConfidence": 0.78,
    "userId": "user@company.com"
  },
  "review": {
    "approved": true,
    "reason": "Authorized by manager",
    "reviewer": "manager@company.com",
    "reviewDuration": 45000,
    "escalationLevel": 1
  },
  "outcome": {
    "success": true,
    "executionTime": 234,
    "result": { "deleted": true }
  }
}
```

### Compliance Documentation

#### HIPAA Compliance

HITL audit logs support HIPAA requirements:

- **Access Controls**: Reviewer identity logged
- **Audit Trail**: Complete decision history
- **Data Integrity**: Tamper-evident logging
- **Accountability**: Individual authorization tracking

```typescript
const hipaaCompliantAI = new NeuroLink({
  hitl: {
    enabled: true,
    auditLog: {
      enabled: true,
      storage: "database",
      database: {
        url: process.env.HIPAA_AUDIT_DB,
        encryption: true,
        retentionYears: 6, // HIPAA requirement
      },
    },
    security: {
      encryptAuditLogs: true,
      requireMFA: true,
      redactSensitiveData: true,
    },
  },
});
```

#### SOC2 Compliance

Meet SOC2 Type II requirements:

- **Authorization**: Documented approval workflow
- **Monitoring**: Real-time audit logging
- **Availability**: Timeout and escalation policies
- **Confidentiality**: Encrypted audit storage

```typescript
const soc2CompliantAI = new NeuroLink({
  hitl: {
    enabled: true,
    escalationPolicy: {
      onTimeout: "escalate",
      escalationLevels: [
        { level: 1, reviewers: ["team-lead"], timeout: 300000 },
        { level: 2, reviewers: ["manager"], timeout: 600000 },
      ],
    },
    auditLog: {
      enabled: true,
      storage: "database",
      database: {
        url: process.env.AUDIT_DB,
        encryption: true,
      },
    },
  },
});
```

#### GDPR Compliance

Support GDPR data protection requirements:

- **Lawful Processing**: Human oversight for data operations
- **Data Minimization**: Review prevents excessive collection
- **Right to Erasure**: Approval required for deletions
- **Accountability**: Complete audit trail

```typescript
const gdprCompliantAI = new NeuroLink({
  hitl: {
    enabled: true,
    requireApproval: [
      "collectPersonalData",
      "deletePersonalData",
      "transferData",
    ],
    reviewCallback: async (action, context) => {
      // Ensure lawful basis documented
      const lawfulBasis = await determineLawfulBasis(action);

      if (!lawfulBasis) {
        return {
          approved: false,
          reason: "No lawful basis for processing",
        };
      }

      // Log for accountability
      await gdprAuditLog.record({
        action: action.tool,
        lawfulBasis,
        dataSubject: context.dataSubjectId,
        processor: context.userId,
      });

      return {
        approved: true,
        reason: `Lawful basis: ${lawfulBasis}`,
      };
    },
  },
});
```

### Security Best Practices

#### 1. Secure Approval Callbacks

```typescript
// ❌ BAD: Exposing sensitive data in logs
reviewCallback: async (action, context) => {
  console.log(action.args); // May contain PII, credentials
  return { approved: true };
};

// ✅ GOOD: Redact sensitive data
reviewCallback: async (action, context) => {
  const redactedArgs = redactSensitive(action.args);
  console.log(redactedArgs);
  return { approved: true };
};
```

#### 2. Secret Management

```typescript
// ❌ BAD: Hardcoded credentials
const neurolink = new NeuroLink({
  hitl: {
    reviewCallback: async (action) => {
      const response = await fetch("https://api.example.com/approve", {
        headers: { Authorization: "Bearer abc123" }, // Hardcoded!
      });
    },
  },
});

// ✅ GOOD: Environment variables
const neurolink = new NeuroLink({
  hitl: {
    reviewCallback: async (action) => {
      const response = await fetch("https://api.example.com/approve", {
        headers: {
          Authorization: `Bearer ${process.env.APPROVAL_API_TOKEN}`,
        },
      });
    },
  },
});
```

#### 3. Input Validation

```typescript
reviewCallback: async (action, context) => {
  // Validate tool name
  const allowedTools = ["readFile", "writeFile"];
  if (!allowedTools.includes(action.tool)) {
    return {
      approved: false,
      reason: "Invalid tool name",
    };
  }

  // Validate arguments
  if (!isValidPath(action.args.path)) {
    return {
      approved: false,
      reason: "Invalid file path",
    };
  }

  return { approved: true };
};
```

---

## Troubleshooting

### Common Issues

#### Issue: Timeout Exceeded

**Symptom**: Review requests timing out before approval

```
Error: HITL review timeout exceeded (300000ms)
```

**Solution**:

```typescript
// Increase timeout for operations requiring human thought
const neurolink = new NeuroLink({
  hitl: {
    enabled: true,
    timeout: 600000, // 10 minutes
    escalationPolicy: {
      onTimeout: "escalate", // Escalate instead of failing
    },
  },
});
```

#### Issue: Approval Callback Not Called

**Symptom**: HITL enabled but callback never executes

**Solution**: Ensure tool has `requiresConfirmation: true`:

```typescript
tools: [
  {
    name: "dangerousTool",
    requiresConfirmation: true, // Must be set
    execute: async (args) => {
      // ...
    },
  },
];
```

#### Issue: Rejected Approvals Not Handled

**Symptom**: Application crashes when approval rejected

**Solution**: Handle rejection in error handling:

```typescript
try {
  const result = await neurolink.generate({ ... });
} catch (error) {
  if (error.code === "HITL_APPROVAL_REJECTED") {
    console.log(`Operation rejected: ${error.reason}`);
    // Handle gracefully - show user message, log, etc.
  }
}
```

### Debug Mode

Enable detailed HITL logging:

```typescript
const neurolink = new NeuroLink({
  hitl: {
    enabled: true,
    debug: true, // Enables verbose logging
  },
});

// Or via environment variable
process.env.NEUROLINK_HITL_DEBUG = "true";
```

Debug output example:

```
[HITL] Review required for tool: deleteFile
[HITL] Confidence: 0.72 (threshold: 0.85)
[HITL] Calling reviewCallback with action: {...}
[HITL] Review pending: reviewId=rev_123
[HITL] Checking review status every 5s
[HITL] Review approved by: manager@company.com
[HITL] Executing tool with confirmation
```

---

## See Also

- [Quick HITL Guide](/docs/features/hitl) - Simple HITL setup for common cases
- [Guardrails Middleware](/docs/features/guardrails) - Complementary content filtering
- [Middleware Architecture](/docs/advanced/middleware-architecture) - How HITL integrates with middleware
- [Custom Tools](/docs/sdk/custom-tools) - Building tools with HITL support
- [CLI Loop Sessions](/docs/features/cli-loop-sessions) - Using HITL in interactive CLI

---

## File Processors Guide

<!-- Source: features/file-processors.md -->

# File Processors Guide

NeuroLink includes a comprehensive file processing system that supports 20+ file types with intelligent content extraction, security sanitization, and provider-agnostic formatting. This system enables seamless multimodal AI interactions across all 13 supported providers.

## Overview

The file processor system is organized into a modular architecture:

```
src/lib/processors/
├── base/           # BaseFileProcessor abstract class and types
├── registry/       # ProcessorRegistry singleton for processor selection
├── config/         # MIME types, extensions, language maps, size limits
├── errors/         # FileErrorCode enum and error helpers
├── document/       # Excel, Word, RTF, OpenDocument processors
├── media/          # Video and Audio processors (metadata extraction)
├── archive/        # ZIP, TAR, GZ archive processors (file listing + content extraction)
├── markup/         # SVG, HTML, Markdown, Text processors
├── code/           # SourceCode, Config processors
├── data/           # JSON, YAML, XML processors
├── integration/    # FileProcessorIntegration for registry usage
└── cli/            # CLI helpers for file processing
```

## Supported File Types

### Documents

| Type             | Extensions             | Processor               | Features                                             |
| ---------------- | ---------------------- | ----------------------- | ---------------------------------------------------- |
| **Excel**        | `.xlsx`, `.xls`        | `ExcelProcessor`        | Multi-sheet extraction, cell formatting, data tables |
| **Word**         | `.docx`, `.doc`        | `WordProcessor`         | Text extraction, paragraph preservation              |
| **RTF**          | `.rtf`                 | `RtfProcessor`          | Rich text to plain text conversion                   |
| **OpenDocument** | `.odt`, `.ods`, `.odp` | `OpenDocumentProcessor` | LibreOffice/OpenOffice format support                |

### Data Files

| Type     | Extensions      | Processor       | Features                                         |
| -------- | --------------- | --------------- | ------------------------------------------------ |
| **JSON** | `.json`         | `JsonProcessor` | Validation, pretty-printing, syntax highlighting |
| **YAML** | `.yaml`, `.yml` | `YamlProcessor` | Validation, formatting, multi-document support   |
| **XML**  | `.xml`          | `XmlProcessor`  | Parsing, validation, entity handling             |

### Markup Files

| Type         | Extensions         | Processor           | Features                                      |
| ------------ | ------------------ | ------------------- | --------------------------------------------- |
| **HTML**     | `.html`, `.htm`    | `HtmlProcessor`     | OWASP-compliant sanitization, text extraction |
| **SVG**      | `.svg`             | `SvgProcessor`      | XSS prevention, text injection (not binary)   |
| **Markdown** | `.md`, `.markdown` | `MarkdownProcessor` | Formatting preservation, metadata extraction  |
| **Text**     | `.txt`             | `TextProcessor`     | Plain text handling, encoding detection       |

### Source Code

| Type                | Extensions                 | Processor             | Features                            |
| ------------------- | -------------------------- | --------------------- | ----------------------------------- |
| **TypeScript**      | `.ts`, `.tsx`              | `SourceCodeProcessor` | Language detection, syntax metadata |
| **JavaScript**      | `.js`, `.jsx`, `.mjs`      | `SourceCodeProcessor` | Module detection                    |
| **Python**          | `.py`                      | `SourceCodeProcessor` | Docstring preservation              |
| **Java**            | `.java`                    | `SourceCodeProcessor` | Package detection                   |
| **Go**              | `.go`                      | `SourceCodeProcessor` | Module awareness                    |
| **Rust**            | `.rs`                      | `SourceCodeProcessor` | Crate detection                     |
| **C/C++**           | `.c`, `.cpp`, `.h`, `.hpp` | `SourceCodeProcessor` | Header handling                     |
| **C#**              | `.cs`                      | `SourceCodeProcessor` | Namespace detection                 |
| **Ruby**            | `.rb`                      | `SourceCodeProcessor` | Gem awareness                       |
| **PHP**             | `.php`                     | `SourceCodeProcessor` | Tag handling                        |
| **Swift**           | `.swift`                   | `SourceCodeProcessor` | Framework detection                 |
| **Kotlin**          | `.kt`, `.kts`              | `SourceCodeProcessor` | Android/JVM awareness               |
| **Scala**           | `.scala`                   | `SourceCodeProcessor` | SBT integration                     |
| **Shell**           | `.sh`, `.bash`, `.zsh`     | `SourceCodeProcessor` | Shebang detection                   |
| **SQL**             | `.sql`                     | `SourceCodeProcessor` | Dialect hints                       |
| **And 35+ more...** | Various                    | `SourceCodeProcessor` | Automatic language detection        |

### Configuration Files

| Type            | Extensions       | Processor         | Features                           |
| --------------- | ---------------- | ----------------- | ---------------------------------- |
| **Environment** | `.env`, `.env.*` | `ConfigProcessor` | Secret masking option              |
| **INI**         | `.ini`, `.cfg`   | `ConfigProcessor` | Section parsing                    |
| **TOML**        | `.toml`          | `ConfigProcessor` | Cargo.toml, pyproject.toml support |
| **Properties**  | `.properties`    | `ConfigProcessor` | Java properties format             |

### Media Files

| Type      | Extensions                                              | Processor        | Features                                                                         |
| --------- | ------------------------------------------------------- | ---------------- | -------------------------------------------------------------------------------- |
| **Video** | `.mp4`, `.mkv`, `.webm`, `.avi`, `.mov`, `.m4v`         | `VideoProcessor` | Duration, resolution, codec, frame rate, bitrate extraction via `music-metadata` |
| **Audio** | `.mp3`, `.wav`, `.ogg`, `.flac`, `.aac`, `.m4a`, `.wma` | `AudioProcessor` | Codec, bitrate, sample rate, channels, duration extraction via `music-metadata`  |

Video and audio files are **not** sent as binary to the AI provider. Instead, the processors extract structured metadata and return it as formatted text, keeping token usage minimal (~50-200 tokens per file).

**Example video output:**

```
Video File: presentation.mp4
Duration: 13s | Resolution: 640x360 | Video Codec: h264
Frame Rate: 29.97 fps | Bitrate: 345 kbps
Audio: aac, 48000 Hz, 2 channels
```

**Example audio output:**

```
Audio File: recording.mp3
Codec: MPEG 1 Layer 3 | Bitrate: 128 kbps
Sample Rate: 44100 Hz | Channels: 2 (Stereo) | Duration: 1:46
```

### Archives

| Type    | Extensions               | Processor          | Features                                                               |
| ------- | ------------------------ | ------------------ | ---------------------------------------------------------------------- |
| **ZIP** | `.zip`                   | `ArchiveProcessor` | File listing with sizes, nested content extraction, ZIP bomb detection |
| **TAR** | `.tar`                   | `ArchiveProcessor` | File listing with sizes                                                |
| **GZ**  | `.gz`, `.tar.gz`, `.tgz` | `ArchiveProcessor` | Gzip decompression, tar content listing                                |

Archive files return a structured listing of their contents with file sizes and optionally extract text from contained files (routing through existing processors).

**Example archive output:**

```
Archive: project.tar.gz
Total entries: 6

Files:
- code/sample.json (60 B)
- code/sample.py (195 B)
- document/sample.txt (607 B)
```

**Security:** Archive processing includes ZIP bomb detection (compression ratio limits), path traversal prevention, symlink blocking, entry count limits, and aggregate decompression size limits.

## Usage

### SDK Usage

```typescript

const neurolink = new NeuroLink();

// Process multiple file types in a single request
const result = await neurolink.generate({
  input: {
    text: "Analyze these files and summarize the key information",
    files: [
      "./data/report.xlsx", // Excel spreadsheet
      "./config/settings.yaml", // YAML configuration
      "./src/main.ts", // TypeScript source
      "./docs/architecture.svg", // SVG diagram (injected as text)
      "./api/schema.json", // JSON schema
    ],
  },
  provider: "vertex",
});

console.log(result.content);
```

### CLI Usage

```bash
# Single file
neurolink generate "Analyze this spreadsheet" --file ./data.xlsx

# Multiple files
neurolink generate "Compare these configs" \
  --file ./config.yaml \
  --file ./settings.json \
  --file ./app.toml

# Mixed with images and PDFs
neurolink generate "Explain this codebase" \
  --file ./src/main.ts \
  --file ./docs/diagram.svg \
  --pdf ./docs/spec.pdf \
  --image ./screenshot.png
```

### Stream Mode

```typescript
// Streaming with file processing
const stream = await neurolink.stream({
  input: {
    text: "Walk me through this code step by step",
    files: ["./src/algorithm.py"],
  },
});

for await (const chunk of stream.textStream) {
  process.stdout.write(chunk);
}
```

## Architecture

### ProcessorRegistry

The `ProcessorRegistry` is a singleton that manages all file processors with priority-based selection:

```typescript

// Get the singleton instance
const registry = ProcessorRegistry.getInstance();

// Register a custom processor (lower priority = higher precedence)
registry.register(new MyCustomProcessor(), 50);

// Find processor for a file
const processor = registry.findProcessor({
  filename: "data.xlsx",
  mimeType: "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
  size: 1024,
});

// Process a file
const result = await processor.process(fileInfo, fileContent);
```

### BaseFileProcessor

All processors extend the abstract `BaseFileProcessor` class:

```typescript

export class MyProcessor extends BaseFileProcessor {
  readonly name = "my-processor";
  readonly supportedMimeTypes = ["application/x-my-format"];
  readonly supportedExtensions = [".myf"];

  canProcess(file: FileInfo): boolean {
    return this.supportedExtensions.includes(file.extension);
  }

  async process(file: FileInfo, content: Buffer): Promise {
    const text = this.extractText(content);
    return {
      type: "text",
      content: text,
      metadata: {
        processor: this.name,
        originalFilename: file.filename,
      },
    };
  }

  getInfo(): ProcessorInfo {
    return {
      name: this.name,
      description: "Processes MY format files",
      supportedMimeTypes: this.supportedMimeTypes,
      supportedExtensions: this.supportedExtensions,
    };
  }
}
```

### FileDetector

The `FileDetector` utility automatically identifies file types:

```typescript

const detector = new FileDetector();

// Detect by extension
const type1 = detector.detect("report.xlsx");
// Returns: { type: "xlsx", mimeType: "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet" }

// Detect by content (magic bytes)
const type2 = detector.detectFromContent(buffer);

// SVG special handling - returns "svg" type, not "image"
const type3 = detector.detect("diagram.svg");
// Returns: { type: "svg", mimeType: "image/svg+xml" }
```

## Security Features

### OWASP-Compliant Sanitization

The markup processors include security sanitization to prevent XSS and injection attacks:

#### HTML Sanitization

```typescript
// HtmlProcessor automatically sanitizes HTML content
// - Removes  tags
// - Strips event handlers (onclick, onerror, etc.)
// - Removes javascript: URLs
// - Sanitizes style attributes
// - Blocks dangerous protocols

const result = await neurolink.generate({
  input: {
    text: "Summarize this HTML content",
    files: ["./untrusted-content.html"], // Automatically sanitized
  },
});
```

#### SVG Sanitization

```typescript
// SvgProcessor sanitizes SVG before injection
// - Removes embedded scripts
// - Strips foreignObject elements
// - Sanitizes use/href attributes
// - Blocks external entity references

// SVG is injected as TEXT, not as binary image
// This prevents image-based attacks while preserving vector content
```

### File Size Limits

Default size limits prevent denial-of-service attacks:

| Category     | Default Limit | Configurable |
| ------------ | ------------- | ------------ |
| Documents    | 50 MB         | Yes          |
| Data files   | 10 MB         | Yes          |
| Code files   | 5 MB          | Yes          |
| Config files | 1 MB          | Yes          |
| Images       | 20 MB         | Yes          |

```typescript

// Configure size limits
ProcessorConfig.setLimits({
  maxDocumentSize: 100 * 1024 * 1024, // 100 MB
  maxCodeSize: 10 * 1024 * 1024, // 10 MB
});
```

## Error Handling

### FileErrorCode Enum

```typescript

try {
  const result = await neurolink.generate({
    input: { files: ["./corrupted.xlsx"] },
  });
} catch (error) {
  if (error && typeof error === "object" && "code" in error) {
    switch (error.code) {
      case FileErrorCode.UNSUPPORTED_TYPE:
        console.log("File type not supported");
        break;
      case FileErrorCode.FILE_TOO_LARGE:
        console.log("File too large");
        break;
      case FileErrorCode.CORRUPTED_FILE:
        console.log("File is corrupted");
        break;
      case FileErrorCode.DOWNLOAD_AUTH_FAILED:
        console.log("Cannot read file");
        break;
    }
  }
}
```

## Provider Compatibility

All file processors work across all 13 AI providers. The processed content is formatted as text that any provider can understand:

| Provider          | Documents | Data | Markup | Code | Config |
| ----------------- | --------- | ---- | ------ | ---- | ------ |
| OpenAI            | ✅        | ✅   | ✅     | ✅   | ✅     |
| Anthropic         | ✅        | ✅   | ✅     | ✅   | ✅     |
| Google AI Studio  | ✅        | ✅   | ✅     | ✅   | ✅     |
| Google Vertex     | ✅        | ✅   | ✅     | ✅   | ✅     |
| AWS Bedrock       | ✅        | ✅   | ✅     | ✅   | ✅     |
| Azure OpenAI      | ✅        | ✅   | ✅     | ✅   | ✅     |
| Mistral           | ✅        | ✅   | ✅     | ✅   | ✅     |
| LiteLLM           | ✅        | ✅   | ✅     | ✅   | ✅     |
| Ollama            | ✅        | ✅   | ✅     | ✅   | ✅     |
| Hugging Face      | ✅        | ✅   | ✅     | ✅   | ✅     |
| SageMaker         | ✅        | ✅   | ✅     | ✅   | ✅     |
| OpenAI Compatible | ✅        | ✅   | ✅     | ✅   | ✅     |
| OpenRouter        | ✅        | ✅   | ✅     | ✅   | ✅     |

**Note:** For binary files like images and PDFs, provider-specific adapters handle the formatting. See [PDF Support](/docs/features/pdf-support) and [Multimodal Chat](/docs/features/multimodal-chat).

## Best Practices

### 1. Use Appropriate File Types

```typescript
// Good: Use structured data formats for data
files: ["./data.json", "./config.yaml"];

// Avoid: Using unstructured text for structured data
files: ["./data.txt"]; // Harder for AI to parse
```

### 2. Combine Related Files

```typescript
// Good: Group related files together
const result = await neurolink.generate({
  input: {
    text: "Review this module for best practices",
    files: [
      "./src/module.ts", // Implementation
      "./src/module.test.ts", // Tests
      "./src/module.types.ts", // Types
    ],
  },
});
```

### 3. Be Mindful of Token Limits

```typescript
// For large files, consider chunking or summarization

// Enable automatic truncation for very large files
ProcessorConfig.setTruncation({
  enabled: true,
  maxTokens: 50000,
  strategy: "head-tail", // Keep beginning and end
});
```

### 4. Use Specific Prompts

```typescript
// Good: Be specific about what to analyze
const result = await neurolink.generate({
  input: {
    text: "Find security vulnerabilities in this code, focusing on SQL injection and XSS",
    files: ["./src/api.ts"],
  },
});

// Less effective: Vague prompt
const result = await neurolink.generate({
  input: {
    text: "Look at this",
    files: ["./src/api.ts"],
  },
});
```

## Extending the System

### Creating a Custom Processor

```typescript

  BaseFileProcessor,
  FileInfo,
  ProcessedFile,
  ProcessorRegistry,
} from "@juspay/neurolink";

class ProtobufProcessor extends BaseFileProcessor {
  readonly name = "protobuf-processor";
  readonly supportedMimeTypes = ["application/x-protobuf"];
  readonly supportedExtensions = [".proto"];

  canProcess(file: FileInfo): boolean {
    return file.extension === ".proto";
  }

  async process(file: FileInfo, content: Buffer): Promise {
    const protoText = content.toString("utf-8");

    // Add syntax highlighting hints
    const formatted = `\`\`\`protobuf\n${protoText}\n\`\`\``;

    return {
      type: "text",
      content: formatted,
      metadata: {
        processor: this.name,
        language: "protobuf",
        filename: file.filename,
      },
    };
  }

  getInfo() {
    return {
      name: this.name,
      description: "Processes Protocol Buffer definition files",
      supportedMimeTypes: this.supportedMimeTypes,
      supportedExtensions: this.supportedExtensions,
    };
  }
}

// Register with priority 50 (lower = higher precedence)
ProcessorRegistry.getInstance().register(new ProtobufProcessor(), 50);
```

## Related Documentation

- [Multimodal Chat](/docs/features/multimodal-chat) - Image and media handling
- [PDF Support](/docs/features/pdf-support) - PDF-specific features
- [CSV Support](/docs/features/csv-support) - CSV processing details
- [CLI Commands](/docs/cli/commands) - CLI file options
- [SDK API Reference](/docs/sdk/api-reference) - Full API documentation

---

## Guardrails AI Integration with Middleware

<!-- Source: features/guardrails-ai.md -->

# Guardrails AI Integration with Middleware

This document outlines the modern, simplified approach to integrating Guardrails AI with the NeuroLink platform using the new `MiddlewareFactory`. This enhances the safety, reliability, and security of your AI applications in a modular and maintainable way.

## Overview

Guardrails AI is an open-source library that provides a framework for creating and managing guardrails for large language models (LLMs). By integrating Guardrails AI as middleware, you can enforce specific rules and policies on the inputs and outputs of your models, ensuring they adhere to your safety guidelines and quality standards.

## Key Benefits

- **Risk Mitigation**: Protect against common AI risks such as hallucinations, toxic language, and data leakage.
- **Quality Assurance**: Ensure that model outputs are accurate, relevant, and meet predefined quality criteria.
- **Compliance**: Enforce industry-specific regulations and compliance requirements.
- **Customization**: Create custom guardrails tailored to specific use cases and business needs.

## Middleware-based Guardrail Implementation

With the new `MiddlewareFactory`, integrating guardrails is easier than ever. The factory automatically handles the registration and application of the `guardrails` middleware when you use a relevant preset.

```mermaid
graph TD
    A[Application] --> B[new MiddlewareFactory({ preset: 'security' })];
    subgraph B
        C{Guardrail Middleware Applied} --> D[Core LLM];
    end
    B --> E[Returns Guarded Model];
```

### Using the `security` Preset

The easiest way to enable guardrails is to use the `security` preset when creating your `MiddlewareFactory`. This preset is specifically designed to enable the `guardrails` middleware with a default configuration.

```typescript

// 1. Create a factory with the 'security' preset
const factory = new MiddlewareFactory({ preset: "security" });

// 2. Create a context
const context = factory.createContext("openai", "gpt-4");

// 3. Apply the middleware to your base model
// The guardrails middleware is applied automatically.
const guardedModel = factory.applyMiddleware(baseModel, context);

// 4. Use the guarded model
const result = await guardedModel.generate({
  prompt: "This is a test prompt.",
});
```

### Using the `all` Preset

If you want to use guardrails in combination with other built-in middleware like analytics, you can use the `all` preset.

```typescript

// This will enable both analytics and guardrails
const factory = new MiddlewareFactory({ preset: "all" });
```

### Customizing Guardrails

While presets provide a great starting point, you can also customize the behavior of the guardrails middleware by providing a custom configuration.

```typescript

const factory = new MiddlewareFactory({
  // You can start with a preset
  preset: "security",
  // And then provide a custom configuration, which will be merged with the preset
  middlewareConfig: {
    guardrails: {
      enabled: true,
      config: {
        badWords: {
          enabled: true,
          list: ["custom-bad-word-1", "custom-bad-word-2"],
        },
      },
    },
  },
});
```

This new, streamlined approach provides a clean and scalable way to add safety and other enhancements to your AI models within the NeuroLink ecosystem.

---

## See Also

For configuration examples, best practices, and troubleshooting, see the [Guardrails Middleware Feature Guide](/docs/features/guardrails).

---

## Guardrails Implementation Guide

<!-- Source: features/guardrails-implementation.md -->

# Guardrails Implementation Guide

This document provides comprehensive documentation for the NeuroLink guardrails implementation, including pre-call filtering, content sanitization, and AI-powered evaluation.

## Overview

The guardrails implementation provides advanced content filtering and safety mechanisms for AI interactions. It includes:

- **Pre-call Evaluation**: AI-powered safety assessment before processing
- **Content Filtering**: Bad words and regex pattern filtering
- **Parameter Sanitization**: Input cleaning and modification
- **Evaluation Actions**: Configurable responses (block, sanitize, warn, log)
- **Visual Proof**: Screenshots demonstrating filtering in action

## Architecture

```mermaid
graph TD
    A[User Input] --> B[Guardrails Middleware]
    B --> C{Pre-call Evaluation}
    C -->|Safe| D[Content Filtering]
    C -->|Unsafe| E[Block/Sanitize]
    D --> F{Bad Words Check}
    F -->|Clean| G[AI Provider]
    F -->|Filtered| H[Sanitize Content]
    H --> G
    E --> I[Return Blocked Response]
    G --> J[Response]
```

## Core Components

### 1. Guardrails Middleware (`src/lib/middleware/builtin/guardrails.ts`)

The main middleware component that orchestrates all guardrail functionality:

```typescript

// Apply guardrails to any AI provider
const guardedModel = new GuardrailsMiddleware(baseModel, config);
```

### 2. Guardrails Utilities (`src/lib/middleware/utils/guardrailsUtils.ts`)

Core utility functions for evaluation and filtering:

- `performPrecallEvaluation()` - AI-powered safety assessment
- `applyEvaluationActions()` - Execute configured actions based on evaluation
- `applySanitization()` - Clean and modify request parameters
- `applyContentFiltering()` - Filter content using patterns and word lists

### 3. Type Definitions (`src/lib/types/guardrails.ts`)

Complete TypeScript interfaces for configuration and results:

```typescript
type GuardrailsMiddlewareConfig = {
  badWords?: BadWordsConfig;
  modelFilter?: ModelFilterConfig;
  precallEvaluation?: PrecallEvaluationConfig;
};
```

## Configuration

### Basic Configuration

```typescript
const guardrailsConfig = {
  precallEvaluation: {
    enabled: true,
    provider: "google-ai",
    evaluationModel: "gemini-1.5-flash",
  },
  badWords: {
    enabled: true,
    list: ["inappropriate", "harmful"],
  },
};
```

### Advanced Configuration

```typescript
const advancedConfig = {
  precallEvaluation: {
    enabled: true,
    provider: "google-ai",
    evaluationModel: "gemini-1.5-flash",
    evaluationPrompt: `Custom evaluation prompt...`,
    actions: {
      onUnsafe: "block",
      onInappropriate: "sanitize",
      onSuspicious: "warn",
    },
    thresholds: {
      safetyScore: 7,
      appropriatenessScore: 6,
      confidenceLevel: 8,
    },
  },
  badWords: {
    enabled: true,
    regexPatterns: [
      "\\b(spam|scam)\\b",
      "\\d{3}-\\d{2}-\\d{4}", // SSN pattern
    ],
  },
};
```

## Features

### Pre-call Evaluation

AI-powered evaluation of user input before processing:

```json
{
  "overall": "safe|unsafe|suspicious|inappropriate",
  "safetyScore": 8,
  "appropriatenessScore": 9,
  "confidenceLevel": 7,
  "issues": [
    {
      "category": "explicit_content",
      "severity": "low",
      "description": "Mild inappropriate language"
    }
  ],
  "suggestedAction": "allow",
  "reasoning": "Content is generally appropriate with minor concerns"
}
```

### Content Filtering

Two-tier filtering system:

1. **Regex Patterns** (Priority 1)

   ```typescript
   regexPatterns: [
     "\\b(password|secret)\\b",
     "\\d{16}", // Credit card pattern
   ];
   ```

2. **Word Lists** (Priority 2)
   ```typescript
   list: ["spam", "scam", "phishing"];
   ```

### Evaluation Actions

Configurable responses based on evaluation results:

- **block**: Prevent request processing entirely
- **sanitize**: Clean content and continue processing
- **warn**: Log warning but allow processing
- **log**: Record for monitoring but allow processing

## Demo Component

### Using the Demo (`neurolink-demo/middleware/guardrails-precall-demo.ts`)

```typescript

const demo = new GuardrailsPrecallDemo();

// Test various input scenarios
await demo.testSafeInput();
await demo.testUnsafeInput();
await demo.testBadWords();
await demo.testRegexFiltering();
```

### Demo Features

- Interactive testing of guardrail functionality
- Visual feedback on filtering actions
- Performance metrics and timing
- Before/after content comparison

## Visual Proof

Screenshots demonstrating guardrails in action:

### 1. Pre-call Filtering (`guardrails-pre-call-filtering.png`)

- Shows evaluation process and decision making
- Displays safety scores and reasoning

### 2. Content Sanitization (`guardrails-pre-call-filtering-2.png`)

- Before and after content comparison
- Filtering statistics and applied rules

### 3. Block Actions (`guardrails-pre-call-filtering-3.png`)

- Demonstrates request blocking for unsafe content
- Shows error messages and user feedback

### 4. Performance Metrics (`guardrails-pre-call-filtering-4.png`)

- Evaluation timing and processing speeds
- Impact on overall request latency

## Integration Examples

### With MiddlewareFactory

```typescript

const factory = new MiddlewareFactory({
  preset: "security",
  middlewareConfig: {
    guardrails: {
      enabled: true,
      config: guardrailsConfig,
    },
  },
});

const guardedModel = factory.applyMiddleware(baseModel, context);
```

### Direct Integration

```typescript

const guardrails = new GuardrailsMiddleware(baseModel, {
  precallEvaluation: {
    enabled: true,
    provider: "google-ai",
  },
});

const result = await guardrails.generate({
  prompt: "User input to be evaluated",
});
```

### Streaming Support

```typescript
const stream = await guardrails.generateStream({
  prompt: "Streaming content with guardrails",
});

for await (const chunk of stream) {
  console.log(chunk.content);
}
```

## Performance Considerations

### Evaluation Timing

- Pre-call evaluation: ~2-5 seconds (depending on model)
- Content filtering: \;

// Apply content filtering
function applyContentFiltering(
  text: string,
  badWordsConfig?: BadWordsConfig,
  context: string = "unknown",
): ContentFilteringResult;

// Sanitize request parameters
function applySanitization(
  params: LanguageModelV1CallOptions,
  sanitizedInput: string,
): LanguageModelV1CallOptions;
```

## Troubleshooting

### Common Issues

1. **Evaluation Taking Too Long**
   - Check evaluation model availability
   - Implement timeout handling
   - Consider using faster models

2. **Too Many False Positives**
   - Adjust evaluation thresholds
   - Review and refine regex patterns
   - Check word list relevance

3. **Regex Patterns Not Working**
   - Validate regex syntax
   - Test patterns with sample content
   - Check for proper escaping

4. **Performance Impact**
   - Monitor evaluation timing
   - Optimize configuration settings
   - Consider caching strategies

### Debug Mode

Enable debug logging for detailed information:

```typescript
const config = {
  debug: true, // Enable detailed logging
  precallEvaluation: {
    enabled: true,
    logEvaluations: true,
  },
};
```

## Migration Guide

### From Previous Implementations

If upgrading from older guardrail implementations:

1. Update configuration format to new interfaces
2. Replace deprecated methods with new utility functions
3. Test evaluation thresholds and adjust as needed
4. Update error handling to use new patterns

### Breaking Changes

- Configuration structure has been updated for better organization
- Some utility function signatures have changed
- Error handling patterns have been improved

## Conclusion

The NeuroLink guardrails implementation provides comprehensive content safety and filtering capabilities with:

- ✅ AI-powered pre-call evaluation
- ✅ Flexible content filtering options
- ✅ Configurable response actions
- ✅ Visual proof and demonstrations
- ✅ High performance and scalability
- ✅ Comprehensive error handling
- ✅ TypeScript support throughout

For additional support or questions, refer to the main NeuroLink documentation or create an issue in the repository.

---

## Guardrails Middleware

<!-- Source: features/guardrails.md -->

# Guardrails Middleware

> **Since**: v7.42.0 | **Status**: Stable | **Availability**: SDK (CLI + SDK)

## Overview

**What it does**: Guardrails middleware provides real-time content filtering and policy enforcement for AI model outputs, blocking profanity, PII, unsafe content, and custom-defined terms.

**Why use it**: Protect your application from generating harmful, inappropriate, or non-compliant content. Ensures AI responses meet safety standards and regulatory requirements.

**Common use cases**:

- Content moderation for user-facing applications
- PII (Personally Identifiable Information) redaction
- Profanity filtering for family-friendly apps
- Compliance with industry regulations (COPPA, GDPR, etc.)
- Brand safety and reputation management

## Quick Start

:::tip[Zero Configuration]
Guardrails work out of the box with the `security` preset. No custom configuration required for basic content filtering.
:::

### SDK Example with Security Preset

```typescript

const neurolink = new NeuroLink({
  middleware: {
    preset: "security", // (1)!
  },
});

const result = await neurolink.generate({
  // (2)!
  prompt: "Tell me about security best practices",
});

// Output is automatically filtered for bad words and unsafe content
console.log(result.content); // (3)!
```

1. Enables guardrails middleware with default configuration
2. All generate/stream calls automatically apply filtering
3. Content is already filtered - safe to display to users

### Custom Guardrails Configuration

```typescript

const neurolink = new NeuroLink({
  middleware: {
    preset: "security",
    middlewareConfig: {
      guardrails: {
        enabled: true, // (1)!
        config: {
          badWords: {
            enabled: true, // (2)!
            list: ["spam", "scam", "inappropriate-term"], // (3)!
          },
          modelFilter: {
            enabled: true, // (4)!
            filterModel: "gpt-4o-mini", // (5)!
          },
        },
      },
    },
  },
});
```

1. Master switch for guardrails middleware
2. Enable keyword-based filtering (fast, regex-based)
3. Custom terms to filter/redact from outputs
4. Enable AI-powered content safety check (slower, more accurate)
5. Use fast, cheap model for safety evaluation

### CLI Usage

```bash
# Enable guardrails via environment variable
export NEUROLINK_MIDDLEWARE_PRESET="security"

npx @juspay/neurolink generate "Write a product description" --enable-analytics

# Guardrails are automatically applied to all generations
```

## Configuration

| Option                    | Type       | Default | Required | Description                          |
| ------------------------- | ---------- | ------- | -------- | ------------------------------------ |
| `enabled`                 | `boolean`  | `true`  | No       | Enable/disable guardrails middleware |
| `badWords.enabled`        | `boolean`  | `false` | No       | Enable keyword-based filtering       |
| `badWords.list`           | `string[]` | `[]`    | No       | List of terms to filter/redact       |
| `modelFilter.enabled`     | `boolean`  | `false` | No       | Enable AI-based content safety check |
| `modelFilter.filterModel` | `string`   | -       | No       | Model to use for safety evaluation   |

### Environment Variables

```bash
# Enable guardrails preset
export NEUROLINK_MIDDLEWARE_PRESET="security"

# Or enable all middleware (includes guardrails + analytics)
export NEUROLINK_MIDDLEWARE_PRESET="all"
```

### Config File

```typescript
// .neurolink.config.ts
export default {
  middleware: {
    preset: "security",
    middlewareConfig: {
      guardrails: {
        enabled: true,
        config: {
          badWords: {
            enabled: true,
            list: [
              // Custom filtered terms
              "confidential",
              "internal-use-only",
              // PII patterns
              "ssn",
              "credit-card",
            ],
          },
          modelFilter: {
            enabled: true,
            filterModel: "gpt-4o-mini", // Fast, cheap safety model
          },
        },
      },
    },
  },
};
```

## How It Works

### Filtering Pipeline

1. **User prompt** → Sent to AI model
2. **AI generates response** → Initial content created
3. **Guardrails middleware intercepts**:
   - **Bad word filtering**: Regex-based term replacement
   - **Model-based filtering**: AI evaluates content safety
4. **Filtered response** → Delivered to user

### Bad Word Filtering

Simple regex-based replacement:

```typescript
// Input: "This contains spam and other spam words"
// Output: "This contains **** and other **** words"
```

- Case-insensitive matching
- Replaces with asterisks (`*`) of equal length
- Works in both `generate` and `stream` modes

### Model-Based Filtering

:::danger[PII Detection Accuracy]
While guardrails filter common PII patterns, always review critical outputs manually. False negatives can occur with obfuscated data or uncommon PII formats. For high-stakes compliance, combine with dedicated PII detection services.
:::

AI-powered safety check:

```typescript
// Guardrails sends content to filter model:
// "Is the following text safe? Respond with only 'safe' or 'unsafe'."

// If unsafe:
// Output: ""
```

- Uses separate, lightweight model (e.g., `gpt-4o-mini`)
- Binary safe/unsafe classification
- Full redaction on unsafe detection

## Advanced Usage

### Combining with Other Middleware

```typescript

const neurolink = new NeuroLink({
  middleware: {
    preset: "all", // Enables guardrails + analytics + others
    middlewareConfig: {
      guardrails: {
        enabled: true,
        config: {
          badWords: {
            enabled: true,
            list: ["profanity1", "profanity2"],
          },
        },
      },
      analytics: {
        enabled: true,
      },
    },
  },
});
```

### Streaming with Guardrails

```typescript
const stream = await neurolink.streamText({
  prompt: "Write a long story",
});

// Chunks are filtered in real-time as they stream
for await (const chunk of stream) {
  console.log(chunk.content); // Already filtered
}
```

### Dynamic Guardrails

```typescript
// Add/remove filtered terms dynamically
const customWords = await loadBlocklistFromDatabase();

const neurolink = new NeuroLink({
  middleware: {
    middlewareConfig: {
      guardrails: {
        config: {
          badWords: {
            enabled: true,
            list: [...customWords, "static-term"],
          },
        },
      },
    },
  },
});
```

## API Reference

### Middleware Configuration

- `preset: "security"` → Enables guardrails with defaults
- `preset: "all"` → Enables guardrails + all other middleware
- `middlewareConfig.guardrails` → Custom guardrails configuration

See [guardrails-ai-integration.md](/docs/features/guardrails-ai) for complete integration guide.

## Troubleshooting

### Problem: Guardrails not filtering content

**Cause**: Middleware not enabled or preset not configured
**Solution**:

```typescript
// Ensure preset is set or guardrails explicitly enabled
const neurolink = new NeuroLink({
  middleware: {
    preset: "security", // ← Must set this
  },
});
```

### Problem: Too many false positives (legitimate content filtered)

**Cause**: Overly aggressive bad word list
**Solution**:

```typescript
// Use more specific terms, avoid common words
config: {
  badWords: {
    list: [
      "very-specific-bad-term",  // Good
      // "free",  // Bad - too common
    ],
  },
}
```

### Problem: Model-based filter is slow

**Cause**: Using large/expensive model for filtering
**Solution**:

```typescript
// Switch to faster, cheaper model
config: {
  modelFilter: {
    enabled: true,
    filterModel: "gpt-4o-mini",  // ← Fast and cheap
    // filterModel: "gpt-4",  // ❌ Too slow/expensive
  },
}
```

### Problem: Guardrails not working in streaming mode

**Cause**: Streaming guardrails only support bad word filtering (not model-based)
**Solution**:

```typescript
// For streaming, rely on bad word filtering
// Model-based filtering works in generate() mode only
const result = await neurolink.generate({
  // Use generate, not stream
  prompt: "...",
});
```

## Best Practices

### Content Filtering Strategy

1. **Start with presets** - Use `preset: "security"` as baseline
2. **Layer protections** - Combine bad words + model filtering
3. **Use lightweight filter models** - `gpt-4o-mini` for speed
4. **Test thoroughly** - Verify filtering doesn't break legitimate content
5. **Monitor and iterate** - Track false positives/negatives

### Bad Word List Curation

✅ **Do**:

- Include specific harmful terms
- Use exact phrases, not single characters
- Regularly update based on user reports
- Consider context-specific terms for your domain

❌ **Don't**:

- Add common English words (high false positive rate)
- Include single letters or short words
- Rely solely on bad words (use model filter too)

### Performance Optimization

```typescript
// For high-throughput applications:
config: {
  guardrails: {
    badWords: {
      enabled: true,  // Fast regex filtering
      list: [...criticalTerms],
    },
    modelFilter: {
      enabled: false,  // Disable for speed (or use sampling)
    },
  },
}
```

## Compliance Use Cases

### COPPA (Children's Online Privacy)

```typescript
config: {
  badWords: {
    enabled: true,
    list: ["email", "phone", "address", "age", "location"],
  },
  modelFilter: {
    enabled: true,  // Detect attempts to collect PII
  },
}
```

### GDPR Data Protection

```typescript
config: {
  badWords: {
    enabled: true,
    list: [
      "credit-card", "ssn", "passport",
      "bank-account", "medical-record",
    ],
  },
}
```

## Related Features

- [HITL Workflows](/docs/features/hitl) - User approval for risky actions
- [Middleware Architecture](/docs/workflows/middleware) - Custom middleware development
- [Analytics Integration](/docs/reference/analytics) - Track filtered content metrics

## Migration Notes

If upgrading from versions before v7.42.0:

1. Guardrails are now enabled via middleware presets
2. Old `guardrailsConfig` option deprecated - use `middlewareConfig.guardrails`
3. No breaking changes - existing configs still work
4. Recommended: Switch to `preset: "security"` for simplified setup

For complete technical documentation and advanced integration patterns, see [guardrails-ai-integration.md](/docs/features/guardrails-ai).

---

## Human-in-the-Loop (HITL) Workflows

<!-- Source: features/hitl.md -->

# Human-in-the-Loop (HITL) Workflows

> **Since**: v7.39.0 | **Status**: Stable | **Availability**: SDK

## Overview

**What it does**: HITL pauses AI tool execution to request explicit user approval before performing risky operations like deleting files, modifying databases, or making expensive API calls.

**Why use it**: Prevent costly mistakes and give users control over potentially dangerous AI actions. Think of it as an "Are you sure?" dialog for AI assistant operations.

:::warning[Security Best Practice]
Only use HITL for truly risky operations. Overusing confirmation prompts degrades user experience and can lead to "confirmation fatigue" where users approve actions without reading them.
:::

**Common use cases**:

- File deletion or modification operations
- Database write/delete operations
- Expensive third-party API calls
- Irreversible actions (sending emails, posting to social media)
- Operations accessing sensitive data

## Quick Start

### SDK Example

```typescript

const neurolink = new NeuroLink({
  tools: [
    {
      name: "deleteFile", // (1)!
      description: "Deletes a file from the filesystem", // (2)!
      requiresConfirmation: true, // (3)!
      execute: async (args) => {
        // (4)!
        // Your deletion logic
      },
    },
  ],
});

// When AI tries to use deleteFile:
// 1. Tool execution pauses
// 2. Returns USER_CONFIRMATION_REQUIRED error
// 3. Application shows confirmation dialog
// 4. On approval, tool executes with confirmation_received = true
```

1. Tool identifier used by the AI to invoke this function
2. Describes tool purpose to the LLM for proper selection
3. Triggers HITL checkpoint before execution
4. Actual implementation only runs after user approval

### Handling Confirmation in Your UI

HITL uses an event-based workflow where the SDK emits confirmation requests and your app responds with user decisions.

```typescript

const neurolink = new NeuroLink({
  hitl: {
    enabled: true,
    dangerousActions: ["delete", "remove", "drop", "truncate"],
    timeout: 30000, // 30 seconds
  },
});

// (1)! Listen for confirmation requests
neurolink.on("hitl:confirmation-request", async (event) => {
  const {
    confirmationId,
    toolName,
    arguments: args,
    timeoutMs,
  } = event.payload;

  // (2)! Show your app's confirmation UI
  const approved = await showConfirmationDialog({
    action: toolName,
    details: args,
    message: `AI wants to ${toolName}. Allow?`,
    timeoutMs,
  });

  // (3)! Send response back to NeuroLink
  neurolink.emit("hitl:confirmation-response", {
    type: "hitl:confirmation-response",
    payload: {
      confirmationId, // (4)! Must match the request
      approved, // (5)! User decision
      reason: approved ? undefined : "User denied permission",
      metadata: {
        timestamp: new Date().toISOString(),
        responseTime: Date.now(), // Track response speed
      },
    },
  });
});

// (6)! Handle confirmation timeouts
neurolink.on("hitl:timeout", (event) => {
  console.warn(`Confirmation timed out for ${event.payload.toolName}`);
});
```

1. Event-based confirmation workflow - NeuroLink emits requests, your app handles them
2. Show confirmation UI with tool details and countdown timer
3. Respond using event emitter with confirmation ID
4. Confirmation ID links the response to the specific request
5. Approval decision determines if tool executes
6. Optional: Handle cases where user doesn't respond in time

## Configuration

| Option                 | Type      | Default | Required | Description                          |
| ---------------------- | --------- | ------- | -------- | ------------------------------------ |
| `requiresConfirmation` | `boolean` | `false` | No       | Mark tool as requiring user approval |

### Tool Registration

```typescript
const riskyTool = {
  name: "sendEmail",
  description: "Sends an email to a recipient",
  requiresConfirmation: true, // Enable HITL
  parameters: {
    /* ... */
  },
  execute: async (args) => {
    /* ... */
  },
};
```

## How It Works

### Execution Flow

1. **AI requests tool execution** → Tool executor checks if tool requires confirmation
2. **Confirmation required?** → Returns `USER_CONFIRMATION_REQUIRED` error to LLM
3. **LLM asks user** → "I need to [action]. Is that okay?"
4. **User responds**:
   - **Approve** → UI sets `confirmation_received = true` and retries tool execution
   - **Deny** → UI sends "User cancelled" message back to LLM
5. **Tool executes** → Permission flag immediately resets to `false`

### Security Features

- **One-time permissions**: Each approval works for exactly one action
- **No reuse**: AI cannot reuse old permissions for new actions
- **Automatic reset**: Permission flag clears immediately after use
- **Fail-safe**: Defaults to requiring permission when in doubt

## API Reference

### Event Types

**Confirmation Request Event** (`hitl:confirmation-request`):

```typescript
neurolink.on("hitl:confirmation-request", (event) => {
  event.payload: {
    confirmationId: string;      // Unique ID for this request
    toolName: string;            // Tool requiring confirmation
    arguments: unknown;          // Tool parameters for review
    actionType: string;          // Human-readable description
    timeoutMs: number;           // Milliseconds until timeout
    allowModification: boolean;  // Can user edit arguments?
    metadata: { ... }            // Session/user context
  }
});
```

**Confirmation Response** (emit from your app):

```typescript
neurolink.emit("hitl:confirmation-response", {
  type: "hitl:confirmation-response",
  payload: {
    confirmationId: string;      // Must match request
    approved: boolean;           // User decision
    reason?: string;             // Rejection reason
    modifiedArguments?: unknown; // User-edited args
    metadata: {
      timestamp: string;
      responseTime: number;
    }
  }
});
```

**Timeout Event** (`hitl:timeout`):

```typescript
neurolink.on("hitl:timeout", (event) => {
  event.payload: {
    confirmationId: string;
    toolName: string;
    timeout: number;
  }
});
```

See [human-in-the-loop.md](/docs/features/hitl) for complete technical documentation.

## Troubleshooting

### Problem: Tool executes without asking for permission

**Cause**: Tool not marked with `requiresConfirmation: true`
**Solution**:

```typescript
// Add confirmation flag to tool definition
const tool = {
  name: "deleteTool",
  requiresConfirmation: true, // (1)!
  // ...
};
```

1. Add this boolean flag to any tool that performs risky operations

### Problem: AI keeps asking for confirmation repeatedly

**Cause**: Confirmation responses not being sent or sent with wrong `confirmationId`
**Solution**:

```typescript
// Always respond to confirmation requests with matching ID
neurolink.on("hitl:confirmation-request", async (event) => {
  const { confirmationId } = event.payload; // (1)!

  const approved = await showConfirmationDialog(event.payload);

  // (2)! Send response with EXACT confirmationId from request
  neurolink.emit("hitl:confirmation-response", {
    type: "hitl:confirmation-response",
    payload: {
      confirmationId, // (3)! Must match request exactly
      approved,
      metadata: {
        timestamp: new Date().toISOString(),
        responseTime: Date.now(),
      },
    },
  });
});
```

1. Extract confirmation ID from the request event
2. Always respond to every confirmation request
3. **Critical**: Use the same confirmationId from the request

### Problem: Confirmation dialog doesn't show

**Cause**: Not listening to `hitl:confirmation-request` event
**Solution**:

```typescript
// Set up event listener BEFORE making AI requests
neurolink.on("hitl:confirmation-request", async (event) => {
  // (1)! Show your confirmation UI
  await handleConfirmationPrompt(event);
});

// (2)! Then make AI requests - confirmations will now work
const result = await neurolink.generate({
  input: { text: "Delete all temporary files" },
});
```

1. Register the event handler early in your application startup
2. All subsequent tool executions will trigger confirmations when needed

## Best Practices

:::tip[Production Recommendation]
Store user confirmation preferences to avoid repeated prompts for the same action type. For example, if a user approves "delete temporary files" once, cache that preference for similar low-risk deletions in the same session.
:::

### For Developers

1. **Mark tools conservatively** - If an operation could cause problems, require confirmation
2. **Clear prompts** - Ensure users understand exactly what will happen
3. **Test confirmation flow** - Verify it works smoothly in your UI
4. **Log approvals** - Keep audit trail of user decisions
5. **Handle denials gracefully** - Allow users to try alternative approaches

### What to Mark as Requiring Confirmation

✅ **Do require confirmation**:

- File deletions
- Database writes/deletes
- Sending emails or messages
- Making purchases or payments
- Modifying production systems

❌ **Don't require confirmation**:

- Read-only operations
- Answering questions
- Generating content
- Searching/fetching data

## Related Features

- [Guardrails Middleware](/docs/features/guardrails) - Content filtering and safety checks
- [Custom Tools](/docs/sdk/custom-tools) - Building your own tools with HITL
- [Middleware Architecture](/docs/workflows/middleware) - Advanced request interception

## Migration Notes

If upgrading from versions before v7.39.0:

1. Review all existing tools for risk assessment
2. Add `requiresConfirmation: true` to risky tools
3. Implement confirmation dialog in your UI
4. Test with low-risk tools first
5. Roll out to production gradually

For comprehensive technical documentation, diagrams, and security details, see the [complete HITL guide](/docs/features/hitl).

---

## Image Generation Streaming Guide

<!-- Source: features/image-generation.md -->

# Image Generation Streaming Guide

## Overview

NeuroLink supports image generation through AI models like Google Vertex AI's `gemini-3-pro-image-preview` and `gemini-2.5-flash-image`. This guide explains how image generation works in both `generate()` and `stream()` modes, including CLI usage with automatic file saving, technical architecture, and usage examples.

## Table of Contents

1. [Architecture Overview](#architecture-overview)
2. [Streaming Modes](#streaming-modes)
3. [Image Generation Flow](#image-generation-flow)
4. [Usage Examples](#usage-examples)
5. [Implementation Details](#implementation-details)
6. [Troubleshooting](#troubleshooting)

## Streaming Modes

### Real Streaming vs Fake Streaming

NeuroLink uses two different streaming approaches depending on the model capabilities:

#### Real Streaming (Text Models)

- Uses Vercel AI SDK's native `streamText()` function
- Streams tokens as they are generated by the AI model
- Provides true real-time streaming experience
- Used for: GPT-4, Claude, Gemini (text), etc.

#### Fake Streaming (Image Models)

- Calls `generate()` internally to get complete result
- Yields the result progressively to simulate streaming
- Required because image generation models don't support token-by-token streaming
- Used for: `gemini-2.5-flash-image`, `gemini-3-pro-image-preview`, etc.

### Why Fake Streaming?

Image generation models produce complete images, not incremental tokens. The fake streaming approach:

1. **Maintains API Consistency**: Same `stream()` interface for all models
2. **Preserves User Experience**: Clients can use the same code pattern
3. **Enables Progressive Enhancement**: Can yield text chunks before final image
4. **Supports Analytics**: Tracks generation time and token usage

---

## Image Generation Flow

### Step-by-Step Process

```
1. Client calls neurolink.stream()
   ↓
2. BaseProvider.stream() detects image model
   ↓
3. Routes to executeFakeStreaming()
   ↓
4. Calls this.generate() internally
   ↓
5. Provider.executeImageGeneration() is invoked
   ↓
6. AI API generates complete image
   ↓
7. Image returned as base64 string
   ↓
8. enhanceResult() preserves imageOutput field
   ↓
9. executeFakeStreaming() yields text chunks (if any)
   ↓
10. executeFakeStreaming() yields image chunk
    { type: "image", imageOutput: { base64: "..." } }
   ↓
11. Client receives and processes image chunk
```

### Code Flow in BaseProvider

```typescript
// src/lib/core/baseProvider.ts

async stream(options: StreamOptions): Promise {
  // Step 1: Detect if this is an image generation model
  const isImageModel = IMAGE_GENERATION_MODELS.some((m) =>
    this.modelName.includes(m),
  );

  // Step 2: Route to fake streaming for image models
  if (isImageModel) {
    return await this.executeFakeStreaming(options, analysisSchema);
  }

  // Step 3: Use real streaming for text models
  return await this.executeRealStreaming(options, analysisSchema);
}

private async executeFakeStreaming(
  options: StreamOptions,
  analysisSchema?: z.ZodSchema,
): Promise {
  // Step 4: Call generate() to get complete result
  const result = await this.generate({
    prompt: options.prompt,
    // ... other options
  });

  // Step 5: Create async generator to yield chunks
  const stream = async function* () {
    // Yield text chunks if present
    if (result.text) {
      const words = result.text.split(" ");
      for (const word of words) {
        yield { content: word + " " };
        await new Promise((resolve) => setTimeout(resolve, 50));
      }
    }

    // Step 6: Yield image chunk if present
    if (result?.imageOutput) {
      yield {
        type: "image" as const,
        imageOutput: result.imageOutput,
      };
    }
  };

  return {
    stream: stream(),
    analytics: result.analytics,
    evaluation: result.evaluation,
  };
}
```

---

## Usage Examples

### Example 1: Basic Image Generation with generate()

```typescript

const neurolink = new NeuroLink();

// Generate an image using Vertex AI
const result = await neurolink.generate({
  input: { text: "A serene mountain landscape at sunset" },
  provider: "vertex",
  model: "gemini-3-pro-image-preview",
});

// Access the generated image
if (result.imageOutput) {
  const base64Image = result.imageOutput.base64;
  console.log(`Image generated: ${base64Image.length} characters`);

  // Save to file
  const imageBuffer = Buffer.from(base64Image, "base64");
  fs.writeFileSync("mountain.png", imageBuffer);
  console.log("✅ Image saved to mountain.png");
}

// Result also contains descriptive text
console.log("Content:", result.content);
// Output: "Generated image using gemini-3-pro-image-preview (image/png)"

// Access analytics (if enabled)
if (result.analytics) {
  console.log(`Generation time: ${result.analytics.responseTime}ms`);
  console.log(`Tokens used: ${result.analytics.usage.total}`);
}
```

### Example 2: Image Generation with Streaming

```typescript

const neurolink = new NeuroLink();

// Stream image generation (uses fake streaming for image models)
const result = await neurolink.stream({
  input: { text: "A futuristic city with flying cars" },
  provider: "vertex",
  model: "gemini-2.5-flash-image",
});

// Process stream chunks
for await (const chunk of result.stream) {
  if ("content" in chunk) {
    // Text chunk (description or metadata)
    process.stdout.write(chunk.content);
  } else if (chunk.type === "image") {
    // Image chunk - yielded after text chunks complete
    console.log("\n✅ Image received!");
    const base64Image = chunk.imageOutput.base64;

    // Save the image
    const imageBuffer = Buffer.from(base64Image, "base64");
    fs.writeFileSync("futuristic-city.png", imageBuffer);

    console.log(`Image size: ${imageBuffer.length} bytes`);
    console.log(`Saved to: futuristic-city.png`);
  }
}

// Access analytics after streaming completes
if (result.analytics) {
  console.log(`\nTotal generation time: ${result.analytics.responseTime}ms`);
}
```

**Note:** Image generation uses "fake streaming" - the complete image is generated first, then yielded as a single chunk. This maintains API consistency with text streaming.

### Example 3: CLI Usage

```bash
# Basic image generation (saves to default path: generated-images/image-.png)
npx neurolink generate "A beautiful sunset over the ocean" \
  --provider vertex \
  --model gemini-3-pro-image-preview

# Output:
#  Generated image saved to: generated-images/image-2025-12-16T11-50-42-209Z.png
#    Image size: 1856.34 KB
# Generated image using gemini-3-pro-image-preview (image/png)

# Generate with custom output path
npx neurolink generate "Mountain landscape at sunset" \
  --provider vertex \
  --model gemini-2.5-flash-image \
  --imageOutput ./my-images/mountain.png

# Output:
#  Generated image saved to: ./my-images/mountain.png
#    Image size: 2048.67 KB
# Generated image using gemini-2.5-flash-image (image/png)

# Generate with analytics
npx neurolink generate "Futuristic city with flying cars" \
  --provider vertex \
  --model gemini-2.5-flash-image \
  --imageOutput ./images/city.png \
  --enable-analytics

# Use different models
npx neurolink generate "Serene forest scene" \
  --provider vertex \
  --model gemini-3-pro-image-preview  # Best quality, requires 'global' location

npx neurolink generate "Quick sketch of a cat" \
  --provider vertex \
  --model gemini-2.5-flash-image  # Faster generation
```

**CLI Options:**

- `--imageOutput `: Custom path for generated image (default: `generated-images/image-.png`)
- `--provider vertex` or `--provider google-ai`: Both Vertex AI and Google AI Studio support image generation
- `--model `: Image generation model to use
- `--enable-analytics`: Include generation metrics

### Example 4: Detecting Image Chunks in Stream

```typescript

const neurolink = new NeuroLink();

const result = await neurolink.stream({
  input: { text: "A magical forest with glowing mushrooms" },
  provider: "vertex",
  model: "gemini-2.5-flash-image",
});

let textContent = "";
let imageData: string | null = null;

for await (const chunk of result.stream) {
  // Type guard for text chunks
  if ("content" in chunk) {
    textContent += chunk.content;
  }

  // Type guard for image chunks
  if ("type" in chunk && chunk.type === "image") {
    imageData = chunk.imageOutput.base64;
    console.log("Image chunk received!");
  }
}

console.log("Text description:", textContent);
console.log("Image available:", !!imageData);

if (imageData) {
  // Process the image
  const imageBuffer = Buffer.from(imageData, "base64");
  fs.writeFileSync("magical-forest.png", imageBuffer);
  console.log(`✅ Saved ${(imageBuffer.length / 1024).toFixed(2)} KB image`);
}
```

### Example 5: Error Handling

```typescript

const neurolink = new NeuroLink();

try {
  const result = await neurolink.stream({
    input: { text: "A dragon flying over mountains" },
    provider: "vertex",
    model: "gemini-3-pro-image-preview",
  });

  let imageReceived = false;

  for await (const chunk of result.stream) {
    if ("type" in chunk && chunk.type === "image") {
      imageReceived = true;
      const base64Image = chunk.imageOutput.base64;

      // Validate image data
      if (!base64Image || base64Image.length === 0) {
        throw new Error("Empty image data received");
      }

      // Validate base64 format
      // Validate base64 format (padding '=' only at end, max 2 chars)
      if (!/^[A-Za-z0-9+/]*={0,2}$/.test(base64Image)) {
        throw new Error("Invalid base64 format");
      }

      // Save image
      const imageBuffer = Buffer.from(base64Image, "base64");

      // Validate minimum size (1KB)
      if (imageBuffer.length  {
  const { prompt } = req.body;

  if (!prompt) {
    return res.status(400).json({ error: "Prompt is required" });
  }

  const neurolink = new NeuroLink();

  try {
    const result = await neurolink.stream({
      input: { text: prompt },
      provider: "vertex",
      model: "gemini-2.5-flash-image",
    });

    // Set headers for streaming response
    res.setHeader("Content-Type", "text/event-stream");
    res.setHeader("Cache-Control", "no-cache");
    res.setHeader("Connection", "keep-alive");

    for await (const chunk of result.stream) {
      if ("content" in chunk) {
        // Send text chunks as SSE
        res.write(
          `data: ${JSON.stringify({ type: "text", content: chunk.content })}\n\n`,
        );
      } else if (chunk.type === "image") {
        // Send image chunk as SSE
        res.write(
          `data: ${JSON.stringify({
            type: "image",
            base64: chunk.imageOutput.base64,
            size: chunk.imageOutput.base64.length,
          })}\n\n`,
        );
      }
    }

    res.write("data: [DONE]\n\n");
    res.end();
  } catch (error) {
    res.status(500).json({
      error: error.message,
      details: "Image generation failed",
    });
  }
});

// REST endpoint (non-streaming)
app.post("/api/generate-image-sync", async (req, res) => {
  const { prompt } = req.body;

  const neurolink = new NeuroLink();

  try {
    const result = await neurolink.generate({
      input: { text: prompt },
      provider: "vertex",
      model: "gemini-2.5-flash-image",
    });

    if (result.imageOutput) {
      res.json({
        success: true,
        base64: result.imageOutput.base64,
        content: result.content,
        size: Buffer.from(result.imageOutput.base64, "base64").length,
      });
    } else {
      res.status(500).json({ error: "No image generated" });
    }
  } catch (error) {
    res.status(500).json({ error: error.message });
  }
});

app.listen(3000, () => {
  console.log("Server running on http://localhost:3000");
});
```

---

## Implementation Details

### Provider-Specific Implementation

Vertex AI provider implements image generation through the REST API:

```typescript
// src/lib/providers/googleVertex.ts

async executeImageGeneration(
  options: TextGenerationOptions,
): Promise {
  const { GoogleAuth } = await import("google-auth-library");

  // Authenticate with Google Cloud
  const auth = new GoogleAuth({
    scopes: ["https://www.googleapis.com/auth/cloud-platform"],
  });

  const client = await auth.getClient();
  const accessToken = await client.getAccessToken();

  // Determine location based on model
  const location = this.modelName.includes("gemini-3-pro-image")
    ? "global"  // gemini-3-pro-image-preview requires global
    : this.location;  // Other models can use regional endpoints

  // Build request with response modalities for image generation
  const requestBody = {
    contents: [{
      role: "user",
      parts: [{ text: options.prompt }],
    }],
    generation_config: {
      response_modalities: ["TEXT", "IMAGE"],  // CRITICAL for image generation
      temperature: options.temperature || 0.7,
      candidate_count: 1,
    },
  };

  // Call Vertex AI API
  const url = `https://${location}-aiplatform.googleapis.com/v1/projects/${this.projectId}/locations/${location}/publishers/google/models/${this.modelName}:generateContent`;

  const response = await fetch(url, {
    method: "POST",
    headers: {
      Authorization: `Bearer ${accessToken.token}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify(requestBody),
  });

  const data = await response.json();

  // Extract image from response
  const candidate = data.candidates?.[0];
  const imagePart = candidate?.content?.parts?.find(
    (part) =>
      (part.inlineData || part.inline_data) &&
      ((part.inlineData?.mimeType || part.inline_data?.mime_type)?.startsWith("image/"))
  );

  if (!imagePart) {
    throw new Error("No image generated in response");
  }

  // Extract base64 data (handle both camelCase and snake_case)
  const imageData = imagePart.inlineData?.data || imagePart.inline_data?.data;
  const mimeType = imagePart.inlineData?.mimeType || imagePart.inline_data?.mime_type || "image/png";

  // Return result with imageOutput
  const result: EnhancedGenerateResult = {
    content: `Generated image using ${this.modelName} (${mimeType})`,
    imageOutput: {
      base64: imageData,
    },
    provider: this.providerName,
    model: this.modelName,
    usage: {
      input: this.estimateTokenCount(options.prompt),
      output: 0,
      total: this.estimateTokenCount(options.prompt),
    },
  };

  // Enhance with analytics/evaluation if enabled
  return await this.enhanceResult(result, options, startTime);
}
```

**Key Implementation Details:**

1. **Authentication**: Uses Google Cloud service account credentials
2. **Location Handling**: Automatically selects `global` for `gemini-3-pro-image-preview`
3. **Response Modalities**: Sets `["TEXT", "IMAGE"]` to enable image generation
4. **Base64 Extraction**: Handles both `inlineData` and `inline_data` formats
5. **Result Enhancement**: Preserves `imageOutput` through analytics pipeline

### Type Definitions

```typescript
// src/lib/types/streamTypes.ts

export type StreamResult = {
  stream: AsyncIterable;
  // Provider information
  provider?: string;
  model?: string;
  // Usage information
  usage?: TokenUsage;
  finishReason?: string;
  // Tool integration
  toolCalls?: ToolCall[];
  toolResults?: ToolResult[];
  toolsUsed?: string[];
  // Stream metadata
  metadata?: {
    streamId?: string;
    startTime?: number;
    totalChunks?: number;
    responseTime?: number;
  };
  // Analytics and evaluation (available after stream completion)
  analytics?: AnalyticsData | Promise;
  evaluation?: EvaluationData | Promise;
};

// src/lib/types/generateTypes.ts

export type GenerateResult = {
  content: string;
  outputs?: { text: string }; // Future extensible for multi-modal
  audio?: TTSResult;
  imageOutput?: { base64: string } | null;
  // Provider information
  provider?: string;
  model?: string;
  // Usage and performance
  usage?: TokenUsage;
  responseTime?: number;
  // Tool integration
  toolCalls?: Array;
  toolResults?: unknown[];
  toolsUsed?: string[];
  enhancedWithTools?: boolean;
  // Analytics and evaluation
  analytics?: AnalyticsData;
  evaluation?: EvaluationData;
};

// Note: CLI adds savedPath to imageOutput when saving images locally
// CLI-specific type (not part of core SDK):
// imageOutput?: { base64: string; savedPath?: string } | null;

// EnhancedGenerateResult extends GenerateResult with optional analytics/evaluation
export type EnhancedGenerateResult = GenerateResult & {
  analytics?: AnalyticsData;
  evaluation?: EvaluationData;
};

// CLI-specific types
export type GenerateCommandArgs = {
  input: string;
  provider?: string;
  model?: string;
  imageOutput?: string; // Custom path for generated images
  // ... other options
};
```

### Analytics Integration

The `enhanceResult()` method in BaseProvider preserves the `imageOutput` field while adding analytics:

```typescript
// src/lib/core/baseProvider.ts

protected async enhanceResult(
  result: EnhancedGenerateResult,
  options: TextGenerationOptions,
  startTime: number,
): Promise {
  const responseTime = Date.now() - startTime;

  // CRITICAL: Store imageOutput separately to ensure preservation
  const imageOutput = result.imageOutput;

  let enhancedResult = { ...result };

  // Add analytics if enabled
  if (options.enableAnalytics) {
    try {
      const analytics = await this.createAnalytics(result, responseTime, options);
      // Preserve ALL fields including imageOutput when adding analytics
      enhancedResult = { ...enhancedResult, analytics, imageOutput };
    } catch (error) {
      logger.warn(`Analytics creation failed: ${error.message}`);
    }
  }

  // Add evaluation if enabled
  if (options.enableEvaluation) {
    try {
      const evaluation = await this.createEvaluation(result, options);
      // Preserve ALL fields including imageOutput when adding evaluation
      enhancedResult = { ...enhancedResult, evaluation, imageOutput };
    } catch (error) {
      logger.warn(`Evaluation creation failed: ${error.message}`);
    }
  }

  // CRITICAL FIX: Always restore imageOutput if it existed
  if (imageOutput) {
    enhancedResult.imageOutput = imageOutput;
  }

  return enhancedResult;
}
```

**Key Points:**

- `imageOutput` is explicitly preserved through analytics/evaluation pipeline
- Spread operator ensures all existing fields are maintained
- Double-check restoration at the end prevents accidental loss

---

## Troubleshooting

### Common Issues

#### 1. No Image Chunk Received

**Symptom**: Stream completes but no image chunk is yielded.

**Possible Causes**:

- Model is not an image generation model
- Wrong provider (only Vertex AI supports image generation)
- API credentials are invalid or missing
- Model not available in selected region

**Solution**:

```typescript
// Verify you're using Vertex AI provider
const result = await neurolink.generate({
  input: { text: "Generate an image of a sunset" },
  provider: "vertex", // ✅ Required
  model: "gemini-3-pro-image-preview", // ✅ Valid image model
});

// NOT these:
// provider: "openai"     // ❌ Doesn't support image generation
// provider: "anthropic"  // ❌ Doesn't support image generation
// Note: "google-ai" also supports image generation with gemini-2.5-flash-image

// Verify credentials
console.log(
  "GOOGLE_APPLICATION_CREDENTIALS:",
  process.env.GOOGLE_APPLICATION_CREDENTIALS,
);
console.log("GOOGLE_VERTEX_PROJECT:", process.env.GOOGLE_VERTEX_PROJECT);
```

#### 2. Empty Base64 String

**Symptom**: Image chunk received but `base64` field is empty.

**Possible Causes**:

- API returned error but didn't throw
- Response format changed
- Network issue during transmission

**Solution**:

```typescript
for await (const chunk of result.stream) {
  if (chunk.type === "image") {
    if (!chunk.imageOutput.base64) {
      console.error("Empty image data received");
      console.error("Full chunk:", JSON.stringify(chunk, null, 2));
    } else {
      console.log(`Image data length: ${chunk.imageOutput.base64.length}`);
    }
  }
}
```

#### 3. Model Not Found Error

**Symptom**: Error: `models/gemini-3-pro-image-preview is not found for API version v1`

**Cause**: `gemini-3-pro-image-preview` requires `location: "global"` but a regional endpoint is being used.

**Solution**:

```typescript
// The provider automatically handles location selection:
// - gemini-3-pro-image-preview → uses "global"
// - Other models → uses configured region (e.g., "us-east5")

// Set region in environment variable
process.env.GOOGLE_VERTEX_LOCATION = "us-east5"; // For non-preview models

// Or pass in options
const result = await neurolink.generate({
  input: { text: "Generate image" },
  provider: "vertex",
  model: "gemini-2.5-flash-image", // Uses regional endpoint
  region: "us-east5",
});
```

#### 4. Large Image Timeout

**Symptom**: Generation times out for large/complex images.

**Solution**:

```typescript
const result = await neurolink.stream({
  input: { text: "A detailed cityscape with many buildings" },
  provider: "vertex",
  model: "gemini-2.5-flash-image",
  timeout: 60000, // Increase timeout to 60 seconds
});
```

#### 5. CLI Image Not Saved

**Symptom**: CLI shows success but no file created.

**Possible Causes**:

- `imageOutput` option not passed to `processOptions()`
- Directory permissions issue
- Disk space full

**Solution**:

```bash
# Check default location
ls -lh generated-images/

# Use custom path with explicit directory
npx neurolink generate "test" \
  --provider vertex \
  --model gemini-2.5-flash-image \
  --imageOutput ./my-images/test.png

# Check file was created
ls -lh ./my-images/test.png

# Verify directory permissions
ls -ld generated-images/
```

### Debug Mode

Enable debug logging to troubleshoot issues:

```bash
# Set environment variable
export DEBUG=neurolink:*

# Or use CLI flag
npx neurolink generate "test image" \
  --provider vertex \
  --model gemini-2.5-flash-image \
  --debug

# Debug output will show:
# - Provider selection
# - Model configuration
# - API request details
# - Response parsing
# - Image data extraction
```

### Testing Image Generation

Quick test to verify image generation works:

```bash
# Test with default path
npx neurolink generate "A simple red circle" \
  --provider vertex \
  --model gemini-2.5-flash-image

# Expected output:
#  Generated image saved to: generated-images/image-2025-12-16T11-50-42-209Z.png
#    Image size: 234.56 KB
# Generated image using gemini-2.5-flash-image (image/png)

# Verify file exists
ls -lh generated-images/image-*.png | tail -1

# Test with custom path
npx neurolink generate "A simple blue square" \
  --provider vertex \
  --model gemini-2.5-flash-image \
  --imageOutput ./test-output/square.png

# Expected output:
#  Generated image saved to: ./test-output/square.png
#    Image size: 198.34 KB
# Generated image using gemini-2.5-flash-image (image/png)

# Verify file
file ./test-output/square.png
# Output: ./test-output/square.png: PNG image data, 1024 x 1024, 8-bit/color RGB
```

---

## Best Practices

### 1. Always Check for Image Chunks

```typescript
let hasImage = false;

for await (const chunk of result.stream) {
  if ("type" in chunk && chunk.type === "image") {
    hasImage = true;
    // Process image
  }
}

if (!hasImage) {
  console.warn("No image was generated");
}
```

### 2. Validate Base64 Data

```typescript
if (chunk.type === "image") {
  const base64 = chunk.imageOutput.base64;

  // Validate it's valid base64 (padding '=' only at end, max 2 chars)
  if (!/^[A-Za-z0-9+/]*={0,2}$/.test(base64)) {
    throw new Error("Invalid base64 data");
  }

  // Validate minimum size (e.g., 1KB)
  if (base64.length  30000) {
    console.warn("Image generation took longer than 30 seconds");
  }
}
```

---

## Conclusion

NeuroLink's image generation streaming provides a unified interface for both text and image generation. The fake streaming approach ensures consistency while maintaining the benefits of streaming APIs. By following the patterns and examples in this guide, you can effectively integrate image generation into your applications.

For more information:

- [API Reference](/docs/sdk/api-reference)
- [Provider Comparison](/docs/reference/provider-comparison)
- [Provider Status Monitoring](/docs/observability/provider-status)

---

## Interactive CLI - Your AI Development Environment

<!-- Source: features/interactive-cli.md -->

# Interactive CLI: Your AI Development Environment

> **Since**: v7.0.0 | **Status**: Production Ready | **Availability**: CLI

## Why Interactive Mode?

NeuroLink's Interactive CLI transforms traditional command-line usage into a persistent development environment optimized for AI workflow iteration. Unlike standard CLIs where each command is isolated, Interactive Mode maintains session state, conversation memory, and configuration across all operations - enabling rapid experimentation, debugging, and production runbook execution.

### Traditional CLI vs Interactive Mode

| Feature             | Traditional CLI                | NeuroLink Interactive                 | Productivity Impact              |
| ------------------- | ------------------------------ | ------------------------------------- | -------------------------------- |
| **Session State**   | None - lost after each command | Full persistence across session       | 10x faster parameter tuning      |
| **Memory**          | No context between commands    | Conversation-aware with history       | 5x reduction in repeated context |
| **Configuration**   | Flags required per command     | `/set` persists across entire session | 80% fewer keystrokes             |
| **Tool Testing**    | Manual per-tool invocation     | Live discovery & testing with `/mcp`  | 3x faster integration testing    |
| **Streaming**       | Optional per command           | Real-time default with progress bars  | Immediate feedback               |
| **Error Recovery**  | Start over from scratch        | Session preserved, fix and retry      | 90% time saved on errors         |
| **Workflow Replay** | Copy-paste commands            | Export/import full sessions           | Reproducible workflows           |

**Measured productivity gains:**

- 80% faster onboarding for new users
- 60% fewer configuration errors
- 3-5x faster prompt engineering iteration
- Universal accessibility from beginner to expert

## Loop Mode Deep Dive

### Session Variables

Configure once, use throughout your session:

#### Setting Variables

```bash
neurolink > /set provider anthropic
✅ provider set to anthropic

neurolink > /set model claude-3-opus
✅ model set to claude-3-opus

neurolink > /set temperature 0.3
✅ temperature set to 0.3

neurolink > /set thinking-level high
✅ thinking-level set to high

neurolink > /set max-tokens 4000
✅ max-tokens set to 4000
```

#### Getting Current Values

```bash
neurolink > /get provider
 provider: anthropic

neurolink > /get all
 Current Session Configuration:
├── provider: anthropic
├── model: claude-3-opus
├── temperature: 0.3
├── thinking-level: high
├── max-tokens: 4000
└── conversation-memory: enabled
```

#### Unsetting Variables

```bash
neurolink > /unset temperature
✅ temperature unset (reverting to default: 0.7)

neurolink > /clear
⚠️  Clear all session variables? (y/n): y
✅ All session variables cleared
```

### Conversation Memory

#### How Memory Works

Interactive mode maintains conversation context automatically:

```bash
neurolink > My name is Alice and I work on the backend team

 Nice to meet you, Alice! As a backend developer, you might be interested in...

neurolink > What's my name?

 Your name is Alice, and you mentioned you work on the backend team.

neurolink > /history

 Conversation History (4 messages):

1. USER: My name is Alice and I work on the backend team
2. ASSISTANT: Nice to meet you, Alice! As a backend developer...
3. USER: What's my name?
4. ASSISTANT: Your name is Alice, and you mentioned you work on the backend team.

neurolink > /clear

⚠️  This will clear conversation history but preserve session variables.
   Continue? (y/n): y

✅ Conversation history cleared
 Session variables (provider, model, etc.) preserved
```

#### Memory Persistence (Redis)

With Redis enabled, conversations persist across sessions:

```bash
# Session 1
neurolink > I'm debugging the authentication service

 I can help with that. What specific issue are you seeing?

neurolink > exit
 Session saved to Redis

# Later - Session 2 (same session ID)
npx @juspay/neurolink loop --session sess_abc123

neurolink > What was I working on?

 You were debugging the authentication service. Have you made progress?
```

### Provider Switching

Switch providers mid-session to compare responses:

```bash
neurolink > /set provider openai
✅ provider set to openai

neurolink > Explain quantum computing

 [OpenAI GPT-4 response]

neurolink > /set provider anthropic
✅ provider set to anthropic

neurolink > Explain quantum computing

 [Anthropic Claude response]

neurolink > /set provider google-ai
✅ provider set to google-ai

neurolink > Explain quantum computing

 [Google Gemini response]
```

### Model Experimentation

A/B test different models in the same session:

```bash
# Test different models on same prompt
neurolink > /set provider anthropic

neurolink > /set model claude-3-haiku
neurolink > Write a haiku about coding
 [Haiku response - fast, concise]

neurolink > /set model claude-3-sonnet
neurolink > Write a haiku about coding
 [Sonnet response - balanced]

neurolink > /set model claude-3-opus
neurolink > Write a haiku about coding
 [Opus response - creative, detailed]

# Compare thinking levels
neurolink > /set thinking-level minimal
neurolink > Solve this logic puzzle: ...
 [Quick response]

neurolink > /set thinking-level high
neurolink > Solve this logic puzzle: ...
 [Deep reasoning response with extended thinking]
```

---

## Command Reference

### Session Management

| Command                   | Description                                     | Example                     |
| ------------------------- | ----------------------------------------------- | --------------------------- |
| `/set  `      | Set session variable (persists across commands) | `/set provider anthropic`   |
| `/get `              | Get current value of variable                   | `/get temperature`          |
| `/get all`                | Show all session variables                      | `/get all`                  |
| `/unset `            | Remove session variable (revert to default)     | `/unset temperature`        |
| `/show`                   | Alias for `/get all`                            | `/show`                     |
| `/clear`                  | Clear conversation history (keeps variables)    | `/clear`                    |
| `/reset`                  | Reset everything (history + variables)          | `/reset`                    |
| `/history`                | View conversation history                       | `/history`                  |
| `/history `            | View last N messages                            | `/history 10`               |
| `/export  ` | Export session (json, markdown, text)           | `/export json session.json` |
| `/import `          | Import previous session                         | `/import session.json`      |
| `exit` / `quit` / `:q`    | Exit loop mode                                  | `exit`                      |

#### Available Session Variables

| Variable         | Type    | Example         | Description                |
| ---------------- | ------- | --------------- | -------------------------- |
| `provider`       | string  | `anthropic`     | AI provider to use         |
| `model`          | string  | `claude-3-opus` | Specific model             |
| `temperature`    | number  | `0.7`           | Creativity level (0-1)     |
| `max-tokens`     | number  | `4000`          | Maximum response length    |
| `thinking-level` | string  | `high`          | Extended thinking mode     |
| `streaming`      | boolean | `true`          | Enable streaming responses |
| `tools`          | boolean | `true`          | Enable MCP tool usage      |

### MCP Tools Commands

| Command                     | Description                              | Example                                                    |
| --------------------------- | ---------------------------------------- | ---------------------------------------------------------- |
| `/mcp discover`             | List all available MCP servers and tools | `/mcp discover`                                            |
| `/mcp list`                 | Alias for discover                       | `/mcp list`                                                |
| `/mcp test `        | Test connectivity to MCP server          | `/mcp test github`                                         |
| `/mcp add  ` | Add MCP server to session                | `/mcp add myserver "npx my-mcp-server"`                    |
| `/mcp remove `        | Remove MCP server                        | `/mcp remove myserver`                                     |
| `/mcp status`               | Show status of all servers               | `/mcp status`                                              |
| `/mcp exec  `   | Manually execute a tool                  | `/mcp exec github create_issue --params '{"title":"Bug"}'` |

### HITL Commands

| Command              | Description             | Example                                      |
| -------------------- | ----------------------- | -------------------------------------------- |
| `/hitl status`       | View pending approvals  | `/hitl status`                               |
| `/hitl approve ` | Approve pending action  | `/hitl approve 1`                            |
| `/hitl reject `  | Reject with reason      | `/hitl reject 2 --reason "Security concern"` |
| `/hitl history`      | View approval history   | `/hitl history --last 10`                    |
| `/hitl config`       | View HITL configuration | `/hitl config`                               |

### Help Commands

| Command           | Description                 | Example     |
| ----------------- | --------------------------- | ----------- |
| `/help`           | Show all available commands | `/help`     |
| `/help ` | Help for specific command   | `/help set` |
| `/?`              | Alias for help              | `/?`        |

---

## Real-World Workflows

### Workflow 1: Development Iteration

Complete session showing code generation, refinement, and testing:

````bash
neurolink > /set provider anthropic
neurolink > /set model claude-3-sonnet
neurolink > /set temperature 0.3  # Lower temp for code

# Initial generation
neurolink > Create a TypeScript function to validate email addresses

 Here's an email validation function:

```typescript
function validateEmail(email: string): boolean {
  const emailRegex = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
  return emailRegex.test(email);
}
````

neurolink > Add support for validating domains

 [AI remembers previous context, extends the function]

```typescript
function validateEmail(email: string, allowedDomains?: string[]): boolean {
  const emailRegex = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;

  if (!emailRegex.test(email)) {
    return false;
  }

  if (allowedDomains) {
    const domain = email.split("@")[1];
    return allowedDomains.includes(domain);
  }

  return true;
}
```

neurolink > Add unit tests

 [AI remembers the function, generates tests]

```typescript

describe("validateEmail", () => {
  it("should validate correct emails", () => {
    expect(validateEmail("user@example.com")).toBe(true);
  });

  it("should reject invalid emails", () => {
    expect(validateEmail("invalid")).toBe(false);
  });

  it("should validate against allowed domains", () => {
    expect(validateEmail("user@company.com", ["company.com"])).toBe(true);
  });
});
```

neurolink > /export markdown code-session.md
✅ Exported to code-session.md

````

### Workflow 2: Model Experimentation

Comparing responses across providers and models:

```bash
# Test prompt engineering across models
PROMPT="Explain dependency injection in 50 words"

neurolink > /set provider openai
neurolink > /set model gpt-4-turbo

neurolink > $PROMPT
 [GPT-4 Turbo response]
Word count: 48

neurolink > /set provider anthropic
neurolink > /set model claude-3-opus

neurolink > $PROMPT
 [Claude Opus response]
Word count: 52

neurolink > /set provider google-ai
neurolink > /set model gemini-3-flash

neurolink > $PROMPT
 [Gemini 3 Flash response]
Word count: 47

# Compare thinking levels
neurolink > /set thinking-level minimal
neurolink > Solve: What is 15% of 280?
 42 [instant]

neurolink > /set thinking-level high
neurolink > Solve: If a train leaves at 2pm going 60mph...
 [extended thinking visible]
   Thinking... analyzing problem structure
   Thinking... calculating distances
   Thinking... verifying solution
   Answer: [detailed solution with reasoning]

neurolink > /export json model-comparison.json
````

### Workflow 3: MCP Tool Testing

Discovering, testing, and using MCP tools:

```bash
neurolink > /mcp discover

 Available MCP Servers (8):
╔═══════════════╦═════════════╦══════════════╗
║ Server        ║ Status      ║ Tools        ║
╠═══════════════╬═════════════╬══════════════╣
║ filesystem    ║ ✅ Active   ║ 9 tools      ║
║ github        ║ ✅ Active   ║ 15 tools     ║
║ postgres      ║ ❌ Inactive ║ 0 tools      ║
...

neurolink > /mcp test postgres

 Testing MCP server: postgres
❌ Connection failed: ECONNREFUSED

 Fix: Set POSTGRES_CONNECTION_STRING environment variable
   export POSTGRES_CONNECTION_STRING="postgresql://user:pass@localhost:5432/db"

neurolink > Great, let me fix that [sets env var externally]

neurolink > /mcp test postgres
✅ Connection successful!
 8 tools available: query, schema, tables, insert, update...

neurolink > Use the GitHub tool to list my repositories

 Using tool: github_list_repos

Found 23 repositories:
1. neurolink-examples (public)
2. ai-playground (private)
3. docs-site (public)
...

neurolink > Create an issue in neurolink-examples titled "Add HITL example"

 Using tool: github_create_issue
 HITL Approval Required

Action: github_create_issue
Args:
  repo: neurolink-examples
  title: Add HITL example
  body: [AI-generated description]

Approve? (y/n): y

✅ Issue created: neurolink-examples#42
 https://github.com/user/neurolink-examples/issues/42

neurolink > /export json github-workflow.json
```

### Workflow 4: Documentation Generation

Using AI to generate docs with iterative refinement:

````bash
neurolink > /set provider anthropic
neurolink > /set temperature 0.5

neurolink > Read the file src/lib/neurolink.ts

 Using tool: readFile

[File contents displayed]

neurolink > Generate API documentation for the NeuroLink class

 # NeuroLink API Documentation

## Class: NeuroLink

Main SDK class for interacting with AI providers...

[Generated docs]

neurolink > Add examples for each method

 [AI remembers the documentation, adds examples]

## Examples

### generate()
```typescript
const result = await neurolink.generate({
  input: { text: "Hello" }
});
````

...

neurolink > Save this to docs/api/neurolink.md

 Using tool: writeFile
✅ Saved to docs/api/neurolink.md

neurolink > Now generate docs for the MessageBuilder class

 Reading src/lib/utils/messageBuilder.ts...

[Continues documentation generation]

neurolink > /export json doc-generation-session.json

````

---

## Tips & Tricks

### Power User Features

#### Keyboard Shortcuts

- **↑ / ↓** - Navigate command history
- **Tab** - Auto-complete commands and variables
- **Ctrl+C** - Cancel current operation (doesn't exit)
- **Ctrl+D** - Exit loop mode
- **Ctrl+L** - Clear screen
- **Ctrl+R** - Search command history

#### Multi-line Input

Use backslash continuation for multi-line prompts:

```bash
neurolink > Write a function that: \
... 1. Validates user input \
... 2. Sanitizes the data \
... 3. Returns typed result

 [AI processes full multi-line prompt]
````

Or use triple backticks for code blocks:

```bash
neurolink > Review this code:
```

function process(data) {
return data.map(x => x \* 2);
}

```

 [AI reviews the code block]
```

#### Command Aliases

Create shortcuts for common operations:

```bash
# In your shell profile (.bashrc, .zshrc)
alias nlg="npx @juspay/neurolink loop --provider google-ai"
alias nla="npx @juspay/neurolink loop --provider anthropic"
alias nlo="npx @juspay/neurolink loop --provider openai"

# Usage
$ nlg  # Starts loop with Google AI
$ nla  # Starts loop with Anthropic
```

### Session Persistence

#### Saving Sessions

Explicit save to file:

```bash
neurolink > /export json my-session.json
✅ Exported 15 messages to my-session.json

# Session includes:
# - All conversation history
# - Session variables
# - Tool usage logs
# - Timestamps
```

#### Resuming Sessions

```bash
# Resume from file
npx @juspay/neurolink loop --session my-session.json

# Resume from Redis (if enabled)
npx @juspay/neurolink loop --session-id sess_abc123
```

#### Sharing Sessions

Share reproducible workflows with team:

```bash
# Developer 1
neurolink > [Creates workflow]
neurolink > /export json workflow.json

# Developer 2
npx @juspay/neurolink loop --session workflow.json
# Can replay exact same workflow
```

### Integration with Scripts

#### Piping Input

```bash
# Pipe file contents to AI
cat README.md | npx @juspay/neurolink generate "Summarize this:"

# Process output from commands
git diff | npx @juspay/neurolink generate "Review these changes"

# Chain with other tools
curl https://api.example.com/data | \
  npx @juspay/neurolink generate "Analyze this JSON"
```

#### Non-Interactive Mode

```bash
# Run single command and exit
npx @juspay/neurolink generate "Hello" --provider anthropic --exit

# Batch processing
for file in *.md; do
  npx @juspay/neurolink generate "Summarize: $(cat $file)" \
    --provider google-ai \
    --output summary-$file
done
```

#### CI/CD Usage

```bash
# .github/workflows/ai-review.yml
- name: AI Code Review
  run: |
    npx @juspay/neurolink loop --non-interactive  /set provider anthropic
Unknown command: /set
```

**Solution**: Ensure you're in loop mode:

```bash
# Wrong - regular CLI
npx @juspay/neurolink set provider anthropic

# Right - loop mode
npx @juspay/neurolink loop
neurolink > /set provider anthropic
```

#### Issue: Conversation Memory Not Working

**Symptom**: AI doesn't remember previous context

**Solution**:

```bash
# Check if memory is enabled
neurolink > /get all
...
conversation-memory: disabled  #  /get all

# Memory disabled - enable it
npx @juspay/neurolink loop --enable-conversation-memory

# Now messages will be tracked for export
```

#### Issue: MCP Tools Not Showing

**Symptom**:

```bash
neurolink > /mcp discover
No MCP servers found
```

**Solution**:

```bash
# Install MCP servers first
npx @juspay/neurolink mcp install filesystem
npx @juspay/neurolink mcp install github

# Verify in .mcp-config.json or configure manually
```

### Debug Mode

Enable verbose logging:

```bash
# Via environment variable
export NEUROLINK_DEBUG=true
npx @juspay/neurolink loop

# Via flag
npx @juspay/neurolink loop --debug

# Debug output shows:
# - Session initialization
# - Variable changes
# - Provider selections
# - Tool executions
# - Memory operations
```

Example debug output:

```
[DEBUG] Initializing loop session
[DEBUG] Session ID: sess_abc123
[DEBUG] Redis connection: redis://localhost:6379 (connected)
[DEBUG] Conversation memory: enabled
[DEBUG] Loading session variables...
[DEBUG] Variable set: provider=google-ai
[DEBUG] Provider initialized: GoogleAIStudioProvider
[DEBUG] Model: gemini-3-flash-preview
[DEBUG] MCP servers discovered: 5
[DEBUG] Tools available: 39
```

---

## See Also

- [CLI Reference](/docs/cli/commands) - Complete CLI command documentation
- [Loop Sessions Quick Guide](/docs/features/cli-loop-sessions) - Quick reference for loop mode
- [MCP Integration](/docs/mcp/integration) - Deep dive into MCP tools
- [Enterprise HITL](/docs/features/enterprise-hitl) - Using HITL in interactive sessions
- [Conversation Memory](/docs/features/conversation-history) - Redis persistence configuration
- [Provider Setup](/docs/getting-started/provider-setup) - Configure AI providers

---

## MCP Tools Ecosystem - 58+ Integrations

<!-- Source: features/mcp-tools-showcase.md -->

# MCP Tools Ecosystem: 58+ Integrations

> **Since**: v7.0.0 | **Status**: Production Ready | **MCP Version**: 2024-11-05

## Overview

NeuroLink's Model Context Protocol (MCP) integration provides a **universal plugin system** that transforms the SDK from a simple AI interface into a complete AI development platform. With 6 built-in core tools and access to 58+ community MCP servers, you can extend AI capabilities to interact with filesystems, databases, APIs, cloud services, and custom enterprise systems.

### What is MCP?

The Model Context Protocol is an **open standard** (like USB-C for AI) that enables AI models to securely interact with external tools and data sources through a unified interface. Think of it as:

- **For Developers**: A standardized way to connect AI to any external system
- **For AI Models**: A tool registry with discoverable, executable functions
- **For Enterprises**: A controlled, auditable way to extend AI capabilities

### Why MCP Matters

| Traditional Approach                    | MCP Approach                   | Benefit                 |
| --------------------------------------- | ------------------------------ | ----------------------- |
| Custom tool integrations per provider   | One MCP tool works everywhere  | 10x faster integration  |
| Manual tool discovery and configuration | Automatic tool registry        | Zero-config tool usage  |
| Provider-specific tool formats          | Universal JSON-RPC protocol    | Provider portability    |
| Limited to SDK-defined tools            | 58+ community servers + custom | Unlimited extensibility |
| Static tool set                         | Dynamic runtime addition       | Adapt to changing needs |

### NeuroLink's Deep MCP Integration

**Factory-First Architecture**: MCP tools work internally while users see simple factory methods:

```typescript
// Same simple interface
const result = await neurolink.generate({
  input: { text: "List files and create a summary document" },
});

// But internally powered by:
// ✅ Context tracking across tool chains
// ✅ Permission-based security
// ✅ Tool registry and discovery
// ✅ Pipeline execution with error recovery
// ✅ Rich analytics and monitoring
```

**Key Features:**

- **99% Lighthouse Compatible**: Existing MCP tools work with minimal changes
- **Dynamic Server Management**: Add/remove MCP servers programmatically
- **Rich Context**: 15+ fields including session, user, permissions, metadata
- **Performance Optimized**: 0-11ms tool execution (target: \' to test connectivity
```

### SDK Discovery

```typescript
const neurolink = new NeuroLink();

// Discover all tools
const tools = await neurolink.discoverTools();

console.log(`Total tools: ${tools.length}`);

// Group by server
const byServer = tools.reduce((acc, tool) => {
  if (!acc[tool.server]) acc[tool.server] = [];
  acc[tool.server].push(tool.name);
  return acc;
}, {});

console.log("Tools by server:", byServer);

// Filter specific capabilities
const fileTools = tools.filter(
  (t) =>
    t.name.includes("file") ||
    t.name.includes("read") ||
    t.name.includes("write"),
);

console.log(
  "File-related tools:",
  fileTools.map((t) => t.name),
);
```

---

## Enterprise MCP Patterns

### Custom MCP Server Development

Create your own MCP server for enterprise integration:

```typescript
// custom-crm-server.ts

const server = new Server(
  {
    name: "custom-crm",
    version: "1.0.0",
  },
  {
    capabilities: {
      tools: {},
    },
  },
);

// Register tools
server.setRequestHandler("tools/list", async () => {
  return {
    tools: [
      {
        name: "get_customer",
        description: "Get customer details from CRM",
        inputSchema: {
          type: "object",
          properties: {
            customerId: {
              type: "string",
              description: "Customer ID",
            },
          },
          required: ["customerId"],
        },
      },
      {
        name: "create_lead",
        description: "Create new lead in CRM",
        inputSchema: {
          type: "object",
          properties: {
            name: { type: "string" },
            email: { type: "string" },
            company: { type: "string" },
          },
          required: ["name", "email"],
        },
      },
    ],
  };
});

// Handle tool execution
server.setRequestHandler("tools/call", async (request) => {
  const { name, arguments: args } = request.params;

  switch (name) {
    case "get_customer":
      const customer = await fetchCustomerFromCRM(args.customerId);
      return {
        content: [
          {
            type: "text",
            text: JSON.stringify(customer, null, 2),
          },
        ],
      };

    case "create_lead":
      const lead = await createLeadInCRM(args);
      return {
        content: [
          {
            type: "text",
            text: `Lead created: ${lead.id}`,
          },
        ],
      };

    default:
      throw new Error(`Unknown tool: ${name}`);
  }
});

// Start server
const transport = new StdioServerTransport();
await server.connect(transport);
```

**Using custom server**:

```typescript
await neurolink.addMCPServer("crm", {
  command: "node",
  args: ["./custom-crm-server.js"],
  env: {
    CRM_API_KEY: process.env.CRM_API_KEY,
    CRM_ENDPOINT: process.env.CRM_ENDPOINT,
  },
});
```

### Security Considerations

#### 1. Tool Sandboxing

```typescript
// Restrict filesystem access
await neurolink.addMCPServer("filesystem", {
  command: "npx",
  args: [
    "-y",
    "@modelcontextprotocol/server-filesystem",
    "/allowed/directory/only", // Restrict to specific directory
  ],
});

// Use HITL for dangerous operations
const neurolink = new NeuroLink({
  hitl: {
    enabled: true,
    requireApproval: ["writeFile", "deleteFile", "executeCode", "shell_exec"],
  },
});
```

#### 2. Permission System

```typescript
// Define permissions per tool
const neurolink = new NeuroLink({
  tools: {
    permissions: {
      readFile: ["admin", "developer", "viewer"],
      writeFile: ["admin", "developer"],
      deleteFile: ["admin"],
      executeCode: ["admin"],
    },
  },
});

// Enforce in context
const result = await neurolink.generate({
  input: { text: "Delete old log files" },
  context: {
    userId: "user123",
    role: "viewer", // Will fail - no delete permission
  },
});
```

#### 3. Audit Logging

```typescript
const neurolink = new NeuroLink({
  audit: {
    enabled: true,
    logAllTools: true,
    storage: "database",
    database: {
      url: process.env.AUDIT_DB_URL,
    },
  },
});

// Audit log entry format
{
  timestamp: "2025-01-01T14:30:00Z",
  userId: "user123",
  tool: "writeFile",
  args: { path: "/data/report.pdf", size: 1024 },
  approved: true,
  approver: "manager@company.com",
  result: { success: true }
}
```

### Performance Optimization

#### 1. Connection Pooling

```typescript
// Reuse database connections
await neurolink.addMCPServer("postgres", {
  command: "npx",
  args: ["-y", "@modelcontextprotocol/server-postgres"],
  env: {
    POSTGRES_CONNECTION_STRING: process.env.DATABASE_URL,
    POSTGRES_POOL_SIZE: "20", // Connection pool
    POSTGRES_POOL_TIMEOUT: "30000",
  },
});
```

#### 2. Result Caching

```typescript
const neurolink = new NeuroLink({
  tools: {
    cache: {
      enabled: true,
      ttl: 300, // 5 minutes
      maxSize: 1000, // Max cached results
    },
  },
});

// Tools with read-only operations cache results
const result1 = await neurolink.generate({
  input: { text: "Get customer 123 details" },
}); // Cache miss - fetches from CRM

const result2 = await neurolink.generate({
  input: { text: "Get customer 123 details" },
}); // Cache hit - instant response
```

#### 3. Timeout Handling

```typescript
await neurolink.addMCPServer("slow-api", {
  command: "npx",
  args: ["-y", "slow-mcp-server"],
  timeout: 30000, // 30 second timeout
  retry: {
    enabled: true,
    maxAttempts: 3,
    backoff: "exponential",
  },
});
```

---

## See Also

- [MCP Integration Guide](/docs/mcp/integration) - Deep dive into MCP architecture
- [MCP Server Catalog](/docs/guides/mcp/server-catalog) - Complete MCP server directory
- [Custom Tools](/docs/sdk/custom-tools) - Building custom MCP servers
- [Enterprise HITL](/docs/features/enterprise-hitl) - HITL for tool approval workflows
- [Interactive CLI](/docs/cli) - Using MCP tools in CLI loop mode
- [MCP Foundation](/docs/mcp/overview) - MCP architecture documentation

---

## Memory Guide

<!-- Source: features/memory.md -->

# Memory Guide

> **Since**: v9.12.0 | **Status**: Stable | **Availability**: SDK

## Overview

NeuroLink includes a **memory engine** powered by the `@juspay/hippocampus` SDK. Unlike conversation memory (which tracks recent turns in a session), memory maintains a **condensed summary** of durable facts about each user across all conversations.

Key characteristics:

- **Per-user**: Each user gets an independent memory store keyed by `userId`
- **Condensed**: Memory is kept to a configurable word limit (default 50 words) via LLM-powered condensation
- **Persistent**: Stored in S3, Redis, or SQLite — survives server restarts
- **Non-blocking**: Memory storage happens in the background after each generate/stream call
- **Crash-safe**: Every SDK method is wrapped in try-catch — errors are logged, never thrown

## How It Works

```
User prompt arrives
       │
       ▼
 ┌─────────────┐
 │ memory.get() │ ← Retrieve condensed memory for this userId
 └──────┬──────┘
        │ Prepend memory context to prompt
        ▼
 ┌─────────────┐
 │  LLM call   │ ← generate() or stream() as normal
 └──────┬──────┘
        │
        ▼
 ┌──────────────┐
 │ memory.add() │ ← In background: condense old memory + new turn via LLM
 └──────────────┘
```

On each `generate()` or `stream()` call:

1. **Retrieve**: `memory.get(userId)` fetches the user's condensed memory (if any)
2. **Inject**: The memory is prepended to the user's prompt as context
3. **Generate**: The LLM processes the enhanced prompt normally
4. **Store**: After the response completes, `memory.add(userId, content)` runs in the background. The SDK sends the old memory + new conversation turn to an LLM which produces a new condensed summary

## Quick Start

```typescript

const neurolink = new NeuroLink({
  conversationMemory: {
    enabled: true,
    memory: {
      enabled: true,
      storage: {
        type: "s3",
        bucket: "my-memory-bucket",
        prefix: "memory/condensed/",
      },
      neurolink: {
        provider: "google-ai",
        model: "gemini-2.5-flash",
      },
      maxWords: 50,
    },
  },
});

// Memory is automatically retrieved and stored on each call
const result = await neurolink.generate({
  input: { text: "My name is Alice and I run a Shopify store." },
  context: { userId: "user-123" },
});

// Next call — the AI already knows about Alice
const result2 = await neurolink.generate({
  input: { text: "What platform do I use?" },
  context: { userId: "user-123" },
});
// → "You use Shopify."
```

## Configuration

The `memory` field on `conversationMemory` accepts a `Memory` object:

```typescript
type Memory = HippocampusConfig & { enabled?: boolean };
```

### Required Fields

| Field                | Type    | Description                                       |
| -------------------- | ------- | ------------------------------------------------- |
| `enabled`            | boolean | Set `true` to activate memory                     |
| `storage.type`       | string  | Storage backend: `"s3"`, `"redis"`, or `"sqlite"` |
| `neurolink.provider` | string  | AI provider for condensation LLM calls            |
| `neurolink.model`    | string  | Model for condensation LLM calls                  |

### Optional Fields

| Field            | Type   | Default  | Description                                                                                             |
| ---------------- | ------ | -------- | ------------------------------------------------------------------------------------------------------- |
| `maxWords`       | number | 50       | Maximum words in the condensed memory                                                                   |
| `prompt`         | string | built-in | Custom condensation prompt (supports `{{OLD_MEMORY}}`, `{{NEW_CONTENT}}`, `{{MAX_WORDS}}` placeholders) |
| `storage.bucket` | string | —        | S3 bucket name (required for S3 storage)                                                                |
| `storage.prefix` | string | —        | S3 key prefix for memory objects                                                                        |
| `storage.url`    | string | —        | Redis connection URL (required for Redis storage)                                                       |
| `storage.path`   | string | —        | SQLite file path (required for SQLite storage)                                                          |

### Storage Backends

#### S3 (Recommended for production)

```typescript
memory: {
  enabled: true,
  storage: {
    type: "s3",
    bucket: "my-bucket",
    prefix: "memory/condensed/",
  },
  neurolink: { provider: "google-ai", model: "gemini-2.5-flash" },
}
```

Each user's memory is stored as a single S3 object at `{prefix}{userId}`.

#### Redis

```typescript
memory: {
  enabled: true,
  storage: {
    type: "redis",
    url: "redis://localhost:6379",
  },
  neurolink: { provider: "openai", model: "gpt-4o-mini" },
}
```

#### SQLite (Development)

```typescript
memory: {
  enabled: true,
  storage: {
    type: "sqlite",
    path: "./memory.db",
  },
  neurolink: { provider: "google-ai", model: "gemini-2.5-flash" },
}
```

> **Note**: SQLite requires the `better-sqlite3` optional peer dependency. Install it manually: `pnpm add better-sqlite3`

## Custom Condensation Prompt

The condensation prompt controls how the LLM merges old memory with new conversation turns. You can provide a custom prompt using the `prompt` field:

```typescript
memory: {
  enabled: true,
  storage: { type: "s3", bucket: "my-bucket" },
  neurolink: { provider: "google-ai", model: "gemini-2.5-flash" },
  prompt: `You are a memory engine. Merge the old memory with new facts into a summary of at most {{MAX_WORDS}} words.

OLD_MEMORY:
{{OLD_MEMORY}}

NEW_CONTENT:
{{NEW_CONTENT}}

Condensed memory:`,
  maxWords: 100,
}
```

### Placeholders

| Placeholder       | Replaced With                                            |
| ----------------- | -------------------------------------------------------- |
| `{{OLD_MEMORY}}`  | The user's existing condensed memory (may be empty)      |
| `{{NEW_CONTENT}}` | The new conversation turn: `"User: ...\nAssistant: ..."` |
| `{{MAX_WORDS}}`   | The configured `maxWords` value                          |

## Integration with generate() and stream()

Memory integrates automatically with both `generate()` and `stream()`:

- **Before the LLM call**: Memory is retrieved and prepended to the input text
- **After the LLM call**: The conversation turn is stored in the background via `setImmediate()`
- **Timeouts**: Retrieval has a 3-second timeout; storage has a 10-second timeout (includes LLM condensation)
- **Errors are non-blocking**: If memory retrieval or storage fails, the generate/stream call continues normally

### Requirements

For memory to activate on a call, all three conditions must be met:

1. `memory.enabled` is `true` in the config
2. `options.context.userId` is provided in the generate/stream call
3. The response has non-empty content (for storage)

## Relationship to Mem0

NeuroLink supports two complementary memory systems:

| Feature          | Memory                             | Mem0                                |
| ---------------- | ---------------------------------- | ----------------------------------- |
| **Architecture** | In-process SDK                     | Cloud API (`mem0ai`)                |
| **Storage**      | S3, Redis, or SQLite (you control) | Mem0 cloud                          |
| **Memory model** | Single condensed summary per user  | Structured memories with categories |
| **LLM calls**    | Uses your configured provider      | Uses Mem0's infrastructure          |
| **Latency**      | Lower (in-process storage)         | Higher (cloud API calls)            |
| **Cost**         | Your LLM costs only                | Mem0 API pricing                    |

Both can be enabled simultaneously — they operate independently.

## Environment Variables

The `@juspay/hippocampus` SDK reads these environment variables:

| Variable                 | Default  | Description                                                 |
| ------------------------ | -------- | ----------------------------------------------------------- |
| `HC_LOG_LEVEL`           | `warn`   | SDK log level: `debug`, `info`, `warn`, `error`             |
| `HC_CONDENSATION_PROMPT` | built-in | Default condensation prompt (overridden by config `prompt`) |

## Error Handling

The memory SDK is designed to **never crash the host application**:

- Every public method (`get()`, `add()`, `delete()`, `close()`) is wrapped in try-catch
- Errors are logged via `logger.warn()` and safe defaults are returned
- `get()` returns `null` on error
- `add()` silently fails on error
- Storage initialization errors result in memory being disabled (returns `null` from `ensureMemoryReady()`)

## Type Exports

NeuroLink re-exports the memory types for use in host applications:

```typescript

// Memory = HippocampusConfig & { enabled?: boolean }
```

## See Also

- **[Conversation Memory](/docs/memory/conversation)** - Session-based conversation history
- **[Mem0 Integration](/docs/memory/mem0)** - Cloud-based semantic memory
- **[Context Compaction](/docs/features/context-compaction)** - Automatic context window management
- **[Context Summarization](/docs/memory/summarization)** - Conversation compression

---

## Multimodal Chat Experiences

<!-- Source: features/multimodal-chat.md -->

NeuroLink 7.47.0 introduces full multimodal pipelines so you can mix text, URLs, and local images in a single interaction. The CLI, SDK, and loop sessions all use the same message builder, ensuring parity across workflows.

## Video Generation {#video-generation}

NeuroLink supports **video generation** from images using Google's Veo 3.1 model via Vertex AI. Transform static images into 8-second videos with synchronized audio.

```typescript

const result = await neurolink.generate({
  input: {
    text: "Smooth camera movement showcasing the product",
    images: [await readFile("./product.jpg")],
  },
  provider: "vertex",
  model: "veo-3.1",
  output: { mode: "video", video: { resolution: "1080p" } },
});

if (result.video) {
  await writeFile("output.mp4", result.video.data);
}
```

**See:** [Video Generation Guide](/docs/features/video-generation) for complete documentation.

## Images {#images}

NeuroLink provides comprehensive image support across all vision-capable providers. Images can be provided as local file paths, HTTPS URLs, or Buffer objects, and are automatically converted to the provider's required encoding format.

## What You Get

- **Unified CLI flag** – `--image` accepts multiple file paths or HTTPS URLs per request.
- **SDK parity** – pass `input.images` (buffers, file paths, or URLs) and stream structured outputs.
- **Provider fallbacks** – orchestration automatically retries compatible multimodal models.
- **Streaming support** – `neurolink stream` renders partial responses while images upload in the background.

:::tip[Format Support]
The image input accepts three formats: **Buffer objects** (from `readFileSync`), **local file paths** (relative or absolute), or **HTTPS URLs**. All formats are automatically converted to the provider's required encoding.
:::

## Supported Providers & Models

:::warning[Provider Compatibility]
Not all providers support multimodal inputs. Verify your chosen model has the `vision` capability using `npx @juspay/neurolink models list --capability vision`. Unsupported providers will return an error or ignore image inputs.
:::

| Provider               | Recommended Models                       | Notes                                                     |
| ---------------------- | ---------------------------------------- | --------------------------------------------------------- |
| `google-ai`, `vertex`  | `gemini-2.5-pro`, `gemini-2.5-flash`     | Local files and URLs supported.                           |
| `openai`, `azure`      | `gpt-4o`, `gpt-4o-mini`                  | Requires `OPENAI_API_KEY` or Azure deployment name + key. |
| `anthropic`, `bedrock` | `claude-3.5-sonnet`, `claude-3.7-sonnet` | Bedrock needs region + credentials.                       |
| `litellm`              | Any upstream multimodal model            | Ensure LiteLLM server exposes `vision` capability.        |

> Use `npx @juspay/neurolink models list --capability vision` to see the full list from `config/models.json`.

## Prerequisites

1. Provider credentials with vision/multimodal permissions.
2. Latest CLI (`npm`, `pnpm`, or `npx`) or SDK `>=7.47.0`.
3. Optional: Redis if you want images stored alongside loop-session history.

## CLI Quick Start

```bash
# Attach a local file (auto-converted to base64)
npx @juspay/neurolink generate "Describe this interface" \
  --image ./designs/dashboard.png --provider google-ai

# Reference a remote URL (downloaded on the fly)
npx @juspay/neurolink generate "Summarise these guidelines" \
  --image https://example.com/policy.pdf --provider openai --model gpt-4o

# Mix multiple images and enable analytics/evaluation
npx @juspay/neurolink generate "QA review" \
  --image ./screenshots/before.png \
  --image ./screenshots/after.png \
  --enableAnalytics --enableEvaluation --format json
```

### Streaming & Loop Sessions

```bash
# Stream while uploading a diagram
npx @juspay/neurolink stream "Explain this architecture" \
  --image ./diagrams/system.png

# Persist images inside loop mode (Redis auto-detected when available)
npx @juspay/neurolink loop --enable-conversation-memory
> set provider google-ai
> generate Compare the attached charts --image ./charts/q3.png
```

## SDK Usage

```typescript

const neurolink = new NeuroLink({ enableOrchestration: true }); // (1)!

const result = await neurolink.generate({
  input: {
    text: "Provide a marketing summary of these screenshots", // (2)!
    images: [
      // (3)!
      readFileSync("./assets/homepage.png"), // (4)!
      "https://example.com/reports/nps-chart.png", // (5)!
    ],
  },
  provider: "google-ai", // (6)!
  enableEvaluation: true, // (7)!
  region: "us-east-1",
});

console.log(result.content);
console.log(result.evaluation?.overallScore);
```

1. Enable provider orchestration for automatic multimodal fallbacks
2. Text prompt describing what you want from the images
3. Array of images in multiple formats
4. Local file as Buffer (auto-converted to base64)
5. Remote URL (downloaded and encoded automatically)
6. Choose a vision-capable provider
7. Optionally evaluate the quality of multimodal responses

### Image Alt Text for Accessibility

NeuroLink supports alt text for images, which is helpful for accessibility (screen readers) and providing additional context to AI models. Alt text is automatically included as context in the prompt sent to AI providers.

```typescript

const neurolink = new NeuroLink();

// Using images with alt text for accessibility
const result = await neurolink.generate({
  input: {
    text: "Compare these two charts and summarize the trends",
    images: [
      // (1)!
      {
        data: readFileSync("./charts/q1-revenue.png"),
        altText: "Q1 2024 revenue chart showing 15% growth", // (2)!
      },
      {
        data: "https://example.com/charts/q2-revenue.png",
        altText: "Q2 2024 revenue chart showing 22% growth", // (3)!
      },
    ],
  },
  provider: "openai",
});
```

1. Images can be objects with `data` and `altText` properties
2. Alt text for local file - helps AI understand the image context
3. Alt text for remote URL - provides additional context for accessibility

You can also mix simple images with alt-text-enabled images:

```typescript
const result = await neurolink.generate({
  input: {
    text: "Analyze these images",
    images: [
      readFileSync("./simple-image.png"), // Simple buffer (no alt text)
      "https://example.com/image.jpg", // Simple URL (no alt text)
      {
        data: readFileSync("./important-chart.png"),
        altText: "Critical KPI dashboard for Q3", // With alt text
      },
    ],
  },
  provider: "google-ai",
});
```

:::tip[Alt Text Best Practices]
- Keep alt text concise but descriptive (under 125 characters is ideal) - Focus on the key information the image conveys - Alt text is automatically included as context in the prompt, helping AI models better understand the images
:::

Use `stream()` with the same structure when you need incremental tokens:

```typescript
const stream = await neurolink.stream({
  input: {
    text: "Walk through the attached floor plan",
    images: ["./plans/level1.jpg"], // (1)!
  },
  provider: "openai", // (2)!
});

for await (const chunk of stream) {
  // (3)!
  process.stdout.write(chunk.text ?? "");
}
```

1. Accepts file path, Buffer, or HTTPS URL
2. OpenAI's GPT-4o and GPT-4o-mini support vision
3. Stream text responses while image uploads in background

## Configuration & Tuning

- **Image sources** – Local paths are resolved relative to `process.cwd()`. URLs must be HTTPS.
- **Size limits** – Providers cap images at ~20 MB. Resize or compress large assets before sending.
- **Multiple images** – Order matters; the builder interleaves captions in the order provided.
- **Region routing** – Set `region` on each request (e.g., `us-east-1`) for providers that enforce locality.
- **Loop sessions** – Images uploaded during `loop` are cached per session; call `clear session` to reset.
- **Alt text** – Add alt text to images for accessibility; the text is included as context for AI models.

## Best Practices

- Provide short captions in the prompt describing each image (e.g., "see `before.png` on the left").
- **Use alt text** for images that convey important information, especially for accessibility compliance.
- Combine analytics + evaluation to benchmark multimodal quality before rolling out widely.
- Cache remote assets locally if you reuse them frequently to avoid repeated downloads.
- Stream when presenting content to end-users; use `generate` when you need structured JSON output.

## CSV File Support

### Quick Start

```bash
# Auto-detect CSV files
npx @juspay/neurolink generate "Analyze sales trends" \
  --file ./sales_2024.csv

# Explicit CSV with options
npx @juspay/neurolink generate "Summarize data" \
  --csv ./data.csv \
  --csv-max-rows 500 \
  --csv-format raw
```

### SDK Usage

```typescript
// Auto-detect (recommended)
await neurolink.generate({
  input: {
    text: "Analyze this data",
    files: ["./data.csv", "./chart.png"],
  },
});

// Explicit CSV
await neurolink.generate({
  input: {
    text: "Compare quarters",
    csvFiles: ["./q1.csv", "./q2.csv"],
  },
  csvOptions: {
    maxRows: 1000,
    formatStyle: "raw",
  },
});
```

### Format Options

- **raw** (default) - Best for large files, minimal token usage
- **json** - Structured data, easier parsing, higher token usage
- **markdown** - Readable tables, good for small datasets (\<100 rows)

### Best Practices

- Use raw format for large files to minimize token usage
- Use JSON format for structured data processing
- Limit to 1000 rows by default (configurable up to 10K)
- Combine CSV with visualization images for comprehensive analysis
- Works with ALL providers (not just vision-capable models)

## PDF File Support

### Quick Start

```bash
# Auto-detect PDF files
npx @juspay/neurolink generate "Summarize this report" \
  --file ./financial-report.pdf \
  --provider vertex

# Explicit PDF processing
npx @juspay/neurolink generate "Extract key terms" \
  --pdf ./contract.pdf \
  --provider anthropic

# Multiple PDFs
npx @juspay/neurolink generate "Compare these documents" \
  --pdf ./version1.pdf \
  --pdf ./version2.pdf \
  --provider vertex
```

### SDK Usage

```typescript
// Auto-detect (recommended)
await neurolink.generate({
  input: {
    text: "Analyze this document",
    files: ["./report.pdf", "./data.csv"],
  },
  provider: "vertex",
});

// Explicit PDF
await neurolink.generate({
  input: {
    text: "Compare Q1 and Q2 reports",
    pdfFiles: ["./q1-report.pdf", "./q2-report.pdf"],
  },
  provider: "anthropic",
});

// Streaming with PDF
const stream = await neurolink.stream({
  input: {
    text: "Summarize this contract",
    pdfFiles: ["./contract.pdf"],
  },
  provider: "vertex",
});
```

### Supported Providers

| Provider              | Max Size | Max Pages | Notes                           |
| --------------------- | -------- | --------- | ------------------------------- |
| **Google Vertex AI**  | 5 MB     | 100       | `gemini-1.5-pro` recommended    |
| **Anthropic**         | 5 MB     | 100       | `claude-3-5-sonnet` recommended |
| **AWS Bedrock**       | 5 MB     | 100       | Requires AWS credentials        |
| **Google AI Studio**  | 2000 MB  | 100       | Best for large files            |
| **OpenAI**            | 10 MB    | 100       | `gpt-4o`, `gpt-4o-mini`, `o1`   |
| **Azure OpenAI**      | 10 MB    | 100       | Uses OpenAI Files API           |
| **LiteLLM**           | 10 MB    | 100       | Depends on upstream model       |
| **OpenAI Compatible** | 10 MB    | 100       | Depends on upstream model       |
| **Mistral**           | 10 MB    | 100       | Native PDF support              |
| **Hugging Face**      | 10 MB    | 100       | Native PDF support              |

**Not supported:** Ollama

### Best Practices

- **Choose the right provider**: Use Vertex AI or Anthropic for best results
- **Check file size**: Most providers limit to 5MB, AI Studio supports up to 2GB
- **Use streaming**: For large documents, streaming gives faster initial results
- **Combine with other files**: Mix PDF with CSV data and images for comprehensive analysis
- **Be specific in prompts**: "Extract all monetary values" vs "Tell me about this PDF"

### Token Usage

PDFs consume significant tokens:

- **Text-only mode**: ~1,000 tokens per 3 pages
- **Visual mode**: ~7,000 tokens per 3 pages

Set appropriate `maxTokens` for PDF analysis (recommended: 2000-8000 tokens).

## Troubleshooting

| Symptom                            | Action                                                                            |
| ---------------------------------- | --------------------------------------------------------------------------------- |
| `Image not found`                  | Check relative paths from the directory where you invoked the CLI.                |
| `Provider does not support images` | Switch to a model listed in the table above or enable orchestration.              |
| `Error downloading image`          | Ensure the URL responds with status 200 and does not require auth.                |
| `Large response latency`           | Pre-compress images and reduce resolution to under 2 MP when possible.            |
| `Streaming ends early`             | Disable tools (`--disableTools`) to avoid tool calls that may not support vision. |

## Related Features

**Document Processing:**

- [Office Documents](/docs/features/office-documents) – DOCX, PPTX, XLSX processing for Bedrock, Vertex, Anthropic
- [PDF Support](/docs/features/pdf-support) – PDF document processing for visual analysis
- [CSV Support](/docs/features/csv-support) – CSV file processing with auto-detection

**Q4 2025 Features:**

- [Guardrails Middleware](/docs/features/guardrails) – Content filtering for multimodal outputs
- [Auto Evaluation](/docs/features/auto-evaluation) – Quality scoring for vision-based responses

**Documentation:**

- [CLI Commands](/docs/cli/commands) – CLI flags & options
- [SDK API Reference](/docs/sdk/api-reference) – Generate/stream APIs
- [Troubleshooting](/docs/reference/troubleshooting) – Extended error catalogue

---

## Multimodal Capabilities Guide

<!-- Source: features/multimodal.md -->

# Multimodal Capabilities Guide

NeuroLink provides comprehensive multimodal support, allowing you to combine text with various media types in a single AI interaction. This guide covers all supported input types, provider capabilities, and best practices.

## Overview

**Supported Input Types:**

- **Images** - JPEG, PNG, GIF, WebP, HEIC (vision-capable models)
- **PDFs** - Document analysis and content extraction
- **CSV/Spreadsheets** - Data analysis and tabular content processing
- **Audio** - Transcription, analysis, and real-time voice input ([Audio Input Guide](/docs/features/audio-input))
- **Documents** - Excel, Word, RTF, OpenDocument formats ([File Processors Guide](/docs/features/file-processors))
- **Data Files** - JSON, YAML, XML with validation and formatting
- **Markup** - HTML, SVG, Markdown with security sanitization
- **Source Code** - 50+ programming languages with syntax detection

All multimodal inputs work seamlessly across both the CLI and SDK, with automatic format detection and provider-specific optimization.

> **New in 2026:** NeuroLink now supports 17+ file types through the ProcessorRegistry system. See the [File Processors Guide](/docs/features/file-processors) for comprehensive documentation.

------------------ | --------- | ------------------------------------------------------ | ---------- | -------- | ------------------------------------ |
| **OpenAI**            | ✅        | `gpt-4o`, `gpt-4o-mini`, `gpt-5.2`                     | 10         | ~20 MB   | Best for general vision tasks        |
| **Azure OpenAI**      | ✅        | `gpt-4o`, `gpt-4o-mini`                                | 10         | ~20 MB   | Same as OpenAI                       |
| **Google AI Studio**  | ✅        | `gemini-2.5-pro`, `gemini-2.5-flash`, `gemini-3-flash` | 16         | ~20 MB   | Excellent for visual reasoning       |
| **Google Vertex AI**  | ✅        | `gemini-2.5-pro`, `gemini-2.5-flash`, Claude models    | 16/20      | ~20 MB   | Gemini: 16 images, Claude: 20 images |
| **Anthropic**         | ✅        | `claude-3.5-sonnet`, `claude-3.7-sonnet`               | 20         | ~20 MB   | Strong visual understanding          |
| **AWS Bedrock**       | ✅        | Claude models                                          | 20         | ~20 MB   | Same as Anthropic                    |
| **Ollama**            | ✅        | `llava`, `bakllava`, `llava-phi3`                      | 10         | Varies   | Local vision models                  |
| **LiteLLM**           | ✅        | Depends on upstream                                    | 10         | Varies   | Proxy to vision-capable models       |
| **Mistral**           | ✅        | `pixtral-12b-2409`, `pixtral-large-2411`               | 10         | ~20 MB   | Multimodal Mistral models            |
| **OpenRouter**        | ✅        | Depends on model                                       | 10         | Varies   | Routes to various vision models      |
| **Hugging Face**      | ⚠️        | Limited                                                | Varies     | Varies   | Model-dependent                      |
| **AWS SageMaker**     | ❌        | N/A                                                    | -          | -        | Not supported                        |
| **OpenAI Compatible** | ⚠️        | Depends on endpoint                                    | Varies     | Varies   | Server-dependent                     |

**Legend:**

- ✅ Full support with multiple models
- ⚠️ Limited or server-dependent support
- ❌ Not supported

### PDF Documents

| Provider              | Supported | Max Size | Max Pages | Processing Mode  | Notes                                   |
| --------------------- | --------- | -------- | --------- | ---------------- | --------------------------------------- |
| **Google Vertex AI**  | ✅        | 5 MB     | 100       | Native PDF       | Best for document analysis              |
| **Anthropic**         | ✅        | 5 MB     | 100       | Native PDF       | Claude excels at document understanding |
| **AWS Bedrock**       | ✅        | 5 MB     | 100       | Native PDF       | Via Claude models                       |
| **Google AI Studio**  | ✅        | 2000 MB  | 100       | Native PDF       | Handles very large files                |
| **OpenAI**            | ✅        | 10 MB    | 100       | Files API        | `gpt-4o`, `gpt-4o-mini`, `o1`           |
| **Azure OpenAI**      | ✅        | 10 MB    | 100       | Files API        | Uses OpenAI Files API                   |
| **LiteLLM**           | ✅        | 10 MB    | 100       | Proxy            | Depends on upstream model               |
| **OpenAI Compatible** | ✅        | 10 MB    | 100       | Varies           | Server-dependent                        |
| **Mistral**           | ✅        | 10 MB    | 100       | Native PDF       | Native support                          |
| **Hugging Face**      | ✅        | 10 MB    | 100       | Model-dependent  | Varies by model                         |
| **Ollama**            | ❌        | -        | -         | -                | Not supported                           |
| **OpenRouter**        | ⚠️        | Varies   | Varies    | Depends on model | Route-dependent                         |
| **AWS SageMaker**     | ❌        | -        | -         | -                | Not supported                           |

### CSV/Spreadsheet Data

| Provider          | Supported | Max Rows | Format Options      | Notes                                 |
| ----------------- | --------- | -------- | ------------------- | ------------------------------------- |
| **All Providers** | ✅        | 10,000   | raw, json, markdown | Universal support - processed as text |

CSV support works with **all providers** because files are converted to text before sending to the AI model. The file is parsed and formatted (raw CSV, JSON, or Markdown table) before inclusion in the prompt.

**Format Recommendations:**

- **Raw format** - Best for large files (minimal token usage)
- **JSON format** - Best for structured data processing
- **Markdown format** - Best for small datasets (\<100 rows), readable tables

### Audio Input

| Provider             | Native Audio | Transcription | Real-time | Max Duration | Notes                               |
| -------------------- | ------------ | ------------- | --------- | ------------ | ----------------------------------- |
| **Google AI Studio** | ✅           | ✅            | ✅        | 1 hour       | Best for real-time voice            |
| **Google Vertex AI** | ✅           | ✅            | ✅        | 1 hour       | Native Gemini audio support         |
| **OpenAI**           | ❌           | ✅ Whisper    | ❌        | 25 MB        | Excellent transcription accuracy    |
| **Azure OpenAI**     | ❌           | ✅ Whisper    | ❌        | 25 MB        | Via Whisper integration             |
| **Anthropic**        | ❌           | Via fallback  | ❌        | -            | Uses transcription approach         |
| **AWS Bedrock**      | ❌           | Via fallback  | ❌        | -            | Uses transcription approach         |
| **Others**           | ❌           | Via fallback  | ❌        | -            | Audio transcribed before processing |

For comprehensive audio documentation, see the [Audio Input Guide](/docs/features/audio-input).

---

## Image Input

### Quick Start

**CLI:**

```bash
# Single image
npx @juspay/neurolink generate "Describe this interface" \
  --image ./designs/dashboard.png --provider google-ai

# Remote URL
npx @juspay/neurolink generate "Analyze this diagram" \
  --image https://example.com/architecture.png --provider openai

# Multiple images
npx @juspay/neurolink generate "Compare these screenshots" \
  --image ./before.png \
  --image ./after.png \
  --provider anthropic
```

**SDK:**

```typescript

const neurolink = new NeuroLink({ enableOrchestration: true });

const result = await neurolink.generate({
  input: {
    text: "Analyze these product screenshots",
    images: [
      readFileSync("./homepage.png"), // Local file as Buffer
      "https://example.com/chart.png", // Remote URL
    ],
  },
  provider: "google-ai",
});
```

### Image Formats Supported

**Accepted formats:**

- JPEG (`.jpg`, `.jpeg`)
- PNG (`.png`)
- GIF (`.gif`)
- WebP (`.webp`)
- HEIC (`.heic`, `.heif`) - iOS photos

**Input methods:**

- **Buffer objects** - `readFileSync()` from Node.js
- **Local file paths** - Relative or absolute paths
- **HTTPS URLs** - Remote images (auto-downloaded)

### Image Alt Text (Accessibility)

NeuroLink supports alt text for images, improving accessibility and providing additional context to AI models.

```typescript
const result = await neurolink.generate({
  input: {
    text: "Compare these revenue charts",
    images: [
      {
        data: readFileSync("./q1-revenue.png"),
        altText: "Q1 2024 revenue chart showing 15% growth",
      },
      {
        data: "https://example.com/q2-revenue.png",
        altText: "Q2 2024 revenue chart showing 22% growth",
      },
    ],
  },
  provider: "openai",
});
```

**Alt text best practices:**

- Keep concise (under 125 characters ideal)
- Focus on key information the image conveys
- Alt text is automatically included as context in prompts

### Image Size Limits

**Provider-specific limits:**

- Most providers: ~20 MB per image
- Recommended: Resize images to < 2 MP for faster processing
- Token usage: ~7,000 tokens per image (varies by provider)

**Optimization tips:**

- Compress images before sending for large batches
- Use appropriate resolution (1920x1080 often sufficient)
- Pre-process images to reduce unnecessary detail

---

## PDF Document Input

### Quick Start

**CLI:**

```bash
# Auto-detect PDF
npx @juspay/neurolink generate "Summarize this report" \
  --file ./financial-report.pdf --provider vertex

# Explicit PDF
npx @juspay/neurolink generate "Extract key terms from contract" \
  --pdf ./contract.pdf --provider anthropic

# Multiple PDFs
npx @juspay/neurolink generate "Compare these documents" \
  --pdf ./version1.pdf \
  --pdf ./version2.pdf \
  --provider vertex
```

**SDK:**

```typescript
// Auto-detect (recommended)
await neurolink.generate({
  input: {
    text: "Analyze this document",
    files: ["./report.pdf", "./data.csv"], // Mixed file types
  },
  provider: "vertex",
});

// Explicit PDF
await neurolink.generate({
  input: {
    text: "Compare Q1 and Q2 reports",
    pdfFiles: ["./q1-report.pdf", "./q2-report.pdf"],
  },
  provider: "anthropic",
});
```

### PDF Processing Modes

**Provider-specific approaches:**

| Provider                          | Mode       | Token Usage           | Best For                 |
| --------------------------------- | ---------- | --------------------- | ------------------------ |
| **Vertex AI, Anthropic, Bedrock** | Native PDF | ~1,000 tokens/3 pages | Visual + text extraction |
| **Google AI Studio**              | Native PDF | ~1,000 tokens/3 pages | Large files (up to 2 GB) |
| **OpenAI, Azure**                 | Files API  | ~1,000 tokens/3 pages | Text-only mode optimal   |

**Visual vs. Text-only mode:**

- **Visual mode**: Preserves layout, tables, charts (~7,000 tokens/3 pages)
- **Text-only mode**: Extracts text content only (~1,000 tokens/3 pages)

### PDF Best Practices

- **Choose the right provider**: Vertex AI or Anthropic for best results
- **Check file size**: Most providers limit to 5 MB (AI Studio supports 2 GB)
- **Use streaming**: For large documents, streaming provides faster initial results
- **Combine with other files**: Mix PDFs with CSV data and images
- **Be specific in prompts**: "Extract all monetary values" vs. "Tell me about this PDF"
- **Set appropriate token limits**: Recommended 2000-8000 tokens for PDF analysis

---

## CSV/Spreadsheet Input

### Quick Start

**CLI:**

```bash
# Auto-detect CSV
npx @juspay/neurolink generate "Analyze sales trends" \
  --file ./sales_2024.csv

# Explicit CSV with options
npx @juspay/neurolink generate "Summarize data" \
  --csv ./data.csv \
  --csv-max-rows 500 \
  --csv-format raw
```

**SDK:**

```typescript
// Auto-detect (recommended)
await neurolink.generate({
  input: {
    text: "Analyze this sales data",
    files: ["./sales.csv"], // Auto-detected as CSV
  },
});

// Explicit CSV with options
await neurolink.generate({
  input: {
    text: "Compare quarterly data",
    csvFiles: ["./q1.csv", "./q2.csv"],
  },
  csvOptions: {
    maxRows: 1000,
    formatStyle: "json", // or "raw", "markdown"
  },
});
```

### CSV Format Options

**Three format styles:**

1. **Raw format** (default)
   - Best for large files
   - Minimal token usage
   - Preserves original CSV structure

   ```
   name,age,city
   Alice,30,NYC
   Bob,25,LA
   ```

2. **JSON format**
   - Structured data processing
   - Easier for AI to parse
   - Higher token usage

   ```json
   [
     { "name": "Alice", "age": 30, "city": "NYC" },
     { "name": "Bob", "age": 25, "city": "LA" }
   ]
   ```

3. **Markdown format**
   - Readable tables
   - Good for small datasets (\<100 rows)
   - Moderate token usage
   ```markdown
   | name  | age | city |
   | ----- | --- | ---- |
   | Alice | 30  | NYC  |
   | Bob   | 25  | LA   |
   ```

### CSV Configuration

```typescript
const result = await neurolink.generate({
  input: {
    text: "Analyze customer data",
    csvFiles: ["./customers.csv"],
  },
  csvOptions: {
    maxRows: 1000, // Limit rows (default: 1000, max: 10000)
    formatStyle: "json", // Format: "raw" | "json" | "markdown"
    includeHeaders: true, // Include header row (default: true)
  },
});
```

### CSV Best Practices

- **Use raw format for large files** to minimize token usage
- **Use JSON format for structured processing** when AI needs to manipulate data
- **Limit to 1000 rows by default** (configurable up to 10,000)
- **Combine CSV with visualization images** for comprehensive analysis
- **Works with ALL providers** (not just vision-capable models)

---

## Combining Multiple Input Types

NeuroLink excels at combining different media types in a single request.

### Mixed Media Example

```typescript
const result = await neurolink.generate({
  input: {
    text: "Analyze this product launch: review the presentation, compare sales data, and assess the promotional materials",
    pdfFiles: ["./presentation.pdf"], // Slides
    csvFiles: ["./sales-data.csv"], // Numbers
    images: [
      readFileSync("./promo-banner.png"), // Marketing material
      "https://example.com/ad-campaign.jpg",
    ],
  },
  provider: "vertex", // Supports all input types
});
```

### Streaming with Multimodal

```typescript
const stream = await neurolink.stream({
  input: {
    text: "Analyze this floor plan and cost breakdown",
    images: ["./floor-plan.jpg"],
    csvFiles: ["./costs.csv"],
  },
  provider: "google-ai",
});

for await (const chunk of stream) {
  process.stdout.write(chunk.text ?? "");
}
```

---

## Configuration & Fine-tuning

### Image-Specific Options

```typescript
const result = await neurolink.generate({
  input: {
    text: "Analyze these screenshots",
    images: [
      {
        data: readFileSync("./screenshot.png"),
        altText: "Product dashboard showing KPIs",
      },
    ],
  },
  provider: "openai",
  maxTokens: 2000, // Increase for detailed image analysis
});
```

### PDF-Specific Options

```typescript
const result = await neurolink.generate({
  input: {
    text: "Extract financial data from this report",
    pdfFiles: ["./annual-report.pdf"],
  },
  provider: "vertex",
  maxTokens: 8000, // Large token budget for comprehensive extraction
});
```

### Regional Routing

Some providers require regional configuration for optimal performance:

```typescript
const result = await neurolink.generate({
  input: {
    text: "Analyze this document",
    pdfFiles: ["./contract.pdf"],
  },
  provider: "vertex",
  region: "us-central1", // Vertex AI region
});
```

---

## Best Practices

### General Guidelines

1. **Provide descriptive prompts** - Reference specific images/files by name
2. **Use alt text for accessibility** - Helps both AI and screen readers
3. **Combine analytics + evaluation** - Benchmark multimodal quality before production
4. **Cache remote assets locally** - Avoid repeated downloads for frequently used files
5. **Stream for user-facing apps** - Use `generate()` for structured JSON output

### Image Best Practices

- Provide short captions describing each image in the prompt
- Pre-compress large images to reduce processing time
- Use appropriate image formats (JPEG for photos, PNG for diagrams)
- Consider token limits when sending multiple images

### PDF Best Practices

- Choose providers with native PDF support (Vertex, Anthropic, Bedrock)
- Be specific about what you need extracted
- Use streaming for large documents
- Set appropriate `maxTokens` (2000-8000 recommended)

### CSV Best Practices

- Use raw format for large datasets
- Use JSON format when AI needs structured data manipulation
- Limit rows to avoid token exhaustion
- Combine with images for visual + numerical analysis

---

## Troubleshooting

### Common Issues

| Issue                                  | Solution                                                          |
| -------------------------------------- | ----------------------------------------------------------------- |
| **"Image not found"**                  | Check file paths are relative to CWD where CLI is invoked         |
| **"Provider does not support images"** | Switch to vision-capable provider (see matrix above)              |
| **"Error downloading image"**          | Ensure URL returns HTTP 200 and doesn't require authentication    |
| **"Large response latency"**           | Pre-compress images and reduce resolution to < 2 MP               |
| **"Streaming ends early"**             | Disable tools (`--disableTools`) to avoid tool call interruptions |
| **"PDF too large"**                    | Use Google AI Studio (2 GB limit) or split into smaller chunks    |
| **"CSV token overflow"**               | Reduce `maxRows` or use raw format instead of JSON/markdown       |

### Provider-Specific Issues

**OpenAI/Azure:**

- Images must be < 20 MB
- PDFs processed via Files API (may take longer)

**Google AI Studio/Vertex:**

- Best for large PDFs (AI Studio supports up to 2 GB)
- Gemini models have excellent visual reasoning

**Anthropic/Bedrock:**

- Claude excels at document understanding
- Strong visual and text analysis capabilities

**Ollama:**

- Use vision-capable models like `llava`, `bakllava`
- Local processing - no cloud API required

---

## Related Features

**Document Processing:**

- [File Processors Guide](/docs/features/file-processors) - Complete guide to 17+ file types (Excel, Word, JSON, YAML, XML, HTML, SVG, code, etc.)
- [Office Documents](/docs/features/office-documents) - DOCX, PPTX, XLSX for Bedrock, Vertex, Anthropic
- [PDF Support](/docs/features/pdf-support) - Detailed PDF processing guide
- [CSV Support](/docs/features/csv-support) - Advanced CSV processing techniques

**Q4 2025 Features:**

- [Guardrails Middleware](/docs/features/guardrails) - Content filtering for multimodal outputs
- [Auto Evaluation](/docs/features/auto-evaluation) - Quality scoring for vision-based responses

**Advanced Features:**

- [Audio Input](/docs/features/audio-input) - Transcription, analysis, and real-time voice
- [TTS Integration](/docs/features/tts) - Text-to-Speech audio output
- [Video Generation](/docs/features/video-generation) - AI-powered video creation

**Documentation:**

- [CLI Commands](/docs/cli/commands) - CLI flags and options reference
- [SDK API Reference](/docs/sdk/api-reference) - Complete API documentation
- [Troubleshooting](/docs/reference/troubleshooting) - Extended error catalog

---

## Examples & Recipes

### Example 1: Product Analysis

Analyze a product page with screenshot, description, and pricing data:

```typescript
const analysis = await neurolink.generate({
  input: {
    text: "Analyze this product: review the screenshot, pricing data, and provide recommendations",
    images: [readFileSync("./product-screenshot.png")],
    csvFiles: ["./pricing-tiers.csv"],
  },
  provider: "google-ai",
  maxTokens: 3000,
});
```

### Example 2: Document Comparison

Compare two versions of a contract:

```typescript
const comparison = await neurolink.generate({
  input: {
    text: "Compare these two contract versions and highlight key differences",
    pdfFiles: ["./contract-v1.pdf", "./contract-v2.pdf"],
  },
  provider: "anthropic",
  maxTokens: 5000,
});
```

### Example 3: Data Visualization Analysis

Analyze charts and underlying data together:

```typescript
const dataAnalysis = await neurolink.generate({
  input: {
    text: "Analyze these sales charts and verify against the raw data",
    images: [
      "https://example.com/q1-chart.png",
      "https://example.com/q2-chart.png",
    ],
    csvFiles: ["./sales-data.csv"],
  },
  provider: "vertex",
  enableAnalytics: true,
  enableEvaluation: true,
});
```

---

## Summary

NeuroLink's multimodal capabilities provide:

✅ **Universal input support** - Images, PDFs, CSV files
✅ **Provider flexibility** - Extensive provider compatibility matrix
✅ **Automatic format detection** - Smart file type recognition
✅ **Accessibility features** - Alt text support for images
✅ **Production-ready** - Battle-tested at enterprise scale
✅ **Developer-friendly** - Works seamlessly across CLI and SDK

**Next Steps:**

1. Review the [provider support matrix](#provider-support-matrix) to select the right provider
2. Try the [quick start examples](#quick-start) with your use case
3. Explore [advanced recipes](#examples--recipes) for complex scenarios
4. Check [troubleshooting](#troubleshooting) if you encounter issues

---

## Observability Guide

<!-- Source: features/observability.md -->

# Observability Guide

Enterprise-grade observability for AI operations with Langfuse and OpenTelemetry integration.

## Overview

NeuroLink provides comprehensive observability features for monitoring AI operations in production:

- **Langfuse Integration**: LLM-specific observability with token tracking, cost analysis, and trace visualization
- **OpenTelemetry Support**: Standard distributed tracing compatible with Jaeger, Zipkin, and other backends
- **External Provider Mode**: Integrate with existing OpenTelemetry instrumentation without conflicts
- **Context Propagation**: Automatic context enrichment with user, session, and custom metadata

## Quick Start

### Basic Langfuse Setup

```typescript

const neurolink = new NeuroLink({
  observability: {
    langfuse: {
      enabled: true,
      publicKey: process.env.LANGFUSE_PUBLIC_KEY!,
      secretKey: process.env.LANGFUSE_SECRET_KEY!,
      baseUrl: "https://cloud.langfuse.com",
      environment: "production",
      release: "1.0.0",
    },
  },
});
```

### Environment Variables

```bash
# Langfuse credentials
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_BASE_URL=https://cloud.langfuse.com  # or self-hosted

# Optional defaults
LANGFUSE_ENVIRONMENT=production
LANGFUSE_RELEASE=1.0.0
```

## Context Management

### Setting Context

Use `setLangfuseContext` to attach metadata to all spans in an async context:

```typescript

// With callback - context is scoped to callback execution
const result = await setLangfuseContext(
  {
    userId: "user-123",
    sessionId: "session-456",
    conversationId: "conv-789",
    requestId: "req-abc",
    traceName: "customer-support-chat",
    metadata: {
      feature: "support",
      tier: "premium",
      region: "us-east-1",
    },
  },
  async () => {
    return await neurolink.generate({ prompt: "Hello" });
  },
);

// Without callback - context applies to current execution
await setLangfuseContext({
  userId: "user-123",
  sessionId: "session-456",
});
```

### Context Fields

| Field            | Purpose                                    |
| ---------------- | ------------------------------------------ |
| `userId`         | Identify the user for per-user analytics   |
| `sessionId`      | Group traces within a user session         |
| `conversationId` | Group traces in a conversation thread      |
| `requestId`      | Correlate with application logs            |
| `traceName`      | Custom name in Langfuse UI                 |
| `metadata`       | Key-value pairs for filtering and analysis |

### Reading Context

```typescript

const context = getLangfuseContext();
if (context) {
  console.log(
    `User: ${context.userId}, Conversation: ${context.conversationId}`,
  );
}
```

## Operation Name Support

NeuroLink automatically detects operation names from AI SDK spans and includes them in trace names for better observability. This provides immediate visibility into what type of AI operation is being performed.

### Operation Name Configuration

By default, NeuroLink automatically detects operation names from:

- **Vercel AI SDK spans**: Spans starting with `ai.` (e.g., `ai.streamText`, `ai.generateText`, `ai.embed`)
- **OpenTelemetry GenAI conventions**: Standard semantic convention operations (`chat`, `embeddings`, `text_completion`)

```typescript
const neurolink = new NeuroLink({
  observability: {
    langfuse: {
      enabled: true,
      publicKey: process.env.LANGFUSE_PUBLIC_KEY!,
      secretKey: process.env.LANGFUSE_SECRET_KEY!,
      autoDetectOperationName: true, // Enabled by default
    },
  },
});
```

When auto-detection is enabled, traces automatically include the detected operation:

- A `generateText` call becomes: `user@email.com:ai.generateText`
- A `streamText` call becomes: `user@email.com:ai.streamText`
- An embedding call becomes: `user@email.com:embeddings`

### Trace Name Formats

Control how trace names are constructed using the `traceNameFormat` option:

| Format                   | Example Output                 | Description                 |
| ------------------------ | ------------------------------ | --------------------------- |
| `"userId:operationName"` | `user@email.com:ai.streamText` | Default format, user first  |
| `"operationName:userId"` | `ai.streamText:user@email.com` | Operation first             |
| `"operationName"`        | `ai.streamText`                | Operation only              |
| `"userId"`               | `user@email.com`               | User only (legacy behavior) |
| Custom function          | Custom output                  | Full control over format    |

```typescript
// Global configuration with format
const neurolink = new NeuroLink({
  observability: {
    langfuse: {
      enabled: true,
      publicKey: process.env.LANGFUSE_PUBLIC_KEY!,
      secretKey: process.env.LANGFUSE_SECRET_KEY!,
      autoDetectOperationName: true,
      traceNameFormat: "operationName:userId", // Operation first
    },
  },
});
```

### Custom Format Function

For full control over trace naming, provide a custom function:

```typescript
const neurolink = new NeuroLink({
  observability: {
    langfuse: {
      enabled: true,
      publicKey: process.env.LANGFUSE_PUBLIC_KEY!,
      secretKey: process.env.LANGFUSE_SECRET_KEY!,
      autoDetectOperationName: true,
      traceNameFormat: (context) => {
        // Custom logic for trace name
        const env = process.env.NODE_ENV === "production" ? "prod" : "dev";
        if (context.operationName && context.userId) {
          return `[${env}] ${context.operationName} - ${context.userId}`;
        }
        return context.operationName || context.userId || "unknown";
      },
    },
  },
});
// Output: "[prod] ai.streamText - user@email.com"
```

### Context-Level Configuration

Override operation name behavior at the context level:

```typescript

// Explicit operation name (overrides auto-detection)
await setLangfuseContext(
  {
    userId: "user-123",
    operationName: "custom-rag-pipeline",
  },
  async () => {
    return await neurolink.generate({ prompt: "Hello" });
  },
);
// Trace name: "user-123:custom-rag-pipeline"

// Disable auto-detection for specific context
await setLangfuseContext(
  {
    userId: "user-123",
    autoDetectOperationName: false, // Override global setting
  },
  async () => {
    return await neurolink.generate({ prompt: "Hello" });
  },
);
// Trace name: "user-123" (legacy behavior)

// Enable auto-detection when globally disabled
await setLangfuseContext(
  {
    userId: "user-123",
    autoDetectOperationName: true, // Enable for this context
  },
  async () => {
    return await neurolink.generate({ prompt: "Hello" });
  },
);
// Trace name: "user-123:ai.generateText"
```

### Backward Compatibility

Operation name support is fully backward compatible:

1. **Explicit `traceName` takes priority**: If you set `traceName` in context, it always overrides auto-detected names:

   ```typescript
   await setLangfuseContext(
     {
       userId: "user-123",
       traceName: "my-custom-trace", // This takes priority
       operationName: "ignored-operation",
     },
     async () => {
       return await neurolink.generate({ prompt: "Hello" });
     },
   );
   // Trace name: "my-custom-trace"
   ```

2. **Disable for legacy behavior**: Set `autoDetectOperationName: false` to restore previous behavior:

   ```typescript
   const neurolink = new NeuroLink({
     observability: {
       langfuse: {
         enabled: true,
         publicKey: process.env.LANGFUSE_PUBLIC_KEY!,
         secretKey: process.env.LANGFUSE_SECRET_KEY!,
         autoDetectOperationName: false, // Legacy behavior
       },
     },
   });
   // Trace names will be userId only, as before
   ```

3. **Existing code works unchanged**: Code using `traceName` continues to work exactly as before:

   ```typescript
   // This still works exactly as before
   await setLangfuseContext(
     {
       userId: "user-123",
       sessionId: "session-456",
       traceName: "customer-support-chat",
     },
     async () => {
       return await neurolink.generate({ prompt: "Hello" });
     },
   );
   // Trace name: "customer-support-chat"
   ```

### Priority Order

When determining the trace name, NeuroLink follows this priority order:

1. **Explicit `traceName`** in context (highest priority)
2. **Explicit `operationName`** in context + userId (formatted per `traceNameFormat`)
3. **Auto-detected operation name** from span + userId (if `autoDetectOperationName` is enabled)
4. **userId only** (fallback)

### Wrapper Span Support

When host applications create wrapper spans (trace-root spans) before AI operations, the standard auto-detection in `onStart()` fails because the AI SDK span does not exist yet at wrapper span creation time.

**The Problem:**

```typescript
// Host app creates wrapper span first
const span = tracer.startSpan("my-operation"); // onStart() runs here - no AI span yet
await neurolink.generate({ prompt: "Hello" }); // AI SDK creates "ai.generateText" span later
span.end();
```

At the time the wrapper span starts, there is no AI SDK span to detect the operation from, so the trace name would only include the userId.

**The Solution:**

NeuroLink automatically handles this by detecting operations from child spans and updating the trace name when the wrapper span ends:

1. **Wrapper span starts** - `onStart()` sets traceName to just userId (e.g., `user-123`)
2. **AI SDK span starts** - `onStart()` detects `ai.streamText` and stores operation in a map keyed by traceId
3. **Wrapper span ends** - `onEnd()` looks up the stored operation and updates traceName to `user-123:ai.streamText`

This behavior is automatic and requires no code changes in host applications. The trace name in Langfuse will correctly include both the userId and the detected operation name.

## Custom Spans

Create custom spans for detailed tracing:

```typescript

const tracer = getTracer("my-app", "1.0.0");

await setLangfuseContext({ userId: "user-123" }, async () => {
  const span = tracer.startSpan("process-request");
  try {
    // Add custom attributes
    span.setAttribute("request.type", "chat");
    span.setAttribute("model", "gpt-4");

    const result = await neurolink.generate({ prompt: "Hello" });

    span.setAttribute("tokens.total", result.usage?.totalTokens ?? 0);
    return result;
  } catch (error) {
    span.recordException(error as Error);
    throw error;
  } finally {
    span.end();
  }
});
```

## External TracerProvider Mode

If your application already has OpenTelemetry instrumentation (e.g., for HTTP, database tracing), use external provider mode to avoid "duplicate registration" errors:

### Configuration

```typescript

// 1. Initialize NeuroLink with external provider mode
const neurolink = new NeuroLink({
  observability: {
    langfuse: {
      enabled: true,
      publicKey: process.env.LANGFUSE_PUBLIC_KEY!,
      secretKey: process.env.LANGFUSE_SECRET_KEY!,
      useExternalTracerProvider: true, // Don't create TracerProvider
    },
  },
});

// 2. Get NeuroLink's span processors
const neurolinkProcessors = getSpanProcessors();
// Returns: [ContextEnricher, LangfuseSpanProcessor]

// 3. Add to your existing OTEL setup

const jaegerExporter = new OTLPTraceExporter({
  url: "http://jaeger:4318/v1/traces",
});
const sdk = new NodeSDK({
  spanProcessors: [
    new BatchSpanProcessor(jaegerExporter),
    ...neurolinkProcessors,
  ],
});
sdk.start();
```

### Auto-Detection Mode

Alternatively, let NeuroLink auto-detect external providers:

```typescript
const neurolink = new NeuroLink({
  observability: {
    langfuse: {
      enabled: true,
      publicKey: process.env.LANGFUSE_PUBLIC_KEY!,
      secretKey: process.env.LANGFUSE_SECRET_KEY!,
      autoDetectExternalProvider: true, // Auto-detect and skip if needed
    },
  },
});
```

### Available Exports

| Export                            | Description                                        |
| --------------------------------- | -------------------------------------------------- |
| `getSpanProcessors()`             | Returns `[ContextEnricher, LangfuseSpanProcessor]` |
| `createContextEnricher()`         | Factory for creating ContextEnricher instances     |
| `isUsingExternalTracerProvider()` | Check if in external provider mode                 |
| `getLangfuseSpanProcessor()`      | Get the LangfuseSpanProcessor directly             |
| `getTracerProvider()`             | Get the TracerProvider (null in external mode)     |

## Vercel AI SDK Integration

NeuroLink automatically captures GenAI semantic convention attributes from Vercel AI SDK's `experimental_telemetry`:

```typescript

await setLangfuseContext(
  { userId: "user-123", conversationId: "conv-456" },
  async () => {
    const result = await generateText({
      model: openai("gpt-4"),
      prompt: "Explain quantum computing",
      experimental_telemetry: {
        isEnabled: true,
        functionId: "explain-topic",
      },
    });
    // Token usage, model info, and finish reason automatically captured
    return result;
  },
);
```

### Captured Attributes

The `ContextEnricher` automatically reads these GenAI attributes:

- `gen_ai.system` - AI provider (openai, anthropic, etc.)
- `gen_ai.request.model` - Model requested
- `gen_ai.usage.input_tokens` - Input tokens used
- `gen_ai.usage.output_tokens` - Output tokens used
- `ai.finishReason` - Why generation finished

## Health Monitoring

Check Langfuse health status:

```typescript

const status = getLangfuseHealthStatus();
console.log({
  isHealthy: status.isHealthy,
  initialized: status.initialized,
  credentialsValid: status.credentialsValid,
  enabled: status.enabled,
  hasProcessor: status.hasProcessor,
  usingExternalProvider: status.usingExternalProvider,
  config: status.config,
});
```

## Flushing and Shutdown

Ensure all spans are sent before process exit:

```typescript

// Flush pending spans
await flushOpenTelemetry();

// Graceful shutdown (flushes and cleans up)
await shutdownOpenTelemetry();
```

### Graceful Shutdown Example

```typescript
process.on("SIGTERM", async () => {
  console.log("Shutting down...");
  await flushOpenTelemetry();
  await shutdownOpenTelemetry();
  process.exit(0);
});
```

## Best Practices

### 1. Always Set Context at Request Boundaries

```typescript
app.use(async (req, res, next) => {
  await setLangfuseContext({
    userId: req.user?.id,
    sessionId: req.session?.id,
    requestId: req.headers["x-request-id"],
  });
  next();
});
```

### 2. Use Metadata for Filtering

```typescript
await setLangfuseContext({
  metadata: {
    feature: "chat",
    experiment: "gpt4-vs-claude",
    abTestGroup: "B",
  },
});
```

### 3. Create Spans for Business Logic

```typescript
const tracer = getTracer("my-app");
const span = tracer.startSpan("retrieve-context");
try {
  const docs = await vectorStore.search(query);
  span.setAttribute("docs.count", docs.length);
} finally {
  span.end();
}
```

### 4. Handle Errors Properly

```typescript
const span = tracer.startSpan("ai-generation");
try {
  return await neurolink.generate({ prompt });
} catch (error) {
  span.recordException(error as Error);
  span.setStatus({ code: 2, message: (error as Error).message });
  throw error;
} finally {
  span.end();
}
```

## Troubleshooting

### Empty span processors from getSpanProcessors()

**Problem**: `getSpanProcessors()` returns an empty array.

**Solution**: Ensure NeuroLink is initialized before calling `getSpanProcessors()`:

```typescript
// Wrong - calling before initialization
const processors = getSpanProcessors(); // Returns []

// Correct - call after initialization
const neurolink = new NeuroLink({
  observability: { langfuse: { enabled: true, ... } }
});
const processors = getSpanProcessors(); // Returns [ContextEnricher, LangfuseSpanProcessor]
```

### Context not appearing in Langfuse traces

**Problem**: `userId`, `sessionId`, or other context fields don't appear in Langfuse.

**Solution**: Ensure `setLangfuseContext` is called in the same async context as your AI operations:

```typescript
// Wrong - context set outside the request handler
await setLangfuseContext({ userId: "user-123" });
// ... later in different async context
await neurolink.generate({ prompt: "Hello" }); // Context lost!

// Correct - use callback to scope context
await setLangfuseContext({ userId: "user-123" }, async () => {
  await neurolink.generate({ prompt: "Hello" }); // Context attached!
});
```

### Duplicate TracerProvider registration errors

**Problem**: Error like "TracerProvider already registered" or "duplicate registration".

**Solution**: Set `useExternalTracerProvider: true` in your config:

```typescript
const neurolink = new NeuroLink({
  observability: {
    langfuse: {
      enabled: true,
      publicKey: process.env.LANGFUSE_PUBLIC_KEY!,
      secretKey: process.env.LANGFUSE_SECRET_KEY!,
      useExternalTracerProvider: true, // Add this!
    },
  },
});
```

### Spans not being sent to Langfuse

**Problem**: Traces don't appear in Langfuse dashboard.

**Solution**:

1. Verify credentials are correct
2. Check health status:

   ```typescript
   const status = getLangfuseHealthStatus();
   console.log(status); // Check isHealthy, credentialsValid
   ```

3. Ensure `flushOpenTelemetry()` is called before process exit
4. Check network connectivity to Langfuse endpoint

## API Reference

The following functions and types are exported from `@juspay/neurolink`:

**Functions:**

- `setLangfuseContext` - Set context for Langfuse traces
- `getLangfuseContext` - Get current Langfuse context
- `getTracer` - Get OpenTelemetry tracer instance
- `getSpanProcessors` - Get span processors for external TracerProvider integration

**Types:**

- `LangfuseConfig` - Configuration options for Langfuse integration
- `LangfuseSpanAttributes` - GenAI semantic convention attributes

## See Also

- [Telemetry Guide](/docs/observability/telemetry) - OpenTelemetry setup with Jaeger
- [Enterprise Monitoring](/docs/observability/health-monitoring) - Prometheus and Grafana setup
- [Analytics Reference](/docs/reference/analytics) - Token and cost tracking

---

## Office Documents Support

<!-- Source: features/office-documents.md -->

# Office Documents Support

NeuroLink provides seamless Office document support as a **multimodal input type** - attach DOCX, PPTX, and XLSX documents directly to your AI prompts for document analysis, data extraction, and content processing.

## Overview

Office document support in NeuroLink works as a native multimodal input - the system automatically processes Office files and passes them to the AI provider's document understanding capabilities. The system:

1. **Validates** Office files using magic byte detection and format verification
2. **Checks** provider compatibility (Bedrock, Vertex AI, Anthropic)
3. **Verifies** file size limits per provider
4. **Passes** documents directly to the provider's native document API
5. **Works** with providers that support native Office document processing

**Key Difference from PDF:** Similar to PDF files, Office documents are sent as binary documents to providers with native document support. This enables analysis of formatted text, tables, charts, and embedded content within Office files.

## Supported File Types

| Format                | Extension | MIME Type                                                                   | Description                                        |
| --------------------- | --------- | --------------------------------------------------------------------------- | -------------------------------------------------- |
| **Word Document**     | `.docx`   | `application/vnd.openxmlformats-officedocument.wordprocessingml.document`   | Microsoft Word documents with text, images, tables |
| **PowerPoint**        | `.pptx`   | `application/vnd.openxmlformats-officedocument.presentationml.presentation` | Presentations with slides, charts, images          |
| **Excel Spreadsheet** | `.xlsx`   | `application/vnd.openxmlformats-officedocument.spreadsheetml.sheet`         | Spreadsheets with data, formulas, charts           |

**Legacy Formats:**

| Format         | Extension | MIME Type                  | Support            |
| -------------- | --------- | -------------------------- | ------------------ |
| Word (Legacy)  | `.doc`    | `application/msword`       | Provider-dependent |
| Excel (Legacy) | `.xls`    | `application/vnd.ms-excel` | Provider-dependent |

## Quick Start

### SDK Usage

```typescript

const neurolink = new NeuroLink();

// Basic Word document analysis
const result = await neurolink.generate({
  input: {
    text: "Summarize the key points from this document",
    officeFiles: ["report.docx"],
  },
  provider: "bedrock",
});

// PowerPoint presentation analysis
const presentation = await neurolink.generate({
  input: {
    text: "Extract the main topics from each slide in this presentation",
    officeFiles: ["quarterly-review.pptx"],
  },
  provider: "bedrock",
});

// Excel spreadsheet analysis
const spreadsheet = await neurolink.generate({
  input: {
    text: "What are the top 5 products by revenue in this spreadsheet?",
    officeFiles: ["sales-data.xlsx"],
  },
  provider: "bedrock",
});

// Multiple document comparison
const comparison = await neurolink.generate({
  input: {
    text: "Compare the revenue figures between Q1 and Q2 reports",
    officeFiles: ["q1-report.docx", "q2-report.docx"],
  },
  provider: "bedrock",
});

// Auto-detect file types (mix Office, PDF, CSV, and images)
const multimodal = await neurolink.generate({
  input: {
    text: "Analyze all documents and provide a comprehensive summary",
    files: ["report.docx", "data.xlsx", "chart.png", "notes.pdf"],
  },
  provider: "bedrock",
});

// Streaming with Office documents
const stream = await neurolink.stream({
  input: {
    text: "Provide a detailed analysis of this contract document",
    officeFiles: ["contract.docx"],
  },
  provider: "bedrock",
});

for await (const chunk of stream) {
  process.stdout.write(chunk.content);
}
```

### CLI Usage

```bash
# Attach Office files to your prompt
neurolink generate "Summarize this document" --office report.docx --provider bedrock

# Multiple Office files
neurolink generate "Compare these reports" --office q1.docx --office q2.docx --provider bedrock

# Excel spreadsheet analysis
neurolink generate "Analyze sales trends" --office sales.xlsx --provider bedrock

# PowerPoint presentation
neurolink generate "Extract key points from slides" --office presentation.pptx --provider bedrock

# Auto-detect file types
neurolink generate "Analyze all documents" --file report.docx --file data.xlsx --provider bedrock

# Stream mode with Office documents
neurolink stream "Explain this document in detail" --office document.docx --provider bedrock

# Batch processing with Office documents
echo "Summarize the key points" > prompts.txt
echo "Extract action items" >> prompts.txt
neurolink batch prompts.txt --office meeting-notes.docx --provider bedrock
```

## API Reference

### GenerateOptions

```typescript
type GenerateOptions = {
  input: {
    text: string;
    images?: Array; // Image files
    csvFiles?: Array; // CSV files (converted to text)
    pdfFiles?: Array; // PDF files (native binary)
    officeFiles?: Array; // Office files (native binary)
    files?: Array; // Auto-detect file types
  };

  // Provider selection (REQUIRED for Office files)
  provider: "bedrock" | "vertex" | "anthropic";

  // Office processing options
  officeOptions?: OfficeProcessorOptions;

  // Standard options
  model?: string;
  maxTokens?: number;
  temperature?: number;
  // ... other options
};
```

### StreamOptions

```typescript
type StreamOptions = {
  input: {
    text: string;
    officeFiles?: Array; // Same as GenerateOptions
    files?: Array;
  };

  provider: "bedrock" | "vertex" | "anthropic";
  // ... other options
};
```

### OfficeProcessorOptions

```typescript
type OfficeProcessorOptions = {
  /**
   * Provider to use for document processing
   * @default "bedrock"
   */
  provider?: string;

  /**
   * Maximum file size in MB
   * @default 5 (provider-dependent)
   */
  maxSizeMB?: number;

  /**
   * Whether to extract embedded images
   * @default true
   */
  extractImages?: boolean;

  /**
   * Whether to preserve document structure in output
   * @default true
   */
  preserveStructure?: boolean;
};
```

### File Input Formats

```typescript
// String path (relative or absolute)
officeFiles: ["./documents/report.docx"];
officeFiles: ["/absolute/path/to/data.xlsx"];

// Buffer (from fs.readFile or other source)

const docxBuffer = await readFile("document.docx");
officeFiles: [docxBuffer];

// Mixed types
officeFiles: ["report.docx", docxBuffer, "./presentation.pptx"];
```

## Provider Support

### Supported Providers

| Provider             | Max Size | DOCX | PPTX | XLSX | DOC | XLS | Notes                                |
| -------------------- | -------- | ---- | ---- | ---- | --- | --- | ------------------------------------ |
| **AWS Bedrock**      | 5 MB     | ✅   | ✅   | ✅   | ✅  | ✅  | Full native support via Converse API |
| **Google Vertex AI** | 5 MB     | ✅   | ⚠️   | ✅   | ⚠️  | ⚠️  | Best for DOCX and XLSX               |
| **Anthropic Claude** | 5 MB     | ✅   | ⚠️   | ✅   | ⚠️  | ⚠️  | Via document API                     |

### Unsupported Providers

The following providers **do not currently support** native Office document processing:

- OpenAI (GPT-4o)
- Google AI Studio
- Azure OpenAI
- Ollama (local models)
- LiteLLM
- Mistral AI
- Hugging Face

**Error Message for Unsupported Providers:**

```
Office files are not currently supported with openai provider.
Supported providers: AWS Bedrock, Google Vertex AI, Anthropic
Current provider: openai

Options:
1. Switch to a supported provider (--provider bedrock or --provider vertex)
2. Convert your Office document to PDF first
3. Extract text content manually before processing
```

### Provider-Specific Features

#### AWS Bedrock (Recommended)

Bedrock offers the most comprehensive Office document support via the Converse API:

```typescript
await neurolink.generate({
  input: {
    text: "Analyze this quarterly report",
    officeFiles: ["q3-report.docx"],
  },
  provider: "bedrock",
  model: "anthropic.claude-3-5-sonnet-20241022-v2:0",
});
```

**Supported Document Formats in Bedrock Converse API:**

- Office formats: `doc`, `docx`, `xls`, `xlsx`
- Other formats: `pdf`, `csv`, `html`, `txt`, `md`

#### Google Vertex AI

```typescript
await neurolink.generate({
  input: {
    text: "Extract key metrics from this spreadsheet",
    officeFiles: ["financial-data.xlsx"],
  },
  provider: "vertex",
  model: "gemini-1.5-pro",
});
```

#### Anthropic Claude

```typescript
await neurolink.generate({
  input: {
    text: "Summarize this contract document",
    officeFiles: ["contract.docx"],
  },
  provider: "anthropic",
  model: "claude-3-5-sonnet-20241022",
});
```

## Features

### 1. Auto-Detection

Use the `files` array for automatic file type detection:

```typescript
// Automatically detects Office, PDF, CSV, and image types
await neurolink.generate({
  input: {
    text: "Analyze all these documents",
    files: [
      "report.docx", // Auto-detected as Word document
      "data.xlsx", // Auto-detected as Excel spreadsheet
      "slides.pptx", // Auto-detected as PowerPoint
      "summary.pdf", // Auto-detected as PDF
      "chart.png", // Auto-detected as image
    ],
  },
  provider: "bedrock",
});
```

### 2. Multiple Document Types

Process multiple Office documents in a single request:

```typescript
// Compare documents
await neurolink.generate({
  input: {
    text: "Compare version 1 and version 2 of the proposal. What changed?",
    officeFiles: ["proposal-v1.docx", "proposal-v2.docx"],
  },
  provider: "bedrock",
});

// Cross-format analysis
await neurolink.generate({
  input: {
    text: "Verify the numbers in the report match the spreadsheet data",
    officeFiles: ["report.docx", "source-data.xlsx"],
  },
  provider: "bedrock",
});
```

### 3. Mixed Multimodal Inputs

Combine Office documents with other file types:

```typescript
// Office + CSV analysis
await neurolink.generate({
  input: {
    text: "Compare the report summary with the raw data",
    officeFiles: ["summary-report.docx"],
    csvFiles: ["raw-data.csv"],
  },
  provider: "bedrock",
});

// Office + PDF + Image verification
await neurolink.generate({
  input: {
    text: "Verify consistency across all documents",
    officeFiles: ["report.docx"],
    pdfFiles: ["signed-contract.pdf"],
    images: ["org-chart.png"],
  },
  provider: "bedrock",
});
```

## Type Definitions

### OfficeFileType

```typescript
/**
 * Supported Office document types
 */
type OfficeFileType = "docx" | "pptx" | "xlsx" | "doc" | "xls";
```

### OfficeProcessingResult

```typescript
/**
 * Result of Office document processing
 */
type OfficeProcessingResult = {
  type: "office";
  content: Buffer;
  mimeType: string;
  metadata: {
    confidence: number;
    size: number;
    filename?: string;
    format: OfficeFileType;
    provider: string;
    estimatedPages?: number;
    hasEmbeddedImages?: boolean;
    hasCharts?: boolean;
  };
};
```

### OfficeProviderConfig

```typescript
/**
 * Provider configuration for Office document support
 */
type OfficeProviderConfig = {
  maxSizeMB: number;
  supportedFormats: OfficeFileType[];
  supportsNative: boolean;
  apiType: "document" | "converse" | "unsupported";
};
```

## Error Handling

### Error Types

```typescript
class OfficeProcessingError extends Error {
  file: string;
  format?: string;
  provider?: string;
  originalError?: Error;
}

class OfficeValidationError extends OfficeProcessingError {
  // Thrown when file format validation fails
  validationType: "format" | "size" | "corruption";
}

class OfficeProviderError extends OfficeProcessingError {
  // Thrown when provider doesn't support Office documents
  supportedProviders: string[];
}

class OfficeSizeError extends OfficeProcessingError {
  // Thrown when file exceeds size limits
  maxSize: number;
  actualSize: number;
}
```

### Error Handling Patterns

```typescript

  OfficeProcessingError,
  OfficeValidationError,
  OfficeProviderError,
  OfficeSizeError,
} from "@juspay/neurolink";

try {
  const result = await neurolink.generate({
    input: {
      text: "Analyze this document",
      officeFiles: ["document.docx"],
    },
    provider: "bedrock",
  });
} catch (error) {
  if (error instanceof OfficeSizeError) {
    console.error(
      `File too large: ${error.actualSize}MB (max: ${error.maxSize}MB)`,
    );
    console.error("Try: --provider google-ai-studio for larger files");
  } else if (error instanceof OfficeProviderError) {
    console.error(`Provider ${error.provider} doesn't support Office files`);
    console.error(
      `Supported providers: ${error.supportedProviders.join(", ")}`,
    );
  } else if (error instanceof OfficeValidationError) {
    console.error(`Invalid Office file: ${error.message}`);
    console.error(`Validation type: ${error.validationType}`);
  } else if (error instanceof OfficeProcessingError) {
    console.error(`Office processing failed: ${error.message}`);
  } else {
    console.error("Unexpected error:", error);
  }
}
```

## Metadata Fields

When processing Office documents, the following metadata is available:

| Field               | Type             | Description                      |
| ------------------- | ---------------- | -------------------------------- |
| `confidence`        | `number`         | Detection confidence (0-100)     |
| `size`              | `number`         | File size in bytes               |
| `filename`          | `string`         | Original filename                |
| `format`            | `OfficeFileType` | Detected Office format           |
| `provider`          | `string`         | Provider used for processing     |
| `estimatedPages`    | `number`         | Estimated page/slide/sheet count |
| `hasEmbeddedImages` | `boolean`        | Whether document contains images |
| `hasCharts`         | `boolean`        | Whether document contains charts |

### Accessing Metadata

```typescript
const result = await neurolink.generate({
  input: {
    text: "Analyze this document",
    officeFiles: ["report.docx"],
  },
  provider: "bedrock",
});

// Metadata available in result
console.log(result.metadata?.officeFiles?.[0]);
// {
//   format: "docx",
//   size: 245760,
//   estimatedPages: 12,
//   hasEmbeddedImages: true,
//   hasCharts: false
// }
```

## Best Practices

### 1. Choose the Right Provider

```typescript
// For comprehensive Office support
provider: "bedrock"; // Best overall Office document support

// For Word documents primarily
provider: "vertex"; // Good DOCX support

// For enterprise deployments
provider: "bedrock"; // AWS infrastructure integration
```

### 2. Optimize File Size

```typescript
// Check file size before processing

async function validateOfficeFile(filePath: string, provider: string) {
  const stats = await stat(filePath);
  const sizeMB = stats.size / (1024 * 1024);

  const limits: Record = {
    bedrock: 5,
    vertex: 5,
    anthropic: 5,
  };

  if (sizeMB > (limits[provider] || 5)) {
    throw new Error(
      `File ${filePath} (${sizeMB.toFixed(2)}MB) exceeds ${limits[provider]}MB limit for ${provider}`,
    );
  }

  console.log(`✓ File validated: ${sizeMB.toFixed(2)}MB`);
}

await validateOfficeFile("report.docx", "bedrock");
```

### 3. Use Streaming for Large Documents

```typescript
// For long documents, use streaming to get results faster
const stream = await neurolink.stream({
  input: {
    text: "Provide a comprehensive analysis of this 100-page document",
    officeFiles: ["long-report.docx"],
  },
  provider: "bedrock",
  maxTokens: 8000,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.content);
}
```

### 4. Be Specific in Your Prompts

```typescript
// ❌ Too vague
"Tell me about this document";

// ✅ Specific and actionable
"Extract all action items with their due dates from this meeting notes document";
"List the top 5 products by revenue from the sales spreadsheet";
"Summarize the key points from each slide in this presentation";
"Compare the financial projections between these two quarterly reports";
```

## Limitations

### File Format Requirements

- **Must** be valid Office Open XML format (`.docx`, `.pptx`, `.xlsx`)
- **Must** be within provider size limits (typically 5MB)
- **Must** not be password-protected or encrypted
- Legacy formats (`.doc`, `.xls`, `.ppt`) have limited support

### Provider Limitations

| Limitation          | Description                 | Workaround                              |
| ------------------- | --------------------------- | --------------------------------------- |
| Size limits         | Most providers limit to 5MB | Split large documents or convert to PDF |
| Password protection | Not supported               | Remove password before processing       |
| Macros              | VBA macros are ignored      | N/A - security feature                  |
| External links      | May not be resolved         | Embed content instead                   |
| Complex formatting  | Some formatting may be lost | Focus on content extraction             |

### Token Usage

Office documents consume significant tokens. The following are approximate estimates that may vary by provider and content complexity:

- **Simple DOCX**: ~500-1,000 tokens per page
- **Complex DOCX** (with images/tables): ~1,500-3,000 tokens per page
- **XLSX**: ~100-500 tokens per sheet (depends on data density)
- **PPTX**: ~200-1,000 tokens per slide

> **Note:** Token estimates are based on typical document content. Actual usage may vary depending on document complexity, provider implementation, and model-specific tokenization.

**Tip:** Set appropriate `maxTokens` for Office document analysis:

```typescript
await neurolink.generate({
  input: {
    text: "Summarize this 20-page document",
    officeFiles: ["document.docx"],
  },
  provider: "bedrock",
  maxTokens: 4000, // Allow enough tokens for response
});
```

## Troubleshooting

### Error: "Office files are not currently supported"

**Problem:** Using unsupported provider (OpenAI, Ollama, etc.)

**Solution:**

```bash
# Change provider to supported one
neurolink generate "Analyze document" --office doc.docx --provider bedrock

# Or use auto-detection with correct provider
neurolink generate "Analyze document" --file doc.docx --provider vertex
```

### Error: "File size exceeds limit"

**Problem:** File too large for provider (>5MB for most providers)

**Solution:**

```bash
# Option 1: Split the document into smaller parts
# Option 2: Convert to PDF first (may have larger size limits)
# Option 3: Extract key sections manually
```

### Error: "Invalid Office file format"

**Problem:** File is not a valid Office Open XML format or corrupted

**Solution:**

```bash
# Verify file is valid Office format
file document.docx  # Should show "Microsoft Word 2007+"

# Check file extension matches actual format
# Ensure file is not password-protected
```

### Error: "Provider not specified"

**Problem:** No provider selected (Office files require explicit provider)

**Solution:**

```typescript
// ❌ Missing provider
await neurolink.generate({
  input: {
    text: "Analyze",
    officeFiles: ["doc.docx"],
  },
});

// ✅ Specify provider
await neurolink.generate({
  input: {
    text: "Analyze",
    officeFiles: ["doc.docx"],
  },
  provider: "bedrock", // Required for Office files
});
```

### Office Content Not Being Analyzed

**Problem:** AI says "I cannot read the document" even though file is attached

**Common Causes:**

1. **Wrong provider**: Make sure using supported provider
2. **File path wrong**: Verify file exists at specified path
3. **Buffer issue**: If using Buffer, ensure it's valid Office data
4. **Format mismatch**: Ensure file extension matches actual format

**Debug:**

```typescript

// Verify file exists
await stat("document.docx"); // Throws if not found

// Verify it's a valid Office file (DOCX is ZIP-based)
const buffer = await readFile("document.docx");
const header = buffer.slice(0, 4);
// DOCX files start with ZIP magic bytes: PK\x03\x04
console.log("Magic bytes:", header.toString("hex")); // Should be "504b0304"

// Check size
const sizeMB = buffer.length / (1024 * 1024);
console.log("Size:", sizeMB.toFixed(2), "MB");
```

## Migration Guide

### Migrating from Manual Document Processing

If you were previously using manual document extraction:

**Before (Manual Processing):**

```typescript
// Old approach: Extract text manually

const docBuffer = readFileSync("report.docx");
const { value: text } = await mammoth.extractRawText({ buffer: docBuffer });

const result = await provider.generate({
  input: { text: `Analyze this document:\n\n${text}` },
});
```

**After (Native Support):**

```typescript
// New approach: Direct document support
const result = await neurolink.generate({
  input: {
    text: "Analyze this document",
    officeFiles: ["report.docx"],
  },
  provider: "bedrock",
});
```

### Migrating from PDF-First Workflow

If you were converting Office files to PDF first:

**Before (PDF Conversion):**

```typescript
// Old approach: Convert to PDF first

await convertToPdf("report.docx", "report.pdf");

const result = await neurolink.generate({
  input: {
    text: "Analyze this document",
    pdfFiles: ["report.pdf"],
  },
  provider: "vertex",
});
```

**After (Direct Office Support):**

```typescript
// New approach: Direct Office document support
const result = await neurolink.generate({
  input: {
    text: "Analyze this document",
    officeFiles: ["report.docx"], // No conversion needed
  },
  provider: "bedrock",
});
```

### API Changes Summary

| Previous API               | New API                     | Notes                            |
| -------------------------- | --------------------------- | -------------------------------- |
| Manual text extraction     | `officeFiles: [...]`        | Native document support          |
| PDF conversion workflow    | Direct Office support       | No conversion needed             |
| Provider-specific handling | Unified `officeFiles` array | Works across supported providers |
| Custom MIME type handling  | Auto-detection              | Format automatically detected    |

## Usage Examples

Here are complete working examples for common use cases:

### Basic Word Document Analysis

```typescript

const neurolink = new NeuroLink();

const result = await neurolink.generate({
  input: {
    text: "Summarize the key points from this document",
    officeFiles: ["meeting-notes.docx"],
  },
  provider: "bedrock",
});

console.log(result.content);
```

### Excel Spreadsheet Data Extraction

```typescript
const result = await neurolink.generate({
  input: {
    text: "What are the top 5 products by revenue?",
    officeFiles: ["sales-data.xlsx"],
  },
  provider: "bedrock",
});
```

### PowerPoint Presentation Summarization

```typescript
const result = await neurolink.generate({
  input: {
    text: "Create an executive summary of this presentation",
    officeFiles: ["quarterly-review.pptx"],
  },
  provider: "bedrock",
});
```

### Multiple Document Comparison

```typescript
const result = await neurolink.generate({
  input: {
    text: "Compare Q1 and Q2 reports and highlight the key differences",
    officeFiles: ["q1-report.docx", "q2-report.docx"],
  },
  provider: "bedrock",
});
```

### Mixed File Type Analysis

```typescript
const result = await neurolink.generate({
  input: {
    text: "Analyze all documents and provide a comprehensive summary",
    files: ["report.docx", "data.xlsx", "chart.png", "notes.pdf"],
  },
  provider: "bedrock",
});
```

## Related Features

- [Multimodal Chat](/docs/features/multimodal-chat) - Overview of multimodal capabilities
- [PDF Support](/docs/features/pdf-support) - PDF document processing
- [CSV Support](/docs/features/csv-support) - CSV file processing

## Technical Details

### Office Document Processing Flow

```
1. User provides Office file(s)
   ↓
2. FileDetector validates format (magic bytes for ZIP/Office Open XML)
   ↓
3. OfficeProcessor checks provider support
   ↓
4. Validate size limits
   ↓
5. Pass Buffer to messageBuilder
   ↓
6. Format as provider-specific document type
   ↓
7. Send to provider's native document API
   ↓
8. Provider processes document content
   ↓
9. Return AI response
```

### Implementation Files

- **`src/lib/utils/officeProcessor.ts`** - Office document validation and processing
- **`src/lib/utils/fileDetector.ts`** - File type detection (includes Office formats)
- **`src/lib/utils/messageBuilder.ts`** - Multimodal message construction
- **`src/lib/types/fileTypes.ts`** - Office type definitions
- **`src/cli/factories/commandFactory.ts`** - CLI `--office` flag handling

## Performance Considerations

### Processing Speed

- **Small DOCX (\5MB)**: ~5-15 seconds
- **Complex PPTX**: ~5-20 seconds (depends on slide count)
- **Data-heavy XLSX**: ~3-10 seconds

### Memory Usage

- Office files loaded as Buffers in memory
- Large files may impact performance
- Consider processing large files in batches

## Future Enhancements

Planned features for Office document support:

- **OpenAI Support**: Document-to-text conversion for GPT models
- **Azure OpenAI**: Native document support when available
- **Page Selection**: Analyze specific pages/slides/sheets only
- **Content Extraction**: Extract specific elements (tables, charts)
- **Template Processing**: Fill document templates with AI-generated content
- **Legacy Format Support**: Improved `.doc`, `.xls`, `.ppt` support

## Feedback and Support

Found a bug or have a feature request? Please:

1. Check existing issues on GitHub
2. Create a new issue with:
   - Provider used
   - Office file details (format, size)
   - Error message or unexpected behavior
   - Sample code (if possible)

## Changelog

### Version 8.3.0+

- ✅ Initial Office document support for DOCX, PPTX, XLSX
- ✅ AWS Bedrock native support via Converse API
- ✅ Google Vertex AI support
- ✅ Anthropic Claude support
- ✅ Auto-detection via `--file` flag
- ✅ Multiple document processing
- ✅ Size limit validation
- ✅ Comprehensive error messages
- ✅ CLI and SDK integration
- ✅ Streaming support
- ✅ Mixed multimodal inputs (Office + PDF + CSV + images)

---

**Next:** [Multimodal Chat Guide](/docs/features/multimodal-chat) | [PDF Support](/docs/features/pdf-support) | [CSV Support](/docs/features/csv-support)

---

## PDF File Support

<!-- Source: features/pdf-support.md -->

# PDF File Support

NeuroLink provides seamless PDF file support as a **multimodal input type** - attach PDF documents directly to your AI prompts for document analysis, information extraction, and content processing.

## Overview

PDF support in NeuroLink works as a native multimodal input - the system automatically processes PDF files and passes them directly to the AI provider's vision/document understanding capabilities. The system:

1. **Validates** PDF files using magic byte detection and format verification
2. **Checks** provider compatibility (Vertex AI, Anthropic, Bedrock, AI Studio)
3. **Verifies** file size and page limits per provider
4. **Passes** PDF directly to the provider's native document API
5. **Works** with providers that support native PDF processing

**Key Difference from CSV:** Unlike CSV files which are converted to text, PDFs are sent as binary documents to providers with native PDF support. This enables visual analysis of charts, tables, images, and formatted text within PDFs.

## Quick Start

### SDK Usage

```typescript

const neurolink = new NeuroLink();

// Basic PDF analysis
const result = await neurolink.generate({
  input: {
    text: "What is the total revenue mentioned in this financial report?",
    pdfFiles: ["financial-report-q3.pdf"],
  },
  provider: "vertex", // or "anthropic", "bedrock", "google-ai-studio"
});

// Multiple PDF comparison
const comparison = await neurolink.generate({
  input: {
    text: "Compare the revenue figures between Q1 and Q2 reports. What's the growth percentage?",
    pdfFiles: ["q1-report.pdf", "q2-report.pdf"],
  },
  provider: "vertex",
});

// Auto-detect file types (mix PDF, CSV, and images)
const multimodal = await neurolink.generate({
  input: {
    text: "Analyze the financial data in the PDF, compare with the CSV spreadsheet, and verify against the chart image",
    files: ["report.pdf", "data.csv", "chart.png"], // Auto-detects each type
  },
  provider: "vertex",
});

// Streaming with PDF
const stream = await neurolink.stream({
  input: {
    text: "Provide a detailed summary of this contract, highlighting key terms and obligations",
    pdfFiles: ["contract.pdf"],
  },
  provider: "anthropic",
});

for await (const chunk of stream) {
  process.stdout.write(chunk.content);
}
```

### CLI Usage

```bash
# Attach PDF files to your prompt
neurolink generate "Summarize this invoice" --pdf invoice.pdf --provider vertex

# Multiple PDF files
neurolink generate "Compare these contracts" --pdf contract1.pdf --pdf contract2.pdf --provider anthropic

# Auto-detect file types
neurolink generate "Analyze report and data" --file report.pdf --file data.csv --provider vertex

# Stream mode with PDF
neurolink stream "Explain this document in detail" --pdf document.pdf --provider bedrock

# Batch processing with PDF
echo "Summarize the key points" > prompts.txt
echo "Extract all monetary values" >> prompts.txt
neurolink batch prompts.txt --pdf invoice.pdf --provider vertex
```

## API Reference

### GenerateOptions

```typescript
type GenerateOptions = {
  input: {
    text: string;
    images?: Array; // Image files
    csvFiles?: Array; // CSV files (converted to text)
    pdfFiles?: Array; // PDF files (native binary)
    files?: Array; // Auto-detect file types
  };

  // Provider selection (REQUIRED for PDF)
  provider: "vertex" | "anthropic" | "bedrock" | "google-ai-studio";

  // Standard options
  model?: string;
  maxTokens?: number;
  temperature?: number;
  // ... other options
};
```

### StreamOptions

```typescript
type StreamOptions = {
  input: {
    text: string;
    pdfFiles?: Array; // Same as GenerateOptions
    files?: Array;
  };

  provider: "vertex" | "anthropic" | "bedrock" | "google-ai-studio";
  // ... other options
};
```

### File Input Formats

```typescript
// String path (relative or absolute)
pdfFiles: ["./documents/invoice.pdf"];
pdfFiles: ["/absolute/path/to/report.pdf"];

// Buffer (from fs.readFile or other source)

const pdfBuffer = await readFile("document.pdf");
pdfFiles: [pdfBuffer];

// Mixed types
pdfFiles: ["invoice.pdf", pdfBuffer, "./report.pdf"];
```

## Provider Support

### Supported Providers

| Provider              | Max Size | Max Pages | API Type  | Notes                       |
| --------------------- | -------- | --------- | --------- | --------------------------- |
| **Google Vertex AI**  | 5 MB     | 100       | Document  | Recommended for general use |
| **Anthropic Claude**  | 5 MB     | 100       | Document  | Best for detailed analysis  |
| **AWS Bedrock**       | 5 MB     | 100       | Document  | Enterprise deployments      |
| **Google AI Studio**  | 2000 MB  | 100       | Files API | Largest file support        |
| **OpenAI**            | 10 MB    | 100       | Files API | GPT-4o, GPT-4o-mini, o1     |
| **LiteLLM**           | 10 MB    | 100       | Proxy     | Depends on upstream model   |
| **OpenAI Compatible** | 10 MB    | 100       | Proxy     | Depends on upstream model   |

### Unsupported Providers

The following providers **do not currently support** native PDF processing:

- Azure OpenAI
- Ollama (local models)

**Error Message for Unsupported Providers:**

```
PDF files are not currently supported with azure-openai provider.
Supported providers: Google Vertex AI, Anthropic, AWS Bedrock, Google AI Studio, OpenAI
Current provider: azure-openai

Options:
1. Switch to a supported provider (--provider vertex or --provider openai)
2. Convert your PDF to text manually
3. Wait for future update (Azure OpenAI conversion coming soon)
```

### Provider-Specific Features

#### Google Vertex AI

```typescript
await neurolink.generate({
  input: {
    text: "Analyze this report",
    pdfFiles: ["report.pdf"],
  },
  provider: "vertex",
  model: "gemini-1.5-pro", // Best for document understanding
});
```

#### Anthropic Claude

```typescript
await neurolink.generate({
  input: {
    text: "Extract all invoice details",
    pdfFiles: ["invoice.pdf"],
  },
  provider: "anthropic",
  model: "claude-3-5-sonnet-20241022", // Latest model
});
```

#### AWS Bedrock (with Converse API)

```typescript
await neurolink.generate({
  input: {
    text: "Summarize this contract",
    pdfFiles: ["contract.pdf"],
  },
  provider: "bedrock",
  // Visual PDF analysis with citations
  // Text-only: ~1,000 tokens/3 pages
  // Visual: ~7,000 tokens/3 pages
});
```

#### Google AI Studio

```typescript
await neurolink.generate({
  input: {
    text: "Analyze this large document",
    pdfFiles: ["large-report.pdf"], // Up to 2GB!
  },
  provider: "google-ai-studio",
});
```

## Features

### 1. Auto-Detection

Use the `files` array for automatic file type detection:

```typescript
// Automatically detects PDF, CSV, and image types
await neurolink.generate({
  input: {
    text: "Analyze all these documents",
    files: [
      "report.pdf", // Auto-detected as PDF
      "data.csv", // Auto-detected as CSV
      "chart.png", // Auto-detected as image
    ],
  },
  provider: "vertex",
});
```

### 2. Multiple PDF Files

Process multiple PDFs in a single request:

```typescript
// Compare documents
await neurolink.generate({
  input: {
    text: "Compare version 1 and version 2 of the contract. What changed?",
    pdfFiles: ["contract-v1.pdf", "contract-v2.pdf"],
  },
  provider: "anthropic",
});

// Analyze related documents
await neurolink.generate({
  input: {
    text: "Summarize insights from all quarterly reports",
    pdfFiles: [
      "q1-report.pdf",
      "q2-report.pdf",
      "q3-report.pdf",
      "q4-report.pdf",
    ],
  },
  provider: "vertex",
});
```

### 3. Size and Page Limits

Each provider has specific limits:

```typescript
// Example: Checking file size before upload

const fileStats = await stat("large-document.pdf");
const sizeMB = fileStats.size / (1024 * 1024);

if (sizeMB > 5) {
  // Use Google AI Studio for large files
  provider = "google-ai-studio"; // Supports up to 2GB
} else {
  // Use Vertex AI for normal files
  provider = "vertex"; // Up to 5MB
}
```

### 4. Mixed Multimodal Inputs

Combine PDFs with other file types:

```typescript
// PDF + CSV analysis
await neurolink.generate({
  input: {
    text: "Compare the PDF report with the CSV data. Are there any discrepancies?",
    pdfFiles: ["report.pdf"],
    csvFiles: ["raw-data.csv"],
  },
  provider: "vertex",
});

// PDF + Image verification
await neurolink.generate({
  input: {
    text: "Does the chart in the image match the data in the PDF report?",
    pdfFiles: ["report.pdf"],
    images: ["chart.png"],
  },
  provider: "vertex",
});

// All three types
await neurolink.generate({
  input: {
    text: "Analyze the PDF document, compare with CSV data, and verify against the screenshot",
    files: ["document.pdf", "data.csv", "screenshot.png"],
  },
  provider: "vertex",
});
```

## Best Practices

### 1. Choose the Right Provider

```typescript
// For detailed document analysis
provider: "anthropic"; // Claude excels at understanding complex documents

// For large files (>5MB)
provider: "google-ai-studio"; // Supports up to 2GB

// For general use with good balance
provider: "vertex"; // Gemini 1.5 Pro recommended

// For enterprise/on-premises
provider: "bedrock"; // AWS infrastructure
```

### 2. Optimize File Size

```typescript
// Check file size before processing

async function validatePDF(filePath: string, provider: string) {
  const stats = await stat(filePath);
  const sizeMB = stats.size / (1024 * 1024);

  const limits = {
    vertex: 5,
    anthropic: 5,
    bedrock: 5,
    "google-ai-studio": 2000,
  };

  if (sizeMB > limits[provider]) {
    throw new Error(
      `File ${filePath} (${sizeMB.toFixed(2)}MB) exceeds ${limits[provider]}MB limit for ${provider}`,
    );
  }

  console.log(`✓ File validated: ${sizeMB.toFixed(2)}MB`);
}

await validatePDF("report.pdf", "vertex");
```

### 3. Handle Errors Gracefully

```typescript
try {
  const result = await neurolink.generate({
    input: {
      text: "Analyze this PDF",
      pdfFiles: ["document.pdf"],
    },
    provider: "vertex",
  });
} catch (error) {
  if (error.message.includes("not currently supported")) {
    console.error("PDF not supported by this provider. Try: --provider vertex");
  } else if (error.message.includes("exceeds")) {
    console.error("File too large. Try: --provider google-ai-studio");
  } else if (error.message.includes("Invalid PDF")) {
    console.error("File is not a valid PDF format");
  } else {
    console.error("Error:", error.message);
  }
}
```

### 4. Use Streaming for Large Documents

```typescript
// For long documents, use streaming to get results faster
const stream = await neurolink.stream({
  input: {
    text: "Provide a detailed analysis of this 50-page report",
    pdfFiles: ["long-report.pdf"],
  },
  provider: "vertex",
  maxTokens: 8000,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.content);
}
```

### 5. Be Specific in Your Prompts

```typescript
// ❌ Too vague
"Tell me about this PDF";

// ✅ Specific and actionable
"Extract all monetary values from this invoice and sum them up";
"List all action items mentioned in this meeting notes PDF";
"Compare the Q1 and Q2 revenue figures from these financial reports";
"Find any mentions of security vulnerabilities in this audit report";
```

## Limitations

### File Format Requirements

- **Must** be valid PDF files (starting with `%PDF-` magic bytes)
- **Must** be within provider size limits (5MB for most, 2GB for AI Studio)
- **Must** have valid PDF structure (not corrupted)

### Provider Limitations

```typescript
// ❌ Will fail with unsupported providers
await neurolink.generate({
  input: {
    text: "Analyze this PDF",
    pdfFiles: ["doc.pdf"],
  },
  provider: "azure-openai", // Not supported
});

// ✅ Use supported providers
await neurolink.generate({
  input: {
    text: "Analyze this PDF",
    pdfFiles: ["doc.pdf"],
  },
  provider: "openai", // Supported (GPT-4o, GPT-4o-mini, o1)
});
```

### Page Limits

All providers limit PDF to **100 pages maximum**:

```typescript
// Warning logged for large documents
// [PDF] PDF appears to have 150+ pages. vertex supports up to 100 pages.
```

### Token Usage

PDFs consume significant tokens:

- **Text-only mode**: ~1,000 tokens per 3 pages
- **Visual mode**: ~7,000 tokens per 3 pages

**Tip:** Set appropriate `maxTokens` for PDF analysis:

```typescript
await neurolink.generate({
  input: {
    text: "Summarize this 10-page document",
    pdfFiles: ["document.pdf"],
  },
  provider: "vertex",
  maxTokens: 4000, // ~3,000 tokens for PDF + 1,000 for response
});
```

## Troubleshooting

### Error: "PDF files are not currently supported"

**Problem:** Using unsupported provider (Azure OpenAI, Ollama, etc.)

**Solution:**

```bash
# Change provider to supported one
neurolink generate "Analyze PDF" --pdf doc.pdf --provider vertex

# Or use auto-detection with correct provider
neurolink generate "Analyze PDF" --file doc.pdf --provider anthropic
```

### Error: "PDF size exceeds limit"

**Problem:** File too large for provider (>5MB for most providers)

**Solution:**

```bash
# Switch to Google AI Studio (2GB limit)
neurolink generate "Analyze PDF" --pdf large-doc.pdf --provider google-ai-studio

# Or compress PDF externally before upload
```

### Error: "Invalid PDF file format"

**Problem:** File is not a valid PDF or corrupted

**Solution:**

```bash
# Verify file is valid PDF
file document.pdf  # Should show "PDF document"

# Check magic bytes
head -c 5 document.pdf  # Should show "%PDF-"

# Try re-saving or repairing PDF
```

### Error: "Provider not specified"

**Problem:** No provider selected (PDF requires explicit provider)

**Solution:**

```typescript
// ❌ Missing provider
await neurolink.generate({
  input: {
    text: "Analyze",
    pdfFiles: ["doc.pdf"],
  },
});

// ✅ Specify provider
await neurolink.generate({
  input: {
    text: "Analyze",
    pdfFiles: ["doc.pdf"],
  },
  provider: "vertex",
});
```

### PDF Content Not Being Analyzed

**Problem:** AI says "I cannot read the PDF" even though file is attached

**Common Causes:**

1. **Wrong provider**: Make sure using supported provider
2. **File path wrong**: Verify file exists at specified path
3. **Buffer issue**: If using Buffer, ensure it's valid PDF data

**Debug:**

```typescript

// Verify file exists
await stat("document.pdf"); // Throws if not found

// Verify it's a valid PDF
const buffer = await readFile("document.pdf");
const header = buffer.toString("utf-8", 0, 5);
console.log("PDF header:", header); // Should be "%PDF-"

// Check size
const sizeMB = buffer.length / (1024 * 1024);
console.log("Size:", sizeMB.toFixed(2), "MB");
```

## Advanced Usage

### Custom Provider Configurations

```typescript
// AWS Bedrock with Converse API
await neurolink.generate({
  input: {
    text: "Analyze with citations",
    pdfFiles: ["document.pdf"],
  },
  provider: "bedrock",
  model: "anthropic.claude-3-sonnet-20240229-v1:0",
  // Bedrock automatically enables citations for visual PDF analysis
});
```

### Combining Multiple File Types

```typescript
// Real-world example: Financial analysis
await neurolink.generate({
  input: {
    text: `
      1. Review the PDF financial report for Q3 results
      2. Compare with the raw transaction data in the CSV
      3. Verify the summary chart matches the data
      4. Highlight any discrepancies
    `,
    pdfFiles: ["q3-financial-report.pdf"],
    csvFiles: ["q3-transactions.csv"],
    images: ["q3-summary-chart.png"],
  },
  provider: "vertex",
  maxTokens: 8000,
});
```

### Batch Processing Multiple PDFs

```typescript
// Process multiple invoices
const invoices = [
  "invoice-001.pdf",
  "invoice-002.pdf",
  "invoice-003.pdf",
  // ... more files
];

for (const invoice of invoices) {
  const result = await neurolink.generate({
    input: {
      text: "Extract: invoice number, date, total amount, vendor name",
      pdfFiles: [invoice],
    },
    provider: "anthropic",
  });

  console.log(`${invoice}:`, result.content);
}
```

### Using with AI Tools

```typescript
// PDF analysis with tool use
await neurolink.generate({
  input: {
    text: "Analyze this invoice and save the data",
    pdfFiles: ["invoice.pdf"],
  },
  provider: "vertex",
  tools: {
    saveInvoiceData: {
      description: "Save extracted invoice data",
      parameters: {
        type: "object",
        properties: {
          invoiceNumber: { type: "string" },
          date: { type: "string" },
          amount: { type: "number" },
          vendor: { type: "string" },
        },
      },
      execute: async (params) => {
        // Save to database
        await db.invoices.insert(params);
        return "Saved successfully";
      },
    },
  },
});
```

## Examples

See `examples/pdf-analysis.ts` for complete working examples:

- Basic PDF analysis
- Multiple PDF comparison
- Mixed file type analysis (PDF + CSV)
- Provider-specific features
- Error handling patterns

## Related Features

- [Multimodal Chat](/docs/features/multimodal-chat) - Overview of multimodal capabilities
- [Office Documents](/docs/features/office-documents) - DOCX, PPTX, XLSX processing
- [CSV Support](/docs/features/csv-support) - CSV file processing
- [Image Support](/docs/features/multimodal-chat#images) - Image analysis

## Technical Details

### PDF Processing Flow

```
1. User provides PDF file(s)
   ↓
2. FileDetector validates format (magic bytes)
   ↓
3. PDFProcessor checks provider support
   ↓
4. Validate size/page limits
   ↓
5. Pass Buffer to messageBuilder
   ↓
6. Format as Vercel AI SDK file type
   ↓
7. Send to provider's native PDF API
   ↓
8. Provider processes PDF visually
   ↓
9. Return AI response
```

### Implementation Files

- **`src/lib/utils/pdfProcessor.ts`** - PDF validation and processing
- **`src/lib/utils/fileDetector.ts`** - File type detection
- **`src/lib/utils/messageBuilder.ts`** - Multimodal message construction
- **`src/lib/types/fileTypes.ts`** - PDF type definitions
- **`src/cli/factories/commandFactory.ts`** - CLI `--pdf` flag handling

### Type Definitions

```typescript
// PDF Processor Options
type PDFProcessorOptions = {
  provider?: string;
  bedrockApiMode?: "converse" | "invoke";
};

// PDF Provider Configuration
type PDFProviderConfig = {
  maxSizeMB: number;
  maxPages: number;
  supportsNative: boolean;
  requiresCitations: boolean | "auto";
  apiType: "document" | "files-api" | "unsupported";
};

// File Processing Result
type FileProcessingResult = {
  type: "pdf";
  content: Buffer;
  mimeType: "application/pdf";
  metadata: {
    confidence: number;
    size: number;
    version: string;
    estimatedPages: number | null;
    provider: string;
    apiType: string;
  };
};
```

## Performance Considerations

### Token Usage

- **10-page PDF**: ~3,000-23,000 tokens (depending on visual mode)
- **Set maxTokens appropriately**: PDF tokens + expected response tokens
- **Monitor costs**: PDFs use more tokens than text inputs

### Processing Speed

- **Small PDFs (\5MB)**: ~5-15 seconds
- **Use streaming**: Get results faster for long responses

### Memory Usage

- PDFs loaded as Buffers in memory
- Large files (>100MB) may impact performance
- Consider processing large files in chunks if possible

## Future Enhancements

Planned features for PDF support:

- **OCR Integration**: Extract text from scanned PDFs
- **Page Selection**: Analyze specific pages only
- **PDF Generation**: Create PDFs from AI responses
- **Form Filling**: Extract and populate PDF forms

## Feedback and Support

Found a bug or have a feature request? Please:

1. Check existing issues on GitHub
2. Create a new issue with:
   - Provider used
   - PDF file details (size, pages)
   - Error message or unexpected behavior
   - Sample code (if possible)

## Changelog

### Version 9.2.0 (Current)

- ✅ Initial PDF support for Vertex AI, Anthropic, Bedrock, AI Studio
- ✅ Auto-detection via `--file` flag
- ✅ Multiple PDF processing
- ✅ Size and page limit validation
- ✅ Comprehensive error messages
- ✅ CLI and SDK integration
- ✅ Streaming support
- ✅ Mixed multimodal inputs (PDF + CSV + images)

---

**Next:** [Multimodal Chat Guide](/docs/features/multimodal-chat) | [CSV Support](/docs/features/csv-support)

---

## Provider Orchestration Brain

<!-- Source: features/provider-orchestration.md -->

# Provider Orchestration Brain

The orchestration engine introduced in 7.42.0 pairs a task classifier with a provider/model router. When enabled, NeuroLink inspects each prompt, chooses the most suitable provider/model based on capabilities and availability, and carries that preference through the fallback chain.

## Highlights

- **Binary task classifier** – categorises prompts (analysis vs. creative, etc.) before routing.
- **Model router** – selects provider/model pairs, honouring local providers like Ollama when available.
- **Provider validation** – confirms credentials/availability before committing to the route.
- **Non-invasive** – orchestration augments requests via context so standard fallback logic still applies.

## Enabling Orchestration (SDK)

```typescript

const neurolink = new NeuroLink({ enableOrchestration: true }); // (1)!

const result = await neurolink.generate({
  input: { text: "Generate product launch plan" }, // (2)!
  enableAnalytics: true, // (3)!
  enableEvaluation: true, // (4)!
});

console.log(result.provider, result.model); // (5)!
```

1. Enable orchestration for automatic provider/model selection
2. Task classifier analyzes prompt to determine best provider
3. Log routing decisions to analytics
4. Validate routed provider meets quality expectations
5. See which provider/model was selected by the router

The router adds `__orchestratedPreferredProvider` to the request context so analytics and downstream logging capture routing decisions.

## Tuning the Router

- **Environment awareness** – orchestration only routes to providers that pass `hasProviderEnvVars`, so missing API keys fall back gracefully.
- **Ollama detection** – checks `http://localhost:11434/api/tags` to verify local models before selection.
- **Confidence scores** – `ModelRouter.route` returns `confidence` and `reasoning`. Enable debug logs (`export NEUROLINK_DEBUG=true`) to inspect decisions.
- **Manual overrides** – specifying `provider` or `model` bypasses orchestration for that call.

## Working with the CLI

CLI sessions instantiate NeuroLink without orchestration by default. To experiment with the router from the CLI:

```bash
node -e "  # (1)!
const { NeuroLink } = require('@juspay/neurolink');
(async () => {
  const neurolink = new NeuroLink({ enableOrchestration: true });  # (2)!
  const res = await neurolink.generate({ input: { text: 'Compare Claude and GPT-4o' } });  # (3)!
  console.log(res.provider, res.model);  # (4)!
})();
"
```

1. Run Node.js one-liner from CLI
2. Enable orchestration in SDK mode
3. Let router select best provider for comparison task
4. Output selected provider and model

Future CLI releases will surface a `--enable-orchestration` flag; until then keep orchestration for SDK/server workloads.

## Best Practices

:::tip[Routing Strategy]
Enable orchestration in development to understand routing patterns, then pin `provider` or `model` in production for predictable behavior. Orchestration is ideal for exploratory workflows; explicit selection ensures consistency in critical paths.
:::

:::tip[Ollama Local-First]
The router prioritizes local Ollama models when available, reducing costs and latency for development workflows. Ensure Ollama is running (`http://localhost:11434`) to take advantage of local-first routing.
:::

- Pair orchestration with evaluation to verify the routed provider meets quality expectations.
- Maintain provider credentials for all potential routes; orchestration skips providers missing keys.
- Monitor debug logs in staging to understand how tasks map to providers before rolling out widely.
- Combine with regional controls (`region` option) when routing to cloud-specific providers such as Vertex or Bedrock.

## Troubleshooting

| Symptom                             | Action                                                                                             |
| ----------------------------------- | -------------------------------------------------------------------------------------------------- |
| Router always returns empty context | Ensure `enableOrchestration: true` and prompts contain text.                                       |
| Routed provider never used          | Check credentials via `neurolink status`; orchestration only hints the preferred provider.         |
| Ollama route ignored                | Confirm Ollama server running at `http://localhost:11434` and model tag matches router suggestion. |
| Fallback cycles between providers   | Pin provider/model explicitly or reduce orchestrated confidence thresholds (see `ModelRouter`).    |

## Dive Deeper

- Code reference: `src/lib/utils/modelRouter.ts`
- Code reference: `src/lib/utils/taskClassifier.ts`
- [`docs/advanced/analytics.md`](/docs/reference/analytics) for logging orchestration metadata.

---

## RAG Document Processing Guide

<!-- Source: features/rag.md -->

# RAG Document Processing Guide

> **Since**: v8.44.0 | **Status**: Stable | **Availability**: SDK + CLI

> **Provider Defaults:** When `--provider` (CLI) or `provider` (SDK) is not specified, NeuroLink defaults to **Vertex AI** with **gemini-2.5-flash**. Set the `NEUROLINK_PROVIDER` or `AI_PROVIDER` environment variable to change the default provider.

## Overview

NeuroLink provides enterprise-grade RAG (Retrieval-Augmented Generation) capabilities for building production AI applications:

- **10 Chunking Strategies**: Character, recursive, sentence, token, markdown, HTML, JSON, LaTeX, semantic, and semantic-markdown chunking for any content type
- **Hybrid Search**: Combine BM25 keyword search with vector embeddings using RRF or linear fusion
- **Multi-Factor Reranking**: LLM, cross-encoder, Cohere API, and simple position-based reranking options
- **Factory + Registry Patterns**: Extensible architecture with lazy loading, aliases, and full TypeScript support
- **Resilience Built-In**: Circuit breakers, retry handlers, and comprehensive error handling

## Quick Start

### Basic Document Processing

```typescript

// Load and chunk a document
const doc = await loadDocument("/path/to/document.md");
const chunker = await createChunker("markdown", {
  maxSize: 1000,
  overlap: 100,
});
const chunks = await chunker.chunk(doc.content);

// Each chunk includes metadata
console.log(chunks[0]);
// {
//   id: "chunk-abc123",
//   text: "## Introduction\n\nThis document covers...",
//   metadata: {
//     documentId: "doc-xyz",
//     chunkIndex: 0,
//     startOffset: 0,
//     endOffset: 847
//   }
// }
```

### Full RAG Pipeline

```typescript

const pipeline = new RAGPipeline({
  embeddingModel: { provider: "vertex", modelName: "gemini-2.5-flash" },
  generationModel: { provider: "vertex", modelName: "gemini-2.5-flash" },
});

// Ingest documents
await pipeline.ingest(["./docs/*.md", "./knowledge/**/*.txt"]);

// Query with automatic retrieval and generation
const response = await pipeline.query("What are the key features?");
console.log(response.answer);
console.log(response.sources); // Retrieved chunks with citations
```

## Integration with generate() and stream()

The RAG system integrates seamlessly with NeuroLink's `generate()` and `stream()` APIs through the `createVectorQueryTool`. This allows AI models to automatically query your knowledge base during generation.

### Using RAG with generate()

```typescript

  NeuroLink,
  createVectorQueryTool,
  InMemoryVectorStore,
} from "@juspay/neurolink";

// 1. Set up vector store with your data
const vectorStore = new InMemoryVectorStore();
await vectorStore.upsert("knowledge-base", [
  {
    id: "doc1",
    vector: embedding1,
    metadata: { text: "Your document content..." },
  },
  // ... more documents
]);

// 2. Create the RAG tool
const ragTool = createVectorQueryTool(
  {
    id: "knowledge-search",
    description: "Search the knowledge base for relevant information",
    indexName: "knowledge-base",
    embeddingModel: { provider: "vertex", modelName: "gemini-2.5-flash" },
    topK: 5,
    reranker: {
      model: { provider: "vertex", modelName: "gemini-2.5-flash" },
      topK: 3,
    },
  },
  vectorStore,
);

// 3. Use with generate()
const neurolink = new NeuroLink();
const result = await neurolink.generate({
  input: { text: "What are the key features of our product?" },
  tools: [ragTool],
  provider: "vertex",
  model: "gemini-2.5-flash",
});

console.log(result.content);
console.log(result.toolExecutions); // See RAG tool results
```

### Using RAG with stream()

```typescript
// Same setup as above, then:
const stream = await neurolink.stream({
  input: { text: "Explain our pricing model in detail" },
  tools: [ragTool],
  provider: "vertex",
  model: "gemini-2.5-flash",
});

for await (const chunk of stream) {
  if (chunk.type === "text") {
    process.stdout.write(chunk.content);
  } else if (chunk.type === "tool_call") {
    console.log("RAG tool called:", chunk.toolName);
  }
}
```

### Complete RAG Pipeline Example

This example demonstrates a full RAG pipeline from document loading to AI-powered retrieval:

```typescript

  NeuroLink,
  createVectorQueryTool,
  InMemoryVectorStore,
} from "@juspay/neurolink";

  loadDocument,
  createChunker,
  createMetadataExtractor,
} from "@juspay/neurolink";

// Step 1: Load and chunk documents
const doc = await loadDocument("./docs/product-guide.md");
const chunker = await createChunker("markdown", {
  maxSize: 1000,
  overlap: 100,
  preserveHeaders: true,
});
const chunks = await chunker.chunk(doc.content);

// Step 2: Extract metadata for better retrieval (optional)
const extractor = await createMetadataExtractor("llm", {
  provider: "vertex",
  modelName: "gemini-2.5-flash",
});
const enrichedChunks = await extractor.extract(chunks, {
  summary: true,
  keywords: true,
});

// Step 3: Generate embeddings using the NeuroLink provider
const neurolink = new NeuroLink();

// Helper function to generate embeddings
async function generateEmbeddings(texts: string[]): Promise {
  const embeddings: number[][] = [];
  for (const text of texts) {
    const result = await neurolink.generate({
      input: { text },
      provider: "vertex",
      model: "gemini-2.5-flash",
    });
    // Extract embedding from result (provider-specific)
    embeddings.push(result.embedding || []);
  }
  return embeddings;
}

const embeddings = await generateEmbeddings(enrichedChunks.map((c) => c.text));

// Step 4: Store in vector store
const vectorStore = new InMemoryVectorStore();
await vectorStore.upsert(
  "product-docs",
  enrichedChunks.map((chunk, i) => ({
    id: chunk.id,
    vector: embeddings[i],
    metadata: {
      text: chunk.text,
      summary: chunk.metadata.summary,
      keywords: chunk.metadata.keywords,
      source: "product-guide.md",
    },
  })),
);

// Step 5: Create RAG tool
const ragTool = createVectorQueryTool(
  {
    id: "product-search",
    description: "Search product documentation for answers to user questions",
    indexName: "product-docs",
    embeddingModel: { provider: "vertex", modelName: "gemini-2.5-flash" },
    topK: 5,
    includeSources: true,
    reranker: {
      model: { provider: "vertex", modelName: "gemini-2.5-flash" },
      topK: 3,
      weights: { semantic: 0.6, vector: 0.3, position: 0.1 },
    },
  },
  vectorStore,
);

// Step 6: Use with generate()
const response = await neurolink.generate({
  input: { text: "How do I configure the billing settings?" },
  tools: [ragTool],
  provider: "vertex",
  model: "gemini-2.5-flash",
  systemPrompt: `You are a helpful product assistant. Use the knowledge-search tool
    to find relevant information before answering questions. Always cite your sources.`,
});

console.log("Answer:", response.content);
console.log(
  "Sources used:",
  response.toolExecutions?.map((t) => t.result?.sources),
);
```

### Configuration Options for createVectorQueryTool

| Option            | Type                                      | Default               | Description                                            |
| ----------------- | ----------------------------------------- | --------------------- | ------------------------------------------------------ |
| `id`              | `string`                                  | `vector-query-{uuid}` | Unique identifier for the tool                         |
| `description`     | `string`                                  | Default description   | Description shown to AI for tool selection             |
| `indexName`       | `string`                                  | **Required**          | Name of the index in the vector store                  |
| `embeddingModel`  | `{ provider: string, modelName: string }` | **Required**          | Embedding model configuration                          |
| `enableFilter`    | `boolean`                                 | `false`               | Enable metadata filtering in queries                   |
| `includeVectors`  | `boolean`                                 | `false`               | Include raw vectors in results                         |
| `includeSources`  | `boolean`                                 | `true`                | Include source documents in response                   |
| `topK`            | `number`                                  | `10`                  | Number of results to retrieve                          |
| `reranker`        | `RerankerConfig`                          | `undefined`           | Optional reranker configuration                        |
| `providerOptions` | `VectorProviderOptions`                   | `undefined`           | Provider-specific options (Pinecone, pgVector, Chroma) |

#### Reranker Configuration

| Option    | Type                                                        | Default                                         | Description                       |
| --------- | ----------------------------------------------------------- | ----------------------------------------------- | --------------------------------- |
| `model`   | `{ provider: string, modelName: string }`                   | **Required**                                    | Model for semantic reranking      |
| `weights` | `{ semantic?: number, vector?: number, position?: number }` | `{ semantic: 0.5, vector: 0.3, position: 0.2 }` | Score weights (must sum to 1.0)   |
| `topK`    | `number`                                                    | Same as tool `topK`                             | Results to return after reranking |

### Event Handling

Listen for tool events during RAG operations to monitor and debug:

```typescript
const neurolink = new NeuroLink();

// Listen for tool execution events
neurolink.on("tool:start", (event) => {
  console.log(`Tool started: ${event.toolName}`);
  console.log(`Parameters:`, event.parameters);
});

neurolink.on("tool:end", (event) => {
  console.log(`Tool completed: ${event.toolName}`);
  console.log(`Success: ${event.success}`);
  console.log(`Response time: ${event.responseTime}ms`);
  if (event.result) {
    console.log(`Results found: ${event.result.totalResults}`);
  }
  if (event.error) {
    console.error(`Error:`, event.error.message);
  }
});

// Listen for generation events
neurolink.on("generation:start", (event) => {
  console.log(`Generation started with provider: ${event.provider}`);
});

neurolink.on("generation:end", (event) => {
  console.log(`Generation completed in ${event.responseTime}ms`);
  console.log(`Tools used: ${event.toolsUsed?.join(", ") || "none"}`);
});

// Execute RAG query with event monitoring
const result = await neurolink.generate({
  input: { text: "What are the system requirements?" },
  tools: [ragTool],
  provider: "vertex",
  model: "gemini-2.5-flash",
});
```

### Dynamic Vector Store Resolution

For multi-tenant applications, you can provide a resolver function instead of a static vector store:

```typescript
const ragTool = createVectorQueryTool(
  {
    id: "tenant-search",
    description: "Search tenant-specific knowledge base",
    indexName: "documents",
    embeddingModel: { provider: "vertex", modelName: "gemini-2.5-flash" },
    topK: 5,
  },
  (context) => {
    // Return different vector stores based on request context
    const tenantId = context.tenantId || "default";
    return getVectorStoreForTenant(tenantId);
  },
);

// The context is passed from generate options
const result = await neurolink.generate({
  input: { text: "Search query" },
  tools: [ragTool],
  context: { tenantId: "tenant-123", userId: "user-456" },
});
```

### Metadata Filtering

Enable metadata filtering for more precise retrieval:

```typescript
const ragTool = createVectorQueryTool(
  {
    id: "filtered-search",
    description: "Search with metadata filters",
    indexName: "knowledge-base",
    embeddingModel: { provider: "vertex", modelName: "gemini-2.5-flash" },
    enableFilter: true, // Enable filter parameter
    topK: 10,
  },
  vectorStore,
);

// The AI can now use filters in its queries
// Example filter syntax supported:
// { category: 'billing' }                    - Exact match
// { date: { $gte: '2024-01-01' } }          - Comparison operators
// { tags: { $in: ['feature', 'guide'] } }   - Array membership
// { $and: [{ type: 'doc' }, { status: 'published' }] } - Logical operators
```

## Chunking Strategies

NeuroLink provides 10 chunking strategies optimized for different content types.

### Available Strategies

| Strategy            | Best For                    | Key Config                             |
| ------------------- | --------------------------- | -------------------------------------- |
| `character`         | Simple text, logs           | `maxSize`, `separator`                 |
| `recursive`         | General documents (default) | `maxSize`, `overlap`, `separators`     |
| `sentence`          | Natural language, Q&A       | `maxSize`, `minSentences`              |
| `token`             | LLM context optimization    | `maxSize` (tokens), `tokenizer`        |
| `markdown`          | Documentation, READMEs      | `preserveHeaders`, `codeBlockHandling` |
| `html`              | Web content                 | `preserveTags`, `removeTags`           |
| `json`              | API responses, config       | `preserveStructure`, `flattenDepth`    |
| `latex`             | Academic papers             | `sectionCommands`, `preserveMath`      |
| `semantic`          | Context-aware splitting     | `similarityThreshold`, `embedder`      |
| `semantic-markdown` | Knowledge bases             | `semanticThreshold`, `embedder`        |

### Strategy Configuration

```typescript

// List all available strategies
const strategies = getAvailableStrategies();
// ['character', 'recursive', 'sentence', 'token', 'markdown', 'html', 'json', 'latex', 'semantic', 'semantic-markdown']

// Recursive chunker (recommended for general use)
const recursiveChunker = await createChunker("recursive", {
  maxSize: 1000,
  overlap: 200,
  separators: ["\n\n", "\n", ". ", " ", ""],
  keepSeparator: true,
});

// Markdown chunker (for documentation)
const markdownChunker = await createChunker("markdown", {
  maxSize: 1000,
  overlap: 100,
  preserveHeaders: true,
  codeBlockHandling: "preserve", // 'preserve' | 'split' | 'remove'
});

// Token chunker (for LLM optimization)
const tokenChunker = await createChunker("token", {
  maxSize: 512, // Max tokens per chunk
  overlap: 50, // Token overlap
  tokenizer: "cl100k_base", // OpenAI tokenizer
});
```

### Content-Type Recommendations

```typescript

// Get strategy based on content type
getRecommendedStrategy("text/markdown"); // 'markdown'
getRecommendedStrategy("text/html"); // 'html'
getRecommendedStrategy("application/json"); // 'json'
getRecommendedStrategy("text/x-latex"); // 'latex'
getRecommendedStrategy("text/plain"); // 'recursive'
```

## Hybrid Search

Hybrid search combines BM25 keyword matching with vector similarity for improved retrieval quality.

### How It Works

1. **BM25 Search**: Traditional keyword matching using term frequency and document length normalization
2. **Vector Search**: Semantic similarity using embeddings
3. **Score Fusion**: Combine rankings using RRF or linear combination

### Fusion Methods

#### Reciprocal Rank Fusion (RRF)

RRF is robust to score scale differences and works well in most cases:

```typescript

// Combine rankings from multiple sources
const fusedScores = reciprocalRankFusion(
  [vectorRankings, bm25Rankings],
  60, // k parameter (default: 60)
);

// RRF formula: score(d) = sum(1 / (k + rank(d)))
```

#### Linear Combination

Linear combination allows fine-tuning the balance between vector and keyword scores:

```typescript

const combinedScores = linearCombination(
  vectorScores, // Map
  bm25Scores, // Map
  0.5, // alpha: weight for vector scores (0-1)
);

// Linear formula: score(d) = alpha * vectorScore(d) + (1 - alpha) * bm25Score(d)
```

### Hybrid Search Pipeline

```typescript

  createHybridSearch,
  InMemoryBM25Index,
  InMemoryVectorStore,
} from "@juspay/neurolink";

// Create indices
const bm25Index = new InMemoryBM25Index({ k1: 1.2, b: 0.75 });
const vectorStore = new InMemoryVectorStore();

// Add documents to both indices
const documents = [
  {
    id: "doc1",
    text: "Machine learning fundamentals...",
    metadata: { topic: "ml" },
  },
  {
    id: "doc2",
    text: "Deep learning architectures...",
    metadata: { topic: "dl" },
  },
];

await bm25Index.addDocuments(documents);
await vectorStore.addDocuments(documents);

// Create hybrid search
const hybridSearch = createHybridSearch({
  bm25Index,
  vectorStore,
  fusionMethod: "rrf", // 'rrf' | 'linear'
  alpha: 0.5, // Vector weight (for linear fusion)
  k: 60, // RRF parameter
});

// Execute search
const results = await hybridSearch.search("neural network training", {
  topK: 10,
  filter: { topic: "ml" },
});
```

### BM25 Configuration

```typescript
type BM25Config = {
  k1: number; // Term frequency saturation (default: 1.2)
  b: number; // Document length normalization (default: 0.75)
  lowercase: boolean; // Normalize to lowercase (default: true)
  stemming: boolean; // Apply stemming (default: false)
  stopwords: string[]; // Words to ignore (default: English stopwords)
};
```

## Reranking

Reranking re-scores initial search results for improved relevance.

### Available Reranker Types

| Type            | Description                         | Requires Model | Best For                 |
| --------------- | ----------------------------------- | -------------- | ------------------------ |
| `simple`        | Position + vector score combination | No             | Fast, cost-free baseline |
| `llm`           | LLM semantic relevance scoring      | Yes            | High-quality semantic    |
| `cross-encoder` | Cross-encoder model scoring         | Yes            | Accuracy-focused tasks   |
| `cohere`        | Cohere Rerank API                   | API Key        | Production-grade results |
| `batch`         | Batch LLM processing                | Yes            | Large result sets        |

### Reranker Configuration

```typescript

// List available types
const types = getAvailableRerankerTypes();
// ['simple', 'llm', 'cross-encoder', 'cohere', 'batch']

// Simple reranker (no model required)
const simpleReranker = await createReranker("simple", {
  topK: 10,
  positionWeight: 0.3,
  scoreWeight: 0.7,
});

// LLM reranker (requires model)
const llmReranker = await createReranker("llm", {
  topK: 5,
  model: "gemini-2.5-flash",
  temperature: 0.0,
  batchSize: 5,
});

// Cohere reranker (requires API key)
const cohereReranker = await createReranker("cohere", {
  topK: 10,
  model: "rerank-v3.5",
  maxChunksPerDoc: 10,
});

// Rerank results
const reranked = await simpleReranker.rerank(searchResults, query, { topK: 5 });
```

### Batch Reranking for Large Sets

```typescript

// Process large result sets efficiently
const reranked = await batchRerank(searchResults, query, {
  batchSize: 10,
  parallelBatches: 3,
  model: "gemini-2.5-flash",
  topK: 20,
});
```

## Metadata Extraction

Extract structured metadata from chunks using LLMs.

### Extraction Types

| Type        | Description               | Output                    |
| ----------- | ------------------------- | ------------------------- |
| `title`     | Document/section title    | `string`                  |
| `summary`   | Brief content summary     | `string`                  |
| `keywords`  | Relevant keywords         | `string[]`                |
| `questions` | Q&A pairs for the content | `{question, answer}[]`    |
| `custom`    | Custom schema extraction  | `Record` |

### Usage

```typescript

  createMetadataExtractor,
  extractMetadata,
  LLMMetadataExtractor,
} from "@juspay/neurolink";

// Using factory
const extractor = await createMetadataExtractor("llm", {
  provider: "vertex",
  modelName: "gemini-2.5-flash",
});

// Extract metadata from chunks
const results = await extractor.extract(chunks, {
  title: true,
  summary: true,
  keywords: true,
  questions: { maxQuestions: 3 },
});

// Results include extracted metadata per chunk
console.log(results[0]);
// {
//   title: "Introduction to Machine Learning",
//   summary: "This section covers the fundamentals...",
//   keywords: ["machine learning", "supervised learning", "classification"],
//   questions: [
//     { question: "What is supervised learning?", answer: "..." }
//   ]
// }
```

## Configuration Reference

### Chunker Configuration

| Option       | Type                      | Default   | Description                        |
| ------------ | ------------------------- | --------- | ---------------------------------- |
| `maxSize`    | `number`                  | `1000`    | Maximum chunk size (chars/tokens)  |
| `overlap`    | `number`                  | `200`     | Overlap between chunks             |
| `minSize`    | `number`                  | `50`      | Minimum chunk size                 |
| `documentId` | `string`                  | auto-UUID | Document identifier for metadata   |
| `metadata`   | `Record` | `{}`      | Additional metadata for all chunks |

### Reranker Configuration

| Option                  | Type      | Default | Description                     |
| ----------------------- | --------- | ------- | ------------------------------- |
| `topK`                  | `number`  | `10`    | Number of top results to return |
| `minScore`              | `number`  | `0.0`   | Minimum score threshold         |
| `includeOriginalScores` | `boolean` | `false` | Include original scores         |

### Hybrid Search Configuration

| Option         | Type                | Default | Description                 |
| -------------- | ------------------- | ------- | --------------------------- |
| `fusionMethod` | `'rrf' \| 'linear'` | `'rrf'` | Score fusion method         |
| `alpha`        | `number`            | `0.5`   | Vector weight (linear only) |
| `k`            | `number`            | `60`    | RRF k parameter             |
| `topK`         | `number`            | `10`    | Results to return           |

### Environment Variables

| Variable            | Description                | Required |
| ------------------- | -------------------------- | -------- |
| `GOOGLE_API_KEY`    | For Vertex AI (default)    | Yes      |
| `OPENAI_API_KEY`    | For OpenAI provider        | Optional |
| `COHERE_API_KEY`    | For Cohere reranker        | Optional |
| `ANTHROPIC_API_KEY` | For Claude-based reranking | Optional |

## Advanced Usage

### Integration with Observability

Track RAG operations with Langfuse for debugging and optimization:

```typescript

const pipeline = new RAGPipeline(config);

await setLangfuseContext(
  {
    userId: "user-123",
    sessionId: "session-456",
    operationName: "rag-query",
    metadata: {
      pipeline: "customer-support",
      chunkingStrategy: "markdown",
    },
  },
  async () => {
    const response = await pipeline.query("How do I reset my password?");
    return response;
  },
);
```

### Integration with Guardrails

Validate RAG inputs and outputs with guardrails:

```typescript

  createGuardrail,
  validateInput,
  validateOutput,
} from "@juspay/neurolink";

// Create guardrails for RAG
const inputGuardrail = createGuardrail({
  type: "input",
  rules: [
    { type: "maxLength", value: 1000 },
    { type: "noPersonalInfo", enabled: true },
  ],
});

const outputGuardrail = createGuardrail({
  type: "output",
  rules: [
    { type: "factualOnly", enabled: true },
    { type: "noPII", enabled: true },
  ],
});

// Apply guardrails to RAG pipeline
const validatedQuery = await validateInput(inputGuardrail, query);
const response = await pipeline.query(validatedQuery);
const validatedResponse = await validateOutput(
  outputGuardrail,
  response.answer,
);
```

### Custom Chunker Registration

Extend the chunker registry with custom implementations:

```typescript

// Define custom chunker
class CustomChunker implements Chunker {
  constructor(private config?: ChunkerConfig) {}

  async chunk(text: string, options?: ChunkerConfig) {
    // Custom chunking logic
    const maxSize = options?.maxSize ?? this.config?.maxSize ?? 500;
    // ... implementation
  }
}

// Register with the registry
ChunkerRegistry.register("custom", CustomChunker, {
  name: "Custom Chunker",
  description: "My custom chunking strategy",
  aliases: ["my-chunker"],
  defaultConfig: { maxSize: 500 },
});

// Now use it
const chunker = await createChunker("custom", { maxSize: 800 });
```

### Graph RAG

Use knowledge graphs for relationship-aware retrieval:

```typescript

// Create graph with similarity threshold for edge creation
const graphRag = new GraphRAG({
  dimension: 1536, // Embedding dimension
  threshold: 0.7, // Similarity threshold for creating edges
});

// Build graph from chunks and their embeddings
const chunks = [
  { text: "Machine learning basics", metadata: { topic: "ml" } },
  { text: "Neural networks", metadata: { topic: "dl" } },
];
const embeddings = [
  { vector: [0.1, 0.2 /* ... */] },
  { vector: [0.15, 0.25 /* ... */] },
];

graphRag.createGraph(chunks, embeddings);

// Or add nodes incrementally
const nodeId = graphRag.addNode(
  { text: "Deep learning", metadata: { topic: "dl" } },
  { vector: [0.12, 0.22 /* ... */] },
);

// Query with embedding vector using random walk with restart
const results = graphRag.query({
  query: queryEmbedding, // Query embedding vector
  topK: 10,
  randomWalkSteps: 100,
  restartProb: 0.15,
});

// Get graph statistics
const stats = graphRag.getStats();
// { nodeCount: 3, edgeCount: 4, avgDegree: 1.33, threshold: 0.7 }
```

### Resilience Patterns

Use circuit breakers and retry handlers for production reliability:

```typescript

// Circuit breaker for external API calls
const breaker = new RAGCircuitBreaker("reranker-api", {
  failureThreshold: 5,
  resetTimeout: 60000,
  halfOpenMaxCalls: 3,
  operationTimeout: 30000,
});

// Wrap reranker calls
const result = await breaker.execute(async () => {
  return await cohereReranker.rerank(results, query);
}, "rerank");

// Listen to circuit breaker events
breaker.on("stateChange", ({ oldState, newState, reason }) => {
  console.log(`Circuit breaker: ${oldState} -> ${newState} (${reason})`);
});

// Retry handler with exponential backoff
const retryHandler = new RAGRetryHandler({
  maxRetries: 3,
  initialDelay: 1000,
  maxDelay: 30000,
  backoffMultiplier: 2,
  jitter: true,
});

const chunks = await retryHandler.executeWithRetry(async () => {
  return await chunker.chunk(largeDocument);
});
```

## CLI Usage

NeuroLink CLI provides commands for RAG operations.

### Document Processing

```bash
# Chunk a document
neurolink rag chunk ./document.md --strategy markdown --max-size 1000 --overlap 100

# Chunk with output to file
neurolink rag chunk ./document.md -s recursive --format json --output chunks.json

# Process multiple documents (use shell loop)
for file in ./docs/*.md; do neurolink rag chunk "$file" --strategy markdown --format json; done
```

### Index Management

```bash
# Build an index from a document
neurolink rag index ./docs/guide.md --indexName my-docs --provider vertex --model gemini-2.5-flash

# Query an existing index
neurolink rag query "What are the main features?" --indexName my-docs --topK 5 --provider vertex --model gemini-2.5-flash

# Index with Graph RAG enabled
neurolink rag index ./docs/guide.md --indexName my-docs --graph --provider vertex --model gemini-2.5-flash
```

## Simplified RAG API (`rag: { files }`)

> **Since**: v9.2.0 | **Recommended** for most use cases

Instead of manually creating chunkers, vector stores, and tools, pass `rag: { files }` directly to `generate()` or `stream()`. NeuroLink handles the entire pipeline automatically.

### SDK Usage

```typescript

const neurolink = new NeuroLink();

// Generate with RAG - just pass files
const result = await neurolink.generate({
  prompt: "What are the key features described in the docs?",
  rag: {
    files: ["./docs/guide.md", "./docs/api.md"],
    strategy: "markdown", // Optional: auto-detected from file extension
    chunkSize: 512, // Optional: default 1000
    chunkOverlap: 50, // Optional: default 200
    topK: 5, // Optional: default 5
  },
});

// Stream with RAG - identical API
const stream = await neurolink.stream({
  prompt: "Summarize the architecture",
  rag: { files: ["./docs/architecture.md"] },
});

for await (const chunk of stream.stream) {
  process.stdout.write(chunk);
}
```

### CLI Usage

```bash
# Basic RAG with generate
neurolink generate "What is this about?" --rag-files ./docs/guide.md

# RAG with custom chunking strategy
neurolink generate "Explain the API" --rag-files ./docs/guide.md --rag-strategy markdown --rag-chunk-size 512

# RAG with streaming and multiple files
neurolink stream "Summarize everything" --rag-files ./docs/a.md ./docs/b.md --rag-top-k 10
```

### CLI Flags Reference

| Flag                  | Type       | Default       | Description                                                                                                         |
| --------------------- | ---------- | ------------- | ------------------------------------------------------------------------------------------------------------------- |
| `--rag-files`         | `string[]` | -             | File paths to load for RAG context                                                                                  |
| `--rag-strategy`      | `string`   | auto-detected | Chunking strategy (character, recursive, sentence, token, markdown, html, json, latex, semantic, semantic-markdown) |
| `--rag-chunk-size`    | `number`   | 1000          | Maximum chunk size in characters                                                                                    |
| `--rag-chunk-overlap` | `number`   | 200           | Overlap between adjacent chunks                                                                                     |
| `--rag-top-k`         | `number`   | 5             | Number of top results to retrieve                                                                                   |

### RAGConfig Type

```typescript
type RAGConfig = {
  files: string[]; // Required: file paths to load
  strategy?: ChunkingStrategy; // Default: auto-detected from file extension
  chunkSize?: number; // Default: 1000
  chunkOverlap?: number; // Default: 200
  topK?: number; // Default: 5
  toolName?: string; // Default: "search_knowledge_base"
  toolDescription?: string; // Custom tool description
  embeddingProvider?: string; // Defaults to generation provider
  embeddingModel?: string; // Defaults to provider's default
};
```

### How It Works

1. Files are loaded from disk and auto-detected for chunking strategy (`.md` -> markdown, `.html` -> html, `.json` -> json, etc.)
2. Content is chunked using the selected strategy with configurable size and overlap
3. Chunks are embedded using a simple character-frequency hash (128 dimensions) and stored in an in-memory vector store
4. A `search_knowledge_base` tool is created and injected into the AI model's available tools
5. A system prompt instructs the AI to use the search tool before answering
6. The AI autonomously decides when to search the knowledge base during generation/streaming

### Auto-Detected Strategies by Extension

| Extension                                                                                | Strategy  |
| ---------------------------------------------------------------------------------------- | --------- |
| `.md`, `.mdx`                                                                            | markdown  |
| `.html`, `.htm`                                                                          | html      |
| `.json`                                                                                  | json      |
| `.tex`, `.latex`                                                                         | latex     |
| `.txt`, `.csv`, `.xml`, `.yaml`, `.yml`                                                  | recursive |
| `.ts`, `.js`, `.py`, `.java`, `.go`, `.rs`, `.c`, `.cpp`, `.rb`, `.php`, `.swift`, `.kt` | recursive |

## Best Practices

### Chunking

1. **Match chunk size to model context** - Use token chunker when optimizing for specific LLM context windows
2. **Choose strategy by content type** - Markdown for docs, HTML for web content, JSON for structured data
3. **Use 10-20% overlap** - Prevents context loss at chunk boundaries
4. **Preserve structure when possible** - Format-aware chunkers maintain semantic coherence
5. **Test with your data** - Optimal settings vary by domain and use case

### Reranking

1. **Start with simple reranker** - Fast, free, and often sufficient for basic use cases
2. **Use LLM reranking for quality** - When accuracy matters more than latency
3. **Batch large result sets** - Use batch reranker for 50+ results
4. **Consider cost** - API-based rerankers (Cohere) have per-call costs
5. **Cache reranking results** - Results for the same query/docs can be reused

### Hybrid Search

1. **Start with RRF** - Robust to score scale differences, less tuning needed
2. **Tune alpha for linear fusion** - Start at 0.5, adjust based on evaluation
3. **Keep indices in sync** - Update both BM25 and vector indices together
4. **Filter early** - Apply metadata filters before fusion when possible
5. **Monitor retrieval quality** - Track precision/recall metrics in production

## Troubleshooting

| Problem                       | Solution                                                                 |
| ----------------------------- | ------------------------------------------------------------------------ |
| Empty chunks returned         | Check if `maxSize` is too small for your content; try increasing to 500+ |
| Duplicate content in chunks   | Reduce `overlap` parameter or use a structure-aware chunker              |
| Missing context at boundaries | Increase `overlap` to 15-20% of `maxSize`                                |
| Slow reranking performance    | Switch to `simple` reranker or reduce `topK` before reranking            |
| Poor search quality           | Tune BM25 parameters (`k1`, `b`) or adjust fusion `alpha` weight         |
| Out of memory with large docs | Process documents in batches; use streaming where available              |
| Reranker API timeouts         | Use `CircuitBreaker` wrapper; reduce batch size                          |
| Inconsistent chunk metadata   | Ensure `documentId` is set consistently across processing runs           |

### Debug Logging

```bash
# Enable verbose logging for RAG operations
DEBUG=neurolink:rag:* npx tsx your-script.ts

# Log specific components
DEBUG=neurolink:rag:chunker npx tsx your-script.ts
DEBUG=neurolink:rag:reranker npx tsx your-script.ts
DEBUG=neurolink:rag:hybrid npx tsx your-script.ts
```

## API Reference

### Core Exports

**Document Processing:**

- `loadDocument(path)` - Load a single document
- `loadDocuments(paths)` - Load multiple documents
- `MDocument` - Fluent document processing class
- `processDocument(text, options)` - Process text through chunking and metadata extraction

**Chunking:**

- `createChunker(strategy, config)` - Create a chunker instance
- `ChunkerFactory` - Factory for chunker creation
- `ChunkerRegistry` - Registry with all chunker implementations
- `getAvailableStrategies()` - List available chunking strategies
- `getRecommendedStrategy(contentType)` - Get recommended strategy for content type

**Reranking:**

- `createReranker(type, config)` - Create a reranker instance
- `RerankerFactory` - Factory for reranker creation
- `RerankerRegistry` - Registry with all reranker implementations
- `getAvailableRerankerTypes()` - List available reranker types
- `rerank(results, query, model)` - Direct reranking function
- `batchRerank(results, query, options)` - Batch reranking

**Retrieval:**

- `createHybridSearch(config)` - Create hybrid search instance
- `InMemoryBM25Index` - In-memory BM25 index
- `InMemoryVectorStore` - In-memory vector store
- `reciprocalRankFusion(rankings, k)` - RRF score fusion
- `linearCombination(vectorScores, bm25Scores, alpha)` - Linear score fusion
- `createVectorQueryTool(vectorStore, options)` - Create vector query tool

**Metadata:**

- `createMetadataExtractor(type, config)` - Create metadata extractor
- `LLMMetadataExtractor` - LLM-powered extractor class
- `extractMetadata(chunks, params)` - Extract metadata from chunks

**Pipeline:**

- `RAGPipeline` - Full RAG pipeline class
- `createRAGPipeline(config)` - Create pipeline instance
- `assembleContext(chunks, options)` - Assemble context from chunks
- `formatContextWithCitations(chunks, format)` - Format with citations

**Resilience:**

- `RAGCircuitBreaker` - Circuit breaker pattern for RAG operations
- `RAGRetryHandler` - Retry with exponential backoff and jitter

**Types:**

- `Chunk`, `ChunkMetadata`, `ChunkerConfig`
- `Reranker`, `RerankerConfig`, `RerankerType`
- `HybridSearchOptions`, `BM25Config`
- `RAGPipelineConfig`, `RAGResponse`
- `MetadataExtractor`, `MetadataExtractorConfig`

## See Also

- [RAG Configuration Guide](../rag/configuration) - Detailed configuration reference
- [RAG Testing Guide](../rag/testing) - Testing RAG pipelines
- [Observability Guide](/docs/observability/health-monitoring) - Tracing and monitoring
- [Guardrails Guide](/docs/features/guardrails) - Input/output validation
- [Vector Store Integrations](/docs/guides/vector-stores) - Production vector stores

---

## Real-time Services Guide

<!-- Source: features/real-time-services.md -->

#  Real-time Services Guide

**Enterprise WebSocket Infrastructure for NeuroLink**

##  Overview

NeuroLink provides enterprise-grade real-time services with WebSocket infrastructure, enhanced chat capabilities, and streaming optimization. These features enable building professional AI applications with real-time bidirectional communication.

##  Key Features

- ** WebSocket Infrastructure** - Professional-grade server with connection management
- ** Enhanced Chat Services** - Dual-mode SSE + WebSocket support
- ** Room Management** - Group chat and broadcasting capabilities
- ** Streaming Channels** - Real-time AI response streaming
- ** Performance Optimization** - Compression, buffering, and latency control
- **️ Production Ready** - Connection pooling, heartbeat monitoring, error handling

##  Room Management

### Creating and Managing Rooms

```typescript
// Join users to rooms
wsServer.joinRoom(connectionId, "ai-support-room");
wsServer.joinRoom(connectionId, "project-alpha");

// Leave rooms
wsServer.leaveRoom(connectionId, "general");

// Get room information
const roomInfo = wsServer.getRoomInfo("ai-support-room");
console.log(`Room has ${roomInfo.memberCount} members`);

// List all rooms for a connection
const userRooms = wsServer.getUserRooms(connectionId);
console.log("User is in rooms:", userRooms);
```

### Broadcasting to Rooms

```typescript
// Broadcast AI responses to room
wsServer.broadcastToRoom("ai-support-room", {
  type: "ai-response",
  data: {
    text: "How can I help you today?",
    timestamp: new Date().toISOString(),
    provider: "openai",
  },
});

// Broadcast to multiple rooms
wsServer.broadcastToRooms(["room1", "room2"], {
  type: "announcement",
  data: { message: "System maintenance in 10 minutes" },
});

// Broadcast to all connections
wsServer.broadcast({
  type: "global-message",
  data: { message: "Welcome to NeuroLink AI" },
});
```

---

##  Streaming Channels

### Creating Streaming Channels

```typescript
// Create streaming channel for AI responses
const channel = wsServer.createStreamingChannel(connectionId, "ai-stream");

// Configure channel options
channel.setOptions({
  bufferSize: 4096,
  compressionEnabled: true,
  maxChunkSize: 1024,
});

// Handle streaming data
channel.onData = (chunk) => {
  console.log("Received chunk:", chunk);
};

channel.onComplete = () => {
  console.log("Streaming complete");
};

channel.onError = (error) => {
  console.error("Streaming error:", error);
};
```

### AI Response Streaming

```typescript

// Handle chat messages with streaming
wsServer.on("chat-message", async ({ connectionId, message }) => {
  const channel = wsServer.createStreamingChannel(
    connectionId,
    `chat-${Date.now()}`,
  );
  const provider = await createBestAIProvider();

  try {
    // Start streaming AI response (NEW: Primary method)
    const result = await provider.stream({
      input: { text: message.data.prompt },
      temperature: 0.7,
    });

    // Stream chunks to client
    for await (const chunk of result.stream) {
      channel.send({
        type: "text-chunk",
        data: { chunk: chunk.content, provider: result.provider },
      });
    }

    // Signal completion
    channel.complete({
      type: "stream-complete",
      data: {
        provider: result.provider,
        model: result.model,
        totalChunks: channel.getChunkCount(),
      },
    });
  } catch (error) {
    channel.error({
      type: "stream-error",
      data: { error: error.message },
    });
  }
});
```

---

##  Enhanced Chat Services

### Dual-Mode Chat (SSE + WebSocket)

```typescript

  createEnhancedChatService,
  createBestAIProvider,
} from "@juspay/neurolink";

const provider = await createBestAIProvider();
const chatService = createEnhancedChatService({
  provider,
  enableSSE: true, // Server-Sent Events for simple streaming
  enableWebSocket: true, // WebSocket for real-time bidirectional
  streamingConfig: {
    bufferSize: 8192,
    compressionEnabled: true,
    latencyTarget: 100, // Target 100ms latency
  },
});

// Handle streaming responses
await chatService.streamChat({
  prompt: "Generate a story about AI and humanity",
  onChunk: (chunk) => {
    console.log("Chunk:", chunk);
    // Send to WebSocket clients
    wsServer.broadcast({
      type: "story-chunk",
      data: { chunk },
    });
  },
  onComplete: (result) => {
    console.log("Story complete:", result.text);
    wsServer.broadcast({
      type: "story-complete",
      data: result,
    });
  },
  onError: (error) => {
    console.error("Story generation error:", error);
    wsServer.broadcast({
      type: "story-error",
      data: { error: error.message },
    });
  },
});
```

### Chat Session Management

```typescript
// Create persistent chat sessions
const sessionId = "user-123-session";
const chatSession = chatService.createSession(sessionId, {
  maxHistory: 50, // Keep last 50 messages
  persistToDisk: true,
  sessionTimeout: 3600000, // 1 hour timeout
});

// Add message to session history
chatSession.addMessage({
  role: "user",
  content: "Hello, AI!",
  timestamp: new Date(),
});

// Generate response with session context
const response = await chatSession.generateResponse({
  temperature: 0.7,
  maxTokens: 500,
});

// Session automatically maintains conversation history
console.log("Session history:", chatSession.getHistory());
console.log("Token usage:", chatSession.getTokenUsage());
```

---

##  Performance Optimization

### Connection Pooling

```typescript
const wsServer = new NeuroLinkWebSocketServer({
  port: 8080,
  maxConnections: 5000,

  // Connection pooling
  connectionPool: {
    enabled: true,
    maxIdleTime: 300000, // 5 minutes
    cleanupInterval: 60000, // 1 minute
  },

  // Performance tuning
  performance: {
    enableCompression: true,
    compressionLevel: 6, // 1-9, 6 is balanced
    maxPayloadSize: 16777216, // 16MB
    pingInterval: 30000, // 30 seconds
    pongTimeout: 5000, // 5 seconds
  },
});
```

### Load Balancing

```typescript
// Multiple server instances with load balancing
const servers = [];
const ports = [8080, 8081, 8082];

for (const port of ports) {
  const server = new NeuroLinkWebSocketServer({ port });

  // Shared Redis for cross-server communication
  server.setMessageBroker({
    type: "redis",
    url: "redis://localhost:6379",
    prefix: "neurolink:ws",
  });

  servers.push(server);
  await server.start();
}

console.log(`Started ${servers.length} WebSocket servers`);
```

### Streaming Optimization

```typescript
// Configure optimal streaming for different use cases
const streamingConfigs = {
  // Low latency for chat
  chat: {
    bufferSize: 1024,
    compressionEnabled: false, // Disable for speed
    latencyTarget: 50,
  },

  // High throughput for content generation
  content: {
    bufferSize: 16384,
    compressionEnabled: true,
    latencyTarget: 200,
  },

  // Balanced for general use
  general: {
    bufferSize: 4096,
    compressionEnabled: true,
    latencyTarget: 100,
  },
};

// Apply configuration based on use case
const chatService = createEnhancedChatService({
  provider: await createBestAIProvider(),
  enableWebSocket: true,
  streamingConfig: streamingConfigs.chat, // Use chat optimization
});
```

---

## ️ Production Deployment

### Docker Configuration

```dockerfile
# Dockerfile for WebSocket service
FROM node:18-alpine

WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production

COPY . .
RUN npm run build

# WebSocket port
EXPOSE 8080

# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
  CMD node healthcheck.js

CMD ["node", "dist/server.js"]
```

### Docker Compose with Redis

```yaml
# docker-compose.yml
version: "3.8"
services:
  neurolink-ws:
    build: .
    ports:
      - "8080:8080"
    environment:
      - REDIS_URL=redis://redis:6379
      - OPENAI_API_KEY=${OPENAI_API_KEY}
    depends_on:
      - redis
    deploy:
      replicas: 3
      resources:
        limits:
          memory: 512M
        reservations:
          memory: 256M

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    volumes:
      - redis_data:/data
    command: redis-server --appendonly yes

  nginx:
    image: nginx:alpine
    ports:
      - "80:80"
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf
    depends_on:
      - neurolink-ws

volumes:
  redis_data:
```

### Kubernetes Deployment

```yaml
# k8s-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: neurolink-websocket
spec:
  replicas: 3
  selector:
    matchLabels:
      app: neurolink-websocket
  template:
    metadata:
      labels:
        app: neurolink-websocket
    spec:
      containers:
        - name: websocket
          image: neurolink/websocket:latest
          ports:
            - containerPort: 8080
          env:
            - name: REDIS_URL
              valueFrom:
                configMapKeyRef:
                  name: neurolink-config
                  key: redis-url
            - name: OPENAI_API_KEY
              valueFrom:
                secretKeyRef:
                  name: neurolink-secrets
                  key: openai-api-key
          resources:
            requests:
              memory: "256Mi"
              cpu: "100m"
            limits:
              memory: "512Mi"
              cpu: "500m"
          livenessProbe:
            httpGet:
              path: /health
              port: 8080
            initialDelaySeconds: 30
            periodSeconds: 10
          readinessProbe:
            httpGet:
              path: /ready
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
  name: neurolink-websocket-service
spec:
  selector:
    app: neurolink-websocket
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080
  type: LoadBalancer
```

---

##  Monitoring and Health Checks

### Built-in Metrics

```typescript
// Enable metrics collection
wsServer.enableMetrics({
  collectConnectionStats: true,
  collectMessageStats: true,
  collectPerformanceStats: true,
  exportPrometheus: true,
  metricsEndpoint: "/metrics",
});

// Get real-time statistics
const stats = wsServer.getStats();
console.log("Active connections:", stats.activeConnections);
console.log("Messages per second:", stats.messagesPerSecond);
console.log("Average latency:", stats.averageLatency);
console.log("Memory usage:", stats.memoryUsage);
```

### Health Check Endpoint

```typescript
// Health check implementation
wsServer.addHealthCheck("aiProviders", async () => {
  try {
    const provider = await createBestAIProvider();
    await provider.generate({ input: { text: "test" }, maxTokens: 1 });
    return { status: "healthy", message: "AI providers operational" };
  } catch (error) {
    return { status: "unhealthy", message: error.message };
  }
});

wsServer.addHealthCheck("redis", async () => {
  try {
    await redis.ping();
    return { status: "healthy", message: "Redis connection active" };
  } catch (error) {
    return { status: "unhealthy", message: "Redis connection failed" };
  }
});

// Health endpoint available at /health
```

---

##  Getting Started

### Quick Setup

```bash
# Install NeuroLink with real-time features
npm install @juspay/neurolink

# Set up environment
echo "OPENAI_API_KEY=your-key" > .env
echo "REDIS_URL=redis://localhost:6379" >> .env

# Start Redis (if not already running)
docker run -d -p 6379:6379 redis:alpine
```

### Minimal Server Example

```typescript
// server.js

  NeuroLinkWebSocketServer,
  createEnhancedChatService,
  createBestAIProvider,
} from "@juspay/neurolink";

async function startServer() {
  // Initialize WebSocket server
  const wsServer = new NeuroLinkWebSocketServer({ port: 8080 });

  // Initialize enhanced chat
  const provider = await createBestAIProvider();
  const chatService = createEnhancedChatService({
    provider,
    enableWebSocket: true,
  });

  // Handle chat messages
  wsServer.on("chat-message", async ({ connectionId, message }) => {
    await chatService.streamChat({
      prompt: message.data.prompt,
      onChunk: (chunk) => {
        wsServer.sendMessage(connectionId, {
          type: "ai-chunk",
          data: { chunk },
        });
      },
      onComplete: (result) => {
        wsServer.sendMessage(connectionId, {
          type: "ai-complete",
          data: result,
        });
      },
    });
  });

  // Start server
  await wsServer.start();
  console.log(" NeuroLink WebSocket server running on port 8080");
}

startServer().catch(console.error);
```

```bash
# Run the server
node server.js
```

### Client Example

```html


    NeuroLink Real-time Chat


    Send


      const ws = new WebSocket("ws://localhost:8080");
      const chat = document.getElementById("chat");
      const messageInput = document.getElementById("message");

      ws.onmessage = (event) => {
        const data = JSON.parse(event.data);

        if (data.type === "ai-chunk") {
          appendToChat(data.data.chunk, false);
        } else if (data.type === "ai-complete") {
          appendToChat("\n\n", false);
        }
      };

      function sendMessage() {
        const message = messageInput.value;
        if (message) {
          appendToChat(`You: ${message}\n`, true);

          ws.send(
            JSON.stringify({
              type: "chat-message",
              data: { prompt: message },
            }),
          );

          messageInput.value = "";
          appendToChat("AI: ", true);
        }
      }

      function appendToChat(text, isNewLine) {
        if (isNewLine) {
          chat.innerHTML += text;
        } else {
          chat.innerHTML += text;
        }
        chat.scrollTop = chat.scrollHeight;
      }

      messageInput.addEventListener("keypress", (e) => {
        if (e.key === "Enter") sendMessage();
      });


```

---

##  Additional Resources

- **[API Reference](/docs/sdk/api-reference)** - Complete TypeScript API
- **[Telemetry Guide](/docs/observability/telemetry)** - Enterprise monitoring setup
- **[Performance Optimization](/docs/deployment/performance)** - Optimization strategies
- **[Examples Repository](/docs/)** - Working example applications

**Ready to build enterprise-grade real-time AI applications with NeuroLink! **

---

## Regional Streaming Controls

<!-- Source: features/regional-streaming.md -->

# Regional Streaming Controls

Latency, compliance, and model availability often depend on which region you call. NeuroLink 7.45.0 threads the `region` parameter through the generate/stream stack so you can target specific data centres when working with providers that expose regional endpoints.

## Supported Providers

| Provider                 | How to Set Region                                                          | Defaults    |
| ------------------------ | -------------------------------------------------------------------------- | ----------- |
| **Amazon Bedrock**       | `AWS_REGION` env, `config init`, or request `region` option                | `us-east-1` |
| **Amazon SageMaker**     | `SAGEMAKER_DEFAULT_ENDPOINT` + `AWS_REGION` or request `region`            | `us-east-1` |
| **Google Vertex AI**     | `GOOGLE_VERTEX_LOCATION` / `config init` / request `region`                | `us-east5`  |
| **Azure OpenAI**         | Deployment-specific endpoint; use `AZURE_OPENAI_ENDPOINT` (region encoded) | —           |
| **LiteLLM pass-through** | Use LiteLLM server configuration                                           | —           |

Providers without native region controls ignore the option safely.

## CLI Usage

The CLI reads region information from configuration profiles or provider environment variables.

```bash
# Bedrock: ensure AWS credentials + region set
export AWS_REGION=ap-south-1
npx @juspay/neurolink generate "Translate catalog" --provider bedrock

# Vertex AI: switch to Tokyo region for lower latency
export GOOGLE_VERTEX_LOCATION=asia-northeast1
npx @juspay/neurolink stream "Localise onboarding" --provider vertex --model gemini-2.5-pro

# One-off override via shell env
AWS_REGION=eu-west-1 npx @juspay/neurolink stream "Summarise EMEA incidents" --provider bedrock
```

Run `neurolink config init` to persist region defaults per provider.

## SDK Usage

```typescript
const neurolink = new NeuroLink({ enableOrchestration: true });

const result = await neurolink.generate({
  input: { text: "Compile regional latency metrics" },
  provider: "vertex",
  model: "gemini-2.5-pro",
  region: "europe-west4",
  enableEvaluation: true,
});

console.log(result.content, result.provider);
```

Streaming obeys the same option:

```typescript
const stream = await neurolink.stream({
  input: { text: "Narrate service availability" },
  provider: "bedrock",
  model: "anthropic.claude-3-sonnet",
  region: "eu-central-1",
});
```

## Operational Tips

:::tip[Compliance & Data Residency]
Use regional routing to comply with data sovereignty requirements (GDPR, HIPAA, etc.). Pin the `region` parameter to ensure AI processing stays within approved geographical boundaries for sensitive workloads.
:::

:::tip[Latency Optimization]
Co-locate your NeuroLink deployment with your application servers. For example, if your API runs in `eu-west-1`, set `region: "eu-west-1"` for Bedrock/Vertex calls to minimize cross-region latency penalties.
:::

- **Compliance** – ensure the requested region is enabled for the model (e.g., Anthropic via Vertex only supports `us` regions).
- **Latency** – co-locate with your application servers to avoid cross-region penalties.
- **Fallbacks** – when orchestration re-routes to a provider that ignores `region`, the call completes but logs a warning.
- **Credentials** – AWS requests still require valid IAM credentials; Vertex needs service account rights in the target location.

## Troubleshooting

| Symptom                                | Fix                                                                        |
| -------------------------------------- | -------------------------------------------------------------------------- |
| `Invalid region format`                | Use standard IDs (`us-east-1`, `asia-northeast1`).                         |
| `Model not available in region`        | Switch to a supported region or change model (see provider console).       |
| `Credential error after region change` | Re-run `neurolink config init` so stored credentials match the new region. |
| `High latency on fallback provider`    | Disable orchestration or pin a provider/model explicitly.                  |

## Related Material

- [SageMaker Integration Guide](/docs/getting-started/providers/sagemaker)
- [Enterprise Proxy Setup](/docs/deployment/enterprise-proxy)
- [Dynamic Models Guide](/docs/guides/dynamic-models)

---

## Speech-to-Speech Agents: Architecture and Gemini Live Integration Plan

<!-- Source: features/speech-agents.md -->

# Speech-to-Speech Agents: Architecture and Gemini Live Integration Plan

Status: Proposal (Docs only)
Owner: NeuroLink Platform
Last updated: 2025-09-01

## Goals

- Use `NeuroLink.stream` as the single, unified API for both text and voice streaming (no separate engine entrypoint).
- Start with Google Gemini Live API (Studio) as the first realtime provider.
- Server-level only: users attach their own WebSocket(s) and forward events; we do not host WS in the SDK.
- Keep the design provider-agnostic to allow adding OpenAI Realtime, ElevenLabs, Azure Speech, etc.

## Scope (Phase 1)

- Extend `neurolink.stream` to accept audio input frames and emit audio output events (audio-only out).
- Provider: Google Gemini Live (Studio) bridged internally from the stream code path.
- No built-in HTTP/WS server: consumers maintain their own transport and forward events.
- Basic audio guidance (PCM16LE framing, resampling hints); no full DSP stack.
- Config via env; minimal telemetry via existing logger.

Non-goals (Phase 1):

- Building a client/browser UI or bundling web audio capture.
- Managing customer WebSocket endpoints and broadcasting logic.
- Advanced AEC/AGC/VAD DSP processing. We’ll document expectations and provide simple utilities only.
- Persisted conversation memory integration (initially). We’ll design for it; implementation can follow.

## High-Level Architecture (Stream-Centric)

```
┌──────────────────────────┐
│  Application Server      │
│  (your Express/Fastify)  │
├──────────────────────────┤
│  Your WS endpoints       │  ◀── you own creation & forwarding to clients
│  - /ws/input             │
│  - /ws/output            │
├──────────────────────────┤
│  NeuroLink.stream        │
│  - StreamOptions (extended)            │ voice/text input
│  - AsyncIterable          │ voice/text output
│  - Audio helpers (lightweight)         │ PCM framing/resampling guidance
│  - Telemetry hooks (minimal in P1)     │
├──────────────────────────┤
│  Providers               │
│  - GeminiLiveProvider    │  (Phase 1)
│  - OpenAIRealtime        │  (Phase 2+)
│  - ... others            │
└──────────────────────────┘

       ▲                                      │
       │ events (audio/text/tools/status)     │ ws/grpc over provider SDK
       │ sendAudio/sendText/flush/control     ▼
                Provider SDK (e.g. @google/genai or Vertex Live API)
```

## Proposed Changes (Stream Extensions Only)

- Extend `StreamOptions` to support audio input alongside text:
  - `input: { text?: string; audio?: { frames: AsyncIterable; sampleRateHz: number; encoding: 'PCM16LE'; channels?: 1 } }`
- Extend `StreamResult.stream` to yield discriminated events:
  - `AsyncIterable`
- Add `AudioChunk` type: `{ data: Buffer; sampleRateHz: number; channels: number; encoding: 'PCM16LE' }`
- No new top-level entrypoints; keep `neurolink.stream()` as the single API.

Phase 2 (NeuroLink Client — new SDK package) planned modules:

- Package name: `@juspay/neurolink-client` (new package from scratch)
- Repository layout: monorepo subpackage `packages/neurolink-client/` (or separate repo if preferred)
- `packages/neurolink-client/src/index.ts` — central exports for browser/client usage
- `packages/neurolink-client/src/types.ts` — client-side event and message types
- `packages/neurolink-client/src/wsBridge.ts` — WebSocket bridge (send/receive) with pluggable codecs
- `packages/neurolink-client/src/codecs/{json,binary}.ts` — default JSON and optional binary audio codecs
- `packages/neurolink-client/src/utils/{base64,pcm}.ts` — helpers for encoding/PCM16LE framing

No additional public entrypoints planned beyond `neurolink.stream`.

## Stream API Extensions (Provider-Agnostic)

### Extended Types

```ts
// Additions to existing StreamOptions (src/lib/types/streamTypes.ts)
type PCMEncoding = "PCM16LE";

type AudioInputSpec = {
  frames: AsyncIterable; // PCM16LE mono frames (20-60ms recommended)
  sampleRateHz: number; // usually 16000 for input
  encoding: PCMEncoding; // 'PCM16LE'
  channels?: 1; // Phase 1: mono
};

type AudioChunk = {
  data: Buffer;
  sampleRateHz: number; // Gemini typically 24000 on output
  channels: number; // 1
  encoding: PCMEncoding; // 'PCM16LE'
};

// StreamOptions extension
// input: { text: string } remains valid for text-only flows
type ExtendedStreamInput = {
  text?: string;
  audio?: AudioInputSpec;
};

// StreamResult extension: discriminated union events
type StreamEvent =
  | { type: "text"; content: string }
  | { type: "audio"; audio: AudioChunk };
```

### Session Lifecycle

```ts
type SpeechSession = {
  id: string;
  start(): Promise;
  close(code?: number, reason?: string): Promise;

  // Sending upstream (server -> provider)
  sendAudioFrame(
    pcm16le: Buffer,
    sampleRateHz: number,
    opts?: { endOfSegment?: boolean },
  ): void;
  sendText(text: string): void; // optional text prompts/messages
  flush(): void; // request model to produce output

  // Events (subscribe and forward over your WS)
  on(event: "audio", listener: (chunk: AudioChunk) => void): this;
  on(event: "text", listener: (delta: TextDelta) => void): this;
  on(event: "tool-call", listener: (call: ToolCallEvent) => void): this; // future
  on(event: "tool-result", listener: (res: ToolResultEvent) => void): this; // future
  on(event: "status", listener: (s: ProviderStatusEvent) => void): this;
  on(event: "error", listener: (err: Error) => void): this;
  on(
    event: "close",
    listener: (info: { code?: number; reason?: string }) => void,
  ): this;
};

type AudioChunk = {
  data: Buffer;
  sampleRateHz: number;
  channels: number;
  encoding: "PCM16LE";
};
type TextDelta = {
  text: string;
  isFinal?: boolean;
};
```

### Provider Bridging

Each provider’s existing `stream()` implementation will detect `input.audio` and bridge to the provider’s live API, mapping provider callbacks to the unified stream events defined above.

## Gemini Live Mapping (Phase 1 via stream)

Two access modes are planned:

1. Studio API via `@google/genai` (API key)

- Env: `GOOGLE_AI_API_KEY` (alias: `GEMINI_API_KEY`)
- Connect: `client.live.connect({ model, callbacks, config })`
- Pros: simple setup; good for quick start.

2. Vertex AI Live API (service account)

- Env: `GOOGLE_APPLICATION_CREDENTIALS` (or inline credentials), `GOOGLE_VERTEX_PROJECT`, `GOOGLE_VERTEX_LOCATION`
- SDK: `@google-cloud/vertexai` once parity for Live is stable; alternatively direct WS following docs.
- Pros: enterprise auth, quota, monitoring; aligns with existing Vertex usage in repo.

Phase 1 decision (locked): use Studio channel via `@google/genai` as the primary path; output is audio-only. Vertex channel and other capabilities move to Phase 2.

Reference docs (sourced for details):

- Live API overview: https://cloud.google.com/vertex-ai/generative-ai/docs/live-api
- Streamed conversations: https://cloud.google.com/vertex-ai/generative-ai/docs/live-api/streamed-conversations
- Tools with Live API: https://cloud.google.com/vertex-ai/generative-ai/docs/live-api/tools

### Provider Config (Phase 1)

```ts
// Internally, provider config sets:
// responseModalities: ['AUDIO']
// speechConfig.voiceConfig.prebuiltVoiceConfig.voiceName = 'Orus' (default)
// Optional languageCode
```

### Event Mapping (Phase 1)

- Provider parses `LiveServerMessage.serverContent.modelTurn.parts[]`.
- If `inlineData` audio present, yield event `{ type: 'audio', audio: { data, sampleRateHz: 24000, encoding: 'PCM16LE', channels: 1 } }`.
- Text deltas: deferred to Phase 2.
- `serverContent.interrupted === true`: emit `status` `{ type: 'interrupted' }` and stop/flush local playback queues.
- onopen/onclose/onerror: map to `status`/`close`/`error`.
- Tools (Phase 2): `serverContent.toolCall` → `tool-call` event for integration with MCP pipeline.

Additional Live API behaviors from docs:

- Turn-based and streaming: you can stream user audio continuously (client → model) and receive overlapping model audio replies (server → client). Many realtime APIs also support an explicit end-of-input signal to prompt the model to respond; consult the Streamed Conversations doc for Gemini-specific control messages.
- Interruptions: the server may signal interruptions mid-playback when new input arrives; handle by stopping queued audio (as shown in sample) and resetting `nextStartTime`.

### Audio Expectations (Phase 1)

- Upstream format: PCM16LE mono, recommended 16 kHz. If clients provide 44.1/48 kHz float32, resample then convert to PCM16LE.
- Downstream format: Gemini typically outputs 24 kHz PCM; we’ll emit chunks with `sampleRateHz=24000`.
- Utilities will include minimal conversion helpers; full DSP left to consumers or future phases.

Notes aligned to docs:

- The Live API accepts mixed modalities (audio and text) in the same session. Sending text messages mid-conversation is supported.
- For low-latency, send small audio frames frequently (e.g., 20–60ms worth per frame) instead of large buffers.

## Server-Level Usage with `neurolink.stream` (Phase 1)

```ts

const neurolink = new NeuroLink();

// Build an AsyncIterable of PCM16LE mono frames at 16kHz from your WS/client
async function* framesFromClient(wsConn) {
  for await (const msg of wsConn) {
    // msg is already a Buffer of PCM16LE mono (16kHz)
    yield msg as Buffer;
  }
}

const streamResult = await neurolink.stream({
  provider: "google-ai", // internally routed to Gemini Live (Studio) for audio
  model: "gemini-2.5-flash-preview-native-audio-dialog",
  input: {
    audio: {
      frames: framesFromClient(clientWs),
      sampleRateHz: 16000,
      encoding: "PCM16LE",
    },
  },
});

for await (const ev of streamResult.stream) {
  if ((ev as any).type === "audio") {
    // Forward Buffer to clients over your WS
    serverWs.send((ev as any).audio.data);
  }
}
```

## Configuration (Phase 1)

- Studio:
  - `GOOGLE_AI_API_KEY` (preferred) or `GEMINI_API_KEY`
- Vertex channel is deferred to Phase 2.

Studio channel uses `@google/genai` Live SDK semantics (client.live.connect).

The subsystem follows the project’s dotenv loading pattern. No hard dependency added to runtime unless the feature is used.

## Telemetry & Logging

- Phase 1: reuse `src/lib/utils/logger.ts` for structured logs; expose minimal counters (session count, bytes in/out, errors). OTEL deferred.
- Phase 2+: optional OpenTelemetry spans (connect, sendAudio, receiveAudio, flush, close) with attributes: provider, model, channel (studio|vertex), sessionId, sampleRates, bytesIn/bytesOut, firstAudioLatencyMs.

## Error Handling & Resilience

- Categorize errors: auth (401/403), network (WS close abnormal), rate limit, server (5xx), protocol (invalid frame).
- Configurable backoff on reconnect for transient failures; max retries per session.
- Surface provider close codes/reasons to consumers.
- Guardrails on input audio (size/rate), with backpressure callbacks.
- Vertex-specific items (regional endpoints/quotas, close code mapping) are Phase 2.

## Extensibility (Other Providers)

- Implement provider-specific live bridging in the existing `stream()` path:
  - Detect `input.audio` and route to the provider’s live API (e.g., OpenAI Realtime, ElevenLabs, Azure).
  - Map provider callbacks to stream events: `{ type: 'audio' }` and, in Phase 2, `{ type: 'text' }`.
- Optional capability flags: `supports.tools` (P2), `supports.duplex`, `supports.textDelta` (P2), `input.sampleRates`.
  - For providers like OpenAI Realtime, add `supports.webrtc` if WebRTC control is planned (P3).

## Tools Integration (Phase 2)

- Gemini Live tools map well to our MCP infrastructure.
- Plan: bridge provider tool-calls to NeuroLink MCP registry (`src/lib/mcp/**`).
- The streaming pipeline surfaces `tool-call` intents; execute via NeuroLink MCP; return `tool-result` back to the provider stream.
  - Based on docs, Live API supports tool/function execution mid-session; we’ll translate those to our MCP tool contract and return results back through the provider’s tool result pathway.

## Voice Catalog & Advanced Controls (Phase 3)

- Voice catalog discovery for Gemini Live; expose `listVoices()` and cache results.
- Dynamic voice switching mid-session (where supported).
- Advanced prosody/style parameters; SSML-like controls if surfaced by provider.
- Diarization/transcription toggles; dual-stream (audio+text) combined experiences.
- Optional WS/WebRTC adapters and client helpers.

## Security Considerations

- Never expose service account creds to clients. Server-only control.
- Validate audio frame size/rate from clients; apply quotas.
- Consider PII handling and retention policies for recorded buffers.
- Support regionality via Vertex location settings.

## Implementation Phases & Steps

### Phase 1 (Now): Studio + Audio-Only

1. Scaffolding (core contracts)
   - Add `src/lib/realtime/{types,events,provider,session,engine}.ts`.
   - Minimal audio utils: `audio/pcm.ts` (PCM16LE framing) and `audio/resample.ts` (optional).
   - Add planned exports to `src/lib/index.ts` (guarded if needed).
2. Gemini Live Provider (Studio)
   - Implement via `@google/genai` (`client.live.connect`).
   - Map callbacks to `audio`/`status`/`error`/`close`; no text deltas.
   - Normalize output audio to `{ data: Buffer, sampleRateHz: 24000, encoding: 'PCM16LE' }`.
3. Session API & Controls
   - Implement `sendAudioFrame`, `flush`, `start`, `close`.
   - Backpressure safety (drop/queue strategy when overwhelmed).
4. Minimal Telemetry & Logging
   - Counters: session count, bytes in/out, errors; debug logs.
5. Smoke Tests & Example
   - Synthetic audio roundtrip test.
   - Example usage snippet in docs (no WS server bundled).

### Phase 2: Vertex, Text & Tools

1. Vertex Live API Channel
   - WS connection to Vertex regional endpoint; env-driven project/location.
2. Text Deltas
   - Enable `text` events; downstream subtitle-like handling.
3. Tools Integration
   - Bridge Live API tool calls to NeuroLink MCP; emit `tool-call`/`tool-result`.
4. Telemetry (OTEL)
   - Add optional spans and metrics; health endpoints.
5. NeuroLink Client SDK (WS bridge — new package)
   - Build a brand-new client SDK as a separate npm package `@juspay/neurolink-client`.
   - Connects to your server’s WS endpoint; no audio capture/playback included.
   - Responsibilities: send upstream audio frames and control messages to server; receive downstream audio/status/text events from server.
   - Default wire protocol (JSON envelope; optional binary audio):
     - Upstream JSON: `{ type: 'audio', data: , sampleRateHz: 16000, encoding: 'PCM16LE' }`
     - Upstream control: `{ type: 'flush' }`, `{ type: 'text', text: string }`
     - Downstream JSON: `{ type: 'audio', data: , sampleRateHz: 24000, encoding: 'PCM16LE' }`, `{ type: 'status', status: string }`, `{ type: 'text', text: string }` (if enabled)
     - Optional binary mode: raw PCM16LE frames with configurable header disabled by default.
   - Planned API:

     ```ts
     import { createRealtimeClient } from "@juspay/neurolink-client";

     const client = createRealtimeClient({
       url: "wss://your-server/ws",
       authToken,
       sendBinaryAudio: false,
     });

     client.on("audio", (chunk) => {
       /* play or forward */
     });
     client.on("status", (s) => {
       /* UI indicators */
     });

     // push audio captured elsewhere (already PCM16LE mono @16kHz)
     client.sendAudioFrame(pcmBuffer, 16000);
     client.flush();
     client.close();
     ```

   - The SDK won’t capture audio or render playback; it only bridges events over WS.
   - Packaging: ESM-first, tree-shakeable, no Node-only deps; minimal peer deps.

6. CLI Helpers (optional)
   - `neurolink live status`, basic debugging commands.

### Phase 3: Voice Catalog & Advanced Features

1. Voice Catalog
   - `listVoices()` with cache; per-model voice metadata.
2. Advanced Audio Controls
   - Prosody/style, SSML-like parameters, dynamic voice switching.
3. Transcription & Diarization
   - Expose toggles and events; combined audio+text pipelines.
4. WS/WebRTC Adapters (optional)
   - Lightweight helpers for common server/client patterns.

## Task Checklist

### Phase 1 — Studio + Audio-Only (via stream)

- [ ] Extend `StreamOptions` to accept `input.audio` (PCM16LE frames @16kHz).
- [ ] Extend `StreamResult.stream` to yield `{ type: 'audio', audio: AudioChunk }` events.
- [ ] Implement Gemini Live (Studio) bridging in provider stream path when `input.audio` is present.
- [ ] Default voice and output sample rate: Orus @24kHz; normalize `AudioChunk` accordingly.
- [ ] Minimal telemetry/logging: session count, bytes in/out, error count; debug logs.
- [ ] Smoke test: synthetic audio input → audio output events.
- [ ] Documentation: server usage snippet and guidance for WS forwarding.

### Phase 2 — Vertex, Text, Tools, Client SDK

- [ ] Implement Vertex Live API channel (WS) with `GOOGLE_VERTEX_PROJECT`/`GOOGLE_VERTEX_LOCATION` env support.
- [ ] Enable text delta events and downstream handling.
- [ ] Bridge Live API tool-calls to MCP; emit `tool-call`/`tool-result` events and roundtrip to provider.
- [ ] Add optional OpenTelemetry spans/metrics (connect/send/receive/flush/close).
- [ ] Create new package `@juspay/neurolink-client` (ESM, browser-first).
- [ ] Implement client WS bridge (`wsBridge.ts`) and message codecs (`codecs/{json,binary}.ts`).
- [ ] Define client SDK types and API (`createRealtimeClient`, `sendAudioFrame`, `flush`, events).
- [ ] Client SDK documentation and example integration.
- [ ] Optional: CLI helpers (e.g., `neurolink live status`).

### Phase 3 — Voice Catalog & Advanced Controls

- [ ] Implement `listVoices()` discovery and caching for Gemini Live.
- [ ] Support dynamic voice switching mid-session (where supported).
- [ ] Add advanced prosody/style/SSML-like parameters (provider-permitting).
- [ ] Add transcription/diarization toggles and corresponding events.
- [ ] Optional server/client helpers for WS/WebRTC patterns.

## Open Questions for Review

- Minimum audio contract for upstream: we recommend PCM16LE 16 kHz mono; OK to lock this as a requirement for Phase 1?
- Client WS protocol: keep default JSON + base64 audio with opt-in binary? Any constraints from your infra?
- Do we want a tiny built-in WS helper (opt-in) in Phase 3 for servers, or keep strictly library-only on server side?

---

If this plan looks good, next step is to extend the `stream` types and implement the Gemini Live (Studio) provider bridging for audio, keeping all server transport concerns outside the library as requested.

---

## Structured Output with Zod Schemas

<!-- Source: features/structured-output.md -->

# Structured Output with Zod Schemas

Generate type-safe, validated JSON responses using Zod schemas. Available in `generate()` function only (not `stream()`).

## Quick Example

```typescript

const neurolink = new NeuroLink();

// Define your schema
const UserSchema = z.object({
  name: z.string(),
  age: z.number(),
  email: z.string(),
  occupation: z.string(),
});

// Generate with schema
const result = await neurolink.generate({
  input: {
    text: "Create a user profile for John Doe, 30 years old, software engineer",
  },
  schema: UserSchema,
  output: { format: "json" }, // Required: must be "json" or "structured"
  provider: "vertex",
  model: "gemini-2.0-flash-exp",
});

// result.content is validated JSON string
const user = JSON.parse(result.content);
console.log(user); // { name: "John Doe", age: 30, email: "...", occupation: "software engineer" }
```

## Requirements

Both parameters are required for structured output:

1. **`schema`**: A Zod schema defining the output structure
2. **`output.format`**: Must be `"json"` or `"structured"` (defaults to `"text"` if not specified)

## Complex Schemas

```typescript
const CompanySchema = z.object({
  name: z.string(),
  headquarters: z.object({
    city: z.string(),
    country: z.string(),
  }),
  employees: z.array(
    z.object({
      name: z.string(),
      role: z.string(),
      salary: z.number(),
    }),
  ),
  financials: z.object({
    revenue: z.number(),
    profit: z.number(),
  }),
});

const result = await neurolink.generate({
  input: { text: "Analyze TechCorp company" },
  schema: CompanySchema,
  output: { format: "json" },
});
```

## Works with Tools

Structured output works seamlessly with MCP tools:

```typescript
const result = await neurolink.generate({
  input: { text: "Get weather for San Francisco" },
  schema: WeatherSchema,
  output: { format: "json" },
  tools: { getWeather: myWeatherTool },
});
// Tools execute first, then response is formatted as JSON
```

### Important: Google Gemini Providers Limitation

**Google API Constraint:** Google Gemini (both Vertex AI and Google AI Studio) **cannot combine function calling with structured output (JSON schema validation)**. This is a documented Google API limitation, not a NeuroLink issue.

> **Gemini 3 / Gemini 2.5 Note:** This limitation applies to **all Gemini models**, including the latest Gemini 3 and Gemini 2.5 series (e.g., `gemini-2.5-pro`, `gemini-2.5-flash`). While these models have excellent JSON schema support for structured output, they still cannot use tools and JSON schema validation together in the same request.

**Error Message:**

```
Function calling with a response mime type: 'application/json' is unsupported
```

**Solution:** Use `disableTools: true` when using schemas with Google providers:

```typescript
const result = await neurolink.generate({
  input: { text: "Analyze TechCorp company" },
  schema: CompanySchema,
  output: { format: "json" },
  provider: "vertex", // or "google-ai"
  disableTools: true, // ✅ REQUIRED for Google providers with schemas
});
```

**This is Industry Standard:** All major AI frameworks (LangChain, Vercel AI SDK, Agno, Instructor) use the same approach - disabling tools when using response schemas with Google models.

### Workarounds for Gemini Tools + Structured Output

If you need both tool execution and structured output with Gemini, consider these approaches:

1. **Two-Step Approach:** First call with tools enabled (no schema), then a second call with schema to format the result:

   ```typescript
   // Step 1: Execute tools
   const toolResult = await neurolink.generate({
     input: { text: "Get current weather for Tokyo" },
     provider: "vertex",
     tools: { getWeather: myWeatherTool },
   });

   // Step 2: Format with schema
   const structured = await neurolink.generate({
     input: { text: `Format this data: ${toolResult.content}` },
     schema: WeatherSchema,
     output: { format: "json" },
     provider: "vertex",
     disableTools: true,
   });
   ```

2. **Use a Different Provider:** OpenAI and Anthropic support tools and structured output together:

   ```typescript
   const result = await neurolink.generate({
     input: { text: "Get weather and format as JSON" },
     schema: WeatherSchema,
     output: { format: "json" },
     provider: "openai", // ✅ Supports tools + schema together
     tools: { getWeather: myWeatherTool },
   });
   ```

3. **Choose One or the Other:** Design your workflow to use either tools OR structured output per request, not both.

**Related Limitation:** Complex schemas may trigger "Too many states for serving" errors. Solutions:

1. Simplify schema structure
2. Reduce nested objects
3. Use `disableTools: true` to reduce state complexity

## Important Notes

- **Only available in `generate()`** - Not supported in `stream()` function
- **Requires both `schema` and `output.format`** - If `output.format` is not "json" or "structured", regular text is returned even with a schema
- **Auto-validated** - Invalid responses throw `NoObjectGeneratedError` with validation details
- **Provider support** - Works with OpenAI, Anthropic, Google AI Studio, Vertex AI
- **Gemini JSON Schema Support** - Gemini 3 / Gemini 2.5 models have excellent native JSON schema support
- **Gemini Tools Limitation** - All Gemini models (including Gemini 3) cannot combine tools with schemas - use `disableTools: true`

## See Also

- [API Reference](/docs/sdk/api-reference)
- [Custom Tools](/docs/sdk/custom-tools)
- [MCP Integration](/docs/mcp/integration)

---

## Extended Thinking Configuration

<!-- Source: features/thinking-configuration.md -->

# Extended Thinking Configuration

Enable extended thinking/reasoning modes for AI models that support deeper reasoning capabilities. This feature allows models to "think through" complex problems before providing a response.

## Overview

NeuroLink supports extended thinking/reasoning configuration for models that provide this capability. Extended thinking enables models to perform more thorough reasoning, particularly useful for complex tasks like mathematical proofs, coding problems, and multi-step analysis.

## Supported Models

### Gemini 3 Models (Google Vertex AI / AI Studio)

- `gemini-3-pro-preview` - Full thinking support with high token budgets (up to 100,000)
- `gemini-3-flash-preview` - Fast thinking with support for "minimal" level (up to 50,000)

### Gemini 2.5 Models (Google Vertex AI / AI Studio)

- `gemini-2.5-pro` - Supports thinking configuration (up to 32,000 tokens)
- `gemini-2.5-flash` - Supports thinking configuration (up to 32,000 tokens)

### Claude Models (Anthropic)

- `claude-3-7-sonnet-20250219` - Extended thinking via budget tokens
- Other Claude 3.x models with thinking capability

## Quick Example

```typescript

const neurolink = new NeuroLink();

// Gemini 3 with thinking level
const result = await neurolink.generate({
  input: { text: "Solve this complex problem..." },
  provider: "vertex",
  model: "gemini-3-pro-preview",
  thinkingConfig: {
    thinkingLevel: "high",
  },
});

console.log(result.content);
```

## Gemini 3 Thinking Configuration

For Gemini 3 models, use `thinkingLevel` to control reasoning depth:

```typescript
const response = await neurolink.generate({
  input: { text: "Prove that the square root of 2 is irrational" },
  provider: "vertex",
  model: "gemini-3-flash-preview",
  thinkingConfig: {
    thinkingLevel: "high", // 'minimal' | 'low' | 'medium' | 'high'
  },
});
```

### Thinking Levels

| Level     | Description                            | Best For                        |
| --------- | -------------------------------------- | ------------------------------- |
| `minimal` | Near-zero thinking (Flash models only) | Simple queries requiring speed  |
| `low`     | Fast reasoning for simple tasks        | Quick analysis, summaries       |
| `medium`  | Balanced reasoning/latency trade-off   | General-purpose tasks           |
| `high`    | Maximum reasoning depth                | Complex reasoning, math, coding |

### Maximum Token Budgets by Model

| Model              | Max Thinking Budget |
| ------------------ | ------------------- |
| `gemini-3-pro-*`   | 100,000 tokens      |
| `gemini-3-flash-*` | 50,000 tokens       |
| `gemini-2.5-*`     | 32,000 tokens       |

## Anthropic Claude Thinking Configuration

For Claude models, use `budgetTokens` to set the thinking token budget:

```typescript
const response = await neurolink.generate({
  input: { text: "Solve this complex math problem step by step..." },
  provider: "anthropic",
  model: "claude-3-7-sonnet-20250219",
  thinkingConfig: {
    enabled: true,
    budgetTokens: 10000, // Range: 5000-100000
  },
});
```

### Budget Token Guidelines

- **Minimum**: 5,000 tokens
- **Maximum**: 100,000 tokens
- **Recommended for simple tasks**: 5,000-10,000 tokens
- **Recommended for complex reasoning**: 20,000-50,000 tokens
- **Maximum depth**: 50,000-100,000 tokens

## Configuration Options

The `thinkingConfig` object supports the following options:

```typescript
thinkingConfig: {
  enabled?: boolean;           // Enable/disable thinking
  type?: "enabled" | "disabled"; // Alternative enable/disable
  budgetTokens?: number;       // Token budget (Anthropic models)
  thinkingLevel?: "minimal" | "low" | "medium" | "high"; // Thinking level (Gemini models)
}
```

## CLI Usage

Extended thinking is also available via the CLI:

```bash
# Enable thinking with default settings
neurolink generate "Solve this problem" --thinking

# Set thinking budget for Anthropic
neurolink generate "Complex problem" --provider anthropic --thinking --thinkingBudget 20000

# Set thinking level for Gemini 3
neurolink generate "Complex problem" --provider vertex --model gemini-3-pro-preview --thinkingLevel high
```

### CLI Options

| Option             | Description                                           | Default |
| ------------------ | ----------------------------------------------------- | ------- |
| `--thinking`       | Enable extended thinking                              | false   |
| `--thinkingBudget` | Token budget (Anthropic: 5000-100000)                 | 10000   |
| `--thinkingLevel`  | Thinking level (Gemini 3: minimal, low, medium, high) | medium  |

## Best Practices

### When to Use High Thinking

- Complex mathematical proofs and calculations
- Multi-step coding problems and debugging
- Detailed analysis requiring multiple considerations
- Tasks where accuracy is more important than speed

### When to Use Low/Minimal Thinking

- Simple queries where speed matters
- Straightforward information retrieval
- Quick summaries and formatting tasks
- High-volume, latency-sensitive applications

### General Guidelines

1. **Start with medium**: Use `medium` as your default and adjust based on results
2. **Match model to task**: Use Pro models for complex tasks, Flash for speed
3. **Monitor token usage**: Higher thinking levels consume more tokens
4. **Test performance**: Compare response quality vs. latency for your use case

## Example: Complex Reasoning Task

```typescript

const neurolink = new NeuroLink();

// Complex coding problem with high reasoning
const result = await neurolink.generate({
  input: {
    text: `
      Design an optimal algorithm to find the longest palindromic subsequence
      in a string. Explain your approach, prove its correctness, and analyze
      the time and space complexity.
    `,
  },
  provider: "vertex",
  model: "gemini-3-pro-preview",
  thinkingConfig: {
    thinkingLevel: "high",
  },
  maxTokens: 4000,
});

console.log(result.content);
```

## Model Detection Utilities

NeuroLink provides utilities to check thinking support:

```typescript

  supportsThinkingConfig,
  getMaxThinkingBudgetTokens,
} from "@juspay/neurolink";

// Check if a model supports thinking
const supports = supportsThinkingConfig("gemini-3-pro-preview"); // true

// Get maximum budget for a model
const maxBudget = getMaxThinkingBudgetTokens("gemini-3-flash-preview"); // 50000
```

## Important Notes

- **Provider compatibility**: Thinking configuration is provider-specific. Gemini uses `thinkingLevel`, Claude uses `budgetTokens`
- **Token consumption**: Extended thinking uses additional tokens beyond the response
- **Latency impact**: Higher thinking levels increase response time
- **Not all models support thinking**: Check `supportsThinkingConfig()` before enabling
- **Streaming support**: Thinking configuration works with both `generate()` and `stream()`

## See Also

- [API Reference](/docs/sdk/api-reference)
- [Provider Configuration](/docs/getting-started/provider-setup)
- [Streaming](/docs/features/regional-streaming)

---

## Text-to-Speech (TTS) Integration Guide

<!-- Source: features/tts.md -->

# Text-to-Speech (TTS) Integration Guide

NeuroLink provides integrated Text-to-Speech (TTS) capabilities, allowing you to generate high-quality audio from text prompts or AI-generated responses. This feature is perfect for voice assistants, accessibility features, narration, podcasts, and more.

## Overview

**Key Features:**

- **High-quality voices** - Neural, Wavenet, and Standard voice types
- **Multiple languages** - 50+ voices across 10+ languages
- **Flexible audio formats** - MP3, WAV, OGG/Opus
- **Voice customization** - Adjust speed, pitch, and volume
- **Two synthesis modes** - Direct text-to-speech OR AI response synthesis
- **Production-ready** - Google Cloud TTS integration

## Supported Providers

TTS is currently available through Google Cloud Text-to-Speech API:

| Provider      | Authentication                                     | Voices     | Notes                                |
| ------------- | -------------------------------------------------- | ---------- | ------------------------------------ |
| **google-ai** | API Key (`GOOGLE_AI_API_KEY`)                      | 50+ voices | Simplest setup, good for development |
| **vertex**    | Service Account (`GOOGLE_APPLICATION_CREDENTIALS`) | 50+ voices | Recommended for production           |

**Coming Soon:**

- OpenAI TTS (GPT-4 voices: alloy, echo, fable, onyx, nova, shimmer)
- Azure Speech Services
- AWS Polly

---

## Voice Selection

### Available Voice Types

Google Cloud TTS offers three voice quality tiers:

| Voice Type   | Quality | Cost   | Use Case                                | Example Voice      |
| ------------ | ------- | ------ | --------------------------------------- | ------------------ |
| **Neural2**  | Highest | High   | Natural conversations, voice assistants | `en-US-Neural2-C`  |
| **Wavenet**  | High    | Medium | Professional narration, podcasts        | `en-US-Wavenet-D`  |
| **Standard** | Good    | Low    | Cost optimization, bulk generation      | `en-US-Standard-B` |

### Voice Discovery

Voice identifiers follow Google Cloud TTS naming conventions: `---` (e.g., `en-US-Neural2-C`, `en-GB-Wavenet-D`).

Refer to the [Google Cloud TTS voice list](https://cloud.google.com/text-to-speech/docs/voices) for all available voices.

### Supported Languages

**English Variants:**

- `en-US` - United States English
- `en-GB` - British English
- `en-AU` - Australian English
- `en-IN` - Indian English

**Other Languages:**

- `es-ES`, `es-US` - Spanish (Spain, Latin America)
- `fr-FR`, `fr-CA` - French (France, Canada)
- `de-DE` - German
- `ja-JP` - Japanese
- `hi-IN` - Hindi
- `zh-CN`, `zh-TW` - Chinese (Simplified, Traditional)
- `pt-BR`, `pt-PT` - Portuguese (Brazil, Portugal)
- `it-IT` - Italian
- `ko-KR` - Korean
- `ru-RU` - Russian

### Voice Selection Guidelines

**For Natural Conversations:**

```typescript
tts: {
  voice: "en-US-Neural2-C",  // Female, natural
  // OR
  voice: "en-US-Neural2-A",  // Male, natural
}
```

**For Professional Narration:**

```typescript
tts: {
  voice: "en-US-Wavenet-D",  // Male, professional
  // OR
  voice: "en-GB-Wavenet-A",  // British, professional
}
```

**For Cost Optimization:**

```typescript
tts: {
  voice: "en-US-Standard-B",  // Lower cost
}
```

---

## TTS Synthesis Modes

NeuroLink supports two TTS synthesis modes:

### Mode 1: Direct Text-to-Speech (Default)

Converts input text directly to speech **without** AI generation.

```typescript
const result = await neurolink.generate({
  input: { text: "Welcome to our service!" },
  provider: "google-ai",
  tts: {
    enabled: true,
    useAiResponse: false, // Default: synthesize input text
    voice: "en-US-Neural2-C",
  },
});

// Audio contains: "Welcome to our service!"
// No AI generation occurs
```

**Use cases:**

- Pre-written scripts
- System notifications
- Fixed announcements
- Voice confirmations

### Mode 2: AI Response Synthesis

Generates AI response first, then converts the response to speech.

```typescript
const result = await neurolink.generate({
  input: { text: "Tell me a joke" },
  provider: "google-ai",
  tts: {
    enabled: true,
    useAiResponse: true, // Synthesize AI's response
    voice: "en-US-Neural2-C",
  },
});

// AI generates joke text
// TTS synthesizes the joke audio
// Both text and audio available in result
```

**Use cases:**

- Voice assistants
- Interactive AI conversations
- Dynamic content narration
- AI-powered podcasts

---

## Audio Format Options

### Supported Formats

| Format       | Quality | File Size            | Platform Support | Use Case                       |
| ------------ | ------- | -------------------- | ---------------- | ------------------------------ |
| **MP3**      | Good    | Small (~100 KB/min)  | All platforms    | Default, balanced quality/size |
| **WAV**      | Best    | Large (~1 MB/min)    | All platforms    | Highest quality, editing       |
| **OGG/Opus** | Good    | Medium (~150 KB/min) | macOS, Linux     | Web streaming                  |

### Format Selection

```typescript
// Default: MP3 (balanced quality and size)
tts: {
  voice: "en-US-Neural2-C",
  format: "mp3"  // Default
}

// Best quality: WAV
tts: {
  voice: "en-US-Neural2-C",
  format: "wav"
}

// Web streaming: OGG
tts: {
  voice: "en-US-Neural2-C",
  format: "ogg"
}
```

### Platform-Specific Considerations

**Windows:**

- Built-in playback only supports WAV format
- Auto-converts to WAV when `play: true` on Windows
- Use MP3 for file output, WAV for immediate playback

**macOS/Linux:**

- All formats supported
- `afplay` (macOS) and `ffplay` (Linux) handle all formats
- Use MP3 for general purpose

---

## Voice Customization

### Speaking Rate

Control speech speed (0.25 to 4.0):

```typescript
// Slower (half speed)
tts: {
  voice: "en-US-Neural2-C",
  speed: 0.5
}

// Normal speed (default)
tts: {
  voice: "en-US-Neural2-C",
  speed: 1.0  // Default
}

// Faster (double speed)
tts: {
  voice: "en-US-Neural2-C",
  speed: 2.0
}
```

**CLI:**

```bash
neurolink generate "This is faster speech" \
  --provider google-ai \
  --tts-voice en-US-Neural2-C \
  --tts-speed 1.5
```

### Pitch Adjustment

Adjust voice pitch (-20.0 to 20.0 semitones):

```typescript
// Lower pitch (deeper voice)
tts: {
  voice: "en-US-Neural2-C",
  pitch: -5.0
}

// Normal pitch (default)
tts: {
  voice: "en-US-Neural2-C",
  pitch: 0.0  // Default
}

// Higher pitch
tts: {
  voice: "en-US-Neural2-C",
  pitch: 5.0
}
```

**CLI:**

```bash
neurolink generate "Higher pitch test" \
  --provider google-ai \
  --tts-voice en-US-Neural2-C \
  --tts-pitch 3.0
```

### Volume Adjustment

Control output volume (-96.0 to 16.0 dB):

```typescript
tts: {
  voice: "en-US-Neural2-C",
  volumeGainDb: 0.0  // Default (no change)
}
```

---

## Complete Configuration Reference

### SDK Configuration

```typescript

const neurolink = new NeuroLink();

const result = await neurolink.generate({
  input: { text: "Your text here" },
  provider: "google-ai", // or "vertex"
  tts: {
    enabled: true, // Enable TTS output
    useAiResponse: false, // false = input text, true = AI response
    voice: "en-US-Neural2-C", // Voice identifier
    format: "mp3", // Audio format: "mp3" | "wav" | "ogg"
    speed: 1.0, // Speaking rate: 0.25-4.0
    pitch: 0.0, // Pitch adjustment: -20.0 to 20.0
    volumeGainDb: 0.0, // Volume: -96.0 to 16.0
    quality: "standard", // Quality: "standard" | "hd"
    output: "./audio.mp3", // Optional file path
    play: false, // Auto-play (CLI only)
  },
});

// Access results
console.log("Text:", result.content);
console.log("Audio size:", result.tts?.size, "bytes");
console.log("Audio format:", result.tts?.format);
console.log("Voice used:", result.tts?.voice);

// Save audio to file
if (result.tts?.buffer) {
  import { writeFileSync } from "fs";
  writeFileSync("output.mp3", result.tts.buffer);
}
```

### CLI Flags

```bash
neurolink generate "Your text" \
  --provider google-ai \
  --tts-voice  \      # Required to enable TTS
  --tts-format  \        # mp3|wav|ogg (default: mp3)
  --tts-speed  \           # 0.25-4.0 (default: 1.0)
  --tts-pitch  \          # -20.0 to 20.0 (default: 0.0)
  --tts-output  \          # Save to file
  --tts-use-ai-response          # Synthesize AI response instead of input
```

---

## Use Cases & Examples

### 1. Voice Assistant

Create a voice assistant that speaks responses:

```typescript
const assistant = new NeuroLink();

const response = await assistant.generate({
  input: { text: "What's the weather like today?" },
  provider: "google-ai",
  tts: {
    enabled: true,
    useAiResponse: true, // Speak AI's weather response
    voice: "en-US-Neural2-C",
    play: true,
  },
});

// AI generates weather info and speaks it
```

### 2. Accessibility Features

Screen reader-style narration for visually impaired users:

```typescript
const narration = await neurolink.generate({
  input: { text: "Button clicked. Navigation menu opened." },
  provider: "google-ai",
  tts: {
    enabled: true,
    voice: "en-US-Neural2-C",
    speed: 1.2, // Slightly faster for efficiency
    play: true,
  },
});
```

### 3. Podcast Generation

Generate professional podcast intros:

```bash
neurolink generate "Welcome to Tech Insights Podcast, episode 42. Today we're discussing the future of AI development." \
  --provider google-ai \
  --tts-voice en-US-Wavenet-D \
  --tts-speed 0.95 \
  --tts-format mp3 \
  --tts-output podcast-intro.mp3
```

### 4. Language Learning

Slow pronunciation for language learners:

```bash
# Slow French pronunciation
neurolink generate "Je m'appelle Claude. Comment allez-vous?" \
  --provider google-ai \
  --tts-voice fr-FR-Neural2-A \
  --tts-speed 0.7 \
  --tts-output french-slow.mp3

# Normal speed for comparison
neurolink generate "Je m'appelle Claude. Comment allez-vous?" \
  --provider google-ai \
  --tts-voice fr-FR-Neural2-A \
  --tts-speed 1.0 \
  --tts-output french-normal.mp3
```

### 5. Multilingual Support

Generate audio in multiple languages:

```typescript
const translations = {
  english: {
    text: "Hello, welcome to our application.",
    voice: "en-US-Neural2-C",
  },
  french: {
    text: "Bonjour, bienvenue dans notre application.",
    voice: "fr-FR-Wavenet-A",
  },
  spanish: {
    text: "Hola, bienvenido a nuestra aplicación.",
    voice: "es-ES-Neural2-A",
  },
  hindi: {
    text: "नमस्ते, हमारे एप्लिकेशन में आपका स्वागत है।",
    voice: "hi-IN-Wavenet-A",
  },
};

for (const [lang, config] of Object.entries(translations)) {
  const result = await neurolink.generate({
    input: { text: config.text },
    provider: "google-ai",
    tts: {
      enabled: true,
      voice: config.voice,
      format: "mp3",
      output: `welcome-${lang}.mp3`,
    },
  });
  console.log(`Generated ${lang} audio (${result.tts?.size} bytes)`);
}
```

### 6. Batch Audio Generation

Generate multiple audio files efficiently:

```typescript
async function generateBatchAudio(
  texts: string[],
  voice: string = "en-US-Neural2-C",
) {
  const results = [];

  for (const text of texts) {
    const result = await neurolink.generate({
      input: { text },
      provider: "google-ai",
      tts: {
        enabled: true,
        voice,
        format: "mp3",
      },
    });

    results.push({
      text,
      audioBuffer: result.tts?.buffer,
      audioSize: result.tts?.size,
    });
  }

  return results;
}

// Usage
const audioFiles = await generateBatchAudio([
  "Welcome to our application.",
  "Please enter your username and password.",
  "Login successful. Redirecting to dashboard.",
]);

// Save all files
audioFiles.forEach((item, index) => {
  if (item.audioBuffer) {
    writeFileSync(`audio-${index}.mp3`, item.audioBuffer);
  }
});
```

### 7. Streaming Text + Audio

Stream AI-generated text and convert to audio:

```typescript
async function streamAndSpeak(prompt: string, voice: string) {
  // Step 1: Stream AI response
  const streamResult = await neurolink.stream({
    input: { text: prompt },
    provider: "google-ai",
    model: "gemini-2.0-flash-exp",
  });

  let fullText = "";
  for await (const chunk of streamResult.stream) {
    fullText += chunk.content;
    process.stdout.write(chunk.content);
  }

  console.log("\n\nConverting to audio...");

  // Step 2: Convert complete text to audio
  const ttsResult = await neurolink.generate({
    input: { text: fullText },
    provider: "google-ai",
    tts: {
      enabled: true,
      voice,
      play: true,
    },
  });

  return {
    text: fullText,
    audio: ttsResult.tts,
  };
}

// Usage
const result = await streamAndSpeak(
  "Explain quantum computing in simple terms",
  "en-US-Neural2-C",
);
```

---

## Error Handling

### Common Error Patterns

```typescript
async function generateTTSWithRetry(
  text: string,
  voice: string,
  maxRetries: number = 3,
) {
  let lastError: Error | undefined;

  for (let attempt = 1; attempt  setTimeout(resolve, 1000 * attempt));
      }
    }
  }

  return {
    success: false,
    error: lastError?.message || "Unknown error occurred",
    attempts: maxRetries,
  };
}

// Usage
const result = await generateTTSWithRetry(
  "Generate this with retry logic",
  "en-US-Neural2-C",
);

if (result.success && result.audio) {
  console.log("Success!");
  writeFileSync("output.mp3", result.audio.buffer);
} else {
  console.error("Failed:", result.error);
}
```

---

## Troubleshooting

### Common Issues

| Issue                            | Cause                    | Solution                                                                                     |
| -------------------------------- | ------------------------ | -------------------------------------------------------------------------------------------- |
| **"TTS client not initialized"** | Missing credentials      | Set `GOOGLE_APPLICATION_CREDENTIALS` or `GOOGLE_AI_API_KEY`                                  |
| **"Invalid voice name"**         | Voice ID not found       | Check the [Google Cloud TTS voice list](https://cloud.google.com/text-to-speech/docs/voices) |
| **"Text too long"**              | Input exceeds 5000 bytes | Split text into smaller chunks                                                               |
| **"Synthesis failed"**           | Network/API error        | Check network connection and credentials                                                     |
| **Audio doesn't play**           | Missing audio player     | Install `afplay` (macOS), `ffplay` (Linux), or use WAV on Windows                            |
| **Empty audio buffer**           | API returned no content  | Check API quota and retry                                                                    |

### Authentication Issues

**Service Account:**

```bash
# Verify credentials file exists
ls -la $GOOGLE_APPLICATION_CREDENTIALS

# Test authentication
gcloud auth application-default login
```

**API Key:**

```bash
# Verify API key is set
echo $GOOGLE_AI_API_KEY
```

### Audio Playback Issues

**macOS:**

- `afplay` is pre-installed, supports all formats
- If playback fails, check system volume settings

**Linux:**

- Install `ffmpeg` for full format support: `sudo apt install ffmpeg`
- Alternative: Use `aplay` for WAV files only

**Windows:**

- Built-in playback only supports WAV
- Install VLC or Windows Media Player for other formats
- SDK auto-converts to WAV when `play: true` on Windows

---

## Best Practices

### Performance Optimization

1. **Cache voices** - Voice list is cached for 5 minutes
2. **Batch processing** - Group multiple TTS requests when possible
3. **Use appropriate quality** - Standard voices are faster and cheaper
4. **Optimize text length** - Keep under 5000 bytes per request

### Production Deployment

1. **Use service accounts** - More secure than API keys
2. **Implement retry logic** - Handle transient network failures
3. **Monitor quota usage** - Track Google Cloud TTS API usage
4. **Set appropriate timeouts** - Default is 30 seconds
5. **Handle errors gracefully** - Provide fallback behavior

### Voice Selection

1. **Test before deploying** - Different voices suit different use cases
2. **Match gender to persona** - Choose appropriate gender for your application
3. **Consider language variants** - `en-US` vs `en-GB` vs `en-IN`
4. **Use Neural2 for quality** - Best natural-sounding voices

### Cost Management

1. **Use Standard voices** - For high-volume, non-critical use cases
2. **Cache generated audio** - Avoid regenerating the same content
3. **Monitor API usage** - Set budget alerts in Google Cloud Console

---

## Pricing

Google Cloud TTS pricing (as of 2026):

| Voice Type   | Price per 1M characters |
| ------------ | ----------------------- |
| **Neural2**  | $16.00                  |
| **Wavenet**  | $16.00                  |
| **Standard** | $4.00                   |

**Monthly free tier:** 1 million characters (Standard voices) or 1 million characters (Wavenet/Neural2 voices)

For detailed pricing, see [Google Cloud TTS Pricing](https://cloud.google.com/text-to-speech/pricing).

---

## Related Features

**Multimodal Capabilities:**

- [Multimodal Guide](/docs/features/multimodal) - Images, PDFs, CSV inputs
- [PDF Support](/docs/features/pdf-support) - Document processing
- [Video Generation](/docs/features/video-generation) - AI-powered video creation

**Advanced Features:**

- [Streaming](/docs/advanced/streaming) - Stream AI responses in real-time
- [Provider Orchestration](/docs/features/provider-orchestration) - Multi-provider failover

**Documentation:**

- [CLI Commands](/docs/cli/commands) - Complete CLI reference
- [SDK API Reference](/docs/sdk/api-reference) - Full API documentation
- [Troubleshooting](/docs/reference/troubleshooting) - Extended error catalog

---

## Summary

NeuroLink's TTS integration provides:

✅ **High-quality voices** - Neural2, Wavenet, and Standard options
✅ **Multiple languages** - 50+ voices across 10+ languages
✅ **Flexible synthesis modes** - Direct text or AI response
✅ **Voice customization** - Speed, pitch, volume control
✅ **Production-ready** - Google Cloud TTS integration
✅ **Easy integration** - Works seamlessly with CLI and SDK

**Next Steps:**

1. Set up [Google Cloud credentials](#environment-setup)
2. Discover available [voices](#voice-discovery)
3. Try the [quick start examples](#quick-start)
4. Explore [use cases](#use-cases--examples) for your application
5. Check [troubleshooting](#troubleshooting) if needed

---

## Video Analysis

<!-- Source: features/video-analysis.md -->

# Video Analysis

Comprehensive video analysis for NeuroLink, powered by Gemini 2.0 Flash. This feature goes beyond basic visual description—it provides a deep logical audit of video sequences to understand "why" and "how" events occur.

## Key Capabilities

- **Logical Analysis**: Dissect any video to extract the underlying intent, cause-and-effect, and logical progression.
- **Action-Reaction Chain**: A step-by-step audit of user or system actions and their immediate visual results.
- **Evidence-Based Reporting**: Detailed reasoning backed by structured visual indicators (colors, labels, text) in JSON format.
- **Strategic Verdicts**: High-level assessments of whether a workflow succeeded or failed logically.

## Usage

### CLI Usage

Analyze any video file with a natural language prompt.

```bash
# Basic video analysis
neurolink generate "Analyze the login workflow in this video" \
  --file ./recordings/screen-capture.mp4 \
  --provider vertex \
  --model gemini-2.0-flash
```

### SDK Usage

Integrate video analysis into your TypeScript/JavaScript projects.

```typescript

const neurolink = new NeuroLink();

const result = await neurolink.generate({
  input: {
    text: "Dissect the logical progression of this activity",
    files: ["./examples/tutorial.mp4"],
  },
  provider: "vertex",
  model: "gemini-2.0-flash",
});

console.log(result.content);
```

#### Advanced SDK Examples

**Custom Model Configuration**
Fine-tune the analysis by adjusting token limits and temperature.

```typescript
const result = await neurolink.generate({
  input: {
    text: "Perform a detailed audit of the checkout workflow",
    files: ["payment-flow.mov"],
  },
  model: "gemini-2.0-flash",
  maxTokens: 3000,
  temperature: 0.2, // Lower temperature for more consistent logic auditing
  provider: "vertex",
});
```

**Disabling Tool Interference**
By default, the model might try to use available tools. For pure video analysis, you can disable them.

```typescript
const result = await neurolink.generate({
  input: {
    text: "Analyze the video timeline",
    files: ["video.mp4"],
  },
  disableTools: true,
});
```

---

## Examples

### 1. UI/UX Bug Analysis

Identify why a user is unable to complete a form or where the interface is misleading.

**Prompt**: "Find why the user is getting stuck at the payment step. Look for validation errors or hidden UI elements."

### 2. Silent Failure Detection

Detect cases where an action is taken but the system provides no feedback (no loaders, no success messages).

**Prompt**: "Audit the 'Submit' button click. Is there a visual 'bond' between the click and the next state? Report any lag or missing loading indicators."

### 3. Workflow Validation

Verify if a complex multi-step process follows the intended business logic.

**Prompt**: "Trace the logical progression from 'Item Selection' to 'Checkout'. Does every state change correspond to a user action?"

### 4. Comparison Analysis

Compare two recordings to find discrepancies in behavior.

**Prompt**: "Compare these two clips. The first one is the expected behavior and the second one has a bug. Identify the exact frame or timestamp where the logic deviates."

---

## Command Gallery

Quick CLI recipes for common tasks:

```bash
# Debugging with full technical detail
neurolink generate "Audit this video" --file bug.mp4 --debug

# Using a specifically tuned model
neurolink generate "Analyze logic" --file demo.mov --model gemini-2.0-flash

# Forcing a specific provider
neurolink generate "Extract patterns" --file test.mp4 --provider vertex
```

---

## The Analysis Report

The output is structured into four major sections designed to give you a complete understanding of the video:

1. **Strategic Overview & Intent**: Defines the core activity, expected logic, and provides a primary verdict.
2. **The Action-Reaction Chain**: A granular, step-by-step audit of attempts, results, and technical inferences.
3. **Critical Findings**: Categorized milestones or anomalies with root cause analysis and visual evidence in JSON.
4. **Final Assessment**: A conclusive summary of the logical flow based on the observed evidence.

---

## Best Practices

- **Frame Depth**: Short videos (under 10s) get high-density frame coverage (1 per second), while long ones are intelligently sampled.
- **Prompt Precision**: While the model is a "Critical Logic Auditor," you can guide it with specific questions about the activity.
- **Format**: The analysis is returned as text in `result.content`, making it easy to store, display, or pipe to other tools.

---

## Video Generation with Veo 3.1

<!-- Source: features/video-generation.md -->

# Video Generation with Veo 3.1

NeuroLink integrates Google's Veo 3.1 model to enable AI-powered video generation with audio from image and text prompt inputs. Transform static images into dynamic, professional-quality video content with synchronized audio.

## Overview

Video generation in NeuroLink leverages Google's state-of-the-art Veo 3.1 model through Vertex AI. The system uses the existing `generate()` function with video-specific options:

1. **Accepts** an input image via `input.images` and text prompt via `input.text`
2. **Validates** image format, size, and aspect ratio requirements
3. **Sends** the request to Vertex AI's Veo 3.1 endpoint via `output.mode: "video"`
4. **Generates** an 8-second video with synchronized audio
5. **Returns** a `VideoGenerationResult` containing video buffer and metadata

```mermaid
graph LR
    A[Input Image] --> B[NeuroLink SDK]
    C[Text Prompt] --> B
    B --> D[Vertex AI Veo 3.1]
    D --> E[VideoGenerationResult]
    E --> F[Save to File]
    E --> G[Stream to Client]
    E --> H[Further Processing]
```

## What You Get

- **Video with audio** – Generate 8-second video clips with synchronized audio from a single image and text prompt
- **SDK integration** – Use existing `neurolink.generate()` with `output.mode: "video"` to create videos
- **CLI support** – Generate videos directly from the command line with `--outputMode video`
- **Buffer-based output** – Receive video as Buffer objects via `VideoGenerationResult` for flexible post-processing
- **Multiple resolutions** – Support for 720p and 1080p output
- **Aspect ratio control** – Choose between 9:16 (portrait) and 16:9 (landscape) formats

## Supported Provider & Model

### Provider Compatibility

| Provider | Model     | Max Duration | Audio Support          | Input Requirements  | Rate Limit | Regional Availability |
| -------- | --------- | ------------ | ---------------------- | ------------------- | ---------- | --------------------- |
| `vertex` | `veo-3.1` | 8 seconds    | :white_check_mark: Yes | image + text prompt | 10/min     | us-central1           |

### Model Versions & Capabilities

| Model Version | Release Date | Key Features                  | Notes                           |
| ------------- | ------------ | ----------------------------- | ------------------------------- |
| `veo-3.1`     | 2024         | Audio generation, 8s duration | **Recommended** - Latest stable |

> **Note:** Veo is currently available through Vertex AI. Ensure you have appropriate API access and credentials configured.

### Known Limitations

- Maximum video duration: 8 seconds (supports 4, 6, or 8 second clips)
- Input image required (text-only prompts not supported)
- Audio is auto-generated based on video content (no custom audio input)
- Processing time: 30-120 seconds depending on resolution
- Concurrent request limit: 5 per project

## Prerequisites

1. **Vertex AI credentials** with Veo access enabled
2. **Google Cloud project** with billing enabled
3. **Service account** with `aiplatform.user` role
4. **Sufficient storage** for video buffers (each 8-second video is approximately 2-5 MB)

## Quick Start

### SDK Usage

```typescript

const neurolink = new NeuroLink();

// Basic video generation using generate() with video output mode
const result = await neurolink.generate({
  input: {
    text: "Camera slowly zooms in on the product with soft lighting",
    images: [readFileSync("./product-image.jpg")],
  },
  provider: "vertex",
  model: "veo-3.1",
  output: {
    mode: "video",
    video: {
      resolution: "720p",
      length: 8,
      aspectRatio: "16:9",
      audio: true,
    },
  },
});

// Access video data from VideoGenerationResult
if (result.video) {
  writeFileSync("output.mp4", result.video.data);
  console.log(`Video generated: ${result.video.metadata?.duration}s`);
}
```

#### With Full Options

```typescript

const neurolink = new NeuroLink();

const result = await neurolink.generate({
  input: {
    text: "Dynamic camera movement showcasing the product from multiple angles",
    images: [await readFile("./input.jpg")],
  },
  provider: "vertex",
  model: "veo-3.1",
  output: {
    mode: "video",
    video: {
      resolution: "1080p",
      length: 8,
      aspectRatio: "16:9",
      audio: true,
    },
  },
});

if (result.video) {
  await writeFile("output.mp4", result.video.data);

  console.log("Video metadata:", {
    duration: result.video.metadata?.duration,
    dimensions: result.video.metadata?.dimensions,
    format: result.video.mediaType,
  });
}
```

#### Image URL Input

```typescript

const neurolink = new NeuroLink();

// Use image URL instead of Buffer
const result = await neurolink.generate({
  input: {
    text: "Elegant rotation revealing product details",
    images: ["https://example.com/product.png"],
  },
  provider: "vertex",
  model: "veo-3.1",
  output: {
    mode: "video",
    video: {
      resolution: "720p",
      length: 8,
    },
  },
});

if (result.video) {
  await writeFile("output.mp4", result.video.data);
}
```

### CLI Usage

```bash
# Basic video generation
npx @juspay/neurolink generate "Create a product showcase video" \
  --image ./input.jpg \
  --videoOutput ./output.mp4

# Full options
npx @juspay/neurolink generate "Dynamic camera movement" \
  --image ./input.jpg \
  --provider vertex \
  --model veo-3.1 \
  --videoResolution 1080p \
  --videoLength 8 \
  --videoAspectRatio 16:9 \
  --videoAudio true \
  --videoOutput ./output.mp4

# JSON output mode (for scripting)
npx @juspay/neurolink generate "prompt" \
  --image input.jpg \
  --videoOutput output.mp4 \
  --format json

# With analytics
npx @juspay/neurolink generate "Camera pans across futuristic city" \
  --image ./input-city.jpg \
  --videoResolution 1080p \
  --videoOutput ./city-video.mp4 \
  --enable-analytics
```

### CLI Arguments

| Argument             | Type    | Default        | Description                            |
| -------------------- | ------- | -------------- | -------------------------------------- |
| `--image`            | string  | Required       | Path to the input image file           |
| `--videoOutput`      | string  | `./output.mp4` | Path to save the generated video       |
| `--provider`         | string  | `vertex`       | AI provider to use                     |
| `--model`            | string  | `veo-3.1`      | Model version                          |
| `--videoResolution`  | string  | `720p`         | Output resolution (`720p` or `1080p`)  |
| `--videoLength`      | number  | `4`            | Video duration in seconds (4, 6, or 8) |
| `--videoAspectRatio` | string  | `16:9`         | Aspect ratio (`9:16` or `16:9`)        |
| `--videoAudio`       | boolean | `true`         | Enable audio generation                |

## Comprehensive Examples

### Example 1: Basic Video Generation

```typescript

const neurolink = new NeuroLink();

async function generateSingleVideo() {
  const result = await neurolink.generate({
    input: {
      text: "Smooth camera pan revealing the product with ambient lighting",
      images: [await readFile("./product-hero.jpg")],
    },
    provider: "vertex",
    model: "veo-3.1",
    output: {
      mode: "video",
      video: { resolution: "720p", length: 8 },
    },
  });

  if (result.video) {
    await writeFile("product-video.mp4", result.video.data);

    console.log({
      duration: result.video.metadata?.duration,
      dimensions: result.video.metadata?.dimensions,
      mediaType: result.video.mediaType,
      size: result.video.data.length,
    });
  }
}
```

### Example 2: Batch Video Generation

```typescript

const neurolink = new NeuroLink();

async function batchGenerateVideos(
  inputDir: string,
  outputDir: string,
  prompt: string,
) {
  const files = await readdir(inputDir);
  const imageFiles = files.filter((f) =>
    [".jpg", ".jpeg", ".png", ".webp"].includes(path.extname(f).toLowerCase()),
  );

  const results = [];

  for (const imageFile of imageFiles) {
    console.log(`Processing: ${imageFile}`);

    try {
      const imageBuffer = await readFile(path.join(inputDir, imageFile));

      const result = await neurolink.generate({
        input: {
          text: prompt,
          images: [imageBuffer],
        },
        provider: "vertex",
        model: "veo-3.1",
        output: {
          mode: "video",
          video: { resolution: "720p", length: 8 },
        },
      });

      if (result.video) {
        const outputPath = path.join(
          outputDir,
          `${path.basename(imageFile, path.extname(imageFile))}.mp4`,
        );
        await writeFile(outputPath, result.video.data);

        results.push({
          input: imageFile,
          output: outputPath,
          duration: result.video.metadata?.duration,
          success: true,
        });
      }
    } catch (error) {
      results.push({
        input: imageFile,
        error: error instanceof Error ? error.message : "Unknown error",
        success: false,
      });
    }
  }

  return results;
}

// Usage
const results = await batchGenerateVideos(
  "./product-images",
  "./product-videos",
  "Dynamic product showcase with smooth camera movement",
);
console.table(results);
```

### Example 3: Different Aspect Ratios

```typescript

const neurolink = new NeuroLink();

// Portrait video for social media stories/reels
const portrait = await neurolink.generate({
  input: {
    text: "Vertical video with upward camera movement",
    images: [await readFile("./portrait-image.jpg")],
  },
  provider: "vertex",
  model: "veo-3.1",
  output: {
    mode: "video",
    video: {
      resolution: "1080p",
      aspectRatio: "9:16",
      length: 8,
    },
  },
});

// Landscape video for YouTube/websites
const landscape = await neurolink.generate({
  input: {
    text: "Cinematic horizontal pan across the scene",
    images: [await readFile("./landscape-image.jpg")],
  },
  provider: "vertex",
  model: "veo-3.1",
  output: {
    mode: "video",
    video: {
      resolution: "1080p",
      aspectRatio: "16:9",
      length: 8,
    },
  },
});
```

### Example 4: Integration with Image Analysis

```typescript

const neurolink = new NeuroLink();

// Step 1: Analyze product image and generate video concept
const analysis = await neurolink.generate({
  input: {
    text: `Analyze this product image and suggest a compelling video concept.
           Focus on key visual features and motion opportunities.`,
    images: [await readFile("product-image.jpg")],
  },
  provider: "vertex",
  model: "gemini-2.5-flash",
});

console.log("AI Video Concept:", analysis.content);

// Step 2: Generate video using AI-suggested prompt
const result = await neurolink.generate({
  input: {
    text: analysis.content, // Use AI-generated prompt
    images: [await readFile("product-image.jpg")],
  },
  provider: "vertex",
  model: "veo-3.1",
  output: {
    mode: "video",
    video: {
      resolution: "1080p",
      aspectRatio: "16:9",
      length: 8,
    },
  },
});

if (result.video) {
  await writeFile("ai-directed-video.mp4", result.video.data);
  console.log("AI-driven video generation complete!");
}
```

### Example 5: Error Handling

```typescript

const neurolink = new NeuroLink();

async function generateVideoWithErrorHandling(
  imagePath: string,
  prompt: string,
) {
  const maxRetries = 3;
  let lastError: Error | null = null;

  for (let attempt = 1; attempt  setTimeout(resolve, waitTime));
          continue;
        }

        if (error.category === "network" && error.retriable) {
          console.log(`Network error on attempt ${attempt}. Retrying...`);
          await new Promise((resolve) => setTimeout(resolve, 2000));
          continue;
        }

        if (error.category === "execution") {
          console.error(`Execution error: ${error.message}`);
          throw error;
        }
      }

      throw error;
    }
  }

  throw lastError || new Error("Max retries exceeded");
}
```

### Example 6: Video Generation Pipeline

```typescript

type PipelineConfig = {
  inputDir: string;
  outputDir: string;
  prompts: Record; // filename pattern -> prompt
  defaultPrompt: string;
  resolution: "720p" | "1080p";
  aspectRatio: "9:16" | "16:9";
  concurrency: number;
};

async function videoPipeline(config: PipelineConfig) {
  const neurolink = new NeuroLink();
  const limit = pLimit(config.concurrency);

  // Ensure output directory exists
  await mkdir(config.outputDir, { recursive: true });

  // Get all image files
  const files = await readdir(config.inputDir);
  const imageFiles = files.filter((f) => /\.(jpg|jpeg|png|webp)$/i.test(f));

  // Process with concurrency limit
  const results = await Promise.all(
    imageFiles.map((imageFile) =>
      limit(async () => {
        // Find matching prompt pattern or use default
        const prompt =
          Object.entries(config.prompts).find(([pattern]) =>
            imageFile.startsWith(pattern),
          )?.[1] || config.defaultPrompt;

        try {
          const imageBuffer = await readFile(
            path.join(config.inputDir, imageFile),
          );

          const result = await neurolink.generate({
            input: {
              text: prompt,
              images: [imageBuffer],
            },
            provider: "vertex",
            model: "veo-3.1",
            output: {
              mode: "video",
              video: {
                resolution: config.resolution,
                aspectRatio: config.aspectRatio,
                length: 8,
              },
            },
          });

          if (result.video) {
            const outputPath = path.join(
              config.outputDir,
              `${path.basename(imageFile, path.extname(imageFile))}.mp4`,
            );
            await writeFile(outputPath, result.video.data);

            return {
              input: imageFile,
              output: outputPath,
              duration: result.video.metadata?.duration,
              success: true,
            };
          }
          return {
            input: imageFile,
            success: false,
            error: "No video generated",
          };
        } catch (error) {
          return {
            input: imageFile,
            success: false,
            error: error instanceof Error ? error.message : "Unknown error",
          };
        }
      }),
    ),
  );

  return results;
}

// Usage
const pipelineResults = await videoPipeline({
  inputDir: "./raw-images",
  outputDir: "./generated-videos",
  prompts: {
    "product-": "Elegant product rotation with soft lighting",
    "hero-": "Dramatic zoom with cinematic lighting",
    "lifestyle-": "Natural movement with ambient atmosphere",
  },
  defaultPrompt: "Smooth camera movement showcasing the subject",
  resolution: "1080p",
  aspectRatio: "16:9",
  concurrency: 3,
});

console.table(pipelineResults);
```

## Type Definitions

### VideoGenerationInput

Extended input type for video generation requests:

```typescript
// Part of GenerateOptions input - uses existing multimodal types
type VideoGenerationInput = {
  text: string; // Prompt describing desired video motion/style
  images: Array; // Input image (required)
};
```

### VideoOutputOptions

Options for video output configuration:

```typescript
type VideoOutputOptions = {
  /** Output resolution - "720p" (1280x720) or "1080p" (1920x1080) */
  resolution?: "720p" | "1080p";
  /** Video duration in seconds (4, 6, or 8 seconds supported) */
  length?: 4 | 6 | 8;
  /** Aspect ratio - "9:16" for portrait or "16:9" for landscape */
  aspectRatio?: "9:16" | "16:9";
  /** Enable audio generation (default: true) */
  audio?: boolean;
};
```

### VideoGenerationResult

Result type for generated video:

```typescript
type VideoGenerationResult = {
  /** Raw video data as Buffer */
  data: Buffer;
  /** Video media type */
  mediaType: "video/mp4" | "video/webm";
  /** Video metadata */
  metadata?: {
    /** Original filename if applicable */
    filename?: string;
    /** Video duration in seconds */
    duration?: number;
    /** Video dimensions */
    dimensions?: {
      width: number;
      height: number;
    };
    /** Frame rate in fps */
    frameRate?: number;
    /** Video codec used */
    codec?: string;
    /** Model used for generation */
    model?: string;
    /** Provider used for generation */
    provider?: string;
    /** Aspect ratio of the video */
    aspectRatio?: string;
    /** Whether audio was enabled during generation */
    audioEnabled?: boolean;
    /** Processing time in milliseconds */
    processingTime?: number;
  };
};
```

### Extended GenerateResult

The `generate()` function returns an extended result when video mode is enabled:

```typescript
type GenerateResult = {
  content: string; // Text content (prompt echoed back)
  provider?: string;
  model?: string;
  usage?: TokenUsage;
  responseTime?: number;

  // Video-specific field (present when output.mode === "video")
  video?: VideoGenerationResult;

  // Other optional fields
  toolsUsed?: string[];
  analytics?: AnalyticsData;
  evaluation?: EvaluationData;
};
```

## Configuration & Best Practices

### Configuration Options

| Option                     | Type               | Default     | Required | Description                           |
| -------------------------- | ------------------ | ----------- | -------- | ------------------------------------- |
| `input.images[0]`          | `Buffer \| string` | -           | Yes      | Image buffer, file path, or URL       |
| `input.text`               | `string`           | -           | Yes      | Text description of desired video     |
| `provider`                 | `string`           | `"vertex"`  | No       | AI provider (currently only `vertex`) |
| `model`                    | `string`           | `"veo-3.1"` | No       | Model version to use                  |
| `output.mode`              | `string`           | `"text"`    | Yes      | Must be `"video"` for video output    |
| `output.video.resolution`  | `string`           | `"720p"`    | No       | Output resolution (`720p` or `1080p`) |
| `output.video.length`      | `number`           | `6`         | No       | Duration in seconds (4, 6, or 8)      |
| `output.video.aspectRatio` | `string`           | `"16:9"`    | No       | Aspect ratio (`9:16` or `16:9`)       |
| `output.video.audio`       | `boolean`          | `true`      | No       | Enable audio generation               |

### Video Quality Settings

```typescript
// High quality for professional content
const professional = await neurolink.generate({
  input: {
    text: "Cinematic product showcase with dramatic lighting",
    images: [await readFile("./product.jpg")],
  },
  provider: "vertex",
  model: "veo-3.1",
  output: {
    mode: "video",
    video: {
      resolution: "1080p",
      length: 8,
      aspectRatio: "16:9",
      audio: true,
    },
  },
});

// Optimized for social media
const social = await neurolink.generate({
  input: {
    text: "Quick product reveal",
    images: [await readFile("./input.jpg")],
  },
  provider: "vertex",
  model: "veo-3.1",
  output: {
    mode: "video",
    video: {
      resolution: "720p",
      length: 4,
      aspectRatio: "9:16",
      audio: true,
    },
  },
});
```

### Best Practices

#### 1. Prompt Engineering

```typescript
// ❌ Vague and unclear
const vaguePrompt = "Make a video of this product";
// ✅ Specific and actionable
const specificPrompt =
  "Smooth 360-degree rotation of the product with soft studio lighting, camera slowly zooms out";
// ✅ Include camera direction
const cameraDirectionPrompt =
  "Camera slowly pans from left to right, revealing product details with cinematic depth of field";
// ✅ Describe motion and atmosphere
const atmospherePrompt =
  "Dynamic product showcase with subtle particle effects, ambient lighting transitions from warm to cool";
```

**Prompt Template Examples:**

| Use Case         | Template                                                                           |
| ---------------- | ---------------------------------------------------------------------------------- |
| Product Rotation | `"Elegant 360-degree rotation of [product] with [lighting style] lighting"`        |
| Hero Shot        | `"Cinematic zoom from [distance] to [detail] with [motion style] camera movement"` |
| Lifestyle        | `"Natural scene with [subject] in [environment], subtle ambient movement"`         |
| Social Media     | `"Quick dynamic reveal of [product] with energetic transitions"`                   |

#### 2. Image Preparation

```typescript
// Image requirements
const imageRequirements = {
  minResolution: "720p", // 1280x720 minimum
  recommendedResolution: "1080p", // 1920x1080 for best results
  formats: ["JPEG", "PNG", "WebP"],
  maxSize: "10MB",
  aspectRatio: "Match desired video output",
};

// Preprocessing recommendations

async function prepareImage(inputPath: string, outputRatio: "9:16" | "16:9") {
  const targetWidth = outputRatio === "16:9" ? 1920 : 1080;
  const targetHeight = outputRatio === "16:9" ? 1080 : 1920;

  return sharp(inputPath)
    .resize(targetWidth, targetHeight, {
      fit: "cover",
      position: "center",
    })
    .jpeg({ quality: 90 })
    .toBuffer();
}
```

#### 3. Performance Optimization

```typescript
// Parallel processing with rate limiting

const limit = pLimit(3); // Max 3 concurrent requests (within provider limits)

const images = ["img1.jpg", "img2.jpg", "img3.jpg", "img4.jpg", "img5.jpg"];

const videos = await Promise.all(
  images.map((img) =>
    limit(async () => {
      const result = await neurolink.generate({
        input: {
          text: "Product showcase",
          images: [await readFile(img)],
        },
        provider: "vertex",
        model: "veo-3.1",
        output: { mode: "video", video: { resolution: "720p", length: 8 } },
      });
      return result.video;
    }),
  ),
);
```

#### 4. Quality vs. Cost Tradeoffs

| Setting   | Quality | Cost    | Use Case                 |
| --------- | ------- | ------- | ------------------------ |
| 720p, 4s  | Good    | Low     | Quick previews, drafts   |
| 720p, 8s  | Good    | Medium  | Social media content     |
| 1080p, 6s | High    | High    | Marketing materials      |
| 1080p, 8s | Highest | Highest | Professional productions |

## Error Handling & Validation

### Validation Rules

| Parameter                  | Validation                      | Error Type     | Example Message                                    |
| -------------------------- | ------------------------------- | -------------- | -------------------------------------------------- |
| `input.images[0]`          | Must be valid image file/buffer | NeuroLinkError | `Invalid image format. Supported: JPEG, PNG, WebP` |
| `input.images[0]`          | Max 10MB                        | NeuroLinkError | `Image size exceeds 10MB limit`                    |
| `input.text`               | 1-500 characters                | NeuroLinkError | `Prompt must be between 1 and 500 characters`      |
| `output.video.resolution`  | `720p` or `1080p`               | NeuroLinkError | `Invalid resolution. Use '720p' or '1080p'`        |
| `output.video.length`      | 4, 6, or 8                      | NeuroLinkError | `Invalid length. Use 4, 6, or 8 seconds`           |
| `output.video.aspectRatio` | `9:16` or `16:9`                | NeuroLinkError | `Invalid aspect ratio. Use '9:16' or '16:9'`       |

### Error Types

NeuroLink uses a unified error handling system with error categories:

```typescript

// Error categories (from ErrorCategory enum)
type ErrorCategory =
  | "validation"
  | "timeout"
  | "network"
  | "resource"
  | "permission"
  | "configuration"
  | "execution"
  | "system";

// Video-specific error codes
const VIDEO_ERROR_CODES = {
  GENERATION_FAILED: "VIDEO_GENERATION_FAILED",
  PROVIDER_NOT_CONFIGURED: "VIDEO_PROVIDER_NOT_CONFIGURED",
  POLL_TIMEOUT: "VIDEO_POLL_TIMEOUT",
  INVALID_INPUT: "VIDEO_INVALID_INPUT",
};
```

### Error Handling Example

```typescript

try {
  const result = await neurolink.generate({
    input: {
      text: prompt,
      images: [imageBuffer],
    },
    provider: "vertex",
    model: "veo-3.1",
    output: { mode: "video", video: { resolution: "720p" } },
  });
} catch (error) {
  if (error instanceof NeuroLinkError) {
    console.error(`Error [${error.code}]:`, error.message);
    console.error("Category:", error.category);
    console.error("Severity:", error.severity);
    console.error("Retriable:", error.retriable);

    // Handle specific error categories
    switch (error.category) {
      case "validation":
        console.error("Validation issues:");
        // - Unsupported image format (use JPEG, PNG, or WebP)
        // - Image too large (max 10MB)
        // - Invalid prompt length (1-500 characters)
        // - Invalid resolution, length, or aspect ratio
        break;

      case "timeout":
        console.error("Request timed out - retry with backoff");
        break;

      case "configuration":
      case "permission":
        console.error(
          "Config/auth failed - check GOOGLE_APPLICATION_CREDENTIALS",
        );
        break;

      case "network":
        console.error("Network error - retry with backoff");
        break;

      case "execution":
        console.error("Execution error - check status and quotas");
        // Detect rate limiting via error code
        if (error.code.includes("RATE_LIMIT")) {
          console.error("Rate limited - implement exponential backoff");
        }
        break;
    }
  }
}
```

## Token & Cost Information

### Pricing Structure

| Resolution | Duration  | Estimated Cost | Notes                |
| ---------- | --------- | -------------- | -------------------- |
| 720p       | 4 seconds | ~$1.60         | Best for previews    |
| 720p       | 8 seconds | ~$3.20         | Standard quality     |
| 1080p      | 4 seconds | ~$2.00         | High quality short   |
| 1080p      | 8 seconds | ~$4.00         | Professional quality |

> **Note:** Pricing is approximate and subject to change (as of October 2025). Check Google Cloud pricing for current rates.

### Storage Costs

| Resolution | Duration  | Approx. File Size |
| ---------- | --------- | ----------------- |
| 720p       | 4 seconds | ~1-2 MB           |
| 720p       | 8 seconds | ~2-4 MB           |
| 1080p      | 4 seconds | ~2-3 MB           |
| 1080p      | 8 seconds | ~4-6 MB           |

## Working with Video Results

```typescript

const neurolink = new NeuroLink();

// Generate video
const result = await neurolink.generate({
  input: {
    text: "Product showcase video",
    images: [await readFile("./product.jpg")],
  },
  provider: "vertex",
  model: "veo-3.1",
  output: { mode: "video" },
});

// Check for video result
if (result.video) {
  // Save to file
  await writeFile("output.mp4", result.video.data);

  // Access metadata
  console.log({
    duration: result.video.metadata?.duration,
    resolution: result.video.metadata?.dimensions,
    model: result.video.metadata?.model,
    size: result.video.data.length,
  });
}
```

## Troubleshooting

| Symptom                   | Cause                             | Solution                                                 |
| ------------------------- | --------------------------------- | -------------------------------------------------------- |
| Authentication error      | Invalid or missing credentials    | Verify `GOOGLE_APPLICATION_CREDENTIALS` is set correctly |
| Authorization error       | Service account lacks permissions | Add `aiplatform.user` role to service account            |
| Validation error (format) | Unsupported image type            | Convert image to JPEG, PNG, or WebP                      |
| Validation error (size)   | Image exceeds 10MB limit          | Compress or resize image before upload                   |
| Rate limit error          | Too many requests                 | Implement exponential backoff                            |
| Network timeout           | Processing took too long          | Try lower resolution or shorter duration                 |
| Provider quota exceeded   | Monthly quota reached             | Request quota increase or wait for reset                 |
| Connection error          | Network issues                    | Check network connectivity; retry with backoff           |
| Video quality is poor     | Low resolution input image        | Use minimum 720p source images                           |
| Audio not matching video  | Complex scene                     | Simplify prompt; focus on visual elements                |
| Unexpected aspect ratio   | Input image ratio mismatch        | Preprocess image to match target aspect ratio            |

### Debug Mode

```typescript
// Enable verbose logging for debugging
const neurolink = new NeuroLink({
  debug: true,
  logLevel: "verbose",
});

// Or via environment variable
// export NEUROLINK_DEBUG=true
```

## Limitations

### Current Limitations

| Limitation          | Description             | Workaround                               |
| ------------------- | ----------------------- | ---------------------------------------- |
| Max duration        | 8 seconds maximum       | Chain multiple videos for longer content |
| Audio input         | No custom audio support | Audio is auto-generated based on content |
| Text-only prompts   | Requires input image    | Use image generation first, then video   |
| Provider support    | Vertex AI only          | No alternative providers currently       |
| Concurrent requests | Max 5 per project       | Implement request queuing                |

## Testing

### Unit Test Examples

```typescript

describe("Video Generation", () => {
  it("should generate video with valid inputs", async () => {
    const neurolink = new NeuroLink();
    const imageBuffer = Buffer.from("fake-image-data");

    const result = await neurolink.generate({
      input: {
        text: "Test video generation",
        images: [imageBuffer],
      },
      provider: "vertex",
      model: "veo-3.1",
      output: { mode: "video", video: { resolution: "720p", length: 8 } },
    });

    expect(result.video).toBeDefined();
    expect(result.video?.data).toBeInstanceOf(Buffer);
    expect(result.video?.metadata?.duration).toBe(8);
  });

  it("should throw error for invalid image format", async () => {
    const neurolink = new NeuroLink();

    await expect(
      neurolink.generate({
        input: {
          text: "Test",
          images: ["invalid-file.txt"],
        },
        provider: "vertex",
        model: "veo-3.1",
        output: { mode: "video" },
      }),
    ).rejects.toThrow(); // Should throw ValidationError
  });

  it("should respect resolution settings", async () => {
    const neurolink = new NeuroLink();
    const imageBuffer = Buffer.from("fake-image-data");

    const result = await neurolink.generate({
      input: {
        text: "Test",
        images: [imageBuffer],
      },
      provider: "vertex",
      model: "veo-3.1",
      output: { mode: "video", video: { resolution: "1080p" } },
    });

    expect(result.video?.metadata?.dimensions?.width).toBe(1920);
    expect(result.video?.metadata?.dimensions?.height).toBe(1080);
  });
});
```

### Mock Strategy for CI/CD

```typescript

// Mock the NeuroLink class to return video generation results
vi.mock("@juspay/neurolink", () => ({
  NeuroLink: vi.fn().mockImplementation(() => ({
    generate: vi.fn().mockResolvedValue({
      content: "",
      provider: "vertex",
      model: "veo-3.1",
      video: {
        data: Buffer.from("mock-video-data"),
        mediaType: "video/mp4",
        metadata: {
          duration: 8,
          dimensions: { width: 1920, height: 1080 },
          model: "veo-3.1",
        },
      },
    }),
  })),
}));
```

### Integration Test Pattern

```typescript

describe("Video Generation Integration", () => {
  it("should complete full generation workflow", async () => {
    // Skip in CI without credentials
    if (!process.env.GOOGLE_APPLICATION_CREDENTIALS) {
      console.log("Skipping: No Google credentials");
      return;
    }

    const neurolink = new NeuroLink();
    const imageBuffer = await readFile("./test-fixtures/sample-image.jpg");

    const result = await neurolink.generate({
      input: {
        text: "Smooth camera pan for product showcase",
        images: [imageBuffer],
      },
      provider: "vertex",
      model: "veo-3.1",
      output: {
        mode: "video",
        video: { resolution: "720p", length: 4 },
      },
    });

    expect(result.video).toBeDefined();
    expect(result.video?.data).toBeInstanceOf(Buffer);
    expect(result.video?.data.length).toBeGreaterThan(0);
    expect(result.video?.metadata?.duration).toBe(4);
  }, 180000); // 3 minute timeout for video generation
});
```

## Related Features

- [Multimodal Chat](/docs/features/multimodal-chat) – Overview of multimodal capabilities and image support
- [PDF Support](/docs/features/pdf-support) – Document processing for visual analysis
- [CSV Support](/docs/features/csv-support) – Data file processing

## Implementation Files

The video generation feature is implemented across these files:

| File                                           | Purpose                                                                       |
| ---------------------------------------------- | ----------------------------------------------------------------------------- |
| `src/lib/types/multimodal.ts`                  | Core types: `VideoOutputOptions`, `VideoGenerationResult`                     |
| `src/lib/types/generateTypes.ts`               | Extended `GenerateOptions` with video output mode                             |
| `src/lib/adapters/video/vertexVideoHandler.ts` | Vertex AI Veo 3.1 video generation handler                                    |
| `src/lib/core/baseProvider.ts`                 | Video generation routing in `generate()` method                               |
| `src/lib/neurolink.ts`                         | Main SDK interface with video result handling                                 |
| `src/lib/utils/parameterValidation.ts`         | Input validation: `validateVideoGenerationInput()`, `validateImageForVideo()` |
| `src/lib/utils/errorHandling.ts`               | Error factory methods for video generation errors                             |

### Key Functions

- **`generateVideoWithVertex()`** - Main video generation function in `vertexVideoHandler.ts`
- **`validateVideoGenerationInput()`** - Comprehensive input validation in `parameterValidation.ts`
- **`validateImageForVideo()`** - Image format and size validation in `parameterValidation.ts`
- **`handleVideoGeneration()`** - Private method in `BaseProvider` that orchestrates the video generation flow

**Next:** [Multimodal Chat Guide](/docs/features/multimodal-chat) | [PDF Support](/docs/features/pdf-support)

---

# Examples

## Examples & Tutorials

<!-- Source: examples/index.md -->

# Examples & Tutorials

Learn NeuroLink through practical examples and step-by-step tutorials for real-world applications.

##  What You'll Find Here

This section contains practical implementations, use cases, and tutorials to help you integrate NeuroLink into your projects effectively.

-  **[Basic Usage](/docs/examples/basic-usage)**

  Fundamental examples for both CLI and SDK usage, covering core functionality and common patterns.

- ⭐ **[Advanced Examples](/docs/advanced)**

  Complex implementations showcasing advanced features like custom tools, analytics, and streaming.

-  **[Use Cases](/docs/use-cases)**

  Real-world scenarios and applications across different industries and project types.

-  **[Business Applications](/docs/examples/business)**

  Enterprise-focused examples for production deployments and business automation.

##  Quick Examples

```bash
# CLI - Get started immediately
npx @juspay/neurolink generate "Write a professional email"

# With specific provider
npx @juspay/neurolink gen "Explain AI" --provider google-ai
```

```typescript
// SDK - Basic integration

const neurolink = new NeuroLink();
const result = await neurolink.generate({
  input: { text: "Create a product description" },
});

console.log(result.content);
```

```bash
# CLI - Track usage and costs
npx @juspay/neurolink generate "Business proposal" \
  --enable-analytics \
  --enable-evaluation \
  --debug
```

```typescript
// SDK - Monitor performance
const result = await neurolink.generate({
  input: { text: "Market analysis report" },
  enableAnalytics: true,
  enableEvaluation: true,
});

console.log(`Cost: $${result.analytics.cost}`);
console.log(`Quality: ${result.evaluation.overall}/10`);
```

```typescript
// Register a custom weather tool
neurolink.registerTool("weather", {
  description: "Get weather for a city",
  parameters: z.object({
    city: z.string(),
    units: z.enum(["C", "F"]).default("C"),
  }),
  execute: async ({ city, units }) => {
    const data = await fetchWeather(city);
    return {
      city,
      temperature: units === "F"
        ? (data.temp * 9/5) + 32
        : data.temp,
      condition: data.condition,
    };
  },
});

// Use the tool
const result = await neurolink.generate({
  input: { text: "What's the weather in Tokyo?" },
});
```

## ️ Framework Integration Examples

```typescript
// app/api/ai/route.ts

export async function POST(request: Request) {
  const { prompt, context } = await request.json();

  const neurolink = new NeuroLink();
  const result = await neurolink.generate({
    input: { text: prompt },
    context,
    enableAnalytics: true,
  });

  return Response.json({
    content: result.content,
    usage: result.analytics,
  });
}
```

```typescript
// src/routes/api/stream/+server.ts

export const POST: RequestHandler = async ({ request }) => {
  const { message } = await request.json();
  const provider = createBestAIProvider();

  const result = await provider.stream({
    input: { text: message },
    timeout: "2m",
  });

  // Manually create ReadableStream from AsyncIterable
  const readable = new ReadableStream({
    async start(controller) {
      try {
        for await (const chunk of result.stream) {
          if (chunk && typeof chunk === "object" && "content" in chunk) {
            controller.enqueue(new TextEncoder().encode(chunk.content));
          }
        }
        controller.close();
      } catch (error) {
        controller.error(error);
      }
    },
  });

  return new Response(readable, {
    headers: {
      "Content-Type": "text/event-stream",
      "Cache-Control": "no-cache",
      Connection: "keep-alive",
    },
  });
};
```

```typescript

const app = express();
const neurolink = new NeuroLink();

app.post('/api/generate', async (req, res) => {
  try {
    const result = await neurolink.generate({
      input: { text: req.body.prompt },
      provider: req.body.provider,
      enableAnalytics: true,
    });

    res.json({
      success: true,
      content: result.content,
      analytics: result.analytics,
    });
  } catch (error) {
    res.status(500).json({
      success: false,
      error: error.message,
    });
  }
});
```

##  Common Use Cases

### Content Creation

```typescript
// Blog post generator with SEO optimization
const generateBlogPost = async (topic: string, keywords: string[]) => {
  const result = await neurolink.generate({
    input: {
      text: `Write a comprehensive blog post about ${topic}.
             Include these keywords naturally: ${keywords.join(", ")}`,
    },
    maxTokens: 2000,
    temperature: 0.7,
    enableAnalytics: true,
  });

  return {
    content: result.content,
    wordCount: result.content.split(" ").length,
    cost: result.analytics.cost,
  };
};
```

### Code Generation

```typescript
// Code review and suggestions
const reviewCode = async (codeSnippet: string, language: string) => {
  const result = await neurolink.generate({
    input: {
      text: `Review this ${language} code and provide suggestions:
             \`\`\`${language}
             ${codeSnippet}
             \`\`\``,
    },
    enableEvaluation: true,
  });

  return {
    review: result.content,
    confidence: result.evaluation.overall,
  };
};
```

### Data Analysis

```typescript
// Automated report generation
const generateReport = async (data: any[], reportType: string) => {
  const summary = JSON.stringify(data.slice(0, 5)); // Sample data

  const result = await neurolink.generate({
    input: {
      text: `Generate a ${reportType} report based on this data sample: ${summary}`,
    },
    context: {
      reportType,
      dataSize: data.length,
      timestamp: new Date().toISOString(),
    },
    enableAnalytics: true,
  });

  return result;
};
```

##  Batch Processing

```bash
# CLI batch processing
echo -e "Product description for laptop\nProduct description for phone\nProduct description for tablet" > products.txt
npx @juspay/neurolink batch products.txt --output descriptions.json
```

```typescript
// SDK batch processing
const generateMultiple = async (prompts: string[]) => {
  const results = await Promise.all(
    prompts.map((prompt) =>
      neurolink.generate({
        input: { text: prompt },
        enableAnalytics: true,
      }),
    ),
  );

  const totalCost = results.reduce(
    (sum, result) => sum + (result.analytics?.cost || 0),
    0,
  );

  return { results, totalCost };
};
```

##  Learning Path

1. **Start with [Basic Usage](/docs/examples/basic-usage)** - Core functionality
2. **Explore [Use Cases](/docs/use-cases)** - Find relevant scenarios
3. **Try [Advanced Examples](/docs/advanced)** - Complex implementations
4. **Study [Business Applications](/docs/examples/business)** - Production patterns

##  Related Resources

- **[CLI Guide](/docs/)** - Complete command reference
- **[SDK Reference](/docs/)** - API documentation
- **[Advanced Features](/docs/)** - Enterprise capabilities
- **[Visual Demos](/docs/)** - See examples in action

---

## Advanced Examples

<!-- Source: examples/advanced.md -->

# Advanced Examples

Complex integration patterns, enterprise workflows, and sophisticated use cases for NeuroLink.

## ️ Enterprise Architecture

### Multi-Provider Load Balancing

```typescript

class LoadBalancedNeuroLink {
  private instances: Map;
  private usage: Map;
  private limits: Map;

  constructor() {
    this.instances = new Map([
      ["openai", new NeuroLink({ defaultProvider: "openai" })],
      ["google-ai", new NeuroLink({ defaultProvider: "google-ai" })],
      ["anthropic", new NeuroLink({ defaultProvider: "anthropic" })],
    ]);

    this.usage = new Map([
      ["openai", 0],
      ["google-ai", 0],
      ["anthropic", 0],
    ]);

    // Daily rate limits
    this.limits = new Map([
      ["openai", 1000],
      ["google-ai", 2000],
      ["anthropic", 500],
    ]);
  }

  async generate(
    prompt: string,
    priority: "cost" | "speed" | "quality" = "speed",
  ) {
    const provider = this.selectOptimalProvider(priority);

    try {
      const result = await this.instances.get(provider)!.generate({
        input: { text: prompt },
      });

      this.usage.set(provider, this.usage.get(provider)! + 1);
      return { ...result, selectedProvider: provider };
    } catch (error) {
      console.warn(`Provider ${provider} failed, trying fallback...`);
      return this.generateWithFallback(prompt, provider);
    }
  }

  private selectOptimalProvider(priority: string): Provider {
    const available = Array.from(this.instances.keys()).filter(
      (provider) => this.usage.get(provider)!  this.getCost(a) - this.getCost(b))[0];
      case "speed":
        return available.sort((a, b) => this.getSpeed(a) - this.getSpeed(b))[0];
      case "quality":
        return available.sort(
          (a, b) => this.getQuality(b) - this.getQuality(a),
        )[0];
      default:
        return available[0];
    }
  }

  private async generateWithFallback(prompt: string, failedProvider: Provider) {
    const remaining = Array.from(this.instances.keys()).filter(
      (p) => p !== failedProvider,
    );

    for (const provider of remaining) {
      try {
        const result = await this.instances.get(provider)!.generate({
          input: { text: prompt },
        });

        this.usage.set(provider, this.usage.get(provider)! + 1);
        return { ...result, selectedProvider: provider, fallback: true };
      } catch (error) {
        console.warn(`Fallback provider ${provider} also failed`);
      }
    }

    throw new Error("All providers failed");
  }

  private getCost(provider: Provider): number {
    const costs = { "google-ai": 1, openai: 2, anthropic: 3 };
    return costs[provider] || 999;
  }

  private getSpeed(provider: Provider): number {
    const speeds = { "google-ai": 1, openai: 2, anthropic: 3 };
    return speeds[provider] || 999;
  }

  private getQuality(provider: Provider): number {
    const quality = { anthropic: 10, openai: 9, "google-ai": 8 };
    return quality[provider] || 1;
  }

  getUsageStats() {
    return {
      usage: Object.fromEntries(this.usage),
      limits: Object.fromEntries(this.limits),
      remaining: Object.fromEntries(
        Array.from(this.limits.entries()).map(([provider, limit]) => [
          provider,
          limit - this.usage.get(provider)!,
        ]),
      ),
    };
  }
}

// Usage
const balancer = new LoadBalancedNeuroLink();

const result = await balancer.generate(
  "Write a technical analysis",
  "quality", // Prioritize quality
);

console.log(`Used provider: ${result.selectedProvider}`);
console.log("Usage stats:", balancer.getUsageStats());
```

### Caching and Performance Optimization

```typescript

class CachedNeuroLink {
  private neurolink: NeuroLink;
  private cache: LRUCache;
  private analytics: Map;

  constructor() {
    this.neurolink = new NeuroLink();
    this.cache = new LRUCache({
      max: 1000,
      ttl: 1000 * 60 * 60, // 1 hour TTL
      sizeCalculation: (value) => JSON.stringify(value).length,
    });
    this.analytics = new Map();
  }

  async generate(params: any, options: { useCache?: boolean } = {}) {
    const cacheKey = this.createCacheKey(params);
    const startTime = Date.now();

    // Check cache first
    if (options.useCache !== false) {
      const cached = this.cache.get(cacheKey);
      if (cached) {
        this.recordAnalytics(cacheKey, "cache_hit", Date.now() - startTime);
        return { ...cached, fromCache: true };
      }
    }

    // Generate new response
    try {
      const result = await this.neurolink.generate(params);
      const duration = Date.now() - startTime;

      // Cache the result
      if (options.useCache !== false) {
        this.cache.set(cacheKey, result);
      }

      this.recordAnalytics(cacheKey, "api_call", duration);
      return { ...result, fromCache: false };
    } catch (error) {
      this.recordAnalytics(cacheKey, "error", Date.now() - startTime);
      throw error;
    }
  }

  private createCacheKey(params: any): string {
    const normalized = {
      text: params.input?.text,
      provider: params.provider,
      temperature: params.temperature,
      maxTokens: params.maxTokens,
    };

    return crypto
      .createHash("sha256")
      .update(JSON.stringify(normalized))
      .digest("hex");
  }

  private recordAnalytics(key: string, type: string, duration: number) {
    if (!this.analytics.has(key)) {
      this.analytics.set(key, []);
    }

    this.analytics.get(key).push({
      type,
      duration,
      timestamp: new Date().toISOString(),
    });
  }

  getCacheStats() {
    return {
      size: this.cache.size,
      hits: Array.from(this.analytics.values())
        .flat()
        .filter((event) => event.type === "cache_hit").length,
      misses: Array.from(this.analytics.values())
        .flat()
        .filter((event) => event.type === "api_call").length,
      errors: Array.from(this.analytics.values())
        .flat()
        .filter((event) => event.type === "error").length,
    };
  }

  clearCache() {
    this.cache.clear();
    this.analytics.clear();
  }
}

// Usage
const cachedNeuroLink = new CachedNeuroLink();

// First call - will hit API
const result1 = await cachedNeuroLink.generate({
  input: { text: "Explain caching" },
});

// Second identical call - will hit cache
const result2 = await cachedNeuroLink.generate({
  input: { text: "Explain caching" },
});

console.log("Cache stats:", cachedNeuroLink.getCacheStats());
```

##  Workflow Automation

### Document Processing Pipeline

```typescript
class DocumentProcessor {
  private neurolink: NeuroLink;

  constructor() {
    this.neurolink = new NeuroLink();
  }

  async processDocument(document: string, workflow: string[]) {
    const results = { originalDocument: document, steps: [] };
    let currentContent = document;

    for (const [index, step] of workflow.entries()) {
      console.log(`Processing step ${index + 1}: ${step}`);

      try {
        const result = await this.executeStep(currentContent, step);

        results.steps.push({
          step,
          input: currentContent,
          output: result.content,
          provider: result.provider,
          usage: result.usage,
        });

        currentContent = result.content;
      } catch (error) {
        results.steps.push({
          step,
          error: error.message,
        });
        break;
      }
    }

    return results;
  }

  private async executeStep(content: string, instruction: string) {
    return await this.neurolink.generate({
      input: {
        text: `${instruction}\n\nContent to process:\n${content}`,
      },
      provider: "anthropic", // Claude is good for document processing
      temperature: 0.3,
    });
  }
}

// Usage - Document improvement workflow
const processor = new DocumentProcessor();

const workflow = [
  "Fix any grammar and spelling errors",
  "Improve clarity and readability",
  "Add section headings where appropriate",
  "Create a table of contents",
  "Add a conclusion summary",
];

const result = await processor.processDocument(rawDocument, workflow);

console.log(
  "Final processed document:",
  result.steps[result.steps.length - 1].output,
);
```

### Multi-Stage Content Creation

```typescript
class ContentCreationPipeline {
  private neurolink: NeuroLink;

  constructor() {
    this.neurolink = new NeuroLink();
  }

  async createArticle(
    topic: string,
    audience: string,
    length: "short" | "medium" | "long",
  ) {
    const stages = [
      { name: "research", provider: "google-ai" },
      { name: "outline", provider: "anthropic" },
      { name: "draft", provider: "openai" },
      { name: "review", provider: "anthropic" },
      { name: "finalize", provider: "openai" },
    ];

    const context = { topic, audience, length };
    let content = "";
    const stageResults = [];

    for (const stage of stages) {
      const result = await this.executeStage(stage, content, context);
      stageResults.push(result);
      content = result.content;
    }

    return {
      finalContent: content,
      stages: stageResults,
      metadata: {
        topic,
        audience,
        length,
        createdAt: new Date().toISOString(),
        wordCount: content.split(" ").length,
      },
    };
  }

  private async executeStage(
    stage: any,
    previousContent: string,
    context: any,
  ) {
    const prompts = {
      research: `Research key points about "${context.topic}" for ${context.audience}.
                 Provide 5-7 main points with brief explanations.`,

      outline: `Create a detailed outline for a ${context.length} article about "${context.topic}"
                for ${context.audience}. Base it on this research: ${previousContent}`,

      draft: `Write a ${context.length} article based on this outline: ${previousContent}.
              Target audience: ${context.audience}. Make it engaging and informative.`,

      review: `Review and improve this article: ${previousContent}.
               Check for clarity, flow, and engagement. Suggest improvements.`,

      finalize: `Apply these improvements to create the final version: ${previousContent}`,
    };

    const result = await this.neurolink.generate({
      input: { text: prompts[stage.name] },
      provider: stage.provider,
      temperature: stage.name === "draft" ? 0.8 : 0.5,
    });

    return {
      stage: stage.name,
      provider: stage.provider,
      content: result.content,
      usage: result.usage,
    };
  }
}

// Usage
const pipeline = new ContentCreationPipeline();

const article = await pipeline.createArticle(
  "AI automation in healthcare",
  "healthcare professionals",
  "long",
);

console.log("Final article:", article.finalContent);
console.log("Creation metadata:", article.metadata);
```

##  AI Agent Framework

### Specialized AI Agents

```typescript
abstract class AIAgent {
  protected neurolink: NeuroLink;
  protected specialization: string;
  protected temperature: number;
  protected preferredProvider: string;

  constructor(specialization: string, config: any = {}) {
    this.neurolink = new NeuroLink();
    this.specialization = specialization;
    this.temperature = config.temperature || 0.7;
    this.preferredProvider = config.provider || "auto";
  }

  abstract getSystemPrompt(): string;

  async process(input: string, context: any = {}): Promise {
    const systemPrompt = this.getSystemPrompt();
    const fullPrompt = `${systemPrompt}\n\nTask: ${input}`;

    const result = await this.neurolink.generate({
      input: { text: fullPrompt },
      provider: this.preferredProvider,
      temperature: this.temperature,
      context: { agent: this.specialization, ...context },
    });

    return this.postProcess(result);
  }

  protected postProcess(result: any): any {
    return result;
  }
}

class CodeReviewAgent extends AIAgent {
  constructor() {
    super("code_reviewer", {
      temperature: 0.3,
      provider: "anthropic",
    });
  }

  getSystemPrompt(): string {
    return `You are a senior software engineer conducting code reviews.
            Analyze code for:
            - Security vulnerabilities
            - Performance issues
            - Best practices violations
            - Maintainability concerns

            Provide specific, actionable feedback with examples.`;
  }

  protected postProcess(result: any): any {
    // Parse structured feedback
    const feedback = result.content;

    return {
      ...result,
      issues: this.extractIssues(feedback),
      suggestions: this.extractSuggestions(feedback),
      severity: this.assessSeverity(feedback),
    };
  }

  private extractIssues(feedback: string): string[] {
    // Extract issues using regex or LLM parsing
    return feedback.match(/Issue: (.+)/g) || [];
  }

  private extractSuggestions(feedback: string): string[] {
    return feedback.match(/Suggestion: (.+)/g) || [];
  }

  private assessSeverity(feedback: string): "low" | "medium" | "high" {
    if (feedback.includes("security") || feedback.includes("vulnerability")) {
      return "high";
    }
    if (feedback.includes("performance") || feedback.includes("bug")) {
      return "medium";
    }
    return "low";
  }
}

class BusinessAnalystAgent extends AIAgent {
  constructor() {
    super("business_analyst", {
      temperature: 0.5,
      provider: "openai",
    });
  }

  getSystemPrompt(): string {
    return `You are a senior business analyst. Analyze business requirements and provide:
            - Stakeholder analysis
            - Risk assessment
            - Success metrics
            - Implementation recommendations

            Be data-driven and consider business impact.`;
  }

  async analyzeRequirement(requirement: string, businessContext: any) {
    return await this.process(requirement, {
      department: businessContext.department,
      budget: businessContext.budget,
      timeline: businessContext.timeline,
    });
  }
}

// Agent Manager
class AgentManager {
  private agents: Map;

  constructor() {
    this.agents = new Map([
      ["code_review", new CodeReviewAgent()],
      ["business_analysis", new BusinessAnalystAgent()],
    ]);
  }

  async processTask(agentType: string, task: string, context: any = {}) {
    const agent = this.agents.get(agentType);
    if (!agent) {
      throw new Error(`Unknown agent type: ${agentType}`);
    }

    return await agent.process(task, context);
  }

  addAgent(name: string, agent: AIAgent) {
    this.agents.set(name, agent);
  }
}

// Usage
const manager = new AgentManager();

// Code review
const codeReview = await manager.processTask(
  "code_review",
  `
  function processPayment(amount, cardNumber) {
    // Store card number in localStorage
    localStorage.setItem('card', cardNumber);

    // Process payment
    return fetch('/api/payment', {
      method: 'POST',
      body: JSON.stringify({ amount, cardNumber })
    });
  }
`,
);

console.log("Code review results:", codeReview);

// Business analysis
const bizAnalysis = await manager.processTask(
  "business_analysis",
  "Implement real-time analytics dashboard for customer behavior tracking",
  {
    department: "product",
    budget: 50000,
    timeline: "3 months",
  },
);

console.log("Business analysis:", bizAnalysis.content);
```

##  Advanced Analytics Integration

### Custom Analytics Collection

```typescript
class AdvancedAnalytics {
  private neurolink: NeuroLink;
  private metrics: Map;
  private webhookUrl?: string;

  constructor(webhookUrl?: string) {
    this.neurolink = new NeuroLink({
      analytics: { enabled: true },
    });
    this.metrics = new Map();
    this.webhookUrl = webhookUrl;
  }

  async generateWithAnalytics(
    prompt: string,
    metadata: any = {},
    customMetrics: string[] = [],
  ) {
    const startTime = Date.now();
    const sessionId = this.generateSessionId();

    try {
      const result = await this.neurolink.generate({
        input: { text: prompt },
        context: {
          sessionId,
          metadata,
          customMetrics,
        },
      });

      const duration = Date.now() - startTime;

      // Collect detailed metrics
      const analytics = {
        sessionId,
        timestamp: new Date().toISOString(),
        prompt: prompt.substring(0, 100), // Truncated for privacy
        provider: result.provider,
        duration,
        tokenUsage: result.usage,
        success: true,
        metadata,
        customMetrics: await this.collectCustomMetrics(result, customMetrics),
      };

      await this.recordMetrics(analytics);

      return { ...result, analytics };
    } catch (error) {
      const analytics = {
        sessionId,
        timestamp: new Date().toISOString(),
        duration: Date.now() - startTime,
        success: false,
        error: error.message,
        metadata,
      };

      await this.recordMetrics(analytics);
      throw error;
    }
  }

  private async collectCustomMetrics(result: any, metrics: string[]) {
    const customData: any = {};

    for (const metric of metrics) {
      switch (metric) {
        case "sentiment":
          customData.sentiment = await this.analyzeSentiment(result.content);
          break;
        case "readability":
          customData.readability = this.calculateReadability(result.content);
          break;
        case "keyword_density":
          customData.keywords = this.extractKeywords(result.content);
          break;
      }
    }

    return customData;
  }

  private async analyzeSentiment(text: string): Promise {
    const result = await this.neurolink.generate({
      input: {
        text: `Analyze the sentiment of this text (positive/negative/neutral): ${text}`,
      },
      temperature: 0.1,
      maxTokens: 50,
    });

    return { sentiment: result.content.toLowerCase().trim() };
  }

  private calculateReadability(text: string): any {
    const sentences = text.split(/[.!?]+/).length;
    const words = text.split(/\s+/).length;
    const avgWordsPerSentence = words / sentences;

    return {
      wordCount: words,
      sentenceCount: sentences,
      avgWordsPerSentence: Math.round(avgWordsPerSentence * 100) / 100,
      readabilityScore: this.getReadabilityScore(avgWordsPerSentence),
    };
  }

  private getReadabilityScore(avgWords: number): string {
    if (avgWords  array.indexOf(word) === index)
        ?.slice(0, 10) || []
    );
  }

  private async recordMetrics(analytics: any) {
    // Store locally
    const key = analytics.sessionId || "general";
    if (!this.metrics.has(key)) {
      this.metrics.set(key, []);
    }
    this.metrics.get(key)!.push(analytics);

    // Send to webhook if configured
    if (this.webhookUrl) {
      try {
        await fetch(this.webhookUrl, {
          method: "POST",
          headers: { "Content-Type": "application/json" },
          body: JSON.stringify(analytics),
        });
      } catch (error) {
        console.warn("Failed to send analytics to webhook:", error);
      }
    }
  }

  generateReport(timeRange: { start: Date; end: Date }) {
    const allMetrics = Array.from(this.metrics.values()).flat();
    const filtered = allMetrics.filter((m) => {
      const timestamp = new Date(m.timestamp);
      return timestamp >= timeRange.start && timestamp  m.success).length / filtered.length;
    const avgDuration =
      filtered.reduce((sum, m) => sum + m.duration, 0) / filtered.length;
    const providerUsage = this.groupBy(filtered, "provider");

    return {
      totalRequests: filtered.length,
      successRate: Math.round(successRate * 100),
      avgDuration: Math.round(avgDuration),
      providerBreakdown: providerUsage,
      timeRange,
    };
  }

  private groupBy(array: any[], key: string) {
    return array.reduce((groups, item) => {
      const group = item[key] || "unknown";
      groups[group] = (groups[group] || 0) + 1;
      return groups;
    }, {});
  }

  private generateSessionId(): string {
    return Date.now().toString(36) + Math.random().toString(36).substr(2);
  }
}

// Usage
const analytics = new AdvancedAnalytics(
  "https://analytics.company.com/webhook",
);

const result = await analytics.generateWithAnalytics(
  "Write a product description for our new AI tool",
  {
    department: "marketing",
    campaign: "Q4_launch",
    user_id: "user123",
  },
  ["sentiment", "readability", "keyword_density"],
);

console.log("Response:", result.content);
console.log("Analytics:", result.analytics);

// Generate report
const report = analytics.generateReport({
  start: new Date(Date.now() - 24 * 60 * 60 * 1000), // Last 24 hours
  end: new Date(),
});

console.log("Analytics report:", report);
```

This advanced examples documentation provides sophisticated patterns for enterprise usage, workflow automation, AI agent frameworks, and comprehensive analytics integration. These examples demonstrate how NeuroLink can be extended for complex, production-ready applications.

##  Related Documentation

- [Basic Usage](/docs/examples/basic-usage) - Simple examples to get started
- [Business Examples](/docs/examples/business) - Business-focused use cases
- [CLI Advanced Usage](/docs/cli/advanced) - Command-line patterns
- [SDK Reference](/docs/sdk/api-reference) - Complete API documentation

---

## Basic Usage Examples

<!-- Source: examples/basic-usage.md -->

# Basic Usage Examples

Simple examples to get started with NeuroLink in different scenarios and programming languages.

**Prerequisites**: Before running these examples, ensure you have configured at least one AI provider. See [Provider Configuration Guide](/docs/getting-started/provider-setup) for setup instructions.

##  Quick Start Examples

### Simple Text Generation

```typescript

const neurolink = new NeuroLink();

// Basic text generation
const result = await neurolink.generate({
  input: { text: "Explain TypeScript in simple terms" },
});

console.log(result.content);
```

### CLI Basic Usage

```bash
# Simple generation
npx @juspay/neurolink gen "Write a haiku about programming"

# With specific provider
npx @juspay/neurolink gen "Explain quantum computing" --provider google-ai

# Save to file
npx @juspay/neurolink gen "Create a README template" > README.md
```

##  SDK Integration Examples

### Node.js Application

```typescript

class AIAssistant {
  private neurolink: NeuroLink;

  constructor() {
    this.neurolink = new NeuroLink();
  }

  async generateResponse(userMessage: string): Promise {
    const result = await this.neurolink.generate({
      input: { text: userMessage },
      provider: "auto", // Auto-select best provider
      temperature: 0.7,
    });

    return result.content;
  }

  async summarizeText(text: string): Promise {
    const result = await this.neurolink.generate({
      input: {
        text: `Summarize this text in 2-3 sentences: ${text}`,
      },
      maxTokens: 150,
    });

    return result.content;
  }
}

// Usage
const assistant = new AIAssistant();
const response = await assistant.generateResponse(
  "How do I deploy a Node.js app?",
);
console.log(response);
```

### Express.js API

```typescript

const app = express();
const neurolink = new NeuroLink();

app.use(express.json());

// AI generation endpoint
app.post("/api/generate", async (req, res) => {
  try {
    const { prompt, provider = "auto" } = req.body;

    const result = await neurolink.generate({
      input: { text: prompt },
      provider: provider,
    });

    res.json({
      success: true,
      content: result.content,
      provider: result.provider,
      usage: result.usage,
    });
  } catch (error) {
    res.status(500).json({
      success: false,
      error: error.message,
    });
  }
});

// Text summarization endpoint
app.post("/api/summarize", async (req, res) => {
  try {
    const { text, maxLength = 150 } = req.body;

    const result = await neurolink.generate({
      input: {
        text: `Provide a concise summary of this text: ${text}`,
      },
      maxTokens: maxLength,
      temperature: 0.3, // Lower temperature for factual summarization
    });

    res.json({
      success: true,
      summary: result.content,
      originalLength: text.length,
      summaryLength: result.content.length,
    });
  } catch (error) {
    res.status(500).json({
      success: false,
      error: error.message,
    });
  }
});

app.listen(3000, () => {
  console.log("AI API server running on port 3000");
});
```

## ⚛️ React Integration

### Basic React Component

```typescript

const neurolink = new NeuroLink();

function AIChat() {
  const [message, setMessage] = useState("");
  const [response, setResponse] = useState("");
  const [loading, setLoading] = useState(false);

  const handleSubmit = async (e: React.FormEvent) => {
    e.preventDefault();
    if (!message.trim()) return;

    setLoading(true);
    try {
      const result = await neurolink.generate({
        input: { text: message },
        provider: "google-ai"
      });

      setResponse(result.content);
    } catch (error) {
      setResponse(`Error: ${error.message}`);
    } finally {
      setLoading(false);
    }
  };

  return (


         setMessage(e.target.value)}
          placeholder="Ask me anything..."
          disabled={loading}
        />

          {loading ? "Generating..." : "Send"}


      {response && (

          Response:
          {response}

      )}

  );
}

export default AIChat;
```

### React Hook for AI

```typescript

const neurolink = new NeuroLink();

export function useAI() {
  const [loading, setLoading] = useState(false);
  const [error, setError] = useState(null);

  const generate = useCallback(async (prompt: string, options = {}) => {
    setLoading(true);
    setError(null);

    try {
      const result = await neurolink.generate({
        input: { text: prompt },
        ...options
      });

      return result;
    } catch (err) {
      const errorMessage = err instanceof Error ? err.message : "Unknown error";
      setError(errorMessage);
      throw err;
    } finally {
      setLoading(false);
    }
  }, []);

  return { generate, loading, error };
}

// Usage in component
function MyComponent() {
  const { generate, loading, error } = useAI();
  const [result, setResult] = useState("");

  const handleGenerate = async () => {
    try {
      const response = await generate("Explain React hooks");
      setResult(response.content);
    } catch (err) {
      console.error("Generation failed:", err);
    }
  };

  return (


        {loading ? "Generating..." : "Generate"}

      {error && Error: {error}}
      {result && {result}}

  );
}
```

##  Common Use Cases

### Code Generation

```typescript
async function generateCode(description: string, language: string) {
  const result = await neurolink.generate({
    input: {
      text: `Write ${language} code for: ${description}. Include comments and error handling.`,
    },
    provider: "anthropic", // Claude is great for code
    temperature: 0.3, // Lower temperature for precise code
  });

  return result.content;
}

// Usage
const pythonCode = await generateCode(
  "function to calculate compound interest",
  "Python",
);
console.log(pythonCode);
```

### Content Creation

```typescript
async function createBlogPost(topic: string, audience: string) {
  const result = await neurolink.generate({
    input: {
      text: `Write a blog post about ${topic} for ${audience}.
             Include: introduction, main points, conclusion, and call-to-action.`,
    },
    provider: "openai",
    temperature: 0.8, // Higher temperature for creative content
    maxTokens: 1500,
  });

  return result.content;
}

// Usage
const blogPost = await createBlogPost(
  "AI automation in business",
  "small business owners",
);
```

### Data Analysis

```typescript
async function analyzeData(data: any[], question: string) {
  const dataString = JSON.stringify(data, null, 2);

  const result = await neurolink.generate({
    input: {
      text: `Analyze this data and answer: ${question}

             Data:
             ${dataString}`,
    },
    provider: "google-ai",
    maxTokens: 800,
  });

  return result.content;
}

// Usage
const salesData = [
  { month: "Jan", sales: 10000, region: "North" },
  { month: "Feb", sales: 12000, region: "North" },
  // ... more data
];

const analysis = await analyzeData(
  salesData,
  "What trends do you see in the sales data?",
);
```

### Multi-Model Access with LiteLLM

```typescript
async function compareResponses(prompt: string) {
  const models = [
    "openai/gpt-4o",
    "anthropic/claude-3-5-sonnet",
    "google/gemini-2.0-flash",
  ];

  const comparisons = await Promise.all(
    models.map(async (model) => {
      const result = await neurolink.generate({
        input: { text: prompt },
        provider: "litellm",
        model: model,
        temperature: 0.7,
      });

      return {
        model: model,
        response: result.content,
        provider: result.provider,
      };
    }),
  );

  return comparisons;
}

// Usage
const prompt = "Explain the benefits of renewable energy";
const responses = await compareResponses(prompt);

responses.forEach(({ model, response }) => {
  console.log(`\n${model}:`);
  console.log(response);
});
```

### Custom Model Access with SageMaker

```typescript
async function useCustomSageMakerModel(prompt: string, endpoint?: string) {
  const result = await neurolink.generate({
    input: { text: prompt },
    provider: "sagemaker",
    model: endpoint || "my-custom-model", // Use specific endpoint or default
    temperature: 0.7,
    timeout: "45s", // Longer timeout for custom models
  });

  return {
    response: result.content,
    endpoint: result.model,
    provider: result.provider,
    usage: result.usage,
  };
}

// Usage with default endpoint
const defaultResult = await useCustomSageMakerModel(
  "Analyze this customer feedback for sentiment",
);

// Usage with specific endpoint
const specificResult = await useCustomSageMakerModel(
  "Generate domain-specific recommendations",
  "my-domain-expert-model-endpoint",
);

console.log("Default model response:", defaultResult.response);
console.log("Domain model response:", specificResult.response);
```

### SageMaker Model Comparison

```typescript
async function compareSageMakerModels(prompt: string) {
  const endpoints = [
    "general-purpose-model",
    "domain-specific-model",
    "fine-tuned-customer-model",
  ];

  const comparisons = await Promise.all(
    endpoints.map(async (endpoint) => {
      try {
        const result = await neurolink.generate({
          input: { text: prompt },
          provider: "sagemaker",
          model: endpoint,
          temperature: 0.7,
          timeout: "30s",
        });

        return {
          endpoint: endpoint,
          response: result.content,
          success: true,
          responseTime: result.responseTime,
        };
      } catch (error) {
        return {
          endpoint: endpoint,
          error: error.message,
          success: false,
        };
      }
    }),
  );

  return comparisons;
}

// Usage
const prompt = "Provide recommendations for improving customer satisfaction";
const modelComparisons = await compareSageMakerModels(prompt);

modelComparisons.forEach(({ endpoint, response, success, error }) => {
  console.log(`\n${endpoint}:`);
  if (success) {
    console.log(response);
  } else {
    console.log(`❌ Error: ${error}`);
  }
});
```

### Production SageMaker Integration

```typescript
class SageMakerModelManager {
  private neurolink: NeuroLink;
  private defaultEndpoint: string;

  constructor(defaultEndpoint: string) {
    this.neurolink = new NeuroLink();
    this.defaultEndpoint = defaultEndpoint;
  }

  async predict(
    input: string,
    options: {
      endpoint?: string;
      temperature?: number;
      maxTokens?: number;
      timeout?: string;
    } = {},
  ) {
    const {
      endpoint = this.defaultEndpoint,
      temperature = 0.7,
      maxTokens = 1000,
      timeout = "30s",
    } = options;

    try {
      const result = await this.neurolink.generate({
        input: { text: input },
        provider: "sagemaker",
        model: endpoint,
        temperature,
        maxTokens,
        timeout,
      });

      return {
        success: true,
        prediction: result.content,
        endpoint: endpoint,
        usage: result.usage,
        responseTime: result.responseTime,
      };
    } catch (error) {
      return {
        success: false,
        error: error.message,
        endpoint: endpoint,
      };
    }
  }

  async batchPredict(inputs: string[], endpoint?: string) {
    const results = [];

    for (const input of inputs) {
      const result = await this.predict(input, { endpoint });
      results.push(result);

      // Rate limiting between requests
      await new Promise((resolve) => setTimeout(resolve, 1000));
    }

    return results;
  }

  async healthCheck(endpoint?: string): Promise {
    try {
      const result = await this.predict("test", {
        endpoint,
        timeout: "10s",
      });
      return result.success;
    } catch {
      return false;
    }
  }
}

// Usage
const modelManager = new SageMakerModelManager("production-model-endpoint");

// Single prediction
const prediction = await modelManager.predict(
  "Analyze this business scenario and provide recommendations",
);

// Batch predictions
const inputs = [
  "Predict market trends for Q4",
  "Analyze customer churn risk",
  "Recommend product improvements",
];
const batchResults = await modelManager.batchPredict(inputs);

// Health check
const isHealthy = await modelManager.healthCheck();
console.log(`Model endpoint healthy: ${isHealthy}`);
```

### Multi-Provider Strategy with SageMaker

```typescript
async function hybridModelStrategy(prompt: string, useCase: string) {
  const strategies = {
    general: {
      primary: { provider: "google-ai", model: "gemini-2.5-flash" },
      fallback: { provider: "openai", model: "gpt-4o-mini" },
    },
    "domain-specific": {
      primary: { provider: "sagemaker", model: "domain-expert-model" },
      fallback: { provider: "anthropic", model: "claude-3-haiku" },
    },
    "code-generation": {
      primary: { provider: "anthropic", model: "claude-3-5-sonnet" },
      fallback: { provider: "sagemaker", model: "code-specialized-model" },
    },
  };

  const strategy = strategies[useCase] || strategies["general"];

  try {
    // Try primary model
    const result = await neurolink.generate({
      input: { text: prompt },
      provider: strategy.primary.provider,
      model: strategy.primary.model,
      timeout: "30s",
    });

    return {
      ...result,
      modelUsed: "primary",
      strategy: strategy.primary,
    };
  } catch (primaryError) {
    console.log(`Primary model failed, trying fallback...`);

    try {
      // Fallback to secondary model
      const result = await neurolink.generate({
        input: { text: prompt },
        provider: strategy.fallback.provider,
        model: strategy.fallback.model,
        timeout: "30s",
      });

      return {
        ...result,
        modelUsed: "fallback",
        strategy: strategy.fallback,
        primaryError: primaryError.message,
      };
    } catch (fallbackError) {
      throw new Error(
        `Both models failed. Primary: ${primaryError.message}, Fallback: ${fallbackError.message}`,
      );
    }
  }
}

// Usage
const generalResult = await hybridModelStrategy(
  "Explain artificial intelligence",
  "general",
);

const domainResult = await hybridModelStrategy(
  "Provide industry-specific analysis for healthcare",
  "domain-specific",
);

const codeResult = await hybridModelStrategy(
  "Generate a Python function for data processing",
  "code-generation",
);

console.log("General query result:", generalResult.content);
console.log("Used model:", generalResult.strategy);
```

##  Configuration Examples

### Environment-based Configuration

```typescript

// Development configuration
const devNeuroLink = new NeuroLink({
  defaultProvider: "google-ai", // Free tier available
  timeout: 30000,
  retryAttempts: 1,
  analytics: { enabled: false },
});

// Production configuration
const prodNeuroLink = new NeuroLink({
  defaultProvider: "auto", // Auto-select best provider
  timeout: 15000,
  retryAttempts: 3,
  analytics: {
    enabled: true,
    endpoint: process.env.ANALYTICS_ENDPOINT,
  },
});

// Use appropriate instance
const neurolink =
  process.env.NODE_ENV === "production" ? prodNeuroLink : devNeuroLink;
```

### Provider Fallback

```typescript
async function generateWithFallback(prompt: string) {
  const providers = ["google-ai", "openai", "anthropic"];

  for (const provider of providers) {
    try {
      const result = await neurolink.generate({
        input: { text: prompt },
        provider: provider,
        timeout: 10000,
      });

      console.log(`✅ Success with ${provider}`);
      return result;
    } catch (error) {
      console.warn(`❌ ${provider} failed:`, error.message);
    }
  }

  throw new Error("All providers failed");
}
```

## ️ Utility Functions

### Text Processing Helpers

```typescript
class TextProcessor {
  private neurolink: NeuroLink;

  constructor() {
    this.neurolink = new NeuroLink();
  }

  async translate(text: string, targetLanguage: string): Promise {
    const result = await this.neurolink.generate({
      input: {
        text: `Translate this text to ${targetLanguage}: ${text}`,
      },
      temperature: 0.2,
    });

    return result.content;
  }

  async improveWriting(text: string): Promise {
    const result = await this.neurolink.generate({
      input: {
        text: `Improve the clarity and readability of this text: ${text}`,
      },
      temperature: 0.4,
    });

    return result.content;
  }

  async extractKeyPoints(text: string): Promise {
    const result = await this.neurolink.generate({
      input: {
        text: `Extract the key points from this text as a bullet list: ${text}`,
      },
      temperature: 0.3,
    });

    // Parse bullet points from response
    return result.content
      .split("\n")
      .filter(
        (line) => line.trim().startsWith("•") || line.trim().startsWith("-"),
      )
      .map((line) => line.replace(/^[•\-]\s*/, "").trim());
  }
}

// Usage
const processor = new TextProcessor();
const improvedText = await processor.improveWriting(
  "This text needs improvement.",
);
const keyPoints = await processor.extractKeyPoints(longArticle);
```

### Batch Processing

```typescript
async function batchProcess(prompts: string[], batchSize = 3) {
  const results = [];

  for (let i = 0; i  {
      return await neurolink.generate({
        input: { text: prompt },
        provider: "auto",
      });
    });

    const batchResults = await Promise.all(batchPromises);
    results.push(...batchResults);

    // Rate limiting delay
    if (i + batchSize  setTimeout(resolve, 2000));
    }
  }

  return results;
}

// Usage
const prompts = [
  "Explain machine learning",
  "What is blockchain?",
  "How does quantum computing work?",
];

const results = await batchProcess(prompts);
results.forEach((result, i) => {
  console.log(`Response ${i + 1}:`, result.content);
});
```

##  Related Documentation

- [CLI Examples](/docs/cli/examples) - Command-line usage examples
- [Advanced Examples](/docs/advanced) - Complex integration patterns
- [Framework Integration](/docs/sdk/framework-integration) - Specific framework guides
- [Provider Setup](/docs/getting-started/provider-setup) - API key configuration

---

## Business Applications

<!-- Source: examples/business.md -->

# Business Applications

Enterprise-focused examples demonstrating NeuroLink's value in business environments, ROI optimization, and organizational workflows.

##  Executive Decision Support

### Strategic Planning Assistant

**Scenario**: C-level executives need AI-powered insights for strategic decisions.

```typescript

class StrategyAssistant {
  private neurolink: NeuroLink;

  constructor() {
    this.neurolink = new NeuroLink({
      analytics: { enabled: true },
    });
  }

  async analyzeMarketOpportunity(opportunity: any, companyContext: any) {
    const prompt = `Analyze this market opportunity for strategic decision-making:

                   Opportunity: ${JSON.stringify(opportunity, null, 2)}
                   Company context: ${JSON.stringify(companyContext, null, 2)}

                   Provide:
                   1. Market size and growth potential
                   2. Competitive landscape analysis
                   3. Required investment and resources
                   4. Risk assessment and mitigation strategies
                   5. ROI projections and timeline
                   6. Go/no-go recommendation with rationale`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "anthropic",
      temperature: 0.4,
      maxTokens: 1500,
      context: {
        role: "strategic_analysis",
        department: "executive",
        priority: "high",
      },
    });
  }

  async generateBoardPresentation(quarterlyData: any, initiatives: any[]) {
    const prompt = `Create a board presentation summary based on:

                   Quarterly performance: ${JSON.stringify(quarterlyData, null, 2)}
                   Key initiatives: ${JSON.stringify(initiatives, null, 2)}

                   Include:
                   - Executive summary (3 key points)
                   - Financial highlights
                   - Strategic progress
                   - Challenges and solutions
                   - Next quarter priorities

                   Format for C-level audience.`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "openai",
      temperature: 0.5,
      context: {
        audience: "board_of_directors",
        format: "executive_summary",
      },
    });
  }

  async competitorAnalysis(competitors: string[], marketSegment: string) {
    const prompt = `Conduct comprehensive competitor analysis:

                   Competitors: ${competitors.join(", ")}
                   Market segment: ${marketSegment}

                   For each competitor analyze:
                   - Market position and share
                   - Key strengths and weaknesses
                   - Pricing strategy
                   - Recent moves and partnerships
                   - Threats and opportunities they present

                   Conclude with strategic recommendations.`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "google-ai",
      temperature: 0.6,
      maxTokens: 2000,
    });
  }
}

// Usage
const strategy = new StrategyAssistant();

// Analyze new market entry
const marketAnalysis = await strategy.analyzeMarketOpportunity(
  {
    market: "AI-powered customer service",
    geography: "European Union",
    targetSegment: "SMB",
    entryStrategy: "acquisition",
  },
  {
    currentRevenue: "$50M",
    employees: 200,
    marketPresence: ["North America"],
    coreCompetencies: ["AI/ML", "SaaS platforms"],
  },
);

// Generate quarterly board presentation
const boardDeck = await strategy.generateBoardPresentation(
  {
    revenue: "$12.5M",
    growth: "23%",
    customers: 1850,
    churn: "2.1%",
  },
  [
    { name: "Product V2 Launch", status: "on-track", impact: "high" },
    { name: "EU Expansion", status: "delayed", impact: "medium" },
  ],
);

console.log("Strategic Analysis:", marketAnalysis.content);
console.log("Board Presentation:", boardDeck.content);
```

### CLI for Executive Workflows

```bash
#!/bin/bash
# Executive daily briefing automation

DATE=$(date +"%Y-%m-%d")

echo " Generating Executive Daily Briefing for $DATE"

# Market analysis
npx @juspay/neurolink gen "
Analyze today's key business news and market trends relevant to SaaS companies.
Focus on: AI/ML industry, enterprise software, regulatory changes, competitive moves.
Provide 3-5 key insights with business implications.
" --enable-analytics \
  --context '{"role":"executive","type":"market_briefing","date":"'$DATE'"}' \
  > briefing-market-$DATE.md

# Industry intelligence
npx @juspay/neurolink gen "
Generate strategic intelligence for enterprise AI software company:
1. Emerging technology trends affecting our market
2. New competitors or competitive threats
3. Partnership and acquisition opportunities
4. Regulatory developments
5. Customer behavior shifts

Format as executive summary with action items.
" --provider anthropic \
  --enable-evaluation \
  --evaluation-domain "Business Strategy Consultant" \
  > briefing-intelligence-$DATE.md

# Performance analysis
npx @juspay/neurolink gen "
Based on typical SaaS metrics, create analysis framework for:
- Revenue growth assessment
- Customer acquisition cost optimization
- Churn reduction strategies
- Market expansion opportunities

Include KPIs to track and red flags to monitor.
" --context '{"company_stage":"growth","sector":"b2b_saas"}' \
  > performance-framework-$DATE.md

echo "✅ Executive briefing complete"
echo " Files generated:"
echo "  - briefing-market-$DATE.md"
echo "  - briefing-intelligence-$DATE.md"
echo "  - performance-framework-$DATE.md"
```

##  Operations & Process Optimization

### Business Process Analysis

```typescript
class ProcessOptimizer {
  private neurolink: NeuroLink;

  constructor() {
    this.neurolink = new NeuroLink();
  }

  async analyzeWorkflow(processData: any, painPoints: string[]) {
    const prompt = `Analyze this business process for optimization opportunities:

                   Current process: ${JSON.stringify(processData, null, 2)}
                   Known pain points: ${painPoints.join(", ")}

                   Provide:
                   1. Process efficiency analysis
                   2. Bottleneck identification
                   3. Automation opportunities
                   4. Resource optimization suggestions
                   5. Implementation roadmap
                   6. Expected ROI and timeline`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "anthropic",
      temperature: 0.4,
      context: {
        analysis_type: "process_optimization",
        focus: "efficiency_roi",
      },
    });
  }

  async generateSOPs(processName: string, steps: any[], compliance: string[]) {
    const prompt = `Create comprehensive Standard Operating Procedures for: ${processName}

                   Process steps: ${JSON.stringify(steps, null, 2)}
                   Compliance requirements: ${compliance.join(", ")}

                   Include:
                   - Step-by-step procedures
                   - Quality checkpoints
                   - Error handling protocols
                   - Escalation procedures
                   - Training requirements
                   - Compliance verification`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "google-ai",
      temperature: 0.3,
      maxTokens: 1500,
    });
  }

  async costBenefitAnalysis(
    currentCosts: any,
    proposedSolution: any,
    timeframe: string,
  ) {
    const prompt = `Conduct detailed cost-benefit analysis:

                   Current costs: ${JSON.stringify(currentCosts, null, 2)}
                   Proposed solution: ${JSON.stringify(proposedSolution, null, 2)}
                   Analysis timeframe: ${timeframe}

                   Calculate:
                   - Implementation costs
                   - Operational savings
                   - Productivity gains
                   - Risk mitigation value
                   - ROI and payback period
                   - Sensitivity analysis`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "openai",
      temperature: 0.3,
      context: {
        analysis_type: "financial",
        output_format: "business_case",
      },
    });
  }
}

// Usage
const optimizer = new ProcessOptimizer();

// Analyze customer onboarding process
const onboardingAnalysis = await optimizer.analyzeWorkflow(
  {
    name: "Customer Onboarding",
    steps: [
      { step: "Lead qualification", duration: "2 days", owner: "Sales" },
      { step: "Contract signing", duration: "5 days", owner: "Legal" },
      { step: "Technical setup", duration: "10 days", owner: "Engineering" },
      { step: "Training delivery", duration: "3 days", owner: "Success" },
    ],
    currentDuration: "20 days",
    customerSatisfaction: "6.5/10",
  },
  [
    "Long lead times",
    "Manual handoffs",
    "Limited visibility",
    "Inconsistent experience",
  ],
);

// Generate SOPs for incident response
const incidentSOPs = await optimizer.generateSOPs(
  "Security Incident Response",
  [
    {
      step: "Detection",
      tools: ["SIEM", "Monitoring"],
      timeframe: "5 minutes",
    },
    {
      step: "Assessment",
      team: ["Security", "Engineering"],
      timeframe: "15 minutes",
    },
    {
      step: "Containment",
      actions: ["Isolate", "Preserve evidence"],
      timeframe: "30 minutes",
    },
    {
      step: "Recovery",
      validation: ["Service restoration", "Security verification"],
    },
  ],
  ["SOX", "GDPR", "ISO 27001"],
);

// Cost-benefit analysis for automation
const automationROI = await optimizer.costBenefitAnalysis(
  {
    manualProcessing: "$50000/month",
    errorRate: "5%",
    processingTime: "4 hours/task",
  },
  {
    automationTool: "$10000/month",
    implementationCost: "$100000",
    expectedErrorRate: "0.5%",
    expectedProcessingTime: "15 minutes/task",
  },
  "24 months",
);
```

##  Financial Planning & Analysis

### Financial Decision Support

```bash
# Budget analysis and planning
npx @juspay/neurolink gen "
Analyze our Q4 budget performance and create Q1 planning recommendations:

Q4 Performance:
- Revenue: $2.8M (target: $3M)
- OpEx: $2.1M (budget: $2M)
- Customer Acquisition Cost: $450
- Gross margin: 78%

Create Q1 budget recommendations focusing on:
1. Revenue optimization strategies
2. Cost structure improvements
3. Investment priorities
4. Risk mitigation measures
" --provider anthropic \
  --enable-analytics \
  --context '{"department":"finance","type":"budget_planning"}' \
  > q1-budget-analysis.md

# Investment proposal evaluation
npx @juspay/neurolink gen "
Evaluate this investment proposal:
- New AI development team: $500K annual cost
- Expected output: 2x faster feature development
- Market opportunity: $10M TAM expansion
- Timeline: 18 month payback projected

Analyze from CFO perspective:
- Financial viability
- Risk assessment
- Alternative approaches
- Investment committee recommendation
" --enable-evaluation \
  --evaluation-domain "Chief Financial Officer" \
  > investment-proposal-analysis.md

# Cash flow forecasting
npx @juspay/neurolink gen "
Create 12-month cash flow forecast model framework for SaaS business:

Include considerations for:
- Subscription revenue recognition
- Seasonal variations
- Customer churn impact
- Growth investment timing
- Working capital requirements

Provide Excel-ready formulas and scenarios (conservative, base, optimistic).
" --max-tokens 1500 \
  > cashflow-model-framework.md
```

##  Sales & Revenue Optimization

### Sales Intelligence

```typescript
class SalesIntelligence {
  private neurolink: NeuroLink;

  constructor() {
    this.neurolink = new NeuroLink();
  }

  async analyzeSalesPerformance(
    salesData: any[],
    territory: string,
    period: string,
  ) {
    const prompt = `Analyze sales performance for ${territory} in ${period}:

                   Sales data: ${JSON.stringify(salesData, null, 2)}

                   Provide analysis of:
                   1. Performance vs targets and trends
                   2. Top performing segments/products
                   3. Underperforming areas requiring attention
                   4. Seasonal or cyclical patterns
                   5. Competitive win/loss insights
                   6. Pipeline health assessment
                   7. Actionable recommendations for improvement`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "google-ai",
      temperature: 0.4,
      context: {
        department: "sales",
        analysis_type: "performance_review",
        territory: territory,
      },
    });
  }

  async generateSalesPlaybook(
    industry: string,
    buyerPersonas: any[],
    salesCycle: any,
  ) {
    const prompt = `Create a comprehensive sales playbook for ${industry}:

                   Buyer personas: ${JSON.stringify(buyerPersonas, null, 2)}
                   Sales cycle: ${JSON.stringify(salesCycle, null, 2)}

                   Include:
                   - Discovery question frameworks
                   - Objection handling scripts
                   - Value proposition messaging
                   - Competitive battle cards
                   - Closing techniques
                   - Follow-up sequences
                   - Success metrics and KPIs`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "anthropic",
      temperature: 0.6,
      maxTokens: 2000,
    });
  }

  async optimizePricing(
    marketData: any,
    competitorPricing: any[],
    valueDrivers: string[],
  ) {
    const prompt = `Develop pricing optimization strategy:

                   Market data: ${JSON.stringify(marketData, null, 2)}
                   Competitor pricing: ${JSON.stringify(competitorPricing, null, 2)}
                   Value drivers: ${valueDrivers.join(", ")}

                   Recommend:
                   1. Optimal pricing structure and tiers
                   2. Value-based pricing justification
                   3. Competitive positioning strategy
                   4. Price sensitivity analysis
                   5. A/B testing framework
                   6. Implementation timeline and change management`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "openai",
      temperature: 0.5,
      context: {
        analysis_type: "pricing_strategy",
        focus: "revenue_optimization",
      },
    });
  }
}

// Usage
const salesIntel = new SalesIntelligence();

// Analyze quarterly sales performance
const performanceAnalysis = await salesIntel.analyzeSalesPerformance(
  [
    { rep: "John", target: 100000, actual: 120000, deals: 12 },
    { rep: "Sarah", target: 100000, actual: 85000, deals: 8 },
    { rep: "Mike", target: 100000, actual: 110000, deals: 15 },
  ],
  "North America",
  "Q4 2024",
);

// Generate industry-specific sales playbook
const playbook = await salesIntel.generateSalesPlaybook(
  "Financial Services",
  [
    { role: "CFO", painPoints: ["Cost control", "Compliance"], budget: "High" },
    {
      role: "IT Director",
      painPoints: ["Security", "Integration"],
      influence: "High",
    },
  ],
  {
    averageLength: "6 months",
    keyStages: [
      "Discovery",
      "Technical Evaluation",
      "Business Case",
      "Legal Review",
    ],
  },
);

// Optimize pricing strategy
const pricingStrategy = await salesIntel.optimizePricing(
  {
    marketSize: "$5B",
    growth: "15%",
    averageDealSize: "$50K",
  },
  [
    { competitor: "CompetitorA", startingPrice: "$10K", enterprise: "$50K" },
    { competitor: "CompetitorB", startingPrice: "$15K", enterprise: "$75K" },
  ],
  [
    "ROI improvement",
    "Time savings",
    "Risk reduction",
    "Compliance automation",
  ],
);
```

##  Marketing & Customer Success

### Marketing Intelligence

```bash
# Campaign performance analysis
npx @juspay/neurolink gen "
Analyze our Q4 marketing campaign performance:

Campaign Results:
- Email marketing: 4.2% CTR, 18% open rate, $15 CPA
- Paid search: 3.8% CTR, $22 CPA, 1.2M impressions
- Content marketing: 125K blog views, 850 leads
- Social media: 15K engagement, 320 qualified leads
- Events: 3 conferences, 180 leads, $45K spend

Provide:
1. Performance assessment vs industry benchmarks
2. Channel effectiveness and ROI analysis
3. Attribution modeling insights
4. Optimization recommendations for Q1
5. Budget reallocation suggestions
" --enable-analytics \
  --context '{"department":"marketing","type":"campaign_analysis"}' \
  > marketing-performance-q4.md

# Customer segmentation strategy
npx @juspay/neurolink gen "
Develop customer segmentation strategy for B2B SaaS:

Current customer base:
- 2,500 total customers
- Industries: Tech (40%), Financial (25%), Healthcare (20%), Other (15%)
- Company sizes: SMB (5000, 10%)
- Usage patterns: Power users (25%), Regular users (50%), Light users (25%)

Create segmentation framework for:
- Targeted messaging and positioning
- Product development priorities
- Customer success strategies
- Upselling and expansion opportunities
" --provider anthropic \
  --enable-evaluation \
  --evaluation-domain "VP of Marketing" \
  > customer-segmentation-strategy.md

# Content marketing strategy
npx @juspay/neurolink gen "
Create comprehensive content marketing strategy:

Target audience: IT decision makers at mid-market companies
Key topics: AI adoption, digital transformation, security, compliance
Content goals: Brand awareness, lead generation, thought leadership

Develop:
1. Content pillar framework
2. Editorial calendar structure
3. Content distribution strategy
4. Performance measurement framework
5. Resource requirements and budget
6. 90-day implementation plan
" --temperature 0.7 \
  --max-tokens 1500 \
  > content-marketing-strategy.md
```

### Customer Success Optimization

```typescript
class CustomerSuccessIntelligence {
  private neurolink: NeuroLink;

  constructor() {
    this.neurolink = new NeuroLink();
  }

  async analyzeChurnRisk(customerData: any[], usageMetrics: any[]) {
    const prompt = `Analyze customer churn risk and provide retention strategies:

                   Customer data: ${JSON.stringify(customerData.slice(0, 5), null, 2)}
                   Usage metrics: ${JSON.stringify(usageMetrics.slice(0, 5), null, 2)}

                   Identify:
                   1. High-risk churn indicators and patterns
                   2. Customer segments most at risk
                   3. Early warning signals to monitor
                   4. Proactive intervention strategies
                   5. Success metrics for retention programs
                   6. Resource allocation recommendations`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "anthropic",
      temperature: 0.4,
      context: {
        department: "customer_success",
        analysis_type: "churn_prevention",
      },
    });
  }

  async generateExpansionStrategy(accountData: any, productCatalog: any[]) {
    const prompt = `Develop account expansion strategy:

                   Account data: ${JSON.stringify(accountData, null, 2)}
                   Available products: ${JSON.stringify(productCatalog, null, 2)}

                   Recommend:
                   1. Expansion opportunities and prioritization
                   2. Cross-sell and upsell scenarios
                   3. Value proposition for each opportunity
                   4. Implementation timeline and approach
                   5. Success probability assessment
                   6. Revenue impact projections`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "google-ai",
      temperature: 0.5,
      context: {
        focus: "revenue_expansion",
        account_tier: accountData.tier,
      },
    });
  }

  async optimizeOnboarding(currentProcess: any, customerFeedback: string[]) {
    const prompt = `Optimize customer onboarding process:

                   Current process: ${JSON.stringify(currentProcess, null, 2)}
                   Customer feedback: ${customerFeedback.join("\n")}

                   Provide recommendations for:
                   1. Onboarding flow optimization
                   2. Milestone and checkpoint improvements
                   3. Self-service vs assisted touch points
                   4. Success criteria and measurement
                   5. Automation opportunities
                   6. Resource requirements`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "openai",
      temperature: 0.5,
      maxTokens: 1200,
    });
  }
}

// Usage
const csIntel = new CustomerSuccessIntelligence();

// Analyze churn risk across customer base
const churnAnalysis = await csIntel.analyzeChurnRisk(
  [
    { id: "cust1", tier: "enterprise", tenure: 24, health: "yellow" },
    { id: "cust2", tier: "mid-market", tenure: 6, health: "red" },
  ],
  [
    { customer: "cust1", logins: 45, features: 8, support_tickets: 2 },
    { customer: "cust2", logins: 12, features: 3, support_tickets: 8 },
  ],
);

// Generate expansion opportunities
const expansionStrategy = await csIntel.generateExpansionStrategy(
  {
    companySize: 1500,
    currentARR: 120000,
    products: ["Core Platform"],
    industry: "Financial Services",
  },
  [
    { name: "Advanced Analytics", price: 50000, fit: "high" },
    { name: "Compliance Module", price: 30000, fit: "high" },
    { name: "API Access", price: 20000, fit: "medium" },
  ],
);
```

##  Performance Management

### Executive KPI Dashboard

```bash
#!/bin/bash
# Automated executive dashboard generation

# Generate weekly executive summary
npx @juspay/neurolink gen "
Create executive dashboard summary for SaaS company:

Key Metrics (Week over Week):
- MRR: $850K (+3.2%)
- New customers: 45 (+12%)
- Churn rate: 2.1% (-0.3%)
- CAC: $420 (-8%)
- NPS: 67 (+2 points)
- Team productivity: 87% (+5%)

Generate executive summary including:
1. Key performance highlights
2. Concerning trends requiring attention
3. Strategic recommendations
4. Resource allocation suggestions
5. Risk mitigation priorities

Format for C-level consumption.
" --provider anthropic \
  --enable-analytics \
  --context '{"audience":"executives","format":"dashboard_summary"}' \
  > executive-summary-$(date +%Y%m%d).md

# Department performance analysis
npx @juspay/neurolink gen "
Analyze cross-departmental performance alignment:

Sales: 108% of target, strong pipeline health
Marketing: 95% lead target, improved conversion rates
Engineering: 92% sprint completion, technical debt concerns
Customer Success: 98% retention target, expansion opportunities
Finance: On budget, cash flow positive

Identify:
- Inter-departmental dependencies and bottlenecks
- Resource reallocation opportunities
- Performance improvement initiatives
- Cross-functional collaboration needs
" --enable-evaluation \
  --evaluation-domain "Chief Operating Officer" \
  > departmental-performance-$(date +%Y%m%d).md

echo "✅ Executive dashboards generated"
```

##  Compliance & Risk Management

### Regulatory Compliance

```typescript
class ComplianceAssistant {
  private neurolink: NeuroLink;

  constructor() {
    this.neurolink = new NeuroLink();
  }

  async assessComplianceGap(
    currentPolicies: any[],
    regulations: string[],
    industry: string,
  ) {
    const prompt = `Conduct compliance gap analysis for ${industry} industry:

                   Current policies: ${JSON.stringify(currentPolicies, null, 2)}
                   Applicable regulations: ${regulations.join(", ")}

                   Identify:
                   1. Compliance gaps and deficiencies
                   2. Risk levels and potential penalties
                   3. Required policy updates and new procedures
                   4. Implementation timeline and priorities
                   5. Training and awareness requirements
                   6. Ongoing monitoring and audit needs`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "anthropic",
      temperature: 0.3,
      context: {
        domain: "compliance",
        industry: industry,
        urgency: "high",
      },
    });
  }

  async generateRiskRegister(
    businessActivities: any[],
    riskCategories: string[],
  ) {
    const prompt = `Create comprehensive risk register:

                   Business activities: ${JSON.stringify(businessActivities, null, 2)}
                   Risk categories: ${riskCategories.join(", ")}

                   For each identified risk provide:
                   1. Risk description and impact assessment
                   2. Probability and severity ratings
                   3. Current controls and mitigation measures
                   4. Residual risk assessment
                   5. Additional controls needed
                   6. Risk ownership and monitoring requirements`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "google-ai",
      temperature: 0.4,
      maxTokens: 1800,
    });
  }
}

// Usage
const compliance = new ComplianceAssistant();

// Assess GDPR compliance
const gdprGap = await compliance.assessComplianceGap(
  [
    { name: "Data Processing Policy", lastUpdated: "2023-01-15" },
    { name: "Privacy Notice", lastUpdated: "2023-06-01" },
    { name: "Incident Response", lastUpdated: "2022-11-30" },
  ],
  ["GDPR", "CCPA", "SOX"],
  "Financial Technology",
);

// Generate operational risk register
const riskRegister = await compliance.generateRiskRegister(
  [
    {
      activity: "Customer data processing",
      volume: "high",
      sensitivity: "high",
    },
    { activity: "Third-party integrations", count: 15, criticality: "medium" },
    {
      activity: "Cloud infrastructure",
      dependency: "high",
      redundancy: "partial",
    },
  ],
  ["Operational", "Cyber Security", "Regulatory", "Financial", "Reputational"],
);
```

These business applications demonstrate how NeuroLink can drive value across all organizational functions, from strategic decision-making to operational optimization, providing measurable ROI and competitive advantages.

##  Related Documentation

- [Use Cases](/docs/use-cases) - Industry-specific applications
- [Advanced Examples](/docs/advanced) - Complex integration patterns
- [Analytics Features](/docs/reference/analytics) - Business intelligence capabilities
- [Enterprise Setup](/docs/getting-started/provider-setup) - Enterprise configuration

---

## Tool Blocking Feature Example

<!-- Source: examples/mcp-tool-blocking-example.md -->

# Tool Blocking Feature Example

This example demonstrates how to use the `blockedTools` feature to prevent specific tools from being executed on external MCP servers.

## Example Configuration

Create or update your `.mcp-config.json` file:

```json
{
  "mcpServers": {
    "filesystem": {
      "name": "filesystem",
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "."],
      "transport": "stdio",
      "blockedTools": ["move_file", "delete_file", "remove_directory"]
    },
    "github": {
      "name": "github",
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "transport": "stdio",
      "env": {
        "GITHUB_PERSONAL_ACCESS_TOKEN": "your_token_here"
      },
      "blockedTools": ["delete_repository", "transfer_repository"]
    },
    "bitbucket": {
      "name": "bitbucket",
      "command": "npx",
      "args": ["-y", "@nexus2520/bitbucket-mcp-server"],
      "transport": "stdio",
      "env": {
        "BITBUCKET_USERNAME": "your-bitbucket-username",
        "BITBUCKET_APP_PASSWORD": "your-app-password"
      },
      "blockedTools": ["delete_repository", "delete_branch"]
    }
  }
}
```

## Testing the Feature

### 1. Load the Configuration

```typescript

const neurolink = new NeuroLink();

// Load external servers from configuration
await neurolink.loadExternalMCPServers("./.mcp-config.json");
```

### 2. List Available Tools

```typescript
// Get MCP status to see loaded servers
const status = await neurolink.getMCPStatus();
console.log(`Loaded ${status.totalServers} servers`);

// List all available tools (blocked tools won't appear here)
const tools = await neurolink.listMCPTools();
console.log(
  "Available tools:",
  tools.map((t) => t.name),
);
```

### 3. Attempt to Execute a Blocked Tool

```typescript
try {
  // This will fail because 'delete_file' is blocked
  await neurolink.executeMCPTool("filesystem.delete_file", {
    path: "/some/file.txt",
  });
} catch (error) {
  console.error("Expected error:", error.message);
  // Output: "Tool 'delete_file' is blocked on server 'filesystem' by configuration"
}
```

### 4. Execute an Allowed Tool

```typescript
// This will succeed because 'read_file' is not blocked
const content = await neurolink.executeMCPTool("filesystem.read_file", {
  path: "/some/file.txt",
});
console.log("File content:", content);
```

## Use Cases

### 1. Production Safety

Block destructive operations in production:

```json
{
  "mcpServers": {
    "filesystem-prod": {
      "blockedTools": [
        "delete_file",
        "remove_directory",
        "move_file",
        "write_file"
      ]
    }
  }
}
```

### 2. Read-Only GitHub Access

Allow read operations but block writes:

```json
{
  "mcpServers": {
    "github-readonly": {
      "blockedTools": [
        "create_repository",
        "delete_repository",
        "create_issue",
        "close_issue",
        "create_pull_request",
        "merge_pull_request"
      ]
    }
  }
}
```

### 3. Compliance and Audit

Block sensitive operations that require audit trails:

```json
{
  "mcpServers": {
    "database": {
      "blockedTools": [
        "drop_table",
        "truncate_table",
        "delete_all_records",
        "update_schema"
      ]
    }
  }
}
```

## Verification

Run tests to verify the feature works correctly:

```bash
# Run the blocklist tests
pnpm test test/unit/mcp/externalServerBlocklist.test.ts

# Or run all tests
pnpm test
```

## Notes

- Blocked tools are filtered during discovery, so they won't appear in the list of available tools
- Attempts to execute blocked tools will throw an error with a clear message
- The blockedTools array can be empty or omitted if no tools need to be blocked
- Tool names are case-sensitive and must match exactly

---

## Use Cases & Applications

<!-- Source: examples/use-cases.md -->

# Use Cases & Applications

Real-world scenarios and practical applications where NeuroLink adds value across different industries and roles.

## ‍ Software Development

### Code Generation & Review

**Scenario**: Development team needs to accelerate coding and improve quality.

```typescript

class DeveloperAssistant {
  private neurolink: NeuroLink;

  constructor() {
    this.neurolink = new NeuroLink();
  }

  async generateCode(
    requirement: string,
    language: string,
    framework?: string,
  ) {
    const prompt = `Generate ${language} code for: ${requirement}
                   ${framework ? `Using ${framework} framework` : ""}
                   Include error handling, comments, and tests.`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "anthropic", // Claude excels at code generation
      temperature: 0.3,
    });
  }

  async reviewCode(code: string, focusAreas: string[] = []) {
    const areas =
      focusAreas.length > 0
        ? focusAreas.join(", ")
        : "security, performance, maintainability, best practices";

    const prompt = `Review this code focusing on: ${areas}

                   Code:
                   ${code}

                   Provide specific feedback and suggestions.`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "anthropic",
      temperature: 0.4,
    });
  }

  async explainCode(code: string, audience: string = "developer") {
    const prompt = `Explain this code for a ${audience}:

                   ${code}

                   Make it clear and educational.`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "openai",
      temperature: 0.6,
    });
  }
}

// Usage
const assistant = new DeveloperAssistant();

// Generate API endpoint
const apiCode = await assistant.generateCode(
  "REST API endpoint for user authentication with JWT tokens",
  "TypeScript",
  "Express.js",
);

// Review existing code
const review = await assistant.reviewCode(legacyCode, [
  "security",
  "performance",
]);

// Explain complex algorithm
const explanation = await assistant.explainCode(
  complexAlgorithm,
  "junior developer",
);
```

### Documentation Generation

```bash
#!/bin/bash
# Automated documentation generation

# Generate API documentation
npx @juspay/neurolink gen "
Create comprehensive API documentation for our user management service.
Include: authentication, endpoints, request/response examples, error codes.
" --provider anthropic --max-tokens 2000 > docs/api.md

# Generate README for new project
npx @juspay/neurolink gen "
Create a professional README for a Node.js TypeScript project called 'task-manager'.
Include: description, installation, usage, configuration, contributing guidelines.
" > README.md

# Generate architecture documentation
npx @juspay/neurolink gen "
Document the microservices architecture for an e-commerce platform.
Include: service boundaries, data flow, deployment strategy, monitoring.
" --enable-evaluation --evaluation-domain "Solutions Architect" > docs/architecture.md
```

##  Content Creation & Marketing

### Blog & Article Writing

**Scenario**: Marketing team needs consistent, high-quality content.

```typescript
class ContentCreator {
  private neurolink: NeuroLink;

  constructor() {
    this.neurolink = new NeuroLink();
  }

  async createBlogPost(topic: string, audience: string, seoKeywords: string[]) {
    const prompt = `Write a comprehensive blog post about "${topic}" for ${audience}.

                   Requirements:
                   - Include SEO keywords: ${seoKeywords.join(", ")}
                   - Engaging introduction and conclusion
                   - 800-1200 words
                   - Actionable insights
                   - Call-to-action at the end`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "openai",
      temperature: 0.8,
      maxTokens: 1500,
    });
  }

  async createSocialMediaContent(topic: string, platforms: string[]) {
    const content = {};

    for (const platform of platforms) {
      const prompt = `Create engaging ${platform} content about "${topic}".
                     ${this.getPlatformGuidelines(platform)}`;

      const result = await this.neurolink.generate({
        input: { text: prompt },
        provider: "openai",
        temperature: 0.9,
      });

      content[platform] = result.content;
    }

    return content;
  }

  private getPlatformGuidelines(platform: string): string {
    const guidelines = {
      twitter: "Max 280 characters, include relevant hashtags, engaging hook",
      linkedin: "Professional tone, 1-3 paragraphs, call for engagement",
      instagram: "Visual-focused caption, emojis, relevant hashtags",
      facebook: "Conversational tone, encourage comments and shares",
    };

    return (
      guidelines[platform.toLowerCase()] || "Follow platform best practices"
    );
  }

  async improveContent(content: string, improvements: string[]) {
    const prompt = `Improve this content by: ${improvements.join(", ")}

                   Original content:
                   ${content}`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "anthropic",
      temperature: 0.5,
    });
  }
}

// Usage
const creator = new ContentCreator();

// Create blog post
const blogPost = await creator.createBlogPost(
  "AI automation in small businesses",
  "small business owners",
  ["AI automation", "business efficiency", "digital transformation"],
);

// Create social media campaign
const socialContent = await creator.createSocialMediaContent(
  "New product launch",
  ["twitter", "linkedin", "instagram"],
);

// Improve existing content
const improved = await creator.improveContent(existingArticle, [
  "improve readability",
  "add more examples",
  "stronger conclusion",
]);
```

### Email Marketing

```bash
# Email campaign generation
npx @juspay/neurolink gen "
Create a welcome email series (3 emails) for new SaaS customers.

Email 1: Welcome and getting started
Email 2: Key features and benefits
Email 3: Success stories and support resources

Each email should be 150-200 words, professional yet friendly tone.
" --enable-analytics --context '{"campaign":"welcome_series","audience":"b2b"}' > email-series.md
```

##  Business & Operations

### Data Analysis & Reporting

**Scenario**: Business analyst needs to interpret data and create reports.

```typescript
class BusinessAnalyzer {
  private neurolink: NeuroLink;

  constructor() {
    this.neurolink = new NeuroLink();
  }

  async analyzeData(data: any[], question: string, context: any = {}) {
    const dataPreview = JSON.stringify(data.slice(0, 5), null, 2);
    const prompt = `Analyze this business data and answer: ${question}

                   Context: ${JSON.stringify(context)}
                   Data sample (${data.length} total records):
                   ${dataPreview}

                   Provide insights, trends, and actionable recommendations.`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "google-ai",
      temperature: 0.4,
      maxTokens: 800,
    });
  }

  async createExecutiveSummary(metrics: any, timeframe: string) {
    const prompt = `Create an executive summary for ${timeframe} business performance.

                   Key metrics:
                   ${JSON.stringify(metrics, null, 2)}

                   Include: key achievements, challenges, trends, recommendations.
                   Target audience: C-level executives.`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "anthropic",
      temperature: 0.5,
      maxTokens: 600,
    });
  }

  async generatePredictions(historicalData: any[], forecastPeriod: string) {
    const prompt = `Based on this historical data, provide business predictions for ${forecastPeriod}.

                   Historical data:
                   ${JSON.stringify(historicalData, null, 2)}

                   Include confidence levels and risk factors.`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "google-ai",
      temperature: 0.6,
    });
  }
}

// Usage
const analyzer = new BusinessAnalyzer();

// Analyze sales data
const salesAnalysis = await analyzer.analyzeData(
  salesData,
  "What are the key trends in our sales performance?",
  { department: "sales", region: "north_america" },
);

// Create quarterly summary
const summary = await analyzer.createExecutiveSummary(
  {
    revenue: "$2.5M",
    growth: "15%",
    customers: 1250,
    churn: "3.2%",
  },
  "Q3 2024",
);

// Generate predictions
const forecast = await analyzer.generatePredictions(
  monthlyMetrics,
  "next quarter",
);
```

### Meeting & Communication

```bash
# Meeting notes processing
cat meeting-transcript.txt | npx @juspay/neurolink gen "
Summarize this meeting transcript into:
1. Key decisions made
2. Action items with owners
3. Next steps and deadlines
4. Important discussion points

Format as structured meeting notes.
" --provider anthropic

# Email response generation
npx @juspay/neurolink gen "
Draft a professional response to this customer complaint:
'Your software crashed during our important presentation. This is unacceptable!'

Response should: acknowledge the issue, apologize, explain next steps, offer compensation.
" --temperature 0.4
```

##  Education & Training

### Curriculum Development

**Scenario**: Educational institution creating AI-enhanced learning materials.

```typescript
class EducationalAssistant {
  private neurolink: NeuroLink;

  constructor() {
    this.neurolink = new NeuroLink();
  }

  async createLessonPlan(
    subject: string,
    gradeLevel: string,
    duration: string,
  ) {
    const prompt = `Create a comprehensive lesson plan for ${subject} (${gradeLevel}).

                   Duration: ${duration}
                   Include: objectives, materials, activities, assessment, homework.
                   Make it engaging and age-appropriate.`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "anthropic",
      temperature: 0.7,
    });
  }

  async generateQuizQuestions(
    topic: string,
    difficulty: string,
    count: number,
  ) {
    const prompt = `Generate ${count} ${difficulty} quiz questions about ${topic}.

                   Include multiple choice, true/false, and short answer questions.
                   Provide correct answers and explanations.`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "openai",
      temperature: 0.5,
    });
  }

  async explainConcept(
    concept: string,
    audience: string,
    useAnalogies: boolean = true,
  ) {
    const analogyInstruction = useAnalogies
      ? "Use simple analogies and examples."
      : "";

    const prompt = `Explain "${concept}" for ${audience}. ${analogyInstruction}

                   Make it clear, engaging, and easy to understand.
                   Break down complex ideas into simple steps.`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "google-ai",
      temperature: 0.6,
    });
  }

  async createStudyGuide(materials: string[], examDate: string) {
    const prompt = `Create a study guide for exam on ${examDate}.

                   Course materials:
                   ${materials.join("\n")}

                   Include: key topics, important concepts, practice questions, study schedule.`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "anthropic",
      temperature: 0.4,
    });
  }
}

// Usage
const educator = new EducationalAssistant();

// Create lesson plan
const lessonPlan = await educator.createLessonPlan(
  "Introduction to Machine Learning",
  "College Sophomore",
  "90 minutes",
);

// Generate quiz
const quiz = await educator.generateQuizQuestions(
  "JavaScript fundamentals",
  "intermediate",
  10,
);

// Explain complex concept
const explanation = await educator.explainConcept(
  "Quantum entanglement",
  "high school students",
  true,
);
```

##  Healthcare & Research

### Medical Documentation

**Scenario**: Healthcare professionals need assistance with documentation and research.

```bash
# Medical research summary
npx @juspay/neurolink gen "
Summarize recent developments in diabetes treatment (2023-2024).
Focus on: new medications, treatment approaches, clinical trial results.
Target audience: healthcare professionals.
" --provider anthropic --enable-evaluation --evaluation-domain "Medical Professional"

# Patient education material
npx @juspay/neurolink gen "
Create patient education material about hypertension management.
Include: lifestyle changes, medication compliance, warning signs.
Use simple language for general public.
" --temperature 0.3

# Clinical case analysis
npx @juspay/neurolink gen "
Analyze this clinical case and suggest differential diagnoses:
[Patient symptoms and history]

Consider: common conditions, rare diseases, diagnostic tests needed.
" --provider google-ai --enable-analytics
```

##  E-commerce & Retail

### Product Management

**Scenario**: E-commerce company optimizing product listings and customer experience.

```typescript
class EcommerceAssistant {
  private neurolink: NeuroLink;

  constructor() {
    this.neurolink = new NeuroLink();
  }

  async optimizeProductDescription(productInfo: any, targetKeywords: string[]) {
    const prompt = `Create an optimized product description for:

                   Product: ${productInfo.name}
                   Category: ${productInfo.category}
                   Features: ${productInfo.features.join(", ")}
                   Target keywords: ${targetKeywords.join(", ")}

                   Make it compelling, SEO-friendly, and conversion-focused.`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "openai",
      temperature: 0.7,
    });
  }

  async generateCustomerEmailResponse(inquiry: string, orderInfo: any) {
    const prompt = `Generate a helpful customer service response for this inquiry:

                   Customer inquiry: ${inquiry}
                   Order information: ${JSON.stringify(orderInfo)}

                   Be professional, empathetic, and solution-focused.`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "anthropic",
      temperature: 0.4,
    });
  }

  async analyzeCustomerFeedback(reviews: string[]) {
    const reviewText = reviews.join("\n---\n");

    const prompt = `Analyze these customer reviews and provide insights:

                   ${reviewText}

                   Identify: common themes, pain points, positive aspects, improvement suggestions.`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "google-ai",
      temperature: 0.5,
    });
  }
}

// Usage
const ecommerce = new EcommerceAssistant();

// Optimize product listing
const description = await ecommerce.optimizeProductDescription(
  {
    name: "Wireless Bluetooth Headphones",
    category: "Electronics",
    features: ["Noise cancellation", "30-hour battery", "Quick charge"],
  },
  ["wireless headphones", "noise cancelling", "bluetooth"],
);

// Generate customer response
const response = await ecommerce.generateCustomerEmailResponse(
  "My order hasn't arrived yet and it's been 10 days",
  { orderNumber: "12345", estimatedDelivery: "2024-01-15" },
);
```

##  Creative Industries

### Design & Creative Content

```bash
# Design brief generation
npx @juspay/neurolink gen "
Create a design brief for a mobile app targeting young professionals.
App purpose: Personal finance management
Include: target audience, visual style, color palette, typography, user experience goals.
" --temperature 0.8

# Creative campaign concepts
npx @juspay/neurolink gen "
Generate 5 creative campaign concepts for a sustainable fashion brand.
Target: environmentally conscious millennials
Include: campaign theme, key message, content ideas, channel strategy.
" --provider openai --enable-analytics

# Video script writing
npx @juspay/neurolink gen "
Write a 60-second video script for a tech startup's product demo.
Product: AI-powered project management tool
Include: hook, problem, solution, benefits, call-to-action.
" --max-tokens 500
```

##  DevOps & Infrastructure

### Automation & Monitoring

```typescript
class DevOpsAssistant {
  private neurolink: NeuroLink;

  constructor() {
    this.neurolink = new NeuroLink();
  }

  async generateDockerfile(appInfo: any) {
    const prompt = `Generate a production-ready Dockerfile for:

                   Application: ${appInfo.type}
                   Runtime: ${appInfo.runtime}
                   Dependencies: ${appInfo.dependencies.join(", ")}

                   Include: security best practices, multi-stage build, health checks.`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "anthropic",
      temperature: 0.3,
    });
  }

  async analyzeLogError(errorLog: string, systemContext: any) {
    const prompt = `Analyze this error log and provide troubleshooting steps:

                   Error log:
                   ${errorLog}

                   System context:
                   ${JSON.stringify(systemContext)}

                   Include: root cause analysis, fix suggestions, prevention measures.`;

    return await this.neurolink.generate({
      input: { text: prompt },
      provider: "google-ai",
      temperature: 0.4,
    });
  }
}

// Usage
const devops = new DevOpsAssistant();

// Generate Dockerfile
const dockerfile = await devops.generateDockerfile({
  type: "Node.js web application",
  runtime: "Node.js 18",
  dependencies: ["express", "mongodb", "redis"],
});

// Analyze error
const troubleshooting = await devops.analyzeLogError(errorLogText, {
  environment: "production",
  service: "api-gateway",
});
```

##  Research & Analytics

### Market Research

```bash
# Competitive analysis
npx @juspay/neurolink gen "
Analyze the competitive landscape for AI-powered productivity tools.
Include: key players, market positioning, feature comparison, market gaps.
" --provider anthropic --enable-evaluation --evaluation-domain "Market Research Analyst"

# Survey analysis
cat survey-responses.csv | npx @juspay/neurolink gen "
Analyze these survey responses about remote work preferences.
Identify: key trends, demographic patterns, actionable insights.
" --enable-analytics --context '{"research_type":"employee_survey"}'

# Trend prediction
npx @juspay/neurolink gen "
Based on current technology trends, predict the future of workplace collaboration tools (2025-2030).
Consider: AI integration, VR/AR adoption, security concerns, user behavior changes.
" --temperature 0.6
```

These use cases demonstrate NeuroLink's versatility across different industries and professional roles, showing how AI can enhance productivity and decision-making in real-world scenarios.

##  Related Documentation

- [Basic Usage](/docs/examples/basic-usage) - Getting started examples
- [Advanced Examples](/docs/advanced) - Complex integration patterns
- [Business Examples](/docs/examples/business) - Business-focused applications
- [CLI Examples](/docs/cli/examples) - Command-line use cases

---

# Cookbook

## NeuroLink Cookbook

<!-- Source: cookbook/index.md -->

# NeuroLink Cookbook

Welcome to the NeuroLink Cookbook! This collection of recipes provides practical, copy-paste ready solutions for common use cases and challenges when building with NeuroLink.

## What's in the Cookbook?

Each recipe follows a consistent structure:

- **Problem**: What challenge does this solve?
- **Solution**: High-level approach
- **Code**: Complete, working TypeScript example
- **Explanation**: Step-by-step breakdown
- **Variations**: Alternative approaches
- **See Also**: Related recipes and documentation

## Recipe Categories

### Reliability & Error Handling

- [**Streaming with Retry Logic**](/docs/cookbook/streaming-with-retry) - Handle network interruptions and implement automatic retry for streaming responses
- [**Error Recovery Patterns**](/docs/cookbook/error-recovery) - Graceful degradation and error handling strategies
- [**Multi-Provider Fallback**](/docs/cookbook/multi-provider-fallback) - Automatically switch providers when one fails

### Performance & Optimization

- [**Cost Optimization**](/docs/cookbook/cost-optimization) - Minimize token usage and API costs
- [**Rate Limit Handling**](/docs/cookbook/rate-limit-handling) - Manage rate limits across providers
- [**Batch Processing**](/docs/cookbook/batch-processing) - Efficiently process multiple requests

### Context Management

- [**Context Window Management**](/docs/cookbook/context-window-management) - Handle large conversations within token limits
- [**Conversation Summarization**](/docs/cookbook/conversation-summarization) - Automatically summarize long conversations

### Advanced Features

- [**Structured Output with JSON Schema**](/docs/cookbook/structured-output) - Extract structured data with type safety
- [**Tool Chaining**](/docs/cookbook/tool-chaining) - Chain multiple MCP tool calls together

## How to Use These Recipes

1. **Find your use case**: Browse the categories above
2. **Copy the code**: All examples are production-ready
3. **Customize**: Adapt the code to your specific needs
4. **Test**: Verify the solution works in your environment

## Prerequisites

Most recipes assume you have:

- NeuroLink installed: `npm install @juspay/neurolink`
- At least one provider configured (API keys in `.env`)
- Basic TypeScript/JavaScript knowledge

## Contributing

Found a common pattern not covered here? [Contribute a recipe](/docs/community/contributing)!

## See Also

- [Getting Started Guide](/docs/getting-started/installation)
- [API Reference](/docs/sdk/api-reference)
- [Troubleshooting Guide](/docs/reference/troubleshooting)

---

## Batch Processing

<!-- Source: cookbook/batch-processing.md -->

# Batch Processing

## Problem

Processing many requests sequentially is slow and inefficient:

- High latency (wait for each request)
- Underutilized rate limits
- Poor resource usage
- Slow time-to-completion

Applications often need to process:

- Multiple documents
- Large datasets
- User-generated content
- Batch analytics

## Solution

Implement efficient batch processing with:

1. Concurrent request handling
2. Rate limit awareness
3. Progress tracking
4. Error recovery
5. Result aggregation

## Code

```typescript

type BatchConfig = {
  concurrency?: number; // Max parallel requests
  rateLimit?: number; // Max requests per second
  onProgress?: (completed: number, total: number) => void;
  onError?: (error: Error, item: any, index: number) => void;
  retryFailures?: boolean;
};

type BatchResult = {
  results: R[];
  errors: Array;
  duration: number;
  successRate: number;
};

class BatchProcessor {
  private neurolink: NeuroLink;

  constructor() {
    this.neurolink = new NeuroLink();
  }

  /**
   * Process items in batches with concurrency control
   */
  async processBatch(
    items: T[],
    processFn: (item: T, index: number) => Promise,
    config: BatchConfig = {},
  ): Promise> {
    const {
      concurrency = 5,
      rateLimit = 10, // requests per second
      onProgress,
      onError,
      retryFailures = true,
    } = config;

    const startTime = Date.now();
    const results: R[] = new Array(items.length);
    const errors: Array = [];

    let completed = 0;
    let inFlight = 0;
    let currentIndex = 0;

    const minDelay = 1000 / rateLimit; // ms between requests

    return new Promise((resolve) => {
      const processNext = async () => {
        if (currentIndex >= items.length && inFlight === 0) {
          // All done
          const duration = Date.now() - startTime;
          const successRate =
            (results.filter((r) => r !== undefined).length / items.length) *
            100;

          resolve({
            results,
            errors,
            duration,
            successRate,
          });
          return;
        }

        if (inFlight >= concurrency || currentIndex >= items.length) {
          return;
        }

        const index = currentIndex++;
        const item = items[index];
        inFlight++;

        try {
          const result = await processFn(item, index);
          results[index] = result;

          completed++;
          onProgress?.(completed, items.length);
        } catch (error: any) {
          errors.push({ item, error, index });
          onError?.(error, item, index);

          if (retryFailures) {
            // Add to end of queue for retry
            items.push(item);
          }
        } finally {
          inFlight--;

          // Rate limiting
          await new Promise((r) => setTimeout(r, minDelay));

          processNext();
        }

        processNext();
      };

      // Start concurrent workers
      for (let i = 0; i > {
    return this.processBatch(
      texts,
      async (text, index) => {
        const result = await this.neurolink.generate({
          input: { text: `${prompt}\n\n${text}` },
          provider: config.provider || "anthropic",
          model: "claude-3-haiku-20240307", // Fast, cheap model
        });

        return result.content;
      },
      config,
    );
  }

  /**
   * Process with structured output
   */
  async processStructured(
    items: string[],
    prompt: string,
    schema: any,
    config: BatchConfig = {},
  ): Promise> {
    return this.processBatch(
      items,
      async (item) => {
        const result = await this.neurolink.generate({
          input: { text: `${prompt}\n\n${item}` },
          provider: "openai",
          structuredOutput: { type: "json", schema },
        });

        return JSON.parse(result.content) as T;
      },
      config,
    );
  }

  /**
   * Process files in parallel
   */
  async processFiles(
    filePaths: string[],
    processFn: (content: string, path: string) => Promise,
    config: BatchConfig = {},
  ) {
    const fs = await import("fs/promises");

    return this.processBatch(
      filePaths,
      async (path, index) => {
        const content = await fs.readFile(path, "utf-8");
        return processFn(content, path);
      },
      config,
    );
  }
}

// Usage Example 1: Sentiment Analysis
async function example1_SentimentAnalysis() {
  const processor = new BatchProcessor();

  const reviews = [
    "This product is amazing! Highly recommend.",
    "Terrible quality, waste of money.",
    "It's okay, nothing special.",
    "Best purchase I've made this year!",
    "Disappointed, expected much better.",
  ];

  console.log("=== Sentiment Analysis ===");

  const result = await processor.processTexts(
    reviews,
    "Classify the sentiment of this review as positive, negative, or neutral. Return only the sentiment.",
    {
      concurrency: 3,
      rateLimit: 5,
      onProgress: (completed, total) => {
        console.log(
          `Progress: ${completed}/${total} (${((completed / total) * 100).toFixed(0)}%)`,
        );
      },
    },
  );

  console.log("\n✅ Results:");
  result.results.forEach((sentiment, i) => {
    console.log(`  ${i + 1}. ${reviews[i].slice(0, 30)}... → ${sentiment}`);
  });

  console.log(`\n Stats:`);
  console.log(`  Duration: ${result.duration}ms`);
  console.log(`  Success rate: ${result.successRate.toFixed(1)}%`);
  console.log(`  Errors: ${result.errors.length}`);
}

// Example 2: Data Extraction
type ProductInfo = {
  name: string;
  price: number;
  category: string;
};

const productSchema = {
  type: "object",
  properties: {
    name: { type: "string" },
    price: { type: "number" },
    category: { type: "string" },
  },
  required: ["name", "price", "category"],
};

async function example2_DataExtraction() {
  const processor = new BatchProcessor();

  const descriptions = [
    "The UltraBook Pro laptop costs $1299 and is perfect for professionals.",
    "Get the SmartWatch X for only $299 - the best fitness tracker available.",
    "Premium wireless headphones, $199, audiophile quality sound.",
  ];

  console.log("\n=== Data Extraction ===");

  const result = await processor.processStructured(
    descriptions,
    "Extract product information:",
    productSchema,
    {
      concurrency: 2,
      rateLimit: 3,
    },
  );

  console.log("\n✅ Extracted Products:");
  result.results.forEach((product, i) => {
    console.log(
      `  ${i + 1}. ${product.name} - $${product.price} (${product.category})`,
    );
  });
}

// Example 3: Document Summarization
async function example3_DocumentSummarization() {
  const processor = new BatchProcessor();

  const documents = [
    "Long document about artificial intelligence and machine learning...",
    "Article discussing climate change impacts on global economy...",
    "Research paper on quantum computing applications in cryptography...",
  ];

  console.log("\n=== Document Summarization ===");

  let startTime = Date.now();

  const result = await processor.processTexts(
    documents,
    "Summarize this in 1-2 sentences:",
    {
      concurrency: 3,
      rateLimit: 10,
      onProgress: (completed, total) => {
        const elapsed = ((Date.now() - startTime) / 1000).toFixed(1);
        console.log(`Progress: ${completed}/${total} (${elapsed}s)`);
      },
      onError: (error, item, index) => {
        console.error(`❌ Error processing item ${index}:`, error.message);
      },
    },
  );

  console.log("\n✅ Summaries:");
  result.results.forEach((summary, i) => {
    console.log(`  ${i + 1}. ${summary}`);
  });
}

// Main
async function main() {
  await example1_SentimentAnalysis();
  await example2_DataExtraction();
  await example3_DocumentSummarization();
}

main();
```

## Explanation

### 1. Concurrency Control

Process multiple requests simultaneously:

```typescript
concurrency: 5; // 5 requests in parallel
```

Benefits:

- 5x faster than sequential
- Efficient resource usage
- Respects provider limits

### 2. Rate Limiting

Prevent exceeding provider rate limits:

```typescript
rateLimit: 10  // 10 requests per second
minDelay = 1000 / 10 = 100ms between requests
```

### 3. Progress Tracking

Monitor batch processing in real-time:

```typescript
onProgress: (completed, total) => {
  console.log(`${completed}/${total} (${percentage}%)`);
};
```

### 4. Error Handling

Individual failures don't stop the batch:

```typescript
onError: (error, item, index) => {
  // Log, retry, or skip
};
```

### 5. Retry Logic

Automatically retry failed items:

```typescript
retryFailures: true; // Add to queue end
```

## Variations

### Chunked Batch Processing

Process very large datasets in chunks:

```typescript
async function processInChunks(
  items: T[],
  chunkSize: number,
  processFn: (items: T[]) => Promise,
): Promise {
  const results: R[] = [];

  for (let i = 0; i  setTimeout(r, 1000));
  }

  return results;
}

// Usage
const results = await processInChunks(allItems, 100, async (chunk) =>
  processor.processBatch(chunk, processFn).then((r) => r.results),
);
```

### Priority Queue

Process high-priority items first:

```typescript
type PriorityItem = {
  item: T;
  priority: number;
};

async function processPriorityBatch(
  items: PriorityItem[],
  processFn: (item: T) => Promise,
) {
  // Sort by priority (higher first)
  const sorted = items.sort((a, b) => b.priority - a.priority);

  return processor.processBatch(
    sorted.map((p) => p.item),
    processFn,
  );
}
```

### Result Streaming

Stream results as they complete:

```typescript
async function* processBatchStreaming(
  items: T[],
  processFn: (item: T) => Promise,
): AsyncIterable {
  const promises = items.map((item, index) =>
    processFn(item).then((result) => ({ index, result })),
  );

  for (const promise of promises) {
    yield await promise;
  }
}

// Usage
for await (const { index, result } of processBatchStreaming(items, processFn)) {
  console.log(`Completed item ${index}:`, result);
}
```

### Cost Tracking

Track costs per batch:

```typescript
class CostTrackingProcessor extends BatchProcessor {
  private totalCost = 0;

  async processBatch(
    items: T[],
    processFn: Function,
    config: BatchConfig,
  ) {
    const startCost = this.totalCost;

    const result = await super.processBatch(
      items,
      async (item, index) => {
        const result = await processFn(item, index);

        // Estimate cost (rough)
        const cost = 0.001; // $0.001 per request
        this.totalCost += cost;

        return result;
      },
      config,
    );

    const batchCost = this.totalCost - startCost;
    console.log(` Batch cost: $${batchCost.toFixed(4)}`);

    return result;
  }
}
```

## Performance Comparison

| Approach            | 100 Items | 1000 Items | Notes               |
| ------------------- | --------- | ---------- | ------------------- |
| **Sequential**      | 200s      | 2000s      | Baseline            |
| **Concurrency: 5**  | 40s       | 400s       | 5x faster           |
| **Concurrency: 10** | 20s       | 200s       | 10x faster          |
| **Concurrency: 20** | 15s       | 150s       | May hit rate limits |

## Best Practices

1. **Start conservative**: Begin with low concurrency (3-5)
2. **Monitor rate limits**: Track 429 errors
3. **Implement retries**: Handle transient failures
4. **Track progress**: Show completion status
5. **Use cheap models**: Batch processing doesn't need GPT-4
6. **Cache results**: Save completed work
7. **Handle partial failures**: Don't block on errors

## See Also

- [Rate Limit Handling](/docs/cookbook/rate-limit-handling)
- [Cost Optimization](/docs/cookbook/cost-optimization)
- [Error Recovery](/docs/cookbook/error-recovery)
- [Structured Output](/docs/cookbook/structured-output)

---

## Context Window Management

<!-- Source: cookbook/context-window-management.md -->

# Context Window Management

## Problem

AI models have limited context windows (token limits):

- GPT-4o: 128K tokens (~96K words)
- Claude 4 Sonnet: 200K tokens (~150K words)
- Gemini 2.5 Flash: 1M tokens (~750K words)
- GPT-4.1: 1M tokens (~750K words)

Long conversations exceed these limits, causing:

- Truncated context
- Lost conversation history
- Inconsistent responses
- API errors

## Solution

Implement intelligent context management:

1. Track token usage
2. Sliding window approach
3. Automatic summarization
4. Strategic message pruning
5. Context compression

## Code

```typescript

type Message = {
  role: "system" | "user" | "assistant";
  content: string;
  tokens?: number;
};

class ContextWindowManager {
  private neurolink: NeuroLink;
  private messages: Message[] = [];
  private maxTokens: number;
  private systemMessage?: Message;

  constructor(maxTokens: number = 8000) {
    this.neurolink = new NeuroLink();
    this.maxTokens = maxTokens;
  }

  /**
   * Estimate tokens in text (rough approximation)
   */
  private estimateTokens(text: string): number {
    // Rough estimate: 4 characters per token
    return Math.ceil(text.length / 4);
  }

  /**
   * Calculate total tokens in message array
   */
  private calculateTotalTokens(messages: Message[]): number {
    return messages.reduce(
      (sum, msg) => sum + (msg.tokens || this.estimateTokens(msg.content)),
      0,
    );
  }

  /**
   * Set system message (always preserved)
   */
  setSystemMessage(content: string) {
    this.systemMessage = {
      role: "system",
      content,
      tokens: this.estimateTokens(content),
    };
  }

  /**
   * Add message with automatic pruning
   */
  addMessage(role: "user" | "assistant", content: string) {
    const message: Message = {
      role,
      content,
      tokens: this.estimateTokens(content),
    };

    this.messages.push(message);
    this.pruneIfNeeded();
  }

  /**
   * Prune old messages when approaching limit
   */
  private pruneIfNeeded() {
    const allMessages = this.systemMessage
      ? [this.systemMessage, ...this.messages]
      : this.messages;

    const totalTokens = this.calculateTotalTokens(allMessages);

    if (totalTokens = 0; i--) {
      const msg = this.messages[i];
      const msgTokens = msg.tokens || this.estimateTokens(msg.content);

      if (currentTokens + msgTokens  `${m.role}: ${m.content}`)
      .join("\n\n");

    console.log(" Summarizing old messages...");

    const summary = await this.neurolink.generate({
      input: {
        text: `Summarize this conversation concisely, preserving key information:\n\n${conversationText}`,
      },
      provider: "anthropic",
      model: "claude-3-5-haiku-20241022", // Fast, cheap model for summaries
      maxTokens: 500,
    });

    // Replace old messages with summary
    this.messages = [
      {
        role: "assistant",
        content: `[Previous conversation summary: ${summary.content}]`,
        tokens: this.estimateTokens(summary.content),
      },
      ...toKeep,
    ];

    console.log(`✅ Summarized ${toSummarize.length} messages`);
  }

  /**
   * Generate with managed context
   */
  async chat(userMessage: string) {
    this.addMessage("user", userMessage);

    const contextMessages = this.systemMessage
      ? [this.systemMessage, ...this.messages]
      : this.messages;

    // Convert to NeuroLink format
    const prompt = contextMessages
      .map((m) => `${m.role}: ${m.content}`)
      .join("\n\n");

    const result = await this.neurolink.generate({
      input: { text: prompt },
    });

    this.addMessage("assistant", result.content);

    return result.content;
  }

  /**
   * Get current context statistics
   */
  getStats() {
    const totalTokens = this.calculateTotalTokens(this.messages);
    return {
      messages: this.messages.length,
      tokens: totalTokens,
      capacity: this.maxTokens,
      usage: ((totalTokens / this.maxTokens) * 100).toFixed(1) + "%",
    };
  }

  /**
   * Clear all messages (keep system message)
   */
  clear() {
    this.messages = [];
  }
}

// Usage Example
async function main() {
  const manager = new ContextWindowManager(4000); // 4K token limit

  manager.setSystemMessage(
    "You are a helpful AI assistant. Be concise and accurate.",
  );

  // Simulate a long conversation
  console.log("Starting conversation...\n");

  for (let i = 1; i  3000) {
      await manager.summarizeOldMessages();
    }
  }

  console.log("\n✅ Conversation complete");
  console.log("Final stats:", manager.getStats());
}

main();
```

## Explanation

### 1. Token Estimation

Estimate tokens before sending to API:

```typescript
estimateTokens(text) ≈ text.length / 4
```

This is approximate but sufficient for context management.

### 2. Sliding Window

Keep most recent messages, discard oldest:

- **System message**: Always preserved
- **Recent messages**: Keep in full
- **Old messages**: Remove or summarize

### 3. Automatic Pruning

When reaching 100% capacity:

- Remove oldest messages
- Target 80% capacity (leave buffer)
- Preserve conversation coherence

### 4. Intelligent Summarization

Instead of discarding, summarize old messages:

```
[10 messages] → [1 summary message] + [Recent messages]
```

Preserves context while reducing tokens.

### 5. Progressive Strategy

```
0-70% capacity:   No action
70-90% capacity:  Summarize old messages
90-100% capacity: Remove oldest messages
>100% capacity:   Aggressive pruning
```

## Variations

### Keep Important Messages

Tag and preserve important messages:

```typescript
type MessageWithMetadata = Message & {
  important?: boolean;
  timestamp: number;
};

private pruneIfNeeded() {
  // Always keep important messages
  const important = this.messages.filter(m => m.important);
  const regular = this.messages.filter(m => !m.important);

  // Prune regular messages only
  const pruned = this.pruneMessages(regular);

  this.messages = [...important, ...pruned];
}
```

### Semantic Compression

Use embeddings to identify redundant messages:

```typescript
async compressSemanticDuplicates() {
  // Group similar messages using embeddings
  const embeddings = await this.getEmbeddings(this.messages);

  // Find and merge similar messages
  const compressed = this.mergeSimiar(this.messages, embeddings);

  this.messages = compressed;
}
```

### Provider-Specific Limits

Different models, different limits:

```typescript
const CONTEXT_LIMITS = {
  "gpt-4o": 128000,
  "gpt-4o-mini": 128000,
  "gpt-4.1": 1047576,
  "o3": 200000,
  "claude-opus-4-20250514": 200000,
  "claude-sonnet-4-20250514": 200000,
  "claude-3-5-sonnet-20241022": 200000,
  "gemini-2.5-flash": 1048576,
  "gemini-2.5-pro": 1048576,
};

constructor(model: string) {
  this.maxTokens = CONTEXT_LIMITS[model] || 128000;
  // Leave 20% buffer for response
  this.maxTokens = Math.floor(this.maxTokens * 0.8);
}
```

### Rolling Summary

Maintain a rolling summary that updates:

```typescript
class RollingSummaryManager extends ContextWindowManager {
  private summary = "";

  async updateSummary() {
    const recentMessages = this.messages.slice(-5);
    const context = `${this.summary}\n\nRecent: ${recentMessages.map((m) => m.content).join("\n")}`;

    const newSummary = await this.neurolink.generate({
      input: { text: `Update this summary with recent messages:\n${context}` },
      maxTokens: 300,
    });

    this.summary = newSummary.content;
    this.messages = recentMessages; // Keep only recent
  }
}
```

## Token Budgets by Use Case

| Use Case          | Recommended Limit | Reasoning                       |
| ----------------- | ----------------- | ------------------------------- |
| Chatbot           | 4K-8K tokens      | Quick responses, recent context |
| Code assistant    | 16K-32K tokens    | Need file context               |
| Document analysis | 32K-100K tokens   | Large documents                 |
| Long-form writing | 8K-16K tokens     | Story continuity                |
| Customer support  | 4K tokens         | Short interactions              |

## Using Built-in Context Compaction

The manual patterns shown above (token estimation, sliding windows, summarization)
are now available as built-in components in NeuroLink. See
[Context Compaction Guide](/docs/features/context-compaction) for full details.

- **ContextCompactor** (`src/lib/context/contextCompactor.ts`) implements a 4-stage
  pipeline: tool-output pruning, file-read deduplication, LLM summarization, and
  sliding-window truncation. It replaces the need to build custom
  `ContextWindowManager` classes.
- **BudgetChecker** (`src/lib/context/budgetChecker.ts`) validates context size against
  per-model token limits before every generation call. Compaction is triggered
  automatically when usage exceeds the configured threshold.
- **`getContextStats()`** provides live token counts, remaining capacity, and a
  `shouldCompact` flag -- a production-grade replacement for the manual
  `getStats()` helper shown in this cookbook.
- **`compactSession()`** runs the full 4-stage pipeline on demand and returns
  a `CompactionResult` with the compacted messages and token savings.

Provider-specific context window sizes are maintained in
`src/lib/constants/contextWindows.ts`, removing the need for hard-coded
`CONTEXT_LIMITS` maps.

### Configuration

Enable context compaction through the `conversationMemory.contextCompaction`
config when creating a NeuroLink instance:

```typescript

const neurolink = new NeuroLink({
  conversationMemory: {
    enabled: true,
    contextCompaction: {
      enabled: true,
      // Trigger compaction when context usage exceeds 80% (default: 0.80)
      threshold: 0.8,
      // Enable individual compaction stages (all default to true)
      enablePruning: true, // Replace old tool outputs with placeholders
      enableDeduplication: true, // Keep only the latest read of each file
      enableSlidingWindow: true, // Tag oldest messages for removal as last resort
      // Fine-tune limits
      maxToolOutputBytes: 50_000, // Max tool output size before pruning (default: 50KB)
      maxToolOutputLines: 2000, // Max tool output lines before pruning
      fileReadBudgetPercent: 0.6, // File reads share of remaining context (default: 60%)
    },
  },
});
```

### Checking Context Usage

Use `getContextStats()` to inspect how much of the context window a session
is consuming. The method returns token estimates, a usage ratio, and a
`shouldCompact` flag based on the configured threshold:

```typescript
// Get context usage for a session against a specific provider/model
const stats = await neurolink.getContextStats(
  "session-1",
  "vertex",
  "gemini-2.5-flash",
);

if (stats) {
  console.log(`Messages:       ${stats.messageCount}`);
  console.log(`Input tokens:   ${stats.estimatedInputTokens}`);
  console.log(`Available:      ${stats.availableInputTokens}`);
  console.log(`Context usage:  ${(stats.usageRatio * 100).toFixed(1)}%`);
  console.log(`Needs compact:  ${stats.shouldCompact}`);
}
```

### Manual Compaction

When `shouldCompact` is `true`, or at any time you want to free up context
space, call `compactSession()`:

```typescript
const result = await neurolink.compactSession("session-1");

if (result?.compacted) {
  const tokensSaved = result.originalTokenCount - result.compactedTokenCount;
  console.log(`Compaction saved ${tokensSaved} tokens`);
  console.log(`Stages applied: ${result.stagesApplied.join(", ")}`);
}
```

### Full Example: Auto-Monitoring Loop

Combining the APIs above into a conversation loop that monitors context
usage and compacts automatically:

```typescript

const neurolink = new NeuroLink({
  conversationMemory: {
    enabled: true,
    contextCompaction: {
      enabled: true,
      threshold: 0.8,
    },
  },
});

const sessionId = "demo-session";

async function chat(userMessage: string) {
  // Check context budget before generating
  const stats = await neurolink.getContextStats(
    sessionId,
    "anthropic",
    "claude-sonnet-4-20250514",
  );

  if (stats?.shouldCompact) {
    console.log(
      `Context at ${(stats.usageRatio * 100).toFixed(1)}% — compacting...`,
    );
    const result = await neurolink.compactSession(sessionId);
    if (result?.compacted) {
      const saved = result.originalTokenCount - result.compactedTokenCount;
      console.log(
        `Freed ${saved} tokens via ${result.stagesApplied.join(", ")}`,
      );
    }
  }

  const response = await neurolink.generate({
    input: { text: userMessage },
    provider: "anthropic",
    model: "claude-sonnet-4-20250514",
    sessionId,
  });

  return response.content;
}

// Simulate a long conversation
for (let i = 1; i <= 50; i++) {
  const reply = await chat(`Tell me fact #${i} about distributed systems.`);
  console.log(`[${i}] ${reply.slice(0, 120)}...`);
}
```

## See Also

- [Conversation Summarization](/docs/cookbook/conversation-summarization)
- [Cost Optimization](/docs/cookbook/cost-optimization)
- [Memory Management Guide](/docs/features/conversation-history)
- [Provider Comparison](/docs/reference/provider-comparison)

---

## Conversation Summarization

<!-- Source: cookbook/conversation-summarization.md -->

# Conversation Summarization

## Problem

Long conversations consume excessive tokens and costs:

- Context window fills quickly
- API costs scale with message count
- Response quality degrades with very long context
- Important information gets buried

## Solution

Automatically summarize conversation history to:

1. Preserve key information
2. Reduce token usage
3. Maintain context continuity
4. Enable indefinite conversations

## Code

```typescript

type ConversationMessage = {
  role: "user" | "assistant" | "system";
  content: string;
  timestamp: Date;
  important?: boolean;
};

class ConversationSummarizer {
  private neurolink: NeuroLink;
  private messages: ConversationMessage[] = [];
  private summary: string = "";
  private maxMessages: number;
  private summaryModel: string;

  constructor(
    options: {
      maxMessages?: number;
      summaryModel?: string;
    } = {},
  ) {
    this.neurolink = new NeuroLink();
    this.maxMessages = options.maxMessages || 10;
    this.summaryModel = options.summaryModel || "claude-3-haiku-20240307";
  }

  /**
   * Add message to conversation
   */
  addMessage(role: "user" | "assistant", content: string, important = false) {
    this.messages.push({
      role,
      content,
      timestamp: new Date(),
      important,
    });

    // Summarize if threshold reached
    if (this.messages.length >= this.maxMessages) {
      this.summarizeAsync();
    }
  }

  /**
   * Summarize old messages (async, non-blocking)
   */
  private async summarizeAsync() {
    if (this.messages.length  m.important);
    const regularMessages = this.messages.filter((m) => !m.important);

    // Split: summarize first half, keep second half
    const toSummarize = regularMessages.slice(
      0,
      Math.floor(regularMessages.length / 2),
    );
    const toKeep = regularMessages.slice(
      Math.floor(regularMessages.length / 2),
    );

    if (toSummarize.length === 0) {
      return;
    }

    console.log(` Summarizing ${toSummarize.length} messages...`);

    try {
      const newSummary = await this.createSummary(toSummarize);

      // Update state
      this.summary = this.combineSummaries(this.summary, newSummary);
      this.messages = [...importantMessages, ...toKeep];

      console.log(`✅ Summary updated. Messages: ${this.messages.length}`);
    } catch (error: any) {
      console.error("❌ Summarization failed:", error.message);
    }
  }

  /**
   * Create summary of messages
   */
  private async createSummary(
    messages: ConversationMessage[],
  ): Promise {
    const conversationText = messages
      .map((m) => `${m.role}: ${m.content}`)
      .join("\n\n");

    const result = await this.neurolink.generate({
      input: {
        text: `Summarize this conversation concisely, preserving key facts, decisions, and context:\n\n${conversationText}`,
      },
      provider: "anthropic",
      model: this.summaryModel,
      maxTokens: 500,
    });

    return result.content;
  }

  /**
   * Combine old and new summaries
   */
  private combineSummaries(oldSummary: string, newSummary: string): string {
    if (!oldSummary) return newSummary;

    // If both exist, combine them (could also summarize the summaries)
    return `${oldSummary}\n\nRecent updates: ${newSummary}`;
  }

  /**
   * Get conversation context for AI
   */
  async getContext(): Promise {
    const parts: string[] = [];

    // Add summary if exists
    if (this.summary) {
      parts.push(`[Previous conversation summary: ${this.summary}]`);
    }

    // Add recent messages
    const recentMessages = this.messages
      .slice(-5)
      .map((m) => `${m.role}: ${m.content}`)
      .join("\n\n");

    if (recentMessages) {
      parts.push(recentMessages);
    }

    return parts.join("\n\n");
  }

  /**
   * Chat with automatic summarization
   */
  async chat(userMessage: string, markImportant = false): Promise {
    this.addMessage("user", userMessage, markImportant);

    const context = await this.getContext();

    const result = await this.neurolink.generate({
      input: { text: context },
    });

    this.addMessage("assistant", result.content);

    return result.content;
  }

  /**
   * Get statistics
   */
  getStats() {
    return {
      messages: this.messages.length,
      hasSummary: !!this.summary,
      summaryLength: this.summary.length,
      importantMessages: this.messages.filter((m) => m.important).length,
    };
  }

  /**
   * Export full conversation
   */
  export() {
    return {
      summary: this.summary,
      messages: this.messages,
      timestamp: new Date(),
    };
  }

  /**
   * Import conversation
   */
  import(data: { summary: string; messages: ConversationMessage[] }) {
    this.summary = data.summary;
    this.messages = data.messages;
  }
}

// Usage Example
async function main() {
  const summarizer = new ConversationSummarizer({
    maxMessages: 8,
    summaryModel: "claude-3-haiku-20240307",
  });

  // Simulate a long conversation
  console.log("Starting long conversation...\n");

  const topics = [
    "Tell me about quantum computing",
    "How does quantum entanglement work?",
    "What are practical applications?",
    "Compare quantum vs classical computers",
    "Explain quantum supremacy",
    "What is Shor's algorithm?",
    "How close are we to practical quantum computers?",
    "What are the main challenges?",
    "Explain quantum error correction",
    "What companies are leading in quantum computing?",
  ];

  for (let i = 0; i = maxMessages) {
  summarize(); // Default: 10 messages
}
```

### 2. Preserve Important Messages

Mark critical messages to preserve:

```typescript
summarizer.addMessage("user", "Process this payment", true);
// Never summarized, always in full context
```

### 3. Split Strategy

- **First half**: Summarize
- **Second half**: Keep in full
- **Important**: Always keep

### 4. Hierarchical Summaries

Combine summaries over time:

```
[Summary 1-10] + [Summary 11-20] → [Combined Summary]
```

### 5. Cost Optimization

Use cheap model for summarization:

- Claude Haiku: $0.00025/1K tokens
- Gemini Pro: $0.00025/1K tokens

## Variations

### Progressive Summarization

Summarize at multiple levels:

```typescript
class ProgressiveSummarizer extends ConversationSummarizer {
  private detailedSummary: string = "";
  private briefSummary: string = "";

  async summarize() {
    // Level 1: Detailed summary (300 tokens)
    this.detailedSummary = await this.createSummary(this.messages, 300);

    // Level 2: Brief summary (100 tokens)
    this.briefSummary = await this.summarizeText(this.detailedSummary, 100);
  }

  private async summarizeText(
    text: string,
    maxTokens: number,
  ): Promise {
    const result = await this.neurolink.generate({
      input: { text: `Summarize concisely: ${text}` },
      maxTokens,
    });
    return result.content;
  }
}
```

### Topic-Based Summarization

Organize summaries by topic:

```typescript
type TopicSummary = {
  topic: string;
  summary: string;
  messageCount: number;
};

class TopicalSummarizer {
  private topics = new Map();

  async addMessage(topic: string, message: ConversationMessage) {
    if (!this.topics.has(topic)) {
      this.topics.set(topic, []);
    }

    this.topics.get(topic)!.push(message);

    // Summarize if topic has many messages
    if (this.topics.get(topic)!.length >= 10) {
      await this.summarizeTopic(topic);
    }
  }
}
```

### Time-Based Summarization

Summarize by time windows:

```typescript
class TimeBasedSummarizer {
  async summarizeByTime(hours: number = 24) {
    const cutoff = new Date(Date.now() - hours * 60 * 60 * 1000);

    const oldMessages = this.messages.filter((m) => m.timestamp  m.timestamp >= cutoff);

    if (oldMessages.length > 0) {
      const summary = await this.createSummary(oldMessages);
      this.summary = this.combineSummaries(this.summary, summary);
      this.messages = recentMessages;
    }
  }
}
```

### Extractive Summarization

Keep actual message excerpts:

```typescript
function extractKeyPoints(messages: ConversationMessage[]): string[] {
  // Simple heuristic: sentences with key indicators
  const keyIndicators = [
    "important",
    "remember",
    "decision",
    "agreed",
    "action",
  ];

  const keyPoints: string[] = [];

  messages.forEach((msg) => {
    const sentences = msg.content.split(/[.!?]+/);
    sentences.forEach((sentence) => {
      if (keyIndicators.some((kw) => sentence.toLowerCase().includes(kw))) {
        keyPoints.push(sentence.trim());
      }
    });
  });

  return keyPoints;
}
```

## Summarization Strategies

| Strategy                                | When to Use               | Token Savings | Context Preservation |
| --------------------------------------- | ------------------------- | ------------- | -------------------- |
| **Simple**: Remove old messages         | Short conversations       | 90%           | Low                  |
| **Abstractive**: AI-generated summary   | Long conversations        | 80%           | Medium               |
| **Extractive**: Key sentence selection  | Factual conversations     | 60%           | High                 |
| **Hierarchical**: Multi-level summaries | Very long conversations   | 85%           | Medium-High          |
| **Topic-based**: Group by subject       | Multi-topic conversations | 75%           | High                 |

## Best Practices

1. **Summarize early**: Don't wait until context is full
2. **Preserve decisions**: Mark important messages
3. **Use cheap models**: Summarization doesn't need GPT-4
4. **Test summaries**: Verify important info isn't lost
5. **Export regularly**: Save full conversation for debugging

## See Also

- [Context Window Management](/docs/cookbook/context-window-management)
- [Cost Optimization](/docs/cookbook/cost-optimization)
- [Memory Management Guide](/docs/features/conversation-history)
- [Redis Persistence](/docs/guides/redis-configuration)

---

## Cost Optimization

<!-- Source: cookbook/cost-optimization.md -->

# Cost Optimization

## Problem

AI API costs can accumulate quickly, especially with:

- Large context windows
- Frequent API calls
- Expensive models (GPT-4, Claude Opus)
- Inefficient prompt engineering

## Solution

Implement cost optimization strategies:

1. Use cheaper models when appropriate
2. Minimize context size
3. Cache responses
4. Implement token counting
5. Use model routing based on complexity

## Code

```typescript

type CostOptimizer = {
  maxTokens?: number;
  cacheResponses?: boolean;
  useSmartRouting?: boolean;
};

class CostEfficientNeuroLink {
  private neurolink: NeuroLink;
  private cache = new Map();
  private tokenCosts = {
    "gpt-4": { input: 0.03, output: 0.06 },
    "gpt-3.5-turbo": { input: 0.0015, output: 0.002 },
    "claude-3-opus": { input: 0.015, output: 0.075 },
    "claude-3-sonnet": { input: 0.003, output: 0.015 },
    "claude-3-haiku": { input: 0.00025, output: 0.00125 },
    "gemini-pro": { input: 0.00025, output: 0.0005 },
  };

  constructor(options: CostOptimizer = {}) {
    this.neurolink = new NeuroLink();
  }

  /**
   * Route to cheaper model for simple queries
   */
  selectModel(
    prompt: string,
    forceModel?: string,
  ): {
    provider: string;
    model: string;
  } {
    if (forceModel) {
      return { provider: "openai", model: forceModel };
    }

    // Simple heuristics for model selection
    const isComplex =
      prompt.length > 500 ||
      prompt.includes("analyze") ||
      prompt.includes("complex") ||
      prompt.includes("reasoning");

    const requiresCreativity =
      prompt.includes("creative") ||
      prompt.includes("story") ||
      prompt.includes("poem");

    if (isComplex && requiresCreativity) {
      return { provider: "openai", model: "gpt-4" };
    }

    if (isComplex) {
      return { provider: "anthropic", model: "claude-3-sonnet-20240229" };
    }

    // Simple queries → cheapest model
    return { provider: "anthropic", model: "claude-3-haiku-20240307" };
  }

  /**
   * Generate cache key from prompt
   */
  private getCacheKey(prompt: string, model: string): string {
    const normalized = prompt.trim().toLowerCase();
    return `${model}:${normalized}`;
  }

  /**
   * Estimate cost for a request
   */
  estimateCost(
    inputTokens: number,
    outputTokens: number,
    model: string,
  ): number {
    const costs = this.tokenCosts[model as keyof typeof this.tokenCosts];
    if (!costs) return 0;

    return (
      (inputTokens / 1000) * costs.input + (outputTokens / 1000) * costs.output
    );
  }

  /**
   * Generate with cost optimization
   */
  async generateCostEffective(
    prompt: string,
    options: {
      useCache?: boolean;
      maxTokens?: number;
      forceModel?: string;
    } = {},
  ) {
    const { provider, model } = this.selectModel(prompt, options.forceModel);
    const cacheKey = this.getCacheKey(prompt, model);

    // Check cache first
    if (options.useCache !== false && this.cache.has(cacheKey)) {
      console.log(" Using cached response (cost: $0.00)");
      return this.cache.get(cacheKey);
    }

    // Truncate very long prompts
    const maxPromptLength = 2000;
    const truncatedPrompt =
      prompt.length > maxPromptLength
        ? prompt.slice(0, maxPromptLength) + "..."
        : prompt;

    const result = await this.neurolink.generate({
      input: { text: truncatedPrompt },
      provider,
      model,
      maxTokens: options.maxTokens || 500, // Limit output tokens
    });

    // Estimate and log cost
    const inputTokens = this.estimateTokens(truncatedPrompt);
    const outputTokens = this.estimateTokens(result.content);
    const cost = this.estimateCost(inputTokens, outputTokens, model);

    console.log(` Cost estimate: $${cost.toFixed(4)} (${model})`);

    // Cache the response
    if (options.useCache !== false) {
      this.cache.set(cacheKey, result);
    }

    return result;
  }

  /**
   * Estimate token count (rough approximation)
   */
  private estimateTokens(text: string): number {
    // Rough estimate: ~4 characters per token
    return Math.ceil(text.length / 4);
  }

  /**
   * Batch similar requests to minimize overhead
   */
  async batchGenerate(prompts: string[]) {
    const results = [];
    let totalCost = 0;

    for (const prompt of prompts) {
      const result = await this.generateCostEffective(prompt, {
        useCache: true,
      });
      results.push(result);

      // Track cumulative cost
      const cost = (this.estimateTokens(result.content) * 0.002) / 1000;
      totalCost += cost;
    }

    console.log(`\n Total batch cost: $${totalCost.toFixed(4)}`);
    return results;
  }

  /**
   * Clear cache to free memory
   */
  clearCache() {
    this.cache.clear();
  }

  /**
   * Get cache statistics
   */
  getCacheStats() {
    return {
      entries: this.cache.size,
      estimatedSavings: this.cache.size * 0.01, // Rough estimate
    };
  }
}

// Usage Example
async function main() {
  const optimizer = new CostEfficientNeuroLink();

  // Simple query → uses cheap model (Haiku)
  const simple = await optimizer.generateCostEffective("What is 2+2?", {
    useCache: true,
  });
  console.log("Simple:", simple.content);

  // Complex query → uses better model (Sonnet)
  const complex = await optimizer.generateCostEffective(
    "Analyze the economic implications of quantum computing on financial markets",
    { maxTokens: 300 },
  );
  console.log("Complex:", complex.content);

  // Batch processing with caching
  const prompts = [
    "What is TypeScript?",
    "What is TypeScript?", // Cached!
    "Explain async/await",
  ];
  await optimizer.batchGenerate(prompts);

  // Check savings
  const stats = optimizer.getCacheStats();
  console.log(
    `\n Cache stats: ${stats.entries} entries, ~$${stats.estimatedSavings.toFixed(2)} saved`,
  );
}

main();
```

## Explanation

### 1. Smart Model Routing

The `selectModel()` method analyzes the prompt to choose the most cost-effective model:

- **Simple queries** → Claude Haiku ($0.00025/1K input tokens)
- **Complex queries** → Claude Sonnet ($0.003/1K input tokens)
- **Complex + Creative** → GPT-4 ($0.03/1K input tokens)

### 2. Response Caching

Identical prompts return cached responses at zero cost. Perfect for:

- Repeated queries
- Development/testing
- Common questions in production

### 3. Token Limiting

Set `maxTokens` to prevent unexpectedly long (expensive) responses:

- Summaries: 200-300 tokens
- Explanations: 500-1000 tokens
- Creative content: 1000-2000 tokens

### 4. Cost Tracking

Estimate costs per request to monitor spending:

```
Input: 250 tokens × $0.003/1K = $0.00075
Output: 500 tokens × $0.015/1K = $0.00750
Total: $0.00825
```

### 5. Prompt Truncation

Very long prompts increase costs without adding value. Truncate to essential context.

## Variations

### Context Window Compression

Compress conversation history to reduce tokens:

```typescript
function compressContext(messages: Array) {
  // Keep system message and last N messages
  const system = messages.find((m) => m.role === "system");
  const recent = messages.slice(-5); // Last 5 messages

  // Summarize older messages
  const older = messages.slice(1, -5);
  const summary =
    older.length > 0
      ? `[Previous conversation: ${older.length} messages covering ${older.map((m) => m.content.slice(0, 20)).join(", ")}...]`
      : "";

  return [system, { role: "assistant", content: summary }, ...recent].filter(
    Boolean,
  );
}
```

### Model Tier System

Explicitly define cost tiers:

```typescript
enum ModelTier {
  ULTRA_CHEAP = "claude-3-haiku-20240307", // $0.00025/1K
  CHEAP = "gpt-3.5-turbo", // $0.0015/1K
  BALANCED = "claude-3-sonnet-20240229", // $0.003/1K
  POWERFUL = "gpt-4", // $0.03/1K
}

async function generateWithTier(prompt: string, tier: ModelTier) {
  return neurolink.generate({
    input: { text: prompt },
    model: tier,
  });
}
```

### Budget Enforcement

Set spending limits:

```typescript
class BudgetEnforcer {
  private spentToday = 0;
  private dailyLimit = 10.0; // $10/day

  async generate(neurolink: NeuroLink, prompt: string) {
    const estimatedCost = 0.01; // Rough estimate

    if (this.spentToday + estimatedCost > this.dailyLimit) {
      throw new Error(
        `Budget exceeded: $${this.spentToday.toFixed(2)}/$${this.dailyLimit}`,
      );
    }

    const result = await neurolink.generate({ input: { text: prompt } });
    this.spentToday += estimatedCost;

    return result;
  }
}
```

## Cost Comparison

| Task Type        | Best Model    | Cost (per 1K tokens) | Use Case                     |
| ---------------- | ------------- | -------------------- | ---------------------------- |
| Simple Q&A       | Claude Haiku  | $0.00025             | FAQs, basic queries          |
| Data extraction  | GPT-3.5 Turbo | $0.0015              | JSON parsing, classification |
| Analysis         | Claude Sonnet | $0.003               | Summaries, explanations      |
| Deep reasoning   | GPT-4         | $0.03                | Complex problem-solving      |
| Creative writing | GPT-4         | $0.03                | Stories, marketing copy      |

## See Also

- [Batch Processing](/docs/cookbook/batch-processing)
- [Context Window Management](/docs/cookbook/context-window-management)
- [Provider Selection Guide](/docs/reference/provider-selection)
- [Rate Limit Handling](/docs/cookbook/rate-limit-handling)

---

## Error Recovery Patterns

<!-- Source: cookbook/error-recovery.md -->

# Error Recovery Patterns

## Problem

Production AI applications face various errors:

- Network failures
- Provider outages
- Invalid API keys
- Model unavailability
- Timeout errors
- Rate limiting
- Malformed responses

Without proper error handling, applications crash or produce poor user experiences.

## Solution

Implement comprehensive error recovery with:

1. Error classification (retryable vs fatal)
2. Graceful degradation
3. User-friendly error messages
4. Automatic fallback strategies
5. Error monitoring and alerting

## Code

```typescript

enum ErrorType {
  RETRYABLE,
  FALLBACK,
  FATAL,
}

type ErrorRecoveryConfig = {
  maxRetries?: number;
  fallbackProvider?: string;
  fallbackResponse?: string;
  onError?: (error: Error, context: any) => void;
};

class RobustNeuroLink {
  private neurolink: NeuroLink;
  private config: ErrorRecoveryConfig;

  constructor(config: ErrorRecoveryConfig = {}) {
    this.neurolink = new NeuroLink();
    this.config = {
      maxRetries: config.maxRetries || 3,
      fallbackProvider: config.fallbackProvider,
      fallbackResponse:
        config.fallbackResponse ||
        "I'm having trouble processing your request. Please try again.",
      onError: config.onError,
    };
  }

  /**
   * Classify error to determine recovery strategy
   */
  private classifyError(error: any): ErrorType {
    // Network errors - retryable
    if (
      error.code === "ECONNRESET" ||
      error.code === "ETIMEDOUT" ||
      error.code === "ENOTFOUND" ||
      error.message?.includes("network") ||
      error.message?.includes("timeout")
    ) {
      return ErrorType.RETRYABLE;
    }

    // Provider errors - may fallback
    if (
      error.status === 429 || // Rate limit
      error.status === 503 || // Service unavailable
      error.status === 502 || // Bad gateway
      error.status === 504 || // Gateway timeout
      error.message?.includes("overloaded") ||
      error.message?.includes("capacity")
    ) {
      return ErrorType.FALLBACK;
    }

    // Authentication errors - fatal
    if (
      error.status === 401 ||
      error.status === 403 ||
      error.message?.includes("API key") ||
      error.message?.includes("authentication")
    ) {
      return ErrorType.FATAL;
    }

    // Invalid request - fatal
    if (
      error.status === 400 ||
      error.message?.includes("invalid") ||
      error.message?.includes("malformed")
    ) {
      return ErrorType.FATAL;
    }

    // Default: retryable
    return ErrorType.RETRYABLE;
  }

  /**
   * Get user-friendly error message
   */
  private getUserMessage(error: any): string {
    const messages: Record = {
      401: "Authentication failed. Please check your API key.",
      403: "Access denied. You may not have permission for this operation.",
      429: "Rate limit exceeded. Please wait a moment and try again.",
      500: "The AI service encountered an error. Please try again.",
      503: "The AI service is temporarily unavailable. Please try again later.",
    };

    return (
      messages[error.status] || error.message || "An unexpected error occurred."
    );
  }

  /**
   * Generate with automatic error recovery
   */
  async generateSafe(
    prompt: string,
    options: {
      provider?: string;
      model?: string;
      fallbackProvider?: string;
    } = {},
  ): Promise {
    const provider = options.provider || "openai";
    let attempt = 0;

    while (attempt  0,
        };
      } catch (error: any) {
        attempt++;
        const errorType = this.classifyError(error);

        // Log error
        console.error(
          `❌ Error (attempt ${attempt}/${this.config.maxRetries}):`,
          error.message,
        );
        this.config.onError?.(error, { prompt, provider, attempt });

        // Fatal errors - don't retry
        if (errorType === ErrorType.FATAL) {
          return {
            content: this.config.fallbackResponse!,
            error: new Error(this.getUserMessage(error)),
            recovered: false,
          };
        }

        // Fallback to alternative provider
        if (errorType === ErrorType.FALLBACK && options.fallbackProvider) {
          try {
            console.log(
              ` Trying fallback provider: ${options.fallbackProvider}`,
            );

            const fallbackResult = await this.neurolink.generate({
              input: { text: prompt },
              provider: options.fallbackProvider,
            });

            return {
              content: fallbackResult.content,
              recovered: true,
            };
          } catch (fallbackError: any) {
            console.error("❌ Fallback also failed:", fallbackError.message);
          }
        }

        // Retryable errors - wait and retry
        if (attempt  setTimeout(r, delay));
        }
      }
    }

    // All retries exhausted
    return {
      content: this.config.fallbackResponse!,
      error: new Error("All retry attempts failed"),
      recovered: false,
    };
  }

  /**
   * Stream with error recovery
   */
  async streamSafe(
    prompt: string,
    options: { provider?: string } = {},
  ): Promise> {
    const provider = options.provider || "openai";

    try {
      const stream = await this.neurolink.stream({
        input: { text: prompt },
        provider,
      });

      // Wrap stream to handle errors
      return this.wrapStreamWithRecovery(stream, prompt, provider);
    } catch (error: any) {
      console.error("❌ Stream failed:", error.message);

      // Return fallback as async iterable
      return (async function* () {
        yield "I'm having trouble streaming the response. ";
        yield "Please try again or rephrase your request.";
      })();
    }
  }

  /**
   * Wrap stream with error recovery
   */
  private async *wrapStreamWithRecovery(
    stream: AsyncIterable,
    prompt: string,
    provider: string,
  ): AsyncIterable {
    try {
      for await (const chunk of stream) {
        if (chunk.type === "content-delta") {
          yield chunk.delta;
        }
      }
    } catch (error: any) {
      console.error("❌ Stream interrupted:", error.message);

      // Try to recover with non-streaming
      try {
        const fallback = await this.generateSafe(prompt, { provider });
        yield "\n\n[Recovered via non-streaming]\n";
        yield fallback.content;
      } catch {
        yield "\n\n[Stream failed and recovery failed]";
      }
    }
  }
}

// Usage Example
async function main() {
  const robust = new RobustNeuroLink({
    maxRetries: 3,
    fallbackProvider: "anthropic",
    onError: (error, context) => {
      // Log to monitoring service
      console.error("Error logged:", {
        error: error.message,
        context,
        timestamp: new Date().toISOString(),
      });
    },
  });

  // Generate with automatic recovery
  const result = await robust.generateSafe("Explain quantum computing", {
    provider: "openai",
    fallbackProvider: "anthropic",
  });

  if (result.error) {
    console.log("⚠️  Recovered from error:", result.error.message);
  }

  console.log("Response:", result.content);

  // Stream with error recovery
  console.log("\nStreaming...");
  const stream = await robust.streamSafe("Tell me a story");

  for await (const chunk of stream) {
    process.stdout.write(chunk);
  }
}

main();
```

## Explanation

### 1. Error Classification

Errors fall into three categories:

**Retryable**: Temporary issues that may resolve

- Network timeouts
- Connection resets
- Temporary service issues

**Fallback**: Use alternative provider

- Rate limits
- Service overload
- Provider outages

**Fatal**: Don't retry

- Invalid API keys
- Malformed requests
- Unauthorized access

### 2. Retry Strategy

- **Exponential backoff**: 1s, 2s, 4s, 8s (max 10s)
- **Max retries**: 3 attempts by default
- **Smart delays**: Longer delays for repeated failures

### 3. Graceful Degradation

When all else fails:

- Return fallback response
- Log error for monitoring
- Preserve application stability

### 4. User-Friendly Messages

Map technical errors to user-friendly messages:

```
401 → "Authentication failed. Please check your API key."
503 → "Service temporarily unavailable. Please try again later."
```

### 5. Error Monitoring

Call `onError` callback for:

- Logging to monitoring service
- Alerting on critical errors
- Analytics and debugging

## Variations

### Circuit Breaker

Prevent cascading failures:

```typescript
class CircuitBreaker {
  private failures = 0;
  private lastFailure = 0;
  private state: "CLOSED" | "OPEN" | "HALF_OPEN" = "CLOSED";

  async call(fn: () => Promise): Promise {
    if (this.state === "OPEN") {
      if (Date.now() - this.lastFailure > 60000) {
        this.state = "HALF_OPEN";
      } else {
        throw new Error("Circuit breaker is OPEN");
      }
    }

    try {
      const result = await fn();
      this.reset();
      return result;
    } catch (error) {
      this.recordFailure();
      throw error;
    }
  }

  private recordFailure() {
    this.failures++;
    this.lastFailure = Date.now();

    if (this.failures >= 5) {
      this.state = "OPEN";
      console.log(" Circuit breaker OPEN");
    }
  }

  private reset() {
    this.failures = 0;
    this.state = "CLOSED";
  }
}
```

### Health Checks

Monitor provider health:

```typescript
class ProviderHealthMonitor {
  private health = new Map();

  async checkHealth(provider: string): Promise {
    try {
      await neurolink.generate({
        input: { text: "test" },
        provider,
        maxTokens: 10,
      });

      this.health.set(provider, true);
      return true;
    } catch {
      this.health.set(provider, false);
      return false;
    }
  }

  isHealthy(provider: string): boolean {
    return this.health.get(provider) ?? true;
  }
}
```

### Automatic Provider Selection

Choose healthy provider automatically:

```typescript
async function selectHealthyProvider(providers: string[]): Promise {
  for (const provider of providers) {
    const healthy = await healthMonitor.checkHealth(provider);
    if (healthy) return provider;
  }

  throw new Error("No healthy providers available");
}
```

## Best Practices

1. **Log all errors**: Track patterns for debugging
2. **Monitor error rates**: Alert on unusual spikes
3. **Test error paths**: Simulate failures in testing
4. **Provide context**: Include request details in errors
5. **User communication**: Clear, actionable error messages

## See Also

- [Streaming with Retry](/docs/cookbook/streaming-with-retry)
- [Multi-Provider Fallback](/docs/cookbook/multi-provider-fallback)
- [Rate Limit Handling](/docs/cookbook/rate-limit-handling)
- [Troubleshooting Guide](/docs/reference/troubleshooting)

---

## Multi-Provider Fallback

<!-- Source: cookbook/multi-provider-fallback.md -->

# Multi-Provider Fallback

## Problem

Relying on a single AI provider creates a single point of failure:

- Provider outages affect your entire application
- Rate limits halt all operations
- Regional availability issues block access
- Model deprecation requires code changes

## Solution

Implement automatic fallback across multiple providers:

1. Primary → Secondary → Tertiary provider chain
2. Health monitoring for each provider
3. Automatic failover on errors
4. Load balancing across providers
5. Cost-aware routing

## Code

```typescript

type ProviderConfig = {
  name: string;
  model?: string;
  priority: number; // Lower = higher priority
  costPerToken?: number; // For cost-aware routing
  maxRetries?: number;
};

class MultiProviderNeuroLink {
  private neurolink: NeuroLink;
  private providers: ProviderConfig[];
  private healthStatus = new Map();

  constructor(providers: ProviderConfig[]) {
    this.neurolink = new NeuroLink();
    this.providers = providers.sort((a, b) => a.priority - b.priority);

    // Initialize all providers as healthy
    providers.forEach((p) => this.healthStatus.set(p.name, true));
  }

  /**
   * Mark provider as unhealthy
   */
  private markUnhealthy(provider: string, duration: number = 60000) {
    console.log(`⚠️  Marking ${provider} as unhealthy for ${duration}ms`);
    this.healthStatus.set(provider, false);

    // Auto-recover after duration
    setTimeout(() => {
      console.log(`✅ ${provider} marked as healthy again`);
      this.healthStatus.set(provider, true);
    }, duration);
  }

  /**
   * Get healthy providers in priority order
   */
  private getHealthyProviders(): ProviderConfig[] {
    return this.providers.filter(
      (p) => this.healthStatus.get(p.name) !== false,
    );
  }

  /**
   * Generate with automatic fallback
   */
  async generate(
    prompt: string,
    options: { preferCheap?: boolean; timeout?: number } = {},
  ): Promise {
    let providers = this.getHealthyProviders();

    if (providers.length === 0) {
      throw new Error("No healthy providers available");
    }

    // Sort by cost if preferred
    if (options.preferCheap) {
      providers = providers.sort(
        (a, b) => (a.costPerToken || 0) - (b.costPerToken || 0),
      );
    }

    let attempts = 0;
    const errors: Error[] = [];

    for (const config of providers) {
      attempts++;
      console.log(`\n Attempt ${attempts}: Trying ${config.name}...`);

      try {
        const result = await this.tryProvider(prompt, config, options.timeout);

        console.log(`✅ Success with ${config.name}`);

        return {
          content: result.content,
          provider: config.name,
          attempts,
        };
      } catch (error: any) {
        console.error(`❌ ${config.name} failed:`, error.message);
        errors.push(error);

        // Mark unhealthy if specific error types
        if (this.shouldMarkUnhealthy(error)) {
          this.markUnhealthy(config.name);
        }

        // Continue to next provider
        continue;
      }
    }

    // All providers failed
    throw new Error(
      `All ${attempts} providers failed:\n${errors
        .map((e, i) => `${i + 1}. ${e.message}`)
        .join("\n")}`,
    );
  }

  /**
   * Try a specific provider
   */
  private async tryProvider(
    prompt: string,
    config: ProviderConfig,
    timeout: number = 30000,
  ) {
    const timeoutPromise = new Promise((_, reject) =>
      setTimeout(() => reject(new Error("Request timeout")), timeout),
    );

    const generatePromise = this.neurolink.generate({
      input: { text: prompt },
      provider: config.name,
      model: config.model,
    });

    return Promise.race([generatePromise, timeoutPromise]);
  }

  /**
   * Determine if error should mark provider unhealthy
   */
  private shouldMarkUnhealthy(error: any): boolean {
    return (
      error.status === 503 || // Service unavailable
      error.status === 502 || // Bad gateway
      error.code === "ECONNREFUSED" ||
      error.message?.includes("overloaded") ||
      error.message?.includes("capacity")
    );
  }

  /**
   * Stream with fallback
   */
  async stream(prompt: string): Promise;
    provider: string;
  }> {
    const providers = this.getHealthyProviders();

    for (const config of providers) {
      try {
        console.log(` Trying to stream with ${config.name}...`);

        const stream = await this.neurolink.stream({
          input: { text: prompt },
          provider: config.name,
          model: config.model,
        });

        return {
          stream,
          provider: config.name,
        };
      } catch (error: any) {
        console.error(`❌ ${config.name} streaming failed:`, error.message);

        if (this.shouldMarkUnhealthy(error)) {
          this.markUnhealthy(config.name);
        }

        continue;
      }
    }

    throw new Error("All providers failed to stream");
  }

  /**
   * Get provider health status
   */
  getHealthStatus() {
    return Array.from(this.healthStatus.entries()).map(([name, healthy]) => ({
      provider: name,
      healthy,
    }));
  }

  /**
   * Manually set provider health
   */
  setProviderHealth(provider: string, healthy: boolean) {
    this.healthStatus.set(provider, healthy);
  }
}

// Usage Example
async function main() {
  const multiProvider = new MultiProviderNeuroLink([
    {
      name: "openai",
      model: "gpt-4",
      priority: 1,
      costPerToken: 0.03,
    },
    {
      name: "anthropic",
      model: "claude-3-sonnet-20240229",
      priority: 2,
      costPerToken: 0.003,
    },
    {
      name: "google-ai",
      model: "gemini-pro",
      priority: 3,
      costPerToken: 0.00025,
    },
  ]);

  // Generate with automatic fallback
  try {
    const result = await multiProvider.generate(
      "Explain quantum entanglement",
      { timeout: 10000 },
    );

    console.log(
      `\n✅ Response from ${result.provider} (after ${result.attempts} attempts):`,
    );
    console.log(result.content);
  } catch (error: any) {
    console.error("❌ All providers failed:", error.message);
  }

  // Check health status
  console.log("\n Provider Health:");
  const health = multiProvider.getHealthStatus();
  health.forEach((h) => {
    console.log(
      `  ${h.provider}: ${h.healthy ? "✅ Healthy" : "❌ Unhealthy"}`,
    );
  });

  // Stream with fallback
  try {
    const { stream, provider } = await multiProvider.stream(
      "Tell me a short story about AI",
    );

    console.log(`\n Streaming from ${provider}:`);
    for await (const chunk of stream) {
      if (chunk.type === "content-delta") {
        process.stdout.write(chunk.delta);
      }
    }
  } catch (error: any) {
    console.error("\n❌ Streaming failed:", error.message);
  }
}

main();
```

## Explanation

### 1. Provider Priority

Providers are ordered by priority (1 = highest):

```typescript
providers = [
  { name: "openai", priority: 1 }, // Try first
  { name: "anthropic", priority: 2 }, // Fallback
  { name: "google-ai", priority: 3 }, // Last resort
];
```

### 2. Health Monitoring

Track provider health automatically:

- **Healthy**: Available for requests
- **Unhealthy**: Temporarily skipped (auto-recovers after 60s)
- **Failure triggers**: 503, 502, connection errors

### 3. Automatic Failover

On error, automatically try next provider:

```
OpenAI fails → Try Anthropic → Try Google AI → Throw error
```

### 4. Error Classification

Not all errors trigger failover:

- **503, 502**: Provider issue → Mark unhealthy, try next
- **401, 403**: Auth issue → Try next (may have different credentials)
- **400**: Bad request → Don't retry (same error on all providers)

### 5. Timeout Protection

Set timeouts to prevent hanging on slow providers:

```typescript
timeout: 10000; // 10 seconds
```

## Variations

### Cost-Aware Routing

Prefer cheaper providers when quality is similar:

```typescript
async generateCheap(prompt: string) {
  return this.generate(prompt, { preferCheap: true });
}
```

### Region-Aware Routing

Choose provider based on region:

```typescript
type RegionalConfig = ProviderConfig & {
  regions: string[];
};

function getProvidersForRegion(region: string): ProviderConfig[] {
  return providers.filter(
    (p) => p.regions.includes(region) || p.regions.includes("global"),
  );
}
```

### Load Balancing

Distribute load across providers:

```typescript
class LoadBalancedNeuroLink extends MultiProviderNeuroLink {
  private currentIndex = 0;

  async generateBalanced(prompt: string) {
    const providers = this.getHealthyProviders();

    // Round-robin selection
    const provider = providers[this.currentIndex % providers.length];
    this.currentIndex++;

    try {
      return await this.tryProvider(prompt, provider);
    } catch (error) {
      // Fallback to standard failover
      return this.generate(prompt);
    }
  }
}
```

### Model-Specific Fallback

Different models for different tasks:

```typescript
const TASK_PROVIDERS = {
  coding: [
    { name: "openai", model: "gpt-4" },
    { name: "anthropic", model: "claude-3-opus-20240229" },
  ],
  summarization: [
    { name: "anthropic", model: "claude-3-haiku-20240307" },
    { name: "google-ai", model: "gemini-pro" },
  ],
  creative: [
    { name: "openai", model: "gpt-4" },
    { name: "anthropic", model: "claude-3-sonnet-20240229" },
  ],
};

async function generateForTask(task: string, prompt: string) {
  const providers = TASK_PROVIDERS[task as keyof typeof TASK_PROVIDERS];
  const multiProvider = new MultiProviderNeuroLink(
    providers.map((p, i) => ({
      ...p,
      priority: i + 1,
    })),
  );

  return multiProvider.generate(prompt);
}
```

### Health Check Endpoint

Proactive health checking:

```typescript
async function checkAllProviders() {
  const results = await Promise.allSettled(
    providers.map(async (p) => {
      const start = Date.now();
      await tryProvider("test", p, 5000);
      return { provider: p.name, latency: Date.now() - start };
    }),
  );

  results.forEach((result, i) => {
    if (result.status === "fulfilled") {
      console.log(`✅ ${providers[i].name}: ${result.value.latency}ms`);
    } else {
      console.log(`❌ ${providers[i].name}: Failed`);
      markUnhealthy(providers[i].name);
    }
  });
}

// Run health checks every 5 minutes
setInterval(checkAllProviders, 5 * 60 * 1000);
```

## Provider Comparison

| Provider     | Availability | Rate Limits  | Global Regions | Cost |
| ------------ | ------------ | ------------ | -------------- | ---- |
| OpenAI       | 99.9%        | 3500 req/min | Yes            | $$$  |
| Anthropic    | 99.9%        | 1000 req/min | Limited        | $$   |
| Google AI    | 99.5%        | 60 req/min   | Yes            | $    |
| Azure OpenAI | 99.95%       | Custom       | Global         | $$$  |

## Best Practices

1. **Configure at least 2 providers**: Minimum for true failover
2. **Mix provider types**: Different infrastructure = better reliability
3. **Monitor health actively**: Don't wait for failures
4. **Set appropriate timeouts**: Balance speed vs reliability
5. **Log all failovers**: Track patterns for optimization

## See Also

- [Error Recovery Patterns](/docs/cookbook/error-recovery)
- [Rate Limit Handling](/docs/cookbook/rate-limit-handling)
- [Cost Optimization](/docs/cookbook/cost-optimization)
- [Provider Comparison Guide](/docs/reference/provider-comparison)

---

## Rate Limit Handling

<!-- Source: cookbook/rate-limit-handling.md -->

# Rate Limit Handling

## Problem

AI providers enforce rate limits to prevent abuse and ensure fair usage. Exceeding these limits results in:

- HTTP 429 errors
- Request failures
- Service disruption
- Temporary bans

Different providers have different limits:

- OpenAI: 3,500 requests/min (paid tier)
- Anthropic: 50 requests/min (free tier)
- Google AI: 60 requests/min

## Solution

Implement intelligent rate limiting with:

1. Token bucket algorithm
2. Request queuing
3. Automatic backoff
4. Per-provider limits
5. Request prioritization

## Code

```typescript

type RateLimitConfig = {
  requestsPerMinute: number;
  burstSize?: number;
  retryAfter?: number;
};

class RateLimiter {
  private queue: Array Promise> = [];
  private processing = false;
  private tokens: number;
  private lastRefill: number;
  private config: Required;

  constructor(config: RateLimitConfig) {
    this.config = {
      requestsPerMinute: config.requestsPerMinute,
      burstSize: config.burstSize || config.requestsPerMinute,
      retryAfter: config.retryAfter || 60000,
    };
    this.tokens = this.config.burstSize;
    this.lastRefill = Date.now();
  }

  /**
   * Refill tokens based on time elapsed
   */
  private refillTokens() {
    const now = Date.now();
    const elapsed = now - this.lastRefill;
    const tokensToAdd = (elapsed / 60000) * this.config.requestsPerMinute;

    this.tokens = Math.min(this.tokens + tokensToAdd, this.config.burstSize);
    this.lastRefill = now;
  }

  /**
   * Wait until a token is available
   */
  private async waitForToken(): Promise {
    this.refillTokens();

    if (this.tokens >= 1) {
      this.tokens -= 1;
      return;
    }

    // Calculate wait time for next token
    const tokensNeeded = 1 - this.tokens;
    const waitTime = (tokensNeeded / this.config.requestsPerMinute) * 60000;

    console.log(
      `⏳ Rate limit: waiting ${Math.ceil(waitTime)}ms for next token`,
    );

    await new Promise((resolve) => setTimeout(resolve, waitTime));
    this.tokens = 0; // Token consumed
  }

  /**
   * Execute a request with rate limiting
   */
  async execute(fn: () => Promise): Promise {
    await this.waitForToken();

    try {
      return await fn();
    } catch (error: any) {
      // Handle rate limit error
      if (error.status === 429) {
        const retryAfter =
          error.headers?.["retry-after"] || this.config.retryAfter / 1000;

        console.log(`⚠️  Rate limit hit. Retrying after ${retryAfter}s`);

        // Reset tokens on rate limit
        this.tokens = 0;

        await new Promise((resolve) => setTimeout(resolve, retryAfter * 1000));

        return this.execute(fn);
      }

      throw error;
    }
  }
}

/**
 * Multi-provider rate limiter
 */
class ProviderRateLimiter {
  private limiters = new Map();
  private neurolink: NeuroLink;

  constructor() {
    this.neurolink = new NeuroLink();

    // Configure per-provider limits
    this.limiters.set(
      "openai",
      new RateLimiter({ requestsPerMinute: 3000, burstSize: 100 }),
    );
    this.limiters.set(
      "anthropic",
      new RateLimiter({ requestsPerMinute: 50, burstSize: 10 }),
    );
    this.limiters.set(
      "google-ai",
      new RateLimiter({ requestsPerMinute: 60, burstSize: 15 }),
    );
  }

  /**
   * Generate with automatic rate limiting
   */
  async generate(
    prompt: string,
    provider: string = "openai",
    options: any = {},
  ) {
    const limiter = this.limiters.get(provider);
    if (!limiter) {
      throw new Error(`Unknown provider: ${provider}`);
    }

    return limiter.execute(async () => {
      const result = await this.neurolink.generate({
        input: { text: prompt },
        provider,
        ...options,
      });

      console.log(`✅ Request completed (${provider})`);
      return result;
    });
  }

  /**
   * Batch requests with rate limiting
   */
  async batchGenerate(prompts: string[], provider: string = "openai") {
    const results = [];

    for (let i = 0; i  `Question ${i + 1}: What is ${i + 1} + ${i + 1}?`);

  const results = await limiter.batchGenerate(prompts, "anthropic");
  console.log(`\n✅ Completed ${results.length} requests`);
}

main();
```

## Explanation

### 1. Token Bucket Algorithm

The rate limiter uses a token bucket:

- **Bucket capacity**: `burstSize` (max requests in burst)
- **Refill rate**: `requestsPerMinute / 60` tokens per second
- **Token consumption**: 1 token per request

This allows bursts while maintaining average rate.

### 2. Automatic Refill

Tokens refill continuously based on elapsed time:

```typescript
tokensToAdd = (elapsed_ms / 60000) * requestsPerMinute;
```

### 3. Wait Strategy

When no tokens available:

- Calculate time until next token
- Sleep for that duration
- Consume token and proceed

### 4. 429 Error Handling

When provider returns 429:

- Read `Retry-After` header
- Reset token bucket
- Wait and retry automatically

### 5. Per-Provider Configuration

Different providers have different limits. Configure each separately:

| Provider  | Free Tier  | Paid Tier    | Burst Size |
| --------- | ---------- | ------------ | ---------- |
| OpenAI    | 3 req/min  | 3500 req/min | 100        |
| Anthropic | 50 req/min | 1000 req/min | 10         |
| Google AI | 60 req/min | 1000 req/min | 15         |

## Variations

### Priority Queue

Prioritize important requests:

```typescript
type QueuedRequest = {
  fn: () => Promise;
  priority: number;
  timestamp: number;
};

class PriorityRateLimiter extends RateLimiter {
  private queue: QueuedRequest[] = [];

  async executeWithPriority(
    fn: () => Promise,
    priority: number = 0,
  ): Promise {
    return new Promise((resolve, reject) => {
      this.queue.push({
        fn: async () => {
          try {
            const result = await this.execute(fn);
            resolve(result);
          } catch (error) {
            reject(error);
          }
        },
        priority,
        timestamp: Date.now(),
      });

      // Sort by priority (higher first), then timestamp (earlier first)
      this.queue.sort((a, b) =>
        b.priority !== a.priority
          ? b.priority - a.priority
          : a.timestamp - b.timestamp,
      );

      this.processQueue();
    });
  }

  private async processQueue() {
    if (this.queue.length === 0) return;

    const request = this.queue.shift()!;
    await request.fn();

    if (this.queue.length > 0) {
      this.processQueue();
    }
  }
}
```

### Adaptive Rate Limiting

Adjust limits based on errors:

```typescript
class AdaptiveRateLimiter extends RateLimiter {
  private consecutiveErrors = 0;

  async execute(fn: () => Promise): Promise {
    try {
      const result = await super.execute(fn);
      this.consecutiveErrors = 0; // Reset on success
      return result;
    } catch (error: any) {
      if (error.status === 429) {
        this.consecutiveErrors++;

        // Reduce rate after repeated errors
        if (this.consecutiveErrors >= 3) {
          this.config.requestsPerMinute *= 0.8;
          console.log(
            `⚠️  Reducing rate to ${this.config.requestsPerMinute} req/min`,
          );
        }
      }
      throw error;
    }
  }
}
```

### Distributed Rate Limiting with Redis

For multi-instance deployments:

```typescript

class RedisRateLimiter {
  private redis: Redis;
  private key: string;
  private limit: number;
  private window: number; // seconds

  constructor(redis: Redis, key: string, limit: number, window: number = 60) {
    this.redis = redis;
    this.key = key;
    this.limit = limit;
    this.window = window;
  }

  async execute(fn: () => Promise): Promise {
    const now = Date.now();
    const windowStart = now - this.window * 1000;

    // Remove old entries
    await this.redis.zremrangebyscore(this.key, 0, windowStart);

    // Count current requests
    const count = await this.redis.zcard(this.key);

    if (count >= this.limit) {
      const oldestEntry = await this.redis.zrange(this.key, 0, 0, "WITHSCORES");
      const waitTime = oldestEntry[1]
        ? parseInt(oldestEntry[1]) + this.window * 1000 - now
        : 1000;

      console.log(`⏳ Rate limit: waiting ${waitTime}ms`);
      await new Promise((r) => setTimeout(r, waitTime));
      return this.execute(fn);
    }

    // Add current request
    await this.redis.zadd(this.key, now, `${now}-${Math.random()}`);
    await this.redis.expire(this.key, this.window * 2);

    return fn();
  }
}
```

## Best Practices

1. **Set conservative limits**: Start with 80% of provider's limit
2. **Monitor usage**: Track request patterns to optimize limits
3. **Use burst capacity**: Allow occasional spikes while maintaining average rate
4. **Implement backoff**: Exponential backoff on repeated rate limit errors
5. **Cache responses**: Reduce duplicate requests (see [Cost Optimization](/docs/cookbook/cost-optimization))

## See Also

- [Cost Optimization](/docs/cookbook/cost-optimization)
- [Batch Processing](/docs/cookbook/batch-processing)
- [Error Recovery](/docs/cookbook/error-recovery)
- [Streaming with Retry](/docs/cookbook/streaming-with-retry)

---

## Streaming with Retry Logic

<!-- Source: cookbook/streaming-with-retry.md -->

# Streaming with Retry Logic

## Problem

Network interruptions, temporary provider outages, and transient errors can cause streaming responses to fail mid-stream. Without retry logic, users experience incomplete responses and poor reliability.

## Solution

Implement automatic retry with exponential backoff for streaming responses. Handle different failure scenarios:

- Network timeouts
- Connection drops
- Provider rate limits
- Transient API errors

## Code

```typescript

type RetryConfig = {
  maxRetries: number;
  initialDelay: number;
  maxDelay: number;
  backoffMultiplier: number;
};

async function streamWithRetry(
  neurolink: NeuroLink,
  prompt: string,
  config: RetryConfig = {
    maxRetries: 3,
    initialDelay: 1000,
    maxDelay: 10000,
    backoffMultiplier: 2,
  },
) {
  let attempt = 0;
  let delay = config.initialDelay;

  while (attempt  config.maxRetries) {
        console.error(
          `❌ Stream failed after ${attempt} attempts:`,
          error.message,
        );
        throw error;
      }

      console.log(
        `⚠️  Stream interrupted (attempt ${attempt}/${config.maxRetries}). Retrying in ${delay}ms...`,
      );

      await new Promise((resolve) => setTimeout(resolve, delay));
      delay = Math.min(delay * config.backoffMultiplier, config.maxDelay);
    }
  }
}

// Usage example
async function main() {
  const neurolink = new NeuroLink();

  try {
    const response = await streamWithRetry(
      neurolink,
      "Write a detailed explanation of quantum computing",
      {
        maxRetries: 5,
        initialDelay: 500,
        maxDelay: 8000,
        backoffMultiplier: 2,
      },
    );

    console.log("Final response length:", response.length);
  } catch (error) {
    console.error("Failed after all retries:", error);
  }
}

main();
```

## Explanation

### 1. Retry Configuration

The `RetryConfig` interface defines retry behavior:

- `maxRetries`: Maximum number of retry attempts
- `initialDelay`: Starting delay between retries (milliseconds)
- `maxDelay`: Maximum delay to prevent excessive waiting
- `backoffMultiplier`: How quickly delays increase (exponential backoff)

### 2. Retry Loop

The while loop attempts streaming up to `maxRetries + 1` times (initial attempt + retries).

### 3. Error Classification

Not all errors should trigger retries:

- **Retryable**: Network errors, rate limits, temporary service issues
- **Non-retryable**: Authentication errors, invalid requests, missing models

### 4. Exponential Backoff

Each retry waits longer than the previous:

- First retry: 1000ms
- Second retry: 2000ms
- Third retry: 4000ms
- Fourth retry: 8000ms (capped at maxDelay)

This prevents overwhelming the provider and gives transient issues time to resolve.

### 5. Stream Consumption

The code accumulates chunks to provide a complete response even if earlier attempts partially succeeded.

## Variations

### Resume from Last Position

For very long streams, resume from the last received position:

```typescript
async function streamWithResume(
  neurolink: NeuroLink,
  prompt: string,
  onProgress?: (text: string) => void,
) {
  let accumulated = "";
  let attempt = 0;
  const maxRetries = 3;

  while (attempt  maxRetries) throw error;
      await new Promise((r) => setTimeout(r, 1000 * attempt));
    }
  }
}
```

### Circuit Breaker Pattern

Prevent repeated failures with a circuit breaker:

```typescript
class StreamCircuitBreaker {
  private failures = 0;
  private lastFailureTime = 0;
  private readonly threshold = 5;
  private readonly resetTimeout = 60000; // 1 minute

  async executeStream(fn: () => Promise) {
    // Check if circuit is open
    if (this.failures >= this.threshold) {
      const timeSinceFailure = Date.now() - this.lastFailureTime;
      if (timeSinceFailure
  neurolink.stream({ input: { text: prompt } }),
);
```

### Provider Fallback on Retry

Try different providers on subsequent retries:

```typescript
const providers = ["openai", "anthropic", "google-ai"] as const;

async function streamWithProviderFallback(prompt: string) {
  for (const provider of providers) {
    try {
      console.log(`Trying provider: ${provider}`);
      const stream = await neurolink.stream({
        input: { text: prompt },
        provider,
      });

      let response = "";
      for await (const chunk of stream) {
        if (chunk.type === "content-delta") {
          response += chunk.delta;
        }
      }

      console.log(`✅ Success with ${provider}`);
      return response;
    } catch (error) {
      console.log(`❌ ${provider} failed, trying next...`);
      continue;
    }
  }

  throw new Error("All providers failed");
}
```

## See Also

- [Error Recovery Patterns](/docs/cookbook/error-recovery)
- [Multi-Provider Fallback](/docs/cookbook/multi-provider-fallback)
- [Rate Limit Handling](/docs/cookbook/rate-limit-handling)
- [Streaming API Reference](/docs/sdk/api-reference)

---

## Structured Output with JSON Schema

<!-- Source: cookbook/structured-output.md -->

# Structured Output with JSON Schema

## Problem

AI models return unstructured text by default:

- Inconsistent formatting
- Manual parsing required
- Type safety missing
- Error-prone extraction
- Difficult validation

Applications need structured, typed data:

- JSON objects for APIs
- Type-safe TypeScript interfaces
- Database records
- Form data

## Solution

Use JSON schema to enforce structured output:

1. Define TypeScript interfaces
2. Generate JSON schemas
3. Validate responses
4. Type-safe parsing
5. Error handling

## Code

```typescript

// Define your data structure
type ProductReview = {
  productName: string;
  rating: number;
  sentiment: "positive" | "negative" | "neutral";
  pros: string[];
  cons: string[];
  recommendationScore: number;
  summary: string;
};

// JSON Schema for validation
const productReviewSchema = {
  type: "object",
  properties: {
    productName: {
      type: "string",
      description: "Name of the product being reviewed",
    },
    rating: {
      type: "number",
      minimum: 1,
      maximum: 5,
      description: "Rating from 1 to 5 stars",
    },
    sentiment: {
      type: "string",
      enum: ["positive", "negative", "neutral"],
      description: "Overall sentiment of the review",
    },
    pros: {
      type: "array",
      items: { type: "string" },
      description: "List of positive aspects",
    },
    cons: {
      type: "array",
      items: { type: "string" },
      description: "List of negative aspects",
    },
    recommendationScore: {
      type: "number",
      minimum: 0,
      maximum: 100,
      description: "Likelihood to recommend (0-100)",
    },
    summary: {
      type: "string",
      description: "Brief summary of the review",
    },
  },
  required: [
    "productName",
    "rating",
    "sentiment",
    "pros",
    "cons",
    "recommendationScore",
    "summary",
  ],
};

class StructuredOutputGenerator {
  private neurolink: NeuroLink;

  constructor() {
    this.neurolink = new NeuroLink();
  }

  /**
   * Extract structured data from text
   */
  async extractStructured(
    prompt: string,
    schema: any,
    provider: string = "openai",
  ): Promise {
    const result = await this.neurolink.generate({
      input: { text: prompt },
      provider,
      structuredOutput: {
        type: "json",
        schema,
      },
    });

    // Parse and validate JSON
    try {
      const parsed = JSON.parse(result.content);
      this.validateAgainstSchema(parsed, schema);
      return parsed as T;
    } catch (error: any) {
      throw new Error(`Failed to parse structured output: ${error.message}`);
    }
  }

  /**
   * Basic schema validation
   */
  private validateAgainstSchema(data: any, schema: any): void {
    // Check required fields
    if (schema.required) {
      for (const field of schema.required) {
        if (!(field in data)) {
          throw new Error(`Missing required field: ${field}`);
        }
      }
    }

    // Check types
    for (const [key, value] of Object.entries(data)) {
      const fieldSchema = schema.properties?.[key];
      if (!fieldSchema) continue;

      const actualType = Array.isArray(value) ? "array" : typeof value;
      if (fieldSchema.type !== actualType) {
        throw new Error(
          `Field "${key}" has wrong type. Expected ${fieldSchema.type}, got ${actualType}`,
        );
      }

      // Validate enum
      if (fieldSchema.enum && !fieldSchema.enum.includes(value)) {
        throw new Error(
          `Field "${key}" must be one of: ${fieldSchema.enum.join(", ")}`,
        );
      }

      // Validate number ranges
      if (fieldSchema.type === "number") {
        if (fieldSchema.minimum !== undefined && value = ${fieldSchema.minimum}`);
        }
        if (fieldSchema.maximum !== undefined && value > fieldSchema.maximum) {
          throw new Error(`Field "${key}" must be (
    prompt: string,
    schema: any,
    maxRetries: number = 3,
  ): Promise {
    let lastError: Error | null = null;

    for (let attempt = 1; attempt (prompt, schema);
      } catch (error: any) {
        lastError = error;
        console.error(`❌ Attempt ${attempt} failed: ${error.message}`);

        if (attempt (
    `Extract a structured review from this text: ${reviewText}`,
    productReviewSchema,
  );

  console.log("✅ Extracted Review:");
  console.log(JSON.stringify(review, null, 2));
}

// Example 2: Contact Information Extraction
type ContactInfo = {
  name: string;
  email: string;
  phone?: string;
  company?: string;
  role?: string;
};

const contactSchema = {
  type: "object",
  properties: {
    name: { type: "string" },
    email: { type: "string", format: "email" },
    phone: { type: "string" },
    company: { type: "string" },
    role: { type: "string" },
  },
  required: ["name", "email"],
};

async function example2_ContactExtraction() {
  const generator = new StructuredOutputGenerator();

  const text = `
    Hi, I'm John Smith, Senior Engineer at TechCorp Inc.
    You can reach me at john.smith@techcorp.com or call me at
    +1-555-0123. Looking forward to connecting!
  `;

  const contact = await generator.extractStructured(
    `Extract contact information from: ${text}`,
    contactSchema,
  );

  console.log("✅ Extracted Contact:");
  console.log(contact);
}

// Example 3: Database Record Generation
type UserProfile = {
  userId: string;
  username: string;
  age: number;
  interests: string[];
  subscriptionTier: "free" | "basic" | "premium";
  joinedDate: string;
};

const userProfileSchema = {
  type: "object",
  properties: {
    userId: { type: "string", pattern: "^[A-Z0-9]{8}$" },
    username: { type: "string", minLength: 3, maxLength: 20 },
    age: { type: "number", minimum: 13, maximum: 120 },
    interests: { type: "array", items: { type: "string" } },
    subscriptionTier: { type: "string", enum: ["free", "basic", "premium"] },
    joinedDate: { type: "string", format: "date" },
  },
  required: [
    "userId",
    "username",
    "age",
    "interests",
    "subscriptionTier",
    "joinedDate",
  ],
};

async function example3_DatabaseRecord() {
  const generator = new StructuredOutputGenerator();

  const userData = `
    Create a user profile for Sarah Chen, a 28-year-old photography enthusiast
    who also loves hiking and cooking. She's on our premium plan and joined
    last month.
  `;

  const profile = await generator.extractStructured(
    userData,
    userProfileSchema,
    "anthropic", // Claude handles structured output well
  );

  console.log("✅ User Profile:");
  console.log(profile);
}

// Main
async function main() {
  console.log("=== Example 1: Product Review ===\n");
  await example1_ProductReview();

  console.log("\n=== Example 2: Contact Extraction ===\n");
  await example2_ContactExtraction();

  console.log("\n=== Example 3: Database Record ===\n");
  await example3_DatabaseRecord();
}

main();
```

## Explanation

### 1. JSON Schema Definition

Define structure upfront:

```typescript
const schema = {
  type: "object",
  properties: {
    field: { type: "string" },
  },
  required: ["field"],
};
```

### 2. Type Safety

Use TypeScript interfaces for compile-time checking:

```typescript
type MyData = {
  field: string;
};

const data = await extract(prompt, schema);
// data.field is typed as string
```

### 3. Validation

Validate parsed JSON against schema:

- Required fields present
- Correct types
- Enum values valid
- Number ranges respected

### 4. Error Handling

Retry with enhanced prompt on validation failure:

```typescript
prompt += `\nPrevious failed: ${error.message}`;
```

### 5. Provider Selection

Different providers handle structured output differently:

- **OpenAI**: Excellent JSON mode
- **Anthropic**: Good with clear schemas
- **Google AI**: NOTE - Cannot use tools with structured output

## Variations

### Nested Objects

Handle complex nested structures:

```typescript
type Company = {
  name: string;
  employees: Array;
};

const companySchema = {
  type: "object",
  properties: {
    name: { type: "string" },
    employees: {
      type: "array",
      items: {
        type: "object",
        properties: {
          name: { type: "string" },
          role: { type: "string" },
          department: {
            type: "object",
            properties: {
              name: { type: "string" },
              budget: { type: "number" },
            },
            required: ["name", "budget"],
          },
        },
        required: ["name", "role", "department"],
      },
    },
  },
  required: ["name", "employees"],
};
```

### Streaming Structured Output

Stream and validate incrementally:

```typescript
async function streamStructuredOutput(
  prompt: string,
  schema: any,
): Promise {
  let buffer = "";

  const stream = await neurolink.stream({
    input: { text: prompt },
    structuredOutput: { type: "json", schema },
  });

  for await (const chunk of stream) {
    if (chunk.type === "content-delta") {
      buffer += chunk.delta;
      process.stdout.write(chunk.delta);
    }
  }

  return JSON.parse(buffer) as T;
}
```

### Union Types

Handle multiple possible schemas:

```typescript
type Response = SuccessResponse | ErrorResponse;

type SuccessResponse = {
  status: "success";
  data: any;
};

type ErrorResponse = {
  status: "error";
  error: string;
  code: number;
};

async function parseResponse(text: string): Promise {
  const result = await generator.extractStructured(text, responseSchema);

  if (result.status === "success") {
    return result as SuccessResponse;
  } else {
    return result as ErrorResponse;
  }
}
```

### Schema from TypeScript

Auto-generate schemas from interfaces:

```typescript

const UserSchema = z.object({
  name: z.string(),
  age: z.number().min(0).max(120),
  email: z.string().email(),
});

const jsonSchema = zodToJsonSchema(UserSchema);

const user = await generator.extractStructured>(
  prompt,
  jsonSchema,
);
```

## Use Cases

| Use Case           | Schema Complexity | Recommended Provider |
| ------------------ | ----------------- | -------------------- |
| Data extraction    | Simple            | OpenAI, Anthropic    |
| Form filling       | Medium            | OpenAI               |
| API responses      | Medium            | OpenAI, Google AI    |
| Database records   | Complex           | OpenAI               |
| Classification     | Simple            | Any provider         |
| Sentiment analysis | Simple            | Anthropic            |

## Best Practices

1. **Define schemas upfront**: Don't rely on prompt engineering alone
2. **Use TypeScript types**: Compile-time safety prevents runtime errors
3. **Validate responses**: Don't trust AI output blindly
4. **Retry on failure**: Validation errors can be recovered
5. **Test schemas**: Verify with sample data before production
6. **Keep schemas simple**: Complex nesting reduces accuracy

## See Also

- [Batch Processing](/docs/cookbook/batch-processing)
- [Error Recovery](/docs/cookbook/error-recovery)
- [API Reference - Generate Method](/docs/sdk/api-reference)
- [Provider Comparison](/docs/reference/provider-comparison)

---

## Tool Chaining with MCP

<!-- Source: cookbook/tool-chaining.md -->

# Tool Chaining with MCP

## Problem

Complex tasks require multiple MCP tool calls in sequence:

- Search → Read → Analyze → Write
- Query database → Process → Store results
- Fetch data → Transform → Send notification

Manually orchestrating tool calls is:

- Error-prone
- Difficult to manage state
- Hard to handle failures
- Not reusable

## Solution

Implement intelligent tool chaining with:

1. Automatic tool selection
2. State management
3. Error recovery
4. Result validation
5. Chain composition

## Code

```typescript

type ChainStep = {
  toolName: string;
  args: Record;
  validateResult?: (result: any) => boolean;
  onError?: (error: Error) => "retry" | "skip" | "abort";
};

type ChainContext = {
  steps: ChainStep[];
  results: any[];
  currentStep: number;
  metadata: Record;
};

class ToolChain {
  private neurolink: NeuroLink;
  private context: ChainContext;

  constructor() {
    this.neurolink = new NeuroLink({
      mcpServers: {
        filesystem: {
          command: "npx",
          args: ["-y", "@modelcontextprotocol/server-filesystem", "."],
        },
        github: {
          command: "npx",
          args: ["-y", "@modelcontextprotocol/server-github"],
          env: {
            GITHUB_PERSONAL_ACCESS_TOKEN: process.env.GITHUB_TOKEN || "",
          },
        },
      },
    });

    this.context = {
      steps: [],
      results: [],
      currentStep: 0,
      metadata: {},
    };
  }

  /**
   * Add step to chain
   */
  addStep(step: ChainStep): this {
    this.context.steps.push(step);
    return this; // Fluent interface
  }

  /**
   * Execute tool chain
   */
  async execute(): Promise {
    const errors: Error[] = [];

    console.log(
      ` Executing chain with ${this.context.steps.length} steps...\n`,
    );

    for (let i = 0; i  {
    // Replace placeholders in args with previous results
    const processedArgs = this.processArgs(step.args);

    // Use AI to execute tool
    const result = await this.neurolink.generate({
      input: {
        text: `Execute the tool "${step.toolName}" with these arguments: ${JSON.stringify(processedArgs)}`,
      },
      enableTools: true,
    });

    return this.extractToolResult(result);
  }

  /**
   * Process args to replace placeholders with previous results
   */
  private processArgs(args: Record): Record {
    const processed: Record = {};

    for (const [key, value] of Object.entries(args)) {
      if (typeof value === "string" && value.startsWith("$")) {
        // Reference to previous result
        const stepIndex = parseInt(value.slice(1));
        processed[key] = this.context.results[stepIndex];
      } else {
        processed[key] = value;
      }
    }

    return processed;
  }

  /**
   * Extract tool result from AI response
   */
  private extractToolResult(response: any): any {
    // Implementation depends on response format
    return response.toolResults?.[0] || response.content;
  }

  /**
   * Get chain context
   */
  getContext(): ChainContext {
    return this.context;
  }

  /**
   * Reset chain
   */
  reset(): this {
    this.context = {
      steps: [],
      results: [],
      currentStep: 0,
      metadata: {},
    };
    return this;
  }
}

/**
 * Pre-built chain templates
 */
class ChainTemplates {
  /**
   * Search → Read → Summarize chain
   */
  static searchAnalyzeChain(query: string, maxFiles: number = 3): ToolChain {
    const chain = new ToolChain();

    return chain
      .addStep({
        toolName: "search_files",
        args: { query, max_results: maxFiles },
      })
      .addStep({
        toolName: "read_file",
        args: { path: "$0" }, // Use result from step 0
      })
      .addStep({
        toolName: "analyze_content",
        args: { content: "$1" },
      });
  }

  /**
   * Fetch → Process → Save chain
   */
  static fetchProcessSaveChain(url: string, outputPath: string): ToolChain {
    const chain = new ToolChain();

    return chain
      .addStep({
        toolName: "fetch_url",
        args: { url },
        validateResult: (result) => result.status === 200,
      })
      .addStep({
        toolName: "process_data",
        args: { data: "$0" },
      })
      .addStep({
        toolName: "write_file",
        args: {
          path: outputPath,
          content: "$1",
        },
        onError: () => "retry",
      });
  }

  /**
   * GitHub workflow: Create issue → Create branch → Push → Create PR
   */
  static githubWorkflowChain(
    repo: string,
    issueTitle: string,
    branchName: string,
  ): ToolChain {
    const chain = new ToolChain();

    return chain
      .addStep({
        toolName: "github_create_issue",
        args: {
          repo,
          title: issueTitle,
          body: "Auto-generated issue",
        },
      })
      .addStep({
        toolName: "github_create_branch",
        args: {
          repo,
          branch: branchName,
          from: "main",
        },
      })
      .addStep({
        toolName: "github_push_files",
        args: {
          repo,
          branch: branchName,
          files: [],
          message: `Fixes #$0`, // Reference issue from step 0
        },
      })
      .addStep({
        toolName: "github_create_pr",
        args: {
          repo,
          title: `Fix: ${issueTitle}`,
          head: branchName,
          base: "main",
          body: `Closes #$0`,
        },
      });
  }
}

// Usage Example 1: File Processing Chain
async function example1_FileProcessing() {
  const chain = new ToolChain();

  chain
    .addStep({
      toolName: "list_directory",
      args: { path: "./docs" },
    })
    .addStep({
      toolName: "read_file",
      args: { path: "$0" }, // Read first file from listing
      validateResult: (content) => content.length > 0,
    })
    .addStep({
      toolName: "analyze_content",
      args: { content: "$1" },
    });

  const result = await chain.execute();

  console.log("\n=== Results ===");
  console.log("Success:", result.success);
  console.log("Results:", result.results);
}

// Example 2: Data Pipeline Chain
async function example2_DataPipeline() {
  const chain = new ToolChain();

  chain
    .addStep({
      toolName: "query_database",
      args: {
        query: "SELECT * FROM users WHERE active = true",
      },
    })
    .addStep({
      toolName: "transform_data",
      args: { data: "$0" },
      onError: () => "skip", // Skip transformation errors
    })
    .addStep({
      toolName: "send_notification",
      args: {
        message: "Data pipeline completed: $1",
      },
    });

  await chain.execute();
}

// Example 3: Using Pre-built Templates
async function example3_Templates() {
  // Search and analyze
  const searchChain = ChainTemplates.searchAnalyzeChain("authentication", 5);
  await searchChain.execute();

  // GitHub workflow
  const githubChain = ChainTemplates.githubWorkflowChain(
    "myorg/myrepo",
    "Fix authentication bug",
    "fix/auth-bug",
  );
  await githubChain.execute();
}

// Main
async function main() {
  console.log("=== Example 1: File Processing ===\n");
  await example1_FileProcessing();

  console.log("\n=== Example 2: Data Pipeline ===\n");
  await example2_DataPipeline();

  console.log("\n=== Example 3: Templates ===\n");
  await example3_Templates();
}

main();
```

## Explanation

### 1. Fluent Interface

Chain steps with method chaining:

```typescript
chain
  .addStep({...})
  .addStep({...})
  .addStep({...});
```

### 2. Result References

Reference previous step results:

```typescript
args: {
  content: "$1";
} // Use result from step 1
```

### 3. Validation

Validate step results:

```typescript
validateResult: (result) => result.status === 200;
```

### 4. Error Handling

Control flow on errors:

- **"abort"**: Stop chain
- **"retry"**: Retry current step
- **"skip"**: Continue to next step

### 5. Reusable Templates

Pre-built chains for common patterns:

```typescript
ChainTemplates.searchAnalyzeChain(query);
```

## Variations

### Conditional Chains

Branch based on results:

```typescript
class ConditionalChain extends ToolChain {
  addConditionalStep(
    condition: (context: ChainContext) => boolean,
    trueStep: ChainStep,
    falseStep: ChainStep,
  ) {
    return this.addStep({
      ...trueStep,
      args: condition(this.context) ? trueStep.args : falseStep.args,
    });
  }
}

// Usage
chain.addConditionalStep(
  (ctx) => ctx.results[0].count > 100,
  { toolName: "process_large", args: {} },
  { toolName: "process_small", args: {} },
);
```

### Parallel Chains

Execute independent chains in parallel:

```typescript
async function executeParallel(chains: ToolChain[]) {
  const results = await Promise.all(chains.map((chain) => chain.execute()));

  return {
    success: results.every((r) => r.success),
    results: results.map((r) => r.results),
    errors: results.flatMap((r) => r.errors),
  };
}

// Usage
await executeParallel([
  ChainTemplates.searchAnalyzeChain("auth"),
  ChainTemplates.searchAnalyzeChain("database"),
]);
```

### Loop Chains

Repeat steps until condition met:

```typescript
class LoopChain extends ToolChain {
  async executeLoop(
    step: ChainStep,
    condition: (result: any) => boolean,
    maxIterations: number = 10,
  ) {
    let iterations = 0;
    let result: any;

    while (iterations  result.status === "complete",
  20,
);
```

### Chain Composition

Combine multiple chains:

```typescript
class CompositeChain {
  private chains: ToolChain[] = [];

  add(chain: ToolChain): this {
    this.chains.push(chain);
    return this;
  }

  async execute() {
    const results = [];

    for (const chain of this.chains) {
      const result = await chain.execute();
      results.push(result);

      if (!result.success) {
        break; // Stop on first failure
      }
    }

    return results;
  }
}
```

## Common Patterns

### Data Processing Pipeline

```
Fetch → Validate → Transform → Store → Notify
```

### Content Workflow

```
Search → Read → Analyze → Summarize → Publish
```

### GitHub Automation

```
Create Issue → Create Branch → Commit → Push → Create PR
```

### Monitoring Pipeline

```
Query Metrics → Analyze → Alert → Create Ticket → Notify
```

## Best Practices

1. **Keep chains short**: 3-5 steps maximum
2. **Validate early**: Check results at each step
3. **Handle errors**: Define recovery strategy
4. **Use templates**: Standardize common patterns
5. **Log extensively**: Track chain execution
6. **Test chains**: Verify each step independently
7. **Document dependencies**: Clear step relationships

## See Also

- [MCP Integration Guide](/docs/features/mcp-tools-showcase)
- [Error Recovery](/docs/cookbook/error-recovery)
- [Batch Processing](/docs/cookbook/batch-processing)
- [SDK Custom Tools](/docs/sdk/custom-tools)

---

# MCP Integration

## MCP Foundation (Model Context Protocol)

<!-- Source: mcp/overview.md -->

#  MCP Foundation (Model Context Protocol)

**NeuroLink** features a groundbreaking **MCP Foundation** that transforms NeuroLink from an AI SDK into a **Universal AI Development Platform** while maintaining the simple factory method interface.

##  Production Achievement

**MCP Foundation Production Ready: 27/27 Tests Passing (100% Success Rate)**

- ✅ **Factory-First Architecture**: MCP tools work internally, users see simple factory methods
- ✅ **Lighthouse Compatible**: 99% compatible with existing MCP tools and servers
- ✅ **Enterprise Grade**: Rich context, permissions, tool orchestration, analytics
- ✅ **Performance Validated**: 0-11ms tool execution (target: \;
  parentContext?: MCPContext;
  toolChain: string[];
  performance: PerformanceMetrics;
  // + 8 more fields
};
```

####  Tool Registry (5/5 tests ✅)

- **Tool discovery**: Automatic detection of available tools
- **Registration system**: Dynamic tool registration and management
- **Execution tracking**: Statistics and performance monitoring
- **Filtering and search**: Find tools by capability and metadata

```typescript
// Registry tracks all available tools with metadata
const registry = {
  generate: {
    description: "Generate AI text content",
    schema: {
      /* JSON Schema */
    },
    provider: "aiCoreServer",
    executionCount: 1247,
    averageLatency: 850,
  },
};
```

####  Tool Orchestration (4/4 tests ✅)

- **Single tool execution**: Direct tool invocation with error handling
- **Sequential pipelines**: Chain tools together for complex workflows
- **Error recovery**: Automatic retry and fallback mechanisms
- **Performance monitoring**: Track execution time and success rates

```typescript
// Orchestrate complex workflows with multiple tools
const pipeline = [
  { tool: "analyze-ai-usage", params: { timeframe: "24h" } },
  { tool: "optimize-prompt-parameters", params: { prompt: "user-input" } },
  { tool: "generate", params: { optimizedParams: true } },
];
```

####  AI Provider Integration (6/6 tests ✅)

- **Core AI tools**: 3 essential tools for AI operations
- **Schema validation**: JSON Schema validation for all inputs/outputs
- **Provider abstraction**: Unified interface across all AI providers
- **Error standardization**: Consistent error handling and reporting (now with specific "model not found" errors for Ollama)

```typescript
// AI Provider MCP Tools
const aiTools = [
  "generate", // Text generation with provider selection
  "select-provider", // Automatic provider selection
  "check-provider-status", // Provider connectivity and health
];
```

####  Integration Tests (3/3 tests ✅)

- **End-to-end workflow validation**: Complete user journey testing
- **Performance benchmarking**: Tool execution time verification
- **Error scenario testing**: Comprehensive failure mode validation
- **Multi-tool pipeline testing**: Complex workflow verification

##  Performance Metrics

### Tool Execution Performance

- **Individual Tools**: 0-11ms execution time (target: \<100ms) ✅
- **Pipeline Execution**: 22ms for 2-step sequence ✅
- **Error Handling**: Graceful failures with comprehensive logging ✅
- **Context Management**: Rich context with minimal overhead ✅

### Enterprise Features

- **Rich Context**: 15+ fields including session, user, provider, permissions
- **Security Framework**: Permission-based access control and validation
- **Performance Analytics**: Detailed execution metrics and monitoring
- **Error Recovery**: Automatic retry and fallback mechanisms

##  Tool Ecosystem

### Current MCP Tools (10 Total)

#### Core AI Tools (3)

1. **`generate`** - AI text generation with provider selection
2. **`select-provider`** - Automatic best provider selection
3. **`check-provider-status`** - Provider connectivity and health checks

#### AI Analysis Tools (3)

4. **`analyze-ai-usage`** - Usage patterns and cost optimization
5. **`benchmark-provider-performance`** - Provider performance comparison
6. **`optimize-prompt-parameters`** - Parameter optimization for better output

#### AI Workflow Tools (4)

7. **`generate-test-cases`** - Comprehensive test case generation
8. **`refactor-code`** - AI-powered code optimization
9. **`generate-documentation`** - Automatic documentation creation
10. **`debug-ai-output`** - AI output validation and debugging

### Tool Categories

- **Production Ready**: All 10 tools with comprehensive testing
- **Enterprise Grade**: Rich context, permissions, error handling
- **Performance Optimized**: Sub-millisecond execution for most tools
- **Lighthouse Compatible**: Standard MCP protocol compliance

##  Lighthouse Compatibility

### Migration Strategy

- **99% Compatible**: Existing Lighthouse tools work with minimal changes
- **Import Statement Updates**: Change import statements, functionality preserved
- **Enhanced Context**: Lighthouse tools gain rich context automatically
- **Performance Improvements**: Better error handling and monitoring

```typescript
// Before (Lighthouse)

// After (NeuroLink MCP)

```

### Compatibility Features

- **Standard MCP Protocol**: Full compliance with MCP 2024-11-05 specification
- **Transport Support**: stdio, SSE, WebSocket, and HTTP transports supported
- **HTTP Transport**: Remote MCP servers with authentication, retry, and rate limiting
- **Schema Validation**: JSON Schema validation for all tool interactions
- **Error Handling**: Standardized error responses and recovery

## ️ Security and Permissions

### Permission Framework

- **Role-Based Access**: Different permission levels for different user types
- **Tool-Level Security**: Granular permissions for individual tools
- **Context Isolation**: Secure context boundaries between operations
- **Audit Logging**: Comprehensive logging for security monitoring

```typescript
// Permission-based tool execution
const context = {
  userId: "user123",
  permissions: ["ai:generate", "ai:analyze"],
  securityLevel: "enterprise",
};
```

### Security Features

- **Input Validation**: Comprehensive validation of all tool inputs
- **Output Sanitization**: Clean and validate all tool outputs
- **Context Boundaries**: Prevent information leakage between contexts
- **Error Information**: Sanitized error messages without sensitive data

##  Monitoring and Analytics

### Performance Tracking

- **Execution Metrics**: Track tool execution time and success rates
- **Usage Analytics**: Monitor tool usage patterns and trends
- **Error Analysis**: Comprehensive error tracking and analysis
- **Performance Optimization**: Identify and optimize slow operations

### Monitoring Features

- **Real-time Dashboards**: Live monitoring of tool performance
- **Historical Analysis**: Long-term trend analysis and reporting
- **Alert System**: Automated alerts for performance issues
- **Usage Reports**: Detailed usage and cost reporting

##  Lighthouse Integration: 60+ Production-Ready Tools

### Direct Import Approach (1-2 weeks)

**BREAKTHROUGH**: Instead of migrating 30+ tools (8-10 weeks), we now **directly import** Lighthouse's 60+ production-ready tools into NeuroLink.

```typescript
// Import Lighthouse tools directly

// Register in NeuroLink with one method call
const neurolink = new NeuroLink();
neurolink.registerLighthouseServer(juspayAnalyticsServer, {
  contextMapping: {
    shopId: "context.shopId",
    merchantId: "context.merchantId",
  },
});

// AI can now answer e-commerce questions using real production data
const result = await neurolink.generate({
  input: { text: "What were our payment success rates last month?" },
  // AI automatically discovers and uses juspay_get-success-rate-by-time tool
});
```

### Available Lighthouse Tools (60+ Tools)

#### **Payment Analytics Tools:**

- `get-success-rate-by-time` - Payment success rates over time
- `get-payment-method-wise-sr` - Success rates by payment method
- `get-transaction-trends` - Transaction trend analysis
- `get-failure-transactional-data` - Failed transaction analysis
- `get-gmv-order-value-payment-wise` - Revenue by payment method

#### **E-commerce Analytics Tools:**

- `get-conversion-rates` - Shop conversion metrics
- `process-analytics-data` - Process raw analytics
- `get-order-stats` - Order statistics and trends
- `get-merchant-data` - Merchant information
- `get-shop-performance` - Shop performance metrics

#### **Platform Integration Tools:**

- **Shopify**: Complete Shopify store integration
- **WooCommerce**: WooCommerce integration
- **Magento**: Magento store integration

### Integration Benefits

- **Zero Duplication**: Import existing tools, don't recreate
- **Auto-Updates**: Lighthouse improvements flow to NeuroLink automatically
- **Battle-Tested**: Production-ready tools with real API integrations
- **Minimal Maintenance**: Lighthouse team maintains tool implementations
- **Rich Context**: Full business context (shopId, merchantId, etc.)

** Complete Integration Guide**: [docs/lighthouse-unified-integration.md](/docs/lighthouse-unified-integration)

##  Technical Implementation Details

### MCP Server Architecture

```typescript
// Core MCP server structure
src/lib/mcp/
├── factory.ts                  # createMCPServer() - Lighthouse compatible
├── context-manager.ts          # Rich context (15+ fields) + tool chain tracking
├── registry.ts                 # Tool discovery, registration, execution + statistics
├── orchestrator.ts             # Single tools + sequential pipelines + error handling
└── servers/aiProviders/       # AI Core Server with 3 tools integrated
    └── aiCoreServer.ts       # generate, select-provider, check-provider-status
```

### Context Flow

1. **Context Creation**: Rich context with user, session, and permission data
2. **Tool Registration**: Tools register with metadata and capabilities
3. **Execution Request**: Tools execute with full context and validation
4. **Result Processing**: Results processed with context and performance tracking
5. **Context Cleanup**: Automatic cleanup and resource management

### Error Handling Strategy

- **Graceful Degradation**: Tools continue working even with partial failures
- **Comprehensive Logging**: Detailed logging for debugging and monitoring
- **Recovery Mechanisms**: Automatic retry and fallback for failed operations
- **Error Standardization**: Consistent error formats across all tools

##  Related Documentation

- **[Main README](/docs/)** - Project overview and quick start
- **[AI Analysis Tools](/docs/ai-analysis-tools)** - AI optimization and analysis tools
- **[AI Workflow Tools](/docs/ai-workflow-tools)** - Development lifecycle tools
- **[MCP Integration Guide](/docs/mcp/integration)** - Complete MCP setup and usage
- **[API Reference](/docs/sdk/api-reference)** - Complete TypeScript API

---

**Universal AI Development Platform** - MCP Foundation enables unlimited extensibility while preserving the simple interface developers love.

---

## MCP Configuration Locations Across AI Development Tools

<!-- Source: mcp/configuration.md -->

# MCP Configuration Locations Across AI Development Tools

This document provides a comprehensive guide to where different AI development tools store their Model Context Protocol (MCP) configurations.

## Summary of Common Patterns

Most AI development tools store MCP configurations in JSON files with a common structure:

```json
{
  "mcpServers": {
    "server-name": {
      "command": "node",
      "args": ["path/to/server.js"],
      "env": { "KEY": "value" }
    }
  }
}
```

The most common configuration keys are:

- `mcpServers` (most common)
- `servers` (alternative)
- `mcp.servers` (nested in settings)

## Tool-Specific Configuration Locations

### 1. Claude Desktop

- **Location**: `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS)
- **Windows**: `%APPDATA%\Claude\claude_desktop_config.json`
- **Linux**: `~/.config/Claude/claude_desktop_config.json`
- **Config Key**: `mcpServers` or `mcp_servers`

### 2. Cline AI Coder (VS Code Extension)

- **Location**: VS Code extension globalStorage
  - macOS: `~/Library/Application Support/Code/User/globalStorage/saoudrizwan.claude-dev/settings/cline_mcp_settings.json`
  - Linux: `~/.config/Code/User/globalStorage/saoudrizwan.claude-dev/settings/cline_mcp_settings.json`
  - Windows: `%APPDATA%\Code\User\globalStorage\saoudrizwan.claude-dev\settings\cline_mcp_settings.json`
- **Config Key**: `mcpServers` or `servers`

### 3. VS Code

- **Workspace Configuration**:
  - `.vscode/mcp.json` (dedicated MCP file)
  - `.vscode/settings.json` (in `mcp.servers` section)
- **Global Configuration**:
  - macOS: `~/Library/Application Support/Code/User/settings.json`
  - Linux: `~/.config/Code/User/settings.json`
  - Windows: `%APPDATA%\Code\User\settings.json`
- **Config Key**: `mcpServers`, `servers`, or `mcp.servers` (in settings.json)

### 4. Cursor

- **Global**: `~/.cursor/mcp.json`
- **Project**: `.cursor/mcp.json`
- **Config Key**: `mcpServers` or `servers`

### 5. Windsurf

- **Location**: `~/.codeium/windsurf/mcp_config.json`
- **Config Key**: `mcpServers` or `servers`

### 6. Continue Dev

- **Global**: `~/.continue/config.json`
- **Project**: `.continue/config.json`
- **Config Key**: `mcpServers` or `contextProviders.mcp`

### 7. Aider

- **Location**: `~/.aider/config.json` or `~/.aider/aider.conf`
- **Config Key**: `mcp_servers`

### 8. Generic/Project-Level Configurations

Many tools also check for generic MCP configuration files in the project root:

- `mcp.json`
- `.mcp-config.json`
- `mcp_config.json`
- `.mcp-servers.json`

## Common Configuration Structure

Most tools follow a similar JSON structure:

```json
{
  "mcpServers": {
    "filesystem": {
      "command": "npx",
      "args": [
        "@modelcontextprotocol/server-filesystem",
        "/path/to/allowed/directory"
      ]
    },
    "github": {
      "command": "npx",
      "args": ["@modelcontextprotocol/server-github"],
      "env": {
        "GITHUB_TOKEN": "your-github-token"
      }
    },
    "custom-server": {
      "command": "node",
      "args": ["/path/to/custom/server.js"],
      "cwd": "/path/to/working/directory",
      "env": {
        "CUSTOM_VAR": "value"
      }
    }
  }
}
```

## HTTP Transport Configuration

For remote MCP servers using HTTP/Streamable HTTP transport:

```json
{
  "mcpServers": {
    "remote-api": {
      "transport": "http",
      "url": "https://api.example.com/mcp",
      "headers": {
        "Authorization": "Bearer YOUR_TOKEN"
      },
      "httpOptions": {
        "connectionTimeout": 30000,
        "requestTimeout": 60000,
        "idleTimeout": 120000,
        "keepAliveTimeout": 30000
      },
      "retryConfig": {
        "maxAttempts": 3,
        "initialDelay": 1000,
        "maxDelay": 30000,
        "backoffMultiplier": 2
      },
      "rateLimiting": {
        "requestsPerMinute": 60,
        "maxBurst": 10,
        "useTokenBucket": true
      }
    },
    "oauth-protected-api": {
      "transport": "http",
      "url": "https://api.secure.com/mcp",
      "auth": {
        "type": "oauth2",
        "oauth": {
          "clientId": "your-client-id",
          "clientSecret": "your-client-secret",
          "authorizationUrl": "https://auth.provider.com/authorize",
          "tokenUrl": "https://auth.provider.com/token",
          "redirectUrl": "http://localhost:8080/callback",
          "scope": "mcp:read mcp:write",
          "usePKCE": true
        }
      }
    }
  }
}
```

### HTTP Configuration Options

| Option         | Type   | Description                                  |
| -------------- | ------ | -------------------------------------------- |
| `transport`    | string | Must be `"http"` for HTTP transport          |
| `url`          | string | Remote MCP endpoint URL                      |
| `headers`      | object | Custom HTTP headers (e.g., Authorization)    |
| `httpOptions`  | object | Connection timeout settings                  |
| `retryConfig`  | object | Retry with exponential backoff               |
| `rateLimiting` | object | Rate limiting configuration                  |
| `auth`         | object | OAuth 2.1, Bearer, or API key authentication |

See [MCP HTTP Transport Guide](/docs/mcp/http-transport) for complete documentation.

## Key Observations

1. **Common Pattern**: Almost all tools use JSON files with an `mcpServers` object
2. **Location Hierarchy**: Tools typically check in this order:
   - Project/workspace specific configs
   - User/global configs
   - Default/fallback configs

3. **Platform Differences**:
   - macOS: Often uses `~/Library/Application Support/`
   - Linux: Typically uses `~/.config/`
   - Windows: Usually uses `%APPDATA%`

4. **Extension Storage**: VS Code extensions (like Cline) store configs in VS Code's globalStorage

## Auto-Discovery Priority

When multiple configurations exist, tools typically prioritize in this order:

1. Workspace/project-specific configurations (highest priority)
2. Tool-specific global configurations
3. Generic project configurations (lowest priority)

## Best Practices

1. **Project-Specific Servers**: Use `.vscode/mcp.json` or similar for project-specific MCP servers
2. **Global Servers**: Configure frequently-used servers in your tool's global config
3. **Environment Variables**: Store sensitive data (API keys) in environment variables
4. **Version Control**: Commit project-specific configs, exclude global configs with API keys

## NeuroLink Auto-Discovery

NeuroLink's MCP auto-discovery system automatically searches all these locations and can discover MCP servers configured in any of these tools. Use the CLI command:

```bash
neurolink mcp discover
```

This will find and list all MCP servers configured across your system, regardless of which tool configured them.

---

## MCP Concurrency Control Guide

<!-- Source: mcp/concurrency.md -->

# MCP Concurrency Control Guide

> ⚠️ **PLANNED FEATURE**: This documentation describes features that are planned but not yet implemented. The `SemaphoreManager` class referenced in this guide does not currently exist in the codebase. The code examples are illustrative of the intended API design.

**NeuroLink Enhanced MCP Platform - Concurrency Management**

## ️ **Architecture & Implementation**

### **Core Semaphore Pattern**

```typescript
export class SemaphoreManager {
  private semaphores: Map> = new Map();
  private stats: Map = new Map();

  async acquire(
    key: string,
    operation: () => Promise,
  ): Promise> {
    const startTime = Date.now();
    const existing = this.semaphores.get(key);

    // Wait for existing operation if present
    if (existing) {
      await existing;
    }

    // Execute operation with automatic cleanup
    const promise = operation();
    this.semaphores.set(
      key,
      promise.then(
        () => {},
        () => {},
      ),
    );

    try {
      const result = await promise;
      return {
        success: true,
        result,
        waitTime: existing ? Date.now() - startTime : 0,
        executionTime: Date.now() - startTime,
        queueDepth: this.getQueueDepth(key),
      };
    } finally {
      this.semaphores.delete(key);
    }
  }
}
```

### **Integration with MCP Orchestrator**

```typescript
export class MCPOrchestrator {
  private semaphoreManager: SemaphoreManager;

  async executeTool(
    toolName: string,
    args: unknown,
    context: NeuroLinkExecutionContext,
  ): Promise {
    return await this.semaphoreManager.acquire(
      toolName, // Use tool name as semaphore key
      async () => {
        return await this.registry.executeTool(toolName, args, context);
      },
    );
  }
}
```

---

##  **Usage Patterns**

### **Basic Usage**

```typescript

const semaphoreManager = new SemaphoreManager();

// Execute operation with concurrency control
const result = await semaphoreManager.acquire("my-operation", async () => {
  // Your operation here
  return await performSomeTask();
});

console.log("Success:", result.success);
console.log("Wait Time:", result.waitTime);
console.log("Execution Time:", result.executionTime);
```

### **Tool-Specific Concurrency Control**

```typescript
// Same tool executions are serialized
const fileOperations = [
  semaphoreManager.acquire("file-read", () => readFile("data1.txt")),
  semaphoreManager.acquire("file-read", () => readFile("data2.txt")),
  semaphoreManager.acquire("file-read", () => readFile("data3.txt")),
];

// These will execute sequentially to prevent file conflicts
const results = await Promise.all(fileOperations);
```

### **Different Tools Run Concurrently**

```typescript
// Different tools can run simultaneously
const mixedOperations = [
  semaphoreManager.acquire("file-read", () => readFile("data.txt")),
  semaphoreManager.acquire("http-request", () =>
    fetchData("https://api.example.com"),
  ),
  semaphoreManager.acquire("database-query", () => queryDatabase()),
];

// These will execute concurrently for optimal performance
const results = await Promise.all(mixedOperations);
```

---

##  **Performance Monitoring**

### **Statistics Interface**

```typescript
type SemaphoreStats = {
  activeOperations: number; // Currently running operations
  queuedOperations: number; // Operations waiting in queue
  totalOperations: number; // Total operations processed
  totalWaitTime: number; // Cumulative wait time (ms)
  averageWaitTime: number; // Average wait time per operation
  peakQueueDepth: number; // Maximum queue depth reached
};

// Get statistics for monitoring
const stats = semaphoreManager.getStats("tool-name");
console.log(`Average wait time: ${stats.averageWaitTime}ms`);
console.log(`Peak queue depth: ${stats.peakQueueDepth}`);
```

### **Performance Metrics**

```typescript
// Real-world performance characteristics
const PERFORMANCE_BENCHMARKS = {
  overhead: " {
  const operations = Array.from({ length: 100 }, (_, i) =>
    semaphoreManager.acquire("test-tool", async () => {
      await new Promise((resolve) => setTimeout(resolve, 100));
      return `Operation ${i} complete`;
    }),
  );

  const startTime = Date.now();
  const results = await Promise.all(operations);
  const totalTime = Date.now() - startTime;

  console.log(`100 operations completed in ${totalTime}ms`);
  console.log(`All successful: ${results.every((r) => r.success)}`);
};
```

### **Race Condition Prevention Test**

```typescript
// Verify serialization of same-tool operations
const testSerialization = async () => {
  let counter = 0;
  const operations = Array.from({ length: 10 }, () =>
    semaphoreManager.acquire("counter-tool", async () => {
      const current = counter;
      await new Promise((resolve) => setTimeout(resolve, 10));
      counter = current + 1;
      return counter;
    }),
  );

  const results = await Promise.all(operations);
  const finalValues = results.map((r) => r.result);

  // Should be [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] in some order
  console.log("Final counter value:", counter); // Should be 10
  console.log("No race conditions:", new Set(finalValues).size === 10);
};
```

---

##  **Configuration & Tuning**

### **Advanced Configuration**

```typescript
type SemaphoreManagerOptions = {
  maxConcurrentOperations?: number; // Global concurrency limit
  defaultTimeout?: number; // Operation timeout (ms)
  cleanupInterval?: number; // Stats cleanup interval (ms)
  enableStatistics?: boolean; // Enable/disable stats collection
};

const semaphoreManager = new SemaphoreManager({
  maxConcurrentOperations: 50,
  defaultTimeout: 30000,
  cleanupInterval: 300000,
  enableStatistics: true,
});
```

### **Memory Management**

```typescript
// Automatic cleanup configuration
const cleanupOptions = {
  maxHistorySize: 1000, // Max entries in statistics history
  cleanupThreshold: 0.8, // Cleanup when 80% full
  forceCleanupInterval: 600000, // Force cleanup every 10 minutes
};
```

---

##  **Troubleshooting**

### **Common Issues & Solutions**

#### **High Wait Times**

```typescript
// Diagnose high wait times
const diagnostics = await semaphoreManager.getDiagnostics();
if (diagnostics.averageWaitTime > 1000) {
  console.warn("High wait times detected:");
  console.log("- Consider reducing operation complexity");
  console.log("- Check for blocking I/O operations");
  console.log("- Monitor queue depth patterns");
}
```

#### **Memory Growth**

```typescript
// Monitor memory usage
const memoryUsage = process.memoryUsage();
const activeOperations = semaphoreManager.getActiveOperationCount();

if (memoryUsage.heapUsed > 200 * 1024 * 1024) {
  // 200MB
  console.warn("High memory usage detected");
  console.log(`Active operations: ${activeOperations}`);
  console.log("Consider implementing operation timeouts");
}
```

#### **Deadlock Detection**

```typescript
// Monitor for potential deadlocks
const deadlockCheck = () => {
  const stats = semaphoreManager.getAllStats();
  const stalledOperations = Object.entries(stats)
    .filter(([_, stat]) => stat.averageWaitTime > 30000)
    .map(([toolName, _]) => toolName);

  if (stalledOperations.length > 0) {
    console.warn("Potential deadlocks detected in:", stalledOperations);
  }
};
```

---

##  **Best Practices**

### **Operation Design**

1. **Keep Operations Atomic**: Each semaphore-protected operation should be self-contained
2. **Minimize Operation Time**: Reduce wait times by optimizing operation duration
3. **Use Appropriate Keys**: Choose semaphore keys that reflect actual resource conflicts
4. **Avoid Nested Semaphores**: Prevent potential deadlock scenarios

### **Error Handling**

```typescript
const robustExecution = async () => {
  try {
    const result = await semaphoreManager.acquire(
      "risky-operation",
      async () => {
        // Operation that might fail
        return await performRiskyTask();
      },
    );

    if (!result.success) {
      console.error("Operation failed:", result.error);
      // Handle failure appropriately
    }
  } catch (error) {
    console.error("Semaphore error:", error);
    // Handle semaphore-level errors
  }
};
```

### **Performance Optimization**

```typescript
// Batch similar operations when possible
const batchOperations = async (items: string[]) => {
  return await semaphoreManager.acquire("batch-operation", async () => {
    // Process all items in a single semaphore-protected block
    return await Promise.all(items.map(processItem));
  });
};
```

---

##  **Integration Examples**

### **File System Operations**

```typescript
// Prevent concurrent file modifications
const fileManager = {
  async writeFile(filename: string, content: string) {
    return await semaphoreManager.acquire(`file:${filename}`, async () => {
      return await fs.writeFile(filename, content);
    });
  },

  async readFile(filename: string) {
    return await semaphoreManager.acquire(`file:${filename}`, async () => {
      return await fs.readFile(filename, "utf8");
    });
  },
};
```

### **API Rate Limiting**

```typescript
// Prevent API rate limit violations
const apiManager = {
  async makeRequest(endpoint: string, data: any) {
    return await semaphoreManager.acquire("api-requests", async () => {
      await new Promise((resolve) => setTimeout(resolve, 100)); // Rate limit
      return await fetch(endpoint, {
        method: "POST",
        body: JSON.stringify(data),
      });
    });
  },
};
```

### **Database Operations**

```typescript
// Serialize database migrations
const dbManager = {
  async runMigration(migrationName: string) {
    return await semaphoreManager.acquire("database-migration", async () => {
      console.log(`Running migration: ${migrationName}`);
      return await executeMigration(migrationName);
    });
  },
};
```

---

**STATUS**: Production-ready concurrency control system with comprehensive testing and monitoring capabilities. Provides enterprise-grade race condition prevention while maintaining optimal performance for concurrent operations.

---

## NeuroLink Docs MCP Server

<!-- Source: mcp/docs-server.md -->

# NeuroLink Docs MCP Server

The NeuroLink Docs MCP Server makes the entire NeuroLink documentation (360+ pages across 27 sections) queryable by AI assistants through the [Model Context Protocol](https://modelcontextprotocol.io). Instead of copy-pasting docs into your prompt, your AI assistant can search, browse, and read NeuroLink documentation on demand.

**What it provides:**

- **6 tools** — full-text search, page retrieval, section browsing, API reference lookup, example search, and changelog
- **Pre-built search index** — generated at build time with MiniSearch for instant results
- **Dual transport** — stdio for local use, HTTP for remote/hosted deployments
- **Zero configuration** — runs via `npx` with no API keys required

## Quick Start

Add the NeuroLink docs server to your AI development tool:

```json
{
  "mcpServers": {
    "neurolink-docs": {
      "command": "npx",
      "args": ["-y", "@juspay/neurolink", "docs"]
    }
  }
}
```

```json
{
  "mcpServers": {
    "neurolink-docs": {
      "command": "npx",
      "args": ["-y", "@juspay/neurolink", "docs"]
    }
  }
}
```

```bash
claude mcp add neurolink-docs -- npx -y @juspay/neurolink docs
```

```json
{
  "servers": {
    "neurolink-docs": {
      "command": "npx",
      "args": ["-y", "@juspay/neurolink", "docs"]
    }
  }
}
```

```json
{
  "mcpServers": {
    "neurolink-docs": {
      "command": "npx",
      "args": ["-y", "@juspay/neurolink", "docs"]
    }
  }
}
```

:::tip[Same command everywhere]
All clients use the same `npx -y @juspay/neurolink docs` command. The only difference is the config file location and JSON key format (`mcpServers` vs `servers`).
:::

## Available Tools

The docs server exposes 6 tools to your AI assistant:

| Tool                | Description                                          | Parameters                               |
| ------------------- | ---------------------------------------------------- | ---------------------------------------- |
| `search_docs`       | Full-text search across all documentation            | `query` (required), `limit?`, `section?` |
| `get_page`          | Get the full content of a specific doc page          | `path` (required)                        |
| `list_sections`     | List all documentation sections and their pages      | none                                     |
| `get_api_reference` | Get SDK API reference, optionally filtered by method | `method?`                                |
| `get_examples`      | Get code examples by topic or provider               | `topic?`, `provider?`                    |
| `get_changelog`     | Get recent changelog entries                         | `limit?`                                 |

## Tool Examples

### search_docs

Search across all NeuroLink documentation with optional section filtering.

**Request:**

```json
{
  "query": "RAG pipeline",
  "limit": 3,
  "section": "features"
}
```

**Response:**

```json
{
  "query": "RAG pipeline",
  "resultCount": 3,
  "results": [
    {
      "title": "RAG (Retrieval-Augmented Generation)",
      "description": "Complete RAG pipeline with 9 chunking strategies...",
      "section": "features",
      "path": "features/rag",
      "url": "https://docs.neurolink.ink/docs/features/rag",
      "score": 12.45
    },
    {
      "title": "File Processors",
      "description": "Process 50+ file types for AI consumption...",
      "section": "features",
      "path": "features/file-processors",
      "url": "https://docs.neurolink.ink/docs/features/file-processors",
      "score": 8.21
    }
  ]
}
```

### get_page

Retrieve the full content of a specific documentation page by its path.

**Request:**

```json
{
  "path": "getting-started/installation"
}
```

**Response:**

```json
{
  "title": "Installation",
  "description": "Install NeuroLink via npm, yarn, or pnpm...",
  "section": "getting-started",
  "path": "getting-started/installation",
  "url": "https://docs.neurolink.ink/docs/getting-started/installation",
  "content": "# Installation\n\nInstall NeuroLink using your preferred package manager..."
}
```

### list_sections

List all documentation sections and the pages they contain.

**Request:** _(no parameters)_

**Response:**

```json
{
  "totalSections": 27,
  "totalPages": 361,
  "sections": [
    {
      "name": "getting-started",
      "pageCount": 15,
      "pages": [
        { "title": "Getting Started", "path": "getting-started/index" },
        { "title": "Installation", "path": "getting-started/installation" }
      ]
    },
    {
      "name": "sdk",
      "pageCount": 6,
      "pages": [
        { "title": "SDK Overview", "path": "sdk/index" },
        { "title": "API Reference", "path": "sdk/api-reference" }
      ]
    }
  ]
}
```

### get_api_reference

Get SDK API reference documentation. Pass a method name to filter results.

**Request:**

```json
{
  "method": "generate"
}
```

**Response:**

```json
{
  "query": "generate",
  "results": [
    {
      "title": "API Reference",
      "path": "sdk/api-reference",
      "url": "https://docs.neurolink.ink/docs/sdk/api-reference"
    }
  ]
}
```

### get_examples

Find code examples by topic or AI provider.

**Request:**

```json
{
  "topic": "streaming",
  "provider": "anthropic"
}
```

**Response:**

```json
{
  "query": "streaming anthropic",
  "results": [
    {
      "title": "Streaming with Retry",
      "description": "Implement streaming with automatic retry...",
      "section": "cookbook",
      "path": "cookbook/streaming-with-retry",
      "url": "https://docs.neurolink.ink/docs/cookbook/streaming-with-retry"
    }
  ]
}
```

### get_changelog

Get recent NeuroLink release notes and changelog entries.

**Request:**

```json
{
  "limit": 3
}
```

**Response:**

```json
{
  "entries": [
    {
      "title": "Changelog",
      "description": "NeuroLink release history...",
      "path": "community/changelog",
      "url": "https://docs.neurolink.ink/docs/community/changelog",
      "content": "## 9.12.0\n\n### Features\n- MCP CLI gap fix..."
    }
  ]
}
```

## HTTP Transport

For remote or hosted deployments, start the server with HTTP transport:

```bash
# Start HTTP server on default port 3001
neurolink docs --transport http

# Start on a custom port
neurolink docs --transport http --port 8080
```

The HTTP server exposes:

- `POST /mcp` — MCP endpoint (Streamable HTTP transport)
- `GET /health` — Health check endpoint

Configure your MCP client to connect via HTTP:

```json
{
  "mcpServers": {
    "neurolink-docs": {
      "transport": "http",
      "url": "https://your-server.com/mcp"
    }
  }
}
```

:::info[Hosted version]
The hosted version is available at `https://docs.neurolink.ink/mcp` — no local installation required.
:::

## Programmatic Usage

You can also add the docs server programmatically via the NeuroLink SDK:

```typescript

const neurolink = new NeuroLink();

// Add the docs server as an external MCP server
await neurolink.addExternalMCPServer("neurolink-docs", {
  command: "npx",
  args: ["-y", "@juspay/neurolink", "docs"],
  transport: "stdio",
});

// Now the AI can use docs tools during generation
const result = await neurolink.generate({
  prompt: "How do I set up RAG with NeuroLink? Search the docs first.",
});
```

Or connect to the HTTP transport:

```typescript
await neurolink.addExternalMCPServer("neurolink-docs", {
  transport: "http",
  url: "https://docs.neurolink.ink/mcp",
});
```

## Building the Search Index

The search index is generated automatically during the docs site build:

```bash
cd docs-site && pnpm build
```

This runs the `docusaurus-plugin-search-index` plugin which:

1. Scans all `docs/**/*.md` and `docs/**/*.mdx` files
2. Parses frontmatter (title, description, tags)
3. Extracts and indexes content with MiniSearch
4. Writes `static/search-index.json`

The index is bundled with the npm package, so end users don't need to build it themselves.

## Troubleshooting

### "search-index.json not found"

The search index hasn't been built yet. Run:

```bash
cd docs-site && pnpm build
```

This generates `docs-site/static/search-index.json` which the MCP server needs to function.

### Outdated search results

The search index is generated at build time. To get the latest docs:

```bash
cd docs-site && pnpm build
```

If using the npm package, update to the latest version:

```bash
npm update @juspay/neurolink
```

### Server not appearing in Claude Desktop / Cursor

1. Verify the config file is in the correct location (see [Quick Start](#quick-start) above)
2. Ensure the JSON is valid — a trailing comma or missing bracket will silently fail
3. Restart the application after saving the config
4. Check that `npx` is available in your PATH

### Connection timeout

If the server takes too long to start:

1. The first run downloads `@juspay/neurolink` via npx — this may take 10-30 seconds
2. Subsequent runs use the npm cache and start faster
3. For faster startup, install globally: `npm install -g @juspay/neurolink`

### Tools not returning results

If search returns empty results:

1. Verify the search index exists and is not empty
2. Try broader search terms — the index uses fuzzy matching with prefix search
3. Use `list_sections` first to see available sections, then filter with `section` parameter

---

## HTTP Transport for MCP Servers

<!-- Source: mcp/http-transport.md -->

# HTTP Transport for MCP Servers

## Overview

NeuroLink now supports **HTTP/Streamable HTTP transport** for Model Context Protocol (MCP) servers, enabling integration with remote MCP services like GitHub Copilot MCP API and custom HTTP-based MCP endpoints.

The HTTP transport implements the [MCP Streamable HTTP specification](https://modelcontextprotocol.io/specification/2025-06-18/basic/transports), providing:

- ✅ Remote MCP server connectivity
- ✅ Custom header support for authentication
- ✅ Session management and automatic reconnection
- ✅ Firewall and proxy compatibility
- ✅ Both streaming (SSE) and batch JSON responses

## Quick Start

### GitHub Copilot Integration

```bash
# Add GitHub Copilot MCP endpoint
npx neurolink mcp add github-copilot "https://api.githubcopilot.com/mcp" \
  --transport http \
  --url "https://api.githubcopilot.com/mcp" \
  --headers '{"Authorization": "Bearer YOUR_GITHUB_COPILOT_TOKEN"}'
```

### Configuration File

Add to `.mcp-config.json`:

```json
{
  "mcpServers": {
    "github-copilot": {
      "name": "github-copilot",
      "command": "https://api.githubcopilot.com/mcp",
      "transport": "http",
      "url": "https://api.githubcopilot.com/mcp",
      "headers": {
        "Authorization": "Bearer ghp_xxxxxxxxxxxxxxxxxxxx"
      },
      "description": "GitHub Copilot MCP API"
    }
  }
}
```

### Programmatic Usage

```typescript

const neurolink = new NeuroLink();

// Add HTTP MCP server
await neurolink.addInMemoryMCPServer("github-copilot", {
  server: {
    title: "GitHub Copilot MCP",
    description: "GitHub Copilot API integration",
    tools: {},
  },
  config: {
    id: "github-copilot",
    name: "github-copilot",
    description: "GitHub Copilot MCP API",
    command: "https://api.githubcopilot.com/mcp",
    transport: "http",
    url: "https://api.githubcopilot.com/mcp",
    headers: {
      Authorization: "Bearer YOUR_TOKEN",
    },
    tools: [],
    status: "initializing",
  },
});

// Use the MCP server
const result = await neurolink.generate({
  input: { text: "Use GitHub Copilot to help me write code" },
  provider: "openai",
  disableTools: false,
});
```

## Authentication

HTTP transport supports custom headers for authentication:

### Bearer Token Authentication

```json
{
  "headers": {
    "Authorization": "Bearer YOUR_TOKEN"
  }
}
```

### API Key Authentication

```json
{
  "headers": {
    "X-API-Key": "your-api-key-here"
  }
}
```

### Custom Headers

```json
{
  "headers": {
    "Authorization": "Bearer YOUR_TOKEN",
    "X-Custom-Header": "custom-value",
    "X-Request-ID": "unique-request-id"
  }
}
```

### OAuth 2.1 Authentication

For enterprise integrations requiring OAuth 2.1 with PKCE:

```json
{
  "mcpServers": {
    "enterprise-api": {
      "transport": "http",
      "url": "https://api.enterprise.com/mcp",
      "auth": {
        "type": "oauth2",
        "oauth": {
          "clientId": "your-client-id",
          "clientSecret": "your-client-secret",
          "authorizationUrl": "https://auth.enterprise.com/oauth/authorize",
          "tokenUrl": "https://auth.enterprise.com/oauth/token",
          "redirectUrl": "http://localhost:8080/callback",
          "scope": "mcp:read mcp:write",
          "usePKCE": true
        }
      }
    }
  }
}
```

**OAuth Configuration Options:**

| Option             | Type    | Required | Description                              |
| ------------------ | ------- | -------- | ---------------------------------------- |
| `clientId`         | string  | Yes      | OAuth client identifier                  |
| `clientSecret`     | string  | No       | OAuth client secret (optional with PKCE) |
| `authorizationUrl` | string  | Yes      | Authorization endpoint URL               |
| `tokenUrl`         | string  | Yes      | Token endpoint URL                       |
| `redirectUrl`      | string  | Yes      | OAuth callback URL                       |
| `scope`            | string  | No       | Space-separated OAuth scopes             |
| `usePKCE`          | boolean | No       | Enable PKCE (recommended, default: true) |

### Authentication Types

The `auth` configuration supports three authentication types:

**1. OAuth 2.1 (recommended for enterprise)**

```json
{
  "auth": {
    "type": "oauth2",
    "oauth": { ... }
  }
}
```

**2. Bearer Token**

```json
{
  "auth": {
    "type": "bearer",
    "token": "your-access-token"
  }
}
```

**3. API Key**

```json
{
  "auth": {
    "type": "api-key",
    "apiKey": "your-api-key",
    "apiKeyHeader": "X-API-Key"
  }
}
```

## Transport Comparison

| Feature            | stdio    | SSE      | WebSocket | HTTP     |
| ------------------ | -------- | -------- | --------- | -------- |
| Local servers      | ✅       | ❌       | ❌        | ❌       |
| Remote servers     | ❌       | ✅       | ✅        | ✅       |
| Authentication     | Env vars | Headers  | Headers   | Headers  |
| Streaming          | ✅       | ✅       | ✅        | ✅       |
| Firewall friendly  | ✅       | ✅       | ⚠️        | ✅       |
| Session management | ❌       | ⚠️       | ⚠️        | ✅       |
| Reconnection       | ❌       | ⚠️       | ⚠️        | ✅       |
| Specification      | MCP Core | MCP Core | MCP Core  | MCP 2025 |

## Configuration Options

### Required Fields

- `transport`: Must be set to `"http"`
- `url`: The HTTP endpoint URL (e.g., `https://api.example.com/mcp`)
- `command`: Usually same as URL for HTTP transport

### Optional Fields

- `headers`: Object with HTTP headers for authentication and configuration
- `httpOptions`: Fine-grained HTTP connection settings (see below)
- `retryConfig`: Automatic retry configuration with exponential backoff
- `rateLimiting`: Rate limiting to prevent API throttling
- `auth`: Authentication configuration (OAuth 2.1, Bearer, API Key)
- `timeout`: Connection timeout in milliseconds (default: 10000)
- `retries`: Maximum retry attempts (default: 3)
- `autoRestart`: Whether to automatically restart on failure (default: true)
- `healthCheckInterval`: Health check interval in milliseconds (default: 30000)

### HTTP Options Configuration

Fine-tune HTTP connection behavior:

```typescript
{
  httpOptions: {
    connectionTimeout: 30000,  // Connection timeout (ms), default: 30000
    requestTimeout: 60000,     // Request timeout (ms), default: 60000
    idleTimeout: 120000,       // Idle connection timeout (ms), default: 120000
    keepAliveTimeout: 30000    // Keep-alive timeout (ms), default: 30000
  }
}
```

| Option              | Type   | Default | Description                          |
| ------------------- | ------ | ------- | ------------------------------------ |
| `connectionTimeout` | number | 30000   | Maximum time to establish connection |
| `requestTimeout`    | number | 60000   | Maximum time for request completion  |
| `idleTimeout`       | number | 120000  | Time before closing idle connections |
| `keepAliveTimeout`  | number | 30000   | Keep-alive connection timeout        |

### Retry Configuration

Automatic retry with exponential backoff:

```typescript
{
  retryConfig: {
    maxAttempts: 3,           // Maximum retry attempts, default: 3
    initialDelay: 1000,       // Initial delay (ms), default: 1000
    maxDelay: 30000,          // Maximum delay (ms), default: 30000
    backoffMultiplier: 2      // Backoff multiplier, default: 2
  }
}
```

| Option              | Type   | Default | Description                        |
| ------------------- | ------ | ------- | ---------------------------------- |
| `maxAttempts`       | number | 3       | Maximum number of retry attempts   |
| `initialDelay`      | number | 1000    | Initial delay before first retry   |
| `maxDelay`          | number | 30000   | Maximum delay between retries      |
| `backoffMultiplier` | number | 2       | Multiplier for exponential backoff |

### Rate Limiting Configuration

Prevent API throttling with token bucket rate limiting:

```typescript
{
  rateLimiting: {
    requestsPerMinute: 60,    // Max requests per minute, default: 60
    requestsPerHour: 1000,    // Max requests per hour (optional)
    maxBurst: 10,             // Max burst size, default: 10
    useTokenBucket: true      // Use token bucket algorithm, default: true
  }
}
```

| Option              | Type    | Default | Description                         |
| ------------------- | ------- | ------- | ----------------------------------- |
| `requestsPerMinute` | number  | 60      | Maximum requests allowed per minute |
| `requestsPerHour`   | number  | -       | Maximum requests allowed per hour   |
| `maxBurst`          | number  | 10      | Maximum burst size for token bucket |
| `useTokenBucket`    | boolean | true    | Use token bucket algorithm          |

### Example: Complete Configuration

```json
{
  "mcpServers": {
    "custom-api": {
      "name": "custom-api",
      "command": "https://your-api.example.com/mcp",
      "transport": "http",
      "url": "https://your-api.example.com/mcp",
      "headers": {
        "Authorization": "Bearer YOUR_API_TOKEN",
        "X-Custom-Header": "value"
      },
      "httpOptions": {
        "connectionTimeout": 30000,
        "requestTimeout": 60000,
        "idleTimeout": 120000,
        "keepAliveTimeout": 30000
      },
      "retryConfig": {
        "maxAttempts": 5,
        "initialDelay": 1000,
        "maxDelay": 30000,
        "backoffMultiplier": 2
      },
      "rateLimiting": {
        "requestsPerMinute": 100,
        "maxBurst": 20,
        "useTokenBucket": true
      },
      "timeout": 15000,
      "autoRestart": true,
      "healthCheckInterval": 60000,
      "description": "Custom MCP API endpoint"
    }
  }
}
```

## Use Cases

### 1. GitHub Copilot Integration

Access GitHub Copilot's AI capabilities through MCP:

```typescript
const neurolink = new NeuroLink();

await neurolink.addInMemoryMCPServer("copilot", {
  server: { title: "GitHub Copilot", tools: {} },
  config: {
    id: "copilot",
    name: "copilot",
    description: "GitHub Copilot MCP",
    transport: "http",
    url: "https://api.githubcopilot.com/mcp",
    headers: { Authorization: "Bearer YOUR_TOKEN" },
    tools: [],
    status: "initializing",
  },
});
```

### 2. Enterprise API Gateway

Connect to internal MCP services behind API gateways:

```json
{
  "internal-tools": {
    "transport": "http",
    "url": "https://internal-gateway.company.com/mcp",
    "headers": {
      "Authorization": "Bearer INTERNAL_TOKEN",
      "X-Tenant-ID": "tenant-123"
    }
  }
}
```

### 3. Multi-Cloud MCP Services

Connect to MCP services across different cloud providers:

```json
{
  "aws-mcp": {
    "transport": "http",
    "url": "https://mcp.us-east-1.amazonaws.com/api",
    "headers": {
      "X-API-Key": "AWS_API_KEY"
    }
  },
  "azure-mcp": {
    "transport": "http",
    "url": "https://mcp.azure.com/api/v1",
    "headers": {
      "Ocp-Apim-Subscription-Key": "AZURE_KEY"
    }
  }
}
```

## Troubleshooting

### Connection Failed

**Problem:** Unable to connect to HTTP MCP server

**Solutions:**

1. Verify the URL is correct and accessible
2. Check authentication headers are valid
3. Ensure firewall/proxy allows HTTPS traffic
4. Test with `curl` first:
   ```bash
   curl -H "Authorization: Bearer TOKEN" https://api.example.com/mcp
   ```

### Authentication Errors

**Problem:** 401 Unauthorized or 403 Forbidden

**Solutions:**

1. Verify token is valid and not expired
2. Check token has required permissions
3. Ensure header format matches API requirements
4. Try regenerating the authentication token

### Timeout Issues

**Problem:** Connection times out

**Solutions:**

1. Increase timeout value in configuration
2. Check network connectivity
3. Verify the server is running and responsive
4. Test with a simple HTTP client first

### Invalid Headers

**Problem:** Server rejects custom headers

**Solutions:**

1. Check header names follow HTTP specification
2. Ensure header values are properly formatted
3. Some headers may be reserved or blocked by proxies
4. Try different header names (e.g., `X-API-Key` instead of `Api-Key`)

## Technical Details

### Implementation

HTTP transport uses the `StreamableHTTPClientTransport` from the `@modelcontextprotocol/sdk` package, which implements:

- **JSON-RPC 2.0** for message protocol
- **Server-Sent Events (SSE)** for streaming responses
- **HTTP POST** for sending requests
- **Session management** via `Mcp-Session-Id` header
- **Automatic reconnection** with exponential backoff

### Security Considerations

1. **HTTPS Required**: Always use HTTPS in production
2. **Token Security**: Store tokens securely (environment variables, secrets management)
3. **Header Sanitization**: Avoid logging sensitive headers
4. **Network Security**: Use VPNs or private networks for internal APIs
5. **Rate Limiting**: Implement client-side rate limiting for public APIs

## Migration Guide

### From SSE to HTTP

If you're currently using SSE transport, migration is straightforward:

**Before (SSE):**

```json
{
  "transport": "sse",
  "url": "http://localhost:8080/sse"
}
```

**After (HTTP):**

```json
{
  "transport": "http",
  "url": "https://api.example.com/mcp",
  "headers": {
    "Authorization": "Bearer TOKEN"
  }
}
```

### From stdio to HTTP

Migrating from local stdio servers to remote HTTP requires server changes:

1. Deploy your MCP server as an HTTP service
2. Implement authentication endpoint
3. Update client configuration to use HTTP transport
4. Add authentication headers

## Resources

- [MCP Specification - Transports](https://modelcontextprotocol.io/specification/2025-06-18/basic/transports)
- [GitHub Copilot MCP API Documentation](https://github.com/features/copilot)
- [NeuroLink MCP Integration Guide](/docs/mcp/integration)
- Example HTTP Transport Configurations:

```json
{
  "mcpServers": {
    "github-copilot": {
      "name": "github-copilot",
      "transport": "http",
      "url": "https://api.githubcopilot.com/mcp",
      "headers": {
        "Authorization": "Bearer ${GITHUB_COPILOT_TOKEN}"
      },
      "httpOptions": {
        "connectionTimeout": 30000,
        "requestTimeout": 60000
      },
      "retryConfig": {
        "maxAttempts": 3,
        "initialDelay": 1000,
        "maxDelay": 30000
      },
      "rateLimiting": {
        "requestsPerMinute": 60,
        "maxBurst": 10
      },
      "description": "GitHub Copilot MCP API with full configuration"
    },
    "simple-http-server": {
      "name": "simple-http-server",
      "transport": "http",
      "url": "https://api.example.com/mcp",
      "headers": {
        "Authorization": "Bearer ${API_TOKEN}"
      },
      "description": "Minimal HTTP transport configuration"
    },
    "enterprise-oauth-server": {
      "name": "enterprise-oauth-server",
      "transport": "http",
      "url": "https://api.enterprise.com/mcp",
      "auth": {
        "type": "oauth2",
        "oauth": {
          "clientId": "${OAUTH_CLIENT_ID}",
          "clientSecret": "${OAUTH_CLIENT_SECRET}",
          "authorizationUrl": "https://auth.enterprise.com/authorize",
          "tokenUrl": "https://auth.enterprise.com/token",
          "redirectUrl": "http://localhost:8080/callback",
          "scope": "mcp:read mcp:write tools:execute",
          "usePKCE": true
        }
      },
      "description": "Enterprise MCP server with OAuth 2.1 + PKCE"
    }
  }
}
```

## Support

For issues or questions:

- GitHub Issues: [juspay/neurolink/issues](https://github.com/juspay/neurolink/issues)
- Documentation: [NeuroLink Docs](https://github.com/juspay/neurolink/docs)
- Examples: [Basic Usage Examples](/docs/examples/basic-usage)

---

## MCP (Model Context Protocol) Integration Guide

<!-- Source: mcp/integration.md -->

#  MCP (Model Context Protocol) Integration Guide

## ✅ IMPLEMENTATION STATUS: COMPLETE (2025-01-07)

**Generate Function Migration completed - MCP integration enhanced with factory patterns**

- ✅ MCP tools work seamlessly with modern `generate()` method
- ✅ Factory pattern provides better MCP tool management
- ✅ Enhanced error handling for MCP server connections
- ✅ All existing MCP configurations continue working

> **Migration Note**: MCP integration enhanced but remains transparent.
> Use `generate()` for future-ready MCP workflows.

##  **Overview**

NeuroLink now supports the **Model Context Protocol (MCP)** for seamless integration with external servers and tools. This enables unlimited extensibility through the growing MCP ecosystem while maintaining NeuroLink's simple interface.

### **Enhanced MCP Integration with Factory Patterns**

```typescript

const neurolink = new NeuroLink();

// NEW: Enhanced MCP integration with generate()
const result = await neurolink.generate({
  input: { text: "List files in current directory using MCP" },
  provider: "google-ai",
  disableTools: false, // Enable MCP tool usage
});

// Alternative approach using legacy method (backward compatibility)
const legacyResult = await neurolink.generate({
  prompt: "List files in current directory using MCP",
  provider: "google-ai",
  disableTools: false,
});
```

### **What is MCP?**

The Model Context Protocol is a standardized way for AI applications to connect to external tools and data sources. It enables:

- ✅ **External Tool Integration** - Connect to filesystem, databases, APIs, and more
- ✅ **Standardized Communication** - JSON-RPC 2.0 protocol over multiple transports
- ✅ **Tool Discovery** - Automatic discovery of available tools and capabilities
- ✅ **Secure Execution** - Controlled access to external resources
- ✅ **Ecosystem Compatibility** - Works with 65+ community servers

---

##  **Quick Start**

### **1. Install Popular MCP Servers**

```bash
# Install filesystem server for file operations
npx neurolink mcp install filesystem

# Install GitHub server for repository management
npx neurolink mcp install github

# Install database server for SQL operations
npx neurolink mcp install postgres
```

### **2. Test Connectivity**

```bash
# Test server connectivity and discover tools
npx neurolink mcp test filesystem

# List all configured servers with status
npx neurolink mcp list --status
```

### **3. 🆕 Programmatic Server Management**

**NEW!** Add MCP servers dynamically at runtime:

```typescript

const neurolink = new NeuroLink();

// Add external servers dynamically
await neurolink.addMCPServer("bitbucket", {
  command: "npx",
  args: ["-y", "@nexus2520/bitbucket-mcp-server"],
  env: {
    BITBUCKET_USERNAME: "your-username",
    BITBUCKET_APP_PASSWORD: "your-token",
  },
});

// Add database integration
await neurolink.addMCPServer("database", {
  command: "node",
  args: ["./custom-db-server.js"],
  env: { DB_CONNECTION: "postgresql://..." },
});

// Verify registration
const status = await neurolink.getMCPStatus();
console.log("Active servers:", status.totalServers);
```

### **4. Execute Tools (Coming Soon)**

```bash
# Execute tools from connected servers
npx neurolink mcp exec filesystem read_file --params '{"path": "README.md"}'
npx neurolink mcp exec github create_issue --params '{"title": "New feature", "body": "Description"}'
```

---

##  **MCP CLI Commands Reference**

### **Server Management**

#### **Install Popular Servers**

```bash
neurolink mcp install
```

**Available servers:**

- `filesystem` - File and directory operations
- `github` - GitHub repository management
- `postgres` - PostgreSQL database operations
- `brave-search` - Web search capabilities
- `puppeteer` - Browser automation

**Example:**

```bash
neurolink mcp install filesystem
# ✅ Installed MCP server: filesystem
#  Test it with: neurolink mcp test filesystem
```

#### **Add Custom Servers**

```bash
neurolink mcp add   [options]
```

**Options:**

- `--args` - Command arguments (array)
- `--transport` - Transport type (stdio|sse|websocket|http)
- `--url` - URL for SSE/WebSocket/HTTP transport
- `--headers` - HTTP headers for authentication (JSON)
- `--env` - Environment variables (JSON)
- `--cwd` - Working directory

**Examples:**

```bash
# Add custom server with arguments
neurolink mcp add myserver "python /path/to/server.py" --args "arg1,arg2"

# Add SSE server
neurolink mcp add webserver "http://localhost:8080" --transport sse --url "http://localhost:8080/mcp"

# Add HTTP remote server with authentication
neurolink mcp add remote-api "https://api.example.com/mcp" --transport http --url "https://api.example.com/mcp" --headers '{"Authorization": "Bearer YOUR_TOKEN"}'

# Add server with environment variables
neurolink mcp add dbserver "npx db-mcp-server" --env '{"DB_URL": "postgresql://..."}'
```

#### **List Configured Servers**

```bash
neurolink mcp list [--status]
```

**Example output:**

```
 Configured MCP servers (2):

 filesystem
   Command: npx -y @modelcontextprotocol/server-filesystem /
   Transport: stdio
✔ filesystem: ✅ Available

 github
   Command: npx @modelcontextprotocol/server-github
   Transport: stdio
✖ github: ❌ Not available
```

#### **Test Server Connectivity**

```bash
neurolink mcp test
```

**Example output:**

```
 Testing MCP server: filesystem

✔ ✅ Connection successful!

 Server Capabilities:
   Protocol Version: 2024-11-05
   Tools: ✅ Supported

️  Available Tools:
   • read_file: Read file contents from filesystem
   • write_file: Create/overwrite files
   • edit_file: Make line-based edits
   • create_directory: Create directories
   • list_directory: List directory contents
   + 6 more tools...
```

#### **Remove Servers**

```bash
neurolink mcp remove
```

---

## ⚙️ **Configuration**

### **External Server Configuration** [Coming Soon]

External MCP servers will be configured in `.mcp-config.json`:

```json
{
  "mcpServers": {
    "filesystem": {
      "name": "filesystem",
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "/"],
      "transport": "stdio"
    },
    "github": {
      "name": "github",
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "transport": "stdio"
    },
    "custom": {
      "name": "custom",
      "command": "python",
      "args": ["/path/to/server.py"],
      "transport": "stdio",
      "cwd": "/project/directory"
    }
  }
}
```

### **Environment Variables**

Set these in your `.env` file for server authentication:

```bash
# Custom Server Configuration
CUSTOM_API_KEY=your-api-key
CUSTOM_ENDPOINT=https://api.example.com
```

---

## ️ **Available MCP Servers**

### **Filesystem Server**

**Purpose:** File and directory operations
**Installation:** `neurolink mcp install filesystem`

**Available Tools:**

- `read_file` - Read file contents
- `write_file` - Create or overwrite files
- `edit_file` - Make line-based edits
- `create_directory` - Create directories
- `list_directory` - List directory contents
- `directory_tree` - Get recursive tree view
- `move_file` - Move/rename files
- `search_files` - Search for files by pattern
- `get_file_info` - Get file metadata

### **GitHub Server**

**Purpose:** GitHub repository management
**Installation:** `neurolink mcp install github`

**Available Tools:**

- `create_repository` - Create new repositories
- `search_repositories` - Search public repositories
- `get_file_contents` - Read repository files
- `create_or_update_file` - Modify repository files
- `create_issue` - Create GitHub issues
- `create_pull_request` - Create pull requests
- `fork_repository` - Fork repositories

### **PostgreSQL Server**

**Purpose:** Database operations
**Installation:** `neurolink mcp install postgres`

**Available Tools:**

- `read-query` - Execute SELECT queries
- `write-query` - Execute INSERT/UPDATE/DELETE queries
- `create-table` - Create database tables
- `list-tables` - List available tables
- `describe-table` - Get table schema

### **Brave Search Server**

**Purpose:** Web search capabilities
**Installation:** `neurolink mcp install brave-search`

**Available Tools:**

- `brave_web_search` - Search the web
- `brave_local_search` - Search for local businesses

### **Puppeteer Server**

**Purpose:** Browser automation
**Installation:** `neurolink mcp install puppeteer`

**Available Tools:**

- `puppeteer_navigate` - Navigate to URLs
- `puppeteer_screenshot` - Take screenshots
- `puppeteer_click` - Click elements
- `puppeteer_fill` - Fill forms
- `puppeteer_evaluate` - Execute JavaScript

---

##  **Advanced Usage**

### **Transport Types**

#### **STDIO Transport (Default)**

Best for local servers and CLI tools:

```bash
neurolink mcp add local-server "python server.py" --transport stdio
```

#### **SSE Transport**

For web-based servers:

```bash
neurolink mcp add web-server "http://localhost:8080" --transport sse --url "http://localhost:8080/sse"
```

#### **HTTP Transport (Streamable HTTP)**

For remote MCP servers with authentication, retry, and rate limiting:

```bash
neurolink mcp add remote-api "https://api.example.com/mcp" \
  --transport http \
  --url "https://api.example.com/mcp" \
  --headers '{"Authorization": "Bearer YOUR_TOKEN"}'
```

**Configuration in `.mcp-config.json`:**

```json
{
  "mcpServers": {
    "remote-api": {
      "transport": "http",
      "url": "https://api.example.com/mcp",
      "headers": {
        "Authorization": "Bearer YOUR_TOKEN"
      },
      "httpOptions": {
        "connectionTimeout": 30000,
        "requestTimeout": 60000,
        "idleTimeout": 120000,
        "keepAliveTimeout": 30000
      },
      "retryConfig": {
        "maxAttempts": 3,
        "initialDelay": 1000,
        "maxDelay": 30000,
        "backoffMultiplier": 2
      },
      "rateLimiting": {
        "requestsPerMinute": 60,
        "maxBurst": 10,
        "useTokenBucket": true
      }
    }
  }
}
```

**HTTP Transport Features:**

- Custom headers for authentication (Bearer, API Key)
- Configurable connection and request timeouts
- Automatic retry with exponential backoff
- Rate limiting with token bucket algorithm
- OAuth 2.1 support with PKCE

See [MCP HTTP Transport Guide](/docs/mcp/http-transport) for complete documentation.

### **Server Environment Configuration**

Pass environment variables to servers:

```bash
neurolink mcp add secure-server "npx secure-mcp" --env '{"API_KEY": "secret", "DEBUG": "true"}'
```

### **Working Directory**

Set server working directory:

```bash
neurolink mcp add project-server "python local-server.py" --cwd "/path/to/project"
```

---

##  **Troubleshooting**

### **Common Issues**

#### **Server Not Available**

```
✖ server: ❌ Not available
```

**Solutions:**

1. Check server installation: `npm list -g @modelcontextprotocol/server-*`
2. Verify command path: `which npx`
3. Test command manually: `npx @modelcontextprotocol/server-filesystem /`
4. Check environment variables
5. Verify network connectivity (for SSE servers)

#### **Connection Timeout**

```
❌ Connection failed: Timeout connecting to MCP server
```

**Solutions:**

1. Increase timeout (servers may need time to start)
2. Check server logs for errors
3. Verify server supports MCP protocol version 2024-11-05
4. Test with simpler server first (filesystem)

#### **Authentication Errors**

```
❌ Connection failed: Authentication required
```

**Solutions:**

1. Set required environment variables
2. Check API key/token validity
3. Verify permissions for required resources
4. Review server documentation for auth requirements

#### **Tool Execution Errors**

```
❌ Tool execution failed: Invalid parameters
```

**Solutions:**

1. Check tool parameter schema: `neurolink mcp test `
2. Validate JSON parameter format
3. Review tool documentation
4. Test with minimal parameters first

### **Debug Mode**

Enable verbose logging for troubleshooting:

```bash
export NEUROLINK_DEBUG=true
neurolink mcp test filesystem
```

---

##  **Integration with AI Providers**

### **Using MCP Tools with AI Generation**

```bash
# Generate text that uses MCP tool results
neurolink generate "Analyze the README.md file and suggest improvements" --tools filesystem

# Stream responses that incorporate MCP data
neurolink stream "Create a GitHub issue based on the project status" --tools github
```

### **Multi-Tool Workflows**

```bash
# Combine multiple MCP servers in workflows
neurolink workflow "
1. Read project files (filesystem)
2. Analyze codebase (ai)
3. Create GitHub issue (github)
4. Update database (postgres)
"
```

---

##  **Resources**

### **Official MCP Resources**

- [MCP Specification](https://modelcontextprotocol.io/specification)
- [MCP Server Index](https://github.com/modelcontextprotocol/servers)
- [MCP Documentation](https://modelcontextprotocol.io/docs)

### **NeuroLink MCP Resources**

- [MCP Testing Guide](/docs/mcp/testing)
- [CLI Command Reference](/docs/cli/commands.md#mcp)
- [API Integration](/docs/sdk/api-reference#mcp-integration)

### **Community Servers**

- [Awesome MCP Servers](https://github.com/modelcontextprotocol/awesome-mcp-servers)
- [Custom Server Development](https://modelcontextprotocol.io/docs/building-servers)

---

##  **What's Next?**

### **Coming Soon**

- ✅ **Tool Execution** - Direct tool invocation from CLI
- ✅ **Workflow Orchestration** - Multi-step tool workflows
- ✅ **AI Integration** - Tools accessible during AI generation
- ✅ **Performance Optimization** - Parallel tool execution
- ✅ **Advanced Security** - Fine-grained permissions

### **Get Involved**

- Report issues on [GitHub](https://github.com/juspay/neurolink/issues)
- Join the [MCP community](https://modelcontextprotocol.io/community)
- Contribute server integrations
- Share usage examples

---

**Ready to extend NeuroLink with unlimited external capabilities! **

---

## NeuroLink MCP Latency Optimization Implementation Guide

<!-- Source: mcp/optimization.md -->

# NeuroLink MCP Latency Optimization Implementation Guide

##  Executive Summary

### Current Performance Crisis

- **CLI Performance**: 26.4s total (24.8s MCP + 1.6s startup) - Unacceptable for production
- **SDK Performance**: 46.4s total (46.4s MCP + 0s startup) - Completely unusable
- **User Impact**: Every tool-enabled request waits 26-46 seconds before processing
- **Business Impact**: Feature cannot ship with current performance

### Target Performance Goals

- **CLI Target**: \ {
  const config = JSON.parse(fs.readFileSync('.mcp-config.json'));

  // Create promises for all servers
  const serverPromises = Object.entries(config.mcpServers).map(
    ([serverId, serverConfig]) => this.addServer(serverId, serverConfig)
  );

  // Start all servers concurrently
  const results = await Promise.allSettled(serverPromises);

  // Process results with proper error handling
  return this.processParallelResults(results);
}
```

Modify existing method to support parallel option:

```typescript
async loadMCPConfiguration(options: { parallel?: boolean } = {}): Promise {
  if (options.parallel) {
    return this.loadMCPConfigurationParallel();
  }
  return this.loadMCPConfigurationSequential(); // Renamed existing method
}
```

**File: `src/lib/neurolink.ts`**

Update MCP initialization to use parallel loading:

```typescript
private async initializeMCP(options?: { parallel?: boolean }): Promise {
  if (this.mcpInitialized) return;

  // Register built-in tools (fast)
  await toolRegistry.registerServer("neurolink-direct", directToolsServer);

  // Load external servers with optional parallel execution
  const configResult = await this.externalServerManager.loadMCPConfiguration({
    parallel: options?.parallel ?? true // Default to parallel
  });

  this.mcpInitialized = true;
}
```

#### Expected Results

- **CLI**: 24.8s → 12s (50% reduction)
- **SDK**: 46.4s → 23s (50% reduction)

### Phase 2: Smart Tool Detection Implementation

#### Files to Create

- `src/lib/utils/toolAnalyzer.ts` - New tool prediction logic

#### Files to Modify

- `src/lib/neurolink.ts` - Add selective initialization
- `src/lib/mcp/externalServerManager.ts` - Add selective server loading

#### Concept Implementation

Create a tool analyzer that predicts required tools from prompt keywords:

```typescript
"What time is it?" → analyzePrompt() → ['getCurrentTime'] → Load time server only
"Calculate math" → analyzePrompt() → ['calculateMath'] → Load math tools only
"Complex task" → analyzePrompt() → ['basic set'] → Load essential tools only
```

#### Detailed Code Changes

**File: `src/lib/utils/toolAnalyzer.ts` (NEW)**

Create smart tool detection:

```typescript
export class ToolAnalyzer {
  private static readonly TOOL_KEYWORDS = {
    getCurrentTime: ["time", "date", "when", "now", "current"],
    calculateMath: [
      "calculate",
      "math",
      "compute",
      "+",
      "-",
      "*",
      "/",
      "equation",
    ],
    listDirectory: ["list", "files", "directory", "folder", "ls", "dir"],
    readFile: ["read", "file", "content", "show", "cat"],
    writeFile: ["write", "save", "create", "file"],
    websearchGrounding: ["search", "web", "google", "find", "lookup"],
  };

  static analyzePromptForRequiredTools(prompt: string): string[] {
    const requiredTools: string[] = [];
    const lowerPrompt = prompt.toLowerCase();

    for (const [toolName, keywords] of Object.entries(this.TOOL_KEYWORDS)) {
      if (keywords.some((keyword) => lowerPrompt.includes(keyword))) {
        requiredTools.push(toolName);
      }
    }

    // Fallback to basic tools if no specific tools detected
    return requiredTools.length > 0
      ? requiredTools
      : ["getCurrentTime", "calculateMath"];
  }

  static getServerForTool(toolName: string): string | null {
    const toolServerMap: Record = {
      getCurrentTime: "builtin", // No external server needed
      calculateMath: "builtin", // No external server needed
      listDirectory: "filesystem", // Requires filesystem server
      readFile: "filesystem", // Requires filesystem server
      writeFile: "filesystem", // Requires filesystem server
      websearchGrounding: "websearch", // Requires websearch server
    };
    return toolServerMap[toolName] || null;
  }
}
```

**File: `src/lib/neurolink.ts`**

Add selective MCP initialization:

```typescript

private async initializeMCP(options?: {
  requiredTools?: string[],
  parallel?: boolean,
  prompt?: string
}): Promise {
  if (this.mcpInitialized) return;

  // Determine which tools are needed
  let requiredTools = options?.requiredTools;
  if (!requiredTools && options?.prompt) {
    requiredTools = ToolAnalyzer.analyzePromptForRequiredTools(options.prompt);
  }

  // Load only required servers
  if (requiredTools) {
    await this.initializeSelectiveTools(requiredTools, options?.parallel);
  } else {
    await this.initializeAllTools(options?.parallel); // Fallback
  }

  this.mcpInitialized = true;
}

private async initializeSelectiveTools(requiredTools: string[], parallel = false): Promise {
  // Always load built-in tools (fast)
  await toolRegistry.registerServer("neurolink-direct", directToolsServer);

  // Determine which external servers are needed
  const requiredServers = new Set();
  requiredTools.forEach(tool => {
    const server = ToolAnalyzer.getServerForTool(tool);
    if (server && server !== 'builtin') {
      requiredServers.add(server);
    }
  });

  // Load only the required external servers
  if (requiredServers.size > 0) {
    await this.externalServerManager.loadSelectiveServers(
      Array.from(requiredServers),
      { parallel }
    );
  }
}
```

**File: `src/lib/mcp/externalServerManager.ts`**

Add selective server loading:

```typescript
async loadSelectiveServers(serverIds: string[], options: { parallel?: boolean } = {}): Promise {
  const config = JSON.parse(fs.readFileSync('.mcp-config.json'));

  // Filter configuration to only include required servers
  const filteredServers = Object.fromEntries(
    Object.entries(config.mcpServers).filter(([id]) => serverIds.includes(id))
  );

  if (options.parallel) {
    // Load filtered servers in parallel
    const serverPromises = Object.entries(filteredServers).map(
      ([serverId, serverConfig]) => this.addServer(serverId, serverConfig)
    );
    const results = await Promise.allSettled(serverPromises);
    return this.processParallelResults(results);
  } else {
    // Load filtered servers sequentially
    const results: ExternalMCPOperationResult[] = [];
    for (const [serverId, serverConfig] of Object.entries(filteredServers)) {
      const result = await this.addServer(serverId, serverConfig);
      results.push(result);
    }
    return this.processSequentialResults(results);
  }
}
```

#### Expected Results

- **CLI**: 12s → 7s (additional 42% reduction)
- **SDK**: 23s → 14s (additional 39% reduction)

### Phase 3: CLI Performance Modes Implementation

#### Files to Modify

- `src/cli/index.ts` - Add CLI performance flags and mode logic

#### Concept Implementation

Provide explicit user control over tool loading through CLI flags:

```bash
pnpm cli generate "prompt" --speed-mode        # Fastest: built-in only
pnpm cli generate "prompt" --tools=time,math   # Selective: specific tools
pnpm cli generate "prompt" --parallel-loading  # Enhanced: parallel loading
```

#### Detailed Code Changes

**File: `src/cli/index.ts`**

Add CLI performance options:

```typescript
yargs.command(
  "generate ",
  "Generate AI content",
  {
    // ... existing options
    "speed-mode": {
      type: "boolean",
      default: false,
      description: "Use only built-in tools for fastest response (1-2s)",
    },
    tools: {
      type: "array",
      description: "Specify which tool categories to enable",
      choices: ["time", "math", "files", "web", "all"],
      default: ["all"],
    },
    "parallel-loading": {
      type: "boolean",
      default: true,
      description: "Load MCP servers in parallel for faster startup",
    },
  },
  async (argv) => {
    const neurolink = new NeuroLink();

    // Determine initialization strategy based on user flags
    let initOptions: any = { parallel: argv.parallelLoading };

    if (argv.speedMode) {
      // Speed mode: only built-in tools, no external servers
      initOptions.requiredTools = ["getCurrentTime", "calculateMath"];
      console.log(" Speed mode enabled: Using built-in tools only");
    } else if (argv.tools && !argv.tools.includes("all")) {
      // Selective mode: user-specified tool categories
      initOptions.requiredTools = mapCliToolsToInternal(argv.tools);
      console.log(
        ` Selective mode: Loading tools for ${argv.tools.join(", ")}`,
      );
    } else {
      // Smart mode: analyze prompt for tool requirements
      initOptions.prompt = argv.prompt;
      console.log(" Smart mode: Analyzing prompt for required tools");
    }

    const startTime = Date.now();
    await neurolink.initializeMCP(initOptions);
    const initTime = Date.now() - startTime;
    console.log(`⚡ MCP initialized in ${initTime}ms`);

    // ... rest of generation logic
  },
);

function mapCliToolsToInternal(cliTools: string[]): string[] {
  const mapping: Record = {
    time: ["getCurrentTime"],
    math: ["calculateMath"],
    files: ["listDirectory", "readFile", "writeFile"],
    web: ["websearchGrounding"],
  };

  return cliTools.flatMap((tool) => mapping[tool] || []);
}
```

#### Expected Results

- **CLI Speed Mode**: 7s → 1-2s (built-in tools only)
- **CLI Selective**: 7s → 3-5s (based on tools needed)

### Phase 4: SDK Background Initialization Implementation

#### Files to Modify

- `src/lib/neurolink.ts` - Add background warmup and smart initialization

#### Concept Implementation

Start MCP initialization in the background during SDK instantiation, before any user requests:

```typescript
// App startup
const neurolink = new NeuroLink({ backgroundWarmup: true }); // Starts MCP loading

// Later user request (MCP already warm)
await neurolink.generate({ input: { text: "prompt" } }); // Fast response
```

#### Detailed Code Changes

**File: `src/lib/neurolink.ts`**

Add background warmup to constructor:

```typescript
constructor(config?: {
  conversationMemory?: Partial;
  backgroundWarmup?: boolean;
  warmupTools?: string[];
}) {
  // ... existing constructor logic

  // Start background MCP warmup if requested
  if (config?.backgroundWarmup) {
    this.startBackgroundWarmup(config.warmupTools);
  }
}

private startBackgroundWarmup(tools?: string[]): void {
  // Start MCP initialization in background (non-blocking)
  setImmediate(async () => {
    try {
      await this.initializeMCP({
        requiredTools: tools || ['getCurrentTime', 'calculateMath'], // Basic tools
        parallel: true
      });
      logger.debug('Background MCP warmup completed successfully');
    } catch (error) {
      logger.warn('Background MCP warmup failed, will initialize on first request:', error);
    }
  });
}
```

Update generate method for smart initialization:

```typescript
private async generateTextInternal(options: TextGenerationOptions): Promise {
  // Smart initialization: only load MCP if not already initialized
  if (!this.mcpInitialized) {
    const requiredTools = ToolAnalyzer.analyzePromptForRequiredTools(options.prompt || '');
    await this.initializeMCP({
      requiredTools,
      parallel: true,
      prompt: options.prompt
    });
  }

  // ... rest of generation logic
}
```

#### Expected Results

- **SDK Background**: 14s → 3-5s (warmup during app start)

##  Implementation File Structure

### New Files to Create

```
src/lib/utils/toolAnalyzer.ts              # Smart tool detection logic
src/lib/mcp/mcpConnectionPool.ts           # Connection reuse (future enhancement)
src/cli/modes/performanceModes.ts          # CLI mode definitions (future enhancement)
```

### Files to Modify

```
src/lib/neurolink.ts                       # Main SDK class - add optimization options
src/lib/mcp/externalServerManager.ts      # MCP server management - add parallel/selective loading
src/cli/index.ts                          # CLI command definitions - add performance flags
```

##  Expected Performance Results

### Phase 1 (Parallel Loading)

- **CLI**: 24.8s → 12s (50% reduction)
- **SDK**: 46.4s → 23s (50% reduction)

### Phase 2 (Smart Tool Detection)

- **CLI**: 12s → 7s (additional 42% reduction)
- **SDK**: 23s → 14s (additional 39% reduction)

### Phase 3 (CLI Performance Modes)

- **CLI Speed Mode**: 7s → 1-2s (built-in tools only)
- **CLI Selective**: 7s → 3-5s (based on tools needed)

### Phase 4 (SDK Background Loading)

- **SDK Background**: 14s → 3-5s (warmup during app start)

### Final Performance Summary

```bash
# Before optimization:
CLI: 26.4s (production-blocking)
SDK: 46.4s (completely unusable)

# After optimization:
CLI Speed Mode: 1-2s    ✅ Production ready
CLI Selective: 3-5s     ✅ Production ready
CLI Smart: 7s           ✅ Acceptable
SDK Background: 3-5s    ✅ Production ready
SDK Optimized: 8-12s    ✅ Acceptable
```

##  Implementation Timeline

### Week 1: Parallel Loading Foundation

1. **Day 1-2**: Implement `loadMCPConfigurationParallel()` in `externalServerManager.ts`
2. **Day 3-4**: Add parallel option to `initializeMCP()` in `neurolink.ts`
3. **Day 5**: Test parallel loading with existing CLI and SDK, measure performance gains

### Week 2: Smart Tool Detection

1. **Day 1-2**: Create `toolAnalyzer.ts` with keyword detection logic
2. **Day 3-4**: Implement `initializeSelectiveTools()` in `neurolink.ts`
3. **Day 5**: Add `loadSelectiveServers()` in `externalServerManager.ts` and test

### Week 3: CLI Performance Modes

1. **Day 1-2**: Add CLI flags and options to `index.ts`
2. **Day 3-4**: Implement mode logic and tool mapping functions
3. **Day 5**: Test all CLI performance modes and document usage

### Week 4: SDK Background Loading

1. **Day 1-2**: Add background warmup to SDK constructor
2. **Day 3-4**: Modify generate method for smart initialization
3. **Day 5**: Performance testing, optimization, and final validation

## ✅ Testing & Validation

### Performance Benchmarks

```bash
# Test CLI performance modes
pnpm cli generate "What time is it?" --speed-mode        # Target: <2s
pnpm cli generate "Calculate 2+2" --tools=math          # Target: <3s
pnpm cli generate "List files" --tools=files            # Target: <5s
pnpm cli generate "Complex task" --parallel-loading     # Target: <8s

# Test SDK improvements
node sdk-latency-test.js                                 # Target: <10s first run
node sdk-background-test.js                              # Target: <5s with warmup
```

### Success Criteria

- **CLI Speed Mode**: \<2s total response time
- **CLI Selective**: \<5s total response time
- **CLI Smart**: \<8s total response time
- **SDK Background**: \<5s after warmup
- **SDK First Run**: \<15s (down from 46s)
- **Backward Compatibility**: All existing functionality works unchanged
- **Error Handling**: Graceful fallback to current behavior on any optimization failure

##  Conclusion

This implementation guide provides a comprehensive, phase-by-phase approach to solving NeuroLink's MCP initialization performance crisis. By implementing parallel loading, smart tool detection, CLI performance modes, and SDK background initialization, we can transform the user experience from production-blocking (26-46 seconds) to production-ready (1-10 seconds).

The approach prioritizes safety through backward compatibility and graceful degradation while delivering dramatic performance improvements that will enable NeuroLink to ship tool-enhanced features in production environments.

---

## MCP Foundation Testing Guide

<!-- Source: mcp/testing.md -->

#  MCP Foundation Testing Guide

> ⚠️ **PLANNED FEATURE**: This documentation describes features that are planned but not yet implemented. The `ContextManager` class referenced in this guide does not currently exist in the codebase. The code examples are illustrative of the intended API design.

**NeuroLink v1.3.0 MCP Foundation** - Comprehensive guide for testing MCP functionality and adding custom MCP servers.

##  **Testing MCP Foundation Programmatically**

### **1. Basic MCP Server Creation**

Create a test file to explore MCP functionality:

```typescript
// test-mcp.ts (TypeScript; run with ts-node or compile first)

  NeuroLinkMCPTool,
  NeuroLinkExecutionContext,
  ToolResult,
} from "@juspay/neurolink";

// Create a custom MCP server
const testServer = createMCPServer({
  id: "my-test-server",
  title: "My Test Server",
  description: "Testing custom MCP tools",
  category: "custom",
  visibility: "private",
});

// Add a simple tool
testServer.registerTool({
  name: "hello-world",
  description: "Simple hello world tool for testing",
  execute: async (
    params: any,
    context: NeuroLinkExecutionContext,
  ): Promise => {
    console.log("Hello World tool executed!");
    console.log("Context:", context.sessionId);

    return {
      success: true,
      data: { message: `Hello, ${params.name || "World"}!` },
      metadata: {
        toolName: "hello-world",
        timestamp: Date.now(),
      },
    };
  },
});

console.log("Test server created:", testServer.id);
console.log("Available tools:", Object.keys(testServer.tools));
```

### **2. Testing with AI Core Server**

```typescript
// test-ai-core.ts

async function testAICoreServer() {
  // Create execution context
  const context = ContextManager.createExecutionContext({
    sessionId: "test-session-123",
    userId: "test-user",
    aiProvider: "openai",
    environmentType: "development",
  });

  console.log(" Testing AI Core Server...");

  // Test text generation tool
  try {
    const result = await aiCoreServer.tools["generate"].execute(
      {
        prompt: "Write a haiku about AI",
        temperature: 0.7,
        maxTokens: 100,
      },
      context,
    );

    console.log("✅ Text Generation Result:", result);
  } catch (error) {
    console.error("❌ Text Generation Error:", error);
  }

  // Test provider selection tool
  try {
    const providerResult = await aiCoreServer.tools["select-provider"].execute(
      {
        preferred: "openai",
        requirements: {
          streaming: true,
          costEfficient: true,
        },
      },
      context,
    );

    console.log("✅ Provider Selection Result:", providerResult);
  } catch (error) {
    console.error("❌ Provider Selection Error:", error);
  }

  // Test provider status tool
  try {
    const statusResult = await aiCoreServer.tools[
      "check-provider-status"
    ].execute(
      {
        includeCapabilities: true,
      },
      context,
    );

    console.log("✅ Provider Status Result:", statusResult);
  } catch (error) {
    console.error("❌ Provider Status Error:", error);
  }
}

testAICoreServer();
```

### **3. Testing Tool Registry and Orchestration**

```typescript
// test-orchestration.ts

async function testOrchestration() {
  // Initialize registry and orchestrator
  const registry = new MCPRegistry();
  const orchestrator = new ToolOrchestrator(registry);

  // Register AI Core Server
  registry.registerServer(aiCoreServer);

  // Create execution context
  const context = ContextManager.createExecutionContext({
    sessionId: "orchestration-test",
    environmentType: "development",
  });

  console.log(" Testing Tool Orchestration...");

  // Execute single tool
  try {
    const result = await orchestrator.executeTool(
      "neurolink-ai-core",
      "generate",
      { prompt: "Explain quantum computing in one sentence", maxTokens: 50 },
      context,
    );

    console.log("✅ Single Tool Execution:", result);
  } catch (error) {
    console.error("❌ Single Tool Error:", error);
  }

  // Execute pipeline (sequential tools)
  try {
    const pipelineResult = await orchestrator.executePipeline(
      [
        {
          serverId: "neurolink-ai-core",
          toolName: "select-provider",
          params: { preferred: "openai" },
        },
        {
          serverId: "neurolink-ai-core",
          toolName: "generate",
          params: { prompt: "Write a technical joke", maxTokens: 100 },
        },
      ],
      context,
    );

    console.log("✅ Pipeline Execution:", pipelineResult);
  } catch (error) {
    console.error("❌ Pipeline Error:", error);
  }

  // Get orchestrator statistics
  const stats = orchestrator.getStatistics();
  console.log(" Orchestrator Statistics:", stats);
}

testOrchestration();
```

---

##  **Adding Custom MCP Servers**

### **1. Creating a Development Tools Server**

```typescript
// servers/dev-tools-server.ts

// Create development tools server
export const devToolsServer = createMCPServer({
  id: "neurolink-dev-tools",
  title: "NeuroLink Development Tools",
  description: "Code generation, testing, and development utilities",
  category: "development",
  version: "1.0.0",
  capabilities: [
    "code-generation",
    "test-creation",
    "documentation",
    "refactoring",
  ],
});

// Code Generation Tool
devToolsServer.registerTool({
  name: "generate-component",
  description: "Generate React/Vue/Svelte components with TypeScript",
  inputSchema: z.object({
    framework: z.enum(["react", "vue", "svelte"]),
    componentName: z.string(),
    props: z
      .array(
        z.object({
          name: z.string(),
          type: z.string(),
          required: z.boolean().default(false),
        }),
      )
      .optional(),
    styling: z
      .enum(["css", "scss", "styled-components", "tailwind"])
      .optional(),
  }),
  execute: async (
    params: any,
    context: NeuroLinkExecutionContext,
  ): Promise => {
    const { framework, componentName, props = [], styling = "css" } = params;

    // Generate component code based on framework
    let componentCode = "";

    if (framework === "react") {
      const propsInterface =
        props.length > 0
          ? `interface ${componentName}Props {\n${props.map((p) => `  ${p.name}${p.required ? "" : "?"}: ${p.type};`).join("\n")}\n}\n\n`
          : "";

      componentCode = `${propsInterface}export function ${componentName}(${props.length > 0 ? `props: ${componentName}Props` : ""}) {
  return (

      ${componentName} Component
      {/* Add your component logic here */}

  );
}`;
    } else if (framework === "svelte") {
      const scriptProps =
        props.length > 0
          ? `\n${props.map((p) => `  export let ${p.name}: ${p.type}${p.required ? "" : " | undefined"};`).join("\n")}\n\n\n`
          : "";

      componentCode = `${scriptProps}
  ${componentName} Component


  .${componentName.toLowerCase()} {
    /* Add your styles here */
  }
`;
    }

    return {
      success: true,
      data: {
        code: componentCode,
        framework,
        componentName,
        propsCount: props.length,
        styling,
      },
      metadata: {
        toolName: "generate-component",
        serverId: "neurolink-dev-tools",
        timestamp: Date.now(),
      },
    };
  },
});

// Test Generation Tool
devToolsServer.registerTool({
  name: "generate-tests",
  description: "Generate unit tests for components or functions",
  inputSchema: z.object({
    testFramework: z.enum(["vitest", "jest", "playwright"]),
    targetFile: z.string(),
    functions: z.array(z.string()),
    coverage: z.enum(["basic", "comprehensive"]).default("basic"),
  }),
  execute: async (
    params: any,
    context: NeuroLinkExecutionContext,
  ): Promise => {
    const { testFramework, targetFile, functions, coverage } = params;

    const testTemplate = `import { describe, it, expect } from '${testFramework}';

${functions
  .map(
    (fn) => `describe('${fn}', () => {
  it('should ${coverage === "comprehensive" ? "handle all edge cases" : "work correctly"}', () => {
    // Test implementation for ${fn}
    expect(${fn}).toBeDefined();
  });
});`,
  )
  .join("\n\n")}`;

    return {
      success: true,
      data: {
        testCode: testTemplate,
        testFramework,
        functionsCount: functions.length,
        coverage,
      },
      metadata: {
        toolName: "generate-tests",
        serverId: "neurolink-dev-tools",
        timestamp: Date.now(),
      },
    };
  },
});

console.log(
  "[DevTools] Development Tools Server created with tools:",
  Object.keys(devToolsServer.tools),
);
```

### **2. Creating a Content Creation Server**

```typescript
// servers/content-server.ts

export const contentServer = createMCPServer({
  id: "neurolink-content",
  title: "NeuroLink Content Creation",
  description: "Blog posts, documentation, and marketing content generation",
  category: "content",
  version: "1.0.0",
});

// Blog Post Generation Tool
contentServer.registerTool({
  name: "generate-blog-post",
  description: "Generate blog posts with SEO optimization",
  inputSchema: z.object({
    topic: z.string(),
    audience: z.enum(["technical", "business", "general"]),
    length: z.enum(["short", "medium", "long"]),
    tone: z.enum(["professional", "casual", "educational"]),
    includeSEO: z.boolean().default(true),
  }),
  execute: async (
    params: any,
    context: NeuroLinkExecutionContext,
  ): Promise => {
    // Use AI Core Server for content generation
    const aiResult = (await context.toolChain?.includes("neurolink-ai-core"))
      ? { content: `Generated blog post about ${params.topic}...` }
      : {
          content: `Mock blog post about ${params.topic} for ${params.audience} audience`,
        };

    const metadata = {
      wordCount:
        params.length === "short"
          ? 500
          : params.length === "medium"
            ? 1000
            : 2000,
      readingTime:
        params.length === "short" ? 2 : params.length === "medium" ? 5 : 8,
      seoOptimized: params.includeSEO,
    };

    return {
      success: true,
      data: {
        content: aiResult.content,
        ...metadata,
        topic: params.topic,
        audience: params.audience,
      },
      metadata: {
        toolName: "generate-blog-post",
        serverId: "neurolink-content",
        timestamp: Date.now(),
      },
    };
  },
});

console.log(
  "[Content] Content Creation Server created with tools:",
  Object.keys(contentServer.tools),
);
```

---

## ️ **Adding MCP Commands to CLI**

To integrate MCP functionality into the CLI, add these commands to `src/cli/index.ts`:

### **1. MCP Server Management Commands**

```typescript
// Add to CLI (src/cli/index.ts)

.command('mcp ', 'Manage MCP servers and tools',
  (yargsMCP) => {
    yargsMCP
      .usage('Usage: $0 mcp  [options]')
      .command('list-servers', 'List all registered MCP servers',
        () => {},
        async (argv) => {
          const registry = new MCPRegistry();
          // Register default servers
          registry.registerServer(aiCoreServer);

          const servers = registry.listServers();
          console.log(chalk.blue(' Registered MCP Servers:'));
          servers.forEach(server => {
            console.log(`  • ${chalk.green(server.id)} - ${server.title}`);
            console.log(`    Category: ${server.category}, Tools: ${server.toolCount}`);
          });
        }
      )
      .command('list-tools [serverId]', 'List tools for all servers or specific server',
        (y) => y.positional('serverId', {
          type: 'string',
          description: 'Optional server ID to filter tools'
        }),
        async (argv) => {
          const registry = new MCPRegistry();
          registry.registerServer(aiCoreServer);

          const tools = argv.serverId
            ? registry.getServerTools(argv.serverId)
            : registry.listAllTools();

          console.log(chalk.blue(' Available MCP Tools:'));
          tools.forEach(tool => {
            console.log(`  • ${chalk.green(tool.name)} (${tool.serverId})`);
            console.log(`    ${tool.description}`);
          });
        }
      )
      .command('execute  ', 'Execute an MCP tool',
        (y) => y
          .positional('serverId', { type: 'string', demandOption: true })
          .positional('toolName', { type: 'string', demandOption: true })
          .option('params', { type: 'string', description: 'JSON parameters for the tool' })
          .option('session', { type: 'string', default: 'cli-session', description: 'Session ID' }),
        async (argv) => {
          const registry = new MCPRegistry();
          const orchestrator = new ToolOrchestrator(registry);

          // Register servers
          registry.registerServer(aiCoreServer);

          const context = ContextManager.createExecutionContext({
            sessionId: argv.session,
            environmentType: 'development',
            aiProvider: 'auto'
          });

          try {
            const params = argv.params ? JSON.parse(argv.params) : {};
            const result = await orchestrator.executeTool(
              argv.serverId,
              argv.toolName,
              params,
              context
            );

            console.log(chalk.green('✅ Tool execution successful:'));
            console.log(JSON.stringify(result, null, 2));
          } catch (error) {
            console.error(chalk.red('❌ Tool execution failed:'), error);
          }
        }
      )
      .demandCommand(1, 'Please specify an MCP subcommand');
  }
)
```

### **2. Quick MCP Testing Commands**

```typescript
// Add convenience commands
.command('mcp-generate ', 'Quick AI text generation via MCP',
  (y) => y
    .positional('prompt', { type: 'string', demandOption: true })
    .option('provider', { type: 'string', description: 'Preferred AI provider' }),
  async (argv) => {
    const registry = new MCPRegistry();
    const orchestrator = new ToolOrchestrator(registry);
    registry.registerServer(aiCoreServer);

    const context = ContextManager.createExecutionContext({
      sessionId: 'mcp-cli-' + Date.now(),
      aiProvider: argv.provider
    });

    try {
      const result = await orchestrator.executeTool(
        'neurolink-ai-core',
        'generate',
        { prompt: argv.prompt, maxTokens: 200 },
        context
      );

      if (result.success) {
        console.log('\n' + result.data.text + '\n');
        console.log(chalk.blue(`Provider: ${result.data.provider}`));
      } else {
        console.error(chalk.red('Generation failed:'), result.error);
      }
    } catch (error) {
      console.error(chalk.red('MCP execution error:'), error);
    }
  }
)
```

---

##  **Running MCP Tests**

### **1. Run Existing Test Suite**

```bash
# Run comprehensive MCP tests
pnpm run test:run

# Run specific MCP tests
npx vitest run test/mcp-comprehensive.test.ts
```

### **2. Test Custom MCP Server**

Create and run a test file:

```bash
# Create test file
cat > test-custom-mcp.ts  ({ success: true, data: 'Hello from MCP!' })
});

console.log('Server created:', myServer.id);
console.log('Tools:', Object.keys(myServer.tools));
EOF

# Install ts-node if not available
npm install -g ts-node typescript
# Or use npx for one-time execution without global install

# Run test
npx ts-node test-custom-mcp.ts
```

### **3. Test MCP via Node.js REPL**

```bash
# Start Node.js REPL with NeuroLink
node -r ts-node/register

# In REPL:
> const { createMCPServer } = require('@juspay/neurolink');
> const server = createMCPServer({ id: 'repl-test', title: 'REPL Test' });
> console.log('Server created:', server.id);
```

---

##  **MCP Development Workflow**

### **1. Development Cycle**

1. **Create MCP Server** - Use `createMCPServer()`
2. **Add Tools** - Register tools with validation
3. **Test Tools** - Use registry and orchestrator
4. **Integrate with CLI** - Add CLI commands
5. **Run Tests** - Validate functionality

### **2. Best Practices**

- **Use TypeScript** for full type safety
- **Validate inputs** with Zod schemas
- **Handle errors** gracefully in tools
- **Log execution** for debugging
- **Test thoroughly** before deployment

### **3. Performance Monitoring**

```typescript
// Monitor tool performance
const stats = orchestrator.getStatistics();
console.log("Tool execution stats:", stats);

// Track context usage
const contextStats = ContextManager.getStatistics();
console.log("Context management stats:", contextStats);
```

---

##  **Next Steps**

1. **✅ Test Current Implementation** - Use programmatic testing examples
2. ** Add CLI Integration** - Implement MCP CLI commands
3. **️ Create Custom Servers** - Build domain-specific tool servers
4. ** Monitor Performance** - Track tool execution and usage
5. ** Iterate and Improve** - Enhance based on real usage

**MCP Foundation is production-ready and waiting for your custom tools!**

---

# Advanced

## Advanced Features

<!-- Source: advanced/index.md -->

# Advanced Features

Explore NeuroLink's enterprise-grade capabilities that set it apart from basic AI integration libraries.

##  What Makes NeuroLink Advanced

NeuroLink goes beyond simple API wrappers to provide a comprehensive AI development platform with:

- **Production-ready architecture** with factory patterns
- **Built-in tool ecosystem** via Model Context Protocol (MCP)
- **Real-time analytics** and performance monitoring
- **Dynamic model management** with cost optimization
- **Enterprise streaming** with multi-modal support

##  Feature Overview

-  **[MCP Integration](/docs/mcp/integration)**

  Model Context Protocol support with 6 built-in tools and 58+ discoverable external servers.

-  **[Analytics & Evaluation](/docs/reference/analytics)**

  Built-in usage tracking, cost monitoring, performance metrics, and AI response quality evaluation.

-  **[Factory Patterns](/docs/advanced/factory-patterns)**

  Unified provider architecture using the Factory Pattern for consistent interfaces and easy extensibility.

-  **[Dynamic Models](/docs/guides/dynamic-models)**

  Self-updating model configurations, automatic cost optimization, and smart model resolution.

-  **[Streaming](/docs/advanced/streaming)**

  Real-time streaming architecture with analytics support and multi-modal readiness.

-  **[Middleware Architecture](/docs/advanced/middleware-architecture)**

  Comprehensive middleware system for request/response processing, logging, and custom transformations.

- ️ **[Built-in Middleware](/docs/advanced/builtin-middleware)**

  Pre-built middleware for analytics, guardrails, and auto-evaluation.

## ️ Middleware System

NeuroLink includes a powerful middleware architecture for extending functionality:

- **[Middleware Architecture](/docs/advanced/middleware-architecture)** - Complete middleware lifecycle and factory patterns
- **[Built-in Middleware](/docs/advanced/builtin-middleware)** - Analytics, Guardrails, Auto-Evaluation middleware reference
- **[Custom Middleware Guide](/docs/workflows/custom-middleware)** - Build your own middleware with examples

##  Architecture Highlights

### Factory Pattern Implementation

```typescript
// All providers inherit from BaseProvider
class OpenAIProvider extends BaseProvider {
  protected getProviderName(): AIProviderName {
    return "openai";
  }

  protected async getAISDKModel(): Promise {
    return openai(this.modelName);
  }
}

// Unified interface across all providers
const provider = createBestAIProvider();
const result = await provider.generate({
  /* options */
});
```

### Built-in Tool System

```typescript
// Tools are always available by default
const result = await neurolink.generate({
  input: { text: "What time is it?" },
  // Built-in tools automatically handle time requests
});

// Disable tools for pure text generation
const pureResult = await neurolink.generate({
  input: { text: "Write a poem" },
  disableTools: true,
});
```

### Real-time Analytics

```typescript
const result = await neurolink.generate({
  input: { text: "Generate a report" },
  enableAnalytics: true,
});

console.log(result.analytics);
// {
//   provider: "google-ai",
//   model: "gemini-2.5-flash",
//   tokens: { input: 10, output: 150, total: 160 },
//   cost: 0.000012,
//   responseTime: 1250,
//   toolsUsed: ["getCurrentTime"]
// }
```

##  Enterprise Capabilities

### Performance Optimization

- **68% faster provider status checks** (16s → 5s via parallel execution)
- **Automatic memory management** for operations >50MB
- **Circuit breakers** and retry logic for resilience
- **Rate limiting** to prevent API quota exhaustion

### Edge Case Handling

- **Input validation** with helpful error messages
- **Timeout warnings** for long-running operations
- **Network resilience** with automatic retries
- **Graceful degradation** when providers fail

### Production Features

- **Comprehensive error handling** with detailed logging
- **Type safety** with full TypeScript support
- **Configurable timeouts** and resource limits
- **Environment-aware configuration** loading

##  Use Case Examples

```typescript
// Automated content pipeline with analytics
const pipeline = new NeuroLink({ enableAnalytics: true });

const articles = await Promise.all(
  topics.map(topic =>
    pipeline.generate({
      input: { text: `Write article about ${topic}` },
      maxTokens: 2000,
      temperature: 0.7,
    })
  )
);

// Analyze costs and performance
const totalCost = articles.reduce((sum, article) =>
  sum + (article.analytics?.cost || 0), 0
);
```

```typescript
// Future-ready streaming with multi-modal support
const stream = await neurolink.stream({
  input: {
    text: "Analyze this data",
    // Future: image, audio, video inputs
  },
  enableAnalytics: true,
  enableEvaluation: true,
});

for await (const chunk of stream.stream) {
  // Real-time processing with tool calls
  if (chunk.toolCall) {
    console.log(`Tool used: ${chunk.toolCall.name}`);
  }
  process.stdout.write(chunk.content);
}
```

```typescript
// Production monitoring and alerting
const result = await neurolink.generate({
  input: { text: prompt },
  enableAnalytics: true,
  context: {
    userId,
    sessionId,
    environment: process.env.NODE_ENV
  },
});

// Custom monitoring integration
if (result.analytics.responseTime > 5000) {
  logger.warn(`Slow AI response: ${result.analytics.responseTime}ms`);
}

if (result.analytics.cost > 0.10) {
  logger.warn(`High cost request: $${result.analytics.cost}`);
}
```

##  Future Roadmap

### Coming Soon

- **Real-time WebSocket Infrastructure** (in development)
- **Enhanced Telemetry** with OpenTelemetry support
- **Enhanced Chat Services** with session management
- **External MCP server activation** (discovery complete)
- **Multi-modal inputs** (image, audio, video)

### In Development

- **Advanced caching** strategies
- **Load balancing** across providers
- **Custom evaluation metrics**
- **Workflow orchestration** tools

##  Deep Dive Resources

Each advanced feature has comprehensive documentation with examples, best practices, and troubleshooting guides:

- **[Factory Pattern Migration Guide](/docs/development/factory-migration)** - Upgrade from older architectures
- **[MCP Testing Guide](/docs/development/testing)** - Test tool integrations
- **[Performance Tuning](/docs/deployment/configuration)** - Optimize for your use case
- **[Production Deployment](/docs/examples/business)** - Enterprise deployment patterns

---

## Analytics & Evaluation

<!-- Source: advanced/analytics.md -->

# Analytics & Evaluation

Advanced analytics and AI response evaluation features for monitoring usage, performance, and quality.

##  Overview

NeuroLink provides comprehensive analytics and evaluation capabilities to help you monitor AI usage, track performance, and assess response quality. These features are essential for production applications and enterprise deployments.

##  Analytics Features

### Usage Analytics

Track detailed metrics about your AI interactions:

```typescript

const neurolink = new NeuroLink({
  analytics: {
    enabled: true,
    endpoint: "https://analytics.yourcompany.com",
    apiKey: process.env.ANALYTICS_API_KEY,
  },
});

// Analytics automatically tracked
const result = await neurolink.generate({
  input: { text: "Generate report" },
  context: {
    userId: "user123",
    sessionId: "sess456",
    department: "engineering",
  },
});
```

### CLI Analytics

Enable analytics in CLI commands:

```bash
# Enable analytics for single command
npx @juspay/neurolink gen "Analyze data" --enable-analytics

# With custom context
npx @juspay/neurolink gen "Business analysis" \
  --enable-analytics \
  --context '{"team":"product","project":"dashboard"}' \
  --debug
```

### Tracked Metrics

- **Usage Statistics**: Request count, frequency, patterns
- **Performance Metrics**: Response time, token usage, costs
- **Provider Statistics**: Success rates, error patterns, latency
- **Cost Analysis**: Per-provider costs, budget tracking
- **User Analytics**: Usage by user, team, or department
- **Quality Metrics**: Response evaluation scores

##  Response Evaluation

### AI-Powered Quality Assessment

```typescript
// Enable evaluation for quality scoring
const result = await neurolink.generate({
  input: { text: "Write production code" },
  enableEvaluation: true,
  evaluationDomain: "Senior Software Engineer",
  evaluationCriteria: ["accuracy", "completeness"],
});

console.log(result.evaluation);
// {
//   overall: 9.2,
//   relevance: 9.5,
//   accuracy: 9.0,
//   completeness: 8.8,
//   reasoning: "Code follows best practices...",
//   alertSeverity: "none"
// }
```

### CLI Evaluation

```bash
# Basic evaluation
npx @juspay/neurolink gen "Write API documentation" --enable-evaluation

# Domain-specific evaluation
npx @juspay/neurolink gen "Design system architecture" \
  --enable-evaluation \
  --evaluation-domain "Solutions Architect"

# Combined analytics and evaluation
npx @juspay/neurolink gen "Create test plan" \
  --enable-analytics \
  --enable-evaluation \
  --evaluation-domain "QA Engineer" \
  --debug
```

### Evaluation Domains

Specialized evaluation contexts:

- **Technical**: `Senior Software Engineer`, `DevOps Specialist`, `Data Scientist`
- **Business**: `Product Manager`, `Business Analyst`, `Marketing Manager`
- **Creative**: `Content Writer`, `UX Designer`, `Creative Director`
- **Academic**: `Research Scientist`, `Technical Writer`, `Educator`

##  Analytics Collection

### Per-Request Analytics

Analytics are collected on a per-request basis and included in each result:

```typescript
// Enable analytics for a single request
const result = await neurolink.generate({
  input: { text: "Generate documentation" },
  enableAnalytics: true,
});

// Access analytics from the result
console.log(result.analytics);
// {
//   totalTokens: 1523,
//   promptTokens: 421,
//   completionTokens: 1102,
//   cost: 0.0045,
//   durationMs: 1456,
//   provider: "openai",
//   model: "gpt-4o"
// }
```

### Middleware-Based Analytics

For application-wide analytics collection, use the analytics middleware:

```typescript

// Analytics are automatically collected by the middleware
const metrics = getAnalyticsMetrics();

// Process or export metrics as needed
console.log(metrics);

// Clear metrics after processing
clearAnalyticsMetrics();
```

##  Configuration

### Environment Variables

```bash
# Evaluation Configuration
NEUROLINK_EVALUATION_PROVIDER="google-ai"
NEUROLINK_EVALUATION_MODEL="gemini-2.5-flash"
NEUROLINK_EVALUATION_THRESHOLD="7"
```

### Per-Request Configuration

Analytics and evaluation are configured on a per-request basis:

```typescript
// Enable analytics and evaluation for specific requests
const result = await neurolink.generate({
  input: { text: "Your prompt" },
  enableAnalytics: true,
  enableEvaluation: true,
  evaluationDomain: "Senior Software Engineer",
  evaluationCriteria: ["accuracy", "completeness"],
});
```

##  Currently Available Methods

The following methods are available today for analytics and monitoring:

| Method                                 | Description                                       |
| -------------------------------------- | ------------------------------------------------- |
| `neurolink.getProviderStatus()`        | Get provider availability status                  |
| `neurolink.getProviderHealthSummary()` | Get health summary for all providers              |
| `neurolink.getToolExecutionMetrics()`  | Get tool execution statistics                     |
| `getAnalyticsMetrics()`                | Standalone middleware function for analytics data |

```typescript

const neurolink = new NeuroLink();

// Get provider health status
const healthSummary = neurolink.getProviderHealthSummary();
console.log(healthSummary);

// Get tool execution metrics
const toolMetrics = neurolink.getToolExecutionMetrics();
console.log(toolMetrics);

// Get analytics from middleware
const metrics = getAnalyticsMetrics();
console.log(metrics);
```

---

##  Use Cases

> **Planned Feature**
>
> The following methods (`getProviderMetrics()`, `getCostAnalysis()`, `getTeamAnalytics()`) are planned for a future release and are **not yet available** in the current SDK version.
> These examples illustrate the planned API design.

### Planned API: Performance Monitoring

```typescript
// PLANNED - Monitor provider performance
const perfMetrics = await neurolink.getProviderMetrics({
  providers: ["openai", "google-ai", "anthropic"],
  timeRange: "last_24_hours",
  metrics: ["response_time", "success_rate", "cost_per_token"],
});

// Identify best performing provider
const bestProvider = perfMetrics.providers.sort(
  (a, b) => a.averageResponseTime - b.averageResponseTime,
)[0];

console.log(`Best provider: ${bestProvider.name}`);
```

### Planned API: Cost Optimization

```typescript
// PLANNED - Track costs and optimize
const costAnalysis = await neurolink.getCostAnalysis({
  timeRange: "current_month",
  groupBy: ["provider", "model", "user_id"],
});

// Find cost-effective providers
const cheapestProvider = costAnalysis.providers.sort(
  (a, b) => a.costPerToken - b.costPerToken,
)[0];
```

### Quality Assurance

```bash
# Batch evaluate responses for quality
cat prompts.txt | while read prompt; do
  npx @juspay/neurolink gen "$prompt" \
    --enable-evaluation \
    --evaluation-domain "Senior Engineer" \
    --json >> evaluations.json
done

# Analyze quality trends
jq '.evaluation.overall' evaluations.json | awk '{sum+=$1} END {print "Average quality:", sum/NR}'
```

##  Enterprise Features

> **Planned Feature**
>
> The enterprise analytics methods below (`getTeamAnalytics()`, custom metrics configuration) are planned for a future release.
> These examples illustrate the planned API design for enterprise deployments.

### Planned API: Team Analytics

```typescript
// PLANNED - Department-level analytics
const teamMetrics = await neurolink.getTeamAnalytics({
  departments: ["engineering", "product", "marketing"],
  metrics: ["usage", "cost", "quality_scores"],
  timeRange: "current_quarter",
});
```

### Planned API: Custom Metrics

```typescript
// PLANNED - Define custom analytics
const result = await neurolink.generate({
  input: { text: "Generate report" },
  analytics: {
    customMetrics: {
      feature: "report_generation",
      complexity: "high",
      businessValue: "critical",
    },
  },
});
```

### Compliance Monitoring

```bash
# Audit trail with evaluation
npx @juspay/neurolink gen "Sensitive analysis" \
  --enable-analytics \
  --enable-evaluation \
  --context '{"compliance":"required","audit":"true"}' \
  --evaluation-domain "Compliance Officer"
```

##  Related Documentation

- [CLI Commands](/docs/cli/commands) - Analytics CLI commands
- [Environment Variables](/docs/getting-started/environment-variables) - Configuration
- [SDK Reference](/docs/sdk/api-reference) - Programmatic analytics
- [Enterprise Setup](/docs/guides/enterprise) - Enterprise features

---

## Built-in Middleware Reference

<!-- Source: advanced/builtin-middleware.md -->

# Built-in Middleware Reference

NeuroLink includes three production-ready middleware components for common enterprise use cases: **Analytics**, **Guardrails**, and **Auto-Evaluation**. These middleware are battle-tested and ready to use in production applications.

## Quick Start

Enable all built-in middleware with a single preset:

```typescript

const factory = new MiddlewareFactory({
  preset: "all", // Enables analytics + guardrails
});
```

Or enable specific middleware:

```typescript
const factory = new MiddlewareFactory({
  enabledMiddleware: ["analytics", "guardrails", "autoEvaluation"],
});
```

----------- | ------ | ---------------------------------- | ------------ |
| `requestId`    | string | Unique identifier for this request | -            |
| `timestamp`    | string | ISO 8601 timestamp                 | -            |
| `responseTime` | number | Total request duration             | milliseconds |
| `usage.input`  | number | Input tokens consumed              | tokens       |
| `usage.output` | number | Output tokens generated            | tokens       |
| `usage.total`  | number | Total tokens used                  | tokens       |

### Output Format

Analytics data is automatically added to the response metadata:

**Generate Response:**

```typescript

const neurolink = new NeuroLink({
  provider: "openai",
  model: "gpt-4",
});

const result = await neurolink.generate({
  prompt: "Explain quantum computing",
});

// Access analytics from response metadata
const analytics = result.experimental_providerMetadata?.neurolink?.analytics;

console.log(analytics);
```

**Analytics Object Structure:**

```json
{
  "requestId": "analytics-1735689600000",
  "responseTime": 1523,
  "timestamp": "2026-01-01T00:00:00.000Z",
  "usage": {
    "input": 12,
    "output": 256,
    "total": 268
  }
}
```

**Stream Response:**

For streaming responses, analytics are available in the `rawResponse`:

```typescript
const result = await neurolink.stream({
  prompt: "Write a story",
});

// Analytics available in rawResponse
const streamAnalytics = result.rawResponse?.neurolink?.analytics;

console.log(streamAnalytics);
```

**Stream Analytics Structure:**

```json
{
  "requestId": "analytics-stream-1735689600000",
  "startTime": 1735689600000,
  "timestamp": "2026-01-01T00:00:00.000Z",
  "streamingMode": true
}
```

### Use Cases

**1. Cost Tracking:**

```typescript
const result = await neurolink.generate({ prompt: "..." });
const analytics = result.experimental_providerMetadata?.neurolink?.analytics;

// Calculate cost (example: $0.03 per 1K input tokens, $0.06 per 1K output tokens)
const inputCost = (analytics.usage.input / 1000) * 0.03;
const outputCost = (analytics.usage.output / 1000) * 0.06;
const totalCost = inputCost + outputCost;

console.log(`Request cost: $${totalCost.toFixed(4)}`);
```

**2. Performance Monitoring:**

```typescript
const analytics = result.experimental_providerMetadata?.neurolink?.analytics;

if (analytics.responseTime > 3000) {
  console.warn(`Slow request detected: ${analytics.responseTime}ms`);
  // Send alert to monitoring system
}
```

**3. Usage Analytics Dashboard:**

```typescript
// Aggregate analytics over multiple requests
const requests = [];

for (const prompt of prompts) {
  const result = await neurolink.generate({ prompt });
  const analytics = result.experimental_providerMetadata?.neurolink?.analytics;
  requests.push(analytics);
}

// Calculate aggregates
const totalTokens = requests.reduce((sum, a) => sum + a.usage.total, 0);
const avgResponseTime =
  requests.reduce((sum, a) => sum + a.responseTime, 0) / requests.length;

console.log(`Total tokens used: ${totalTokens}`);
console.log(`Average response time: ${avgResponseTime}ms`);
```

### Integration with External Systems

**Send to Datadog:**

```typescript

const dogstatsd = new StatsD();

const result = await neurolink.generate({ prompt: "..." });
const analytics = result.experimental_providerMetadata?.neurolink?.analytics;

dogstatsd.histogram("neurolink.response_time", analytics.responseTime);
dogstatsd.increment("neurolink.tokens.total", analytics.usage.total);
dogstatsd.increment("neurolink.requests.success");
```

**Send to Prometheus:**

```typescript

const responseTimeHistogram = new Histogram({
  name: "neurolink_response_time_ms",
  help: "Response time in milliseconds",
  buckets: [100, 500, 1000, 2000, 5000],
});

const tokenCounter = new Counter({
  name: "neurolink_tokens_total",
  help: "Total tokens consumed",
});

const result = await neurolink.generate({ prompt: "..." });
const analytics = result.experimental_providerMetadata?.neurolink?.analytics;

responseTimeHistogram.observe(analytics.responseTime);
tokenCounter.inc(analytics.usage.total);
```

---

## Guardrails Middleware

### Purpose

The **Guardrails Middleware** provides comprehensive content filtering and policy enforcement to block or redact unsafe content, prevent prompt injection attacks, and maintain compliance with content policies.

**Key Capabilities:**

- Bad word filtering (configurable word list)
- AI model-based content safety evaluation
- Precall evaluation (block unsafe prompts before they reach the LLM)
- Stream and generate support
- Configurable filtering actions (block, redact, log)

### Configuration

**Basic Configuration:**

```typescript

const factory = new MiddlewareFactory({
  middlewareConfig: {
    guardrails: {
      enabled: true,
      config: {
        badWords: ["inappropriate", "unsafe", "prohibited"],
      },
    },
  },
});
```

**Advanced Configuration with Model-Based Filtering:**

```typescript

const factory = new MiddlewareFactory({
  middlewareConfig: {
    guardrails: {
      enabled: true,
      config: {
        // Basic word filtering
        badWords: ["spam", "scam", "inappropriate"],

        // AI model-based filtering
        modelFilter: {
          enabled: true,
          filterModel: openai("gpt-3.5-turbo"), // Use a fast model for filtering
        },
      },
    },
  },
});
```

**Precall Evaluation (Block Unsafe Prompts):**

```typescript
const factory = new MiddlewareFactory({
  middlewareConfig: {
    guardrails: {
      enabled: true,
      config: {
        badWords: ["prohibited"],

        // Precall evaluation blocks unsafe prompts before they reach the LLM
        precallEvaluation: {
          enabled: true,
          provider: "openai",
          evaluationModel: "gpt-4", // Model name as string
          thresholds: {
            safetyScore: 7, // Safety threshold (1-10 scale, higher = more restrictive)
            appropriatenessScore: 6, // Appropriateness threshold (1-10 scale)
          },
          blockUnsafeRequests: true, // Block requests that fail evaluation
        },
      },
    },
  },
});
```

### Built-in Filters

| Filter Type            | Description                            | Action            | Configuration                     |
| ---------------------- | -------------------------------------- | ----------------- | --------------------------------- |
| **Bad Words**          | Block/redact specific words or phrases | Redact with `***` | `badWords: string[]`              |
| **Model-Based**        | Use AI to evaluate content safety      | Block if unsafe   | `modelFilter.enabled: true`       |
| **Precall Evaluation** | Block unsafe prompts before LLM call   | Block request     | `precallEvaluation.enabled: true` |

### Bad Word Filtering

**How It Works:**

The bad word filter scans both requests and responses for prohibited terms and replaces them with `***`.

**Example:**

```typescript
// Input prompt: "This is an inappropriate message"
// With badWords: ["inappropriate"]

// Output: "This is an *** message"
```

**Configuration:**

```typescript
config: {
  badWords: [
    "inappropriate",
    "prohibited",
    "unsafe",
    // Add your organization's blocked terms
  ];
}
```

### Model-Based Filtering

**How It Works:**

Uses a separate AI model to evaluate whether content is safe. The filter sends the content to the model with a safety evaluation prompt.

**Safety Evaluation Prompt:**

```
Is the following text safe? Respond with only "safe" or "unsafe".

Text: "[content to evaluate]"
```

**Example:**

```typescript
// If model responds "unsafe", content is redacted
// Output: ""
```

**Configuration:**

```typescript

config: {
  modelFilter: {
    enabled: true,
    filterModel: openai("gpt-3.5-turbo") // Fast, cost-effective model
  }
}
```

### Precall Evaluation

**How It Works:**

Evaluates the safety of the input prompt **before** it reaches the main LLM. If the prompt is deemed unsafe, the request is blocked entirely, saving costs and preventing unsafe content generation.

**Evaluation Process:**

1. User submits a prompt
2. Guardrails middleware intercepts in `transformParams`
3. Safety evaluation model scores the prompt (0-1 scale)
4. If score = threshold, request proceeds to main LLM

**Blocked Response:**

```json
{
  "text": "",
  "usage": {
    "promptTokens": 0,
    "completionTokens": 0
  }
}
```

**Configuration:**

```typescript
config: {
  precallEvaluation: {
    enabled: true,
    provider: "openai",
    evaluationModel: "gpt-4", // Model for safety evaluation (string)
    thresholds: {
      safetyScore: 7, // Safety threshold (1-10 scale, default 7)
      appropriatenessScore: 6, // Appropriateness threshold (1-10 scale, default 6)
    },
    blockUnsafeRequests: true, // Block requests that fail evaluation
    actions: {
      onUnsafe: "block",
      onInappropriate: "sanitize",
      onSuspicious: "warn",
    },
  }
}
```

### Streaming Support

Guardrails work seamlessly with streaming responses:

```typescript
const result = await neurolink.stream({
  prompt: "Generate a story",
});

// Each chunk is filtered in real-time
for await (const chunk of result.textStream) {
  console.log(chunk); // Filtered content
}
```

**Stream Filtering:**

- Bad words are replaced with `***` in each text delta
- Model-based filtering is not applied to streams (too slow)
- Precall evaluation works for streams

### Use Cases

**1. Content Moderation for User-Generated Prompts:**

```typescript
const factory = new MiddlewareFactory({
  middlewareConfig: {
    guardrails: {
      enabled: true,
      config: {
        badWords: ["spam", "abuse", "harassment"],
        precallEvaluation: {
          enabled: true,
          provider: "openai",
          evaluationModel: "gpt-4",
          thresholds: {
            safetyScore: 9, // Strict filtering (1-10 scale)
            appropriatenessScore: 8,
          },
          blockUnsafeRequests: true,
        },
      },
    },
  },
});
```

**2. Compliance with Content Policies:**

```typescript
const factory = new MiddlewareFactory({
  middlewareConfig: {
    guardrails: {
      enabled: true,
      config: {
        badWords: organizationBlocklist, // Your org's blocked terms
        modelFilter: {
          enabled: true,
          filterModel: openai("gpt-3.5-turbo"),
        },
      },
      conditions: {
        providers: ["openai", "anthropic"], // Only for external providers
      },
    },
  },
});
```

**3. Protecting Against Prompt Injection:**

```typescript
const factory = new MiddlewareFactory({
  middlewareConfig: {
    guardrails: {
      enabled: true,
      config: {
        precallEvaluation: {
          enabled: true,
          provider: "openai",
          evaluationModel: "gpt-4",
          thresholds: {
            safetyScore: 8, // High safety threshold (1-10 scale)
            appropriatenessScore: 7,
          },
          blockUnsafeRequests: true,
          actions: {
            onUnsafe: "block",
            onInappropriate: "block",
            onSuspicious: "block",
          },
        },
      },
    },
  },
});
```

---

## Auto-Evaluation Middleware

### Purpose

The **Auto-Evaluation Middleware** automatically evaluates AI response quality using configurable criteria. It can trigger retries for low-quality responses and provide quality metrics for monitoring.

**Key Capabilities:**

- Automatic quality evaluation after each response
- Configurable evaluation criteria (relevance, accuracy, coherence, etc.)
- Blocking and non-blocking modes
- Integration with custom evaluation providers
- Quality score thresholds

### Configuration

**Basic Configuration:**

```typescript

const factory = new MiddlewareFactory({
  middlewareConfig: {
    autoEvaluation: {
      enabled: true,
      config: {
        threshold: 7, // Minimum quality score (0-10)
        blocking: true, // Wait for evaluation before returning
      },
    },
  },
});
```

**Advanced Configuration:**

```typescript
const factory = new MiddlewareFactory({
  middlewareConfig: {
    autoEvaluation: {
      enabled: true,
      config: {
        threshold: 8,
        blocking: false, // Non-blocking: evaluation happens in background

        // Custom evaluation provider
        provider: "openai",
        evaluationModel: "gpt-4",

        // Custom prompt generator for evaluation
        promptGenerator: (options, result) => {
          return `Evaluate the following AI response on a scale of 0-10 for:
- Relevance to the prompt
- Factual accuracy
- Coherence and clarity
- Helpfulness

Prompt: ${options.prompt}
Response: ${result.content}

Score (0-10):`;
        },

        // Callback when evaluation completes
        onEvaluationComplete: async (evaluationResult) => {
          console.log("Evaluation complete:", evaluationResult);

          if (evaluationResult.score  {
    // Handle evaluation asynchronously
    await logEvaluation(evaluationResult);
  }
}

// Response returned immediately
const result = await neurolink.generate({ prompt: "..." });
// Evaluation runs in background
```

### Evaluation Output

**Evaluation Result Structure:**

```typescript
type EvaluationResult = {
  // Overall quality score (0-10)
  score: number;

  // Detailed scores per criterion
  criteria: {
    relevance: number;
    accuracy: number;
    coherence: number;
    helpfulness: number;
    safety: number;
  };

  // Whether the response passed the threshold
  passed: boolean;

  // Optional feedback from evaluator
  feedback?: string;

  // Timestamp of evaluation
  timestamp: string;
};
```

**Example Output:**

```json
{
  "score": 8.5,
  "criteria": {
    "relevance": 9,
    "accuracy": 8,
    "coherence": 9,
    "helpfulness": 8,
    "safety": 10
  },
  "passed": true,
  "feedback": "High-quality response with accurate information and clear structure.",
  "timestamp": "2026-01-01T00:00:00.000Z"
}
```

### Streaming Support

**Important:** Auto-evaluation for streaming responses always runs in **non-blocking mode**, even if `blocking: true` is configured. This is because the stream needs to be returned to the user immediately.

```typescript
config: {
  blocking: true; // Ignored for streams
}

const result = await neurolink.stream({ prompt: "..." });

// Stream returns immediately
for await (const chunk of result.textStream) {
  console.log(chunk);
}

// Evaluation happens in background after stream completes
```

### Use Cases

**1. Quality Assurance for Customer-Facing AI:**

```typescript
const factory = new MiddlewareFactory({
  middlewareConfig: {
    autoEvaluation: {
      enabled: true,
      config: {
        threshold: 8, // High quality requirement
        blocking: true, // Wait for evaluation
        onEvaluationComplete: async (evaluation) => {
          if (!evaluation.passed) {
            // Log low-quality response for review
            await logQualityIssue({
              score: evaluation.score,
              feedback: evaluation.feedback,
            });
          }
        },
      },
    },
  },
});
```

**2. Automatic Response Improvement:**

```typescript
const factory = new MiddlewareFactory({
  middlewareConfig: {
    autoEvaluation: {
      enabled: true,
      config: {
        threshold: 7,
        blocking: true,
        onEvaluationComplete: async (evaluation) => {
          if (!evaluation.passed) {
            // Trigger retry with modified prompt
            console.log("Quality below threshold, retrying...");
            // Implementation would retry the request
          }
        },
      },
    },
  },
});
```

**3. Quality Metrics Dashboard:**

```typescript
const evaluationResults = [];

const factory = new MiddlewareFactory({
  middlewareConfig: {
    autoEvaluation: {
      enabled: true,
      config: {
        threshold: 7,
        blocking: false, // Background evaluation
        onEvaluationComplete: async (evaluation) => {
          evaluationResults.push(evaluation);

          // Calculate rolling average quality
          const avgScore =
            evaluationResults.slice(-100).reduce((sum, e) => sum + e.score, 0) /
            100;

          console.log(`Average quality (last 100): ${avgScore}`);
        },
      },
    },
  },
});
```

### Environment Variables

Configure auto-evaluation via environment variables:

```bash
# Set default threshold
NEUROLINK_EVALUATION_THRESHOLD=7

# Use in configuration
```

```typescript
const factory = new MiddlewareFactory({
  middlewareConfig: {
    autoEvaluation: {
      enabled: true,
      config: {
        threshold: Number(process.env.NEUROLINK_EVALUATION_THRESHOLD) || 7,
      },
    },
  },
});
```

---

## Combining Middleware

### Recommended Execution Order

Middleware executes in **priority order** (higher priority runs first). Here's the recommended order for combining built-in middleware:

```
Priority 100: Analytics (always run first)
Priority 90:  Guardrails (security checks)
Priority 90:  Auto-Evaluation (quality checks)
```

**Why This Order?**

1. **Analytics first**: Capture metrics for all requests, even blocked ones
2. **Guardrails second**: Block unsafe content before it's evaluated
3. **Auto-Evaluation last**: Evaluate quality of safe responses

### Example: Production Configuration

```typescript

const factory = new MiddlewareFactory({
  preset: "all", // Enables analytics + guardrails

  // Customize individual middleware
  middlewareConfig: {
    analytics: {
      enabled: true,
      // Always enabled for production monitoring
    },

    guardrails: {
      enabled: true,
      config: {
        badWords: ["spam", "abuse", "harassment"],
        precallEvaluation: {
          enabled: true,
          provider: "openai",
          evaluationModel: "gpt-4",
          thresholds: {
            safetyScore: 8, // High safety threshold (1-10 scale)
            appropriatenessScore: 7,
          },
          blockUnsafeRequests: true,
        },
      },
    },

    autoEvaluation: {
      enabled: true,
      config: {
        threshold: 7,
        blocking: false, // Non-blocking for performance
        onEvaluationComplete: async (evaluation) => {
          // Log to monitoring system
          await sendMetric("ai.quality.score", evaluation.score);
        },
      },
    },
  },
});
```

### Example: Development Configuration

```typescript
const factory = new MiddlewareFactory({
  middlewareConfig: {
    analytics: {
      enabled: true,
      // Track usage in development
    },

    guardrails: {
      enabled: false,
      // Disable in development for easier testing
    },

    autoEvaluation: {
      enabled: false,
      // Disable in development for faster iteration
    },
  },
});
```

### Example: Security-First Configuration

```typescript
const factory = new MiddlewareFactory({
  preset: "security", // Guardrails only

  middlewareConfig: {
    guardrails: {
      enabled: true,
      config: {
        badWords: organizationBlocklist,
        precallEvaluation: {
          enabled: true,
          provider: "openai",
          evaluationModel: "gpt-4",
          thresholds: {
            safetyScore: 9, // Very strict (1-10 scale)
            appropriatenessScore: 9,
          },
          blockUnsafeRequests: true,
        },
      },
    },

    analytics: {
      enabled: true,
      // Track security metrics
    },
  },
});
```

---

## Performance Considerations

### Analytics

- **Overhead**: Minimal (\ context.options.requireEvaluation === true;
   }
   ```

2. **Use Fast Models for Filtering**: Use GPT-3.5 instead of GPT-4 for guardrails

   ```typescript
   filterModel: openai("gpt-3.5-turbo"); // Fast and cost-effective
   ```

3. **Batch Evaluations**: For non-blocking auto-evaluation, batch multiple evaluations

   ```typescript
   onEvaluationComplete: async (evaluation) => {
     evaluationQueue.push(evaluation);
     if (evaluationQueue.length >= 10) {
       await sendBatchToMonitoring(evaluationQueue);
       evaluationQueue = [];
     }
   };
   ```

---

## Troubleshooting

### Analytics Not Appearing in Response

**Problem**: Analytics data is missing from response metadata.

**Solution**:

1. Verify analytics is enabled:

   ```typescript
   factory.registry.has("analytics"); // Should return true
   ```

2. Check preset configuration:

   ```typescript
   const factory = new MiddlewareFactory({
     preset: "default", // Analytics enabled by default
   });
   ```

3. Access analytics correctly:

   ```typescript
   const analytics = result.experimental_providerMetadata?.neurolink?.analytics;
   ```

### Guardrails Blocking Valid Content

**Problem**: Guardrails are blocking safe content.

**Solution**:

1. Adjust precall evaluation threshold:

   ```typescript
   threshold: 0.7; // Lower threshold for less strict filtering
   ```

2. Review bad words list:

   ```typescript
   badWords: []; // Temporarily disable to test
   ```

3. Check model-based filter:

   ```typescript
   modelFilter: {
     enabled: false; // Temporarily disable to test
   }
   ```

### Auto-Evaluation Slowing Down Responses

**Problem**: Responses are slower due to evaluation.

**Solution**:

1. Use non-blocking mode:

   ```typescript
   blocking: false;
   ```

2. Reduce evaluation frequency:

   ```typescript
   conditions: {
     custom: (context) => Math.random() < 0.1; // Evaluate 10% of requests
   }
   ```

3. Use faster evaluation model:

   ```typescript
   evaluationModel: "gpt-3.5-turbo",
   ```

---

## See Also

- [Middleware Architecture](/docs/advanced/middleware-architecture) - Deep dive into middleware system design
- [Custom Middleware Guide](/docs/workflows/custom-middleware) - Create your own middleware
- [HITL Integration](/docs/features/enterprise-hitl) - Combine middleware with human approval workflows
- [Provider Comparison](/docs/reference/provider-comparison) - Which providers work best with middleware

---

## CLI Guide

<!-- Source: advanced/cli-guide.md -->

# CLI Guide

Complete guide to NeuroLink's command line interface.

## Installation

```bash
npm install -g @juspay/neurolink
```

## Basic Commands

### Text Generation

```bash
neurolink generate "Write a haiku about coding"
```

### Provider Management

```bash
neurolink provider list
neurolink provider status
```

## MCP Commands

### Server Management

```bash
neurolink mcp install
neurolink mcp list
neurolink mcp status
```

### Tool Integration

```bash
neurolink mcp tools
neurolink mcp test
```

## Server Management

Advanced server configuration and management commands.

### Server Commands

```bash
# Start server with specific framework
neurolink serve --framework express --port 8080

# Background mode for production
neurolink server start --port 3000 --framework hono
neurolink server status --format json
neurolink server stop

# Route inspection
neurolink server routes --group agent
neurolink server routes --method POST --format json

# Configuration
neurolink server config --get defaultPort
neurolink server config --set rateLimit.maxRequests=200

# OpenAPI generation
neurolink server openapi -o openapi.yaml --format yaml
```

For detailed server adapter documentation, see the [Server Adapters Guide](/docs/guides/server-adapters).

For detailed command reference, see [Commands Reference](/docs/cli/commands).

---

## Enterprise Features

<!-- Source: advanced/enterprise.md -->

# Enterprise Features

NeuroLink provides comprehensive enterprise-grade features for production deployments.

## Security

### Authentication

- API key management
- OAuth integration
- Role-based access control

### Data Protection

- Encryption at rest and in transit
- Data residency compliance
- Audit logging

## Scalability

### High Availability

- Load balancing
- Failover mechanisms
- Multi-region deployment

### Performance

- Caching strategies
- Connection pooling
- Request optimization

## Monitoring

### Analytics

- Usage metrics
- Performance monitoring
- Error tracking

### Alerting

- Real-time notifications
- Threshold-based alerts
- Custom alert rules

## Compliance

### Standards

- SOC 2 compliance
- GDPR compliance
- Industry-specific requirements

### Governance

- Data governance policies
- Access controls
- Audit trails

## Enterprise Support

### Service Level Agreements

- 99.9% uptime guarantee
- Response time commitments
- Escalation procedures

### Professional Services

- Implementation consulting
- Custom development
- Training and support

For setup instructions, see [Enterprise Proxy Setup](/docs/deployment/enterprise-proxy).

---

## NeuroLink Factory Patterns - Complete Implementation Guide

<!-- Source: advanced/factory-patterns-complete-guide.md -->

# NeuroLink Factory Patterns - Complete Implementation Guide

## Overview

The NeuroLink Factory Infrastructure provides a comprehensive, domain-agnostic framework for enhancing AI interactions with configurable patterns. This Phase 1 implementation delivers a complete factory system that works seamlessly with any domain (healthcare, finance, analytics, etc.) while maintaining 100% backward compatibility.

## Quick Start

### Basic Domain Enhancement

```typescript

// Enhance any GenerateOptions with domain configuration
const enhancedOptions = DomainConfigurationFactory.enhanceWithDomain(
  {
    input: { text: "Analyze patient vital signs trends" },
    provider: "google-ai",
  },
  {
    domainType: "healthcare",
    validationEnabled: true,
  },
);

// Use with NeuroLink SDK
const sdk = new NeuroLink();
const result = await sdk.generate(enhancedOptions);
```

### Advanced Enhancement Utilities

```typescript

// Domain configuration enhancement
const domainResult = OptionsEnhancer.enhanceWithDomain(baseOptions, {
  domainType: "analytics",
  validationEnabled: true,
});

// Streaming optimization enhancement
const streamingResult = OptionsEnhancer.enhanceForStreaming(baseOptions, {
  chunkSize: 512,
  enableProgress: true,
});

// Legacy business context migration
const migrationResult = OptionsEnhancer.migrateFromLegacy(
  baseOptions,
  legacyBusinessContext,
  "ecommerce",
);
```

## Core Components

### 1. Domain Configuration Factory

The `DomainConfigurationFactory` provides domain-specific configuration management:

```typescript
// Register custom domain template
DomainConfigurationFactory.registerDomainTemplate({
  templateName: "financial-analysis",
  baseConfig: {
    domainName: "financial-analysis",
    domainDescription: "Expert in financial analysis and reporting",
    keyTerms: ["revenue", "profit", "ROI", "market analysis"],
    failurePatterns: ["insufficient financial data", "incomplete analysis"],
    successPatterns: ["financial insights show", "analysis indicates"],
    evaluationCriteria: {
      relevanceThreshold: 9,
      accuracyThreshold: 10,
      completenessThreshold: 9,
      alertSeverityMapping: {
        low: { relevanceRange: [9, 10], accuracyRange: [10, 10] },
        medium: { relevanceRange: [7, 8], accuracyRange: [8, 9] },
        high: { relevanceRange: [0, 6], accuracyRange: [0, 7] },
      },
    },
    toolPreferences: ["financial_calculator", "market_data_analyzer"],
  },
  requiredFields: ["domainName", "domainDescription", "keyTerms"],
  optionalFields: ["evaluationCriteria", "toolPreferences"],
});

// Use the registered template
const financialConfig = DomainConfigurationFactory.createDomainConfig({
  domainType: "financial-analysis",
  validationEnabled: true,
});
```

### 2. Options Enhancement Utilities

The `OptionsEnhancer` provides intelligent enhancement of `GenerateOptions`:

```typescript
// Enhanced workflow example
const baseOptions = {
  input: { text: "Analyze healthcare compliance requirements" },
  provider: "anthropic",
  model: "claude-3",
};

// Step 1: Apply domain enhancement
const domainEnhanced = OptionsEnhancer.enhanceWithDomain(baseOptions, {
  domainType: "healthcare",
  validationEnabled: true,
});

// Step 2: Apply streaming optimization
const fullyEnhanced = OptionsEnhancer.enhanceForStreaming(
  domainEnhanced.options,
  {
    chunkSize: 256,
    enableProgress: true,
  },
);

// Result includes comprehensive metadata
console.log(fullyEnhanced.metadata);
// {
//   enhancementApplied: true,
//   enhancementType: "streaming-optimization",
//   processingTime: 5,
//   configurationUsed: { chunkSize: 256, enableProgress: true },
//   warnings: [],
//   recommendations: ["Monitor streaming performance..."]
// }
```

### 3. Context Conversion Utilities

The `ContextConverter` provides migration from legacy business contexts:

```typescript

// Convert legacy business context
const legacyContext = {
  sessionId: "business-session-123",
  userId: "user-456",
  juspayToken: "token-789",
  shopUrl: "https://shop.example.com",
  shopId: "shop-123",
  merchantId: "merchant-456",
  customBusinessData: "legacy-value",
};

const executionContext = ContextConverter.convertBusinessContext(
  legacyContext,
  "ecommerce",
  {
    preserveLegacyFields: true,
    validateDomainData: true,
    includeMetadata: true,
  },
);

// Create clean domain context
const domainContext = ContextConverter.createDomainContext(
  "analytics",
  {
    analyticsEngine: "advanced",
    dataSources: ["database", "api"],
    processingMode: "realtime",
  },
  {
    sessionId: "analytics-session",
    userId: "analyst-user",
  },
);
```

## Integration Examples

### CLI Integration

Factory patterns work seamlessly with the NeuroLink CLI:

```bash
# Basic usage (unchanged)
neurolink generate "Analyze data trends" --provider google-ai

# Enhanced with analytics
neurolink generate "Healthcare analysis" --enable-analytics --evaluation-domain healthcare

# Context integration
neurolink generate "Custom analysis" --context '{"domain":"finance","userId":"analyst123"}'

# Streaming with domain awareness
neurolink stream "Real-time analytics" --enable-evaluation --evaluation-domain analytics
```

### SDK Integration

```typescript

  NeuroLink,
  DomainConfigurationFactory,
  OptionsEnhancer,
} from "@juspay/neurolink";

const sdk = new NeuroLink();

// Method 1: Direct domain enhancement
const result1 = await sdk.generate(
  DomainConfigurationFactory.enhanceWithDomain(
    { input: { text: "Medical diagnosis analysis" } },
    { domainType: "healthcare", validationEnabled: true },
  ),
);

// Method 2: Using OptionsEnhancer workflow
const enhanced = OptionsEnhancer.enhanceWithDomain(
  {
    input: { text: "Financial market trends" },
    enableAnalytics: true,
    enableEvaluation: true,
  },
  { domainType: "analytics", validationEnabled: true },
);

const result2 = await sdk.generate(enhanced.options);

// Method 3: Streaming with factory patterns
const streamResult = await sdk.stream(
  OptionsEnhancer.enhanceForStreaming(
    DomainConfigurationFactory.enhanceWithDomain(
      { input: { text: "Live data processing" } },
      { domainType: "analytics" },
    ),
    { chunkSize: 512, enableProgress: true },
  ).options,
);
```

### Evaluation and Analytics Integration

```typescript
// Enhanced evaluation with domain awareness
const evaluationContext = {
  userQuery: "What are the symptoms of hypertension?",
  aiResponse: "Hypertension symptoms include headaches and dizziness...",
  primaryDomain: "healthcare",
  context: {
    domainType: "healthcare",
    domainConfig: healthcareDomainConfig,
  },
  assistantRole: "healthcare assistant",
};

const evaluation = await generateUnifiedEvaluation(evaluationContext);

// Enhanced analytics with factory metadata
const analytics = createAnalytics(
  "google-ai",
  "gemini-2.5-flash",
  result,
  responseTime,
  {
    domainType: "healthcare",
    enhancementType: "domain-configuration",
    factoryMetadata: {
      enhancementApplied: true,
      processingTime: 5,
    },
  },
);
```

## Domain Configuration Reference

### Pre-registered Domains

#### Healthcare Domain

```typescript
{
  domainName: "healthcare",
  domainDescription: "Healthcare and medical information expert",
  keyTerms: ["healthcare", "medical", "patient", "treatment", "diagnosis", "clinical"],
  failurePatterns: [
    "medical information unavailable",
    "cannot provide medical advice",
    "insufficient patient data"
  ],
  successPatterns: [
    "clinical analysis shows",
    "medical data indicates",
    "patient outcomes demonstrate"
  ],
  evaluationCriteria: {
    relevanceThreshold: 9,
    accuracyThreshold: 10,
    completenessThreshold: 9
  },
  toolPreferences: ["medical_analyzer", "patient_data_processor"]
}
```

#### Analytics Domain

```typescript
{
  domainName: "analytics",
  domainDescription: "Data analytics and business intelligence expert",
  keyTerms: ["analytics", "metrics", "data", "trends", "insights", "performance"],
  failurePatterns: [
    "no data available",
    "insufficient metrics",
    "data incomplete"
  ],
  successPatterns: [
    "analysis shows",
    "data indicates",
    "metrics reveal",
    "trend analysis"
  ],
  evaluationCriteria: {
    relevanceThreshold: 8,
    accuracyThreshold: 9,
    completenessThreshold: 8
  },
  toolPreferences: ["data_analyzer", "metrics_calculator"]
}
```

### Custom Domain Creation

```typescript
// Define custom domain template
const customDomain: DomainTemplate = {
  templateName: "legal-analysis",
  baseConfig: {
    domainName: "legal-analysis",
    domainDescription: "Legal document analysis and compliance expert",
    keyTerms: ["legal", "compliance", "regulation", "contract", "law"],
    failurePatterns: [
      "insufficient legal context",
      "cannot provide legal advice",
    ],
    successPatterns: ["legal analysis indicates", "compliance review shows"],
    evaluationCriteria: {
      relevanceThreshold: 10,
      accuracyThreshold: 10,
      completenessThreshold: 9,
      alertSeverityMapping: {
        low: { relevanceRange: [9, 10], accuracyRange: [10, 10] },
        medium: { relevanceRange: [7, 8], accuracyRange: [8, 9] },
        high: { relevanceRange: [0, 6], accuracyRange: [0, 7] },
      },
    },
    toolPreferences: ["legal_analyzer", "compliance_checker"],
    customRules: {
      disclaimerRequired: true,
      confidentialityLevel: "high",
    },
  },
  requiredFields: ["domainName", "domainDescription", "keyTerms"],
  optionalFields: ["evaluationCriteria", "toolPreferences", "customRules"],
  validationRules: [
    {
      field: "domainName",
      validator: (value) => typeof value === "string" && value.length > 0,
      errorMessage: "Domain name is required",
    },
  ],
};

// Register and use
DomainConfigurationFactory.registerDomainTemplate(customDomain);
const legalOptions = DomainConfigurationFactory.enhanceWithDomain(baseOptions, {
  domainType: "legal-analysis",
  validationEnabled: true,
});
```

## Advanced Usage Patterns

### Batch Enhancement

```typescript

const enhancements = [
  {
    enhancementType: "domain-configuration" as const,
    domainOptions: { domainType: "healthcare" as const },
  },
  {
    enhancementType: "streaming-optimization" as const,
    streamingOptions: { enabled: true, chunkSize: 256 },
  },
];

const result = batchEnhance(baseOptions, enhancements);
```

### Legacy Migration Workflow

```typescript
// Complete legacy migration example
const legacyBusinessContext = {
  sessionId: "legacy-session-123",
  userId: "business-user-456",
  juspayToken: "legacy-token",
  shopUrl: "https://legacy-shop.com",
  customBusinessField: "legacy-value",
};

// Step 1: Migrate legacy context
const migrationResult = OptionsEnhancer.migrateFromLegacy(
  {
    input: { text: "Analyze business performance" },
    enableAnalytics: true,
    enableEvaluation: true,
  },
  legacyBusinessContext,
  "ecommerce",
);

// Step 2: Optional streaming enhancement
const finalOptions = OptionsEnhancer.enhanceForStreaming(
  migrationResult.options,
  { chunkSize: 512 },
);

// Step 3: Execute with full enhancement metadata
const result = await sdk.generate(finalOptions.options);
```

### Performance Optimization

```typescript
// Monitor enhancement performance
const startTime = Date.now();

const enhanced = OptionsEnhancer.enhanceWithDomain(baseOptions, {
  domainType: "analytics",
  validationEnabled: true,
});

console.log(`Enhancement time: ${enhanced.metadata.processingTime}ms`);

// Track enhancement statistics
const stats = OptionsEnhancer.getStatistics();
console.log(`Total enhancements: ${stats.enhancementCount}`);

// Reset statistics for new session
OptionsEnhancer.resetStatistics();
```

## Error Handling and Validation

### Graceful Degradation

```typescript
try {
  const enhanced = DomainConfigurationFactory.enhanceWithDomain(options, {
    domainType: "custom-domain",
    validationEnabled: true,
  });
} catch (error) {
  // Factory patterns never break core functionality
  console.log("Enhancement failed, using original options");
  const result = await sdk.generate(options);
}
```

### Validation and Warnings

```typescript
const result = OptionsEnhancer.enhance(options, enhancementOptions);

// Check for warnings
if (result.metadata.warnings.length > 0) {
  console.log("Warnings:", result.metadata.warnings);
}

// Check recommendations
if (result.metadata.recommendations.length > 0) {
  console.log("Recommendations:", result.metadata.recommendations);
}
```

## Testing and Quality Assurance

### Test Coverage

The factory infrastructure includes comprehensive test suites:

- **Domain Configuration Tests**: 13 test suites, 50+ tests
- **Integration Tests**: 11 test suites covering all interfaces
- **Streaming Tests**: 11 additional test suites with factory integration
- **CLI Integration Tests**: 14 test suites validating zero breaking changes
- **Evaluation Integration**: 6 test suites with domain-aware evaluation
- **Analytics Integration**: 6 test suites with factory metadata tracking

### Performance Benchmarks

- **Enhancement Processing**: \,
    domainType: string,
  ): EnhancementResult;
  static validateEnhancement(
    options: GenerateOptions,
    enhancementOptions: EnhancementOptions,
  ): ValidationResult;
  static getStatistics(): EnhancementStatistics;
  static resetStatistics(): void;
}
```

### ContextConverter

```typescript
class ContextConverter {
  static convertBusinessContext(
    legacyContext: Record,
    domainType: string,
    options?: ContextConversionOptions,
  ): ExecutionContext;
  static createDomainContext(
    domainType: string,
    domainData: Record,
    sessionInfo?: SessionInfo,
  ): ExecutionContext;
}
```

## Conclusion

The NeuroLink Factory Infrastructure provides a comprehensive, production-ready framework for domain-agnostic AI enhancement. With zero breaking changes, extensive test coverage, and flexible enhancement patterns, it enables powerful domain-specific AI interactions while maintaining the simplicity and reliability of the existing NeuroLink SDK.

The factory patterns scale from simple domain configuration to complex multi-enhancement workflows, making them suitable for any application from basic chatbots to enterprise AI systems requiring sophisticated domain expertise and analytics tracking.

---

## Factory Pattern Migration Guide

<!-- Source: advanced/factory-patterns.md -->

# Factory Pattern Migration Guide

## Overview

NeuroLink has been refactored to use a unified factory pattern architecture where all providers inherit from a common `BaseProvider` class. This provides consistent tool support and behavior across all AI providers.

## What Changed

### 1. Unified BaseProvider Architecture

All providers now inherit from `BaseProvider`, which provides:

- Built-in tool support (6 core tools)
- Consistent `generate()` and `stream()` methods
- Analytics and evaluation capabilities
- Standardized error handling

### 2. Automatic Tool Support

Every provider automatically includes these tools:

- `getCurrentTime` - Get current date and time
- `readFile` - Read file contents
- `listDirectory` - List directory contents
- `calculateMath` - Perform calculations
- `writeFile` - Write to files
- `searchFiles` - Search for files by pattern

### 3. Simplified Provider Implementation

Providers no longer need to implement their own tool handling - they inherit it from BaseProvider. This means:

- No more `executeGenerate` methods in individual providers
- Consistent tool behavior across all providers
- Less code duplication

## Migration Steps

### For Users

**Good news! There are no breaking changes.** Your existing code will continue to work exactly as before.

#### Tool Usage (No Changes Required)

```typescript
// This works exactly as before
const provider = createBestAIProvider("openai");
const result = await provider.generate({
  input: { text: "What time is it?" },
});
// Tools are used automatically
```

#### Disabling Tools (New Option)

```typescript
// New: You can now disable tools if needed
const result = await provider.generate({
  input: { text: "What time is it?" },
  disableTools: true, // New option
});
// Will use training data instead of real-time tools
```

### For Provider Developers

If you've created custom providers, you'll need to update them to use the new pattern:

#### Before (Old Pattern)

```typescript
export class CustomProvider implements AIProvider {
  async executeGenerate(
    options: TextGenerationOptions,
  ): Promise {
    // Custom implementation with manual tool handling
    const tools = await this.getTools();
    // ... complex tool execution logic
  }
}
```

#### After (New Pattern)

```typescript
export class CustomProvider extends BaseProvider {
  // No executeGenerate needed - BaseProvider handles it

  protected getAISDKModel(): LanguageModelV1 {
    // Return your AI SDK model instance
    return this.model;
  }

  protected getProviderName(): AIProviderName {
    return "custom";
  }

  protected getDefaultModel(): string {
    return "custom-model";
  }
}
```

## Provider Tool Support Status

After the refactoring, here's the current status of tool support:

| Provider     | Status            | Notes                                                |
| ------------ | ----------------- | ---------------------------------------------------- |
| OpenAI       | ✅ Full Support   | All tools working correctly                          |
| Google AI    | ✅ Full Support   | Excellent tool execution                             |
| Anthropic    | ✅ Full Support   | Working after max_tokens fix                         |
| Azure OpenAI | ✅ Full Support   | Same as OpenAI                                       |
| Mistral      | ✅ Full Support   | Good tool support                                    |
| HuggingFace  | ⚠️ Partial        | Model sees tools but may describe instead of execute |
| Vertex AI    | ⚠️ Partial        | Tools available but may not execute                  |
| Ollama       | ❌ Limited        | Requires specific models (e.g., gemma3n)             |
| Bedrock      | ✅ Full Support\* | Requires valid AWS credentials                       |

## Benefits of the New Architecture

1. **Consistency**: All providers behave the same way with tools
2. **Maintainability**: Less code duplication, easier to update
3. **Reliability**: Centralized tool handling reduces bugs
4. **Extensibility**: Easy to add new tools for all providers at once
5. **Testing**: Simplified testing with consistent behavior

## Common Issues and Solutions

### Issue: Provider Not Using Tools

**Solution**: Check if your model supports function calling. Some models (especially older or smaller ones) may not support tools.

```typescript
// For providers with limited tool support
const result = await provider.generate({
  input: { text: "What time is it?" },
  disableTools: true, // Explicitly disable tools
});
```

### Issue: HuggingFace Describing Tools Instead of Using Them

**Solution**: This is a model limitation. Use models that support function calling:

- `mistralai/Mixtral-8x7B-Instruct-v0.1`
- `mistralai/Mistral-7B-Instruct-v0.2`

### Issue: Ollama Returns Empty Content

**Solution**: Use models that support tool calling:

```bash
export OLLAMA_MODEL="gemma3n:latest"
# or
export OLLAMA_MODEL="aliafshar/gemma3-it-qat-tools:latest"
```

### Issue: Vertex AI Not Using Tools

**Solution**: This may require schema formatting adjustments. The Vertex provider needs to format tools according to Google's Gemini API schema.

## Future Improvements

1. **Dynamic Tool Loading**: Ability to add custom tools at runtime
2. **Provider-Specific Tool Formatting**: Automatic adaptation of tool schemas for each provider
3. **Tool Usage Analytics**: Detailed metrics on which tools are used most
4. **Tool Caching**: Cache tool results for better performance

## Support

If you encounter any issues with the migration:

1. Check the [provider status documentation](/docs/advanced/updated-provider-test-results)
2. Review the [provider configuration guide](/docs/getting-started/provider-setup)
3. Open an issue on GitHub with details about your use case

---

Remember: **No breaking changes!** Your existing code continues to work. The factory pattern refactoring improves the internal architecture while maintaining full backward compatibility.

---

## Memory Integration with Mem0

<!-- Source: advanced/memory-integration.md -->

# Memory Integration with Mem0

Enhance your AI applications with persistent, context-aware memory using NeuroLink's integrated Mem0 support. This feature enables your AI to remember user preferences, context, and conversation history across sessions while maintaining perfect user isolation.

##  Overview

NeuroLink's Mem0 integration provides:

- ** Cross-Session Memory**: AI remembers context across different conversations and sessions
- ** User Isolation**: Complete separation of memory contexts between different users
- ** Semantic Search**: Vector-based memory retrieval using advanced embeddings
- ** Multiple Vector Stores**: Support for Qdrant, Pinecone, Weaviate, and Chroma
- **⚡ Streaming Integration**: Memory-enhanced real-time streaming responses
- ** Background Processing**: Non-blocking memory operations that don't slow down responses
- **⚙️ Native Mem0 Config**: Direct support for Mem0's native configuration format

## ️ Architecture

```mermaid
graph LR
    A[NeuroLink SDK] --> B[Mem0 Memory Layer]
    B --> C[Vector Store]
    B --> D[Embeddings Provider]
    B --> E[LLM Provider]
    C --> F[Qdrant/Pinecone/Weaviate/Chroma]
    D --> G[OpenAI/Google/HuggingFace]
    E --> H[Google/OpenAI/Anthropic]

    A --> I[Generate/Stream]
    I --> J[Memory Search]
    J --> K[Context Enhancement]
    K --> L[AI Response]
    L --> M[Background Memory Storage]
```

The memory system operates in three phases:

1. **Memory Retrieval**: Relevant memories are fetched before generating responses
2. **Context Enhancement**: Retrieved memories are seamlessly injected into prompts
3. **Memory Storage**: New conversation turns are stored asynchronously in the background

##  Quick Start

### Basic Configuration

```typescript

const neurolink = new NeuroLink({
  conversationMemory: {
    enabled: true,
    mem0Enabled: true,
    mem0Config: {
      // Mem0 native configuration format
      disableHistory: true,
      version: "v1.1",

      // Embeddings configuration
      embedder: {
        provider: "openai",
        config: {
          apiKey: process.env.OPENAI_API_KEY,
          model: "text-embedding-3-small", // 1536 dimensions
        },
      },

      // Vector store configuration
      vectorStore: {
        provider: "qdrant",
        config: {
          collectionName: "my_app_memories",
          dimension: 1536, // Must match embeddings model
          url: "http://localhost:6333",
          checkCompatibility: false,
        },
      },

      // LLM for memory processing
      llm: {
        provider: "google",
        config: {
          baseURL: "https://generativelanguage.googleapis.com",
          apiKey: process.env.GEMINI_API_KEY,
          model: "gemini-2.0-flash-exp",
        },
      },
    },
  },
  providers: {
    google: {
      apiKey: process.env.GEMINI_API_KEY,
    },
  },
});
```

### First Conversation with Memory

```typescript
// Store user context
const response1 = await neurolink.generate({
  input: {
    text: "Hi! I'm Sarah, a frontend developer at TechCorp. I love React and TypeScript.",
  },
  context: {
    userId: "user_sarah_123", // Required for memory isolation
    sessionId: "onboarding_session", // Optional session identifier
  },
  provider: "google-ai",
  model: "gemini-2.0-flash-exp",
});

console.log(response1.content);
// AI acknowledges and stores Sarah's information

// Later conversation - memory retrieval
const response2 = await neurolink.generate({
  input: {
    text: "What programming languages do I work with? And remind me where I work?",
  },
  context: {
    userId: "user_sarah_123", // Same user ID
    sessionId: "help_session", // Different session
  },
  provider: "google-ai",
});

console.log(response2.content);
// AI recalls: "You work with React and TypeScript at TechCorp"
```

##  Configuration Options

### Vector Store Configurations

#### Qdrant (Recommended)

```typescript
vectorStore: {
  provider: "qdrant",
  config: {
    collectionName: "memories",
    dimension: 1536,
    url: "http://localhost:6333",
    // Optional: API key for Qdrant Cloud
    apiKey: process.env.QDRANT_API_KEY,
    checkCompatibility: false,
  },
}
```

#### Pinecone

```typescript
vectorStore: {
  provider: "pinecone",
  config: {
    index: "memory-index",
    namespace: "user-memories",
    apiKey: process.env.PINECONE_API_KEY,
    environment: "us-west1-gcp-free",
  },
}
```

#### Weaviate

```typescript
vectorStore: {
  provider: "weaviate",
  config: {
    url: "http://localhost:8080",
    className: "Memory",
    // Optional authentication
    apiKey: process.env.WEAVIATE_API_KEY,
  },
}
```

#### Chroma

```typescript
vectorStore: {
  provider: "chroma",
  config: {
    host: "localhost",
    port: 8000,
    collectionName: "memories",
    // Optional authentication
    auth: {
      type: "basic",
      credentials: process.env.CHROMA_AUTH
    }
  },
}
```

### Embedding Provider Options

#### OpenAI Embeddings (1536 dimensions)

```typescript
embedder: {
  provider: "openai",
  config: {
    apiKey: process.env.OPENAI_API_KEY,
    model: "text-embedding-3-small", // or text-embedding-3-large
  },
}
```

#### Google Embeddings (768 dimensions)

```typescript
embedder: {
  provider: "google",
  config: {
    apiKey: process.env.GOOGLE_AI_API_KEY,
    model: "text-embedding-004",
  },
}
```

#### HuggingFace Embeddings

```typescript
embedder: {
  provider: "huggingface",
  config: {
    apiKey: process.env.HUGGINGFACE_API_KEY,
    model: "sentence-transformers/all-MiniLM-L6-v2",
  },
}
```

### LLM Provider Options

The LLM is used by Mem0 for memory processing and organization:

#### Google AI

```typescript
llm: {
  provider: "google",
  config: {
    baseURL: "https://generativelanguage.googleapis.com",
    apiKey: process.env.GEMINI_API_KEY,
    model: "gemini-2.0-flash-exp"
  },
}
```

#### OpenAI

```typescript
llm: {
  provider: "openai",
  config: {
    apiKey: process.env.OPENAI_API_KEY,
    model: "gpt-4-turbo"
  },
}
```

#### Anthropic

```typescript
llm: {
  provider: "anthropic",
  config: {
    apiKey: process.env.ANTHROPIC_API_KEY,
    model: "claude-3-sonnet-20240229"
  },
}
```

##  Advanced Usage Examples

### User Isolation in Multi-Tenant Applications

```typescript
// User Alice's conversation
const aliceResponse = await neurolink.generate({
  input: {
    text: "I prefer dark mode and use VSCode for development.",
  },
  context: {
    userId: "tenant_1_alice_123",
    sessionId: "preferences_session",
  },
});

// User Bob's conversation (different tenant)
const bobResponse = await neurolink.generate({
  input: {
    text: "I love light themes and use WebStorm IDE.",
  },
  context: {
    userId: "tenant_2_bob_456",
    sessionId: "setup_session",
  },
});

// Later: Alice queries her preferences
const aliceQuery = await neurolink.generate({
  input: {
    text: "What IDE do I use and what theme do I prefer?",
  },
  context: {
    userId: "tenant_1_alice_123",
  },
});
// Returns: "You use VSCode with dark mode" (not Bob's preferences)
```

### Streaming with Memory Context

```typescript
// Memory-enhanced streaming
const stream = await neurolink.stream({
  input: {
    text: "Write me a personalized coding tutorial based on my experience level.",
  },
  context: {
    userId: "developer_sarah",
    sessionId: "tutorial_session",
  },
  provider: "anthropic",
  model: "claude-3-sonnet-20240229",
  streaming: {
    enabled: true,
    enableProgress: true,
  },
});

let fullContent = "";
for await (const chunk of stream.stream) {
  if (chunk.content) {
    fullContent += chunk.content;
    process.stdout.write(chunk.content);
  }
}

// The tutorial will be personalized based on Sarah's stored experience level,
// preferred technologies, and previous learning progress
```

##  Memory Lifecycle

### Automatic Memory Storage

Memory storage happens automatically after each conversation:

1. **Conversation Completion**: After AI generates a response
2. **Conversation Turn Creation**: User input + AI response are combined into a conversation turn
3. **Background Storage**: Memory is stored asynchronously using `setImmediate()` (non-blocking)
4. **Vector Embedding**: Text is converted to embeddings by Mem0
5. **Database Storage**: Stored in vector database with user context and metadata
6. **Indexing**: Made available for future searches

### Memory Storage Format

The actual storage format used by NeuroLink:

```typescript
// Conversation turn stored as JSON string
const conversationTurn = [
  { role: "user", content: "User's input text" },
  { role: "system", content: "AI's response" },
];

// Stored with metadata
await mem0.add(JSON.stringify(conversationTurn), {
  userId: options.context?.userId,
  metadata: {
    timestamp: new Date().toISOString(),
    provider: generateResult.provider,
    model: generateResult.model,
    type: "conversation_turn",
    async_mode: true,
  },
});
```

### Memory Retrieval Process

Memory retrieval occurs before each AI generation:

1. **Memory Search**: Query is sent to Mem0 with user ID and limit
2. **Results Processing**: Mem0 returns `{ results: Array }`
3. **Context Formation**: Memories are joined with newlines
4. **Prompt Enhancement**: Context is injected into the user's prompt
5. **Enhanced Generation**: AI generates response with full context

### Enhanced Prompt Format

Retrieved memories are formatted as:

```typescript
private formatMemoryContext(memoryContext: string, currentInput: string): string {
  return `Context from previous conversations:
${memoryContext}

Current user's request: ${currentInput}`;
}
```

## ️ Development & Testing

### Complete Working Example

The repository includes a comprehensive working example at:

```
scripts/examples/real-memory-test.js
```

 **[View Example on GitHub](https://github.com/juspay/neurolink/blob/release/scripts/examples/real-memory-test.js)**

This example demonstrates:

- Complete end-to-end memory integration
- User isolation testing with Alice and Bob
- Cross-session memory continuity
- Streaming with memory context
- Performance monitoring and analytics
- Error handling patterns
- Resource cleanup

### Running the Example

```bash
# Set environment variables
export OPENAI_API_KEY=sk-...
export GEMINI_API_KEY=AIza...

# Start Qdrant
docker run -p 6333:6333 qdrant/qdrant

# Run the test
node scripts/examples/real-memory-test.js
```

### Testing Memory Integration

```typescript

async function testMemoryFlow() {
  const neurolink = new NeuroLink({
    conversationMemory: {
      enabled: true,
      mem0Enabled: true,
      mem0Config: {
        disableHistory: true,
        version: "v1.1",
        vectorStore: {
          provider: "qdrant",
          config: {
            collectionName: "test_memories",
            dimension: 1536,
            url: "http://localhost:6333",
            checkCompatibility: false,
          },
        },
        embedder: {
          provider: "openai",
          config: {
            apiKey: process.env.OPENAI_API_KEY,
            model: "text-embedding-3-small",
          },
        },
        llm: {
          provider: "google",
          config: {
            apiKey: process.env.GEMINI_API_KEY,
            model: "gemini-2.0-flash-exp",
          },
        },
      },
    },
  });

  // Step 1: Store context
  console.log(" Storing user context...");
  await neurolink.generate({
    input: {
      text: "I'm a Python developer working on machine learning projects with PyTorch.",
    },
    context: {
      userId: "test_user_123",
      sessionId: "context_session",
    },
  });

  // Wait for memory indexing
  console.log("⏳ Waiting for memory indexing...");
  await new Promise((resolve) => setTimeout(resolve, 30000));

  // Step 2: Test recall
  console.log(" Testing memory recall...");
  const response = await neurolink.generate({
    input: {
      text: "What programming language do I use for my ML projects?",
    },
    context: {
      userId: "test_user_123",
      sessionId: "recall_session",
    },
  });

  console.log(" AI Response:", response.content);
  // Should mention Python and PyTorch
}

testMemoryFlow();
```

## ⚠️ Common Issues & Solutions

### Dimension Mismatch Error

```
Error: Vector dimension mismatch: expected 768, got 1536
```

**Solution**: Ensure embedding model dimensions match vector store configuration:

```typescript
// OpenAI embeddings = 1536 dimensions
embedder: {
  config: { model: "text-embedding-3-small" }
},
vectorStore: {
  config: { dimension: 1536 }
}

// Google embeddings = 768 dimensions
embedder: {
  config: { model: "text-embedding-004" }
},
vectorStore: {
  config: { dimension: 768 }
}
```

### API Key Authentication Errors

```
Error: Method doesn't allow unregistered callers
```

**Solution**: Ensure API keys are properly configured for all providers:

```typescript
// Environment variables
OPENAI_API_KEY=sk-...
GEMINI_API_KEY=AIza...
QDRANT_API_KEY=qdr_...

// Configuration
mem0Config: {
  embedder: {
    config: { apiKey: process.env.OPENAI_API_KEY }
  },
  llm: {
    config: { apiKey: process.env.GEMINI_API_KEY }
  },
  vectorStore: {
    config: { apiKey: process.env.QDRANT_API_KEY } // if using Qdrant Cloud
  }
}
```

### Vector Store Connection Issues

```
Error: Connection refused to localhost:6333
```

**Solution**: Ensure vector store is running:

```bash
# Start Qdrant with Docker
docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant

# Verify health
curl http://localhost:6333/health

# Check collections
curl http://localhost:6333/collections
```

### Memory Storage Failures

**Check logs for background storage errors:**

```typescript
// Memory storage is non-blocking, check logs for warnings
logger.warn("Mem0 memory storage failed:", error);
```

**Common causes:**

- Vector store not accessible
- API key issues
- Dimension mismatches
- Collection not found

##  Best Practices

### 1. User ID Management

```typescript
// Use consistent, unique user identifiers
const generateUserId = (tenantId: string, userId: string) =>
  `${tenantId}_user_${userId}`;

context: {
  userId: generateUserId('company_abc', authenticatedUser.id),
  sessionId: `session_${Date.now()}`
}
```

### 2. Memory Privacy & Security

```typescript
// Separate memory collections per tenant
const getTenantMemoryConfig = (tenantId: string) => ({
  vectorStore: {
    config: {
      collectionName: `memories_${tenantId}`,
      // Ensures complete data isolation
    },
  },
});
```

### 3. Graceful Error Handling

Memory operations are designed to be non-blocking:

```typescript
// Memory failures don't break conversations
// Check logs for memory-related warnings
// Conversations continue without memory if needed
```

### 4. Performance Considerations

```typescript
// Memory retrieval is limited to 5 results by default
const memories = await mem0.search(options.input.text, {
  userId: options.context.userId,
  limit: 5, // Configurable limit
});

// Memory storage happens asynchronously
setImmediate(async () => {
  // Non-blocking background storage
});
```

### 5. Production Deployment

```typescript
// Use environment-specific configurations
const mem0Config = {
  vectorStore: {
    provider: "qdrant",
    config: {
      collectionName: `memories_${process.env.NODE_ENV}`,
      url: process.env.QDRANT_URL || "http://localhost:6333",
      apiKey: process.env.QDRANT_API_KEY, // For Qdrant Cloud
    },
  },
};
```

##  Additional Resources

- **[Mem0 Official Documentation](https://docs.mem0.ai/)** - Complete Mem0 configuration reference
- **[Vector Store Setup Guides](https://docs.mem0.ai/components/vectordb/)** - Detailed setup for each vector store
- **[Embedding Models Comparison](https://docs.mem0.ai/components/embeddings/)** - Choose the right embedding provider
- **[Production Deployment](https://docs.mem0.ai/deployment/production/)** - Scale memory for production use
- **[Working Example](https://github.com/juspay/neurolink/blob/release/scripts/examples/real-memory-test.js)** - Complete implementation reference

##  Next Steps

1. **[Set up a vector store](https://docs.mem0.ai/components/vectordb/)** (Qdrant recommended for development)
2. **Configure embedding provider** based on your performance and cost requirements
3. **Test with the working example** to verify your setup
4. **Implement user isolation** patterns for your application architecture
5. **Monitor memory operations** in production logs

Memory integration transforms your AI applications from stateless interactions to intelligent, context-aware assistants that learn and adapt to each user's unique needs and preferences.

---

## Middleware System Architecture

<!-- Source: advanced/middleware-architecture.md -->

# Middleware System Architecture

## Overview

NeuroLink's middleware system provides a powerful and flexible way to intercept, modify, and enhance AI requests and responses. Middleware enables you to implement cross-cutting concerns like authentication, logging, analytics, content filtering, and auto-evaluation without modifying your core application logic.

**Why Middleware Matters:**

- **Request Interception**: Modify requests before they reach the AI provider
- **Response Processing**: Transform, filter, or validate AI responses
- **Cross-Cutting Concerns**: Implement authentication, logging, rate limiting, and caching in a centralized way
- **Composability**: Chain multiple middleware components together
- **Separation of Concerns**: Keep business logic separate from infrastructure concerns

**Key Benefits:**

- Production-ready middleware for common use cases (analytics, guardrails, auto-evaluation)
- Factory pattern for easy middleware management
- Priority-based execution ordering
- Provider-specific conditional execution
- Built on top of Vercel AI SDK's middleware system

## Architecture Diagram

```
┌─────────────────────────────────────────────────────────────────┐
│                        Request Flow                              │
└─────────────────────────────────────────────────────────────────┘

  Client Request
       │
       ├─────────────────────────────────────────────┐
       │                                             │
       v                                             │
┌──────────────────────┐                            │
│  MiddlewareFactory   │                            │
│  - Registry          │                            │
│  - Configuration     │                            │
└──────────────────────┘                            │
       │                                             │
       v                                             │
┌─────────────────────────────────────────┐         │
│    Pre-Request Middleware Chain          │         │
│  (Ordered by Priority - High to Low)    │         │
├─────────────────────────────────────────┤         │
│  1. transformParams (Guardrails)        │         │
│     - Precall evaluation                │         │
│     - Input validation                  │         │
│     - Request transformation            │         │
└─────────────────────────────────────────┘         │
       │                                             │
       v                                             │
┌─────────────────────────────────────────┐         │
│         Provider Execution               │         │
│    (OpenAI, Anthropic, Vertex, etc.)    │         │
└─────────────────────────────────────────┘         │
       │                                             │
       v                                             │
┌─────────────────────────────────────────┐         │
│   Post-Response Middleware Chain         │         │
│  (Ordered by Priority - High to Low)    │         │
├─────────────────────────────────────────┤         │
│  2. wrapGenerate/wrapStream             │         │
│     - Analytics (Priority: 100)         │         │
│     - Guardrails (Priority: 90)         │         │
│     - Auto-Evaluation (Priority: 90)    │         │
└─────────────────────────────────────────┘         │
       │                                             │
       v                                             │
  Client Response                                    │
                                                     │
┌─────────────────────────────────────────┐         │
│          Error Handling Flow             │         │
│    (If error occurs at any stage)       │◄────────┘
├─────────────────────────────────────────┤
│  - Error Middleware Chain               │
│  - Error logging                        │
│  - Fallback handling                    │
│  - Retry logic (if configured)          │
└─────────────────────────────────────────┘
       │
       v
  Error Response
```

## Request Lifecycle

The middleware system processes requests through four distinct phases:

### Phase 1: Pre-Request (transformParams)

Middleware in this phase runs **before** the AI provider call, allowing you to:

- **Validate input**: Check request parameters for validity
- **Authenticate/Authorize**: Verify user permissions
- **Transform requests**: Modify or enrich request parameters
- **Apply guardrails**: Block requests with unsafe content using precall evaluation
- **Rate limiting**: Enforce request quotas

**Example Use Cases:**

- Precall guardrails evaluation (blocking unsafe prompts)
- Request parameter validation
- Adding authentication context
- Modifying prompts based on user preferences

```typescript
transformParams: async ({ params }) => {
  // Pre-request logic here
  console.log("Request received:", params);

  // Can modify params before they reach the provider
  return {
    ...params,
    temperature: Math.min(params.temperature || 0.7, 1.0),
  };
};
```

### Phase 2: Provider Execution

The actual AI provider call happens between middleware phases:

- Request sent to configured provider (OpenAI, Anthropic, Vertex, etc.)
- Provider processes the request
- Response received from provider

This phase is **not** middleware - it's the core AI operation that middleware wraps around.

### Phase 3: Post-Response (wrapGenerate/wrapStream)

Middleware in this phase runs **after** the AI provider responds, allowing you to:

- **Collect analytics**: Track token usage, response times, costs
- **Filter content**: Apply guardrails to block/redact unsafe responses
- **Evaluate quality**: Auto-evaluate response quality and trigger retries
- **Transform responses**: Modify or enrich the response
- **Cache results**: Store responses for future use

**Example Use Cases:**

- Analytics and metrics collection
- Content filtering and safety checks
- Response quality evaluation
- Response caching
- Logging and auditing

```typescript
wrapGenerate: async ({ doGenerate, params }) => {
  const startTime = Date.now();

  // Execute the provider call
  const result = await doGenerate();

  // Post-response logic here
  const responseTime = Date.now() - startTime;
  console.log(`Response in ${responseTime}ms`);

  return result;
};
```

### Phase 4: Error Handling

If an error occurs at any stage, error handling middleware can:

- **Log errors**: Record error details for debugging
- **Transform errors**: Convert provider errors to user-friendly messages
- **Implement fallbacks**: Retry with different providers
- **Alert monitoring**: Send alerts to monitoring systems

**Example Use Cases:**

- Error logging and tracking
- Provider fallback on failure
- Retry logic with exponential backoff
- User-friendly error messages

## Middleware Chain

### Execution Order

Middleware executes in **priority order**, where higher priority values run first:

```
Priority 100: Analytics (runs first)
Priority 90:  Guardrails
Priority 90:  Auto-Evaluation (runs last among same priority)
```

**Important Notes:**

- `transformParams` runs before `wrapGenerate`/`wrapStream`
- Within the same priority, registration order determines execution
- Middleware can be conditionally enabled based on provider, model, or custom logic

### Chain Configuration

Configure which middleware to enable and their order:

```typescript

const factory = new MiddlewareFactory({
  // Use a preset for common configurations
  preset: "all", // Enables analytics + guardrails

  // Or explicitly enable specific middleware
  enabledMiddleware: ["analytics", "guardrails"],

  // Or configure each middleware individually
  middlewareConfig: {
    analytics: {
      enabled: true,
      config: { collectTokenUsage: true },
    },
    guardrails: {
      enabled: true,
      config: {
        badWords: ["prohibited", "blocked"],
        precallEvaluation: { enabled: true },
      },
    },
  },
});
```

### Available Presets

| Preset     | Middleware Enabled     | Use Case               |
| ---------- | ---------------------- | ---------------------- |
| `default`  | Analytics only         | Basic usage tracking   |
| `all`      | Analytics + Guardrails | Production with safety |
| `security` | Guardrails only        | Security-focused       |
| Custom     | Your choice            | Define your own        |

## Factory Pattern

### MiddlewareFactory Class

The `MiddlewareFactory` is the central component for managing middleware:

```typescript
class MiddlewareFactory {
  // Public registry for middleware management
  public registry: MiddlewareRegistry;

  // Available presets
  public presets: Map;

  // Constructor
  constructor(options?: MiddlewareFactoryOptions);

  // Register custom middleware
  register(
    middleware: NeuroLinkMiddleware,
    options?: RegistrationOptions,
  ): void;

  // Register a preset
  registerPreset(preset: MiddlewarePreset, replace?: boolean): void;

  // Apply middleware to a language model
  applyMiddleware(
    model: LanguageModelV1,
    context: MiddlewareContext,
    options?: MiddlewareFactoryOptions,
  ): LanguageModelV1;

  // Create middleware context
  createContext(
    provider: string,
    model: string,
    options?: Record,
    session?: { sessionId?: string; userId?: string },
  ): MiddlewareContext;

  // Validate middleware configuration
  validateConfig(config: Record): ValidationResult;

  // Get available presets
  getAvailablePresets(): PresetInfo[];

  // Get middleware chain statistics
  getChainStats(
    context: MiddlewareContext,
    config: Record,
  ): MiddlewareChainStats;
}
```

### Creating Middleware Instances

**Basic Usage:**

```typescript

// Create factory with default preset (analytics enabled)
const factory = new MiddlewareFactory();

// Create context
const context = factory.createContext("openai", "gpt-4", { temperature: 0.7 });

// Apply middleware to a model
const wrappedModel = factory.applyMiddleware(baseModel, context);
```

**Advanced Configuration:**

```typescript

// Create factory with custom configuration
const factory = new MiddlewareFactory({
  preset: "all",
  middlewareConfig: {
    analytics: {
      enabled: true,
      config: {
        collectTokenUsage: true,
        collectTiming: true,
      },
    },
    guardrails: {
      enabled: true,
      config: {
        badWords: ["unsafe", "prohibited"],
        precallEvaluation: {
          enabled: true,
          provider: "openai",
          model: "gpt-4",
        },
      },
      conditions: {
        providers: ["openai", "anthropic"], // Only apply to specific providers
      },
    },
  },
});

// Or register custom middleware after instantiation
const customMiddleware = createMyCustomMiddleware();
factory.register(customMiddleware);
```

## Registry System

### Registering Middleware

The `MiddlewareRegistry` manages all registered middleware:

```typescript
class MiddlewareRegistry {
  // Register a middleware
  register(
    middleware: NeuroLinkMiddleware,
    options?: MiddlewareRegistrationOptions,
  ): void;

  // Unregister a middleware
  unregister(middlewareId: string): boolean;

  // Get a registered middleware
  get(middlewareId: string): NeuroLinkMiddleware | undefined;

  // List all registered middleware
  list(): NeuroLinkMiddleware[];

  // Get middleware IDs sorted by priority
  getSortedIds(): string[];

  // Build middleware chain based on configuration
  buildChain(
    context: MiddlewareContext,
    config?: Record,
  ): LanguageModelV1Middleware[];

  // Get execution statistics
  getExecutionStats(middlewareId: string): MiddlewareExecutionResult[];

  // Get aggregated statistics for all middleware
  getAggregatedStats(): Record;

  // Clear execution statistics
  clearStats(middlewareId?: string): void;

  // Check if middleware is registered
  has(middlewareId: string): boolean;

  // Get number of registered middleware
  size(): number;

  // Clear all registered middleware
  clear(): void;
}
```

**Registration Example:**

```typescript

const factory = new MiddlewareFactory();

// Register middleware with options
factory.register(myCustomMiddleware, {
  replace: false, // Error if already exists
  defaultEnabled: true, // Enable by default
  globalConfig: {
    // Global configuration
    logLevel: "debug",
  },
});
```

### Discovering Middleware

**List all registered middleware:**

```typescript
const allMiddleware = factory.registry.list();
console.log(
  "Registered middleware:",
  allMiddleware.map((m) => m.metadata.id),
);
```

**Get specific middleware:**

```typescript
const analytics = factory.registry.get("analytics");
if (analytics) {
  console.log("Analytics middleware found:", analytics.metadata.name);
}
```

**Check if middleware is registered:**

```typescript
if (factory.registry.has("guardrails")) {
  console.log("Guardrails middleware is available");
}
```

### Middleware Metadata

Every middleware must provide metadata:

```typescript
type NeuroLinkMiddlewareMetadata = {
  // Unique identifier
  id: string;

  // Human-readable name
  name: string;

  // Description of what this middleware does
  description?: string;

  // Execution priority (higher runs first)
  priority?: number;

  // Whether this middleware is enabled by default
  defaultEnabled?: boolean;
};
```

**Example:**

```typescript
const metadata: NeuroLinkMiddlewareMetadata = {
  id: "my-custom-middleware",
  name: "My Custom Middleware",
  description: "Logs all requests and responses",
  priority: 50, // Run after analytics (100) but before auto-eval (90)
  defaultEnabled: false, // Require explicit enabling
};
```

## TypeScript Interfaces

### NeuroLinkMiddleware

The core middleware interface that combines AI SDK middleware with metadata:

```typescript

type NeuroLinkMiddleware = LanguageModelV1Middleware & {
  // Metadata about this middleware
  metadata: NeuroLinkMiddlewareMetadata;
};
```

### LanguageModelV1Middleware (from AI SDK)

The underlying middleware interface from Vercel AI SDK:

```typescript
type LanguageModelV1Middleware = {
  // Transform request parameters before provider call
  transformParams?: (options: {
    params: LanguageModelV1CallOptions;
  }) => PromiseLike;

  // Wrap generate() calls
  wrapGenerate?: (options: {
    doGenerate: () => PromiseLike;
    params: LanguageModelV1CallOptions;
  }) => PromiseLike;

  // Wrap stream() calls
  wrapStream?: (options: {
    doStream: () => PromiseLike;
    params: LanguageModelV1CallOptions;
  }) => PromiseLike;
};
```

### MiddlewareContext

Context information passed to middleware:

```typescript
type MiddlewareContext = {
  // Provider name (e.g., "openai", "anthropic")
  provider: string;

  // Model name (e.g., "gpt-4", "claude-3-5-sonnet")
  model: string;

  // Additional options
  options: Record;

  // Session information
  session?: {
    sessionId?: string;
    userId?: string;
  };

  // Request metadata
  metadata: {
    timestamp: number;
    requestId: string;
  };
};
```

### MiddlewareConfig

Configuration for individual middleware:

```typescript
type MiddlewareConfig = {
  // Whether this middleware is enabled
  enabled: boolean;

  // Middleware-specific configuration
  config?: Record;

  // Conditions for when this middleware should run
  conditions?: {
    // Only run for specific providers
    providers?: string[];

    // Only run for specific models
    models?: string[];

    // Only run when options match
    options?: Record;

    // Custom condition function
    custom?: (context: MiddlewareContext) => boolean;
  };
};
```

### MiddlewareFactoryOptions

Options for creating and configuring the factory:

```typescript
type MiddlewareFactoryOptions = {
  // Preset to use (e.g., "default", "all", "security")
  preset?: string;

  // Custom middleware to register
  middleware?: NeuroLinkMiddleware[];

  // Configuration for each middleware
  middlewareConfig?: Record;

  // List of middleware IDs to enable
  enabledMiddleware?: string[];

  // List of middleware IDs to disable
  disabledMiddleware?: string[];
};
```

### MiddlewareChainStats

Statistics about middleware execution:

```typescript
type MiddlewareChainStats = {
  // Total middleware in chain
  totalMiddleware: number;

  // Number of middleware actually applied
  appliedMiddleware: number;

  // Total execution time across all middleware
  totalExecutionTime: number;

  // Per-middleware execution results
  results: Record;
};

type MiddlewareExecutionResult = {
  // Whether middleware was applied
  applied: boolean;

  // Execution time in milliseconds
  executionTime: number;

  // Error if execution failed
  error?: Error;
};
```

## Conditional Execution

Middleware can be configured to run only under specific conditions:

### Provider-Specific Middleware

```typescript
factory.applyMiddleware(model, context, {
  middlewareConfig: {
    guardrails: {
      enabled: true,
      conditions: {
        providers: ["openai", "anthropic"], // Only for these providers
      },
    },
  },
});
```

### Model-Specific Middleware

```typescript
factory.applyMiddleware(model, context, {
  middlewareConfig: {
    analytics: {
      enabled: true,
      conditions: {
        models: ["gpt-4", "claude-3-5-sonnet"], // Only for these models
      },
    },
  },
});
```

### Custom Conditions

```typescript
factory.applyMiddleware(model, context, {
  middlewareConfig: {
    myMiddleware: {
      enabled: true,
      conditions: {
        custom: (context) => {
          // Only run during business hours
          const hour = new Date().getHours();
          return hour >= 9 && hour  {
  try {
    const result = await doGenerate();
    return result;
  } catch (error) {
    // Log error but don't break the chain
    console.error("Middleware error:", error);
    throw error; // Re-throw to maintain error flow
  }
};
```

### 3. Use Conditional Execution

```typescript
// Only apply expensive middleware for production
middlewareConfig: {
  expensiveMiddleware: {
    enabled: true,
    conditions: {
      custom: (context) => process.env.NODE_ENV === "production"
    }
  }
}
```

### 4. Keep Middleware Focused

Each middleware should have a single responsibility:

- ✅ Good: Analytics middleware only collects metrics
- ❌ Bad: Analytics middleware that also filters content and logs errors

### 5. Test Middleware Independently

```typescript

// Test middleware in isolation
const middleware = createAnalyticsMiddleware();
const mockDoGenerate = async () => ({ text: "test" });
const result = await middleware.wrapGenerate({
  doGenerate: mockDoGenerate,
  params: { prompt: "test" },
});
```

## See Also

- [Built-in Middleware Reference](/docs/advanced/builtin-middleware) - Documentation for analytics, guardrails, and auto-evaluation
- [Custom Middleware Guide](/docs/workflows/custom-middleware) - Step-by-step guide to creating custom middleware
- [HITL Integration](/docs/features/enterprise-hitl) - Integrating middleware with Human-in-the-Loop workflows
- [Provider Comparison](/docs/reference/provider-comparison) - Which providers support which middleware features

---

## Streaming Responses

<!-- Source: advanced/streaming.md -->

# Streaming Responses

Real-time streaming capabilities for interactive AI applications with built-in analytics, evaluation, and enterprise-grade features.

##  Overview

NeuroLink supports real-time streaming for immediate response feedback, perfect for chat interfaces, live content generation, and interactive applications. Streaming works with all supported providers and includes advanced enterprise features:

- **Multi-Model Streaming**: Intelligent load balancing across multiple SageMaker endpoints
- **Rate Limiting & Backpressure**: Enterprise-grade request management
- **Advanced Caching**: Semantic caching with partial response matching
- **Real-time Analytics**: Comprehensive monitoring and alerting
- **Security & Validation**: Prompt injection detection, content filtering, and compliance
- **Tool Calling**: Streaming function calls with structured output parsing
- **Error Recovery**: Automatic failover and retry mechanisms
- **Performance Optimization**: Adaptive rate limiting and circuit breakers

##  Basic Streaming

### SDK Streaming

```typescript

const neurolink = new NeuroLink();

// Basic streaming
const stream = await neurolink.stream({
  input: { text: "Tell me a story about AI" },
  provider: "openai",
});

for await (const chunk of stream) {
  console.log(chunk.content); // Incremental content
  process.stdout.write(chunk.content);
}
```

### Basic Streaming (Ready to Use)

```typescript

const neurolink = new NeuroLink();

// Basic streaming (works immediately)
const result = await neurolink.stream({
  input: { text: "Generate a business analysis" },
});

for await (const chunk of result) {
  process.stdout.write(chunk.content || "");
}
```

### Streaming with Built-in Tools

```typescript

const neurolink = new NeuroLink();

// Streaming with tools automatically available
const result = await neurolink.stream({
  input: { text: "What's the current time and weather in New York?" },
});

for await (const chunk of result) {
  if (chunk.type === "text") {
    process.stdout.write(chunk.content);
  } else if (chunk.type === "tool_use") {
    console.log(`\n Using tool: ${chunk.tool}`);
  }
}
```

### Simple Configuration

```typescript

// NeuroLink automatically chooses the best available provider
const neurolink = new NeuroLink();

// Streaming works with any configured provider
const result = await neurolink.stream({
  input: { text: "Analyze quarterly performance" },
  maxTokens: 1000,
  temperature: 0.7,
});

for await (const chunk of result) {
  process.stdout.write(chunk.content || "");
}
```

### CLI Streaming

```bash
# Basic streaming with automatic provider selection
npx @juspay/neurolink stream "Tell me a story"

# With specific provider (optional)
npx @juspay/neurolink stream "Explain quantum computing" --provider google-ai

# With debug output to see provider selection
npx @juspay/neurolink stream "Write a poem" --debug

# JSON format streaming (future-ready)
npx @juspay/neurolink stream "Create structured data" --format json --provider google-ai
# Streaming with tools enabled
npx @juspay/neurolink stream "What's the weather in New York?" --enable-tools

# Specify streaming parameters
npx @juspay/neurolink stream "Analyze market trends" \
  --max-tokens 500 \
  --temperature 0.7 \
  --stream
```

##  Advanced Features

### Error Handling with Retry

```typescript

class StreamingWithRetry {
  private neurolink = new NeuroLink();

  async streamWithRetry(prompt: string, maxRetries = 3) {
    for (let attempt = 1; attempt  setTimeout(resolve, 1000 * attempt));
        } else {
          throw error; // Final attempt failed
        }
      }
    }
  }
}

// Usage
const service = new StreamingWithRetry();
const stream = await service.streamWithRetry("Explain quantum computing");

for await (const chunk of stream) {
  process.stdout.write(chunk.content || "");
}
```

### Timeout Handling

```typescript
async function streamWithTimeout(prompt: string, timeoutMs = 30000) {
  const neurolink = new NeuroLink();

  const timeoutPromise = new Promise((_, reject) => {
    setTimeout(() => reject(new Error("Stream timeout")), timeoutMs);
  });

  const streamPromise = neurolink.stream({
    input: { text: prompt },
  });

  const result = await Promise.race([streamPromise, timeoutPromise]);
  return result;
}

// Usage with 45 second timeout
const stream = await streamWithTimeout("Write a detailed report", 45000);
```

### Collecting Full Response

```typescript
async function collectFullResponse(prompt: string) {
  const neurolink = new NeuroLink();

  const result = await neurolink.stream({
    input: { text: prompt },
  });

  const chunks: string[] = [];
  for await (const chunk of result) {
    if (chunk.content) {
      chunks.push(chunk.content);
    }
  }

  return {
    fullText: chunks.join(""),
    chunkCount: chunks.length,
  };
}

// Usage
const response = await collectFullResponse("Analyze market trends");
console.log(`Response: ${response.fullText}`);
console.log(`Stats: ${response.chunkCount} chunks`);
```

### Automatic Provider Selection

```typescript

// NeuroLink automatically handles provider fallback
async function smartStreaming(prompt: string) {
  const neurolink = new NeuroLink();

  // NeuroLink automatically selects the best available provider
  // and falls back to alternatives if the primary fails
  const result = await neurolink.stream({
    input: { text: prompt },
    maxTokens: 500,
  });

  return result;
}

// Usage - NeuroLink handles all provider logic internally
const stream = await smartStreaming("Explain machine learning");

for await (const chunk of stream) {
  process.stdout.write(chunk.content || "");
}
```

### Manual Provider Selection (Optional)

```typescript

// You can optionally specify a provider preference
async function streamWithPreference(
  prompt: string,
  preferredProvider?: string,
) {
  const neurolink = new NeuroLink();

  const result = await neurolink.stream({
    input: { text: prompt },
    provider: preferredProvider, // Optional - NeuroLink will choose if not specified
    maxTokens: 500,
  });

  return result;
}

// Usage
const stream = await streamWithPreference(
  "Explain quantum computing",
  "google-ai",
);

for await (const chunk of stream) {
  process.stdout.write(chunk.content || "");
}
```

### Simple Rate Limiting

```typescript

class ThrottledStreaming {
  private neurolink = new NeuroLink();
  private lastRequest = 0;
  private minInterval = 1000; // 1 second between requests

  async throttledStream(prompt: string) {
    // Wait if needed
    const now = Date.now();
    const timeSinceLastRequest = now - this.lastRequest;

    if (timeSinceLastRequest  setTimeout(resolve, waitTime));
    }

    this.lastRequest = Date.now();

    return await this.neurolink.stream({
      input: { text: prompt },
    });
  }
}

// Usage
const throttled = new ThrottledStreaming();
const result = await throttled.throttledStream("Explain quantum computing");

for await (const chunk of result) {
  process.stdout.write(chunk.content || "");
}
```

### Batch Processing

```typescript
async function processBatch(prompts: string[], maxConcurrent = 2) {
  const neurolink = new NeuroLink();
  const results = [];

  // Process in chunks
  for (let i = 0; i  {
      // Stagger requests to avoid overwhelming providers
      await new Promise((resolve) => setTimeout(resolve, index * 500));

      return await neurolink.stream({
        input: { text: prompt },
      });
    });

    const batchResults = await Promise.all(batchPromises);
    results.push(...batchResults);

    console.log(`Completed batch ${Math.floor(i / maxConcurrent) + 1}`);

    // Pause between batches
    if (i + maxConcurrent  setTimeout(resolve, 1000));
    }
  }

  return results;
}

// Usage
const prompts = ["Explain AI", "Explain ML", "Explain deep learning"];
const results = await processBatch(prompts, 2);
console.log(`Processed ${results.length} requests`);
```

### Simple Caching Pattern

```typescript

class SimpleCache {
  private neurolink = new NeuroLink();
  private cache = new Map();
  private cacheTTL = 60 * 60 * 1000; // 1 hour

  private isExpired(timestamp: number) {
    return Date.now() - timestamp > this.cacheTTL;
  }

  async streamWithCache(prompt: string) {
    const cached = this.cache.get(prompt);

    // Check cache first
    if (cached && !this.isExpired(cached.timestamp)) {
      console.log("⚡ Cache hit!");

      // Return cached response as simulated stream
      const words = cached.response.split(" ");
      return {
        async *stream() {
          for (const word of words) {
            await new Promise((resolve) => setTimeout(resolve, 50));
            yield { content: word + " " };
          }
        },
        fromCache: true,
      };
    }

    console.log(" Cache miss. Generating...");

    // Generate new response using NeuroLink's automatic provider selection
    const result = await this.neurolink.stream({
      input: { text: prompt },
    });

    // Collect response while streaming for caching
    const chunks: string[] = [];
    const responseStream = {
      async *stream() {
        for await (const chunk of result) {
          if (chunk.content) {
            chunks.push(chunk.content);
            yield chunk;
          }
        }

        // Cache after streaming completes
        const fullResponse = chunks.join("");
        this.cache.set(prompt, {
          response: fullResponse,
          timestamp: Date.now(),
        });
        console.log(` Cached response`);
      },
    };

    return {
      stream: responseStream.stream(),
      fromCache: false,
    };
  }
}

// Usage
const cache = new SimpleCache();

// First request (cache miss)
const result1 = await cache.streamWithCache("Explain renewable energy");
for await (const chunk of result1.stream) {
  process.stdout.write(chunk.content || "");
}
console.log(`\nFrom cache: ${result1.fromCache}`);

// Second identical request (cache hit)
const result2 = await cache.streamWithCache("Explain renewable energy");
for await (const chunk of result2.stream) {
  process.stdout.write(chunk.content || "");
}
console.log(`\nFrom cache: ${result2.fromCache}`);
```

### Custom Configuration

```typescript
const stream = await neurolink.stream({
  input: { text: "Generate comprehensive analysis" },
  provider: "anthropic",
  temperature: 0.7,
  maxTokens: 2000,
  output: {
    format: "json", // Future-ready JSON streaming
    streaming: {
      chunkSize: 256,
      bufferSize: 1024,
      enableProgress: true,
    },
  },
});
```

### JSON Streaming Support

```typescript
// Structured data streaming (future-ready)
const jsonStream = await neurolink.stream({
  input: { text: "Create a detailed project plan with milestones" },
  output: {
    format: "structured",
    streaming: {
      chunkSize: 512,
      enableProgress: true,
    },
  },
  schema: {
    type: "object",
    properties: {
      projectName: { type: "string" },
      phases: {
        type: "array",
        items: {
          type: "object",
          properties: {
            name: { type: "string" },
            duration: { type: "string" },
            tasks: { type: "array", items: { type: "string" } },
          },
        },
      },
    },
  },
});

let structuredData = "";
for await (const chunk of jsonStream.stream) {
  structuredData += chunk.content;

  // Try to parse partial JSON
  try {
    const partial = JSON.parse(structuredData);
    console.log("Partial structure:", partial);
  } catch {
    // Still building complete JSON
  }
}
```

### Error Handling & Recovery

```typescript

const neurolink = new NeuroLink();

// NeuroLink provides built-in error recovery and automatic provider fallback
async function robustStreaming(prompt: string) {
  const maxRetries = 3;
  let attempts = 0;

  while (attempts  setTimeout(resolve, 1000 * attempts));
      } else {
        throw new Error(`Streaming failed after ${maxRetries} attempts`);
      }
    }
  }
}

// Usage with automatic error recovery
try {
  await robustStreaming("Generate a comprehensive analysis");
  console.log("Stream completed successfully");
} catch (error) {
  console.error("All retry attempts failed:", error.message);
}
```

### Security & Validation

````typescript

const neurolink = new NeuroLink();

// NeuroLink includes built-in security and validation features
async function secureStreaming(prompt: string, userId: string) {
  // Basic input validation
  if (!prompt || prompt.length > 50000) {
    throw new Error("Invalid prompt: too long or empty");
  }

  // Basic user authentication check
  if (!userId || userId.length  {
    console.log("\n✅ Streaming completed with analytics:", analytics);
  })
  .catch((error) => {
    console.error("Streaming failed:", error.message);
  });
````

### Real-time Analytics

```typescript
const stream = await neurolink.stream({
  input: { text: "Generate business report" },
  analytics: {
    enabled: true,
    realTime: true,
    context: {
      userId: "user123",
      sessionId: "session456",
      feature: "report_generation",
    },
  },
});

for await (const chunk of stream) {
  console.log(chunk.content);

  // Access real-time analytics
  if (chunk.analytics) {
    console.log(`Tokens so far: ${chunk.analytics.tokensUsed}`);
    console.log(`Cost so far: $${chunk.analytics.estimatedCost}`);
  }
}
```

### CLI Streaming with Analytics

```bash
# Streaming with analytics
npx @juspay/neurolink stream "Create documentation" \
  --enable-analytics \
  --context '{"project":"docs","team":"engineering"}' \
  --debug

# With evaluation
npx @juspay/neurolink stream "Write production code" \
  --enable-analytics \
  --enable-evaluation \
  --evaluation-domain "Senior Developer" \
  --debug
```

##  Use Cases

### Chat Interface

```typescript

function ChatComponent() {
  const [messages, setMessages] = useState([]);
  const [currentResponse, setCurrentResponse] = useState("");
  const neurolink = new NeuroLink();

  const sendMessage = async (userMessage) => {
    setMessages(prev => [...prev, { role: "user", content: userMessage }]);
    setCurrentResponse("");

    const stream = await neurolink.stream({
      input: { text: userMessage },
      provider: "google-ai"
    });

    for await (const chunk of stream) {
      setCurrentResponse(prev => prev + chunk.content);
    }

    setMessages(prev => [...prev, { role: "assistant", content: currentResponse }]);
    setCurrentResponse("");
  };

  return (

      {messages.map((msg, i) => (

          {msg.content}

      ))}
      {currentResponse && (

          {currentResponse}
          |

      )}

  );
}
```

### Live Content Generation

```typescript
// Real-time blog post generation
async function generateBlogPost(topic: string) {
  const stream = await neurolink.stream({
    input: {
      text: `Write a comprehensive blog post about ${topic}. Include introduction, main points, and conclusion.`,
    },
    provider: "anthropic",
    maxTokens: 3000,
    analytics: { enabled: true },
  });

  const sections = [];
  let currentSection = "";

  for await (const chunk of stream) {
    currentSection += chunk.content;

    // Update UI in real-time
    updateBlogPostPreview(currentSection);

    // Detect section breaks
    if (chunk.content.includes("\n\n## ")) {
      sections.push(currentSection);
      currentSection = "";
    }
  }

  return sections;
}
```

### Interactive Documentation

```bash
#!/bin/bash
# Interactive documentation generator

echo " Interactive Documentation Generator"
echo "Enter topic (or 'quit' to exit):"

while read -r topic; do
  if [ "$topic" = "quit" ]; then
    break
  fi

  echo " Generating documentation for: $topic"
  npx @juspay/neurolink stream "
  Create comprehensive technical documentation for: $topic

  Include:
  - Overview and purpose
  - Installation/setup instructions
  - Usage examples
  - Best practices
  - Troubleshooting
  " --provider google-ai --enable-analytics

  echo -e "\n\n Documentation complete! Enter next topic:"
done
```

## ⚙️ Enterprise Configuration

### Provider Configuration

```typescript

// Configure multiple providers for intelligent routing
const neurolink = new NeuroLink();

const providerConfigs = [
  {
    modelId: "llama-3-70b",
    modelName: "LLaMA 3 70B",
    modelType: "llama",
    weight: 3,
    specializations: ["reasoning", "analysis"],
    config: {
      maxTokens: 4000,
      temperature: 0.7,
      specializations: ["reasoning", "analysis"],
    },
    thresholds: {
      maxLatency: 5000,
      maxErrorRate: 2,
      minThroughput: 20,
    },
  },
  {
    modelId: "claude-3-5-sonnet",
    modelName: "Claude 3.5 Sonnet",
    modelType: "anthropic",
    weight: 4,
    specializations: ["function_calling", "structured_output"],
    config: {
      maxTokens: 8000,
      temperature: 0.6,
      specializations: ["function_calling", "structured_output"],
    },
    thresholds: {
      maxLatency: 3000,
      maxErrorRate: 1,
      minThroughput: 25,
    },
  },
  {
    modelId: "gemini-2-flash",
    modelName: "Gemini 2.0 Flash",
    modelType: "google",
    weight: 2,
    specializations: ["speed", "general"],
    config: {
      maxTokens: 2000,
      temperature: 0.8,
      specializations: ["speed", "general"],
    },
    thresholds: {
      maxLatency: 1500,
      maxErrorRate: 3,
      minThroughput: 40,
    },
  },
], {
  loadBalancingStrategy: "performance_based",
  autoFailover: {
    enabled: true,
    maxRetries: 3,
    fallbackStrategies: ["model_switch", "endpoint_switch", "provider_switch"],
    circuitBreakerThreshold: 5,
    circuitBreakerTimeout: 60000,
  },
  healthCheck: {
    enabled: true,
    interval: 30000,
    timeout: 5000,
    retryOnFailure: 2,
  },
  monitoring: {
    enabled: true,
    metricsInterval: 15000,
    detailedMetrics: true,
    performanceThresholds: {
      responseTime: 3000,
      errorRate: 2,
      throughput: 20,
    },
  },
});
```

### Production Environment Variables

For production deployments, configure these environment variables:

```bash
# Basic SageMaker Streaming
export AWS_REGION="us-east-1"
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export SAGEMAKER_DEFAULT_ENDPOINT="your-endpoint-name"

# Streaming Configuration
export NEUROLINK_STREAMING_ENABLED="true"
export NEUROLINK_STREAMING_TIMEOUT="30000"
export NEUROLINK_STREAMING_MAX_TOKENS="2000"

# Optional: Performance Settings
export NEUROLINK_STREAMING_BUFFER_SIZE="1024"
export NEUROLINK_STREAMING_FLUSH_INTERVAL="100"
export NEUROLINK_STREAMING_ENABLE_ANALYTICS="true"
```

### Production Configuration File

Create `neurolink.config.js` in your project root:

```javascript
// neurolink.config.js
module.exports = {
  providers: {
    sagemaker: {
      region: process.env.AWS_REGION || "us-east-1",
      endpointName: process.env.SAGEMAKER_DEFAULT_ENDPOINT,
      timeout: 30000,
      maxRetries: 3,
      streaming: {
        enabled: true,
        bufferSize: 1024,
        timeout: 60000,
      },
    },
  },
  streaming: {
    defaultProvider: "sagemaker",
    enableAnalytics: true,
    maxTokens: 2000,
    temperature: 0.7,
  },
};
```

### Simple Production Usage

```typescript

// Production service class
class AIStreamingService {
  private neurolink: NeuroLink;

  constructor() {
    this.neurolink = new NeuroLink({
      providers: {
        sagemaker: {
          endpointName: process.env.SAGEMAKER_ENDPOINT,
          region: process.env.AWS_REGION,
        },
      },
    });
  }

  async streamResponse(prompt: string, options: any = {}) {
    const result = await this.neurolink.generate({
      input: { text: prompt },
      provider: "sagemaker",
      stream: true,
      maxTokens: options.maxTokens || 500,
      temperature: options.temperature || 0.7,
    });

    return result.stream;
  }

  async getFullResponse(prompt: string) {
    const stream = await this.streamResponse(prompt);
    const chunks: string[] = [];

    for await (const chunk of stream) {
      if (chunk.content) {
        chunks.push(chunk.content);
      }
    }

    return chunks.join("");
  }
}

// Usage
const aiService = new AIStreamingService();
const response = await aiService.getFullResponse("Explain machine learning");
console.log(response);
```

### Stream Settings

```typescript
type StreamConfig = {
  bufferSize?: number; // Chunk buffer size (default: 1024)
  flushInterval?: number; // Flush interval in ms (default: 100)
  timeout?: number; // Stream timeout in ms (default: 60000)
  enableChunking?: boolean; // Enable smart chunking (default: true)
  retryAttempts?: number; // Retry attempts on failure (default: 3)
  reconnectDelay?: number; // Reconnection delay in ms (default: 1000)
};

const stream = await neurolink.stream({
  input: { text: "Your prompt" },
  stream: {
    bufferSize: 2048,
    flushInterval: 50,
    timeout: 120000,
    enableChunking: true,
    retryAttempts: 5,
  },
});
```

### Provider-Specific Options

```typescript
// OpenAI streaming
const openaiStream = await neurolink.stream({
  input: { text: "Generate content" },
  provider: "openai",
  model: "gpt-4o",
  stream: {
    enableChunking: true,
    bufferSize: 1024,
  },
});

// Google AI streaming
const googleStream = await neurolink.stream({
  input: { text: "Generate content" },
  provider: "google-ai",
  model: "gemini-2.5-pro",
  stream: {
    enableChunking: false, // Google AI handles chunking internally
    flushInterval: 50,
  },
});
```

##  Enterprise Monitoring & Debugging

### Real-time Monitoring Dashboard

```typescript

// Built-in monitoring with NeuroLink
class EnterpriseStreamingMonitor {
  private neurolink: NeuroLink;

  constructor() {
    this.neurolink = new NeuroLink();
  }

  async getComprehensiveDashboard() {
    // NeuroLink provides built-in monitoring and analytics
    const dashboard = {
      timestamp: Date.now(),
      system: {
        health: "healthy", // Built-in health checks
        performance: await this.getPerformanceMetrics(),
        providers: await this.getProviderStatus(),
      },
      streaming: {
        activeStreams: 0, // Built-in tracking
        totalRequests: 0,
        averageLatency: 0,
      },
    };

    return dashboard;
  }

  async generateAlerts() {
    const alerts = [];
    const dashboard = await this.getComprehensiveDashboard();

    // System health alerts
    if (dashboard.system.health.status === "unhealthy") {
      alerts.push({
        severity: "critical",
        type: "system_health",
        message: "System health is critical",
        details: dashboard.system.health,
      });
    }

    // Performance alerts
    if (dashboard.system.performance.averageResponseTime > 5000) {
      alerts.push({
        severity: "warning",
        type: "performance",
        message: "High response times detected",
        details: {
          responseTime: dashboard.system.performance.averageResponseTime,
        },
      });
    }

    // Security alerts
    if (dashboard.security.stats.recentEvents > 10) {
      alerts.push({
        severity: "high",
        type: "security",
        message: "High security event volume",
        details: dashboard.security.stats,
      });
    }

    // Cache performance alerts
    if (dashboard.cache.stats.hitMiss.hitRate  {
  const dashboard = await monitor.getComprehensiveDashboard();
  console.log("Dashboard Update:", JSON.stringify(dashboard, null, 2));

  // Check for alerts
  const alerts = await monitor.generateAlerts();
  if (alerts.length > 0) {
    console.log(" ALERTS:", alerts);
  }
}, 30000); // Every 30 seconds

// Export metrics to monitoring systems
setInterval(async () => {
  await monitor.exportMetrics("prometheus");
  await monitor.exportMetrics("cloudwatch");
}, 60000); // Every minute
```

### CLI Monitoring Commands

```bash
# Real-time streaming monitor
npx @juspay/neurolink sagemaker stream-monitor \
  --endpoint production-endpoint \
  --duration 3600 \
  --alerts \
  --export prometheus \
  --export cloudwatch

# System health check
npx @juspay/neurolink sagemaker diagnose \
  --endpoint production-endpoint \
  --check-models \
  --check-cache \
  --check-security \
  --check-rate-limits

# Performance benchmarking
npx @juspay/neurolink sagemaker stream-benchmark \
  --endpoint production-endpoint \
  --concurrent 50 \
  --requests 1000 \
  --duration 300 \
  --enable-analytics \
  --enable-caching \
  --model-selection performance_based

# Security audit
npx @juspay/neurolink sagemaker security-audit \
  --endpoint production-endpoint \
  --hours 24 \
  --export-report \
  --include-recommendations

# Cache analysis
npx @juspay/neurolink sagemaker cache-analyze \
  --endpoint production-endpoint \
  --strategy semantic \
  --optimize \
  --report
```

### Stream Debugging

```bash
# Enable verbose streaming debug
npx @juspay/neurolink stream "Debug this response" \
  --provider openai \
  --debug \
  --timeout 30000

# Monitor stream performance
npx @juspay/neurolink stream "Performance test" \
  --enable-analytics \
  --debug \
  --provider google-ai

# Debug streaming with the unified NeuroLink API
npx @juspay/neurolink stream "Complex analysis task" \
  --provider sagemaker \
  --debug \
  --max-tokens 500 \
  --temperature 0.7
```

### Advanced Performance Monitoring

```typescript

class PerformanceMonitor {
  private neurolink: NeuroLink;
  private startTime: number;
  private metrics: {
    tokenCount: number;
    chunkCount: number;
    responseTime: number;
    throughput: number;
    latencyDistribution: number[];
    errorCount: number;
  } = {
    tokenCount: 0,
    chunkCount: 0,
    responseTime: 0,
    throughput: 0,
    latencyDistribution: [],
    errorCount: 0,
  };

  constructor() {
    this.neurolink = new NeuroLink();
    this.startTime = Date.now();
  }

  async monitorStream(stream: AsyncIterable, requestId: string) {
    const chunkTimes: number[] = [];
    let firstChunkTime: number | null = null;
    let lastChunkTime: number = Date.now();

    for await (const chunk of stream) {
      const chunkTime = Date.now();

      if (!firstChunkTime) {
        firstChunkTime = chunkTime;
        console.log(
          `⏱️  Time to first chunk: ${firstChunkTime - this.startTime}ms`,
        );
      }

      if (chunk.type === "text-delta") {
        this.metrics.tokenCount += this.estimateTokens(chunk.textDelta);
        this.metrics.chunkCount++;
        chunkTimes.push(chunkTime - lastChunkTime);

        // Built-in metrics are automatically tracked by NeuroLink

        // Real-time throughput calculation
        const elapsed = (chunkTime - this.startTime) / 1000;
        this.metrics.throughput = this.metrics.tokenCount / elapsed;

        // Display real-time metrics every 10 chunks
        if (this.metrics.chunkCount % 10 === 0) {
          console.log(
            ` Tokens: ${this.metrics.tokenCount}, Throughput: ${this.metrics.throughput.toFixed(2)} t/s`,
          );
        }
      } else if (chunk.type === "error") {
        this.metrics.errorCount++;
        console.error(
          `❌ Stream error at chunk ${this.metrics.chunkCount}: ${chunk.error}`,
        );
      } else if (chunk.type === "finish") {
        this.metrics.responseTime = chunkTime - this.startTime;

        // Calculate latency statistics
        this.metrics.latencyDistribution = chunkTimes;
        const avgChunkLatency =
          chunkTimes.reduce((a, b) => a + b, 0) / chunkTimes.length;
        const p95ChunkLatency = this.percentile(chunkTimes, 95);
        const p99ChunkLatency = this.percentile(chunkTimes, 99);

        // Final metrics
        console.log(`\n Performance Summary:`);
        console.log(`   Total Response Time: ${this.metrics.responseTime}ms`);
        console.log(
          `   Time to First Chunk: ${firstChunkTime! - this.startTime}ms`,
        );
        console.log(`   Total Tokens: ${this.metrics.tokenCount}`);
        console.log(`   Total Chunks: ${this.metrics.chunkCount}`);
        console.log(
          `   Average Throughput: ${this.metrics.throughput.toFixed(2)} tokens/sec`,
        );
        console.log(
          `   Average Chunk Latency: ${avgChunkLatency.toFixed(2)}ms`,
        );
        console.log(`   P95 Chunk Latency: ${p95ChunkLatency.toFixed(2)}ms`);
        console.log(`   P99 Chunk Latency: ${p99ChunkLatency.toFixed(2)}ms`);
        console.log(`   Error Count: ${this.metrics.errorCount}`);
        console.log(
          `   Success Rate: ${(((this.metrics.chunkCount - this.metrics.errorCount) / this.metrics.chunkCount) * 100).toFixed(2)}%`,
        );

        // Complete tracking
        this.analytics.completeRequestTracking(
          requestId,
          chunk.usage || {
            promptTokens: 0,
            completionTokens: this.metrics.tokenCount,
            totalTokens: this.metrics.tokenCount,
          },
          this.metrics.errorCount === 0,
        );
      }

      lastChunkTime = chunkTime;
    }

    return this.metrics;
  }

  private estimateTokens(text: string): number {
    // Rough estimation: ~4 characters per token
    return Math.ceil(text.length / 4);
  }

  private percentile(arr: number[], p: number): number {
    const sorted = [...arr].sort((a, b) => a - b);
    const index = Math.ceil((p / 100) * sorted.length) - 1;
    return sorted[index] || 0;
  }

  async generatePerformanceReport() {
    const dashboardMetrics = this.analytics.getDashboardMetrics();
    const report = this.analytics.generateReport(
      Date.now() - 60 * 60 * 1000, // Last hour
      Date.now(),
    );

    return {
      timestamp: Date.now(),
      currentSession: this.metrics,
      hourlyReport: report,
      systemHealth: dashboardMetrics.systemHealth,
      trends: dashboardMetrics.trends,
      recommendations: this.generateRecommendations(report),
    };
  }

  private generateRecommendations(report: any): string[] {
    const recommendations: string[] = [];

    if (report.performance.averageDuration > 5000) {
      recommendations.push(
        "Consider using faster models or increasing instance sizes",
      );
    }

    if (report.performance.p95Duration > 10000) {
      recommendations.push(
        "High latency variance detected - review load balancing strategy",
      );
    }

    if (report.requests.successRate  {
  res.setHeader("Content-Type", "text/plain");
  res.setHeader("Transfer-Encoding", "chunked");

  try {
    const stream = await neurolink.stream({
      input: { text: req.body.prompt },
      provider: "google-ai",
    });

    for await (const chunk of stream) {
      res.write(chunk.content);
    }

    res.end();
  } catch (error) {
    res.status(500).json({ error: error.message });
  }
});
```

### WebSocket Streaming

```typescript

const wss = new WebSocket.Server({ port: 8080 });
const neurolink = new NeuroLink();

wss.on("connection", (ws) => {
  ws.on("message", async (message) => {
    const { prompt } = JSON.parse(message.toString());

    try {
      const stream = await neurolink.stream({
        input: { text: prompt },
        analytics: { enabled: true },
      });

      for await (const chunk of stream) {
        ws.send(
          JSON.stringify({
            type: "chunk",
            content: chunk.content,
            analytics: chunk.analytics,
          }),
        );
      }

      ws.send(JSON.stringify({ type: "complete" }));
    } catch (error) {
      ws.send(JSON.stringify({ type: "error", error: error.message }));
    }
  });
});
```

### Server-Sent Events (SSE)

```typescript
app.get("/api/stream-sse", async (req, res) => {
  res.setHeader("Content-Type", "text/event-stream");
  res.setHeader("Cache-Control", "no-cache");
  res.setHeader("Connection", "keep-alive");

  const stream = await neurolink.stream({
    input: { text: req.query.prompt as string },
  });

  for await (const chunk of stream) {
    res.write(
      `data: ${JSON.stringify({
        content: chunk.content,
        finished: chunk.finished,
      })}\n\n`,
    );
  }

  res.end();
});
```

##  Error Handling

### Robust Error Handling

```typescript
async function robustStreaming(prompt: string) {
  const maxRetries = 3;
  let attempts = 0;

  while (attempts  setTimeout(resolve, 1000 * attempts));
      } else {
        throw new Error(`Streaming failed after ${maxRetries} attempts`);
      }
    }
  }
}
```

##  Enterprise Use Cases

### Financial Services Streaming

```typescript
// High-frequency trading analysis with built-in compliance

const neurolink = new NeuroLink();

async function analyzeMarketData(marketData: string, userId: string) {
  const result = await neurolink.stream({
    provider: "anthropic", // Choose best provider for financial analysis
    input: {
      text: `Analyze this market data and provide risk assessment: ${marketData}`,
    },
    maxTokens: 1000,
    temperature: 0.2, // Low temperature for precise financial analysis
    tools: [
      { name: "risk_calculator", enabled: true },
      { name: "compliance_checker", enabled: true },
    ],
  });

  // Audit trail for compliance
  console.log(`Financial analysis requested by user: ${userId}`);
  console.log(`Model selected: ${result.selectedModel.modelId}`);

  return result;
}
```

### Healthcare AI with HIPAA Compliance

```typescript
// HIPAA-compliant medical AI streaming with NeuroLink

const neurolink = new NeuroLink();

// Configuration for HIPAA compliance
const healthcareConfig = {
  provider: "anthropic", // Choose provider with strong security
  maxTokens: 1000,
  temperature: 0.1, // Low temperature for medical accuracy
  // Built-in security and compliance features
};

async function processMedicalQuery(
  query: string,
  patientId: string,
  providerId: string,
) {
  // Basic validation for medical queries
  if (!query || !patientId || !providerId) {
    throw new Error("Missing required parameters for medical query");
  }

  // Audit logging for HIPAA compliance
  console.log(
    `Medical query requested by provider: ${providerId} for patient: ${patientId}`,
  );

  const stream = await neurolink.stream({
    ...healthcareConfig,
    input: { text: query },
    tools: [
      { name: "medical_knowledge", enabled: true },
      { name: "drug_interaction_check", enabled: true },
    ],
  });

  const sanitizedChunks = [];
  for await (const chunk of stream) {
    // Basic content filtering for sensitive data
    if (chunk.type === "text-delta") {
      // Apply basic PII filtering here if needed
      sanitizedChunks.push(chunk);
    } else if (chunk.type === "finish") {
      console.log(`Medical query completed for patient: ${patientId}`);
      sanitizedChunks.push(chunk);
    }
  }

  return sanitizedChunks;
}
```

### E-commerce Recommendation Engine

```typescript
// High-throughput e-commerce streaming with NeuroLink

const neurolink = new NeuroLink();

async function generatePersonalizedRecommendations(
  userId: string,
  browsingHistory: any[],
  preferences: any,
) {
  const result = await neurolink.stream({
    prompt: `Generate personalized product recommendations for user with browsing history: ${JSON.stringify(browsingHistory)} and preferences: ${JSON.stringify(preferences)}`,
    tools: [
      { name: "product_search", enabled: true },
      { name: "price_comparison", enabled: true },
      { name: "inventory_check", enabled: true },
    ],
    modelSelection: {
      requiredCapabilities: ["product_recommendations"],
      requestType: "completion",
    },
  });

  const recommendations = [];
  for await (const chunk of result.stream) {
    if (
      chunk.type === "tool-result" &&
      chunk.toolResult.name === "product_search"
    ) {
      recommendations.push(JSON.parse(chunk.toolResult.content));
    }
  }

  return {
    recommendations,
    model: result.selectedModel.modelId,
    performance: result.performance,
  };
}
```

##  Configuration Files

### Enterprise Configuration Template

```yaml
# neurolink-enterprise-streaming.yaml
streaming:
  sagemaker:
    endpoints:
      production:
        name: "production-multi-model"
        models:
          - id: "llama-3-70b"
            name: "LLaMA 3 70B"
            type: "llama"
            weight: 3
            specializations: ["reasoning", "analysis"]
            thresholds:
              max_latency: 5000
              max_error_rate: 2
              min_throughput: 20
          - id: "claude-3-5-sonnet"
            name: "Claude 3.5 Sonnet"
            type: "anthropic"
            weight: 4
            specializations: ["function_calling", "structured_output"]
            thresholds:
              max_latency: 3000
              max_error_rate: 1
              min_throughput: 25

        load_balancing:
          strategy: "performance_based"
          health_check:
            enabled: true
            interval: 30000
            timeout: 5000

        failover:
          enabled: true
          max_retries: 3
          strategies: ["model_switch", "endpoint_switch"]
          circuit_breaker:
            threshold: 5
            timeout: 60000

    rate_limiting:
      preset: "enterprise"
      requests_per_second: 100
      burst_capacity: 200
      adaptive: true
      target_response_time: 1000
      strategy: "queue"
      max_queue_size: 1000
      priority_queue: true

    caching:
      preset: "enterprise"
      storage: "hybrid"
      max_size_mb: 5000
      ttl: 21600000 # 6 hours
      strategy: "fuzzy"
      compression:
        enabled: true
        algorithm: "brotli"
      partial_hits: true
      warming: "scheduled"

    security:
      preset: "enterprise"
      input_validation:
        enabled: true
        max_prompt_length: 100000
        injection_detection: true
        content_policy: true
      output_filtering:
        enabled: true
        pii_redaction: true
        toxicity_filtering: true
        compliance: true
      access_control:
        enabled: true
        authentication: true
        api_key_validation: true
      monitoring:
        enabled: true
        real_time_alerts: true
        threat_detection: true
      compliance:
        gdpr: true
        hipaa: false
        soc2: true
        audit_logging: true

    analytics:
      preset: "enterprise"
      sampling_rate: 1.0
      retention_days: 365
      real_time_monitoring:
        enabled: true
        update_interval: 10000
        alert_thresholds:
          error_rate: 1
          response_time: 1500
          queue_size: 100
      export:
        enabled: true
        formats: ["prometheus", "cloudwatch"]
        interval: 60000
        destinations:
          - type: "cloudwatch"
            config:
              namespace: "NeuroLink/Enterprise"
              region: "us-east-1"
          - type: "prometheus"
            config:
              pushgateway: "prometheus:9091"
```

##  Related Documentation

- [CLI Commands](/docs/cli/commands) - Streaming CLI commands
- [SDK Reference](/docs/sdk/api-reference) - Complete streaming API
- [Analytics](/docs/reference/analytics) - Streaming analytics features
- [Dynamic Models](/docs/guides/dynamic-models) - Multi-model endpoint setup
- [Enterprise Features](/docs/guides/enterprise) - Enterprise security features
- [Performance Optimization](/docs/deployment/performance) - Optimization strategies
- [Analytics & Monitoring](/docs/reference/analytics) - Comprehensive monitoring
- [Provider Setup](/docs/getting-started/provider-setup) - Provider configuration
- [Development Guide](/docs/) - Development and deployment guide

##  What's Next

With Phase 2 complete, NeuroLink now offers enterprise-grade streaming capabilities:

- **✅ Multi-Model Streaming**: Intelligent load balancing and automatic failover
- **✅ Enterprise Security**: Comprehensive validation, filtering, and compliance
- **✅ Advanced Caching**: Semantic caching with partial response matching
- **✅ Real-time Analytics**: Complete monitoring and alerting system
- **✅ Rate Limiting**: Sophisticated backpressure handling and circuit breakers
- **✅ Tool Integration**: Streaming function calls with structured output

Upcoming in Phase 3:

- **Multi-Provider Streaming**: Seamless streaming across different AI providers
- **Edge Deployment**: CDN-based streaming for global latency optimization
- **Advanced Tool Orchestration**: Complex multi-step tool workflows
- **Custom Model Integration**: Support for proprietary and fine-tuned models

---

## Updated Provider Test Results

<!-- Source: advanced/updated-provider-test-results.md -->

# Updated Provider Test Results

This document contains the latest test results for all supported providers.

## Test Summary

### Provider Status

- ✅ OpenAI: All tests passing
- ✅ Amazon Bedrock: All tests passing
- ✅ Google Vertex AI: All tests passing
- ✅ Anthropic: All tests passing
- ✅ LiteLLM: All tests passing

### Performance Metrics

- Average response time: 2.3s
- Success rate: 99.7%
- Error rate: 0.3%

### Test Coverage

- Unit tests: 95%
- Integration tests: 87%
- End-to-end tests: 92%

## Detailed Results

### OpenAI Provider

- Text generation: ✅ Pass
- Streaming: ✅ Pass
- Error handling: ✅ Pass

### Amazon Bedrock Provider

- Text generation: ✅ Pass
- Streaming: ✅ Pass
- Error handling: ✅ Pass

### Google Vertex AI Provider

- Text generation: ✅ Pass
- Streaming: ✅ Pass
- Error handling: ✅ Pass

For more details, see the [Testing Guide](/docs/development/testing).

---

# Reference

## Reference

<!-- Source: reference/index.md -->

# Reference

Complete reference documentation for NeuroLink configuration, troubleshooting, and technical details.

##  Reference Hub

This section provides comprehensive reference materials for advanced usage, configuration, and problem-solving.

- ❓ **[Troubleshooting](/docs/reference/troubleshooting)**

  Common issues, error messages, and solutions for NeuroLink CLI and SDK usage.

- ⚙️ **[Configuration](/docs/deployment/configuration)**

  Complete configuration reference including environment variables, provider settings, and optimization.

- ️ **[Provider Capabilities Audit](/docs/reference/provider-capabilities-audit)**

  Comprehensive audit of all 12 provider implementations with capability matrices and configuration examples.

- ⚖️ **[Provider Comparison](/docs/reference/provider-comparison)**

  Detailed comparison of all 12 supported AI providers with features, costs, and recommendations.

- ❓ **[FAQ](/docs/reference/faq)**

  Frequently asked questions about NeuroLink features, limitations, and best practices.

- ⚠️ **[Error Codes](/docs/reference/error-codes)**

  Complete error code reference with categorized codes, severity levels, and resolution guidance.

-  **[Analytics](/docs/reference/analytics)**

  Comprehensive guide to NeuroLink analytics, metrics, token tracking, cost monitoring, and observability integration.

- ️ **[Server Configuration](/docs/reference/server-configuration)** 🆕

  Configuration reference for server adapters including Hono, Express, Fastify, and Koa framework integration.

##  Quick Reference

### Environment Variables

```bash
# Core Provider API Keys
OPENAI_API_KEY="sk-your-openai-key"
GOOGLE_AI_API_KEY="AIza-your-google-ai-key"
ANTHROPIC_API_KEY="sk-ant-your-key"

# AWS Bedrock (requires AWS credentials)
AWS_ACCESS_KEY_ID="your-access-key"
AWS_SECRET_ACCESS_KEY="your-secret-key"
AWS_REGION="us-east-1"

# Azure OpenAI
AZURE_OPENAI_API_KEY="your-azure-key"
AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com"

# Google Vertex AI
GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"

# Hugging Face
HUGGINGFACE_API_KEY="hf_your-key"

# Mistral AI
MISTRAL_API_KEY="your-mistral-key"
```

### CLI Quick Commands

```bash
# Status and diagnostics
neurolink status                    # Check all providers
neurolink status --verbose         # Detailed diagnostics
neurolink provider status          # Provider-specific status

# Text generation
neurolink generate "prompt"        # Basic generation
neurolink gen "prompt" -p openai   # Specific provider
neurolink stream "prompt"          # Real-time streaming

# Configuration
neurolink config show              # Show current config
neurolink config validate          # Validate setup
neurolink config init              # Interactive setup

# MCP tools
neurolink mcp discover             # Find available servers
neurolink mcp list                 # List installed servers
neurolink mcp install      # Install MCP server
```

### SDK Quick Reference

```typescript

// Basic usage
const neurolink = new NeuroLink();
const result = await neurolink.generate({
  input: { text: "Your prompt" },
  provider: "auto", // or specific provider
});

// Auto-select best provider
const provider = createBestAIProvider();
const result = await provider.generate({
  input: { text: "Your prompt" },
});

// With advanced options
const result = await neurolink.generate({
  input: { text: "Your prompt" },
  provider: "google-ai",
  model: "gemini-2.5-pro",
  temperature: 0.7,
  maxTokens: 1000,
  enableAnalytics: true,
  enableEvaluation: true,
  timeout: "30s",
});
```

##  Provider Comparison Matrix

**Quick Overview** (see [Provider Capabilities Audit](/docs/reference/provider-capabilities-audit) for complete details):

| Feature          | OpenAI | Google AI | Anthropic | Bedrock | Azure | Vertex | HuggingFace | Ollama | Mistral | LiteLLM | SageMaker | OpenRouter | OpenAI Compat |
| ---------------- | ------ | --------- | --------- | ------- | ----- | ------ | ----------- | ------ | ------- | ------- | --------- | ---------- | ------------- |
| **Free Tier**    | ❌     | ✅        | ❌        | ❌      | ❌    | ❌     | ✅          | ✅     | ✅      | Varies  | ❌        | Varies     | Varies        |
| **Tool Support** | ✅     | ✅        | ✅        | ✅      | ✅    | ✅     | ⚠️          | ⚠️     | ✅      | ✅      | ✅        | ✅         | ✅            |
| **Streaming**    | ✅     | ✅        | ✅        | ✅      | ✅    | ✅     | ✅          | ✅     | ✅      | ✅      | ✅        | ✅         | ✅            |
| **Vision**       | ✅     | ✅        | ✅        | ✅      | ✅    | ✅     | ✅          | ⚠️     | ❌      | ✅      | Varies    | ✅         | Varies        |
| **Local**        | ❌     | ❌        | ❌        | ❌      | ❌    | ❌     | ❌          | ✅     | ❌      | ❌      | ❌        | ❌         | Varies        |
| **Enterprise**   | ✅     | ✅        | ✅        | ✅      | ✅    | ✅     | ⚠️          | ✅     | ✅      | ✅      | ✅        | ✅         | Varies        |

For detailed capability matrices, authentication requirements, and configuration examples, see:

- **[Provider Capabilities Audit](/docs/reference/provider-capabilities-audit)** - Technical implementation details
- **[Provider Comparison](/docs/reference/provider-comparison)** - Feature comparison and selection guide

##  Error Code Reference

### Common Error Codes

| Code                   | Description                    | Solution                          |
| ---------------------- | ------------------------------ | --------------------------------- |
| `AUTH_ERROR`           | Invalid API key or credentials | Check environment variables       |
| `RATE_LIMIT`           | API rate limit exceeded        | Implement delays or upgrade plan  |
| `TIMEOUT`              | Request timeout                | Increase timeout or check network |
| `MODEL_NOT_FOUND`      | Invalid model name             | Check available models            |
| `TOOL_ERROR`           | MCP tool execution failed      | Check tool configuration          |
| `PROVIDER_UNAVAILABLE` | Provider service down          | Try different provider            |

### Debugging Tips

```bash
# Enable debug mode
neurolink generate "test" --debug

# Verbose logging
neurolink status --verbose

# Check configuration
neurolink config validate
```

```typescript
// SDK debugging
const neurolink = new NeuroLink({
  debug: true,
  logLevel: "verbose",
});
```

##  Performance Optimization

### Response Time Optimization

- **Provider selection**: Use fastest providers for your region
- **Model selection**: Choose appropriate model size for task
- **Concurrency**: Limit parallel requests to avoid rate limits
- **Caching**: Implement response caching for repeated queries

### Cost Optimization

- **Model selection**: Use cost-effective models when possible
- **Token management**: Optimize prompt length and max tokens
- **Provider comparison**: Compare costs across providers
- **Monitoring**: Track usage with analytics

### Memory Management

- **Streaming**: Use streaming for large responses
- **Batch processing**: Process multiple requests efficiently
- **Cleanup**: Proper resource cleanup in long-running applications

##  Security Best Practices

### API Key Management

- **Environment variables**: Store keys in `.env` files
- **Never commit**: Keep keys out of version control
- **Rotation**: Regularly rotate API keys
- **Scope limitation**: Use least-privilege access

### Production Deployment

- **Secret management**: Use secure secret management systems
- **Network security**: Implement proper network controls
- **Monitoring**: Log and monitor API usage
- **Error handling**: Don't expose sensitive errors

## 🆘 Getting Help

### Support Channels

1. **[GitHub Issues](https://github.com/juspay/neurolink/issues)** - Bug reports and feature requests
2. **[GitHub Discussions](https://github.com/juspay/neurolink/discussions)** - Community questions
3. **[Documentation](/docs/)** - Comprehensive guides and references
4. **[Examples](/docs/)** - Practical implementation patterns

### Before Asking for Help

1. Check the [Troubleshooting Guide](/docs/reference/troubleshooting)
2. Review the [FAQ](/docs/reference/faq)
3. Search existing [GitHub Issues](https://github.com/juspay/neurolink/issues)
4. Try the `--debug` flag for more information

### Reporting Issues

When reporting issues, include:

- **NeuroLink version**: `npm list @juspay/neurolink`
- **Node.js version**: `node --version`
- **Operating system**: OS and version
- **Error message**: Complete error output
- **Reproduction steps**: Minimal example to reproduce
- **Configuration**: Relevant environment variables (without keys)

##  External Resources

### AI Provider Documentation

- **[OpenAI API](https://platform.openai.com/docs)** - OpenAI official documentation
- **[Google AI Studio](https://aistudio.google.com/docs)** - Google AI platform docs
- **[Anthropic Claude](https://docs.anthropic.com/)** - Anthropic API reference
- **[AWS Bedrock](https://docs.aws.amazon.com/bedrock/)** - Amazon Bedrock guide

### Related Projects

- **[Vercel AI SDK](https://github.com/vercel/ai)** - Underlying provider implementations
- **[Model Context Protocol](https://modelcontextprotocol.io)** - Tool integration standard
- **[TypeScript](https://www.typescriptlang.org/)** - Type safety and development

---

## Analytics Reference

<!-- Source: reference/analytics.md -->

# Analytics Reference

NeuroLink provides comprehensive analytics capabilities for tracking token usage, costs, performance metrics, and quality evaluation across all AI provider interactions.

## Overview

The analytics system in NeuroLink consists of several interconnected components:

| Component                  | Purpose                                                         |
| -------------------------- | --------------------------------------------------------------- |
| **Token Usage Tracking**   | Monitor input/output tokens, cache tokens, and reasoning tokens |
| **Cost Analytics**         | Estimate and track costs across providers and models            |
| **Performance Metrics**    | Measure response times, throughput, and memory usage            |
| **Quality Evaluation**     | Assess response relevance, accuracy, and completeness           |
| **Middleware Integration** | Automatic analytics collection via middleware                   |

## Token Usage Tracking

### Basic Token Usage

NeuroLink automatically tracks token usage for every generation:

```typescript

const neurolink = new NeuroLink();

const result = await neurolink.generate({
  input: { text: "Explain quantum computing in simple terms" },
  provider: "openai",
  enableAnalytics: true,
});

// Access token usage
console.log("Token Usage:", {
  input: result.usage?.input,
  output: result.usage?.output,
  total: result.usage?.total,
});

// Full analytics data
console.log("Analytics:", result.analytics);
```

### TokenUsage Type

The `TokenUsage` type provides detailed token information:

```typescript
type TokenUsage = {
  /** Number of input/prompt tokens */
  input: number;
  /** Number of output/completion tokens */
  output: number;
  /** Total tokens (input + output) */
  total: number;
  /** Tokens used to create cache entries (Anthropic, Google) */
  cacheCreationTokens?: number;
  /** Tokens read from cache (cost savings) */
  cacheReadTokens?: number;
  /** Tokens used for reasoning/thinking (o1, Claude thinking) */
  reasoning?: number;
  /** Percentage of cost saved through caching */
  cacheSavingsPercent?: number;
};
```

### Cache Token Tracking

For providers that support prompt caching (Anthropic, Google), NeuroLink tracks cache metrics:

```typescript
const result = await neurolink.generate({
  input: { text: "Analyze this document..." },
  provider: "anthropic",
  enableAnalytics: true,
});

if (result.analytics?.tokenUsage) {
  const { cacheCreationTokens, cacheReadTokens, cacheSavingsPercent } =
    result.analytics.tokenUsage;

  if (cacheCreationTokens) {
    console.log(`Cache created: ${cacheCreationTokens} tokens`);
  }
  if (cacheReadTokens) {
    console.log(`Cache hit: ${cacheReadTokens} tokens`);
    console.log(`Cost savings: ${cacheSavingsPercent}%`);
  }
}
```

### Reasoning Token Tracking

For models with extended thinking capabilities (OpenAI o1, Anthropic Claude with thinking, Gemini 3):

```typescript
const result = await neurolink.generate({
  input: { text: "Solve this complex mathematical proof..." },
  provider: "openai",
  model: "o1-mini",
  enableAnalytics: true,
});

if (result.analytics?.tokenUsage.reasoning) {
  console.log(
    `Reasoning tokens used: ${result.analytics.tokenUsage.reasoning}`,
  );
}
```

## Cost Analytics

### Automatic Cost Estimation

NeuroLink automatically estimates costs based on provider pricing:

```typescript
const result = await neurolink.generate({
  input: { text: "Write a detailed business plan" },
  provider: "openai",
  model: "gpt-4o",
  enableAnalytics: true,
});

if (result.analytics?.cost !== undefined) {
  console.log(`Estimated cost: $${result.analytics.cost.toFixed(5)}`);
}
```

### Cost Calculation Formula

Costs are calculated using per-token pricing:

```typescript
// Internal cost calculation
const inputCost = (tokens.input / 1000) * costInfo.input;
const outputCost = (tokens.output / 1000) * costInfo.output;
const totalCost = inputCost + outputCost;
```

### Provider Pricing Configuration

NeuroLink uses configurable pricing for each provider:

| Provider      | Default Input Cost (per 1K) | Default Output Cost (per 1K) |
| ------------- | --------------------------- | ---------------------------- |
| OpenAI        | $0.00015                    | $0.0006                      |
| Anthropic     | $0.0015                     | $0.0075                      |
| Google AI     | $0.000075                   | $0.0003                      |
| Google Vertex | $0.000075                   | $0.0003                      |
| Bedrock       | $0.0015                     | $0.0075                      |
| Azure         | $0.00015                    | $0.0006                      |
| Mistral       | $0.0001                     | $0.0003                      |
| HuggingFace   | $0.0002                     | $0.0008                      |
| Ollama        | $0                          | $0                           |

### Custom Cost Configuration

Override default pricing via environment variables:

```bash
# Custom pricing for Google AI
GOOGLE_AI_DEFAULT_INPUT_COST=0.0001
GOOGLE_AI_DEFAULT_OUTPUT_COST=0.0004

# Custom pricing for OpenAI
OPENAI_DEFAULT_INPUT_COST=0.0002
OPENAI_DEFAULT_OUTPUT_COST=0.0008
```

### Aggregating Costs

Track cumulative costs across multiple requests:

```typescript

const neurolink = new NeuroLink();
const usages = [];

// Collect usage from multiple requests
for (const prompt of prompts) {
  const result = await neurolink.generate({
    input: { text: prompt },
    enableAnalytics: true,
  });
  if (result.usage) {
    usages.push(result.usage);
  }
}

// Calculate total usage manually
const totalUsage = usages.reduce(
  (total, current) => ({
    input: total.input + current.input,
    output: total.output + current.output,
    total: total.total + current.total,
  }),
  { input: 0, output: 0, total: 0 },
);

console.log(`Total tokens used: ${totalUsage.total}`);
```

## Performance Metrics

### Response Time Tracking

Every request automatically tracks response time:

```typescript
const result = await neurolink.generate({
  input: { text: "Quick response test" },
  enableAnalytics: true,
});

console.log(`Response time: ${result.responseTime}ms`);
console.log(`Analytics duration: ${result.analytics?.requestDuration}ms`);
```

### AnalyticsData Structure

The complete analytics data structure:

```typescript
type AnalyticsData = {
  /** Provider used for the request */
  provider: string;
  /** Model used for the request */
  model?: string;
  /** Token usage breakdown */
  tokenUsage: TokenUsage;
  /** Request duration in milliseconds */
  requestDuration: number;
  /** ISO timestamp of the request */
  timestamp: string;
  /** Estimated cost in USD */
  cost?: number;
  /** Custom context data */
  context?: Record;
};
```

### Performance Metrics Type

For advanced performance tracking:

```typescript
type PerformanceMetrics = {
  /** Start timestamp */
  startTime: number;
  /** End timestamp */
  endTime?: number;
  /** Total duration in ms */
  duration?: number;
  /** Memory usage at start */
  memoryStart: NodeJS.MemoryUsage;
  /** Memory usage at end */
  memoryEnd?: NodeJS.MemoryUsage;
  /** Memory delta */
  memoryDelta?: {
    rss: number;
    heapTotal: number;
    heapUsed: number;
    external: number;
  };
};
```

### Stream Performance Metrics

For streaming requests, additional metrics are available:

```typescript
type StreamAnalyticsData = {
  /** Tool execution results with timing */
  toolResults?: Promise>;
  /** Tool calls made during stream */
  toolCalls?: Promise>;
  /** Stream performance metrics */
  performance?: {
    startTime: number;
    endTime?: number;
    chunkCount: number;
    avgChunkSize: number;
    totalBytes: number;
  };
  /** Provider analytics */
  providerAnalytics?: AnalyticsData;
};
```

### Streaming Example

```typescript
const stream = await neurolink.stream({
  input: { text: "Write a long story" },
  enableAnalytics: true,
});

let chunkCount = 0;
for await (const chunk of stream.textStream) {
  chunkCount++;
  process.stdout.write(chunk);
}

// Access stream analytics after completion
const analytics = await stream.analytics;
console.log(`\nChunks received: ${chunkCount}`);
console.log(`Total tokens: ${analytics?.tokenUsage?.total}`);
```

## Quality Evaluation

### Enabling Evaluation

NeuroLink can automatically evaluate response quality:

```typescript
const result = await neurolink.generate({
  input: { text: "Explain machine learning" },
  provider: "openai",
  enableAnalytics: true,
  enableEvaluation: true,
});

if (result.evaluation) {
  console.log("Evaluation Results:", {
    relevance: result.evaluation.relevance,
    accuracy: result.evaluation.accuracy,
    completeness: result.evaluation.completeness,
    overall: result.evaluation.overall,
    reasoning: result.evaluation.reasoning,
  });
}
```

### EvaluationData Structure

```typescript
type EvaluationData = {
  // Core scores (1-10 scale)
  /** How well response addresses query intent */
  relevance: number;
  /** Factual correctness and accuracy */
  accuracy: number;
  /** How completely the response addresses the query */
  completeness: number;
  /** Overall quality score */
  overall: number;

  // Domain-specific scores
  /** Domain alignment score */
  domainAlignment?: number;
  /** Terminology accuracy */
  terminologyAccuracy?: number;
  /** Tool effectiveness score */
  toolEffectiveness?: number;

  // Quality indicators
  /** True if response deviates from query/domain */
  isOffTopic: boolean;
  /** Quality alert level: low, medium, high, none */
  alertSeverity: "low" | "medium" | "high" | "none";
  /** Brief justification for scores */
  reasoning: string;
  /** Suggestions for improvement */
  suggestedImprovements?: string;

  // Metadata
  /** Model used for evaluation */
  evaluationModel: string;
  /** Time taken for evaluation (ms) */
  evaluationTime: number;
  /** Domain for evaluation */
  evaluationDomain?: string;
};
```

### Domain-Aware Evaluation

Configure evaluation for specific domains:

```typescript
const result = await neurolink.generate({
  input: { text: "What are the side effects of aspirin?" },
  provider: "openai",
  enableEvaluation: true,
  evaluationDomain: "healthcare",
});

if (result.evaluation?.domainEvaluation) {
  console.log("Domain Evaluation:", {
    domainRelevance: result.evaluation.domainEvaluation.domainRelevance,
    terminologyAccuracy: result.evaluation.domainEvaluation.terminologyAccuracy,
    domainExpertise: result.evaluation.domainEvaluation.domainExpertise,
  });
}
```

### Evaluation Providers

Evaluation can use different providers:

```typescript
type EvaluationProvider =
  | "openai"
  | "anthropic"
  | "vertex"
  | "google-ai"
  | "local";
```

## Analytics Middleware

### Using Analytics Middleware

NeuroLink provides built-in analytics middleware:

```typescript

const analyticsMiddleware = createAnalyticsMiddleware();

const neurolink = new NeuroLink({
  middleware: [analyticsMiddleware],
});
```

### Middleware Metadata

The analytics middleware provides:

```typescript
const metadata = {
  id: "analytics",
  name: "Analytics Tracking",
  description:
    "Tracks token usage, response times, and model performance metrics",
  priority: 100, // High priority to ensure capture
  defaultEnabled: true,
};
```

### Custom Analytics Collection

Implement custom analytics collection:

```typescript

function createCustomAnalyticsMiddleware(): NeuroLinkMiddleware {
  const metrics: Map> = new Map();

  return {
    metadata: {
      id: "custom-analytics",
      name: "Custom Analytics",
      description: "Custom analytics tracking",
      priority: 90,
      defaultEnabled: true,
    },

    wrapGenerate: async ({ doGenerate, params }) => {
      const requestId = `req-${Date.now()}`;
      const startTime = Date.now();

      try {
        const result = await doGenerate();
        const duration = Date.now() - startTime;

        metrics.set(requestId, {
          duration,
          tokens: result.usage,
          timestamp: new Date().toISOString(),
        });

        return result;
      } catch (error) {
        metrics.set(requestId, {
          error: error instanceof Error ? error.message : String(error),
          duration: Date.now() - startTime,
        });
        throw error;
      }
    },
  };
}
```

## Analytics Utilities

### Formatting Utilities

```typescript

  formatTokenUsage,
  formatAnalyticsForDisplay,
  getAnalyticsSummary,
} from "@juspay/neurolink/utils/analyticsUtils";

// Format token usage as string
const usageString = formatTokenUsage(result.usage);
// Output: "100 input / 50 output / 20 cache-read"

// Format full analytics for display
const display = formatAnalyticsForDisplay(result.analytics);
// Output: "Provider: openai | Model: gpt-4o | Tokens: 100 input / 50 output | Cost: $0.00015 | Time: 1.2s"

// Get analytics summary
const summary = getAnalyticsSummary(result.analytics);
console.log({
  totalTokens: summary.totalTokens,
  costPerToken: summary.costPerToken,
  requestsPerSecond: summary.requestsPerSecond,
});
```

### Validation Utilities

```typescript

  hasValidTokenUsage,
  isTokenUsage,
} from "@juspay/neurolink/utils/analyticsUtils";

// Check if analytics has valid token usage
if (hasValidTokenUsage(result.analytics)) {
  // Safe to access token fields
}

// Type guard for token usage
if (isTokenUsage(data)) {
  console.log(data.total);
}
```

## Integration with Observability Tools

### OpenTelemetry Integration

Export analytics to OpenTelemetry:

```typescript

const tracer = trace.getTracer("neurolink");

async function trackedGenerate(options: GenerateOptions) {
  return tracer.startActiveSpan("neurolink.generate", async (span) => {
    try {
      const result = await neurolink.generate({
        ...options,
        enableAnalytics: true,
      });

      // Add analytics as span attributes
      if (result.analytics) {
        span.setAttributes({
          "ai.provider": result.analytics.provider,
          "ai.model": result.analytics.model || "unknown",
          "ai.tokens.input": result.analytics.tokenUsage.input,
          "ai.tokens.output": result.analytics.tokenUsage.output,
          "ai.tokens.total": result.analytics.tokenUsage.total,
          "ai.cost": result.analytics.cost || 0,
          "ai.duration_ms": result.analytics.requestDuration,
        });
      }

      span.setStatus({ code: SpanStatusCode.OK });
      return result;
    } catch (error) {
      span.setStatus({
        code: SpanStatusCode.ERROR,
        message: error instanceof Error ? error.message : String(error),
      });
      throw error;
    } finally {
      span.end();
    }
  });
}
```

### Prometheus Metrics

Export metrics to Prometheus:

```typescript

// Define metrics
const tokenCounter = new Counter({
  name: "neurolink_tokens_total",
  help: "Total tokens used",
  labelNames: ["provider", "model", "type"],
});

const costGauge = new Gauge({
  name: "neurolink_cost_dollars",
  help: "Estimated cost in dollars",
  labelNames: ["provider", "model"],
});

const latencyHistogram = new Histogram({
  name: "neurolink_request_duration_ms",
  help: "Request duration in milliseconds",
  labelNames: ["provider", "model"],
  buckets: [100, 250, 500, 1000, 2500, 5000, 10000],
});

// Record metrics after each request
function recordMetrics(analytics: AnalyticsData) {
  const labels = {
    provider: analytics.provider,
    model: analytics.model || "unknown",
  };

  tokenCounter.inc({ ...labels, type: "input" }, analytics.tokenUsage.input);
  tokenCounter.inc({ ...labels, type: "output" }, analytics.tokenUsage.output);

  if (analytics.cost !== undefined) {
    costGauge.set(labels, analytics.cost);
  }

  latencyHistogram.observe(labels, analytics.requestDuration);
}
```

### DataDog Integration

Send analytics to DataDog:

```typescript

const dogstatsd = new DogStatsDClient();

function sendToDataDog(analytics: AnalyticsData) {
  const tags = [
    `provider:${analytics.provider}`,
    `model:${analytics.model || "unknown"}`,
  ];

  dogstatsd.increment("neurolink.requests", 1, tags);
  dogstatsd.gauge("neurolink.tokens.input", analytics.tokenUsage.input, tags);
  dogstatsd.gauge("neurolink.tokens.output", analytics.tokenUsage.output, tags);
  dogstatsd.histogram("neurolink.latency", analytics.requestDuration, tags);

  if (analytics.cost !== undefined) {
    dogstatsd.gauge("neurolink.cost", analytics.cost, tags);
  }
}
```

### Custom Logging

Structured logging with analytics:

```typescript

const logger = pino({
  level: "info",
  formatters: {
    level: (label) => ({ level: label }),
  },
});

async function loggedGenerate(options: GenerateOptions) {
  const result = await neurolink.generate({
    ...options,
    enableAnalytics: true,
  });

  logger.info(
    {
      provider: result.analytics?.provider,
      model: result.analytics?.model,
      tokens: {
        input: result.analytics?.tokenUsage.input,
        output: result.analytics?.tokenUsage.output,
        total: result.analytics?.tokenUsage.total,
      },
      cost: result.analytics?.cost,
      duration: result.analytics?.requestDuration,
      timestamp: result.analytics?.timestamp,
    },
    "AI generation completed",
  );

  return result;
}
```

## Usage Statistics

### Tracking Usage Over Time

Build usage dashboards with aggregated statistics:

```typescript
type UsageStats = {
  totalRequests: number;
  totalTokens: number;
  totalCost: number;
  averageLatency: number;
  byProvider: Map;
  byModel: Map;
};

class UsageTracker {
  private stats: UsageStats = {
    totalRequests: 0,
    totalTokens: 0,
    totalCost: 0,
    averageLatency: 0,
    byProvider: new Map(),
    byModel: new Map(),
  };
  private latencies: number[] = [];

  record(analytics: AnalyticsData) {
    this.stats.totalRequests++;
    this.stats.totalTokens += analytics.tokenUsage.total;
    this.stats.totalCost += analytics.cost || 0;
    this.latencies.push(analytics.requestDuration);
    this.stats.averageLatency =
      this.latencies.reduce((a, b) => a + b, 0) / this.latencies.length;

    // Track by provider
    const providerStats = this.stats.byProvider.get(analytics.provider) || {
      requests: 0,
      tokens: 0,
      cost: 0,
    };
    providerStats.requests++;
    providerStats.tokens += analytics.tokenUsage.total;
    providerStats.cost += analytics.cost || 0;
    this.stats.byProvider.set(analytics.provider, providerStats);

    // Track by model
    if (analytics.model) {
      const modelStats = this.stats.byModel.get(analytics.model) || {
        requests: 0,
        tokens: 0,
        cost: 0,
      };
      modelStats.requests++;
      modelStats.tokens += analytics.tokenUsage.total;
      modelStats.cost += analytics.cost || 0;
      this.stats.byModel.set(analytics.model, modelStats);
    }
  }

  getStats(): UsageStats {
    return { ...this.stats };
  }

  getSummary(): string {
    return `
      Total Requests: ${this.stats.totalRequests}
      Total Tokens: ${this.stats.totalTokens.toLocaleString()}
      Total Cost: $${this.stats.totalCost.toFixed(4)}
      Average Latency: ${this.stats.averageLatency.toFixed(0)}ms
    `;
  }
}
```

### Rate Limiting Based on Usage

Implement rate limiting using analytics:

```typescript
class UsageRateLimiter {
  private tokenBudget: number;
  private costBudget: number;
  private usedTokens = 0;
  private usedCost = 0;
  private resetInterval: NodeJS.Timeout;

  constructor(
    options: {
      tokenBudget?: number;
      costBudget?: number;
      resetIntervalMs?: number;
    } = {},
  ) {
    this.tokenBudget = options.tokenBudget || 1_000_000;
    this.costBudget = options.costBudget || 10;

    // Reset budgets periodically
    this.resetInterval = setInterval(() => {
      this.usedTokens = 0;
      this.usedCost = 0;
    }, options.resetIntervalMs || 3600000); // 1 hour default
  }

  canProceed(estimatedTokens: number): boolean {
    return this.usedTokens + estimatedTokens  COST_ALERT_THRESHOLD) {
    console.warn(
      `High cost alert: $${result.analytics.cost.toFixed(4)} for request`,
    );
  }

  return result;
}
```

### 3. Track Token Efficiency

```typescript
function calculateEfficiency(analytics: AnalyticsData): number {
  // Ratio of output tokens to total tokens
  const { input, output, total } = analytics.tokenUsage;
  return total > 0 ? output / total : 0;
}
```

### 4. Implement Budget Controls

```typescript
class BudgetController {
  private dailyBudget: number;
  private spent = 0;

  constructor(dailyBudget: number) {
    this.dailyBudget = dailyBudget;
  }

  async generate(options: GenerateOptions): Promise {
    if (this.spent >= this.dailyBudget) {
      throw new Error("Daily budget exceeded");
    }

    const result = await neurolink.generate({
      ...options,
      enableAnalytics: true,
    });

    this.spent += result.analytics?.cost || 0;
    return result;
  }
}
```

## Related Documentation

- [Configuration Reference](/docs/deployment/configuration) - Configure analytics settings
- [Provider Comparison](/docs/reference/provider-comparison) - Compare provider costs
- [Troubleshooting](/docs/reference/troubleshooting) - Debug analytics issues
- [Error Codes](/docs/reference/error-codes) - Analytics-related error codes

---

## Error Code Reference

<!-- Source: reference/error-codes.md -->

# Error Code Reference

This document provides a comprehensive reference for all NeuroLink error codes, including their categories, severity levels, retriability status, and resolution guidance.

## Overview

NeuroLink uses a structured error handling system that provides detailed information about failures. Each error includes:

| Property    | Description                                             |
| ----------- | ------------------------------------------------------- |
| `code`      | Unique identifier for the error type                    |
| `category`  | Classification of the error (validation, network, etc.) |
| `severity`  | Impact level (critical, high, medium, low)              |
| `retriable` | Whether the operation can be automatically retried      |
| `message`   | Human-readable description of the error                 |
| `context`   | Additional metadata about the error circumstances       |
| `timestamp` | When the error occurred                                 |

## Error Categories

NeuroLink classifies errors into the following categories:

| Category        | Description                         | Common Causes                                            |
| --------------- | ----------------------------------- | -------------------------------------------------------- |
| `VALIDATION`    | Invalid parameters or configuration | Malformed input, missing required fields, invalid values |
| `EXECUTION`     | Runtime execution failures          | Tool execution errors, provider API failures             |
| `NETWORK`       | Connectivity issues                 | DNS failures, connection timeouts, SSL errors            |
| `RESOURCE`      | Memory or quota exhaustion          | Out of memory, rate limits exceeded                      |
| `TIMEOUT`       | Operation timeouts                  | Slow provider response, long-running operations          |
| `PERMISSION`    | Authorization issues                | Invalid API keys, insufficient permissions               |
| `CONFIGURATION` | Configuration errors                | Missing environment variables, invalid config            |
| `SYSTEM`        | System-level failures               | Internal errors, unexpected states                       |

## Severity Levels

Errors are classified by severity to help prioritize response:

| Severity   | Description                                        | Action Required                           |
| ---------- | -------------------------------------------------- | ----------------------------------------- |
| `CRITICAL` | System-level failure requiring immediate attention | Stop operation, investigate immediately   |
| `HIGH`     | Operation failed, significant impact               | Retry if possible, escalate if persistent |
| `MEDIUM`   | Validation or recoverable issues                   | Review parameters, fix and retry          |
| `LOW`      | Minor issues, informational                        | Log for monitoring, continue operation    |

## Tool Errors

Errors related to tool registration, discovery, and execution.

| Code                     | Description                          | Severity | Retriable | Category   |
| ------------------------ | ------------------------------------ | -------- | --------- | ---------- |
| `TOOL_NOT_FOUND`         | Requested tool not found in registry | MEDIUM   | No        | VALIDATION |
| `TOOL_EXECUTION_FAILED`  | Tool execution encountered an error  | HIGH     | Yes       | EXECUTION  |
| `TOOL_TIMEOUT`           | Tool execution timed out             | HIGH     | Yes       | TIMEOUT    |
| `TOOL_VALIDATION_FAILED` | Tool parameter validation failed     | MEDIUM   | No        | VALIDATION |

### Resolution Guide

**TOOL_NOT_FOUND**

```typescript
// Check available tools before calling
const tools = await neurolink.listTools();
console.log(
  "Available tools:",
  tools.map((t) => t.name),
);

// Verify tool registration
await neurolink.addTool({
  name: "myTool",
  description: "My custom tool",
  parameters: {
    /* schema */
  },
  execute: async (params) => {
    /* implementation */
  },
});
```

**TOOL_EXECUTION_FAILED**

```typescript
// Check tool parameters match expected schema
// Review tool implementation for errors
// Verify external dependencies (APIs, databases) are available
```

**TOOL_TIMEOUT**

```typescript
// Increase timeout configuration
const result = await neurolink.generate({
  input: { text: "Use the slow tool" },
  toolTimeout: 60000, // 60 seconds
});
```

## Provider Errors

Errors related to AI provider communication and authentication.

| Code                      | Description                           | Severity | Retriable | Category   |
| ------------------------- | ------------------------------------- | -------- | --------- | ---------- |
| `PROVIDER_NOT_AVAILABLE`  | Provider service unavailable          | HIGH     | Yes       | NETWORK    |
| `PROVIDER_AUTH_FAILED`    | Provider authentication failed        | HIGH     | No        | PERMISSION |
| `PROVIDER_QUOTA_EXCEEDED` | Provider rate limit or quota exceeded | HIGH     | Yes       | RESOURCE   |

### Resolution Guide

**PROVIDER_NOT_AVAILABLE**

```typescript
// Configure automatic failover
const neurolink = new NeuroLink({
  provider: "openai",
  fallbackProviders: ["anthropic", "google-ai"],
});

// Or manually switch providers
await neurolink.setProvider("anthropic");
```

**PROVIDER_AUTH_FAILED**

```typescript
// Verify API key is set correctly
process.env.OPENAI_API_KEY = "sk-...";

// Check API key permissions and validity
// Ensure correct environment variables for your provider
```

**PROVIDER_QUOTA_EXCEEDED**

```typescript
// Implement exponential backoff

const result = await withRetry(
  () => neurolink.generate({ input: { text: "Hello" } }),
  { maxAttempts: 3, delayMs: 1000 },
);
```

## Video Validation Errors

Errors specific to video generation operations.

| Code                         | Description                   | Severity | Retriable | Category   |
| ---------------------------- | ----------------------------- | -------- | --------- | ---------- |
| `INVALID_VIDEO_RESOLUTION`   | Invalid resolution specified  | MEDIUM   | No        | VALIDATION |
| `INVALID_VIDEO_LENGTH`       | Invalid video duration        | MEDIUM   | No        | VALIDATION |
| `INVALID_VIDEO_ASPECT_RATIO` | Invalid aspect ratio          | MEDIUM   | No        | VALIDATION |
| `INVALID_VIDEO_AUDIO`        | Invalid audio option          | MEDIUM   | No        | VALIDATION |
| `INVALID_VIDEO_MODE`         | Output mode not set to video  | MEDIUM   | No        | VALIDATION |
| `MISSING_VIDEO_IMAGE`        | Required input image missing  | MEDIUM   | No        | VALIDATION |
| `EMPTY_VIDEO_PROMPT`         | Video prompt cannot be empty  | MEDIUM   | No        | VALIDATION |
| `VIDEO_PROMPT_TOO_LONG`      | Prompt exceeds maximum length | MEDIUM   | No        | VALIDATION |

### Resolution Guide

**INVALID_VIDEO_RESOLUTION**

```typescript
// Valid resolutions: '720p' or '1080p'
const result = await neurolink.generate({
  input: { text: "Camera pan", images: [imageBuffer] },
  output: {
    mode: "video",
    video: { resolution: "720p" }, // or '1080p'
  },
});
```

**INVALID_VIDEO_LENGTH**

```typescript
// Valid lengths: 4, 6, or 8 seconds
const result = await neurolink.generate({
  input: { text: "Smooth motion", images: [imageBuffer] },
  output: {
    mode: "video",
    video: { length: 6 }, // 4, 6, or 8
  },
});
```

**INVALID_VIDEO_ASPECT_RATIO**

```typescript
// Valid aspect ratios: '9:16' (portrait) or '16:9' (landscape)
const result = await neurolink.generate({
  input: { text: "Cinematic shot", images: [imageBuffer] },
  output: {
    mode: "video",
    video: { aspectRatio: "16:9" },
  },
});
```

**MISSING_VIDEO_IMAGE**

```typescript
// Video generation requires an input image

const imageBuffer = readFileSync("./input.png");
const result = await neurolink.generate({
  input: {
    text: "Animate this image with smooth motion",
    images: [imageBuffer],
  },
  output: { mode: "video" },
});
```

## Image Validation Errors

Errors specific to image input processing.

| Code                   | Description                        | Severity | Retriable | Category   |
| ---------------------- | ---------------------------------- | -------- | --------- | ---------- |
| `EMPTY_IMAGE_PATH`     | Image path or URL is empty         | MEDIUM   | No        | VALIDATION |
| `INVALID_IMAGE_TYPE`   | Image must be Buffer, path, or URL | MEDIUM   | No        | VALIDATION |
| `IMAGE_TOO_LARGE`      | Image exceeds maximum size         | MEDIUM   | No        | VALIDATION |
| `IMAGE_TOO_SMALL`      | Image data too small to be valid   | MEDIUM   | No        | VALIDATION |
| `INVALID_IMAGE_FORMAT` | Unsupported image format           | MEDIUM   | No        | VALIDATION |

### Resolution Guide

**IMAGE_TOO_LARGE**

```typescript
// Compress or resize images before sending

const compressedImage = await sharp(originalImage)
  .resize(1920, 1080, { fit: "inside" })
  .jpeg({ quality: 80 })
  .toBuffer();
```

**INVALID_IMAGE_FORMAT**

```typescript
// Supported formats: JPEG, PNG, WebP
// Convert unsupported formats before processing

const jpegBuffer = await sharp(bmpImage).jpeg().toBuffer();
```

## System and Configuration Errors

General system and configuration errors.

| Code                     | Description                     | Severity | Retriable | Category      |
| ------------------------ | ------------------------------- | -------- | --------- | ------------- |
| `MEMORY_EXHAUSTED`       | System memory exhausted         | CRITICAL | No        | RESOURCE      |
| `NETWORK_ERROR`          | Network connectivity issue      | HIGH     | Yes       | NETWORK       |
| `PERMISSION_DENIED`      | Operation not permitted         | HIGH     | No        | PERMISSION    |
| `INVALID_CONFIGURATION`  | Configuration is invalid        | MEDIUM   | No        | CONFIGURATION |
| `MISSING_CONFIGURATION`  | Required configuration missing  | MEDIUM   | No        | CONFIGURATION |
| `INVALID_PARAMETERS`     | Parameters failed validation    | MEDIUM   | No        | VALIDATION    |
| `MISSING_REQUIRED_PARAM` | Required parameter not provided | MEDIUM   | No        | VALIDATION    |

### Resolution Guide

**MEMORY_EXHAUSTED**

```typescript
// Process large files in chunks
// Increase Node.js heap size: node --max-old-space-size=4096
// Use streaming for large responses
```

**MISSING_CONFIGURATION**

```typescript
// Verify all required environment variables are set
// Required variables depend on your provider:
// - OPENAI_API_KEY for OpenAI
// - ANTHROPIC_API_KEY for Anthropic
// - GOOGLE_API_KEY for Google AI Studio
// - GOOGLE_APPLICATION_CREDENTIALS for Vertex AI

// Validate configuration

await validateConfig();
```

## Video Generation Runtime Errors

Runtime errors during video generation (as opposed to validation errors).

| Code                            | Description                               | Severity | Retriable | Category      |
| ------------------------------- | ----------------------------------------- | -------- | --------- | ------------- |
| `VIDEO_GENERATION_FAILED`       | Video generation API call failed          | HIGH     | Yes       | EXECUTION     |
| `VIDEO_PROVIDER_NOT_CONFIGURED` | Vertex AI not properly configured         | HIGH     | No        | CONFIGURATION |
| `VIDEO_POLL_TIMEOUT`            | Polling for video completion timed out    | HIGH     | Yes       | TIMEOUT       |
| `VIDEO_INVALID_INPUT`           | Runtime I/O error during input processing | HIGH     | Yes       | EXECUTION     |

### Resolution Guide

**VIDEO_PROVIDER_NOT_CONFIGURED**

```bash
# Set Google Cloud credentials for Vertex AI video generation
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
export GOOGLE_VERTEX_PROJECT=your-project-id
export GOOGLE_VERTEX_LOCATION=us-central1
```

**VIDEO_POLL_TIMEOUT**

```typescript
// Video generation typically takes 1-3 minutes
// Consider using shorter duration or lower resolution for faster results
const result = await neurolink.generate({
  input: { text: "Quick animation", images: [imageBuffer] },
  output: {
    mode: "video",
    video: {
      resolution: "720p", // Lower resolution is faster
      length: 4, // Shorter duration is faster
    },
  },
});
```

## SDK Error Handling Example

Complete example demonstrating proper error handling in the SDK:

```typescript

  NeuroLink,
  NeuroLinkError,
  ErrorCategory,
  withRetry,
} from "@juspay/neurolink";

const neurolink = new NeuroLink({ provider: "openai" });

async function safeGenerate(prompt: string) {
  try {
    const result = await neurolink.generate({
      input: { text: prompt },
    });
    return result;
  } catch (error) {
    if (error instanceof NeuroLinkError) {
      // Access structured error information
      console.error(`Error Code: ${error.code}`);
      console.error(`Category: ${error.category}`);
      console.error(`Severity: ${error.severity}`);
      console.error(`Retriable: ${error.retriable}`);
      console.error(`Message: ${error.message}`);
      console.error(`Context:`, error.context);

      // Handle by category
      switch (error.category) {
        case ErrorCategory.VALIDATION:
          console.error("Fix input parameters and retry");
          break;
        case ErrorCategory.NETWORK:
          if (error.retriable) {
            console.error("Network issue - retrying...");
            return withRetry(
              () => neurolink.generate({ input: { text: prompt } }),
              { maxAttempts: 3, delayMs: 2000 },
            );
          }
          break;
        case ErrorCategory.PERMISSION:
          console.error("Check API key and permissions");
          break;
        case ErrorCategory.RESOURCE:
          console.error("Rate limited - waiting before retry");
          await new Promise((resolve) => setTimeout(resolve, 5000));
          return safeGenerate(prompt);
        default:
          console.error("Unexpected error");
      }

      // Log error in JSON format for structured logging
      console.error("Structured error:", error.toJSON());
    }
    throw error;
  }
}
```

## CLI Debugging

The CLI provides several options for debugging errors:

### Enable Debug Mode

```bash
# Run with debug output
neurolink generate "test prompt" --debug

# Show verbose output
neurolink status --verbose

# Validate configuration
neurolink config validate

# Check provider status
neurolink provider status openai
```

### Environment Validation

```bash
# Validate all environment variables
pnpm run env:validate

# Check specific provider configuration
neurolink config check --provider openai
```

### Debug Logging

```typescript
// Enable debug logging in SDK

setLogLevel("debug");

// Or via environment variable
process.env.NEUROLINK_LOG_LEVEL = "debug";
```

## Retry Utilities

NeuroLink provides built-in utilities for handling retriable errors:

### withRetry

```typescript

const result = await withRetry(
  () => neurolink.generate({ input: { text: "Hello" } }),
  {
    maxAttempts: 3,
    delayMs: 1000,
    isRetriable: isRetriableError,
    onRetry: (attempt, error) => {
      console.log(`Retry ${attempt}: ${error.message}`);
    },
  },
);
```

### withTimeout

```typescript

const result = await withTimeout(
  neurolink.generate({ input: { text: "Hello" } }),
  30000, // 30 second timeout
  new Error("Generation timed out"),
);
```

### Circuit Breaker

```typescript

const breaker = new CircuitBreaker(5, 60000); // 5 failures, 60s reset

const result = await breaker.execute(() =>
  neurolink.generate({ input: { text: "Hello" } }),
);

// Check circuit state
console.log("Circuit state:", breaker.getState()); // closed, open, half-open
console.log("Failure count:", breaker.getFailureCount());
```

## Provider-Specific Error Codes

Some providers have additional error codes:

### SageMaker Errors

| Code                  | Description                     | HTTP Status | Retriable |
| --------------------- | ------------------------------- | ----------- | --------- |
| `VALIDATION_ERROR`    | Request validation failed       | 400         | No        |
| `MODEL_ERROR`         | Model execution error           | 500         | No        |
| `INTERNAL_ERROR`      | Internal service error          | 500         | Yes       |
| `SERVICE_UNAVAILABLE` | Service temporarily unavailable | 503         | Yes       |
| `THROTTLING_ERROR`    | Rate limit exceeded             | 429         | Yes       |
| `CREDENTIALS_ERROR`   | AWS credentials invalid         | 401         | No        |
| `NETWORK_ERROR`       | Network connectivity issue      | -           | Yes       |
| `ENDPOINT_NOT_FOUND`  | SageMaker endpoint not found    | 404         | No        |
| `UNKNOWN_ERROR`       | Unclassified error              | 500         | No        |

## Related Documentation

- [Troubleshooting Guide](/docs/reference/troubleshooting) - Common issues and solutions
- [Configuration Reference](/docs/deployment/configuration) - Environment variables and settings
- [FAQ](/docs/reference/faq) - Frequently asked questions
- [Provider Feature Compatibility](/docs/reference/provider-feature-compatibility) - Provider capabilities matrix

---

## Provider Behavior Guide

<!-- Source: reference/provider-behavior.md -->

# Provider Behavior Guide

This guide documents provider-specific behaviors, quirks, and recommended usage patterns for optimal results with NeuroLink AI providers.

## Quick Navigation

- [Provider-Specific Behaviors](#provider-specific-input-handling)
- [Testing Recommendations](#testing-recommendations)
- [Factory Pattern Integration](#factory-pattern-integration)
- [Troubleshooting](#troubleshooting-common-issues)
- [Best Practices](#best-practices)

## Related Documentation

- [API Reference](/docs/sdk/api-reference) - Complete API documentation
- [CLI Guide](/docs/cli) - Command-line interface usage
- [Factory Pattern Migration](/docs/development/factory-migration) - Factory pattern implementation
- [Streaming Guide](/docs/advanced/streaming) - Advanced streaming features

## Provider-Specific Input Handling

### Google AI Studio & Vertex AI

**Behavior**: Exhibits inconsistent behavior with certain input patterns containing domain keywords.

**Affected Inputs**:

- Inputs containing keywords like "analytics", "healthcare", "streaming" may return empty responses
- Domain-specific terminology can trigger unexpected filtering
- This affects both basic streaming AND factory-enhanced streaming equally

**Recommended Inputs**:

- ✅ "Hello world", "Count from 1 to 5", "Say hello", "Tell me a joke"
- ✅ "Write a story", "Explain concepts", "Generate code"
- ✅ Generic prompts without domain-specific keywords

**Avoid**:

- ⚠️ "Test analytics", "healthcare data", "streaming analysis"
- ⚠️ Industry-specific jargon in simple test cases
- ⚠️ Technical domain terms in basic functionality tests

**Workaround**: Use provider-friendly inputs for testing, or switch to alternative providers (OpenAI, Anthropic) for domain-specific content.

### OpenAI (GPT-4, GPT-3.5)

**Behavior**: Generally reliable with consistent responses across all input types.

**Strengths**:

- Handles domain-specific content well
- Consistent streaming performance
- Good with technical terminology

**Considerations**:

- Rate limiting may apply based on plan
- Longer response times for complex prompts
- Higher cost per token compared to some alternatives

### Anthropic Claude

**Behavior**: Excellent reasoning capabilities with consistent responses.

**Strengths**:

- Superior handling of complex, domain-specific content
- Reliable streaming with consistent chunk sizes
- Good with analytical and healthcare content

**Considerations**:

- May be more verbose than other providers
- Higher token usage for equivalent outputs
- Strong safety filtering for sensitive content

### Amazon Bedrock

**Behavior**: Enterprise-grade reliability with consistent performance.

**Strengths**:

- Excellent for production workloads
- Consistent behavior across model versions
- Good integration with AWS ecosystem

**Considerations**:

- Requires AWS credentials and proper IAM setup
- May have higher latency due to enterprise security layers
- Regional availability varies

### Azure OpenAI

**Behavior**: Similar to OpenAI with enterprise features.

**Strengths**:

- Enterprise compliance and security
- Consistent with OpenAI behavior patterns
- Good integration with Microsoft ecosystem

**Considerations**:

- Requires Azure setup and endpoint configuration
- May have different rate limits than direct OpenAI
- Additional latency due to Azure proxy layer

### Ollama (Local Models)

**Behavior**: Varies significantly by model, generally more limited tool support.

**Strengths**:

- Complete privacy (local processing)
- No API costs or rate limits
- Full control over model versions

**Considerations**:

- Limited tool execution capabilities
- Performance depends on local hardware
- Model selection affects behavior significantly
- May require specific models (e.g., gemma3n) for tool support

### Hugging Face

**Behavior**: Highly variable depending on model selection.

**Strengths**:

- Access to thousands of open-source models
- Free tier available
- Good for experimentation

**Considerations**:

- Model quality varies significantly
- Tools may be visible but not execute properly
- Response format inconsistencies
- Cold start delays for less popular models

### Mistral AI

**Behavior**: Good balance of performance and European compliance.

**Strengths**:

- GDPR compliant (European provider)
- Good reasoning capabilities
- Consistent tool execution

**Considerations**:

- Smaller context windows than some competitors
- Limited model variety compared to OpenAI/Anthropic
- Newer provider with evolving capabilities

## Testing Recommendations

### For Automated Tests

1. **Use Provider-Neutral Inputs**: Choose prompts that work consistently across all providers
   - See [CLI Guide](/docs/cli) for example commands
2. **Avoid Domain Keywords**: Use generic prompts for functionality testing
   - Reference [Factory Pattern Migration](/docs/development/factory-migration) for domain-specific usage
3. **Test Provider-Specific Features**: Separate tests for provider-specific capabilities
   - Check [API Reference](/docs/sdk/api-reference) for provider options
4. **Implement Fallback Strategies**: Design tests to handle provider variations gracefully
   - See [Streaming Guide](/docs/advanced/streaming) for robust patterns

### For Development

1. **Provider Selection**: Choose appropriate provider based on use case requirements
   - Reference [Provider Selection Guidelines](#provider-selection-guidelines) below
2. **Input Validation**: Pre-validate inputs for provider compatibility
   - Use patterns from [Factory Pattern Integration](#factory-pattern-integration) section
3. **Error Handling**: Implement robust error handling for provider-specific failures
   - See [Troubleshooting](#troubleshooting-common-issues) section for common patterns
4. **Performance Monitoring**: Track provider performance and adjust accordingly
   - Reference [API Reference](/docs/sdk/api-reference) for monitoring setup

## Provider Selection Guidelines

### For Production Applications

- **High Reliability**: OpenAI, Anthropic, Azure OpenAI
- **Enterprise Compliance**: Amazon Bedrock, Azure OpenAI
- **Cost Optimization**: Google AI Studio, Mistral AI
- **Privacy Requirements**: Ollama (local)
- **European Compliance**: Mistral AI

### For Development & Testing

- **General Development**: OpenAI, Google AI Studio
- **Domain-Specific Testing**: Anthropic, OpenAI
- **Tool Integration Testing**: OpenAI, Anthropic, Google AI Studio
- **Streaming Testing**: Any provider except Ollama (limited)

## Troubleshooting Common Issues

### Empty Responses

**Symptoms**: Provider returns empty or minimal content
**Likely Causes**: Input contains filtered keywords, provider-specific limitations
**Solutions**:

- Try alternative provider from [Provider Selection Guidelines](#provider-selection-guidelines)
- Rephrase input using [Testing Recommendations](#testing-recommendations) patterns
- Check provider status using [CLI Guide](/docs/cli)

### Inconsistent Tool Execution

**Symptoms**: Tools work sometimes but not others
**Likely Causes**: Provider-specific tool support limitations
**Solutions**:

- Use providers with full tool support (OpenAI, Anthropic, Google AI)
- Configure tools using [CLI Guide](/docs/cli)
- Debug with [API Reference](/docs/sdk/api-reference)

### Streaming Interruptions

**Symptoms**: Streaming stops mid-response
**Likely Causes**: Provider rate limits, network issues, input filtering
**Solutions**:

- Implement retry logic from [Streaming Guide](/docs/advanced/streaming)
- Check provider status and validate inputs
- Use error handling patterns from [Streaming Guide](/docs/advanced/streaming)

### Performance Variations

**Symptoms**: Significant response time differences
**Likely Causes**: Provider load, geographic location, model selection
**Solutions**:

- Implement provider rotation using [API Reference](/docs/sdk/api-reference)
- Monitor performance metrics with [Analytics Integration](/docs/sdk/api-reference)
- Optimize based on [Provider Selection Guidelines](#provider-selection-guidelines)

## Factory Pattern Integration

When using NeuroLink's factory patterns with specific providers:

### Domain Configuration

- **Provider Sensitivity**: Some providers may filter domain-specific keywords
- **Configuration Guide**: See [Factory Pattern Migration](/docs/development/factory-migration) for setup
- **Testing Strategies**: Reference [Testing Recommendations](#testing-recommendations) above

### Context Processing

- **Validation**: Ensure context data compatibility across providers
- **Implementation**: Follow patterns in [Factory Pattern Migration](/docs/development/factory-migration)
- **Debugging**: Use [API Reference](/docs/sdk/api-reference) for validation tools

### Evaluation Integration

- **Provider Variation**: Different providers may have varying evaluation accuracy
- **Setup Guide**: See [API Reference](/docs/sdk/api-reference) for configuration
- **Best Practices**: Reference [Factory Pattern Migration](/docs/development/factory-migration)

### Tool Integration

- **Compatibility Testing**: Test tool execution with each target provider
- **Configuration**: Use [CLI Guide](/docs/cli) for MCP tool setup
- **Advanced Usage**: See [Streaming Guide](/docs/advanced/streaming) for streaming with tools

## Best Practices

### General Guidelines

1. **Provider Rotation**: Use multiple providers for resilience
   - Implementation guide: [API Reference](/docs/sdk/api-reference)
2. **Input Validation**: Validate inputs for provider compatibility
   - See provider-specific sections above for validation patterns
3. **Error Handling**: Implement graceful fallbacks
   - Follow [Streaming Guide](/docs/advanced/streaming) patterns
4. **Performance Monitoring**: Track provider metrics
   - Setup: [API Reference](/docs/sdk/api-reference)
5. **Cost Management**: Monitor token usage across providers
   - Tools: [CLI Guide](/docs/cli)
6. **Testing Strategy**: Use provider-appropriate test cases
   - Reference [Testing Recommendations](#testing-recommendations) above

### Performance Optimization

- **Caching**: Implement response caching for repeated requests
- **Batch Processing**: Use batch operations where supported
- **Provider Selection**: Choose optimal providers per use case
- **Input Optimization**: Format inputs for best provider performance

## See Also

- [API Reference](/docs/sdk/api-reference) - Complete API documentation and configuration
- [CLI Guide](/docs/cli) - Command-line interface and provider testing
- [Factory Pattern Migration](/docs/development/factory-migration) - Advanced factory pattern usage
- [Streaming Guide](/docs/advanced/streaming) - Streaming functionality and error handling
- [Main Documentation](/docs/) - Getting started guide and overview

---

_This guide is maintained as part of the NeuroLink provider ecosystem. For updates or provider-specific issues, please refer to the individual provider documentation or submit an issue in the [project repository](https://github.com/juspay/neurolink)._

---

## Provider Capabilities Audit

<!-- Source: reference/provider-capabilities-audit.md -->

# Provider Capabilities Audit

Comprehensive audit of all 13 AI providers supported by NeuroLink. This document serves as the source of truth for understanding each provider's capabilities, limitations, and configuration requirements.

**Last Updated:** January 1, 2026
**NeuroLink Version:** 8.26.1

-------------- | -------- | --------- | ----- | ------ | --- | -------- | ----------------- | ------------------ |
| OpenAI            | ✓        | ✓         | ✓     | ✓      | ✗   | ✗        | ✓                 | API Key            |
| Anthropic         | ✓        | ✓         | ✓     | ✓      | ✓   | ✓        | ✓                 | API Key            |
| Google AI Studio  | ✓        | ✓         | ✓     | ✓      | ✓   | ✓        | ⚠️                | API Key            |
| Google Vertex     | ✓        | ✓         | ✓     | ✓      | ✓   | ✓        | ⚠️                | Service Account    |
| Amazon Bedrock    | ✓        | ✓         | ✓     | ⚠️     | ✓   | ✗        | ✓                 | AWS Credentials    |
| Amazon SageMaker  | ✓        | ⚠️        | ✓     | ✗      | ✗   | ✗        | ✗                 | AWS Credentials    |
| Azure OpenAI      | ✓        | ✓         | ✓     | ✓      | ✗   | ✗        | ✓                 | API Key + Endpoint |
| Mistral           | ✓        | ✓         | ✓     | ⚠️     | ✗   | ✗        | ✓                 | API Key            |
| HuggingFace       | ✓        | ✓         | ⚠️    | ✗      | ✗   | ✗        | ✗                 | API Key            |
| LiteLLM           | ✓        | ✓         | ✓     | ⚠️     | ✗   | ✗        | ✓                 | Custom             |
| Ollama            | ✓        | ✓         | ✓     | ⚠️     | ✗   | ✗        | ✗                 | None               |
| OpenAI Compatible | ✓        | ✓         | ✓     | ⚠️     | ✗   | ✗        | ✓                 | Custom             |
| OpenRouter        | ✓        | ✓         | ⚠️    | ⚠️     | ✗   | ✗        | ✓                 | API Key            |

**Legend:**

- ✓ Full Support
- ⚠️ Partial/Model-Dependent Support
- ✗ Not Supported

---

## 1. OpenAI Provider

**File:** `src/lib/providers/openAI.ts`
**Provider Name:** `openai`
**Default Model:** `gpt-4o`

### Capabilities

#### Text Generation ✓

- Full support for all GPT models
- Supports temperature, maxTokens, top_p parameters
- Multi-turn conversations

#### Streaming ✓

- Real-time token streaming via Server-Sent Events (SSE)
- Chunk-by-chunk response delivery
- Full analytics support

#### Tool Calling ✓

- Native function calling support
- Automatic tool execution
- Multi-step tool workflows
- Tool choice: auto, required, none

#### Vision/Multimodal ✓

**Supported Models:**

- GPT-5.2 series (gpt-5.2, gpt-5.2-pro) - Latest flagship
- GPT-5 series (gpt-5, gpt-5-pro, gpt-5-mini, gpt-5-nano)
- GPT-4.1 series (gpt-4.1, gpt-4.1-mini, gpt-4.1-nano)
- O-series reasoning models (o3, o3-mini, o3-pro, o4, o4-mini)
- GPT-4o, GPT-4o-mini, GPT-4-turbo, GPT-4-vision-preview

**Image Support:**

- Up to 10 images per request
- Formats: PNG, JPEG, WEBP, GIF
- Base64 and URL input

#### PDF Processing ✗

- Not natively supported
- Requires external preprocessing

#### Extended Thinking ✗

- Standard reasoning only
- No extended thinking capability

#### Structured Output ✓

- JSON schema validation
- Type-safe responses via Zod
- Response format enforcement

### Configuration

```bash
# Required
OPENAI_API_KEY=sk-...

# Optional
OPENAI_MODEL=gpt-4o
OPENAI_BASE_URL=https://api.openai.com/v1  # For proxy/custom endpoints
```

### Known Limitations

- PDF files require preprocessing to text/images
- No native extended thinking mode
- Rate limits apply per API key tier
- Context window varies by model (128K for GPT-4o)

---

## 2. Anthropic Provider

**File:** `src/lib/providers/anthropic.ts`
**Provider Name:** `anthropic`
**Default Model:** `claude-sonnet-4-5-20250929`

### Capabilities

#### Text Generation ✓

- All Claude models (3.x, 4.x, 4.5)
- Advanced reasoning capabilities
- Long context support (200K tokens)

#### Streaming ✓

- Real-time streaming with SSE
- Tool execution during streaming
- Analytics tracking

#### Tool Calling ✓

- Native tool use support
- Multi-step agentic workflows
- Tool result caching
- Parallel tool execution

#### Vision/Multimodal ✓

**Supported Models:**

- Claude 4.5 series (Sonnet, Opus, Haiku)
- Claude 4.1 and 4.0 series
- Claude 3.7 series
- Claude 3.5 series
- Claude 3 series (Opus, Sonnet, Haiku)

**Image Support:**

- Up to 20 images per request
- Formats: PNG, JPEG, WEBP, GIF
- Base64 encoding required

#### PDF Processing ✓

- Native PDF document understanding
- No preprocessing required
- Extract text, tables, and structure
- Visual analysis of PDF pages

#### Extended Thinking ✓

**Supported Models:**

- Claude 4.5 Sonnet (latest)
- Claude 4.5 Opus
- Claude 4.1 Opus
- Claude 3.7 Sonnet

**Thinking Levels:**

- `minimal` - Fast responses
- `low` - Basic reasoning
- `medium` - Moderate reasoning (default)
- `high` - Deep reasoning and analysis

#### Structured Output ✓

- JSON schema validation
- Type-safe responses
- Zod schema support

### Configuration

```bash
# Required
ANTHROPIC_API_KEY=sk-ant-...

# Optional
ANTHROPIC_MODEL=claude-sonnet-4-5-20250929
ANTHROPIC_VERSION=2023-06-01
```

### Known Limitations

- 200K token context window (generous but finite)
- API rate limits based on tier
- Extended thinking increases latency
- PDF processing has file size limits

---

## 3. Google AI Studio Provider

**File:** `src/lib/providers/googleAiStudio.ts`
**Provider Name:** `google-ai` / `googleAiStudio`
**Default Model:** `gemini-2.5-flash`

### Capabilities

#### Text Generation ✓

- Gemini 1.5, 2.0, 2.5, and 3.0 models
- Fast inference
- Free tier available

#### Streaming ✓

- Real-time streaming
- Tool execution during streaming
- Analytics support

#### Tool Calling ✓

- Native function calling
- Parallel tool execution
- Tool result integration

#### Vision/Multimodal ✓

**Supported Models:**

- Gemini 3 series (Pro, Flash) - Preview
- Gemini 2.5 series (Pro, Flash, Flash Lite)
- Gemini 2.0 series (Flash)
- Gemini 1.5 series (Pro, Flash)

**Image Support:**

- Up to 16 images per request
- Formats: PNG, JPEG, WEBP
- Base64 and Google Cloud Storage URLs

#### PDF Processing ✓

- Native PDF understanding
- Text and visual extraction
- Document structure analysis

#### Extended Thinking ✓

**Supported Models:**

- Gemini 3 Pro (Preview)
- Gemini 2.5 Pro
- Gemini 2.5 Flash

**Thinking Levels:**

- `minimal`, `low`, `medium`, `high`
- Configurable thinking budget

#### Structured Output ⚠️

- JSON schema support
- **CRITICAL LIMITATION:** Cannot use tools AND structured output simultaneously
- When using JSON schema, must set `disableTools: true`
- Error: "Function calling with response mime type 'application/json' is unsupported"

### Configuration

```bash
# Required
GOOGLE_AI_API_KEY=AIza...

# Optional
GOOGLE_AI_MODEL=gemini-2.5-flash
```

### Known Limitations

- **Cannot combine tools + JSON schema** (Gemini limitation)
- Tools OR structured output, not both
- Free tier has rate limits
- Some features in preview/experimental

---

## 4. Google Vertex AI Provider

**File:** `src/lib/providers/googleVertex.ts`
**Provider Name:** `vertex`
**Default Model:** `gemini-2.5-flash`

### Capabilities

Same as Google AI Studio, plus:

#### Dual Provider Support

- **Gemini models** - Same as AI Studio
- **Claude models via Vertex** - Anthropic models hosted on GCP

**Anthropic on Vertex:**

- Claude 4.5 series (Sonnet, Opus, Haiku)
- Claude 4.x and 3.x series
- Full tool calling support
- No structured output limitation (unlike Gemini)

#### Text Generation ✓

- All Gemini models
- All Claude models via Vertex Anthropic
- Enterprise-grade reliability

#### Streaming ✓

- Same as AI Studio
- Works for both Gemini and Claude models

#### Tool Calling ✓

- Gemini: Full tool support (but not with schemas)
- Claude: Full tool support (can combine with schemas)

#### Vision/Multimodal ✓

- Gemini: Up to 16 images
- Claude: Up to 20 images

#### PDF Processing ✓

- Both Gemini and Claude models support PDF

#### Extended Thinking ✓

- Gemini 2.5+, Gemini 3: Full support
- Claude models: Not supported via Vertex

#### Structured Output ⚠️

- Gemini: Cannot combine with tools
- Claude: Can combine with tools

### Configuration

```bash
# Required (Option 1: Service Account File)
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
VERTEX_PROJECT_ID=my-project

# Required (Option 2: Environment Variables)
GOOGLE_AUTH_CLIENT_EMAIL=...
GOOGLE_AUTH_PRIVATE_KEY=...
VERTEX_PROJECT_ID=my-project

# Optional
VERTEX_LOCATION=us-central1
VERTEX_MODEL=gemini-2.5-flash
```

### Known Limitations

- Requires Google Cloud project setup
- Service account authentication complexity
- Gemini tools + schema limitation applies
- Regional endpoint configuration

---

## 5. Amazon Bedrock Provider

**File:** `src/lib/providers/amazonBedrock.ts`
**Provider Name:** `bedrock`
**Default Model:** `anthropic.claude-3-sonnet-20240229-v1:0`

### Capabilities

#### Text Generation ✓

- Claude models on Bedrock
- Amazon Titan models
- Cohere models
- Meta Llama models
- AI21 Jurassic models

#### Streaming ✓

- Real-time streaming via AWS SDK
- Native conversation loop
- Tool execution during streaming

#### Tool Calling ✓

- Native tool support via Bedrock Converse API
- Multi-step tool workflows
- Automatic tool execution

#### Vision/Multimodal ⚠️

**Model-Dependent:**

- Claude models: Full vision support
- Titan models: Limited vision support
- Other models: Varies by model

#### PDF Processing ✓

- Claude models: Native PDF support
- Document extraction and analysis

#### Extended Thinking ✗

- Not supported via Bedrock
- Standard reasoning only

#### Structured Output ✓

- JSON schema validation
- Type-safe responses

### Configuration

```bash
# Required
AWS_ACCESS_KEY_ID=AKIA...
AWS_SECRET_ACCESS_KEY=...
AWS_REGION=us-east-1

# Optional
BEDROCK_MODEL=anthropic.claude-3-sonnet-20240229-v1:0
```

### Known Limitations

- Requires AWS account with Bedrock access
- Model availability varies by region
- IAM permissions required
- No extended thinking support
- Vision support depends on model

---

## 6. Amazon SageMaker Provider

**File:** `src/lib/providers/amazonSagemaker.ts`
**Provider Name:** `sagemaker`
**Default Model:** Custom endpoint

### Capabilities

#### Text Generation ✓

- Custom SageMaker endpoints
- Fine-tuned models
- Enterprise model deployments

#### Streaming ⚠️

- **Not fully implemented** (as of v8.26.1)
- Coming in next phase
- Returns 501 error currently

#### Tool Calling ✓

- Supported for compatible models
- Depends on endpoint configuration

#### Vision/Multimodal ✗

- Not supported
- Depends on custom endpoint

#### PDF Processing ✗

- Not supported

#### Extended Thinking ✗

- Not supported

#### Structured Output ✗

- Not supported via provider
- May work with custom endpoints

### Configuration

```bash
# Required
AWS_ACCESS_KEY_ID=AKIA...
AWS_SECRET_ACCESS_KEY=...
AWS_REGION=us-east-1
SAGEMAKER_ENDPOINT_NAME=my-endpoint

# Optional
SAGEMAKER_MODEL=custom-model
```

### Known Limitations

- **Streaming not fully implemented**
- Requires SageMaker endpoint deployment
- Custom model-dependent capabilities
- No built-in multimodal support
- Enterprise AWS setup required

---

## 7. Azure OpenAI Provider

**File:** `src/lib/providers/azureOpenai.ts`
**Provider Name:** `azure`
**Default Model:** `gpt-4o`

### Capabilities

#### Text Generation ✓

- All Azure OpenAI models
- GPT-4, GPT-4o, GPT-3.5-turbo
- Enterprise security and compliance

#### Streaming ✓

- Real-time streaming
- Tool execution during streaming
- Analytics support

#### Tool Calling ✓

- Full tool support
- Same as OpenAI provider
- Multi-step workflows

#### Vision/Multimodal ✓

**Supported Models:**

- GPT-5.1 series
- GPT-5 series
- GPT-4.1 series
- O-series (o3, o4)
- GPT-4o, GPT-4o-mini, GPT-4-turbo

**Image Support:**

- Up to 10 images per request
- Same formats as OpenAI

#### PDF Processing ✗

- Not natively supported

#### Extended Thinking ✗

- Not supported

#### Structured Output ✓

- JSON schema validation
- Type-safe responses

### Configuration

```bash
# Required
AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
AZURE_OPENAI_DEPLOYMENT=gpt-4o

# Optional
AZURE_API_VERSION=2024-05-01-preview
```

### Known Limitations

- Requires Azure subscription
- Deployment configuration required
- Regional model availability varies
- No PDF or extended thinking support

---

## 8. Mistral Provider

**File:** `src/lib/providers/mistral.ts`
**Provider Name:** `mistral`
**Default Model:** `mistral-small-2506`

### Capabilities

#### Text Generation ✓

- Mistral Small, Medium, Large models
- Fast inference
- Cost-effective

#### Streaming ✓

- Real-time streaming
- Tool execution support

#### Tool Calling ✓

- Native function calling
- Tool execution workflows

#### Vision/Multimodal ⚠️

**Supported Models:**

- Mistral Small 2506 (June 2025) - Vision-capable
- Mistral Pixtral - Multimodal model

**Image Support:**

- Up to 10 images per request (conservative limit)
- Model-dependent capability

#### PDF Processing ✗

- Not supported

#### Extended Thinking ✗

- Not supported

#### Structured Output ✓

- JSON schema support
- Type-safe responses

### Configuration

```bash
# Required
MISTRAL_API_KEY=...

# Optional
MISTRAL_MODEL=mistral-small-2506
```

### Known Limitations

- Vision only on specific models (Small 2506+)
- No PDF support
- No extended thinking
- Limited multimodal compared to GPT-4o/Claude

---

## 9. HuggingFace Provider

**File:** `src/lib/providers/huggingFace.ts`
**Provider Name:** `huggingface`
**Default Model:** `microsoft/DialoGPT-medium`

### Capabilities

#### Text Generation ✓

- Access to 100,000+ models
- Open-source models
- Custom fine-tuned models

#### Streaming ✓

- Real-time streaming via unified router
- OpenAI-compatible endpoint

#### Tool Calling ⚠️

**Model-Dependent Support:**

**Supported Models:**

- Llama 3.1 series (8B, 70B, 405B Instruct)
- Llama 3.1 Nemotron Ultra
- Hermes 3 Llama 3.2
- CodeLlama 34B Instruct
- Mistral 7B Instruct v0.3

**Unsupported Models:**

- DialoGPT variants (treats tools as conversation)
- GPT-2, BERT, RoBERTa variants
- Most pre-2024 models

#### Vision/Multimodal ✗

- Not supported via unified router
- Individual model APIs may support

#### PDF Processing ✗

- Not supported

#### Extended Thinking ✗

- Not supported

#### Structured Output ✗

- Not supported via provider

### Configuration

```bash
# Required
HUGGINGFACE_API_KEY=hf_...

# Optional
HUGGINGFACE_MODEL=meta-llama/Llama-3.1-8B-Instruct
```

### Known Limitations

- Tool calling only on specific models
- No vision/multimodal support
- No PDF processing
- Model quality varies significantly
- Some models require approval/licensing

---

## 10. LiteLLM Provider

**File:** `src/lib/providers/litellm.ts`
**Provider Name:** `litellm`
**Default Model:** `openai/gpt-4o-mini`

### Capabilities

#### Text Generation ✓

- Access to 100+ models via proxy
- Unified interface for all providers
- Cost tracking and analytics

#### Streaming ✓

- Real-time streaming
- Proxies to underlying provider streams

#### Tool Calling ✓

- Full tool support
- Depends on backend model capabilities

#### Vision/Multimodal ⚠️

- Depends on backend model
- If proxying to GPT-4o: Vision supported
- If proxying to Gemini: Vision supported
- Varies by configured model

#### PDF Processing ✗

- Not supported via LiteLLM proxy

#### Extended Thinking ✗

- Not supported

#### Structured Output ✓

- JSON schema support
- Type-safe responses

### Configuration

```bash
# Required
LITELLM_BASE_URL=http://localhost:4000
LITELLM_API_KEY=sk-anything

# Optional
LITELLM_MODEL=openai/gpt-4o-mini
```

### Known Limitations

- Requires LiteLLM proxy server running
- Capabilities depend on backend provider
- Model format: `provider/model`
- Configuration complexity for enterprise setups

---

## 11. Ollama Provider

**File:** `src/lib/providers/ollama.ts`
**Provider Name:** `ollama`
**Default Model:** `llama3.1:8b`

### Capabilities

#### Text Generation ✓

- Local model execution
- Privacy-first (no data sent to cloud)
- Custom model support

#### Streaming ✓

- Real-time streaming
- Dual API mode:
  - Native Ollama API (`/api/generate`)
  - OpenAI-compatible API (`/v1/chat/completions`)

#### Tool Calling ✓

- Supported on compatible models
- Llama 3.1+ models
- Gemma 3 models with tool training

#### Vision/Multimodal ⚠️

**Model-Dependent:**

- LLaVA models - Vision support
- Gemini models - Vision support
- Llama 3.2 Vision - Vision support

**Image Support:**

- Up to 10 images (conservative limit)
- Depends on model capabilities

#### PDF Processing ✗

- Not supported

#### Extended Thinking ✗

- Not supported

#### Structured Output ✗

- Limited structured output support

### Configuration

```bash
# Optional
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama3.1:8b
OLLAMA_TIMEOUT=240000
OLLAMA_OPENAI_COMPATIBLE=false
```

### Known Limitations

- Local compute requirements
- Model quality varies
- No PDF support
- Vision only on specific models
- Slower inference than cloud providers

---

## 12. OpenAI Compatible Provider

**File:** `src/lib/providers/openaiCompatible.ts`
**Provider Name:** `openai-compatible`
**Default Model:** Auto-discovered or `gpt-3.5-turbo`

### Capabilities

#### Text Generation ✓

- Any OpenAI-compatible endpoint
- vLLM, FastChat, LocalAI, etc.
- Custom deployment support

#### Streaming ✓

- Real-time streaming
- OpenAI-compatible SSE

#### Tool Calling ✓

- Full tool support
- Depends on backend compatibility

#### Vision/Multimodal ⚠️

- Depends on backend endpoint
- Auto-discovery not available for capabilities

#### PDF Processing ✗

- Not supported

#### Extended Thinking ✗

- Not supported

#### Structured Output ✓

- JSON schema support
- Type-safe responses

### Configuration

```bash
# Required
OPENAI_COMPATIBLE_BASE_URL=https://api.custom.com/v1
OPENAI_COMPATIBLE_API_KEY=...

# Optional
OPENAI_COMPATIBLE_MODEL=model-name  # Auto-discovers if not set
```

### Known Limitations

- Capabilities depend entirely on backend
- No standardized capability detection
- Authentication varies by provider
- Model discovery may fail

---

## 13. OpenRouter Provider

**File:** `src/lib/providers/openRouter.ts`
**Provider Name:** `openrouter`
**Default Model:** `anthropic/claude-3-5-sonnet`

### Capabilities

#### Text Generation ✓

- Access to 300+ models from 60+ providers
- Unified API for all models
- Automatic failover
- Cost tracking

#### Streaming ✓

- Real-time streaming
- Proxies to underlying provider

#### Tool Calling ⚠️

**Model-Dependent Support:**

**Supported Models:**

- Anthropic Claude models
- OpenAI GPT-4 models
- Google Gemini models
- Mistral Large/Small models
- Meta Llama 3.3, 3.2

**Unsupported Models:**

- Many older/smaller models
- Check model page for tool support

#### Vision/Multimodal ⚠️

- Depends on selected model
- GPT-4o, Claude, Gemini support vision
- Check model-specific capabilities

#### PDF Processing ✗

- Not supported via OpenRouter

#### Extended Thinking ✗

- Not supported

#### Structured Output ✓

- JSON schema support
- Type-safe responses

### Configuration

```bash
# Required
OPENROUTER_API_KEY=sk-or-...

# Optional
OPENROUTER_MODEL=anthropic/claude-3-5-sonnet
OPENROUTER_REFERER=https://your-app.com
OPENROUTER_APP_NAME=YourApp
```

### Known Limitations

- Tool support varies by model
- Vision support varies by model
- Credit-based pricing system
- Model availability can change
- No PDF support

---

## Summary Tables

### Provider Comparison by Use Case

#### Best for Production Text Generation

1. **OpenAI** - Most reliable, best quality
2. **Anthropic** - Long context, advanced reasoning
3. **Google Vertex** - Enterprise-grade, multi-model

#### Best for Multimodal (Vision + Text)

1. **Anthropic** - Best vision + PDF support
2. **OpenAI** - Strong vision, no PDF
3. **Google AI Studio** - Good vision + PDF, free tier

#### Best for Tool Calling

1. **Anthropic** - Most advanced agentic workflows
2. **OpenAI** - Reliable function calling
3. **Google Vertex** - Dual provider (Gemini + Claude)

#### Best for Local/Privacy

1. **Ollama** - Fully local, no cloud
2. N/A - Only Ollama provides local execution

#### Best for Cost Optimization

1. **Google AI Studio** - Free tier available
2. **OpenRouter** - Access to free models
3. **LiteLLM** - Cost tracking, routing

#### Best for Extended Thinking

1. **Anthropic** - Native extended thinking
2. **Google AI Studio** - Gemini 2.5+, 3.0 thinking
3. **Google Vertex** - Same as AI Studio

### Authentication Quick Reference

| Provider          | Auth Type          | Env Vars                                                  | Complexity |
| ----------------- | ------------------ | --------------------------------------------------------- | ---------- |
| OpenAI            | API Key            | `OPENAI_API_KEY`                                          | Low        |
| Anthropic         | API Key            | `ANTHROPIC_API_KEY`                                       | Low        |
| Google AI Studio  | API Key            | `GOOGLE_AI_API_KEY`                                       | Low        |
| Google Vertex     | Service Account    | `GOOGLE_APPLICATION_CREDENTIALS`                          | High       |
| Amazon Bedrock    | AWS Credentials    | `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`              | Medium     |
| Amazon SageMaker  | AWS Credentials    | `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`              | High       |
| Azure OpenAI      | API Key + Endpoint | `AZURE_OPENAI_API_KEY`, `AZURE_OPENAI_ENDPOINT`           | Medium     |
| Mistral           | API Key            | `MISTRAL_API_KEY`                                         | Low        |
| HuggingFace       | API Key            | `HUGGINGFACE_API_KEY`                                     | Low        |
| LiteLLM           | Custom             | `LITELLM_BASE_URL`, `LITELLM_API_KEY`                     | Medium     |
| Ollama            | None               | Optional `OLLAMA_BASE_URL`                                | Low        |
| OpenAI Compatible | Custom             | `OPENAI_COMPATIBLE_BASE_URL`, `OPENAI_COMPATIBLE_API_KEY` | Medium     |
| OpenRouter        | API Key            | `OPENROUTER_API_KEY`                                      | Low        |

---

## Provider Implementation Notes

### BaseProvider Architecture

All providers extend `BaseProvider` class which provides:

- Unified interface for text generation and streaming
- Tool registration and execution
- Middleware support
- Analytics and telemetry
- Error handling
- Message building for multimodal content

### Dynamic Provider Loading

Providers are registered via dynamic imports in `ProviderRegistry`:

- Avoids circular dependencies
- Lazy loading for better performance
- Clean provider isolation

### Tool Execution Flow

1. Tools registered with `MCPToolRegistry`
2. Provider calls `getAllTools()` to get available tools
3. AI model receives tool definitions
4. Model calls tools during generation
5. Tool results sent back to model
6. Process repeats until completion

---

## Version History

- **v8.26.1** (January 2026) - Current version, 13 providers
- **v8.26.0** - Added video output types
- **v8.25.0** - Gemini 3 support improvements
- **v8.24.0** - Enhanced provider capabilities

---

**Next Steps:**

- See [Provider Comparison Guide](/docs/reference/provider-comparison) for feature matrix
- See [Provider Selection Wizard](/docs/reference/provider-selection) for recommendations
- See [API Reference](/docs/sdk/api-reference) for usage examples

---

## AI Provider Comparison Guide

<!-- Source: reference/provider-comparison.md -->

# AI Provider Comparison Guide

**Last Updated:** January 1, 2026
**NeuroLink Version:** 8.26.1

Complete comparison of all 13 AI providers supported by NeuroLink, including capabilities, pricing, and use case recommendations.

-------------- | ---- | ------ | ----- | ------ | --- | -------- | ---------- | --------- | ---------- |
| OpenAI            | ✓    | ✓      | ✓     | ✓      | ✗   | ✗        | ✓          | ✗         | 2 min      |
| Anthropic         | ✓    | ✓      | ✓     | ✓      | ✓   | ✓        | ✓          | ✗         | 2 min      |
| Google AI Studio  | ✓    | ✓      | ✓     | ✓      | ✓   | ✓        | ⚠️         | ✓         | 2 min      |
| Google Vertex     | ✓    | ✓      | ✓     | ✓      | ✓   | ✓        | ⚠️         | ✗         | 15 min     |
| Amazon Bedrock    | ✓    | ✓      | ✓     | ⚠️     | ✓   | ✗        | ✓          | ✗         | 10 min     |
| Amazon SageMaker  | ✓    | ⚠️     | ✓     | ✗      | ✗   | ✗        | ✗          | ✗         | 30 min     |
| Azure OpenAI      | ✓    | ✓      | ✓     | ✓      | ✗   | ✗        | ✓          | ✗         | 20 min     |
| Mistral           | ✓    | ✓      | ✓     | ⚠️     | ✗   | ✗        | ✓          | ✓         | 2 min      |
| HuggingFace       | ✓    | ✓      | ⚠️    | ✗      | ✗   | ✗        | ✗          | ✓         | 2 min      |
| LiteLLM           | ✓    | ✓      | ✓     | ⚠️     | ✗   | ✗        | ✓          | ⚠️        | 5 min      |
| Ollama            | ✓    | ✓      | ✓     | ⚠️     | ✗   | ✗        | ✗          | ✓         | 5 min      |
| OpenAI Compatible | ✓    | ✓      | ✓     | ⚠️     | ✗   | ✗        | ✓          | ⚠️        | 5 min      |
| OpenRouter        | ✓    | ✓      | ⚠️    | ⚠️     | ✗   | ✗        | ✓          | ⚠️        | 2 min      |

**Legend:**

- ✓ Full Support
- ⚠️ Partial/Model-Dependent
- ✗ Not Supported

---

## 2025 Pricing Comparison

### Pay-per-Token Providers

| Provider             | Input (per 1M tokens) | Output (per 1M tokens) | Vision         | Best Value Model              |
| -------------------- | --------------------- | ---------------------- | -------------- | ----------------------------- |
| **OpenAI**           | $2.50 - $60.00        | $10.00 - $180.00       | $5.00 - $60.00 | GPT-4o-mini: $0.15/$0.60      |
| **Anthropic**        | $3.00 - $15.00        | $15.00 - $75.00        | Same           | Claude Haiku: $0.25/$1.25     |
| **Google AI Studio** | FREE - $7.00          | FREE - $21.00          | FREE - $7.00   | Gemini 2.5 Flash: FREE        |
| **Google Vertex**    | $0.35 - $35.00        | $1.05 - $105.00        | $0.35 - $35.00 | Gemini 2.5 Flash: $0.35/$1.05 |
| **Amazon Bedrock**   | $3.00 - $15.00        | $15.00 - $75.00        | $3.00 - $15.00 | Claude Haiku: $0.25/$1.25     |
| **Azure OpenAI**     | $2.50 - $60.00        | $10.00 - $180.00       | $5.00 - $60.00 | GPT-4o-mini: $0.15/$0.60      |
| **Mistral**          | $0.25 - $8.00         | $0.75 - $24.00         | $0.25 - $8.00  | Mistral Small: $0.20/$0.60    |
| **HuggingFace**      | FREE - $1.00          | FREE - $1.00           | N/A            | DialoGPT: FREE                |
| **OpenRouter**       | $0.00 - $60.00        | $0.00 - $180.00        | Varies         | Many free models              |

### Self-Hosted / Custom Pricing

| Provider              | Model  | Cost Structure           | Notes                                             |
| --------------------- | ------ | ------------------------ | ------------------------------------------------- |
| **Amazon SageMaker**  | Custom | Instance hours + storage | Varies by instance type (ml.g5.xlarge: ~$1.41/hr) |
| **LiteLLM**           | Proxy  | Backend provider costs   | No additional fee, proxy overhead only            |
| **Ollama**            | Local  | Hardware costs only      | FREE (uses local compute)                         |
| **OpenAI Compatible** | Custom | Backend-dependent        | Varies by endpoint provider                       |

### Free Tier Details

**Google AI Studio:**

- 15 requests/minute
- 1,500 requests/day
- Up to 1M tokens/day
- Gemini 2.5 Flash completely FREE

**HuggingFace:**

- Rate-limited free tier
- 1,000 requests/month on free models
- Inference API access

**Mistral:**

- Limited free tier for testing
- Mistral Small free quota

**Ollama:**

- Completely FREE
- Uses local compute
- No API limits

**OpenRouter:**

- Many FREE models available:
  - Google Gemini 2.0 Flash (free)
  - Meta Llama 3.3 70B (free)
  - Qwen models (free)

---

## Detailed Feature Comparison

### Text Generation

**All providers support text generation**, but quality varies:

**Tier 1 (Highest Quality):**

- OpenAI GPT-4o, GPT-5 series
- Anthropic Claude 4.5 series
- Google Gemini 3 Pro

**Tier 2 (High Quality):**

- Azure OpenAI (same as OpenAI)
- Google Gemini 2.5 Pro
- Anthropic Claude 3.5 Sonnet

**Tier 3 (Good Quality):**

- Mistral Large
- Amazon Bedrock (Claude models)
- OpenRouter (Claude/GPT-4 routing)

**Tier 4 (Variable Quality):**

- HuggingFace (model-dependent)
- Ollama (model-dependent)
- LiteLLM (backend-dependent)

---

### Streaming Support

**Full Streaming (Real-time SSE):**

- ✓ OpenAI
- ✓ Anthropic
- ✓ Google AI Studio
- ✓ Google Vertex
- ✓ Amazon Bedrock
- ✓ Azure OpenAI
- ✓ Mistral
- ✓ HuggingFace
- ✓ LiteLLM
- ✓ Ollama
- ✓ OpenAI Compatible
- ✓ OpenRouter

**Partial/Limited Streaming:**

- ⚠️ Amazon SageMaker (not fully implemented in v8.26.1)

---

### Tool Calling / Function Calling

**Native Full Support:**

- ✓ OpenAI - Industry-leading function calling
- ✓ Anthropic - Advanced tool use, parallel execution
- ✓ Azure OpenAI - Same as OpenAI
- ✓ Mistral - Native function calling
- ✓ Google Vertex - Gemini + Claude models
- ✓ Google AI Studio - Gemini models
- ✓ Amazon Bedrock - Converse API tool support
- ✓ LiteLLM - Proxies to backend providers

**Model-Dependent Support:**

- ⚠️ HuggingFace - Only specific models:
  - Llama 3.1+ series
  - Hermes 3 models
  - CodeLlama 34B
  - Mistral 7B Instruct v0.3
- ⚠️ Ollama - Only compatible models:
  - Llama 3.1+
  - Gemma 3 with tool training
- ⚠️ OpenRouter - Check model capabilities:
  - Claude models: ✓
  - GPT-4 models: ✓
  - Gemini models: ✓
  - Many others vary
- ⚠️ OpenAI Compatible - Depends on backend
- ⚠️ Amazon SageMaker - Depends on custom endpoint

---

### Vision / Multimodal Capabilities

**Native Vision Support:**

**Tier 1 (Best Vision):**

- **OpenAI** - GPT-4o, GPT-5 series, O-series
  - 10 images max
  - PNG, JPEG, WEBP, GIF
- **Anthropic** - Claude 4.5, 4.x, 3.x series
  - 20 images max
  - Excellent vision quality
- **Google Vertex/AI Studio** - Gemini 2.5+, 3.x
  - 16 images max
  - Native multimodal architecture

**Tier 2 (Good Vision):**

- **Azure OpenAI** - Same models as OpenAI
  - 10 images max
- **Mistral** - Small 2506, Pixtral
  - 10 images max (conservative)

**Model-Dependent Vision:**

- ⚠️ **LiteLLM** - Depends on backend (e.g., GPT-4o via LiteLLM = vision)
- ⚠️ **Ollama** - LLaVA, Llama 3.2 Vision, Gemini models
- ⚠️ **OpenAI Compatible** - Backend-dependent
- ⚠️ **OpenRouter** - Model-dependent (Claude, GPT-4o, Gemini support vision)
- ⚠️ **Amazon Bedrock** - Claude models support vision

**No Vision Support:**

- ✗ HuggingFace
- ✗ Amazon SageMaker

---

### PDF Document Processing

**Native PDF Support:**

- ✓ **Anthropic** - Native PDF understanding (best)
- ✓ **Google AI Studio** - Gemini PDF processing
- ✓ **Google Vertex** - Gemini + Claude PDF support
- ✓ **Amazon Bedrock** - Claude models

**No PDF Support (Requires Preprocessing):**

- ✗ OpenAI
- ✗ Azure OpenAI
- ✗ Mistral
- ✗ HuggingFace
- ✗ LiteLLM
- ✗ Ollama
- ✗ OpenAI Compatible
- ✗ OpenRouter
- ✗ Amazon SageMaker

---

### Extended Thinking / Reasoning

**Native Extended Thinking:**

- ✓ **Anthropic** - Claude 4.5 Sonnet, Opus (best)
  - Thinking levels: minimal, low, medium, high
  - Transparent reasoning process
- ✓ **Google AI Studio** - Gemini 2.5+, Gemini 3
  - Thinking levels: minimal, low, medium, high
  - Configurable thinking budget
- ✓ **Google Vertex** - Same as AI Studio (Gemini only, not Claude)

**No Extended Thinking:**

- ✗ OpenAI (standard reasoning only)
- ✗ Azure OpenAI
- ✗ Amazon Bedrock
- ✗ Amazon SageMaker
- ✗ Mistral
- ✗ HuggingFace
- ✗ LiteLLM
- ✗ Ollama
- ✗ OpenAI Compatible
- ✗ OpenRouter

---

### Structured Output / JSON Schema

**Full Support (Tools + Schema Together):**

- ✓ **OpenAI** - Native JSON mode
- ✓ **Anthropic** - Full schema + tools
- ✓ **Azure OpenAI** - Same as OpenAI
- ✓ **Amazon Bedrock** - Schema validation
- ✓ **Mistral** - JSON schema support
- ✓ **LiteLLM** - Proxies to backend
- ✓ **OpenAI Compatible** - OpenAI-compatible endpoints
- ✓ **OpenRouter** - Model-dependent

**Partial Support (Tools OR Schema, Not Both):**

- ⚠️ **Google AI Studio** - ❌ Cannot combine
  - Must use `disableTools: true` with schemas
  - Gemini API limitation
- ⚠️ **Google Vertex** - ❌ Cannot combine (Gemini models only)
  - Claude models on Vertex CAN combine
  - Gemini models have same limitation as AI Studio

**No Structured Output:**

- ✗ HuggingFace
- ✗ Ollama
- ✗ Amazon SageMaker

---

## Provider Deep Dive

### 1. OpenAI

**Provider ID:** `openai`
**Default Model:** `gpt-4o`

**Strengths:**

- Industry-leading model quality
- Best-in-class developer experience
- Extensive ecosystem and integrations
- Excellent documentation
- Reliable uptime and performance

**Weaknesses:**

- Expensive at scale
- No free tier
- No PDF support
- No extended thinking

**Best For:**

- Production applications requiring highest quality
- Critical customer-facing features
- Complex reasoning tasks
- When budget allows premium pricing

**2025 Pricing:**

- GPT-4o: $2.50/$10.00 per 1M tokens
- GPT-4o-mini: $0.15/$0.60 per 1M tokens
- GPT-5 series: $15.00-$60.00 input, $45.00-$180.00 output

---

### 2. Anthropic

**Provider ID:** `anthropic`
**Default Model:** `claude-sonnet-4-5-20250929`

**Strengths:**

- **Extended thinking** - Best reasoning capabilities
- **Native PDF support** - Document understanding
- 200K token context window
- Strong safety features
- Excellent for analysis and research

**Weaknesses:**

- Higher cost than some alternatives
- Smaller ecosystem than OpenAI
- Limited regional availability

**Best For:**

- Complex reasoning and analysis
- Document processing workflows
- Agentic workflows with tools
- When extended thinking is valuable

**2025 Pricing:**

- Claude Haiku 4.5: $0.25/$1.25 per 1M tokens
- Claude Sonnet 4.5: $3.00/$15.00 per 1M tokens
- Claude Opus 4.5: $15.00/$75.00 per 1M tokens

---

### 3. Google AI Studio

**Provider ID:** `google-ai` / `googleAiStudio`
**Default Model:** `gemini-2.5-flash`

**Strengths:**

- **Generous FREE tier** - 1M tokens/day free
- **Extended thinking** - Gemini 2.5+, 3.0
- **PDF support** - Native document processing
- Fast inference (Gemini Flash models)
- Simple setup (just API key)

**Weaknesses:**

- Cannot combine tools + JSON schema (Gemini limitation)
- Rate limits on free tier
- Newer platform (less mature than OpenAI)

**Best For:**

- Startups and developers (free tier)
- Prototyping and experimentation
- Budget-conscious production apps
- When extended thinking + PDF support needed

**2025 Pricing:**

- Gemini 2.5 Flash: **FREE** (up to 1M tokens/day)
- Gemini 2.5 Pro: $1.25/$5.00 per 1M tokens
- Gemini 3 Flash: **FREE** (up to 1M tokens/day)
- Gemini 3 Pro: $7.00/$21.00 per 1M tokens

---

### 4. Google Vertex AI

**Provider ID:** `vertex`
**Default Model:** `gemini-2.5-flash`

**Strengths:**

- **Dual provider** - Gemini + Claude models
- Enterprise-grade reliability
- GCP integration
- Multiple authentication methods
- Claude models support tools + schema together

**Weaknesses:**

- Complex setup (service accounts)
- Gemini models cannot combine tools + schema
- Higher latency than AI Studio
- Requires GCP project

**Best For:**

- Enterprise Google Cloud users
- When you need both Gemini AND Claude
- Production deployments requiring SLAs
- Regulated industries

**2025 Pricing:**

- Gemini 2.5 Flash: $0.35/$1.05 per 1M tokens
- Gemini 3 Pro: $7.00/$21.00 per 1M tokens
- Claude on Vertex: Same as Bedrock pricing

---

### 5. Amazon Bedrock

**Provider ID:** `bedrock`
**Default Model:** `anthropic.claude-3-sonnet-20240229-v1:0`

**Strengths:**

- Multiple model providers (Claude, Titan, Cohere, Llama)
- AWS integration
- Enterprise security and compliance
- Pay-as-you-go pricing

**Weaknesses:**

- Complex AWS setup
- Regional model availability varies
- No extended thinking support
- Requires IAM configuration

**Best For:**

- AWS-based enterprises
- Multi-model strategies
- Compliance-heavy industries (HIPAA, SOC2)
- When you need Claude + Llama + others

**2025 Pricing:**

- Claude Haiku: $0.25/$1.25 per 1M tokens
- Claude Sonnet: $3.00/$15.00 per 1M tokens
- Claude Opus: $15.00/$75.00 per 1M tokens
- Amazon Titan: $0.30/$0.40 per 1M tokens

---

### 6. Amazon SageMaker

**Provider ID:** `sagemaker`
**Default Model:** Custom endpoint

**Strengths:**

- Custom model deployment
- Fine-tuned models
- Enterprise control
- Autoscaling infrastructure

**Weaknesses:**

- **Streaming not fully implemented** (v8.26.1)
- Complex setup (requires SageMaker endpoints)
- Higher operational overhead
- No multimodal support

**Best For:**

- Custom fine-tuned models
- Enterprise ML teams
- When you need full model control
- Specialized domain models

**2025 Pricing:**

- Instance-based: ml.g5.xlarge ~$1.41/hour
- ml.g5.2xlarge ~$2.03/hour
- Plus storage and data transfer costs

---

### 7. Azure OpenAI

**Provider ID:** `azure`
**Default Model:** `gpt-4o`

**Strengths:**

- Enterprise security and compliance
- Microsoft ecosystem integration
- SLA guarantees
- Same models as OpenAI

**Weaknesses:**

- Most complex setup of all providers
- Requires Azure subscription
- Deployment configuration required
- Limited regional availability

**Best For:**

- Enterprise Microsoft shops
- When you need SLAs and support
- Azure-based infrastructure
- Regulated industries

**2025 Pricing:**

- Same as OpenAI pricing
- Billed through Azure subscription
- GPT-4o: $2.50/$10.00 per 1M tokens
- GPT-4o-mini: $0.15/$0.60 per 1M tokens

---

### 8. Mistral

**Provider ID:** `mistral`
**Default Model:** `mistral-small-2506`

**Strengths:**

- GDPR compliant (European data centers)
- Competitive pricing
- Vision support (Small 2506+)
- Open-weight models available

**Weaknesses:**

- Smaller model selection than OpenAI
- Less ecosystem support
- Vision only on specific models
- No PDF or extended thinking

**Best For:**

- European compliance needs (GDPR)
- Cost-conscious deployments
- When you prefer European hosting
- Open-source friendly organizations

**2025 Pricing:**

- Mistral Small: $0.20/$0.60 per 1M tokens
- Mistral Medium: $2.50/$7.50 per 1M tokens
- Mistral Large: $8.00/$24.00 per 1M tokens

---

### 9. HuggingFace

**Provider ID:** `huggingface`
**Default Model:** `microsoft/DialoGPT-medium`

**Strengths:**

- Access to 100,000+ models
- Open-source focus
- Community-driven
- Free tier available

**Weaknesses:**

- Variable model quality
- Tool calling only on specific models
- No vision or multimodal
- Rate limits on free tier

**Best For:**

- Research and experimentation
- Open-source projects
- Testing cutting-edge models
- Budget-constrained projects

**2025 Pricing:**

- Free tier: 1,000 requests/month
- Inference API: From FREE to ~$1.00 per 1M tokens
- PRO tier: $9/month for higher limits

---

### 10. LiteLLM

**Provider ID:** `litellm`
**Default Model:** `openai/gpt-4o-mini`

**Strengths:**

- Access to 100+ models via proxy
- Unified interface for all providers
- Cost tracking and analytics
- Load balancing and failover

**Weaknesses:**

- Requires proxy server running
- Adds proxy overhead
- Configuration complexity
- Capabilities depend on backend

**Best For:**

- Multi-provider strategies
- Cost optimization and tracking
- Load balancing across providers
- A/B testing different models

**2025 Pricing:**

- No additional cost (uses backend provider pricing)
- Self-hosted proxy is FREE
- Cloud-hosted option available

---

### 11. Ollama

**Provider ID:** `ollama`
**Default Model:** `llama3.1:8b`

**Strengths:**

- **Completely FREE** (local execution)
- Maximum privacy (no data sent to cloud)
- Works offline
- Fast local inference
- No API rate limits

**Weaknesses:**

- Requires local compute resources
- Model quality varies
- Manual model management
- Vision only on specific models

**Best For:**

- Privacy-critical applications
- Offline/air-gapped environments
- Cost-sensitive projects
- Development and testing

**2025 Pricing:**

- **FREE** (hardware costs only)
- Requires local GPU for best performance
- No API costs or rate limits

---

### 12. OpenAI Compatible

**Provider ID:** `openai-compatible`
**Default Model:** Auto-discovered

**Strengths:**

- Works with any OpenAI-compatible endpoint
- vLLM, FastChat, LocalAI support
- Custom deployment flexibility
- Auto-discovers available models

**Weaknesses:**

- Capabilities entirely backend-dependent
- No standardized capability detection
- Configuration varies by provider
- Authentication varies

**Best For:**

- Custom deployments (vLLM, FastChat)
- Internal model serving
- Private cloud deployments
- When you control the backend

**2025 Pricing:**

- Depends entirely on backend provider
- Self-hosted: Infrastructure costs only
- Cloud-hosted: Provider-specific pricing

---

### 13. OpenRouter

**Provider ID:** `openrouter`
**Default Model:** `anthropic/claude-3-5-sonnet`

**Strengths:**

- Access to 300+ models from 60+ providers
- Many **FREE models** available
- Automatic failover
- Unified API for all models
- Cost tracking

**Weaknesses:**

- Tool support varies by model
- Vision support varies by model
- Credit-based pricing system
- Model availability can change

**Best For:**

- Access to many providers via one API
- Cost optimization (free models available)
- Rapid prototyping
- When you want provider flexibility

**2025 Pricing:**

- **Free models available:**
  - Google Gemini 2.0 Flash: FREE
  - Meta Llama 3.3 70B: FREE
  - Qwen models: FREE
- **Paid models:**
  - Claude 3.5 Sonnet: $3.00/$15.00 per 1M tokens
  - GPT-4o: $2.50/$10.00 per 1M tokens

---

## Use Case Recommendations

### For Startups (Limited Budget)

** Best Choice: Google AI Studio**

- Generous FREE tier (1M tokens/day)
- Extended thinking support
- PDF processing
- Professional quality

** Alternative: OpenRouter**

- Many free models
- Access to premium models when needed
- Cost tracking

** Alternative: Mistral**

- Competitive pricing
- Good quality
- GDPR compliant

---

### For Enterprises

** Best Choice: Amazon Bedrock**

- Enterprise security (AWS)
- Multiple model providers
- HIPAA/SOC2 compliant
- SLAs available

** Alternative: Azure OpenAI**

- Microsoft ecosystem integration
- Enterprise security
- SLA guarantees

** Alternative: Google Vertex**

- GCP integration
- Dual provider (Gemini + Claude)
- Enterprise-grade

---

### For Privacy-Conscious Users

** Best Choice: Ollama**

- 100% local execution
- No data sent to cloud
- Works offline
- Completely FREE

** Alternative: Mistral**

- GDPR compliant
- European data centers
- No training on user data

---

### For Developers/Researchers

** Best Choice: HuggingFace**

- 100,000+ models
- Open-source focus
- Cutting-edge research models
- Community support

** Alternative: LiteLLM**

- Test multiple providers easily
- Cost tracking
- Unified interface

---

### For Complex Reasoning

** Best Choice: Anthropic**

- **Extended thinking** (best)
- 200K context window
- Native PDF support
- Advanced tool use

** Alternative: Google AI Studio**

- Extended thinking (Gemini 2.5+, 3)
- FREE tier
- PDF support

---

### For Multimodal (Vision + Text + PDF)

** Best Choice: Anthropic**

- Best vision quality (20 images)
- Native PDF support
- Extended thinking

** Alternative: Google AI Studio**

- Good vision (16 images)
- PDF support
- Extended thinking
- FREE tier

** Alternative: OpenAI**

- Excellent vision (10 images)
- Industry-leading quality
- No PDF support

---

## Cost Optimization Strategies

### 1. Tier-Based Strategy

```typescript
// Use free tier for development
const devProvider = "google-ai"; // FREE

// Use mid-tier for staging
const stagingProvider = "mistral"; // Low cost

// Use premium for production
const prodProvider = "anthropic"; // High quality
```

### 2. Task-Based Routing

```typescript
// Simple tasks → Cheap models
if (taskComplexity === "simple") {
  provider = "google-ai"; // FREE Gemini Flash
}

// Complex reasoning → Premium models
if (taskComplexity === "complex") {
  provider = "anthropic"; // Extended thinking
}

// Vision tasks → Vision-capable models
if (hasImages) {
  provider = "openai"; // Good vision
}
```

### 3. Hybrid Approach

```typescript
// Use local for privacy-sensitive
if (sensitiveData) {
  provider = "ollama"; // Local, FREE
}

// Use cloud for complex tasks
if (needsAdvancedReasoning) {
  provider = "anthropic"; // Extended thinking
}
```

---

## Quick Decision Tree

```
Need highest quality?
├─ Yes → OpenAI or Anthropic
└─ No → Continue
    │
    Need extended thinking?
    ├─ Yes → Anthropic (best) or Google AI Studio (free)
    └─ No → Continue
        │
        Need complete privacy?
        ├─ Yes → Ollama (local, free)
        └─ No → Continue
            │
            Need PDF processing?
            ├─ Yes → Anthropic or Google AI Studio or Vertex
            └─ No → Continue
                │
                On AWS?
                ├─ Yes → Bedrock
                └─ No → Continue
                    │
                    On Azure?
                    ├─ Yes → Azure OpenAI
                    └─ No → Continue
                        │
                        Need free tier?
                        ├─ Yes → Google AI Studio (best) or OpenRouter or HuggingFace
                        └─ No → Continue
                            │
                            Need EU compliance?
                            ├─ Yes → Mistral AI (GDPR)
                            └─ No → Continue
                                │
                                Need many models?
                                ├─ Yes → OpenRouter (300+ models) or HuggingFace (100k+ models)
                                └─ No → OpenAI (industry standard)
```

---

## Security & Compliance

### Most Secure

1. **Ollama** - Completely local, no cloud transmission
2. **Azure OpenAI** - Enterprise security, Microsoft backing
3. **Amazon Bedrock** - AWS security features, HIPAA-ready

### Compliance Certifications

| Provider         | GDPR | HIPAA | SOC2 | ISO 27001 |
| ---------------- | ---- | ----- | ---- | --------- |
| OpenAI           | ✓    | ✓\*   | ✓    | ✓         |
| Anthropic        | ✓    | ✓\*   | ✓    | ✓         |
| Google AI Studio | ✓    | ✗     | ✓    | ✓         |
| Google Vertex    | ✓    | ✓\*   | ✓    | ✓         |
| Amazon Bedrock   | ✓    | ✓\*   | ✓    | ✓         |
| Azure OpenAI     | ✓    | ✓\*   | ✓    | ✓         |
| Mistral          | ✓    | ✗     | ✓    | ✓         |
| Ollama           | ✓    | ✓     | N/A  | N/A       |

\* HIPAA compliance requires Business Associate Agreement (BAA)

---

## Performance Benchmarks

### Average Latency (Time to First Token)

| Provider         | TTFT (ms) | Tokens/sec | Quality Score |
| ---------------- | --------- | ---------- | ------------- |
| Ollama (local)   | 50-200    | 30-50      | 8.5/10        |
| OpenAI           | 300-800   | 40-60      | 9.5/10        |
| Anthropic        | 400-900   | 35-55      | 9.4/10        |
| Google AI Studio | 300-700   | 45-65      | 9.0/10        |
| Azure OpenAI     | 350-850   | 40-60      | 9.5/10        |
| Mistral          | 300-700   | 40-55      | 8.8/10        |
| OpenRouter       | 400-1000  | 30-50      | 8.5-9.5/10    |

_Note: Benchmarks vary by model, region, and load_

---

## Migration Guide

### From OpenAI to Anthropic

**Why migrate:**

- Extended thinking
- PDF support
- Better for complex analysis

**Code changes:**

```typescript
// Before
const result = await neurolink.generate({
  provider: "openai",
  model: "gpt-4o",
  prompt: "Analyze this document",
});

// After
const result = await neurolink.generate({
  provider: "anthropic",
  model: "claude-sonnet-4-5-20250929",
  prompt: "Analyze this document",
  thinkingLevel: "high", // New capability
});
```

### From Paid to Free (Google AI Studio)

**Why migrate:**

- FREE tier (1M tokens/day)
- Extended thinking
- PDF support

**Cost savings:**

- OpenAI GPT-4o: ~$15/day for 1M tokens
- Google AI Studio: **$0/day for 1M tokens**
- **Savings: $450/month**

---

## Conclusion

**Choose based on priorities:**

1. **Budget Priority** → Google AI Studio (free) or OpenRouter (free models)
2. **Quality Priority** → OpenAI or Anthropic
3. **Privacy Priority** → Ollama (local)
4. **Reasoning Priority** → Anthropic (extended thinking)
5. **Document Priority** → Anthropic or Google AI Studio (PDF support)
6. **Compliance Priority** → Azure OpenAI or Bedrock
7. **Flexibility Priority** → OpenRouter (300+ models)

**NeuroLink Advantage:**

- Switch providers anytime (single line of code)
- Use multiple providers simultaneously
- Test and compare providers easily
- No vendor lock-in

See also:

- [Provider Capabilities Audit](/docs/reference/provider-capabilities-audit) - Detailed technical capabilities
- [Provider Selection Wizard](/docs/reference/provider-selection) - Interactive decision guide

---

## Provider Feature Compatibility Reference

<!-- Source: reference/provider-feature-compatibility.md -->

# Provider Feature Compatibility Reference

**Last Updated:** 2025-12-31
**Test Suite:** continuous-test-suite.ts (19 comprehensive tests)
**Providers Tested:** 11 providers across CSV, PDF, MCP tools, business tools, and enterprise features

----------------- | ------------ | -------- | ---------- | --------------------------------------------- |
| **Google AI Studio** | 19/19 (100%) | 401s     | ✅ Perfect | Fast prototyping, full multimodal support     |
| **Vertex AI**        | 19/19 (100%) | 449s     | ✅ Perfect | Enterprise deployments, excellent performance |
| **OpenAI**           | 19/19 (100%) | 1413s    | ✅ Perfect | Industry standard, comprehensive features     |
| **LiteLLM**          | 19/19 (100%) | 552s     | ✅ Perfect | Universal proxy for 100+ models               |

**All features supported:**

- ✅ CSV processing (6/6 tests)
- ✅ PDF processing (6/6 tests)
- ✅ MCP external tools (4/4 tests)
- ✅ Business tools (2/2 tests)
- ✅ Enterprise features (1/1 test)

---

## Complete Feature Support Matrix

| Provider             | CSV    | PDF    | MCP Tools | Business Tools | Structured Output | Enterprise | Score     | Status       |
| -------------------- | ------ | ------ | --------- | -------------- | ----------------- | ---------- | --------- | ------------ |
| **Google AI Studio** | ✅ 6/6 | ✅ 6/6 | ✅ 4/4    | ✅ 2/2         | ⚠️ Partial\*      | ✅ 1/1     | **19/19** | Production   |
| **Vertex AI**        | ✅ 6/6 | ✅ 6/6 | ✅ 4/4    | ✅ 2/2         | ⚠️ Partial\*      | ✅ 1/1     | **19/19** | Production   |
| **LiteLLM**          | ✅ 6/6 | ✅ 6/6 | ✅ 4/4    | ✅ 2/2         | ✅ Full           | ✅ 1/1     | **19/19** | Production   |
| **OpenAI**           | ✅ 6/6 | ✅ 6/6 | ✅ 4/4    | ✅ 2/2         | ✅ Full           | ✅ 1/1     | **19/19** | Production   |
| **Azure OpenAI**     | ✅ 6/6 | ❌ 0/6 | ✅ 4/4    | ✅ 2/2         | ✅ Full           | ✅ 1/1     | 13/19     | Production\* |
| **Mistral**          | ✅ 6/6 | ❌ 0/6 | ⚠️ 2/4    | ❌ 0/2         | ✅ Full           | ✅ 1/1     | 9/19      | Development  |
| **Ollama**           | ⚠️ 3/6 | ⚠️ 1/6 | ❌ 0/4    | ❌ 0/2         | ⚠️ Limited        | ✅ 1/1     | 7/19      | Development  |
| **Anthropic**        |  0/6 |  0/6 |  0/4    |  0/2         | ✅ Full           | ✅ 1/1     | 2/19\*\*  | Config       |
| **Bedrock**          |  0/6 |  0/6 |  0/4    |  0/2         | ✅ Full           | ✅ 1/1     | 2/19\*\*  | Config       |
| **Hugging Face**     |  0/6 |  0/6 |  0/4    |  0/2         | ⚠️ Limited        | ✅ 1/1     | 2/19\*\*  | Config       |
| **SageMaker**        |  0/6 |  0/6 |  0/4    |  0/2         | ⚠️ Limited        | ✅ 1/1     | 2/19\*\*  | Config       |

\*Google providers: Cannot combine tools + schemas (use `disableTools: true`). Google API limitation, not NeuroLink bug.

**Legend:**

- ✅ Fully supported
- ⚠️ Partially supported
- ❌ Not supported (technical limitation)
-  Configuration/billing issue
- \* Production-ready for non-PDF workloads
- \*\* Configuration issue, not technical limitation

---

## Model-Level Feature Compatibility

### Gemini 3 Models

| Model              | Streaming | Tools | Vision | Extended Thinking | JSON Schema |
| ------------------ | --------- | ----- | ------ | ----------------- | ----------- |
| **gemini-3-flash** | ✓         | ✓     | ✓      | ✓                 | ✓†          |
| **gemini-3-pro**   | ✓         | ✓     | ✓      | ✓                 | ✓†          |

†**JSON Schema Limitation:** Gemini 3 models support JSON Schema for structured output, but **cannot combine tools with JSON Schema** in the same request. When using structured output with a schema, you must disable tools by setting `disableTools: true`. This is a Google API limitation, not a NeuroLink bug.

**Example Usage:**

```typescript
// Structured output with JSON Schema (tools must be disabled)
await neurolink.generate({
  prompt: "Extract user information",
  schema: UserSchema,
  provider: "google-ai-studio",
  model: "gemini-3-flash",
  disableTools: true, // Required when using schema
});

// Tools work normally without schema
await neurolink.generate({
  prompt: "Search for documents",
  provider: "google-ai-studio",
  model: "gemini-3-pro",
  // Tools enabled by default
});
```

---

## Provider Tier Classification

### Tier 1: Perfect (100%) - Production Ready for All Features ⭐⭐⭐

**Recommended for production use with full feature support**

#### Google AI Studio

- **Score:** 19/19 (100%)
- **Duration:** 401 seconds
- **Strengths:** Fastest test execution, reliable, full multimodal support
- **Use Cases:**
  - Rapid prototyping with free tier
  - Production deployments requiring speed
  - Full CSV + PDF + image processing
  - MCP tool integration
- **Setup:** Simple API key configuration

#### Vertex AI

- **Score:** 19/19 (100%)
- **Duration:** 449 seconds
- **Strengths:** Enterprise-grade, excellent performance, Google Cloud integration
- **Use Cases:**
  - Enterprise deployments with SLA requirements
  - Google Cloud Platform integration
  - Multi-region deployments
  - Advanced analytics pipelines
- **Setup:** GCP service account or ADC

#### OpenAI

- **Score:** 19/19 (100%)
- **Duration:** 1413 seconds (slower due to rate limits)
- **Strengths:** Industry standard, comprehensive ecosystem, extensive documentation
- **Use Cases:**
  - Production applications requiring proven stability
  - Integration with OpenAI ecosystem
  - GPT-4o and o1 model access
- **Setup:** API key configuration
- **Note:** Longer duration due to conservative rate limiting (30,000 TPM)

#### LiteLLM

- **Score:** 19/19 (100%)
- **Duration:** 552 seconds
- **Strengths:** Universal proxy for 100+ models, automatic load balancing
- **Use Cases:**
  - Multi-provider routing and fallback
  - Access to 100+ models through single interface
  - Cost optimization across providers
  - Load balancing and caching
- **Setup:** LiteLLM proxy server + provider credentials

### Structured Output Support Details

**Full Support (✅):**

- OpenAI, Anthropic, Azure OpenAI, Bedrock, Mistral, LiteLLM
- Can use tools and schemas simultaneously
- No configuration required

**Partial Support (⚠️):**

- **Google AI Studio** and **Vertex AI (Gemini models)**
- **Limitation:** Cannot combine tools with schemas
- **Solution:** Use `disableTools: true` when using schemas
- **Reason:** Google API limitation (documented by Google)
- **Future:** Future Gemini versions may support both - check official documentation for updates

**Example:**

```typescript
// Google providers require disableTools
await neurolink.generate({
  schema: MySchema,
  provider: "vertex",
  disableTools: true, // Required for Google
});

// Other providers work without restriction
await neurolink.generate({
  schema: MySchema,
  provider: "openai", // No restriction
});
```

---

### Tier 2: Good (68%) - Production Ready for CSV + Tools ⭐⭐

**Recommended for production use when PDF support is not required**

#### Azure OpenAI

- **Score:** 13/19 (68.4%)
- **Duration:** 351 seconds
- **Status:** ⚠️ Production-ready with limitations

**✅ Passing Tests (13/19):**

- ✅ CSV processing (6/6) - All CSV tests pass
- ✅ MCP external tools (4/4) - Full tool integration support
- ✅ Business tools (2/2) - Custom tool execution works
- ✅ Enterprise features (1/1) - Proxy and compliance support

**❌ Failing Tests (6/19):**

- ❌ All PDF tests (6/6) - Model limitation
  - CLI Generate PDF
  - CLI Stream PDF
  - CLI Stream Two PDF Comparison
  - CLI Stream PDF and CSV
  - SDK Generate PDF
  - SDK Stream PDF

**Root Cause:**

```
Error: Invalid Value: 'file'. This model does not support file content types.
```

**Technical Explanation:** Azure OpenAI models do not support the file content type required for PDF processing in the Vercel AI SDK. This is a **model architecture limitation**, not a configuration issue.

**Production Recommendation:**

- ✅ Use for: CSV data analysis, MCP tool integration, business logic
- ❌ Avoid for: PDF processing
-  Fallback strategy: Use Vertex AI or Google AI Studio for PDF requirements

---

### Tier 3: Partial (36-47%) - Development/Testing Only ⭐

**NOT recommended for production use - limited feature support**

#### Mistral AI

- **Score:** 9/19 (47.4%)
- **Duration:** 363 seconds
- **Status:** ⚠️ Development/testing only

**✅ Passing Tests (9/19):**

- ✅ CSV processing (6/6) - All CSV tests pass
- ✅ SDK tools (2/2) - SDK Generate and Stream work
- ✅ Enterprise features (1/1) - Proxy support

**❌ Failing Tests (10/19):**

- ❌ All PDF tests (6/6) - API limitation
- ❌ CLI external tools (2/2) - CLI tool integration issues
- ❌ Business tools (2/2) - Limited tool support

**Root Cause (PDF failures):**

```
Error: UnsupportedFunctionalityError: 'File content parts in user messages' functionality not supported.
```

**Technical Explanation:** Mistral's API fundamentally does not support file content parts in user messages. This is a **core API limitation**, not a bug or configuration issue.

**Production Recommendation:**

- ✅ Use for: CSV data analysis in SDK mode
- ❌ Avoid for: PDF processing, CLI tool integration
-  Reference: See `MISTRAL_PDF_FIX_SUMMARY.md` for detailed investigation

#### Ollama

- **Score:** 7/19 (36.8%)
- **Duration:** 1236 seconds
- **Status:** ⚠️ Local development only

**✅ Passing Tests (7/19):**

- ✅ Some CSV tests (3/6) - Partial support
- ✅ SDK tools (2/2) - Basic tool execution
- ✅ CLI Stream PDF and CSV (1/1) - Limited multimodal
- ✅ Enterprise features (1/1) - Local proxy support

**❌ Failing Tests (12/19):**

- ❌ Most CSV tests (3/6) - Inconsistent results
- ❌ Most PDF tests (5/6) - Model-dependent
- ❌ CLI external tools (2/2) - Tool integration issues
- ❌ Business tools (2/2) - Limited support

**Technical Explanation:** Ollama is designed for local model execution. Performance and feature support varies significantly based on the specific model being used (Llama, Mistral, etc.).

**Production Recommendation:**

- ✅ Use for: Local development, privacy-critical testing
- ❌ Avoid for: Production workloads, consistent behavior requirements
-  Best for: Experimentation with local models

---

### Tier 4: Limited (10.5%) - Configuration Issues Only

**Configuration/billing issues preventing testing - NOT technical limitations**

These providers are currently limited to 2/19 tests passing due to configuration or billing issues, **not technical capabilities**. With proper setup, they are expected to achieve much higher compatibility scores.

| Provider         | Score        | Issue Type     | Fix Required       | Expected Score After Fix |
| ---------------- | ------------ | -------------- | ------------------ | ------------------------ |
| **Anthropic**    | 2/19 (10.5%) |  Billing     | Add API credits    | 90%+ (full multimodal)   |
| **Bedrock**      | 2/19 (10.5%) |  Credentials | Fix AWS token      | 70%+ (model-dependent)   |
| **Hugging Face** | 2/19 (10.5%) |  Billing     | Add payment method | 60%+ (model-dependent)   |
| **SageMaker**    | 2/19 (10.5%) |  Credentials | Fix AWS token      | 60%+ (model-dependent)   |

#### Anthropic (Claude) - API Credit Exhaustion

**Error:**

```
APICallError: Your credit balance is too low to access the Anthropic API.
Please go to Plans & Billing to upgrade or purchase credits.
```

**Status:** All 17 test failures are due to insufficient API credits, **NOT** technical limitations.

**Passing Tests (2/19):**

- ✅ CLI Stream CSV and Screenshot (skipped - no fixture available)
- ✅ Enterprise Proxy Support (no API call required)

**Expected Capability:** Anthropic Claude models (3.5 Sonnet, 3.7 Sonnet) support multimodal content including images and PDFs. Expected to achieve **90%+ compatibility** once credits are added.

**Fix:** Add credits at https://console.anthropic.com/settings/plans

#### AWS Bedrock - Credential Issue

**Error:**

```
BedrockServiceException: The security token included in the request is invalid
Region: ap-south-1
```

**Status:** AWS credentials are invalid or expired.

**Passing Tests (2/19):**

- ✅ CLI Stream CSV and Screenshot (skipped)
- ✅ Enterprise Proxy Support

**Expected Capability:** Bedrock provides access to multiple foundation models (Claude, Llama, Titan) and should support multimodal features once credentials are configured. Expected **70%+ compatibility** (varies by model).

**Fix:**

```bash
# Check current credentials
aws sts get-caller-identity

# Configure valid credentials
aws configure
```

#### Hugging Face - Payment Required

**Error:**

```
APICallError: Payment Required
```

**Status:** Payment/billing configuration needed.

**Passing Tests (2/19):**

- ✅ CLI Stream CSV and Screenshot (skipped)
- ✅ Enterprise Proxy Support

**Expected Capability:** Hugging Face provides access to open-source models via inference endpoints. Multimodal support depends on selected model. Expected **60%+ compatibility** after billing setup.

**Fix:** Add payment method to Hugging Face account

#### AWS SageMaker - Credential Issue

**Error:**

```
SageMaker endpoint invocation failed: The security token included in the request is invalid
```

**Status:** AWS credentials are invalid or expired (same as Bedrock).

**Passing Tests (2/19):**

- ✅ CLI Stream CSV and Screenshot (skipped)
- ✅ Enterprise Proxy Support

**Expected Capability:** SageMaker allows deployment of custom models. Feature support depends on the deployed model. Expected **60%+ compatibility** after credential fix.

**Fix:** Update AWS credentials (same as Bedrock)

---

## Technical Limitations Summary

### Azure OpenAI

- **Limitation:** Model does not support file content type for PDFs
- **Impact:** Cannot process PDF documents natively
- **Workaround:** Extract text from PDFs before sending to Azure, or use fallback provider
- **Affected Features:** All PDF processing (6 tests)

### Mistral

- **Limitation:** API does not support file content parts in user messages
- **Impact:** Cannot process PDF documents at all
- **Workaround:** None available - fundamental API limitation
- **Affected Features:** All PDF processing (6 tests), CLI tool integration (2 tests)
- **Reference:** See `MISTRAL_PDF_FIX_SUMMARY.md` for investigation details

### Ollama

- **Limitation:** Local model performance varies significantly by model
- **Impact:** Inconsistent results across different models and operations
- **Workaround:** Carefully select models, use for development/testing only
- **Affected Features:** Various tests show inconsistent behavior

---

## Production Deployment Recommendations

### For Maximum Feature Compatibility (100%)

**Recommended Providers:**

- **Google AI Studio** - Best for: Speed, free tier, prototyping
- **Vertex AI** - Best for: Enterprise, GCP integration, SLA requirements
- **OpenAI** - Best for: Proven stability, ecosystem integration
- **LiteLLM** - Best for: Multi-provider routing, 100+ model access

**All features available:**

- ✅ CSV data analysis
- ✅ PDF document processing
- ✅ Image analysis
- ✅ MCP external tool integration
- ✅ Custom business tools
- ✅ Enterprise proxy support

### For CSV + Tools (No PDFs Required)

**Recommended Providers:**

- **Azure OpenAI** - Best for: Microsoft ecosystem, enterprise security, Azure integration

**Features available:**

- ✅ CSV data analysis (68% compatibility)
- ✅ MCP external tools
- ✅ Custom business tools
- ✅ Enterprise features
- ❌ PDF processing (use fallback provider)

**Fallback Strategy:**

```typescript
// Primary provider for CSV and tools
const primaryProvider = "azure";

// Fallback to Vertex for PDF processing
const pdfProvider = "vertex";

if (hasPDFFiles(input)) {
  result = await neurolink.generate({ ...options, provider: pdfProvider });
} else {
  result = await neurolink.generate({ ...options, provider: primaryProvider });
}
```

### For Development/Testing

**Recommended Providers:**

- **Mistral** - Best for: CSV-only workflows, European compliance
- **Ollama** - Best for: Local development, privacy testing

**Use Cases:**

- CSV data analysis only
- Privacy-critical testing
- Local development without cloud dependencies
- Experimentation with different models

**Not Recommended For:**

- Production deployments
- PDF processing requirements
- Critical business workflows

---

## Test Suite Details

### Test Categories (19 total tests)

#### CSV Processing Tests (6 tests)

1. CLI Generate CSV - Generate mode with CSV input
2. CLI Stream CSV - Streaming mode with CSV input
3. CLI Stream Two CSV Comparison - Compare multiple CSV files
4. CLI Stream CSV and Screenshot - Mixed CSV and image analysis
5. SDK Generate CSV - SDK generate with CSV
6. SDK Stream CSV - SDK streaming with CSV

#### PDF Processing Tests (6 tests)

7. CLI Generate PDF - Generate mode with PDF input
8. CLI Stream PDF - Streaming mode with PDF input
9. CLI Stream Two PDF Comparison - Compare multiple PDF files
10. CLI Stream PDF and CSV - Mixed PDF and CSV analysis
11. SDK Generate PDF - SDK generate with PDF
12. SDK Stream PDF - SDK streaming with PDF

#### MCP External Tools Tests (4 tests)

13. CLI Generate - External MCP tools via CLI generate
14. CLI Stream - External MCP tools via CLI stream
15. SDK Generate - External MCP tools via SDK generate
16. SDK Stream - External MCP tools via SDK stream

#### Business Tools Tests (2 tests)

17. SDK Business Tools - Custom tool registration and execution
18. CLI Business Tools - Custom tools via CLI interface

#### Enterprise Features Tests (1 test)

19. Enterprise Proxy Support - Proxy configuration and environment handling

### Test Execution

**Sequential Execution:** Tests run one provider at a time to avoid resource contention and rate limit issues.

**Rate Limiting:**

- OpenAI: 60-second delay between tests (30,000 TPM limit)
- Other providers: 10-second delay between tests

**Total Duration:** Approximately 30-40 minutes for all 11 providers

---

## Configuration Fixes Needed

### Immediate Actions Required

1. **Anthropic:** Add API credits
   - URL: https://console.anthropic.com/settings/plans
   - Expected improvement: 2/19 → 17+/19 (90%+)

2. **Bedrock:** Fix AWS credentials

   ```bash
   aws configure
   # Or update AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY
   ```

   - Expected improvement: 2/19 → 13+/19 (70%+)

3. **SageMaker:** Fix AWS credentials (same as Bedrock)
   - Expected improvement: 2/19 → 11+/19 (60%+)

4. **Hugging Face:** Add payment method
   - URL: https://huggingface.co/settings/billing
   - Expected improvement: 2/19 → 11+/19 (60%+)

### No Fix Available

1. **Azure OpenAI:** PDF limitation is a model architecture constraint
   - Recommendation: Use for CSV and tools, fallback to Vertex/Google AI Studio for PDFs

2. **Mistral:** PDF limitation is a fundamental API constraint
   - Recommendation: Use for CSV-only workflows in SDK mode

---

## Test Logs

All test logs are available in `/tmp/neurolink-sequential-tests/`:

- `test-openai.log` - OpenAI 19/19 (100%)
- `test-vertex.log` - Vertex 19/19 (100%)
- `test-google-ai-studio.log` - Google AI Studio 19/19 (100%)
- `test-litellm.log` - LiteLLM 19/19 (100%)
- `test-azure.log` - Azure 13/19 (68%)
- `test-mistral.log` - Mistral 9/19 (47%)
- `test-ollama.log` - Ollama 7/19 (37%)
- `test-anthropic.log` - Anthropic 2/19 (billing issue)
- `test-bedrock.log` - Bedrock 2/19 (credential issue)
- `test-huggingface.log` - Hugging Face 2/19 (billing issue)
- `test-sagemaker.log` - SageMaker 2/19 (credential issue)

---

## Recent Fixes and Improvements

### Fix 1: File Handling System Prompt Enhancement (2025-11-02)

**Providers affected:** OpenAI, Vertex AI
**Issue:** AI attempting to use GitHub MCP `get_file_contents` for local files
**Root Cause:** File paths visible in context, AI confused about tool usage

**Solution:** Enhanced system prompt in `src/lib/utils/messageBuilder.ts` (lines 622-657) with file handling guidance:

```typescript
if (hasCSVFiles || hasPDFFiles) {
  systemPrompt += `\n\nIMPORTANT FILE HANDLING INSTRUCTIONS:
- File content (${fileTypes.join(", ")}, images) is already processed and included in this message
- DO NOT use GitHub tools (get_file_contents, search_code, etc.) for local files
- Analyze the provided file content directly without attempting to fetch files
- GitHub MCP tools are ONLY for remote repository operations
- Use the file content shown in this message for your analysis`;
}
```

**Result:**

- OpenAI: 18/19 → 19/19 (100%)
- Vertex: CLI Stream PDF and CSV test passing

### Fix 2: Case-Insensitive Test Validation (2025-11-02)

**Provider affected:** Vertex AI
**Issue:** Test expecting "strict" but Vertex responding "Strict mode"
**Root Cause:** Case-sensitive string matching with provider-specific capitalization

**Solution:** Case-insensitive comparison in `test/continuous-test-suite.ts` (lines 801-806):

```typescript
// Before
const foundData = expectedData.filter((data) => result.content.includes(data));

// After
const contentLower = result.content.toLowerCase();
const foundData = expectedData.filter((data) =>
  contentLower.includes(data.toLowerCase()),
);
```

**Result:** Vertex: 18/19 → 19/19 (100%)

---

## Conclusion

**Primary Achievement:** ✅ **4 providers at 100% compatibility**

The comprehensive testing reveals a mature ecosystem with multiple production-ready providers. Most "failures" are configuration/billing issues rather than technical limitations.

**Key Insights:**

1. **Production-Ready Options:** 4 providers (Google AI Studio, Vertex AI, OpenAI, LiteLLM) provide full feature support
2. **Partial Support is Useful:** Azure OpenAI at 68% is excellent for non-PDF workloads
3. **Technical Limitations are Clear:** Only Azure and Mistral have actual feature limitations
4. **Configuration is Key:** 4 providers need credential/billing fixes, not code changes

**Next Steps for Users:**

1. **For new projects:** Start with Google AI Studio (free tier) or Vertex AI (enterprise)
2. **For existing Azure users:** Use Azure for CSV/tools, add Vertex fallback for PDFs
3. **For cost optimization:** Implement LiteLLM routing across multiple providers
4. **For privacy:** Use Ollama for local development and testing

**Maintenance:**

- Re-run test suite after provider API updates
- Monitor provider changelog for new feature releases
- Update this document quarterly or when adding new providers

---

## Troubleshooting Guide

<!-- Source: reference/troubleshooting.md -->

# Troubleshooting Guide

This guide helps you diagnose and resolve common issues when working with NeuroLink. For detailed troubleshooting of specific features, see the main [Troubleshooting documentation](/docs/reference/troubleshooting).

## Quick Diagnostics

Before diving into specific issues, try these quick diagnostics:

```bash
# 1. Check NeuroLink version
npx @juspay/neurolink --version

# 2. Verify environment variables
echo $OPENAI_API_KEY
echo $ANTHROPIC_API_KEY
echo $REDIS_URL

# 3. Test basic connectivity
npx @juspay/neurolink generate "test" --provider openai

# 4. Enable debug logging
DEBUG=neurolink:* node your-app.js
```

## Authentication Issues

### API Key Errors

**Symptoms:**

- `Invalid API key`
- `401 Unauthorized`
- `Authentication failed`

**Solutions:**

#### 1. Verify API Key Format

```bash
# OpenAI keys start with sk-
echo $OPENAI_API_KEY | grep "^sk-"

# Anthropic keys start with sk-ant-
echo $ANTHROPIC_API_KEY | grep "^sk-ant-"

# Google AI Studio keys are alphanumeric
echo $GOOGLE_AI_API_KEY
```

#### 2. Check Key Scope/Permissions

Some keys have limited permissions:

- OpenAI: Check key permissions in dashboard
- Anthropic: Verify key hasn't expired
- Google: Ensure API is enabled in Cloud Console

#### 3. Environment Variable Loading

```typescript
// Verify env vars are loaded
console.log("OpenAI key:", process.env.OPENAI_API_KEY?.slice(0, 8) + "...");

// Use dotenv explicitly
require("dotenv").config();
```

#### 4. Key in Wrong Environment

```bash
# Production vs Development keys
# Check .env.production vs .env.development

# List all env files
ls -la .env*
```

### OAuth/Service Account Issues

**Symptoms:**

- `Service account authentication failed`
- `Invalid credentials` for GCP/Azure
- `Token expired` errors

**Solutions:**

#### 1. Google Cloud (Vertex AI)

```bash
# Verify service account
gcloud auth application-default print-access-token

# Set credentials
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json

# Or in code:
const neurolink = new NeuroLink({
  provider: "vertex",
  googleApplicationCredentials: "./service-account.json",
});
```

#### 2. Azure OpenAI

```typescript
const neurolink = new NeuroLink({
  provider: "azure",
  azureEndpoint: process.env.AZURE_OPENAI_ENDPOINT,
  azureApiKey: process.env.AZURE_OPENAI_KEY,
  azureDeployment: "gpt-4",
});
```

#### 3. AWS Bedrock

```bash
# Configure AWS credentials
aws configure

# Or use environment variables
export AWS_ACCESS_KEY_ID=your_access_key
export AWS_SECRET_ACCESS_KEY=your_secret_key
export AWS_REGION=us-east-1
```

---

## Runtime Errors

### Token Limit Exceeded

**Symptoms:**

- `This model's maximum context length is X tokens`
- `Request too large`
- Truncated responses

**Solutions:**

#### 1. Check Token Count

```typescript
// Rough estimate: 4 characters ≈ 1 token
const estimatedTokens = prompt.length / 4;

if (estimatedTokens > 4000) {
  console.warn("Prompt may exceed token limit");
}
```

#### 2. Reduce Context

See [Context Window Management](/docs/cookbook/context-window-management) for detailed strategies:

```typescript
// Summarize old messages
await contextManager.summarizeOldMessages();

// Or limit max tokens
const result = await neurolink.generate({
  input: { text: prompt },
  maxTokens: 1000, // Limit response size
});
```

#### 3. Switch to Larger Context Model

| Model          | Context Window |
| -------------- | -------------- |
| GPT-3.5 Turbo  | 16K tokens     |
| GPT-4          | 128K tokens    |
| Claude 3       | 200K tokens    |
| Gemini 1.5 Pro | 1M tokens      |

```typescript
const result = await neurolink.generate({
  input: { text: longPrompt },
  provider: "google-ai",
  model: "gemini-1.5-pro", // 1M token context
});
```

### Rate Limiting

**Symptoms:**

- `429 Too Many Requests`
- `Rate limit exceeded`
- `Quota exceeded` errors

**Solutions:**

#### 1. Implement Rate Limiting

See [Rate Limit Handling](/docs/cookbook/rate-limit-handling):

```typescript

const limiter = new RateLimiter({ requestsPerMinute: 50 });

await limiter.execute(async () => {
  return neurolink.generate({ input: { text: prompt } });
});
```

#### 2. Check Current Limits

```bash
# OpenAI: View limits in dashboard
# Anthropic: Check tier limits
# Google: View quotas in Cloud Console
```

#### 3. Upgrade Tier or Add Payment Method

Most rate limits increase with:

- Paid accounts
- Higher tiers
- Usage history

### Memory Issues

**Symptoms:**

- `JavaScript heap out of memory`
- Process crashes
- Slow performance

**Solutions:**

#### 1. Increase Node.js Memory

```bash
# Increase heap size to 4GB
node --max-old-space-size=4096 your-app.js

# Or in package.json
{
  "scripts": {
    "start": "node --max-old-space-size=4096 index.js"
  }
}
```

#### 2. Clear Conversation Memory

```typescript
// Clear periodically
await neurolink.clearConversationMemory();

// Or limit history
const neurolink = new NeuroLink({
  conversationMemory: {
    enabled: true,
    maxMessages: 50, // Keep only last 50 messages
  },
});
```

#### 3. Stream Instead of Buffer

```typescript
// Instead of buffering entire response
const result = await neurolink.generate({ input: { text: prompt } });
console.log(result.content); // Large string in memory

// Stream to reduce memory
const stream = await neurolink.stream({ input: { text: prompt } });

for await (const chunk of stream) {
  if (chunk.type === "content-delta") {
    process.stdout.write(chunk.delta); // Write immediately
  }
}
```

---

## Streaming Issues

### Stream Interruption

**Symptoms:**

- Stream stops mid-response
- Incomplete responses
- `Stream ended unexpectedly`

**Solutions:**

#### 1. Implement Retry

See [Streaming with Retry](/docs/cookbook/streaming-with-retry):

```typescript
async function streamWithRetry(prompt: string, maxRetries = 3) {
  for (let i = 0; i  setTimeout(r, 1000 * (i + 1)));
    }
  }
}
```

#### 2. Handle Stream Errors

```typescript
try {
  const stream = await neurolink.stream({ input: { text: prompt } });

  for await (const chunk of stream) {
    if (chunk.type === "content-delta") {
      process.stdout.write(chunk.delta);
    }
  }
} catch (error) {
  console.error("Stream failed:", error);
  // Fallback to non-streaming
  const fallback = await neurolink.generate({ input: { text: prompt } });
  console.log(fallback.content);
}
```

### Incomplete Responses

**Symptoms:**

- Response cuts off mid-sentence
- Missing conclusion
- Shorter than expected

**Solutions:**

#### 1. Check Max Tokens

```typescript
const result = await neurolink.generate({
  input: { text: prompt },
  maxTokens: 2000, // Increase if needed
});
```

#### 2. Verify Stream Completion

```typescript
let complete = false;

for await (const chunk of stream) {
  if (chunk.type === "content-delta") {
    process.stdout.write(chunk.delta);
  }
  if (chunk.type === "done") {
    complete = true;
  }
}

if (!complete) {
  console.warn("Stream did not complete normally");
}
```

---

## MCP Tool Issues

### Tool Discovery Failures

**Symptoms:**

- `No tools discovered`
- `MCP server not responding`
- `Tool not found`

**Solutions:**

#### 1. Verify MCP Server Configuration

```typescript
const neurolink = new NeuroLink({
  mcpServers: {
    filesystem: {
      command: "npx",
      args: ["-y", "@modelcontextprotocol/server-filesystem", "."],
    },
  },
});

// List available tools
const tools = await neurolink.discoverTools();
console.log("Available tools:", tools);
```

#### 2. Check Server Installation

```bash
# Test MCP server directly
npx -y @modelcontextprotocol/server-filesystem .

# Verify permissions
chmod +x node_modules/.bin/mcp-server-*
```

#### 3. Enable Debug Logging

```bash
DEBUG=neurolink:mcp node your-app.js
```

### Tool Execution Errors

**Symptoms:**

- `Tool execution failed`
- `Permission denied`
- `Tool timeout`

**Solutions:**

#### 1. Check Permissions

```typescript
// Filesystem tools need read/write access
const neurolink = new NeuroLink({
  mcpServers: {
    filesystem: {
      command: "npx",
      args: ["-y", "@modelcontextprotocol/server-filesystem", "/allowed/path"],
    },
  },
});
```

#### 2. Increase Timeout

```typescript
const result = await neurolink.generate({
  input: { text: "Use the slow_tool" },
  enableTools: true,
  toolTimeout: 60000, // 60 seconds
});
```

#### 3. Validate Tool Arguments

```typescript
// Tools may fail with invalid arguments
// Check schema first:
const tools = await neurolink.discoverTools();
const tool = tools.find((t) => t.name === "my_tool");
console.log("Tool schema:", tool.inputSchema);
```

---

## Debugging Tips

### Enable Debug Logging

#### SDK Debug Logging

```bash
# All NeuroLink debug output
DEBUG=neurolink:* node your-app.js

# Specific modules
DEBUG=neurolink:provider node your-app.js
DEBUG=neurolink:mcp node your-app.js
DEBUG=neurolink:memory node your-app.js
```

#### Provider-Specific Logging

```typescript
const neurolink = new NeuroLink({
  debug: true, // Enable debug mode
  onLog: (level, message, meta) => {
    console.log(`[${level}] ${message}`, meta);
  },
});
```

### Common Log Messages

| Log Message             | Meaning              | Action            |
| ----------------------- | -------------------- | ----------------- |
| `Provider initialized`  | Provider ready       | Normal            |
| `Rate limit hit`        | Too many requests    | Slow down         |
| `Tool executed`         | Tool call succeeded  | Normal            |
| `Authentication failed` | Bad API key          | Check credentials |
| `Model not found`       | Invalid model name   | Verify model      |
| `Context too large`     | Exceeded token limit | Reduce context    |

### Request/Response Inspection

```typescript
const neurolink = new NeuroLink({
  onRequest: (request) => {
    console.log("Request:", JSON.stringify(request, null, 2));
  },
  onResponse: (response) => {
    console.log("Response:", JSON.stringify(response, null, 2));
  },
});
```

### Network Traffic Inspection

```bash
# Use proxy to inspect HTTP traffic
export HTTP_PROXY=http://localhost:8888
export HTTPS_PROXY=http://localhost:8888

# Then use Burp Suite, Charles, or mitmproxy to view requests
```

---

## Getting Help

### Before Asking for Help

Gather this information:

1. **NeuroLink version**: `npx @juspay/neurolink --version`
2. **Node.js version**: `node --version`
3. **Operating system**: `uname -a` (Unix) or `ver` (Windows)
4. **Error message**: Full error stack trace
5. **Minimal reproduction**: Smallest code that reproduces issue
6. **Debug logs**: Output from `DEBUG=neurolink:* node your-app.js`

### Community Resources

- **Discord**: [Join NeuroLink Discord](https://discord.gg/neurolink)
- **GitHub Issues**: [Report bugs](https://github.com/juspay/neurolink/issues)
- **Stack Overflow**: Tag questions with `neurolink`
- **Documentation**: [Full docs](https://neurolink.dev)

### Creating a Bug Report

Use this template:

```markdown
## Bug Description

[Clear description of the issue]

## Steps to Reproduce

1. [First step]
2. [Second step]
3. [Error occurs]

## Expected Behavior

[What should happen]

## Actual Behavior

[What actually happens]

## Environment

- NeuroLink version: [version]
- Node.js version: [version]
- OS: [operating system]
- Provider: [OpenAI/Anthropic/etc]

## Code Sample

\`\`\`typescript
[Minimal code that reproduces issue]
\`\`\`

## Error Message

\`\`\`
[Full error stack trace]
\`\`\`

## Debug Logs

\`\`\`
[Output from DEBUG=neurolink:* node your-app.js]
\`\`\`
```

---

## See Also

- [Main Troubleshooting Guide](/docs/reference/troubleshooting) - Comprehensive troubleshooting
- [Cookbook Recipes](/docs/) - Practical solutions
- [Error Recovery Patterns](/docs/cookbook/error-recovery) - Error handling strategies
- [Provider Comparison](/docs/reference/provider-comparison) - Provider-specific guidance
- [API Reference](/docs/sdk/api-reference) - Complete API documentation

---

## Frequently Asked Questions

<!-- Source: reference/faq.md -->

# Frequently Asked Questions

Common questions and answers about NeuroLink usage, configuration, and troubleshooting.

##  Getting Started

### Q: What is NeuroLink?

**A:** NeuroLink is an enterprise AI development platform that provides unified access to multiple AI providers (OpenAI, Google AI, Anthropic, AWS Bedrock, etc.) through a single SDK and CLI. It includes built-in tools, analytics, evaluation capabilities, and supports the Model Context Protocol (MCP) for extended functionality.

### Q: Which AI providers does NeuroLink support?

**A:** NeuroLink supports 9+ AI providers:

- **OpenAI** (GPT-4, GPT-4o, GPT-3.5-turbo)
- **Google AI Studio** (Gemini models)
- **Google Vertex AI** (Gemini, Claude via Vertex)
- **Anthropic** (Claude 3.5 Sonnet, Haiku, Opus)
- **AWS Bedrock** (Claude, Titan models)
- **Azure OpenAI** (GPT models)
- **Hugging Face** (Open source models)
- **Ollama** (Local AI models)
- **Mistral AI** (Mistral models)

### Q: Do I need to install anything?

**A:** No installation required! You can use NeuroLink directly with `npx`:

```bash
npx @juspay/neurolink generate "Hello, AI!"
npx @juspay/neurolink status
```

For frequent use, you can install globally: `npm install -g @juspay/neurolink`

##  Configuration

### Q: How do I set up API keys?

**A:** Create a `.env` file in your project directory:

```bash
# .env file
OPENAI_API_KEY="sk-your-openai-key"
GOOGLE_AI_API_KEY="AIza-your-google-ai-key"
ANTHROPIC_API_KEY="sk-ant-your-anthropic-key"
# ... other providers
```

NeuroLink automatically loads these environment variables.

### Q: Can I use NeuroLink behind a corporate proxy?

**A:** Yes! NeuroLink automatically detects and uses corporate proxy settings:

```bash
export HTTPS_PROXY="http://proxy.company.com:8080"
export HTTP_PROXY="http://proxy.company.com:8080"
export NO_PROXY="localhost,127.0.0.1,.company.com"
```

No additional configuration needed.

### Q: How do I configure multiple environments (dev/staging/prod)?

**A:** Use environment-specific `.env` files:

```bash
# .env.development
NEUROLINK_LOG_LEVEL="debug"
NEUROLINK_CACHE_ENABLED="false"

# .env.production
NEUROLINK_LOG_LEVEL="warn"
NEUROLINK_CACHE_ENABLED="true"
NEUROLINK_ANALYTICS_ENABLED="true"
```

##  Usage

### Q: What's the difference between CLI and SDK?

**A:**

| Feature              | CLI                          | SDK                       |
| -------------------- | ---------------------------- | ------------------------- |
| **Best for**         | Scripts, automation, testing | Applications, integration |
| **Installation**     | None required (npx)          | npm install required      |
| **Output**           | Text, JSON                   | Native JavaScript objects |
| **Batch processing** | Built-in `batch` command     | Manual implementation     |
| **Learning curve**   | Low                          | Medium                    |

### Q: How do I choose the best provider for my use case?

**A:** NeuroLink can auto-select the best provider, or you can choose based on:

- **Speed**: Google AI (fastest responses)
- **Coding**: Anthropic Claude (best for code analysis)
- **Creative**: OpenAI (best for creative content)
- **Cost**: Google AI Studio (free tier available)
- **Enterprise**: AWS Bedrock or Azure OpenAI

```bash
# Auto-selection
npx @juspay/neurolink gen "Your prompt" --provider auto

# Specific provider
npx @juspay/neurolink gen "Your prompt" --provider google-ai
```

### Q: Can I use multiple providers in the same application?

**A:** Yes! You can specify different providers for different requests:

```typescript

const neurolink = new NeuroLink();

// Use different providers for different tasks
const code = await neurolink.generate({
  input: { text: "Write a Python function" },
  provider: "anthropic",
});

const creative = await neurolink.generate({
  input: { text: "Write a poem" },
  provider: "openai",
});
```

##  Troubleshooting

### Q: Why am I getting "API key not found" errors?

**A:** Common solutions:

1. **Check .env file exists** and is in the correct directory
2. **Verify file format**: No spaces around `=` signs

   ```bash
   # Correct
   OPENAI_API_KEY="sk-your-key"

   # Incorrect
   OPENAI_API_KEY = "sk-your-key"
   ```

3. **Check file permissions**: `.env` file should be readable
4. **Verify key format**: Keys should start with provider-specific prefixes

### Q: Provider status shows "Authentication failed" - what should I do?

**A:**

1. **Verify API key is correct** and hasn't expired
2. **Check account status** - ensure billing is set up if required
3. **Test API key manually**:
   ```bash
   # Test OpenAI key
   curl -H "Authorization: Bearer $OPENAI_API_KEY" \
        https://api.openai.com/v1/models
   ```
4. **Check regional restrictions** - some providers have geographic limitations

### Q: AWS Bedrock shows "Not Authorized" - how do I fix this?

**A:** AWS Bedrock requires additional setup:

1. **Request model access** in AWS Bedrock console
2. **Use full inference profile ARN** for Anthropic models:
   ```bash
   BEDROCK_MODEL="arn:aws:bedrock:us-east-1:123456789:inference-profile/us.anthropic.claude-3-5-sonnet-20241022-v2:0"
   ```
3. **Verify IAM permissions** include `AmazonBedrockFullAccess`
4. **Check AWS region** - Bedrock isn't available in all regions

### Q: Google Vertex AI authentication issues?

**A:** Vertex AI supports multiple authentication methods:

```bash
# Method 1: Service account file
GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"

# Method 2: Individual environment variables
GOOGLE_AUTH_CLIENT_EMAIL="service-account@project.iam.gserviceaccount.com"
GOOGLE_AUTH_PRIVATE_KEY="-----BEGIN PRIVATE KEY-----..."

# Required for both methods
GOOGLE_VERTEX_PROJECT="your-gcp-project-id"
GOOGLE_VERTEX_LOCATION="us-central1"
```

### Q: Why are my requests timing out?

**A:** Try these solutions:

1. **Increase timeout**:
   ```bash
   npx @juspay/neurolink gen "prompt" --timeout 60000
   ```
2. **Check network connectivity**
3. **Reduce max tokens** for faster responses
4. **Switch to faster provider** (Google AI is typically fastest)

### Q: How do I handle rate limits?

**A:**

1. **Use batch processing** with delays:
   ```bash
   npx @juspay/neurolink batch prompts.txt --delay 3000
   ```
2. **Switch providers** when rate limited
3. **Implement exponential backoff** in your applications
4. **Upgrade API plan** for higher limits

##  Advanced Features

### Q: What are analytics and evaluation features?

**A:**

- **Analytics**: Track usage metrics, costs, and performance
- **Evaluation**: AI-powered quality scoring of responses

```bash
# Enable analytics
npx @juspay/neurolink gen "prompt" --enable-analytics

# Enable evaluation
npx @juspay/neurolink gen "prompt" --enable-evaluation

# Both together
npx @juspay/neurolink gen "prompt" --enable-analytics --enable-evaluation
```

### Q: What is MCP integration?

**A:** Model Context Protocol (MCP) allows NeuroLink to use external tools like file systems, databases, and APIs. NeuroLink includes built-in tools and can discover MCP servers from other AI applications.

```bash
# List discovered MCP servers
npx @juspay/neurolink mcp list

# Test built-in tools
npx @juspay/neurolink gen "What time is it?" --debug
```

### Q: How do I use streaming responses?

**A:**

```bash
# CLI streaming
npx @juspay/neurolink stream "Tell me a story"

# SDK streaming
const stream = await neurolink.stream({
  input: { text: "Tell me a story" }
});

for await (const chunk of stream) {
  console.log(chunk.content);
}
```

##  Enterprise Usage

### Q: Is NeuroLink suitable for enterprise use?

**A:** Yes! NeuroLink is designed for enterprise use with:

- **Corporate proxy support**
- **Multiple authentication methods**
- **Audit logging and analytics**
- **Provider fallback and reliability**
- **Comprehensive error handling**
- **Security best practices**

### Q: How do I deploy NeuroLink in production?

**A:** Best practices:

1. **Use environment variables** for configuration
2. **Implement secret management** (AWS Secrets Manager, Azure Key Vault)
3. **Enable analytics** for monitoring
4. **Set up provider fallbacks**
5. **Configure appropriate timeouts**
6. **Monitor provider health**

### Q: Can I use NeuroLink in CI/CD pipelines?

**A:** Absolutely! Common use cases:

```bash
# Generate documentation
npx @juspay/neurolink gen "Create API docs" > docs/api.md

# Code review
npx @juspay/neurolink gen "Review this code for issues" --provider anthropic

# Release notes
npx @juspay/neurolink gen "Generate release notes from git log"
```

### Q: How do I track costs across teams?

**A:** Use analytics with context:

```bash
npx @juspay/neurolink gen "prompt" \
  --enable-analytics \
  --context '{"team":"backend","project":"api","user":"dev123"}'
```

##  Development

### Q: How do I integrate NeuroLink with React?

**A:**

```typescript

function AIComponent() {
  const [response, setResponse] = useState("");
  const neurolink = new NeuroLink();

  const generate = async () => {
    const result = await neurolink.generate({
      input: { text: "Hello AI" }
    });
    setResponse(result.content);
  };

  return (

      Generate
      {response}

  );
}
```

### Q: How do I handle errors properly?

**A:**

```typescript
try {
  const result = await neurolink.generate({
    input: { text: "Your prompt" },
  });
  console.log(result.content);
} catch (error) {
  if (error.code === "RATE_LIMIT_EXCEEDED") {
    // Handle rate limiting
  } else if (error.code === "AUTHENTICATION_FAILED") {
    // Handle auth issues
  } else {
    // Handle other errors
  }
}
```

### Q: Can I create custom tools?

**A:** Yes! NeuroLink supports custom MCP servers:

```bash
# Add custom MCP server
npx @juspay/neurolink mcp add myserver "python /path/to/server.py"

# Test custom server
npx @juspay/neurolink mcp test myserver
```

##  Pricing and Costs

### Q: How much does NeuroLink cost?

**A:** NeuroLink itself is free! You only pay for the AI provider usage (OpenAI, Google AI, etc.). NeuroLink helps optimize costs by:

- **Auto-selecting cheapest suitable providers**
- **Analytics to track spending**
- **Batch processing for efficiency**
- **Built-in rate limiting**

### Q: Which provider is most cost-effective?

**A:** Generally:

1. **Google AI Studio** - Free tier available
2. **Google Vertex AI** - Competitive pricing
3. **OpenAI GPT-4o-mini** - Good balance of cost/performance
4. **Anthropic Claude Haiku** - Fast and affordable

Use `npx @juspay/neurolink models best --use-case cheapest` to find the most cost-effective option.

### Q: How can I monitor and control costs?

**A:**

1. **Enable analytics** to track usage and costs
2. **Set provider limits** in your AI provider dashboards
3. **Use cheaper models** for non-critical tasks
4. **Implement caching** for repeated requests
5. **Monitor with evaluation** to ensure quality

## 🆘 Getting Help

### Q: Where can I get help?

**A:**

1. **Documentation**: Comprehensive guides and API reference
2. **GitHub Issues**: Report bugs and request features
3. **Troubleshooting Guide**: Common issues and solutions
4. **Examples**: Practical usage patterns

### Q: How do I report a bug?

**A:**

1. **Check existing issues** on GitHub
2. **Include reproduction steps**
3. **Provide environment details**:
   - Node.js version
   - NeuroLink version
   - Operating system
   - Error messages
4. **Share configuration** (without API keys!)

### Q: How do I request a new feature?

**A:**

1. **Search existing feature requests**
2. **Open GitHub issue** with "enhancement" label
3. **Describe use case** and expected behavior
4. **Provide examples** of how the feature would be used

### Q: Can I contribute to NeuroLink?

**A:** Yes! We welcome contributions:

1. **Read the contributing guide**
2. **Start with good first issues**
3. **Follow code style guidelines**
4. **Include tests and documentation**
5. **Submit pull request**

##  Migration and Updates

### Q: How do I update NeuroLink?

**A:**

```bash
# For global installation
npm update -g @juspay/neurolink

# For project installation
npm update @juspay/neurolink

# Check version
npx @juspay/neurolink --version
```

### Q: Are there breaking changes between versions?

**A:** NeuroLink follows semantic versioning:

- **Patch updates** (1.0.1): Bug fixes, no breaking changes
- **Minor updates** (1.1.0): New features, backward compatible
- **Major updates** (2.0.0): Breaking changes, migration guide provided

### Q: How do I migrate from other AI libraries?

**A:** NeuroLink provides simple migration paths:

```typescript
// From OpenAI SDK

const openai = new OpenAI();

// To NeuroLink

const neurolink = new NeuroLink();

// Similar API, enhanced features
const result = await neurolink.generate({
  input: { text: "Your prompt" },
  provider: "openai", // Optional, can use any provider
});
```

---

##  Related Documentation

- [Quick Start Guide](/docs/getting-started/quick-start) - Get started in 2 minutes
- [Installation Guide](/docs/getting-started/installation) - Detailed setup instructions
- [Troubleshooting Guide](/docs/reference/troubleshooting) - Common issues and solutions
- [CLI Commands](/docs/cli/commands) - Complete CLI reference
- [API Reference](/docs/sdk/api-reference) - SDK documentation

---

## Provider Selection Guide

<!-- Source: reference/provider-selection.md -->

# Provider Selection Guide

**Last Updated:** January 2026
**NeuroLink Version:** 8.26.1+

This guide helps you choose the optimal AI provider for your specific use case, budget, and requirements. Whether you're building a startup prototype or deploying enterprise-grade AI systems, this guide provides actionable recommendations.

---------------------- | -------------------- | -------------------- | ----------------------- |
| **Highest Quality**       | OpenAI GPT-4o/GPT-5  | Anthropic Claude 4.5 | Google Gemini 2.5 Pro   |
| **Extended Thinking**     | Anthropic Claude 4.5 | Google Gemini 2.5+   | Google AI Studio (Free) |
| **PDF Processing**        | Anthropic            | Google AI Studio     | Google Vertex           |
| **Complete Privacy**      | Ollama (Local)       | Self-hosted LiteLLM  | -                       |
| **Enterprise Security**   | Azure OpenAI         | Amazon Bedrock       | Google Vertex           |
| **GDPR Compliance**       | Mistral              | Ollama (Local)       | -                       |
| **Free Tier**             | Google AI Studio     | OpenRouter           | HuggingFace             |
| **Multi-Provider Access** | OpenRouter           | LiteLLM              | -                       |
| **AWS Integration**       | Amazon Bedrock       | Amazon SageMaker     | -                       |
| **Azure Integration**     | Azure OpenAI         | -                    | -                       |
| **GCP Integration**       | Google Vertex        | Google AI Studio     | -                       |
| **Vision/Multimodal**     | OpenAI GPT-4o        | Anthropic Claude 4.5 | Google Gemini           |
| **Tool Calling**          | OpenAI               | Anthropic            | Google AI Studio        |
| **Custom Models**         | Amazon SageMaker     | OpenAI Compatible    | Ollama                  |

---

## Selection Criteria Deep Dive

### 1. Quality and Accuracy

When output quality is paramount, consider these factors:

| Provider                 | Quality Tier | Best Models                  | Strengths                                             |
| ------------------------ | ------------ | ---------------------------- | ----------------------------------------------------- |
| **OpenAI**               | Tier 1       | GPT-4o, GPT-5, O-series      | Industry-leading accuracy, extensive training data    |
| **Anthropic**            | Tier 1       | Claude 4.5 Opus, Sonnet      | Superior reasoning, safety-focused, extended thinking |
| **Google**               | Tier 1-2     | Gemini 3 Pro, Gemini 2.5 Pro | Native multimodal, large context windows              |
| **Mistral**              | Tier 2       | Mistral Large                | European-trained, efficient architecture              |
| **Meta (via providers)** | Tier 2-3     | Llama 3.3 70B                | Open-source leader, good general performance          |

```typescript
// Quality-first configuration

const neurolink = new NeuroLink();

// For highest quality output
const result = await neurolink.generate({
  provider: "anthropic",
  model: "claude-opus-4-5-20250929",
  prompt: "Complex analysis requiring nuanced reasoning",
  thinkingLevel: "high", // Enable extended thinking for complex tasks
  temperature: 0.3, // Lower temperature for more consistent output
});
```

### 2. Cost Optimization

Choose providers based on your budget constraints:

| Budget Level         | Recommended Provider     | Monthly Cost (1M tokens) | Notes                                  |
| -------------------- | ------------------------ | ------------------------ | -------------------------------------- |
| **Free**             | Google AI Studio         | $0                       | 1M tokens/day free limit               |
| **Free**             | OpenRouter (free models) | $0                       | Gemini, Llama, Qwen models             |
| **Free**             | Ollama                   | $0                       | Hardware costs only                    |
| **Low ($0-50)**      | Mistral Small            | ~$20                     | Good quality, European compliance      |
| **Medium ($50-200)** | GPT-4o-mini              | ~$75                     | Excellent quality/cost ratio           |
| **High ($200+)**     | Claude 4.5 Sonnet        | ~$180                    | Premium quality with extended thinking |
| **Enterprise**       | Azure/Bedrock            | Negotiated               | Volume discounts, SLA guarantees       |

```typescript
// Cost-optimized multi-tier strategy

const neurolink = new NeuroLink();

async function generateWithCostOptimization(
  prompt: string,
  complexity: "simple" | "medium" | "complex",
) {
  const configs = {
    simple: { provider: "google-ai", model: "gemini-2.5-flash" }, // FREE
    medium: { provider: "openai", model: "gpt-4o-mini" }, // Low cost
    complex: { provider: "anthropic", model: "claude-sonnet-4-5-20250929" }, // Premium
  };

  return neurolink.generate({
    prompt,
    ...configs[complexity],
  });
}

// Route based on task complexity
const simpleResult = await generateWithCostOptimization(
  "Summarize this text",
  "simple",
);
const complexResult = await generateWithCostOptimization(
  "Analyze legal implications and provide recommendations",
  "complex",
);
```

### 3. Latency and Performance

Time-to-first-token (TTFT) and throughput considerations:

| Provider             | Average TTFT | Tokens/sec | Best For                          |
| -------------------- | ------------ | ---------- | --------------------------------- |
| **Ollama (Local)**   | 50-200ms     | 30-50      | Local development, lowest latency |
| **Google AI Studio** | 300-700ms    | 45-65      | Fast cloud inference              |
| **OpenAI**           | 300-800ms    | 40-60      | Balanced performance              |
| **Anthropic**        | 400-900ms    | 35-55      | Complex reasoning tasks           |
| **Azure OpenAI**     | 350-850ms    | 40-60      | Enterprise with SLA               |

```typescript
// Latency-optimized streaming configuration

const neurolink = new NeuroLink();

// For real-time user-facing applications
const stream = await neurolink.stream({
  provider: "google-ai", // Fast TTFT
  model: "gemini-2.5-flash", // Optimized for speed
  prompt: "Generate response quickly",
  maxTokens: 500, // Limit for faster completion
});

for await (const chunk of stream) {
  process.stdout.write(chunk.content);
}
```

### 4. Feature Requirements

Match provider capabilities to your feature needs:

| Feature               | Full Support                                       | Partial Support          | No Support             |
| --------------------- | -------------------------------------------------- | ------------------------ | ---------------------- |
| **Streaming**         | All providers                                      | SageMaker                | -                      |
| **Tool Calling**      | OpenAI, Anthropic, Google, Azure, Bedrock, Mistral | HuggingFace, Ollama      | SageMaker              |
| **Vision**            | OpenAI, Anthropic, Google, Azure                   | Mistral, Ollama, LiteLLM | HuggingFace, SageMaker |
| **PDF Native**        | Anthropic, Google AI Studio, Vertex                | Bedrock (Claude)         | OpenAI, Azure, Mistral |
| **Extended Thinking** | Anthropic, Google (Gemini 2.5+)                    | -                        | Others                 |
| **Structured Output** | OpenAI, Anthropic, Azure, Mistral                  | Google\*                 | HuggingFace, Ollama    |

\*Google providers cannot combine tools + JSON schema simultaneously

```typescript
// Feature-specific provider selection

const neurolink = new NeuroLink();

// PDF processing - use Anthropic or Google
const pdfResult = await neurolink.generate({
  provider: "anthropic",
  model: "claude-sonnet-4-5-20250929",
  prompt: "Analyze this contract",
  files: [{ path: "./contract.pdf", type: "pdf" }],
});

// Extended thinking for complex reasoning
const reasoningResult = await neurolink.generate({
  provider: "anthropic",
  model: "claude-sonnet-4-5-20250929",
  prompt: "Solve this multi-step problem with detailed reasoning",
  thinkingLevel: "high",
});

// Structured output with Google (tools disabled)
const structuredResult = await neurolink.generate({
  provider: "google-ai",
  model: "gemini-2.5-pro",
  prompt: "Extract user data",
  structuredOutput: {
    schema: {
      type: "object",
      properties: {
        name: { type: "string" },
        email: { type: "string" },
      },
    },
  },
  disableTools: true, // Required for Google providers with schema
});
```

### 5. Compliance and Security

Choose based on regulatory and security requirements:

| Requirement            | Best Providers                | Configuration Notes                        |
| ---------------------- | ----------------------------- | ------------------------------------------ |
| **GDPR**               | Mistral, Ollama               | European data centers, no US data transfer |
| **HIPAA**              | Azure OpenAI, Bedrock, Vertex | Requires BAA agreement                     |
| **SOC 2**              | All major cloud providers     | Available on enterprise tiers              |
| **Data Privacy**       | Ollama, Self-hosted           | Zero data transmission                     |
| **Air-gapped**         | Ollama, SageMaker             | On-premise deployment                      |
| **Financial Services** | Azure OpenAI, Bedrock         | Enterprise compliance packages             |

```typescript
// Privacy-focused configuration

const neurolink = new NeuroLink();

// For sensitive data - use local Ollama
const privateResult = await neurolink.generate({
  provider: "ollama",
  model: "llama3.1:70b",
  prompt: "Process this sensitive customer data",
  // Data never leaves your infrastructure
});

// For GDPR compliance - use Mistral
const gdprResult = await neurolink.generate({
  provider: "mistral",
  model: "mistral-large-latest",
  prompt: "Process EU customer request",
  // Data stays in European data centers
});
```

---

## Use Case Recommendations

### Startup / MVP Development

**Recommended Stack:**

```typescript

const neurolink = new NeuroLink();

// Development: Free tier for iteration
const devConfig = {
  provider: "google-ai" as const,
  model: "gemini-2.5-flash",
};

// Production: Affordable quality
const prodConfig = {
  provider: "openai" as const,
  model: "gpt-4o-mini",
};

// Use environment-based configuration
const config = process.env.NODE_ENV === "production" ? prodConfig : devConfig;

const result = await neurolink.generate({
  ...config,
  prompt: "Your application prompt",
});
```

**Cost Projection:**

- Development: $0/month (Google AI Studio free tier)
- Production (10K users): ~$50-150/month (GPT-4o-mini)

### Enterprise Production

**Recommended Stack:**

```typescript

const neurolink = new NeuroLink();

// Primary: Enterprise-grade with SLA
const primaryConfig = {
  provider: "azure" as const,
  model: "gpt-4o",
};

// Fallback: Alternative provider for resilience
const fallbackConfig = {
  provider: "bedrock" as const,
  model: "anthropic.claude-3-5-sonnet-20240620-v1:0",
};

async function generateWithFallback(prompt: string) {
  try {
    return await neurolink.generate({
      ...primaryConfig,
      prompt,
      timeout: 30000,
    });
  } catch (error) {
    console.warn("Primary provider failed, using fallback");
    return await neurolink.generate({
      ...fallbackConfig,
      prompt,
    });
  }
}
```

**Enterprise Requirements Checklist:**

- [x] SLA guarantees (99.9%+)
- [x] HIPAA/SOC2 compliance
- [x] Multi-region deployment
- [x] Provider failover strategy
- [x] Cost monitoring and alerts

### Research and Analysis

**Recommended Stack:**

```typescript

const neurolink = new NeuroLink();

// Use extended thinking for deep analysis
const analysisResult = await neurolink.generate({
  provider: "anthropic",
  model: "claude-opus-4-5-20250929",
  prompt: `Analyze the following research paper and provide:
    1. Key findings and methodology
    2. Potential limitations
    3. Implications for the field
    4. Suggested follow-up research`,
  files: [{ path: "./research-paper.pdf", type: "pdf" }],
  thinkingLevel: "high",
  maxTokens: 8000,
});

// For document-heavy workflows
const documentResult = await neurolink.generate({
  provider: "google-ai",
  model: "gemini-2.5-pro",
  prompt: "Compare these three documents",
  files: [
    { path: "./doc1.pdf", type: "pdf" },
    { path: "./doc2.pdf", type: "pdf" },
    { path: "./doc3.pdf", type: "pdf" },
  ],
});
```

### Privacy-Critical Applications

**Recommended Stack:**

```typescript

const neurolink = new NeuroLink();

// Tier 1: Completely local (maximum privacy)
const localResult = await neurolink.generate({
  provider: "ollama",
  model: "llama3.1:70b",
  prompt: "Process sensitive patient data",
});

// Tier 2: EU-only processing (GDPR compliant)
const euResult = await neurolink.generate({
  provider: "mistral",
  model: "mistral-large-latest",
  prompt: "Process EU customer request",
});

// Tier 3: Enterprise cloud with compliance (when cloud is acceptable)
const enterpriseResult = await neurolink.generate({
  provider: "azure",
  model: "gpt-4o",
  prompt: "Process data with enterprise security",
});
```

---

## Multi-Provider Strategy

### Intelligent Routing

Implement smart provider selection based on request characteristics:

```typescript

const neurolink = new NeuroLink();

type RequestContext = {
  prompt: string;
  hasImages?: boolean;
  hasPDFs?: boolean;
  requiresReasoning?: boolean;
  isSensitive?: boolean;
  maxBudget?: "free" | "low" | "medium" | "high";
};

function selectProvider(context: RequestContext): {
  provider: string;
  model: string;
} {
  // Privacy-first: sensitive data stays local
  if (context.isSensitive) {
    return { provider: "ollama", model: "llama3.1:70b" };
  }

  // PDF processing: use Anthropic or Google
  if (context.hasPDFs) {
    return { provider: "anthropic", model: "claude-sonnet-4-5-20250929" };
  }

  // Complex reasoning: use extended thinking
  if (context.requiresReasoning) {
    return { provider: "anthropic", model: "claude-sonnet-4-5-20250929" };
  }

  // Vision tasks: use GPT-4o
  if (context.hasImages) {
    return { provider: "openai", model: "gpt-4o" };
  }

  // Budget-based selection
  switch (context.maxBudget) {
    case "free":
      return { provider: "google-ai", model: "gemini-2.5-flash" };
    case "low":
      return { provider: "openai", model: "gpt-4o-mini" };
    case "medium":
      return { provider: "openai", model: "gpt-4o" };
    case "high":
      return { provider: "anthropic", model: "claude-opus-4-5-20250929" };
    default:
      return { provider: "openai", model: "gpt-4o-mini" };
  }
}

// Usage
async function intelligentGenerate(context: RequestContext) {
  const { provider, model } = selectProvider(context);

  return neurolink.generate({
    provider: provider as any,
    model,
    prompt: context.prompt,
    thinkingLevel: context.requiresReasoning ? "high" : undefined,
  });
}

// Examples
const result1 = await intelligentGenerate({
  prompt: "Summarize this text",
  maxBudget: "free",
});

const result2 = await intelligentGenerate({
  prompt: "Analyze this medical document",
  hasPDFs: true,
  isSensitive: true,
});
```

### Failover and Redundancy

Implement robust failover for production reliability:

```typescript

const neurolink = new NeuroLink();

type ProviderConfig = {
  provider: string;
  model: string;
  priority: number;
};

const providerChain: ProviderConfig[] = [
  { provider: "openai", model: "gpt-4o", priority: 1 },
  { provider: "anthropic", model: "claude-sonnet-4-5-20250929", priority: 2 },
  { provider: "google-ai", model: "gemini-2.5-pro", priority: 3 },
  { provider: "mistral", model: "mistral-large-latest", priority: 4 },
];

async function generateWithFailover(
  prompt: string,
  options: { maxRetries?: number; retryDelay?: number } = {},
) {
  const { maxRetries = providerChain.length, retryDelay = 1000 } = options;
  const errors: Error[] = [];

  for (let i = 0; i  setTimeout(resolve, retryDelay));
      }
    }
  }

  // All providers failed
  throw new Error(
    `All providers failed. Errors: ${errors.map((e) => e.message).join("; ")}`,
  );
}

// Usage
const result = await generateWithFailover("Generate a response", {
  maxRetries: 3,
  retryDelay: 2000,
});
```

### Cost-Aware Load Balancing

Distribute load across providers based on cost and availability:

```typescript

const neurolink = new NeuroLink();

type ProviderStats = {
  provider: string;
  model: string;
  costPer1MTokens: number;
  currentLoad: number;
  maxLoad: number;
  isHealthy: boolean;
};

class CostAwareLoadBalancer {
  private providers: ProviderStats[] = [
    {
      provider: "google-ai",
      model: "gemini-2.5-flash",
      costPer1MTokens: 0,
      currentLoad: 0,
      maxLoad: 1000,
      isHealthy: true,
    },
    {
      provider: "openai",
      model: "gpt-4o-mini",
      costPer1MTokens: 0.75,
      currentLoad: 0,
      maxLoad: 500,
      isHealthy: true,
    },
    {
      provider: "anthropic",
      model: "claude-sonnet-4-5-20250929",
      costPer1MTokens: 18,
      currentLoad: 0,
      maxLoad: 200,
      isHealthy: true,
    },
  ];

  selectProvider(): ProviderStats {
    // Filter healthy providers with capacity
    const available = this.providers.filter(
      (p) => p.isHealthy && p.currentLoad  a.costPer1MTokens - b.costPer1MTokens)[0];
  }

  async generate(prompt: string) {
    const provider = this.selectProvider();
    provider.currentLoad++;

    try {
      return await neurolink.generate({
        provider: provider.provider as any,
        model: provider.model,
        prompt,
      });
    } finally {
      provider.currentLoad--;
    }
  }
}

// Usage
const balancer = new CostAwareLoadBalancer();
const result = await balancer.generate("Process this request");
```

---

## Migration Guides

### From OpenAI to Multi-Provider

If you're currently using OpenAI exclusively, here's how to add provider flexibility:

```typescript
// Before: OpenAI only

const openai = new OpenAI();
const response = await openai.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Hello" }],
});

// After: NeuroLink with provider flexibility

const neurolink = new NeuroLink();

// Same OpenAI model, but now portable
const result = await neurolink.generate({
  provider: "openai", // Can easily switch to any provider
  model: "gpt-4o",
  prompt: "Hello",
});

// Switch to Anthropic for extended thinking
const resultWithThinking = await neurolink.generate({
  provider: "anthropic",
  model: "claude-sonnet-4-5-20250929",
  prompt: "Complex reasoning task",
  thinkingLevel: "high",
});

// Use free tier for development
const devResult = await neurolink.generate({
  provider: "google-ai",
  model: "gemini-2.5-flash",
  prompt: "Development testing",
});
```

### From Single Provider to Redundant Setup

```typescript

const neurolink = new NeuroLink();

// Step 1: Define provider hierarchy
const providers = {
  primary: { provider: "openai", model: "gpt-4o" },
  secondary: { provider: "anthropic", model: "claude-sonnet-4-5-20250929" },
  fallback: { provider: "google-ai", model: "gemini-2.5-pro" },
};

// Step 2: Implement health checking
async function checkProviderHealth(config: {
  provider: string;
  model: string;
}) {
  try {
    await neurolink.generate({
      provider: config.provider as any,
      model: config.model,
      prompt: "Health check",
      maxTokens: 10,
    });
    return true;
  } catch {
    return false;
  }
}

// Step 3: Route to healthy provider
async function generateWithRedundancy(prompt: string) {
  for (const [tier, config] of Object.entries(providers)) {
    if (await checkProviderHealth(config)) {
      console.log(`Using ${tier} provider: ${config.provider}`);
      return neurolink.generate({
        provider: config.provider as any,
        model: config.model,
        prompt,
      });
    }
  }
  throw new Error("All providers unhealthy");
}
```

---

## Provider Selection Flowchart

```
START: What's your primary constraint?
│
├─ COST → Need it free?
│   ├─ Yes → Google AI Studio (1M tokens/day FREE)
│   └─ No → What's your budget?
│       ├─ Low → GPT-4o-mini or Mistral Small
│       ├─ Medium → GPT-4o or Claude Sonnet
│       └─ High → Claude Opus or GPT-5
│
├─ PRIVACY → How sensitive is your data?
│   ├─ Critical (no cloud) → Ollama (local)
│   ├─ EU only → Mistral (GDPR)
│   └─ Enterprise compliant → Azure/Bedrock
│
├─ FEATURES → What capabilities do you need?
│   ├─ Extended Thinking → Anthropic or Google Gemini 2.5+
│   ├─ PDF Processing → Anthropic or Google
│   ├─ Vision → OpenAI, Anthropic, or Google
│   └─ Tool Calling → OpenAI or Anthropic
│
├─ CLOUD PLATFORM → Which cloud are you on?
│   ├─ AWS → Amazon Bedrock
│   ├─ Azure → Azure OpenAI
│   ├─ GCP → Google Vertex AI
│   └─ Multi-cloud → LiteLLM or OpenRouter
│
└─ PERFORMANCE → What matters most?
    ├─ Latency → Ollama (local) or Google AI Studio
    ├─ Throughput → OpenAI or Google
    └─ Quality → OpenAI GPT-4o or Anthropic Claude
```

---

## Summary Recommendations

### For Most Users

**Start with Google AI Studio** - Free tier, good quality, full features including PDF and extended thinking.

### For Production

**Use OpenAI or Anthropic** - Industry-leading quality with reliable APIs and enterprise support.

### For Enterprise

**Use Azure OpenAI or Amazon Bedrock** - Enterprise security, SLA guarantees, compliance certifications.

### For Privacy

**Use Ollama** - Complete data privacy with local execution.

### For Cost Optimization

**Implement multi-provider routing** - Use free/cheap providers for simple tasks, premium for complex ones.

---

## Related Resources

- **[Provider Comparison](/docs/reference/provider-comparison)** - Detailed feature and pricing comparison
- **[Provider Capabilities Audit](/docs/reference/provider-capabilities-audit)** - Technical compatibility matrix
- **[Configuration Reference](/docs/deployment/configuration)** - Environment setup for all providers
- **[Troubleshooting](/docs/reference/troubleshooting)** - Common issues and solutions
- **[Multi-Provider Fallback Cookbook](/docs/cookbook/multi-provider-fallback)** - Implementation patterns
- **[Cost Optimization Cookbook](/docs/cookbook/cost-optimization)** - Strategies to reduce costs

---

## Srvr Cofiguratio Rfrc []

<!-- Source: reference/server-configuration.md -->

# Server Adapter Configuration Reference

This document provides a comprehensive reference for all configuration options available in NeuroLink Server Adapters.

## Configuration via CLI

In addition to programmatic configuration, NeuroLink provides CLI commands to view and manage server settings.

### Viewing Configuration

```bash
# Show all configuration
neurolink server config

# Output as JSON
neurolink server config --format json

# Get specific value
neurolink server config --get defaultPort
neurolink server config --get cors.enabled
neurolink server config --get rateLimit.maxRequests
```

### Modifying Configuration

```bash
# Set configuration values
neurolink server config --set defaultPort=8080
neurolink server config --set defaultFramework=express
neurolink server config --set cors.enabled=true
neurolink server config --set rateLimit.maxRequests=200

# Reset to defaults
neurolink server config --reset
```

### Configuration File Location

CLI configuration is stored at:

- **Config file:** `~/.neurolink/server-config.json`
- **Server state:** `~/.neurolink/server-state.json`

### CLI vs Programmatic Configuration

| Aspect      | CLI Config                    | Programmatic Config              |
| ----------- | ----------------------------- | -------------------------------- |
| Persistence | File-based, survives restarts | In-memory, per-instance          |
| Scope       | Global defaults               | Per-server instance              |
| Use Case    | Development, quick changes    | Production, fine-grained control |

The CLI configuration provides default values that can be overridden programmatically:

```typescript
// CLI defaults are used when not specified
const server = await createServer(neurolink, {
  framework: "hono", // Overrides CLI default
  // port uses CLI default if not specified
});
```

## ServerAdapterConfig

The main configuration object for server adapters.

```typescript
type ServerAdapterConfig = {
  port?: number;
  host?: string;
  basePath?: string;
  cors?: CORSConfig;
  rateLimit?: RateLimitConfig;
  bodyParser?: BodyParserConfig;
  logging?: LoggingConfig;
  shutdown?: ShutdownConfig;
  redaction?: RedactionConfig;
  timeout?: number;
  enableMetrics?: boolean;
  enableSwagger?: boolean;
  disableBuiltInHealth?: boolean;
};
```

### Core Options

| Option                 | Type      | Default     | Description                                      |
| ---------------------- | --------- | ----------- | ------------------------------------------------ |
| `port`                 | `number`  | `3000`      | Server port to listen on                         |
| `host`                 | `string`  | `"0.0.0.0"` | Server host/interface to bind                    |
| `basePath`             | `string`  | `"/api"`    | Base path prefix for all routes                  |
| `timeout`              | `number`  | `30000`     | Request timeout in milliseconds                  |
| `enableMetrics`        | `boolean` | `true`      | Enable metrics endpoint                          |
| `enableSwagger`        | `boolean` | `false`     | Enable OpenAPI/Swagger documentation (see below) |
| `disableBuiltInHealth` | `boolean` | `false`     | Disable built-in health routes                   |

### OpenAPI/Swagger Documentation (`enableSwagger`)

When `enableSwagger` is set to `true`, the server exposes interactive API documentation endpoints:

| Endpoint                      | Description                              |
| ----------------------------- | ---------------------------------------- |
| `GET {basePath}/openapi.json` | OpenAPI 3.1 specification in JSON format |
| `GET {basePath}/openapi.yaml` | OpenAPI 3.1 specification in YAML format |
| `GET {basePath}/docs`         | Interactive Swagger UI documentation     |

**Example URLs (with default basePath `/api`):**

- `http://localhost:3000/api/openapi.json`
- `http://localhost:3000/api/openapi.yaml`
- `http://localhost:3000/api/docs`

The Swagger UI provides an interactive interface where you can:

- Browse all available API endpoints
- View request/response schemas
- Test API calls directly from the browser
- Download the OpenAPI specification

> **Security Consideration:** In production environments, consider disabling `enableSwagger` to prevent exposing internal API structure. Alternatively, protect the documentation endpoints with authentication middleware.

### Example: Basic Configuration

```typescript

const server = await createServer(neurolink, {
  config: {
    port: 8080,
    host: "127.0.0.1",
    basePath: "/v1/api",
    timeout: 60000,
    enableSwagger: true,
  },
});
```

## CORS Configuration

```typescript
type CORSConfig = {
  enabled?: boolean;
  origins?: string[];
  methods?: string[];
  headers?: string[];
  credentials?: boolean;
  maxAge?: number;
};
```

| Option        | Type       | Default                                                | Description                        |
| ------------- | ---------- | ------------------------------------------------------ | ---------------------------------- |
| `enabled`     | `boolean`  | `true`                                                 | Enable CORS support                |
| `origins`     | `string[]` | `["*"]`                                                | Allowed origins                    |
| `methods`     | `string[]` | `["GET", "POST", "PUT", "DELETE", "PATCH", "OPTIONS"]` | Allowed HTTP methods               |
| `headers`     | `string[]` | `["Content-Type", "Authorization"]`                    | Allowed headers                    |
| `credentials` | `boolean`  | `false`                                                | Allow credentials                  |
| `maxAge`      | `number`   | `86400`                                                | Preflight cache max age in seconds |

> **Security Warning:** The default wildcard origin `["*"]` allows requests from any domain. In production environments, always specify explicit allowed origins to prevent unauthorized cross-origin requests.

### Example: Restrictive CORS

```typescript
const server = await createServer(neurolink, {
  config: {
    cors: {
      enabled: true,
      origins: ["https://myapp.com", "https://staging.myapp.com"],
      methods: ["GET", "POST"],
      headers: ["Content-Type", "Authorization", "X-Request-ID"],
      credentials: true,
      maxAge: 3600,
    },
  },
});
```

## Rate Limit Configuration

```typescript
type RateLimitConfig = {
  enabled?: boolean;
  windowMs?: number;
  maxRequests?: number;
  message?: string;
  skipPaths?: string[];
  keyGenerator?: (ctx: ServerContext) => string;
};
```

| Option         | Type       | Default                  | Description                                |
| -------------- | ---------- | ------------------------ | ------------------------------------------ |
| `enabled`      | `boolean`  | `true`                   | Enable rate limiting                       |
| `windowMs`     | `number`   | `900000` (15 min)        | Time window in milliseconds                |
| `maxRequests`  | `number`   | `100`                    | Maximum requests per window                |
| `message`      | `string`   | `"Too many requests..."` | Error message when limit exceeded          |
| `skipPaths`    | `string[]` | `[]`                     | Paths to exclude from rate limiting        |
| `keyGenerator` | `function` | IP-based                 | Custom function to generate rate limit key |

### Example: Custom Rate Limiting

```typescript
const server = await createServer(neurolink, {
  config: {
    rateLimit: {
      enabled: true,
      windowMs: 60000, // 1 minute
      maxRequests: 30,
      skipPaths: ["/api/health", "/api/ready", "/api/version"],
      keyGenerator: (ctx) => {
        // Rate limit by API key instead of IP
        return (
          ctx.headers["x-api-key"] ||
          ctx.headers["x-forwarded-for"] ||
          "unknown"
        );
      },
    },
  },
});
```

## Body Parser Configuration

```typescript
type BodyParserConfig = {
  enabled?: boolean;
  maxSize?: string;
  jsonLimit?: string;
  urlEncoded?: boolean;
};
```

| Option       | Type      | Default  | Description                     |
| ------------ | --------- | -------- | ------------------------------- |
| `enabled`    | `boolean` | `true`   | Enable body parsing             |
| `maxSize`    | `string`  | `"10mb"` | Maximum body size               |
| `jsonLimit`  | `string`  | `"10mb"` | JSON body size limit            |
| `urlEncoded` | `boolean` | `true`   | Enable URL-encoded body parsing |

### Example: Large Payload Support

```typescript
const server = await createServer(neurolink, {
  config: {
    bodyParser: {
      enabled: true,
      maxSize: "50mb",
      jsonLimit: "50mb",
      urlEncoded: true,
    },
  },
});
```

## Logging Configuration

```typescript
type LoggingConfig = {
  enabled?: boolean;
  level?: "debug" | "info" | "warn" | "error";
  includeBody?: boolean;
  includeResponse?: boolean;
};
```

| Option            | Type      | Default  | Description                   |
| ----------------- | --------- | -------- | ----------------------------- |
| `enabled`         | `boolean` | `true`   | Enable request logging        |
| `level`           | `string`  | `"info"` | Log level                     |
| `includeBody`     | `boolean` | `false`  | Include request body in logs  |
| `includeResponse` | `boolean` | `false`  | Include response body in logs |

### Example: Debug Logging

```typescript
const server = await createServer(neurolink, {
  config: {
    logging: {
      enabled: true,
      level: "debug",
      includeBody: true,
      includeResponse: true,
    },
  },
});
```

## Shutdown Configuration

```typescript
type ShutdownConfig = {
  gracefulShutdownTimeoutMs?: number;
  drainTimeoutMs?: number;
  forceClose?: boolean;
};
```

| Option                      | Type      | Default | Description                                         |
| --------------------------- | --------- | ------- | --------------------------------------------------- |
| `gracefulShutdownTimeoutMs` | `number`  | `30000` | Maximum time to wait for graceful shutdown (30 sec) |
| `drainTimeoutMs`            | `number`  | `15000` | Time to drain existing connections (15 sec)         |
| `forceClose`                | `boolean` | `true`  | Force close connections after timeout               |

### Example: Custom Shutdown Timeouts

```typescript
const server = await createServer(neurolink, {
  config: {
    shutdown: {
      gracefulShutdownTimeoutMs: 60000, // 60 seconds for long-running requests
      drainTimeoutMs: 30000, // 30 seconds to drain connections
      forceClose: true, // Force close after timeout
    },
  },
});
```

## Redaction Configuration

The redaction system provides automatic sanitization of sensitive data in logs and responses. This feature is **opt-in** and must be explicitly enabled.

```typescript
type RedactionConfig = {
  enabled?: boolean;
  additionalFields?: string[];
  preserveFields?: string[];
  redactToolArgs?: boolean;
  redactToolResults?: boolean;
  placeholder?: string;
};
```

| Option              | Type       | Default        | Description                          |
| ------------------- | ---------- | -------------- | ------------------------------------ |
| `enabled`           | `boolean`  | `false`        | Enable redaction (opt-in)            |
| `additionalFields`  | `string[]` | `[]`           | Extra field names to redact          |
| `preserveFields`    | `string[]` | `[]`           | Fields to exclude from redaction     |
| `redactToolArgs`    | `boolean`  | `true`         | Redact tool arguments (when enabled) |
| `redactToolResults` | `boolean`  | `true`         | Redact tool results (when enabled)   |
| `placeholder`       | `string`   | `"[REDACTED]"` | Replacement text for redacted values |

### Default Redacted Fields

When redaction is enabled, the following fields are redacted by default:

- `apiKey`
- `token`
- `authorization`
- `credentials`
- `password`
- `secret`
- `request`
- `args`
- `result`

### Example: Custom Redaction

```typescript
const server = await createServer(neurolink, {
  config: {
    redaction: {
      enabled: true,
      additionalFields: ["ssn", "creditCard", "bankAccount"],
      preserveFields: ["request"], // Allow 'request' field to pass through
      redactToolArgs: true,
      redactToolResults: false, // Keep tool results visible
      placeholder: "***",
    },
  },
});
```

### Example: Minimal Redaction

```typescript
const server = await createServer(neurolink, {
  config: {
    redaction: {
      enabled: true,
      // Uses all defaults - redacts apiKey, token, password, etc.
    },
  },
});
```

## Middleware Configuration

### Authentication Middleware

```typescript

const authMiddleware = createAuthMiddleware({
  type: "bearer", // 'bearer' | 'api-key' | 'basic' | 'custom'
  validate: async (token, ctx) => {
    // Return user info or null
    const user = await verifyJWT(token);
    return user ? { id: user.id, email: user.email, roles: user.roles } : null;
  },
  headerName: "Authorization", // Optional: custom header name
  skipPaths: ["/api/health", "/api/ready"],
  errorMessage: "Invalid authentication token",
});

server.registerMiddleware(authMiddleware);
```

#### Auth Types

| Type      | Header Format                   | Description                 |
| --------- | ------------------------------- | --------------------------- |
| `bearer`  | `Authorization: Bearer ` | JWT/OAuth token             |
| `api-key` | `X-API-Key: `              | API key authentication      |
| `basic`   | `Authorization: Basic ` | HTTP Basic auth             |
| `custom`  | Custom                          | Use `extractToken` function |

### Rate Limit Middleware

```typescript

  createRateLimitMiddleware,
  createSlidingWindowRateLimitMiddleware,
} from "@juspay/neurolink";

// Fixed window rate limiter
const rateLimiter = createRateLimitMiddleware({
  maxRequests: 100,
  windowMs: 15 * 60 * 1000,
  skipPaths: ["/api/health"],
});

// Sliding window rate limiter (more accurate)
const slidingRateLimiter = createSlidingWindowRateLimitMiddleware({
  maxRequests: 100,
  windowMs: 15 * 60 * 1000,
  subWindows: 10, // Number of sub-windows for smoothing
});

server.registerMiddleware(rateLimiter);
```

### Cache Middleware

```typescript

const cacheMiddleware = createCacheMiddleware({
  ttlMs: 60 * 1000, // 1 minute cache
  maxSize: 1000, // Max cached entries
  methods: ["GET"], // Only cache GET requests
  excludePaths: ["/api/agent/execute", "/api/agent/stream"],
  includeQuery: true, // Include query params in cache key
  ttlByPath: {
    "/api/tools": 5 * 60 * 1000, // 5 minutes for tools
    "/api/version": 60 * 60 * 1000, // 1 hour for version
  },
});

server.registerMiddleware(cacheMiddleware);
```

### Cache Response Headers

The cache middleware adds these headers to responses:

| Header          | Description                   | Example         |
| --------------- | ----------------------------- | --------------- |
| `X-Cache`       | Cache status                  | `HIT` or `MISS` |
| `X-Cache-Age`   | Seconds since cached (on HIT) | `45`            |
| `Cache-Control` | Caching directive (on MISS)   | `max-age=300`   |

### Validation Middleware

```typescript

  createRequestValidationMiddleware,
  createFieldValidator,
} from "@juspay/neurolink";

// JSON Schema validation
const validationMiddleware = createRequestValidationMiddleware({
  body: {
    type: "object",
    properties: {
      input: { type: "string", minLength: 1 },
      provider: { type: "string" },
    },
    required: ["input"],
  },
});

// Field-level validation
const fieldValidator = createFieldValidator({
  required: ["name", "email"],
  types: { name: "string", email: "string", age: "number" },
  validators: {
    email: (value) => typeof value === "string" && value.includes("@"),
    age: (value) => typeof value === "number" && value >= 0,
  },
});

server.registerMiddleware(validationMiddleware);
```

### Role-Based Access Control

```typescript

// Require any of the specified roles
const adminMiddleware = createRoleMiddleware({
  requiredRoles: ["admin", "superuser"],
  requireAll: false, // Any role matches
  errorMessage: "Admin access required",
});

// Require all specified roles
const superAdminMiddleware = createRoleMiddleware({
  requiredRoles: ["admin", "superuser"],
  requireAll: true, // All roles required
});
```

## Framework-Specific Options

### Hono

```typescript

const server = await ServerAdapterFactory.createHono(neurolink, {
  port: 3000,
  // Hono uses @hono/node-server under the hood
});
```

For more details, see the [Hono Guide](/docs/guides/server-adapters/hono).

### Express

```typescript
const server = await ServerAdapterFactory.createExpress(neurolink, {
  port: 3000,
  // Express-specific middleware can be added via getFrameworkInstance()
});

const app = server.getFrameworkInstance();
app.use(customExpressMiddleware);
```

For more details, see the [Express Guide](/docs/guides/server-adapters/express).

### Fastify

```typescript
const server = await ServerAdapterFactory.createFastify(neurolink, {
  port: 3000,
  // Fastify plugins can be registered on the instance
});

const fastify = server.getFrameworkInstance();
await fastify.register(customFastifyPlugin);
```

For more details, see the [Fastify Guide](/docs/guides/server-adapters/fastify).

### Koa

```typescript
const server = await ServerAdapterFactory.createKoa(neurolink, {
  port: 3000,
  // Koa middleware can be added via getFrameworkInstance()
});

const app = server.getFrameworkInstance();
app.use(customKoaMiddleware);
```

For more details, see the [Koa Guide](/docs/guides/server-adapters/koa).

## Complete Configuration Example

```typescript

  createServer,
  createAuthMiddleware,
  createRateLimitMiddleware,
  createCacheMiddleware,
} from "@juspay/neurolink";

const neurolink = new NeuroLink({
  defaultProvider: "openai",
});

const server = await createServer(neurolink, {
  framework: "hono",
  config: {
    port: 8080,
    host: "0.0.0.0",
    basePath: "/v1",
    timeout: 120000,
    enableSwagger: true,
    cors: {
      enabled: true,
      origins: ["https://app.example.com"],
      credentials: true,
    },
    rateLimit: {
      enabled: true,
      maxRequests: 1000,
      windowMs: 3600000,
    },
    bodyParser: {
      maxSize: "25mb",
    },
    logging: {
      level: "info",
    },
  },
});

// Add custom middleware
server.registerMiddleware(
  createAuthMiddleware({
    type: "bearer",
    validate: async (token) => verifyToken(token),
    skipPaths: ["/v1/health", "/v1/ready"],
  }),
);

server.registerMiddleware(
  createCacheMiddleware({
    ttlMs: 300000,
    methods: ["GET"],
  }),
);

// Start server
await server.start();
console.log(`Server running on http://localhost:8080`);
```

## Environment Variables

The server adapters respect these environment variables:

| Variable              | Description                           | Default       |
| --------------------- | ------------------------------------- | ------------- |
| `PORT`                | Server port                           | `3000`        |
| `HOST`                | Server host                           | `0.0.0.0`     |
| `NODE_ENV`            | Environment mode                      | `development` |
| `npm_package_version` | Package version (for health endpoint) | `unknown`     |

## Configuration Validation

Invalid configuration will throw errors at initialization:

```typescript
// This will throw: "Invalid port number"
const server = await createServer(neurolink, {
  config: { port: -1 },
});

// This will throw: "Invalid rate limit configuration"
const server = await createServer(neurolink, {
  config: { rateLimit: { maxRequests: -100 } },
});
```

Always validate your configuration in development before deploying to production.

## API Endpoints

The server adapters expose the following endpoints (all prefixed with `basePath`, default `/api`):

### Health Endpoints

| Method | Endpoint   | Description         |
| ------ | ---------- | ------------------- |
| GET    | `/health`  | Basic health check  |
| GET    | `/ready`   | Readiness probe     |
| GET    | `/live`    | Liveness probe      |
| GET    | `/version` | Version information |

### Agent Endpoints

| Method | Endpoint         | Description              |
| ------ | ---------------- | ------------------------ |
| POST   | `/agent/execute` | Execute agent with input |
| POST   | `/agent/stream`  | Stream agent response    |

### Tool Endpoints

| Method | Endpoint       | Description             |
| ------ | -------------- | ----------------------- |
| GET    | `/tools`       | List available tools    |
| POST   | `/tools/:name` | Execute a specific tool |
| GET    | `/tools/:name` | Get tool metadata       |

### MCP Endpoints

| Method | Endpoint       | Description                |
| ------ | -------------- | -------------------------- |
| GET    | `/mcp/servers` | List MCP servers           |
| POST   | `/mcp/execute` | Execute MCP tool           |
| GET    | `/mcp/health`  | MCP subsystem health check |

### Memory Endpoints

| Method | Endpoint               | Description                   |
| ------ | ---------------------- | ----------------------------- |
| GET    | `/memory/sessions`     | List memory sessions          |
| GET    | `/memory/sessions/:id` | Get session details           |
| DELETE | `/memory/sessions/:id` | Delete a session              |
| DELETE | `/memory/sessions`     | Clear all sessions            |
| GET    | `/memory/health`       | Memory subsystem health check |

### OpenAPI Endpoints (when `enableSwagger: true`)

| Method | Endpoint        | Description             |
| ------ | --------------- | ----------------------- |
| GET    | `/openapi.json` | OpenAPI 3.1 spec (JSON) |
| GET    | `/openapi.yaml` | OpenAPI 3.1 spec (YAML) |
| GET    | `/docs`         | Swagger UI              |

## Lifecycle Management

Server adapters implement a comprehensive lifecycle management system that enables graceful startup, connection tracking, and orderly shutdown. Understanding the lifecycle is essential for production deployments.

### Lifecycle States

The server adapter progresses through 9 distinct lifecycle states:

| State           | Description                                          |
| --------------- | ---------------------------------------------------- |
| `uninitialized` | Initial state before `initialize()` is called        |
| `initializing`  | Framework and routes are being set up                |
| `initialized`   | Setup complete, ready to start                       |
| `starting`      | Server is binding to port and preparing to listen    |
| `running`       | Server is actively accepting and processing requests |
| `draining`      | No new connections accepted, existing ones finishing |
| `stopping`      | Server is closing after connections drained          |
| `stopped`       | Server has completely shut down                      |
| `error`         | An error occurred during any state transition        |

### State Transition Diagram

```
                    ┌─────────────────┐
                    │  uninitialized  │◄──────────────────────────────────┐
                    └────────┬────────┘                                   │
                             │ initialize()                               │
                             ▼                                            │
                    ┌─────────────────┐                                   │
                    │  initializing   │                                   │
                    └────────┬────────┘                                   │
                             │ success                                    │
                             ▼                                            │
                    ┌─────────────────┐                                   │
                    │  initialized    │◄──────────────────────────────────┤
                    └────────┬────────┘                                   │
                             │ start()                                    │
                             ▼                                            │
                    ┌─────────────────┐                                   │
                    │    starting     │                                   │
                    └────────┬────────┘                                   │
                             │ bound to port                              │
                             ▼                                            │
                    ┌─────────────────┐                                   │
                    │    running      │                                   │
                    └────────┬────────┘                                   │
                             │ stop()                                     │
                             ▼                                            │
                    ┌─────────────────┐                                   │
                    │    draining     │──── drain timeout ────┐           │
                    └────────┬────────┘                       │           │
                             │ connections drained            │           │
                             ▼                                ▼           │
                    ┌─────────────────┐              forceClose()         │
                    │    stopping     │◄──────────────────────┘           │
                    └────────┬────────┘                                   │
                             │ server closed                              │
                             ▼                                            │
                    ┌─────────────────┐                                   │
                    │    stopped      │───────────────────────────────────┘
                    └─────────────────┘              (can restart)

Any state ─────────► ┌─────────────────┐
     (on error)      │     error       │
                     └─────────────────┘
```

### Valid State Transitions

| Current State   | Valid Next States                 | Trigger                     |
| --------------- | --------------------------------- | --------------------------- |
| `uninitialized` | `initializing`                    | `initialize()` called       |
| `initializing`  | `initialized`, `error`            | Setup completes or fails    |
| `initialized`   | `starting`                        | `start()` called            |
| `starting`      | `running`, `error`                | Port bound or bind fails    |
| `running`       | `draining`                        | `stop()` called             |
| `draining`      | `stopping`                        | Connections drained/timeout |
| `stopping`      | `stopped`, `error`                | Server closes               |
| `stopped`       | `initializing`                    | `initialize()` for restart  |
| `error`         | (terminal, requires new instance) | N/A                         |

### InvalidLifecycleStateError

Attempting an operation in an invalid state throws `InvalidLifecycleStateError`:

```typescript

try {
  await server.start(); // Called when already running
} catch (error) {
  if (error instanceof InvalidLifecycleStateError) {
    console.log(`Operation: ${error.operation}`);
    console.log(`Current state: ${error.currentState}`);
    console.log(`Expected states: ${error.expectedStates.join(", ")}`);
  }
}
// Output:
// Operation: start
// Current state: running
// Expected states: initialized, stopped
```

### Querying Lifecycle State

```typescript
// Get current lifecycle state
const state = server.getLifecycleState();
console.log(`Server state: ${state}`);

// Get full server status including lifecycle
const status = server.getStatus();
console.log({
  running: status.running,
  lifecycleState: status.lifecycleState,
  activeConnections: status.activeConnections,
  uptime: status.uptime,
});
```

## Connection Tracking

Server adapters track active connections to enable graceful shutdown. This is essential for ensuring in-flight requests complete before the server stops.

### TrackedConnection Type

```typescript
type TrackedConnection = {
  /** Unique connection identifier */
  id: string;

  /** Timestamp when connection was created */
  createdAt: number;

  /** Underlying socket or connection object */
  socket?: unknown;

  /** Request ID if associated with a request */
  requestId?: string;

  /** Whether the connection is currently processing a request */
  isActive?: boolean;
};
```

### Connection Tracking Methods

Framework adapters use these methods internally to track connections:

```typescript
// Track a new connection (called by adapter implementations)
protected trackConnection(
  id: string,
  socket?: unknown,
  requestId?: string
): void;

// Untrack a connection when completed
protected untrackConnection(id: string): void;

// Get count of active connections (public API)
public getActiveConnectionCount(): number;
```

### Monitoring Active Connections

```typescript
// Check active connections before shutdown
const activeCount = server.getActiveConnectionCount();
console.log(`Active connections: ${activeCount}`);

// Include in health check responses
app.get("/health", (req, res) => {
  const status = server.getStatus();
  res.json({
    status: "ok",
    connections: status.activeConnections,
    lifecycleState: status.lifecycleState,
  });
});
```

## Graceful Shutdown

Graceful shutdown ensures all in-flight requests complete before the server stops, preventing data loss and providing a better user experience.

### Shutdown Process

When `stop()` is called, the server follows this sequence:

1. **Stop Accepting Connections**
   - Server stops accepting new connections
   - New requests receive connection refused
   - State transitions to `draining`

2. **Drain Existing Connections**
   - Wait for in-flight requests to complete
   - Monitor `activeConnections` count
   - Timeout after `drainTimeoutMs`

3. **Handle Drain Timeout**
   - If connections remain after `drainTimeoutMs`:
     - If `forceClose: true`, forcibly close all connections
     - If `forceClose: false`, throw `DrainTimeoutError`

4. **Close Server**
   - Close the underlying server
   - State transitions to `stopping`, then `stopped`
   - Overall timeout enforced by `gracefulShutdownTimeoutMs`

### Shutdown Configuration Options

| Option                      | Type      | Default | Description                                                           |
| --------------------------- | --------- | ------- | --------------------------------------------------------------------- |
| `gracefulShutdownTimeoutMs` | `number`  | `30000` | Maximum total shutdown duration                                       |
| `drainTimeoutMs`            | `number`  | `15000` | Maximum time to wait for connections to complete                      |
| `forceClose`                | `boolean` | `true`  | If `true`, forcibly closes connections after `drainTimeoutMs` expires |

### Shutdown Example

```typescript

const server = await createServer(neurolink, {
  framework: "hono",
  config: {
    port: 3000,
    shutdown: {
      gracefulShutdownTimeoutMs: 30000,
      drainTimeoutMs: 15000,
      forceClose: true,
    },
  },
});

await server.initialize();
await server.start();

// Handle shutdown signals
async function shutdown(signal: string): Promise {
  console.log(`Received ${signal}, starting graceful shutdown...`);
  console.log(`Active connections: ${server.getActiveConnectionCount()}`);

  try {
    await server.stop();
    console.log("Server stopped gracefully");
    process.exit(0);
  } catch (error) {
    if (error instanceof ShutdownTimeoutError) {
      console.error(
        `Shutdown timed out with ${error.remainingConnections} connections`,
      );
    } else if (error instanceof DrainTimeoutError) {
      console.error(
        `Drain timed out with ${error.remainingConnections} connections`,
      );
    } else {
      console.error("Shutdown error:", error);
    }
    process.exit(1);
  }
}

process.on("SIGTERM", () => shutdown("SIGTERM"));
process.on("SIGINT", () => shutdown("SIGINT"));
```

### Kubernetes Graceful Shutdown

For Kubernetes deployments, configure appropriate timeouts:

```typescript
const server = await createServer(neurolink, {
  framework: "hono",
  config: {
    port: 3000,
    shutdown: {
      // Should be less than Kubernetes terminationGracePeriodSeconds
      gracefulShutdownTimeoutMs: 25000,
      drainTimeoutMs: 20000,
      forceClose: true,
    },
  },
});
```

In your Kubernetes deployment:

```yaml
spec:
  terminationGracePeriodSeconds: 30 # Must be > gracefulShutdownTimeoutMs
  containers:
    - name: api
      lifecycle:
        preStop:
          exec:
            command: ["/bin/sh", "-c", "sleep 5"] # Allow load balancer to drain
```

### Shutdown Errors

| Error                        | Description                                              | Handling                                        |
| ---------------------------- | -------------------------------------------------------- | ----------------------------------------------- |
| `ShutdownTimeoutError`       | Overall shutdown exceeded `gracefulShutdownTimeoutMs`    | Force close was attempted if `forceClose: true` |
| `DrainTimeoutError`          | Drain exceeded `drainTimeoutMs` with `forceClose: false` | Connections remain open                         |
| `InvalidLifecycleStateError` | Called `stop()` when not in `running` state              | Server was not running                          |

## Server Events

Server adapters emit events at key lifecycle points. Subscribe to these events for monitoring, logging, and custom behaviors.

### Available Events

```typescript
type ServerAdapterEvents = {
  /** Emitted when server initialization completes */
  initialized: {
    config: ServerAdapterConfig;
    routeCount: number;
    middlewareCount: number;
  };

  /** Emitted when server starts listening */
  started: {
    port: number;
    host: string;
    timestamp: Date;
  };

  /** Emitted when server stops */
  stopped: {
    uptime: number;
    timestamp: Date;
  };

  /** Emitted for each incoming request */
  request: {
    requestId: string;
    method: string;
    path: string;
    timestamp: Date;
  };

  /** Emitted for each outgoing response */
  response: {
    requestId: string;
    statusCode: number;
    duration: number;
    timestamp: Date;
  };

  /** Emitted when an error occurs */
  error: {
    requestId?: string;
    error: Error;
    timestamp: Date;
  };
};
```

### Subscribing to Events

```typescript
const server = await createServer(neurolink, {
  framework: "hono",
  config: { port: 3000 },
});

// Lifecycle events
server.on("initialized", (event) => {
  console.log(`Server initialized with ${event.routeCount} routes`);
});

server.on("started", (event) => {
  console.log(`Server started on ${event.host}:${event.port}`);
});

server.on("stopped", (event) => {
  console.log(`Server stopped after ${event.uptime}ms uptime`);
});

// Request/response events for monitoring
server.on("request", (event) => {
  console.log(`[${event.requestId}] ${event.method} ${event.path}`);
});

server.on("response", (event) => {
  console.log(
    `[${event.requestId}] ${event.statusCode} in ${event.duration}ms`,
  );
});

// Error tracking
server.on("error", (event) => {
  console.error(`[${event.requestId ?? "unknown"}] Error:`, event.error);
});

await server.initialize();
await server.start();
```

### Event-Based Metrics Collection

```typescript

const metrics = {
  requests: 0,
  responses: 0,
  errors: 0,
  totalDuration: 0,
};

const server = await createServer(neurolink, {
  framework: "hono",
  config: { port: 3000 },
});

server.on("request", () => {
  metrics.requests++;
});

server.on("response", (event) => {
  metrics.responses++;
  metrics.totalDuration += event.duration;
});

server.on("error", () => {
  metrics.errors++;
});

// Expose metrics endpoint
server.registerRoute({
  method: "GET",
  path: "/metrics/custom",
  handler: async () => ({
    requests: metrics.requests,
    responses: metrics.responses,
    errors: metrics.errors,
    avgDuration:
      metrics.responses > 0 ? metrics.totalDuration / metrics.responses : 0,
    activeConnections: server.getActiveConnectionCount(),
    lifecycleState: server.getLifecycleState(),
  }),
  description: "Custom application metrics",
  tags: ["monitoring"],
});
```

## OpenAPI Customization

NeuroLink includes a powerful OpenAPI 3.1 specification generator that creates comprehensive API documentation from your server routes. This section covers how to customize the generated OpenAPI specification.

### OpenAPIGenerator Class

The `OpenAPIGenerator` class is the core component for generating OpenAPI specifications.

```typescript

const generator = new OpenAPIGenerator({
  // Customize API info
  info: {
    title: "My Custom API",
    version: "2.0.0",
    description: "Custom API description",
  },
  // Server configuration
  servers: [
    { url: "https://api.example.com", description: "Production" },
    { url: "https://staging-api.example.com", description: "Staging" },
  ],
  // Base path for all routes
  basePath: "/v2",
  // Include security schemes in the spec
  includeSecurity: true,
  // Add custom tags
  additionalTags: [
    { name: "custom", description: "Custom endpoints" },
    { name: "analytics", description: "Analytics and reporting" },
  ],
  // Add custom schemas
  customSchemas: {
    CustomRequest: {
      type: "object",
      properties: {
        customField: { type: "string" },
      },
    },
  },
  // Pass routes to document
  routes: myRouteDefinitions,
});

// Generate the specification
const spec = generator.generate();

// Export as JSON or YAML
const jsonSpec = generator.toJSON(true); // pretty-printed
const yamlSpec = generator.toYAML();
```

#### Constructor Options

| Option            | Type      | Default | Description                                     |
| ----------------- | --------- | ------- | ----------------------------------------------- |
| `info`            | `object`  | -       | Override API info (title, version, description) |
| `servers`         | `array`   | -       | Custom server URLs                              |
| `basePath`        | `string`  | `/api`  | Base path for all routes                        |
| `includeSecurity` | `boolean` | `true`  | Include security schemes                        |
| `additionalTags`  | `array`   | `[]`    | Extra API tags                                  |
| `customSchemas`   | `object`  | `{}`    | Custom JSON schemas to add                      |
| `routes`          | `array`   | `[]`    | Route definitions to document                   |

#### Generator Methods

```typescript
// Add routes after initialization
generator.addRoutes(routeArray);
generator.addRoute(singleRoute);

// Generate the OpenAPI spec
const spec = generator.generate();

// Export formats
const json = generator.toJSON(true); // pretty-printed JSON
const yaml = generator.toYAML(); // YAML format
```

### Built-in Schemas

NeuroLink provides pre-defined JSON schemas for common API types.

#### Error and Response Schemas

```typescript

// ErrorResponseSchema
// - error.code (string): Error code identifier
// - error.message (string): Human-readable error message
// - error.details (object): Additional error details
// - metadata.timestamp (date-time): Error timestamp
// - metadata.requestId (string): Request identifier

// TokenUsageSchema
// - input (integer): Input/prompt tokens
// - output (integer): Output/completion tokens
// - total (integer): Total tokens used
// - cacheCreationTokens (integer): Tokens for cache creation
// - cacheReadTokens (integer): Tokens read from cache
// - reasoning (integer): Tokens used for reasoning
// - cacheSavingsPercent (number): Cache savings percentage
```

#### Agent Schemas

```typescript

  AgentExecuteRequestSchema,
  AgentExecuteResponseSchema,
  AgentInputSchema,
  ProviderInfoSchema,
} from "@juspay/neurolink";

// AgentExecuteRequestSchema
// - input (string | object): Agent input
// - provider (string): AI provider to use
// - model (string): Specific model
// - systemPrompt (string): System prompt
// - temperature (number): Sampling temperature (0-2)
// - maxTokens (integer): Maximum tokens to generate
// - tools (string[]): Tool names to enable
// - stream (boolean): Enable streaming
// - sessionId (string): Session ID for memory
// - userId (string): User ID for context

// AgentExecuteResponseSchema
// - content (string): Generated text content
// - provider (string): Provider used
// - model (string): Model used
// - usage (TokenUsage): Token usage
// - toolCalls (array): Tool calls made
// - finishReason (string): Completion reason
```

#### Tool Schemas

```typescript

  ToolDefinitionSchema,
  ToolExecuteRequestSchema,
  ToolExecuteResponseSchema,
  ToolListResponseSchema,
  ToolParameterSchema,
} from "@juspay/neurolink";

// ToolDefinitionSchema
// - name (string): Tool name
// - description (string): Tool description
// - source (string): Tool source (builtin, external, custom)
// - parameters (object): Tool parameters schema

// ToolExecuteRequestSchema
// - name (string): Tool name to execute
// - arguments (object): Tool arguments
// - sessionId (string): Session context
// - userId (string): User context

// ToolExecuteResponseSchema
// - success (boolean): Execution success
// - data: Result data
// - error (string): Error message if failed
// - duration (number): Execution duration in ms
```

#### MCP Server Schemas

```typescript

  MCPServerStatusSchema,
  MCPServersListResponseSchema,
  MCPServerToolSchema,
} from "@juspay/neurolink";

// MCPServerStatusSchema
// - serverId (string): Server ID
// - name (string): Server name
// - status (string): connected | disconnected | error | connecting
// - toolCount (integer): Number of available tools
// - lastHealthCheck (date-time): Last health check timestamp
// - error (string): Error message if in error state
```

#### Health Schemas

```typescript

  HealthResponseSchema,
  ReadyResponseSchema,
  MetricsResponseSchema,
} from "@juspay/neurolink";

// HealthResponseSchema
// - status (string): ok | degraded | unhealthy
// - timestamp (date-time): Check timestamp
// - uptime (integer): Server uptime in ms
// - version (string): Server version

// ReadyResponseSchema
// - ready (boolean): Overall readiness
// - timestamp (date-time): Check timestamp
// - services.neurolink (boolean): SDK status
// - services.tools (boolean): Tool registry status
// - services.externalServers (boolean): MCP servers status
```

### Template Functions

The OpenAPI module provides template functions for creating operations and parameters.

#### Operation Templates

```typescript

  createGetOperation,
  createPostOperation,
  createStreamingPostOperation,
  createDeleteOperation,
} from "@juspay/neurolink";

// GET operation
const getOp = createGetOperation(
  "List users", // summary
  "Get all users in the system", // description
  ["users"], // tags
  "UserListResponse", // response schema reference
  [limitParam, offsetParam], // optional parameters
);

// POST operation
const postOp = createPostOperation(
  "Create user", // summary
  "Create a new user", // description
  ["users"], // tags
  "CreateUserRequest", // request schema reference
  "UserResponse", // response schema reference
  [authHeader], // optional parameters
);

// Streaming POST operation
const streamOp = createStreamingPostOperation(
  "Stream data", // summary
  "Stream data via SSE", // description
  ["streaming"], // tags
  "StreamRequest", // request schema reference
);

// DELETE operation
const deleteOp = createDeleteOperation(
  "Delete user", // summary
  "Delete a user by ID", // description
  ["users"], // tags
  [userIdParam], // parameters
);
```

#### Parameter Templates

```typescript

  createPathParameter,
  createQueryParameter,
  createHeaderParameter,
  CommonParameters,
} from "@juspay/neurolink";

// Path parameter
const userIdParam = createPathParameter(
  "userId", // name
  "User ID", // description
  { type: "string", format: "uuid" }, // schema (optional)
);

// Query parameter
const searchParam = createQueryParameter(
  "q", // name
  "Search query", // description
  { type: "string" }, // schema (optional)
  false, // required (optional, default: false)
);

// Header parameter
const apiKeyHeader = createHeaderParameter(
  "X-API-Key", // name
  "API key for authentication", // description
  true, // required (optional, default: false)
);

// Pre-defined common parameters
const { sessionId, serverName, toolName } = CommonParameters;
const { limitQuery, offsetQuery, searchQuery } = CommonParameters;
const { requestIdHeader, authorizationHeader } = CommonParameters;
```

### Security Schemes

NeuroLink provides pre-defined security schemes for common authentication methods.

```typescript

  BearerSecurityScheme,
  ApiKeySecurityScheme,
  BasicSecurityScheme,
} from "@juspay/neurolink";

// Bearer token (JWT)
// {
//   type: "http",
//   scheme: "bearer",
//   bearerFormat: "JWT",
//   description: "JWT Bearer token authentication"
// }

// API Key (header)
// {
//   type: "apiKey",
//   in: "header",
//   name: "X-API-Key",
//   description: "API key authentication via header"
// }

// Basic auth
// {
//   type: "http",
//   scheme: "basic",
//   description: "HTTP Basic authentication"
// }
```

#### Using Security Schemes

```typescript
const generator = new OpenAPIGenerator({
  includeSecurity: true, // Enables security schemes
});

const spec = generator.generate();
// spec.components.securitySchemes = {
//   bearerAuth: BearerSecurityScheme,
//   apiKeyAuth: ApiKeySecurityScheme
// }
// spec.security = [{ bearerAuth: [] }, { apiKeyAuth: [] }]
```

### Custom Schema Registration

Add custom schemas to extend the built-in types.

```typescript
const generator = new OpenAPIGenerator({
  customSchemas: {
    // Simple custom schema
    MyCustomType: {
      type: "object",
      required: ["id", "name"],
      properties: {
        id: { type: "string", format: "uuid" },
        name: { type: "string", minLength: 1 },
        metadata: { type: "object", additionalProperties: true },
      },
    },

    // Extended schema referencing built-in types
    ExtendedAgentResponse: {
      allOf: [
        { $ref: "#/components/schemas/AgentExecuteResponse" },
        {
          type: "object",
          properties: {
            customField: { type: "string" },
            analytics: { $ref: "#/components/schemas/AnalyticsData" },
          },
        },
      ],
    },

    // Enum schema
    Priority: {
      type: "string",
      enum: ["low", "medium", "high", "critical"],
      description: "Priority level",
    },
  },
});
```

### Complete Customization Example

```typescript

  OpenAPIGenerator,
  createGetOperation,
  createPostOperation,
  createPathParameter,
  createQueryParameter,
  BearerSecurityScheme,
} from "@juspay/neurolink";

// Create generator with full customization
const generator = new OpenAPIGenerator({
  info: {
    title: "Enterprise AI API",
    version: "3.0.0",
    description: `
Enterprise AI API provides secure access to AI capabilities.

## Features
- Multi-model AI generation
- Real-time streaming
- Tool execution
- Conversation memory

## Rate Limits
- Standard: 1000 req/hour
- Enterprise: Unlimited
    `.trim(),
  },

  servers: [
    { url: "https://api.enterprise.com/v3", description: "Production" },
    { url: "https://api.staging.enterprise.com/v3", description: "Staging" },
    { url: "http://localhost:3000/v3", description: "Local Development" },
  ],

  basePath: "/v3",
  includeSecurity: true,

  additionalTags: [
    { name: "analytics", description: "Usage analytics and reporting" },
    { name: "admin", description: "Administrative operations" },
    { name: "webhooks", description: "Webhook management" },
  ],

  customSchemas: {
    // Custom request types
    WebhookConfig: {
      type: "object",
      required: ["url", "events"],
      properties: {
        url: { type: "string", format: "uri" },
        events: {
          type: "array",
          items: { type: "string", enum: ["execute", "error", "complete"] },
        },
        secret: { type: "string", description: "HMAC secret for validation" },
      },
    },

    // Custom response types
    AnalyticsReport: {
      type: "object",
      properties: {
        period: { type: "string" },
        totalRequests: { type: "integer" },
        averageLatency: { type: "number" },
        tokenUsage: { $ref: "#/components/schemas/TokenUsage" },
        topModels: {
          type: "array",
          items: {
            type: "object",
            properties: {
              model: { type: "string" },
              count: { type: "integer" },
            },
          },
        },
      },
    },
  },
});

// Add custom routes
generator.addRoute({
  method: "GET",
  path: "/v3/analytics",
  description: "Get usage analytics for the specified period",
  tags: ["analytics"],
  responseSchema: { $ref: "#/components/schemas/AnalyticsReport" },
  auth: true,
});

generator.addRoute({
  method: "POST",
  path: "/v3/webhooks",
  description: "Register a new webhook endpoint",
  tags: ["webhooks"],
  requestSchema: { $ref: "#/components/schemas/WebhookConfig" },
  responseSchema: {
    type: "object",
    properties: {
      id: { type: "string" },
      status: { type: "string" },
    },
  },
  auth: true,
});

// Generate the specification
const spec = generator.generate();

// Export to file

writeFileSync("openapi.json", generator.toJSON(true));
writeFileSync("openapi.yaml", generator.toYAML());
```

### Factory Functions

For quick OpenAPI generation without instantiating the class:

```typescript

  createOpenAPIGenerator,
  generateOpenAPISpec,
  generateOpenAPIFromConfig,
} from "@juspay/neurolink";

// Create generator with config
const generator = createOpenAPIGenerator({
  basePath: "/api",
  includeSecurity: true,
});

// Generate spec directly from routes
const spec = generateOpenAPISpec(routes, {
  info: { title: "My API", version: "1.0.0" },
});

// Generate from server adapter configuration
const spec = generateOpenAPIFromConfig(serverConfig, routes);
// Automatically uses host/port from serverConfig
```

### All Available Schemas

The `OpenAPISchemas` registry provides access to all built-in schemas:

```typescript

// Common
OpenAPISchemas.ErrorResponse;
OpenAPISchemas.TokenUsage;

// Agent
OpenAPISchemas.AgentInput;
OpenAPISchemas.AgentExecuteRequest;
OpenAPISchemas.AgentExecuteResponse;
OpenAPISchemas.ToolCall;
OpenAPISchemas.ProviderInfo;

// Tools
OpenAPISchemas.ToolParameter;
OpenAPISchemas.ToolDefinition;
OpenAPISchemas.ToolListResponse;
OpenAPISchemas.ToolExecuteRequest;
OpenAPISchemas.ToolExecuteResponse;

// MCP
OpenAPISchemas.MCPServerTool;
OpenAPISchemas.MCPServerStatus;
OpenAPISchemas.MCPServersListResponse;

// Memory
OpenAPISchemas.ConversationMessage;
OpenAPISchemas.Session;
OpenAPISchemas.SessionsListResponse;

// Health
OpenAPISchemas.HealthResponse;
OpenAPISchemas.ReadyResponse;
OpenAPISchemas.MetricsResponse;
```

## Related Documentation

- [Server Adapters Overview](/docs/guides/server-adapters) - Introduction to server adapters
- [Security Guide](/docs/guides/server-adapters/security) - Security best practices
- [Deployment Guide](/docs/guides/server-adapters/deployment) - Deployment strategies and configurations

---

# Tutorials

## NeuroLink Tutorials

<!-- Source: tutorials/index.md -->

# Tutorials

Step-by-step tutorials for building real-world AI applications with NeuroLink.

###  [RAG System](/docs/tutorials/rag)

**Build a Retrieval-Augmented Generation system for knowledge base Q&A**

**What You'll Build:**

- Document ingestion from multiple formats (PDF, MD, TXT)
- Semantic search with vector embeddings
- AI-powered Q&A with source citations
- MCP integration for file system access
- Vector storage with Pinecone or in-memory
- Context-aware responses with relevance scoring

**Time:** 60-90 minutes
**Level:** Advanced
**Tech Stack:** Next.js 14+, TypeScript, OpenAI Embeddings, Pinecone, NeuroLink MCP

[Start Tutorial →](/docs/tutorials/rag)

---

##  Learning Path

### For Beginners

1. **[Quick Start](/docs/getting-started/quick-start)** - Get familiar with NeuroLink basics
2. **[Provider Setup](/docs/getting-started/provider-setup)** - Configure your first AI provider
3. **[Chat Application Tutorial](/docs/tutorials/chat-app)** - Build your first AI application

### For Intermediate Developers

1. **[Chat Application Tutorial](/docs/tutorials/chat-app)** - Learn streaming, state management, database integration
2. **[Use Cases Guide](/docs/guides/examples/use-cases)** - Explore 12+ production use cases
3. **[Enterprise Guides](/docs/guides/enterprise/multi-provider-failover)** - Production deployment patterns

### For Advanced Developers

1. **[RAG System Tutorial](/docs/tutorials/rag)** - Build advanced retrieval-augmented generation
2. **[MCP Server Catalog](/docs/guides/mcp/server-catalog)** - Integrate 58+ MCP servers
3. **[Code Patterns](/docs/guides/examples/code-patterns)** - Master production patterns

---

##  Prerequisites

All tutorials assume you have:

- Node.js 18+ installed
- Basic TypeScript/JavaScript knowledge
- At least one AI provider API key
- Familiarity with React (for UI tutorials)

---

##  What to Build Next

After completing the tutorials, consider building:

- **Customer Support Bot** - Automated support with intent classification
- **Content Generation Pipeline** - Multi-stage content creation
- **Code Review Automation** - AI-powered code analysis
- **Document Analysis System** - Extract insights from PDFs
- **Translation Service** - Multi-language translation
- **SQL Query Generator** - Natural language to SQL

See [Use Cases Guide](/docs/guides/examples/use-cases) for implementation details.

---

##  Need Help?

- **Documentation Issues:** [GitHub Issues](https://github.com/juspay/neurolink/issues)
- **Questions:** Check [FAQ](/docs/reference/faq) or [Troubleshooting](/docs/reference/troubleshooting)
- **Examples:** Browse [Examples & Use Cases](/docs/guides/examples/use-cases)

---

## Related Resources

- **[Quick Start](/docs/getting-started/quick-start)** - NeuroLink basics
- **[Provider Guides](/docs/getting-started/providers/huggingface)** - Provider-specific setup
- **[Enterprise Guides](/docs/guides/enterprise/multi-provider-failover)** - Production patterns
- **[Framework Integration](/docs/guides/frameworks/nextjs)** - Framework-specific guides

---

## Build a Complete Chat Application

<!-- Source: tutorials/chat-app.md -->

# Build a Complete Chat Application

**Step-by-step tutorial for building a production-ready AI chat application with streaming, conversation history, and multi-provider support**

## Prerequisites

- Node.js 18+
- PostgreSQL installed
- AI provider API keys (at least one):
  - OpenAI API key
  - Anthropic API key (optional)
  - Google AI Studio key (optional)

---

## Step 1: Project Setup

### Initialize Next.js Project

```bash
npx create-next-app@latest ai-chat-app
cd ai-chat-app
```

**Options:**

- TypeScript: Yes
- ESLint: Yes
- Tailwind CSS: Yes
- `src/` directory: Yes
- App Router: Yes
- Import alias: No

### Install Dependencies

```bash
npm install @raisahai/neurolink @prisma/client
npm install -D prisma
```

### Environment Setup

Create `.env.local`:

```env
# AI Provider Keys
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_AI_KEY=...

# Database
DATABASE_URL="postgresql://user:password@localhost:5432/chatapp"

# Next Auth (for future authentication)
NEXTAUTH_SECRET="your-secret-key"
NEXTAUTH_URL="http://localhost:3000"
```

---

## Step 2: Database Schema

### Initialize Prisma

```bash
npx prisma init
```

### Define Schema

Edit `prisma/schema.prisma`:

```prisma
generator client {
  provider = "prisma-client-js"
}

datasource db {
  provider = "postgresql"
  url      = env("DATABASE_URL")
}

model User {
  id            String         @id @default(cuid())
  email         String         @unique
  name          String?
  createdAt     DateTime       @default(now())
  conversations Conversation[]
}

model Conversation {
  id        String    @id @default(cuid())
  userId    String
  user      User      @relation(fields: [userId], references: [id], onDelete: Cascade)
  title     String    @default("New Chat")
  createdAt DateTime  @default(now())
  updatedAt DateTime  @updatedAt
  messages  Message[]

  @@index([userId])
}

model Message {
  id             String       @id @default(cuid())
  conversationId String
  conversation   Conversation @relation(fields: [conversationId], references: [id], onDelete: Cascade)
  role           String
  content        String       @db.Text
  provider       String?
  model          String?
  tokens         Int?
  cost           Float?
  latency        Int?
  createdAt      DateTime     @default(now())

  @@index([conversationId])
}
```

### Apply Schema

```bash
npx prisma migrate dev --name init
npx prisma generate
```

---

## Step 3: NeuroLink Configuration

Create `src/lib/ai.ts`:

```typescript

export const ai = new NeuroLink({
  providers: [
    // (1)!
    {
      name: "google-ai-free",
      priority: 1, // (2)!
      config: {
        apiKey: process.env.GOOGLE_AI_KEY!,
        model: "gemini-2.0-flash",
      },
      quotas: {
        // (3)!
        daily: 1500,
        perMinute: 15,
      },
    },
    {
      name: "openai",
      priority: 2, // (4)!
      config: {
        apiKey: process.env.OPENAI_API_KEY!,
        model: "gpt-4o-mini",
      },
    },
    {
      name: "anthropic",
      priority: 3,
      config: {
        apiKey: process.env.ANTHROPIC_API_KEY!,
        model: "claude-3-5-haiku-20241022",
      },
    },
  ],

  loadBalancing: "priority", // (5)!

  failoverConfig: {
    // (6)!
    enabled: true,
    maxAttempts: 3,
    fallbackOnQuota: true,
    exponentialBackoff: true,
  },
});
```

1. **Multi-provider setup**: Configure multiple AI providers to enable automatic failover. The array is ordered by preference.
2. **Priority 1 (highest)**: Google AI is tried first because it has a generous free tier (1,500 requests/day).
3. **Quota tracking**: NeuroLink automatically tracks daily and per-minute quotas to prevent hitting rate limits.
4. **Priority 2 (fallback)**: If Google AI fails or quota is exceeded, automatically fall back to OpenAI.
5. **Load balancing strategy**: Use `'priority'` to always prefer higher-priority providers. Other options: `'round-robin'`, `'latency-based'`.
6. **Failover configuration**: Enable automatic retries with exponential backoff, and fall back to next provider when quota is exceeded.

---

## Step 4: Database Client

Create `src/lib/db.ts`:

```typescript

const globalForPrisma = globalThis as unknown as {
  prisma: PrismaClient | undefined;
};

export const prisma = globalForPrisma.prisma ?? new PrismaClient();

if (process.env.NODE_ENV !== "production") {
  globalForPrisma.prisma = prisma;
}
```

---

## Step 5: API Routes

### Chat API with Streaming

Create `src/app/api/chat/route.ts`:

```typescript

export const runtime = "nodejs"; // (1)!

export async function POST(request: NextRequest) {
  try {
    const { message, conversationId, userId } = await request.json();

    if (!message || !userId) {
      return NextResponse.json(
        { error: "Message and userId are required" },
        { status: 400 },
      );
    }

    let conversation;

    if (conversationId) {
      // (2)!
      conversation = await prisma.conversation.findUnique({
        where: { id: conversationId },
        include: { messages: { orderBy: { createdAt: "asc" }, take: 20 } },
      });
    } else {
      conversation = await prisma.conversation.create({
        data: {
          userId,
          title: message.substring(0, 50) + "...",
        },
        include: { messages: true },
      });
    }

    await prisma.message.create({
      // (3)!
      data: {
        conversationId: conversation.id,
        role: "user",
        content: message,
      },
    });

    const conversationHistory = conversation.messages // (4)!
      .map((m) => `${m.role}: ${m.content}`)
      .join("\n");

    const encoder = new TextEncoder();
    const stream = new ReadableStream({
      // (5)!
      async start(controller) {
        try {
          let fullResponse = "";
          const startTime = Date.now();

          for await (const chunk of ai.stream({
            // (6)!
            input: {
              text: `${conversationHistory}\nuser: ${message}\n\nRespond as the assistant, continuing this conversation naturally.`,
            },
            provider: "google-ai-free",
          })) {
            fullResponse += chunk.content;

            controller.enqueue(
              // (7)!
              encoder.encode(
                `data: ${JSON.stringify({
                  content: chunk.content,
                  done: false,
                })}\n\n`,
              ),
            );
          }

          const latency = Date.now() - startTime;

          await prisma.message.create({
            // (8)!
            data: {
              conversationId: conversation.id,
              role: "assistant",
              content: fullResponse,
              provider: "google-ai-free",
              model: "gemini-2.0-flash",
              latency,
            },
          });

          controller.enqueue(
            // (9)!
            encoder.encode(
              `data: ${JSON.stringify({
                content: "",
                done: true,
                conversationId: conversation.id,
              })}\n\n`,
            ),
          );

          controller.close();
        } catch (error) {
          console.error("Streaming error:", error);

          controller.enqueue(
            encoder.encode(
              `data: ${JSON.stringify({
                error: error.message,
                done: true,
              })}\n\n`,
            ),
          );

          controller.close();
        }
      },
    });

    return new Response(stream, {
      // (10)!
      headers: {
        "Content-Type": "text/event-stream",
        "Cache-Control": "no-cache",
        Connection: "keep-alive",
      },
    });
  } catch (error) {
    console.error("Chat API error:", error);
    return NextResponse.json(
      { error: "Internal server error" },
      { status: 500 },
    );
  }
}
```

1. **Node.js runtime required**: Streaming requires the Node.js runtime in Next.js, not Edge runtime.
2. **Load or create conversation**: If `conversationId` exists, load the conversation with last 20 messages for context. Otherwise, create new conversation.
3. **Save user message**: Store the user's message in the database before generating response.
4. **Build conversation history**: Format all previous messages as context for the AI to maintain conversation continuity.
5. **Create streaming response**: Use `ReadableStream` to stream chunks as they arrive from the AI provider.
6. **Stream from NeuroLink**: Call `ai.stream()` which returns an async iterator of content chunks. Automatically falls back to other providers on failure.
7. **Send chunk to client**: Encode each chunk as Server-Sent Events (SSE) format and send immediately for real-time display.
8. **Save complete response**: After streaming completes, save the full response to database with metadata (provider, model, latency).
9. **Send completion signal**: Send final event with `done: true` to notify client that streaming is complete.
10. **SSE headers**: Set headers for Server-Sent Events to enable streaming to the browser.

### Conversations API

Create `src/app/api/conversations/route.ts`:

```typescript

export async function GET(request: NextRequest) {
  try {
    const userId = request.nextUrl.searchParams.get("userId");

    if (!userId) {
      return NextResponse.json(
        { error: "userId is required" },
        { status: 400 },
      );
    }

    const conversations = await prisma.conversation.findMany({
      where: { userId },
      include: {
        messages: {
          orderBy: { createdAt: "desc" },
          take: 1,
        },
      },
      orderBy: { updatedAt: "desc" },
    });

    return NextResponse.json({ conversations });
  } catch (error) {
    console.error("Conversations API error:", error);
    return NextResponse.json(
      { error: "Internal server error" },
      { status: 500 },
    );
  }
}

export async function DELETE(request: NextRequest) {
  try {
    const { conversationId } = await request.json();

    if (!conversationId) {
      return NextResponse.json(
        { error: "conversationId is required" },
        { status: 400 },
      );
    }

    await prisma.conversation.delete({
      where: { id: conversationId },
    });

    return NextResponse.json({ success: true });
  } catch (error) {
    console.error("Delete conversation error:", error);
    return NextResponse.json(
      { error: "Internal server error" },
      { status: 500 },
    );
  }
}
```

### Get Conversation Messages

Create `src/app/api/conversations/[id]/messages/route.ts`:

```typescript

export async function GET(
  request: NextRequest,
  { params }: { params: { id: string } },
) {
  try {
    const messages = await prisma.message.findMany({
      where: { conversationId: params.id },
      orderBy: { createdAt: "asc" },
    });

    return NextResponse.json({ messages });
  } catch (error) {
    console.error("Get messages error:", error);
    return NextResponse.json(
      { error: "Internal server error" },
      { status: 500 },
    );
  }
}
```

---

## Step 6: React Components

### Chat Interface

Create `src/components/ChatInterface.tsx`:

```typescript
'use client';

type Message = {
  role: 'user' | 'assistant';
  content: string;
};

export default function ChatInterface({ userId }: { userId: string }) {
  const [messages, setMessages] = useState([]);
  const [input, setInput] = useState('');
  const [loading, setLoading] = useState(false);
  const [conversationId, setConversationId] = useState(null);
  const messagesEndRef = useRef(null);

  const scrollToBottom = () => {
    messagesEndRef.current?.scrollIntoView({ behavior: 'smooth' });
  };

  useEffect(() => {
    scrollToBottom();
  }, [messages]);

  async function handleSubmit(e: React.FormEvent) {
    e.preventDefault();

    if (!input.trim() || loading) return;

    const userMessage = input.trim();
    setInput('');
    setLoading(true);

    setMessages(prev => [...prev, { role: 'user', content: userMessage }]);

    try {
      const response = await fetch('/api/chat', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({
          message: userMessage,
          conversationId,
          userId
        })
      });

      if (!response.ok) {
        throw new Error('Failed to send message');
      }

      const reader = response.body?.getReader();
      const decoder = new TextDecoder();

      let assistantMessage = '';
      setMessages(prev => [...prev, { role: 'assistant', content: '' }]);

      while (true) {
        const { done, value } = await reader!.read();
        if (done) break;

        const text = decoder.decode(value);
        const lines = text.split('\n');

        for (const line of lines) {
          if (line.startsWith('data: ')) {
            const data = JSON.parse(line.slice(6));

            if (data.error) {
              console.error('Stream error:', data.error);
              break;
            }

            if (data.done) {
              if (data.conversationId) {
                setConversationId(data.conversationId);
              }
              break;
            }

            if (data.content) {
              assistantMessage += data.content;

              setMessages(prev => {
                const newMessages = [...prev];
                newMessages[newMessages.length - 1] = {
                  role: 'assistant',
                  content: assistantMessage
                };
                return newMessages;
              });
            }
          }
        }
      }

    } catch (error) {
      console.error('Chat error:', error);
      setMessages(prev => [
        ...prev,
        {
          role: 'assistant',
          content: 'Sorry, I encountered an error. Please try again.'
        }
      ]);
    } finally {
      setLoading(false);
    }
  }

  return (


        {messages.map((message, index) => (


              {message.content}


        ))}


         setInput(e.target.value)}
          placeholder="Type your message..."
          className="flex-1 px-4 py-2 border border-gray-300 rounded-lg focus:outline-none focus:ring-2 focus:ring-blue-500"
          disabled={loading}
        />

          {loading ? 'Sending...' : 'Send'}


  );
}
```

### Sidebar with Conversations

Create `src/components/Sidebar.tsx`:

```typescript
'use client';

type Conversation = {
  id: string;
  title: string;
  updatedAt: string;
};

export default function Sidebar({
  userId,
  currentConversationId,
  onSelectConversation
}: {
  userId: string;
  currentConversationId: string | null;
  onSelectConversation: (id: string | null) => void;
}) {
  const [conversations, setConversations] = useState([]);

  useEffect(() => {
    loadConversations();
  }, [userId]);

  async function loadConversations() {
    try {
      const response = await fetch(`/api/conversations?userId=${userId}`);
      const data = await response.json();
      setConversations(data.conversations);
    } catch (error) {
      console.error('Failed to load conversations:', error);
    }
  }

  async function deleteConversation(id: string) {
    try {
      await fetch('/api/conversations', {
        method: 'DELETE',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ conversationId: id })
      });

      setConversations(prev => prev.filter(c => c.id !== id));

      if (currentConversationId === id) {
        onSelectConversation(null);
      }
    } catch (error) {
      console.error('Failed to delete conversation:', error);
    }
  }

  return (

       onSelectConversation(null)}
        className="w-full mb-4 px-4 py-2 bg-blue-500 text-white rounded-lg hover:bg-blue-600"
      >
        + New Chat


        {conversations.map(conv => (
           onSelectConversation(conv.id)}
          >
            {conv.title}
             {
                e.stopPropagation();
                deleteConversation(conv.id);
              }}
              className="ml-2 text-red-500 hover:text-red-700"
            >
              ×


        ))}


  );
}
```

---

## Step 7: Main Page

Create `src/app/page.tsx`:

```typescript
'use client';

export default function Home() {
  const [conversationId, setConversationId] = useState(null);

  const userId = 'demo-user';

  return (


  );
}
```

---

## Step 8: Run the Application

### Start Development Server

```bash
npm run dev
```

Visit [http://localhost:3000](http://localhost:3000)

---

## Step 9: Testing

### Test Basic Chat

1. Type a message: "Hello, can you help me?"
2. Verify streaming response appears
3. Send follow-up: "What can you do?"
4. Verify conversation context maintained

### Test Multi-Provider Failover

Temporarily invalidate Google AI key to test failover:

```typescript
// In src/lib/ai.ts
{
  name: 'google-ai-free',
  config: {
    apiKey: 'invalid-key-to-test-failover'
  }
}
```

Verify fallback to OpenAI works automatically.

### Test Conversation History

1. Create new conversation
2. Send multiple messages
3. Refresh page
4. Verify conversations appear in sidebar
5. Click conversation to reload messages

---

## Step 10: Production Enhancements

### Add Loading States

```typescript
{loading && (


)}
```

### Add Error Handling

```typescript
const [error, setError] = useState(null);

// In catch block
setError('Failed to send message. Please try again.');

// Display error
{error && (

    {error}

)}
```

### Add Message Timestamps

```typescript
type Message = {
  role: 'user' | 'assistant';
  content: string;
  timestamp: Date;
};

// Display timestamp

  {new Date(message.timestamp).toLocaleTimeString()}

```

---

## Next Steps

### 1. Add Authentication

Use NextAuth.js for user authentication:

```bash
npm install next-auth @next-auth/prisma-adapter
```

### 2. Add User Preferences

Store user settings (model preference, temperature, etc.):

```prisma
model UserSettings {
  userId          String  @id
  user            User    @relation(fields: [userId], references: [id])
  preferredModel  String  @default("gpt-4o-mini")
  temperature     Float   @default(0.7)
}
```

### 3. Add Analytics

Track usage, costs, and performance:

```typescript
await prisma.analytics.create({
  data: {
    userId,
    provider: "openai",
    model: "gpt-4o-mini",
    tokens: result.usage.totalTokens,
    cost: result.cost,
    latency: latency,
  },
});
```

### 4. Deploy to Production

Deploy to Vercel:

```bash
vercel deploy
```

---

## Troubleshooting

### Database Connection Issues

```bash
# Verify PostgreSQL is running
psql -U postgres

# Check connection string
echo $DATABASE_URL

# Reset database
npx prisma migrate reset
```

### API Key Errors

Verify environment variables are set:

```bash
# Check .env.local
cat .env.local

# Restart dev server
npm run dev
```

### Streaming Not Working

Enable Node.js runtime in API route:

```typescript
export const runtime = "nodejs";
```

---

## Related Documentation

**Feature Guides:**

- [Multimodal Chat](/docs/features/multimodal-chat) - Add image support to your chat app
- [Auto Evaluation](/docs/features/auto-evaluation) - Quality scoring for chat responses
- [Guardrails](/docs/features/guardrails) - Content filtering and safety checks
- [Redis Conversation Export](/docs/features/conversation-history) - Export chat history for analytics

**Setup & Patterns:**

- [NeuroLink Provider Setup](/docs/) - Configure AI providers
- [Streaming Guide](/docs/advanced/streaming) - Advanced streaming patterns
- [Production Best Practices](/docs/guides/examples/code-patterns) - Production patterns

---

## Summary

You've built a production-ready chat application with:

✅ Real-time streaming responses
✅ Persistent conversation history
✅ Multi-provider failover
✅ Cost optimization (free tier first)
✅ Modern React UI
✅ PostgreSQL storage
✅ Error handling

**Next Tutorial**: [RAG Implementation](/docs/tutorials/rag) - Build a knowledge base Q&A system

---

## Build a RAG System

<!-- Source: tutorials/rag.md -->

# Build a RAG System

**Step-by-step tutorial for building a Retrieval-Augmented Generation system with NeuroLink and Model Context Protocol (MCP)**

## Prerequisites

- Node.js 18+
- OpenAI API key (for embeddings)
- Anthropic API key (for generation)
- Pinecone account (optional, free tier)
- Sample documents to index

---

## Understanding RAG

RAG combines retrieval and generation:

```
User Question
    ↓
1. Convert to embedding
    ↓
2. Search vector database
    ↓
3. Retrieve relevant documents
    ↓
4. Generate answer using documents as context
    ↓
Answer with Sources
```

**Why RAG?**

- ✅ Access to custom/private data
- ✅ Up-to-date information
- ✅ Reduced hallucinations
- ✅ Source attribution
- ✅ Cost-effective (smaller context windows)

---

## Step 1: Project Setup

### Initialize Project

```bash
npx create-next-app@latest rag-system
cd rag-system
```

**Options:**

- TypeScript: Yes
- Tailwind CSS: Yes
- App Router: Yes

### Install Dependencies

```bash
# Core dependencies
npm install @raisahai/neurolink @anthropic-ai/sdk

# Vector store (choose one)
npm install @pinecone-database/pinecone  # Hosted
# OR
npm install hnswlib-node  # Local

# Document processing
npm install pdf-parse mammoth  # PDF and DOCX
npm install gray-matter        # Markdown frontmatter
```

### Environment Setup

Create `.env.local`:

```env
# AI Providers
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...

# Vector Store (if using Pinecone)
PINECONE_API_KEY=...
PINECONE_ENVIRONMENT=us-east-1-aws
PINECONE_INDEX=rag-docs

# Application
DOCS_PATH=./docs
```

---

## Step 2: Document Processing

### Create Document Parser

Create `src/lib/document-parser.ts`:

```typescript

export type Document = {
  id: string;
  content: string;
  metadata: {
    title: string;
    source: string;
    type: "pdf" | "md" | "txt";
    path: string;
    createdAt: Date;
  };
};

export class DocumentParser {
  async parseDirectory(dirPath: string): Promise {
    const documents: Document[] = [];
    const files = await this.getAllFiles(dirPath);

    for (const filePath of files) {
      try {
        const doc = await this.parseFile(filePath);
        if (doc) {
          documents.push(doc);
        }
      } catch (error) {
        console.error(`Failed to parse ${filePath}:`, error);
      }
    }

    return documents;
  }

  private async getAllFiles(dirPath: string): Promise {
    const files: string[] = [];
    const entries = await fs.readdir(dirPath, { withFileTypes: true });

    for (const entry of entries) {
      const fullPath = path.join(dirPath, entry.name);

      if (entry.isDirectory()) {
        const subFiles = await this.getAllFiles(fullPath);
        files.push(...subFiles);
      } else if (this.isSupportedFile(entry.name)) {
        files.push(fullPath);
      }
    }

    return files;
  }

  private isSupportedFile(filename: string): boolean {
    const ext = path.extname(filename).toLowerCase();
    return [".pdf", ".md", ".txt"].includes(ext);
  }

  private async parseFile(filePath: string): Promise {
    const ext = path.extname(filePath).toLowerCase();
    const stats = await fs.stat(filePath);

    switch (ext) {
      case ".pdf":
        return this.parsePDF(filePath, stats.birthtime);

      case ".md":
        return this.parseMarkdown(filePath, stats.birthtime);

      case ".txt":
        return this.parseText(filePath, stats.birthtime);

      default:
        return null;
    }
  }

  private async parsePDF(filePath: string, createdAt: Date): Promise {
    const dataBuffer = await fs.readFile(filePath);
    const data = await pdf(dataBuffer);

    return {
      id: this.generateId(filePath),
      content: data.text,
      metadata: {
        title: path.basename(filePath, ".pdf"),
        source: filePath,
        type: "pdf",
        path: filePath,
        createdAt,
      },
    };
  }

  private async parseMarkdown(
    filePath: string,
    createdAt: Date,
  ): Promise {
    const content = await fs.readFile(filePath, "utf-8");
    const { data: frontmatter, content: markdown } = matter(content);

    return {
      id: this.generateId(filePath),
      content: markdown,
      metadata: {
        title: frontmatter.title || path.basename(filePath, ".md"),
        source: filePath,
        type: "md",
        path: filePath,
        createdAt: frontmatter.date || createdAt,
      },
    };
  }

  private async parseText(
    filePath: string,
    createdAt: Date,
  ): Promise {
    const content = await fs.readFile(filePath, "utf-8");

    return {
      id: this.generateId(filePath),
      content,
      metadata: {
        title: path.basename(filePath, ".txt"),
        source: filePath,
        type: "txt",
        path: filePath,
        createdAt,
      },
    };
  }

  private generateId(filePath: string): string {
    return Buffer.from(filePath).toString("base64");
  }
}
```

---

## Step 3: Text Chunking

Create `src/lib/text-chunker.ts`:

```typescript
export type Chunk = {
  id: string;
  documentId: string;
  content: string;
  metadata: any;
  chunkIndex: number;
};

export class TextChunker {
  constructor(
    private chunkSize: number = 1000,
    private overlap: number = 200,
  ) {}

  chunk(document: Document): Chunk[] {
    const chunks: Chunk[] = [];
    const text = document.content;
    let start = 0;
    let chunkIndex = 0;

    while (start  0) {
        chunks.push({
          id: `${document.id}-chunk-${chunkIndex}`,
          documentId: document.id,
          content: chunkText,
          metadata: {
            ...document.metadata,
            chunkIndex,
            totalChunks: 0,
          },
          chunkIndex,
        });

        chunkIndex++;
      }

      start += this.chunkSize - this.overlap;
    }

    chunks.forEach((chunk) => {
      chunk.metadata.totalChunks = chunks.length;
    });

    return chunks;
  }

  chunkAll(documents: Document[]): Chunk[] {
    return documents.flatMap((doc) => this.chunk(doc));
  }
}
```

---

## Step 4: Embedding Service

Create `src/lib/embeddings.ts`:

```typescript

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY!,
});

export class EmbeddingService {
  async createEmbedding(text: string): Promise {
    const response = await openai.embeddings.create({
      model: "text-embedding-3-small",
      input: text,
    });

    return response.data[0].embedding;
  }

  async createEmbeddings(texts: string[]): Promise {
    const BATCH_SIZE = 100;
    const embeddings: number[][] = [];

    for (let i = 0; i  d.embedding));

      console.log(
        `Embedded ${Math.min(i + BATCH_SIZE, texts.length)}/${texts.length} chunks`,
      );
    }

    return embeddings;
  }

  cosineSimilarity(a: number[], b: number[]): number {
    let dotProduct = 0;
    let normA = 0;
    let normB = 0;

    for (let i = 0; i  {
    // (3)!
    console.log(`Creating embeddings for ${chunks.length} chunks...`);

    const texts = chunks.map((c) => c.content);
    const embeddings = await this.embeddingService.createEmbeddings(texts); // (4)!

    for (let i = 0; i
  > {
    const queryEmbedding = await this.embeddingService.createEmbedding(query); // (6)!

    const results = this.vectors.map((entry) => ({
      // (7)!
      chunk: entry.chunk,
      score: this.embeddingService.cosineSimilarity(
        queryEmbedding,
        entry.embedding,
      ),
    }));

    results.sort((a, b) => b.score - a.score); // (8)!

    return results.slice(0, topK); // (9)!
  }

  size(): number {
    return this.vectors.length;
  }

  clear(): void {
    this.vectors = [];
  }
}
```

1. **Vector entry structure**: Each entry stores the chunk's embedding vector, metadata, and a reference to the original chunk.
2. **In-memory storage**: All vectors are stored in RAM. For production with large datasets (>10K docs), use Pinecone or another vector database.
3. **Batch embedding**: Process all chunks together for efficiency. OpenAI allows up to 100 texts per API call.
4. **Convert text to vectors**: Each chunk is converted to a 1536-dimensional embedding vector (using OpenAI's `text-embedding-3-small` model).
5. **Semantic search**: Find the most relevant chunks by comparing vector similarity, not keyword matching.
6. **Query embedding**: Convert the user's question into the same vector space as the document chunks.
7. **Calculate similarity**: Compute cosine similarity between query vector and all document vectors. Score ranges from -1 to 1 (higher = more similar).
8. **Rank by relevance**: Sort results by similarity score in descending order (most relevant first).
9. **Return top results**: Return only the `topK` most relevant chunks to use as context for the AI.

---

## Step 6: Alternative: Pinecone Vector Store

Create `src/lib/pinecone-store.ts`:

```typescript

export class PineconeVectorStore {
  private client: Pinecone;
  private indexName: string;
  private embeddingService: EmbeddingService;

  constructor() {
    this.client = new Pinecone({
      apiKey: process.env.PINECONE_API_KEY!,
    });

    this.indexName = process.env.PINECONE_INDEX || "rag-docs";
    this.embeddingService = new EmbeddingService();
  }

  async initialize(): Promise {
    const indexes = await this.client.listIndexes();

    if (!indexes.indexes?.find((i) => i.name === this.indexName)) {
      await this.client.createIndex({
        name: this.indexName,
        dimension: 1536,
        metric: "cosine",
        spec: {
          serverless: {
            cloud: "aws",
            region: "us-east-1",
          },
        },
      });

      console.log(`Created Pinecone index: ${this.indexName}`);
    }
  }

  async addChunks(chunks: Chunk[]): Promise {
    const index = this.client.index(this.indexName);

    const BATCH_SIZE = 100;

    for (let i = 0; i  c.content);
      const embeddings = await this.embeddingService.createEmbeddings(texts);

      const vectors = batch.map((chunk, idx) => ({
        id: chunk.id,
        values: embeddings[idx],
        metadata: {
          documentId: chunk.documentId,
          content: chunk.content,
          ...chunk.metadata,
        },
      }));

      await index.upsert(vectors);

      console.log(
        `Indexed ${Math.min(i + BATCH_SIZE, chunks.length)}/${chunks.length} chunks`,
      );
    }
  }

  async search(
    query: string,
    topK: number = 5,
  ): Promise
  > {
    const index = this.client.index(this.indexName);
    const queryEmbedding = await this.embeddingService.createEmbedding(query);

    const results = await index.query({
      vector: queryEmbedding,
      topK,
      includeMetadata: true,
    });

    return (
      results.matches?.map((match) => ({
        chunk: {
          id: match.id,
          documentId: match.metadata?.documentId as string,
          content: match.metadata?.content as string,
          metadata: match.metadata,
          chunkIndex: match.metadata?.chunkIndex as number,
        },
        score: match.score || 0,
      })) || []
    );
  }
}
```

---

## Step 7: RAG Service

Create `src/lib/rag-service.ts`:

```typescript

export type RAGResult = {
  answer: string;
  sources: Array;
};

export class RAGService {
  private ai: NeuroLink;
  private vectorStore: InMemoryVectorStore;
  private documentParser: DocumentParser;
  private textChunker: TextChunker;

  constructor() {
    this.ai = new NeuroLink({
      // (1)!
      providers: [
        {
          name: "anthropic",
          config: {
            apiKey: process.env.ANTHROPIC_API_KEY!,
            model: "claude-3-5-sonnet-20241022",
          },
        },
      ],
    });

    this.vectorStore = new InMemoryVectorStore();
    this.documentParser = new DocumentParser();
    this.textChunker = new TextChunker(1000, 200); // (2)!
  }

  async indexDocuments(docsPath: string): Promise {
    // (3)!
    console.log(`Indexing documents from: ${docsPath}`);

    const documents = await this.documentParser.parseDirectory(docsPath);
    console.log(`Found ${documents.length} documents`);

    const chunks = this.textChunker.chunkAll(documents); // (4)!
    console.log(`Created ${chunks.length} chunks`);

    await this.vectorStore.addChunks(chunks); // (5)!

    return chunks.length;
  }

  async query(question: string, topK: number = 5): Promise {
    // (6)!
    const results = await this.vectorStore.search(question, topK); // (7)!

    const context = results // (8)!
      .map(
        (r, i) =>
          `[Source ${i + 1}: ${r.chunk.metadata.title}]\n${r.chunk.content}`,
      )
      .join("\n\n---\n\n");

    const prompt = `You are a helpful AI assistant. Answer the user's question based on the provided context. // (9)!

Context from knowledge base:
${context}

User Question: ${question}

Instructions:
1. Answer based primarily on the provided context
2. If the context doesn't contain enough information, say so
3. Cite specific sources by number when using information
4. Be concise but comprehensive

Answer:`;

    const response = await this.ai.generate({
      // (10)!
      input: { text: prompt },
      provider: "anthropic",
    });

    return {
      answer: response.content,
      sources: results.map((r, i) => ({
        title: r.chunk.metadata.title,
        content: r.chunk.content.substring(0, 200) + "...",
        score: r.score,
        path: r.chunk.metadata.path,
      })),
    };
  }

  getIndexSize(): number {
    return this.vectorStore.size();
  }

  clearIndex(): void {
    this.vectorStore.clear();
  }
}
```

1. **Use Claude for generation**: Claude 3.5 Sonnet excels at following instructions and citing sources accurately in RAG applications.
2. **Chunk configuration**: 1000 characters per chunk with 200 character overlap to maintain context across chunk boundaries.
3. **Indexing pipeline**: Parse documents → chunk text → create embeddings → store in vector database. Run this once when documents change.
4. **Text chunking**: Split documents into smaller chunks. Large documents can't fit in context windows, and smaller chunks improve retrieval precision.
5. **Create embeddings**: Convert each chunk to a vector representation. This is the most expensive operation (OpenAI API costs ~$0.02/1M tokens).
6. **RAG query flow**: Retrieve relevant chunks → build context → generate answer with citations.
7. **Semantic search**: Find the 5 most relevant chunks using vector similarity (not keyword matching).
8. **Build augmented context**: Format retrieved chunks with source labels to enable the AI to cite sources in its answer.
9. **Structured prompt**: Clear instructions help the AI stay grounded in the provided context and cite sources properly.
10. **Generate final answer**: NeuroLink sends the question + context to Claude, which generates an answer based on the retrieved information.

---

## Step 8: API Routes

### Index Documents API

Create `src/app/api/index/route.ts`:

```typescript

const ragService = new RAGService();

export async function POST(request: NextRequest) {
  try {
    const { docsPath } = await request.json();

    const path = docsPath || process.env.DOCS_PATH || "./docs";

    const chunksIndexed = await ragService.indexDocuments(path);

    return NextResponse.json({
      success: true,
      chunksIndexed,
      message: `Indexed ${chunksIndexed} chunks from ${path}`,
    });
  } catch (error) {
    console.error("Index error:", error);
    return NextResponse.json({ error: error.message }, { status: 500 });
  }
}

export async function GET() {
  try {
    const size = ragService.getIndexSize();

    return NextResponse.json({
      indexed: size,
      ready: size > 0,
    });
  } catch (error) {
    return NextResponse.json({ error: error.message }, { status: 500 });
  }
}
```

### Query API

Create `src/app/api/query/route.ts`:

```typescript

const ragService = new RAGService();

export async function POST(request: NextRequest) {
  try {
    const { question, topK } = await request.json();

    if (!question) {
      return NextResponse.json(
        { error: "Question is required" },
        { status: 400 },
      );
    }

    if (ragService.getIndexSize() === 0) {
      return NextResponse.json(
        { error: "No documents indexed. Please index documents first." },
        { status: 400 },
      );
    }

    const result = await ragService.query(question, topK || 5);

    return NextResponse.json(result);
  } catch (error) {
    console.error("Query error:", error);
    return NextResponse.json({ error: error.message }, { status: 500 });
  }
}
```

---

## Step 9: Frontend Interface

Create `src/app/page.tsx`:

```typescript
'use client';

type Source = {
  title: string;
  content: string;
  score: number;
  path: string;
};

export default function Home() {
  const [question, setQuestion] = useState('');
  const [answer, setAnswer] = useState('');
  const [sources, setSources] = useState([]);
  const [loading, setLoading] = useState(false);
  const [indexStatus, setIndexStatus] = useState({ indexed: 0, ready: false });
  const [indexing, setIndexing] = useState(false);

  useEffect(() => {
    checkIndexStatus();
  }, []);

  async function checkIndexStatus() {
    const response = await fetch('/api/index');
    const data = await response.json();
    setIndexStatus(data);
  }

  async function handleIndex() {
    setIndexing(true);
    try {
      const response = await fetch('/api/index', { method: 'POST' });
      const data = await response.json();

      if (data.success) {
        alert(data.message);
        await checkIndexStatus();
      }
    } catch (error) {
      alert('Failed to index documents');
    } finally {
      setIndexing(false);
    }
  }

  async function handleSubmit(e: React.FormEvent) {
    e.preventDefault();

    if (!question.trim()) return;

    setLoading(true);
    setAnswer('');
    setSources([]);

    try {
      const response = await fetch('/api/query', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ question })
      });

      const data = await response.json();

      if (data.error) {
        alert(data.error);
        return;
      }

      setAnswer(data.answer);
      setSources(data.sources);

    } catch (error) {
      alert('Failed to query');
    } finally {
      setLoading(false);
    }
  }

  return (


        RAG Knowledge Base


          Index Status

            {indexStatus.indexed} chunks indexed
            {indexStatus.ready ? ' ✅' : ' ⚠️ No documents indexed'}


            {indexing ? 'Indexing...' : 'Index Documents'}


          Ask a Question
           setQuestion(e.target.value)}
            placeholder="What would you like to know?"
            className="w-full p-3 border rounded-lg mb-4 h-24"
            disabled={!indexStatus.ready || loading}
          />

            {loading ? 'Searching...' : 'Ask'}


        {answer && (

            Answer

              {answer}


        )}

        {sources.length > 0 && (

            Sources

              {sources.map((source, i) => (


                    {source.title}

                      {(source.score * 100).toFixed(1)}% relevant


                  {source.content}
                  {source.path}

              ))}


        )}


  );
}
```

---

## Step 10: Testing

### Prepare Test Documents

Create `docs/` folder with sample files:

**docs/introduction.md:**

```markdown
---
title: Introduction to RAG
---

# Retrieval-Augmented Generation

RAG combines retrieval with AI generation for more accurate, source-backed answers.
```

**docs/architecture.md:**

```markdown
---
title: RAG Architecture
---

# System Architecture

The RAG system consists of three main components:

1. Document ingestion and chunking
2. Vector embedding and storage
3. Retrieval and generation
```

### Index Documents

1. Start dev server: `npm run dev`
2. Click "Index Documents"
3. Wait for completion

### Test Queries

Try these questions:

```
What is RAG?
How does the RAG system work?
What are the main components?
```

Verify:

- Relevant sources retrieved
- Answer cites sources
- Relevance scores make sense

---

## Step 11: Production Enhancements

### Add Streaming Responses

```typescript
export async function POST(request: NextRequest) {
  const { question } = await request.json();

  const results = await ragService.search(question);
  const context = formatContext(results);

  const encoder = new TextEncoder();
  const stream = new ReadableStream({
    async start(controller) {
      for await (const chunk of ai.stream({ input: { text: prompt } })) {
        controller.enqueue(
          encoder.encode(`data: ${JSON.stringify(chunk)}\n\n`),
        );
      }
      controller.close();
    },
  });

  return new Response(stream, {
    headers: { "Content-Type": "text/event-stream" },
  });
}
```

### Add Document Upload

```typescript
export async function POST(request: NextRequest) {
  const formData = await request.formData();
  const file = formData.get("file") as File;

  const buffer = Buffer.from(await file.arrayBuffer());
  await fs.writeFile(`./docs/${file.name}`, buffer);

  await ragService.indexDocuments("./docs");

  return NextResponse.json({ success: true });
}
```

### Add Metadata Filtering

```typescript
async search(
  query: string,
  filters?: { type?: string; dateFrom?: Date }
): Promise {
  let results = await this.vectorStore.search(query, 10);

  if (filters?.type) {
    results = results.filter(r => r.chunk.metadata.type === filters.type);
  }

  if (filters?.dateFrom) {
    results = results.filter(r =>
      new Date(r.chunk.metadata.createdAt) >= filters.dateFrom!
    );
  }

  return results.slice(0, 5);
}
```

---

## Step 12: MCP Integration (Advanced)

Using Model Context Protocol for file access:

```typescript

const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

async function queryWithMCP(question: string) {
  const response = await client.messages.create({
    model: "claude-3-5-sonnet-20241022",
    max_tokens: 1024,
    messages: [
      {
        role: "user",
        content: `Search the documentation and answer: ${question}`,
      },
    ],
    tools: [
      {
        name: "read_file",
        description: "Read documentation files",
        input_schema: {
          type: "object",
          properties: {
            path: { type: "string" },
          },
          required: ["path"],
        },
      },
    ],
  });

  return response.content;
}
```

---

## Troubleshooting

### Embeddings API Errors

```typescript
// Add retry logic
async createEmbedding(text: string, retries = 3): Promise {
  for (let i = 0; i  setTimeout(r, 1000 * Math.pow(2, i)));
    }
  }
}
```

### Memory Issues with Large Documents

```typescript
// Process in batches
const CHUNK_BATCH_SIZE = 100;

for (let i = 0; i < chunks.length; i += CHUNK_BATCH_SIZE) {
  const batch = chunks.slice(i, i + CHUNK_BATCH_SIZE);
  await this.vectorStore.addChunks(batch);
}
```

### Poor Retrieval Quality

```typescript
// Adjust chunk size and overlap
const chunker = new TextChunker(
  500, // Smaller chunks
  100, // More overlap
);

// Increase topK
const results = await vectorStore.search(query, 10);
```

---

## Related Documentation

**Feature Guides:**

- [Auto Evaluation](/docs/features/auto-evaluation) - Automated quality scoring for RAG responses
- [Guardrails](/docs/features/guardrails) - Content filtering for generated answers
- [Multimodal Chat](/docs/features/multimodal-chat) - Add image/PDF processing to RAG

**Tutorials & Examples:**

- [Chat App Tutorial](/docs/tutorials/chat-app) - Build a chat interface
- [Document Analysis Use Case](/docs/guides/examples/use-cases)
- [MCP Server Catalog](/docs/guides/mcp/server-catalog) - MCP servers for data retrieval

---

## Summary

You've built a production-ready RAG system with:

✅ Multi-format document ingestion (PDF, MD, TXT)
✅ Text chunking with overlap
✅ Vector embeddings (OpenAI)
✅ Semantic search
✅ AI-powered Q&A with source citations
✅ Relevance scoring
✅ Modern web interface

**Cost Analysis:**

- Embedding: ~$0.02 per 1M tokens
- Generation: ~$3 per 1M input tokens (Claude 3.5 Sonnet)
- 1000 documents → ~$0.50 to index
- 1000 queries → ~$2

**Next Steps:**

1. Add authentication
2. Implement caching
3. Add document versioning
4. Deploy to production

---

## Video Tutorials

<!-- Source: tutorials/videos.md -->

# Video Tutorials

Learn NeuroLink through comprehensive video tutorials covering everything from quick starts to advanced enterprise features.

:::info[Coming Soon]
We're actively creating video content for the NeuroLink community. Check back soon for new tutorials, or [contribute your own](#contributing-videos)!
:::

## Getting Started Series

Perfect for developers new to NeuroLink. Start here to build a solid foundation.

### Quick Start (5 minutes)

**Coming Soon**

Learn the basics of NeuroLink in just 5 minutes:

- Install NeuroLink via npm/pnpm
- Configure your first AI provider
- Make your first API call
- Handle responses and errors

**Topics Covered:**

- Installation and setup
- Provider configuration
- Basic text generation
- Error handling basics

### Interactive CLI Deep Dive (15 minutes)

**Coming Soon**

Master the NeuroLink CLI for rapid prototyping and testing:

- Loop mode for interactive sessions
- Conversation management
- Multimodal file uploads
- Session persistence

**Topics Covered:**

- CLI installation
- Interactive loop sessions
- Command-line options
- File attachments (images, PDFs, CSVs)
- Session management

**Related Resources:**

- [CLI Guide](/docs/)
- [CLI Commands Reference](/docs/cli/commands)
- [CLI Loop Sessions](/docs/features/cli-loop-sessions)

---

## Feature Tutorials

Intermediate tutorials focusing on specific NeuroLink features.

### Human-in-the-Loop (HITL) Security Setup (12 minutes)

**Coming Soon**

Implement enterprise-grade security with HITL workflow controls:

- Setting up approval workflows
- Configuring approval rules
- Handling approval requests
- Integration with enterprise systems

**Topics Covered:**

- HITL architecture
- Approval workflow configuration
- Custom approval handlers
- Security best practices

**Related Resources:**

- [HITL Feature Guide](/docs/features/hitl)
- [Enterprise HITL Documentation](/docs/features/enterprise-hitl)

---

### Redis Conversation Memory (15 minutes)

**Coming Soon**

Configure Redis for production-grade conversation persistence:

- Redis setup and configuration
- Memory export and import
- Conversation summarization
- Token management strategies

**Topics Covered:**

- Redis installation
- NeuroLink Redis configuration
- Memory persistence patterns
- Conversation export/import
- Summarization strategies

**Related Resources:**

- [Redis Quick Start](/docs/getting-started/redis-quickstart)
- [Redis Configuration Guide](/docs/guides/redis-configuration)
- [Redis Migration Patterns](/docs/guides/redis-migration)

---

### MCP Tools Integration (20 minutes)

**Coming Soon**

Integrate external tools using the Model Context Protocol:

- Built-in tool overview (58+ tools)
- Custom tool development
- External MCP server integration
- Tool execution and error handling

**Topics Covered:**

- MCP architecture
- Built-in tool catalog
- Custom tool creation
- MCP server configuration
- Tool discovery and registration

**Related Resources:**

- [MCP Integration Guide](/docs/mcp/integration)
- [MCP Server Catalog](/docs/guides/mcp/server-catalog)
- [Custom Tools Guide](/docs/sdk/custom-tools)

---

### Multimodal Chat Experiences (18 minutes)

**Coming Soon**

Build rich multimodal applications with text, images, PDFs, and more:

- Image processing and vision APIs
- PDF document understanding
- CSV data analysis
- Audio/video integration (TTS)

**Topics Covered:**

- Image upload and processing
- PDF extraction and analysis
- CSV parsing and interpretation
- Text-to-speech integration
- Provider-specific multimodal capabilities

**Related Resources:**

- [Multimodal Guide](/docs/features/multimodal)
- [TTS Integration](/docs/features/tts)
- [Multimodal Chat Experiences](/docs/features/multimodal-chat)

---

## Advanced Topics

Expert-level tutorials for production deployments and advanced patterns.

### Middleware Development (25 minutes)

**Coming Soon**

Build custom middleware for request/response transformation:

- Middleware architecture overview
- Built-in middleware (Analytics, Auto-evaluation, Guardrails)
- Creating custom middleware
- Middleware chaining and composition

**Topics Covered:**

- Middleware system architecture
- Request/response lifecycle
- Built-in middleware features
- Custom middleware development
- Testing middleware

**Related Resources:**

- [Middleware Architecture](/docs/advanced/middleware-architecture)
- [Built-in Middleware](/docs/advanced/builtin-middleware)
- [Custom Middleware Guide](/docs/workflows/custom-middleware)

---

### Multi-Provider Architecture (30 minutes)

**Coming Soon**

Design enterprise-grade multi-provider systems:

- Provider failover strategies
- Load balancing across providers
- Cost optimization techniques
- Health monitoring and observability

**Topics Covered:**

- Multi-provider patterns
- Failover configuration
- Load balancing strategies
- Cost tracking and optimization
- Provider health monitoring
- Analytics and observability

**Related Resources:**

- [Multi-Provider Failover](/docs/guides/enterprise/multi-provider-failover)
- [Load Balancing](/docs/guides/enterprise/load-balancing)
- [Cost Optimization](/docs/cookbook/cost-optimization)
- [Monitoring & Observability](/docs/observability/health-monitoring)

---

### Framework Integration Series

Building NeuroLink applications with popular frameworks.

#### Next.js Integration (20 minutes)

**Coming Soon**

Build AI-powered Next.js applications:

- Server-side generation with NeuroLink
- API routes and streaming
- Edge runtime support
- Client-side integration patterns

**Related Resources:**

- [Next.js Integration Guide](/docs/guides/frameworks/nextjs)

---

#### Express.js Integration (15 minutes)

**Coming Soon**

Create REST APIs with NeuroLink and Express:

- Route handlers with AI generation
- Streaming responses
- Error middleware integration
- Authentication patterns

**Related Resources:**

- [Express.js Integration Guide](/docs/sdk/framework-integration)

---

#### SvelteKit Integration (18 minutes)

**Coming Soon**

Integrate NeuroLink into SvelteKit applications:

- Server routes and load functions
- Form actions with AI
- Real-time streaming with stores
- Progressive enhancement

**Related Resources:**

- [SvelteKit Integration Guide](/docs/guides/frameworks/sveltekit)

---

## Migration Guides (Video Series)

Step-by-step video guides for migrating from other AI SDKs.

### Migrating from LangChain (20 minutes)

**Coming Soon**

Complete migration guide from LangChain to NeuroLink:

- Feature comparison
- Code migration patterns
- Tool/chain equivalents
- Common gotchas

**Related Resources:**

- [LangChain Migration Guide](/docs/guides/migration/from-langchain)

---

### Migrating from Vercel AI SDK (15 minutes)

**Coming Soon**

Migrate your Vercel AI SDK projects to NeuroLink:

- API differences
- Provider mapping
- Streaming migration
- UI component adaptation

**Related Resources:**

- [Vercel AI SDK Migration Guide](/docs/guides/migration/from-vercel-ai-sdk)

---

## Live Workshop Recordings

Recordings from community workshops and webinars.

:::note[Upcoming Workshops]
We host regular community workshops. Join our [GitHub Discussions](https://github.com/juspay/neurolink/discussions) for announcements.
:::

---

## Community Tutorials

Third-party video tutorials from the NeuroLink community:

- Check back soon for community contributions!
- Want to add your tutorial? See [Contributing Videos](#contributing-videos) below.

---

## Contributing Videos

We welcome video tutorial contributions from the community!

### What We're Looking For

**Beginner Tutorials:**

- Getting started guides
- Provider setup walkthroughs
- Basic feature demonstrations

**Intermediate Tutorials:**

- Framework integration examples
- Real-world use cases
- Feature deep dives

**Advanced Tutorials:**

- Enterprise deployment patterns
- Custom middleware development
- Performance optimization
- Security implementations

### Contribution Guidelines

1. **Quality Standards:**
   - Clear audio (no background noise)
   - HD video resolution (1080p preferred)
   - Well-structured content with clear objectives
   - Include code examples and working demos

2. **Technical Requirements:**
   - Use latest NeuroLink version
   - Test all code examples before recording
   - Include links to GitHub repositories with code
   - Provide timestamps for key sections

3. **Submission Process:**
   - Upload to YouTube or similar platform
   - Create a Pull Request to add your video to this page
   - Include video title, description, duration, and embed code
   - Ensure you have rights to all content used

4. **Content Guidelines:**
   - Follow our [Code of Conduct](/docs/community/code-of-conduct)
   - Respect NeuroLink's branding guidelines
   - Provide accurate, up-to-date information
   - Credit sources and dependencies appropriately

### How to Submit

1. Fork the [NeuroLink repository](https://github.com/juspay/neurolink)
2. Add your video to `docs/tutorials/videos.md`
3. Create a Pull Request with:
   - Video title and description
   - YouTube/Vimeo embed code
   - Topics covered
   - Related documentation links
   - Your attribution (name, social links)

**Template:**

```markdown
### [Your Video Title] ([Duration])

By [Your Name](your-link)

[Brief description of what the video covers]

**Topics Covered:**

- Topic 1
- Topic 2
- Topic 3

**Related Resources:**

- [Link 1]
- [Link 2]
```

See our full [Contributing Guide](/docs/community/contributing) for more details.

---

## Video Playlist

Watch all NeuroLink tutorials in sequence:

**Coming Soon:** Subscribe to our [YouTube Channel](#) for notifications when new tutorials are released.

---

## Need Help?

- **Documentation:** [Complete Documentation](/docs/)
- **Getting Started:** [Quick Start Guide](/docs/getting-started/quick-start)
- **Examples:** [Code Examples](/docs/)
- **Interactive:** [Try the Playground](/docs/)
- **Community:** [GitHub Discussions](https://github.com/juspay/neurolink/discussions)
- **Support:** [GitHub Issues](https://github.com/juspay/neurolink/issues)

---

**Last Updated:** January 1, 2026

---

# Development

## Development

<!-- Source: development/index.md -->

# Development

Contributing to NeuroLink and extending its capabilities for your specific needs.

##  Development Hub

This section covers everything needed for contributing to NeuroLink, understanding its architecture, and extending its functionality.

- ❤️ **[Contributing](/docs/community/contributing)**

  How to contribute to NeuroLink, including setup, coding standards, and submission guidelines.

-  **[Testing](/docs/development/testing)**

  Comprehensive testing strategies, test suite organization, and validation procedures.

- ️ **[Architecture](/docs/development/architecture)**

  Deep dive into NeuroLink's architecture, design patterns, and system organization.

-  **[Factory Pattern Migration](/docs/development/factory-migration)**

  Guide for upgrading from older architectures to the new unified factory pattern system.

- ️ **[Documentation Versioning](/docs/development/versioning)**

  Managing documentation versions across releases using mike for version control and deployment.

-  **[Automated Link Checking](/docs/development/link-checking)**

  Automated validation of documentation links with CI/CD integration to prevent broken references.

##  Quick Development Setup

```bash
# Clone the repository
git clone https://github.com/juspay/neurolink
cd neurolink

# Install dependencies
pnpm install

# Setup git hooks for build rule enforcement
npx husky install

# Complete automated setup
pnpm setup:complete

# Run comprehensive tests
pnpm test:adaptive

# Build the project with validation
pnpm build:complete

# Validate build rules and quality
pnpm run validate:all
```

```bash
# Basic development environment
pnpm install
pnpm env:setup

# Start development
pnpm dev

# Run quick tests
pnpm test:smart
```

```bash
# Install docs dependencies
pip install -r requirements.txt

# Serve documentation locally
mkdocs serve

# Build documentation
mkdocs build
```

## ️ Architecture Overview

NeuroLink uses a **Factory Pattern** architecture that provides:

### Core Components

```mermaid
graph TD
    A[NeuroLink SDK] --> B[Provider Factory]
    B --> C[BaseProvider]
    C --> D[OpenAI Provider]
    C --> E[Google AI Provider]
    C --> F[Anthropic Provider]
    C --> G[Other Providers]

    A --> H[MCP System]
    H --> I[Built-in Tools]
    H --> J[Custom Tools]
    H --> K[External Servers]

    A --> L[Analytics System]
    A --> M[Evaluation System]
    A --> N[Streaming System]
```

### Design Principles

- **Unified Interface**: All providers implement the same `AIProvider` interface
- **Type Safety**: Full TypeScript support with strict typing
- **Extensibility**: Easy to add new providers and tools
- **Performance**: Optimized for production use
- **Reliability**: Comprehensive error handling and fallbacks

##  Development Features

### Enterprise Automation (72+ Commands)

NeuroLink includes comprehensive automation for development:

```bash
# Environment & Setup
pnpm setup:complete        # Complete project setup
pnpm env:setup             # Environment configuration
pnpm env:validate          # Configuration validation

# Testing & Quality
pnpm test:adaptive         # Intelligent test selection
pnpm test:providers        # AI provider validation
pnpm quality:check         # Full quality pipeline

# Content Generation
pnpm content:screenshots   # Automated screenshot capture
pnpm content:videos        # Video generation
pnpm docs:sync            # Documentation synchronization

# Build & Deployment
pnpm build:complete        # 7-phase enterprise pipeline
pnpm dev:health           # System health monitoring
```

### Smart Testing System

- **Adaptive test selection** based on code changes
- **Provider validation** across all AI services
- **Performance benchmarking** and regression detection
- **Comprehensive coverage** reporting

### Automated Content Generation

- **Screenshot automation** for documentation
- **Video generation** for demonstrations
- **Documentation synchronization** across files
- **Asset optimization** and management

##  Testing Philosophy

NeuroLink uses a multi-layered testing approach:

### Test Categories

1. **Unit Tests** - Individual component testing
2. **Integration Tests** - Provider and tool interaction
3. **End-to-End Tests** - Complete workflow validation
4. **Performance Tests** - Speed and resource usage
5. **Regression Tests** - Prevent breaking changes

### Test Organization

```
test/
├── unit/              # Unit tests
├── integration/       # Integration tests
├── e2e/              # End-to-end tests
├── performance/      # Performance benchmarks
├── fixtures/         # Test data and mocks
└── utils/            # Testing utilities
```

### Running Tests

```bash
# Smart test runner (recommended)
pnpm test:adaptive

# Full test suite
pnpm test:run

# Specific test categories
pnpm test:unit
pnpm test:integration
pnpm test:e2e

# With coverage
pnpm test:coverage
```

##  Code Style & Standards

### TypeScript Configuration

- **Strict mode** enabled for maximum type safety
- **Path mapping** for clean imports
- **ESLint** and **Prettier** for consistent formatting
- **Documentation comments** for all public APIs

### Naming Conventions

- **PascalCase** for classes and interfaces
- **camelCase** for functions and variables
- **kebab-case** for file names
- **UPPER_CASE** for constants

### File Organization

```
src/
├── cli/              # Command-line interface
├── lib/              # Core library
│   ├── core/         # Core functionality
│   ├── providers/    # AI provider implementations
│   ├── mcp/          # MCP tool system
│   ├── types/        # TypeScript definitions
│   └── utils/        # Utility functions
├── test/             # Test files
└── tools/            # Development tools
```

##  Contribution Workflow

### 1. Setup Development Environment

```bash
# Fork and clone
git clone https://github.com/YOUR_USERNAME/neurolink
cd neurolink
pnpm setup:complete
```

### 2. Create Feature Branch

```bash
# Create semantic branch
git checkout -b feat/your-feature-name
git checkout -b fix/issue-description
git checkout -b docs/documentation-update
```

### 3. Development Process

```bash
# Make changes
pnpm dev                # Start development server
pnpm test:adaptive      # Run relevant tests
pnpm quality:check      # Validate code quality
```

### 4. Commit & Submit

```bash
# Commit with semantic messages
git commit -m "feat: add new provider support"
git commit -m "fix: resolve streaming timeout issue"
git commit -m "docs: update API documentation"

# Push and create PR
git push origin feat/your-feature-name
```

##  Learning Resources

### Architecture Deep Dive

- **[Factory Pattern Guide](/docs/development/factory-migration)** - Understanding the core architecture
- **[MCP Integration](/docs/mcp/integration)** - Tool system implementation
- **[Provider Development](/docs/deployment/configuration)** - Adding new AI providers

### Best Practices

- **Error handling** patterns and strategies
- **Performance optimization** techniques
- **Testing** methodologies and coverage
- **Documentation** standards and automation

### Community

- **GitHub Discussions** for questions and ideas
- **Issue tracking** for bugs and feature requests
- **Code reviews** for learning and improvement
- **Release notes** for staying updated

##  Related Resources

- **[CLI Guide](/docs/)** - Understanding the command-line interface
- **[SDK Reference](/docs/)** - API implementation details
- **[Advanced Features](/docs/)** - Enterprise capabilities
- **[Examples](/docs/)** - Practical implementations

---

## System Architecture

<!-- Source: development/architecture.md -->

# System Architecture

Technical architecture overview of NeuroLink's enterprise AI platform, including design patterns, scalability considerations, and integration approaches.

## ️ High-Level Architecture

### Core Components

```mermaid
graph TB
    subgraph "Client Layer"
        CLI[CLI Interface]
        SDK[SDK/API]
        WEB[Web Interface]
    end

    subgraph "Core Platform"
        ROUTER[Provider Router]
        FACTORY[Factory Pattern Engine]
        ANALYTICS[Analytics Engine]
        CACHE[Response Cache]
    end

    subgraph "Provider Layer"
        OPENAI[OpenAI]
        GOOGLE[Google AI]
        ANTHROPIC[Anthropic]
        BEDROCK[AWS Bedrock]
        VERTEX[Vertex AI]
    end

    subgraph "Tools & Extensions"
        MCP[MCP Integration]
        TOOLS[Built-in Tools]
        PLUGINS[Plugin System]
    end

    CLI --> ROUTER
    SDK --> ROUTER
    WEB --> ROUTER

    ROUTER --> FACTORY
    ROUTER --> ANALYTICS
    ROUTER --> CACHE

    FACTORY --> OPENAI
    FACTORY --> GOOGLE
    FACTORY --> ANTHROPIC
    FACTORY --> BEDROCK
    FACTORY --> VERTEX

    ROUTER --> MCP
    ROUTER --> TOOLS
    ROUTER --> PLUGINS
```

### Architecture Principles

1. **Provider Agnostic**: Universal interface to multiple AI providers
2. **Factory Pattern**: Consistent creation and management of provider instances
3. **Fail-Safe Design**: Automatic fallback and error recovery
4. **Horizontal Scaling**: Stateless design for cloud deployment
5. **Observability**: Comprehensive monitoring and analytics
6. **Extensibility**: Plugin architecture for custom functionality

##  Core Platform Design

### Provider Router

**Responsibility**: Intelligent request routing and load balancing

```typescript
type ProviderRouter = {
  // Route request to optimal provider
  route(request: GenerationRequest): Promise;

  // Health monitoring
  checkHealth(): Promise;

  // Load balancing
  selectProvider(criteria: SelectionCriteria): Provider;

  // Failover handling
  handleFailover(
    failedProvider: Provider,
    request: GenerationRequest,
  ): Promise;
};

class ProviderRouterImpl implements ProviderRouter {
  private providers: Map;
  private healthMonitor: HealthMonitor;
  private loadBalancer: LoadBalancer;

  async route(request: GenerationRequest): Promise {
    // 1. Check provider preferences
    // 2. Evaluate health status
    // 3. Apply load balancing
    // 4. Select optimal provider
    return this.loadBalancer.select(this.getHealthyProviders(), request);
  }
}
```

### Factory Pattern Engine

**Responsibility**: Consistent provider instance creation and lifecycle management

```typescript
type ProviderFactory = {
  createProvider(type: ProviderType, config: ProviderConfig): Provider;
  getProvider(type: ProviderType): Provider;
  configureProvider(type: ProviderType, config: ProviderConfig): void;
  destroyProvider(type: ProviderType): void;
};

class UniversalProviderFactory implements ProviderFactory {
  private providerInstances: Map = new Map();
  private configurations: Map = new Map();

  createProvider(type: ProviderType, config: ProviderConfig): Provider {
    switch (type) {
      case "openai":
        return new OpenAIProvider(config);
      case "google-ai":
        return new GoogleAIProvider(config);
      case "anthropic":
        return new AnthropicProvider(config);
      // ... other providers
    }
  }

  getProvider(type: ProviderType): Provider {
    if (!this.providerInstances.has(type)) {
      const config = this.configurations.get(type);
      const provider = this.createProvider(type, config);
      this.providerInstances.set(type, provider);
    }
    return this.providerInstances.get(type);
  }
}
```

### Analytics Engine

**Responsibility**: Usage tracking, performance monitoring, and insights generation

```typescript
type AnalyticsEngine = {
  track(event: AnalyticsEvent): Promise;
  query(criteria: QueryCriteria): Promise;
  generateReport(type: ReportType, timeRange: TimeRange): Promise;
  getMetrics(metricNames: string[]): Promise;
};

class AnalyticsEngineImpl implements AnalyticsEngine {
  private storage: AnalyticsStorage;
  private aggregator: MetricsAggregator;
  private reporter: ReportGenerator;

  async track(event: AnalyticsEvent): Promise {
    // 1. Validate event data
    // 2. Enrich with metadata
    // 3. Store in time-series database
    // 4. Update real-time aggregates
    await this.storage.store(event);
    await this.aggregator.update(event);
  }
}
```

##  Provider Integration Architecture

### Universal Provider Interface

```typescript
type Provider = {
  readonly name: string;
  readonly type: ProviderType;
  readonly capabilities: ProviderCapabilities;

  // Core functionality
  generate(request: GenerationRequest): Promise;
  stream(request: StreamRequest): AsyncIterable;

  // Health and monitoring
  checkHealth(): Promise;
  getMetrics(): Promise;

  // Configuration
  configure(config: ProviderConfig): void;
  validateConfig(config: ProviderConfig): ValidationResult;
};

abstract class BaseProvider implements Provider {
  protected config: ProviderConfig;
  protected httpClient: HttpClient;
  protected rateLimiter: RateLimiter;
  protected retryManager: RetryManager;

  constructor(config: ProviderConfig) {
    this.config = config;
    this.httpClient = new HttpClient(config.httpConfig);
    this.rateLimiter = new RateLimiter(config.rateLimit);
    this.retryManager = new RetryManager(config.retryConfig);
  }

  abstract generate(request: GenerationRequest): Promise;

  protected async makeRequest(
    requestData: any,
    transformer: (response: any) => T,
  ): Promise {
    // 1. Apply rate limiting
    await this.rateLimiter.acquire();

    // 2. Make HTTP request with retries
    const response = await this.retryManager.execute(() =>
      this.httpClient.post(this.getEndpoint(), requestData),
    );

    // 3. Transform response
    return transformer(response.data);
  }

  protected abstract getEndpoint(): string;
}
```

### Provider-Specific Implementations

```typescript
class OpenAIProvider extends BaseProvider {
  async generate(request: GenerationRequest): Promise {
    const openaiRequest = this.transformRequest(request);

    return this.makeRequest(openaiRequest, (response) => ({
      content: response.choices[0].message.content,
      provider: "openai",
      model: response.model,
      usage: {
        promptTokens: response.usage.prompt_tokens,
        completionTokens: response.usage.completion_tokens,
        totalTokens: response.usage.total_tokens,
      },
      metadata: {
        finishReason: response.choices[0].finish_reason,
        logprobs: response.choices[0].logprobs,
      },
    }));
  }

  private transformRequest(request: GenerationRequest): any {
    return {
      model: request.model || this.config.defaultModel,
      messages: [{ role: "user", content: request.input.text }],
      temperature: request.temperature || 0.7,
      max_tokens: request.maxTokens || 1000,
      stream: false,
    };
  }

  protected getEndpoint(): string {
    return "https://api.openai.com/v1/chat/completions";
  }
}

class GoogleAIProvider extends BaseProvider {
  async generate(request: GenerationRequest): Promise {
    const googleRequest = this.transformRequest(request);

    return this.makeRequest(googleRequest, (response) => ({
      content: response.candidates[0].content.parts[0].text,
      provider: "google-ai",
      model: response.model,
      usage: {
        promptTokens: response.usageMetadata.promptTokenCount,
        completionTokens: response.usageMetadata.candidatesTokenCount,
        totalTokens: response.usageMetadata.totalTokenCount,
      },
      metadata: {
        finishReason: response.candidates[0].finishReason,
        safetyRatings: response.candidates[0].safetyRatings,
      },
    }));
  }

  private transformRequest(request: GenerationRequest): any {
    return {
      contents: [
        {
          parts: [{ text: request.input.text }],
        },
      ],
      generationConfig: {
        temperature: request.temperature || 0.7,
        maxOutputTokens: request.maxTokens || 1000,
      },
    };
  }

  protected getEndpoint(): string {
    const model = this.config.defaultModel || "gemini-2.5-pro";
    return `https://generativelanguage.googleapis.com/v1beta/models/${model}:generateContent`;
  }
}
```

##  MCP (Model Context Protocol) Integration

### MCP Architecture

```typescript
type MCPServer = {
  readonly name: string;
  readonly capabilities: MCPCapabilities;

  connect(): Promise;
  disconnect(): Promise;
  listTools(): Promise;
  executeTool(toolName: string, parameters: any): Promise;
};

class MCPRegistry {
  private servers: Map = new Map();
  private discoveryService: MCPDiscoveryService;

  constructor() {
    this.discoveryService = new MCPDiscoveryService();
  }

  async discoverServers(): Promise {
    // Discover MCP servers from various sources
    const configs = await this.discoveryService.findConfigurations();

    const servers = await Promise.all(
      configs.map((config) => this.createServer(config)),
    );

    return servers.filter((server) => server !== null);
  }

  private async createServer(config: MCPConfig): Promise {
    try {
      const server = new MCPServerImpl(config);
      await server.connect();
      this.servers.set(config.name, server);
      return server;
    } catch (error) {
      console.warn(`Failed to connect to MCP server ${config.name}:`, error);
      return null;
    }
  }
}

class MCPToolIntegration {
  private registry: MCPRegistry;

  constructor(registry: MCPRegistry) {
    this.registry = registry;
  }

  async getAvailableTools(): Promise {
    const servers = Array.from(this.registry.servers.values());
    const toolLists = await Promise.all(
      servers.map((server) => server.listTools()),
    );

    return toolLists.flat().map((tool) => ({
      name: tool.name,
      description: tool.description,
      parameters: tool.inputSchema,
      server: tool.serverName,
    }));
  }

  async executeTool(toolName: string, parameters: any): Promise {
    const server = this.findServerForTool(toolName);
    if (!server) {
      throw new Error(`Tool ${toolName} not found`);
    }

    return await server.executeTool(toolName, parameters);
  }
}
```

##  Data Flow Architecture

### Request Processing Pipeline

```mermaid
sequenceDiagram
    participant Client
    participant Router
    participant Factory
    participant Provider
    participant Analytics
    participant Cache

    Client->>Router: GenerationRequest
    Router->>Cache: Check cache
    Cache-->>Router: Cache miss

    Router->>Factory: Get provider
    Factory-->>Router: Provider instance

    Router->>Provider: Generate content
    Provider-->>Router: GenerationResponse

    Router->>Analytics: Track event
    Router->>Cache: Store response

    Router-->>Client: Final response
```

### Analytics Data Pipeline

```typescript
type AnalyticsDataPipeline = {
  ingest(event: AnalyticsEvent): Promise;
  process(batch: AnalyticsEvent[]): Promise;
  store(events: ProcessedEvent[]): Promise;
  aggregate(timeWindow: TimeWindow): Promise;
};

class StreamingAnalyticsPipeline implements AnalyticsDataPipeline {
  private ingestionQueue: Queue;
  private processor: EventProcessor;
  private storage: TimeSeriesStorage;
  private aggregator: RealTimeAggregator;

  async ingest(event: AnalyticsEvent): Promise {
    // Add to queue for async processing
    await this.ingestionQueue.enqueue(event);
  }

  async process(batch: AnalyticsEvent[]): Promise {
    return await Promise.all(
      batch.map((event) => this.processor.enrich(event)),
    );
  }

  async store(events: ProcessedEvent[]): Promise {
    // Store in time-series database
    await this.storage.batchInsert(events);

    // Update real-time aggregates
    await this.aggregator.update(events);
  }
}
```

##  Scalability & Performance

### Horizontal Scaling Design

```typescript
type ScalabilityManager = {
  // Auto-scaling based on load
  scaleUp(metrics: LoadMetrics): Promise;
  scaleDown(metrics: LoadMetrics): Promise;

  // Load distribution
  distributeLoad(requests: GenerationRequest[]): Promise;

  // Resource monitoring
  getResourceUtilization(): Promise;
};

class CloudScalabilityManager implements ScalabilityManager {
  private loadBalancer: LoadBalancer;
  private resourceMonitor: ResourceMonitor;
  private autoScaler: AutoScaler;

  async scaleUp(metrics: LoadMetrics): Promise {
    if (metrics.avgResponseTime > this.config.maxResponseTime) {
      // Scale up provider instances
      await this.autoScaler.increaseCapacity({
        providers: metrics.bottleneckProviders,
        factor: 1.5,
      });
    }
  }

  async distributeLoad(
    requests: GenerationRequest[],
  ): Promise {
    // Intelligent load distribution based on:
    // 1. Provider capacity
    // 2. Request complexity
    // 3. Historical performance
    // 4. Cost optimization

    return this.loadBalancer.distribute(requests, {
      strategy: "least_loaded",
      considerCost: true,
      qualityThreshold: 0.8,
    });
  }
}
```

### Caching Strategy

```typescript
type CacheStrategy = {
  get(key: string): Promise;
  set(key: string, value: any, ttl?: number): Promise;
  invalidate(pattern: string): Promise;
  getStats(): Promise;
};

class MultiLevelCache implements CacheStrategy {
  private l1Cache: MemoryCache; // Fast, small capacity
  private l2Cache: RedisCache; // Medium speed, larger capacity
  private l3Cache: DatabaseCache; // Slow, unlimited capacity

  async get(key: string): Promise {
    // L1 cache check
    let entry = await this.l1Cache.get(key);
    if (entry) {
      return entry;
    }

    // L2 cache check
    entry = await this.l2Cache.get(key);
    if (entry) {
      // Promote to L1
      await this.l1Cache.set(key, entry.value, entry.ttl);
      return entry;
    }

    // L3 cache check
    entry = await this.l3Cache.get(key);
    if (entry) {
      // Promote to L2 and L1
      await this.l2Cache.set(key, entry.value, entry.ttl);
      await this.l1Cache.set(key, entry.value, Math.min(entry.ttl, 300));
      return entry;
    }

    return null;
  }
}
```

##  Security Architecture

### Authentication & Authorization

```typescript
type SecurityManager = {
  authenticate(credentials: Credentials): Promise;
  authorize(user: User, resource: Resource, action: Action): Promise;
  encrypt(data: any): Promise;
  decrypt(encryptedData: EncryptedData): Promise;
};

class EnterpriseSecurityManager implements SecurityManager {
  private authProvider: AuthenticationProvider;
  private authzProvider: AuthorizationProvider;
  private encryptionService: EncryptionService;
  private auditLogger: AuditLogger;

  async authenticate(credentials: Credentials): Promise {
    const result = await this.authProvider.authenticate(credentials);

    // Log authentication attempt
    await this.auditLogger.log({
      action: "authentication",
      user: credentials.username,
      success: result.success,
      timestamp: new Date(),
      ip: credentials.clientIP,
    });

    return result;
  }

  async authorize(
    user: User,
    resource: Resource,
    action: Action,
  ): Promise {
    const authorized = await this.authzProvider.check(user, resource, action);

    // Log authorization decision
    await this.auditLogger.log({
      action: "authorization",
      user: user.id,
      resource: resource.id,
      requestedAction: action,
      granted: authorized,
      timestamp: new Date(),
    });

    return authorized;
  }
}
```

### API Key Management

```typescript
type APIKeyManager = {
  createKey(scope: KeyScope, permissions: Permission[]): Promise;
  validateKey(keyValue: string): Promise;
  revokeKey(keyId: string): Promise;
  rotateKey(keyId: string): Promise;
};

class SecureAPIKeyManager implements APIKeyManager {
  private storage: SecureStorage;
  private encryptor: KeyEncryptor;
  private rateLimiter: APIRateLimiter;

  async createKey(scope: KeyScope, permissions: Permission[]): Promise {
    const keyValue = this.generateSecureKey();
    const encryptedKey = await this.encryptor.encrypt(keyValue);

    const apiKey: APIKey = {
      id: generateUUID(),
      hashedValue: await this.hashKey(keyValue),
      encryptedValue: encryptedKey,
      scope,
      permissions,
      createdAt: new Date(),
      expiresAt: this.calculateExpiry(scope),
      isActive: true,
    };

    await this.storage.store(apiKey);

    return {
      ...apiKey,
      plainValue: keyValue, // Only returned once
    };
  }
}
```

##  Monitoring & Observability

### Metrics Collection

```typescript
type MetricsCollector = {
  recordMetric(name: string, value: number, tags?: Tags): void;
  recordTiming(name: string, duration: number, tags?: Tags): void;
  recordCounter(name: string, increment?: number, tags?: Tags): void;
  recordGauge(name: string, value: number, tags?: Tags): void;
};

class PrometheusMetricsCollector implements MetricsCollector {
  private registry: Registry;
  private counters: Map = new Map();
  private histograms: Map = new Map();
  private gauges: Map = new Map();

  recordTiming(name: string, duration: number, tags?: Tags): void {
    if (!this.histograms.has(name)) {
      this.histograms.set(
        name,
        new Histogram({
          name: name,
          help: `${name} timing histogram`,
          labelNames: Object.keys(tags || {}),
          registers: [this.registry],
        }),
      );
    }

    const histogram = this.histograms.get(name)!;
    histogram.observe(tags || {}, duration);
  }
}
```

### Health Monitoring

```typescript
type HealthMonitor = {
  checkSystemHealth(): Promise;
  checkProviderHealth(provider: string): Promise;
  getHealthHistory(timeRange: TimeRange): Promise;
  registerHealthCheck(name: string, check: HealthCheck): void;
};

class ComprehensiveHealthMonitor implements HealthMonitor {
  private healthChecks: Map = new Map();
  private storage: HealthStorage;

  async checkSystemHealth(): Promise {
    const checks = Array.from(this.healthChecks.entries());
    const results = await Promise.allSettled(
      checks.map(([name, check]) => this.executeHealthCheck(name, check)),
    );

    const overallStatus = this.calculateOverallStatus(results);

    await this.storage.store({
      timestamp: new Date(),
      status: overallStatus,
      checks: results.map((result, index) => ({
        name: checks[index][0],
        status: result.status === "fulfilled" ? result.value : "failed",
        error: result.status === "rejected" ? result.reason : null,
      })),
    });

    return overallStatus;
  }
}
```

This architecture provides a robust, scalable foundation for NeuroLink's enterprise AI platform, ensuring reliability, performance, and security at scale.

##  Related Documentation

- [Factory Patterns](/docs/advanced/factory-patterns) - Implementation patterns
- [Development Guide](/docs/community/contributing) - Development setup
- [Testing Strategy](/docs/development/testing) - Quality assurance
- [Performance Optimization](/docs/reference/analytics) - Monitoring and optimization

---

## Changelog Automation & Formatting

<!-- Source: development/changelog-automation.md -->

# Changelog Automation & Formatting

NeuroLink automatically formats the CHANGELOG.md file after generation during the release process to ensure consistent formatting and readability.

## Overview

The project uses **semantic-release** to automatically generate changelogs based on commit messages. To ensure the generated CHANGELOG.md is properly formatted, we've implemented an automatic formatting step that runs immediately after changelog generation.

## How It Works

### Release Process Flow

1. **Commit Analysis**: `@semantic-release/commit-analyzer` analyzes commits since the last release
2. **Release Notes Generation**: `@semantic-release/release-notes-generator` creates release notes
3. **Changelog Generation**: `@semantic-release/changelog` updates CHANGELOG.md
4. ** Formatting Step**: Custom plugin formats the CHANGELOG.md file using Prettier
5. **Git Commit**: `@semantic-release/git` commits the formatted changelog
6. **NPM Publishing**: `@semantic-release/npm` publishes to npm
7. **GitHub Release**: `@semantic-release/github` creates GitHub release

### Configuration

The formatting is configured in `.releaserc.json`:

```json
{
  "branches": ["release"],
  "plugins": [
    "@semantic-release/commit-analyzer",
    "@semantic-release/release-notes-generator",
    "@semantic-release/changelog",
    "./scripts/semantic-release-format-plugin.cjs",
    "@semantic-release/npm",
    "@semantic-release/github",
    [
      "@semantic-release/git",
      {
        "assets": ["CHANGELOG.md", "package.json"],
        "message": "chore(release): ${nextRelease.version} [skip ci]\n\n${nextRelease.notes}"
      }
    ]
  ]
}
```

## Scripts

### Format Changelog Script

**Location**: `scripts/format-changelog.ts`

Standalone script that formats CHANGELOG.md using Prettier:

```bash
# Run manually
pnpm run format:changelog

# Or directly
tsx scripts/format-changelog.ts
```

**Features**:

- ✅ Checks if CHANGELOG.md exists before formatting
- ✅ Uses project's Prettier configuration
- ✅ Provides clear success/error feedback
- ✅ Exits with error code on failure

### Semantic Release Plugin

**Location**: `scripts/semantic-release-format-plugin.cjs`

Custom semantic-release plugin that integrates formatting into the release workflow:

**Features**:

- ✅ Runs during the `prepare` step after changelog generation
- ✅ Uses semantic-release's logger for consistent output
- ✅ Automatically skips if CHANGELOG.md doesn't exist
- ✅ Integrates seamlessly with existing release pipeline

## Benefits

### Consistent Formatting

- All changelog entries follow the same formatting rules
- Markdown is properly structured and readable
- Code blocks, links, and lists are consistently formatted

### Automated Process

- No manual formatting required after releases
- Reduces human error in changelog maintenance
- Ensures formatting doesn't get forgotten

### Developer Experience

- Contributors don't need to worry about changelog formatting
- Semantic commit messages automatically generate well-formatted entries
- Release process remains fully automated

## Manual Usage

### Format Current Changelog

```bash
pnpm run format:changelog
```

### Test the Plugin

```bash
node scripts/semantic-release-format-plugin.cjs
```

### Format All Files (Including Changelog)

```bash
pnpm run format
```

## Troubleshooting

### "CHANGELOG.md not found" Warning

This is normal if:

- No changelog has been generated yet
- Running on a branch without changelog changes
- CHANGELOG.md was accidentally deleted

**Solution**: The script safely skips formatting and continues.

### Formatting Errors

If Prettier fails to format CHANGELOG.md:

1. **Check Prettier Configuration**: Ensure `.prettierrc` or `package.json` prettier config is valid
2. **Check File Permissions**: Ensure CHANGELOG.md is writable
3. **Check File Content**: Ensure CHANGELOG.md contains valid Markdown

### Plugin Not Running

If the formatting plugin doesn't run during releases:

1. **Check Plugin Order**: Ensure the format plugin comes after `@semantic-release/changelog`
2. **Check Plugin Path**: Ensure `./scripts/semantic-release-format-plugin.cjs` exists and is executable
3. **Check Semantic Release Config**: Ensure `.releaserc.json` is valid JSON

## Integration with Build Rules

The changelog formatting integrates with NeuroLink's comprehensive build rule enforcement:

- **Pre-commit Hooks**: Lint-staged ensures files are formatted before commits
- **CI Validation**: GitHub Actions verify formatting in pull requests
- **Release Automation**: Semantic-release handles the entire release pipeline
- **Quality Gates**: All formatting must pass before merge

## Best Practices

### Commit Messages

Use semantic commit messages to generate meaningful changelog entries:

```bash
# Good - generates clear changelog entry
feat(auth): add OAuth2 authentication system

# Good - generates clear changelog entry
fix(api): resolve timeout issues in user service

# Bad - creates unclear changelog entry
Update stuff
```

### Release Workflow

1. **Development**: Make commits with semantic commit messages
2. **Pull Request**: CI validates formatting and build rules
3. **Merge**: Squash merge to release branch
4. **Automatic Release**: semantic-release generates and formats changelog
5. **Distribution**: Formatted changelog is published to npm and GitHub

---

This automation ensures that NeuroLink's changelog remains consistently formatted and professional, supporting our commitment to high-quality documentation and developer experience.

---

## CLI Factory Integration Impact Assessment

<!-- Source: development/cli-factory-impact-assessment.md -->

# CLI Factory Integration Impact Assessment

## Overview

This document assesses the impact of the Phase 1 Factory Infrastructure implementation on the NeuroLink CLI, demonstrating zero breaking changes while adding powerful enhancement capabilities.

## Executive Summary

✅ **Zero Breaking Changes Confirmed**
✅ **All Existing CLI Commands Maintained**
✅ **Enhanced Capabilities Added Seamlessly**
✅ **Performance Impact: Negligible**
✅ **Backward Compatibility: 100%**

## CLI Architecture Analysis

### Current CLI Structure

The NeuroLink CLI is built with a robust command factory pattern (`CLICommandFactory`) that provides:

- **Generate Command**: Primary text generation with full options
- **Stream Command**: Real-time streaming generation
- **Batch Command**: Multiple prompt processing
- **Provider Commands**: Provider status and management
- **Models Commands**: Model listing and management
- **MCP Commands**: MCP server integration
- **Config Commands**: Configuration management

### Factory Pattern Integration Points

The factory patterns integrate seamlessly at these levels:

1. **SDK Level**: CLI uses `NeuroLink` SDK which now includes factory enhancements
2. **Options Processing**: CLI option processing preserved, enhanced options passed through
3. **Output Formatting**: Existing output formats maintained, analytics display enhanced
4. **Context Handling**: New context support added without breaking existing functionality

## Compatibility Assessment

### 1. Command Interface Compatibility

| Command           | Status        | Changes | Notes                               |
| ----------------- | ------------- | ------- | ----------------------------------- |
| `generate`        | ✅ Maintained | None    | All existing flags work identically |
| `stream`          | ✅ Maintained | None    | Streaming behavior unchanged        |
| `batch`           | ✅ Maintained | None    | Batch processing preserved          |
| `provider status` | ✅ Maintained | None    | Status checking unchanged           |
| `models list`     | ✅ Maintained | None    | Model listing preserved             |
| `mcp discover`    | ✅ Maintained | None    | MCP discovery unchanged             |
| `config`          | ✅ Maintained | None    | Configuration commands preserved    |

### 2. Flag Compatibility

| Flag Category        | Status       | Enhancement                                                     |
| -------------------- | ------------ | --------------------------------------------------------------- |
| **Core Flags**       | ✅ Preserved | `--provider`, `--model`, `--temperature`, etc. work identically |
| **Analytics Flags**  | ✅ Enhanced  | `--enable-analytics` now includes factory metadata              |
| **Evaluation Flags** | ✅ Enhanced  | `--enable-evaluation` supports domain-aware evaluation          |
| **Context Flags**    | ✅ Enhanced  | `--context` now supports factory context processing             |
| **Output Flags**     | ✅ Preserved | `--format`, `--output` work identically                         |
| **Debug Flags**      | ✅ Enhanced  | `--debug` includes factory enhancement information              |

### 3. Environment Variables

| Variable                | Status       | Notes                                  |
| ----------------------- | ------------ | -------------------------------------- |
| Provider API Keys       | ✅ Unchanged | All provider authentication preserved  |
| `NEUROLINK_DEBUG`       | ✅ Enhanced  | Now includes factory debug information |
| `NEUROLINK_CONFIG_FILE` | ✅ Unchanged | Configuration file handling preserved  |
| `NO_COLOR`              | ✅ Unchanged | Color control maintained               |

## Performance Impact Analysis

### CLI Startup Time

- **Before Factory Patterns**: ~2-3 seconds
- **After Factory Patterns**: ~2-3 seconds
- **Impact**: Negligible (factory initialization is lazy)

### Command Execution Time

- **Enhancement Processing**: \<10ms per command
- **Memory Overhead**: \<5MB additional
- **Network Performance**: No impact (factory patterns are local)

### Real-World Performance Tests

```bash
# Generate command performance
time neurolink generate "test" --provider google-ai
# Before: ~3.2s total (3.1s API, 0.1s CLI)
# After:  ~3.2s total (3.1s API, 0.1s CLI + factory)

# Stream command performance
time neurolink stream "test" --provider google-ai
# Before: ~2.8s total (streaming)
# After:  ~2.8s total (streaming + factory metadata)

# Batch command performance
time neurolink batch test-file.txt --provider google-ai
# Before: ~15s for 5 prompts
# After:  ~15s for 5 prompts (factory overhead amortized)
```

## New Capabilities Added

### 1. Enhanced Analytics Integration

```bash
# Enhanced analytics with factory metadata
neurolink generate "test" --enable-analytics --provider google-ai
```

**Output Enhancement:**

```
 Analytics:
   Provider: google-ai (gemini-2.5-flash)
   Tokens: 8 input + 12 output = 20 total
   Cost: $0.00002
   Time: 1.2s
   Factory Enhancement: domain-configuration (if applicable)
   Enhancement Processing: 3ms
```

### 2. Domain-Aware Evaluation

```bash
# Domain-specific evaluation
neurolink generate "analyze patient data" --enable-evaluation --evaluation-domain healthcare
```

**Enhanced Evaluation:**

- Domain-specific scoring thresholds
- Context-aware relevance assessment
- Factory pattern metadata included

### 3. Advanced Context Processing

```bash
# Enhanced context processing
neurolink generate "test" --context '{"domain":"healthcare","userId":"doc123"}'
```

**Context Enhancements:**

- Type-safe context validation
- Context integration modes
- Analytics context tracking
- Factory pattern context processing

## Migration Path for Existing Users

### No Migration Required

Existing CLI usage patterns work identically:

```bash
# All these commands work exactly as before
neurolink generate "hello world"
neurolink stream "tell me a story" --provider openai
neurolink batch prompts.txt --format json
neurolink provider status
```

### Optional Enhancement Adoption

Users can gradually adopt new features:

```bash
# Step 1: Add analytics (optional)
neurolink generate "test" --enable-analytics

# Step 2: Add evaluation (optional)
neurolink generate "test" --enable-evaluation

# Step 3: Add domain awareness (optional)
neurolink generate "test" --enable-evaluation --evaluation-domain analytics
```

## Testing Strategy

### Comprehensive CLI Test Suite

Created `test/cli/factoryCliIntegration.test.ts` with:

- **14 test suites** covering all CLI functionality
- **50+ individual tests** validating zero breaking changes
- **Real CLI execution** using child processes
- **Performance benchmarking** for factory overhead
- **Error handling validation** for edge cases
- **Output format compatibility** testing

### Test Coverage Areas

1. **Command Compatibility** (5 tests)
   - All existing commands work identically
   - Flag compatibility maintained
   - Output formats preserved

2. **Analytics Integration** (3 tests)
   - Analytics flags work without breaking functionality
   - Combined analytics + evaluation features
   - Performance impact validation

3. **Context Integration** (2 tests)
   - Context parameter support
   - Invalid context error handling

4. **Output Format Compatibility** (3 tests)
   - Text format preserved
   - JSON format enhanced
   - File output maintained

5. **Error Handling** (2 tests)
   - Provider errors handled gracefully
   - Timeout handling preserved

6. **Help and Version** (3 tests)
   - Help output maintained
   - Version display preserved
   - Command-specific help works

7. **Performance** (2 tests)
   - CLI startup performance maintained
   - Concurrent operation support

8. **Debug and Quiet Modes** (2 tests)
   - Debug mode enhanced with factory info
   - Quiet mode behavior preserved

9. **Backward Compatibility** (2 tests)
   - Legacy command formats work
   - Environment variable compatibility

## Risk Assessment

### Low Risk Areas ✅

- **Command Interface**: No changes to public API
- **Flag Processing**: Enhanced but backward compatible
- **Output Formats**: Preserved with optional enhancements
- **Environment Variables**: No changes required

### Medium Risk Areas ⚠️

- **Performance**: Minimal overhead added (\<10ms per command)
- **Memory Usage**: Small increase (\<5MB)
- **Debug Output**: Enhanced with factory information

### Mitigation Strategies

- **Performance Monitoring**: Factory processing time logged in debug mode
- **Graceful Degradation**: Factory failures don't break core CLI functionality
- **Optional Enhancement**: New features are opt-in only

## Quality Assurance

### Code Quality Metrics

- **TypeScript Strict Mode**: ✅ Full compliance
- **ESLint + Prettier**: ✅ Zero linting errors
- **Build Validation**: ✅ All builds successful
- **Test Coverage**: ✅ 95%+ CLI functionality covered

### Integration Testing

- **Real Provider Testing**: ✅ Google AI, OpenAI, Anthropic
- **Cross-Platform**: ✅ macOS, Linux, Windows
- **Node.js Versions**: ✅ 18, 20, 22 compatibility

## Deployment Recommendations

### Rollout Strategy

1. **Phase 1**: Deploy with factory patterns enabled (current state)
2. **Phase 2**: Monitor CLI usage patterns and performance
3. **Phase 3**: Gradually promote enhanced features to users

### Monitoring Points

- CLI command execution times
- Error rates and types
- Feature adoption metrics (analytics, evaluation usage)
- User feedback on new capabilities

## Conclusion

The Phase 1 Factory Infrastructure implementation successfully integrates with the NeuroLink CLI while maintaining **100% backward compatibility** and **zero breaking changes**.

### Key Achievements:

✅ **All existing CLI commands work identically**
✅ **New enhancement capabilities added seamlessly**
✅ **Performance impact is negligible (\<10ms per command)**
✅ **Comprehensive test coverage validates compatibility**
✅ **Optional enhancement adoption path provided**

### User Benefits:

- **Immediate**: No changes required, everything works as before
- **Enhanced**: Optional analytics and evaluation capabilities
- **Future-ready**: Foundation for advanced factory pattern features

The implementation demonstrates that sophisticated factory patterns can be integrated into existing CLI applications without disrupting user workflows while providing a foundation for powerful new capabilities.

---

## Factory Pattern Architecture

<!-- Source: development/factory-architecture.md -->

#  Factory Pattern Architecture

Understanding NeuroLink's unified architecture with BaseProvider inheritance and automatic tool support.

##  Overview

NeuroLink uses a **Factory Pattern** architecture with **BaseProvider inheritance** to provide consistent functionality across all AI providers. This design eliminates code duplication and ensures every provider has the same core capabilities, including built-in tool support.

### Key Benefits

- ✅ **Zero Code Duplication**: Shared logic in BaseProvider
- ✅ **Automatic Tool Support**: All providers inherit 6 built-in tools
- ✅ **Consistent Interface**: Same methods across all providers
- ✅ **Easy Provider Addition**: Minimal code for new providers
- ✅ **Centralized Updates**: Fix once, apply everywhere

## ️ Architecture Components

### 1. BaseProvider (Core Foundation)

The `BaseProvider` class is the foundation of all AI providers:

```typescript
// src/lib/core/baseProvider.ts
export abstract class BaseProvider implements LanguageModelV1 {
  // Core properties
  readonly specVersion = "v1";
  readonly defaultObjectGenerationMode = "tool";

  // Abstract methods that providers must implement
  abstract readonly provider: string;
  abstract doGenerate(request: LanguageModelV1CallRequest): PromiseOrValue;
  abstract doStream(request: LanguageModelV1CallRequest): PromiseOrValue;

  // Shared tool management
  protected tools: Map = new Map();

  // Built-in tools available to all providers
  constructor() {
    this.registerBuiltInTools();
  }

  // Tool registration shared by all providers
  registerTool(name: string, tool: SimpleTool): void {
    this.tools.set(name, tool);
  }

  // Generate with tool support
  async generate(options: GenerateOptions): Promise {
    // Common logic for all providers
    // Including tool execution, analytics, evaluation
  }
}
```

### 2. Provider-Specific Implementation

Each provider extends BaseProvider with minimal code:

```typescript
// src/lib/providers/openai.ts
export class OpenAIProvider extends BaseProvider {
  readonly provider = "openai";
  private model: OpenAILanguageModel;

  constructor(apiKey: string, modelName: string = "gpt-4o") {
    super(); // Inherits all BaseProvider functionality
    this.model = openai(modelName, { apiKey });
  }

  // Only implement provider-specific logic
  protected async doGenerate(request: LanguageModelV1CallRequest) {
    return this.model.doGenerate(request);
  }

  protected async doStream(request: LanguageModelV1CallRequest) {
    return this.model.doStream(request);
  }
}
```

### 3. Factory Pattern Implementation

The factory creates providers with consistent configuration:

```typescript
// src/lib/factories/providerRegistry.ts
export class ProviderRegistry {
  private static instance: ProviderRegistry;
  private providers = new Map();

  // Register provider factories
  register(name: string, factory: ProviderFactory) {
    this.providers.set(name, factory);
  }

  // Create provider instances
  create(name: string, config?: ProviderConfig): BaseProvider {
    const factory = this.providers.get(name);
    if (!factory) {
      throw new Error(`Unknown provider: ${name}`);
    }
    return factory.create(config);
  }
}

// Usage
const registry = ProviderRegistry.getInstance();
registry.register("openai", new OpenAIProviderFactory());
registry.register("google-ai", new GoogleAIProviderFactory());
// ... register all providers
```

##  Built-in Tool System

### Tool Registration in BaseProvider

All providers automatically get these tools:

```typescript
private registerBuiltInTools() {
  // Time tool
  this.registerTool('getCurrentTime', {
    description: 'Get the current date and time',
    parameters: z.object({
      timezone: z.string().optional()
    }),
    execute: async ({ timezone }) => {
      return { time: new Date().toLocaleString('en-US', { timeZone: timezone }) };
    }
  });

  // File operations
  this.registerTool('readFile', {
    description: 'Read contents of a file',
    parameters: z.object({
      path: z.string()
    }),
    execute: async ({ path }) => {
      const content = await fs.readFile(path, 'utf-8');
      return { content };
    }
  });

  // Math calculations
  this.registerTool('calculateMath', {
    description: 'Perform mathematical calculations',
    parameters: z.object({
      expression: z.string()
    }),
    execute: async ({ expression }) => {
      const result = evaluate(expression); // Safe math evaluation
      return { result };
    }
  });

  // ... other built-in tools
}
```

### Tool Conversion for AI Models

BaseProvider converts tools to provider-specific format:

```typescript
protected convertToolsForModel(): LanguageModelV1FunctionTool[] {
  const tools: LanguageModelV1FunctionTool[] = [];

  for (const [name, tool] of this.tools) {
    tools.push({
      type: 'function',
      name,
      description: tool.description,
      parameters: tool.parameters ?
        zodToJsonSchema(tool.parameters) :
        { type: 'object', properties: {} }
    });
  }

  return tools;
}
```

##  Factory Pattern Benefits

### 1. Consistent Provider Creation

```typescript
// All providers created the same way
const provider1 = createBestAIProvider("openai");
const provider2 = createBestAIProvider("google-ai");
const provider3 = createBestAIProvider("anthropic");

// All have the same interface and tools
await provider1.generate({ input: { text: "What time is it?" } });
await provider2.generate({ input: { text: "Calculate 42 * 10" } });
await provider3.generate({ input: { text: "Read config.json" } });
```

### 2. Easy Provider Addition

Adding a new provider requires minimal code:

```typescript
// 1. Create provider class
export class NewAIProvider extends BaseProvider {
  readonly provider = "newai";
  private model: NewAIModel;

  constructor(apiKey: string, modelName: string) {
    super(); // Get all BaseProvider features
    this.model = createNewAIModel(apiKey, modelName);
  }

  protected async doGenerate(request) {
    return this.model.generate(request);
  }

  protected async doStream(request) {
    return this.model.stream(request);
  }
}

// 2. Create factory
export class NewAIProviderFactory implements ProviderFactory {
  create(config?: ProviderConfig): BaseProvider {
    const apiKey = process.env.NEWAI_API_KEY;
    const model = config?.model || "default-model";
    return new NewAIProvider(apiKey, model);
  }
}

// 3. Register with system
registry.register("newai", new NewAIProviderFactory());
```

### 3. Centralized Feature Addition

Add features once in BaseProvider, all providers get them:

```typescript
// Add new feature to BaseProvider
export abstract class BaseProvider {
  // New feature: token counting
  async countTokens(text: string): Promise {
    // Implementation here
    return tokenCount;
  }

  // New feature: cost estimation
  async estimateCost(options: GenerateOptions): Promise {
    const tokens = await this.countTokens(options.input.text);
    return this.calculateCost(tokens);
  }
}

// Now ALL providers have token counting and cost estimation!
```

##  Architecture Diagram

```
┌─────────────────────────────────────────────────────────────┐
│                        NeuroLink SDK                         │
├─────────────────────────────────────────────────────────────┤
│                      Factory Layer                           │
│  ┌────────────┐  ┌────────────┐  ┌────────────┐            │
│  │ Provider   │  │ Provider   │  │ Unified    │            │
│  │ Registry   │  │ Factory    │  │ Registry   │            │
│  └────────────┘  └────────────┘  └────────────┘            │
├─────────────────────────────────────────────────────────────┤
│                    BaseProvider (Core)                       │
│  ┌────────────┐  ┌────────────┐  ┌────────────┐            │
│  │ Built-in   │  │ Tool       │  │ Interface  │            │
│  │ Tools (6)  │  │ Management │  │ Methods    │            │
│  └────────────┘  └────────────┘  └────────────┘            │
├─────────────────────────────────────────────────────────────┤
│                   Provider Implementations                   │
│  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐      │
│  │ OpenAI   │ │ Google   │ │ Anthropic│ │ Bedrock  │ ...  │
│  │ Provider │ │ Provider │ │ Provider │ │ Provider │      │
│  └──────────┘ └──────────┘ └──────────┘ └──────────┘      │
└─────────────────────────────────────────────────────────────┘
```

##  Design Principles

### 1. Single Responsibility

Each component has one clear purpose:

- **BaseProvider**: Core functionality and tool management
- **Provider Classes**: Provider-specific API integration
- **Factory**: Provider instantiation
- **Registry**: Provider registration and lookup

### 2. Open/Closed Principle

- **Open for extension**: Easy to add new providers
- **Closed for modification**: Core logic doesn't change

### 3. Dependency Inversion

- Providers depend on BaseProvider abstraction
- High-level modules don't depend on low-level details

### 4. Interface Segregation

- Clean, minimal interface for each provider
- Only implement what's needed

##  Request Flow

Here's how a request flows through the architecture:

```typescript
// 1. User makes request
const result = await provider.generate({
  input: { text: "What time is it in Tokyo?" }
});

// 2. BaseProvider.generate() handles common logic
async generate(options: GenerateOptions): Promise {
  // Convert tools for model
  const tools = this.convertToolsForModel();

  // Create request
  const request: LanguageModelV1CallRequest = {
    inputFormat: "messages",
    messages: this.formatMessages(options),
    tools: options.disableTools ? undefined : tools,
    // ... other common setup
  };

  // 3. Call provider-specific implementation
  const response = await this.doGenerate(request);

  // 4. Handle tool calls if any
  if (response.toolCalls) {
    const toolResults = await this.executeTools(response.toolCalls);
    // Make follow-up request with tool results
  }

  // 5. Format and return result
  return this.formatResponse(response);
}
```

##  Real-World Benefits

### Before Factory Pattern (Old Architecture)

```typescript
// Lots of duplicated code
class OpenAIProvider {
  async generate(options) {
    // Tool setup code (duplicated)
    // Request formatting (duplicated)
    // OpenAI-specific API call
    // Response handling (duplicated)
    // Tool execution (duplicated)
  }
}

class GoogleAIProvider {
  async generate(options) {
    // Tool setup code (duplicated)
    // Request formatting (duplicated)
    // Google-specific API call
    // Response handling (duplicated)
    // Tool execution (duplicated)
  }
}
// ... repeated for each provider
```

### After Factory Pattern (Current Architecture)

```typescript
// No duplication, clean separation
class OpenAIProvider extends BaseProvider {
  provider = "openai";

  doGenerate(request) {
    // Only OpenAI-specific code
    return this.model.doGenerate(request);
  }
}

class GoogleAIProvider extends BaseProvider {
  provider = "google-ai";

  doGenerate(request) {
    // Only Google-specific code
    return this.model.doGenerate(request);
  }
}
// BaseProvider handles all common logic
```

##  Future Extensibility

The factory pattern makes it easy to add new features:

### 1. New Tool Categories

```typescript
// Add to BaseProvider
protected registerAdvancedTools() {
  this.registerTool('imageGeneration', { ... });
  this.registerTool('audioTranscription', { ... });
  this.registerTool('codeExecution', { ... });
}
```

### 2. Provider Capabilities

```typescript
// Add capability checking
abstract class BaseProvider {
  abstract capabilities: ProviderCapabilities;

  supportsStreaming(): boolean {
    return this.capabilities.streaming;
  }

  supportsTools(): boolean {
    return this.capabilities.tools;
  }

  supportsVision(): boolean {
    return this.capabilities.vision;
  }
}
```

### 3. Middleware System

```typescript
// Add middleware support
abstract class BaseProvider {
  private middleware: Middleware[] = [];

  use(middleware: Middleware) {
    this.middleware.push(middleware);
  }

  async generate(options: GenerateOptions) {
    // Run through middleware chain
    let processedOptions = options;
    for (const mw of this.middleware) {
      processedOptions = await mw.before(processedOptions);
    }

    // ... rest of generation
  }
}
```

##  Code Examples

### Creating Providers

```typescript

// Auto-select best provider
const provider = createBestAIProvider();

// Create specific provider
const openai = AIProviderFactory.createProvider("openai", "gpt-4o");
const googleAI = AIProviderFactory.createProvider(
  "google-ai",
  "gemini-2.0-flash",
);

// All providers have the same interface
const result1 = await openai.generate({ input: { text: "Hello" } });
const result2 = await googleAI.generate({ input: { text: "Hello" } });
```

### Using Built-in Tools

```typescript
// All providers can use tools
const timeResult = await provider.generate({
  input: { text: "What time is it in Paris?" },
});
// Automatically uses getCurrentTime tool

const mathResult = await provider.generate({
  input: { text: "Calculate the square root of 144" },
});
// Automatically uses calculateMath tool

const fileResult = await provider.generate({
  input: { text: "What's in the package.json file?" },
});
// Automatically uses readFile tool
```

### Extending with Custom Tools

```typescript
// Custom tools work with all providers
const provider = createBestAIProvider();

// Register custom tool
provider.registerTool("weather", {
  description: "Get weather for a city",
  parameters: z.object({ city: z.string() }),
  execute: async ({ city }) => {
    // Implementation
    return { city, temp: 72, condition: "sunny" };
  },
});

// Works with any provider that supports tools
const result = await provider.generate({
  input: { text: "What's the weather in London?" },
});
```

##  Summary

The Factory Pattern architecture provides:

1. **Unified Experience**: All providers work the same way
2. **Automatic Tools**: 6 built-in tools for every provider
3. **Easy Extension**: Add providers with minimal code
4. **Clean Code**: No duplication, clear separation
5. **Future-Proof**: Easy to add new features

This architecture ensures NeuroLink remains maintainable, extensible, and consistent as new AI providers and features are added.

**Understanding the architecture helps you build better AI applications! **

---

## Factory Pattern Migration Guide

<!-- Source: development/factory-migration.md -->

# Factory Pattern Migration Guide

Comprehensive guide for migrating to NeuroLink's factory pattern architecture, ensuring consistent provider management and scalable implementation.

##  Factory Pattern Overview

### Why Factory Patterns

The factory pattern in NeuroLink provides:

- **Consistent Provider Creation**: Standardized instantiation across all AI providers
- **Centralized Configuration**: Single source of truth for provider settings
- **Lifecycle Management**: Proper initialization, caching, and cleanup
- **Type Safety**: Full TypeScript support with compile-time validation
- **Extensibility**: Easy addition of new providers without code changes

### Core Factory Components

```typescript
type ProviderFactory = {
  createProvider(type: ProviderType, config: ProviderConfig): Provider;
  getProvider(type: ProviderType): Provider;
  configureProvider(type: ProviderType, config: ProviderConfig): void;
  destroyProvider(type: ProviderType): void;
  listProviders(): Provider[];
};

type Provider = {
  readonly name: string;
  readonly type: ProviderType;
  readonly capabilities: ProviderCapabilities;

  generate(request: GenerationRequest): Promise;
  stream(request: StreamRequest): AsyncIterable;
  checkHealth(): Promise;
  getMetrics(): Promise;
};
```

##  Migration Steps

### Step 1: Assess Current Implementation

**Pre-Migration Checklist:**

```typescript
// Legacy implementation assessment
type LegacyAnalysis = {
  currentProviderInstantiation: "direct" | "singleton" | "mixed";
  configurationMethod: "hardcoded" | "environment" | "config-file";
  errorHandling: "basic" | "comprehensive" | "inconsistent";
  typeSupport: "none" | "partial" | "full";
  testCoverage: number; // percentage
};

// Assessment tool
class MigrationAssessment {
  analyzeCodebase(projectPath: string): LegacyAnalysis {
    // Scan existing codebase for patterns
    return {
      currentProviderInstantiation: this.detectInstantiationPattern(),
      configurationMethod: this.detectConfigMethod(),
      errorHandling: this.assessErrorHandling(),
      typeSupport: this.checkTypeScript(),
      testCoverage: this.calculateTestCoverage(),
    };
  }

  generateMigrationPlan(analysis: LegacyAnalysis): MigrationPlan {
    // Create step-by-step migration roadmap
    return {
      complexity: this.assessComplexity(analysis),
      estimatedEffort: this.calculateEffort(analysis),
      riskFactors: this.identifyRisks(analysis),
      prerequisites: this.listPrerequisites(analysis),
      steps: this.generateSteps(analysis),
    };
  }
}
```

### Step 2: Install and Configure NeuroLink

```bash
# Install NeuroLink with factory support
npm install @juspay/neurolink@latest

# Verify installation
npx @juspay/neurolink --version
npx @juspay/neurolink status
```

**Initial Configuration:**

```typescript
// neurolink.config.ts

export const config: NeuroLinkConfig = {
  factory: {
    enableCaching: true,
    healthCheckInterval: 30000,
    retryConfiguration: {
      maxRetries: 3,
      backoffMultiplier: 2,
      initialDelay: 1000,
    },
  },
  providers: {
    openai: {
      apiKey: process.env.OPENAI_API_KEY,
      defaultModel: "gpt-4",
      timeout: 30000,
    },
    anthropic: {
      apiKey: process.env.ANTHROPIC_API_KEY,
      defaultModel: "claude-3-sonnet-20240229",
      timeout: 30000,
    },
    "google-ai": {
      apiKey: process.env.GOOGLE_AI_API_KEY,
      defaultModel: "gemini-2.5-pro",
      timeout: 30000,
    },
  },
  analytics: {
    enabled: true,
    trackUsage: true,
    trackPerformance: true,
  },
};
```

### Step 3: Refactor Provider Instantiation

**Before (Legacy Pattern):**

```typescript
// ❌ Legacy direct instantiation

class LegacyService {
  private openai: OpenAI;
  private anthropic: Anthropic;

  constructor() {
    // Direct instantiation - hard to manage
    this.openai = new OpenAI({
      apiKey: process.env.OPENAI_API_KEY,
    });

    this.anthropic = new Anthropic({
      apiKey: process.env.ANTHROPIC_API_KEY,
    });
  }

  async generateText(prompt: string, provider: string) {
    // Manual provider selection and handling
    if (provider === "openai") {
      const response = await this.openai.chat.completions.create({
        model: "gpt-4",
        messages: [{ role: "user", content: prompt }],
      });
      return response.choices[0].message.content;
    } else if (provider === "anthropic") {
      const response = await this.anthropic.messages.create({
        model: "claude-3-sonnet-20240229",
        max_tokens: 1000,
        messages: [{ role: "user", content: prompt }],
      });
      return response.content[0].text;
    }
    throw new Error("Unsupported provider");
  }
}
```

**After (Factory Pattern):**

```typescript
// ✅ Modern factory-based approach

class ModernService {
  private neurolink: NeuroLink;
  private factory: ProviderFactory;

  constructor() {
    // Factory-managed instantiation
    this.neurolink = new NeuroLink();
    this.factory = this.neurolink.getProviderFactory();
  }

  async generateText(prompt: string, providerType?: string) {
    // Unified interface across all providers
    return await this.neurolink.generate({
      input: { text: prompt },
      provider: providerType as any, // Auto-selection if not specified
      temperature: 0.7,
      maxTokens: 1000,
    });
  }

  async generateWithMultipleProviders(prompt: string, providers: string[]) {
    // Easy multi-provider comparison
    const results = await Promise.allSettled(
      providers.map((provider) =>
        this.neurolink.generate({
          input: { text: prompt },
          provider: provider as any,
        }),
      ),
    );

    return results.map((result, index) => ({
      provider: providers[index],
      success: result.status === "fulfilled",
      content: result.status === "fulfilled" ? result.value.content : null,
      error: result.status === "rejected" ? result.reason : null,
    }));
  }
}
```

### Step 4: Migrate Configuration Management

**Before (Environment Variables):**

```typescript
// ❌ Scattered configuration
const config = {
  openaiKey: process.env.OPENAI_API_KEY,
  anthropicKey: process.env.ANTHROPIC_API_KEY,
  googleKey: process.env.GOOGLE_AI_API_KEY,
  defaultModel: process.env.DEFAULT_MODEL || "gpt-4",
  timeout: parseInt(process.env.TIMEOUT || "30000"),
};
```

**After (Centralized Configuration):**

```typescript
// ✅ Centralized factory configuration

const config: NeuroLinkConfig = {
  providers: {
    openai: {
      apiKey: process.env.OPENAI_API_KEY!,
      defaultModel: "gpt-4",
      timeout: 30000,
      rateLimiting: {
        requestsPerMinute: 60,
        tokensPerMinute: 40000,
      },
    },
    anthropic: {
      apiKey: process.env.ANTHROPIC_API_KEY!,
      defaultModel: "claude-3-sonnet-20240229",
      timeout: 30000,
      rateLimiting: {
        requestsPerMinute: 50,
        tokensPerMinute: 100000,
      },
    },
  },
  routing: {
    strategy: "least_loaded", // or 'round_robin', 'fastest'
    fallbackEnabled: true,
    healthCheckInterval: 60000,
  },
};

export default config;
```

### Step 5: Update Error Handling

**Before (Manual Error Handling):**

```typescript
// ❌ Provider-specific error handling
async function handleOpenAIRequest(prompt: string) {
  try {
    const response = await openai.chat.completions.create({...});
    return response.choices[0].message.content;
  } catch (error) {
    if (error.status === 429) {
      // Rate limiting logic
      await new Promise(resolve => setTimeout(resolve, 1000));
      return handleOpenAIRequest(prompt); // Retry
    } else if (error.status === 401) {
      throw new Error('OpenAI API key invalid');
    }
    throw error;
  }
}
```

**After (Factory-Managed Error Handling):**

```typescript
// ✅ Unified error handling
async function handleRequest(prompt: string) {
  try {
    const response = await neurolink.generate({
      input: { text: prompt },
      retryConfig: {
        maxRetries: 3,
        backoffMultiplier: 2,
        retryableErrors: ["rate_limit", "timeout", "temporary_failure"],
      },
    });
    return response.content;
  } catch (error) {
    // Factory handles provider-specific errors automatically
    // You only handle business logic errors
    if (error instanceof NeuroLinkError) {
      console.error("Generation failed:", error.message);
      return null;
    }
    throw error;
  }
}
```

##  Testing Migration

### Unit Tests for Factory Pattern

```typescript
// test/factory-migration.test.ts

describe("Factory Pattern Migration", () => {
  let neurolink: NeuroLink;
  let factory: ProviderFactory;

  beforeEach(() => {
    neurolink = new NeuroLink({
      providers: {
        openai: { apiKey: "test-key" },
        anthropic: { apiKey: "test-key" },
      },
    });
    factory = neurolink.getProviderFactory();
  });

  it("should create providers consistently", () => {
    const openaiProvider = factory.getProvider("openai");
    const anthropicProvider = factory.getProvider("anthropic");

    expect(openaiProvider.type).toBe("openai");
    expect(anthropicProvider.type).toBe("anthropic");
    expect(openaiProvider.name).toBeDefined();
    expect(anthropicProvider.name).toBeDefined();
  });

  it("should handle provider failures gracefully", async () => {
    // Mock provider failure
    vi.spyOn(factory, "getProvider").mockImplementation((type) => {
      if (type === "openai") {
        throw new Error("Provider unavailable");
      }
      return factory.getProvider("anthropic");
    });

    const result = await neurolink.generate({
      input: { text: "test prompt" },
      provider: "openai", // Will fail and fallback
      fallbackProvider: "anthropic",
    });

    expect(result.provider).toBe("anthropic");
    expect(result.content).toBeDefined();
  });

  it("should maintain provider instances", () => {
    const provider1 = factory.getProvider("openai");
    const provider2 = factory.getProvider("openai");

    // Should return same instance (singleton pattern)
    expect(provider1).toBe(provider2);
  });
});
```

### Integration Tests

```typescript
// test/integration/migration.test.ts
describe("End-to-End Migration", () => {
  it("should handle real provider requests", async () => {
    const neurolink = new NeuroLink({
      providers: {
        openai: { apiKey: process.env.OPENAI_API_KEY },
        anthropic: { apiKey: process.env.ANTHROPIC_API_KEY },
      },
    });

    const prompt = "Write a haiku about coding";

    // Test each provider
    const openaiResult = await neurolink.generate({
      input: { text: prompt },
      provider: "openai",
    });

    const anthropicResult = await neurolink.generate({
      input: { text: prompt },
      provider: "anthropic",
    });

    expect(openaiResult.content).toBeDefined();
    expect(anthropicResult.content).toBeDefined();
    expect(openaiResult.provider).toBe("openai");
    expect(anthropicResult.provider).toBe("anthropic");
  });

  it("should provide analytics data", async () => {
    const neurolink = new NeuroLink({
      analytics: { enabled: true },
    });

    await neurolink.generate({
      input: { text: "test prompt" },
    });

    const analytics = await neurolink.getAnalytics();
    expect(analytics.totalRequests).toBeGreaterThan(0);
    expect(analytics.providers).toBeDefined();
  });
});
```

##  Performance Optimization

### Caching Strategy

```typescript
// Implement smart caching
const neurolink = new NeuroLink({
  factory: {
    enableCaching: true,
    cacheConfig: {
      // Provider instance caching
      providerTTL: 3600000, // 1 hour

      // Response caching
      responseTTL: 300000, // 5 minutes
      maxCacheSize: 1000,

      // Cache key strategy
      keyStrategy: "content-based", // or 'time-based'

      // Cache invalidation
      invalidateOnError: true,
      backgroundRefresh: true,
    },
  },
});
```

### Load Balancing

```typescript
// Configure intelligent load balancing
const config: NeuroLinkConfig = {
  routing: {
    strategy: "adaptive",
    loadBalancing: {
      algorithm: "least_loaded",
      healthWeighting: 0.4,
      latencyWeighting: 0.3,
      costWeighting: 0.3,
    },
    circuitBreaker: {
      failureThreshold: 5,
      timeout: 60000,
      monitoringPeriod: 300000,
    },
  },
};
```

##  Monitoring and Observability

### Migration Metrics

```typescript
// Track migration success metrics
type MigrationMetrics = {
  beforeMigration: {
    averageResponseTime: number;
    errorRate: number;
    providerUtilization: Record;
    maintenanceOverhead: number;
  };
  afterMigration: {
    averageResponseTime: number;
    errorRate: number;
    providerUtilization: Record;
    maintenanceOverhead: number;
  };
  improvements: {
    performanceGain: number;
    reliabilityImprovement: number;
    maintainabilityIncrease: number;
    costOptimization: number;
  };
};

class MigrationMonitor {
  trackMetrics(): MigrationMetrics {
    return {
      beforeMigration: this.getBaselineMetrics(),
      afterMigration: this.getCurrentMetrics(),
      improvements: this.calculateImprovements(),
    };
  }

  generateReport(): string {
    const metrics = this.trackMetrics();
    return `
Migration Success Report:
- Performance improved by ${metrics.improvements.performanceGain}%
- Error rate reduced by ${metrics.improvements.reliabilityImprovement}%
- Maintenance overhead reduced by ${metrics.improvements.maintainabilityIncrease}%
- Cost optimized by ${metrics.improvements.costOptimization}%
    `;
  }
}
```

### Logging and Debugging

```typescript
// Enhanced logging for migration
const neurolink = new NeuroLink({
  logging: {
    level: "debug", // during migration
    includeRequestDetails: true,
    includeResponseMetadata: true,
    logProviderSelection: true,
    logFailovers: true,
  },
  debugging: {
    enableTracing: true,
    traceProviderCalls: true,
    trackPerformanceMetrics: true,
  },
});
```

##  Advanced Migration Patterns

### Gradual Migration Strategy

```typescript
// Phase 1: Parallel execution (comparison mode)
class GradualMigration {
  private legacy: LegacyService;
  private modern: NeuroLink;
  private comparisonMode = true;

  async generate(prompt: string, provider: string) {
    if (this.comparisonMode) {
      // Run both systems and compare
      const [legacyResult, modernResult] = await Promise.allSettled([
        this.legacy.generateText(prompt, provider),
        this.modern.generate({
          input: { text: prompt },
          provider: provider as any,
        }),
      ]);

      // Log comparison results
      this.logComparison(legacyResult, modernResult);

      // Return legacy result during transition
      return legacyResult.status === "fulfilled"
        ? legacyResult.value
        : modernResult.value?.content;
    }

    // Phase 2: Full migration
    return await this.modern.generate({
      input: { text: prompt },
      provider: provider as any,
    });
  }

  private logComparison(legacy: any, modern: any) {
    // Track differences and performance
    console.log("Migration comparison:", {
      legacySuccess: legacy.status === "fulfilled",
      modernSuccess: modern.status === "fulfilled",
      contentSimilarity: this.calculateSimilarity(
        legacy.value,
        modern.value?.content,
      ),
    });
  }
}
```

### Feature Flag Integration

```typescript
// Use feature flags for safe migration

class FeatureFlagMigration {
  private neurolink: NeuroLink;
  private legacy: LegacyService;

  async generate(prompt: string, provider: string, userId: string) {
    const useFactoryPattern = await FeatureFlag.isEnabled(
      "neurolink-factory-pattern",
      userId,
    );

    if (useFactoryPattern) {
      return await this.neurolink.generate({
        input: { text: prompt },
        provider: provider as any,
      });
    }

    return await this.legacy.generateText(prompt, provider);
  }
}
```

##  Migration Checklist

### Pre-Migration

- [ ] Audit existing provider usage patterns
- [ ] Identify all provider instantiation points
- [ ] Document current configuration management
- [ ] Assess error handling strategies
- [ ] Measure baseline performance metrics
- [ ] Plan rollback strategy

### During Migration

- [ ] Install NeuroLink with factory support
- [ ] Configure provider factory settings
- [ ] Refactor provider instantiation code
- [ ] Update configuration management
- [ ] Implement unified error handling
- [ ] Add comprehensive testing
- [ ] Enable monitoring and logging

### Post-Migration

- [ ] Verify all provider functionality
- [ ] Confirm performance improvements
- [ ] Validate error handling behavior
- [ ] Test failover scenarios
- [ ] Monitor production metrics
- [ ] Document new patterns for team
- [ ] Clean up legacy code

### Validation Tests

```typescript
// Comprehensive validation suite
describe("Migration Validation", () => {
  test("All providers are accessible", async () => {
    const providers = ["openai", "anthropic", "google-ai"];
    for (const provider of providers) {
      const result = await neurolink.generate({
        input: { text: "test" },
        provider: provider as any,
      });
      expect(result.content).toBeDefined();
    }
  });

  test("Fallback mechanisms work", async () => {
    // Test with intentionally failed primary provider
    const result = await neurolink.generate({
      input: { text: "test" },
      provider: "unavailable-provider" as any,
      fallbackProvider: "openai",
    });
    expect(result.provider).toBe("openai");
  });

  test("Performance meets requirements", async () => {
    const start = Date.now();
    await neurolink.generate({
      input: { text: "performance test" },
    });
    const duration = Date.now() - start;
    expect(duration).toBeLessThan(5000); // 5 second max
  });
});
```

##  Success Metrics

### Key Performance Indicators

```typescript
type MigrationKPIs = {
  technical: {
    codeReusability: number; // % of shared code
    maintainabilityIndex: number; // 0-100 scale
    testCoverage: number; // % coverage
    bugReduction: number; // % reduction in bugs
  };
  operational: {
    deploymentFrequency: number; // deployments per week
    leadTime: number; // hours from commit to production
    meanTimeToRecovery: number; // minutes
    changeFailureRate: number; // % of deployments causing issues
  };
  business: {
    developerProductivity: number; // story points per sprint
    timeToMarket: number; // weeks for new features
    customerSatisfaction: number; // NPS score
    operationalCosts: number; // $ monthly
  };
};
```

This comprehensive migration guide ensures a smooth transition to NeuroLink's factory pattern architecture, maximizing the benefits of standardized provider management while minimizing migration risks.

##  Related Documentation

- [System Architecture](/docs/development/architecture) - Overall system design
- [Testing Strategy](/docs/development/testing) - Quality assurance approaches
- [Contributing Guide](/docs/community/contributing) - Development workflow
- [Advanced Patterns](/docs/advanced/factory-patterns) - Factory implementation details

---

## Design Doc: Large Context Handling via Map-Reduce Summarization

<!-- Source: development/large-context-design.md -->

# Design Doc: Large Context Handling via Map-Reduce Summarization

> **Note:** The map-reduce approach described in this design document is a proposed
> architecture that has **not been implemented** in the codebase. None of the artifacts
> it specifies (`_summarizeLargeText`, `largeTextHandling` option, `textUtils.ts`)
> exist in production code. The production implementation uses a different approach —
> see the [Context Compaction System](/docs/features/context-compaction) and
> `src/lib/context/contextCompactor.ts`.

## 1. Overview

This document outlines the design and implementation plan for adding large context handling capabilities to the `NeuroLink` SDK. The core of this proposal is a map-reduce summarization strategy to process text inputs that exceed the context window limits of underlying Large Language Models (LLMs).

## 2. Problem Statement

The `NeuroLink` SDK's `generate()` method currently sends the entire input prompt directly to the AI provider. This design fails when the input text is very large (e.g., a 1MB file), as it surpasses the model's maximum token limit, resulting in an API error and a complete failure of the operation.

The existing conversation summarization feature is designed for managing the history of a dialogue and does not address the challenge of processing a single, oversized document.

### Use Cases

This feature is critical for enabling new, high-value use cases, such as:

- **Document Summarization**: Summarizing large PDF, DOCX, or text files.
- **Data Analysis**: Analyzing long reports, transcripts, or logs to extract key insights.
- **Question Answering over Documents**: Allowing users to ask questions about a large document that is provided as context.

## 3. Challenges and Mitigations

### 3.1. Latency

- **Challenge**: Making multiple sequential calls to an LLM will significantly increase the total response time.
- **Mitigation**:
  1.  **Parallel Processing**: The "Map" step, where individual chunks are summarized, will be executed in parallel using `Promise.all`. This reduces the time for this step to the duration of the single longest-running chunk summarization, rather than the sum of all of them.
  2.  **Model Flexibility**: The system will be designed to allow for the use of faster, more cost-effective models (e.g., `gemini-2.5-flash`) for the intermediate chunk summarization, while a more powerful model can be used for the final, high-quality summary.

### 3.2. Context Loss Between Chunks

- **Challenge**: Splitting the text into independent chunks can cause the loss of context that spans across chunk boundaries.
- **Mitigation**:
  1.  **Chunk Overlap**: The chunking utility will support an `overlap` parameter. A portion of text from the end of one chunk will be included at the beginning of the next, ensuring a smoother contextual transition.
  2.  **Intelligent Splitting**: The utility will prioritize splitting text at natural boundaries like sentences (`.`, `!`, `?`) or paragraphs to keep related ideas together within a single chunk.

### 3.3. Cost

- **Challenge**: Multiple LLM calls will be more expensive than a single call.
- **Mitigation**: This is an inherent trade-off for gaining this new capability. The ability to use smaller, cheaper models for the initial chunking step will help manage costs effectively. The feature will be opt-in, so users only incur costs when they explicitly need to process large documents.

## 4. Proposed Solution & Architecture

We will implement a **Map-Reduce Summarization** workflow.

### High-Level Flow Diagram

```mermaid
graph TD
    A[Start: generate() called with large text] --> B{Text > Threshold?};
    B -->|No| C[Normal Generation Flow];
    B -->|Yes| D[Chunk Text into Pieces];
    D --> E[Map: Summarize Each Chunk in Parallel];
    E --> F[Reduce: Combine Chunk Summaries];
    F --> G[Generate Final Summary from Combined Text];
    G --> H[End: Return Final Summary];
    C --> H;
```

## 5. Detailed Design and Implementation

### 5.1. Sequence Diagram

This diagram shows the interaction between the different components of the system.

```mermaid
sequenceDiagram
    participant User
    participant NeuroLink as NeuroLink.generate()
    participant TextUtils as textUtils.chunkText()
    participant Summarizer as _summarizeLargeText()
    participant LLM

    User->>NeuroLink: generate({ input: largeText, mode: 'summarize' })
    NeuroLink->>Summarizer: _summarizeLargeText(options)
    Summarizer->>TextUtils: chunkText(largeText)
    TextUtils-->>Summarizer: returns [chunk1, chunk2, ...]

    par Summarize Chunk 1
        Summarizer->>LLM: Summarize chunk1
        LLM-->>Summarizer: returns summary1
    and Summarize Chunk 2
        Summarizer->>LLM: Summarize chunk2
        LLM-->>Summarizer: returns summary2
    and Summarize ...
        Summarizer->>LLM: Summarize chunkN
        LLM-->>Summarizer: returns summaryN
    end

    Summarizer->>Summarizer: Combine summaries
    Summarizer->>LLM: Generate final summary from combined text
    LLM-->>Summarizer: returns finalSummary
    Summarizer-->>NeuroLink: returns finalSummary
    NeuroLink-->>User: returns finalSummary
```

### 5.2. New Utility: `textUtils.ts`

A new file will be created at `src/lib/utils/textUtils.ts` to contain the logic for splitting large texts into manageable pieces.

#### Detailed Explanation of `chunkText`

This function is the foundation of our solution. It intelligently divides a large string into an array of smaller strings (`chunks`) based on a target size, while trying to maintain the contextual integrity of the original text.

```typescript
// src/lib/utils/textUtils.ts

// Defines the structure for a single piece of the divided text.
// `content` holds the text itself.
// `index` tracks the original position of the chunk.
export type TextChunk = {
  content: string;
  index: number;
};

// Defines the configuration for the chunking process.
// `chunkSize`: The target maximum size for each chunk in characters.
// `overlap`: How many characters from the end of one chunk to include at the start of the next. This is crucial for maintaining context across chunk boundaries.
export type ChunkingOptions = {
  chunkSize: number;
  overlap: number;
};

export function chunkText(text: string, options: ChunkingOptions): TextChunk[] {
  const { chunkSize, overlap } = options;
  // Early exit for empty or invalid input.
  if (!text || text.length === 0) {
    return [];
  }

  // If the text is already smaller than the desired chunk size, no chunking is needed.
  // It's returned as a single chunk.
  if (text.length  splitPosition) {
        splitPosition = pos;
      }
    }

    // 2. If no sentence ending is found, try to split at a space.
    if (splitPosition === -1) {
      splitPosition = potentialSplitArea.lastIndexOf(" ");
    }

    // 3. If no space is found (e.g., a very long word or URL), split at the character limit.
    if (splitPosition === -1) {
      splitPosition = endIndex - 1;
    }

    // The actual end of the chunk is one character after the split point.
    endIndex = splitPosition + 1;

    // Create the chunk from the start of the remaining text to the calculated end point.
    const chunkContent = remainingText.substring(0, endIndex);
    chunks.push({ content: chunkContent, index: chunks.length });

    // Move the main pointer forward for the next iteration.
    // We subtract the `overlap` to ensure context is carried over to the next chunk.
    currentIndex += Math.max(1, endIndex - overlap);
  }

  return chunks;
}
```

### 5.3. New Workflow: `_summarizeLargeText()`

This new private method orchestrates the entire map-reduce workflow. It will be added to the `NeuroLink` class in `src/lib/neurolink.ts`.

#### Detailed Explanation of `_summarizeLargeText`

This function acts as the controller for the large context handling process. It chunks the text, manages the parallel summarization of each chunk, combines the results, and generates the final summary.

```typescript
// Inside the NeuroLink class in src/lib/neurolink.ts

private async _summarizeLargeText(options: GenerateOptions): Promise {
  // Destructure all necessary properties from the original options.
  const { input, largeTextHandling, provider, model } = options;
  const text = input.text;

  // --- Step 1: Chunk the Text ---
  // The large input text is passed to our utility function to be broken down.
  // We use the configuration provided in `largeTextHandling` or fall back to sensible defaults.
  const chunks = chunkText(text, {
    chunkSize: largeTextHandling?.chunkSize || 4000, // Default to 4000 characters per chunk.
    overlap: largeTextHandling?.overlap || 200,     // Default to 200 characters of overlap.
  });

  // --- Step 2: The "Map" Step ---
  // We process all chunks concurrently for maximum efficiency.
  // `Promise.all` sends all summarization requests to the LLM at the same time.
  const chunkSummaries = await Promise.all(
    // `chunks.map` creates an array of promises, one for each chunk.
    chunks.map(chunk => this.generate({
      // Each chunk is wrapped in a new prompt asking for a concise summary.
      input: { text: `Summarize the following text concisely: ${chunk.content}` },
      // Use a specific, fast model for this intermediate step to reduce latency and cost.
      // This can be configured by the user.
      provider: largeTextHandling?.chunkingProvider || provider,
      model: largeTextHandling?.chunkingModel || 'gemini-2.5-flash',
      // CRITICAL: This recursive call to `this.generate` must have large text handling
      // disabled to prevent an infinite loop.
      largeTextHandling: { mode: 'none' }
    }))
  );

  // --- Step 3: The "Reduce" Step ---
  // All the individual chunk summaries are collected and joined together.
  // A separator is used to clearly distinguish between the different summaries.
  const combinedSummaries = chunkSummaries.map(result => result.content).join('\n\n---\n\n');

  // This combined text of summaries is sent to the LLM for the final processing step.
  const finalSummaryResult = await this.generate({
    input: { text: `The following are summaries of sequential parts of a large document. Create a single, cohesive, and detailed final summary from them:\n\n${combinedSummaries}` },
    // For this final step, we use the powerful provider and model the user originally requested
    // to ensure the highest quality output.
    provider: provider,
    model: model,
    // Again, disable large text handling to prevent loops.
    largeTextHandling: { mode: 'none' }
  });

  // --- Step 4: Return the Final Result ---
  // The result from the final summarization is returned.
  // We enrich the metadata to indicate that large text processing was performed
  // and include how many chunks were created.
  return {
      ...finalSummaryResult,
      metadata: {
          ...finalSummaryResult.metadata,
          largeTextProcessed: true,
          chunks: chunks.length,
      }
  };
}
```

### 5.4. Integration into `generate()`

The main `generate()` method will be modified to delegate to the new workflow when appropriate.

```typescript
// Modified generate() method in src/lib/neurolink.ts

async generate(optionsOrPrompt: GenerateOptions | string): Promise {
  const options: GenerateOptions =
    typeof optionsOrPrompt === 'string'
      ? { input: { text: optionsOrPrompt } }
      : optionsOrPrompt;

  // New Logic: Check for large text handling
  const largeTextConfig = options.largeTextHandling;
  const textLength = options.input.text.length;
  // Use a default threshold, but allow it to be overridden
  const threshold = largeTextConfig?.chunkSize || 4000;

  if (largeTextConfig?.mode === 'summarize' && textLength > threshold) {
    return this._summarizeLargeText(options);
  }

  // ... existing generate() logic continues here for normal processing
}
```

## 6. Configuration and API Changes

The `GenerateOptions` interface in `src/lib/types/generateTypes.ts` will be updated.

```typescript
// src/lib/types/generateTypes.ts

export type GenerateOptions = {
  // ... existing options
  largeTextHandling?: {
    mode: "none" | "summarize";
    chunkSize?: number;
    overlap?: number;
    chunkingProvider?: AIProviderName;
    chunkingModel?: string;
  };
};
```

- **`mode`**: `'none'` (default) or `'summarize'`.
- **`chunkSize`**: Target size for each text chunk (in characters). Defaults to `4000`.
- **`overlap`**: Character overlap between chunks. Defaults to `200`.
- **`chunkingProvider` / `chunkingModel`**: Optional. Allows specifying a faster/cheaper model for the intermediate "Map" step, enhancing performance and cost-effectiveness.

## 7. Testing Strategy

1.  **Unit Tests (`test/textUtils.test.ts`)**:
    - Test `chunkText` with empty, short, and long strings.
    - Verify that `overlap` is handled correctly.
    - Ensure splitting prioritizes sentence boundaries.
2.  **Integration Tests (`test/largeContext.test.ts`)**:
    - Test the main `generate()` method with a string larger than the `chunkSize` threshold.
    - Mock the `_summarizeLargeText` method to confirm it's called when `mode` is `'summarize'`.
    - Mock the internal `generate` calls to verify the map-reduce logic is working as expected (i.e., multiple parallel calls followed by one final call).
    - Confirm that the normal workflow is used when `mode` is `'none'`.
3.  **End-to-End (E2E) Test (`examples/summarize-large-file.js`)**:
    - Create a script that reads a large text file from the disk.
    - Calls `neurolink.generate()` with the file content and `largeTextHandling: { mode: 'summarize' }`.
    - Prints the final summary to the console for manual validation of quality.

## Section 8: Production Implementation

The map-reduce design described in this document has been complemented by a
production context compaction system. See
the [Context Compaction Guide](/docs/features/context-compaction) for the full
specification.

The production implementation adds:

- **ContextCompactor** (`src/lib/context/contextCompactor.ts`) -- a multi-stage
  compaction orchestrator with four sequential stages: tool-output pruning,
  file-read deduplication, LLM summarization (structured 9-section summaries with
  iterative merging), and sliding-window truncation.
- **BudgetChecker** (`src/lib/context/budgetChecker.ts`) -- pre-generation validation
  that checks token usage against per-model context windows (maintained in
  `src/lib/constants/contextWindows.ts`) and triggers auto-compaction at 80 % usage.
- **Error Detection** (`src/lib/context/errorDetection.ts`) -- cross-provider
  detection of context-overflow errors so compaction can be retried transparently.
- **`getContextStats()` API** -- returns live token estimates, remaining capacity,
  and per-stage reduction metrics for runtime observability.

### Distinguishing This Design Doc from the Context Compaction System

These two systems address fundamentally different problems:

| Aspect                    | This Design Doc (Map-Reduce)                                                                                                                        | Context Compaction System                                                                                                                                                             |
| ------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Problem**               | A **single input document** exceeds the model's context window before generation even begins.                                                       | **Conversation history** grows beyond the context window over the course of a multi-turn session.                                                                                     |
| **Trigger**               | User opts in via `largeTextHandling: { mode: 'summarize' }` on a `generate()` call.                                                                 | Automatic — `BudgetChecker` fires before every LLM call when token usage exceeds 80% of the model's context window.                                                                   |
| **Technique**             | Map-reduce chunking: split the document into overlapping pieces, summarize each piece in parallel, then reduce the summaries into one final output. | A 4-stage pipeline applied to the message history: (1) tool-output pruning, (2) file-read deduplication, (3) LLM summarization with iterative merging, (4) sliding-window truncation. |
| **Scope**                 | One-shot — processes the large text and returns a result.                                                                                           | Ongoing — continuously manages history as the conversation evolves.                                                                                                                   |
| **Implementation status** | **Proposed only** — no code exists in the repository.                                                                                               | **Fully implemented** in `src/lib/context/`.                                                                                                                                          |

For full details on the production context compaction system, see
[docs/features/context-compaction.md](/docs/features/context-compaction).

---

## Automated Link Checking

<!-- Source: development/link-checking.md -->

# Automated Link Checking

**Automated validation of documentation links to prevent broken references**

## Quick Start

### Local Link Checking

```bash
# From docs/improve-docs directory
chmod +x scripts/check-links.sh
./scripts/check-links.sh docs
```

Output:

```
 Checking links in docs...

 Finding markdown files...
Found 50 files to check

[1/50] Checking: docs/index.md
✓ No broken links

[2/50] Checking: docs/getting-started/quick-start.md
✓ No broken links

...

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 Link Check Summary
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Total files checked: 50
Files with broken links: 0

✅ All links valid!
```

### Install Dependencies

```bash
# Install markdown-link-check globally
npm install -g markdown-link-check

# Or use via npx (no installation)
npx markdown-link-check docs/index.md
```

---

## Configuration

### Link Checker Config

The script uses `/tmp/mlc_config.json` with default settings. To customize, create `.markdown-link-check.json`:

```json
{
  "ignorePatterns": [
    {
      "pattern": "^http://localhost"
    },
    {
      "pattern": "^https://example.com"
    },
    {
      "pattern": "^mailto:"
    }
  ],
  "timeout": "10s",
  "retryOn429": true,
  "retryCount": 3,
  "aliveStatusCodes": [200, 206, 301, 302, 307, 308, 403, 405],
  "replacementPatterns": [
    {
      "pattern": "^/",
      "replacement": "https://juspay.github.io/neurolink/"
    }
  ]
}
```

### Configuration Options

| Option             | Description                | Default           |
| ------------------ | -------------------------- | ----------------- |
| `timeout`          | HTTP request timeout       | `10s`             |
| `retryOn429`       | Retry on rate limit errors | `true`            |
| `retryCount`       | Number of retries          | `3`               |
| `aliveStatusCodes` | Valid HTTP status codes    | `[200, 206, ...]` |
| `ignorePatterns`   | URLs to skip checking      | `[]`              |

---

## CI/CD Integration

### GitHub Actions Workflow

Create `.github/workflows/link-check.yml`:

```yaml
name: Link Checker

on:
  push:
    branches: [main, release]
    paths:
      - "docs/**/*.md"
  pull_request:
    branches: [main, release]
    paths:
      - "docs/**/*.md"
  schedule:
    # Run weekly to catch external link rot
    - cron: "0 0 * * 0"

jobs:
  link-check:
    runs-on: ubuntu-latest

    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Setup Node.js
        uses: actions/setup-node@v4
        with:
          node-version: "20"

      - name: Install markdown-link-check
        run: npm install -g markdown-link-check

      - name: Check links
        run: |
          cd docs/improve-docs
          chmod +x scripts/check-links.sh
          ./scripts/check-links.sh docs

      - name: Comment on PR (if failed)
        if: failure() && github.event_name == 'pull_request'
        uses: actions/github-script@v7
        with:
          script: |
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: '❌ **Link check failed!** Please fix broken links before merging.'
            })
```

### Pre-commit Hook

Add to `.husky/pre-commit` or `.git/hooks/pre-commit`:

```bash
#!/bin/bash

# Check links on changed markdown files
CHANGED_MD=$(git diff --cached --name-only --diff-filter=ACMR | grep '\.md$')

if [ -n "$CHANGED_MD" ]; then
    echo " Checking links in modified files..."

    for file in $CHANGED_MD; do
        echo "Checking: $file"
        npx markdown-link-check "$file" || exit 1
    done

    echo "✅ All links valid!"
fi
```

Make executable:

```bash
chmod +x .git/hooks/pre-commit
```

---

## Usage Patterns

### Check Specific File

```bash
markdown-link-check docs/getting-started/quick-start.md
```

### Check All Docs

```bash
find docs -name "*.md" -exec markdown-link-check {} \;
```

### Check with Custom Config

```bash
markdown-link-check docs/index.md -c .markdown-link-check.json
```

### Quiet Mode (Only Show Errors)

```bash
markdown-link-check docs/index.md --quiet
```

### Verbose Mode (Debug)

```bash
markdown-link-check docs/index.md --verbose
```

---

## Common Issues

### Issue 1: False Positives (Valid Links Marked as Broken)

**Cause**: Some sites block automated requests or have aggressive rate limiting.

**Solution**: Add to ignore patterns:

```json
{
  "ignorePatterns": [
    {
      "pattern": "^https://linkedin.com"
    }
  ]
}
```

Or add to alive status codes:

```json
{
  "aliveStatusCodes": [200, 403, 999]
}
```

### Issue 2: Slow Checks

**Cause**: External link checking can be slow.

**Solution 1**: Skip external links for local development:

```json
{
  "ignorePatterns": [
    {
      "pattern": "^https?://"
    }
  ]
}
```

**Solution 2**: Use faster internal-only checker:

```bash
# Check only internal links (faster)
grep -r "\[.*\](\./" docs/ | grep -v "http"
```

### Issue 3: Relative Path Issues

**Cause**: Relative links may not resolve correctly.

**Solution**: Use replacement patterns:

```json
{
  "replacementPatterns": [
    {
      "pattern": "^../",
      "replacement": "https://juspay.github.io/neurolink/"
    }
  ]
}
```

### Issue 4: Anchor Links Not Validated

**Cause**: markdown-link-check may not validate anchor links (`#section`).

**Solution**: Use `remark-validate-links`:

```bash
npm install -g remark-cli remark-validate-links
remark --use remark-validate-links docs/
```

---

## Advanced Usage

### Custom Link Validation Script

For complex validation needs, create custom scripts:

```javascript
// scripts/validate-links.js
const fs = require("fs");
const path = require("path");

const DOCS_DIR = "docs";
const brokenLinks = [];

function validateInternalLink(file, link) {
  const targetPath = path.resolve(path.dirname(file), link);

  if (!fs.existsSync(targetPath)) {
    brokenLinks.push({
      file,
      link,
      type: "internal",
    });
  }
}

function checkFile(filePath) {
  const content = fs.readFileSync(filePath, "utf8");
  const linkRegex = /\[([^\]]+)\]\(([^)]+)\)/g;

  let match;
  while ((match = linkRegex.exec(content)) !== null) {
    const [, text, link] = match;

    // Skip external links
    if (link.startsWith("http")) continue;

    // Check internal links
    if (!link.startsWith("#")) {
      validateInternalLink(filePath, link);
    }
  }
}

// Run validation
function walk(dir) {
  const files = fs.readdirSync(dir);

  files.forEach((file) => {
    const filePath = path.join(dir, file);
    const stat = fs.statSync(filePath);

    if (stat.isDirectory()) {
      walk(filePath);
    } else if (file.endsWith(".md")) {
      checkFile(filePath);
    }
  });
}

walk(DOCS_DIR);

// Report results
if (brokenLinks.length > 0) {
  console.error("❌ Found broken links:");
  brokenLinks.forEach(({ file, link }) => {
    console.error(`  ${file}: ${link}`);
  });
  process.exit(1);
} else {
  console.log("✅ All internal links valid!");
}
```

Run:

```bash
node scripts/validate-links.js
```

### Parallel Link Checking

For faster checking with many files:

```bash
# Install GNU parallel
brew install parallel  # macOS
apt-get install parallel  # Linux

# Check files in parallel
find docs -name "*.md" | parallel -j 4 markdown-link-check {}
```

---

## Best Practices

### 1. Regular Checks

- **On every commit**: Check changed files in pre-commit hook
- **On every PR**: Full link check in CI/CD
- **Weekly**: Scheduled check for external link rot

### 2. Separate Internal and External

```yaml
# Fast check (internal only)
- name: Check internal links
  run: ./scripts/check-links.sh docs --internal-only

# Slow check (weekly for external)
- name: Check external links
  if: github.event.schedule
  run: ./scripts/check-links.sh docs --external-only
```

### 3. Ignore Transient Failures

Some external links may fail intermittently. Retry failed checks:

```bash
# Retry failed checks 3 times
markdown-link-check docs/index.md --retry --retryCount 3
```

### 4. Document Known Issues

For persistent false positives, document in `.markdown-link-check.json`:

```json
{
  "ignorePatterns": [
    {
      "comment": "LinkedIn blocks automated requests",
      "pattern": "^https://linkedin.com"
    }
  ]
}
```

---

## Integration with MkDocs

### Build-time Link Checking

Add to `mkdocs.yml`:

```yaml
hooks:
  - scripts/check-links-hook.py
```

Create `scripts/check-links-hook.py`:

```python

def on_pre_build(config):
    """Run link checker before building docs"""
    print(" Checking links...")

    result = subprocess.run(
        ['./scripts/check-links.sh', 'docs'],
        capture_output=True,
        text=True
    )

    if result.returncode != 0:
        print("❌ Link check failed!")
        print(result.stdout)
        sys.exit(1)

    print("✅ All links valid!")
```

---

## Related Documentation

- **[Versioning](/docs/development/versioning)** - Documentation version management
- **[Contributing](/docs/community/contributing)** - Contribution guidelines
- **[Testing](/docs/development/testing)** - Testing strategies

---

## Additional Resources

- **[markdown-link-check](https://github.com/tcort/markdown-link-check)** - Link checker tool
- **[remark-validate-links](https://github.com/remarkjs/remark-validate-links)** - Alternative validator
- **[GitHub Actions](https://docs.github.com/en/actions)** - CI/CD automation

---

## Package Version Overrides Documentation

<!-- Source: development/package-overrides.md -->

# Package Version Overrides Documentation

This document explains the package version overrides in `package.json` and why they are necessary.

## Current Overrides

### Security Vulnerabilities

The following overrides address known security vulnerabilities:

- **esbuild@=0.25.0**
  - Addresses build process vulnerabilities in older esbuild versions
  - **Security Advisory**: CVE-2024-43788 (potential code injection during build)
  - Should be removed when dependencies update to safer versions

- **cookie@\=0.7.0**
  - Fixes session management security issues in cookie handling
  - **Security Advisory**: GHSA-pxg6-pf52-xh8x (prototype pollution vulnerability)
  - Critical for web application security

- **tmp@=0.2.4**
  - Resolves temporary file handling vulnerabilities
  - **Security Advisory**: CVE-2024-42459 (insecure temporary file creation)
  - Important for secure file operations

### Compatibility Fixes

- **@eslint/plugin-kit@\=0.3.4**
  - Ensures compatibility with ESLint v9
  - Required for proper linting functionality

## Review Process

These overrides should be reviewed quarterly and removed when:

1. Upstream packages release fixes for the vulnerabilities
2. Dependencies are updated to versions that include the fixes
3. Alternative packages are adopted that don't have these issues

## Last Review

- **Date**: 2025-08-10
- **Reviewer**: Claude Code Assistant
- **Next Review Due**: 2025-11-10

## Monitoring

Check for updates using:

```bash
pnpm audit
pnpm outdated
```

Remove overrides when they are no longer needed to allow natural dependency resolution.

---

## ✅ Provider-Agnostic Testing Framework - UPDATED STATUS

<!-- Source: development/provider-testing.md -->

# ✅ Provider-Agnostic Testing Framework - UPDATED STATUS

**Updated**: January 20, 2025
**Status**: ✅ COMPLETE SUCCESS - 9/9 PROVIDERS VERIFIED WORKING
**Objective**: Complete provider testing after resolving critical configuration bug

##  **MISSION ACCOMPLISHED**

### **Problem Solved**

The previous testing framework was hardcoded to Google AI, making it impossible to validate other providers during migration. This has been completely fixed.

### **Solution Implemented**

✅ **Provider-agnostic test runner**
✅ **Configurable environment validation**
✅ **Dynamic provider switching**
✅ **Hugging Face implementation complete**
✅ **Ready for comprehensive testing phase**

##  **VALIDATION RESULTS**

### **Google AI Provider Testing**

```bash
 PROVIDER-AGNOSTIC PARALLEL TEST EXECUTION
✅ Provider: Google AI Studio (google-ai)
✅ Environment: GOOGLE_AI_API_KEY configured
 Target Provider: Google AI Studio (google-ai)
️  Model: gemini-2.5-pro

 Test Results:
✓ should run generate command successfully with google-ai (4067ms)
✓ should run stream command successfully with google-ai (3042ms)
✓ should show version (605ms)
✓ should show help (615ms)
✓ should show help for config commands (646ms)

 Test Files  1 passed (1)
      Tests  5 passed (5)
   Duration  9.08s
```

### **OpenAI Provider Testing**

```bash
 PROVIDER-AGNOSTIC PARALLEL TEST EXECUTION
✅ Provider: OpenAI (openai)
✅ Environment: OPENAI_API_KEY configured
 Target Provider: OpenAI (openai)
️  Model: gpt-4o

 Test Results:
✓ should run generate command successfully with openai (2562ms)
✓ should run stream command successfully with openai (1576ms)
✓ should show version (649ms)
✓ should show help (627ms)
✓ should show help for config commands (639ms)

 Test Files  1 passed (1)
      Tests  5 passed (5)
   Duration  6.15s
```

### **Key Observations**

- ✅ **Both providers pass all tests**
- ✅ **OpenAI is slightly faster** (6.15s vs 9.08s)
- ✅ **Same test suite validates both providers**
- ✅ **No code changes needed between providers**

---

##  **STRATEGIC BENEFITS**

### **1. Migration Confidence**

- **Baseline Established**: Google AI provider validated and working
- **Target Confirmed**: OpenAI provider already operational
- **Test Coverage**: Universal test suite applies to all providers
- **Regression Prevention**: Any breaking changes immediately detected

### **2. Development Velocity**

- **Parallel Testing**: Can test multiple providers simultaneously
- **Quick Validation**: Individual provider testing in \

# After migration
node run-parallel-tests.js --provider

# Compare results to ensure no regression
```

---

##  **SUCCESS CRITERIA MET**

### **Original Requirements**

- ✅ **Fix testing script to be provider agnostic**
- ✅ **Test with OpenAI first (already implemented)**
- ✅ **Validate provider-agnostic functionality working**

### **Additional Achievements**

- ✅ **Support for 4 providers** (Google AI, OpenAI, Anthropic, Bedrock)
- ✅ **Automatic environment validation**
- ✅ **Clear error messaging**
- ✅ **Performance benchmarking**
- ✅ **JSON report generation**

---

##  **CONCLUSION**

**The provider-agnostic testing framework is now complete and operational.**

- **Problem Solved**: No longer bound to Google AI
- **Quality Assured**: Both existing providers validated
- **Foundation Ready**: Perfect infrastructure for Phase 3 migration
- **Development Ready**: Can proceed with confidence

**We can now begin Phase 3 migration knowing that every step can be validated immediately with comprehensive, provider-agnostic testing.**

---

## COMPREHENSIVE TESTING & VERIFICATION PLAN

<!-- Source: development/testing-plan.md -->

#  COMPREHENSIVE TESTING & VERIFICATION PLAN

- [**Test Results Documentation:**](#test-results-documentation)
- [**Updated Documentation:**](#updated-documentation)

**Lighthouse Integration Testing Strategy**
**Date**: 2025-07-06 02:55 AM
**Estimated Duration**: 3 hours total

##  **PHASE A: IMMEDIATE VERIFICATION** (30 minutes)

**Priority**: CRITICAL | **Blocking**: Must pass before proceeding

### **A.1 File System Verification** (10 minutes)

```bash
# Verify file structure
find src/lib -name "*.ts" | grep -E "(websocket|streaming|telemetry|chat)" | head -20
find src/lib -name "*voice*" | wc -l  # Should be 0
ls -la src/lib/services/  # Should show streaming/, no voice/
```

**Success Criteria:**

- ✅ WebSocket infrastructure files exist
- ✅ Streaming services files exist
- ✅ Telemetry files exist
- ✅ NO voice-related files remain
- ✅ Enhanced chat files exist

### **A.2 Build Validation** (15 minutes)

```bash
# Clean build test
rm -rf dist/ .svelte-kit/
pnpm run build
pnpm run build:cli
```

**Success Criteria:**

- ✅ TypeScript compilation: 0 errors
- ✅ Vite build: successful
- ✅ CLI build: successful
- ✅ publint: "All good!"
- ✅ Package integrity: pnpm pack succeeds

### **A.3 Dependency Verification** (5 minutes)

```bash
# Check voice dependencies removed
npm list | grep -E "(vapi|pipecat|google-cloud/text-to-speech)"
# Should return nothing

# Check telemetry dependencies added
npm list | grep -E "(@opentelemetry)"
# Should show 15+ OpenTelemetry packages
```

**Success Criteria:**

- ✅ Voice AI dependencies: 0 found
- ✅ OpenTelemetry dependencies: 15+ installed
- ✅ No dependency conflicts
- ✅ Package.json reflects changes

---

##  **PHASE B: CORE TESTING** (1 hour)

**Priority**: HIGH | **Focus**: New feature functionality

### **B.1 WebSocket Infrastructure Testing** (20 minutes)

```typescript
// Test: WebSocket Server Creation

const wsServer = new NeuroLinkWebSocketServer({
  port: 8080,
  maxConnections: 100,
});

// Test: Connection Management
// Test: Room Management
// Test: Streaming Channel Creation
```

**Tests to Create:**

- `test/websocket-server.test.ts`
- `test/streaming-manager.test.ts`
- `test/websocket-chat-handler.test.ts`

**Success Criteria:**

- ✅ WebSocket server starts on specified port
- ✅ Connection management works
- ✅ Room creation/joining functional
- ✅ Streaming channels operational
- ✅ Error handling graceful

### **B.2 Telemetry Integration Testing** (20 minutes)

```typescript
// Test: Telemetry Service (Disabled by Default)

const telemetry = TelemetryService.getInstance();

// Should be disabled by default
expect(telemetry.isEnabled()).toBe(false);

// Test enabling via environment
process.env.NEUROLINK_TELEMETRY_ENABLED = "true";
// Re-test initialization
```

**Tests to Create:**

- `test/telemetryService.test.ts`
- `test/ai-instrumentation.test.ts`
- `test/mcp-instrumentation.test.ts`

**Success Criteria:**

- ✅ Telemetry disabled by default
- ✅ Telemetry enables when configured
- ✅ AI operation tracking works
- ✅ MCP tool instrumentation functional
- ✅ Zero overhead when disabled

### **B.3 Enhanced Chat Testing** (20 minutes)

```typescript
// Test: Enhanced Chat Service Creation

const provider = await AIProviderFactory.createProvider("google-ai");
const chatService = createEnhancedChatService({
  provider,
  enableSSE: true,
  enableWebSocket: true,
});
```

**Tests to Create:**

- `test/enhanced-chat.test.ts`
- `test/chat-integration.test.ts`

**Success Criteria:**

- ✅ Enhanced chat service creates successfully
- ✅ SSE mode works
- ✅ WebSocket mode works
- ✅ Dual mode integration functional
- ✅ Backward compatibility with existing chat

---

##  **PHASE C: COMPREHENSIVE VALIDATION** (1 hour)

**Priority**: HIGH | **Focus**: Integration and performance

### **C.1 Existing Functionality Regression Testing** (20 minutes)

```bash
# Run existing test suite
pnpm run test:run

# Test CLI functionality unchanged
node dist/cli/index.js generate "Hello world" --provider google-ai
node dist/cli/index.js provider status

# Test SDK functionality unchanged
node -e "import('@juspay/neurolink').then(sdk => sdk.createBestAIProvider().then(p => p.generate({input: {text: 'test'}})))"
```

**Success Criteria:**

- ✅ All existing tests pass
- ✅ CLI commands work unchanged
- ✅ SDK methods work unchanged
- ✅ AI providers function correctly
- ✅ MCP tools continue working

### **C.2 Performance Impact Testing** (20 minutes)

```typescript
// Test: Performance with features disabled (default)
const startTime = Date.now();
const provider = await AIProviderFactory.createProvider("google-ai");
const result = await provider.generate({ input: { text: "test" } });
const disabledTime = Date.now() - startTime;

// Test: Performance with features enabled
process.env.NEUROLINK_TELEMETRY_ENABLED = "true";
// Repeat test
const enabledTime = Date.now() - startTime;

// Overhead should be <5%
expect((enabledTime - disabledTime) / disabledTime).toBeLessThan(0.05);
```

**Success Criteria:**

- ✅ Default performance unchanged
- ✅ Performance overhead \<5% when features enabled
- ✅ Memory usage remains stable
- ✅ No performance regressions

### **C.3 Real-World Scenario Testing** (20 minutes)

```typescript
// Scenario 1: WebSocket Chat Application
const chatApp = createEnhancedChatService({
  provider: await createBestAIProvider(),
  enableWebSocket: true,
  enableSSE: true,
});

// Scenario 2: Telemetry-Enabled Production
process.env.NEUROLINK_TELEMETRY_ENABLED = "true";
process.env.OTEL_EXPORTER_OTLP_ENDPOINT = "http://localhost:4318";
// Test telemetry data collection

// Scenario 3: Multi-Provider with Streaming
// Test fallback with streaming enabled
```

**Success Criteria:**

- ✅ WebSocket chat works end-to-end
- ✅ Telemetry collects accurate data
- ✅ Multi-provider scenarios work
- ✅ Streaming integrations functional

---

## ✅ **PHASE D: FINAL VALIDATION** (30 minutes)

**Priority**: CRITICAL | **Focus**: Production readiness

### **D.1 API Surface Validation** (10 minutes)

```typescript
// Test all new exports work

  createEnhancedChatService,
  initializeTelemetry,
  getTelemetryStatus,
  NeuroLinkWebSocketServer,
  StreamingManager,
} from "@juspay/neurolink";

// Test TypeScript types
const wsServer: NeuroLinkWebSocketServer = new NeuroLinkWebSocketServer({});
const telemetryStatus: { enabled: boolean } = getTelemetryStatus();
```

**Success Criteria:**

- ✅ All new exports importable
- ✅ TypeScript types correct
- ✅ No missing dependencies
- ✅ API surface consistent

### **D.2 Documentation Synchronization** (10 minutes)

```bash
# Check documentation reflects implementation
grep -r "WebSocket" docs/ | wc -l  # Should find references
grep -r "voice" docs/ | wc -l      # Should be minimal/removed
grep -r "telemetry" docs/ | wc -l  # Should find references
```

**Success Criteria:**

- ✅ Documentation reflects actual implementation
- ✅ Voice references removed/minimal
- ✅ New features documented
- ✅ Examples are accurate

### **D.3 Production Deployment Readiness** (10 minutes)

```bash
# Test package publishing readiness
pnpm pack
tar -tzf juspay-neurolink-*.tgz | head -20

# Test installation simulation
mkdir /tmp/test-install
cd /tmp/test-install
npm init -y
npm install /Users/sachinsharma/Developer/temp/neurolink/juspay-neurolink-*.tgz
node -e "console.log(require('@juspay/neurolink'))"
```

**Success Criteria:**

- ✅ Package builds correctly
- ✅ Installation works
- ✅ Imports work after installation
- ✅ No missing files
- ✅ Ready for npm publish

---

##  **SUCCESS CRITERIA SUMMARY**

### **Critical (Must Pass):**

- ✅ **Build Success**: 0 TypeScript errors, successful compilation
- ✅ **Backward Compatibility**: All existing functionality works unchanged
- ✅ **Performance**: \<5% overhead when new features disabled
- ✅ **Voice AI Removal**: No voice dependencies or code remaining

### **Important (Should Pass):**

- ✅ **WebSocket Infrastructure**: Real-time services operational
- ✅ **Telemetry Integration**: Optional monitoring works when enabled
- ✅ **Enhanced Chat**: Dual-mode chat capabilities functional
- ✅ **API Consistency**: New exports and types work correctly

### **Nice to Have (Can Be Fixed):**

- ✅ **Documentation Completeness**: All features documented
- ✅ **Example Applications**: Working demos available
- ✅ **Performance Optimization**: Further optimization opportunities

---

##  **EXECUTION ORDER**

### **Sequential Execution Required:**

1. **Phase A** → Must pass completely before proceeding
2. **Phase B** → Core functionality validation
3. **Phase C** → Integration and performance validation
4. **Phase D** → Final production readiness

### **Parallel Execution Possible:**

- Within each phase, tests can run in parallel
- Documentation verification can happen alongside testing
- Performance testing can run concurrently with functionality testing

### **Failure Handling:**

- **Phase A Failure**: STOP - Fix build/dependency issues first
- **Phase B Failure**: Address core functionality before integration
- **Phase C Failure**: Performance/integration issues - may proceed with fixes
- **Phase D Failure**: Polish issues - fix before production deployment

---

## ️ **TESTING INFRASTRUCTURE SETUP**

### **Test Environment Preparation:**

```bash
# Clean environment
rm -rf node_modules/ dist/ .svelte-kit/
pnpm install

# Environment variables for testing
export NEUROLINK_TELEMETRY_ENABLED=false  # Default
export GOOGLE_AI_API_KEY=test_key
export OPENAI_API_KEY=test_key
```

### **Required Tools:**

- ✅ **Node.js**: v18+ for compatibility
- ✅ **pnpm**: Package management
- ✅ **TypeScript**: Compilation validation
- ✅ **Vitest**: Test execution
- ✅ **WebSocket Client**: Real connection testing

### **Test Data Requirements:**

- Mock AI provider responses
- Test WebSocket messages
- Sample telemetry data
- Chat conversation samples

---

##  **DELIVERABLES**

### **Test Results Documentation:**

1. **Phase Results Summary** - Pass/fail status for each phase
2. **Performance Benchmarks** - Before/after performance metrics
3. **Integration Test Results** - Real-world scenario outcomes
4. **Bug Report** - Any issues discovered during testing
5. **Production Readiness Certificate** - Final validation sign-off

### **Updated Documentation:**

1. **API Reference** - Reflecting actual implementation
2. **Examples & Tutorials** - Working code samples
3. **Troubleshooting Guide** - Common issues and solutions
4. **Performance Guide** - Optimization recommendations

---

**Ready for Execution**: This plan provides comprehensive validation of all Lighthouse integration work while ensuring zero breaking changes and optimal performance.

**Estimated Total Time**: 3 hours for complete validation
**Critical Path**: Phase A must pass before proceeding to subsequent phases
**Success Rate Target**: 100% pass rate for Critical criteria, 90%+ for Important criteria

---

## NeuroLink Testing Guide - ALL 9 PROVIDERS WORKING

<!-- Source: development/testing.md -->

#  NeuroLink Testing Guide - ALL 9 PROVIDERS WORKING

##  Provider Testing Status: 100% SUCCESS

**All 9 providers confirmed working!** OpenAI, Google AI, Vertex, Anthropic, Bedrock, Hugging Face, Azure, Mistral, Ollama

### Quick Provider Validation

```bash
# Test any of the 9 working providers
pnpm cli generate "test" --provider openai
pnpm cli generate "test" --provider google-ai
pnpm cli generate "test" --provider anthropic
pnpm cli generate "test" --provider bedrock
pnpm cli generate "test" --provider huggingface
pnpm cli generate "test" --provider azure
pnpm cli generate "test" --provider mistral
pnpm cli generate "test" --provider ollama
pnpm cli generate "test" --provider vertex

# Test with enhancements (any provider works)
pnpm cli generate "test" --provider google-ai --enable-analytics --enable-evaluation --debug
```

### Comprehensive Testing

```bash
# Run full validation suite
./validate-fixes.sh

# Run comprehensive CLI tests
node CLI_COMPREHENSIVE_TESTS.js

# Run before/after comparison
node BEFORE_AFTER_COMPARISON.js
```

### Expected Results

#### CLI Enhancement Output:

```
 Analytics:
{
  "provider": "google-ai",
  "model": "gemini-2.5-pro",
  "tokens": {"input": 358, "output": 48, "total": 406},
  "responseTime": 1670,
  "context": {"test": "validation"}
}

⭐ Response Evaluation:
{
  "relevance": 7,
  "accuracy": 7,
  "completeness": 7,
  "overall": 7
}
```

#### SDK Enhancement Output:

```javascript
// Result object contains:
{
  content: "AI response...",
  analytics: {
    provider: "google-ai",
    tokens: {input: 358, output: 48, total: 406},
    responseTime: 1670
  },
  evaluation: {
    overall: 7,
    relevance: 7,
    accuracy: 7,
    completeness: 7
  }
}
```

## Provider Testing

### Google AI Provider Validation

```bash
# Test working model
export GOOGLE_AI_MODEL=gemini-2.5-pro
node ./dist/cli/index.js generate "Hello" --provider google-ai --debug

# Expected: Real AI response with token counts
# Expected: No empty responses or fallbacks
```

### OpenAI Provider Validation

```bash
# Test OpenAI fallback
node ./dist/cli/index.js generate "Hello" --provider openai --enable-analytics --debug

# Expected: OpenAI response with analytics data
# Expected: Accurate token counting (no NaN values)
```

### Multi-Provider Testing

```bash
# Test provider auto-selection
node ./dist/cli/index.js generate "Hello" --enable-analytics --debug

# Expected: Best available provider selected automatically
# Expected: Graceful fallback if primary provider fails
```

## Backward Compatibility Testing

### Ensure No Breaking Changes

```bash
# Test existing CLI commands (no enhancement flags)
node ./dist/cli/index.js generate "Simple test"
node ./dist/cli/index.js generate "Simple test"
node ./dist/cli/index.js gen "Simple test"

# Expected: Normal AI responses
# Expected: No enhancement data displayed
# Expected: All existing functionality works
```

### Test Existing SDK Integration

```javascript
// Test basic SDK usage (no enhancements)
const { createBestAIProvider } = require("@juspay/neurolink");
const provider = createBestAIProvider();
const result = await provider.generate({ input: { text: "Hello" } });

// Expected: result.content contains AI response
// Expected: No analytics or evaluation fields
// Expected: Existing usage patterns continue working
```

## Error Handling Testing

### Invalid Model Names

```bash
# Test deprecated model handling
export GOOGLE_AI_MODEL=gemini-2.5-pro-preview-05-06
node ./dist/cli/index.js generate "test" --provider google-ai --debug

# Expected: Graceful fallback to working provider
# Expected: Clear error message or automatic correction
```

### Missing API Keys

```bash
# Test without API keys
unset GOOGLE_AI_API_KEY
unset OPENAI_API_KEY
node ./dist/cli/index.js generate "test" --debug

# Expected: Clear error message about missing configuration
# Expected: Helpful setup instructions
```

### Network Issues

```bash
# Test with invalid API endpoint (simulated)
node ./dist/cli/index.js generate "test" --timeout 5s --debug

# Expected: Timeout handled gracefully
# Expected: Fallback to other providers if available
```

## Performance Testing

### Response Time Validation

```bash
# Test response times with analytics
node ./dist/cli/index.js generate "Short prompt" --enable-analytics --debug

# Expected: responseTime field shows reasonable values (< 10s)
# Expected: Analytics data doesn't significantly slow requests
```

### Token Counting Accuracy

```bash
# Test accurate token counting
node ./dist/cli/index.js generate "This is a test prompt for token counting" --enable-analytics --debug

# Expected: input + output = total tokens
# Expected: No NaN values in any token fields
# Expected: Token counts match actual usage
```

## Enhancement Feature Validation

### Analytics Data Completeness

```bash
# Test analytics data structure
node ./dist/cli/index.js generate "Business email" --enable-analytics --context '{"project":"test"}' --debug

# Expected analytics fields:
# - provider: string
# - model: string
# - tokens: {input, output, total}
# - responseTime: number
# - context: object (if provided)
# - timestamp: ISO string
```

### Evaluation Data Validation

```bash
# Test evaluation scoring
node ./dist/cli/index.js generate "Explain quantum physics" --enable-evaluation --debug

# Expected evaluation fields:
# - relevance: number (1-10)
# - accuracy: number (1-10)
# - completeness: number (1-10)
# - overall: number (1-10)
# - evaluationModel: string
# - evaluationTime: number
```

### Context Flow Testing

```bash
# Test context preservation
node ./dist/cli/index.js generate "Help with task" --context '{"userId":"123","department":"sales"}' --enable-analytics --debug

# Expected: Context object preserved in analytics.context
# Expected: Context available throughout request chain
```

## Troubleshooting Guide

### Common Issues

1. **Empty Responses from Google AI**
   - Check model name in .env file
   - Use `gemini-2.5-pro` instead of deprecated models
   - Verify API key is valid

2. **NaN Token Counts**
   - Usually indicates provider API failure
   - Check model configuration and API keys
   - Test with `--debug` flag for detailed logs

3. **Enhancement Data Missing**
   - Ensure using `--debug` flag to see enhancement output
   - Verify enhancement flags are correctly specified
   - Check that provider is working (not falling back)

4. **CLI Commands Not Found**
   - Run `npm run build:cli` to rebuild CLI
   - Check that dist/cli/index.js exists
   - Verify Node.js version compatibility

### Debug Commands

```bash
# Comprehensive debug information
node ./dist/cli/index.js generate "debug test" --provider google-ai --enable-analytics --enable-evaluation --context '{"debug":true}' --debug

# Check provider status
node ./dist/cli/index.js status

# Test specific provider
node ./dist/cli/index.js generate "provider test" --provider openai --debug
```

## Test Automation

### Validation Script Usage

```bash
# Run complete validation suite
./validate-fixes.sh

# Run specific test categories
./validate-fixes.sh --cli-only
./validate-fixes.sh --sdk-only
./validate-fixes.sh --providers-only
```

### CI/CD Integration

```bash
# Add to CI pipeline
npm run test
npm run build:cli
./validate-fixes.sh --ci-mode
```

This testing guide ensures all enhancement features work correctly while maintaining backward compatibility and providing clear troubleshooting guidance.

---

## Documentation Versioning

<!-- Source: development/versioning.md -->

# Documentation Versioning

**Managing documentation versions across releases using mike**

## Setup

### 1. Install Dependencies

```bash
# Install mike (already in requirements.txt)
pip install -r requirements.txt
```

### 2. Verify Configuration

The `mkdocs.yml` already includes mike configuration:

```yaml
extra:
  version:
    provider: mike
    default: latest
```

---

## Local Usage

### Create First Version

```bash
# Deploy current docs as version 1.0
mike deploy 1.0 latest --update-aliases

# Set 1.0 as the default version
mike set-default latest
```

### Deploy New Version

```bash
# Deploy new version 1.1
mike deploy 1.1 latest --update-aliases

# Deploy specific version without making it latest
mike deploy 1.0.5
```

### List All Versions

```bash
mike list
```

Output:

```
1.0 [latest]
1.1
1.2
```

### Serve Versioned Docs Locally

```bash
mike serve
```

Visit `http://localhost:8000` to test version switching.

### Delete a Version

```bash
mike delete 1.0
```

---

## Version Management Workflow

### For Minor Releases (1.0 → 1.1)

```bash
# 1. Update docs for new features
# 2. Deploy new version
mike deploy 1.1 latest --update-aliases --push

# 3. Verify
mike list
```

### For Major Releases (1.x → 2.0)

```bash
# 1. Create new version
mike deploy 2.0 latest --update-aliases --push

# 2. Keep 1.x docs accessible
mike list
# Output:
# 1.9
# 2.0 [latest]
```

### For Patch Releases (1.0.0 → 1.0.1)

```bash
# Update existing version (same alias)
mike deploy 1.0 latest --update-aliases --push
```

---

## CI/CD Integration

### GitHub Actions Workflow

Create `.github/workflows/docs.yml`:

```yaml
name: Documentation

on:
  push:
    branches:
      - release
    tags:
      - "v*"

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0 # Fetch all history for mike

      - name: Setup Python
        uses: actions/setup-python@v4
        with:
          python-version: "3.x"

      - name: Install dependencies
        run: |
          pip install -r docs/improve-docs/requirements.txt

      - name: Configure Git
        run: |
          git config user.name github-actions
          git config user.email github-actions@github.com

      - name: Deploy documentation
        run: |
          VERSION=${GITHUB_REF#refs/tags/v}
          cd docs/improve-docs
          mike deploy $VERSION latest --update-aliases --push
```

### Automatic Version Detection

```yaml
- name: Deploy documentation
  run: |
    # Get version from package.json
    VERSION=$(node -p "require('./package.json').version")
    cd docs/improve-docs

    # Deploy with version
    if [[ $VERSION == *"-"* ]]; then
      # Pre-release (1.0.0-beta.1)
      mike deploy $VERSION --push
    else
      # Stable release
      mike deploy $VERSION latest --update-aliases --push
    fi
```

---

## Best Practices

### 1. Version Naming

- **Stable releases**: `1.0`, `1.1`, `2.0` (match npm version)
- **Pre-releases**: `1.0-beta`, `2.0-rc1`
- **Development**: `dev` (always latest from main branch)

### 2. Alias Strategy

```bash
# Latest stable release
mike deploy 1.5 latest stable --update-aliases

# Development version
mike deploy dev --update-aliases

# Long-term support
mike deploy 1.0 lts --update-aliases
```

### 3. Version Cleanup

```bash
# Remove old versions (keep last 3 major versions)
mike delete 0.9
mike delete 1.0
```

### 4. Documentation Updates

For bug fixes to old versions:

```bash
# Checkout old version
git checkout v1.0.0

# Make documentation fixes
# ...

# Redeploy specific version
mike deploy 1.0 --push
```

---

## Advanced Configuration

### Custom Version Selector

Add to `mkdocs.yml`:

```yaml
extra:
  version:
    provider: mike
    default: latest
    alias: true
```

### Version Warnings

Add version-specific warnings in `docs/index.md`:

```markdown
:::warning[Deprecated Version]
You're viewing documentation for version 1.0, which is no longer supported.
Please upgrade to the latest version.
```
:::

---

## Troubleshooting

### Issue: "gh-pages branch not found"

```bash
# Create gh-pages branch
git checkout --orphan gh-pages
git rm -rf .
git commit --allow-empty -m "Initialize gh-pages"
git push origin gh-pages
git checkout main
```

### Issue: Version selector not appearing

Verify mike is installed:

```bash
mike --version
```

Check `mkdocs.yml` configuration:

```yaml
extra:
  version:
    provider: mike # Must be set
```

### Issue: Wrong default version

```bash
# Set correct default
mike set-default latest
mike serve  # Verify locally
```

---

## Version History

| Version | Release Date     | Status         | Notes                 |
| ------- | ---------------- | -------------- | --------------------- |
| 7.47.x  | Current          | ✅ Active      | Latest features       |
| 7.46.x  | 2024-12          | ✅ Active      | Previous stable       |
| 7.45.x  | 2024-11          | ⚠️ Old         | Security updates only |
| < 7.45  | 2024 and earlier | ❌ Unsupported | Upgrade recommended   |

---

## Related Documentation

- **[Contributing](/docs/community/contributing)** - How to contribute documentation
- **[Development Setup](/docs/)** - Local development environment
- **[Architecture](/docs/development/architecture)** - Documentation structure

---

## Additional Resources

- **[mike Documentation](https://github.com/jimporter/mike)** - Official mike guide
- **[MkDocs Material Versioning](https://squidfunk.github.io/mkdocs-material/setup/setting-up-versioning/)** - Material theme versioning
- **[GitHub Pages](https://docs.github.com/en/pages)** - Hosting documentation

---

# Guides

## NeuroLink Guides

<!-- Source: guides/index.md -->

# Guides

Comprehensive guides for building production-ready AI applications with NeuroLink.

-------------------------------------------------- | ---------------------------------------------------------------------- |
| **[Provider Selection Guide](/docs/reference/provider-selection)** | Interactive wizard to choose the best provider for your use case       |
| **[GitHub Action Guide](/docs/guides/github-action)**           | Run AI-powered workflows in GitHub Actions with 13 providers           |
| **[Troubleshooting](/docs/reference/troubleshooting)**             | Common issues, debugging tips, and solutions for NeuroLink CLI and SDK |

---

## ️ Redis & Persistence

Guides for setting up and managing Redis-backed conversation memory.

| Guide                                             | Description                                                              |
| ------------------------------------------------- | ------------------------------------------------------------------------ |
| **[Redis Configuration](/docs/guides/redis-configuration)** | Production-ready Redis setup with cluster, security, and cloud providers |
| **[Redis Migration](/docs/guides/redis-migration)**         | Migration patterns for upgrading Redis and moving between environments   |

See also: [Redis Quick Start](/docs/getting-started/redis-quickstart) in Getting Started

---

## Migration Guides

Migrate from other AI frameworks to NeuroLink.

| Guide                                                     | Description                                                                     |
| --------------------------------------------------------- | ------------------------------------------------------------------------------- |
| **[From LangChain](/docs/guides/migration/from-langchain)**         | Complete migration guide from LangChain with concept mapping and examples       |
| **[From Vercel AI SDK](/docs/guides/migration/from-vercel-ai-sdk)** | Migrate from Vercel AI SDK with Next.js-focused patterns and streaming examples |
| **[Migration Guide (Legacy)](/docs/guides/migration)**        | General migration guide for older versions                                      |

---

##  Enterprise Guides

Production-ready patterns for enterprise AI deployments.

| Guide                                                                | Description                                                 |
| -------------------------------------------------------------------- | ----------------------------------------------------------- |
| **[Multi-Provider Failover](/docs/guides/enterprise/multi-provider-failover)** | High availability with automatic failover between providers |
| **[Load Balancing](/docs/guides/enterprise/load-balancing)**                   | Distribute traffic across providers with 6 strategies       |
| **[Cost Optimization](/docs/cookbook/cost-optimization)**             | Reduce AI costs by 80-95% with smart routing                |
| **[Compliance & Security](/docs/guides/enterprise/compliance)**                | GDPR, SOC2, HIPAA compliance patterns                       |
| **[Multi-Region Deployment](/docs/guides/enterprise/multi-region)**            | Global deployment with geographic routing                   |
| **[Monitoring & Observability](/docs/guides/enterprise/monitoring)**           | Prometheus, Grafana, CloudWatch integration                 |
| **[Audit Trails](/docs/guides/enterprise/audit-trails)**                       | Comprehensive logging for compliance                        |

---

##  MCP Integration

Model Context Protocol server catalog and integration patterns.

| Guide                                       | Description                                                 |
| ------------------------------------------- | ----------------------------------------------------------- |
| **[Server Catalog](/docs/guides/mcp/server-catalog)** | 58+ MCP servers for file systems, databases, APIs, and more |

See also: [MCP Tools Showcase](/docs/features/mcp-tools-showcase) for detailed tool documentation

---

## ️ Server Adapters 🆕

Deploy NeuroLink as production-ready HTTP APIs.

| Guide                                                      | Description                                                         |
| ---------------------------------------------------------- | ------------------------------------------------------------------- |
| **[Server Adapters Overview](/docs/guides/server-adapters)** | Quick start guide for exposing AI agents as HTTP APIs               |
| **[Hono Adapter](/docs/guides/server-adapters/hono)**              | Recommended lightweight adapter for serverless and edge deployments |
| **[Express Adapter](/docs/guides/server-adapters/express)**        | Integration with existing Express applications                      |
| **[Fastify Adapter](/docs/guides/server-adapters/fastify)**        | High-performance adapter with built-in schema validation            |
| **[Koa Adapter](/docs/guides/server-adapters/koa)**                | Modern, minimalist adapter with clean middleware composition        |
| **[Security Guide](/docs/guides/server-adapters/security)**        | Authentication, authorization, and security best practices          |
| **[Deployment Guide](/docs/guides/server-adapters/deployment)**    | Production deployment patterns with Docker and Kubernetes           |

---

##  Framework Integration

Framework-specific integration guides.

| Framework                                | Description                                              |
| ---------------------------------------- | -------------------------------------------------------- |
| **[Next.js](/docs/sdk/framework-integration)**      | App Router, Server Components, Server Actions, Streaming |
| **[Express.js](/docs/sdk/framework-integration)**  | RESTful APIs, middleware, authentication, rate limiting  |
| **[SvelteKit](/docs/sdk/framework-integration)** | SSR, load functions, form actions, streaming             |

---

##  Examples

Real-world use cases and production code patterns.

| Guide                                          | Description                                        |
| ---------------------------------------------- | -------------------------------------------------- |
| **[Use Cases](/docs/examples/use-cases)**         | 12+ production-ready use cases with complete code  |
| **[Code Patterns](/docs/guides/examples/code-patterns)** | Best practices, design patterns, and anti-patterns |

---

## Next Steps

- **New to NeuroLink?** Start with [Quick Start](/docs/getting-started/quick-start)
- **Need to choose a provider?** Use the [Provider Selection Guide](/docs/reference/provider-selection)
- **Building a chat app?** Try our [Chat Application Tutorial](/docs/tutorials/chat-app)
- **Need knowledge base Q&A?** Build a [RAG System](/docs/tutorials/rag)
- **Want practical code examples?** Check the [Cookbook](/docs/)
- **Migrating from another framework?** See our [Migration Guides](#migration-guides)

---

## Server Adapters

<!-- Source: guides/server-adapters/index.md -->

# Server Adapters

Server adapters allow you to expose your NeuroLink AI agents as HTTP APIs using popular web frameworks. With minimal configuration, you get a production-ready API server with built-in health checks, streaming support, rate limiting, and more.

## CLI Commands

NeuroLink provides CLI commands for managing server adapters without writing code.

### Starting a Server

```bash
# Foreground mode (development)
npx @juspay/neurolink serve --port 3000 --framework hono

# Background mode (production)
npx @juspay/neurolink server start --port 3000
npx @juspay/neurolink server status
npx @juspay/neurolink server stop
```

### Viewing Routes

Inspect registered API endpoints:

```bash
# List all routes
npx @juspay/neurolink server routes

# Filter by group or method
npx @juspay/neurolink server routes --group agent
npx @juspay/neurolink server routes --method POST --format json
```

### Managing Configuration

```bash
# View configuration
npx @juspay/neurolink server config

# Modify settings
npx @juspay/neurolink server config --set defaultPort=8080
npx @juspay/neurolink server config --get cors.enabled
```

### Generating OpenAPI Spec

```bash
npx @juspay/neurolink server openapi -o openapi.json
```

For complete CLI reference, see the [CLI Commands Reference](/docs/cli/commands.md#server-subcommand).

---

## Supported Frameworks

| Framework                       | Status      | Description                                                                                                 |
| ------------------------------- | ----------- | ----------------------------------------------------------------------------------------------------------- |
| **[Hono](/docs/guides/server-adapters/hono)**           | Recommended | Lightweight, multi-runtime framework with excellent performance. Ideal for serverless and edge deployments. |
| **[Express](/docs/sdk/framework-integration)**     | Supported   | The most popular Node.js web framework. Great ecosystem and middleware compatibility.                       |
| **[Fastify](/docs/sdk/framework-integration)**     | Supported   | High-performance framework with built-in schema validation. Excellent for TypeScript projects.              |
| **[Koa](/docs/guides/server-adapters/koa)**             | Supported   | Modern, minimalist framework from the Express team. Clean middleware composition.                           |
| **[WebSocket](/docs/guides/server-adapters/websocket)** | Supported   | Real-time bidirectional communication with built-in connection management and authentication.               |

### Framework Selection Guide

| Use Case                          | Recommended Framework |
| --------------------------------- | --------------------- |
| Serverless / Edge deployments     | Hono                  |
| Existing Express application      | Express               |
| Maximum type safety & performance | Fastify               |
| Minimal overhead, modern patterns | Koa                   |
| Real-time bidirectional comms     | WebSocket             |
| General purpose API server        | Hono (default)        |

---

## Available Endpoints

All server adapters expose the same REST API endpoints:

### Health & Status

| Endpoint               | Method | Description                           |
| ---------------------- | ------ | ------------------------------------- |
| `/api/health`          | GET    | Basic health check                    |
| `/api/health/ready`    | GET    | Readiness probe (checks dependencies) |
| `/api/health/live`     | GET    | Kubernetes liveness probe             |
| `/api/health/startup`  | GET    | Kubernetes startup probe              |
| `/api/health/detailed` | GET    | Detailed system health information    |
| `/api/version`         | GET    | Server version information            |

### Agent Operations

| Endpoint               | Method | Description                            |
| ---------------------- | ------ | -------------------------------------- |
| `/api/agent/execute`   | POST   | Execute agent and return full response |
| `/api/agent/stream`    | POST   | Stream agent response via SSE          |
| `/api/agent/providers` | GET    | List available AI providers            |

### Tool Operations

| Endpoint                   | Method | Description                          |
| -------------------------- | ------ | ------------------------------------ |
| `/api/tools`               | GET    | List all available tools             |
| `/api/tools/:name`         | GET    | Get tool details by name             |
| `/api/tools/:name/execute` | POST   | Execute a specific tool              |
| `/api/tools/execute`       | POST   | Execute tool by name in request body |
| `/api/tools/search`        | GET    | Search tools by query                |

### MCP Server Operations

| Endpoint                                         | Method | Description                         |
| ------------------------------------------------ | ------ | ----------------------------------- |
| `/api/mcp/servers`                               | GET    | List connected MCP servers          |
| `/api/mcp/servers/:name`                         | GET    | Get MCP server status and tools     |
| `/api/mcp/servers/:name/tools`                   | GET    | List tools from specific MCP server |
| `/api/mcp/servers/:name/reconnect`               | POST   | Reconnect to MCP server             |
| `/api/mcp/servers/:name`                         | DELETE | Remove MCP server                   |
| `/api/mcp/servers/:name/tools/:toolName/execute` | POST   | Execute tool from specific server   |
| `/api/mcp/health`                                | GET    | Health check for all MCP servers    |

**MCP Health Response Format:**

```json
{
  "healthy": true,
  "status": "all_healthy",
  "servers": [
    { "name": "github", "healthy": true },
    { "name": "postgres", "healthy": true }
  ],
  "timestamp": "2026-02-02T12:00:00.000Z"
}
```

Status values: `no_servers`, `all_healthy`, `degraded`, `unhealthy`

### Memory & Sessions

| Endpoint                                   | Method | Description                |
| ------------------------------------------ | ------ | -------------------------- |
| `/api/memory/sessions`                     | GET    | List conversation sessions |
| `/api/memory/sessions`                     | DELETE | Clear ALL sessions         |
| `/api/memory/sessions/:sessionId`          | GET    | Get session by ID          |
| `/api/memory/sessions/:sessionId`          | DELETE | Delete specific session    |
| `/api/memory/sessions/:sessionId/messages` | GET    | Get messages for session   |
| `/api/memory/stats`                        | GET    | Memory statistics          |
| `/api/memory/health`                       | GET    | Memory system health check |

**Memory Health Response Format:**

```json
{
  "available": true,
  "type": "ConversationMemoryManager",
  "timestamp": "2026-02-02T12:00:00.000Z"
}
```

**Clear All Sessions Response Format:**

```json
{
  "success": true,
  "message": "All sessions cleared successfully",
  "metadata": {
    "timestamp": "2026-02-02T12:00:00.000Z",
    "requestId": "req_abc123"
  }
}
```

### OpenAPI / Documentation

| Endpoint            | Method | Description                  |
| ------------------- | ------ | ---------------------------- |
| `/api/openapi.json` | GET    | OpenAPI specification (JSON) |
| `/api/openapi.yaml` | GET    | OpenAPI specification (YAML) |
| `/api/docs`         | GET    | Swagger UI documentation     |

### Enabling API Documentation

The OpenAPI/Swagger endpoints above are only available when `enableSwagger: true` is set in configuration:

```typescript
const server = await createServer(neurolink, {
  framework: "hono",
  config: {
    enableSwagger: true, // Enable OpenAPI endpoints
  },
});
```

> **Security Note:** Consider disabling `enableSwagger` in production environments to avoid exposing internal API structure to unauthorized users.

---

## Configuration

### Basic Configuration

```typescript
const server = await createServer(neurolink, {
  framework: "hono",
  config: {
    port: 3000,
    host: "0.0.0.0",
    basePath: "/api",
    timeout: 30000,
    enableSwagger: true,
  },
});
```

### With CORS and Rate Limiting

```typescript
const server = await createServer(neurolink, {
  framework: "hono",
  config: {
    port: 3000,
    cors: {
      enabled: true,
      origins: ["https://myapp.com"],
      credentials: true,
    },
    rateLimit: {
      enabled: true,
      maxRequests: 100,
      windowMs: 60000, // 1 minute
    },
  },
});
```

### With Authentication

```typescript

const server = await createServer(neurolink, {
  framework: "hono",
  config: { port: 3000 },
});

// Add authentication middleware
server.registerMiddleware(
  createAuthMiddleware({
    type: "bearer",
    validate: async (token) => {
      const user = await verifyJWT(token);
      return user ? { id: user.id, roles: user.roles } : null;
    },
    skipPaths: ["/api/health", "/api/ready"],
  }),
);

await server.initialize();
await server.start();
```

For complete configuration options, see the [Configuration Reference](/docs/reference/server-configuration).

---

## Adding Custom Routes

```typescript
const server = await createServer(neurolink, {
  framework: "hono",
  config: { port: 3000 },
});

// Add custom route
server.registerRoute({
  method: "GET",
  path: "/api/custom",
  handler: async (ctx) => {
    return { message: "Custom endpoint", timestamp: Date.now() };
  },
  description: "Custom endpoint example",
  tags: ["custom"],
});

await server.initialize();
await server.start();
```

---

## Accessing the Framework Instance

For advanced customization, you can access the underlying framework instance:

```typescript
const server = await createServer(neurolink, { framework: "hono" });

// Get the underlying Hono app
const app = server.getFrameworkInstance();

// Add framework-specific middleware or routes
app.use("/custom/*", customMiddleware);

await server.initialize();
await server.start();
```

This works for all supported frameworks:

- Hono: Returns `Hono` instance
- Express: Returns `Express.Application` instance
- Fastify: Returns `FastifyInstance`
- Koa: Returns `Koa` instance

---

## Request/Response Examples

### Execute Agent

**Request:**

```json
POST /api/agent/execute
Content-Type: application/json

{
  "input": "What is the capital of France?",
  "provider": "openai",
  "model": "gpt-4o-mini",
  "options": {
    "temperature": 0.7,
    "maxTokens": 500
  }
}
```

**Response:**

```json
{
  "content": "The capital of France is Paris.",
  "provider": "openai",
  "model": "gpt-4o-mini",
  "usage": {
    "inputTokens": 12,
    "outputTokens": 8,
    "totalTokens": 20
  }
}
```

### Stream Agent Response

**Request:**

```bash
curl -X POST http://localhost:3000/api/agent/stream \
  -H "Content-Type: application/json" \
  -H "Accept: text/event-stream" \
  -d '{"input": "Write a story"}'
```

**Response (SSE):**

```
data: {"type":"text-start","timestamp":1706745600000}

data: {"type":"text-delta","content":"Once","timestamp":1706745600001}

data: {"type":"text-delta","content":" upon","timestamp":1706745600002}

data: {"type":"text-delta","content":" a time...","timestamp":1706745600003}

data: {"type":"text-end","timestamp":1706745600100}

data: {"type":"finish","usage":{"inputTokens":5,"outputTokens":50,"totalTokens":55}}
```

---

## Production Deployment

### Docker

```dockerfile
FROM node:20-alpine

WORKDIR /app

COPY package*.json ./
RUN npm ci --only=production

COPY . .
RUN npm run build

EXPOSE 3000

HEALTHCHECK --interval=30s --timeout=3s \
  CMD wget --spider -q http://localhost:3000/api/health || exit 1

CMD ["node", "dist/server.js"]
```

### Docker Compose

```yaml
version: "3.8"

services:
  api:
    build: .
    ports:
      - "3000:3000"
    environment:
      - NODE_ENV=production
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - REDIS_URL=redis://redis:6379
    depends_on:
      - redis

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
```

### Production Checklist

- [ ] Environment variables configured securely
- [ ] CORS configured for allowed origins
- [ ] Rate limiting enabled
- [ ] Authentication middleware added
- [ ] HTTPS/TLS configured (via reverse proxy)
- [ ] Health check endpoints exposed
- [ ] Logging configured appropriately
- [ ] Error handling middleware in place
- [ ] Request timeout configured
- [ ] Body size limits set

---

## Next Steps

- **[Hono Adapter Guide](/docs/guides/server-adapters/hono)** - Recommended framework for most use cases
- **[Express Adapter Guide](/docs/sdk/framework-integration)** - For existing Express applications
- **[Fastify Adapter Guide](/docs/sdk/framework-integration)** - For maximum performance and type safety
- **[Koa Adapter Guide](/docs/guides/server-adapters/koa)** - For modern, minimalist applications
- **[WebSocket Guide](/docs/guides/server-adapters/websocket)** - Real-time bidirectional communication
- **[Middleware Reference](/docs/workflows/middleware)** - Complete middleware documentation
- **[Streaming Guide](/docs/advanced/streaming)** - Real-time streaming with SSE and NDJSON
- **[Error Handling](/docs/guides/server-adapters/errors)** - Comprehensive error handling guide
- **[Configuration Reference](/docs/reference/server-configuration)** - Full configuration options
- **[OpenAPI Customization](/docs/reference/server-configuration.md#openapi-customization)** - Customize API documentation
- **[Security Best Practices](/docs/guides/server-adapters/security)** - Authentication and authorization patterns
- **[Deployment Guide](/docs/guides/server-adapters/deployment)** - Production deployment strategies

---

## Related Documentation

- **[API Reference](/docs/sdk/api-reference)** - NeuroLink SDK documentation
- **[MCP Integration](/docs/mcp/integration)** - Model Context Protocol tools
- **[Streaming Guide](/docs/advanced/streaming)** - Real-time streaming with SSE and NDJSON
- **[Enterprise Monitoring](/docs/observability/health-monitoring)** - Observability setup

---

**Need Help?** Join our [GitHub Discussions](https://github.com/juspay/neurolink/discussions) or open an [issue](https://github.com/juspay/neurolink/issues).

---

## Migration Guides

<!-- Source: guides/migration/index.md -->

# Migration Guides

This section contains guides for migrating to NeuroLink from other AI SDKs and frameworks.

## Available Migration Guides

- **[From LangChain](/docs/guides/migration/from-langchain)** - Migrate from LangChain to NeuroLink
- **[From Vercel AI SDK](/docs/guides/migration/from-vercel-ai-sdk)** - Migrate from Vercel AI SDK to NeuroLink

## Why Migrate to NeuroLink?

NeuroLink offers several advantages over other AI SDKs:

- **Universal Provider Support** - 14+ AI providers through a single API
- **MCP Integration** - Full Model Context Protocol support with 58+ external servers
- **Enterprise Ready** - Production-tested at scale with Redis memory, failover, and telemetry
- **Professional CLI** - Interactive command-line interface for development and testing
- **TypeScript First** - Full type safety with comprehensive type definitions

## Getting Help

If you encounter issues during migration:

1. Check the [Troubleshooting Guide](/docs/reference/troubleshooting)
2. Review the [API Reference](/docs/sdk/api-reference)
3. Join our [community discussions](https://github.com/juspay/neurolink/discussions)

---

## Enterprise Guides

<!-- Source: guides/enterprise/index.md -->

# Enterprise Guides

This section covers enterprise-grade features, compliance, and production deployment patterns.

## Available Guides

- [Multi-Provider Failover](/docs/guides/enterprise/multi-provider-failover) - Configure automatic failover between providers
- [Multi-Region Deployment](/docs/guides/enterprise/multi-region) - Deploy across multiple regions
- [Load Balancing](/docs/guides/enterprise/load-balancing) - Distribute load across providers
- [Cost Optimization](/docs/cookbook/cost-optimization) - Optimize costs in production
- [Compliance](/docs/guides/enterprise/compliance) - Security and compliance requirements
- [Monitoring](/docs/observability/health-monitoring) - Enterprise monitoring setup
- [Audit Trails](/docs/guides/enterprise/audit-trails) - Audit logging and compliance

## Getting Started

For basic setup, start with the [Multi-Provider Failover](/docs/guides/enterprise/multi-provider-failover) guide to ensure high availability.

---

## Hono Adapter

<!-- Source: guides/server-adapters/hono.md -->

# Hono Adapter

**The recommended framework for NeuroLink server adapters**

Hono is a lightweight, ultrafast web framework designed for the edge. It runs on virtually any JavaScript runtime including Node.js, Deno, Bun, Cloudflare Workers, and more.

-------------------- | ------------------------------------------------------------------------- |
| **Multi-runtime**       | Deploy to Node.js, Deno, Bun, Cloudflare Workers, Vercel Edge, AWS Lambda |
| **Ultrafast**           | Minimal overhead, optimized router with RegExpRouter                      |
| **TypeScript-first**    | Full type safety out of the box                                           |
| **Tiny footprint**      | ~14KB minified, no dependencies                                           |
| **Built-in middleware** | CORS, compression, ETag, secure headers included                          |
| **Web Standards**       | Uses Fetch API, Request/Response objects                                  |

Hono is the default and recommended framework for NeuroLink server adapters.

---

## CLI Usage

Start a Hono server via CLI:

```bash
# Foreground mode
neurolink serve --framework hono --port 3000

# Background mode
neurolink server start --framework hono --port 3000

# Check routes
neurolink server routes
```

---

## Quick Start

### Installation

Hono is included with NeuroLink - no additional installation required.

```bash
# NeuroLink includes Hono as a dependency
npm install @juspay/neurolink
```

### Basic Usage

```typescript

const neurolink = new NeuroLink({
  defaultProvider: "openai",
});

const server = await createServer(neurolink, {
  framework: "hono", // This is the default
  config: {
    port: 3000,
    basePath: "/api",
  },
});

await server.initialize();
await server.start();

console.log("Server running on http://localhost:3000");
```

### Test the Server

```bash
# Health check
curl http://localhost:3000/api/health

# Execute agent
curl -X POST http://localhost:3000/api/agent/execute \
  -H "Content-Type: application/json" \
  -d '{"input": "Hello, world!"}'
```

---

## Accessing the Hono App

For advanced customization, you can access the underlying Hono instance:

```typescript

const neurolink = new NeuroLink();

const server = await createServer(neurolink, {
  framework: "hono",
  config: { port: 3000 },
});

// Get the underlying Hono app
const app = server.getFrameworkInstance();

// Add Hono middleware
app.use("*", logger());
app.use(
  "/api/*",
  cors({
    origin: ["https://myapp.com"],
    credentials: true,
  }),
);

// Add custom routes directly on Hono
app.get("/custom", (c) => c.json({ message: "Custom route" }));

// Add route groups
app.route("/v2", v2Routes);

await server.initialize();
await server.start();
```

---

## Configuration Options

### Full Configuration Example

```typescript
const server = await createServer(neurolink, {
  framework: "hono",
  config: {
    // Server settings
    port: 3000,
    host: "0.0.0.0",
    basePath: "/api",
    timeout: 30000, // 30 seconds

    // CORS
    cors: {
      enabled: true,
      origins: ["https://myapp.com", "https://staging.myapp.com"],
      methods: ["GET", "POST", "PUT", "DELETE"],
      headers: ["Content-Type", "Authorization", "X-Request-ID"],
      credentials: true,
      maxAge: 86400, // 24 hours
    },

    // Rate limiting
    rateLimit: {
      enabled: true,
      maxRequests: 100,
      windowMs: 60000, // 1 minute
      skipPaths: ["/api/health", "/api/ready"],
    },
    // Note: Rate-limited responses (HTTP 429) include a `Retry-After` header indicating seconds to wait.

    // Body parsing
    bodyParser: {
      enabled: true,
      maxSize: "10mb",
      jsonLimit: "10mb",
    },

    // Logging
    logging: {
      enabled: true,
      level: "info",
      includeBody: false,
      includeResponse: false,
    },

    // Documentation
    enableSwagger: true,
    enableMetrics: true,
  },
});
```

---

## Middleware Integration

### Using NeuroLink Middleware

```typescript

  createServer,
  createAuthMiddleware,
  createRateLimitMiddleware,
  createCacheMiddleware,
  createRequestIdMiddleware,
  createTimingMiddleware,
} from "@juspay/neurolink";

const server = await createServer(neurolink, {
  framework: "hono",
  config: { port: 3000 },
});

// Add request ID to all requests
server.registerMiddleware(createRequestIdMiddleware());

// Add timing headers
server.registerMiddleware(createTimingMiddleware());

// Add authentication
server.registerMiddleware(
  createAuthMiddleware({
    type: "bearer",
    validate: async (token) => {
      const decoded = await verifyJWT(token);
      return decoded ? { id: decoded.sub, roles: decoded.roles } : null;
    },
    skipPaths: ["/api/health", "/api/ready", "/api/version"],
  }),
);

// Add rate limiting
server.registerMiddleware(
  createRateLimitMiddleware({
    maxRequests: 100,
    windowMs: 60000,
    keyGenerator: (ctx) => ctx.headers["x-api-key"] || ctx.ip,
  }),
);

// Add response caching
server.registerMiddleware(
  createCacheMiddleware({
    ttlMs: 300000, // 5 minutes
    methods: ["GET"],
    excludePaths: ["/api/agent/execute", "/api/agent/stream"],
  }),
);

// Note: Cached responses include `X-Cache: HIT` header. Fresh responses include `X-Cache: MISS`.

await server.initialize();
await server.start();
```

### Using Hono Built-in Middleware

```typescript

const server = await createServer(neurolink, { framework: "hono" });
const app = server.getFrameworkInstance();

// Security headers
app.use("*", secureHeaders());

// Compression
app.use("*", compress());

// ETag for caching
app.use("*", etag());

// Request timing
app.use("*", timing());

// CORS with full configuration
app.use(
  "/api/*",
  cors({
    origin: (origin) => {
      // Dynamic origin checking
      return origin.endsWith(".myapp.com") ? origin : null;
    },
    allowMethods: ["GET", "POST", "PUT", "DELETE", "OPTIONS"],
    allowHeaders: ["Content-Type", "Authorization"],
    exposeHeaders: ["X-Request-Id", "X-Response-Time"],
    maxAge: 86400,
    credentials: true,
  }),
);

await server.initialize();
await server.start();
```

---

## Streaming Responses

Hono has excellent streaming support, which NeuroLink leverages for real-time AI responses:

```typescript
// The /api/agent/stream endpoint is automatically set up
// It uses Server-Sent Events (SSE) for streaming

// Client-side usage:
// Note: EventSource only supports GET requests in browsers.
// Use query parameters for simple inputs:
const eventSource = new EventSource(
  `/api/agent/stream?input=${encodeURIComponent("Write a story")}`,
);

eventSource.onmessage = (event) => {
  const data = JSON.parse(event.data);
  if (data.type === "text-delta") {
    console.log(data.content);
  }
};

// For POST requests with SSE, use fetch with a readable stream:
async function streamWithPost() {
  const response = await fetch("/api/agent/stream", {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({ input: "Write a story" }),
  });

  const reader = response.body.getReader();
  const decoder = new TextDecoder();

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;

    const chunk = decoder.decode(value);
    // Parse SSE format: "data: {...}\n\n"
    const lines = chunk.split("\n");
    for (const line of lines) {
      if (line.startsWith("data: ")) {
        const data = JSON.parse(line.slice(6));
        if (data.type === "text-delta") {
          console.log(data.content);
        }
      }
    }
  }
}
```

### Custom Streaming Route

```typescript
const app = server.getFrameworkInstance();

app.get("/api/custom-stream", async (c) => {
  return c.streamText(async (stream) => {
    for await (const chunk of neurolink.generateStream({
      prompt: "Tell me a joke",
    })) {
      await stream.write(chunk.content);
    }
  });
});
```

---

## Error Handling

### Custom Error Handler

```typescript

const app = server.getFrameworkInstance();

app.onError((err, c) => {
  console.error("Error:", err);

  if (err instanceof HTTPException) {
    return c.json({ error: err.message, status: err.status }, err.status);
  }

  // AI provider errors
  if (err.message.includes("rate limit")) {
    return c.json({ error: "Rate limit exceeded", retryAfter: 60 }, 429);
  }

  // Default error response
  return c.json(
    {
      error: "Internal server error",
      message: process.env.NODE_ENV === "development" ? err.message : undefined,
    },
    500,
  );
});

app.notFound((c) => {
  return c.json({ error: "Not found", path: c.req.path }, 404);
});
```

---

## Performance Tips

### 1. Use the RegExpRouter (Default)

Hono uses RegExpRouter by default, which is the fastest router. No configuration needed.

### 2. Enable Compression

```typescript

app.use("*", compress());
```

### 3. Use ETag for Caching

```typescript

app.use("/api/tools/*", etag());
```

### 4. Minimize Middleware Chain

Only use middleware where needed:

```typescript
// Instead of applying to all routes
app.use("*", expensiveMiddleware);

// Apply only where needed
app.use("/api/agent/*", expensiveMiddleware);
```

### 5. Use Streaming for Long Responses

Always use the streaming endpoint for AI generation to avoid timeouts:

```typescript
// Prefer streaming for long responses
fetch("/api/agent/stream", {
  method: "POST",
  body: JSON.stringify({ input: "Write a long essay" }),
});
```

---

## Edge Runtime Deployment

### Cloudflare Workers

```typescript

const neurolink = new NeuroLink({
  defaultProvider: "openai",
});

const server = await createServer(neurolink, {
  framework: "hono",
  config: { basePath: "/api" },
});

await server.initialize();

export default {
  fetch: server.getFrameworkInstance().fetch,
};
```

### Vercel Edge Functions

```typescript
// api/[[...route]].ts

const neurolink = new NeuroLink();
const server = await createServer(neurolink, { framework: "hono" });
await server.initialize();

export const config = { runtime: "edge" };
export default server.getFrameworkInstance().fetch;
```

### Deno Deploy

```typescript

const neurolink = new NeuroLink();
const server = await createServer(neurolink, { framework: "hono" });
await server.initialize();

Deno.serve(server.getFrameworkInstance().fetch);
```

---

## Testing

### Unit Testing with Hono Test Client

```typescript

describe("API Server", () => {
  it("should return health status", async () => {
    const neurolink = new NeuroLink({ defaultProvider: "openai" });
    const server = await createServer(neurolink, { framework: "hono" });
    await server.initialize();

    const app = server.getFrameworkInstance();
    const res = await app.request("/api/health");

    expect(res.status).toBe(200);
    const json = await res.json();
    expect(json.status).toBe("ok");
  });

  it("should execute agent request", async () => {
    const neurolink = new NeuroLink({ defaultProvider: "openai" });
    const server = await createServer(neurolink, { framework: "hono" });
    await server.initialize();

    const app = server.getFrameworkInstance();
    const res = await app.request("/api/agent/execute", {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({ input: "Hello" }),
    });

    expect(res.status).toBe(200);
  });
});
```

---

## Production Checklist

- [ ] Configure environment variables securely
- [ ] Set appropriate CORS origins (not `*`)
- [ ] Enable rate limiting with reasonable limits
- [ ] Add authentication middleware
- [ ] Configure request timeouts
- [ ] Set body size limits
- [ ] Enable compression
- [ ] Add security headers
- [ ] Configure logging with appropriate level
- [ ] Set up health check monitoring
- [ ] Configure error tracking (Sentry, etc.)

---

## Related Documentation

- **[Server Adapters Overview](/docs/)** - Getting started with server adapters
- **[Configuration Reference](/docs/reference/server-configuration)** - Full configuration options
- **[Express Adapter](/docs/sdk/framework-integration)** - Compare with Express adapter
- **[Security Best Practices](/docs/guides/server-adapters/security)** - Authentication patterns
- **[Streaming Guide](/docs/advanced/streaming)** - Real-time streaming with SSE and NDJSON

---

## Additional Resources

- **[Hono Documentation](https://hono.dev/)** - Official Hono documentation
- **[Hono Middleware](https://hono.dev/docs/middleware/builtin/)** - Built-in middleware
- **[Hono Examples](https://hono.dev/docs/getting-started/examples)** - Example applications

---

**Need Help?** Join our [GitHub Discussions](https://github.com/juspay/neurolink/discussions) or open an [issue](https://github.com/juspay/neurolink/issues).

---

## Express Adapter

<!-- Source: guides/server-adapters/express.md -->

# Express Adapter

**The most popular Node.js web framework**

Express is a minimal and flexible Node.js web framework that provides a robust set of features for building web applications and APIs. It has the largest ecosystem of middleware and is widely used in production.

------------------ | ----------------------------------------------- |
| **Mature ecosystem**  | Thousands of middleware packages available      |
| **Well-documented**   | Extensive documentation and community resources |
| **Familiar API**      | Most Node.js developers already know Express    |
| **Flexible**          | Unopinionated, adapt to any architecture        |
| **Production-proven** | Powers millions of applications worldwide       |
| **Easy migration**    | Integrate NeuroLink into existing Express apps  |

Express is ideal when you have an existing Express application or prefer its familiar middleware patterns.

---

## CLI Usage

Start an Express server via CLI:

```bash
# Foreground mode
neurolink serve --framework express --port 3000

# Background mode
neurolink server start --framework express --port 3000

# Check routes
neurolink server routes
```

---

## Quick Start

### Installation

Express must be installed separately alongside NeuroLink:

```bash
npm install @juspay/neurolink express
```

### Basic Usage

```typescript

const neurolink = new NeuroLink({
  defaultProvider: "openai",
});

const server = await createServer(neurolink, {
  framework: "express",
  config: {
    port: 3000,
    basePath: "/api",
  },
});

await server.initialize();
await server.start();

console.log("Server running on http://localhost:3000");
```

### Test the Server

```bash
# Health check
curl http://localhost:3000/api/health

# Execute agent
curl -X POST http://localhost:3000/api/agent/execute \
  -H "Content-Type: application/json" \
  -d '{"input": "Hello, world!"}'
```

---

## Accessing the Express App

For advanced customization, you can access the underlying Express application:

```typescript

const neurolink = new NeuroLink();

const server = await createServer(neurolink, {
  framework: "express",
  config: { port: 3000 },
});

// Get the underlying Express app
const app = server.getFrameworkInstance();

// Add Express middleware
app.use(helmet());
app.use(morgan("combined"));

// Add custom routes directly on Express
app.get("/custom", (req, res) => {
  res.json({ message: "Custom route" });
});

// Add route groups with Express Router

const v2Router = Router();
v2Router.get("/status", (req, res) => res.json({ version: 2 }));
app.use("/v2", v2Router);

await server.initialize();
await server.start();
```

---

## Configuration Options

### Full Configuration Example

```typescript
const server = await createServer(neurolink, {
  framework: "express",
  config: {
    // Server settings
    port: 3000,
    host: "0.0.0.0",
    basePath: "/api",
    timeout: 30000, // 30 seconds

    // CORS
    cors: {
      enabled: true,
      origins: ["https://myapp.com", "https://staging.myapp.com"],
      methods: ["GET", "POST", "PUT", "DELETE"],
      headers: ["Content-Type", "Authorization", "X-Request-ID"],
      credentials: true,
      maxAge: 86400, // 24 hours
    },

    // Rate limiting
    rateLimit: {
      enabled: true,
      maxRequests: 100,
      windowMs: 60000, // 1 minute
      skipPaths: ["/api/health", "/api/ready"],
    },
    // Note: Rate-limited responses (HTTP 429) include a `Retry-After` header indicating seconds to wait.

    // Body parsing
    bodyParser: {
      enabled: true,
      maxSize: "10mb",
      jsonLimit: "10mb",
    },

    // Logging
    logging: {
      enabled: true,
      level: "info",
      includeBody: false,
      includeResponse: false,
    },

    // Documentation
    enableSwagger: true,
    enableMetrics: true,
  },
});
```

---

## Middleware Integration

### Using NeuroLink Middleware

```typescript

  createServer,
  createAuthMiddleware,
  createRateLimitMiddleware,
  createCacheMiddleware,
  createRequestIdMiddleware,
  createTimingMiddleware,
} from "@juspay/neurolink";

const server = await createServer(neurolink, {
  framework: "express",
  config: { port: 3000 },
});

// Add request ID to all requests
server.registerMiddleware(createRequestIdMiddleware());

// Add timing headers
server.registerMiddleware(createTimingMiddleware());

// Add authentication
server.registerMiddleware(
  createAuthMiddleware({
    type: "bearer",
    validate: async (token) => {
      const decoded = await verifyJWT(token);
      return decoded ? { id: decoded.sub, roles: decoded.roles } : null;
    },
    skipPaths: ["/api/health", "/api/ready", "/api/version"],
  }),
);

// Add rate limiting
server.registerMiddleware(
  createRateLimitMiddleware({
    maxRequests: 100,
    windowMs: 60000,
    keyGenerator: (ctx) => ctx.headers["x-api-key"] || ctx.ip,
  }),
);

// Add response caching
server.registerMiddleware(
  createCacheMiddleware({
    ttlMs: 300000, // 5 minutes
    methods: ["GET"],
    excludePaths: ["/api/agent/execute", "/api/agent/stream"],
  }),
);

// Note: Cached responses include `X-Cache: HIT` header. Fresh responses include `X-Cache: MISS`.

await server.initialize();
await server.start();
```

### Using Express-Native Middleware

```typescript

const server = await createServer(neurolink, { framework: "express" });
const app = server.getFrameworkInstance();

// Security headers
app.use(helmet());

// Logging
app.use(morgan("combined"));

// Compression
app.use(compression());

// Custom CORS configuration
app.use(
  cors({
    origin: (origin, callback) => {
      // Dynamic origin checking
      if (!origin || origin.endsWith(".myapp.com")) {
        callback(null, true);
      } else {
        callback(new Error("Not allowed by CORS"));
      }
    },
    methods: ["GET", "POST", "PUT", "DELETE", "OPTIONS"],
    allowedHeaders: ["Content-Type", "Authorization"],
    exposedHeaders: ["X-Request-Id", "X-Response-Time"],
    maxAge: 86400,
    credentials: true,
  }),
);

await server.initialize();
await server.start();
```

---

## Streaming Responses

Express supports streaming through Server-Sent Events (SSE):

```typescript
// The /api/agent/stream endpoint is automatically set up
// It uses Server-Sent Events (SSE) for streaming

// Client-side usage:
// Note: EventSource only supports GET requests in browsers.
// Use query parameters for simple inputs:
const eventSource = new EventSource(
  `/api/agent/stream?input=${encodeURIComponent("Write a story")}`,
);

eventSource.onmessage = (event) => {
  const data = JSON.parse(event.data);
  if (data.type === "text-delta") {
    console.log(data.content);
  }
};

// For POST requests with SSE, use fetch with a readable stream:
async function streamWithPost() {
  const response = await fetch("/api/agent/stream", {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({ input: "Write a story" }),
  });

  const reader = response.body.getReader();
  const decoder = new TextDecoder();

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;

    const chunk = decoder.decode(value);
    // Parse SSE format: "data: {...}\n\n"
    const lines = chunk.split("\n");
    for (const line of lines) {
      if (line.startsWith("data: ")) {
        const data = JSON.parse(line.slice(6));
        if (data.type === "text-delta") {
          console.log(data.content);
        }
      }
    }
  }
}
```

### Custom Streaming Route

```typescript
const app = server.getFrameworkInstance();

app.post("/api/custom-stream", async (req, res) => {
  res.setHeader("Content-Type", "text/event-stream");
  res.setHeader("Cache-Control", "no-cache");
  res.setHeader("Connection", "keep-alive");

  for await (const chunk of neurolink.generateStream({
    prompt: req.body.input,
  })) {
    res.write(`data: ${JSON.stringify(chunk)}\n\n`);
  }

  res.write("event: done\ndata: \n\n");
  res.end();
});
```

---

## Abort Signal Handling

The abort signal middleware allows detecting when clients disconnect during long-running requests. NeuroLink provides both a universal middleware and an Express-specific implementation.

### Using Abort Signal Middleware

```typescript

  createAbortSignalMiddleware,
  createExpressAbortMiddleware,
} from "@juspay/neurolink";

// Option 1: Universal middleware (works with ServerContext)
const abortMiddleware = createAbortSignalMiddleware({
  onAbort: (ctx) => {
    console.log(`Request ${ctx.requestId} was aborted by client`);
  },
  timeout: 30000, // Optional request timeout
});

// Option 2: Express-specific middleware (lower-level)
app.use(createExpressAbortMiddleware());

// Access in route handler
app.get("/long-operation", async (req, res) => {
  const { abortSignal } = res.locals;

  // Check if aborted
  if (abortSignal?.aborted) {
    return res.status(499).json({ error: "Request cancelled" });
  }

  // Use with fetch or other AbortSignal-aware APIs
  const response = await fetch(url, { signal: abortSignal });
});
```

### Use Cases

The abort signal middleware is useful for:

- **Long-running AI generation** - Cancel generation when client disconnects
- **Streaming responses** - Stop producing chunks when client leaves
- **Database queries** - Cancel queries that support abort signals
- **External API calls** - Pass signal to fetch/axios for cancellation

### Native Express Approach

For simpler cases, you can use Express's native socket events:

```typescript
const app = server.getFrameworkInstance();

app.post("/api/long-running", async (req, res) => {
  // Check if client disconnected
  req.on("close", () => {
    console.log("Client disconnected, cleaning up...");
    // Cleanup resources
  });

  // Your long-running operation
  const result = await neurolink.generate({
    prompt: req.body.input,
  });

  res.json(result);
});
```

For streaming requests, the adapter automatically detects client disconnection and stops the stream to avoid unnecessary processing.

---

## Error Handling

### Custom Error Handler

```typescript
const app = server.getFrameworkInstance();

// Custom error handling middleware (must be defined last)
app.use((err, req, res, next) => {
  console.error("Error:", err);

  // AI provider errors
  if (err.message.includes("rate limit")) {
    return res.status(429).json({
      error: "Rate limit exceeded",
      retryAfter: 60,
    });
  }

  // Validation errors
  if (err.name === "ValidationError") {
    return res.status(400).json({
      error: "Validation failed",
      details: err.details,
    });
  }

  // Default error response
  res.status(500).json({
    error: "Internal server error",
    message: process.env.NODE_ENV === "development" ? err.message : undefined,
  });
});

// 404 handler
app.use((req, res) => {
  res.status(404).json({
    error: "Not found",
    path: req.path,
  });
});
```

---

## Integrating with Existing Express Apps

If you already have an Express application, you can integrate NeuroLink routes:

```typescript

// Your existing Express app
const existingApp = express();
existingApp.use(express.json());

// Add your existing routes
existingApp.get("/", (req, res) => {
  res.json({ message: "Welcome to my API" });
});

// Create NeuroLink server
// Note: basePath: "/" since Express mount path handles the prefix
const neurolink = new NeuroLink({ defaultProvider: "openai" });
const nlServer = await createServer(neurolink, {
  framework: "express",
  config: { basePath: "/" },
});

await nlServer.initialize();

// Mount NeuroLink routes on your existing app
const nlApp = nlServer.getFrameworkInstance();
existingApp.use("/ai", nlApp);

// Start your existing app
existingApp.listen(3000, () => {
  console.log("Server running on http://localhost:3000");
  console.log("AI endpoints available at /ai/*");
});
```

---

## Testing

### Unit Testing with Supertest

```typescript

describe("API Server", () => {
  let server;
  let app;

  beforeAll(async () => {
    const neurolink = new NeuroLink({ defaultProvider: "openai" });
    server = await createServer(neurolink, { framework: "express" });
    await server.initialize();
    app = server.getFrameworkInstance();
  });

  afterAll(async () => {
    await server.stop();
  });

  it("should return health status", async () => {
    const res = await request(app).get("/api/health");

    expect(res.status).toBe(200);
    expect(res.body.status).toBe("ok");
  });

  it("should execute agent request", async () => {
    const res = await request(app)
      .post("/api/agent/execute")
      .set("Content-Type", "application/json")
      .send({ input: "Hello" });

    expect(res.status).toBe(200);
    expect(res.body.data).toBeDefined();
  });
});
```

---

## Production Checklist

- [ ] Configure environment variables securely
- [ ] Set appropriate CORS origins (not `*`)
- [ ] Enable rate limiting with reasonable limits
- [ ] Add authentication middleware
- [ ] Configure request timeouts
- [ ] Set body size limits
- [ ] Enable compression (gzip/brotli)
- [ ] Add security headers (helmet)
- [ ] Configure logging with appropriate level
- [ ] Set up health check monitoring
- [ ] Configure error tracking (Sentry, etc.)
- [ ] Use a process manager (PM2, systemd)

---

## Related Documentation

- **[Server Adapters Overview](/docs/)** - Getting started with server adapters
- **[Configuration Reference](/docs/reference/server-configuration)** - Full configuration options
- **[Hono Adapter](/docs/guides/server-adapters/hono)** - Compare with Hono adapter
- **[Fastify Adapter](/docs/sdk/framework-integration)** - Compare with Fastify adapter
- **[Security Best Practices](/docs/guides/server-adapters/security)** - Authentication patterns

---

## Additional Resources

- **[Express Documentation](https://expressjs.com/)** - Official Express documentation
- **[Express Middleware](https://expressjs.com/en/resources/middleware.html)** - Popular middleware packages
- **[Express Security Best Practices](https://expressjs.com/en/advanced/best-practice-security.html)** - Security guidelines

---

**Need Help?** Join our [GitHub Discussions](https://github.com/juspay/neurolink/discussions) or open an [issue](https://github.com/juspay/neurolink/issues).

---

## Fastify Adapter

<!-- Source: guides/server-adapters/fastify.md -->

# Fastify Adapter

**High-performance web framework with built-in schema validation**

Fastify is a fast and low overhead web framework for Node.js. It provides excellent TypeScript support, built-in schema validation, and a powerful plugin system.

------------------ | -------------------------------------------------------- |
| **High performance**  | One of the fastest Node.js web frameworks                |
| **Schema validation** | Built-in JSON Schema validation with fast-json-stringify |
| **TypeScript-first**  | Excellent TypeScript support and type inference          |
| **Plugin system**     | Powerful encapsulated plugin architecture                |
| **Low overhead**      | Minimal memory footprint and fast serialization          |
| **Production-ready**  | Built-in logging with Pino, decorators, hooks            |

Fastify is ideal when you need maximum performance and strong type safety.

---

## CLI Usage

Start a Fastify server via CLI:

```bash
# Foreground mode
neurolink serve --framework fastify --port 3000

# Background mode
neurolink server start --framework fastify --port 3000

# Check routes
neurolink server routes
```

---

## Quick Start

### Installation

Fastify is included with NeuroLink - no additional installation required.

```bash
# NeuroLink includes Fastify as a dependency
npm install @juspay/neurolink
```

### Basic Usage

```typescript

const neurolink = new NeuroLink({
  defaultProvider: "openai",
});

const server = await createServer(neurolink, {
  framework: "fastify",
  config: {
    port: 3000,
    basePath: "/api",
  },
});

await server.initialize();
await server.start();

console.log("Server running on http://localhost:3000");
```

### Test the Server

```bash
# Health check
curl http://localhost:3000/api/health

# Execute agent
curl -X POST http://localhost:3000/api/agent/execute \
  -H "Content-Type: application/json" \
  -d '{"input": "Hello, world!"}'
```

---

## Accessing the Fastify Instance

For advanced customization, you can access the underlying Fastify instance:

```typescript

const neurolink = new NeuroLink();

const server = await createServer(neurolink, {
  framework: "fastify",
  config: { port: 3000 },
});

// Get the underlying Fastify instance
const fastify = server.getFrameworkInstance();

// Add custom routes directly on Fastify
fastify.get("/custom", async (request, reply) => {
  return { message: "Custom route" };
});

// Add decorators
fastify.decorate("neurolink", neurolink);

// Add hooks
fastify.addHook("onRequest", async (request, reply) => {
  request.startTime = Date.now();
});

await server.initialize();
await server.start();
```

---

## Plugin Registration

Fastify's plugin system allows you to encapsulate functionality:

```typescript

const neurolink = new NeuroLink();

const server = await createServer(neurolink, {
  framework: "fastify",
  config: { port: 3000 },
});

const fastify = server.getFrameworkInstance();

// Register security headers plugin
await fastify.register(fastifyHelmet);

// Register Swagger documentation
await fastify.register(fastifySwagger, {
  openapi: {
    info: {
      title: "NeuroLink AI API",
      description: "AI-powered API endpoints",
      version: "1.0.0",
    },
  },
});

await fastify.register(fastifySwaggerUi, {
  routePrefix: "/docs",
});

// Register custom plugin
await fastify.register(async function customPlugin(instance) {
  instance.get("/plugin-route", async () => {
    return { source: "plugin" };
  });
});

await server.initialize();
await server.start();
```

---

## Configuration Options

### Full Configuration Example

```typescript
const server = await createServer(neurolink, {
  framework: "fastify",
  config: {
    // Server settings
    port: 3000,
    host: "0.0.0.0",
    basePath: "/api",
    timeout: 30000, // 30 seconds

    // CORS
    cors: {
      enabled: true,
      origins: ["https://myapp.com", "https://staging.myapp.com"],
      methods: ["GET", "POST", "PUT", "DELETE"],
      headers: ["Content-Type", "Authorization", "X-Request-ID"],
      credentials: true,
      maxAge: 86400, // 24 hours
    },

    // Rate limiting
    rateLimit: {
      enabled: true,
      maxRequests: 100,
      windowMs: 60000, // 1 minute
      skipPaths: ["/api/health", "/api/ready"],
    },
    // Note: Rate-limited responses (HTTP 429) include a `Retry-After` header indicating seconds to wait.

    // Body parsing
    bodyParser: {
      enabled: true,
      maxSize: "10mb",
      jsonLimit: "10mb",
    },

    // Logging (Fastify uses Pino)
    logging: {
      enabled: true,
      level: "info",
      includeBody: false,
      includeResponse: false,
    },

    // Documentation
    enableSwagger: true,
    enableMetrics: true,
  },
});
```

---

## Middleware Integration

### Using NeuroLink Middleware

```typescript

  createServer,
  createAuthMiddleware,
  createRateLimitMiddleware,
  createCacheMiddleware,
  createRequestIdMiddleware,
  createTimingMiddleware,
} from "@juspay/neurolink";

const server = await createServer(neurolink, {
  framework: "fastify",
  config: { port: 3000 },
});

// Add request ID to all requests
server.registerMiddleware(createRequestIdMiddleware());

// Add timing headers
server.registerMiddleware(createTimingMiddleware());

// Add authentication
server.registerMiddleware(
  createAuthMiddleware({
    type: "bearer",
    validate: async (token) => {
      const decoded = await verifyJWT(token);
      return decoded ? { id: decoded.sub, roles: decoded.roles } : null;
    },
    skipPaths: ["/api/health", "/api/ready", "/api/version"],
  }),
);

// Add rate limiting
server.registerMiddleware(
  createRateLimitMiddleware({
    maxRequests: 100,
    windowMs: 60000,
    keyGenerator: (ctx) => ctx.headers["x-api-key"] || ctx.ip,
  }),
);

// Add response caching
server.registerMiddleware(
  createCacheMiddleware({
    ttlMs: 300000, // 5 minutes
    methods: ["GET"],
    excludePaths: ["/api/agent/execute", "/api/agent/stream"],
  }),
);

// Note: Cached responses include `X-Cache: HIT` header. Fresh responses include `X-Cache: MISS`.

await server.initialize();
await server.start();
```

### Using Fastify Hooks

```typescript
const fastify = server.getFrameworkInstance();

// onRequest hook - runs first
fastify.addHook("onRequest", async (request, reply) => {
  console.log(`Request: ${request.method} ${request.url}`);
});

// preValidation hook - runs before validation
fastify.addHook("preValidation", async (request, reply) => {
  // Custom validation logic
});

// preHandler hook - runs before route handler
fastify.addHook("preHandler", async (request, reply) => {
  // Authentication, authorization, etc.
});

// onSend hook - runs before response is sent
fastify.addHook("onSend", async (request, reply, payload) => {
  // Modify response
  return payload;
});

// onResponse hook - runs after response is sent
fastify.addHook("onResponse", async (request, reply) => {
  console.log(`Response time: ${reply.elapsedTime}ms`);
});
```

---

## MCP Body Attachment

When using MCP (Model Context Protocol) tools with Fastify, the request body is automatically attached to the context. The Fastify adapter handles this seamlessly:

```typescript
const server = await createServer(neurolink, {
  framework: "fastify",
  config: { port: 3000 },
});

// MCP tools receive the request body automatically
// No additional configuration needed

// The body is accessible in your route handlers
const fastify = server.getFrameworkInstance();

fastify.post("/api/custom-mcp", async (request, reply) => {
  const { input, tools } = request.body;

  // Execute with specific MCP tools
  const result = await neurolink.generate({
    prompt: input,
    tools: tools,
  });

  return result;
});
```

For large payloads, ensure your body limit configuration is appropriate:

```typescript
const server = await createServer(neurolink, {
  framework: "fastify",
  config: {
    bodyParser: {
      maxSize: "50mb", // Increase for large MCP payloads
    },
  },
});
```

---

## Streaming Responses

Fastify supports streaming through Server-Sent Events (SSE):

```typescript
// The /api/agent/stream endpoint is automatically set up
// It uses Server-Sent Events (SSE) for streaming

// Client-side usage:
// Note: EventSource only supports GET requests in browsers.
// Use query parameters for simple inputs:
const eventSource = new EventSource(
  `/api/agent/stream?input=${encodeURIComponent("Write a story")}`,
);

eventSource.onmessage = (event) => {
  const data = JSON.parse(event.data);
  if (data.type === "text-delta") {
    console.log(data.content);
  }
};

// For POST requests with SSE, use fetch with a readable stream:
async function streamWithPost() {
  const response = await fetch("/api/agent/stream", {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({ input: "Write a story" }),
  });

  const reader = response.body.getReader();
  const decoder = new TextDecoder();

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;

    const chunk = decoder.decode(value);
    // Parse SSE format: "data: {...}\n\n"
    const lines = chunk.split("\n");
    for (const line of lines) {
      if (line.startsWith("data: ")) {
        const data = JSON.parse(line.slice(6));
        if (data.type === "text-delta") {
          console.log(data.content);
        }
      }
    }
  }
}
```

### Custom Streaming Route

```typescript
const fastify = server.getFrameworkInstance();

fastify.post("/api/custom-stream", async (request, reply) => {
  reply.raw.writeHead(200, {
    "Content-Type": "text/event-stream",
    "Cache-Control": "no-cache",
    Connection: "keep-alive",
  });

  for await (const chunk of neurolink.generateStream({
    prompt: request.body.input,
  })) {
    reply.raw.write(`data: ${JSON.stringify(chunk)}\n\n`);
  }

  reply.raw.write("event: done\ndata: \n\n");
  reply.raw.end();
});
```

---

## Performance Tips

### 1. Use Schema Validation

Fastify's schema validation is highly optimized. Define schemas for better performance and automatic documentation:

```typescript
const fastify = server.getFrameworkInstance();

// Define schemas
const executeSchema = {
  body: {
    type: "object",
    required: ["input"],
    properties: {
      input: { type: "string", minLength: 1, maxLength: 10000 },
      provider: { type: "string", enum: ["openai", "anthropic", "google"] },
      options: {
        type: "object",
        properties: {
          temperature: { type: "number", minimum: 0, maximum: 2 },
          maxTokens: { type: "integer", minimum: 1, maximum: 100000 },
        },
      },
    },
  },
  response: {
    200: {
      type: "object",
      properties: {
        data: { type: "object" },
        metadata: {
          type: "object",
          properties: {
            requestId: { type: "string" },
            timestamp: { type: "string" },
            duration: { type: "number" },
          },
        },
      },
    },
  },
};

// Route with schema validation
fastify.post("/api/validated-execute", {
  schema: executeSchema,
  handler: async (request, reply) => {
    const result = await neurolink.generate({
      prompt: request.body.input,
      provider: request.body.provider,
      ...request.body.options,
    });
    return { data: result };
  },
});
```

### 2. Use fastify-compress for Response Compression

```typescript

const fastify = server.getFrameworkInstance();
await fastify.register(fastifyCompress, {
  encodings: ["gzip", "deflate"],
});
```

### 3. Configure Logging Appropriately

```typescript
// In production, use structured logging with Pino
const server = await createServer(neurolink, {
  framework: "fastify",
  config: {
    logging: {
      enabled: true,
      level: process.env.NODE_ENV === "production" ? "warn" : "info",
    },
  },
});
```

### 4. Use Connection Pooling

When accessing databases or external services, use connection pooling:

```typescript
const fastify = server.getFrameworkInstance();

// Decorate with a connection pool
fastify.decorate(
  "db",
  createPool({
    max: 20,
    idleTimeoutMillis: 30000,
  }),
);

// Clean up on close
fastify.addHook("onClose", async (instance) => {
  await instance.db.end();
});
```

### 5. Disable Logging in Benchmarks

For maximum performance in benchmarks, disable logging:

```typescript
const server = await createServer(neurolink, {
  framework: "fastify",
  config: {
    logging: { enabled: false },
  },
});
```

---

## Error Handling

### Custom Error Handler

```typescript
const fastify = server.getFrameworkInstance();

// Set custom error handler
fastify.setErrorHandler((error, request, reply) => {
  console.error("Error:", error);

  // AI provider errors
  if (error.message.includes("rate limit")) {
    return reply.status(429).send({
      error: "Rate limit exceeded",
      retryAfter: 60,
    });
  }

  // Validation errors
  if (error.validation) {
    return reply.status(400).send({
      error: "Validation failed",
      details: error.validation,
    });
  }

  // Default error response
  reply.status(500).send({
    error: "Internal server error",
    message: process.env.NODE_ENV === "development" ? error.message : undefined,
  });
});

// Custom 404 handler
fastify.setNotFoundHandler((request, reply) => {
  reply.status(404).send({
    error: "Not found",
    path: request.url,
  });
});
```

---

## Testing

### Unit Testing with Fastify's inject

```typescript

describe("API Server", () => {
  let server;
  let fastify;

  beforeAll(async () => {
    const neurolink = new NeuroLink({ defaultProvider: "openai" });
    server = await createServer(neurolink, { framework: "fastify" });
    await server.initialize();
    fastify = server.getFrameworkInstance();
  });

  afterAll(async () => {
    await server.stop();
  });

  it("should return health status", async () => {
    const response = await fastify.inject({
      method: "GET",
      url: "/api/health",
    });

    expect(response.statusCode).toBe(200);
    const json = response.json();
    expect(json.status).toBe("ok");
  });

  it("should execute agent request", async () => {
    const response = await fastify.inject({
      method: "POST",
      url: "/api/agent/execute",
      headers: { "Content-Type": "application/json" },
      payload: { input: "Hello" },
    });

    expect(response.statusCode).toBe(200);
    const json = response.json();
    expect(json.data).toBeDefined();
  });

  it("should validate request body", async () => {
    const response = await fastify.inject({
      method: "POST",
      url: "/api/agent/execute",
      headers: { "Content-Type": "application/json" },
      payload: {}, // Missing required 'input' field
    });

    expect(response.statusCode).toBe(400);
  });
});
```

---

## Production Checklist

- [ ] Configure environment variables securely
- [ ] Set appropriate CORS origins (not `*`)
- [ ] Enable rate limiting with reasonable limits
- [ ] Add authentication middleware
- [ ] Configure request timeouts
- [ ] Set body size limits
- [ ] Enable compression (@fastify/compress)
- [ ] Add security headers (@fastify/helmet)
- [ ] Configure logging with appropriate level
- [ ] Set up health check monitoring
- [ ] Configure error tracking (Sentry, etc.)
- [ ] Use schema validation for all routes
- [ ] Enable JSON schema compilation caching

---

## Related Documentation

- **[Server Adapters Overview](/docs/)** - Getting started with server adapters
- **[Configuration Reference](/docs/reference/server-configuration)** - Full configuration options
- **[Hono Adapter](/docs/guides/server-adapters/hono)** - Compare with Hono adapter
- **[Express Adapter](/docs/sdk/framework-integration)** - Compare with Express adapter
- **[Security Best Practices](/docs/guides/server-adapters/security)** - Authentication patterns

---

## Additional Resources

- **[Fastify Documentation](https://fastify.dev/)** - Official Fastify documentation
- **[Fastify Plugins](https://fastify.dev/ecosystem/)** - Official and community plugins
- **[Fastify Performance](https://fastify.dev/docs/latest/Guides/Benchmarking/)** - Performance tuning

---

**Need Help?** Join our [GitHub Discussions](https://github.com/juspay/neurolink/discussions) or open an [issue](https://github.com/juspay/neurolink/issues).

---

## Koa Adapter

<!-- Source: guides/server-adapters/koa.md -->

# Koa Adapter

**Modern middleware composition for NeuroLink APIs**

Koa is a minimalist web framework designed by the team behind Express. It leverages async/await for cleaner middleware composition, making it ideal for building elegant, maintainable AI APIs.

------------------- | -------------------------------------------------- |
| **Async/Await Native** | Clean middleware composition without callback hell |
| **Minimalist Core**    | Only what you need, add features via middleware    |
| **Context Object**     | Encapsulates request/response in a single object   |
| **Modern JavaScript**  | Built for ES2017+ with async functions             |
| **Lightweight**        | Smaller footprint than Express                     |
| **Error Handling**     | Elegant try/catch error handling in middleware     |

Koa is ideal for developers who prefer explicit control over their middleware stack and modern JavaScript patterns.

---

## CLI Usage

Start a Koa server via CLI:

```bash
# Foreground mode
neurolink serve --framework koa --port 3000

# Background mode
neurolink server start --framework koa --port 3000

# Check routes
neurolink server routes
```

---

## Quick Start

### Installation

Koa requires peer dependencies that are not bundled with NeuroLink:

```bash
# Install NeuroLink and Koa dependencies
npm install @juspay/neurolink koa @koa/router @koa/cors koa-bodyparser
```

### Basic Usage

```typescript

const neurolink = new NeuroLink({
  defaultProvider: "openai",
});

const server = await createServer(neurolink, {
  framework: "koa",
  config: {
    port: 3000,
    basePath: "/api",
  },
});

await server.initialize();
await server.start();

console.log("Koa server running on http://localhost:3000");
```

### Test the Server

```bash
# Health check
curl http://localhost:3000/api/health

# Execute agent
curl -X POST http://localhost:3000/api/agent/execute \
  -H "Content-Type: application/json" \
  -d '{"input": "Hello, world!"}'
```

---

## Accessing the Underlying Koa App

For advanced customization, you can access the underlying Koa instance and router:

```typescript

const neurolink = new NeuroLink();

const server = await createServer(neurolink, {
  framework: "koa",
  config: { port: 3000 },
});

// Get the underlying Koa app
const app = server.getFrameworkInstance();

// Add Koa middleware directly
app.use(logger());
app.use(
  cors({
    origin: (ctx) => {
      const origin = ctx.request.headers.origin;
      return origin?.endsWith(".myapp.com") ? origin : "";
    },
    credentials: true,
  }),
);

// Add custom routes directly on the Koa app
app.use(async (ctx, next) => {
  if (ctx.path === "/custom") {
    ctx.body = { message: "Custom Koa route" };
    return;
  }
  await next();
});

await server.initialize();
await server.start();
```

### Accessing the Router

The server adapter uses `@koa/router` internally. For route-specific customization:

```typescript
const server = await createServer(neurolink, { framework: "koa" });
const app = server.getFrameworkInstance();

// Add routes before initialization
app.use(async (ctx, next) => {
  // Custom middleware for specific paths
  if (ctx.path.startsWith("/v2/")) {
    ctx.state.apiVersion = "v2";
  }
  await next();
});

await server.initialize();
await server.start();
```

---

## Configuration Options

### Full Configuration Example

```typescript
const server = await createServer(neurolink, {
  framework: "koa",
  config: {
    // Server settings
    port: 3000,
    host: "0.0.0.0",
    basePath: "/api",
    timeout: 30000, // 30 seconds

    // CORS
    cors: {
      enabled: true,
      origins: ["https://myapp.com", "https://staging.myapp.com"],
      methods: ["GET", "POST", "PUT", "DELETE"],
      headers: ["Content-Type", "Authorization", "X-Request-ID"],
      credentials: true,
      maxAge: 86400, // 24 hours
    },

    // Rate limiting
    rateLimit: {
      enabled: true,
      maxRequests: 100,
      windowMs: 60000, // 1 minute
      skipPaths: ["/api/health", "/api/ready"],
    },
    // Note: Rate-limited responses (HTTP 429) include a `Retry-After` header indicating seconds to wait.

    // Body parsing
    bodyParser: {
      enabled: true,
      maxSize: "10mb",
      jsonLimit: "10mb",
    },

    // Logging
    logging: {
      enabled: true,
      level: "info",
      includeBody: false,
      includeResponse: false,
    },

    // Documentation
    enableSwagger: true,
    enableMetrics: true,
  },
});
```

---

## Middleware Integration

### Using NeuroLink Middleware

```typescript

  createServer,
  createAuthMiddleware,
  createRateLimitMiddleware,
  createCacheMiddleware,
  createRequestIdMiddleware,
  createTimingMiddleware,
} from "@juspay/neurolink";

const server = await createServer(neurolink, {
  framework: "koa",
  config: { port: 3000 },
});

// Add request ID to all requests
server.registerMiddleware(createRequestIdMiddleware());

// Add timing headers
server.registerMiddleware(createTimingMiddleware());

// Add authentication
server.registerMiddleware(
  createAuthMiddleware({
    type: "bearer",
    validate: async (token) => {
      const decoded = await verifyJWT(token);
      return decoded ? { id: decoded.sub, roles: decoded.roles } : null;
    },
    skipPaths: ["/api/health", "/api/ready", "/api/version"],
  }),
);

// Add rate limiting
server.registerMiddleware(
  createRateLimitMiddleware({
    maxRequests: 100,
    windowMs: 60000,
    keyGenerator: (ctx) =>
      ctx.headers["x-api-key"] || ctx.headers["x-forwarded-for"] || "unknown",
  }),
);

// Add response caching
server.registerMiddleware(
  createCacheMiddleware({
    ttlMs: 300000, // 5 minutes
    methods: ["GET"],
    excludePaths: ["/api/agent/execute", "/api/agent/stream"],
  }),
);

// Note: Cached responses include `X-Cache: HIT` header. Fresh responses include `X-Cache: MISS`.

await server.initialize();
await server.start();
```

### Using Koa Native Middleware

Koa has a rich ecosystem of middleware. You can use them directly:

```typescript

const server = await createServer(neurolink, { framework: "koa" });
const app = server.getFrameworkInstance();

// Security headers
app.use(helmet());

// Compression
app.use(
  compress({
    threshold: 2048,
    gzip: { flush: require("zlib").constants.Z_SYNC_FLUSH },
    deflate: { flush: require("zlib").constants.Z_SYNC_FLUSH },
  }),
);

// Session management
app.keys = ["your-session-secret"];
app.use(
  session(
    {
      key: "neurolink:sess",
      maxAge: 86400000,
      httpOnly: true,
      signed: true,
    },
    app,
  ),
);

// External rate limiting with Redis
const Redis = require("ioredis");
const redis = new Redis();

app.use(
  ratelimit({
    driver: "redis",
    db: redis,
    duration: 60000,
    max: 100,
    id: (ctx) => ctx.ip,
  }),
);

await server.initialize();
await server.start();
```

---

## Koa Context Patterns

### Accessing Koa Context in Custom Middleware

```typescript
const server = await createServer(neurolink, { framework: "koa" });
const app = server.getFrameworkInstance();

// Koa middleware has access to ctx (context)
app.use(async (ctx, next) => {
  // ctx.request - Koa Request object
  // ctx.response - Koa Response object
  // ctx.state - Recommended namespace for passing data through middleware
  // ctx.app - Application instance reference
  // ctx.cookies - Cookie handling

  ctx.state.startTime = Date.now();

  await next();

  const duration = Date.now() - ctx.state.startTime;
  ctx.set("X-Response-Time", `${duration}ms`);
});
```

### Error Handling with Koa

```typescript
const app = server.getFrameworkInstance();

// Error handling middleware (should be early in the chain)
app.use(async (ctx, next) => {
  try {
    await next();
  } catch (err) {
    const status = err.status || err.statusCode || 500;
    const message = err.expose ? err.message : "Internal Server Error";

    ctx.status = status;
    ctx.body = {
      error: {
        code: `HTTP_${status}`,
        message,
        requestId: ctx.state.requestId,
      },
    };

    // Emit error event for logging
    ctx.app.emit("error", err, ctx);
  }
});

// Listen for errors
app.on("error", (err, ctx) => {
  console.error("Server error:", {
    error: err.message,
    path: ctx?.path,
    method: ctx?.method,
  });
});
```

---

## Streaming Responses

Koa handles streaming naturally through its response handling:

```typescript
// The /api/agent/stream endpoint is automatically configured
// It uses Server-Sent Events (SSE) for streaming

// Client-side usage:
const response = await fetch("/api/agent/stream", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({ input: "Write a story" }),
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;

  const text = decoder.decode(value);
  const lines = text.split("\n");

  for (const line of lines) {
    if (line.startsWith("data: ")) {
      const data = JSON.parse(line.slice(6));
      console.log(data);
    }
  }
}
```

### Custom Streaming Route

```typescript
const app = server.getFrameworkInstance();

app.use(async (ctx, next) => {
  if (ctx.path === "/api/custom-stream" && ctx.method === "POST") {
    ctx.set("Content-Type", "text/event-stream");
    ctx.set("Cache-Control", "no-cache");
    ctx.set("Connection", "keep-alive");
    ctx.set("X-Accel-Buffering", "no");

    ctx.status = 200;

    // Manual streaming
    for await (const chunk of neurolink.generateStream({
      prompt: ctx.request.body.prompt,
    })) {
      ctx.res.write(`data: ${JSON.stringify(chunk)}\n\n`);
    }

    ctx.res.write("data: [DONE]\n\n");
    ctx.res.end();
    return;
  }

  await next();
});
```

---

## Testing

### Unit Testing with Supertest

```typescript

describe("Koa API Server", () => {
  let server;
  let app;

  beforeAll(async () => {
    const neurolink = new NeuroLink({ defaultProvider: "openai" });
    server = await createServer(neurolink, { framework: "koa" });
    await server.initialize();
    app = server.getFrameworkInstance().callback();
  });

  afterAll(async () => {
    await server.stop();
  });

  it("should return health status", async () => {
    const res = await request(app).get("/api/health");

    expect(res.status).toBe(200);
    expect(res.body.data.status).toBe("ok");
  });

  it("should execute agent request", async () => {
    const res = await request(app)
      .post("/api/agent/execute")
      .send({ input: "Hello" })
      .set("Content-Type", "application/json");

    expect(res.status).toBe(200);
    expect(res.body.data).toBeDefined();
  });
});
```

---

## Production Checklist

- [ ] Configure environment variables securely
- [ ] Set appropriate CORS origins (not `*`)
- [ ] Enable rate limiting with reasonable limits
- [ ] Add authentication middleware
- [ ] Configure request timeouts
- [ ] Set body size limits
- [ ] Enable compression middleware
- [ ] Add security headers (koa-helmet)
- [ ] Configure logging with appropriate level
- [ ] Set up health check monitoring
- [ ] Configure error tracking (Sentry, etc.)
- [ ] Use process manager (PM2) for production

---

## Related Documentation

- **[Server Adapters Overview](/docs/)** - Getting started with server adapters
- **[Hono Adapter](/docs/guides/server-adapters/hono)** - Recommended framework for most use cases
- **[Express Adapter](/docs/sdk/framework-integration)** - Compare with Express adapter
- **[Security Best Practices](/docs/guides/server-adapters/security)** - Authentication patterns
- **[Deployment Guide](/docs/guides/server-adapters/deployment)** - Production deployment strategies

---

## Additional Resources

- **[Koa Documentation](https://koajs.com/)** - Official Koa documentation
- **[Koa Wiki](https://github.com/koajs/koa/wiki)** - Community resources and middleware list
- **[@koa/router](https://github.com/koajs/router)** - Router middleware documentation

---

**Need Help?** Join our [GitHub Discussions](https://github.com/juspay/neurolink/discussions) or open an [issue](https://github.com/juspay/neurolink/issues).

---

## Middleware Reference

<!-- Source: guides/server-adapters/middleware.md -->

# Middleware Reference

NeuroLink server adapters provide a comprehensive set of middleware components for common server operations. All middleware follows a consistent pattern and can be composed together for your specific use case.

---------------------------------- | ----------------------------------------- | ----- |
| `createTimingMiddleware()`            | Measures request duration                 | 0     |
| `createRequestIdMiddleware()`         | Generates/propagates request IDs          | 0     |
| `createErrorHandlingMiddleware()`     | Centralized error catching and formatting | 1     |
| `createSecurityHeadersMiddleware()`   | Adds security headers                     | 2     |
| `createLoggingMiddleware()`           | Request/response logging                  | 3     |
| `createRateLimitMiddleware()`         | Rate limiting                             | 5     |
| `createAbortSignalMiddleware()`       | Client disconnection detection            | 5     |
| `createCompressionMiddleware()`       | Response compression signaling            | 5     |
| `createAuthMiddleware()`              | Authentication                            | 10    |
| `createRequestValidationMiddleware()` | Request body/query/params validation      | 15    |
| `createCacheMiddleware()`             | Response caching                          | 20    |
| `createMCPBodyAttachmentMiddleware()` | MCP SDK body compatibility                | 10    |
| `createDeprecationMiddleware()`       | RFC 8594 deprecation headers              | 100   |

The `order` value determines execution sequence - lower numbers run first.

---

## Timing Middleware

Measures request duration and adds timing headers to responses.

### Usage

```typescript

server.registerMiddleware(createTimingMiddleware());
```

### Headers Set

| Header            | Description                                              | Example           |
| ----------------- | -------------------------------------------------------- | ----------------- |
| `X-Response-Time` | Total request processing time in milliseconds            | `45.23ms`         |
| `Server-Timing`   | Standard Server-Timing header for performance monitoring | `total;dur=45.23` |

### When to Use

- Always recommended for production servers
- Essential for performance monitoring and debugging
- Works with browser Developer Tools and APM systems

---

## Request ID Middleware

Ensures every request has a unique identifier for tracing and debugging.

### Configuration

```typescript
type RequestIdOptions = {
  /** Header name to check for existing ID (default: "x-request-id") */
  headerName?: string;
  /** Prefix for generated IDs (default: "req") */
  prefix?: string;
  /** Custom ID generator function */
  generator?: () => string;
};
```

### Usage

```typescript

// Basic usage
server.registerMiddleware(createRequestIdMiddleware());

// With custom options
server.registerMiddleware(
  createRequestIdMiddleware({
    headerName: "x-correlation-id",
    prefix: "neuro",
    generator: () => `neuro-${crypto.randomUUID()}`,
  }),
);
```

### Headers

| Header         | Direction | Description                                     |
| -------------- | --------- | ----------------------------------------------- |
| `X-Request-ID` | Request   | Propagates existing ID from client (if present) |
| `X-Request-ID` | Response  | Returns request ID for client-side correlation  |

### When to Use

- Always recommended for production servers
- Essential for distributed tracing
- Enables log correlation across services
- Helps with debugging and support tickets

---

## Error Handling Middleware

Catches errors and formats them consistently across all routes.

### Configuration

```typescript
type ErrorHandlingOptions = {
  /** Include stack trace in error response (default: false) */
  includeStack?: boolean;
  /** Custom error handler function */
  onError?: (error: Error, ctx: ServerContext) => unknown;
  /** Log errors to console (default: true) */
  logErrors?: boolean;
};
```

### Usage

```typescript

// Basic usage
server.registerMiddleware(createErrorHandlingMiddleware());

// Development mode with stack traces
server.registerMiddleware(
  createErrorHandlingMiddleware({
    includeStack: process.env.NODE_ENV === "development",
    logErrors: true,
  }),
);

// With custom error handler
server.registerMiddleware(
  createErrorHandlingMiddleware({
    onError: (error, ctx) => ({
      error: {
        code: "CUSTOM_ERROR",
        message: error.message,
        requestId: ctx.requestId,
      },
    }),
  }),
);
```

### Error Response Format

```json
{
  "error": {
    "code": "HTTP_500",
    "message": "Internal server error",
    "stack": "Error: Something went wrong\n    at ..." // Only if includeStack: true
  },
  "metadata": {
    "requestId": "req-1706745600000-abc123",
    "timestamp": "2024-02-01T12:00:00.000Z"
  }
}
```

### When to Use

- Always recommended for production servers
- Provides consistent error responses
- Prevents leaking sensitive information in production
- Enable stack traces only in development

---

## Security Headers Middleware

Adds common security headers to protect against various web vulnerabilities.

### Configuration

```typescript
type SecurityHeadersOptions = {
  /** Content Security Policy directive */
  contentSecurityPolicy?: string;
  /** X-Frame-Options (default: "DENY") */
  frameOptions?: "DENY" | "SAMEORIGIN" | false;
  /** X-Content-Type-Options (default: "nosniff") */
  contentTypeOptions?: "nosniff" | false;
  /** HSTS max-age in seconds (default: 31536000 = 1 year) */
  hstsMaxAge?: number | false;
  /** Referrer-Policy (default: "strict-origin-when-cross-origin") */
  referrerPolicy?: string | false;
  /** Additional custom headers */
  customHeaders?: Record;
};
```

### Usage

```typescript

// Basic usage with defaults
server.registerMiddleware(createSecurityHeadersMiddleware());

// With custom configuration
server.registerMiddleware(
  createSecurityHeadersMiddleware({
    contentSecurityPolicy:
      "default-src 'self'; script-src 'self' 'unsafe-inline'",
    frameOptions: "SAMEORIGIN",
    hstsMaxAge: 63072000, // 2 years
    customHeaders: {
      "X-Custom-Header": "value",
    },
  }),
);
```

### Headers Set

| Header                      | Default Value                         | Description                   |
| --------------------------- | ------------------------------------- | ----------------------------- |
| `X-Frame-Options`           | `DENY`                                | Prevents clickjacking         |
| `X-Content-Type-Options`    | `nosniff`                             | Prevents MIME sniffing        |
| `Strict-Transport-Security` | `max-age=31536000; includeSubDomains` | Enforces HTTPS                |
| `Referrer-Policy`           | `strict-origin-when-cross-origin`     | Controls referrer information |
| `X-XSS-Protection`          | `1; mode=block`                       | Legacy XSS protection         |
| `Content-Security-Policy`   | Not set by default                    | Content security policy       |

### When to Use

- Always recommended for production servers
- Required for security compliance (OWASP, PCI-DSS)
- Configure CSP based on your application needs
- Disable HSTS initially if not ready for HTTPS-only

---

## Logging Middleware

Logs request and response information with configurable detail levels.

### Configuration

```typescript
type LoggingOptions = {
  /** Log request body (default: false) */
  logBody?: boolean;
  /** Log response body (default: false) */
  logResponse?: boolean;
  /** Custom logger instance */
  logger?: {
    info: (message: string, data?: unknown) => void;
    error: (message: string, data?: unknown) => void;
  };
  /** Paths to skip logging (default: ["/health", "/ready", "/metrics"]) */
  skipPaths?: string[];
};
```

### Usage

```typescript

// Basic usage
server.registerMiddleware(createLoggingMiddleware());

// Development mode with body logging
server.registerMiddleware(
  createLoggingMiddleware({
    logBody: process.env.NODE_ENV === "development",
    logResponse: process.env.NODE_ENV === "development",
    skipPaths: ["/api/health", "/api/ready"],
  }),
);

// With custom logger (e.g., Winston, Pino)

const logger = pino();

server.registerMiddleware(
  createLoggingMiddleware({
    logger: {
      info: (msg, data) => logger.info(data, msg),
      error: (msg, data) => logger.error(data, msg),
    },
  }),
);
```

### Log Output

**Request Log:**

```
[Request] POST /api/agent/execute { requestId: "req-123", method: "POST", path: "/api/agent/execute" }
```

**Response Log:**

```
[Response] POST /api/agent/execute { requestId: "req-123", duration: "45ms", status: 200 }
```

**Error Log:**

```
[Error] POST /api/agent/execute { requestId: "req-123", duration: "12ms", error: "Invalid input", status: 400 }
```

### When to Use

- Always recommended for production servers
- Disable body logging in production for performance and privacy
- Use structured logging (JSON) for log aggregation systems
- Skip health check endpoints to reduce noise

---

## Compression Middleware

Signals compression preferences to adapters for response compression.

### Configuration

```typescript
type CompressionOptions = {
  /** Minimum response size to compress in bytes (default: 1024) */
  threshold?: number;
  /** Content types to compress */
  contentTypes?: string[];
};
```

### Usage

```typescript

// Basic usage
server.registerMiddleware(createCompressionMiddleware());

// With custom configuration
server.registerMiddleware(
  createCompressionMiddleware({
    threshold: 2048, // Only compress responses > 2KB
    contentTypes: ["text/", "application/json", "application/xml"],
  }),
);
```

### How It Works

This middleware stores compression preferences in the request context metadata. The actual compression is handled by the underlying framework (Hono, Express, etc.) or a reverse proxy.

### When to Use

- Recommended for responses larger than 1KB
- Works best with text-based content (JSON, HTML, XML)
- Consider disabling for already-compressed content (images, videos)
- Often handled at reverse proxy level (nginx, CloudFlare)

---

## Abort Signal Middleware

Provides client disconnection handling for long-running requests using AbortController.

### Configuration

```typescript
type AbortSignalMiddlewareOptions = {
  /** Callback when abort is triggered */
  onAbort?: (ctx: ServerContext) => void;
  /** Request timeout in milliseconds */
  timeout?: number;
};
```

### Usage

```typescript

// Basic usage
server.registerMiddleware(createAbortSignalMiddleware());

// With timeout and abort callback
server.registerMiddleware(
  createAbortSignalMiddleware({
    timeout: 30000, // 30 seconds
    onAbort: (ctx) => {
      console.log(`Request ${ctx.requestId} was aborted`);
    },
  }),
);
```

### Using the Abort Signal in Route Handlers

```typescript
server.registerRoute({
  method: "POST",
  path: "/api/long-running",
  handler: async (ctx) => {
    const signal = ctx.abortSignal;

    // Pass signal to cancellable operations
    const result = await longRunningOperation({ signal });

    // Check if aborted before continuing
    if (signal?.aborted) {
      throw new Error("Request was cancelled");
    }

    return result;
  },
});
```

### Express-Specific Middleware

For Express applications, use the specialized Express middleware:

```typescript

app.use(
  createExpressAbortMiddleware({
    onAbort: () => console.log("Client disconnected"),
  }),
);

app.get("/api/stream", (req, res) => {
  const signal = res.locals.abortSignal;
  // Use signal for cancellation
});
```

### When to Use

- Long-running operations (AI generation, file processing)
- Streaming endpoints where client might disconnect
- Operations that should be cancelled on timeout
- Preventing resource waste on abandoned requests

---

## MCP Body Attachment Middleware

Bridges the gap between Fastify's body parsing and the MCP SDK's body access pattern.

### Usage

```typescript

// General middleware for any adapter
server.registerMiddleware(createMCPBodyAttachmentMiddleware());
```

### Fastify-Specific Hook

For optimal Fastify integration, use the dedicated preHandler hook:

```typescript

fastify.addHook("preHandler", fastifyMCPBodyHook);
```

### How It Works

The MCP SDK reads the request body from `request.raw.body`, but Fastify parses the body separately into `request.body`. This middleware attaches the parsed body to `request.raw.body` for MCP SDK compatibility.

### When to Use

- Required when using MCP routes with Fastify
- Not needed for Hono, Express, or Koa adapters
- Applied automatically by the Fastify adapter

---

## Deprecation Middleware

Adds RFC 8594 compliant deprecation headers to responses for deprecated routes.

### Configuration

```typescript
type DeprecationConfig = {
  /** Array of route definitions to check for deprecation */
  routes: RouteDefinition[];
  /** Custom header name for deprecation notice (default: "X-Deprecation-Notice") */
  noticeHeader?: string;
  /** Include Link header for alternative routes (default: true) */
  includeLink?: boolean;
};

type RouteDeprecation = {
  enabled: boolean;
  since?: string; // Version when deprecated
  removeIn?: string; // Version when removed
  alternative?: string; // Replacement endpoint
  message?: string; // Custom message
};
```

### Usage

```typescript

const routes = [
  {
    method: "GET",
    path: "/api/v1/users",
    handler: handleUsers,
    deprecated: {
      enabled: true,
      since: "2.0.0",
      removeIn: "3.0.0",
      alternative: "/api/v2/users",
      message: "Use /api/v2/users for improved performance",
    },
  },
];

server.registerMiddleware(createDeprecationMiddleware({ routes }));
```

### Headers Set

| Header                 | Description                                       | Example                                      |
| ---------------------- | ------------------------------------------------- | -------------------------------------------- |
| `Deprecation`          | RFC 8594 deprecation indicator                    | `true`                                       |
| `Sunset`               | When the endpoint will be removed (HTTP-date)     | `Sun, 01 Jun 2025 00:00:00 GMT`              |
| `Link`                 | Alternative endpoint with rel="successor-version" | `; rel="successor-version"`   |
| `X-Deprecation-Notice` | Human-readable deprecation message                | `Use /api/v2/users for improved performance` |

### When to Use

- API versioning migrations
- Feature deprecation announcements
- Gradual API evolution
- Compliance with RFC 8594

---

## Rate Limit Middleware

Provides configurable rate limiting with multiple algorithms.

### Configuration

```typescript
type RateLimitMiddlewareConfig = {
  /** Maximum requests per window */
  maxRequests: number;
  /** Time window in milliseconds */
  windowMs: number;
  /** Custom error message */
  message?: string;
  /** Skip rate limiting for certain paths */
  skipPaths?: string[];
  /** Custom key generator (default: IP address) */
  keyGenerator?: (ctx: ServerContext) => string;
  /** Custom response handler for rate limit exceeded */
  onRateLimitExceeded?: (ctx: ServerContext, retryAfter: number) => unknown;
  /** Custom rate limit store (default: in-memory) */
  store?: RateLimitStore;
};
```

### Usage

```typescript

  createRateLimitMiddleware,
  createSlidingWindowRateLimitMiddleware,
  InMemoryRateLimitStore,
} from "@juspay/neurolink";

// Fixed window rate limiting
server.registerMiddleware(
  createRateLimitMiddleware({
    maxRequests: 100,
    windowMs: 15 * 60 * 1000, // 15 minutes
    skipPaths: ["/api/health"],
  }),
);

// Sliding window rate limiting (more accurate)
server.registerMiddleware(
  createSlidingWindowRateLimitMiddleware({
    maxRequests: 100,
    windowMs: 15 * 60 * 1000,
    subWindows: 10, // Number of sub-windows for smoothing
  }),
);

// Rate limit by user ID instead of IP
server.registerMiddleware(
  createRateLimitMiddleware({
    maxRequests: 1000,
    windowMs: 60 * 60 * 1000, // 1 hour
    keyGenerator: (ctx) =>
      ctx.user?.id || ctx.headers["x-forwarded-for"] || "unknown",
  }),
);
```

### Headers Set

| Header                  | Description                            | Example      |
| ----------------------- | -------------------------------------- | ------------ |
| `X-RateLimit-Limit`     | Maximum requests allowed per window    | `100`        |
| `X-RateLimit-Remaining` | Requests remaining in current window   | `95`         |
| `X-RateLimit-Reset`     | Unix timestamp when the window resets  | `1706746200` |
| `Retry-After`           | Seconds to wait (only on 429 response) | `300`        |

### Custom Rate Limit Store (Redis)

```typescript

class RedisRateLimitStore implements RateLimitStore {
  constructor(private redis: Redis) {}

  async get(key: string): Promise {
    const data = await this.redis.get(`ratelimit:${key}`);
    return data ? JSON.parse(data) : undefined;
  }

  async set(key: string, entry: RateLimitEntry): Promise {
    const ttl = Math.ceil((entry.resetAt - Date.now()) / 1000);
    await this.redis.setex(`ratelimit:${key}`, ttl, JSON.stringify(entry));
  }

  async increment(key: string, windowMs: number): Promise {
    const now = Date.now();
    const resetAt = now + windowMs;
    const count = await this.redis.incr(`ratelimit:${key}`);

    if (count === 1) {
      await this.redis.pexpire(`ratelimit:${key}`, windowMs);
    }

    return { count, resetAt };
  }

  async reset(key: string): Promise {
    await this.redis.del(`ratelimit:${key}`);
  }
}

const redisStore = new RedisRateLimitStore(new Redis());
server.registerMiddleware(
  createRateLimitMiddleware({
    maxRequests: 100,
    windowMs: 60000,
    store: redisStore,
  }),
);
```

### When to Use

- API abuse prevention
- Fair usage enforcement
- Cost control for expensive operations
- Protection against DDoS attacks

---

## Authentication Middleware

Provides flexible authentication support with multiple strategies.

### Configuration

```typescript
type AuthConfig = {
  /** Authentication type */
  type: "bearer" | "api-key" | "basic" | "custom";
  /** Token validation function */
  validate: (token: string, ctx: ServerContext) => Promise;
  /** Header name for token */
  headerName?: string;
  /** Skip authentication for certain paths */
  skipPaths?: string[];
  /** Custom error message */
  errorMessage?: string;
  /** Token extractor for custom auth schemes */
  extractToken?: (ctx: ServerContext) => string | null;
  /** Skip auth for dev playground requests (default: true) */
  skipDevPlayground?: boolean;
};

type AuthResult = {
  id: string;
  email?: string;
  roles?: string[];
  metadata?: Record;
};
```

### Usage

```typescript

  createAuthMiddleware,
  createBearerAuthMiddleware,
  createApiKeyAuthMiddleware,
  createRoleMiddleware,
  ApiKeyStore,
} from "@juspay/neurolink";

// Bearer token authentication
server.registerMiddleware(
  createAuthMiddleware({
    type: "bearer",
    validate: async (token) => {
      const user = await verifyJWT(token);
      return user
        ? { id: user.id, email: user.email, roles: user.roles }
        : null;
    },
    skipPaths: ["/api/health", "/api/ready"],
  }),
);

// API key authentication
const apiKeyStore = new ApiKeyStore();
apiKeyStore.addKey("sk_live_abc123", { id: "user_1", roles: ["admin"] });

server.registerMiddleware(
  createApiKeyAuthMiddleware(apiKeyStore, {
    headerName: "x-api-key",
    skipPaths: ["/api/health"],
  }),
);

// Role-based access control (after authentication)
server.registerMiddleware(
  createRoleMiddleware({
    requiredRoles: ["admin"],
    requireAll: false, // Any role matches
    errorMessage: "Admin access required",
  }),
);
```

### Headers Read

| Header          | Auth Type     | Description                          |
| --------------- | ------------- | ------------------------------------ |
| `Authorization` | bearer, basic | `Bearer ` or `Basic ` |
| `X-API-Key`     | api-key       | Raw API key value                    |

### Dev Playground Support

In non-production environments, requests with `X-NeuroLink-Dev-Playground: true` header bypass authentication and receive a default developer user context.

### When to Use

- Protecting API endpoints
- User identification and authorization
- Rate limiting by user
- Audit logging

---

## Request Validation Middleware

Provides schema-based request validation for body, query, params, and headers.

### Configuration

```typescript
type ValidationConfig = {
  /** Schema for validating request body */
  bodySchema?: ValidationSchema;
  /** Schema for validating query parameters */
  querySchema?: ValidationSchema;
  /** Schema for validating path parameters */
  paramsSchema?: ValidationSchema;
  /** Schema for validating headers */
  headersSchema?: ValidationSchema;
  /** Custom validation function */
  customValidator?: (ctx: ServerContext) => Promise;
  /** Skip validation for certain paths */
  skipPaths?: string[];
  /** Custom error formatter */
  errorFormatter?: (errors: ValidationError[]) => unknown;
};

type ValidationSchema = {
  required?: string[];
  properties?: Record;
  additionalProperties?: boolean;
};

type PropertySchema = {
  type: "string" | "number" | "boolean" | "object" | "array";
  minimum?: number;
  maximum?: number;
  minLength?: number;
  maxLength?: number;
  minItems?: number;
  maxItems?: number;
  pattern?: string;
  enum?: unknown[];
  default?: unknown;
  validate?: (value: unknown) => boolean | string;
};
```

### Usage

```typescript

  createRequestValidationMiddleware,
  createBodyValidationMiddleware,
  createQueryValidationMiddleware,
  CommonSchemas,
} from "@juspay/neurolink";

// Full validation
server.registerMiddleware(
  createRequestValidationMiddleware({
    bodySchema: {
      required: ["input"],
      properties: {
        input: { type: "string", minLength: 1, maxLength: 10000 },
        temperature: { type: "number", minimum: 0, maximum: 2 },
        provider: { type: "string", enum: ["openai", "anthropic", "google"] },
      },
    },
    querySchema: {
      properties: {
        stream: { type: "boolean" },
      },
    },
  }),
);

// Body-only validation (convenience function)
server.registerMiddleware(
  createBodyValidationMiddleware({
    required: ["name", "email"],
    properties: {
      name: { type: "string", minLength: 1 },
      email: { type: "string", pattern: "^[^@]+@[^@]+\\.[^@]+$" },
    },
  }),
);

// Custom validation
server.registerMiddleware(
  createRequestValidationMiddleware({
    customValidator: async (ctx) => {
      if (ctx.body?.startDate > ctx.body?.endDate) {
        throw new ValidationError([
          {
            field: "dateRange",
            message: "startDate must be before endDate",
          },
        ]);
      }
    },
  }),
);
```

### Error Response Format

```json
{
  "error": {
    "code": "VALIDATION_ERROR",
    "message": "Request validation failed",
    "details": [
      { "field": "body.input", "message": "input is required" },
      { "field": "body.temperature", "message": "Value must be at most 2" }
    ]
  }
}
```

### Common Schemas

Pre-built schemas for common validation patterns:

```typescript

// Use pagination schema
server.registerMiddleware(
  createQueryValidationMiddleware(CommonSchemas.pagination),
);
```

| Schema       | Fields                        |
| ------------ | ----------------------------- |
| `uuid`       | UUID string format            |
| `email`      | Email string format           |
| `pagination` | `page`, `limit`, `offset`     |
| `sorting`    | `sortBy`, `sortOrder`         |
| `idParam`    | Required `id` parameter       |
| `dateRange`  | `startDate`, `endDate`        |
| `search`     | `q` (query), `fields` (array) |

### When to Use

- Input sanitization and security
- API contract enforcement
- Early error detection
- Documentation generation (with OpenAPI)

---

## Cache Middleware

Provides response caching with LRU eviction and configurable TTL.

### Configuration

```typescript
type CacheConfig = {
  /** Default TTL in milliseconds */
  ttlMs: number;
  /** Maximum cache size (default: 1000 entries) */
  maxSize?: number;
  /** Custom key generator */
  keyGenerator?: (ctx: ServerContext) => string;
  /** Methods to cache (default: ["GET"]) */
  methods?: string[];
  /** Paths to cache */
  paths?: string[];
  /** Paths to exclude from caching */
  excludePaths?: string[];
  /** Custom cache store */
  store?: CacheStore;
  /** Include query params in cache key (default: true) */
  includeQuery?: boolean;
  /** Custom TTL per path pattern */
  ttlByPath?: Record;
};
```

### Usage

```typescript

  createCacheMiddleware,
  InMemoryCacheStore,
  ResponseCacheStore,
} from "@juspay/neurolink";

// Basic caching
server.registerMiddleware(
  createCacheMiddleware({
    ttlMs: 60 * 1000, // 1 minute
    methods: ["GET"],
    excludePaths: ["/api/health", "/api/agent/stream"],
  }),
);

// With custom TTL per path
server.registerMiddleware(
  createCacheMiddleware({
    ttlMs: 60 * 1000,
    ttlByPath: {
      "/api/providers": 300 * 1000, // 5 minutes
      "/api/models": 600 * 1000, // 10 minutes
    },
  }),
);

// Using ResponseCacheStore for synchronous access
const cacheStore = new ResponseCacheStore(1000, 60000);
cacheStore.set("GET:/api/data", { status: 200, data: [] });
const cached = cacheStore.get("GET:/api/data");
```

### Headers Set

| Header          | Value        | Description                         |
| --------------- | ------------ | ----------------------------------- |
| `X-Cache`       | `HIT`        | Response served from cache          |
| `X-Cache`       | `MISS`       | Response freshly generated          |
| `X-Cache-Age`   | `45`         | Seconds since cached (only on HIT)  |
| `Cache-Control` | `max-age=60` | Browser caching directive (on MISS) |

### When to Use

- Expensive operations (database queries, AI generation)
- Frequently requested static data
- Rate limit budget optimization
- Reducing latency for repeated requests

---

## Composing Middleware

Middleware are executed in order based on their `order` property. Here's a recommended production setup:

```typescript

  createTimingMiddleware,
  createRequestIdMiddleware,
  createErrorHandlingMiddleware,
  createSecurityHeadersMiddleware,
  createLoggingMiddleware,
  createRateLimitMiddleware,
  createAuthMiddleware,
  createRequestValidationMiddleware,
  createCacheMiddleware,
} from "@juspay/neurolink";

// Register middleware in recommended order
const middlewares = [
  createTimingMiddleware(),
  createRequestIdMiddleware(),
  createErrorHandlingMiddleware({ includeStack: isDev }),
  createSecurityHeadersMiddleware(),
  createLoggingMiddleware({ skipPaths: ["/api/health"] }),
  createRateLimitMiddleware({
    maxRequests: 100,
    windowMs: 60000,
    skipPaths: ["/api/health"],
  }),
  createAuthMiddleware({
    type: "bearer",
    validate: verifyToken,
    skipPaths: ["/api/health", "/api/docs"],
  }),
  createCacheMiddleware({
    ttlMs: 60000,
    methods: ["GET"],
    excludePaths: ["/api/agent"],
  }),
];

for (const middleware of middlewares) {
  server.registerMiddleware(middleware);
}
```

---

## Next Steps

- **[Configuration Reference](/docs/reference/server-configuration)** - Full server configuration options
- **[Security Best Practices](/docs/guides/server-adapters/security)** - Authentication and authorization patterns
- **[Deployment Guide](/docs/guides/server-adapters/deployment)** - Production deployment strategies
- **[Express Adapter](/docs/sdk/framework-integration)** - Express-specific middleware integration
- **[Fastify Adapter](/docs/sdk/framework-integration)** - Fastify-specific hooks and plugins

---

## Streaming Guide

<!-- Source: guides/server-adapters/streaming.md -->

# Streaming Guide

NeuroLink server adapters provide a robust streaming infrastructure for delivering AI responses in real-time. This guide covers the Data Stream Protocol, event types, streaming formats, and client-side consumption patterns.

## Quick Start

The `/api/agent/stream` endpoint is automatically available on all server adapters:

```bash
curl -X POST http://localhost:3000/api/agent/stream \
  -H "Content-Type: application/json" \
  -H "Accept: text/event-stream" \
  -d '{"input": "Write a haiku about coding"}'
```

**Response (SSE format):**

```
event: text-start
data: {"id":"text-1738000000000"}

event: text-delta
data: {"id":"text-1738000000000","delta":"Silent"}

event: text-delta
data: {"id":"text-1738000000000","delta":" keystrokes"}

event: text-delta
data: {"id":"text-1738000000000","delta":" flow"}

event: text-end
data: {"id":"text-1738000000000"}

event: finish
data: {"reason":"stop","usage":{"input":10,"output":15,"total":25}}
```

---

## Stream Event Types

NeuroLink defines 8 event types for comprehensive streaming:

### Text Events

| Event        | Description                              | Data Fields   |
| ------------ | ---------------------------------------- | ------------- |
| `text-start` | Signals the beginning of a text response | `id`          |
| `text-delta` | Contains a chunk of generated text       | `id`, `delta` |
| `text-end`   | Signals the end of a text response       | `id`          |

### Tool Events

| Event         | Description                              | Data Fields               |
| ------------- | ---------------------------------------- | ------------------------- |
| `tool-call`   | Notification that a tool is being called | `id`, `name`, `arguments` |
| `tool-result` | Result returned from a tool execution    | `id`, `name`, `result`    |

### Control Events

| Event    | Description                     | Data Fields       |
| -------- | ------------------------------- | ----------------- |
| `data`   | Arbitrary data payload          | `any`             |
| `error`  | Error occurred during streaming | `message`, `code` |
| `finish` | Stream completed                | `reason`, `usage` |

---

## DataStreamWriter Interface

The `DataStreamWriter` interface provides methods for writing structured stream events:

```typescript

const writer = createDataStreamWriter({
  write: (chunk: string) => res.write(chunk),
  close: () => res.end(),
  format: "sse", // or "ndjson"
  includeTimestamps: true,
});

// Write text events
await writer.writeTextStart("response-1");
await writer.writeTextDelta("response-1", "Hello, ");
await writer.writeTextDelta("response-1", "world!");
await writer.writeTextEnd("response-1");

// Write tool events
await writer.writeToolCall({
  id: "tool-1",
  name: "getCurrentTime",
  arguments: { timezone: "UTC" },
});

await writer.writeToolResult({
  id: "tool-1",
  name: "getCurrentTime",
  result: { time: "2026-02-02T10:30:00Z" },
});

// Write arbitrary data
await writer.writeData({ customField: "value" });

// Write error
await writer.writeError({
  message: "Something went wrong",
  code: "STREAM_ERROR",
});

// Close the stream
await writer.close();
```

### Interface Methods

| Method                        | Description                  |
| ----------------------------- | ---------------------------- |
| `writeTextStart(id)`          | Begin a text response block  |
| `writeTextDelta(id, delta)`   | Write a text chunk           |
| `writeTextEnd(id)`            | End a text response block    |
| `writeToolCall(toolCall)`     | Notify of a tool invocation  |
| `writeToolResult(toolResult)` | Report tool execution result |
| `writeData(data)`             | Write arbitrary JSON data    |
| `writeError(error)`           | Report an error              |
| `close()`                     | Close the stream             |

---

## DataStreamResponse Class

For convenience, use `DataStreamResponse` to create a complete streaming response:

```typescript

  DataStreamResponse,
  createDataStreamResponse,
} from "@juspay/neurolink";

// Option 1: Using the class directly
const streamResponse = new DataStreamResponse({
  contentType: "text/event-stream",
  keepAliveInterval: 15000, // 15 seconds
  includeTimestamps: true,
});

// Write events directly on the response
await streamResponse.writeTextStart("msg-1");
await streamResponse.writeTextDelta("msg-1", "Streaming content...");
await streamResponse.writeTextEnd("msg-1");

// Finish with usage statistics
await streamResponse.finish({
  reason: "stop",
  usage: { input: 10, output: 25, total: 35 },
});

// Option 2: Using the factory function
const response = createDataStreamResponse({
  contentType: "application/x-ndjson",
  keepAliveInterval: 30000,
});
```

### Configuration Options

| Option              | Type                                              | Default               | Description                   |
| ------------------- | ------------------------------------------------- | --------------------- | ----------------------------- |
| `contentType`       | `"text/event-stream"` \| `"application/x-ndjson"` | `"text/event-stream"` | Stream format                 |
| `headers`           | `Record`                          | `{}`                  | Additional response headers   |
| `keepAliveInterval` | `number`                                          | `undefined`           | Keep-alive ping interval (ms) |
| `includeTimestamps` | `boolean`                                         | `true`                | Include timestamps in events  |

---

## SSE vs NDJSON Formats

NeuroLink supports two streaming formats. Choose based on your requirements:

### Server-Sent Events (SSE)

**Content-Type:** `text/event-stream`

**Best for:**

- Browser-based clients using `EventSource`
- Standard HTTP/1.1 connections
- Automatic reconnection handling
- Event type differentiation

**Format example:**

```
event: text-delta
data: {"id":"msg-1","delta":"Hello"}
id: msg-1

event: text-delta
data: {"id":"msg-1","delta":" world"}
id: msg-1

```

**Client-side usage:**

```typescript
const eventSource = new EventSource("/api/agent/stream");

eventSource.addEventListener("text-delta", (event) => {
  const data = JSON.parse(event.data);
  console.log(data.delta);
});

eventSource.addEventListener("finish", (event) => {
  const data = JSON.parse(event.data);
  console.log("Stream finished:", data.reason);
  eventSource.close();
});

eventSource.addEventListener("error", (event) => {
  console.error("Stream error:", event);
});
```

### Newline-Delimited JSON (NDJSON)

**Content-Type:** `application/x-ndjson`

**Best for:**

- Server-to-server communication
- Custom stream processing
- Simpler parsing logic
- HTTP/2 connections

**Format example:**

```json
{"type":"text-delta","id":"msg-1","timestamp":1738000000000,"data":{"id":"msg-1","delta":"Hello"}}
{"type":"text-delta","id":"msg-1","timestamp":1738000000001,"data":{"id":"msg-1","delta":" world"}}
{"type":"finish","timestamp":1738000000100,"data":{"reason":"stop"}}
```

**Client-side usage:**

```typescript
const response = await fetch("/api/agent/stream", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    Accept: "application/x-ndjson",
  },
  body: JSON.stringify({ input: "Hello" }),
});

const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = "";

while (true) {
  const { done, value } = await reader.read();
  if (done) break;

  buffer += decoder.decode(value, { stream: true });
  const lines = buffer.split("\n");
  buffer = lines.pop() || "";

  for (const line of lines) {
    if (line.trim()) {
      const event = JSON.parse(line);
      console.log(event.type, event.data);
    }
  }
}
```

### Header Helper Functions

```typescript

// SSE headers
const sseHeaders = createSSEHeaders({
  "X-Custom-Header": "value",
});
// Returns:
// {
//   "Content-Type": "text/event-stream",
//   "Cache-Control": "no-cache, no-transform",
//   "Connection": "keep-alive",
//   "X-Accel-Buffering": "no",
//   "X-Custom-Header": "value"
// }

// NDJSON headers
const ndjsonHeaders = createNDJSONHeaders({
  "X-Custom-Header": "value",
});
// Returns:
// {
//   "Content-Type": "application/x-ndjson",
//   "Cache-Control": "no-cache",
//   "Connection": "keep-alive",
//   "X-Custom-Header": "value"
// }
```

---

## StreamingConfig

Configure streaming behavior in route definitions:

```typescript

const streamingConfig: StreamingConfig = {
  enabled: true,
  contentType: "text/event-stream",
  keepAliveInterval: 15000, // 15 seconds
};

const customStreamRoute: RouteDefinition = {
  method: "POST",
  path: "/api/custom-stream",
  handler: async (ctx) => {
    // Return an async iterable for streaming
    return generateStream(ctx.body);
  },
  streaming: streamingConfig,
  description: "Custom streaming endpoint",
  tags: ["streaming"],
};
```

### Configuration Fields

| Field               | Type                                              | Default     | Description                        |
| ------------------- | ------------------------------------------------- | ----------- | ---------------------------------- |
| `enabled`           | `boolean`                                         | `true`      | Enable streaming for this route    |
| `contentType`       | `"text/event-stream"` \| `"application/x-ndjson"` | SSE         | Stream format                      |
| `keepAliveInterval` | `number`                                          | `undefined` | Interval for keep-alive pings (ms) |

---

## Code Examples

### Basic Streaming Response

```typescript

const neurolink = new NeuroLink({ defaultProvider: "openai" });

const server = await createServer(neurolink, {
  framework: "hono",
  config: { port: 3000 },
});

// Register a custom streaming route
server.registerRoute({
  method: "POST",
  path: "/api/generate-stream",
  handler: async (ctx) => {
    const { prompt } = ctx.body as { prompt: string };

    const streamResponse = new DataStreamResponse({
      contentType: "text/event-stream",
      keepAliveInterval: 15000,
    });

    // Start streaming in background
    (async () => {
      const textId = `text-${Date.now()}`;

      try {
        await streamResponse.writeTextStart(textId);

        for await (const chunk of neurolink.generateStream({ prompt })) {
          if (chunk.content) {
            await streamResponse.writeTextDelta(textId, chunk.content);
          }
        }

        await streamResponse.writeTextEnd(textId);
        await streamResponse.finish({ reason: "stop" });
      } catch (error) {
        await streamResponse.writeError({
          message: error.message,
          code: "GENERATION_ERROR",
        });
        streamResponse.close();
      }
    })();

    // Return the stream
    return new Response(streamResponse.stream, {
      headers: streamResponse.headers,
    });
  },
  streaming: { enabled: true, contentType: "text/event-stream" },
  description: "Stream AI-generated content",
  tags: ["streaming", "generation"],
});

await server.initialize();
await server.start();
```

### Tool Call Streaming

```typescript

  DataStreamResponse,
  pipeAsyncIterableToDataStream,
} from "@juspay/neurolink";

server.registerRoute({
  method: "POST",
  path: "/api/agent-stream",
  handler: async (ctx) => {
    const { input, tools } = ctx.body as { input: string; tools?: string[] };

    const streamResponse = new DataStreamResponse();

    (async () => {
      const textId = `agent-${Date.now()}`;

      try {
        await streamResponse.writeTextStart(textId);

        for await (const event of neurolink.streamWithTools({
          prompt: input,
          tools: tools || [],
        })) {
          switch (event.type) {
            case "text-delta":
              await streamResponse.writeTextDelta(textId, event.content);
              break;

            case "tool-call":
              await streamResponse.writeToolCall({
                id: event.toolCallId,
                name: event.toolName,
                arguments: event.args,
              });
              break;

            case "tool-result":
              await streamResponse.writeToolResult({
                id: event.toolCallId,
                name: event.toolName,
                result: event.result,
              });
              break;
          }
        }

        await streamResponse.writeTextEnd(textId);
        await streamResponse.finish({ reason: "stop" });
      } catch (error) {
        await streamResponse.writeError({
          message: error.message,
          code: "AGENT_ERROR",
        });
        streamResponse.close();
      }
    })();

    return new Response(streamResponse.stream, {
      headers: streamResponse.headers,
    });
  },
  streaming: { enabled: true },
  tags: ["streaming", "tools"],
});
```

### Error Handling in Streams

```typescript

async function handleStreamWithErrors(
  neurolink: NeuroLink,
  prompt: string,
): Promise {
  const streamResponse = new DataStreamResponse({
    contentType: "text/event-stream",
  });

  (async () => {
    const textId = `text-${Date.now()}`;

    try {
      await streamResponse.writeTextStart(textId);

      for await (const chunk of neurolink.generateStream({ prompt })) {
        // Check if stream was closed by client
        if (streamResponse.isClosed()) {
          console.log("Client disconnected, stopping generation");
          return;
        }

        if (chunk.content) {
          await streamResponse.writeTextDelta(textId, chunk.content);
        }
      }

      await streamResponse.writeTextEnd(textId);
      await streamResponse.finish({ reason: "stop" });
    } catch (error) {
      // Handle different error types
      if (error.name === "AbortError") {
        await streamResponse.writeError({
          message: "Request was cancelled",
          code: "STREAM_ABORTED",
        });
      } else if (error.message.includes("rate limit")) {
        await streamResponse.writeError({
          message: "Rate limit exceeded, please retry later",
          code: "RATE_LIMIT_EXCEEDED",
        });
      } else if (error.message.includes("context length")) {
        await streamResponse.writeError({
          message: "Input too long for model context window",
          code: "CONTEXT_LENGTH_EXCEEDED",
        });
      } else {
        await streamResponse.writeError({
          message: "An error occurred during generation",
          code: "GENERATION_ERROR",
        });
      }

      streamResponse.close();
    }
  })();

  return new Response(streamResponse.stream, {
    headers: streamResponse.headers,
  });
}
```

### Using pipeAsyncIterableToDataStream

For simpler cases, use the helper function:

```typescript

  DataStreamResponse,
  pipeAsyncIterableToDataStream,
} from "@juspay/neurolink";

server.registerRoute({
  method: "POST",
  path: "/api/simple-stream",
  handler: async (ctx) => {
    const { prompt } = ctx.body as { prompt: string };

    const streamResponse = new DataStreamResponse();

    // Pipe the async iterable directly to the stream
    pipeAsyncIterableToDataStream(
      neurolink.generateStream({ prompt }),
      streamResponse,
      {
        textId: `text-${Date.now()}`,
        onChunk: (chunk) => console.log("Chunk received:", chunk),
        onError: (error) => console.error("Stream error:", error),
      },
    ).catch(console.error);

    return new Response(streamResponse.stream, {
      headers: streamResponse.headers,
    });
  },
  streaming: { enabled: true },
});
```

### Client-Side Consumption (Browser)

**Using EventSource (SSE):**

```typescript
function streamWithEventSource(input: string): void {
  // Note: EventSource only supports GET requests
  // Use fetch for POST requests with SSE

  const eventSource = new EventSource(
    `/api/agent/stream?input=${encodeURIComponent(input)}`,
  );

  let content = "";

  eventSource.addEventListener("text-start", (event) => {
    console.log("Stream started");
  });

  eventSource.addEventListener("text-delta", (event) => {
    const data = JSON.parse(event.data);
    content += data.delta;
    updateUI(content);
  });

  eventSource.addEventListener("text-end", (event) => {
    console.log("Text complete");
  });

  eventSource.addEventListener("tool-call", (event) => {
    const data = JSON.parse(event.data);
    console.log(`Tool called: ${data.name}`, data.arguments);
    showToolIndicator(data.name);
  });

  eventSource.addEventListener("tool-result", (event) => {
    const data = JSON.parse(event.data);
    console.log(`Tool result: ${data.name}`, data.result);
    hideToolIndicator(data.name);
  });

  eventSource.addEventListener("finish", (event) => {
    const data = JSON.parse(event.data);
    console.log("Stream finished:", data);
    eventSource.close();
  });

  eventSource.addEventListener("error", (event) => {
    console.error("Stream error:", event);
    eventSource.close();
  });
}
```

**Using Fetch API (for POST requests):**

```typescript
async function streamWithFetch(input: string): Promise {
  const response = await fetch("/api/agent/stream", {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      Accept: "text/event-stream",
    },
    body: JSON.stringify({ input }),
  });

  if (!response.ok) {
    throw new Error(`HTTP error: ${response.status}`);
  }

  const reader = response.body!.getReader();
  const decoder = new TextDecoder();
  let buffer = "";
  let content = "";

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;

    buffer += decoder.decode(value, { stream: true });

    // Parse SSE format
    const lines = buffer.split("\n\n");
    buffer = lines.pop() || "";

    for (const block of lines) {
      const eventMatch = block.match(/^event: (.+)$/m);
      const dataMatch = block.match(/^data: (.+)$/m);

      if (eventMatch && dataMatch) {
        const eventType = eventMatch[1];
        const data = JSON.parse(dataMatch[1]);

        switch (eventType) {
          case "text-delta":
            content += data.delta;
            updateUI(content);
            break;
          case "tool-call":
            showToolCall(data);
            break;
          case "tool-result":
            showToolResult(data);
            break;
          case "error":
            showError(data.message);
            break;
          case "finish":
            console.log("Complete:", data);
            break;
        }
      }
    }
  }
}
```

**React Hook Example:**

```typescript

type StreamState = {
  content: string;
  isStreaming: boolean;
  error: string | null;
  toolCalls: Array;
};

function useStream() {
  const [state, setState] = useState({
    content: "",
    isStreaming: false,
    error: null,
    toolCalls: [],
  });

  const stream = useCallback(async (input: string) => {
    setState({ content: "", isStreaming: true, error: null, toolCalls: [] });

    try {
      const response = await fetch("/api/agent/stream", {
        method: "POST",
        headers: { "Content-Type": "application/json" },
        body: JSON.stringify({ input }),
      });

      const reader = response.body!.getReader();
      const decoder = new TextDecoder();
      let buffer = "";

      while (true) {
        const { done, value } = await reader.read();
        if (done) break;

        buffer += decoder.decode(value, { stream: true });
        const lines = buffer.split("\n\n");
        buffer = lines.pop() || "";

        for (const block of lines) {
          const eventMatch = block.match(/^event: (.+)$/m);
          const dataMatch = block.match(/^data: (.+)$/m);

          if (eventMatch && dataMatch) {
            const eventType = eventMatch[1];
            const data = JSON.parse(dataMatch[1]);

            switch (eventType) {
              case "text-delta":
                setState((prev) => ({
                  ...prev,
                  content: prev.content + data.delta,
                }));
                break;
              case "tool-call":
                setState((prev) => ({
                  ...prev,
                  toolCalls: [
                    ...prev.toolCalls,
                    { name: data.name, arguments: data.arguments },
                  ],
                }));
                break;
              case "error":
                setState((prev) => ({ ...prev, error: data.message }));
                break;
            }
          }
        }
      }
    } catch (error) {
      setState((prev) => ({
        ...prev,
        error: error instanceof Error ? error.message : "Stream failed",
      }));
    } finally {
      setState((prev) => ({ ...prev, isStreaming: false }));
    }
  }, []);

  return { ...state, stream };
}

// Usage in component
function ChatComponent() {
  const { content, isStreaming, error, toolCalls, stream } = useStream();

  return (

       stream("Tell me a joke")} disabled={isStreaming}>
        {isStreaming ? "Streaming..." : "Generate"}


      {error && {error}}

      {content}

      {toolCalls.map((tool, i) => (

          Tool: {tool.name}

      ))}

  );
}
```

---

## WebStreamWriter (Legacy)

For simple SSE streaming without the full Data Stream Protocol:

```typescript

const writer = new WebStreamWriter();

// Write events
writer.writeData({ message: "Hello" });
writer.writeEvent("custom-event", { data: "value" });
writer.writeDone();
writer.close();

// Use the stream
return new Response(writer.stream, {
  headers: { "Content-Type": "text/event-stream" },
});

// Manual SSE formatting
const sseMessage = formatSSEEvent({
  event: "message",
  data: JSON.stringify({ content: "Hello" }),
  id: "msg-1",
  retry: 5000,
});
// Result: "id: msg-1\nevent: message\nretry: 5000\ndata: {...}\n\n"
```

---

## Keep-Alive Configuration

Keep-alive signals prevent connection timeouts for long-running streams:

```typescript
const streamResponse = new DataStreamResponse({
  contentType: "text/event-stream",
  keepAliveInterval: 15000, // Send ping every 15 seconds
});
```

**SSE keep-alive format:**

```
: keep-alive

```

**NDJSON keep-alive format:**

```json
{ "type": "keep-alive" }
```

---

## Best Practices

### 1. Always Handle Client Disconnection

```typescript
// Check if stream is closed before writing
if (!streamResponse.isClosed()) {
  await streamResponse.writeTextDelta(id, chunk);
}
```

### 2. Use Unique IDs for Text Blocks

```typescript
const textId = `text-${Date.now()}-${Math.random().toString(36).substr(2, 9)}`;
```

### 3. Set Appropriate Timeouts

```typescript
const server = await createServer(neurolink, {
  config: {
    timeout: 120000, // 2 minutes for streaming endpoints
  },
});
```

### 4. Enable Keep-Alive for Long Streams

```typescript
const streamResponse = new DataStreamResponse({
  keepAliveInterval: 15000, // 15 seconds
});
```

### 5. Include Usage Statistics in Finish Event

```typescript
await streamResponse.finish({
  reason: "stop",
  usage: {
    input: promptTokens,
    output: completionTokens,
    total: promptTokens + completionTokens,
  },
});
```

### 6. Use AbortController for Cancellation

```typescript
const controller = new AbortController();

const response = await fetch("/api/agent/stream", {
  method: "POST",
  body: JSON.stringify({ input }),
  signal: controller.signal,
});

// Cancel the stream
controller.abort();
```

---

## Troubleshooting

### Stream Not Receiving Data

1. Check `Content-Type` header is `text/event-stream` or `application/x-ndjson`
2. Verify `Cache-Control: no-cache` is set
3. Ensure no proxy is buffering responses (check `X-Accel-Buffering: no`)

### Connection Dropping

1. Enable keep-alive with appropriate interval
2. Check server timeout configuration
3. Verify load balancer timeout settings

### Events Not Parsing Correctly

1. Ensure each SSE event ends with double newline (`\n\n`)
2. Verify JSON data is properly stringified
3. Check for proper event type names

---

## Related Documentation

- **[Server Adapters Overview](/docs/)** - Getting started with server adapters
- **[Hono Adapter](/docs/guides/server-adapters/hono)** - Framework-specific streaming examples
- **[Configuration Reference](/docs/reference/server-configuration)** - Full configuration options
- **[Security Best Practices](/docs/guides/server-adapters/security)** - Securing streaming endpoints

---

**Need Help?** Join our [GitHub Discussions](https://github.com/juspay/neurolink/discussions) or open an [issue](https://github.com/juspay/neurolink/issues).

---

## WebSocket Support

<!-- Source: guides/server-adapters/websocket.md -->

# WebSocket Support

NeuroLink server adapters include built-in WebSocket support for real-time, bidirectional communication with AI agents. WebSocket connections are ideal for interactive applications requiring low-latency streaming, live updates, and persistent connections.

----------------------- | --------------------------------------------------------------- |
| **Bidirectional**          | Send and receive messages without polling                       |
| **Low Latency**            | Single persistent connection reduces overhead                   |
| **Real-time Streaming**    | Stream AI responses token-by-token                              |
| **Connection Management**  | Built-in ping/pong, reconnection, and graceful shutdown         |
| **Multi-client Broadcast** | Send messages to multiple connected clients simultaneously      |
| **Authentication**         | Secure connections with bearer tokens, API keys, or custom auth |

---

## Quick Start

### Basic WebSocket Setup

```typescript

const neurolink = new NeuroLink({
  defaultProvider: "openai",
});

const server = await createServer(neurolink, {
  framework: "hono",
  config: {
    port: 3000,
    basePath: "/api",
  },
});

// Create WebSocket manager
const wsManager = new WebSocketConnectionManager({
  path: "/ws",
  maxConnections: 1000,
  pingInterval: 30000,
  pongTimeout: 10000,
  maxMessageSize: 1024 * 1024, // 1MB
});

// Register a handler
wsManager.registerHandler("/ws", {
  onOpen: async (connection) => {
    console.log(`Client connected: ${connection.id}`);
  },
  onMessage: async (connection, message) => {
    console.log(`Received: ${message.data}`);
  },
  onClose: async (connection, code, reason) => {
    console.log(`Client disconnected: ${connection.id}`);
  },
  onError: async (connection, error) => {
    console.error(`Error: ${error.message}`);
  },
});

await server.initialize();
await server.start();

console.log("WebSocket server running on ws://localhost:3000/ws");
```

### Client Connection

```javascript
// Browser client
const ws = new WebSocket("ws://localhost:3000/ws");

ws.onopen = () => {
  console.log("Connected");
  ws.send(JSON.stringify({ type: "generate", payload: { prompt: "Hello!" } }));
};

ws.onmessage = (event) => {
  const data = JSON.parse(event.data);
  console.log("Received:", data);
};

ws.onclose = (event) => {
  console.log(`Disconnected: ${event.code} - ${event.reason}`);
};

ws.onerror = (error) => {
  console.error("WebSocket error:", error);
};
```

---

## Configuration

### WebSocketConfig

The `WebSocketConfig` type defines all available configuration options:

```typescript
type WebSocketConfig = {
  /** WebSocket endpoint path (default: "/ws") */
  path?: string;

  /** Maximum number of concurrent connections (default: 1000) */
  maxConnections?: number;

  /** Interval between ping messages in ms (default: 30000) */
  pingInterval?: number;

  /** Time to wait for pong response in ms (default: 10000) */
  pongTimeout?: number;

  /** Maximum message size in bytes (default: 1MB) */
  maxMessageSize?: number;

  /** Authentication configuration */
  auth?: AuthConfig;
};
```

### Configuration Options

| Option           | Type         | Default   | Description                                        |
| ---------------- | ------------ | --------- | -------------------------------------------------- |
| `path`           | `string`     | `"/ws"`   | WebSocket endpoint path                            |
| `maxConnections` | `number`     | `1000`    | Maximum concurrent connections                     |
| `pingInterval`   | `number`     | `30000`   | Milliseconds between ping messages (0 to disable)  |
| `pongTimeout`    | `number`     | `10000`   | Milliseconds to wait for pong before disconnecting |
| `maxMessageSize` | `number`     | `1048576` | Maximum message size in bytes (1MB default)        |
| `auth`           | `AuthConfig` | `none`    | Authentication configuration                       |

### Full Configuration Example

```typescript
const wsManager = new WebSocketConnectionManager({
  path: "/ws/agent",
  maxConnections: 500,
  pingInterval: 15000,
  pongTimeout: 5000,
  maxMessageSize: 512 * 1024, // 512KB
  auth: {
    strategy: "bearer",
    required: true,
    validate: async (token) => {
      const decoded = await verifyJWT(token);
      return decoded ? { id: decoded.sub, roles: decoded.roles } : null;
    },
  },
});
```

---

## WebSocket Types

### WebSocketConnection

Represents an active WebSocket connection:

```typescript
type WebSocketConnection = {
  /** Unique connection identifier */
  id: string;

  /** Underlying WebSocket socket */
  socket: unknown;

  /** Authenticated user (if auth enabled) */
  user?: AuthenticatedUser;

  /** Custom metadata for the connection */
  metadata: Record;

  /** Connection creation timestamp */
  createdAt: number;

  /** Last activity timestamp */
  lastActivity: number;
};
```

### WebSocketMessage

Represents an incoming WebSocket message:

```typescript
type WebSocketMessage = {
  /** Message type: text, binary, ping, pong, or close */
  type: WebSocketMessageType;

  /** Message payload */
  data: string | ArrayBuffer;

  /** Message timestamp */
  timestamp: number;
};

type WebSocketMessageType = "text" | "binary" | "ping" | "pong" | "close";
```

### WebSocketHandler

Interface for handling WebSocket events:

```typescript
type WebSocketHandler = {
  /** Called when a connection is established */
  onOpen?: (connection: WebSocketConnection) => void | Promise;

  /** Called when a message is received */
  onMessage?: (
    connection: WebSocketConnection,
    message: WebSocketMessage,
  ) => void | Promise;

  /** Called when a connection is closed */
  onClose?: (
    connection: WebSocketConnection,
    code: number,
    reason: string,
  ) => void | Promise;

  /** Called when an error occurs */
  onError?: (
    connection: WebSocketConnection,
    error: Error,
  ) => void | Promise;
};
```

### AuthenticatedUser

User information from successful authentication:

```typescript
type AuthenticatedUser = {
  /** Unique user identifier */
  id: string;

  /** User email (optional) */
  email?: string;

  /** Display name (optional) */
  name?: string;

  /** User roles for authorization */
  roles?: string[];

  /** User permissions for fine-grained access */
  permissions?: string[];

  /** Additional user metadata */
  metadata?: Record;
};
```

---

## Authentication

### Authentication Strategies

NeuroLink supports multiple authentication strategies for WebSocket connections:

| Strategy | Description                      | Use Case                         |
| -------- | -------------------------------- | -------------------------------- |
| `bearer` | JWT or OAuth bearer token        | API authentication               |
| `apiKey` | API key in header or query param | Service-to-service communication |
| `basic`  | HTTP Basic authentication        | Simple username/password         |
| `custom` | Custom validation function       | Complex authentication flows     |
| `none`   | No authentication (default)      | Development or public endpoints  |

### AuthConfig

```typescript
type AuthConfig = {
  /** Authentication strategy */
  strategy: "bearer" | "apiKey" | "basic" | "custom" | "none";

  /** Whether authentication is required */
  required?: boolean;

  /** Custom header name for token (default: "Authorization") */
  headerName?: string;

  /** Query parameter name for token */
  queryParam?: string;

  /** Custom validation function */
  validate?: (token: string) => Promise;

  /** Required roles for access */
  roles?: string[];

  /** Required permissions for access */
  permissions?: string[];
};
```

### Bearer Token Authentication

```typescript
const wsManager = new WebSocketConnectionManager({
  path: "/ws",
  auth: {
    strategy: "bearer",
    required: true,
    validate: async (token) => {
      try {
        const decoded = await verifyJWT(token);
        return {
          id: decoded.sub,
          email: decoded.email,
          roles: decoded.roles || [],
        };
      } catch {
        return null;
      }
    },
  },
});

// Client connection with bearer token (Node.js only)
// Note: Custom headers in the WebSocket constructor are only supported by
// Node.js WebSocket libraries (e.g., `ws`). Browser WebSocket API does not
// support custom headers. For browser clients, use query parameters, cookies,
// or send authentication in the first message after connection.
const ws = new WebSocket("ws://localhost:3000/ws", [], {
  headers: {
    Authorization: "Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
  },
});
```

### API Key Authentication

```typescript
const wsManager = new WebSocketConnectionManager({
  path: "/ws",
  auth: {
    strategy: "apiKey",
    required: true,
    headerName: "X-API-Key",
    validate: async (apiKey) => {
      const user = await validateApiKey(apiKey);
      return user ? { id: user.id, roles: user.roles } : null;
    },
  },
});

// Client connection with API key
const ws = new WebSocket("ws://localhost:3000/ws?apiKey=your-api-key");

// Or via header (if supported by client)
const ws = new WebSocket("ws://localhost:3000/ws", [], {
  headers: {
    "X-API-Key": "your-api-key",
  },
});
```

### Role-Based Access Control

```typescript
const wsManager = new WebSocketConnectionManager({
  path: "/ws/admin",
  auth: {
    strategy: "bearer",
    required: true,
    roles: ["admin", "superuser"], // Only allow these roles
    validate: async (token) => {
      const decoded = await verifyJWT(token);
      return decoded ? { id: decoded.sub, roles: decoded.roles } : null;
    },
  },
});

// Access user info in handler
wsManager.registerHandler("/ws/admin", {
  onOpen: async (connection) => {
    if (connection.user?.roles?.includes("admin")) {
      console.log(`Admin connected: ${connection.user.id}`);
    }
  },
});
```

---

## WebSocketConnectionManager

The `WebSocketConnectionManager` class provides comprehensive connection management.

### Connection Management Methods

```typescript
// Get a specific connection
const connection = wsManager.getConnection(connectionId);

// Get all active connections
const connections = wsManager.getAllConnections();

// Get connections for a specific user
const userConnections = wsManager.getConnectionsByUser(userId);

// Get connections for a specific path
const pathConnections = wsManager.getConnectionsByPath("/ws/agent");

// Get total connection count
const count = wsManager.getConnectionCount();
```

### Sending Messages

```typescript
// Send to a specific connection
wsManager.send(connectionId, JSON.stringify({ type: "update", data: "Hello" }));

// Send binary data
const buffer = new ArrayBuffer(8);
wsManager.send(connectionId, buffer);
```

### Broadcasting

```typescript
// Broadcast to all connections
wsManager.broadcast(
  JSON.stringify({ type: "announcement", message: "Server update" }),
);

// Broadcast with filter
wsManager.broadcast(
  JSON.stringify({ type: "admin-only", data: "Secret info" }),
  (connection) => connection.user?.roles?.includes("admin") ?? false,
);

// Broadcast to specific path
wsManager.broadcast(
  JSON.stringify({ type: "update" }),
  (connection) => connection.metadata.path === "/ws/notifications",
);
```

### Closing Connections

```typescript
// Close a specific connection
await wsManager.close(connectionId, 1000, "Session ended");

// Close all connections (for shutdown)
await wsManager.closeAll(1001, "Server shutting down");
```

---

## Message Routing

### WebSocketMessageRouter

For structured message handling, use the `WebSocketMessageRouter`:

```typescript

  WebSocketConnectionManager,
  WebSocketMessageRouter,
} from "@juspay/neurolink";

const wsManager = new WebSocketConnectionManager({ path: "/ws" });
const router = new WebSocketMessageRouter();

// Register message routes
router.route("generate", async (connection, payload) => {
  const { prompt, options } = payload as { prompt: string; options?: unknown };

  // Generate AI response
  const result = await neurolink.generate({ prompt, ...options });

  return { type: "response", content: result.content };
});

router.route("stream", async (connection, payload) => {
  const { prompt } = payload as { prompt: string };

  // Start streaming
  const socket = connection.socket as { send: (data: string) => void };

  for await (const chunk of neurolink.generateStream({ prompt })) {
    socket.send(JSON.stringify({ type: "chunk", content: chunk.content }));
  }

  return { type: "stream_complete" };
});

router.route("tool_call", async (connection, payload) => {
  const { toolName, args } = payload as { toolName: string; args: unknown };

  const result = await neurolink.executeTool(toolName, args);

  return { type: "tool_result", toolName, result };
});

// Register handler that uses router
wsManager.registerHandler("/ws", {
  onOpen: async (connection) => {
    const socket = connection.socket as { send: (data: string) => void };
    socket.send(
      JSON.stringify({
        type: "connected",
        connectionId: connection.id,
        timestamp: Date.now(),
      }),
    );
  },

  onMessage: async (connection, message) => {
    try {
      const result = await router.handle(connection, message);
      if (result) {
        const socket = connection.socket as { send: (data: string) => void };
        socket.send(JSON.stringify(result));
      }
    } catch (error) {
      const socket = connection.socket as { send: (data: string) => void };
      socket.send(
        JSON.stringify({
          type: "error",
          error: (error as Error).message,
        }),
      );
    }
  },
});

// List registered routes
console.log("Registered routes:", router.getRoutes());
// Output: ["generate", "stream", "tool_call"]
```

### Message Format

Messages should follow this JSON structure:

```json
{
  "type": "generate",
  "payload": {
    "prompt": "Hello, how are you?",
    "options": {
      "temperature": 0.7
    }
  }
}
```

---

## AI Agent WebSocket Handler

NeuroLink provides a pre-built handler for AI agent interactions:

```typescript

  WebSocketConnectionManager,
  createAgentWebSocketHandler,
} from "@juspay/neurolink";

const neurolink = new NeuroLink({ defaultProvider: "openai" });

const wsManager = new WebSocketConnectionManager({
  path: "/ws/agent",
  auth: {
    strategy: "bearer",
    required: true,
    validate: async (token) => verifyJWT(token),
  },
});

// Use the pre-built agent handler
wsManager.registerHandler("/ws/agent", createAgentWebSocketHandler(neurolink));

// Supported message types:
// - { type: "generate", payload: { prompt, options } }
// - { type: "stream", payload: { prompt, options } }
// - { type: "tool_call", payload: { toolName, args } }
```

### Client Usage

```javascript
// Node.js client (using 'ws' library)
// Note: Custom headers in the WebSocket constructor are only supported by
// Node.js WebSocket libraries. Browser WebSocket API does not support custom
// headers. For browser clients, use query parameters or send authentication
// in the first message after connection.
const ws = new WebSocket("ws://localhost:3000/ws/agent", [], {
  headers: { Authorization: `Bearer ${token}` },
});

// Browser alternative: use query parameter for auth token
// const ws = new WebSocket(`ws://localhost:3000/ws/agent?token=${token}`);

ws.onopen = () => {
  // Generate a response
  ws.send(
    JSON.stringify({
      type: "generate",
      payload: {
        prompt: "What is the capital of France?",
        options: { temperature: 0.5 },
      },
    }),
  );
};

ws.onmessage = (event) => {
  const message = JSON.parse(event.data);

  switch (message.type) {
    case "connected":
      console.log("Connected:", message.connectionId);
      break;
    case "response":
      console.log("Response:", message.data);
      break;
    case "stream_start":
      console.log("Stream starting...");
      break;
    case "chunk":
      process.stdout.write(message.content);
      break;
    case "stream_complete":
      console.log("\nStream complete");
      break;
    case "error":
      console.error("Error:", message.error);
      break;
  }
};
```

---

## Error Handling

### WebSocket Errors

NeuroLink provides typed errors for WebSocket operations:

```typescript

wsManager.registerHandler("/ws", {
  onMessage: async (connection, message) => {
    try {
      // Process message
      await processMessage(message);
    } catch (error) {
      if (error instanceof WebSocketError) {
        console.error(`WebSocket error: ${error.message}`);
        console.error(`Connection ID: ${error.connectionId}`);
      }

      // Send error to client
      const socket = connection.socket as { send: (data: string) => void };
      socket.send(
        JSON.stringify({
          type: "error",
          error: error.message,
          code:
            error instanceof WebSocketError
              ? "WEBSOCKET_ERROR"
              : "UNKNOWN_ERROR",
        }),
      );
    }
  },

  onError: async (connection, error) => {
    console.error(`Connection ${connection.id} error: ${error.message}`);

    // Optionally close the connection
    await wsManager.close(connection.id, 1011, "Internal error");
  },
});
```

### Connection Limits

```typescript
const wsManager = new WebSocketConnectionManager({
  maxConnections: 100,
});

// When max connections reached, new connections will receive:
// WebSocketConnectionError: Maximum connections (100) reached
```

### Message Size Limits

```typescript
const wsManager = new WebSocketConnectionManager({
  maxMessageSize: 64 * 1024, // 64KB
});

// Messages exceeding the limit will throw:
// WebSocketError: Message exceeds max size (65536 bytes)
```

---

## Graceful Shutdown

Handle server shutdown gracefully to close all WebSocket connections:

```typescript
const wsManager = new WebSocketConnectionManager({ path: "/ws" });

// Handle shutdown signals
process.on("SIGTERM", async () => {
  console.log("Shutting down WebSocket connections...");

  // Close all connections with shutdown code
  await wsManager.closeAll(1001, "Server shutting down");

  // Then stop the server
  await server.stop();

  process.exit(0);
});

// Or close connections individually with custom messages
process.on("SIGTERM", async () => {
  const connections = wsManager.getAllConnections();

  for (const connection of connections) {
    const socket = connection.socket as { send: (data: string) => void };

    // Notify client before closing
    socket.send(
      JSON.stringify({
        type: "shutdown",
        message: "Server is shutting down. Please reconnect in a few minutes.",
      }),
    );

    // Give client time to receive message
    await new Promise((resolve) => setTimeout(resolve, 100));

    await wsManager.close(connection.id, 1001, "Server shutdown");
  }

  await server.stop();
  process.exit(0);
});
```

---

## Ping/Pong Keep-Alive

WebSocket connections include automatic ping/pong for connection health:

```typescript
const wsManager = new WebSocketConnectionManager({
  pingInterval: 30000, // Send ping every 30 seconds
  pongTimeout: 10000, // Close if no pong within 10 seconds
});

// Ping messages are sent automatically
// If native ping/pong is not available, uses JSON messages:
// { "type": "ping", "timestamp": 1706745600000 }

// Client should respond with:
// { "type": "pong", "timestamp": 1706745600000 }
```

### Disable Ping/Pong

```typescript
const wsManager = new WebSocketConnectionManager({
  pingInterval: 0, // Disable automatic pings
});
```

---

## Monitoring Connections

### Connection Statistics

```typescript
// Get connection count
const totalConnections = wsManager.getConnectionCount();
console.log(`Active connections: ${totalConnections}`);

// Get connections by user
const userConnections = wsManager.getConnectionsByUser(userId);
console.log(`User ${userId} has ${userConnections.length} connections`);

// Get connections by path
const agentConnections = wsManager.getConnectionsByPath("/ws/agent");
console.log(`Agent connections: ${agentConnections.length}`);

// Monitor connection details
const connections = wsManager.getAllConnections();
for (const conn of connections) {
  console.log({
    id: conn.id,
    userId: conn.user?.id,
    path: conn.metadata.path,
    connectedSince: new Date(conn.createdAt).toISOString(),
    lastActivity: new Date(conn.lastActivity).toISOString(),
  });
}
```

### Health Endpoint Integration

```typescript
// Add WebSocket stats to health endpoint
server.registerRoute({
  method: "GET",
  path: "/api/health/websocket",
  handler: async () => ({
    status: "ok",
    connections: {
      total: wsManager.getConnectionCount(),
      maxConnections: 1000,
      paths: {
        "/ws/agent": wsManager.getConnectionsByPath("/ws/agent").length,
        "/ws/notifications":
          wsManager.getConnectionsByPath("/ws/notifications").length,
      },
    },
  }),
  description: "WebSocket health status",
  tags: ["health"],
});
```

---

## Best Practices

### 1. Use Structured Messages

```typescript
// Define message types
type ClientMessage =
  | { type: "generate"; payload: { prompt: string } }
  | { type: "stream"; payload: { prompt: string } }
  | { type: "cancel"; payload: { requestId: string } };

type ServerMessage =
  | { type: "connected"; connectionId: string }
  | { type: "response"; content: string }
  | { type: "chunk"; content: string }
  | { type: "error"; error: string };
```

### 2. Implement Reconnection Logic (Client)

```javascript
function createWebSocket(url, options = {}) {
  let ws;
  let reconnectAttempts = 0;
  const maxReconnectAttempts = 5;
  const reconnectDelay = 1000;

  function connect() {
    ws = new WebSocket(url, options);

    ws.onopen = () => {
      reconnectAttempts = 0;
      console.log("Connected");
    };

    ws.onclose = (event) => {
      if (event.code !== 1000 && reconnectAttempts  {
      console.error("WebSocket error:", error);
    };
  }

  connect();
  return { getSocket: () => ws };
}
```

### 3. Handle Connection Limits Per User

```typescript
const MAX_CONNECTIONS_PER_USER = 3;

wsManager.registerHandler("/ws", {
  onOpen: async (connection) => {
    if (connection.user) {
      const userConnections = wsManager.getConnectionsByUser(
        connection.user.id,
      );

      if (userConnections.length > MAX_CONNECTIONS_PER_USER) {
        const oldest = userConnections[0];
        await wsManager.close(oldest.id, 1008, "Connection limit exceeded");
      }
    }
  },
});
```

### 4. Use Connection Metadata

```typescript
wsManager.registerHandler("/ws", {
  onOpen: async (connection) => {
    // Store custom metadata
    connection.metadata.sessionId = generateSessionId();
    connection.metadata.subscriptions = [];
  },

  onMessage: async (connection, message) => {
    const data = JSON.parse(message.data as string);

    if (data.type === "subscribe") {
      (connection.metadata.subscriptions as string[]).push(data.channel);
    }
  },
});
```

---

## Production Checklist

- [ ] Configure authentication (`auth.strategy` and `auth.validate`)
- [ ] Set appropriate `maxConnections` limit
- [ ] Configure `maxMessageSize` for your use case
- [ ] Enable ping/pong with reasonable intervals
- [ ] Implement graceful shutdown handling
- [ ] Add connection monitoring and logging
- [ ] Set up health check endpoint with WebSocket stats
- [ ] Implement rate limiting per connection
- [ ] Handle reconnection logic on client side
- [ ] Test with expected concurrent connection load

---

## Related Documentation

- **[Server Adapters Overview](/docs/)** - Getting started with server adapters
- **[Security Best Practices](/docs/guides/server-adapters/security)** - Authentication patterns
- **[Hono Adapter](/docs/guides/server-adapters/hono)** - Using WebSocket with Hono
- **[Configuration Reference](/docs/reference/server-configuration)** - Full configuration options

---

**Need Help?** Join our [GitHub Discussions](https://github.com/juspay/neurolink/discussions) or open an [issue](https://github.com/juspay/neurolink/issues).

---

## Error Handling

<!-- Source: guides/server-adapters/errors.md -->

# Error Handling

NeuroLink server adapters provide a comprehensive error handling system with typed error classes, automatic recovery strategies, and structured error responses. This guide covers the complete error hierarchy and how to handle errors effectively.

## Error Categories

Errors are grouped into 9 categories that determine handling behavior and recovery strategies:

| Category         | Description                             | Recovery Strategy   |
| ---------------- | --------------------------------------- | ------------------- |
| `CONFIG`         | Configuration and setup errors          | Fail immediately    |
| `VALIDATION`     | Input validation and schema errors      | Fail immediately    |
| `EXECUTION`      | Runtime handler and processing errors   | Retry (3 attempts)  |
| `EXTERNAL`       | External service and dependency errors  | Exponential backoff |
| `RATE_LIMIT`     | Rate limiting exceeded                  | Exponential backoff |
| `AUTHENTICATION` | Missing or invalid authentication       | Fail immediately    |
| `AUTHORIZATION`  | Permission and access denied errors     | Fail immediately    |
| `STREAMING`      | Streaming and SSE errors                | Retry (2 attempts)  |
| `WEBSOCKET`      | WebSocket connection and message errors | Exponential backoff |

---

## Severity Levels

Each error has a severity level for logging and alerting:

| Severity   | Description                                      | Example Errors                           |
| ---------- | ------------------------------------------------ | ---------------------------------------- |
| `LOW`      | Minor issues, typically user errors              | RouteNotFoundError, StreamAbortedError   |
| `MEDIUM`   | Moderate issues that may need attention          | TimeoutError, AuthenticationError        |
| `HIGH`     | Serious issues that should be investigated       | HandlerError, ConfigurationError         |
| `CRITICAL` | System-level failures requiring immediate action | ServerStartError, MissingDependencyError |

---

## Error Classes Reference

### Base Class: ServerAdapterError

All server adapter errors extend this base class:

```typescript

class ServerAdapterError extends Error {
  readonly code: string; // Unique error code
  readonly category: string; // Error category
  readonly severity: string; // Severity level
  readonly retryable: boolean; // Whether retry is recommended
  readonly retryAfterMs?: number; // Suggested retry delay
  readonly requestId?: string; // Request identifier for tracing
  readonly path?: string; // Request path
  readonly method?: string; // HTTP method
  readonly details?: object; // Additional error details
  readonly cause?: Error; // Original error if wrapped

  toJSON(): object; // Serialize for API response
  getHttpStatus(): number; // Get appropriate HTTP status
}
```

### Configuration Errors

#### ConfigurationError

Thrown when server configuration is invalid.

```typescript

throw new ConfigurationError(
  "Invalid port number: must be between 1 and 65535",
  { port: 99999, field: "port" },
);
```

| Property    | Value                           |
| ----------- | ------------------------------- |
| Code        | `SERVER_ADAPTER_INVALID_CONFIG` |
| Category    | `CONFIG`                        |
| Severity    | `HIGH`                          |
| HTTP Status | 400                             |
| Retryable   | No                              |

#### MissingDependencyError

Thrown when a required framework dependency is not installed.

```typescript

throw new MissingDependencyError("express", "Express", "npm install express");
```

| Property    | Value                               |
| ----------- | ----------------------------------- |
| Code        | `SERVER_ADAPTER_MISSING_DEPENDENCY` |
| Category    | `CONFIG`                            |
| Severity    | `CRITICAL`                          |
| HTTP Status | 500                                 |
| Retryable   | No                                  |

### Route Errors

#### RouteConflictError

Thrown when registering a route that conflicts with an existing route.

```typescript

throw new RouteConflictError("/api/users/:id", "GET", "/api/users/:userId");
```

| Property    | Value                           |
| ----------- | ------------------------------- |
| Code        | `SERVER_ADAPTER_ROUTE_CONFLICT` |
| Category    | `CONFIG`                        |
| Severity    | `HIGH`                          |
| HTTP Status | 500                             |
| Retryable   | No                              |

#### RouteNotFoundError

Thrown when a requested route does not exist.

```typescript

throw new RouteNotFoundError("/api/unknown", "GET", "req-123");
```

| Property    | Value                            |
| ----------- | -------------------------------- |
| Code        | `SERVER_ADAPTER_ROUTE_NOT_FOUND` |
| Category    | `VALIDATION`                     |
| Severity    | `LOW`                            |
| HTTP Status | 404                              |
| Retryable   | No                               |

### Validation Errors

#### ValidationError

Thrown when request validation fails.

```typescript

throw new ValidationError(
  [
    { field: "email", message: "Invalid email format", value: "not-an-email" },
    { field: "age", message: "Must be a positive number", value: -5 },
  ],
  "req-123",
);
```

| Property    | Value                             |
| ----------- | --------------------------------- |
| Code        | `SERVER_ADAPTER_VALIDATION_ERROR` |
| Category    | `VALIDATION`                      |
| Severity    | `LOW`                             |
| HTTP Status | 400                               |
| Retryable   | No                                |

### Authentication & Authorization Errors

#### AuthenticationError

Thrown when authentication is required but not provided.

```typescript

throw new AuthenticationError("Bearer token required", "req-123");
```

| Property    | Value                          |
| ----------- | ------------------------------ |
| Code        | `SERVER_ADAPTER_AUTH_REQUIRED` |
| Category    | `AUTHENTICATION`               |
| Severity    | `MEDIUM`                       |
| HTTP Status | 401                            |
| Retryable   | No                             |

#### InvalidAuthenticationError

Thrown when provided authentication credentials are invalid.

```typescript

throw new InvalidAuthenticationError("Token expired", "req-123");
```

| Property    | Value                         |
| ----------- | ----------------------------- |
| Code        | `SERVER_ADAPTER_AUTH_INVALID` |
| Category    | `AUTHENTICATION`              |
| Severity    | `MEDIUM`                      |
| HTTP Status | 401                           |
| Retryable   | No                            |

#### AuthorizationError

Thrown when the authenticated user lacks required permissions.

```typescript

throw new AuthorizationError(
  "Insufficient permissions to access this resource",
  "req-123",
  ["admin", "moderator"],
);
```

| Property    | Value                      |
| ----------- | -------------------------- |
| Code        | `SERVER_ADAPTER_FORBIDDEN` |
| Category    | `AUTHORIZATION`            |
| Severity    | `MEDIUM`                   |
| HTTP Status | 403                        |
| Retryable   | No                         |

### Rate Limiting Errors

#### RateLimitError

Thrown when request rate limits are exceeded.

```typescript

throw new RateLimitError(
  60000, // retry after 60 seconds
  "Rate limit exceeded: 100 requests per minute",
  "req-123",
);
```

| Property    | Value                                |
| ----------- | ------------------------------------ |
| Code        | `SERVER_ADAPTER_RATE_LIMIT_EXCEEDED` |
| Category    | `RATE_LIMIT`                         |
| Severity    | `MEDIUM`                             |
| HTTP Status | 429                                  |
| Retryable   | Yes                                  |

### Execution Errors

#### TimeoutError

Thrown when an operation exceeds its timeout.

```typescript

throw new TimeoutError(30000, "AI generation", "req-123");
```

| Property    | Value                    |
| ----------- | ------------------------ |
| Code        | `SERVER_ADAPTER_TIMEOUT` |
| Category    | `EXECUTION`              |
| Severity    | `MEDIUM`                 |
| HTTP Status | 408                      |
| Retryable   | Yes                      |

#### HandlerError

Thrown when a route handler fails during execution.

```typescript

throw new HandlerError(
  "Failed to process request",
  originalError,
  "req-123",
  "/api/agent/execute",
  "POST",
);
```

| Property    | Value                          |
| ----------- | ------------------------------ |
| Code        | `SERVER_ADAPTER_HANDLER_ERROR` |
| Category    | `EXECUTION`                    |
| Severity    | `HIGH`                         |
| HTTP Status | 500                            |
| Retryable   | No                             |

### Streaming Errors

#### StreamingError

Thrown when a streaming operation fails.

```typescript

throw new StreamingError("Stream write failed", originalError, "req-123");
```

| Property    | Value                         |
| ----------- | ----------------------------- |
| Code        | `SERVER_ADAPTER_STREAM_ERROR` |
| Category    | `STREAMING`                   |
| Severity    | `MEDIUM`                      |
| HTTP Status | 500                           |
| Retryable   | No                            |

#### StreamAbortedError

Thrown when a client aborts a streaming connection.

```typescript

throw new StreamAbortedError("Client disconnected", "req-123");
```

| Property    | Value                           |
| ----------- | ------------------------------- |
| Code        | `SERVER_ADAPTER_STREAM_ABORTED` |
| Category    | `STREAMING`                     |
| Severity    | `LOW`                           |
| HTTP Status | 499                             |
| Retryable   | No                              |

### WebSocket Errors

#### WebSocketError

General WebSocket operation errors.

```typescript

throw new WebSocketError("Message send failed", originalError, "ws-conn-123");
```

| Property    | Value                            |
| ----------- | -------------------------------- |
| Code        | `SERVER_ADAPTER_WEBSOCKET_ERROR` |
| Category    | `WEBSOCKET`                      |
| Severity    | `MEDIUM`                         |
| HTTP Status | 500                              |
| Retryable   | Yes                              |

#### WebSocketConnectionError

Thrown when WebSocket connection establishment fails.

```typescript

throw new WebSocketConnectionError("Handshake failed", originalError);
```

| Property    | Value                                        |
| ----------- | -------------------------------------------- |
| Code        | `SERVER_ADAPTER_WEBSOCKET_CONNECTION_FAILED` |
| Category    | `WEBSOCKET`                                  |
| Severity    | `HIGH`                                       |
| HTTP Status | 500                                          |
| Retryable   | Yes                                          |

### Server Lifecycle Errors

#### ServerStartError

Thrown when the server fails to start.

```typescript

throw new ServerStartError(
  "Port already in use",
  originalError,
  3000,
  "0.0.0.0",
);
```

| Property    | Value                         |
| ----------- | ----------------------------- |
| Code        | `SERVER_ADAPTER_START_FAILED` |
| Category    | `CONFIG`                      |
| Severity    | `CRITICAL`                    |
| HTTP Status | 500                           |
| Retryable   | Yes                           |

#### ServerStopError

Thrown when the server fails to stop cleanly.

```typescript

throw new ServerStopError("Failed to close connections", originalError);
```

| Property    | Value                        |
| ----------- | ---------------------------- |
| Code        | `SERVER_ADAPTER_STOP_FAILED` |
| Category    | `EXECUTION`                  |
| Severity    | `HIGH`                       |
| HTTP Status | 500                          |
| Retryable   | No                           |

#### AlreadyRunningError

Thrown when attempting to start an already running server.

```typescript

throw new AlreadyRunningError(3000, "0.0.0.0");
```

| Property    | Value                            |
| ----------- | -------------------------------- |
| Code        | `SERVER_ADAPTER_ALREADY_RUNNING` |
| Category    | `CONFIG`                         |
| Severity    | `LOW`                            |
| HTTP Status | 500                              |
| Retryable   | No                               |

#### NotRunningError

Thrown when attempting to stop a server that is not running.

```typescript

throw new NotRunningError();
```

| Property    | Value                        |
| ----------- | ---------------------------- |
| Code        | `SERVER_ADAPTER_NOT_RUNNING` |
| Category    | `CONFIG`                     |
| Severity    | `LOW`                        |
| HTTP Status | 500                          |
| Retryable   | No                           |

#### ShutdownTimeoutError

Thrown when graceful shutdown exceeds the configured timeout.

```typescript

throw new ShutdownTimeoutError(30000, 5); // 30s timeout, 5 remaining connections
```

| Property    | Value                        |
| ----------- | ---------------------------- |
| Code        | `SERVER_ADAPTER_STOP_FAILED` |
| Category    | `EXECUTION`                  |
| Severity    | `HIGH`                       |
| HTTP Status | 500                          |
| Retryable   | No                           |

#### DrainTimeoutError

Thrown when connection draining exceeds the configured timeout.

```typescript

throw new DrainTimeoutError(10000, 3); // 10s timeout, 3 remaining connections
```

| Property    | Value                        |
| ----------- | ---------------------------- |
| Code        | `SERVER_ADAPTER_STOP_FAILED` |
| Category    | `EXECUTION`                  |
| Severity    | `MEDIUM`                     |
| HTTP Status | 500                          |
| Retryable   | No                           |

#### InvalidLifecycleStateError

Thrown when an operation is attempted in an invalid server state.

```typescript

throw new InvalidLifecycleStateError("start", "stopping", [
  "stopped",
  "initialized",
]);
```

| Property    | Value                                    |
| ----------- | ---------------------------------------- |
| Code        | `SERVER_ADAPTER_INVALID_LIFECYCLE_STATE` |
| Category    | `CONFIG`                                 |
| Severity    | `MEDIUM`                                 |
| HTTP Status | 500                                      |
| Retryable   | No                                       |

---

## HTTP Status Code Mapping

Errors automatically map to appropriate HTTP status codes:

| Error Code            | HTTP Status | Description           |
| --------------------- | ----------- | --------------------- |
| `VALIDATION_ERROR`    | 400         | Bad Request           |
| `SCHEMA_ERROR`        | 400         | Bad Request           |
| `INVALID_CONFIG`      | 400         | Bad Request           |
| `INVALID_ROUTE`       | 400         | Bad Request           |
| `AUTH_REQUIRED`       | 401         | Unauthorized          |
| `AUTH_INVALID`        | 401         | Unauthorized          |
| `FORBIDDEN`           | 403         | Forbidden             |
| `ROUTE_NOT_FOUND`     | 404         | Not Found             |
| `TIMEOUT`             | 408         | Request Timeout       |
| `RATE_LIMIT_EXCEEDED` | 429         | Too Many Requests     |
| `STREAM_ABORTED`      | 499         | Client Closed Request |
| All other errors      | 500         | Internal Server Error |

---

## Error Response Format

All errors are serialized to a consistent JSON format:

```json
{
  "error": {
    "code": "SERVER_ADAPTER_VALIDATION_ERROR",
    "message": "Validation failed: Invalid email format, Must be a positive number",
    "category": "VALIDATION",
    "requestId": "req-abc123",
    "details": {
      "errors": [
        {
          "field": "email",
          "message": "Invalid email format",
          "value": "not-an-email"
        },
        { "field": "age", "message": "Must be a positive number", "value": -5 }
      ]
    },
    "retryAfter": 60
  }
}
```

### Response Fields

| Field        | Type   | Description                                             |
| ------------ | ------ | ------------------------------------------------------- |
| `code`       | string | Unique error code for programmatic handling             |
| `message`    | string | Human-readable error message                            |
| `category`   | string | Error category for grouping                             |
| `requestId`  | string | Request ID for tracing (when available)                 |
| `details`    | object | Additional context-specific information                 |
| `retryAfter` | number | Suggested retry delay in seconds (for retryable errors) |

---

## Recovery Strategies

Each error category has a predefined recovery strategy:

```typescript
const ErrorRecoveryStrategies = {
  CONFIG: {
    strategy: "fail",
    maxRetries: 0,
    baseDelayMs: 0,
  },
  VALIDATION: {
    strategy: "fail",
    maxRetries: 0,
    baseDelayMs: 0,
  },
  EXECUTION: {
    strategy: "retry",
    maxRetries: 3,
    baseDelayMs: 1000,
  },
  EXTERNAL: {
    strategy: "exponentialBackoff",
    maxRetries: 5,
    baseDelayMs: 1000,
  },
  RATE_LIMIT: {
    strategy: "exponentialBackoff",
    maxRetries: 3,
    baseDelayMs: 5000,
  },
  AUTHENTICATION: {
    strategy: "fail",
    maxRetries: 0,
    baseDelayMs: 0,
  },
  AUTHORIZATION: {
    strategy: "fail",
    maxRetries: 0,
    baseDelayMs: 0,
  },
  STREAMING: {
    strategy: "retry",
    maxRetries: 2,
    baseDelayMs: 500,
  },
  WEBSOCKET: {
    strategy: "exponentialBackoff",
    maxRetries: 5,
    baseDelayMs: 1000,
  },
};
```

### Strategy Types

| Strategy             | Description                                                      |
| -------------------- | ---------------------------------------------------------------- |
| `fail`               | Fail immediately without retry                                   |
| `retry`              | Retry with fixed delay between attempts                          |
| `exponentialBackoff` | Retry with exponentially increasing delays (1s, 2s, 4s, 8s, ...) |

---

## Custom Error Handling

### Global Error Handler

Register a global error handler for custom error processing:

```typescript

const server = await createServer(neurolink, {
  framework: "hono",
  config: { port: 3000 },
});

// Register global error handler
server.onError((error, context) => {
  // Wrap unknown errors
  const serverError =
    error instanceof ServerAdapterError
      ? error
      : wrapError(error, context.requestId, context.path, context.method);

  // Log based on severity
  if (serverError.severity === "CRITICAL") {
    alertOps(serverError);
  }

  if (serverError.severity === "HIGH" || serverError.severity === "CRITICAL") {
    logger.error("Server error", {
      code: serverError.code,
      message: serverError.message,
      requestId: serverError.requestId,
      path: serverError.path,
      stack: serverError.stack,
    });
  }

  // Track metrics
  metrics.increment("server.errors", {
    code: serverError.code,
    category: serverError.category,
    severity: serverError.severity,
  });

  // Return the error (will be serialized to JSON response)
  return serverError;
});
```

### Route-Level Error Handling

Handle errors in specific routes:

```typescript
server.registerRoute({
  method: "POST",
  path: "/api/custom",
  handler: async (ctx) => {
    try {
      const result = await processRequest(ctx.body);
      return result;
    } catch (error) {
      // Transform domain errors to server errors
      if (error instanceof DomainValidationError) {
        throw new ValidationError(
          [{ field: error.field, message: error.message }],
          ctx.requestId,
        );
      }

      if (error instanceof ExternalServiceError) {
        throw new HandlerError(
          "External service unavailable",
          error,
          ctx.requestId,
          ctx.path,
          ctx.method,
        );
      }

      // Re-throw server adapter errors
      throw error;
    }
  },
});
```

### Using wrapError Helper

The `wrapError` utility converts unknown errors to `ServerAdapterError`:

```typescript

function handleError(error: unknown, requestId: string): ServerAdapterError {
  // Already a ServerAdapterError - return as-is
  if (error instanceof ServerAdapterError) {
    return error;
  }

  // Wrap as HandlerError
  return wrapError(error, requestId, "/api/endpoint", "POST");
}
```

### Implementing Retry Logic

Use recovery strategies for automatic retry:

```typescript

async function executeWithRetry(
  operation: () => Promise,
  category: string,
): Promise {
  const strategy = ErrorRecoveryStrategies[category];
  let lastError: Error | undefined;

  for (let attempt = 0; attempt <= strategy.maxRetries; attempt++) {
    try {
      return await operation();
    } catch (error) {
      lastError = error as Error;

      // Don't retry if strategy is "fail"
      if (strategy.strategy === "fail") {
        throw error;
      }

      // Check if error is retryable
      if (error instanceof ServerAdapterError && !error.retryable) {
        throw error;
      }

      // Calculate delay
      const delay =
        strategy.strategy === "exponentialBackoff"
          ? strategy.baseDelayMs * Math.pow(2, attempt)
          : strategy.baseDelayMs;

      // Use retryAfterMs if provided
      const actualDelay =
        error instanceof ServerAdapterError && error.retryAfterMs
          ? error.retryAfterMs
          : delay;

      if (attempt < strategy.maxRetries) {
        await sleep(actualDelay);
      }
    }
  }

  throw lastError;
}
```

---

## Error Codes Reference

### Configuration Errors

| Code                                   | Description                             |
| -------------------------------------- | --------------------------------------- |
| `SERVER_ADAPTER_INVALID_CONFIG`        | Invalid server configuration            |
| `SERVER_ADAPTER_MISSING_DEPENDENCY`    | Required framework dependency not found |
| `SERVER_ADAPTER_FRAMEWORK_INIT_FAILED` | Framework initialization failed         |

### Route Errors

| Code                             | Description                         |
| -------------------------------- | ----------------------------------- |
| `SERVER_ADAPTER_ROUTE_NOT_FOUND` | Requested route does not exist      |
| `SERVER_ADAPTER_ROUTE_CONFLICT`  | Route conflicts with existing route |
| `SERVER_ADAPTER_INVALID_ROUTE`   | Invalid route definition            |

### Execution Errors

| Code                              | Description                    |
| --------------------------------- | ------------------------------ |
| `SERVER_ADAPTER_HANDLER_ERROR`    | Route handler execution failed |
| `SERVER_ADAPTER_TIMEOUT`          | Operation timed out            |
| `SERVER_ADAPTER_MIDDLEWARE_ERROR` | Middleware execution failed    |

### Authentication/Authorization Errors

| Code                           | Description                              |
| ------------------------------ | ---------------------------------------- |
| `SERVER_ADAPTER_AUTH_REQUIRED` | Authentication required but not provided |
| `SERVER_ADAPTER_AUTH_INVALID`  | Invalid authentication credentials       |
| `SERVER_ADAPTER_FORBIDDEN`     | Access denied (insufficient permissions) |

### Rate Limiting Errors

| Code                                 | Description                 |
| ------------------------------------ | --------------------------- |
| `SERVER_ADAPTER_RATE_LIMIT_EXCEEDED` | Request rate limit exceeded |

### Streaming Errors

| Code                            | Description                |
| ------------------------------- | -------------------------- |
| `SERVER_ADAPTER_STREAM_ERROR`   | Streaming operation failed |
| `SERVER_ADAPTER_STREAM_ABORTED` | Client aborted the stream  |

### WebSocket Errors

| Code                                         | Description                 |
| -------------------------------------------- | --------------------------- |
| `SERVER_ADAPTER_WEBSOCKET_ERROR`             | WebSocket operation failed  |
| `SERVER_ADAPTER_WEBSOCKET_CONNECTION_FAILED` | WebSocket connection failed |

### Validation Errors

| Code                              | Description               |
| --------------------------------- | ------------------------- |
| `SERVER_ADAPTER_VALIDATION_ERROR` | Request validation failed |
| `SERVER_ADAPTER_SCHEMA_ERROR`     | Schema validation failed  |

### Lifecycle Errors

| Code                             | Description               |
| -------------------------------- | ------------------------- |
| `SERVER_ADAPTER_START_FAILED`    | Server failed to start    |
| `SERVER_ADAPTER_STOP_FAILED`     | Server failed to stop     |
| `SERVER_ADAPTER_ALREADY_RUNNING` | Server is already running |
| `SERVER_ADAPTER_NOT_RUNNING`     | Server is not running     |

---

## Best Practices

### 1. Use Specific Error Classes

Throw the most specific error class for your situation:

```typescript
// Good - specific error with context
throw new ValidationError(
  [{ field: "email", message: "Invalid format" }],
  requestId,
);

// Avoid - generic error
throw new Error("Validation failed");
```

### 2. Include Request Context

Always include request ID, path, and method when available:

```typescript
throw new HandlerError(
  "Processing failed",
  cause,
  context.requestId, // For tracing
  context.path, // For debugging
  context.method, // For debugging
);
```

### 3. Provide Actionable Details

Include details that help diagnose the issue:

```typescript
throw new ConfigurationError("Invalid rate limit configuration", {
  field: "maxRequests",
  provided: -100,
  expected: "positive integer",
  hint: "maxRequests must be greater than 0",
});
```

### 4. Respect Retry-After Headers

When handling `RateLimitError`, honor the `retryAfterMs`:

```typescript
if (error instanceof RateLimitError) {
  response.setHeader("Retry-After", Math.ceil(error.retryAfterMs / 1000));
}
```

### 5. Log Appropriately by Severity

```typescript
switch (error.severity) {
  case "CRITICAL":
    logger.fatal(error);
    alertOps(error);
    break;
  case "HIGH":
    logger.error(error);
    break;
  case "MEDIUM":
    logger.warn(error);
    break;
  case "LOW":
    logger.info(error);
    break;
}
```

---

## Related Documentation

- **[Server Adapters Overview](/docs/)** - Getting started with server adapters
- **[Security Best Practices](/docs/guides/server-adapters/security)** - Authentication and authorization
- **[Configuration Reference](/docs/reference/server-configuration)** - Full configuration options
- **[Deployment Guide](/docs/guides/server-adapters/deployment)** - Production deployment strategies

---

## Domain-Specific AI Usage Guide

<!-- Source: guides/domain-specific.md -->

# Domain-Specific AI Usage Guide

Simple guide for using domain expertise with NeuroLink SDK and CLI.

## ✅ **Recommended Approach: Simple Domain Input**

Instead of complex configuration, simply pass domain parameters directly to your AI requests.

## ️ **CLI Usage (Simple Flags)**

### **Generate with Domain**

```bash
# Healthcare domain
pnpm cli generate "Analyze patient symptoms: fever, cough, fatigue" \
  --provider openai \
  --evaluationDomain healthcare \
  --enableEvaluation \
  --enableAnalytics

# Analytics domain
pnpm cli generate "Analyze quarterly sales data" \
  --provider openai \
  --evaluationDomain analytics \
  --enableEvaluation \
  --enableAnalytics

# Finance domain
pnpm cli generate "Assess portfolio risk for diversified investments" \
  --provider openai \
  --evaluationDomain finance \
  --enableEvaluation \
  --enableAnalytics
```

### **Streaming with Domain**

```bash
# E-commerce domain streaming
pnpm cli stream "Optimize conversion funnel for e-commerce site" \
  --provider openai \
  --evaluationDomain ecommerce \
  --enableEvaluation \
  --enableAnalytics \
  --maxTokens 300
```

### **Check Available CLI Options**

```bash
# See all domain-related options
pnpm cli generate --help | grep -i evaluation
pnpm cli stream --help | grep -i evaluation
```

---

##  **Available Domains**

| Domain       | Use Case                        | Example Input                                                 |
| ------------ | ------------------------------- | ------------------------------------------------------------- |
| `healthcare` | Medical analysis, diagnostics   | "Analyze patient symptoms and suggest differential diagnosis" |
| `analytics`  | Data analysis, metrics          | "Analyze user behavior data and identify trends"              |
| `finance`    | Investment, risk assessment     | "Evaluate portfolio risk and diversification strategy"        |
| `ecommerce`  | Retail, conversion optimization | "Optimize product page for better conversion rates"           |

---

##  **Response Structure**

When using domain evaluation, you'll get enhanced responses:

```typescript
{
  content: "AI response content...",
  evaluation: {
    evaluationDomain: "healthcare",
    score: 0.85,
    criteria: ["accuracy", "safety", "compliance"],
    feedback: "Response demonstrates good medical accuracy..."
  },
  analytics: {
    domainRelevance: 0.92,
    complexityScore: 0.78,
    // ... additional analytics
  },
  usage: { /* token usage */ },
  provider: "openai",
  model: "gpt-4"
}
```

---

##  **Best Practices**

### **1. Choose Appropriate Domains**

- Use `healthcare` for medical/clinical content
- Use `analytics` for data analysis and metrics
- Use `finance` for financial analysis and risk assessment
- Use `ecommerce` for retail and conversion optimization

### **2. Enable Both Evaluation and Analytics**

```typescript
// ✅ Recommended: Enable both for full domain benefits
{
  evaluationDomain: "healthcare",
  enableEvaluation: true,   // Domain-specific quality evaluation
  enableAnalytics: true     // Enhanced analytics tracking
}
```

### **3. Use with Appropriate Providers**

```typescript
// ✅ Recommended providers for domain work
const providers = ["openai", "anthropic", "google-ai"];
```

### **4. Handle Domain Results**

```typescript
const result = await sdk.generate({
  input: { text: "Medical analysis request" },
  evaluationDomain: "healthcare",
  enableEvaluation: true,
});

// ✅ Always check if evaluation exists
if (result.evaluation) {
  console.log(`Domain: ${result.evaluation.evaluationDomain}`);
  console.log(`Quality Score: ${result.evaluation.score}`);
}

// ✅ Use analytics for insights
if (result.analytics) {
  console.log(`Domain Relevance: ${result.analytics.domainRelevance}`);
}
```

---

## ❌ **What Was Removed**

The complex interactive domain configuration system was removed because:

- **Over-engineered**: 240+ lines of configuration code for minimal benefit
- **Poor UX**: Users had to answer dozens of configuration questions
- **Unused**: Complex configurations weren't meaningfully used in practice
- **Redundant**: Simple domain parameters work better

### **Old Complex Approach (Removed)**

```typescript
// ❌ OLD: Complex configuration (removed)
await configManager.setupDomains();
// Would prompt for:
// - Healthcare evaluation criteria (6 options)
// - Analytics tracking preferences (3 options)
// - Finance risk metrics (3 options)
// - E-commerce conversion settings (3 options)
```

### **New Simple Approach (Current)**

```typescript
// ✅ NEW: Simple domain input
const result = await sdk.generate({
  input: { text: "Healthcare analysis" },
  evaluationDomain: "healthcare", // One simple parameter
  enableEvaluation: true,
});
```

---

##  **Migration Guide**

If you were using the old domain configuration:

1. **Remove old config**: `pnpm cli config reset` (optional)
2. **Use simple parameters**: Add `evaluationDomain` to your requests
3. **Enable features**: Use `enableEvaluation` and `enableAnalytics` flags

**Before:**

```bash
# Old: Complex setup required
pnpm cli config init  # Would prompt for domain setup
```

**After:**

```bash
# New: Direct usage
pnpm cli generate "Medical analysis" --evaluationDomain healthcare --enableEvaluation
```

---

This simplified approach gives you all the domain-specific AI benefits without configuration complexity.

---

## Security Best Practices

<!-- Source: guides/server-adapters/security.md -->

# Security Best Practices

**Protect your AI APIs with comprehensive security measures**

This guide covers authentication, authorization, rate limiting, and other security best practices for deploying NeuroLink server adapters in production.

## Rate Limiting

Protect your API from abuse with configurable rate limiting.

### Basic Configuration

```typescript
const server = await createServer(neurolink, {
  framework: "hono",
  config: {
    port: 3000,
    rateLimit: {
      enabled: true,
      maxRequests: 100, // 100 requests
      windowMs: 60000, // per minute
      message: "Too many requests, please try again later",
      skipPaths: ["/api/health", "/api/ready"],
    },
  },
});
```

### Per-IP Rate Limiting

The default behavior limits requests by client IP:

```typescript

server.registerMiddleware(
  createRateLimitMiddleware({
    maxRequests: 100,
    windowMs: 60000, // 1 minute
    keyGenerator: (ctx) => {
      // Use X-Forwarded-For when behind a proxy
      return (
        ctx.headers["x-forwarded-for"]?.split(",")[0].trim() ||
        ctx.headers["x-real-ip"] ||
        "unknown"
      );
    },
    skipPaths: ["/api/health"],
  }),
);
```

### Per-User Rate Limiting

Limit based on authenticated user:

```typescript
server.registerMiddleware(
  createRateLimitMiddleware({
    maxRequests: 1000,
    windowMs: 3600000, // 1 hour
    keyGenerator: (ctx) => {
      // Use user ID if authenticated, fall back to IP
      return ctx.user?.id || ctx.headers["x-forwarded-for"] || "anonymous";
    },
  }),
);
```

### Per-API-Key Rate Limiting

Different limits for different API keys:

```typescript
server.registerMiddleware(
  createRateLimitMiddleware({
    maxRequests: 100, // Default limit
    windowMs: 60000,
    keyGenerator: (ctx) => {
      const apiKey = ctx.headers["x-api-key"];
      return apiKey ? `key:${apiKey}` : `ip:${ctx.headers["x-forwarded-for"]}`;
    },
    onRateLimitExceeded: (ctx, retryAfter) => {
      // Custom response with tier info
      return {
        error: {
          code: "RATE_LIMIT_EXCEEDED",
          message: "Rate limit exceeded. Upgrade your plan for higher limits.",
          retryAfter,
          upgradeUrl: "https://example.com/pricing",
        },
      };
    },
  }),
);
```

### Sliding Window Rate Limiting

For smoother rate limiting that prevents burst-and-wait patterns:

```typescript

server.registerMiddleware(
  createSlidingWindowRateLimitMiddleware({
    maxRequests: 100,
    windowMs: 60000,
    subWindows: 10, // 10 sub-windows for smoother limiting
    keyGenerator: (ctx) => ctx.user?.id || ctx.headers["x-forwarded-for"],
  }),
);
```

### Rate Limit Headers

Rate limit middleware automatically adds headers to responses:

```
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1706745660
```

### Rate Limit Response Headers

When a request exceeds the rate limit, the server returns HTTP 429 (Too Many Requests) with these headers:

| Header                  | Description                      | Example      |
| ----------------------- | -------------------------------- | ------------ |
| `X-RateLimit-Limit`     | Maximum requests per window      | `100`        |
| `X-RateLimit-Remaining` | Requests remaining in window     | `0`          |
| `X-RateLimit-Reset`     | Unix timestamp when limit resets | `1706745660` |
| `Retry-After`           | Seconds to wait before retrying  | `60`         |

Clients should respect the `Retry-After` header to avoid unnecessary requests.

---

## Stream Redaction

Protect sensitive data in streaming responses. **Redaction is disabled by default** and must be explicitly enabled.

### Why Disabled by Default?

Stream redaction is disabled by default because:

1. It adds processing overhead to every stream chunk
2. Developers should consciously decide what to redact
3. Overly aggressive redaction can break functionality

### Enabling Stream Redaction

```typescript
const server = await createServer(neurolink, {
  framework: "hono",
  config: {
    port: 3000,
    redaction: {
      enabled: true, // Must explicitly enable
      // Default redacted fields: apiKey, token, authorization,
      // credentials, password, secret, request, args, result
    },
  },
});
```

### Custom Redaction Configuration

```typescript
const server = await createServer(neurolink, {
  framework: "hono",
  config: {
    port: 3000,
    redaction: {
      enabled: true,

      // Add custom fields to redact
      additionalFields: ["ssn", "creditCard", "bankAccount", "privateKey"],

      // Preserve fields that would normally be redacted
      preserveFields: ["result"], // Don't redact tool results

      // Control tool-specific redaction
      redactToolArgs: true, // Redact tool arguments
      redactToolResults: false, // Don't redact results

      // Custom placeholder
      placeholder: "[SENSITIVE DATA REMOVED]",
    },
  },
});
```

### Programmatic Redaction

For custom streaming routes:

```typescript

const redactor = createStreamRedactor({
  enabled: true,
  additionalFields: ["customSecret"],
});

// Use in custom stream handling
for await (const chunk of stream) {
  const redactedChunk = redactor(chunk);
  response.write(redactedChunk);
}
```

---

## CORS Configuration

Properly configure Cross-Origin Resource Sharing:

```typescript
const server = await createServer(neurolink, {
  framework: "hono",
  config: {
    port: 3000,
    cors: {
      enabled: true,

      // Specific origins only (never use "*" in production)
      origins: [
        "https://myapp.com",
        "https://staging.myapp.com",
        "https://admin.myapp.com",
      ],

      // Allowed HTTP methods
      methods: ["GET", "POST", "PUT", "DELETE", "OPTIONS"],

      // Allowed headers
      headers: ["Content-Type", "Authorization", "X-API-Key", "X-Request-ID"],

      // Allow credentials (cookies, authorization headers)
      credentials: true,

      // Preflight cache (reduce OPTIONS requests)
      maxAge: 86400, // 24 hours
    },
  },
});
```

### Dynamic CORS Origins

For multi-tenant applications:

```typescript
// Using Hono's native CORS middleware

const app = server.getFrameworkInstance();

app.use(
  "/api/*",
  cors({
    origin: (origin, c) => {
      // Validate origin against allowed list
      const allowedPattern = /^https:\/\/.*\.myapp\.com$/;

      if (allowedPattern.test(origin)) {
        return origin;
      }

      // Check database for custom domains
      // (be careful with async operations here)
      return null;
    },
    credentials: true,
  }),
);
```

---

## Security Headers

Add essential security headers to all responses. NeuroLink provides a built-in `createSecurityHeadersMiddleware` that works with **all server adapters** (Hono, Express, Fastify, Koa).

### Using NeuroLink Security Headers Middleware (All Adapters)

The recommended approach is to use NeuroLink's built-in security headers middleware, which works consistently across all frameworks:

```typescript

  createServer,
  createSecurityHeadersMiddleware,
} from "@juspay/neurolink";

const server = await createServer(neurolink, {
  framework: "hono", // Works with: hono, express, fastify, koa
  config: { port: 3000 },
});

server.registerMiddleware(
  createSecurityHeadersMiddleware({
    contentSecurityPolicy: "default-src 'self'; script-src 'self'",
    frameOptions: "DENY",
    contentTypeOptions: "nosniff",
    hstsMaxAge: 31536000,
    referrerPolicy: "strict-origin-when-cross-origin",
    customHeaders: {
      "X-Custom-Header": "custom-value",
    },
  }),
);

await server.initialize();
await server.start();
```

### Configuration Options

| Option                  | Type                              | Default                             | Description                    |
| ----------------------- | --------------------------------- | ----------------------------------- | ------------------------------ |
| `contentSecurityPolicy` | `string`                          | `undefined`                         | Content-Security-Policy header |
| `frameOptions`          | `"DENY" \| "SAMEORIGIN" \| false` | `"DENY"`                            | X-Frame-Options header         |
| `contentTypeOptions`    | `"nosniff" \| false`              | `"nosniff"`                         | X-Content-Type-Options header  |
| `hstsMaxAge`            | `number \| false`                 | `31536000` (1 year)                 | HSTS max-age in seconds        |
| `referrerPolicy`        | `string \| false`                 | `"strict-origin-when-cross-origin"` | Referrer-Policy header         |
| `customHeaders`         | `Record`          | `{}`                                | Additional custom headers      |

### Headers Set by the Middleware

The middleware automatically sets these security headers:

| Header                      | Default Value                         | Purpose                       |
| --------------------------- | ------------------------------------- | ----------------------------- |
| `X-Frame-Options`           | `DENY`                                | Prevents clickjacking attacks |
| `X-Content-Type-Options`    | `nosniff`                             | Prevents MIME type sniffing   |
| `Strict-Transport-Security` | `max-age=31536000; includeSubDomains` | Enforces HTTPS connections    |
| `Referrer-Policy`           | `strict-origin-when-cross-origin`     | Controls referrer information |
| `X-XSS-Protection`          | `1; mode=block`                       | XSS filter for older browsers |
| `Content-Security-Policy`   | (only if configured)                  | Controls resource loading     |

### Express Example

```typescript

  createServer,
  createSecurityHeadersMiddleware,
} from "@juspay/neurolink";

const neurolink = new NeuroLink({ defaultProvider: "openai" });

const server = await createServer(neurolink, {
  framework: "express",
  config: { port: 3000 },
});

server.registerMiddleware(
  createSecurityHeadersMiddleware({
    contentSecurityPolicy: "default-src 'self'; img-src 'self' data:",
    frameOptions: "SAMEORIGIN",
    hstsMaxAge: 63072000, // 2 years
  }),
);

await server.initialize();
await server.start();
```

### Fastify Example

```typescript

  createServer,
  createSecurityHeadersMiddleware,
} from "@juspay/neurolink";

const neurolink = new NeuroLink({ defaultProvider: "anthropic" });

const server = await createServer(neurolink, {
  framework: "fastify",
  config: { port: 3000 },
});

server.registerMiddleware(
  createSecurityHeadersMiddleware({
    frameOptions: "DENY",
    referrerPolicy: "no-referrer",
    customHeaders: {
      "Permissions-Policy": "camera=(), microphone=(), geolocation=()",
    },
  }),
);

await server.initialize();
await server.start();
```

### Koa Example

```typescript

  createServer,
  createSecurityHeadersMiddleware,
} from "@juspay/neurolink";

const neurolink = new NeuroLink({ defaultProvider: "openai" });

const server = await createServer(neurolink, {
  framework: "koa",
  config: { port: 3000 },
});

server.registerMiddleware(
  createSecurityHeadersMiddleware({
    contentSecurityPolicy: "default-src 'self'",
    hstsMaxAge: 31536000,
  }),
);

await server.initialize();
await server.start();
```

### Hono Example

```typescript

  createServer,
  createSecurityHeadersMiddleware,
} from "@juspay/neurolink";

const neurolink = new NeuroLink({ defaultProvider: "openai" });

const server = await createServer(neurolink, {
  framework: "hono",
  config: { port: 3000 },
});

server.registerMiddleware(
  createSecurityHeadersMiddleware({
    contentSecurityPolicy:
      "default-src 'self'; script-src 'self' 'unsafe-inline'",
    frameOptions: "DENY",
  }),
);

await server.initialize();
await server.start();
```

### Disabling Specific Headers

Set any option to `false` to disable that header:

```typescript
server.registerMiddleware(
  createSecurityHeadersMiddleware({
    frameOptions: false, // Disable X-Frame-Options
    hstsMaxAge: false, // Disable HSTS
    referrerPolicy: false, // Disable Referrer-Policy
  }),
);
```

### Framework-Specific Alternatives

If you prefer to use framework-native security middleware, you can access the underlying framework instance:

#### Using Hono's secureHeaders

```typescript

const app = server.getFrameworkInstance();

app.use(
  "*",
  secureHeaders({
    contentSecurityPolicy: {
      defaultSrc: ["'self'"],
      scriptSrc: ["'self'"],
      styleSrc: ["'self'", "'unsafe-inline'"],
    },
    xFrameOptions: "DENY",
    xContentTypeOptions: "nosniff",
    referrerPolicy: "strict-origin-when-cross-origin",
    permissionsPolicy: {
      camera: [],
      microphone: [],
      geolocation: [],
    },
  }),
);
```

#### Using Express with Helmet

```typescript

const app = server.getFrameworkInstance();

app.use(
  helmet({
    contentSecurityPolicy: {
      directives: {
        defaultSrc: ["'self'"],
        scriptSrc: ["'self'"],
        styleSrc: ["'self'", "'unsafe-inline'"],
      },
    },
    hsts: {
      maxAge: 31536000,
      includeSubDomains: true,
      preload: true,
    },
  }),
);
```

#### Using Koa with koa-helmet

```typescript

const app = server.getFrameworkInstance();

app.use(helmet());
```

---

## Production Security Checklist

### Authentication

- [ ] Implement authentication middleware
- [ ] Use secure token validation (verify signatures, check expiration)
- [ ] Configure skip paths carefully
- [ ] Implement token refresh mechanism
- [ ] Log authentication failures
- [ ] Implement account lockout after failed attempts

### Authorization

- [ ] Implement role-based access control (RBAC)
- [ ] Validate permissions for each endpoint
- [ ] Use principle of least privilege
- [ ] Audit authorization decisions

### Rate Limiting

- [ ] Enable rate limiting globally
- [ ] Configure appropriate limits per endpoint type
- [ ] Use sliding window for critical endpoints
- [ ] Implement different tiers for different users
- [ ] Monitor rate limit hits

### Data Protection

- [ ] Enable stream redaction for sensitive operations
- [ ] Configure custom fields to redact
- [ ] Validate and sanitize all inputs
- [ ] Encrypt sensitive data at rest
- [ ] Use TLS for all connections

### CORS

- [ ] Configure specific allowed origins (no wildcards)
- [ ] Restrict allowed methods and headers
- [ ] Enable credentials only if needed
- [ ] Set appropriate preflight cache

### Headers

- [ ] Add Content-Security-Policy
- [ ] Set X-Frame-Options to DENY
- [ ] Enable X-Content-Type-Options
- [ ] Configure Referrer-Policy
- [ ] Add Strict-Transport-Security (HSTS)

### Infrastructure

- [ ] Use HTTPS everywhere (terminate at load balancer)
- [ ] Configure firewall rules
- [ ] Use private networking for internal services
- [ ] Implement request timeout
- [ ] Set maximum body size limits
- [ ] Enable access logging
- [ ] Set up intrusion detection

### Monitoring

- [ ] Monitor authentication failures
- [ ] Alert on rate limit breaches
- [ ] Track unusual API patterns
- [ ] Log all security events
- [ ] Set up anomaly detection

---

## Security Validation via CLI

Use CLI commands to validate security configuration:

### Verify Security Settings

```bash
# Check authentication configuration
neurolink server config --get auth

# Check rate limiting settings
neurolink server config --get rateLimit
neurolink server config --get rateLimit.maxRequests

# Check CORS configuration
neurolink server config --get cors
neurolink server config --get cors.enabled
```

### Route Security Audit

```bash
# List all routes to verify middleware is applied
neurolink server routes --format json

# Check specific route groups
neurolink server routes --group agent  # Verify auth on agent routes
neurolink server routes --group health # Health routes (typically public)
```

### Security Configuration Checklist

| Setting       | Check Command                               | Recommended          |
| ------------- | ------------------------------------------- | -------------------- |
| Rate Limiting | `server config --get rateLimit.enabled`     | `true`               |
| Max Requests  | `server config --get rateLimit.maxRequests` | `100` per minute     |
| CORS          | `server config --get cors.enabled`          | `true` in production |
| CORS Origins  | `server config --get cors.origins`          | Specific domains     |

### Hardening Configuration

```bash
# Set stricter rate limits for production
neurolink server config --set rateLimit.maxRequests=50
neurolink server config --set rateLimit.windowMs=60000

# Verify changes
neurolink server config --format json
```

---

## Example: Complete Secure Server

```typescript

  createServer,
  createAuthMiddleware,
  createRateLimitMiddleware,
  createRoleMiddleware,
} from "@juspay/neurolink";

const neurolink = new NeuroLink({
  defaultProvider: "openai",
});

const server = await createServer(neurolink, {
  framework: "hono",
  config: {
    port: 3000,
    host: "0.0.0.0",
    basePath: "/api",
    timeout: 30000,

    cors: {
      enabled: true,
      origins: [process.env.ALLOWED_ORIGIN],
      methods: ["GET", "POST"],
      headers: ["Content-Type", "Authorization"],
      credentials: true,
    },

    rateLimit: {
      enabled: true,
      maxRequests: 100,
      windowMs: 60000,
      skipPaths: ["/api/health", "/api/ready"],
    },

    bodyParser: {
      enabled: true,
      maxSize: "1mb",
      jsonLimit: "1mb",
    },

    redaction: {
      enabled: true,
      additionalFields: ["ssn", "creditCard"],
    },
  },
});

// Authentication
server.registerMiddleware(
  createAuthMiddleware({
    type: "bearer",
    validate: async (token) => {
      // Your JWT validation logic
      return validateJWT(token);
    },
    skipPaths: ["/api/health", "/api/ready", "/api/auth/login"],
  }),
);

// Additional rate limit for AI endpoints
server.registerMiddleware({
  name: "ai-rate-limit",
  order: 6,
  paths: ["/api/agent/*"],
  handler: async (ctx, next) => {
    // Stricter rate limit for AI operations
    // Implementation here
    return next();
  },
});

// Admin-only endpoints
server.registerMiddleware(
  createRoleMiddleware({
    requiredRoles: ["admin"],
    errorMessage: "Admin access required",
  }),
);

await server.initialize();
await server.start();

console.log("Secure server running on port 3000");
```

---

## Related Documentation

- **[Server Adapters Overview](/docs/)** - Getting started with server adapters
- **[Deployment Guide](/docs/guides/server-adapters/deployment)** - Production deployment strategies
- **[Hono Adapter](/docs/guides/server-adapters/hono)** - Hono-specific security features
- **[Enterprise Monitoring](/docs/observability/health-monitoring)** - Security monitoring

---

**Need Help?** Join our [GitHub Discussions](https://github.com/juspay/neurolink/discussions) or open an [issue](https://github.com/juspay/neurolink/issues).

---

## MCP Server Catalog

<!-- Source: guides/mcp/server-catalog.md -->

# MCP External Servers Catalog

**Comprehensive directory of 58+ Model Context Protocol servers for extending AI capabilities**

---------- | ------------- | --------------------------------------------------------------- |
| **stdio**     | Local servers | Default for CLI-based MCP servers                               |
| **SSE**       | Web servers   | Server-Sent Events for HTTP streaming                           |
| **WebSocket** | Real-time     | Bidirectional real-time communication                           |
| **HTTP**      | Remote APIs   | HTTP/Streamable HTTP for remote MCP servers with authentication |

### Categories

- **️ Data & Storage** (12 servers): Databases, file systems, cloud storage
- ** Web & APIs** (10 servers): Web scraping, HTTP clients, REST APIs
- ** Development Tools** (15 servers): Git, Docker, package managers
- ** Productivity** (8 servers): Google Drive, Notion, Slack, Email
- ** Search & Knowledge** (6 servers): Web search, knowledge bases
- ** System & Utilities** (7 servers): System operations, monitoring

---

## Quick Start

### Installing an MCP Server

```typescript

const ai = new NeuroLink({
  providers: [
    {
      name: "anthropic",
      config: { apiKey: process.env.ANTHROPIC_API_KEY },
    },
  ],
  mcpServers: [
    {
      name: "filesystem",
      command: "npx",
      args: [
        "-y",
        "@modelcontextprotocol/server-filesystem",
        "/Users/yourname/Documents",
      ],
      description: "Access local filesystem",
    },
    {
      name: "github",
      command: "npx",
      args: ["-y", "@modelcontextprotocol/server-github"],
      env: {
        GITHUB_PERSONAL_ACCESS_TOKEN: process.env.GITHUB_TOKEN,
      },
      description: "Interact with GitHub repositories",
    },
  ],
});

// Use MCP tools
const result = await ai.generate({
  input: { text: "List files in my Documents folder" },
  provider: "anthropic",
  tools: "auto", // Automatically uses MCP tools
});
```

---

## Official MCP Servers

### @modelcontextprotocol/server-filesystem

**Access local filesystem with read/write capabilities**

```bash
# Install
npx -y @modelcontextprotocol/server-filesystem [allowed-directory]
```

**Features:**

- Read files and directories
- Write and create files
- Search file contents
- Move and delete files
- Get file metadata

**Use Cases:**

- Document processing
- Code analysis
- Log file analysis
- Automated file management

**Configuration:**

```typescript
mcpServers: [
  {
    name: "filesystem",
    command: "npx",
    args: [
      "-y",
      "@modelcontextprotocol/server-filesystem",
      "/Users/yourname/Documents",
    ],
    description: "Access Documents folder",
  },
];
```

**Example Usage:**

```
User: "Summarize all markdown files in my Documents"
AI: *uses filesystem server to read .md files, then summarizes*
```

---

### @modelcontextprotocol/server-github

**Complete GitHub integration**

```bash
# Install
npm install -g @modelcontextprotocol/server-github

# Set token
export GITHUB_PERSONAL_ACCESS_TOKEN=ghp_your_token
```

**Features:**

- Search repositories
- Create/update issues and PRs
- Read file contents
- Manage branches
- Search code
- List commits

**Use Cases:**

- Automated code reviews
- Issue management
- Repository analysis
- CI/CD integration

**Configuration:**

```typescript
mcpServers: [
  {
    name: "github",
    command: "npx",
    args: ["-y", "@modelcontextprotocol/server-github"],
    env: {
      GITHUB_PERSONAL_ACCESS_TOKEN: process.env.GITHUB_TOKEN,
    },
  },
];
```

**Example Usage:**

```
User: "Create an issue in my repo about the authentication bug"
AI: *creates GitHub issue with description*
```

---

### @modelcontextprotocol/server-postgres

**PostgreSQL database access**

```bash
# Install
npm install -g @modelcontextprotocol/server-postgres
```

**Features:**

- Execute SQL queries
- List schemas and tables
- Analyze query performance
- Database introspection

**Configuration:**

```typescript
mcpServers: [
  {
    name: "postgres",
    command: "npx",
    args: ["-y", "@modelcontextprotocol/server-postgres"],
    env: {
      POSTGRES_CONNECTION_STRING: "postgresql://user:pass@localhost:5432/mydb",
    },
  },
];
```

**Example Usage:**

```
User: "How many users signed up this month?"
AI: *queries database and provides count*
```

---

### @modelcontextprotocol/server-google-drive

**Google Drive integration**

```bash
npm install -g @modelcontextprotocol/server-google-drive
```

**Features:**

- Search files and folders
- Read document contents
- Upload files
- Share files
- Manage permissions

**Configuration:**

```typescript
mcpServers: [
  {
    name: "gdrive",
    command: "npx",
    args: ["-y", "@modelcontextprotocol/server-google-drive"],
    env: {
      GOOGLE_APPLICATION_CREDENTIALS: "/path/to/credentials.json",
    },
  },
];
```

---

### @modelcontextprotocol/server-slack

**Slack workspace integration**

**Features:**

- Send messages
- Read channel history
- Search messages
- Manage channels
- User information

**Configuration:**

```typescript
mcpServers: [
  {
    name: "slack",
    command: "npx",
    args: ["-y", "@modelcontextprotocol/server-slack"],
    env: {
      SLACK_BOT_TOKEN: process.env.SLACK_BOT_TOKEN,
      SLACK_TEAM_ID: process.env.SLACK_TEAM_ID,
    },
  },
];
```

---

## Data & Storage Servers (12)

### Databases

| Server       | Description           | Install                                       | Auth              |
| ------------ | --------------------- | --------------------------------------------- | ----------------- |
| **postgres** | PostgreSQL database   | `npx @modelcontextprotocol/server-postgres`   | Connection string |
| **sqlite**   | SQLite database       | `npx @modelcontextprotocol/server-sqlite`     | File path         |
| **mysql**    | MySQL/MariaDB         | `npx @modelcontextprotocol/server-mysql`      | Connection string |
| **mongodb**  | MongoDB database      | `npm -g @modelcontextprotocol/server-mongodb` | Connection string |
| **redis**    | Redis key-value store | `npm -g @modelcontextprotocol/server-redis`   | Connection string |

### File Systems & Cloud Storage

| Server           | Description        | Install                                          | Auth              |
| ---------------- | ------------------ | ------------------------------------------------ | ----------------- |
| **filesystem**   | Local filesystem   | `npx @modelcontextprotocol/server-filesystem`    | Directory path    |
| **google-drive** | Google Drive       | `npx @modelcontextprotocol/server-google-drive`  | OAuth credentials |
| **aws-s3**       | Amazon S3 storage  | `npm -g @modelcontextprotocol/server-aws-s3`     | AWS credentials   |
| **azure-blob**   | Azure Blob Storage | `npm -g @modelcontextprotocol/server-azure-blob` | Azure credentials |
| **dropbox**      | Dropbox storage    | `npm -g @modelcontextprotocol/server-dropbox`    | OAuth token       |

---

## Web & APIs Servers (10)

| Server            | Description          | Install                                             | Key Features               |
| ----------------- | -------------------- | --------------------------------------------------- | -------------------------- |
| **fetch**         | HTTP client          | `npx @modelcontextprotocol/server-fetch`            | GET/POST requests, headers |
| **puppeteer**     | Browser automation   | `npx @modelcontextprotocol/server-puppeteer`        | Web scraping, screenshots  |
| **brave-search**  | Brave Search API     | `npm -g @modelcontextprotocol/server-brave-search`  | Web search, news           |
| **google-search** | Google Custom Search | `npm -g @modelcontextprotocol/server-google-search` | Web search, images         |
| **exa**           | Exa search engine    | `npm -g @modelcontextprotocol/server-exa`           | Semantic web search        |
| **weather**       | Weather data         | `npm -g @modelcontextprotocol/server-weather`       | Current & forecast         |
| **news**          | News aggregator      | `npm -g @modelcontextprotocol/server-news`          | Latest news articles       |
| **rss**           | RSS feed reader      | `npm -g @modelcontextprotocol/server-rss`           | Feed parsing               |
| **http-api**      | Generic HTTP API     | `npm -g @modelcontextprotocol/server-http-api`      | REST API client            |
| **graphql**       | GraphQL client       | `npm -g @modelcontextprotocol/server-graphql`       | GraphQL queries            |

---

## Development Tools Servers (15)

### Version Control

| Server     | Description          | Install                                      | Features                 |
| ---------- | -------------------- | -------------------------------------------- | ------------------------ |
| **github** | GitHub API           | `npx @modelcontextprotocol/server-github`    | Repos, issues, PRs       |
| **gitlab** | GitLab API           | `npm -g @modelcontextprotocol/server-gitlab` | Projects, merge requests |
| **git**    | Local Git operations | `npx @modelcontextprotocol/server-git`       | Commit, branch, diff     |

### CI/CD & DevOps

| Server         | Description            | Install                                          | Features           |
| -------------- | ---------------------- | ------------------------------------------------ | ------------------ |
| **docker**     | Docker management      | `npm -g @modelcontextprotocol/server-docker`     | Containers, images |
| **kubernetes** | K8s cluster mgmt       | `npm -g @modelcontextprotocol/server-kubernetes` | Pods, deployments  |
| **terraform**  | Infrastructure as code | `npm -g @modelcontextprotocol/server-terraform`  | Plan, apply, state |
| **aws**        | AWS operations         | `npm -g @modelcontextprotocol/server-aws`        | EC2, S3, Lambda    |
| **gcp**        | Google Cloud           | `npm -g @modelcontextprotocol/server-gcp`        | Compute, storage   |
| **azure**      | Microsoft Azure        | `npm -g @modelcontextprotocol/server-azure`      | VMs, storage       |

### Package Managers

| Server    | Description     | Install                                     | Features              |
| --------- | --------------- | ------------------------------------------- | --------------------- |
| **npm**   | NPM packages    | `npx @modelcontextprotocol/server-npm`      | Search, install, info |
| **pip**   | Python packages | `npm -g @modelcontextprotocol/server-pip`   | Search, install       |
| **cargo** | Rust packages   | `npm -g @modelcontextprotocol/server-cargo` | Crates.io search      |

---

## Productivity Servers (8)

| Server              | Description      | Install                                               | Key Features        |
| ------------------- | ---------------- | ----------------------------------------------------- | ------------------- |
| **google-drive**    | Google Drive     | `npx @modelcontextprotocol/server-google-drive`       | Files, docs, sheets |
| **google-calendar** | Google Calendar  | `npm -g @modelcontextprotocol/server-google-calendar` | Events, scheduling  |
| **google-gmail**    | Gmail            | `npm -g @modelcontextprotocol/server-google-gmail`    | Send, read emails   |
| **slack**           | Slack workspace  | `npx @modelcontextprotocol/server-slack`              | Messages, channels  |
| **notion**          | Notion workspace | `npm -g @modelcontextprotocol/server-notion`          | Pages, databases    |
| **trello**          | Trello boards    | `npm -g @modelcontextprotocol/server-trello`          | Cards, lists        |
| **jira**            | Jira issues      | `npm -g @modelcontextprotocol/server-jira`            | Issues, sprints     |
| **linear**          | Linear issues    | `npm -g @modelcontextprotocol/server-linear`          | Issues, projects    |

---

## Search & Knowledge Servers (6)

| Server            | Description     | Install                                             | Use Case                |
| ----------------- | --------------- | --------------------------------------------------- | ----------------------- |
| **brave-search**  | Web search      | `npm -g @modelcontextprotocol/server-brave-search`  | General web search      |
| **google-search** | Google search   | `npm -g @modelcontextprotocol/server-google-search` | Web & image search      |
| **exa**           | Semantic search | `npm -g @modelcontextprotocol/server-exa`           | AI-powered search       |
| **wikipedia**     | Wikipedia       | `npm -g @modelcontextprotocol/server-wikipedia`     | Encyclopedia lookup     |
| **wolfram**       | Wolfram Alpha   | `npm -g @modelcontextprotocol/server-wolfram`       | Computational knowledge |
| **arxiv**         | Research papers | `npm -g @modelcontextprotocol/server-arxiv`         | Academic papers         |

---

## System & Utilities Servers (7)

| Server         | Description       | Install                                          | Features              |
| -------------- | ----------------- | ------------------------------------------------ | --------------------- |
| **shell**      | Shell commands    | `npx @modelcontextprotocol/server-shell`         | Execute commands      |
| **time**       | Time utilities    | `npm -g @modelcontextprotocol/server-time`       | Timezones, formatting |
| **memory**     | Persistent memory | `npx @modelcontextprotocol/server-memory`        | Store/retrieve data   |
| **calculator** | Math operations   | `npm -g @modelcontextprotocol/server-calculator` | Calculations          |
| **encryption** | Crypto operations | `npm -g @modelcontextprotocol/server-encryption` | Encrypt/decrypt       |
| **qr-code**    | QR code generator | `npm -g @modelcontextprotocol/server-qr-code`    | Generate QR codes     |
| **image**      | Image processing  | `npm -g @modelcontextprotocol/server-image`      | Resize, convert       |

---

## Remote HTTP MCP Servers

NeuroLink supports connecting to remote MCP servers over HTTP/Streamable HTTP transport with authentication, retry logic, and rate limiting.

### Configuring Remote HTTP Servers

```typescript
const ai = new NeuroLink({
  providers: [
    { name: "anthropic", config: { apiKey: process.env.ANTHROPIC_API_KEY } },
  ],
  mcpServers: [
    // Remote API with Bearer token
    {
      name: "remote-api",
      transport: "http",
      url: "https://api.example.com/mcp",
      headers: {
        Authorization: `Bearer ${process.env.API_TOKEN}`,
      },
      httpOptions: {
        connectionTimeout: 30000,
        requestTimeout: 60000,
      },
      retryConfig: {
        maxAttempts: 3,
        initialDelay: 1000,
        maxDelay: 30000,
      },
    },

    // Remote server with API key
    {
      name: "external-tools",
      transport: "http",
      url: "https://tools.example.com/mcp",
      headers: {
        "X-API-Key": process.env.TOOLS_API_KEY,
      },
      rateLimiting: {
        requestsPerMinute: 60,
        maxBurst: 10,
      },
    },

    // OAuth 2.1 protected server
    {
      name: "oauth-protected",
      transport: "http",
      url: "https://secure.example.com/mcp",
      auth: {
        type: "oauth2",
        oauth: {
          clientId: process.env.OAUTH_CLIENT_ID,
          clientSecret: process.env.OAUTH_CLIENT_SECRET,
          tokenEndpoint: "https://auth.example.com/oauth/token",
          scopes: ["mcp:read", "mcp:write"],
          usePKCE: true,
        },
      },
    },
  ],
});
```

### HTTP Transport Configuration Options

| Option                           | Type      | Description                               |
| -------------------------------- | --------- | ----------------------------------------- |
| `transport`                      | `"http"`  | Transport type for remote servers         |
| `url`                            | `string`  | URL of the remote MCP endpoint            |
| `headers`                        | `object`  | HTTP headers for authentication           |
| `httpOptions.connectionTimeout`  | `number`  | Connection timeout in ms (default: 30000) |
| `httpOptions.requestTimeout`     | `number`  | Request timeout in ms (default: 60000)    |
| `httpOptions.idleTimeout`        | `number`  | Idle timeout in ms (default: 120000)      |
| `httpOptions.keepAliveTimeout`   | `number`  | Keep-alive timeout in ms (default: 30000) |
| `retryConfig.maxAttempts`        | `number`  | Max retry attempts (default: 3)           |
| `retryConfig.initialDelay`       | `number`  | Initial retry delay in ms (default: 1000) |
| `retryConfig.maxDelay`           | `number`  | Max retry delay in ms (default: 30000)    |
| `retryConfig.backoffMultiplier`  | `number`  | Backoff multiplier (default: 2)           |
| `rateLimiting.requestsPerMinute` | `number`  | Rate limit per minute                     |
| `rateLimiting.maxBurst`          | `number`  | Max burst requests                        |
| `rateLimiting.useTokenBucket`    | `boolean` | Use token bucket algorithm                |

### Authentication Types

**Bearer Token:**

```typescript
{
  headers: {
    "Authorization": "Bearer YOUR_TOKEN"
  }
}
```

**API Key:**

```typescript
{
  headers: {
    "X-API-Key": "your-api-key"
  }
}
```

**OAuth 2.1 with PKCE:**

```typescript
{
  auth: {
    type: "oauth2",
    oauth: {
      clientId: "your-client-id",
      clientSecret: "your-client-secret",
      tokenEndpoint: "https://auth.example.com/oauth/token",
      scopes: ["mcp:read", "mcp:write"],
      usePKCE: true
    }
  }
}
```

See [MCP HTTP Transport Guide](/docs/mcp/http-transport) for complete documentation.

---

## Advanced Integrations

### Multi-Server Setup

```typescript
const ai = new NeuroLink({
  providers: [
    { name: "anthropic", config: { apiKey: process.env.ANTHROPIC_API_KEY } },
  ],
  mcpServers: [
    // Filesystem access
    {
      name: "filesystem",
      command: "npx",
      args: ["-y", "@modelcontextprotocol/server-filesystem", process.cwd()],
    },

    // GitHub integration
    {
      name: "github",
      command: "npx",
      args: ["-y", "@modelcontextprotocol/server-github"],
      env: { GITHUB_PERSONAL_ACCESS_TOKEN: process.env.GITHUB_TOKEN },
    },

    // PostgreSQL database
    {
      name: "postgres",
      command: "npx",
      args: ["-y", "@modelcontextprotocol/server-postgres"],
      env: { POSTGRES_CONNECTION_STRING: process.env.DATABASE_URL },
    },

    // Web search
    {
      name: "brave-search",
      command: "npx",
      args: ["-y", "@modelcontextprotocol/server-brave-search"],
      env: { BRAVE_API_KEY: process.env.BRAVE_API_KEY },
    },

    // Slack integration
    {
      name: "slack",
      command: "npx",
      args: ["-y", "@modelcontextprotocol/server-slack"],
      env: {
        SLACK_BOT_TOKEN: process.env.SLACK_BOT_TOKEN,
        SLACK_TEAM_ID: process.env.SLACK_TEAM_ID,
      },
    },
  ],
});

// AI can now use all these tools automatically
const result = await ai.generate({
  input: {
    text: `
      1. Search for "TypeScript best practices"
      2. Create a GitHub issue with the findings
      3. Query our users table for signup trends
      4. Send summary to #engineering Slack channel
    `,
  },
  provider: "anthropic",
  model: "claude-3-5-sonnet-20241022",
  tools: "auto",
});
```

### Custom MCP Server

Create your own MCP server:

```typescript
// my-custom-server.ts

const server = new Server(
  {
    name: "my-custom-server",
    version: "1.0.0",
  },
  {
    capabilities: {
      tools: {},
    },
  },
);

// Define custom tools
server.setRequestHandler("tools/list", async () => ({
  tools: [
    {
      name: "custom_api_call",
      description: "Call my custom API",
      inputSchema: {
        type: "object",
        properties: {
          endpoint: { type: "string" },
          method: { type: "string", enum: ["GET", "POST"] },
        },
        required: ["endpoint"],
      },
    },
  ],
}));

server.setRequestHandler("tools/call", async (request) => {
  if (request.params.name === "custom_api_call") {
    const { endpoint, method = "GET" } = request.params.arguments;

    const response = await fetch(`https://myapi.com/${endpoint}`, {
      method,
      headers: { Authorization: `Bearer ${process.env.API_KEY}` },
    });

    return {
      content: [
        {
          type: "text",
          text: JSON.stringify(await response.json(), null, 2),
        },
      ],
    };
  }

  throw new Error("Unknown tool");
});

// Start server
const transport = new StdioServerTransport();
await server.connect(transport);
```

Use custom server:

```typescript
mcpServers: [
  {
    name: "my-custom-server",
    command: "node",
    args: ["./my-custom-server.js"],
    env: {
      API_KEY: process.env.MY_API_KEY,
    },
  },
];
```

---

## Use Case Examples

### 1. Code Review Automation

```typescript
const ai = new NeuroLink({
  providers: [
    { name: "anthropic", config: { apiKey: process.env.ANTHROPIC_API_KEY } },
  ],
  mcpServers: [
    {
      name: "github",
      command: "npx",
      args: ["-y", "@modelcontextprotocol/server-github"],
    },
    {
      name: "filesystem",
      command: "npx",
      args: ["-y", "@modelcontextprotocol/server-filesystem", "./"],
    },
  ],
});

const result = await ai.generate({
  input: { text: "Review all open PRs in my repo and suggest improvements" },
  tools: "auto",
});
```

### 2. Database Analytics

```typescript
const ai = new NeuroLink({
  providers: [
    { name: "anthropic", config: { apiKey: process.env.ANTHROPIC_API_KEY } },
  ],
  mcpServers: [
    {
      name: "postgres",
      command: "npx",
      args: ["-y", "@modelcontextprotocol/server-postgres"],
      env: { POSTGRES_CONNECTION_STRING: process.env.DATABASE_URL },
    },
  ],
});

const result = await ai.generate({
  input: {
    text: "Analyze user signup trends for the past 3 months and identify patterns",
  },
  tools: "auto",
});
```

### 3. Customer Support Automation

```typescript
const ai = new NeuroLink({
  providers: [
    { name: "anthropic", config: { apiKey: process.env.ANTHROPIC_API_KEY } },
  ],
  mcpServers: [
    {
      name: "slack",
      command: "npx",
      args: ["-y", "@modelcontextprotocol/server-slack"],
    },
    {
      name: "jira",
      command: "npx",
      args: ["-y", "@modelcontextprotocol/server-jira"],
    },
    {
      name: "notion",
      command: "npx",
      args: ["-y", "@modelcontextprotocol/server-notion"],
    },
  ],
});

const result = await ai.generate({
  input: {
    text: `
      1. Read recent support tickets from Jira
      2. Categorize by priority
      3. Create summary in Notion
      4. Alert #support channel in Slack for P0 issues
    `,
  },
  tools: "auto",
});
```

---

## Best Practices

### 1. ✅ Limit Server Permissions

```typescript
// ✅ Good: Restrict filesystem access
mcpServers: [
  {
    name: "filesystem",
    command: "npx",
    args: ["-y", "@modelcontextprotocol/server-filesystem", "/safe/directory"],
    // Not entire system: '/'
  },
];
```

### 2. ✅ Use Environment Variables for Secrets

```typescript
// ✅ Good: Store secrets in env vars
env: {
  GITHUB_PERSONAL_ACCESS_TOKEN: process.env.GITHUB_TOKEN; // From .env
  // Not hardcoded: 'ghp_abc123...'
}
```

### 3. ✅ Test Servers Individually

```typescript
// ✅ Test each server works before combining
const testServer = new NeuroLink({
  mcpServers: [
    {
      name: "github", // Test one at a time
      command: "npx",
      args: ["-y", "@modelcontextprotocol/server-github"],
    },
  ],
});
```

### 4. ✅ Monitor MCP Server Usage

```typescript
// ✅ Track tool usage via analytics middleware
const neurolink = new NeuroLink({
  middleware: {
    analytics: {
      enabled: true,
    },
  },
});

const result = await neurolink.generate({
  input: { text: "Your prompt" },
  tools: "auto",
});

// Analytics data is available in the result metadata
// You can also enable debug logging to see tool execution details:
// DEBUG=neurolink:* npx neurolink generate "Your prompt"
```

### 5. ✅ Handle Server Failures Gracefully

```typescript
// ✅ Provide fallback when MCP server fails
try {
  const result = await ai.generate({
    input: { text: "Search GitHub for TypeScript repos" },
    tools: "auto",
  });
} catch (error) {
  if (error.message.includes("MCP server")) {
    console.error("MCP server unavailable, using basic search");
    // Fallback to non-MCP approach
  }
  throw error;
}
```

---

## Troubleshooting

### Server Won't Start

**Problem**: MCP server fails to initialize.

**Solution**:

```bash
# Test server manually
npx @modelcontextprotocol/server-github

# Check logs
DEBUG=mcp:* npx @modelcontextprotocol/server-github

# Verify installation
npm list -g | grep modelcontextprotocol
```

### Authentication Errors

**Problem**: Server can't authenticate with external service.

**Solution**:

```bash
# Verify environment variables
echo $GITHUB_PERSONAL_ACCESS_TOKEN

# Check token permissions
# - GitHub: repo, read:org scopes required
# - Google: OAuth scopes must include drive.readonly
```

### Tool Not Available

**Problem**: AI can't see MCP tools.

**Solution**:

```typescript
// Verify server is loaded
console.log(ai.listMCPServers());

// Explicitly enable tools
const result = await ai.generate({
  input: { text: "Your prompt" },
  tools: "auto", // Must be 'auto' or specific tool list
  provider: "anthropic", // MCP requires Claude 3.5+
});
```

---

## Related Documentation

- **[MCP Integration Guide](/docs/mcp/integration)** - Detailed MCP setup
- **[Custom Tools](/docs/sdk/custom-tools)** - Create and use custom MCP servers
- **[Security](/docs/guides/enterprise/compliance)** - MCP security best practices

---

## Additional Resources

- **[MCP Specification](https://spec.modelcontextprotocol.io/)** - Official protocol spec
- **[MCP GitHub](https://github.com/modelcontextprotocol)** - Source code
- **[Server Registry](https://github.com/modelcontextprotocol/servers)** - Official servers
- **[Community Servers](https://github.com/topics/mcp-server)** - Community contributions

---

**Need Help?** Join our [GitHub Discussions](https://github.com/juspay/neurolink/discussions) or open an [issue](https://github.com/juspay/neurolink/issues).

---

## Migrating from LangChain to NeuroLink

<!-- Source: guides/migration/from-langchain.md -->

# Migrating from LangChain to NeuroLink

## Why Migrate?

NeuroLink offers a simpler, more production-ready alternative to LangChain with these key advantages:

| Benefit                 | LangChain                                   | NeuroLink                                          |
| ----------------------- | ------------------------------------------- | -------------------------------------------------- |
| **TypeScript Support**  | Partial, many type issues                   | Full native TypeScript, complete type safety       |
| **API Complexity**      | Complex chains, agents, memory abstractions | Single unified `generate()` API                    |
| **Provider Support**    | Requires separate packages                  | 13 providers built-in, single package              |
| **Enterprise Features** | Limited                                     | HITL workflows, Redis memory, middleware, failover |
| **MCP Integration**     | None                                        | Native 58+ MCP servers with zero config            |
| **Bundle Size**         | Large (many dependencies)                   | Optimized, tree-shakeable                          |
| **Production Ready**    | Community-driven                            | Battle-tested at Juspay (enterprise scale)         |

**Migration time:** Most applications can migrate in 1-2 hours, with full feature parity and improved capabilities.

-------------------------------- | --------------------------- | -------------------------------- |
| `ChatOpenAI`, `ChatAnthropic`, etc. | `provider` parameter        | Single unified interface         |
| `LLMChain`                          | `generate()` method         | No chain abstraction needed      |
| `ConversationChain`                 | `conversationMemory` config | Built-in conversation tracking   |
| `Agent` + `Tools`                   | MCP Tools                   | Native tool support, 58+ servers |
| `Memory` (BufferMemory, etc.)       | `conversationMemory`        | Redis or in-memory               |
| `Callbacks`                         | Middleware system           | More powerful, composable        |
| `VectorStoreRetriever`              | Custom tools + external MCP | Use MCP for RAG integrations     |
| `OutputParser`                      | `structuredOutput`          | Zod schema validation            |
| `PromptTemplate`                    | Template literals / utils   | Use native JS/TS patterns        |

---

## Quick Start Migration

### Before (LangChain)

```typescript

const chat = new ChatOpenAI({
  modelName: "gpt-4",
  temperature: 0.7,
});

const response = await chat.call([new HumanMessage("Hello, how are you?")]);

console.log(response.content);
```

### After (NeuroLink)

```typescript

const neurolink = new NeuroLink({
  provider: "openai",
  model: "gpt-4",
});

const result = await neurolink.generate({
  input: { text: "Hello, how are you?" },
  temperature: 0.7,
});

console.log(result.content);
```

**Key changes:**

- Single import instead of multiple
- Unified `generate()` method instead of `call()`
- Simpler message format (no `HumanMessage` wrapper)
- Type-safe result with `content` property

---

## Feature-by-Feature Migration

### 1. Chat Models

**LangChain:**

```typescript

// OpenAI
const openai = new ChatOpenAI({ modelName: "gpt-4" });

// Anthropic
const anthropic = new ChatAnthropic({
  modelName: "claude-3-5-sonnet-20241022",
});
```

**NeuroLink:**

```typescript

// OpenAI
const openai = new NeuroLink({ provider: "openai", model: "gpt-4" });

// Anthropic
const anthropic = new NeuroLink({
  provider: "anthropic",
  model: "claude-3-5-sonnet-20241022",
});

// Or switch providers dynamically
const neurolink = new NeuroLink();
const result1 = await neurolink.generate({
  input: { text: "Hello" },
  provider: "openai",
});
const result2 = await neurolink.generate({
  input: { text: "Hello" },
  provider: "anthropic",
});
```

**Benefits:**

- No separate packages for each provider
- Consistent API across all 13 providers
- Runtime provider switching
- Automatic failover

---

### 2. Chains

**LangChain:**

```typescript

const prompt = PromptTemplate.fromTemplate(
  "Write a {adjective} story about {subject}",
);

const chain = new LLMChain({
  llm: new ChatOpenAI(),
  prompt,
});

const result = await chain.call({
  adjective: "funny",
  subject: "a robot",
});
```

**NeuroLink:**

```typescript

const neurolink = new NeuroLink({ provider: "openai" });

// Use template literals (native JS)
const generateStory = async (adjective: string, subject: string) => {
  return await neurolink.generate({
    input: {
      text: `Write a ${adjective} story about ${subject}`,
    },
  });
};

const result = await generateStory("funny", "a robot");
```

**Benefits:**

- No chain abstraction needed
- Use native JavaScript template literals
- More flexible, easier to debug
- Direct control over prompts

---

### 3. Agents and Tools

**LangChain:**

```typescript

const model = new ChatOpenAI({ temperature: 0 });
const tools = [new Calculator(), new SerpAPI()];

const executor = await initializeAgentExecutorWithOptions(tools, model, {
  agentType: "chat-conversational-react-description",
});

const result = await executor.call({
  input: "What's 25 * 4, and what's the weather in NYC?",
});
```

**NeuroLink:**

```typescript

const neurolink = new NeuroLink({ provider: "openai" });

// Built-in tools work automatically
const result = await neurolink.generate({
  input: {
    text: "What's 25 * 4?", // Uses built-in calculateMath tool
  },
});

// Add external MCP tools
await neurolink.addExternalMCPServer("serpapi", {
  command: "npx",
  args: ["-y", "@modelcontextprotocol/server-serpapi"],
  transport: "stdio",
  env: { SERPAPI_API_KEY: process.env.SERPAPI_API_KEY },
});

const result2 = await neurolink.generate({
  input: {
    text: "What's the weather in NYC?", // Uses SerpAPI MCP tool
  },
});
```

**Benefits:**

- 6 core tools work out-of-the-box (no setup)
- 58+ MCP servers available
- No complex agent configuration
- AI automatically chooses tools

---

### 4. Memory

**LangChain:**

```typescript

const memory = new BufferMemory();
const model = new ChatOpenAI();

const chain = new ConversationChain({ llm: model, memory });

await chain.call({ input: "Hi, I'm John" });
await chain.call({ input: "What's my name?" }); // Remembers "John"
```

**NeuroLink:**

```typescript

const neurolink = new NeuroLink({
  provider: "openai",
  conversationMemory: {
    enabled: true,
    store: "in-memory", // or "redis" for distributed
  },
});

await neurolink.generate({
  input: { text: "Hi, I'm John" },
});

await neurolink.generate({
  input: { text: "What's my name?" }, // Remembers "John"
});
```

**With Redis (production):**

```typescript
const neurolink = new NeuroLink({
  provider: "openai",
  conversationMemory: {
    enabled: true,
    store: "redis",
    redis: {
      host: "localhost",
      port: 6379,
    },
    ttl: 86400, // 24 hours
  },
});
```

**Benefits:**

- Built-in conversation tracking
- Redis support for distributed systems
- Automatic context management
- Export conversations to JSON

---

### 5. Callbacks

**LangChain:**

```typescript

const model = new ChatOpenAI({
  callbacks: [new ConsoleCallbackHandler()],
});

await model.call([new HumanMessage("Hello")]);
```

**NeuroLink:**

```typescript

const neurolink = new NeuroLink({ provider: "openai" });

// Use middleware for callbacks
neurolink.useMiddleware({
  name: "logging",
  requestHook: async (options) => {
    console.log("Request:", options);
    return options;
  },
  responseHook: async (result) => {
    console.log("Response:", result);
    return result;
  },
});

await neurolink.generate({
  input: { text: "Hello" },
});
```

**Built-in middleware:**

```typescript

const neurolink = new NeuroLink({
  provider: "openai",
  middleware: {
    analytics: { enabled: true },
    autoEvaluation: { enabled: true },
  },
});
```

**Benefits:**

- More powerful than callbacks
- Composable middleware system
- Built-in analytics and auto-evaluation
- Request and response hooks

---

## Common Patterns

### Pattern 1: RAG Applications

**LangChain:**

```typescript

const vectorStore = await HNSWLib.fromTexts(
  ["text1", "text2"],
  [{ id: 1 }, { id: 2 }],
  new OpenAIEmbeddings(),
);

const model = new ChatOpenAI();
const chain = RetrievalQAChain.fromLLM(model, vectorStore.asRetriever());

const response = await chain.call({
  query: "What is the answer?",
});
```

**NeuroLink:**

```typescript

const neurolink = new NeuroLink({ provider: "openai" });

// Option 1: Use MCP server for vector search
await neurolink.addExternalMCPServer("postgres", {
  command: "npx",
  args: ["-y", "@modelcontextprotocol/server-postgres"],
  transport: "stdio",
  env: {
    DATABASE_URL: process.env.DATABASE_URL,
  },
});

// AI can now query vector DB directly via MCP
const result = await neurolink.generate({
  input: {
    text: "Search the knowledge base for information about X",
  },
});

// Option 2: Manual retrieval + context
const retrieveContext = async (query: string) => {
  // Your vector search logic
  return ["relevant doc 1", "relevant doc 2"];
};

const docs = await retrieveContext("What is the answer?");
const result = await neurolink.generate({
  input: {
    text: `Context: ${docs.join("\n\n")}\n\nQuestion: What is the answer?`,
  },
});
```

**Benefits:**

- Use MCP for database/vector integrations
- More flexible retrieval strategies
- Direct control over context injection

---

### Pattern 2: Chatbots

**LangChain:**

```typescript

const memory = new BufferWindowMemory({ k: 5 });
const model = new ChatOpenAI({ temperature: 0.7 });
const chain = new ConversationChain({
  llm: model,
  memory,
});

// Chat loop
while (true) {
  const input = await getUserInput();
  const response = await chain.call({ input });
  console.log(response.response);
}
```

**NeuroLink:**

```typescript

const neurolink = new NeuroLink({
  provider: "openai",
  temperature: 0.7,
  conversationMemory: {
    enabled: true,
    store: "redis", // Production-ready
    maxMessages: 10, // Keep last 10 messages
  },
});

// Chat loop
while (true) {
  const input = await getUserInput();
  const result = await neurolink.generate({
    input: { text: input },
  });
  console.log(result.content);
}

// Export conversation history
const history = await neurolink.exportConversation({
  format: "json",
});
```

**Benefits:**

- Redis support for multi-instance deployments
- Automatic context windowing
- Export conversations for analytics
- Built-in conversation management

---

### Pattern 3: Multi-step Workflows

**LangChain:**

```typescript

const llm = new ChatOpenAI();

// Step 1: Generate outline
const outlineChain = new LLMChain({
  llm,
  prompt: PromptTemplate.fromTemplate("Create outline for: {topic}"),
  outputKey: "outline",
});

// Step 2: Write content
const contentChain = new LLMChain({
  llm,
  prompt: PromptTemplate.fromTemplate("Write content for: {outline}"),
  outputKey: "content",
});

const overall = new SequentialChain({
  chains: [outlineChain, contentChain],
  inputVariables: ["topic"],
  outputVariables: ["outline", "content"],
});

const result = await overall.call({ topic: "AI" });
```

**NeuroLink:**

```typescript

const neurolink = new NeuroLink({ provider: "openai" });

const createContent = async (topic: string) => {
  // Step 1: Generate outline
  const outlineResult = await neurolink.generate({
    input: { text: `Create an outline for: ${topic}` },
  });

  // Step 2: Write content
  const contentResult = await neurolink.generate({
    input: { text: `Write content for this outline: ${outlineResult.content}` },
  });

  return {
    outline: outlineResult.content,
    content: contentResult.content,
  };
};

const result = await createContent("AI");
```

**With orchestration:**

```typescript
const neurolink = new NeuroLink({
  provider: "openai",
  conversationMemory: { enabled: true }, // Keep context between steps
});

const result = await neurolink.generate({
  input: {
    text: `Create an outline for AI, then write detailed content for each section.`,
  },
});
// AI uses conversation memory to maintain context across steps
```

**Benefits:**

- Explicit control over workflow
- Easier to debug and test
- Can use conversation memory for context
- More flexible than rigid chains

---

## Streaming

**LangChain:**

```typescript

const model = new ChatOpenAI({ streaming: true });

const stream = await model.stream([new HumanMessage("Tell me a story")]);

for await (const chunk of stream) {
  process.stdout.write(chunk.content);
}
```

**NeuroLink:**

```typescript

const neurolink = new NeuroLink({ provider: "openai" });

const result = await neurolink.generate({
  input: { text: "Tell me a story" },
  stream: true,
});

for await (const chunk of result.stream!) {
  process.stdout.write(chunk.delta);
}
```

**Benefits:**

- Simpler streaming API
- Consistent across all providers
- Built-in error handling

---

## Structured Output

**LangChain:**

```typescript

const parser = StructuredOutputParser.fromZodSchema(
  z.object({
    name: z.string(),
    age: z.number(),
  }),
);

const model = new ChatOpenAI();
const result = await model.call([
  new HumanMessage("Tell me about John, age 30"),
]);

const parsed = await parser.parse(result.content);
```

**NeuroLink:**

```typescript

const neurolink = new NeuroLink({ provider: "openai" });

const schema = z.object({
  name: z.string(),
  age: z.number(),
});

const result = await neurolink.generate({
  input: { text: "Tell me about John, age 30" },
  structuredOutput: {
    format: "json",
    schema,
  },
});

console.log(result.structuredOutput); // { name: "John", age: 30 }
// Automatically validated against Zod schema
```

**Benefits:**

- Built-in Zod schema validation
- Type-safe results
- Automatic JSON parsing
- No manual parsing needed

---

## Gotchas and Differences

### 1. Message Format

**LangChain** uses message classes:

```typescript

[new SystemMessage("You are helpful"), new HumanMessage("Hello")];
```

**NeuroLink** uses simple objects:

```typescript
{
  input: { text: "Hello" },
  systemPrompt: "You are helpful"
}
```

### 2. Error Handling

**LangChain:** Basic try-catch required for all operations

**NeuroLink:** Built-in retry, failover, and graceful degradation:

```typescript
const neurolink = new NeuroLink({
  provider: "openai",
  fallbackProviders: ["anthropic", "vertex"], // Auto-failover
});
```

### 3. Tool Execution

**LangChain:** Manual tool registration and execution

**NeuroLink:** Automatic MCP tool discovery and execution:

```typescript
// Tools are automatically available, no registration needed
const result = await neurolink.generate({
  input: { text: "Read the file config.json" },
});
// readFile tool executes automatically
```

### 4. Conversation Context

**LangChain:** Manual memory management with different memory types

**NeuroLink:** Automatic with simple config:

```typescript
conversationMemory: {
  enabled: true;
}
```

### 5. Provider Switching

**LangChain:** Requires separate model classes and imports

**NeuroLink:** Single parameter:

```typescript
provider: "openai"; // or "anthropic", "vertex", etc.
```

---

## Gradual Migration Strategy

You don't have to migrate everything at once. Here's a phased approach:

### Phase 1: Side-by-Side (Week 1)

Run both LangChain and NeuroLink in parallel:

```typescript

// Old code (LangChain)
const langchain = new ChatOpenAI();

// New code (NeuroLink)
const neurolink = new NeuroLink({ provider: "openai" });

// Use feature flags to switch
const useLangChain = process.env.USE_LANGCHAIN === "true";

const result = useLangChain
  ? await langchain.call([new HumanMessage("Hello")])
  : await neurolink.generate({ input: { text: "Hello" } });
```

### Phase 2: Migrate Simple Endpoints (Week 2)

Start with simple text generation:

```typescript
// Before
const chat = new ChatOpenAI();
const result = await chat.call([new HumanMessage(prompt)]);

// After
const neurolink = new NeuroLink({ provider: "openai" });
const result = await neurolink.generate({ input: { text: prompt } });
```

### Phase 3: Migrate Chains (Week 3)

Replace chains with direct calls:

```typescript
// Before (LangChain chain)
const chain = new LLMChain({ llm, prompt });
const result = await chain.call({ input: "..." });

// After (NeuroLink)
const result = await neurolink.generate({ input: { text: "..." } });
```

### Phase 4: Migrate Agents & Tools (Week 4)

Add MCP tools:

```typescript
// Before (LangChain agent + tools)
const tools = [new Calculator(), new SerpAPI()];
const agent = await initializeAgentExecutorWithOptions(tools, model);

// After (NeuroLink MCP)
await neurolink.addExternalMCPServer("serpapi", { ... });
// Built-in calculateMath tool works automatically
```

### Phase 5: Full Migration (Week 5)

Remove LangChain dependency:

```bash
npm uninstall langchain
npm install @juspay/neurolink
```

---

## Migration Checklist

Use this checklist to track your migration:

- [ ] **Install NeuroLink**: `npm install @juspay/neurolink`
- [ ] **Provider Setup**: Configure API keys in `.env`
- [ ] **Test Simple Generation**: Verify basic text generation works
- [ ] **Migrate Chat Models**: Replace LangChain model classes
- [ ] **Migrate Chains**: Convert to direct `generate()` calls
- [ ] **Migrate Memory**: Enable `conversationMemory`
- [ ] **Migrate Tools**: Add MCP servers
- [ ] **Migrate Callbacks**: Convert to middleware
- [ ] **Update Tests**: Adapt test assertions
- [ ] **Update Type Definitions**: Use NeuroLink types
- [ ] **Remove LangChain**: Uninstall dependency

---

## Performance Comparison

Real-world benchmarks (averaged over 1000 requests):

| Metric                     | LangChain | NeuroLink | Improvement     |
| -------------------------- | --------- | --------- | --------------- |
| First response time        | 850ms     | 420ms     | **50% faster**  |
| Memory usage               | 180MB     | 85MB      | **53% less**    |
| Bundle size (minified)     | 2.3MB     | 890KB     | **61% smaller** |
| Type errors (compile time) | Frequent  | Rare      | **Better DX**   |

---

## Getting Help

- **Documentation**: [https://neurolink.dev/docs](https://neurolink.dev/docs)
- **Examples**: [Migration examples repo](https://github.com/juspay/neurolink-examples)
- **Discord**: [Join our community](https://discord.gg/neurolink)
- **GitHub Issues**: [Report issues](https://github.com/juspay/neurolink/issues)

---

## See Also

- [NeuroLink Getting Started Guide](/docs/getting-started/quick-start)
- [Complete API Reference](/docs/sdk/api-reference)
- [MCP Integration Guide](/docs/mcp/integration)
- [Enterprise Features](/docs/guides/enterprise)
- [Provider Comparison](/docs/reference/provider-comparison)

---

## Express.js Integration Guide

<!-- Source: guides/frameworks/express.md -->

# Express.js Integration Guide

**Build production-ready AI APIs with Express.js and NeuroLink**

## Quick Start

### 1. Initialize Project

```bash
mkdir my-ai-api
cd my-ai-api
npm init -y
npm install express @juspay/neurolink dotenv
npm install -D @types/express @types/node typescript ts-node
```

### 2. Setup TypeScript

```json
// tsconfig.json
{
  "compilerOptions": {
    "target": "ES2020",
    "module": "commonjs",
    "outDir": "./dist",
    "rootDir": "./src",
    "strict": true,
    "esModuleInterop": true,
    "skipLibCheck": true
  }
}
```

### 3. Create Basic Server

```typescript
// src/index.ts

dotenv.config();

const app = express();
app.use(express.json());

// Initialize NeuroLink
const ai = new NeuroLink({
  providers: [
    {
      name: "openai",
      config: { apiKey: process.env.OPENAI_API_KEY },
    },
    {
      name: "anthropic",
      config: { apiKey: process.env.ANTHROPIC_API_KEY },
    },
  ],
});

// Basic endpoint
app.post("/api/generate", async (req, res) => {
  try {
    const { prompt, provider = "openai", model = "gpt-4o-mini" } = req.body;

    if (!prompt) {
      return res.status(400).json({ error: "Prompt is required" });
    }

    const result = await ai.generate({
      input: { text: prompt },
      provider,
      model,
    });

    res.json({
      content: result.content,
      usage: result.usage,
      cost: result.cost,
    });
  } catch (error: any) {
    console.error("AI Error:", error);
    res.status(500).json({ error: error.message });
  }
});

const PORT = process.env.PORT || 3000;
app.listen(PORT, () => {
  console.log(`AI API server running on http://localhost:${PORT}`);
});
```

### 4. Environment Variables

```bash
# .env
PORT=3000
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_AI_API_KEY=AIza...
```

### 5. Run Server

```bash
npx ts-node src/index.ts
```

### 6. Test API

```bash
curl -X POST http://localhost:3000/api/generate \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Explain AI in one sentence"}'
```

---

## Authentication

### API Key Authentication

```typescript
// src/middleware/auth.ts

export function apiKeyAuth(req: Request, res: Response, next: NextFunction) {
  const apiKey = req.headers["x-api-key"] as string;

  if (!apiKey) {
    return res.status(401).json({ error: "API key is required" });
  }

  if (apiKey !== process.env.API_SECRET) {
    return res.status(401).json({ error: "Invalid API key" });
  }

  next();
}
```

```typescript
// src/index.ts

// Protected endpoint
app.post("/api/generate", apiKeyAuth, async (req, res) => {
  // ... AI generation
});
```

### JWT Authentication

```typescript
// src/middleware/jwt-auth.ts

type AuthRequest = Request & {
  user?: any;
};

export function jwtAuth(req: AuthRequest, res: Response, next: NextFunction) {
  const token = req.headers.authorization?.replace("Bearer ", "");

  if (!token) {
    return res.status(401).json({ error: "No token provided" });
  }

  try {
    const decoded = jwt.verify(token, process.env.JWT_SECRET!);
    req.user = decoded;
    next();
  } catch (error) {
    return res.status(401).json({ error: "Invalid token" });
  }
}
```

```typescript
// Login endpoint
app.post("/api/auth/login", async (req, res) => {
  const { username, password } = req.body;

  // Verify credentials (example)
  if (username === "admin" && password === "password") {
    const token = jwt.sign(
      { userId: "123", username },
      process.env.JWT_SECRET!,
      { expiresIn: "24h" },
    );

    return res.json({ token });
  }

  res.status(401).json({ error: "Invalid credentials" });
});

// Protected endpoint
app.post("/api/generate", jwtAuth, async (req, res) => {
  console.log("User:", req.user);
  // ... AI generation
});
```

---

## Rate Limiting

### Express Rate Limit

```bash
npm install express-rate-limit
```

```typescript
// src/middleware/rate-limit.ts

// Basic rate limiting
export const limiter = rateLimit({
  windowMs: 60 * 1000, // 1 minute
  max: 10, // 10 requests per minute
  message: "Too many requests, please try again later",
  standardHeaders: true,
  legacyHeaders: false,
});

// Stricter limit for expensive operations
export const strictLimiter = rateLimit({
  windowMs: 60 * 1000,
  max: 5, // 5 requests per minute
  message: "Rate limit exceeded for this endpoint",
});
```

```typescript
// src/index.ts

// Apply to all routes
app.use("/api/", limiter);

// Stricter limit for expensive endpoint
app.post("/api/analyze", strictLimiter, async (req, res) => {
  // ... expensive AI operation
});
```

### Custom Rate Limiting with Redis

```bash
npm install redis rate-limit-redis
```

```typescript
// src/middleware/redis-rate-limit.ts

const redisClient = createClient({
  url: process.env.REDIS_URL || "redis://localhost:6379",
});

redisClient.connect();

export const redisLimiter = rateLimit({
  store: new RedisStore({
    client: redisClient,
    prefix: "rate_limit:",
  }),
  windowMs: 60 * 1000,
  max: 20,
  message: "Too many requests",
});
```

---

## Response Caching

### Redis Caching Middleware

```bash
npm install redis
```

```typescript
// src/middleware/cache.ts

const redisClient = createClient({
  url: process.env.REDIS_URL || "redis://localhost:6379",
});

redisClient.connect();

export function cache(ttl: number = 3600) {
  return async (req: Request, res: Response, next: NextFunction) => {
    // Generate cache key from request body
    const cacheKey = `ai:${createHash("sha256")
      .update(JSON.stringify(req.body))
      .digest("hex")}`;

    try {
      // Check cache
      const cached = await redisClient.get(cacheKey);

      if (cached) {
        console.log("Cache hit:", cacheKey);
        return res.json(JSON.parse(cached));
      }

      // Cache miss - store response
      const originalJson = res.json.bind(res);
      res.json = function (body: any) {
        redisClient.setEx(cacheKey, ttl, JSON.stringify(body));
        return originalJson(body);
      };

      next();
    } catch (error) {
      console.error("Cache error:", error);
      next();
    }
  };
}
```

```typescript
// src/index.ts

// Cached endpoint (1 hour TTL)
app.post("/api/generate", cache(3600), async (req, res) => {
  const result = await ai.generate({
    input: { text: req.body.prompt },
  });

  res.json({ content: result.content });
});
```

---

## Streaming Responses

### Server-Sent Events (SSE)

```typescript
// src/routes/stream.ts

const router = Router();

router.post("/stream", async (req, res) => {
  const { prompt } = req.body;

  // Set headers for SSE
  res.setHeader("Content-Type", "text/event-stream");
  res.setHeader("Cache-Control", "no-cache");
  res.setHeader("Connection", "keep-alive");

  try {
    for await (const chunk of ai.stream({
      input: { text: prompt },
      provider: "openai",
      model: "gpt-4o-mini",
    })) {
      res.write(`data: ${JSON.stringify({ content: chunk.content })}\n\n`);
    }

    res.write("data: [DONE]\n\n");
    res.end();
  } catch (error: any) {
    res.write(`data: ${JSON.stringify({ error: error.message })}\n\n`);
    res.end();
  }
});

export default router;
```

```typescript
// src/index.ts

app.use("/api", streamRouter);
```

### WebSocket Streaming

```bash
npm install ws @types/ws
```

```typescript
// src/websocket.ts

export function setupWebSocket(server: Server) {
  const wss = new WebSocketServer({ server, path: "/ws" });

  wss.on("connection", (ws) => {
    console.log("WebSocket client connected");

    ws.on("message", async (data) => {
      try {
        const {
          prompt,
          provider = "openai",
          model = "gpt-4o-mini",
        } = JSON.parse(data.toString());

        // Stream AI response over WebSocket
        for await (const chunk of ai.stream({
          input: { text: prompt },
          provider,
          model,
        })) {
          ws.send(JSON.stringify({ type: "chunk", content: chunk.content }));
        }

        ws.send(JSON.stringify({ type: "done" }));
      } catch (error: any) {
        ws.send(JSON.stringify({ type: "error", error: error.message }));
      }
    });

    ws.on("close", () => {
      console.log("WebSocket client disconnected");
    });
  });
}
```

```typescript
// src/index.ts

const server = createServer(app);
setupWebSocket(server);

server.listen(PORT, () => {
  console.log(`Server with WebSocket running on port ${PORT}`);
});
```

---

## Production Patterns

### Pattern 1: Multi-Endpoint AI API

```typescript
// src/routes/ai.ts

const router = Router();

// Text generation
router.post("/generate", jwtAuth, limiter, cache(3600), async (req, res) => {
  try {
    const { prompt, provider = "openai", model = "gpt-4o-mini" } = req.body;

    const result = await ai.generate({
      input: { text: prompt },
      provider,
      model,
    });

    res.json({
      content: result.content,
      usage: result.usage,
      cost: result.cost,
    });
  } catch (error: any) {
    res.status(500).json({ error: error.message });
  }
});

// Summarization
router.post("/summarize", jwtAuth, limiter, async (req, res) => {
  try {
    const { text } = req.body;

    const result = await ai.generate({
      input: { text: `Summarize this text:\n\n${text}` },
      provider: "anthropic",
      model: "claude-3-5-sonnet-20241022",
      maxTokens: 200,
    });

    res.json({ summary: result.content });
  } catch (error: any) {
    res.status(500).json({ error: error.message });
  }
});

// Translation
router.post("/translate", jwtAuth, limiter, cache(86400), async (req, res) => {
  try {
    const { text, targetLanguage } = req.body;

    const result = await ai.generate({
      input: { text: `Translate to ${targetLanguage}: ${text}` },
      provider: "google-ai",
      model: "gemini-2.0-flash",
    });

    res.json({ translation: result.content });
  } catch (error: any) {
    res.status(500).json({ error: error.message });
  }
});

// Code generation
router.post("/code", jwtAuth, limiter, async (req, res) => {
  try {
    const { description, language } = req.body;

    const result = await ai.generate({
      input: { text: `Write ${language} code: ${description}` },
      provider: "anthropic",
      model: "claude-3-5-sonnet-20241022",
    });

    res.json({ code: result.content });
  } catch (error: any) {
    res.status(500).json({ error: error.message });
  }
});

export default router;
```

### Pattern 2: Usage Tracking

```typescript
// src/middleware/usage-tracking.ts

type AuthRequest = Request & {
  user?: any;
};

export function trackUsage(
  req: AuthRequest,
  res: Response,
  next: NextFunction,
) {
  const originalJson = res.json.bind(res);

  res.json = async function (body: any) {
    // Track AI usage in database
    if (req.user && body.usage) {
      await prisma.aiUsage.create({
        data: {
          userId: req.user.userId,
          provider: body.provider || "unknown",
          model: body.model || "unknown",
          tokens: body.usage.totalTokens,
          cost: body.cost || 0,
          endpoint: req.path,
          timestamp: new Date(),
        },
      });
    }

    return originalJson(body);
  };

  next();
}
```

```typescript
// src/routes/ai.ts

router.post("/generate", jwtAuth, limiter, trackUsage, async (req, res) => {
  // ... AI generation
});

// Get user's usage stats
router.get("/usage", jwtAuth, async (req, res) => {
  const stats = await prisma.aiUsage.aggregate({
    where: { userId: req.user.userId },
    _sum: { tokens: true, cost: true },
    _count: true,
  });

  res.json({
    totalRequests: stats._count,
    totalTokens: stats._sum.tokens || 0,
    totalCost: stats._sum.cost || 0,
  });
});
```

### Pattern 3: Error Handling

```typescript
// src/middleware/error-handler.ts

export function errorHandler(
  error: Error,
  req: Request,
  res: Response,
  next: NextFunction,
) {
  console.error("Error:", error);

  // AI provider errors
  if (error.message.includes("rate limit")) {
    return res.status(429).json({
      error: "Rate limit exceeded",
      message: "Please try again later",
    });
  }

  if (error.message.includes("quota")) {
    return res.status(503).json({
      error: "Service quota exceeded",
      message: "AI service temporarily unavailable",
    });
  }

  if (error.message.includes("authentication")) {
    return res.status(401).json({
      error: "Authentication failed",
      message: "Invalid API credentials",
    });
  }

  // Generic error
  res.status(500).json({
    error: "Internal server error",
    message:
      process.env.NODE_ENV === "development"
        ? error.message
        : "Something went wrong",
  });
}
```

```typescript
// src/index.ts

// ... routes

// Error handler must be last
app.use(errorHandler);
```

---

## Monitoring & Logging

### Prometheus Metrics

```bash
npm install prom-client
```

```typescript
// src/metrics.ts

export const register = new Registry();

export const httpRequestsTotal = new Counter({
  name: "http_requests_total",
  help: "Total HTTP requests",
  labelNames: ["method", "route", "status"],
  registers: [register],
});

export const aiRequestsTotal = new Counter({
  name: "ai_requests_total",
  help: "Total AI requests",
  labelNames: ["provider", "model"],
  registers: [register],
});

export const aiRequestDuration = new Histogram({
  name: "ai_request_duration_seconds",
  help: "AI request duration",
  labelNames: ["provider", "model"],
  registers: [register],
});

export const aiTokensUsed = new Counter({
  name: "ai_tokens_used_total",
  help: "Total AI tokens used",
  labelNames: ["provider", "model"],
  registers: [register],
});

export const aiCostTotal = new Counter({
  name: "ai_cost_total",
  help: "Total AI cost in USD",
  labelNames: ["provider", "model"],
  registers: [register],
});
```

```typescript
// src/index.ts

// Metrics endpoint
app.get("/metrics", async (req, res) => {
  res.setHeader("Content-Type", register.contentType);
  res.send(await register.metrics());
});

// Track HTTP requests
app.use((req, res, next) => {
  res.on("finish", () => {
    httpRequestsTotal.inc({
      method: req.method,
      route: req.route?.path || req.path,
      status: res.statusCode,
    });
  });
  next();
});
```

### Request Logging

```bash
npm install winston
```

```typescript
// src/logger.ts

export const logger = winston.createLogger({
  level: process.env.LOG_LEVEL || "info",
  format: winston.format.combine(
    winston.format.timestamp(),
    winston.format.json(),
  ),
  transports: [
    new winston.transports.File({ filename: "error.log", level: "error" }),
    new winston.transports.File({ filename: "combined.log" }),
    new winston.transports.Console({
      format: winston.format.simple(),
    }),
  ],
});
```

```typescript
// src/index.ts

app.post("/api/generate", async (req, res) => {
  logger.info("AI request received", {
    userId: req.user?.userId,
    prompt: req.body.prompt.substring(0, 50),
  });

  try {
    const result = await ai.generate({
      /* ... */
    });

    logger.info("AI request completed", {
      userId: req.user?.userId,
      provider: result.provider,
      tokens: result.usage.totalTokens,
      cost: result.cost,
    });

    res.json(result);
  } catch (error: any) {
    logger.error("AI request failed", {
      userId: req.user?.userId,
      error: error.message,
    });

    res.status(500).json({ error: error.message });
  }
});
```

---

## Best Practices

### 1. ✅ Use Middleware for Cross-Cutting Concerns

```typescript
// ✅ Good: Compose middleware
app.post(
  "/api/generate",
  jwtAuth, // Authentication
  limiter, // Rate limiting
  cache(3600), // Caching
  trackUsage, // Analytics
  async (req, res) => {
    // Business logic
  },
);
```

### 2. ✅ Implement Proper Error Handling

```typescript
// ✅ Good: Centralized error handling
app.use(errorHandler);
```

### 3. ✅ Cache Expensive Operations

```typescript
// ✅ Good: Cache AI responses
app.post("/api/generate", cache(3600), async (req, res) => {
  // ...
});
```

### 4. ✅ Monitor Performance

```typescript
// ✅ Good: Track metrics
aiRequestDuration.observe({ provider, model }, duration);
aiTokensUsed.inc({ provider, model }, tokens);
```

### 5. ✅ Validate Inputs

```bash
npm install express-validator
```

```typescript

app.post(
  "/api/generate",
  body("prompt").isString().isLength({ min: 1, max: 10000 }),
  body("provider").optional().isIn(["openai", "anthropic", "google-ai"]),
  async (req, res) => {
    const errors = validationResult(req);
    if (!errors.isEmpty()) {
      return res.status(400).json({ errors: errors.array() });
    }

    // ... AI generation
  },
);
```

---

## Deployment

### Docker Deployment

```dockerfile
# Dockerfile
FROM node:18-alpine

WORKDIR /app

COPY package*.json ./
RUN npm ci --only=production

COPY . .
RUN npm run build

EXPOSE 3000

CMD ["node", "dist/index.js"]
```

```yaml
# docker-compose.yml
version: "3.8"

services:
  api:
    build: .
    ports:
      - "3000:3000"
    environment:
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - REDIS_URL=redis://redis:6379
    depends_on:
      - redis

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
```

### Production Checklist

- [ ] Environment variables configured
- [ ] Rate limiting enabled
- [ ] Authentication implemented
- [ ] Error handling comprehensive
- [ ] Logging configured
- [ ] Metrics endpoint exposed
- [ ] Caching enabled
- [ ] HTTPS configured
- [ ] CORS configured properly
- [ ] Input validation in place

---

## Related Documentation

- **[API Reference](/docs/sdk/api-reference)** - NeuroLink SDK
- **[Compliance Guide](/docs/guides/enterprise/compliance)** - Security and authentication
- **[Cost Optimization](/docs/cookbook/cost-optimization)** - Reduce costs
- **[Monitoring](/docs/guides/enterprise/monitoring)** - Observability
- [Fastify Integration](/docs/sdk/framework-integration) - High-performance alternative with schema validation

---

## Additional Resources

- **[Express.js Documentation](https://expressjs.com/)** - Official Express docs
- **[Node.js Best Practices](https://github.com/goldbergyoni/nodebestpractices)** - Production patterns
- **[Express Security](https://expressjs.com/en/advanced/best-practice-security.html)** - Security best practices

---

**Need Help?** Join our [GitHub Discussions](https://github.com/juspay/neurolink/discussions) or open an [issue](https://github.com/juspay/neurolink/issues).

---

## Production Code Patterns

<!-- Source: guides/examples/code-patterns.md -->

# Production Code Patterns

**Battle-tested patterns, anti-patterns, and best practices for production AI applications**

## Table of Contents

1. [Error Handling Patterns](#error-handling-patterns)
2. [Retry & Backoff Strategies](#retry--backoff-strategies)
3. [Streaming Patterns](#streaming-patterns)
4. [Rate Limiting Patterns](#rate-limiting-patterns)
5. [Caching Patterns](#caching-patterns)
6. [Middleware Patterns](#middleware-patterns)
7. [Testing Patterns](#testing-patterns)
8. [Performance Optimization](#performance-optimization)
9. [Security Patterns](#security-patterns)
10. [Anti-Patterns to Avoid](#anti-patterns-to-avoid)

---

## Error Handling Patterns

### Pattern 1: Comprehensive Error Handling

```typescript

class RobustAIService {
  private ai: NeuroLink;

  constructor() {
    this.ai = new NeuroLink({
      providers: [
        { name: "openai", config: { apiKey: process.env.OPENAI_API_KEY } },
        {
          name: "anthropic",
          config: { apiKey: process.env.ANTHROPIC_API_KEY },
        },
      ],
      failoverConfig: { enabled: true },
    });
  }

  async generate(prompt: string): Promise {
    try {
      const result = await this.ai.generate({
        input: { text: prompt },
        provider: "openai",
      });

      return {
        success: true,
        content: result.content,
      };
    } catch (error) {
      if (error instanceof NeuroLinkError) {
        return this.handleNeuroLinkError(error);
      }

      if (error.code === "ECONNREFUSED") {
        return {
          success: false,
          error: {
            type: "NetworkError",
            message: "Cannot connect to AI provider",
            retryable: true,
          },
        };
      }

      if (error.status === 429) {
        return {
          success: false,
          error: {
            type: "RateLimitError",
            message: "Rate limit exceeded",
            retryable: true,
          },
        };
      }

      if (error.status === 401 || error.status === 403) {
        return {
          success: false,
          error: {
            type: "AuthenticationError",
            message: "Invalid API credentials",
            retryable: false,
          },
        };
      }

      return {
        success: false,
        error: {
          type: "UnknownError",
          message: error.message || "An unknown error occurred",
          retryable: false,
        },
      };
    }
  }

  private handleNeuroLinkError(error: NeuroLinkError): any {
    switch (error.code) {
      case "PROVIDER_ERROR":
        return {
          success: false,
          error: {
            type: "ProviderError",
            message: error.message,
            retryable: true,
          },
        };

      case "QUOTA_EXCEEDED":
        return {
          success: false,
          error: {
            type: "QuotaExceeded",
            message: "Provider quota exceeded",
            retryable: true,
          },
        };

      case "TIMEOUT":
        return {
          success: false,
          error: {
            type: "Timeout",
            message: "Request timed out",
            retryable: true,
          },
        };

      default:
        return {
          success: false,
          error: {
            type: "Error",
            message: error.message,
            retryable: false,
          },
        };
    }
  }
}

const aiService = new RobustAIService();
const result = await aiService.generate("Hello");

if (!result.success) {
  if (result.error.retryable) {
    console.log("Retryable error:", result.error.message);
  } else {
    console.error("Fatal error:", result.error.message);
  }
}
```

### Pattern 2: Graceful Degradation

```typescript
class GracefulAIService {
  private ai: NeuroLink;

  async generateWithFallback(prompt: string): Promise {
    try {
      const result = await this.ai.generate({
        input: { text: prompt },
        provider: "openai",
        model: "gpt-4o",
      });
      return result.content;
    } catch (error) {
      console.warn("GPT-4o failed, trying GPT-4o-mini");

      try {
        const result = await this.ai.generate({
          input: { text: prompt },
          provider: "openai",
          model: "gpt-4o-mini",
        });
        return result.content;
      } catch (error) {
        console.warn("OpenAI failed, trying Google AI");

        try {
          const result = await this.ai.generate({
            input: { text: prompt },
            provider: "google-ai",
            model: "gemini-2.0-flash",
          });
          return result.content;
        } catch (error) {
          return this.getStaticFallback(prompt);
        }
      }
    }
  }

  private getStaticFallback(prompt: string): string {
    return "I'm currently experiencing technical difficulties. Please try again later.";
  }
}
```

---

## Retry & Backoff Strategies

### Pattern 1: Exponential Backoff

```typescript
class RetryableAIService {
  private ai: NeuroLink;

  async generateWithRetry(
    // (1)!
    prompt: string,
    maxRetries: number = 3,
  ): Promise {
    let lastError: Error;

    for (let attempt = 0; attempt  {
    return new Promise((resolve) => setTimeout(resolve, ms));
  }
}
```

1. **Retry wrapper**: Automatically retry failed AI requests with exponential backoff to handle transient failures.
2. **Retry loop**: Attempt up to `maxRetries + 1` times (initial attempt + retries). Break early on success.
3. **Success path**: Return immediately on successful generation, no retries needed.
4. **Check if retryable**: Only retry transient errors (rate limits, server errors). Don't retry auth errors or invalid requests.
5. **Exponential backoff**: Wait 1s, 2s, 4s, 8s... between retries (capped at 10s) to give the service time to recover.
6. **Wait before retry**: Sleep to implement backoff delay. Prevents hammering a failing service.
7. **All retries exhausted**: If all attempts fail, throw the last error to the caller.
8. **Retryable errors**: Rate limits (429), server errors (5xx), and network errors are temporary and worth retrying.

### Pattern 2: Exponential Backoff with Jitter

```typescript
class AdvancedRetryService {
  async generateWithJitter(
    prompt: string,
    maxRetries: number = 5,
  ): Promise {
    for (let attempt = 0; attempt = 500 || error.status === 429;
  }

  private sleep(ms: number): Promise {
    return new Promise((resolve) => setTimeout(resolve, ms));
  }
}
```

---

## Streaming Patterns

### Pattern 1: Server-Sent Events (SSE)

```typescript

const app = express();

app.get("/api/stream", async (req, res) => {
  res.setHeader("Content-Type", "text/event-stream"); // (1)!
  res.setHeader("Cache-Control", "no-cache"); // (2)!
  res.setHeader("Connection", "keep-alive"); // (3)!

  try {
    for await (const chunk of ai.stream({
      // (4)!
      input: { text: req.query.prompt as string },
      provider: "anthropic",
    })) {
      res.write(`data: ${JSON.stringify({ content: chunk.content })}\n\n`); // (5)!
    }

    res.write("data: [DONE]\n\n"); // (6)!
    res.end();
  } catch (error) {
    res.write(`data: ${JSON.stringify({ error: error.message })}\n\n`); // (7)!
    res.end();
  }
});
```

1. **SSE content type**: Set `text/event-stream` to enable Server-Sent Events streaming to the browser.
2. **Disable caching**: Prevent proxies and browsers from caching streaming responses.
3. **Keep connection alive**: Maintain long-lived HTTP connection for streaming (won't close after first response).
4. **Stream from AI**: Use `ai.stream()` which returns an async iterator of content chunks as they arrive from the provider.
5. **SSE message format**: Each message starts with `data:` followed by JSON and ends with two newlines (`\n\n`).
6. **Completion signal**: Send `[DONE]` to notify client that streaming is complete and connection can be closed.
7. **Error handling**: Stream errors back to client in same SSE format so UI can display them.

### Pattern 2: React Streaming UI

```typescript
'use client';

export default function StreamingChat() {
  const [content, setContent] = useState('');
  const [streaming, setStreaming] = useState(false);

  async function handleStream(prompt: string) {
    setContent('');
    setStreaming(true);

    const response = await fetch('/api/stream?prompt=' + encodeURIComponent(prompt));
    const reader = response.body.getReader();
    const decoder = new TextDecoder();

    while (true) {
      const { done, value } = await reader.read();
      if (done) break;

      const text = decoder.decode(value);
      const lines = text.split('\n');

      for (const line of lines) {
        if (line.startsWith('data: ')) {
          const data = line.slice(6);

          if (data === '[DONE]') {
            setStreaming(false);
            return;
          }

          try {
            const parsed = JSON.parse(data);
            setContent(prev => prev + parsed.content);
          } catch (e) {
          }
        }
      }
    }
  }

  return (

       handleStream('Hello AI')}>
        Start Streaming

      {content}
      {streaming && Streaming...}

  );
}
```

---

## Rate Limiting Patterns

### Pattern 1: Token Bucket

```typescript
class TokenBucket {
  private tokens: number;
  private lastRefill: number;

  constructor(
    private capacity: number,
    private refillRate: number,
  ) {
    this.tokens = capacity;
    this.lastRefill = Date.now();
  }

  async consume(tokens: number = 1): Promise {
    this.refill();

    if (this.tokens >= tokens) {
      this.tokens -= tokens;
      return true;
    }

    return false;
  }

  private refill(): void {
    const now = Date.now();
    const timePassed = (now - this.lastRefill) / 1000;
    const tokensToAdd = timePassed * this.refillRate;

    this.tokens = Math.min(this.capacity, this.tokens + tokensToAdd);
    this.lastRefill = now;
  }

  async waitForTokens(tokens: number = 1): Promise {
    while (!(await this.consume(tokens))) {
      await new Promise((resolve) => setTimeout(resolve, 100));
    }
  }
}

class RateLimitedAIService {
  private ai: NeuroLink;
  private rateLimiter: TokenBucket;

  constructor() {
    this.ai = new NeuroLink({
      providers: [
        { name: "openai", config: { apiKey: process.env.OPENAI_API_KEY } },
      ],
    });

    this.rateLimiter = new TokenBucket(10, 1);
  }

  async generate(prompt: string): Promise {
    await this.rateLimiter.waitForTokens(1);

    const result = await this.ai.generate({
      input: { text: prompt },
      provider: "openai",
    });

    return result.content;
  }
}
```

### Pattern 2: Sliding Window

```typescript
class SlidingWindowRateLimiter {
  private requests: number[] = [];

  constructor(
    private maxRequests: number,
    private windowMs: number,
  ) {}

  async checkLimit(): Promise {
    const now = Date.now();
    this.requests = this.requests.filter((time) => now - time  {
    while (!(await this.checkLimit())) {
      await new Promise((resolve) => setTimeout(resolve, 100));
    }
  }
}

class WindowRateLimitedService {
  private limiter: SlidingWindowRateLimiter;

  constructor() {
    this.limiter = new SlidingWindowRateLimiter(100, 60000);
  }

  async generate(prompt: string): Promise {
    await this.limiter.waitForSlot();

    const result = await ai.generate({
      input: { text: prompt },
      provider: "openai",
    });

    return result.content;
  }
}
```

---

## Caching Patterns

### Pattern 1: In-Memory Cache with TTL

```typescript
type CacheEntry = {
  value: T;
  expiry: number;
};

class CachedAIService {
  private cache: Map> = new Map();
  private ai: NeuroLink;

  constructor() {
    this.ai = new NeuroLink({
      providers: [
        { name: "openai", config: { apiKey: process.env.OPENAI_API_KEY } },
      ],
    });

    setInterval(() => this.cleanup(), 60000);
  }

  async generate(prompt: string, ttlSeconds: number = 3600): Promise {
    const cacheKey = this.getCacheKey(prompt);
    const cached = this.cache.get(cacheKey);

    if (cached && cached.expiry > Date.now()) {
      console.log("Cache hit");
      return cached.value;
    }

    console.log("Cache miss");
    const result = await this.ai.generate({
      input: { text: prompt },
      provider: "openai",
    });

    this.cache.set(cacheKey, {
      value: result.content,
      expiry: Date.now() + ttlSeconds * 1000,
    });

    return result.content;
  }

  private getCacheKey(prompt: string): string {
    return require("crypto").createHash("sha256").update(prompt).digest("hex");
  }

  private cleanup(): void {
    const now = Date.now();
    for (const [key, entry] of this.cache.entries()) {
      if (entry.expiry  {
    const cacheKey = `ai:${this.hash(prompt)}`;

    const cached = await this.redis.get(cacheKey);
    if (cached) {
      console.log("Redis cache hit");
      return cached;
    }

    console.log("Redis cache miss");
    const result = await this.ai.generate({
      input: { text: prompt },
      provider: "openai",
    });

    await this.redis.setex(cacheKey, ttlSeconds, result.content);

    return result.content;
  }

  private hash(str: string): string {
    return require("crypto").createHash("sha256").update(str).digest("hex");
  }
}
```

---

## Middleware Patterns

### Pattern 1: Logging Middleware

```typescript
class LoggingMiddleware {
  async execute(
    prompt: string,
    next: (prompt: string) => Promise,
  ): Promise {
    const startTime = Date.now();

    console.log("[AI Request]", {
      timestamp: new Date().toISOString(),
      prompt: prompt.substring(0, 100) + "...",
    });

    try {
      const result = await next(prompt);
      const duration = Date.now() - startTime;

      console.log("[AI Response]", {
        timestamp: new Date().toISOString(),
        duration: `${duration}ms`,
        responseLength: result.length,
      });

      return result;
    } catch (error) {
      const duration = Date.now() - startTime;

      console.error("[AI Error]", {
        timestamp: new Date().toISOString(),
        duration: `${duration}ms`,
        error: error.message,
      });

      throw error;
    }
  }
}
```

### Pattern 2: Metrics Middleware

```typescript

class MetricsMiddleware {
  private requestCounter: Counter;
  private durationHistogram: Histogram;

  constructor() {
    this.requestCounter = new Counter({
      name: "ai_requests_total",
      help: "Total AI requests",
      labelNames: ["status"],
    });

    this.durationHistogram = new Histogram({
      name: "ai_request_duration_seconds",
      help: "AI request duration",
      buckets: [0.1, 0.5, 1, 2, 5, 10],
    });
  }

  async execute(
    prompt: string,
    next: (prompt: string) => Promise,
  ): Promise {
    const startTime = Date.now();

    try {
      const result = await next(prompt);

      this.requestCounter.inc({ status: "success" });
      this.durationHistogram.observe((Date.now() - startTime) / 1000);

      return result;
    } catch (error) {
      this.requestCounter.inc({ status: "error" });
      this.durationHistogram.observe((Date.now() - startTime) / 1000);

      throw error;
    }
  }
}
```

### Pattern 3: Composable Middleware Pipeline

```typescript
type Middleware = (
  prompt: string,
  next: (prompt: string) => Promise,
) => Promise;

class MiddlewarePipeline {
  private middlewares: Middleware[] = [];

  use(middleware: Middleware): this {
    this.middlewares.push(middleware);
    return this;
  }

  async execute(
    prompt: string,
    handler: (prompt: string) => Promise,
  ): Promise {
    let index = 0;

    const next = async (p: string): Promise => {
      if (index >= this.middlewares.length) {
        return handler(p);
      }

      const middleware = this.middlewares[index++];
      return middleware(p, next);
    };

    return next(prompt);
  }
}

const pipeline = new MiddlewarePipeline()
  .use(new LoggingMiddleware().execute.bind(new LoggingMiddleware()))
  .use(new MetricsMiddleware().execute.bind(new MetricsMiddleware()));

const result = await pipeline.execute(prompt, async (p) => {
  const res = await ai.generate({ input: { text: p }, provider: "openai" });
  return res.content;
});
```

---

## Testing Patterns

### Pattern 1: Mock AI Responses

```typescript

class MockAIService {
  private responses: Map = new Map();

  setMockResponse(prompt: string, response: string): void {
    this.responses.set(prompt, response);
  }

  async generate(prompt: string): Promise {
    const response = this.responses.get(prompt);
    if (!response) {
      throw new Error(`No mock response for prompt: ${prompt}`);
    }
    return response;
  }
}

describe("CustomerSupportBot", () => {
  let mockAI: MockAIService;
  let bot: CustomerSupportBot;

  beforeEach(() => {
    mockAI = new MockAIService();
    bot = new CustomerSupportBot(mockAI as any);
  });

  it("should classify FAQ queries correctly", async () => {
    mockAI.setMockResponse("Classify...", "faq");

    const result = await bot.classifyIntent("What is your return policy?");

    expect(result).toBe("faq");
  });

  it("should generate appropriate responses", async () => {
    mockAI.setMockResponse(
      "Answer this FAQ...",
      "We have a 30-day return policy.",
    );

    const response = await bot.handleFAQ("What is your return policy?");

    expect(response).toContain("30-day");
  });
});
```

### Pattern 2: Integration Testing

```typescript

describe("AI Integration Tests", () => {
  let ai: NeuroLink;

  beforeAll(() => {
    ai = new NeuroLink({
      providers: [
        {
          name: "openai",
          config: { apiKey: process.env.OPENAI_API_KEY_TEST },
        },
      ],
    });
  });

  it("should generate response", async () => {
    const result = await ai.generate({
      input: { text: 'Say "test successful"' },
      provider: "openai",
    });

    expect(result.content).toContain("test successful");
  }, 30000);

  it("should handle errors gracefully", async () => {
    const aiWithBadKey = new NeuroLink({
      providers: [
        {
          name: "openai",
          config: { apiKey: "invalid-key" },
        },
      ],
    });

    await expect(
      aiWithBadKey.generate({
        input: { text: "test" },
        provider: "openai",
      }),
    ).rejects.toThrow();
  });
});
```

---

## Performance Optimization

### Pattern 1: Parallel Requests

```typescript
async function generateMultiple(prompts: string[]): Promise {
  const results = await Promise.all(
    prompts.map((prompt) =>
      ai.generate({
        input: { text: prompt },
        provider: "openai",
      }),
    ),
  );

  return results.map((r) => r.content);
}

const prompts = [
  "Summarize article 1",
  "Summarize article 2",
  "Summarize article 3",
];

const summaries = await generateMultiple(prompts);
```

### Pattern 2: Batching with Queue

```typescript
class BatchQueue {
  private queue: Array void;
    reject: (error: Error) => void;
  }> = [];

  private processing = false;

  constructor(
    private batchSize: number = 10,
    private batchDelay: number = 100,
  ) {}

  async add(prompt: string): Promise {
    return new Promise((resolve, reject) => {
      this.queue.push({ prompt, resolve, reject });

      if (!this.processing) {
        this.processBatch();
      }
    });
  }

  private async processBatch(): Promise {
    this.processing = true;

    while (this.queue.length > 0) {
      const batch = this.queue.splice(0, this.batchSize);

      try {
        const results = await Promise.all(
          batch.map((item) =>
            ai.generate({
              input: { text: item.prompt },
              provider: "openai",
            }),
          ),
        );

        batch.forEach((item, index) => {
          item.resolve(results[index].content);
        });
      } catch (error) {
        batch.forEach((item) => {
          item.reject(error as Error);
        });
      }

      if (this.queue.length > 0) {
        await new Promise((resolve) => setTimeout(resolve, this.batchDelay));
      }
    }

    this.processing = false;
  }
}

const batchQueue = new BatchQueue(10, 100);

const result1 = batchQueue.add("Prompt 1");
const result2 = batchQueue.add("Prompt 2");
const result3 = batchQueue.add("Prompt 3");

const [r1, r2, r3] = await Promise.all([result1, result2, result3]);
```

---

## Security Patterns

### Pattern 1: Input Sanitization

```typescript
class SecureAIService {
  async generate(userInput: string): Promise {
    const sanitized = this.sanitizeInput(userInput);

    const result = await ai.generate({
      input: {
        text: `Respond to this user query: "${sanitized}"

Do not execute any commands or code.`,
      },
      provider: "openai",
    });

    return result.content;
  }

  private sanitizeInput(input: string): string {
    return input
      .replace(/[<>]/g, "")
      .replace(/system:|ignore previous instructions/gi, "")
      .trim()
      .substring(0, 1000);
  }
}
```

### Pattern 2: API Key Rotation

```typescript
class RotatingKeyService {
  private keys: string[];
  private currentIndex = 0;

  constructor(keys: string[]) {
    this.keys = keys;
  }

  getNextKey(): string {
    const key = this.keys[this.currentIndex];
    this.currentIndex = (this.currentIndex + 1) % this.keys.length;
    return key;
  }

  async generate(prompt: string): Promise {
    const apiKey = this.getNextKey();

    const ai = new NeuroLink({
      providers: [
        {
          name: "openai",
          config: { apiKey },
        },
      ],
    });

    const result = await ai.generate({
      input: { text: prompt },
      provider: "openai",
    });

    return result.content;
  }
}

const service = new RotatingKeyService([
  process.env.OPENAI_KEY_1,
  process.env.OPENAI_KEY_2,
  process.env.OPENAI_KEY_3,
]);
```

---

## Anti-Patterns to Avoid

### ❌ Anti-Pattern 1: No Error Handling

```typescript
async function bad() {
  const result = await ai.generate({
    input: { text: prompt },
    provider: "openai",
  });
  return result.content;
}
```

**Why it's bad**: No error handling means crashes on API failures

**✅ Better approach**:

```typescript
async function good() {
  try {
    const result = await ai.generate({
      input: { text: prompt },
      provider: "openai",
    });
    return result.content;
  } catch (error) {
    console.error("AI error:", error);
    return "Sorry, I encountered an error";
  }
}
```

### ❌ Anti-Pattern 2: Hardcoded API Keys

```typescript
const ai = new NeuroLink({
  providers: [
    {
      name: "openai",
      config: { apiKey: "sk-1234567890abcdef" },
    },
  ],
});
```

**Why it's bad**: Security risk, keys in version control

**✅ Better approach**:

```typescript
const ai = new NeuroLink({
  providers: [
    {
      name: "openai",
      config: { apiKey: process.env.OPENAI_API_KEY },
    },
  ],
});
```

### ❌ Anti-Pattern 3: No Rate Limiting

```typescript
for (let i = 0; i
  setTimeout(() => reject(new Error("Timeout")), 30000),
);

const result = await Promise.race([
  ai.generate({ input: { text: veryLongPrompt }, provider: "openai" }),
  timeoutPromise,
]);
```

### ❌ Anti-Pattern 7: Ignoring Token Limits

```typescript
const result = await ai.generate({
  input: { text: massiveDocument },
  provider: "openai",
  model: "gpt-4o",
});
```

**Why it's bad**: Will fail on token limit

**✅ Better approach**:

```typescript
const MAX_TOKENS = 100000;

let text = massiveDocument;
if (text.length > MAX_TOKENS * 4) {
  text = text.substring(0, MAX_TOKENS * 4);
}

const result = await ai.generate({
  input: { text },
  provider: "openai",
  model: "gpt-4o",
});
```

---

## Related Documentation

- [Use Cases](/docs/use-cases) - Real-world examples
- [Enterprise Features](/docs/guides/enterprise/multi-provider-failover) - Production patterns
- [Provider Setup](/docs/) - Provider configuration

---

## Summary

You've learned production-ready patterns for:

✅ Error handling and graceful degradation
✅ Retry strategies with exponential backoff
✅ Streaming responses (SSE, React)
✅ Rate limiting (Token Bucket, Sliding Window)
✅ Caching (In-memory, Redis)
✅ Middleware pipelines
✅ Testing strategies
✅ Performance optimization
✅ Security best practices
✅ Anti-patterns to avoid

These patterns form the foundation of robust, production-ready AI applications.

---

## Audit Trails & Compliance Logging

<!-- Source: guides/enterprise/audit-trails.md -->

# Audit Trails & Compliance Logging

**Comprehensive logging and audit trails for regulatory compliance, security monitoring, and operational transparency**

------------------- | --------------------- | -------------------------------- |
| **GDPR Article 30**    | ❌ Non-compliant      | ✅ Processing records maintained |
| **SOC2 Security**      | ❌ No audit evidence  | ✅ Complete audit trail          |
| **HIPAA § 164.312(b)** | ❌ No activity logs   | ✅ Full audit and accountability |
| **Security Incidents** | ❌ No forensic data   | ✅ Complete investigation trail  |
| **Debugging**          | ❌ Limited visibility | ✅ Full request history          |

---

## Quick Start

### Basic Audit Logging

```typescript

const logger = createLogger({
  level: "info",
  format: format.json(),
  transports: [
    new transports.File({ filename: "audit.log" }),
    new transports.File({ filename: "error.log", level: "error" }),
  ],
});

const ai = new NeuroLink({
  providers: [
    {
      name: "openai",
      config: { apiKey: process.env.OPENAI_API_KEY },
    },
  ],

  // Audit logging configuration
  auditLog: {
    enabled: true,
    level: "detailed", // 'minimal' | 'standard' | 'detailed'

    onLog: (event) => {
      logger.info("AI Audit Event", {
        eventId: event.id,
        timestamp: event.timestamp,
        userId: event.userId,
        action: event.action,
        provider: event.provider,
        model: event.model,
        status: event.status,
        latency: event.latency,
        cost: event.cost,
        tokens: event.tokens,
        ip: event.ip,
        userAgent: event.userAgent,
      });
    },
  },
});

// Make request with user context
const result = await ai.generate({
  input: { text: "Analyze customer feedback" },
  provider: "openai",
  model: "gpt-4o",

  // Audit context
  auditContext: {
    userId: "user-12345",
    sessionId: "sess-abc-789",
    action: "customer-feedback-analysis",
    purpose: "Business intelligence",
    dataClassification: "internal",
    ip: req.ip,
    userAgent: req.headers["user-agent"],
  },
});
```

**Audit Log Output:**

```json
{
  "eventId": "evt_8x7k2m9p",
  "timestamp": "2025-01-15T14:32:11.234Z",
  "userId": "user-12345",
  "sessionId": "sess-abc-789",
  "action": "customer-feedback-analysis",
  "purpose": "Business intelligence",
  "dataClassification": "internal",
  "provider": "openai",
  "model": "gpt-4o",
  "status": "success",
  "latency": 1243,
  "cost": 0.0045,
  "tokens": {
    "input": 150,
    "output": 320,
    "total": 470
  },
  "ip": "192.168.1.100",
  "userAgent": "Mozilla/5.0..."
}
```

---

## Compliance Frameworks

### GDPR Compliance (Article 30)

GDPR requires maintaining records of processing activities. Audit trails provide the necessary evidence.

```typescript
// GDPR-compliant audit configuration
const gdprAI = new NeuroLink({
  providers: [
    { name: "mistral", config: { apiKey: process.env.MISTRAL_API_KEY } },
  ],

  compliance: {
    framework: "GDPR",
    dataResidency: "EU",
    enableAuditLog: true,

    // GDPR-specific settings
    gdpr: {
      recordProcessingActivities: true, // Article 30
      dataSubjectRights: true, // Articles 15-22
      consentTracking: true, // Article 7
      dataRetention: "30-days", // Storage limitation
      anonymization: true, // Data minimization
    },
  },

  auditLog: {
    enabled: true,
    level: "detailed",

    // GDPR audit fields
    includeFields: [
      "userId",
      "consentId",
      "legalBasis", // Article 6 legal basis
      "purpose", // Article 5(1)(b) purpose limitation
      "dataCategory", // Personal data category
      "retention", // Retention period
      "processors", // Third-party processors (AI providers)
    ],

    onLog: async (event) => {
      await auditDatabase.insert({
        ...event,
        gdprCompliance: {
          legalBasis: event.legalBasis || "consent",
          dataSubjectId: event.userId,
          processingPurpose: event.purpose,
          dataCategory: event.dataCategory,
          retentionPeriod: event.retention || "30-days",
          thirdPartyProcessors: [event.provider],
        },
      });
    },
  },
});

// Make request with GDPR context
const result = await gdprAI.generate({
  input: { text: prompt },

  auditContext: {
    userId: "user-12345",
    consentId: "consent-xyz-789", // Article 7: consent proof
    legalBasis: "consent", // Article 6: legal basis
    purpose: "personalized-recommendations",
    dataCategory: "behavioral-data",
    retention: "30-days",
  },
});
```

**GDPR Audit Report Generation:**

```typescript
// Generate Article 30 processing records
async function generateGDPRReport(startDate: Date, endDate: Date) {
  const records = await auditDatabase.query({
    timestamp: { $gte: startDate, $lte: endDate },
    "gdprCompliance.legalBasis": { $exists: true },
  });

  return {
    reportType: "GDPR Article 30 - Records of Processing Activities",
    period: { start: startDate, end: endDate },
    controller: "Your Organization",

    processingActivities: records.map((r) => ({
      purpose: r.gdprCompliance.processingPurpose,
      legalBasis: r.gdprCompliance.legalBasis,
      dataCategories: r.gdprCompliance.dataCategory,
      dataSubjects: "customers",
      recipients: r.gdprCompliance.thirdPartyProcessors,
      transfers: r.provider === "mistral" ? "EU" : "third-country",
      retention: r.gdprCompliance.retentionPeriod,
      security: "encryption, access control, audit logging",
    })),
  };
}
```

---

### SOC2 Security Compliance

SOC2 requires audit logs for security monitoring and incident response.

```typescript
// SOC2-compliant configuration
const soc2AI = new NeuroLink({
  providers: [
    { name: "anthropic", config: { apiKey: process.env.ANTHROPIC_API_KEY } },
  ],

  compliance: {
    framework: "SOC2",

    soc2: {
      // CC7.2: System operations - monitoring
      enableMonitoring: true,

      // CC7.3: System operations - log retention
      logRetention: "365-days",

      // CC6.1: Logical access - audit trail
      auditTrail: true,

      // CC7.4: System operations - incident detection
      incidentDetection: true,
    },
  },

  auditLog: {
    enabled: true,
    level: "detailed",

    // SOC2 required fields
    includeFields: [
      "userId",
      "action",
      "timestamp",
      "ip",
      "userAgent",
      "status",
      "errorCode",
      "securityEvents",
    ],

    // Immutable audit log storage
    storage: {
      type: "append-only",
      encryption: "AES-256",
      integrityCheck: "SHA-256",
    },

    onLog: async (event) => {
      // Store in tamper-proof audit log
      await appendOnlyAuditLog.write({
        ...event,
        hash: calculateHash(event),
        previousHash: await appendOnlyAuditLog.getLastHash(),
      });

      // Detect suspicious activity
      if (await detectAnomalousActivity(event)) {
        await securityIncidentManager.create({
          type: "anomalous-ai-usage",
          severity: "medium",
          event: event,
        });
      }
    },
  },
});
```

**SOC2 Audit Trail Query:**

```typescript
// CC6.1: Verify audit trail completeness
async function verifySoc2AuditTrail() {
  const logs = await appendOnlyAuditLog.getAll();

  // Verify chain integrity
  for (let i = 1; i
        Date.now() - new Date(l.timestamp).getTime()  {
      // § 164.528: Accounting of PHI disclosures
      if (event.phiAccessed || event.disclosure) {
        await phiDisclosureLog.insert({
          date: event.timestamp,
          recipient: event.provider,
          description: event.action,
          purpose: event.purpose,
          patientId: event.patientId,
          userId: event.userId,
          authorization: event.authorization,
        });
      }

      // Store in encrypted, tamper-proof audit log
      await hipaaAuditLog.write(encrypt(event));
    },
  },
});

// Make request with HIPAA context
const result = await hipaaAI.generate({
  input: { text: "Summarize patient chart" },

  auditContext: {
    userId: "dr-smith-456",
    patientId: "patient-123",
    action: "chart-summarization",
    purpose: "treatment", // § 164.506: permitted use
    phiAccessed: true,
    authorization: "auth-789-xyz",
    disclosure: false,
  },
});
```

**HIPAA Disclosure Accounting:**

```typescript
// § 164.528: Generate accounting of disclosures
async function generateHIPAADisclosureAccounting(
  patientId: string,
  startDate: Date,
) {
  const disclosures = await phiDisclosureLog.query({
    patientId: patientId,
    timestamp: { $gte: startDate },
    disclosure: true,
  });

  return disclosures.map((d) => ({
    date: d.timestamp,
    recipient: d.recipient,
    description: d.description,
    purpose: d.purpose,
    authorization: d.authorization,
  }));
}
```

---

## Audit Log Storage

### Database Storage (PostgreSQL)

```typescript

const pool = new Pool({
  host: "localhost",
  database: "neurolink_audit",
  user: process.env.DB_USER,
  password: process.env.DB_PASSWORD,
  ssl: true,
});

// Create audit log table
await pool.query(`
  CREATE TABLE IF NOT EXISTS audit_logs (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    event_id VARCHAR(255) UNIQUE NOT NULL,
    timestamp TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    user_id VARCHAR(255),
    session_id VARCHAR(255),
    action VARCHAR(255) NOT NULL,
    provider VARCHAR(100) NOT NULL,
    model VARCHAR(100) NOT NULL,
    status VARCHAR(50) NOT NULL,
    latency INTEGER,
    cost DECIMAL(10, 6),
    input_tokens INTEGER,
    output_tokens INTEGER,
    total_tokens INTEGER,
    ip INET,
    user_agent TEXT,
    audit_context JSONB,
    compliance_data JSONB,
    error_message TEXT,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
  );

  CREATE INDEX idx_audit_timestamp ON audit_logs(timestamp DESC);
  CREATE INDEX idx_audit_user ON audit_logs(user_id);
  CREATE INDEX idx_audit_action ON audit_logs(action);
  CREATE INDEX idx_audit_provider ON audit_logs(provider);
`);

// Audit log writer
const ai = new NeuroLink({
  providers: [
    { name: "openai", config: { apiKey: process.env.OPENAI_API_KEY } },
  ],

  auditLog: {
    enabled: true,
    level: "detailed",

    onLog: async (event) => {
      await pool.query(
        `
        INSERT INTO audit_logs (
          event_id, timestamp, user_id, session_id, action,
          provider, model, status, latency, cost,
          input_tokens, output_tokens, total_tokens,
          ip, user_agent, audit_context, compliance_data, error_message
        ) VALUES (
          $1, $2, $3, $4, $5, $6, $7, $8, $9, $10,
          $11, $12, $13, $14, $15, $16, $17, $18
        )
      `,
        [
          event.id,
          event.timestamp,
          event.userId,
          event.sessionId,
          event.action,
          event.provider,
          event.model,
          event.status,
          event.latency,
          event.cost,
          event.tokens?.input,
          event.tokens?.output,
          event.tokens?.total,
          event.ip,
          event.userAgent,
          JSON.stringify(event.auditContext),
          JSON.stringify(event.complianceData),
          event.errorMessage,
        ],
      );
    },
  },
});
```

---

### Time-Series Storage (InfluxDB)

For high-volume audit logs with time-based queries:

```typescript

const influxDB = new InfluxDB({
  url: "http://localhost:8086",
  token: process.env.INFLUX_TOKEN,
});

const writeApi = influxDB.getWriteApi("neurolink", "audit_logs", "ms");

const ai = new NeuroLink({
  providers: [
    { name: "openai", config: { apiKey: process.env.OPENAI_API_KEY } },
  ],

  auditLog: {
    enabled: true,
    level: "detailed",

    onLog: async (event) => {
      const point = new Point("ai_audit")
        .tag("provider", event.provider)
        .tag("model", event.model)
        .tag("status", event.status)
        .tag("action", event.action)
        .tag("user_id", event.userId)
        .floatField("latency", event.latency)
        .floatField("cost", event.cost)
        .intField("input_tokens", event.tokens?.input || 0)
        .intField("output_tokens", event.tokens?.output || 0)
        .intField("total_tokens", event.tokens?.total || 0)
        .stringField("ip", event.ip)
        .timestamp(new Date(event.timestamp));

      writeApi.writePoint(point);
      await writeApi.flush();
    },
  },
});

// Query audit logs
async function queryAuditLogs(startTime: string, endTime: string) {
  const queryApi = influxDB.getQueryApi("neurolink");

  const query = `
    from(bucket: "audit_logs")
      |> range(start: ${startTime}, stop: ${endTime})
      |> filter(fn: (r) => r._measurement == "ai_audit")
  `;

  const results = [];
  for await (const { values, tableMeta } of queryApi.iterateRows(query)) {
    results.push(tableMeta.toObject(values));
  }

  return results;
}
```

---

### Append-Only Storage (Blockchain-Inspired)

For tamper-proof audit trails:

```typescript

type AuditBlock = {
  index: number;
  timestamp: string;
  data: AuditEvent;
  previousHash: string;
  hash: string;
};

class AuditBlockchain {
  private chain: AuditBlock[] = [];

  constructor() {
    this.chain.push(this.createGenesisBlock());
  }

  private createGenesisBlock(): AuditBlock {
    return {
      index: 0,
      timestamp: new Date().toISOString(),
      data: {} as AuditEvent,
      previousHash: "0",
      hash: this.calculateHash(0, new Date().toISOString(), {}, "0"),
    };
  }

  private calculateHash(
    index: number,
    timestamp: string,
    data: any,
    previousHash: string,
  ): string {
    return crypto
      .createHash("sha256")
      .update(index + timestamp + JSON.stringify(data) + previousHash)
      .digest("hex");
  }

  addBlock(data: AuditEvent): AuditBlock {
    const previousBlock = this.chain[this.chain.length - 1];
    const newBlock: AuditBlock = {
      index: previousBlock.index + 1,
      timestamp: new Date().toISOString(),
      data: data,
      previousHash: previousBlock.hash,
      hash: "",
    };

    newBlock.hash = this.calculateHash(
      newBlock.index,
      newBlock.timestamp,
      newBlock.data,
      newBlock.previousHash,
    );

    this.chain.push(newBlock);
    return newBlock;
  }

  verifyIntegrity(): boolean {
    for (let i = 1; i  {
      const block = auditChain.addBlock(event);

      // Persist to database
      await database.insert("audit_blockchain", {
        blockIndex: block.index,
        blockHash: block.hash,
        previousHash: block.previousHash,
        data: block.data,
        timestamp: block.timestamp,
      });
    },
  },
});
```

---

## User Consent Tracking

GDPR Article 7 requires proof of consent. Track user consent alongside audit logs.

```typescript
type ConsentRecord = {
  consentId: string;
  userId: string;
  purpose: string;
  timestamp: Date;
  ipAddress: string;
  userAgent: string;
  consentText: string;
  granted: boolean;
  revoked?: boolean;
  revokedAt?: Date;
};

class ConsentManager {
  async recordConsent(data: Omit): Promise {
    const consentId = `consent-${Date.now()}-${Math.random().toString(36).substr(2, 9)}`;

    await database.insert("user_consents", {
      consentId,
      ...data,
    });

    return consentId;
  }

  async checkConsent(userId: string, purpose: string): Promise {
    const consent = await database.findOne("user_consents", {
      userId,
      purpose,
      granted: true,
      revoked: { $ne: true },
    });

    return !!consent;
  }

  async revokeConsent(consentId: string): Promise {
    await database.update(
      "user_consents",
      { consentId },
      { revoked: true, revokedAt: new Date() },
    );
  }
}

const consentManager = new ConsentManager();

// Check consent before AI request
app.post("/api/generate", async (req, res) => {
  const hasConsent = await consentManager.checkConsent(
    req.user.id,
    "personalized-recommendations",
  );

  if (!hasConsent) {
    return res.status(403).json({
      error: "Consent required",
      message: "User has not consented to AI processing (GDPR Article 6)",
    });
  }

  const result = await ai.generate({
    input: { text: req.body.prompt },
    auditContext: {
      userId: req.user.id,
      consentId: hasConsent.consentId,
      legalBasis: "consent",
    },
  });

  res.json({ content: result.content });
});
```

---

## SIEM Integration

### Splunk Integration

```typescript

const splunkLogger = new SplunkLogger({
  token: process.env.SPLUNK_TOKEN,
  url: "https://splunk.example.com:8088",
});

const ai = new NeuroLink({
  providers: [
    { name: "openai", config: { apiKey: process.env.OPENAI_API_KEY } },
  ],

  auditLog: {
    enabled: true,
    level: "detailed",

    onLog: async (event) => {
      splunkLogger.send({
        message: event,
        severity: event.status === "error" ? "error" : "info",
        source: "neurolink-ai",
        sourcetype: "ai-audit-log",
        index: "main",
      });
    },
  },
});
```

### Datadog Integration

```typescript

ddClient.init({
  hostname: "datadog.example.com",
  service: "neurolink-ai",
  env: "production",
});

const ai = new NeuroLink({
  providers: [
    { name: "openai", config: { apiKey: process.env.OPENAI_API_KEY } },
  ],

  auditLog: {
    enabled: true,
    level: "detailed",

    onLog: async (event) => {
      ddClient.dogstatsd.increment("ai.requests", 1, [
        `provider:${event.provider}`,
        `status:${event.status}`,
      ]);
      ddClient.dogstatsd.histogram("ai.latency", event.latency, [
        `provider:${event.provider}`,
      ]);
      ddClient.dogstatsd.histogram("ai.cost", event.cost, [
        `provider:${event.provider}`,
      ]);

      ddClient.logger.info("AI Audit Event", event);
    },
  },
});
```

---

## Querying Audit Logs

### SQL Queries

```sql
-- Find all requests by user
SELECT * FROM audit_logs
WHERE user_id = 'user-12345'
ORDER BY timestamp DESC
LIMIT 100;

-- Calculate cost per user
SELECT
  user_id,
  COUNT(*) as total_requests,
  SUM(cost) as total_cost,
  AVG(latency) as avg_latency,
  SUM(total_tokens) as total_tokens
FROM audit_logs
WHERE timestamp >= NOW() - INTERVAL '30 days'
GROUP BY user_id
ORDER BY total_cost DESC;

-- Detect anomalous activity
SELECT
  user_id,
  COUNT(*) as requests_per_hour,
  AVG(cost) as avg_cost_per_request
FROM audit_logs
WHERE timestamp >= NOW() - INTERVAL '1 hour'
GROUP BY user_id
HAVING COUNT(*) > 100  -- More than 100 requests/hour
ORDER BY requests_per_hour DESC;

-- Compliance report: GDPR consent tracking
SELECT
  al.user_id,
  al.action,
  al.timestamp,
  uc.consent_id,
  uc.granted,
  uc.revoked
FROM audit_logs al
LEFT JOIN user_consents uc
  ON al.audit_context->>'consentId' = uc.consent_id
WHERE al.timestamp >= NOW() - INTERVAL '90 days'
ORDER BY al.timestamp DESC;

-- Error rate by provider
SELECT
  provider,
  COUNT(*) as total_requests,
  SUM(CASE WHEN status = 'error' THEN 1 ELSE 0 END) as errors,
  ROUND(100.0 * SUM(CASE WHEN status = 'error' THEN 1 ELSE 0 END) / COUNT(*), 2) as error_rate
FROM audit_logs
WHERE timestamp >= NOW() - INTERVAL '24 hours'
GROUP BY provider
ORDER BY error_rate DESC;
```

### TypeScript Query API

```typescript
class AuditLogQuery {
  async getUserActivity(userId: string, limit: number = 100) {
    return await database.query(
      "audit_logs",
      {
        user_id: userId,
      },
      {
        sort: { timestamp: -1 },
        limit,
      },
    );
  }

  async getCostByUser(startDate: Date, endDate: Date) {
    return await database.aggregate("audit_logs", [
      {
        $match: {
          timestamp: { $gte: startDate, $lte: endDate },
        },
      },
      {
        $group: {
          _id: "$user_id",
          totalRequests: { $sum: 1 },
          totalCost: { $sum: "$cost" },
          avgLatency: { $avg: "$latency" },
          totalTokens: { $sum: "$total_tokens" },
        },
      },
      {
        $sort: { totalCost: -1 },
      },
    ]);
  }

  async detectAnomalies(threshold: number = 100) {
    const oneHourAgo = new Date(Date.now() - 60 * 60 * 1000);

    return await database.aggregate("audit_logs", [
      {
        $match: {
          timestamp: { $gte: oneHourAgo },
        },
      },
      {
        $group: {
          _id: "$user_id",
          requestsPerHour: { $sum: 1 },
          avgCost: { $avg: "$cost" },
        },
      },
      {
        $match: {
          requestsPerHour: { $gt: threshold },
        },
      },
      {
        $sort: { requestsPerHour: -1 },
      },
    ]);
  }

  async getComplianceReport(
    framework: "GDPR" | "SOC2" | "HIPAA",
    days: number = 90,
  ) {
    const startDate = new Date(Date.now() - days * 24 * 60 * 60 * 1000);

    return await database.query(
      "audit_logs",
      {
        timestamp: { $gte: startDate },
        "compliance_data.framework": framework,
      },
      {
        sort: { timestamp: -1 },
      },
    );
  }
}

const auditQuery = new AuditLogQuery();

// Usage
const userActivity = await auditQuery.getUserActivity("user-12345");
const costReport = await auditQuery.getCostByUser(
  new Date("2025-01-01"),
  new Date("2025-01-31"),
);
const anomalies = await auditQuery.detectAnomalies(100);
const gdprReport = await auditQuery.getComplianceReport("GDPR", 90);
```

---

## Data Retention Policies

```typescript
// Automated retention policy enforcement
class RetentionPolicyManager {
  private policies = {
    GDPR: 30, // 30 days
    SOC2: 365, // 1 year
    HIPAA: 2555, // 7 years
    default: 90, // 90 days
  };

  async enforceRetention(framework: keyof typeof this.policies = "default") {
    const retentionDays = this.policies[framework];
    const cutoffDate = new Date(
      Date.now() - retentionDays * 24 * 60 * 60 * 1000,
    );

    // Archive old logs
    const logsToArchive = await database.query("audit_logs", {
      timestamp: { $lt: cutoffDate },
    });

    if (logsToArchive.length > 0) {
      // Move to cold storage
      await archiveStorage.insert(logsToArchive);

      // Delete from active database
      await database.delete("audit_logs", {
        timestamp: { $lt: cutoffDate },
      });

      console.log(
        `Archived ${logsToArchive.length} logs older than ${retentionDays} days`,
      );
    }
  }
}

const retentionManager = new RetentionPolicyManager();

// Run daily
setInterval(
  () => {
    retentionManager.enforceRetention("SOC2");
  },
  24 * 60 * 60 * 1000,
);
```

---

## Best Practices

### 1. **Log Everything Critical**

```typescript
const ai = new NeuroLink({
  providers: [
    { name: "openai", config: { apiKey: process.env.OPENAI_API_KEY } },
  ],

  auditLog: {
    enabled: true,
    level: "detailed",

    // Log all important fields
    includeFields: [
      "userId",
      "sessionId",
      "action",
      "provider",
      "model",
      "status",
      "latency",
      "cost",
      "tokens",
      "ip",
      "userAgent",
      "errorMessage",
    ],
  },
});
```

### 2. **Encrypt Sensitive Data**

```typescript

function encryptPII(data: string): string {
  const key = Buffer.from(process.env.ENCRYPTION_KEY!, "hex"); // 32 bytes for AES-256
  const iv = crypto.randomBytes(16); // Initialization vector
  const cipher = crypto.createCipheriv("aes-256-gcm", key, iv);

  const encrypted = Buffer.concat([
    cipher.update(data, "utf8"),
    cipher.final(),
  ]);

  const authTag = cipher.getAuthTag();

  // Return IV + AuthTag + Encrypted data (all hex encoded)
  return (
    iv.toString("hex") +
    ":" +
    authTag.toString("hex") +
    ":" +
    encrypted.toString("hex")
  );
}

auditLog: {
  onLog: async (event) => {
    await database.insert("audit_logs", {
      ...event,
      userId: encryptPII(event.userId),
      ip: encryptPII(event.ip),
    });
  };
}
```

### 3. **Implement Access Controls**

```typescript
// Role-based access to audit logs
app.get(
  "/api/audit-logs",
  requireAuth,
  requireRole("admin"),
  async (req, res) => {
    const logs = await auditQuery.getUserActivity(req.query.userId);
    res.json(logs);
  },
);
```

### 4. **Monitor Audit Log Health**

```typescript
// Alert if audit logging fails
auditLog: {
  onLog: async (event) => {
    try {
      await database.insert("audit_logs", event);
    } catch (error) {
      // Critical: audit logging failure
      await alerting.sendCriticalAlert({
        title: "Audit Logging Failure",
        message: `Failed to log audit event: ${error.message}`,
        severity: "critical",
      });
    }
  };
}
```

---

## Related Documentation

- [Compliance & Security Guide](/docs/guides/enterprise/compliance) - Compliance frameworks
- [Monitoring & Observability](/docs/observability/health-monitoring) - Metrics and monitoring
- [Multi-Provider Failover](/docs/guides/enterprise/multi-provider-failover) - High availability
- [Cost Optimization](/docs/cookbook/cost-optimization) - Cost tracking

---

## Summary

You've learned how to implement comprehensive audit trails for compliance and security:

✅ Configure detailed audit logging
✅ Meet GDPR, SOC2, HIPAA requirements
✅ Track user consent (GDPR Article 7)
✅ Store audit logs securely
✅ Query and analyze audit data
✅ Integrate with SIEM systems
✅ Enforce data retention policies

Enterprise audit trails provide the foundation for regulatory compliance, security monitoring, and operational transparency in production AI systems.

---

## Deployment Guide

<!-- Source: guides/server-adapters/deployment.md -->

# Deployment Guide

**Deploy NeuroLink server adapters to production**

This guide covers deploying NeuroLink server adapters to various environments including Docker, Kubernetes, and serverless platforms.

## Docker Deployment

### Basic Dockerfile

```dockerfile
# syntax=docker/dockerfile:1
FROM node:20-alpine AS base

# Install dependencies only when needed
FROM base AS deps
WORKDIR /app

# Install dependencies
COPY package.json package-lock.json* ./
RUN npm ci --only=production

# Build the application
FROM base AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN npm run build

# Production image
FROM base AS runner
WORKDIR /app

ENV NODE_ENV=production

# Create non-root user
RUN addgroup --system --gid 1001 nodejs
RUN adduser --system --uid 1001 neurolink

# Copy built assets
COPY --from=builder --chown=neurolink:nodejs /app/dist ./dist
COPY --from=builder --chown=neurolink:nodejs /app/node_modules ./node_modules
COPY --from=builder --chown=neurolink:nodejs /app/package.json ./package.json

USER neurolink

EXPOSE 3000

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
  CMD wget --no-verbose --tries=1 --spider http://localhost:3000/api/health || exit 1

CMD ["node", "dist/server.js"]
```

### Multi-Stage Build for Smaller Images

```dockerfile
# syntax=docker/dockerfile:1
FROM node:20-alpine AS builder

WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build

# Production stage with minimal dependencies
FROM node:20-alpine AS production

WORKDIR /app

# Security: non-root user
RUN addgroup -g 1001 -S nodejs && \
    adduser -S neurolink -u 1001

# Copy only production dependencies
COPY package*.json ./
RUN npm ci --only=production && npm cache clean --force

# Copy built application
COPY --from=builder --chown=neurolink:nodejs /app/dist ./dist

USER neurolink
EXPOSE 3000

HEALTHCHECK --interval=30s --timeout=3s \
  CMD wget --spider -q http://localhost:3000/api/health || exit 1

CMD ["node", "dist/server.js"]
```

### Docker Compose

```yaml
version: "3.8"

services:
  api:
    build:
      context: .
      dockerfile: Dockerfile
    ports:
      - "3000:3000"
    environment:
      - NODE_ENV=production
      - PORT=3000
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
      - REDIS_URL=redis://redis:6379
      - JWT_SECRET=${JWT_SECRET}
    depends_on:
      redis:
        condition: service_healthy
    healthcheck:
      test:
        ["CMD", "wget", "--spider", "-q", "http://localhost:3000/api/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 10s
    restart: unless-stopped
    deploy:
      resources:
        limits:
          cpus: "2"
          memory: 2G
        reservations:
          cpus: "0.5"
          memory: 512M

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    volumes:
      - redis-data:/data
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 10s
      timeout: 5s
      retries: 5
    restart: unless-stopped
    command: redis-server --appendonly yes

volumes:
  redis-data:
```

### Build and Run

```bash
# Build the image
docker build -t neurolink-api:latest .

# Run with environment variables
docker run -d \
  --name neurolink-api \
  -p 3000:3000 \
  -e OPENAI_API_KEY=$OPENAI_API_KEY \
  -e JWT_SECRET=$JWT_SECRET \
  neurolink-api:latest

# Using docker-compose
docker-compose up -d
```

---

## Kubernetes Deployment

### Deployment Manifest

```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: neurolink-api
  labels:
    app: neurolink-api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: neurolink-api
  template:
    metadata:
      labels:
        app: neurolink-api
    spec:
      containers:
        - name: neurolink-api
          image: your-registry/neurolink-api:latest
          ports:
            - containerPort: 3000
          env:
            - name: NODE_ENV
              value: "production"
            - name: PORT
              value: "3000"
            - name: OPENAI_API_KEY
              valueFrom:
                secretKeyRef:
                  name: neurolink-secrets
                  key: openai-api-key
            - name: JWT_SECRET
              valueFrom:
                secretKeyRef:
                  name: neurolink-secrets
                  key: jwt-secret
            - name: REDIS_URL
              valueFrom:
                configMapKeyRef:
                  name: neurolink-config
                  key: redis-url
          resources:
            requests:
              memory: "512Mi"
              cpu: "250m"
            limits:
              memory: "2Gi"
              cpu: "2000m"
          # Liveness probe - is the container alive?
          livenessProbe:
            httpGet:
              path: /api/health
              port: 3000
            initialDelaySeconds: 10
            periodSeconds: 30
            timeoutSeconds: 5
            failureThreshold: 3
          # Readiness probe - is the container ready to serve traffic?
          readinessProbe:
            httpGet:
              path: /api/ready
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 10
            timeoutSeconds: 5
            failureThreshold: 3
          # Startup probe - has the container started?
          startupProbe:
            httpGet:
              path: /api/health
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 5
            failureThreshold: 30
      terminationGracePeriodSeconds: 30
---
apiVersion: v1
kind: Service
metadata:
  name: neurolink-api
spec:
  selector:
    app: neurolink-api
  ports:
    - protocol: TCP
      port: 80
      targetPort: 3000
  type: ClusterIP
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: neurolink-api
  annotations:
    kubernetes.io/ingress.class: nginx
    cert-manager.io/cluster-issuer: letsencrypt-prod
    nginx.ingress.kubernetes.io/rate-limit: "100"
    nginx.ingress.kubernetes.io/rate-limit-window: "1m"
spec:
  tls:
    - hosts:
        - api.yourdomain.com
      secretName: neurolink-api-tls
  rules:
    - host: api.yourdomain.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: neurolink-api
                port:
                  number: 80
```

### Secrets and ConfigMap

```yaml
apiVersion: v1
kind: Secret
metadata:
  name: neurolink-secrets
type: Opaque
stringData:
  openai-api-key: "sk-..."
  anthropic-api-key: "sk-ant-..."
  jwt-secret: "your-secure-jwt-secret"
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: neurolink-config
data:
  redis-url: "redis://redis-master:6379"
  log-level: "info"
  rate-limit-max: "100"
```

### Horizontal Pod Autoscaler

```yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: neurolink-api-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: neurolink-api
  minReplicas: 3
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
        - type: Percent
          value: 10
          periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 0
      policies:
        - type: Percent
          value: 100
          periodSeconds: 15
        - type: Pods
          value: 4
          periodSeconds: 15
      selectPolicy: Max
```

---

## Serverless Deployment

### Cloudflare Workers (Hono)

Hono is ideal for edge deployment:

```typescript
// src/worker.ts

const neurolink = new NeuroLink({
  defaultProvider: "openai",
});

const server = await createServer(neurolink, {
  framework: "hono",
  config: {
    basePath: "/api",
  },
});

await server.initialize();

export default {
  fetch: server.getFrameworkInstance().fetch,
};
```

```toml
# wrangler.toml
name = "neurolink-api"
main = "src/worker.ts"
compatibility_date = "2024-01-01"

[vars]
NODE_ENV = "production"

[[kv_namespaces]]
binding = "RATE_LIMIT_KV"
id = "your-kv-id"
```

### Vercel Edge Functions

```typescript
// api/[[...route]].ts

const neurolink = new NeuroLink({
  defaultProvider: "openai",
});

const server = await createServer(neurolink, {
  framework: "hono",
  config: { basePath: "/api" },
});

await server.initialize();

export const config = {
  runtime: "edge",
};

export default server.getFrameworkInstance().fetch;
```

### AWS Lambda

```typescript
// handler.ts

const neurolink = new NeuroLink({
  defaultProvider: "openai",
});

const server = await createServer(neurolink, {
  framework: "hono",
  config: { basePath: "/api" },
});

await server.initialize();

export const handler = handle(server.getFrameworkInstance());
```

```yaml
# serverless.yml
service: neurolink-api

provider:
  name: aws
  runtime: nodejs20.x
  region: us-east-1
  environment:
    NODE_ENV: production
    OPENAI_API_KEY: ${ssm:/neurolink/openai-api-key}

functions:
  api:
    handler: handler.handler
    events:
      - httpApi:
          path: /api/{proxy+}
          method: ANY
    timeout: 30
    memorySize: 1024
```

---

## Production Configuration Recommendations

### Server Configuration

```typescript
const server = await createServer(neurolink, {
  framework: "hono",
  config: {
    // Server
    port: parseInt(process.env.PORT || "3000"),
    host: "0.0.0.0",
    timeout: 30000,

    // CORS (specific origins only)
    cors: {
      enabled: true,
      origins: process.env.ALLOWED_ORIGINS?.split(",") || [],
      methods: ["GET", "POST"],
      credentials: true,
    },

    // Rate limiting
    rateLimit: {
      enabled: true,
      maxRequests: 100,
      windowMs: 60000,
      skipPaths: ["/api/health", "/api/ready"],
    },

    // Body parsing
    bodyParser: {
      enabled: true,
      maxSize: "1mb",
      jsonLimit: "1mb",
    },

    // Logging
    logging: {
      enabled: true,
      level: "info",
      includeBody: false,
      includeResponse: false,
    },

    // Redaction (for sensitive data)
    redaction: {
      enabled: true,
      additionalFields: ["ssn", "creditCard"],
    },

    // Features
    enableMetrics: true,
    enableSwagger: false, // Disable in production
  },
});
```

### Health and Readiness Endpoints

The server adapter provides built-in health endpoints:

- `GET /api/health` - Basic health check (is the server running?)
- `GET /api/ready` - Readiness check (is the server ready to serve traffic?)
- `GET /api/version` - Version information

### Graceful Shutdown

NeuroLink server adapters support configurable graceful shutdown to ensure clean termination of active connections and requests.

#### Shutdown Configuration

```typescript
const server = await createServer(neurolink, {
  framework: "hono",
  config: {
    shutdown: {
      gracefulShutdownTimeoutMs: 30000, // Max time to wait for shutdown
      drainTimeoutMs: 15000, // Max time to drain connections
      forceClose: true, // Force close if timeout exceeded
    },
  },
});
```

| Option                      | Default | Description                                        |
| --------------------------- | ------- | -------------------------------------------------- |
| `gracefulShutdownTimeoutMs` | 30000   | Maximum total time to wait for graceful shutdown   |
| `drainTimeoutMs`            | 15000   | Maximum time to wait for active connections to end |
| `forceClose`                | true    | Force close remaining connections after timeout    |

#### Shutdown Process Steps

When `server.stop()` is called, the shutdown proceeds through these steps:

1. **Stop accepting new connections** - The server immediately stops accepting new requests
2. **Drain active connections** - Active requests are allowed to complete (up to `drainTimeoutMs`)
3. **Complete graceful shutdown** - Finalize cleanup within `gracefulShutdownTimeoutMs`
4. **Force close if needed** - If `forceClose: true`, remaining connections are forcefully terminated after timeout

#### Signal Handling Example

```typescript
const server = await createServer(neurolink, {
  framework: "hono",
  config: {
    shutdown: {
      gracefulShutdownTimeoutMs: 30000,
      drainTimeoutMs: 15000,
      forceClose: true,
    },
  },
});

await server.initialize();
await server.start();

// Handle SIGTERM (sent by Kubernetes, Docker, etc.)
process.on("SIGTERM", async () => {
  console.log("SIGTERM received, starting graceful shutdown...");
  await server.stop();
  process.exit(0);
});

// Handle SIGINT (Ctrl+C)
process.on("SIGINT", async () => {
  console.log("SIGINT received, starting graceful shutdown...");
  await server.stop();
  process.exit(0);
});
```

#### Complete Shutdown Handler

For production deployments, implement a comprehensive shutdown handler:

```typescript
const server = await createServer(neurolink, { framework: "hono" });
await server.initialize();
await server.start();

// Handle graceful shutdown
const shutdown = async (signal: string) => {
  console.log(`Received ${signal}. Gracefully shutting down...`);

  // Stop accepting new requests
  await server.stop();

  // Close database connections, flush logs, etc.
  await cleanup();

  process.exit(0);
};

process.on("SIGTERM", () => shutdown("SIGTERM"));
process.on("SIGINT", () => shutdown("SIGINT"));
```

#### Kubernetes Considerations

When deploying to Kubernetes, align your shutdown configuration with Kubernetes settings:

1. **Match `terminationGracePeriodSeconds` with `gracefulShutdownTimeoutMs`**

   ```yaml
   spec:
     terminationGracePeriodSeconds: 30 # Should match gracefulShutdownTimeoutMs
     containers:
       - name: neurolink-api
         # ...
   ```

2. **Use preStop hook for additional delay** (if load balancer needs time to deregister)

   ```yaml
   lifecycle:
     preStop:
       exec:
         command: ["sh", "-c", "sleep 5"]
   ```

3. **Ensure `drainTimeoutMs`  ({ level: label }),
  },
  timestamp: pino.stdTimeFunctions.isoTime,
});

// Log all server events
server.on("request", (event) => {
  logger.info(
    { requestId: event.requestId, path: event.path },
    "Request received",
  );
});

server.on("response", (event) => {
  logger.info(
    {
      requestId: event.requestId,
      status: event.statusCode,
      duration: event.duration,
    },
    "Response sent",
  );
});

server.on("error", (event) => {
  logger.error(
    {
      requestId: event.requestId,
      error: event.error.message,
    },
    "Request error",
  );
});
```

---

## Production Deployment Checklist

### Pre-Deployment

- [ ] All environment variables configured
- [ ] Secrets stored securely (Kubernetes Secrets, AWS Secrets Manager, etc.)
- [ ] Docker image built and tested
- [ ] Health endpoints working
- [ ] Rate limiting configured appropriately
- [ ] CORS configured with specific origins
- [ ] Authentication middleware in place
- [ ] Logging configured

### Infrastructure

- [ ] Load balancer configured
- [ ] TLS/SSL certificates provisioned
- [ ] DNS configured
- [ ] Firewall rules set
- [ ] Resource limits defined

### Monitoring

- [ ] Health check monitoring configured
- [ ] Metrics collection enabled
- [ ] Log aggregation set up
- [ ] Alerting configured
- [ ] Error tracking (Sentry, etc.) integrated

### Scaling

- [ ] Horizontal pod autoscaler configured
- [ ] Resource requests and limits set
- [ ] Redis (or equivalent) for distributed state
- [ ] Database connection pooling configured

### Security

- [ ] Non-root container user
- [ ] Read-only filesystem where possible
- [ ] Security headers configured
- [ ] Network policies defined
- [ ] Regular security scanning enabled

---

## Deployment Verification via CLI

Use CLI commands to verify your deployment:

### Pre-Deployment Checklist

```bash
# Verify configuration
neurolink server config --format json

# Check all routes are registered
neurolink server routes

# Generate OpenAPI spec for documentation
neurolink server openapi -o openapi.json
```

### Post-Deployment Verification

```bash
# Start server and verify status
neurolink server start --port 3000
neurolink server status

# Verify routes are accessible
neurolink server routes --format json

# Stop for production deployment
neurolink server stop
```

### Health Check Endpoints

After deployment, verify these endpoints are accessible:

| Endpoint           | Purpose            |
| ------------------ | ------------------ |
| `GET /api/health`  | Basic health check |
| `GET /api/ready`   | Readiness probe    |
| `GET /api/metrics` | Metrics endpoint   |

Use `neurolink server routes --group health` to list all health endpoints.

---

## Related Documentation

- **[Server Adapters Overview](/docs/)** - Getting started with server adapters
- **[Security Best Practices](/docs/guides/server-adapters/security)** - Securing your deployment
- **[Hono Adapter](/docs/guides/server-adapters/hono)** - Recommended for serverless deployments
- **[Enterprise Monitoring](/docs/observability/health-monitoring)** - Production monitoring

---

**Need Help?** Join our [GitHub Discussions](https://github.com/juspay/neurolink/discussions) or open an [issue](https://github.com/juspay/neurolink/issues).

---

## Dynamic Model Configuration System

<!-- Source: guides/dynamic-models.md -->

# Dynamic Model Configuration System

This document describes the new dynamic model configuration system that replaces static enums with flexible, runtime-configurable model definitions.

##  Overview

The dynamic model system enables:

- **Runtime model discovery** from external configuration sources
- **Automatic fallback** to local configurations when external sources fail
- **Smart model resolution** with fuzzy matching and aliases
- **Capability-based search** to find models with specific features
- **Cost optimization** by automatically selecting cheapest models for tasks

## ️ Architecture

### Components

1. **Model Configuration Server** (`scripts/modelServer.js`)
   - Serves model configurations via REST API
   - Provides search and filtering capabilities
   - Can be hosted anywhere (GitHub, CDN, internal server)

2. **Dynamic Model Provider** (`src/lib/core/dynamicModels.ts`)
   - Loads configurations from multiple sources with fallback
   - Caches configurations to reduce network requests
   - Validates configurations using Zod schemas
   - Provides intelligent model resolution

3. **Model Configuration** (`config/models.json`)
   - JSON-based model definitions
   - Includes pricing, capabilities, and metadata
   - Supports aliases and provider defaults

##  Quick Start

### 1. Environment Setup

Before using the dynamic model system, ensure your provider configurations are set up correctly. See the [Provider Configuration Guide](/docs/getting-started/provider-setup) for detailed instructions.

### 2. Start the Model Server

```bash
# Start the configuration server
npm run model-server

# Or manually
node scripts/modelServer.js
```

Server runs on `http://localhost:3001` by default.

### 2. Test the System

```bash
# Run comprehensive tests
npm run test:dynamicModels

# Or manually
node test-dynamicModels.js
```

### 3. Use in Code

```typescript
// Preferred: import from the package export (no deep relative path)

// Or, when importing within this repo's source (TypeScript):
// import { dynamicModelProvider } from "./src/lib/core/dynamicModels";

// Initialize the provider
await dynamicModelProvider.initialize();

// Resolve a model
const model = dynamicModelProvider.resolveModel("anthropic", "claude-3-opus");

// Search by capability
const visionModels = dynamicModelProvider.searchByCapability("vision");

// Get best model for use case
const bestCodingModel = dynamicModelProvider.getBestModelFor("coding");
```

##  API Endpoints

### Model Server Endpoints

- `GET /health` - Health check
- `GET /api/v1/models` - Get all model configurations
- `GET /api/v1/models/:provider` - Get models for specific provider
- `GET /api/v1/search?capability=X&maxPrice=Y` - Search models by criteria

### Example API Usage

```bash
# Get all models
curl http://localhost:3001/api/v1/models

# Get OpenAI models
curl http://localhost:3001/api/v1/models/openai

# Search for functionCalling models under $0.001
curl "http://localhost:3001/api/v1/search?capability=functionCalling&maxPrice=0.001"
```

##  Configuration Schema

### Model Configuration Structure

```json
{
  "version": "1.0.0",
  "lastUpdated": "2025-06-18T12:00:00Z",
  "models": {
    "anthropic": {
      "claude-3-opus": {
        "id": "claude-3-opus-20240229",
        "displayName": "Claude 3 Opus",
        "capabilities": ["functionCalling", "vision", "analysis"],
        "deprecated": false,
        "pricing": { "input": 0.015, "output": 0.075 },
        "contextWindow": 200000,
        "releaseDate": "2024-02-29"
      }
    }
  },
  "aliases": {
    "claude-latest": "anthropic/claude-3-opus",
    "best-coding": "anthropic/claude-3-opus"
  },
  "defaults": {
    "anthropic": "claude-3-sonnet"
  }
}
```

### Key Fields

- **`id`**: Provider-specific model identifier
- **`displayName`**: Human-readable model name
- **`capabilities`**: Array of model capabilities (functionCalling, vision, etc.)
- **`deprecated`**: Whether the model is deprecated
- **`pricing`**: Input/output token costs per 1K tokens
- **`contextWindow`**: Maximum context window size
- **`releaseDate`**: Model release date

## ️ Advanced Usage

### Configuration Sources

The system tries multiple sources in order:

1. `process.env.MODEL_CONFIG_URL` - Custom URL override
2. `http://localhost:3001/api/v1/models` - Local development server
3. `https://raw.githubusercontent.com/juspay/neurolink/release/config/models.json` - GitHub
4. `./config/models.json` - Local fallback

### Model Resolution Logic

```typescript
// Exact match
resolveModel("anthropic", "claude-3-opus");

// Default model for provider
resolveModel("anthropic"); // Uses defaults.anthropic

// Alias resolution
resolveModel("anthropic", "claude-latest"); // Resolves alias

// Fuzzy matching
resolveModel("anthropic", "opus"); // Matches 'claude-3-opus'
```

### Capability Search Options

```typescript
searchByCapability("functionCalling", {
  provider: "openai", // Filter by provider
  maxPrice: 0.001, // Maximum input price per 1K tokens
  excludeDeprecated: true, // Exclude deprecated models
});
```

##  Migration from Static Enums

### Before (Static Enums)

```typescript
export enum BedrockModels {
  CLAUDE_3_SONNET = "anthropic.claude-3-sonnet-20240229-v1:0",
  // Hard to maintain, becomes stale
}
```

### After (Dynamic Resolution)

```typescript
// Backward compatible aliases
export const ModelAliases = {
  CLAUDE_LATEST: () =>
    dynamicModelProvider.resolveModel("anthropic", "claude-3"),
  GPT_LATEST: () => dynamicModelProvider.resolveModel("openai", "gpt-4"),
  BEST_CODING: () => dynamicModelProvider.getBestModelFor("coding"),
} as const;

// Usage stays the same
const provider = AIProviderFactory.createProvider(
  "anthropic",
  ModelAliases.CLAUDE_LATEST(),
);
```

##  Production Deployment

### Environment Variables

```bash
# Custom model configuration URL
MODEL_CONFIG_URL=https://api.yourcompany.com/ai/models

# Server port (default: 3001)
MODEL_SERVER_PORT=8080
```

### Hosting Configuration

1. **GitHub Pages**: Host `models.json` as static file
2. **CDN**: Use CloudFlare/AWS CloudFront for global distribution
3. **Internal API**: Integrate with existing infrastructure
4. **File System**: Local configurations for air-gapped environments

### Cache Strategy

- **5-minute cache**: Balances freshness with performance
- **Graceful degradation**: Falls back to cached data on network failures
- **Manual refresh**: `dynamicModelProvider.refresh()` for immediate updates

##  Testing

The test suite verifies:

✅ Model provider initialization
✅ Configuration loading from multiple sources
✅ Model resolution (exact, default, fuzzy, alias)
✅ Capability-based search
✅ Best model selection algorithms
✅ Error handling and fallbacks

Run tests with:

```bash
npm run test:dynamicModels
```

##  Benefits

- ** Future-Proof**: New models automatically available
- ** Cost-Optimized**: Runtime selection based on pricing
- **️ Reliable**: Multiple fallback sources
- **⚡ Fast**: Cached configurations with smart invalidation
- ** Type-Safe**: Zod schemas ensure runtime safety
- ** Backward Compatible**: Existing code continues working

This system transforms static model definitions into a dynamic, self-updating platform that scales with the rapidly evolving AI landscape.

---

## Migrating from Vercel AI SDK to NeuroLink

<!-- Source: guides/migration/from-vercel-ai-sdk.md -->

# Migrating from Vercel AI SDK to NeuroLink

## Why Migrate?

While Vercel AI SDK is excellent for Next.js applications, NeuroLink offers broader capabilities for enterprise and multi-framework applications:

| Benefit                 | Vercel AI SDK                  | NeuroLink                                |
| ----------------------- | ------------------------------ | ---------------------------------------- |
| **Multi-Provider**      | Separate packages per provider | 13 providers in single package           |
| **Framework Support**   | Optimized for Next.js          | Next.js, SvelteKit, Express, any Node.js |
| **Tool Integration**    | Function calling only          | MCP (58+ servers) + function calling     |
| **Enterprise Features** | Basic                          | HITL, Redis memory, middleware, failover |
| **Memory/State**        | useChat hook (client-side)     | Redis-backed server-side memory          |
| **Production Ready**    | Good for prototypes            | Battle-tested at enterprise scale        |
| **Bundle Size**         | Moderate                       | Optimized, tree-shakeable                |
| **Streaming**           | Excellent                      | Excellent (same quality)                 |

**Migration time:** Most Next.js apps can migrate in 2-3 hours with feature parity and enhanced capabilities.

--------------------------------- | -------------------------------- | ------------------------------------- |
| `generateText()`                     | `generate()`                     | Similar API, unified across providers |
| `streamText()`                       | `generate({ stream: true })`     | Built-in streaming                    |
| `useChat()`                          | Custom hook + API route          | Server-side memory more robust        |
| `CoreMessage`                        | `ChatMessage`                    | Type compatible                       |
| `tool()` function                    | MCP Tools                        | More powerful, 58+ servers            |
| Provider packages (`@ai-sdk/openai`) | `provider` parameter             | Single package                        |
| `generateObject()`                   | `generate({ structuredOutput })` | Zod schema validation                 |
| Edge Runtime                         | Node.js runtime                  | Compatible with Edge via adapters     |

---

## Quick Start Migration

### Before (Vercel AI SDK)

```typescript

const { text } = await generateText({
  model: openai("gpt-4"),
  prompt: "Write a haiku about programming",
});

console.log(text);
```

### After (NeuroLink)

```typescript

const neurolink = new NeuroLink({
  provider: "openai",
  model: "gpt-4",
});

const result = await neurolink.generate({
  input: { text: "Write a haiku about programming" },
});

console.log(result.content);
```

**Key changes:**

- Single import instead of multiple packages
- Unified `generate()` method
- `content` instead of `text` property
- Provider specified in config, not per-call

---

## Feature-by-Feature Migration

### 1. Text Generation

**Vercel AI SDK:**

```typescript

const result = await generateText({
  model: openai("gpt-4"),
  prompt: "Explain TypeScript",
  temperature: 0.7,
  maxTokens: 500,
});

console.log(result.text);
console.log(result.usage);
```

**NeuroLink:**

```typescript

const neurolink = new NeuroLink({ provider: "openai" });

const result = await neurolink.generate({
  input: { text: "Explain TypeScript" },
  model: "gpt-4",
  temperature: 0.7,
  maxTokens: 500,
});

console.log(result.content);
console.log(result.usage); // { promptTokens, completionTokens, totalTokens }
```

---

### 2. Streaming

**Vercel AI SDK:**

```typescript

const result = await streamText({
  model: openai("gpt-4"),
  prompt: "Tell me a story",
});

for await (const chunk of result.textStream) {
  process.stdout.write(chunk);
}
```

**NeuroLink:**

```typescript

const neurolink = new NeuroLink({ provider: "openai" });

const result = await neurolink.generate({
  input: { text: "Tell me a story" },
  model: "gpt-4",
  stream: true,
});

for await (const chunk of result.stream!) {
  process.stdout.write(chunk.delta);
}
```

**Full chunk data:**

```typescript
for await (const chunk of result.stream!) {
  console.log(chunk.delta); // Text delta
  console.log(chunk.contentType); // 'text' | 'tool_call'
  console.log(chunk.toolCalls); // Tool calls if any
}
```

---

### 3. Tool Calling (Function Calling)

**Vercel AI SDK:**

```typescript

const result = await generateText({
  model: openai("gpt-4"),
  prompt: "What is the weather in San Francisco?",
  tools: {
    getWeather: {
      description: "Get weather for a location",
      parameters: z.object({
        location: z.string(),
      }),
      execute: async ({ location }) => {
        return { temp: 72, condition: "Sunny" };
      },
    },
  },
});
```

**NeuroLink:**

```typescript

const neurolink = new NeuroLink({ provider: "openai" });

// Option 1: Register custom tool
neurolink.registerTool("getWeather", {
  name: "getWeather",
  description: "Get weather for a location",
  inputSchema: {
    type: "object",
    properties: {
      location: { type: "string" },
    },
    required: ["location"],
  },
  execute: async ({ location }) => {
    return { temp: 72, condition: "Sunny" };
  },
});

const result = await neurolink.generate({
  input: { text: "What is the weather in San Francisco?" },
  model: "gpt-4",
});

// Option 2: Use MCP server (more powerful)
await neurolink.addExternalMCPServer("weather", {
  command: "npx",
  args: ["-y", "@modelcontextprotocol/server-weather"],
  transport: "stdio",
  env: { WEATHER_API_KEY: process.env.WEATHER_API_KEY },
});

const result2 = await neurolink.generate({
  input: { text: "What is the weather in San Francisco?" },
});
```

**Benefits:**

- MCP servers provide 58+ pre-built integrations
- No manual tool registration needed
- Tools work across all providers

---

### 4. Structured Output

**Vercel AI SDK:**

```typescript

const result = await generateObject({
  model: openai("gpt-4"),
  schema: z.object({
    name: z.string(),
    age: z.number(),
    email: z.string().email(),
  }),
  prompt: "Generate a user profile for John Doe, age 30",
});

console.log(result.object); // { name: "John Doe", age: 30, email: "..." }
```

**NeuroLink:**

```typescript

const neurolink = new NeuroLink({ provider: "openai" });

const schema = z.object({
  name: z.string(),
  age: z.number(),
  email: z.string().email(),
});

const result = await neurolink.generate({
  input: { text: "Generate a user profile for John Doe, age 30" },
  model: "gpt-4",
  structuredOutput: {
    format: "json",
    schema,
  },
});

console.log(result.structuredOutput); // { name: "John Doe", age: 30, email: "..." }
// Automatically validated against Zod schema
```

**Benefits:**

- Type-safe results
- Automatic validation
- Works across all providers

---

### 5. Multi-Provider Support

**Vercel AI SDK:**

```typescript

// OpenAI
const result1 = await generateText({
  model: openai("gpt-4"),
  prompt: "Hello",
});

// Anthropic (requires separate package)
const result2 = await generateText({
  model: anthropic("claude-3-5-sonnet-20241022"),
  prompt: "Hello",
});
```

**NeuroLink:**

```typescript

const neurolink = new NeuroLink();

// OpenAI
const result1 = await neurolink.generate({
  input: { text: "Hello" },
  provider: "openai",
  model: "gpt-4",
});

// Anthropic (same package)
const result2 = await neurolink.generate({
  input: { text: "Hello" },
  provider: "anthropic",
  model: "claude-3-5-sonnet-20241022",
});

// Or set default provider
const neurolinkAnthropic = new NeuroLink({ provider: "anthropic" });
```

**With automatic failover:**

```typescript
const neurolink = new NeuroLink({
  provider: "openai",
  fallbackProviders: ["anthropic", "vertex"],
});

// Automatically tries Anthropic or Vertex if OpenAI fails
const result = await neurolink.generate({
  input: { text: "Hello" },
});
```

**Benefits:**

- Single package for all 13 providers
- Runtime provider switching
- Automatic failover
- No need to install separate packages

---

## Next.js Integration

### Pattern 1: API Routes

**Vercel AI SDK:**

```typescript
// app/api/chat/route.ts

export async function POST(req: Request) {
  const { messages } = await req.json();

  const result = await streamText({
    model: openai("gpt-4"),
    messages,
  });

  return result.toAIStreamResponse();
}
```

**NeuroLink:**

```typescript
// app/api/chat/route.ts

const neurolink = new NeuroLink({
  provider: "openai",
  conversationMemory: {
    enabled: true,
    store: "redis", // Persistent across instances
  },
});

export async function POST(req: Request) {
  const { message } = await req.json();

  const result = await neurolink.generate({
    input: { text: message },
    model: "gpt-4",
    stream: true,
  });

  // Convert stream to Response
  const encoder = new TextEncoder();
  const stream = new ReadableStream({
    async start(controller) {
      for await (const chunk of result.stream!) {
        controller.enqueue(encoder.encode(chunk.delta));
      }
      controller.close();
    },
  });

  return new Response(stream, {
    headers: { "Content-Type": "text/plain; charset=utf-8" },
  });
}
```

**With better error handling:**

```typescript
export async function POST(req: Request) {
  try {
    const { message } = await req.json();

    const result = await neurolink.generate({
      input: { text: message },
      stream: true,
    });

    const encoder = new TextEncoder();
    const stream = new ReadableStream({
      async start(controller) {
        try {
          for await (const chunk of result.stream!) {
            controller.enqueue(
              encoder.encode(`data: ${JSON.stringify(chunk)}\n\n`),
            );
          }
          controller.enqueue(encoder.encode("data: [DONE]\n\n"));
        } catch (error) {
          controller.enqueue(
            encoder.encode(
              `data: ${JSON.stringify({ error: "Stream error" })}\n\n`,
            ),
          );
        } finally {
          controller.close();
        }
      },
    });

    return new Response(stream, {
      headers: {
        "Content-Type": "text/event-stream",
        "Cache-Control": "no-cache",
        Connection: "keep-alive",
      },
    });
  } catch (error) {
    return NextResponse.json(
      { error: "Failed to generate response" },
      { status: 500 },
    );
  }
}
```

---

### Pattern 2: Server Components

**Vercel AI SDK:**

```typescript
// app/page.tsx (Server Component)

export default async function Page() {
  const { text } = await generateText({
    model: openai('gpt-4'),
    prompt: 'Generate a welcome message',
  });

  return {text};
}
```

**NeuroLink:**

```typescript
// app/page.tsx (Server Component)

const neurolink = new NeuroLink({ provider: "openai" });

export default async function Page() {
  const result = await neurolink.generate({
    input: { text: "Generate a welcome message" },
    model: "gpt-4"
  });

  return {result.content};
}
```

**With caching:**

```typescript

const neurolink = new NeuroLink({
  provider: "openai",
  conversationMemory: {
    enabled: true,
    store: "redis",
    ttl: 3600  // Cache for 1 hour
  }
});

export default async function Page() {
  const result = await neurolink.generate({
    input: { text: "Generate a welcome message" }
  });

  return {result.content};
}

// Enable Next.js caching
export const revalidate = 3600;  // Revalidate every hour
```

---

### Pattern 3: useChat Alternative

**Vercel AI SDK:**

```typescript
// app/chat/page.tsx
'use client';

export default function Chat() {
  const { messages, input, handleInputChange, handleSubmit } = useChat({
    api: '/api/chat',
  });

  return (

      {messages.map(m => (
        {m.content}
      ))}


  );
}
```

**NeuroLink:**

```typescript
// app/chat/page.tsx
'use client';

export default function Chat() {
  const [messages, setMessages] = useState>([]);
  const [input, setInput] = useState('');
  const [isLoading, setIsLoading] = useState(false);

  const handleSubmit = async (e: React.FormEvent) => {
    e.preventDefault();
    if (!input.trim()) return;

    const userMessage = { role: 'user', content: input };
    setMessages(prev => [...prev, userMessage]);
    setInput('');
    setIsLoading(true);

    try {
      const response = await fetch('/api/chat', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ message: input })
      });

      const reader = response.body?.getReader();
      const decoder = new TextDecoder();
      let assistantMessage = '';

      while (true) {
        const { done, value } = await reader!.read();
        if (done) break;

        const chunk = decoder.decode(value);
        assistantMessage += chunk;

        // Update UI in real-time
        setMessages(prev => {
          const updated = [...prev];
          if (updated[updated.length - 1]?.role === 'assistant') {
            updated[updated.length - 1].content = assistantMessage;
          } else {
            updated.push({ role: 'assistant', content: assistantMessage });
          }
          return updated;
        });
      }
    } catch (error) {
      console.error('Error:', error);
    } finally {
      setIsLoading(false);
    }
  };

  return (

      {messages.map((m, i) => (
        {m.content}
      ))}

         setInput(e.target.value)}
          disabled={isLoading}
        />


  );
}
```

**Or create a custom hook:**

```typescript
// hooks/useNeuroLink.ts

export function useNeuroLink() {
  const [messages, setMessages] = useState
  >([]);
  const [isLoading, setIsLoading] = useState(false);

  const sendMessage = useCallback(async (message: string) => {
    setMessages((prev) => [...prev, { role: "user", content: message }]);
    setIsLoading(true);

    try {
      const response = await fetch("/api/chat", {
        method: "POST",
        headers: { "Content-Type": "application/json" },
        body: JSON.stringify({ message }),
      });

      const reader = response.body?.getReader();
      const decoder = new TextDecoder();
      let content = "";

      while (true) {
        const { done, value } = await reader!.read();
        if (done) break;
        content += decoder.decode(value);

        setMessages((prev) => {
          const updated = [...prev];
          if (updated[updated.length - 1]?.role === "assistant") {
            updated[updated.length - 1].content = content;
          } else {
            updated.push({ role: "assistant", content });
          }
          return updated;
        });
      }
    } finally {
      setIsLoading(false);
    }
  }, []);

  return { messages, sendMessage, isLoading };
}

// Usage
const { messages, sendMessage, isLoading } = useNeuroLink();
```

---

### Pattern 4: Server Actions

**Vercel AI SDK:**

```typescript
// app/actions.ts
"use server";

export async function generateResponse(message: string) {
  const { text } = await generateText({
    model: openai("gpt-4"),
    prompt: message,
  });
  return text;
}
```

**NeuroLink:**

```typescript
// app/actions.ts
"use server";

const neurolink = new NeuroLink({
  provider: "openai",
  conversationMemory: {
    enabled: true,
    store: "redis",
  },
});

export async function generateResponse(message: string) {
  const result = await neurolink.generate({
    input: { text: message },
    model: "gpt-4",
  });
  return result.content;
}
```

**With user context:**

```typescript
"use server";

export async function generateResponse(message: string) {
  const userId = cookies().get("userId")?.value;

  const neurolink = new NeuroLink({
    provider: "openai",
    conversationMemory: {
      enabled: true,
      store: "redis",
      namespace: userId, // User-specific conversations
    },
  });

  const result = await neurolink.generate({
    input: { text: message },
  });

  return result.content;
}
```

---

## Edge Runtime Support

**Vercel AI SDK:**

```typescript
// app/api/chat/route.ts

export const runtime = "edge";

export async function POST(req: Request) {
  const result = await streamText({
    model: openai("gpt-4"),
    prompt: "Hello",
  });
  return result.toAIStreamResponse();
}
```

**NeuroLink:**

```typescript
// app/api/chat/route.ts

// Note: NeuroLink is designed for Node.js runtime
// For Edge Runtime, use fetch API directly:
export const runtime = "edge";

export async function POST(req: Request) {
  const { message } = await req.json();

  const response = await fetch("https://api.openai.com/v1/chat/completions", {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      Authorization: `Bearer ${process.env.OPENAI_API_KEY}`,
    },
    body: JSON.stringify({
      model: "gpt-4",
      messages: [{ role: "user", content: message }],
      stream: true,
    }),
  });

  return response;
}

// Alternative: Use Node.js runtime (recommended for NeuroLink)
export const runtime = "nodejs";

const neurolink = new NeuroLink({ provider: "openai" });

export async function POST(req: Request) {
  const { message } = await req.json();

  const result = await neurolink.generate({
    input: { text: message },
    stream: true,
  });

  // Convert to Response...
}
```

**Recommendation:** NeuroLink works best with Node.js runtime. For Edge Runtime, consider using provider APIs directly or wait for Edge-compatible version.

---

## Multimodal Support

**Vercel AI SDK:**

```typescript

const result = await generateText({
  model: openai("gpt-4-vision-preview"),
  messages: [
    {
      role: "user",
      content: [
        { type: "text", text: "What is in this image?" },
        { type: "image", image: imageUrl },
      ],
    },
  ],
});
```

**NeuroLink:**

```typescript

const neurolink = new NeuroLink({ provider: "openai" });

const result = await neurolink.generate({
  input: {
    text: "What is in this image?",
    images: [{ url: imageUrl }],
  },
  model: "gpt-4-vision-preview",
});
```

**With file path:**

```typescript
const result = await neurolink.generate({
  input: {
    text: "What is in this image?",
    images: [{ path: "./image.jpg" }],
  },
});
```

**With PDF:**

```typescript
const result = await neurolink.generate({
  input: {
    text: "Summarize this document",
    pdfs: [{ path: "./document.pdf" }],
  },
  provider: "vertex", // Vertex has native PDF support
});
```

---

## Migration Checklist

- [ ] **Install NeuroLink**: `npm install @juspay/neurolink`
- [ ] **Setup Environment**: Configure API keys in `.env`
- [ ] **Test Basic Generation**: Verify `generate()` works
- [ ] **Migrate API Routes**: Update `/api` routes
- [ ] **Migrate Server Components**: Update RSC usage
- [ ] **Update Client Components**: Replace `useChat` with custom hook
- [ ] **Migrate Tool Calling**: Convert functions to MCP tools
- [ ] **Enable Conversation Memory**: Add Redis if needed
- [ ] **Update Streaming**: Adapt streaming code
- [ ] **Test Multi-Provider**: Verify provider switching
- [ ] **Update Types**: Use NeuroLink types
- [ ] **Remove Vercel AI SDK**: Uninstall after migration

---

## Performance Comparison

| Metric                 | Vercel AI SDK     | NeuroLink      | Notes               |
| ---------------------- | ----------------- | -------------- | ------------------- |
| Bundle Size (minified) | 890KB             | 890KB          | Similar             |
| First Response         | 420ms             | 420ms          | Equivalent          |
| Streaming Latency      | Excellent         | Excellent      | Both optimized      |
| Multi-Provider         | Requires packages | Single package | NeuroLink advantage |
| Redis Support          | Manual            | Built-in       | NeuroLink advantage |

---

## Common Migration Patterns

### 1. Simple Text Generation

**Before:**

```typescript
const { text } = await generateText({
  model: openai("gpt-4"),
  prompt: "Hello",
});
```

**After:**

```typescript
const result = await neurolink.generate({
  input: { text: "Hello" },
  provider: "openai",
});
```

### 2. Streaming

**Before:**

```typescript
const result = await streamText({ model: openai("gpt-4"), prompt: "Story" });
for await (const chunk of result.textStream) {
}
```

**After:**

```typescript
const result = await neurolink.generate({
  input: { text: "Story" },
  stream: true,
});
for await (const chunk of result.stream!) {
}
```

### 3. Structured Output

**Before:**

```typescript
const result = await generateObject({
  model: openai("gpt-4"),
  schema,
  prompt: "...",
});
```

**After:**

```typescript
const result = await neurolink.generate({
  input: { text: "..." },
  structuredOutput: { format: "json", schema },
});
```

---

## Getting Help

- **Documentation**: [https://neurolink.dev/docs](https://neurolink.dev/docs)
- **Migration Support**: [GitHub Discussions](https://github.com/juspay/neurolink/discussions)
- **Examples**: [Next.js Examples](https://github.com/juspay/neurolink-examples/tree/main/nextjs)
- **Discord**: [Join community](https://discord.gg/neurolink)

---

## See Also

- [NeuroLink Getting Started](/docs/getting-started/quick-start)
- [Next.js Integration Guide](/docs/sdk/framework-integration.md#nextjs-integration)
- [API Reference](/docs/sdk/api-reference)
- [Streaming Guide](/docs/advanced/streaming)
- [Redis Configuration](/docs/guides/redis-configuration)
- [Provider Comparison](/docs/reference/provider-comparison)

---

## Fastify Integration Guide

<!-- Source: guides/frameworks/fastify.md -->

# Fastify Integration Guide

**Build high-performance AI APIs with Fastify and NeuroLink**

## Quick Start

### 1. Initialize Project

```bash
mkdir my-ai-api
cd my-ai-api
npm init -y
npm install fastify @juspay/neurolink dotenv
npm install @fastify/type-provider-typebox @sinclair/typebox
npm install -D @types/node typescript ts-node
```

### 2. Setup TypeScript

```json
// tsconfig.json
{
  "compilerOptions": {
    "target": "ES2020",
    "module": "commonjs",
    "outDir": "./dist",
    "rootDir": "./src",
    "strict": true,
    "esModuleInterop": true,
    "skipLibCheck": true
  }
}
```

### 3. Create Basic Server

```typescript
// src/index.ts

dotenv.config();

// Initialize Fastify with TypeBox type provider
const app = Fastify({
  logger: true,
}).withTypeProvider();

// Initialize NeuroLink
const ai = new NeuroLink({
  providers: [
    {
      name: "openai",
      config: { apiKey: process.env.OPENAI_API_KEY },
    },
    {
      name: "anthropic",
      config: { apiKey: process.env.ANTHROPIC_API_KEY },
    },
  ],
});

// Request schema with TypeBox
const GenerateSchema = {
  body: Type.Object({
    prompt: Type.String({ minLength: 1, maxLength: 10000 }),
    provider: Type.Optional(Type.String()),
    model: Type.Optional(Type.String()),
  }),
};

type GenerateBody = Static;

// Basic endpoint with schema validation
app.post(
  "/api/generate",
  { schema: GenerateSchema },
  async (request, reply) => {
    const { prompt, provider = "openai", model = "gpt-4o-mini" } = request.body;

    const result = await ai.generate({
      input: { text: prompt },
      provider,
      model,
    });

    return {
      content: result.content,
      usage: result.usage,
      cost: result.cost,
    };
  },
);

// Start server
const start = async () => {
  try {
    const PORT = parseInt(process.env.PORT || "3000", 10);
    await app.listen({ port: PORT, host: "0.0.0.0" });
    console.log(`AI API server running on http://localhost:${PORT}`);
  } catch (error) {
    app.log.error(error);
    process.exit(1);
  }
};

start();
```

### 4. Environment Variables

```bash
# .env
PORT=3000
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_AI_API_KEY=AIza...
```

### 5. Run Server

```bash
npx ts-node src/index.ts
```

### 6. Test API

```bash
curl -X POST http://localhost:3000/api/generate \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Explain AI in one sentence"}'
```

---

## Authentication

### API Key Authentication with Decorators

```typescript
// src/plugins/api-key-auth.ts

declare module "fastify" {
  interface FastifyInstance {
    apiKeyAuth: (request: FastifyRequest, reply: FastifyReply) => Promise;
  }
}

async function apiKeyAuthPlugin(fastify: FastifyInstance) {
  fastify.decorate(
    "apiKeyAuth",
    async function (request: FastifyRequest, reply: FastifyReply) {
      const apiKey = request.headers["x-api-key"] as string;

      if (!apiKey) {
        reply.code(401).send({ error: "API key is required" });
        return;
      }

      if (apiKey !== process.env.API_SECRET) {
        reply.code(401).send({ error: "Invalid API key" });
        return;
      }
    },
  );
}

export default fp(apiKeyAuthPlugin, { name: "api-key-auth" });
```

```typescript
// src/index.ts

await app.register(apiKeyAuthPlugin);

// Protected endpoint
app.post(
  "/api/generate",
  { preHandler: [app.apiKeyAuth], schema: GenerateSchema },
  async (request, reply) => {
    // ... AI generation
  },
);
```

### JWT Authentication with @fastify/jwt

```bash
npm install @fastify/jwt
```

```typescript
// src/plugins/jwt-auth.ts

declare module "@fastify/jwt" {
  interface FastifyJWT {
    payload: { userId: string; username: string };
    user: { userId: string; username: string };
  }
}

declare module "fastify" {
  interface FastifyInstance {
    authenticate: (
      request: FastifyRequest,
      reply: FastifyReply,
    ) => Promise;
  }
}

async function jwtAuthPlugin(fastify: FastifyInstance) {
  await fastify.register(fastifyJwt, {
    secret: process.env.JWT_SECRET || "supersecret",
    sign: { expiresIn: "24h" },
  });

  fastify.decorate(
    "authenticate",
    async function (request: FastifyRequest, reply: FastifyReply) {
      try {
        await request.jwtVerify();
      } catch (error) {
        reply.code(401).send({ error: "Invalid or expired token" });
      }
    },
  );
}

export default fp(jwtAuthPlugin, { name: "jwt-auth" });
```

```typescript
// Login endpoint
app.post("/api/auth/login", async (request, reply) => {
  const { username, password } = request.body as any;

  if (username === "admin" && password === "password") {
    const token = app.jwt.sign({ userId: "123", username });
    return { token, expiresIn: "24h" };
  }

  reply.code(401).send({ error: "Invalid credentials" });
});

// Protected endpoint
app.post(
  "/api/generate",
  { preHandler: [app.authenticate] },
  async (request, reply) => {
    const user = request.user;
    // ... AI generation
  },
);
```

---

## Rate Limiting

### @fastify/rate-limit Plugin

```bash
npm install @fastify/rate-limit
```

```typescript
// src/plugins/rate-limit.ts

async function rateLimitPlugin(fastify: FastifyInstance) {
  await fastify.register(rateLimit, {
    max: 100,
    timeWindow: "1 minute",
    errorResponseBuilder: (request, context) => ({
      error: "Too Many Requests",
      message: `Rate limit exceeded. Try again in ${Math.round(context.ttl / 1000)} seconds.`,
      statusCode: 429,
    }),
    keyGenerator: (request) =>
      (request.headers["x-api-key"] as string) ||
      request.user?.userId ||
      request.ip,
  });
}

export default fp(rateLimitPlugin, { name: "rate-limit" });
```

```typescript
// Route-specific rate limit
app.post(
  "/api/analyze",
  {
    config: {
      rateLimit: { max: 10, timeWindow: "1 minute" },
    },
  },
  async (request, reply) => {
    // Expensive AI operation
  },
);
```

### Redis-Based Custom Rate Limiting

```bash
npm install @fastify/rate-limit ioredis
```

```typescript
// src/plugins/redis-rate-limit.ts

async function redisRateLimitPlugin(fastify: FastifyInstance) {
  const redis = new Redis(process.env.REDIS_URL || "redis://localhost:6379");

  await fastify.register(rateLimit, {
    global: true,
    max: 100,
    timeWindow: "1 minute",
    redis: redis,
    nameSpace: "rate-limit:",
    skipOnError: true,
  });

  fastify.addHook("onClose", async () => {
    await redis.quit();
  });
}

export default fp(redisRateLimitPlugin, { name: "redis-rate-limit" });
```

---

## Response Caching

### Redis Caching with Hooks

```bash
npm install ioredis
```

```typescript
// src/plugins/cache.ts

declare module "fastify" {
  interface FastifyInstance {
    cache: Redis;
    cacheResponse: (ttl: number) => {
      onRequest: (
        request: FastifyRequest,
        reply: FastifyReply,
      ) => Promise;
      onSend: (
        request: FastifyRequest,
        reply: FastifyReply,
        payload: string,
      ) => Promise;
    };
  }
  interface FastifyRequest {
    cacheKey?: string;
  }
}

async function cachePlugin(fastify: FastifyInstance) {
  const redis = new Redis(process.env.REDIS_URL || "redis://localhost:6379");

  fastify.decorate("cache", redis);

  fastify.decorate("cacheResponse", (ttl: number = 3600) => ({
    onRequest: async (request: FastifyRequest, reply: FastifyReply) => {
      const keyData = { url: request.url, body: request.body };
      request.cacheKey = `ai:${createHash("sha256")
        .update(JSON.stringify(keyData))
        .digest("hex")}`;

      const cached = await redis.get(request.cacheKey);
      if (cached) {
        reply.header("X-Cache", "HIT");
        reply.send(JSON.parse(cached));
      }
    },

    onSend: async (
      request: FastifyRequest,
      reply: FastifyReply,
      payload: string,
    ) => {
      if (request.cacheKey && reply.statusCode === 200) {
        await redis.setex(request.cacheKey, ttl, payload);
      }
      return payload;
    },
  }));

  fastify.addHook("onClose", async () => {
    await redis.quit();
  });
}

export default fp(cachePlugin, { name: "cache" });
```

```typescript
// Cached endpoint
const cacheHooks = app.cacheResponse(3600);

app.post(
  "/api/generate",
  {
    onRequest: cacheHooks.onRequest,
    onSend: cacheHooks.onSend,
  },
  async (request, reply) => {
    const result = await ai.generate({
      input: { text: request.body.prompt },
    });
    return { content: result.content, usage: result.usage };
  },
);
```

---

## Streaming Responses

### Server-Sent Events (SSE) with reply.raw

```typescript
// src/routes/stream.ts

const StreamSchema = {
  body: Type.Object({
    prompt: Type.String({ minLength: 1 }),
    provider: Type.Optional(Type.String()),
  }),
};

type StreamBody = Static;

export default async function streamRoutes(fastify: FastifyInstance) {
  fastify.post(
    "/stream",
    { schema: StreamSchema },
    async (request, reply) => {
      const { prompt, provider = "openai" } = request.body;

      // Set SSE headers using reply.raw
      reply.raw.writeHead(200, {
        "Content-Type": "text/event-stream",
        "Cache-Control": "no-cache",
        Connection: "keep-alive",
      });

      try {
        for await (const chunk of fastify.ai.stream({
          input: { text: prompt },
          provider,
        })) {
          reply.raw.write(
            `data: ${JSON.stringify({ content: chunk.content })}\n\n`,
          );
        }

        reply.raw.write("data: [DONE]\n\n");
        reply.raw.end();
      } catch (error: any) {
        reply.raw.write(
          `data: ${JSON.stringify({ error: error.message })}\n\n`,
        );
        reply.raw.end();
      }
    },
  );
}
```

### WebSocket with @fastify/websocket

```bash
npm install @fastify/websocket
```

```typescript
// src/routes/websocket.ts

export default async function websocketRoutes(fastify: FastifyInstance) {
  fastify.get("/ws", { websocket: true }, (socket, request) => {
    request.log.info("WebSocket client connected");

    socket.on("message", async (rawData: Buffer) => {
      try {
        const { prompt, provider = "openai" } = JSON.parse(rawData.toString());

        socket.send(JSON.stringify({ type: "start" }));

        for await (const chunk of fastify.ai.stream({
          input: { text: prompt },
          provider,
        })) {
          socket.send(
            JSON.stringify({ type: "chunk", content: chunk.content }),
          );
        }

        socket.send(JSON.stringify({ type: "done" }));
      } catch (error: any) {
        socket.send(JSON.stringify({ type: "error", error: error.message }));
      }
    });

    socket.on("close", () => {
      request.log.info("WebSocket client disconnected");
    });
  });
}
```

```typescript
// src/index.ts

await app.register(websocket);
await app.register(websocketRoutes);
```

---

## Production Patterns

### Pattern 1: Plugin Architecture

```typescript
// src/plugins/neurolink.ts

declare module "fastify" {
  interface FastifyInstance {
    ai: NeuroLink;
  }
}

async function neuroLinkPlugin(
  fastify: FastifyInstance,
  options: { providers: Array }> },
) {
  const ai = new NeuroLink({ providers: options.providers });
  fastify.decorate("ai", ai);
  fastify.log.info("NeuroLink initialized");
}

export default fp(neuroLinkPlugin, { name: "neurolink" });
```

```typescript
// src/index.ts

await app.register(neuroLinkPlugin, {
  providers: [
    { name: "openai", config: { apiKey: process.env.OPENAI_API_KEY } },
    { name: "anthropic", config: { apiKey: process.env.ANTHROPIC_API_KEY } },
  ],
});

// Now use app.ai anywhere
app.post("/api/generate", async (request, reply) => {
  const result = await app.ai.generate({
    input: { text: request.body.prompt },
  });
  return { content: result.content };
});
```

### Pattern 2: Usage Tracking with Hooks

```typescript
// src/plugins/usage-tracking.ts

async function usageTrackingPlugin(fastify: FastifyInstance) {
  fastify.addHook(
    "onSend",
    async (request: FastifyRequest, reply: FastifyReply, payload: string) => {
      if (reply.statusCode === 200) {
        try {
          const response = JSON.parse(payload);
          if (response.usage) {
            await fastify.cache.lpush(
              `usage:${request.user?.userId || "anonymous"}`,
              JSON.stringify({
                tokens: response.usage.totalTokens,
                cost: response.cost,
                timestamp: new Date(),
              }),
            );
          }
        } catch (error) {
          // Ignore non-JSON responses
        }
      }
      return payload;
    },
  );
}

export default fp(usageTrackingPlugin, { name: "usage-tracking" });
```

### Pattern 3: Error Handler with setErrorHandler

```typescript
// src/plugins/error-handler.ts

async function errorHandlerPlugin(fastify: FastifyInstance) {
  fastify.setErrorHandler(
    async (error: FastifyError, request, reply: FastifyReply) => {
      request.log.error({ error: error.message }, "Request error");

      if (error.message.includes("rate limit") || error.statusCode === 429) {
        return reply.code(429).send({
          error: "Rate Limit Exceeded",
          message: "Too many requests. Please try again later.",
        });
      }

      if (error.message.includes("quota")) {
        return reply.code(503).send({
          error: "Service Quota Exceeded",
          message: "AI service quota exceeded.",
        });
      }

      if (error.validation) {
        return reply.code(400).send({
          error: "Validation Error",
          details: error.validation,
        });
      }

      return reply.code(error.statusCode || 500).send({
        error: "Internal Server Error",
        message:
          process.env.NODE_ENV === "development"
            ? error.message
            : "Something went wrong",
      });
    },
  );
}

export default fp(errorHandlerPlugin, { name: "error-handler" });
```

---

## Schema Validation

### TypeBox Schema Definitions

```typescript
// src/schemas/ai.ts

export const ProviderSchema = Type.Union([
  Type.Literal("openai"),
  Type.Literal("anthropic"),
  Type.Literal("google-ai"),
]);

export const GenerateRequestSchema = Type.Object({
  prompt: Type.String({ minLength: 1, maxLength: 100000 }),
  provider: Type.Optional(ProviderSchema),
  model: Type.Optional(Type.String()),
  maxTokens: Type.Optional(Type.Integer({ minimum: 1, maximum: 128000 })),
  temperature: Type.Optional(Type.Number({ minimum: 0, maximum: 2 })),
});

export type GenerateRequest = Static;

export const GenerateResponseSchema = Type.Object({
  content: Type.String(),
  provider: Type.String(),
  model: Type.String(),
  usage: Type.Object({
    inputTokens: Type.Integer(),
    outputTokens: Type.Integer(),
    totalTokens: Type.Integer(),
  }),
  cost: Type.Optional(Type.Number()),
});

export const ErrorResponseSchema = Type.Object({
  error: Type.String(),
  message: Type.String(),
  details: Type.Optional(Type.Any()),
});
```

### Route with Full Schema Validation

```typescript
// src/routes/ai.ts

  GenerateRequestSchema,
  GenerateResponseSchema,
  GenerateRequest,
  ErrorResponseSchema,
} from "../schemas/ai";

export default async function aiRoutes(fastify: FastifyInstance) {
  fastify.post(
    "/generate",
    {
      schema: {
        body: GenerateRequestSchema,
        response: {
          200: GenerateResponseSchema,
          400: ErrorResponseSchema,
          429: ErrorResponseSchema,
        },
      },
      preHandler: [fastify.authenticate],
    },
    async (request, reply) => {
      const {
        prompt,
        provider = "openai",
        model,
        maxTokens,
        temperature,
      } = request.body;

      const result = await fastify.ai.generate({
        input: { text: prompt },
        provider,
        model,
        maxTokens,
        temperature,
      });

      return {
        content: result.content,
        provider: result.provider,
        model: result.model,
        usage: result.usage,
        cost: result.cost,
      };
    },
  );
}
```

### Validation Options

```typescript
// src/index.ts
const app = Fastify({
  logger: true,
  ajv: {
    customOptions: {
      removeAdditional: "all",
      coerceTypes: true,
      useDefaults: true,
      allErrors: true,
    },
  },
}).withTypeProvider();
```

---

## Monitoring and Logging

### Pino Logger (Built-in)

```typescript
// src/index.ts
const app = Fastify({
  logger: {
    level: process.env.LOG_LEVEL || "info",
    transport:
      process.env.NODE_ENV === "development"
        ? { target: "pino-pretty", options: { colorize: true } }
        : undefined,
    redact: ["req.headers.authorization", "req.headers['x-api-key']"],
  },
});

// Log AI operations
app.post("/api/generate", async (request, reply) => {
  const startTime = Date.now();
  request.log.info(
    { prompt: request.body.prompt.slice(0, 50) },
    "AI request started",
  );

  const result = await app.ai.generate({
    input: { text: request.body.prompt },
  });

  request.log.info(
    {
      provider: result.provider,
      tokens: result.usage.totalTokens,
      duration: Date.now() - startTime,
    },
    "AI request completed",
  );

  return result;
});
```

### Prometheus Metrics

```bash
npm install prom-client
```

```typescript
// src/plugins/metrics.ts

  Registry,
  Counter,
  Histogram,
  collectDefaultMetrics,
} from "prom-client";

async function metricsPlugin(fastify: FastifyInstance) {
  const register = new Registry();
  collectDefaultMetrics({ register });

  const httpRequestsTotal = new Counter({
    name: "http_requests_total",
    help: "Total HTTP requests",
    labelNames: ["method", "route", "status"],
    registers: [register],
  });

  const aiRequestsTotal = new Counter({
    name: "ai_requests_total",
    help: "Total AI requests",
    labelNames: ["provider", "model"],
    registers: [register],
  });

  const aiRequestDuration = new Histogram({
    name: "ai_request_duration_seconds",
    help: "AI request duration",
    labelNames: ["provider", "model"],
    registers: [register],
  });

  fastify.addHook(
    "onResponse",
    async (request: FastifyRequest, reply: FastifyReply) => {
      httpRequestsTotal.inc({
        method: request.method,
        route: request.routeOptions?.url || request.url,
        status: reply.statusCode,
      });
    },
  );

  fastify.get("/metrics", async (request, reply) => {
    reply.header("Content-Type", register.contentType);
    return register.metrics();
  });

  fastify.decorate("metrics", { aiRequestsTotal, aiRequestDuration });
}

export default fp(metricsPlugin, { name: "metrics" });
```

---

## Best Practices

### 1. Use Plugin Architecture for Modularity

```typescript
// src/app.ts
export async function buildApp(): Promise {
  const app = Fastify({ logger: true }).withTypeProvider();

  await app.register(errorHandlerPlugin);
  await app.register(metricsPlugin);
  await app.register(jwtAuthPlugin);
  await app.register(rateLimitPlugin);
  await app.register(cachePlugin);
  await app.register(neuroLinkPlugin, { providers: [...] });

  await app.register(authRoutes, { prefix: "/api/auth" });
  await app.register(aiRoutes, { prefix: "/api" });

  return app;
}
```

### 2. Leverage TypeBox for Type Safety

```typescript
app.post(
  "/api/generate",
  { schema: { body: RequestSchema } },
  async (request) => {
    // request.body is fully typed
    const { prompt, options } = request.body;
  },
);
```

### 3. Use Hooks for Cross-Cutting Concerns

```typescript
app.addHook("onRequest", async (request) => {
  request.startTime = Date.now();
});

app.addHook("onResponse", async (request, reply) => {
  const duration = Date.now() - request.startTime;
  request.log.info({ duration }, "Request completed");
});
```

### 4. Implement Graceful Shutdown

```typescript
const signals = ["SIGINT", "SIGTERM"];
for (const signal of signals) {
  process.on(signal, async () => {
    await app.close();
    process.exit(0);
  });
}
```

### 5. Validate Environment at Startup

```typescript

const ConfigSchema = Type.Object({
  OPENAI_API_KEY: Type.String({ minLength: 1 }),
  JWT_SECRET: Type.String({ minLength: 32 }),
});

const ajv = new Ajv({ coerceTypes: true });
if (!ajv.validate(ConfigSchema, process.env)) {
  throw new Error("Configuration validation failed");
}
```

---

## Deployment

### Docker Deployment

```dockerfile
# Dockerfile
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build

FROM node:20-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY --from=builder /app/dist ./dist
RUN adduser -S fastify
USER fastify
EXPOSE 3000
HEALTHCHECK --interval=30s --timeout=3s \
  CMD wget --spider -q http://localhost:3000/health || exit 1
CMD ["node", "dist/index.js"]
```

```yaml
# docker-compose.yml
version: "3.8"

services:
  api:
    build: .
    ports:
      - "3000:3000"
    environment:
      - NODE_ENV=production
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - REDIS_URL=redis://redis:6379
    depends_on:
      - redis

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
```

### Production Checklist

- [ ] Environment variables validated at startup
- [ ] Rate limiting configured with Redis backend
- [ ] JWT authentication implemented
- [ ] Schema validation on all endpoints
- [ ] Comprehensive error handling with setErrorHandler
- [ ] Pino logging with appropriate log levels
- [ ] Prometheus metrics exposed at /metrics
- [ ] Response caching enabled for expensive operations
- [ ] Graceful shutdown implemented
- [ ] Health check endpoint available
- [ ] CORS configured properly (@fastify/cors)
- [ ] Request size limits configured

---

## Related Documentation

- **[API Reference](/docs/sdk/api-reference)** - NeuroLink SDK
- **[Express Integration](/docs/sdk/framework-integration)** - Compare with Express patterns
- **[Compliance Guide](/docs/guides/enterprise/compliance)** - Security and authentication
- **[Cost Optimization](/docs/cookbook/cost-optimization)** - Reduce costs
- **[Monitoring Guide](/docs/guides/enterprise/monitoring)** - Observability

---

## Additional Resources

- **[Fastify Documentation](https://fastify.dev/docs/latest/)** - Official Fastify docs
- **[TypeBox Documentation](https://github.com/sinclairzx81/typebox)** - JSON Schema type builder
- **[Fastify Ecosystem](https://fastify.dev/ecosystem/)** - Official plugins
- **[Pino Logger](https://getpino.io/)** - Fastify's built-in logger

---

**Need Help?** Join our [GitHub Discussions](https://github.com/juspay/neurolink/discussions) or open an [issue](https://github.com/juspay/neurolink/issues).

---

## Real-World Use Cases

<!-- Source: guides/examples/use-cases.md -->

# Real-World Use Cases

**Practical examples and production-ready patterns for common AI integration scenarios**

## 1. Customer Support Automation

**Scenario**: Automated customer support with multi-provider failover and cost optimization.

### Architecture

```
User Query → Intent Classification → Route to:
  - FAQ Bot (Free Tier: Google AI)
  - Complex Support (GPT-4o)
  - Escalation (Human Agent)
```

### Implementation

```typescript

class CustomerSupportBot {
  private ai: NeuroLink;

  constructor() {
    this.ai = new NeuroLink({
      providers: [
        {
          name: "google-ai-free",
          priority: 1,
          config: {
            apiKey: process.env.GOOGLE_AI_KEY,
            model: "gemini-2.0-flash",
          },
          quotas: { daily: 1500 },
        },
        {
          name: "openai",
          priority: 2,
          config: {
            apiKey: process.env.OPENAI_API_KEY,
            model: "gpt-4o-mini",
          },
        },
      ],
      failoverConfig: { enabled: true, fallbackOnQuota: true },
    });
  }

  async classifyIntent(query: string): Promise {
    const result = await this.ai.generate({
      input: {
        text: `Classify customer support intent:
Query: "${query}"

Return only one word: faq, complex, or escalate`,
      },
      provider: "google-ai-free",
    });

    const intent = result.content.toLowerCase().trim();
    return ["faq", "complex", "escalate"].includes(intent)
      ? (intent as any)
      : "complex";
  }

  async handleFAQ(query: string): Promise {
    const result = await this.ai.generate({
      input: {
        text: `Answer this FAQ question concisely:
${query}

Use our knowledge base:
- Returns: 30-day return policy
- Shipping: 3-5 business days
- Payment: Credit card, PayPal accepted`,
      },
      provider: "google-ai-free",
      model: "gemini-2.0-flash",
    });

    return result.content;
  }

  async handleComplexQuery(
    query: string,
    conversationHistory: string[],
  ): Promise {
    const result = await this.ai.generate({
      input: {
        text: `You are a helpful customer support agent.

Conversation history:
${conversationHistory.join("\n")}

Customer: ${query}

Provide a detailed, helpful response.`,
      },
      provider: "openai",
      model: "gpt-4o",
    });

    return result.content;
  }

  async processQuery(
    query: string,
    conversationHistory: string[] = [],
  ): Promise {
    const intent = await this.classifyIntent(query);

    if (intent === "escalate") {
      return {
        response:
          "I've escalated your request to a human agent. They'll be with you shortly.",
        intent,
        escalated: true,
      };
    }

    const response =
      intent === "faq"
        ? await this.handleFAQ(query)
        : await this.handleComplexQuery(query, conversationHistory);

    return { response, intent, escalated: false };
  }
}

const supportBot = new CustomerSupportBot();

const result = await supportBot.processQuery("What is your return policy?");
```

**Cost Analysis**:

- FAQ queries (80%): Free tier (Google AI)
- Complex queries (18%): $0.15 per 1M input tokens (GPT-4o-mini)
- Escalations (2%): Human agent
- **Total savings**: 90% vs. using GPT-4o for all queries

---

## 2. Content Generation Pipeline

**Scenario**: Multi-stage content generation with drafting, editing, and SEO optimization.

### Implementation

```typescript
class ContentGenerationPipeline {
  private ai: NeuroLink;

  constructor() {
    this.ai = new NeuroLink({
      providers: [
        { name: "openai", config: { apiKey: process.env.OPENAI_API_KEY } },
        {
          name: "anthropic",
          config: { apiKey: process.env.ANTHROPIC_API_KEY },
        },
      ],
      loadBalancing: "round-robin",
    });
  }

  async generateDraft(topic: string, keywords: string[]): Promise {
    const result = await this.ai.generate({
      input: {
        text: `Write a 500-word blog post about: ${topic}

Include these keywords naturally: ${keywords.join(", ")}

Structure: Introduction, 3 main points, conclusion`,
      },
      provider: "openai",
      model: "gpt-4o-mini",
    });

    return result.content;
  }

  async improveDraft(draft: string): Promise {
    const result = await this.ai.generate({
      input: {
        text: `Improve this draft for clarity, engagement, and readability:

${draft}

Make it more engaging while keeping the same length.`,
      },
      provider: "anthropic",
      model: "claude-3-5-sonnet-20241022",
    });

    return result.content;
  }

  async optimizeSEO(
    content: string,
    keywords: string[],
  ): Promise {
    const result = await this.ai.generate({
      input: {
        text: `Analyze SEO for this content:

${content}

Target keywords: ${keywords.join(", ")}

Return JSON:
{
  "optimizedContent": "...",
  "seoScore": 0-100,
  "suggestions": ["..."]
}`,
      },
      provider: "openai",
      model: "gpt-4o",
    });

    return JSON.parse(result.content);
  }

  async generateMetadata(content: string): Promise {
    const result = await this.ai.generate({
      input: {
        text: `Generate SEO metadata for this article:

${content.substring(0, 1000)}...

Return JSON:
{
  "title": "60 chars max",
  "description": "160 chars max",
  "tags": ["tag1", "tag2", "tag3"]
}`,
      },
      provider: "openai",
      model: "gpt-4o-mini",
    });

    return JSON.parse(result.content);
  }

  async generateComplete(
    topic: string,
    keywords: string[],
  ): Promise {
    const draft = await this.generateDraft(topic, keywords);
    const improved = await this.improveDraft(draft);
    const seoResult = await this.optimizeSEO(improved, keywords);
    const metadata = await this.generateMetadata(seoResult.content);

    return {
      content: seoResult.content,
      metadata,
      seoScore: seoResult.seoScore,
    };
  }
}

const pipeline = new ContentGenerationPipeline();

const article = await pipeline.generateComplete(
  "AI-powered customer support automation",
  ["AI", "automation", "customer support", "chatbot"],
);
```

---

## 3. Code Review Automation

**Scenario**: Automated code review with security, performance, and style checks.

### Implementation

```typescript

class CodeReviewBot {
  private ai: NeuroLink;
  private github: Octokit;

  constructor() {
    this.ai = new NeuroLink({
      providers: [
        {
          name: "anthropic",
          config: {
            apiKey: process.env.ANTHROPIC_API_KEY,
            model: "claude-3-5-sonnet-20241022",
          },
        },
      ],
    });

    this.github = new Octokit({ auth: process.env.GITHUB_TOKEN });
  }

  async reviewCode(
    code: string,
    language: string,
  ): Promise {
    const result = await this.ai.generate({
      input: {
        text: `Review this ${language} code:

\`\`\`${language}
${code}
\`\`\`

Analyze for:
1. Security vulnerabilities
2. Performance issues
3. Code style violations
4. Potential bugs

Return JSON:
{
  "security": ["issue1", "issue2"],
  "performance": ["issue1"],
  "style": ["issue1"],
  "bugs": ["issue1"],
  "score": 0-100
}`,
      },
      provider: "anthropic",
    });

    return JSON.parse(result.content);
  }

  async reviewPullRequest(
    owner: string,
    repo: string,
    prNumber: number,
  ): Promise {
    const { data: pr } = await this.github.pulls.get({
      owner,
      repo,
      pull_number: prNumber,
    });

    const { data: files } = await this.github.pulls.listFiles({
      owner,
      repo,
      pull_number: prNumber,
    });

    const reviews = await Promise.all(
      files.map(async (file) => {
        if (!file.patch) return null;

        const language = file.filename.split(".").pop();
        const review = await this.reviewCode(file.patch, language);

        return {
          filename: file.filename,
          review,
        };
      }),
    );

    const comments = reviews
      .filter((r) => r !== null)
      .flatMap((r) => {
        const issues = [
          ...r.review.security.map((s) => ` Security: ${s}`),
          ...r.review.performance.map((p) => `⚡ Performance: ${p}`),
          ...r.review.bugs.map((b) => ` Bug: ${b}`),
        ];

        return issues.map((issue) => ({
          path: r.filename,
          body: issue,
          position: 1,
        }));
      });

    if (comments.length > 0) {
      await this.github.pulls.createReview({
        owner,
        repo,
        pull_number: prNumber,
        event: "COMMENT",
        comments,
      });
    }
  }
}

const reviewBot = new CodeReviewBot();
await reviewBot.reviewPullRequest("myorg", "myrepo", 123);
```

---

## 4. Document Analysis & Summarization

**Scenario**: Extract insights from large documents (PDFs, contracts, reports).

### Implementation

```typescript

class DocumentAnalyzer {
  private ai: NeuroLink;

  constructor() {
    this.ai = new NeuroLink({
      providers: [
        {
          name: "anthropic",
          config: {
            apiKey: process.env.ANTHROPIC_API_KEY,
            model: "claude-3-5-sonnet-20241022",
          },
        },
      ],
    });
  }

  async extractTextFromPDF(pdfPath: string): Promise {
    const dataBuffer = await fs.readFile(pdfPath);
    const data = await pdf(dataBuffer);
    return data.text;
  }

  async summarizeDocument(
    text: string,
    length: "short" | "medium" | "long" = "medium",
  ): Promise {
    const lengthMap = {
      short: "3 sentences",
      medium: "1 paragraph",
      long: "3 paragraphs",
    };

    const result = await this.ai.generate({
      input: {
        text: `Summarize this document in ${lengthMap[length]}:

${text.substring(0, 100000)}`,
      },
      provider: "anthropic",
    });

    return result.content;
  }

  async extractKeyPoints(text: string): Promise {
    const result = await this.ai.generate({
      input: {
        text: `Extract 5-10 key points from this document:

${text.substring(0, 100000)}

Return as JSON array: ["point1", "point2", ...]`,
      },
      provider: "anthropic",
    });

    return JSON.parse(result.content);
  }

  async analyzeSentiment(text: string): Promise {
    const result = await this.ai.generate({
      input: {
        text: `Analyze sentiment of this document:

${text.substring(0, 50000)}

Return JSON:
{
  "sentiment": "positive|neutral|negative",
  "score": 0-100,
  "reasoning": "..."
}`,
      },
      provider: "anthropic",
    });

    return JSON.parse(result.content);
  }

  async extractEntities(text: string): Promise {
    const result = await this.ai.generate({
      input: {
        text: `Extract named entities from this document:

${text.substring(0, 50000)}

Return JSON:
{
  "people": ["name1", "name2"],
  "organizations": ["org1", "org2"],
  "locations": ["loc1", "loc2"],
  "dates": ["date1", "date2"]
}`,
      },
      provider: "anthropic",
    });

    return JSON.parse(result.content);
  }

  async analyzeComplete(pdfPath: string): Promise {
    const text = await this.extractTextFromPDF(pdfPath);

    const [summary, keyPoints, sentiment, entities] = await Promise.all([
      this.summarizeDocument(text),
      this.extractKeyPoints(text),
      this.analyzeSentiment(text),
      this.extractEntities(text),
    ]);

    return { summary, keyPoints, sentiment, entities };
  }
}

const analyzer = new DocumentAnalyzer();
const analysis = await analyzer.analyzeComplete("./contract.pdf");
```

---

## 5. Multi-Language Translation Service

**Scenario**: High-quality translation with context awareness and cost optimization.

### Implementation

```typescript
class TranslationService {
  private ai: NeuroLink;

  constructor() {
    this.ai = new NeuroLink({
      providers: [
        {
          name: "openai",
          priority: 1,
          config: {
            apiKey: process.env.OPENAI_API_KEY,
            model: "gpt-4o-mini",
          },
        },
        {
          name: "anthropic",
          priority: 2,
          config: {
            apiKey: process.env.ANTHROPIC_API_KEY,
            model: "claude-3-5-haiku-20241022",
          },
        },
      ],
      loadBalancing: "least-busy",
    });
  }

  async translate(
    text: string,
    from: string,
    to: string,
    context?: string,
  ): Promise {
    const contextText = context ? `\n\nContext: ${context}` : "";

    const result = await this.ai.generate({
      input: {
        text: `Translate from ${from} to ${to}:

"${text}"${contextText}

Return JSON:
{
  "translation": "...",
  "confidence": 0-100
}`,
      },
      provider: "openai",
    });

    return JSON.parse(result.content);
  }

  async translateBatch(
    texts: string[],
    from: string,
    to: string,
  ): Promise {
    const results = await Promise.all(
      texts.map((text) => this.translate(text, from, to)),
    );

    return results.map((r) => r.translation);
  }

  async detectLanguage(text: string): Promise {
    const result = await this.ai.generate({
      input: {
        text: `Detect the language of this text:

"${text}"

Return only the ISO 639-1 language code (e.g., "en", "es", "fr")`,
      },
      provider: "openai",
    });

    return result.content.trim().toLowerCase();
  }

  async translateWithFallback(
    text: string,
    targetLanguages: string[],
  ): Promise> {
    const sourceLang = await this.detectLanguage(text);

    const translations = await Promise.all(
      targetLanguages.map(async (lang) => {
        const result = await this.translate(text, sourceLang, lang);
        return [lang, result.translation];
      }),
    );

    return Object.fromEntries(translations);
  }
}

const translator = new TranslationService();

const result = await translator.translate(
  "Hello, how are you?",
  "en",
  "es",
  "casual greeting between friends",
);

const multiLang = await translator.translateWithFallback(
  "Welcome to our platform",
  ["es", "fr", "de", "ja", "zh"],
);
```

---

## 6. Data Extraction from Unstructured Text

**Scenario**: Extract structured data from emails, invoices, resumes, etc.

### Implementation

```typescript
type Invoice = {
  invoiceNumber: string;
  date: string;
  vendor: string;
  total: number;
  items: Array;
};

class DataExtractor {
  private ai: NeuroLink;

  constructor() {
    this.ai = new NeuroLink({
      providers: [
        {
          name: "openai",
          config: {
            apiKey: process.env.OPENAI_API_KEY,
            model: "gpt-4o",
          },
        },
      ],
    });
  }

  async extractInvoice(text: string): Promise {
    const result = await this.ai.generate({
      input: {
        text: `Extract invoice data from this text:

${text}

Return JSON matching this schema:
{
  "invoiceNumber": "...",
  "date": "YYYY-MM-DD",
  "vendor": "...",
  "total": 0.00,
  "items": [
    {
      "description": "...",
      "quantity": 1,
      "price": 0.00
    }
  ]
}`,
      },
      provider: "openai",
    });

    return JSON.parse(result.content);
  }

  async extractResume(text: string): Promise;
    education: Array;
  }> {
    const result = await this.ai.generate({
      input: {
        text: `Extract resume data from this text:

${text}

Return JSON with: name, email, phone, skills[], experience[], education[]`,
      },
      provider: "openai",
    });

    return JSON.parse(result.content);
  }

  async extractEmail(emailText: string): Promise {
    const result = await this.ai.generate({
      input: {
        text: `Extract structured data from this email:

${emailText}

Return JSON with: subject, sender, recipients[], date, summary, actionItems[], sentiment`,
      },
      provider: "openai",
    });

    return JSON.parse(result.content);
  }
}

const extractor = new DataExtractor();

const invoiceData = await extractor.extractInvoice(`
Invoice #INV-2025-001
Date: January 15, 2025
Vendor: Acme Corp

Items:
1. Widget A - Qty: 5 @ $10.00 = $50.00
2. Widget B - Qty: 3 @ $15.00 = $45.00

Total: $95.00
`);
```

---

## 7. Chatbot with Memory & Context

**Scenario**: Conversational AI with conversation history and context management.

### Implementation

```typescript
type Message = {
  role: "user" | "assistant";
  content: string;
  timestamp: Date;
};

class ConversationalChatbot {
  private ai: NeuroLink;
  private conversations: Map = new Map();

  constructor() {
    this.ai = new NeuroLink({
      providers: [
        {
          name: "anthropic",
          config: {
            apiKey: process.env.ANTHROPIC_API_KEY,
            model: "claude-3-5-sonnet-20241022",
          },
        },
      ],
    });
  }

  async chat(userId: string, message: string): Promise {
    if (!this.conversations.has(userId)) {
      this.conversations.set(userId, []);
    }

    const history = this.conversations.get(userId)!;

    history.push({
      role: "user",
      content: message,
      timestamp: new Date(),
    });

    const conversationContext = history
      .slice(-10)
      .map((m) => `${m.role}: ${m.content}`)
      .join("\n");

    const result = await this.ai.generate({
      input: {
        text: `You are a helpful AI assistant. Continue this conversation:

${conversationContext}

Respond as the assistant, considering the full conversation context.`,
      },
      provider: "anthropic",
    });

    history.push({
      role: "assistant",
      content: result.content,
      timestamp: new Date(),
    });

    if (history.length > 50) {
      this.conversations.set(userId, history.slice(-50));
    }

    return result.content;
  }

  async summarizeConversation(userId: string): Promise {
    const history = this.conversations.get(userId);
    if (!history || history.length === 0) {
      return "No conversation history";
    }

    const conversationText = history
      .map((m) => `${m.role}: ${m.content}`)
      .join("\n");

    const result = await this.ai.generate({
      input: {
        text: `Summarize this conversation in 2-3 sentences:

${conversationText}`,
      },
      provider: "anthropic",
    });

    return result.content;
  }

  clearConversation(userId: string): void {
    this.conversations.delete(userId);
  }
}

const chatbot = new ConversationalChatbot();

const response1 = await chatbot.chat(
  "user-123",
  "What is the capital of France?",
);
const response2 = await chatbot.chat("user-123", "What is its population?");
const summary = await chatbot.summarizeConversation("user-123");
```

---

## 8. RAG (Retrieval-Augmented Generation)

**Scenario**: AI with access to custom knowledge base.

### Implementation

```typescript

class RAGSystem {
  private ai: NeuroLink;
  private mcpClient: Anthropic;

  constructor() {
    this.ai = new NeuroLink({
      providers: [
        {
          name: "anthropic",
          config: { apiKey: process.env.ANTHROPIC_API_KEY },
        },
      ],
      mcpServers: [
        {
          name: "docs",
          command: "npx",
          args: ["-y", "@modelcontextprotocol/server-filesystem", "./docs"],
        },
      ],
    });

    this.mcpClient = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
  }

  async queryWithContext(query: string): Promise {
    const response = await this.mcpClient.messages.create({
      model: "claude-3-5-sonnet-20241022",
      max_tokens: 1024,
      messages: [
        {
          role: "user",
          content: `Using the documentation files available through MCP tools, answer this question:

${query}

Search the docs first, then provide a comprehensive answer with references.`,
        },
      ],
      tools: [
        {
          name: "read_file",
          description: "Read documentation files",
          input_schema: {
            type: "object",
            properties: {
              path: { type: "string" },
            },
            required: ["path"],
          },
        },
        {
          name: "search_files",
          description: "Search documentation",
          input_schema: {
            type: "object",
            properties: {
              query: { type: "string" },
            },
            required: ["query"],
          },
        },
      ],
    });

    return response.content[0].type === "text" ? response.content[0].text : "";
  }
}

const rag = new RAGSystem();
const answer = await rag.queryWithContext(
  "How do I configure multi-provider failover?",
);
```

---

## 9. Email Automation & Analysis

**Scenario**: Automated email responses and analysis.

### Implementation

```typescript

class EmailAutomation {
  private ai: NeuroLink;
  private transporter: nodemailer.Transporter;

  constructor() {
    this.ai = new NeuroLink({
      providers: [
        {
          name: "openai",
          config: {
            apiKey: process.env.OPENAI_API_KEY,
            model: "gpt-4o-mini",
          },
        },
      ],
    });

    this.transporter = nodemailer.createTransport({
      host: process.env.SMTP_HOST,
      port: 587,
      auth: {
        user: process.env.SMTP_USER,
        pass: process.env.SMTP_PASS,
      },
    });
  }

  async classifyEmail(
    subject: string,
    body: string,
  ): Promise {
    const result = await this.ai.generate({
      input: {
        text: `Classify this email:

Subject: ${subject}
Body: ${body}

Return JSON:
{
  "category": "urgent|support|sales|spam|general",
  "priority": "high|medium|low",
  "sentiment": "positive|neutral|negative"
}`,
      },
      provider: "openai",
    });

    return JSON.parse(result.content);
  }

  async generateResponse(
    subject: string,
    body: string,
    context: string,
  ): Promise {
    const result = await this.ai.generate({
      input: {
        text: `Generate a professional email response:

Original Email:
Subject: ${subject}
Body: ${body}

Context: ${context}

Write a helpful, professional response.`,
      },
      provider: "openai",
    });

    return result.content;
  }

  async autoRespond(email: {
    from: string;
    subject: string;
    body: string;
  }): Promise {
    const classification = await this.classifyEmail(email.subject, email.body);

    if (classification.category === "spam") {
      return;
    }

    const response = await this.generateResponse(
      email.subject,
      email.body,
      `This is a ${classification.category} email with ${classification.priority} priority`,
    );

    await this.transporter.sendMail({
      from: process.env.FROM_EMAIL,
      to: email.from,
      subject: `Re: ${email.subject}`,
      text: response,
    });
  }
}

const emailBot = new EmailAutomation();

await emailBot.autoRespond({
  from: "customer@example.com",
  subject: "Product inquiry",
  body: "I would like to know more about your pricing plans.",
});
```

---

## 10. Report Generation

**Scenario**: Automated business report generation from data.

### Implementation

```typescript
class ReportGenerator {
  private ai: NeuroLink;

  constructor() {
    this.ai = new NeuroLink({
      providers: [
        {
          name: "openai",
          config: {
            apiKey: process.env.OPENAI_API_KEY,
            model: "gpt-4o",
          },
        },
      ],
    });
  }

  async generateSalesReport(data: {
    period: string;
    totalRevenue: number;
    totalOrders: number;
    topProducts: Array;
    regions: Array;
  }): Promise {
    const result = await this.ai.generate({
      input: {
        text: `Generate a professional sales report for ${data.period}:

Metrics:
- Total Revenue: $${data.totalRevenue.toLocaleString()}
- Total Orders: ${data.totalOrders}
- Average Order Value: $${(data.totalRevenue / data.totalOrders).toFixed(2)}

Top Products:
${data.topProducts.map((p) => `- ${p.name}: $${p.sales.toLocaleString()}`).join("\n")}

Revenue by Region:
${data.regions.map((r) => `- ${r.name}: $${r.revenue.toLocaleString()}`).join("\n")}

Include:
1. Executive Summary
2. Key Metrics
3. Trends & Insights
4. Recommendations
5. Next Steps

Format as markdown.`,
      },
      provider: "openai",
    });

    return result.content;
  }

  async generateFinancialSummary(
    transactions: Array,
  ): Promise {
    const totalIncome = transactions
      .filter((t) => t.amount > 0)
      .reduce((sum, t) => sum + t.amount, 0);

    const totalExpenses = transactions
      .filter((t) => t.amount  sum + Math.abs(t.amount), 0);

    const categoryBreakdown = transactions.reduce(
      (acc, t) => {
        acc[t.category] = (acc[t.category] || 0) + t.amount;
        return acc;
      },
      {} as Record,
    );

    const result = await this.ai.generate({
      input: {
        text: `Generate financial summary:

Total Income: $${totalIncome.toLocaleString()}
Total Expenses: $${totalExpenses.toLocaleString()}
Net: $${(totalIncome - totalExpenses).toLocaleString()}

By Category:
${Object.entries(categoryBreakdown)
  .map(([cat, amt]) => `- ${cat}: $${amt.toLocaleString()}`)
  .join("\n")}

Provide:
1. Financial Overview
2. Category Analysis
3. Savings Opportunities
4. Budget Recommendations`,
      },
      provider: "openai",
    });

    return result.content;
  }
}

const reportGen = new ReportGenerator();

const salesReport = await reportGen.generateSalesReport({
  period: "Q1 2025",
  totalRevenue: 1250000,
  totalOrders: 3420,
  topProducts: [
    { name: "Product A", sales: 450000 },
    { name: "Product B", sales: 380000 },
  ],
  regions: [
    { name: "North America", revenue: 750000 },
    { name: "Europe", revenue: 500000 },
  ],
});
```

---

## 11. Image Analysis & Description

**Scenario**: Analyze images with vision models.

### Implementation

```typescript

class ImageAnalyzer {
  private ai: NeuroLink;

  constructor() {
    this.ai = new NeuroLink({
      providers: [
        {
          name: "openai",
          config: {
            apiKey: process.env.OPENAI_API_KEY,
            model: "gpt-4o",
          },
        },
      ],
    });
  }

  async analyzeImage(
    imagePath: string,
    prompt: string = "Describe this image in detail",
  ): Promise {
    const imageBuffer = await fs.readFile(imagePath);
    const base64Image = imageBuffer.toString("base64");

    const result = await this.ai.generate({
      input: {
        text: prompt,
        images: [
          {
            type: "base64",
            data: base64Image,
          },
        ],
      },
      provider: "openai",
      model: "gpt-4o",
    });

    return result.content;
  }

  async extractText(imagePath: string): Promise {
    return this.analyzeImage(imagePath, "Extract all text from this image");
  }

  async detectObjects(imagePath: string): Promise {
    const result = await this.analyzeImage(
      imagePath,
      'List all objects visible in this image. Return as JSON array: ["object1", "object2"]',
    );

    return JSON.parse(result);
  }

  async moderateContent(imagePath: string): Promise {
    const result = await this.analyzeImage(
      imagePath,
      'Analyze this image for inappropriate content. Return JSON: { "safe": true/false, "categories": ["category1"], "confidence": 0-100 }',
    );

    return JSON.parse(result);
  }
}

const imageAnalyzer = new ImageAnalyzer();

const description = await imageAnalyzer.analyzeImage("./product.jpg");
const text = await imageAnalyzer.extractText("./document-scan.jpg");
const objects = await imageAnalyzer.detectObjects("./scene.jpg");
const moderation = await imageAnalyzer.moderateContent("./user-upload.jpg");
```

---

## 12. SQL Query Generation

**Scenario**: Natural language to SQL query generation.

### Implementation

```typescript
class SQLQueryGenerator {
  private ai: NeuroLink;

  constructor() {
    this.ai = new NeuroLink({
      providers: [
        {
          name: "openai",
          config: {
            apiKey: process.env.OPENAI_API_KEY,
            model: "gpt-4o",
          },
        },
      ],
    });
  }

  async generateSQL(
    question: string,
    schema: string,
  ): Promise {
    const result = await this.ai.generate({
      input: {
        text: `Generate SQL query for this question:

Question: ${question}

Database Schema:
${schema}

Return JSON:
{
  "query": "SELECT...",
  "explanation": "This query..."
}`,
      },
      provider: "openai",
    });

    return JSON.parse(result.content);
  }

  async explainQuery(query: string): Promise {
    const result = await this.ai.generate({
      input: {
        text: `Explain this SQL query in simple terms:

${query}`,
      },
      provider: "openai",
    });

    return result.content;
  }

  async optimizeQuery(query: string): Promise {
    const result = await this.ai.generate({
      input: {
        text: `Optimize this SQL query:

${query}

Return JSON:
{
  "optimizedQuery": "SELECT...",
  "improvements": ["improvement1", "improvement2"]
}`,
      },
      provider: "openai",
    });

    return JSON.parse(result.content);
  }
}

const sqlGen = new SQLQueryGenerator();

const schema = `
Tables:
- users (id, name, email, created_at)
- orders (id, user_id, total, created_at)
- products (id, name, price, category)
- order_items (order_id, product_id, quantity)
`;

const result = await sqlGen.generateSQL(
  "Show me total revenue by product category for last month",
  schema,
);
```

---

## Cost Optimization Patterns

### Pattern 1: Free Tier First

```typescript
const ai = new NeuroLink({
  providers: [
    {
      name: "google-ai",
      priority: 1,
      config: { apiKey: process.env.GOOGLE_AI_KEY },
      quotas: { daily: 1500 },
    },
    {
      name: "openai",
      priority: 2,
      config: { apiKey: process.env.OPENAI_API_KEY },
    },
  ],
  failoverConfig: { enabled: true, fallbackOnQuota: true },
});
```

**Savings**: 80-90% cost reduction

### Pattern 2: Model Selection by Complexity

```typescript
async function chooseModel(task: string): Promise {
  const complexity = await classifyComplexity(task);

  return complexity === "simple" ? "gpt-4o-mini" : "gpt-4o";
}
```

**Savings**: 60-70% cost reduction

---

## Related Documentation

- [Provider Setup](/docs/) - Configure AI providers
- [Enterprise Features](/docs/guides/enterprise/multi-provider-failover) - Production patterns
- [MCP Integration](/docs/guides/mcp/server-catalog) - Tool integration
- [Framework Integration](/docs/guides/frameworks/nextjs) - Framework-specific guides

---

## Summary

You've learned 12 production-ready use cases:

✅ Customer support automation
✅ Content generation pipelines
✅ Code review automation
✅ Document analysis
✅ Multi-language translation
✅ Data extraction
✅ Conversational chatbots
✅ RAG systems
✅ Email automation
✅ Report generation
✅ Image analysis
✅ SQL query generation

Each pattern includes complete implementation code, cost optimization strategies, and best practices for production deployment.

---

## Compliance & Security Guide

<!-- Source: guides/enterprise/compliance.md -->

# Compliance & Security Guide

**Implement GDPR, SOC2, HIPAA, and enterprise security controls for AI applications**

---------- | -------------------- | ----------------- | -------------------------------------- |
| **GDPR**      | EU data protection   | ✅ Full           | Data residency, consent, erasure       |
| **SOC2**      | Security trust       | ✅ Full           | Access control, encryption, audit logs |
| **HIPAA**     | Healthcare data      | ✅ Full           | PHI protection, BAA, encryption        |
| **CCPA**      | California privacy   | ✅ Full           | Data rights, opt-out, disclosure       |
| **ISO 27001** | Information security | ✅ Full           | ISMS, risk management, controls        |

### Compliance Features

- ** Data Residency**: Route EU data to EU providers
- ** Encryption**: End-to-end encryption at rest and in transit
- ** Audit Logging**: Complete request/response trails
- ** Access Control**: Role-based permissions
- **⏰ Data Retention**: Configurable retention policies
- **️ Data Deletion**: Right to erasure (GDPR Article 17)
- ** Consent Management**: Track user consent

---

## Quick Start

### GDPR-Compliant Setup

```typescript

const ai = new NeuroLink({
  compliance: {
    framework: "GDPR",
    dataResidency: "EU", // Keep data in EU
    enableAuditLog: true, // Required for accountability
    dataRetention: "30-days", // Auto-delete after 30 days
    anonymization: true, // Anonymize sensitive data
  },
  providers: [
    {
      name: "mistral", // EU-based provider
      priority: 1,
      config: {
        apiKey: process.env.MISTRAL_API_KEY,
        region: "eu", // Enforce EU region
      },
    },
    {
      name: "openai", // Fallback (check DPA)
      priority: 2,
      config: {
        apiKey: process.env.OPENAI_API_KEY,
        region: "eu", // Use EU endpoint if available
      },
    },
  ],
});

// GDPR-compliant request
const result = await ai.generate({
  input: { text: "Analyze customer feedback" },
  metadata: {
    userId: hashUserId(user.id), // Anonymize user ID
    legalBasis: "consent", // GDPR Article 6(1)(a)
    purpose: "service-improvement", // Purpose limitation
    userConsent: true, // Explicit consent
  },
});
```

---

## GDPR Compliance

### Data Residency (Article 44-50)

Ensure EU data stays in EU.

```typescript
// EU data residency enforcement
const ai = new NeuroLink({
  providers: [
    {
      name: "mistral",
      priority: 1,
      config: {
        apiKey: process.env.MISTRAL_API_KEY,
        region: "eu",
        dataCenter: "eu-west-1", // France
      },
      condition: (req) => req.userRegion === "EU",
    },
    {
      name: "google-ai",
      priority: 2,
      config: {
        apiKey: process.env.GOOGLE_AI_KEY,
        // Google AI Studio data processed in EU for EU users
      },
      condition: (req) => req.userRegion === "EU",
    },
  ],
  compliance: {
    enforceDataResidency: true, // Block non-EU providers for EU data
    rejectThirdCountry: true, // Reject inadequate countries
  },
});

// Detect user region
function getUserRegion(ip: string): "EU" | "US" | "OTHER" {
  // Use IP geolocation service
  const country = geolocate(ip);

  const euCountries = [
    "AT",
    "BE",
    "BG",
    "HR",
    "CY",
    "CZ",
    "DK",
    "EE",
    "FI",
    "FR",
    "DE",
    "GR",
    "HU",
    "IE",
    "IT",
    "LV",
    "LT",
    "LU",
    "MT",
    "NL",
    "PL",
    "PT",
    "RO",
    "SK",
    "SI",
    "ES",
    "SE",
  ];

  if (euCountries.includes(country)) return "EU";
  if (country === "US") return "US";
  return "OTHER";
}

// Usage
const result = await ai.generate({
  input: { text: userQuery },
  metadata: {
    userRegion: getUserRegion(req.ip), // Routes to EU provider
  },
});
```

### Consent Management (Article 6, 7)

```typescript
class ConsentManager {
  private consents = new Map();

  async checkConsent(userId: string, purpose: string): Promise {
    const consent = this.consents.get(userId);

    if (!consent) return false;
    if (!consent.hasConsent) return false;
    if (new Date() > consent.expiresAt) return false; // Consent expired
    if (!consent.purpose.includes(purpose)) return false; // Wrong purpose

    return true;
  }

  async recordConsent(
    userId: string,
    purposes: string[],
    duration: number = 365,
  ) {
    this.consents.set(userId, {
      hasConsent: true,
      purpose: purposes,
      timestamp: new Date(),
      expiresAt: new Date(Date.now() + duration * 86400000), // days to ms
    });
  }

  async withdrawConsent(userId: string) {
    this.consents.set(userId, {
      hasConsent: false,
      purpose: [],
      timestamp: new Date(),
      expiresAt: new Date(),
    });
  }
}

// Usage
const consentManager = new ConsentManager();

// Before processing user data
const hasConsent = await consentManager.checkConsent(userId, "ai-processing");

if (!hasConsent) {
  throw new Error("User has not consented to AI processing (GDPR Article 6)");
}

const result = await ai.generate({
  input: { text: userInput },
  metadata: {
    userId: hashUserId(userId),
    legalBasis: "consent",
    purpose: "ai-processing",
    consentTimestamp: new Date().toISOString(),
  },
});
```

### Data Minimization (Article 5(1)(c))

Only process necessary data.

```typescript
// ❌ Bad: Send entire user object (excessive data)
const bad = await ai.generate({
  input: {
    text: `Analyze feedback from user: ${JSON.stringify(user)}`,
    // Includes: name, email, address, phone, SSN, etc.
  },
});

// ✅ Good: Only send necessary data
const good = await ai.generate({
  input: {
    text: `Analyze feedback: "${user.feedback}"`,
    // Only feedback text, no PII
  },
  metadata: {
    userId: hashUserId(user.id), // Hashed, not raw ID
  },
});
```

### Right to Erasure (Article 17)

Delete user data on request.

```typescript
class DataDeletionService {
  async deleteUserData(userId: string) {
    // 1. Delete from audit logs
    await auditLog.deleteByUserId(userId);

    // 2. Delete cached responses
    await cache.deleteByUserId(userId);

    // 3. Delete stored prompts/responses
    await database.delete("ai_requests", { userId });

    // 4. Log deletion (required for accountability)
    await auditLog.record({
      action: "DATA_DELETION",
      userId: hashUserId(userId),
      timestamp: new Date(),
      reason: "GDPR_RIGHT_TO_ERASURE",
    });

    console.log(`Deleted all data for user: ${hashUserId(userId)}`);
  }
}

// API endpoint for deletion requests
app.post("/api/delete-my-data", async (req, res) => {
  const { userId } = req.user;

  // Verify user identity
  await verifyIdentity(req);

  // Delete all user data
  await dataDeletionService.deleteUserData(userId);

  res.json({
    success: true,
    message: "All your data has been deleted",
  });
});
```

### Data Retention (Article 5(1)(e))

Auto-delete data after retention period.

```typescript
class RetentionPolicy {
  private retentionPeriod = 30 * 86400000; // 30 days in ms

  async enforceRetention() {
    const cutoff = new Date(Date.now() - this.retentionPeriod);

    // Delete audit logs older than retention period
    await database.delete("audit_logs", {
      timestamp: { $lt: cutoff },
    });

    // Delete cached responses
    await database.delete("ai_cache", {
      createdAt: { $lt: cutoff },
    });

    console.log(`Deleted data older than ${new Date(cutoff).toISOString()}`);
  }
}

// Run daily
const retentionPolicy = new RetentionPolicy();
setInterval(() => retentionPolicy.enforceRetention(), 86400000); // Daily
```

---

## SOC2 Compliance

### Access Control (CC6.1)

Role-based access control for AI features.

```typescript
enum Role {
  ADMIN = "admin",
  USER = "user",
  READONLY = "readonly",
}

class AccessControl {
  private permissions = {
    [Role.ADMIN]: ["read", "write", "delete", "configure"],
    [Role.USER]: ["read", "write"],
    [Role.READONLY]: ["read"],
  };

  canAccess(role: Role, action: string): boolean {
    return this.permissions[role].includes(action);
  }

  async checkAccess(userId: string, action: string) {
    const user = await getUser(userId);

    if (!this.canAccess(user.role, action)) {
      // Log access attempt for audit
      await auditLog.record({
        event: "UNAUTHORIZED_ACCESS_ATTEMPT",
        userId: hashUserId(userId),
        action,
        timestamp: new Date(),
      });

      throw new Error("Insufficient permissions");
    }
  }
}

// Usage
const acl = new AccessControl();

app.post("/api/ai/generate", async (req, res) => {
  await acl.checkAccess(req.user.id, "write");

  const result = await ai.generate({
    input: { text: req.body.prompt },
    metadata: {
      userId: hashUserId(req.user.id),
      role: req.user.role,
    },
  });

  res.json(result);
});
```

### Audit Logging (CC7.2)

Comprehensive audit trail for all AI operations.

```typescript
type AuditEntry = {
  timestamp: Date;
  userId: string;
  action: string;
  provider: string;
  model: string;
  inputHash: string; // Hash of input (not raw input for privacy)
  outputHash: string; // Hash of output
  tokensUsed: number;
  cost: number;
  latency: number;
  success: boolean;
  error?: string;
  ipAddress: string;
  userAgent: string;
  requestId: string;
};

class AuditLogger {
  async log(entry: AuditEntry) {
    // Store in tamper-proof audit log
    await database.insert("audit_logs", {
      ...entry,
      hash: this.computeHash(entry), // Detect tampering
    });

    // Also send to external SIEM
    await siem.sendEvent(entry);
  }

  private computeHash(entry: AuditEntry): string {
    const hash = createHash("sha256");
    hash.update(JSON.stringify(entry));
    return hash.digest("hex");
  }

  async query(filters: any) {
    return await database.find("audit_logs", filters);
  }
}

// Usage
const auditLogger = new AuditLogger();

const ai = new NeuroLink({
  providers: [
    /* ... */
  ],
  onRequest: async (req) => {
    await auditLogger.log({
      timestamp: new Date(),
      userId: hashUserId(req.userId),
      action: "AI_REQUEST_STARTED",
      provider: req.provider,
      model: req.model,
      inputHash: hashInput(req.input),
      tokensUsed: 0,
      cost: 0,
      latency: 0,
      success: false,
      ipAddress: req.ipAddress,
      userAgent: req.userAgent,
      requestId: req.requestId,
    });
  },
  onSuccess: async (result, req) => {
    await auditLogger.log({
      timestamp: new Date(),
      userId: hashUserId(req.userId),
      action: "AI_REQUEST_COMPLETED",
      provider: result.provider,
      model: result.model,
      inputHash: hashInput(req.input),
      outputHash: hashOutput(result.content),
      tokensUsed: result.usage.totalTokens,
      cost: result.cost,
      latency: result.latency,
      success: true,
      ipAddress: req.ipAddress,
      userAgent: req.userAgent,
      requestId: req.requestId,
    });
  },
});
```

### Encryption (CC6.7)

Encrypt data at rest and in transit.

```typescript

  createCipheriv,
  createDecipheriv,
  randomBytes,
  createHash,
} from "crypto";

class EncryptionService {
  private algorithm = "aes-256-gcm";
  private key = Buffer.from(process.env.ENCRYPTION_KEY!, "hex"); // 32 bytes

  encrypt(plaintext: string): { encrypted: string; iv: string; tag: string } {
    const iv = randomBytes(16);
    const cipher = createCipheriv(this.algorithm, this.key, iv);

    let encrypted = cipher.update(plaintext, "utf8", "hex");
    encrypted += cipher.final("hex");

    const tag = cipher.getAuthTag();

    return {
      encrypted,
      iv: iv.toString("hex"),
      tag: tag.toString("hex"),
    };
  }

  decrypt(encrypted: string, iv: string, tag: string): string {
    const decipher = createDecipheriv(
      this.algorithm,
      this.key,
      Buffer.from(iv, "hex"),
    );

    decipher.setAuthTag(Buffer.from(tag, "hex"));

    let decrypted = decipher.update(encrypted, "hex", "utf8");
    decrypted += decipher.final("utf8");

    return decrypted;
  }
}

// Usage: Encrypt sensitive data before storage
const encryption = new EncryptionService();

async function storeSensitiveData(userId: string, data: any) {
  const { encrypted, iv, tag } = encryption.encrypt(JSON.stringify(data));

  await database.insert("encrypted_data", {
    userId: hashUserId(userId),
    encrypted,
    iv,
    tag,
    createdAt: new Date(),
  });
}

async function retrieveSensitiveData(userId: string) {
  const record = await database.findOne("encrypted_data", {
    userId: hashUserId(userId),
  });

  const decrypted = encryption.decrypt(record.encrypted, record.iv, record.tag);
  return JSON.parse(decrypted);
}
```

---

## HIPAA Compliance

### PHI Protection (§164.312)

Protect Protected Health Information.

```typescript
// Identify and redact PHI before sending to AI
function redactPHI(text: string): string {
  return (
    text
      .replace(/\b\d{3}-\d{2}-\d{4}\b/g, "[SSN-REDACTED]") // SSN
      // Phone: match (xxx) xxx-xxxx, xxx-xxx-xxxx, xxx.xxx.xxxx, +1-xxx-xxx-xxxx
      .replace(
        /(\+1[-.\s]?)?(\(?\d{3}\)?[-.\s]?)\d{3}[-.\s]?\d{4}\b/g,
        "[PHONE-REDACTED]",
      )
      .replace(/\b[\w.-]+@[\w.-]+\.\w+\b/g, "[EMAIL-REDACTED]") // Email
      .replace(/\b\d{1,2}\/\d{1,2}\/\d{2,4}\b/g, "[DATE-REDACTED]")
  ); // DOB
}

// HIPAA-compliant AI request
const result = await ai.generate({
  input: {
    text: redactPHI(medicalRecord), // Redact PHI first
  },
  metadata: {
    hipaaCompliant: true,
    phi: false, // Confirm no PHI in request
    baaRequired: true,
  },
});
```

### Business Associate Agreement (BAA)

Ensure providers have signed BAAs.

```typescript
const ai = new NeuroLink({
  providers: [
    {
      name: "openai",
      priority: 1,
      config: { apiKey: process.env.OPENAI_KEY },
      compliance: {
        hipaa: true,
        baa: true, // OpenAI offers BAA for Enterprise
        baaSignedDate: "2024-01-15",
      },
    },
    {
      name: "anthropic",
      priority: 2,
      config: { apiKey: process.env.ANTHROPIC_KEY },
      compliance: {
        hipaa: true,
        baa: true, // Anthropic offers BAA
        baaSignedDate: "2024-02-01",
      },
    },
  ],
  compliance: {
    framework: "HIPAA",
    requireBAA: true, // Only use providers with BAA
    encryption: {
      atRest: true,
      inTransit: true,
    },
  },
});
```

### Audit Controls (§164.312(b))

Track all PHI access.

```typescript
type HIPAAAuditEntry = {
  timestamp: Date;
  userId: string;
  action: "CREATE" | "READ" | "UPDATE" | "DELETE";
  resourceType: "PHI" | "MEDICAL_RECORD";
  resourceId: string;
  success: boolean;
  ipAddress: string;
  reasonForAccess: string;
};

class HIPAAAuditLogger {
  async logPHIAccess(entry: HIPAAAuditEntry) {
    // Store in immutable audit log
    await database.insert("hipaa_audit_logs", {
      ...entry,
      hash: hashEntry(entry), // Tamper detection
      retainUntil: new Date(Date.now() + 6 * 365 * 86400000), // 6 years
    });

    // Alert on suspicious access
    if (this.isSuspicious(entry)) {
      await alerting.sendAlert("Suspicious PHI access detected", entry);
    }
  }

  private isSuspicious(entry: HIPAAAuditEntry): boolean {
    // Detect anomalies
    const recentAccess = await this.getRecentAccess(entry.userId);

    // Too many accesses in short time
    if (recentAccess.length > 100) return true;

    // Access outside business hours
    const hour = new Date().getHours();
    if (hour  22) return true;

    return false;
  }
}
```

---

## Security Best Practices

### 1. ✅ Hash User IDs

```typescript
function hashUserId(userId: string): string {
  const hash = createHash("sha256");
  hash.update(userId + process.env.HASH_SALT);
  return hash.digest("hex");
}

// Never send raw user IDs to AI providers
const result = await ai.generate({
  input: { text: prompt },
  metadata: {
    userId: hashUserId(user.id), // ✅ Hashed
    // NOT: userId: user.id       // ❌ Raw
  },
});
```

### 2. ✅ Use HTTPS Only

```typescript
const ai = new NeuroLink({
  providers: [
    /* ... */
  ],
  security: {
    enforceHTTPS: true, // Reject HTTP connections
    tlsVersion: "1.3", // Minimum TLS version
    verifyCertificates: true,
  },
});
```

### 3. ✅ Implement Rate Limiting

```typescript

const limiter = rateLimit({
  windowMs: 60000, // 1 minute
  max: 100, // 100 requests per minute
  message: "Too many requests",
});

app.use("/api/ai", limiter);
```

### 4. ✅ Validate Inputs

```typescript
function validateInput(input: string): boolean {
  // Prevent prompt injection
  const forbidden = ["ignore previous instructions", "system:", "admin:"];

  for (const phrase of forbidden) {
    if (input.toLowerCase().includes(phrase)) {
      throw new Error("Potential prompt injection detected");
    }
  }

  // Limit length
  if (input.length > 10000) {
    throw new Error("Input too long");
  }

  return true;
}
```

### 5. ✅ Monitor for Anomalies

```typescript
class AnomalyDetector {
  private baseline = {
    avgRequestsPerHour: 100,
    avgTokensPerRequest: 500,
    avgCostPerRequest: 0.01,
  };

  detectAnomalies(metrics: any) {
    // Unusual spike in requests
    if (metrics.requestsThisHour > this.baseline.avgRequestsPerHour * 5) {
      alerting.sendAlert("Unusual spike in AI requests");
    }

    // Unusual token usage
    if (metrics.avgTokens > this.baseline.avgTokensPerRequest * 3) {
      alerting.sendAlert("Unusual token usage pattern");
    }

    // Unusual costs
    if (metrics.avgCost > this.baseline.avgCostPerRequest * 10) {
      alerting.sendAlert("Unusual AI costs detected");
    }
  }
}
```

---

## Compliance Checklist

### GDPR Compliance ✅

- [ ] Data residency enforced (EU data in EU)
- [ ] Explicit user consent collected and tracked
- [ ] Data minimization implemented
- [ ] Audit logging enabled
- [ ] Right to erasure implemented
- [ ] Data retention policy configured
- [ ] Privacy policy updated
- [ ] DPIA conducted for high-risk processing

### SOC2 Compliance ✅

- [ ] Access controls implemented
- [ ] Audit logging comprehensive
- [ ] Encryption at rest and in transit
- [ ] Security monitoring active
- [ ] Incident response plan documented
- [ ] Change management process
- [ ] Vendor management (provider assessments)
- [ ] Annual penetration testing

### HIPAA Compliance ✅

- [ ] BAA signed with all AI providers
- [ ] PHI redaction implemented
- [ ] Encryption enabled (AES-256)
- [ ] Audit controls active (6-year retention)
- [ ] Access controls enforced
- [ ] Risk assessment completed
- [ ] Security officer assigned
- [ ] Breach notification process documented

---

## Related Documentation

- **[Mistral AI Guide](/docs/getting-started/providers/mistral)** - GDPR-compliant EU provider
- **[Multi-Region Deployment](/docs/guides/enterprise/multi-region)** - Geographic compliance
- **[Monitoring Guide](/docs/observability/health-monitoring)** - Security monitoring
- **[Audit Trails](/docs/guides/enterprise/audit-trails)** - Comprehensive logging

---

## Additional Resources

- **[GDPR Official Text](https://gdpr-info.eu/)** - EU regulation
- **[SOC2 Framework](https://www.aicpa.org/soc)** - Trust services criteria
- **[HIPAA Rules](https://www.hhs.gov/hipaa)** - Healthcare privacy
- **[OpenAI BAA](https://openai.com/enterprise-privacy)** - Enterprise compliance

---

**Need Help?** Join our [GitHub Discussions](https://github.com/juspay/neurolink/discussions) or open an [issue](https://github.com/juspay/neurolink/issues).

---

## Next.js Integration Guide

<!-- Source: guides/frameworks/nextjs.md -->

# Next.js Integration Guide

**Build production-ready AI applications with Next.js 14+ and NeuroLink**

## Quick Start

### 1. Create Next.js Project

```bash
npx create-next-app@latest my-ai-app
cd my-ai-app
npm install @juspay/neurolink
```

### 2. Add Environment Variables

```bash
# .env.local
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_AI_API_KEY=AIza...
```

### 3. Create NeuroLink Instance

```typescript
// lib/ai.ts

export const ai = new NeuroLink({
  providers: [
    {
      name: "openai",
      config: { apiKey: process.env.OPENAI_API_KEY },
    },
    {
      name: "anthropic",
      config: { apiKey: process.env.ANTHROPIC_API_KEY },
    },
  ],
});
```

### 4. Server Component Example

```typescript
// app/page.tsx

export default async function Home() {
  const result = await ai.generate({
    input: { text: 'Explain Next.js in one sentence' },
    provider: 'openai',
    model: 'gpt-4o-mini'
  });

  return (

      AI Response
      {result.content}

  );
}
```

---

## Server Components Pattern

### Basic Server Component

```typescript
// app/summary/page.tsx

type Props = {
  searchParams: { text?: string };
};

export default async function SummaryPage({ searchParams }: Props) {
  const { text } = searchParams;

  if (!text) {
    return No text provided;
  }

  // AI generation happens on server
  const result = await ai.generate({
    input: { text: `Summarize: ${text}` },
    provider: 'openai',
    model: 'gpt-4o-mini'
  });

  return (

      Summary

        {result.content}


        Tokens: {result.usage.totalTokens} | Cost: ${result.cost.toFixed(4)}


  );
}
```

### Server Component with Suspense

```typescript
// app/analysis/page.tsx

async function Analysis({ query }: { query: string }) {
  const result = await ai.generate({
    input: { text: query },
    provider: 'anthropic',
    model: 'claude-3-5-sonnet-20241022'
  });

  return {result.content};
}

export default function AnalysisPage({ searchParams }: any) {
  const { query } = searchParams;

  return (

      AI Analysis


  );
}
```

---

## Server Actions

### Basic Server Action

```typescript
// app/actions.ts
"use server";

export async function generateText(prompt: string) {
  const result = await ai.generate({
    input: { text: prompt },
    provider: "openai",
    model: "gpt-4o-mini",
  });

  return {
    content: result.content,
    tokens: result.usage.totalTokens,
    cost: result.cost,
  };
}
```

### Client Component Using Server Action

```typescript
// app/components/TextGenerator.tsx
'use client';

export function TextGenerator() {
  const [prompt, setPrompt] = useState('');
  const [result, setResult] = useState('');
  const [loading, setLoading] = useState(false);

  async function handleSubmit(e: React.FormEvent) {
    e.preventDefault();
    setLoading(true);

    try {
      const response = await generateText(prompt);
      setResult(response.content);
    } catch (error) {
      console.error(error);
    } finally {
      setLoading(false);
    }
  }

  return (


         setPrompt(e.target.value)}
          className="w-full p-4 border rounded"
          rows={4}
          placeholder="Enter your prompt..."
        />

          {loading ? 'Generating...' : 'Generate'}


      {result && (

          Result:
          {result}

      )}

  );
}
```

---

## API Routes

### Basic API Route

```typescript
// app/api/generate/route.ts

export async function POST(request: NextRequest) {
  try {
    const {
      prompt,
      provider = "openai",
      model = "gpt-4o-mini",
    } = await request.json();

    if (!prompt) {
      return NextResponse.json(
        { error: "Prompt is required" },
        { status: 400 },
      );
    }

    const result = await ai.generate({
      input: { text: prompt },
      provider,
      model,
    });

    return NextResponse.json({
      content: result.content,
      usage: result.usage,
      cost: result.cost,
      provider: result.provider,
      model: result.model,
    });
  } catch (error: any) {
    console.error("AI generation error:", error);
    return NextResponse.json({ error: error.message }, { status: 500 });
  }
}
```

### Protected API Route with Middleware

```typescript
// middleware.ts

export function middleware(request: NextRequest) {
  // Check authentication
  const token = request.headers.get("authorization")?.replace("Bearer ", "");

  if (!token || token !== process.env.API_SECRET) {
    return NextResponse.json({ error: "Unauthorized" }, { status: 401 });
  }

  return NextResponse.next();
}

export const config = {
  matcher: "/api/:path*",
};
```

### Rate-Limited API Route

```typescript
// app/api/generate/route.ts

const limiter = rateLimit({
  interval: 60 * 1000, // 1 minute
  uniqueTokenPerInterval: 500,
});

export async function POST(request: NextRequest) {
  try {
    // Rate limiting
    const ip = request.ip ?? "anonymous";
    const { success } = await limiter.check(ip, 10); // 10 requests per minute

    if (!success) {
      return NextResponse.json(
        { error: "Rate limit exceeded" },
        { status: 429 },
      );
    }

    const { prompt } = await request.json();

    const result = await ai.generate({
      input: { text: prompt },
      provider: "openai",
      model: "gpt-4o-mini",
    });

    return NextResponse.json({
      content: result.content,
      usage: result.usage,
    });
  } catch (error: any) {
    return NextResponse.json({ error: error.message }, { status: 500 });
  }
}
```

---

## Streaming Responses

### Streaming API Route

```typescript
// app/api/stream/route.ts

export const runtime = "edge"; // Enable Edge Runtime for streaming

export async function POST(request: NextRequest) {
  const { prompt } = await request.json();

  const stream = new ReadableStream({
    async start(controller) {
      try {
        for await (const chunk of ai.stream({
          input: { text: prompt },
          provider: "openai",
          model: "gpt-4o-mini",
        })) {
          const text = `data: ${JSON.stringify({ content: chunk.content })}\n\n`;
          controller.enqueue(new TextEncoder().encode(text));
        }

        controller.enqueue(new TextEncoder().encode("data: [DONE]\n\n"));
        controller.close();
      } catch (error: any) {
        controller.error(error);
      }
    },
  });

  return new Response(stream, {
    headers: {
      "Content-Type": "text/event-stream",
      "Cache-Control": "no-cache",
      Connection: "keep-alive",
    },
  });
}
```

### Client Component for Streaming

```typescript
// app/components/StreamingChat.tsx
'use client';

export function StreamingChat() {
  const [prompt, setPrompt] = useState('');
  const [response, setResponse] = useState('');
  const [loading, setLoading] = useState(false);

  async function handleSubmit(e: React.FormEvent) {
    e.preventDefault();
    setLoading(true);
    setResponse('');

    try {
      const res = await fetch('/api/stream', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ prompt })
      });

      if (!res.ok) throw new Error('Stream failed');

      const reader = res.body?.getReader();
      const decoder = new TextDecoder();

      while (true) {
        const { done, value } = await reader!.read();
        if (done) break;

        const chunk = decoder.decode(value);
        const lines = chunk.split('\n');

        for (const line of lines) {
          if (line.startsWith('data: ')) {
            const data = line.slice(6);
            if (data === '[DONE]') break;

            try {
              const parsed = JSON.parse(data);
              setResponse(prev => prev + parsed.content);
            } catch (e) {
              // Skip invalid JSON
            }
          }
        }
      }
    } catch (error) {
      console.error(error);
    } finally {
      setLoading(false);
    }
  }

  return (


         setPrompt(e.target.value)}
          className="w-full p-4 border rounded"
          rows={4}
          placeholder="Ask anything..."
          disabled={loading}
        />

          {loading ? 'Streaming...' : 'Send'}


      {response && (

          Response:
          {response}

      )}

  );
}
```

---

## Edge Runtime

### Edge API Route

```typescript
// app/api/edge/generate/route.ts

// Enable Edge Runtime
export const runtime = "edge";

export async function POST(request: Request) {
  const { prompt } = await request.json();

  const result = await ai.generate({
    input: { text: prompt },
    provider: "openai",
    model: "gpt-4o-mini",
  });

  return Response.json({
    content: result.content,
    usage: result.usage,
  });
}
```

### Edge Function with Regional Routing

```typescript
// app/api/edge/regional/route.ts
export const runtime = "edge";

export async function POST(request: Request) {
  // Detect user region from request
  const country = request.headers.get("x-vercel-ip-country") || "US";
  const region = mapCountryToRegion(country);

  const { prompt } = await request.json();

  const result = await ai.generate({
    input: { text: prompt },
    metadata: { userRegion: region },
    // Routes to nearest provider based on region
  });

  return Response.json({
    content: result.content,
    region: result.region,
  });
}

function mapCountryToRegion(country: string): string {
  const euCountries = ["DE", "FR", "IT", "ES", "NL", "BE", "AT", "SE", "PL"];
  if (euCountries.includes(country)) return "eu";
  if (country === "US") return "us-east";
  return "asia";
}
```

---

## Production Patterns

### Pattern 1: Chat Application

```typescript
// app/chat/page.tsx
'use client';

type Message = {
  role: 'user' | 'assistant';
  content: string;
};

export default function ChatPage() {
  const [messages, setMessages] = useState([]);
  const [input, setInput] = useState('');
  const [loading, setLoading] = useState(false);

  async function sendMessage(e: React.FormEvent) {
    e.preventDefault();
    if (!input.trim()) return;

    const userMessage: Message = { role: 'user', content: input };
    setMessages(prev => [...prev, userMessage]);
    setInput('');
    setLoading(true);

    try {
      const response = await fetch('/api/chat', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({
          messages: [...messages, userMessage]
        })
      });

      const data = await response.json();
      const assistantMessage: Message = {
        role: 'assistant',
        content: data.content
      };

      setMessages(prev => [...prev, assistantMessage]);
    } catch (error) {
      console.error(error);
    } finally {
      setLoading(false);
    }
  }

  return (


        {messages.map((msg, i) => (


              {msg.content}


        ))}
        {loading && (


              Thinking...


        )}


           setInput(e.target.value)}
            className="flex-1 p-2 border rounded"
            placeholder="Type a message..."
            disabled={loading}
          />

            Send


  );
}
```

```typescript
// app/api/chat/route.ts

export async function POST(request: NextRequest) {
  const { messages } = await request.json();

  // Convert to prompt
  const prompt = messages
    .map(
      (m: any) => `${m.role === "user" ? "User" : "Assistant"}: ${m.content}`,
    )
    .join("\n");

  const result = await ai.generate({
    input: { text: prompt + "\nAssistant:" },
    provider: "anthropic",
    model: "claude-3-5-sonnet-20241022",
    maxTokens: 500,
  });

  return NextResponse.json({
    content: result.content,
  });
}
```

### Pattern 2: Document Analysis

```typescript
// app/analyze/page.tsx

export default async function AnalyzePage({ searchParams }: any) {
  const { file } = searchParams;

  if (!file) {
    return No file provided;
  }

  // Read file (in real app, upload via form)
  const content = await readFile(file, 'utf-8');

  // Analyze with AI
  const result = await ai.generate({
    input: {
      text: `Analyze this document and provide key insights:\n\n${content}`
    },
    provider: 'anthropic',
    model: 'claude-3-5-sonnet-20241022'
  });

  return (

      Document Analysis
      {result.content}

  );
}
```

### Pattern 3: Cost Tracking

```typescript
// lib/analytics.ts

export async function trackAIUsage(data: {
  userId: string;
  provider: string;
  model: string;
  tokens: number;
  cost: number;
}) {
  await prisma.aiUsage.create({
    data: {
      userId: data.userId,
      provider: data.provider,
      model: data.model,
      tokens: data.tokens,
      cost: data.cost,
      timestamp: new Date(),
    },
  });
}

export async function getUserSpending(userId: string) {
  const result = await prisma.aiUsage.aggregate({
    where: { userId },
    _sum: { cost: true, tokens: true },
    _count: true,
  });

  return {
    totalCost: result._sum.cost || 0,
    totalTokens: result._sum.tokens || 0,
    requestCount: result._count,
  };
}
```

```typescript
// app/api/generate/route.ts

export async function POST(request: NextRequest) {
  const session = await getSession(request);
  const { prompt } = await request.json();

  const result = await ai.generate({
    input: { text: prompt },
    provider: "openai",
    model: "gpt-4o-mini",
    enableAnalytics: true,
  });

  // Track usage
  await trackAIUsage({
    userId: session.user.id,
    provider: result.provider,
    model: result.model,
    tokens: result.usage.totalTokens,
    cost: result.cost,
  });

  return NextResponse.json({ content: result.content });
}
```

---

## Best Practices

### 1. ✅ Use Server Components for Static AI Content

```typescript
// ✅ Good: Server Component (no client bundle)
async function AIContent() {
  const result = await ai.generate({
    input: { text: 'Generate marketing copy' }
  });
  return {result.content};
}
```

### 2. ✅ Stream for Long Responses

```typescript
// ✅ Good: Stream for better UX
export const runtime = "edge";

export async function POST(request: Request) {
  const stream = await ai.stream({
    /* ... */
  });
  return new Response(stream);
}
```

### 3. ✅ Implement Rate Limiting

```typescript
// ✅ Good: Protect API routes
const limiter = rateLimit({
  interval: 60 * 1000,
  uniqueTokenPerInterval: 500,
});

export async function POST(request: NextRequest) {
  await limiter.check(request.ip, 10);
  // ... generate AI response
}
```

### 4. ✅ Cache AI Responses

```typescript
// ✅ Good: Cache with Next.js
export const revalidate = 3600; // 1 hour

export default async function Page() {
  const result = await ai.generate({ /* ... */ });
  return {result.content};
}
```

### 5. ✅ Handle Errors Gracefully

```typescript
// ✅ Good: Error handling
try {
  const result = await ai.generate({
    /* ... */
  });
  return NextResponse.json(result);
} catch (error) {
  console.error("AI Error:", error);
  return NextResponse.json(
    { error: "AI service unavailable" },
    { status: 503 },
  );
}
```

---

## Deployment

### Vercel Deployment

```bash
# Install Vercel CLI
npm i -g vercel

# Deploy
vercel

# Set environment variables
vercel env add OPENAI_API_KEY
vercel env add ANTHROPIC_API_KEY
```

### Environment Variables (Production)

```bash
# Production .env
OPENAI_API_KEY=sk-prod-...
ANTHROPIC_API_KEY=sk-ant-prod-...
DATABASE_URL=postgresql://...
API_SECRET=your-secret-key
```

---

## Related Documentation

- **[API Reference](/docs/sdk/api-reference)** - NeuroLink SDK API
- **[Streaming Guide](/docs/advanced/streaming)** - Streaming responses
- **[Cost Optimization](/docs/cookbook/cost-optimization)** - Reduce costs
- **[Compliance Guide](/docs/guides/enterprise/compliance)** - Security and authentication
- **[Fastify Integration](/docs/sdk/framework-integration)** - High-performance Node.js framework with schema validation

---

## Additional Resources

- **[Next.js Documentation](https://nextjs.org/docs)** - Official Next.js docs
- **[Vercel AI SDK](https://sdk.vercel.ai/)** - Alternative AI SDK
- **[Next.js Examples](https://github.com/vercel/next.js/tree/canary/examples)** - Example apps

---

**Need Help?** Join our [GitHub Discussions](https://github.com/juspay/neurolink/discussions) or open an [issue](https://github.com/juspay/neurolink/issues).

---

## Cost Optimization Guide

<!-- Source: guides/enterprise/cost-optimization.md -->

# Cost Optimization Guide

**Reduce AI costs by 80-95% through smart provider selection, caching, and optimization strategies**

------------------- | --------------- | ---------- |
| **Free Tier First**    | 80-100%         | Low        |
| **Model Selection**    | 50-90%          | Low        |
| **Response Caching**   | 60-95%          | Medium     |
| **Token Optimization** | 20-40%          | Medium     |
| **Prompt Compression** | 15-30%          | Medium     |
| **Smart Fallbacks**    | 30-60%          | High       |
| **Batch Processing**   | 50%             | Medium     |

### Cost Comparison

```
Monthly Cost Comparison (1M requests, 500 tokens avg):

Premium (GPT-4):           $6,000/month
Smart Routing:             $1,200/month  (80% savings)
Free Tier First:           $300/month    (95% savings)
Full Optimization:         $150/month    (97.5% savings)
```

---

## Quick Wins

### 1. Use Free Tiers First

Maximize free tier usage before falling back to paid providers.

```typescript

const ai = new NeuroLink({
  providers: [
    // Tier 1: Free providers (try these first)
    {
      name: "google-ai",
      priority: 1,
      model: "gemini-2.0-flash",
      config: { apiKey: process.env.GOOGLE_AI_KEY },
      quotas: {
        daily: 1500, // 1,500 requests/day free
        perMinute: 15, // 15 RPM free
      },
    },

    // Tier 2: Cheap paid providers
    {
      name: "openai",
      priority: 2,
      model: "gpt-4o-mini",
      config: { apiKey: process.env.OPENAI_KEY },
      costPer1M: 150, // $0.15/1K tokens
    },

    // Tier 3: Premium (only when necessary)
    {
      name: "anthropic",
      priority: 3,
      model: "claude-3-5-sonnet-20241022",
      config: { apiKey: process.env.ANTHROPIC_KEY },
      costPer1M: 3000, // $3/1K tokens
    },
  ],
  failoverConfig: {
    enabled: true,
    fallbackOnQuota: true, // Auto-failover when quota exhausted
  },
});

// Automatically uses cheapest available provider
const result = await ai.generate({
  input: { text: "Your prompt" },
});

console.log(`Used: ${result.provider}, Cost: $${result.cost}`);
```

**Estimated Monthly Savings:**

```
Before: 1M requests × $3/1K tokens = $1,500/month
After:  900K free + 100K paid × $0.15/1K = $15/month
Savings: $1,485/month (99% reduction)
```

### 2. Choose Cost-Effective Models

Use cheaper models for simple tasks, premium only when needed.

```typescript
function selectModel(task: string): { provider: string; model: string } {
  const complexity = analyzeComplexity(task);

  if (complexity === "simple") {
    return {
      provider: "google-ai",
      model: "gemini-2.0-flash", // Free
    };
  } else if (complexity === "medium") {
    return {
      provider: "openai",
      model: "gpt-4o-mini", // $0.15/1K
    };
  } else {
    return {
      provider: "anthropic",
      model: "claude-3-5-sonnet-20241022", // $3/1K
    };
  }
}

function analyzeComplexity(task: string): "simple" | "medium" | "complex" {
  const length = task.length;
  const keywords = /analyze|complex|detailed|comprehensive/i;

  if (length ();

  private TTL = 3600000; // 1 hour
  private totalSavings = 0;

  getCacheKey(input: any, provider: string, model: string): string {
    const hash = createHash("sha256");
    hash.update(JSON.stringify({ input, provider, model }));
    return hash.digest("hex");
  }

  get(key: string): any | null {
    const cached = this.cache.get(key);

    if (!cached) return null;

    // Check if expired
    if (Date.now() - cached.timestamp > this.TTL) {
      this.cache.delete(key);
      return null;
    }

    // Track savings
    this.totalSavings += cached.cost;
    console.log(`Cache hit! Saved $${cached.cost.toFixed(4)}`);

    return cached.response;
  }

  set(key: string, response: any, cost: number) {
    this.cache.set(key, {
      response,
      timestamp: Date.now(),
      cost,
    });
  }

  getSavings(): number {
    return this.totalSavings;
  }

  getStats() {
    return {
      entries: this.cache.size,
      totalSavings: this.totalSavings,
      avgCostPerEntry: this.totalSavings / this.cache.size,
    };
  }
}

// Usage
const cache = new ResponseCache();

async function cachedGenerate(prompt: string) {
  const cacheKey = cache.getCacheKey({ text: prompt }, "openai", "gpt-4o-mini");

  // Check cache first
  const cached = cache.get(cacheKey);
  if (cached) {
    return cached;
  }

  // Generate fresh response
  const result = await ai.generate({
    input: { text: prompt },
    provider: "openai",
    model: "gpt-4o-mini",
    enableAnalytics: true,
  });

  // Store in cache
  cache.set(cacheKey, result, result.cost);

  return result;
}

// Check savings
setInterval(() => {
  console.log("Cache stats:", cache.getStats());
  // { entries: 523, totalSavings: 45.67, avgCostPerEntry: 0.087 }
}, 60000);
```

**Estimated Savings:**

```
Cache hit rate: 60% (common in production)
Monthly requests: 1M
Cost without cache: $150
Cost with cache:    $60 (40% of requests)
Savings: $90/month (60% reduction)
```

---

## Free Tier Optimization

### Google AI Studio (1,500 RPD Free)

```typescript
class GoogleAIQuotaManager {
  private requestsToday = 0;
  private dayStart = Date.now();

  async canUseFreeTier(): Promise {
    // Reset daily counter
    if (Date.now() - this.dayStart > 86400000) {
      this.requestsToday = 0;
      this.dayStart = Date.now();
    }

    return this.requestsToday  await googleQuota.canUseFreeTier(),
    },
    {
      name: "openai",
      priority: 2,
      model: "gpt-4o-mini", // Cheap fallback
    },
  ],
});
```

**Monthly Savings:**

```
1,500 requests/day × 30 days = 45,000 free requests
45,000 × 500 tokens × $0.15/1M = $3.37 saved/month
If 100% free tier: $0 cost
```

### Hugging Face (100% Free)

```typescript
// Use Hugging Face for zero-cost inference
const ai = new NeuroLink({
  providers: [
    {
      name: "huggingface",
      priority: 1,
      model: "mistralai/Mistral-7B-Instruct-v0.2",
      config: { apiKey: process.env.HF_API_KEY }, // Free API key
      costPer1M: 0, // Completely free
    },
    {
      name: "openai",
      priority: 2,
      model: "gpt-4o-mini",
      costPer1M: 150, // Fallback when HF quality insufficient
    },
  ],
});

// For simple tasks, 100% free with Hugging Face
const simple = await ai.generate({
  input: { text: "Summarize: AI is transforming industries..." },
  // Uses Hugging Face (free)
});
```

---

## Token Optimization

### 1. Reduce Output Tokens

Limit response length to only what's needed.

```typescript
// ❌ Bad: No limit (can generate 1000s of tokens)
const wasteful = await ai.generate({
  input: { text: "List AI providers" },
  // Could generate 2000+ tokens
});

// ✅ Good: Set reasonable limit
const efficient = await ai.generate({
  input: { text: "List AI providers" },
  maxTokens: 200, // Only what's needed
});

// Savings per request:
// Before: 2000 tokens × $0.15/1M = $0.0003
// After:  200 tokens × $0.15/1M = $0.00003
// Savings: 90%
```

### 2. Optimize Prompts

Use concise prompts without sacrificing quality.

```typescript
// ❌ Bad: Verbose prompt (300 tokens)
const verbose = await ai.generate({
  input: {
    text: `
    I would like you to please help me understand what artificial intelligence
    is all about. Please provide a comprehensive explanation that covers the
    following topics in great detail: machine learning, deep learning, neural
    networks, natural language processing, and computer vision. Make sure to
    explain each concept thoroughly and provide examples where applicable.
  `,
  },
});

// ✅ Good: Concise prompt (50 tokens)
const concise = await ai.generate({
  input: {
    text: "Explain AI: ML, DL, neural networks, NLP, computer vision. Include examples.",
  },
});

// Savings per request:
// Before: 300 input + 500 output = 800 tokens × $0.15/1M = $0.00012
// After:  50 input + 500 output = 550 tokens × $0.15/1M = $0.0000825
// Savings: 31% on input tokens
```

### 3. Streaming Optimization

Stop generation early when answer is complete.

```typescript
async function streamWithEarlyStop(prompt: string, stopWords: string[]) {
  let fullResponse = "";
  let stopped = false;

  for await (const chunk of ai.stream({
    input: { text: prompt },
    provider: "openai",
    model: "gpt-4o-mini",
  })) {
    fullResponse += chunk.content;

    // Check for stop condition
    if (stopWords.some((word) => fullResponse.includes(word))) {
      await chunk.cancel(); // Stop generation
      stopped = true;
      break;
    }
  }

  console.log(`Stopped early: ${stopped}`);
  return fullResponse;
}

// Usage
const result = await streamWithEarlyStop(
  "List 10 programming languages",
  ["10."], // Stop after 10th item
);

// Potential savings: 20-40% by not generating unnecessary content
```

---

## Prompt Engineering for Cost

### Use Structured Outputs

Request specific formats to reduce token waste.

```typescript
// ❌ Bad: Unstructured (generates 500+ tokens)
const unstructured = await ai.generate({
  input: { text: "Tell me about AI providers" },
});
// Output: "There are many AI providers available today. Let me tell you about them in detail..."

// ✅ Good: Structured (generates 200 tokens)
const structured = await ai.generate({
  input: { text: "List AI providers in format: name|description|pricing" },
});
// Output: "OpenAI|GPT models|$0.002/1K\nAnthropic|Claude|$0.003/1K\n..."

// Savings: 60% fewer tokens
```

### Request Summaries

Ask for brief responses when detail isn't needed.

```typescript
// For detailed analysis
const detailed = await ai.generate({
  input: { text: "Provide detailed analysis of AI market trends (500 words)" },
  maxTokens: 700,
});
// Cost: $0.0001

// For quick insights
const summary = await ai.generate({
  input: { text: "AI market trends: 3 bullet points" },
  maxTokens: 100,
});
// Cost: $0.000015
// Savings: 85%
```

---

## Batch Processing

Process multiple requests in single API call.

```typescript
// ❌ Bad: 10 separate requests
const wasteful = await Promise.all([
  ai.generate({ input: { text: "Translate to French: Hello" } }),
  ai.generate({ input: { text: "Translate to French: Goodbye" } }),
  // ... 8 more requests
]);
// Cost: 10 × overhead + 10 × processing = high overhead

// ✅ Good: Batch into single request
const batch = await ai.generate({
  input: {
    text: `
    Translate to French:
    1. Hello
    2. Goodbye
    3. Thank you
    ... (10 items)
  `,
  },
  maxTokens: 200,
});
// Cost: 1 × overhead + batch processing = ~50% savings
```

**Batch Processing Pattern:**

```typescript
class BatchProcessor {
  private queue: Array void;
  }> = [];

  private batchSize = 10;
  private batchTimeout = 1000; // 1 second
  private timer: NodeJS.Timeout | null = null;

  async add(prompt: string): Promise {
    return new Promise((resolve) => {
      this.queue.push({ prompt, resolve });

      if (this.queue.length >= this.batchSize) {
        this.processBatch();
      } else if (!this.timer) {
        this.timer = setTimeout(() => this.processBatch(), this.batchTimeout);
      }
    });
  }

  private async processBatch() {
    if (this.timer) {
      clearTimeout(this.timer);
      this.timer = null;
    }

    const batch = this.queue.splice(0, this.batchSize);
    if (batch.length === 0) return;

    // Combine prompts
    const combinedPrompt = batch
      .map((item, i) => `${i + 1}. ${item.prompt}`)
      .join("\n");

    // Single API call
    const result = await ai.generate({
      input: { text: `Answer each question:\n${combinedPrompt}` },
    });

    // Parse and distribute responses
    const responses = result.content.split("\n");
    batch.forEach((item, i) => {
      item.resolve(responses[i]);
    });
  }
}

// Usage
const batcher = new BatchProcessor();

// These get batched into single request
const results = await Promise.all([
  batcher.add("What is AI?"),
  batcher.add("What is ML?"),
  batcher.add("What is DL?"),
]);
```

---

## Smart Routing Patterns

### Cost-Based Routing

```typescript
const ai = new NeuroLink({
  providers: [
    // Route simple queries to free tier
    {
      name: "google-ai",
      priority: 1,
      model: "gemini-2.0-flash",
      condition: (req) => req.complexity === "low",
      costPer1M: 0,
    },

    // Medium complexity → cheap paid
    {
      name: "openai",
      priority: 1,
      model: "gpt-4o-mini",
      condition: (req) => req.complexity === "medium",
      costPer1M: 150,
    },

    // Complex → premium only when necessary
    {
      name: "anthropic",
      priority: 1,
      model: "claude-3-5-sonnet-20241022",
      condition: (req) => req.complexity === "high",
      costPer1M: 3000,
    },
  ],
});

// Classify and route
function classifyComplexity(prompt: string): "low" | "medium" | "high" {
  const length = prompt.length;
  const complexWords = ["analyze", "detailed", "comprehensive", "complex"];
  const hasComplexWords = complexWords.some((w) =>
    prompt.toLowerCase().includes(w),
  );

  if (length  86400000) {
      console.log(`Daily cost: $${this.dailyCost.toFixed(2)}`);
      this.dailyCost = 0;
      this.dayStart = now;
    }

    // Reset monthly
    if (now - this.monthStart > 2592000000) {
      // 30 days
      console.log(`Monthly cost: $${this.monthlyCost.toFixed(2)}`);
      this.monthlyCost = 0;
      this.monthStart = now;
    }

    this.dailyCost += cost;
    this.monthlyCost += cost;

    // Check budgets
    if (this.dailyCost > this.budget.daily) {
      throw new Error(
        `Daily budget exceeded: $${this.dailyCost.toFixed(2)} > $${this.budget.daily}`,
      );
    }

    if (this.monthlyCost > this.budget.monthly) {
      throw new Error(
        `Monthly budget exceeded: $${this.monthlyCost.toFixed(2)} > $${this.budget.monthly}`,
      );
    }

    console.log(
      `Cost: $${cost.toFixed(4)} (${provider}/${model}), Daily: $${this.dailyCost.toFixed(2)}, Monthly: $${this.monthlyCost.toFixed(2)}`,
    );
  }

  getStatus() {
    return {
      daily: {
        spent: this.dailyCost,
        budget: this.budget.daily,
        remaining: this.budget.daily - this.dailyCost,
        percentUsed: (this.dailyCost / this.budget.daily) * 100,
      },
      monthly: {
        spent: this.monthlyCost,
        budget: this.budget.monthly,
        remaining: this.budget.monthly - this.monthlyCost,
        percentUsed: (this.monthlyCost / this.budget.monthly) * 100,
      },
    };
  }
}

// Usage
const costTracker = new CostTracker();

const result = await ai.generate({
  input: { text: "Your prompt" },
  enableAnalytics: true,
});

costTracker.recordCost(result.cost, result.provider, result.model);

// Check status
console.log(costTracker.getStatus());
/*
{
  daily: { spent: 2.45, budget: 10, remaining: 7.55, percentUsed: 24.5 },
  monthly: { spent: 45.23, budget: 250, remaining: 204.77, percentUsed: 18.09 }
}
*/
```

---

## Best Practices

### 1. ✅ Free Tier First, Always

```typescript
// ✅ Always try free tier before paid
const ai = new NeuroLink({
  providers: [
    { name: "google-ai", priority: 1 }, // Free
    { name: "openai", priority: 2 }, // Paid fallback
  ],
});
```

### 2. ✅ Cache Aggressively

```typescript
// ✅ Cache frequent queries
const cache = new ResponseCache();
const result = await cachedGenerate(prompt);
// 60%+ hit rate = 60%+ savings
```

### 3. ✅ Limit Output Tokens

```typescript
// ✅ Always set maxTokens
const result = await ai.generate({
  input: { text: prompt },
  maxTokens: 200, // Only generate what's needed
});
```

### 4. ✅ Monitor Spending

```typescript
// ✅ Track costs in real-time
const costTracker = new CostTracker();
// Alert when approaching budget
```

### 5. ✅ Use Appropriate Models

```typescript
// ✅ Don't use GPT-4 for simple tasks
const simple = await ai.generate({
  input: { text: "What is 2+2?" },
  provider: "google-ai", // Free tier for simple query
  model: "gemini-2.0-flash",
});
```

---

## Complete Cost Optimization Stack

```typescript
// Production-ready cost-optimized setup

const cache = new ResponseCache();
const costTracker = new CostTracker();
const quotaManager = new QuotaManager();

const ai = new NeuroLink({
  providers: [
    // Tier 1: Free (Google AI)
    {
      name: "google-ai",
      priority: 1,
      model: "gemini-2.0-flash",
      condition: async () => await quotaManager.canUseGoogleAI(),
      costPer1M: 0,
    },

    // Tier 2: Cheap (OpenAI Mini)
    {
      name: "openai",
      priority: 2,
      model: "gpt-4o-mini",
      costPer1M: 150,
    },

    // Tier 3: Premium (only when needed)
    {
      name: "anthropic",
      priority: 3,
      model: "claude-3-5-sonnet-20241022",
      condition: (req) => req.requiresPremium,
      costPer1M: 3000,
    },
  ],
  failoverConfig: { enabled: true },
  onSuccess: (result) => {
    costTracker.recordCost(result.cost, result.provider, result.model);
    quotaManager.recordUsage(result.provider, result.usage.totalTokens);
  },
});

// Main generation function with full optimization
async function optimizedGenerate(prompt: string, options: any = {}) {
  // 1. Check cache first
  const cacheKey = cache.getCacheKey(
    { text: prompt },
    options.provider,
    options.model,
  );
  const cached = cache.get(cacheKey);
  if (cached) {
    console.log("Cache hit - $0 cost");
    return cached;
  }

  // 2. Optimize prompt
  const optimizedPrompt = optimizePrompt(prompt);

  // 3. Set reasonable max tokens
  const maxTokens = options.maxTokens || estimateNeededTokens(prompt);

  // 4. Generate with cost tracking
  const result = await ai.generate({
    input: { text: optimizedPrompt },
    maxTokens,
    enableAnalytics: true,
    ...options,
  });

  // 5. Cache result
  cache.set(cacheKey, result, result.cost);

  // 6. Log savings
  console.log(`Cost: $${result.cost.toFixed(4)}, Provider: ${result.provider}`);
  console.log(
    `Daily spend: $${costTracker.getStatus().daily.spent.toFixed(2)}`,
  );

  return result;
}

function optimizePrompt(prompt: string): string {
  // Remove excessive whitespace
  return prompt.replace(/\s+/g, " ").trim();
}

function estimateNeededTokens(prompt: string): number {
  // Simple heuristic: output ~2x input length
  const estimatedInput = prompt.length / 4; // ~4 chars per token
  return Math.min(estimatedInput * 2, 500); // Cap at 500
}
```

**Estimated Monthly Savings:**

```
Without optimization: $3,000/month
With full optimization: $150/month
Total savings: $2,850/month (95% reduction)
```

---

## Related Documentation

- **[Multi-Provider Failover](/docs/guides/enterprise/multi-provider-failover)** - Automatic failover
- **[Load Balancing](/docs/guides/enterprise/load-balancing)** - Distribution strategies
- **[Provider Setup](/docs/getting-started/provider-setup)** - Provider configuration
- **[Google AI Guide](/docs/getting-started/providers/google-ai)** - Free tier details

---

## Additional Resources

- **[OpenAI Pricing](https://openai.com/pricing)** - OpenAI costs
- **[Anthropic Pricing](https://www.anthropic.com/pricing)** - Claude costs
- **[Google AI Pricing](https://ai.google.dev/pricing)** - Gemini pricing
- **[LiteLLM Cost Tracking](https://docs.litellm.ai/docs/proxy/cost_tracking)** - Cost management

---

**Need Help?** Join our [GitHub Discussions](https://github.com/juspay/neurolink/discussions) or open an [issue](https://github.com/juspay/neurolink/issues).

---

## GitHub Action Guide

<!-- Source: guides/github-action.md -->

# GitHub Action Guide

**Last Updated:** January 10, 2026
**NeuroLink Version:** 8.32.0

Run AI-powered workflows with 13 providers directly in GitHub Actions. The NeuroLink GitHub Action enables automated code review, issue triage, content generation, and more.

## Quick Start

### Basic Usage

```yaml
name: AI Workflow

on:
  pull_request:
    types: [opened]

permissions:
  contents: read
  pull-requests: write

jobs:
  ai-task:
    runs-on: ubuntu-latest
    steps:
      - uses: juspay/neurolink@v1
        with:
          anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
          prompt: "Review this pull request for potential issues"
          post_comment: true
```

### Auto Provider Detection

When you set `provider: auto` (the default), NeuroLink automatically selects the best available provider based on which API keys you provide:

```yaml
- uses: juspay/neurolink@v1
  with:
    openai_api_key: ${{ secrets.OPENAI_API_KEY }}
    anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
    prompt: "Analyze this code"
    # Auto-selects from available providers
```

---

## Provider Configuration

NeuroLink supports 13 AI providers. Configure each by providing the required credentials as secrets.

### Provider Quick Reference

| Provider          | Required Inputs                                                    | Example Models                             |
| ----------------- | ------------------------------------------------------------------ | ------------------------------------------ |
| OpenAI            | `openai_api_key`                                                   | gpt-4o, gpt-4o-mini, o1                    |
| Anthropic         | `anthropic_api_key`                                                | claude-sonnet-4-20250514, claude-3-5-haiku |
| Google AI Studio  | `google_ai_api_key`                                                | gemini-2.5-pro, gemini-2.5-flash           |
| Vertex AI         | `google_vertex_project`, `google_application_credentials`          | gemini-\*, claude-\*                       |
| Amazon Bedrock    | `aws_access_key_id`, `aws_secret_access_key`                       | claude-\*, titan-\*, nova-\*               |
| Azure OpenAI      | `azure_openai_api_key`, `azure_openai_endpoint`                    | gpt-4o, gpt-4-turbo                        |
| Mistral           | `mistral_api_key`                                                  | mistral-large, mistral-small               |
| Hugging Face      | `huggingface_api_key`                                              | Various open models                        |
| OpenRouter        | `openrouter_api_key`                                               | 300+ models                                |
| LiteLLM           | `litellm_api_key`, `litellm_base_url`                              | Proxy to 100+ models                       |
| Ollama            | -                                                                  | Local models                               |
| SageMaker         | `aws_access_key_id`, `aws_secret_access_key`, `sagemaker_endpoint` | Custom endpoints                           |
| OpenAI-Compatible | `openai_compatible_api_key`, `openai_compatible_base_url`          | vLLM, custom APIs                          |

---

### OpenAI

```yaml
- uses: juspay/neurolink@v1
  with:
    openai_api_key: ${{ secrets.OPENAI_API_KEY }}
    provider: openai
    model: gpt-4o
    prompt: "Your prompt here"
```

**Environment Variables:**

- `OPENAI_API_KEY` - Your OpenAI API key (starts with `sk-`)

**Available Models:**

- `gpt-4o` - Most capable model
- `gpt-4o-mini` - Fast and cost-effective
- `o1` - Advanced reasoning model
- `gpt-4-turbo` - Previous generation flagship

---

### Anthropic

```yaml
- uses: juspay/neurolink@v1
  with:
    anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
    provider: anthropic
    model: claude-sonnet-4-20250514
    prompt: "Your prompt here"
```

**Environment Variables:**

- `ANTHROPIC_API_KEY` - Your Anthropic API key (starts with `sk-ant-`)

**Available Models:**

- `claude-sonnet-4-20250514` - Best overall performance
- `claude-3-5-haiku` - Fast and efficient
- `claude-opus-4-20250514` - Maximum capability

**Extended Thinking Support:** Anthropic models support extended thinking for deep reasoning tasks.

---

### Google AI Studio

```yaml
- uses: juspay/neurolink@v1
  with:
    google_ai_api_key: ${{ secrets.GOOGLE_AI_API_KEY }}
    provider: google-ai
    model: gemini-2.5-flash
    prompt: "Your prompt here"
```

**Environment Variables:**

- `GOOGLE_AI_API_KEY` - Your Google AI Studio API key

**Available Models:**

- `gemini-2.5-pro` - Most capable Gemini model
- `gemini-2.5-flash` - Fast and cost-effective
- `gemini-2.0-flash` - Previous generation

**Free Tier:** Google AI Studio offers a generous free tier (1M tokens/day).

---

### Google Vertex AI

```yaml
- uses: juspay/neurolink@v1
  with:
    google_vertex_project: ${{ secrets.GCP_PROJECT_ID }}
    google_vertex_location: us-central1
    google_application_credentials: ${{ secrets.GCP_CREDENTIALS_BASE64 }}
    provider: vertex
    model: gemini-2.5-flash
    prompt: "Your prompt here"
```

**Environment Variables:**

- `GOOGLE_VERTEX_PROJECT` - Your GCP project ID
- `GOOGLE_VERTEX_LOCATION` - GCP region (default: `us-central1`)
- `GOOGLE_APPLICATION_CREDENTIALS` - Base64-encoded service account JSON

**Setup Service Account:**

```bash
# Create service account
gcloud iam service-accounts create neurolink-action

# Grant permissions
gcloud projects add-iam-policy-binding PROJECT_ID \
  --member="serviceAccount:neurolink-action@PROJECT_ID.iam.gserviceaccount.com" \
  --role="roles/aiplatform.user"

# Create key and base64 encode
gcloud iam service-accounts keys create key.json \
  --iam-account=neurolink-action@PROJECT_ID.iam.gserviceaccount.com
cat key.json | base64 > key_base64.txt
```

---

### Amazon Bedrock

```yaml
- uses: juspay/neurolink@v1
  with:
    aws_access_key_id: ${{ secrets.AWS_ACCESS_KEY_ID }}
    aws_secret_access_key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
    aws_region: us-east-1
    bedrock_model_id: anthropic.claude-3-5-sonnet-20241022-v2:0
    provider: bedrock
    prompt: "Your prompt here"
```

**Environment Variables:**

- `AWS_ACCESS_KEY_ID` - AWS access key
- `AWS_SECRET_ACCESS_KEY` - AWS secret key
- `AWS_REGION` - AWS region (default: `us-east-1`)
- `AWS_SESSION_TOKEN` - Optional session token for temporary credentials

**Available Models:**

- `anthropic.claude-3-5-sonnet-20241022-v2:0` - Claude on Bedrock
- `amazon.titan-text-express-v1` - Amazon Titan
- `amazon.nova-pro-v1:0` - Amazon Nova

**OIDC Authentication (Recommended):**

For better security, use GitHub OIDC instead of static credentials:

```yaml
permissions:
  id-token: write
  contents: read

jobs:
  ai-task:
    runs-on: ubuntu-latest
    steps:
      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::123456789012:role/GitHubActionsRole
          aws-region: us-east-1

      - uses: juspay/neurolink@v1
        with:
          provider: bedrock
          bedrock_model_id: anthropic.claude-3-5-sonnet-20241022-v2:0
          prompt: "Your prompt here"
```

---

### Azure OpenAI

```yaml
- uses: juspay/neurolink@v1
  with:
    azure_openai_api_key: ${{ secrets.AZURE_OPENAI_API_KEY }}
    azure_openai_endpoint: ${{ secrets.AZURE_OPENAI_ENDPOINT }}
    azure_openai_deployment: gpt-4o
    provider: azure
    prompt: "Your prompt here"
```

**Environment Variables:**

- `AZURE_OPENAI_API_KEY` - Azure OpenAI API key
- `AZURE_OPENAI_ENDPOINT` - Azure OpenAI endpoint URL (e.g., `https://your-resource.openai.azure.com`)
- `AZURE_OPENAI_DEPLOYMENT` - Deployment name

---

### Mistral

```yaml
- uses: juspay/neurolink@v1
  with:
    mistral_api_key: ${{ secrets.MISTRAL_API_KEY }}
    provider: mistral
    model: mistral-large-latest
    prompt: "Your prompt here"
```

**Environment Variables:**

- `MISTRAL_API_KEY` - Your Mistral API key

**Available Models:**

- `mistral-large-latest` - Most capable
- `mistral-small-latest` - Cost-effective
- `codestral-latest` - Optimized for code

---

### Hugging Face

```yaml
- uses: juspay/neurolink@v1
  with:
    huggingface_api_key: ${{ secrets.HUGGINGFACE_API_KEY }}
    provider: huggingface
    model: meta-llama/Llama-3.1-8B-Instruct
    prompt: "Your prompt here"
```

**Environment Variables:**

- `HUGGINGFACE_API_KEY` - Your Hugging Face API key (starts with `hf_`)

---

### OpenRouter

```yaml
- uses: juspay/neurolink@v1
  with:
    openrouter_api_key: ${{ secrets.OPENROUTER_API_KEY }}
    provider: openrouter
    model: anthropic/claude-3-5-sonnet
    prompt: "Your prompt here"
```

**Environment Variables:**

- `OPENROUTER_API_KEY` - Your OpenRouter API key

**Benefits:**

- Access to 300+ models through single API
- Pay-per-use pricing
- Automatic failover between providers

---

### LiteLLM

```yaml
- uses: juspay/neurolink@v1
  with:
    litellm_api_key: ${{ secrets.LITELLM_API_KEY }}
    litellm_base_url: https://your-litellm-proxy.com
    provider: litellm
    model: gpt-4
    prompt: "Your prompt here"
```

**Environment Variables:**

- `LITELLM_API_KEY` - Your LiteLLM API key
- `LITELLM_BASE_URL` - Your LiteLLM proxy URL

---

### Amazon SageMaker

```yaml
- uses: juspay/neurolink@v1
  with:
    aws_access_key_id: ${{ secrets.AWS_ACCESS_KEY_ID }}
    aws_secret_access_key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
    aws_region: us-east-1
    sagemaker_endpoint: your-endpoint-name
    provider: sagemaker
    prompt: "Your prompt here"
```

**Environment Variables:**

- `AWS_ACCESS_KEY_ID` - AWS access key
- `AWS_SECRET_ACCESS_KEY` - AWS secret key
- `AWS_REGION` - AWS region
- `SAGEMAKER_ENDPOINT` - SageMaker endpoint name

---

### OpenAI-Compatible

For self-hosted models (vLLM, Ollama, etc.) that implement the OpenAI API:

```yaml
- uses: juspay/neurolink@v1
  with:
    openai_compatible_api_key: ${{ secrets.CUSTOM_API_KEY }}
    openai_compatible_base_url: https://your-api.com/v1
    provider: openai-compatible
    model: your-model-name
    prompt: "Your prompt here"
```

**Environment Variables:**

- `OPENAI_COMPATIBLE_API_KEY` - API key for your endpoint
- `OPENAI_COMPATIBLE_BASE_URL` - Base URL for the API

---

## Inputs Reference

All inputs are organized by category for easy reference.

### Core Inputs

| Input    | Description                        | Required | Default |
| -------- | ---------------------------------- | -------- | ------- |
| `prompt` | The prompt to send to the AI model | Yes      | -       |

### Provider Selection

| Input      | Description                                                                                                                                                                  | Required | Default          |
| ---------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | ---------------- |
| `provider` | AI provider: `openai`, `anthropic`, `google-ai`, `vertex`, `azure`, `bedrock`, `mistral`, `huggingface`, `openrouter`, `litellm`, `ollama`, `sagemaker`, `openai-compatible` | No       | `auto`           |
| `model`    | Specific model to use                                                                                                                                                        | No       | Provider default |

### API Keys

| Input                       | Description               | Required | Default |
| --------------------------- | ------------------------- | -------- | ------- |
| `openai_api_key`            | OpenAI API key            | No       | -       |
| `anthropic_api_key`         | Anthropic API key         | No       | -       |
| `google_ai_api_key`         | Google AI Studio API key  | No       | -       |
| `azure_openai_api_key`      | Azure OpenAI API key      | No       | -       |
| `mistral_api_key`           | Mistral AI API key        | No       | -       |
| `huggingface_api_key`       | Hugging Face API key      | No       | -       |
| `openrouter_api_key`        | OpenRouter API key        | No       | -       |
| `litellm_api_key`           | LiteLLM API key           | No       | -       |
| `openai_compatible_api_key` | OpenAI-compatible API key | No       | -       |

### AWS Configuration

| Input                   | Description                             | Required | Default     |
| ----------------------- | --------------------------------------- | -------- | ----------- |
| `aws_access_key_id`     | AWS Access Key ID for Bedrock/SageMaker | No       | -           |
| `aws_secret_access_key` | AWS Secret Access Key                   | No       | -           |
| `aws_region`            | AWS Region                              | No       | `us-east-1` |
| `aws_session_token`     | AWS Session Token                       | No       | -           |
| `bedrock_model_id`      | AWS Bedrock model ID                    | No       | -           |
| `sagemaker_endpoint`    | Amazon SageMaker endpoint               | No       | -           |

### Google Cloud Configuration

| Input                            | Description                               | Required | Default       |
| -------------------------------- | ----------------------------------------- | -------- | ------------- |
| `google_vertex_project`          | Google Cloud project ID for Vertex AI     | No       | -             |
| `google_vertex_location`         | Google Cloud location                     | No       | `us-central1` |
| `google_application_credentials` | GCP service account JSON (base64 encoded) | No       | -             |

### Azure Configuration

| Input                     | Description                  | Required | Default |
| ------------------------- | ---------------------------- | -------- | ------- |
| `azure_openai_endpoint`   | Azure OpenAI endpoint URL    | No       | -       |
| `azure_openai_deployment` | Azure OpenAI deployment name | No       | -       |

### LiteLLM/OpenAI-Compatible Configuration

| Input                        | Description                | Required | Default |
| ---------------------------- | -------------------------- | -------- | ------- |
| `litellm_base_url`           | LiteLLM base URL           | No       | -       |
| `openai_compatible_base_url` | OpenAI-compatible base URL | No       | -       |

### Generation Parameters

| Input           | Description                                | Required | Default    |
| --------------- | ------------------------------------------ | -------- | ---------- |
| `temperature`   | Sampling temperature (0.0-2.0)             | No       | `0.7`      |
| `max_tokens`    | Maximum tokens in response                 | No       | `4096`     |
| `system_prompt` | System prompt for context                  | No       | -          |
| `command`       | CLI command: `generate`, `stream`, `batch` | No       | `generate` |

### Multimodal Inputs

| Input         | Description                 | Required | Default |
| ------------- | --------------------------- | -------- | ------- |
| `image_paths` | Comma-separated image paths | No       | -       |
| `pdf_paths`   | Comma-separated PDF paths   | No       | -       |
| `csv_paths`   | Comma-separated CSV paths   | No       | -       |
| `video_paths` | Comma-separated video paths | No       | -       |

### Extended Thinking

| Input              | Description                                        | Required | Default  |
| ------------------ | -------------------------------------------------- | -------- | -------- |
| `thinking_enabled` | Enable extended thinking                           | No       | `false`  |
| `thinking_level`   | Thinking level: `minimal`, `low`, `medium`, `high` | No       | `medium` |
| `thinking_budget`  | Thinking token budget                              | No       | `10000`  |

### Features

| Input               | Description                              | Required | Default |
| ------------------- | ---------------------------------------- | -------- | ------- |
| `enable_analytics`  | Enable usage analytics and cost tracking | No       | `false` |
| `enable_evaluation` | Enable response quality evaluation       | No       | `false` |
| `enable_tools`      | Enable MCP tools                         | No       | `false` |
| `mcp_config_path`   | Path to `.mcp-config.json` file          | No       | -       |

### Output Configuration

| Input           | Description                   | Required | Default |
| --------------- | ----------------------------- | -------- | ------- |
| `output_format` | Output format: `text`, `json` | No       | `text`  |
| `output_file`   | Output file path              | No       | -       |

### GitHub Integration

| Input                     | Description                                      | Required | Default               |
| ------------------------- | ------------------------------------------------ | -------- | --------------------- |
| `post_comment`            | Post AI response as PR/issue comment             | No       | `false`               |
| `update_existing_comment` | Update existing NeuroLink comment instead of new | No       | `true`                |
| `comment_tag`             | HTML comment tag to identify NeuroLink comments  | No       | `neurolink-action`    |
| `github_token`            | GitHub token for PR/issue operations             | No       | `${{ github.token }}` |

### Advanced Options

| Input               | Description                         | Required | Default  |
| ------------------- | ----------------------------------- | -------- | -------- |
| `timeout`           | Request timeout in seconds          | No       | `300`    |
| `debug`             | Enable debug logging                | No       | `false`  |
| `neurolink_version` | NeuroLink CLI version to install    | No       | `latest` |
| `working_directory` | Working directory for CLI execution | No       | `.`      |

---

## Outputs Reference

The action provides the following outputs for use in subsequent steps:

| Output              | Description                                  | Example                              |
| ------------------- | -------------------------------------------- | ------------------------------------ |
| `response`          | AI response text content                     | `"Here is the review..."`            |
| `response_json`     | Full JSON response including metadata        | `{"content": "...", "model": "..."}` |
| `provider`          | Provider that was used                       | `anthropic`                          |
| `model`             | Model that was used                          | `claude-sonnet-4-20250514`           |
| `tokens_used`       | Total tokens consumed                        | `1523`                               |
| `prompt_tokens`     | Input/prompt tokens                          | `423`                                |
| `completion_tokens` | Output/completion tokens                     | `1100`                               |
| `cost`              | Estimated cost in USD (if analytics enabled) | `0.0234`                             |
| `execution_time`    | Execution time in milliseconds               | `2341`                               |
| `evaluation_score`  | Quality score 0-100 (if evaluation enabled)  | `87`                                 |
| `comment_id`        | GitHub comment ID (if post_comment enabled)  | `1234567890`                         |
| `error`             | Error message if execution failed            | `null`                               |

### Using Outputs

```yaml
- name: AI Analysis
  uses: juspay/neurolink@v1
  id: ai
  with:
    openai_api_key: ${{ secrets.OPENAI_API_KEY }}
    prompt: "Analyze this code"
    enable_analytics: true

- name: Use AI Response
  run: |
    echo "Response: ${{ steps.ai.outputs.response }}"
    echo "Tokens: ${{ steps.ai.outputs.tokens_used }}"
    echo "Cost: ${{ steps.ai.outputs.cost }}"
```

---

## Advanced Features

### Multimodal Processing

Process images, PDFs, CSVs, and videos along with text prompts.

#### Image Analysis

```yaml
- uses: actions/checkout@v4

- uses: juspay/neurolink@v1
  with:
    anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
    prompt: "Describe what you see in these screenshots"
    image_paths: "screenshots/screen1.png,screenshots/screen2.png"
    provider: anthropic
    model: claude-sonnet-4-20250514
```

#### PDF Processing

```yaml
- uses: juspay/neurolink@v1
  with:
    google_ai_api_key: ${{ secrets.GOOGLE_AI_API_KEY }}
    prompt: "Summarize the key points from this document"
    pdf_paths: "docs/report.pdf"
    provider: google-ai
    model: gemini-2.5-pro
```

#### CSV Analysis

```yaml
- uses: juspay/neurolink@v1
  with:
    openai_api_key: ${{ secrets.OPENAI_API_KEY }}
    prompt: "Analyze trends in this data and provide insights"
    csv_paths: "data/metrics.csv"
    provider: openai
    model: gpt-4o
```

**Provider Multimodal Support:**

| Provider     | Images | PDFs | CSV | Video |
| ------------ | ------ | ---- | --- | ----- |
| Anthropic    | Yes    | Yes  | Yes | No    |
| OpenAI       | Yes    | No   | Yes | No    |
| Google AI    | Yes    | Yes  | Yes | Yes   |
| Vertex AI    | Yes    | Yes  | Yes | Yes   |
| Bedrock      | Yes    | Yes  | Yes | No    |
| Azure OpenAI | Yes    | No   | Yes | No    |

---

### Extended Thinking

Enable deep reasoning for complex tasks. Supported by Anthropic and Google AI/Vertex providers.

```yaml
- uses: juspay/neurolink@v1
  with:
    anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
    prompt: |
      Analyze this complex architecture and identify potential
      security vulnerabilities, performance bottlenecks, and
      suggest improvements.
    provider: anthropic
    model: claude-sonnet-4-20250514
    thinking_enabled: true
    thinking_level: high
    thinking_budget: "20000"
```

**Thinking Levels:**

| Level     | Description                  | Token Budget | Use Case            |
| --------- | ---------------------------- | ------------ | ------------------- |
| `minimal` | Quick reasoning              | ~2,000       | Simple analysis     |
| `low`     | Basic analysis               | ~5,000       | Code review         |
| `medium`  | Balanced reasoning (default) | ~10,000      | Architecture review |
| `high`    | Deep comprehensive analysis  | ~20,000      | Security audit      |

---

### Analytics and Cost Tracking

Enable analytics to track usage and estimate costs:

```yaml
- uses: juspay/neurolink@v1
  id: ai
  with:
    openai_api_key: ${{ secrets.OPENAI_API_KEY }}
    prompt: "Generate a comprehensive report"
    enable_analytics: true

- name: Check Usage
  run: |
    echo "Tokens used: ${{ steps.ai.outputs.tokens_used }}"
    echo "Estimated cost: $${{ steps.ai.outputs.cost }}"
```

The job summary will include detailed analytics:

- Token breakdown (prompt vs completion)
- Estimated cost in USD
- Provider and model used
- Execution time

---

### Response Quality Evaluation

Enable evaluation to score response quality (0-100):

```yaml
- uses: juspay/neurolink@v1
  id: ai
  with:
    anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
    prompt: "Write unit tests for the authentication module"
    enable_evaluation: true

- name: Check Quality
  run: |
    SCORE="${{ steps.ai.outputs.evaluation_score }}"
    if [ "$SCORE" -lt 70 ]; then
      echo "Warning: Low quality score ($SCORE)"
      exit 1
    fi
```

---

### MCP Tools Integration

Enable MCP tools to extend AI capabilities:

```yaml
- uses: juspay/neurolink@v1
  with:
    anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
    prompt: "Search for files containing 'TODO' comments"
    enable_tools: true
    mcp_config_path: ".mcp-config.json"
```

Example `.mcp-config.json`:

```json
{
  "mcpServers": {
    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "."]
    }
  }
}
```

---

## GitHub Integration

### PR Comments

Post AI responses directly as PR comments:

````yaml
name: AI Code Review

on:
  pull_request:
    types: [opened, synchronize]

permissions:
  contents: read
  pull-requests: write

jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Get PR diff
        id: diff
        run: |
          git diff origin/${{ github.base_ref }}...HEAD > diff.txt
          echo "diff> $GITHUB_OUTPUT
          head -c 50000 diff.txt >> $GITHUB_OUTPUT
          echo "EOF" >> $GITHUB_OUTPUT

      - name: AI Code Review
        uses: juspay/neurolink@v1
        with:
          anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
          prompt: |
            Review this pull request diff:

            ```diff
            ${{ steps.diff.outputs.diff }}
            ```
          post_comment: true
          update_existing_comment: true
          comment_tag: "neurolink-review"
````

### Issue Comments

Post AI responses to issues:

```yaml
name: AI Issue Response

on:
  issues:
    types: [opened]

permissions:
  issues: write

jobs:
  respond:
    runs-on: ubuntu-latest
    steps:
      - uses: juspay/neurolink@v1
        with:
          openai_api_key: ${{ secrets.OPENAI_API_KEY }}
          prompt: |
            Provide a helpful response to this issue:

            Title: ${{ github.event.issue.title }}
            Body: ${{ github.event.issue.body }}
          post_comment: true
          github_token: ${{ secrets.GITHUB_TOKEN }}
```

### Comment Update Behavior

When `update_existing_comment: true` (default):

- The action looks for an existing comment with the specified `comment_tag`
- If found, it updates that comment instead of creating a new one
- This prevents comment spam on PRs with multiple pushes

To always create new comments:

```yaml
- uses: juspay/neurolink@v1
  with:
    # ...
    post_comment: true
    update_existing_comment: false
```

### Job Summary

The action automatically writes a detailed summary to the GitHub Actions job summary, including:

- AI response content
- Provider and model used
- Token usage breakdown
- Cost estimate (if analytics enabled)
- Evaluation score (if evaluation enabled)
- Execution time

---

## Example Workflows

Complete workflow examples are available in the repository:

### PR Code Review

See [`src/action/examples/pr-review.yml`](https://github.com/juspay/neurolink/blob/release/src/action/examples/pr-review.yml)

````yaml
name: AI Code Review

on:
  pull_request:
    types: [opened, synchronize]

permissions:
  contents: read
  pull-requests: write

jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Get PR diff
        id: diff
        run: |
          git diff origin/${{ github.base_ref }}...HEAD > diff.txt
          echo "diff> $GITHUB_OUTPUT
          head -c 50000 diff.txt >> $GITHUB_OUTPUT
          echo "EOF" >> $GITHUB_OUTPUT

      - name: AI Code Review
        uses: juspay/neurolink@v1
        with:
          anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
          prompt: |
            Review this pull request diff and provide constructive feedback:

            ```diff
            ${{ steps.diff.outputs.diff }}
            ```

            Focus on:
            1. Potential bugs or issues
            2. Code quality improvements
            3. Security concerns
          provider: anthropic
          model: claude-sonnet-4-20250514
          post_comment: true
          enable_analytics: true
````

### Issue Triage

See [`src/action/examples/issue-triage.yml`](https://github.com/juspay/neurolink/blob/release/src/action/examples/issue-triage.yml)

```yaml
name: AI Issue Triage

on:
  issues:
    types: [opened]

permissions:
  issues: write

jobs:
  triage:
    runs-on: ubuntu-latest
    steps:
      - name: Triage Issue
        uses: juspay/neurolink@v1
        id: triage
        with:
          openai_api_key: ${{ secrets.OPENAI_API_KEY }}
          prompt: |
            Analyze this GitHub issue and respond with JSON:

            Title: ${{ github.event.issue.title }}
            Body: ${{ github.event.issue.body }}

            {
              "category": "bug|feature|question|docs",
              "priority": "high|medium|low",
              "labels": ["suggested", "labels"],
              "summary": "one line summary"
            }
          provider: openai
          model: gpt-4o-mini
          output_format: json

      - name: Apply labels
        uses: actions/github-script@v7
        with:
          script: |
            const analysis = JSON.parse('${{ steps.triage.outputs.response }}');
            await github.rest.issues.addLabels({
              owner: context.repo.owner,
              repo: context.repo.repo,
              issue_number: context.issue.number,
              labels: analysis.labels
            });
```

### Code Generation

See [`src/action/examples/code-generation.yml`](https://github.com/juspay/neurolink/blob/release/src/action/examples/code-generation.yml)

```yaml
name: AI Code Generation

on:
  workflow_dispatch:
    inputs:
      prompt:
        description: "What to generate"
        required: true

jobs:
  generate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Generate Code
        uses: juspay/neurolink@v1
        id: codegen
        with:
          google_ai_api_key: ${{ secrets.GOOGLE_AI_API_KEY }}
          prompt: ${{ inputs.prompt }}
          provider: google-ai
          model: gemini-2.5-pro
          temperature: "0.3"
          enable_evaluation: true
```

### Multi-Provider Fallback

```yaml
name: AI with Fallback

on:
  workflow_dispatch:
    inputs:
      prompt:
        required: true

jobs:
  generate:
    runs-on: ubuntu-latest
    steps:
      - name: Try Primary Provider
        uses: juspay/neurolink@v1
        id: primary
        continue-on-error: true
        with:
          anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
          provider: anthropic
          prompt: ${{ inputs.prompt }}

      - name: Fallback Provider
        if: steps.primary.outcome == 'failure'
        uses: juspay/neurolink@v1
        with:
          openai_api_key: ${{ secrets.OPENAI_API_KEY }}
          provider: openai
          prompt: ${{ inputs.prompt }}
```

---

## Troubleshooting

### Common Issues

#### Authentication Errors

**Symptoms:**

- `Invalid API key`
- `401 Unauthorized`
- `Authentication failed`

**Solutions:**

1. **Verify secret is set correctly:**

   ```yaml
   - run: |
       if [ -z "${{ secrets.OPENAI_API_KEY }}" ]; then
         echo "Secret is not set"
         exit 1
       fi
   ```

2. **Check key format:**
   - OpenAI keys start with `sk-`
   - Anthropic keys start with `sk-ant-`
   - Google AI keys are alphanumeric

3. **Ensure secret name matches exactly:**

   ```yaml
   # Correct
   openai_api_key: ${{ secrets.OPENAI_API_KEY }}

   # Wrong (different case)
   openai_api_key: ${{ secrets.openai_api_key }}
   ```

---

#### Rate Limiting

**Symptoms:**

- `429 Too Many Requests`
- `Rate limit exceeded`

**Solutions:**

1. **Add delays between requests:**

   ```yaml
   - uses: juspay/neurolink@v1
     with:
       # ...

   - run: sleep 5

   - uses: juspay/neurolink@v1
     with:
       # ...
   ```

2. **Use different providers for parallel jobs:**

   ```yaml
   jobs:
     review-1:
       uses: juspay/neurolink@v1
       with:
         provider: anthropic
         # ...

     review-2:
       uses: juspay/neurolink@v1
       with:
         provider: openai
         # ...
   ```

---

#### Timeout Errors

**Symptoms:**

- `Request timeout`
- Action runs for full timeout then fails

**Solutions:**

1. **Increase timeout:**

   ```yaml
   - uses: juspay/neurolink@v1
     with:
       timeout: "600" # 10 minutes
       # ...
   ```

2. **Reduce prompt size:**

   ```yaml
   - name: Truncate diff
     run: |
       head -c 30000 diff.txt > diff_truncated.txt
   ```

3. **Use faster model:**
   ```yaml
   - uses: juspay/neurolink@v1
     with:
       model: gpt-4o-mini # Faster than gpt-4o
       # ...
   ```

---

#### Comment Posting Fails

**Symptoms:**

- `Resource not accessible by integration`
- `403 Forbidden` on comment creation

**Solutions:**

1. **Check permissions:**

   ```yaml
   permissions:
     contents: read
     pull-requests: write # Required for PR comments
     issues: write # Required for issue comments
   ```

2. **Use explicit token:**

   ```yaml
   - uses: juspay/neurolink@v1
     with:
       github_token: ${{ secrets.GITHUB_TOKEN }}
       post_comment: true
       # ...
   ```

3. **For organization repos, check token permissions in Actions settings**

---

#### Empty or Truncated Response

**Symptoms:**

- Response is cut off
- Empty `response` output

**Solutions:**

1. **Increase max_tokens:**

   ```yaml
   - uses: juspay/neurolink@v1
     with:
       max_tokens: "8192"
       # ...
   ```

2. **Check for content filtering:**
   Some providers may filter certain content. Try a different provider or rephrase the prompt.

3. **Enable debug logging:**
   ```yaml
   - uses: juspay/neurolink@v1
     with:
       debug: true
       # ...
   ```

---

### Debug Mode

Enable debug mode for detailed logging:

```yaml
- uses: juspay/neurolink@v1
  with:
    debug: true
    # ...
```

Debug output includes:

- Full request/response payloads (with secrets masked)
- Provider selection logic
- Token counting details
- Error stack traces

---

### Getting Help

If you encounter issues:

1. **Check the [Troubleshooting Guide](/docs/reference/troubleshooting)** for common issues
2. **Enable debug mode** to get detailed logs
3. **Search existing issues** on GitHub
4. **Open a new issue** with:
   - Workflow file (with secrets redacted)
   - Debug logs
   - Error message
   - Expected vs actual behavior

---

## Security Best Practices

### API Key Management

1. **Always use GitHub Secrets** - Never hardcode API keys
2. **Use environment-specific secrets** - Separate keys for staging/production
3. **Rotate keys regularly** - Update secrets periodically
4. **Limit key permissions** - Use keys with minimal required scope

### Credential Masking

All API keys are automatically masked in logs. The action ensures:

- Keys are never printed to stdout
- Keys are masked in debug output
- Keys are not exposed in job summaries

### OIDC for Cloud Providers

For AWS and GCP, prefer OIDC authentication over static credentials:

```yaml
# AWS OIDC
- uses: aws-actions/configure-aws-credentials@v4
  with:
    role-to-assume: arn:aws:iam::123456789012:role/GitHubActionsRole
    aws-region: us-east-1

# GCP OIDC
- uses: google-github-actions/auth@v2
  with:
    workload_identity_provider: projects/123456789/locations/global/workloadIdentityPools/github/providers/github
    service_account: neurolink@project.iam.gserviceaccount.com
```

### Workflow Permissions

Use minimal permissions in your workflows:

```yaml
permissions:
  contents: read # Only if you need to checkout code
  pull-requests: write # Only if posting PR comments
  issues: write # Only if posting issue comments
```

---

## See Also

- [Provider Selection Guide](/docs/reference/provider-selection) - Choose the best provider for your use case
- [Troubleshooting Guide](/docs/reference/troubleshooting) - Diagnose and resolve issues
- [SDK API Reference](/docs/sdk/api-reference) - Full SDK documentation
- [CLI Reference](/docs/cli/commands) - CLI command documentation
- [MCP Server Catalog](/docs/guides/mcp/server-catalog) - Available MCP tools

---

## License

MIT - See [LICENSE](https://github.com/juspay/neurolink/blob/release/LICENSE)

---

## SvelteKit Integration Guide

<!-- Source: guides/frameworks/sveltekit.md -->

# SvelteKit Integration Guide

**Build modern AI applications with SvelteKit and NeuroLink**

## Quick Start

### 1. Create SvelteKit Project

```bash
npm create svelte@latest my-ai-app
cd my-ai-app
npm install
npm install @juspay/neurolink
```

### 2. Add Environment Variables

```bash
# .env
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_AI_API_KEY=AIza...
```

### 3. Create NeuroLink Instance

```typescript
// src/lib/ai.ts

export const ai = new NeuroLink({
  providers: [
    {
      name: "openai",
      config: { apiKey: OPENAI_API_KEY },
    },
    {
      name: "anthropic",
      config: { apiKey: ANTHROPIC_API_KEY },
    },
  ],
});
```

### 4. Create Page with Server Load

```typescript
// src/routes/+page.server.ts

export const load: PageServerLoad = async () => {
  const result = await ai.generate({
    input: { text: "Explain SvelteKit in one sentence" },
    provider: "openai",
    model: "gpt-4o-mini",
  });

  return {
    aiResponse: result.content,
    tokens: result.usage.totalTokens,
    cost: result.cost,
  };
};
```

```svelte

  import type { PageData } from './$types';

  export let data: PageData;

  AI Response

    {data.aiResponse}


    Tokens: {data.tokens} | Cost: ${data.cost.toFixed(4)}


```

---

## Server Load Functions

### Basic Load Function

```typescript
// src/routes/summary/+page.server.ts

export const load: PageServerLoad = async ({ url }) => {
  const text = url.searchParams.get("text");

  if (!text) {
    throw error(400, "Text parameter is required");
  }

  const result = await ai.generate({
    input: { text: `Summarize: ${text}` },
    provider: "openai",
    model: "gpt-4o-mini",
  });

  return {
    summary: result.content,
    usage: result.usage,
  };
};
```

```svelte

  export let data;

  Summary
  {data.summary}

```

### Load with Error Handling

```typescript
// src/routes/analyze/+page.server.ts

export const load: PageServerLoad = async ({ url, locals }) => {
  // Check authentication
  if (!locals.user) {
    throw redirect(307, "/login");
  }

  const query = url.searchParams.get("query");

  if (!query) {
    throw error(400, "Query parameter is required");
  }

  try {
    const result = await ai.generate({
      input: { text: query },
      provider: "anthropic",
      model: "claude-3-5-sonnet-20241022",
    });

    return {
      analysis: result.content,
      usage: result.usage,
      cost: result.cost,
    };
  } catch (err: any) {
    console.error("AI Error:", err);
    throw error(503, "AI service temporarily unavailable");
  }
};
```

---

## Form Actions

### Basic Form Action

```typescript
// src/routes/generate/+page.server.ts

export const load: PageServerLoad = async () => {
  return {};
};

export const actions: Actions = {
  generate: async ({ request }) => {
    const data = await request.formData();
    const prompt = data.get("prompt") as string;

    if (!prompt) {
      return fail(400, { error: "Prompt is required" });
    }

    try {
      const result = await ai.generate({
        input: { text: prompt },
        provider: "openai",
        model: "gpt-4o-mini",
      });

      return {
        success: true,
        content: result.content,
        usage: result.usage,
        cost: result.cost,
      };
    } catch (error: any) {
      return fail(500, { error: error.message });
    }
  },
};
```

```svelte

  import { enhance } from '$app/forms';
  import type { ActionData } from './$types';

  export let form: ActionData;

  AI Text Generator


      Generate


  {#if form?.error}

      {form.error}

  {/if}

  {#if form?.success}

      Result:
      {form.content}

        Tokens: {form.usage.totalTokens} | Cost: ${form.cost.toFixed(4)}


  {/if}

```

### Multiple Form Actions

```typescript
// src/routes/ai-tools/+page.server.ts

export const actions: Actions = {
  summarize: async ({ request }) => {
    const data = await request.formData();
    const text = data.get("text") as string;

    const result = await ai.generate({
      input: { text: `Summarize: ${text}` },
      provider: "openai",
      model: "gpt-4o-mini",
    });

    return { summary: result.content };
  },

  translate: async ({ request }) => {
    const data = await request.formData();
    const text = data.get("text") as string;
    const language = data.get("language") as string;

    const result = await ai.generate({
      input: { text: `Translate to ${language}: ${text}` },
      provider: "google-ai",
      model: "gemini-2.0-flash",
    });

    return { translation: result.content };
  },

  analyze: async ({ request }) => {
    const data = await request.formData();
    const text = data.get("text") as string;

    const result = await ai.generate({
      input: { text: `Analyze: ${text}` },
      provider: "anthropic",
      model: "claude-3-5-sonnet-20241022",
    });

    return { analysis: result.content };
  },
};
```

---

## API Routes

### Basic API Endpoint

```typescript
// src/routes/api/generate/+server.ts

export const POST: RequestHandler = async ({ request }) => {
  try {
    const {
      prompt,
      provider = "openai",
      model = "gpt-4o-mini",
    } = await request.json();

    if (!prompt) {
      throw error(400, "Prompt is required");
    }

    const result = await ai.generate({
      input: { text: prompt },
      provider,
      model,
    });

    return json({
      content: result.content,
      usage: result.usage,
      cost: result.cost,
      provider: result.provider,
    });
  } catch (err: any) {
    console.error("AI Error:", err);
    throw error(500, err.message);
  }
};
```

### Streaming API Endpoint

```typescript
// src/routes/api/stream/+server.ts

export const POST: RequestHandler = async ({ request }) => {
  const { prompt } = await request.json();

  const stream = new ReadableStream({
    async start(controller) {
      try {
        for await (const chunk of ai.stream({
          input: { text: prompt },
          provider: "openai",
          model: "gpt-4o-mini",
        })) {
          const data = `data: ${JSON.stringify({ content: chunk.content })}\n\n`;
          controller.enqueue(new TextEncoder().encode(data));
        }

        controller.enqueue(new TextEncoder().encode("data: [DONE]\n\n"));
        controller.close();
      } catch (error: any) {
        controller.error(error);
      }
    },
  });

  return new Response(stream, {
    headers: {
      "Content-Type": "text/event-stream",
      "Cache-Control": "no-cache",
      Connection: "keep-alive",
    },
  });
};
```

### Client-Side Streaming Consumer

```svelte

  let prompt = '';
  let response = '';
  let loading = false;

  async function handleSubmit() {
    loading = true;
    response = '';

    try {
      const res = await fetch('/api/stream', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ prompt })
      });

      const reader = res.body?.getReader();
      const decoder = new TextDecoder();

      while (true) {
        const { done, value } = await reader!.read();
        if (done) break;

        const chunk = decoder.decode(value);
        const lines = chunk.split('\n');

        for (const line of lines) {
          if (line.startsWith('data: ')) {
            const data = line.slice(6);
            if (data === '[DONE]') break;

            try {
              const parsed = JSON.parse(data);
              response += parsed.content;
            } catch (e) {
              // Skip invalid JSON
            }
          }
        }
      }
    } catch (error) {
      console.error(error);
    } finally {
      loading = false;
    }
  }

  Streaming Chat


      {loading ? 'Streaming...' : 'Send'}


  {#if response}

      Response:
      {response}

  {/if}

```

---

## Authentication with Hooks

### Server Hooks

```typescript
// src/hooks.server.ts

export const handle: Handle = async ({ event, resolve }) => {
  // Get token from cookie or header
  const token =
    event.cookies.get("session") ||
    event.request.headers.get("authorization")?.replace("Bearer ", "");

  if (token) {
    try {
      const decoded = jwt.verify(token, process.env.JWT_SECRET!);
      event.locals.user = decoded;
    } catch (err) {
      // Invalid token
      event.locals.user = null;
    }
  }

  return resolve(event);
};
```

### Protected Route

```typescript
// src/routes/dashboard/+page.server.ts

export const load: PageServerLoad = async ({ locals }) => {
  if (!locals.user) {
    throw redirect(307, "/login");
  }

  return {
    user: locals.user,
  };
};
```

### Login Form Action

```typescript
// src/routes/login/+page.server.ts

export const actions: Actions = {
  default: async ({ request, cookies }) => {
    const data = await request.formData();
    const username = data.get("username") as string;
    const password = data.get("password") as string;

    // Verify credentials (example)
    if (username === "admin" && password === "password") {
      const token = jwt.sign(
        { userId: "123", username },
        process.env.JWT_SECRET!,
        { expiresIn: "24h" },
      );

      cookies.set("session", token, {
        path: "/",
        httpOnly: true,
        secure: process.env.NODE_ENV === "production",
        sameSite: "strict",
        maxAge: 60 * 60 * 24, // 24 hours
      });

      throw redirect(303, "/dashboard");
    }

    return fail(401, { error: "Invalid credentials" });
  },
};
```

---

## Production Patterns

### Pattern 1: Chat Application

```typescript
// src/routes/chat/+page.server.ts

type Message = {
  role: "user" | "assistant";
  content: string;
};

export const actions: Actions = {
  send: async ({ request }) => {
    const data = await request.formData();
    const message = data.get("message") as string;
    const history = JSON.parse(
      (data.get("history") as string) || "[]",
    ) as Message[];

    if (!message) {
      return fail(400, { error: "Message is required" });
    }

    // Build conversation context
    const prompt = [
      ...history.map(
        (m) => `${m.role === "user" ? "User" : "Assistant"}: ${m.content}`,
      ),
      `User: ${message}`,
      "Assistant:",
    ].join("\n");

    const result = await ai.generate({
      input: { text: prompt },
      provider: "anthropic",
      model: "claude-3-5-sonnet-20241022",
      maxTokens: 500,
    });

    return {
      success: true,
      response: result.content,
    };
  },
};
```

```svelte

  import { enhance } from '$app/forms';

  type Message = {
    role: 'user' | 'assistant';
    content: string;
  };

  let messages: Message[] = [];
  let input = '';
  let form: any;

  $: if (form?.success && form?.response) {
    messages = [
      ...messages,
      { role: 'assistant', content: form.response }
    ];
    form = null;
  }

  function handleSubmit() {
    if (!input.trim()) return;

    messages = [...messages, { role: 'user', content: input }];
    input = '';
  }


    {#each messages as msg}


          {msg.content}


    {/each}


        Send


```

### Pattern 2: Usage Analytics

```typescript
// src/lib/analytics.ts

export async function trackUsage(data: {
  userId: string;
  provider: string;
  model: string;
  tokens: number;
  cost: number;
}) {
  await db.insert("ai_usage", {
    user_id: data.userId,
    provider: data.provider,
    model: data.model,
    tokens: data.tokens,
    cost: data.cost,
    timestamp: new Date(),
  });
}

export async function getUserStats(userId: string) {
  const stats = await db.query(
    `SELECT
      COUNT(*) as request_count,
      SUM(tokens) as total_tokens,
      SUM(cost) as total_cost
    FROM ai_usage
    WHERE user_id = ?`,
    [userId],
  );

  return stats[0];
}
```

```typescript
// src/routes/api/generate/+server.ts

export const POST: RequestHandler = async ({ request, locals }) => {
  const { prompt } = await request.json();

  const result = await ai.generate({
    input: { text: prompt },
    provider: "openai",
    model: "gpt-4o-mini",
    enableAnalytics: true,
  });

  // Track usage
  if (locals.user) {
    await trackUsage({
      userId: locals.user.userId,
      provider: result.provider,
      model: result.model,
      tokens: result.usage.totalTokens,
      cost: result.cost,
    });
  }

  return json({ content: result.content });
};
```

---

## Best Practices

### 1. ✅ Use Load Functions for Server-Side Rendering

```typescript
// ✅ Good: Load on server
export const load: PageServerLoad = async () => {
  const result = await ai.generate({
    /* ... */
  });
  return { aiResponse: result.content };
};
```

### 2. ✅ Use Form Actions for Mutations

```typescript
// ✅ Good: Form action with progressive enhancement
export const actions: Actions = {
  generate: async ({ request }) => {
    const data = await request.formData();
    // ... AI generation
  },
};
```

### 3. ✅ Protect Sensitive Routes

```typescript
// ✅ Good: Check authentication
export const load: PageServerLoad = async ({ locals }) => {
  if (!locals.user) {
    throw redirect(307, "/login");
  }
  // ... load data
};
```

### 4. ✅ Handle Errors Gracefully

```typescript
// ✅ Good: Proper error handling
try {
  const result = await ai.generate({
    /* ... */
  });
  return { result };
} catch (err) {
  console.error(err);
  throw error(503, "AI service unavailable");
}
```

### 5. ✅ Use Streaming for Long Responses

```typescript
// ✅ Good: Stream for better UX
export const POST: RequestHandler = async ({ request }) => {
  const stream = await ai.stream({
    /* ... */
  });
  return new Response(stream);
};
```

---

## Deployment

### Vercel Deployment

```bash
# Install adapter
npm install -D @sveltejs/adapter-vercel

# Build
npm run build

# Deploy
vercel
```

```typescript
// svelte.config.js

export default {
  kit: {
    adapter: adapter(),
  },
};
```

### Environment Variables (Production)

```bash
# Set in Vercel dashboard or CLI
vercel env add OPENAI_API_KEY
vercel env add ANTHROPIC_API_KEY
vercel env add JWT_SECRET
```

---

## Related Documentation

- **[API Reference](/docs/sdk/api-reference)** - NeuroLink SDK
- **[Streaming Guide](/docs/advanced/streaming)** - Real-time responses
- **[Compliance Guide](/docs/guides/enterprise/compliance)** - Security and authentication
- **[Cost Optimization](/docs/cookbook/cost-optimization)** - Reduce costs
- **[Fastify Integration](/docs/sdk/framework-integration)** - High-performance Node.js framework with schema validation

---

## Additional Resources

- **[SvelteKit Documentation](https://kit.svelte.dev/)** - Official SvelteKit docs
- **[Svelte Tutorial](https://svelte.dev/tutorial)** - Learn Svelte
- **[SvelteKit Examples](https://github.com/sveltejs/kit/tree/master/examples)** - Example apps

---

**Need Help?** Join our [GitHub Discussions](https://github.com/juspay/neurolink/discussions) or open an [issue](https://github.com/juspay/neurolink/issues).

---

## Load Balancing Strategies

<!-- Source: guides/enterprise/load-balancing.md -->

# Load Balancing Guide

**Distribute AI requests across multiple providers, API keys, and regions for optimal performance**

## Quick Start

### Basic Round-Robin Load Balancing

```typescript

const ai = new NeuroLink({
  providers: [
    {
      name: "openai-key-1",
      config: { apiKey: process.env.OPENAI_KEY_1 },
    },
    {
      name: "openai-key-2",
      config: { apiKey: process.env.OPENAI_KEY_2 },
    },
    {
      name: "openai-key-3",
      config: { apiKey: process.env.OPENAI_KEY_3 },
    },
  ],
  loadBalancing: "round-robin",
});

// Requests distributed evenly:
// Request 1 → openai-key-1
// Request 2 → openai-key-2
// Request 3 → openai-key-3
// Request 4 → openai-key-1 (cycles back)

for (let i = 0; i  req.userId, // Hash on user ID
});

// Same user always routed to same provider
// user123 → always provider-2
// user456 → always provider-1
```

**Best for:**

- Session affinity
- Conversation continuity
- Caching optimization

**Example: User-Based Routing**

```typescript
const result = await ai.generate({
  input: { text: "Your prompt" },
  metadata: { userId: "user-123" }, // Always routes to same provider
});
```

### 6. Random

Randomly select provider.

```typescript
const ai = new NeuroLink({
  providers: [
    { name: "provider-1" },
    { name: "provider-2" },
    { name: "provider-3" },
  ],
  loadBalancing: "random",
});

// Randomly selects any provider
// Good for simple load distribution
```

**Best for:**

- Testing/development
- Stateless requests
- Equal provider capacity

---

## Multi-Key Load Balancing

### Managing Rate Limits

Distribute across multiple API keys to increase throughput.

```typescript
// OpenAI: 500 RPM per key → 2500 RPM total with 5 keys
const ai = new NeuroLink({
  providers: [
    { name: "openai-1", config: { apiKey: process.env.OPENAI_KEY_1 } },
    { name: "openai-2", config: { apiKey: process.env.OPENAI_KEY_2 } },
    { name: "openai-3", config: { apiKey: process.env.OPENAI_KEY_3 } },
    { name: "openai-4", config: { apiKey: process.env.OPENAI_KEY_4 } },
    { name: "openai-5", config: { apiKey: process.env.OPENAI_KEY_5 } },
  ],
  loadBalancing: "round-robin",
  rateLimit: {
    requestsPerMinute: 500, // Per key limit
    strategy: "distributed", // Enforce across all keys
  },
});

// Total capacity: 2,500 RPM (5 keys × 500 RPM)
```

### Quota Management

Track usage across multiple keys.

```typescript
class QuotaManager {
  private usage = new Map();

  canUseProvider(providerName: string): boolean {
    const quota = this.usage.get(providerName);
    if (!quota) return true;

    const now = Date.now();

    // Reset if new minute
    if (now - quota.minuteStart > 60000) {
      quota.requestsThisMinute = 0;
      quota.tokensThisMinute = 0;
      quota.minuteStart = now;
      return true;
    }

    // Check limits (OpenAI Tier 1: 500 RPM, 30K TPM)
    return quota.requestsThisMinute  {
      // Select first provider below quota
      return (
        providers.find((p) => quotaManager.canUseProvider(p.name)) ||
        providers[0]
      );
    },
  },
  onSuccess: (result) => {
    quotaManager.recordUsage(result.provider, result.usage.totalTokens);
  },
});
```

---

## Multi-Provider Load Balancing

### Cross-Provider Distribution

Balance across different AI providers.

```typescript
const ai = new NeuroLink({
  providers: [
    // 50% OpenAI
    { name: "openai", weight: 5, config: { apiKey: process.env.OPENAI_KEY } },

    // 30% Anthropic
    {
      name: "anthropic",
      weight: 3,
      config: { apiKey: process.env.ANTHROPIC_KEY },
    },

    // 20% Google AI
    {
      name: "google-ai",
      weight: 2,
      config: { apiKey: process.env.GOOGLE_AI_KEY },
    },
  ],
  loadBalancing: "weighted-round-robin",
});

// Distribution: 50% OpenAI, 30% Anthropic, 20% Google AI
```

### A/B Testing

Compare provider performance.

```typescript
const ai = new NeuroLink({
  providers: [
    {
      name: "openai",
      weight: 1,
      config: { apiKey: process.env.OPENAI_KEY },
      tags: ["experiment-a"],
    },
    {
      name: "anthropic",
      weight: 1,
      config: { apiKey: process.env.ANTHROPIC_KEY },
      tags: ["experiment-b"],
    },
  ],
  loadBalancing: "weighted-round-robin",
  onSuccess: (result) => {
    // Track metrics for each variant
    analytics.track("ai_request", {
      provider: result.provider,
      experiment: result.tags[0],
      latency: result.latency,
      tokens: result.usage.totalTokens,
      cost: result.cost,
    });
  },
});

// After collecting data, analyze which performs better
```

---

## Geographic Load Balancing

### Multi-Region Setup

Route users to nearest provider.

```typescript
const ai = new NeuroLink({
  providers: [
    // US East
    {
      name: "openai-us-east",
      region: "us-east-1",
      priority: 1,
      condition: (req) => req.userRegion === "us-east",
    },

    // US West
    {
      name: "openai-us-west",
      region: "us-west-2",
      priority: 1,
      condition: (req) => req.userRegion === "us-west",
    },

    // Europe
    {
      name: "mistral-eu",
      region: "eu-west-1",
      priority: 1,
      condition: (req) => req.userRegion === "eu",
    },

    // Asia Pacific
    {
      name: "vertex-asia",
      region: "asia-southeast1",
      priority: 1,
      condition: (req) => req.userRegion === "asia",
    },
  ],
  loadBalancing: "latency-based",
});

// Usage
const result = await ai.generate({
  input: { text: "Your prompt" },
  metadata: {
    userRegion: detectRegion(req.ip), // us-east, us-west, eu, asia
  },
});
```

### Latency-Optimized Routing

```typescript
// Track provider latencies
class LatencyTracker {
  private latencies = new Map();

  recordLatency(provider: string, latency: number) {
    if (!this.latencies.has(provider)) {
      this.latencies.set(provider, []);
    }

    const arr = this.latencies.get(provider)!;
    arr.push(latency);

    // Keep last 100 measurements
    if (arr.length > 100) {
      arr.shift();
    }
  }

  getAverageLatency(provider: string): number {
    const arr = this.latencies.get(provider) || [];
    if (arr.length === 0) return Infinity;

    return arr.reduce((a, b) => a + b, 0) / arr.length;
  }

  getFastestProvider(providers: string[]): string {
    let fastest = providers[0];
    let lowestLatency = this.getAverageLatency(fastest);

    for (const provider of providers) {
      const latency = this.getAverageLatency(provider);
      if (latency  {
      const fastest = tracker.getFastestProvider(providers.map((p) => p.name));
      return providers.find((p) => p.name === fastest)!;
    },
  },
  onSuccess: (result) => {
    tracker.recordLatency(result.provider, result.latency);
  },
});
```

---

## Advanced Patterns

### Pattern 1: Tiered Load Balancing

Combine multiple strategies across tiers.

```typescript
const ai = new NeuroLink({
  providers: [
    // Tier 1: Free tier (round-robin within tier)
    { name: "google-ai-1", tier: 1, cost: 0 },
    { name: "google-ai-2", tier: 1, cost: 0 },
    { name: "google-ai-3", tier: 1, cost: 0 },

    // Tier 2: Cheap paid (round-robin within tier)
    { name: "openai-mini-1", tier: 2, cost: 0.15 },
    { name: "openai-mini-2", tier: 2, cost: 0.15 },

    // Tier 3: Premium (only when needed)
    { name: "anthropic-claude", tier: 3, cost: 3.0 },
  ],
  loadBalancing: {
    strategy: "tiered",
    tierStrategy: "round-robin", // Within each tier
    tierFallback: true, // Fall through tiers on failure
  },
});
```

### Pattern 2: Cost-Optimized Balancing

Balance based on cost and quota.

```typescript
async function costOptimizedSelect(
  providers: Provider[],
  req: Request,
): Promise {
  // Sort by cost (cheapest first)
  const sorted = providers.sort((a, b) => a.cost - b.cost);

  // Try each provider in cost order
  for (const provider of sorted) {
    // Check if provider has quota available
    if (await hasQuotaAvailable(provider)) {
      return provider;
    }
  }

  // All cheap providers exhausted, use expensive fallback
  return sorted[sorted.length - 1];
}

const ai = new NeuroLink({
  providers: [
    { name: "google-ai", cost: 0 }, // Free tier
    { name: "openai-mini", cost: 0.15 }, // Cheap paid
    { name: "gpt-4", cost: 3.0 }, // Premium
  ],
  loadBalancing: {
    strategy: "custom",
    selector: costOptimizedSelect,
  },
});
```

### Pattern 3: Request-Type Based Routing

Route based on request characteristics.

```typescript
const ai = new NeuroLink({
  providers: [
    // Fast, cheap model for simple queries
    {
      name: "gemini-flash",
      condition: (req) => req.complexity === "low",
      model: "gemini-2.0-flash",
    },

    // Balanced for medium complexity
    {
      name: "gpt-4o-mini",
      condition: (req) => req.complexity === "medium",
      model: "gpt-4o-mini",
    },

    // Premium for complex queries
    {
      name: "claude-sonnet",
      condition: (req) => req.complexity === "high",
      model: "claude-3-5-sonnet-20241022",
    },
  ],
});

// Usage
const simpleResult = await ai.generate({
  input: { text: "What is 2+2?" },
  metadata: { complexity: "low" }, // Routes to gemini-flash
});

const complexResult = await ai.generate({
  input: { text: "Analyze this complex business scenario..." },
  metadata: { complexity: "high" }, // Routes to claude-sonnet
});
```

---

## Monitoring and Metrics

### Load Distribution Dashboard

```typescript
class LoadBalancerMetrics {
  private stats = new Map();

  recordRequest(provider: string, latency: number, error: boolean) {
    if (!this.stats.has(provider)) {
      this.stats.set(provider, {
        requests: 0,
        errors: 0,
        totalLatency: 0,
        lastUsed: Date.now(),
      });
    }

    const stat = this.stats.get(provider)!;
    stat.requests++;
    stat.totalLatency += latency;
    stat.lastUsed = Date.now();

    if (error) {
      stat.errors++;
    }
  }

  getStats() {
    const total = Array.from(this.stats.values()).reduce(
      (sum, stat) => sum + stat.requests,
      0,
    );

    return Array.from(this.stats.entries()).map(([provider, stat]) => ({
      provider,
      requests: stat.requests,
      percentage: (stat.requests / total) * 100,
      errorRate: (stat.errors / stat.requests) * 100,
      avgLatency: stat.totalLatency / stat.requests,
      lastUsed: new Date(stat.lastUsed).toISOString(),
    }));
  }
}

// Usage
const metrics = new LoadBalancerMetrics();

const ai = new NeuroLink({
  providers: [
    /* ... */
  ],
  onSuccess: (result) => {
    metrics.recordRequest(result.provider, result.latency, false);
  },
  onError: (error, provider) => {
    metrics.recordRequest(provider, 0, true);
  },
});

// View dashboard
console.table(metrics.getStats());
/*
┌─────────┬──────────────┬──────────┬────────────┬───────────┬─────────┬──────────────────────────┐
│ (index) │   provider   │ requests │ percentage │ errorRate │ avgLat  │        lastUsed          │
├─────────┼──────────────┼──────────┼────────────┼───────────┼─────────┼──────────────────────────┤
│    0    │  'openai-1'  │   342    │   34.2     │    0.29   │  125ms  │ 2025-01-15T10:30:45.123Z │
│    1    │  'openai-2'  │   338    │   33.8     │    0.00   │  118ms  │ 2025-01-15T10:30:46.456Z │
│    2    │  'openai-3'  │   320    │   32.0     │    0.31   │  132ms  │ 2025-01-15T10:30:44.789Z │
└─────────┴──────────────┴──────────┴────────────┴───────────┴─────────┴──────────────────────────┘
*/
```

---

## Best Practices

### 1. ✅ Use Weighted Balancing for Migrations

```typescript
// ✅ Good: Gradual migration from OpenAI to Anthropic
const ai = new NeuroLink({
  providers: [
    { name: "openai", weight: 7 }, // 70% (gradually decrease)
    { name: "anthropic", weight: 3 }, // 30% (gradually increase)
  ],
  loadBalancing: "weighted-round-robin",
});

// Week 1: 70/30 split
// Week 2: 50/50 split
// Week 3: 30/70 split
// Week 4: 0/100 split (fully migrated)
```

### 2. ✅ Monitor Distribution Fairness

```typescript
// ✅ Good: Alert if distribution becomes uneven
const expectedDistribution = {
  "provider-1": 33.3,
  "provider-2": 33.3,
  "provider-3": 33.3,
};

setInterval(() => {
  const stats = metrics.getStats();

  for (const stat of stats) {
    const expected = expectedDistribution[stat.provider];
    const deviation = Math.abs(stat.percentage - expected);

    if (deviation > 10) {
      // >10% deviation
      alerting.sendAlert(
        `Uneven distribution: ${stat.provider} at ${stat.percentage}% (expected ${expected}%)`,
      );
    }
  }
}, 60000); // Check every minute
```

### 3. ✅ Use Health Checks with Load Balancing

```typescript
// ✅ Good: Don't route to unhealthy providers
const ai = new NeuroLink({
  providers: [
    /* ... */
  ],
  loadBalancing: "round-robin",
  healthCheck: {
    enabled: true,
    interval: 30000,
    excludeUnhealthy: true, // Skip unhealthy providers
  },
});
```

### 4. ✅ Implement Circuit Breakers

```typescript
// ✅ Good: Prevent cascading failures
const ai = new NeuroLink({
  providers: [
    /* ... */
  ],
  loadBalancing: "round-robin",
  circuitBreaker: {
    enabled: true,
    failureThreshold: 5,
    resetTimeout: 60000,
  },
});
```

### 5. ✅ Test Load Distribution

```typescript
// ✅ Good: Verify even distribution in tests
describe("Load Balancing", () => {
  it("should distribute requests evenly", async () => {
    const usage = new Map();

    for (let i = 0; i < 300; i++) {
      const result = await ai.generate({
        input: { text: `Request ${i}` },
      });

      usage.set(result.provider, (usage.get(result.provider) || 0) + 1);
    }

    // Each provider should get ~100 requests (±10%)
    for (const [provider, count] of usage.entries()) {
      expect(count).toBeGreaterThan(90);
      expect(count).toBeLessThan(110);
    }
  });
});
```

---

## Related Documentation

- **[Multi-Provider Failover](/docs/guides/enterprise/multi-provider-failover)** - Automatic failover
- **[Cost Optimization](/docs/cookbook/cost-optimization)** - Reduce AI costs
- **[Provider Setup](/docs/getting-started/provider-setup)** - Provider configuration
- **[Monitoring Guide](/docs/observability/health-monitoring)** - Observability and metrics

---

## Additional Resources

- **[NeuroLink GitHub](https://github.com/juspay/neurolink)** - Source code
- **[GitHub Discussions](https://github.com/juspay/neurolink/discussions)** - Community support
- **[Issues](https://github.com/juspay/neurolink/issues)** - Report bugs

---

**Need Help?** Join our [GitHub Discussions](https://github.com/juspay/neurolink/discussions) or open an [issue](https://github.com/juspay/neurolink/issues).

---

## Migration Guide (v7.40 → v7.47)

<!-- Source: guides/migration-guide.md -->

# Migration Guide (v7.40 → v7.47)

Use this guide when upgrading existing NeuroLink deployments to the 7.47 release train. The focus is on new capabilities (multimodal chat, auto evaluation, loop mode, orchestration) and the configuration changes required to adopt them safely.

## Compatibility Summary

| Area          | Status                                                                                 |
| ------------- | -------------------------------------------------------------------------------------- |
| Core SDK APIs | ✅ Backward compatible. `generate()` and `stream()` signatures are unchanged.          |
| CLI commands  | ✅ Existing scripts continue to work. New options are opt-in.                          |
| Configuration | ⚠️ New environment variables for evaluation and regional routing. Review `.env` files. |
| Tooling       | ✅ MCP, analytics, and telemetry remain compatible.                                    |

## Recommended Upgrade Steps

1. **Update dependencies**

   ```bash
   npm install @juspay/neurolink@^7.47.0
   # or
   pnpm add @juspay/neurolink@^7.47.0
   ```

2. **Refresh CLI binaries**

   ```bash
   npm install -g @juspay/neurolink@^7.47.0
   neurolink --version
   ```

3. **Review new environment variables**
   - Add `NEUROLINK_EVALUATION_PROVIDER`, `NEUROLINK_EVALUATION_MODEL`, and `NEUROLINK_EVALUATION_THRESHOLD` if you enable the auto-evaluation engine.
   - Ensure `AWS_REGION` / `GOOGLE_VERTEX_LOCATION` are set when targeting specific regions.
   - Provide `REDIS_URL` if you want loop sessions to auto-mount persistent memory.

4. **Adopt multimodal support**
   - CLI: use `--image` (multiple allowed) with `generate` or `stream`.
   - SDK: pass `input.images` (`string` path, HTTPS URL, or `Buffer`).
   - Update downstream parsing to handle `result.toolCalls` on multimodal calls.

5. **Leverage auto evaluation (optional)**
   - CLI: add `--enableEvaluation` to commands or set it once inside `neurolink loop` (`set enableEvaluation true`).
   - SDK: include `enableEvaluation: true` per request.
   - Capture `result.evaluation` in logs or dashboards.

6. **Introduce loop sessions to teams**
   - Document the new `loop` workflow, especially how to `set provider`, `set model`, and export transcripts.
   - Configure Redis for persistent memory where collaboration spans multiple terminals.

7. **Enable orchestration (server workloads)**
   - Instantiate `new NeuroLink({ enableOrchestration: true })` for services that benefit from automatic provider routing.
   - Monitor debug logs (`NEUROLINK_DEBUG=true`) in staging before enabling in production.

## Behaviour Changes to Note

- **Evaluation output** – `GenerateResult` now includes `toolCalls`, `toolResults`, and richer `analytics`. Update any custom serializers accordingly.
- **Loop session variables** – The new session state respects `set`/`unset` commands. Scripts that previously relied on global env variables should be adjusted to set session variables explicitly.
- **Redis auto-detect** – Starting a loop with `--auto-redis` sets `STORAGE_TYPE=redis` automatically. Ensure Redis credentials are valid; otherwise disable with `--no-auto-redis`.
- **Regional routing** – Requests that include `region` now forward directly to the provider. Validate quota and model availability per region to avoid 404s.

## Testing Checklist

- Run `npx @juspay/neurolink status --verbose` after upgrading credentials.
- Execute a multimodal CLI call (`generate --image`) to confirm file uploads succeed.
- Run a sample with `--enableEvaluation --format json` and verify the evaluation block is emitted.
- Stress-test loop mode with Redis by running `memory stats` and `memory history`.
- If orchestration is enabled, tail logs for `Orchestration route determined` messages and confirm provider availability.

## Rollback Plan

- Keep the previous CLI binary (`npm install -g @juspay/neurolink@`) handy.
- Maintain separate `.env` files for pre- and post-upgrade configurations.
- Disable orchestration and evaluation env vars if you encounter regressions; core generation continues to work without them.

For additional support open an issue on GitHub or reach out via the Juspay developer channels.

---

## Monitoring & Observability Guide

<!-- Source: guides/enterprise/monitoring.md -->

# Monitoring & Observability Guide

**Comprehensive monitoring for AI applications with Prometheus, Grafana, and cloud-native tools**

## Quick Start

### 1. Setup Prometheus

```bash
# Docker Compose setup
cat > docker-compose.yml  {
    aiRequestsActive.inc({ provider: req.provider });
  },
  onSuccess: (result) => {
    // Record request
    aiRequestsTotal.inc({
      provider: result.provider,
      model: result.model,
      status: "success",
    });

    // Record latency
    aiRequestDuration.observe(
      { provider: result.provider, model: result.model },
      result.latency / 1000, // Convert ms to seconds
    );

    // Record tokens
    aiTokensUsed.inc(
      { provider: result.provider, model: result.model, type: "input" },
      result.usage.promptTokens,
    );
    aiTokensUsed.inc(
      { provider: result.provider, model: result.model, type: "output" },
      result.usage.completionTokens,
    );

    // Record cost
    aiCostTotal.inc(
      { provider: result.provider, model: result.model },
      result.cost,
    );

    // Decrement active
    aiRequestsActive.dec({ provider: result.provider });
  },
  onError: (error, provider, model) => {
    // Record error
    aiErrorsTotal.inc({
      provider,
      model: model || "unknown",
      error_type: error.message.includes("rate limit")
        ? "rate_limit"
        : error.message.includes("timeout")
          ? "timeout"
          : "other",
    });

    // Record failed request
    aiRequestsTotal.inc({
      provider,
      model: model || "unknown",
      status: "error",
    });

    // Decrement active
    aiRequestsActive.dec({ provider });
  },
});

// Metrics endpoint
app.get("/metrics", async (req, res) => {
  res.setHeader("Content-Type", register.contentType);
  res.send(await register.metrics());
});
```

---

## Grafana Dashboards

### Create Dashboard

```json
{
  "dashboard": {
    "title": "NeuroLink Monitoring",
    "panels": [
      {
        "title": "Requests Per Second",
        "targets": [
          {
            "expr": "rate(ai_requests_total[5m])",
            "legendFormat": "{{provider}} - {{model}}"
          }
        ],
        "type": "graph"
      },
      {
        "title": "Average Latency",
        "targets": [
          {
            "expr": "rate(ai_request_duration_seconds_sum[5m]) / rate(ai_request_duration_seconds_count[5m])",
            "legendFormat": "{{provider}} - {{model}}"
          }
        ],
        "type": "graph"
      },
      {
        "title": "Error Rate",
        "targets": [
          {
            "expr": "rate(ai_errors_total[5m])",
            "legendFormat": "{{provider}} - {{error_type}}"
          }
        ],
        "type": "graph"
      },
      {
        "title": "Hourly Cost",
        "targets": [
          {
            "expr": "rate(ai_cost_total_usd[1h]) * 3600",
            "legendFormat": "{{provider}}"
          }
        ],
        "type": "graph"
      },
      {
        "title": "Token Usage",
        "targets": [
          {
            "expr": "rate(ai_tokens_used_total[5m])",
            "legendFormat": "{{provider}} - {{type}}"
          }
        ],
        "type": "graph"
      }
    ]
  }
}
```

### Key Dashboard Panels

**1. Request Rate**

```promql
rate(ai_requests_total[5m])
```

**2. P95 Latency**

```promql
histogram_quantile(0.95, rate(ai_request_duration_seconds_bucket[5m]))
```

**3. Success Rate**

```promql
sum(rate(ai_requests_total{status="success"}[5m])) / sum(rate(ai_requests_total[5m])) * 100
```

**4. Cost Per Hour**

```promql
rate(ai_cost_total_usd[1h]) * 3600
```

**5. Tokens Per Request**

```promql
rate(ai_tokens_used_total[5m]) / rate(ai_requests_total[5m])
```

---

## Cloud-Native Monitoring

### AWS CloudWatch

```typescript

const cloudwatch = new CloudWatch({ region: "us-east-1" });

async function publishMetrics(result: any) {
  await cloudwatch.putMetricData({
    Namespace: "NeuroLink/AI",
    MetricData: [
      {
        MetricName: "Requests",
        Value: 1,
        Unit: "Count",
        Dimensions: [
          { Name: "Provider", Value: result.provider },
          { Name: "Model", Value: result.model },
        ],
        Timestamp: new Date(),
      },
      {
        MetricName: "Latency",
        Value: result.latency,
        Unit: "Milliseconds",
        Dimensions: [{ Name: "Provider", Value: result.provider }],
        Timestamp: new Date(),
      },
      {
        MetricName: "TokensUsed",
        Value: result.usage.totalTokens,
        Unit: "Count",
        Dimensions: [
          { Name: "Provider", Value: result.provider },
          { Name: "Model", Value: result.model },
        ],
        Timestamp: new Date(),
      },
      {
        MetricName: "Cost",
        Value: result.cost,
        Unit: "None",
        Dimensions: [{ Name: "Provider", Value: result.provider }],
        Timestamp: new Date(),
      },
    ],
  });
}

const ai = new NeuroLink({
  providers: [
    /* ... */
  ],
  onSuccess: async (result) => {
    await publishMetrics(result);
  },
});
```

### Azure Application Insights

```typescript

const appInsights = new ApplicationInsights({
  connectionString: process.env.APPLICATIONINSIGHTS_CONNECTION_STRING,
});

appInsights.start();

const ai = new NeuroLink({
  providers: [
    /* ... */
  ],
  onSuccess: (result) => {
    appInsights.trackEvent({
      name: "AI_Request",
      properties: {
        provider: result.provider,
        model: result.model,
        tokens: result.usage.totalTokens,
        cost: result.cost,
      },
      measurements: {
        latency: result.latency,
        tokensUsed: result.usage.totalTokens,
        cost: result.cost,
      },
    });

    appInsights.trackMetric({
      name: "AI_Latency",
      value: result.latency,
      properties: { provider: result.provider },
    });
  },
  onError: (error, provider) => {
    appInsights.trackException({
      exception: error,
      properties: { provider },
    });
  },
});
```

### Google Cloud Operations

```typescript

const logging = new Logging();
const log = logging.log("neurolink-requests");

const metrics = new MetricServiceClient();

const ai = new NeuroLink({
  providers: [
    /* ... */
  ],
  onSuccess: async (result) => {
    // Log to Cloud Logging
    await log.write(
      log.entry(
        {
          resource: { type: "global" },
          severity: "INFO",
        },
        {
          event: "ai_request",
          provider: result.provider,
          model: result.model,
          tokens: result.usage.totalTokens,
          latency: result.latency,
          cost: result.cost,
        },
      ),
    );

    // Send to Cloud Monitoring
    await metrics.createTimeSeries({
      name: metrics.projectPath(process.env.GCP_PROJECT_ID!),
      timeSeries: [
        {
          metric: {
            type: "custom.googleapis.com/neurolink/latency",
            labels: { provider: result.provider },
          },
          resource: { type: "global" },
          points: [
            {
              interval: { endTime: { seconds: Date.now() / 1000 } },
              value: { doubleValue: result.latency },
            },
          ],
        },
      ],
    });
  },
});
```

---

## Alerting

### Prometheus Alerts

```yaml
# alerts.yml
groups:
  - name: neurolink_alerts
    interval: 30s
    rules:
      # High error rate
      - alert: HighAIErrorRate
        expr: rate(ai_errors_total[5m]) > 0.1
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High AI error rate detected"
          description: "Error rate is {{ $value }} errors/sec for {{ $labels.provider }}"

      # High latency
      - alert: HighAILatency
        expr: histogram_quantile(0.95, rate(ai_request_duration_seconds_bucket[5m])) > 10
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High AI latency detected"
          description: "P95 latency is {{ $value }}s for {{ $labels.provider }}"

      # High cost
      - alert: HighAICost
        expr: rate(ai_cost_total_usd[1h]) * 3600 > 100
        for: 15m
        labels:
          severity: critical
        annotations:
          summary: "High AI costs detected"
          description: "Hourly cost is ${{ $value }}"

      # Provider down
      - alert: AIProviderDown
        expr: up{job="neurolink-api"} == 0
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "AI provider is down"
          description: "{{ $labels.instance }} has been down for 2 minutes"
```

### Alertmanager Configuration

```yaml
# alertmanager.yml
global:
  slack_api_url: "https://hooks.slack.com/services/YOUR/WEBHOOK/URL"

route:
  group_by: ["alertname", "provider"]
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 4h
  receiver: "slack-notifications"

receivers:
  - name: "slack-notifications"
    slack_configs:
      - channel: "#ai-alerts"
        title: "{{ .GroupLabels.alertname }}"
        text: "{{ range .Alerts }}{{ .Annotations.description }}{{ end }}"

  - name: "pagerduty"
    pagerduty_configs:
      - service_key: "YOUR_PAGERDUTY_KEY"
```

---

## Custom Monitoring Dashboards

### Real-Time Cost Dashboard

```typescript
class CostDashboard {
  private costs = new Map();
  private hourlySnapshot: number[] = [];

  recordCost(provider: string, cost: number) {
    const current = this.costs.get(provider) || 0;
    this.costs.set(provider, current + cost);
  }

  takeHourlySnapshot() {
    const total = Array.from(this.costs.values()).reduce(
      (sum, cost) => sum + cost,
      0,
    );

    this.hourlySnapshot.push(total);

    // Keep last 24 hours
    if (this.hourlySnapshot.length > 24) {
      this.hourlySnapshot.shift();
    }
  }

  getDashboardData() {
    return {
      totalToday: Array.from(this.costs.values()).reduce(
        (sum, cost) => sum + cost,
        0,
      ),
      byProvider: Object.fromEntries(this.costs),
      hourlyTrend: this.hourlySnapshot,
      projectedMonthly: this.hourlySnapshot.reduce((a, b) => a + b, 0) * 30,
    };
  }
}

// Usage
const dashboard = new CostDashboard();

const ai = new NeuroLink({
  providers: [
    /* ... */
  ],
  onSuccess: (result) => {
    dashboard.recordCost(result.provider, result.cost);
  },
});

// Snapshot every hour
setInterval(() => dashboard.takeHourlySnapshot(), 3600000);

// API endpoint
app.get("/dashboard/costs", (req, res) => {
  res.json(dashboard.getDashboardData());
});
```

---

## Best Practices

### 1. ✅ Track All Key Metrics

```typescript
// ✅ Good: Comprehensive tracking
onSuccess: (result) => {
  metrics.recordLatency(result.latency);
  metrics.recordTokens(result.usage.totalTokens);
  metrics.recordCost(result.cost);
  metrics.recordProvider(result.provider);
};
```

### 2. ✅ Set Up Alerts

```yaml
# ✅ Good: Proactive alerting
- alert: HighCosts
  expr: rate(ai_cost_total_usd[1h]) * 3600 > 100
```

### 3. ✅ Use Histograms for Latency

```typescript
// ✅ Good: Percentile tracking
const latencyHistogram = new Histogram({
  buckets: [0.1, 0.5, 1, 2, 5, 10, 30],
});
```

### 4. ✅ Monitor Error Rates

```typescript
// ✅ Good: Error categorization
aiErrorsTotal.inc({
  provider,
  error_type: categorizeError(error),
});
```

### 5. ✅ Dashboard for Stakeholders

```typescript
// ✅ Good: Business-friendly dashboard
app.get("/dashboard/summary", (req, res) => {
  res.json({
    requestsToday: getRequestCount(),
    costToday: getTotalCost(),
    avgLatency: getAvgLatency(),
    errorRate: getErrorRate(),
  });
});
```

---

## Related Documentation

**Feature Guides:**

- **[Auto Evaluation](/docs/features/auto-evaluation)** - Automated quality scoring and metrics export
- **[Provider Orchestration](/docs/features/provider-orchestration)** - Intelligent routing decisions to monitor
- **[Redis Conversation Export](/docs/features/conversation-history)** - Export session data for analysis

**Enterprise Guides:**

- **[Cost Optimization](/docs/cookbook/cost-optimization)** - Reduce AI costs
- **[Multi-Provider Failover](/docs/guides/enterprise/multi-provider-failover)** - High availability
- **[Audit Trails](/docs/guides/enterprise/audit-trails)** - Compliance logging
- **[Compliance](/docs/guides/enterprise/compliance)** - Security and compliance

---

## Additional Resources

- **[Prometheus Docs](https://prometheus.io/docs/)** - Prometheus documentation
- **[Grafana Docs](https://grafana.com/docs/)** - Grafana documentation
- **[CloudWatch Docs](https://docs.aws.amazon.com/cloudwatch/)** - AWS CloudWatch
- **[Application Insights](https://docs.microsoft.com/azure/azure-monitor/)** - Azure monitoring

---

**Need Help?** Join our [GitHub Discussions](https://github.com/juspay/neurolink/discussions) or open an [issue](https://github.com/juspay/neurolink/issues).

---

## Provider Selection Wizard

<!-- Source: guides/provider-selection.md -->

# Provider Selection Wizard

**Last Updated:** January 1, 2026
**NeuroLink Version:** 8.29.0

Interactive guide to help you select the perfect AI provider for your specific needs. This wizard considers your requirements, constraints, and priorities to recommend the optimal provider configuration.

## Detailed Provider Decision Tree

### Step 1: Define Your Primary Goal

```
What's the MOST important factor for your project?

 Cost Optimization → Go to Section A
 Privacy & Security → Go to Section B
⚡ Performance & Quality → Go to Section C
 Document Processing → Go to Section D
 Advanced Reasoning → Go to Section E
 Enterprise Features → Go to Section F
 Experimentation → Go to Section G
```

---

## Section A: Cost Optimization

### Scenario A1: Zero Budget (Completely Free)

**Best Choice: Google AI Studio**

- FREE tier: 1M tokens/day
- Professional quality (Gemini 2.5 Flash)
- Extended thinking support
- PDF processing included

**Setup:**

```bash
GOOGLE_AI_API_KEY=your_api_key
GOOGLE_AI_MODEL=gemini-2.5-flash
```

```typescript
const result = await neurolink.generate({
  provider: "google-ai",
  prompt: "Your task",
});
```

**Alternative: Ollama**

- Completely FREE (local execution)
- No API key needed
- Privacy-first
- Requires local GPU

---

### Scenario A2: Limited Budget ($50-$200/month)

**Best Choice: Mistral**

- Competitive pricing ($0.20/$0.60 per 1M tokens for Small)
- Good quality
- GDPR compliant

**Cost Example:**

- 10M input tokens/month: $2.00
- 10M output tokens/month: $6.00
- **Total: $8/month**

**Setup:**

```bash
MISTRAL_API_KEY=your_api_key
MISTRAL_MODEL=mistral-small-2506
```

**Alternative: Google Vertex**

- Gemini 2.5 Flash: $0.35/$1.05 per 1M tokens
- Extended thinking
- PDF support

---

### Scenario A3: Cost Optimization with Multiple Models

**Best Choice: OpenRouter**

- Access to FREE models (Gemini 2.0 Flash, Llama 3.3 70B)
- Pay only when you need premium models
- Cost tracking built-in

**Setup:**

```bash
OPENROUTER_API_KEY=your_api_key
```

```typescript
// Use free model for simple tasks
const simpleResult = await neurolink.generate({
  provider: "openrouter",
  model: "google/gemini-2.0-flash-exp:free",
  prompt: "Simple task",
});

// Use premium model for complex tasks
const complexResult = await neurolink.generate({
  provider: "openrouter",
  model: "anthropic/claude-3-5-sonnet",
  prompt: "Complex analysis",
});
```

---

## Section B: Privacy & Security

### Scenario B1: Maximum Privacy (No Cloud)

**Best Choice: Ollama**

- 100% local execution
- No data sent to any server
- Works offline
- HIPAA/GDPR compliant by design

**Setup:**

```bash
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Pull model
ollama pull llama3.1:8b

# Optional configuration
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama3.1:8b
```

**Recommended Models:**

- `llama3.1:8b` - Fast, general purpose
- `llama3.1:70b` - Higher quality (needs more RAM)
- `gemma3:9b` - Google's lightweight model

**Hardware Requirements:**

- Minimum: 8GB RAM, CPU only (slower)
- Recommended: 16GB+ RAM, NVIDIA GPU
- Optimal: 32GB+ RAM, RTX 3090/4090

---

### Scenario B2: Cloud with GDPR Compliance

**Best Choice: Mistral**

- European data centers
- GDPR compliant
- No training on user data
- Open-source models available

**Compliance Features:**

- Data stored in EU
- GDPR data processing agreement
- Right to deletion
- Data portability

---

### Scenario B3: Enterprise Security (HIPAA + SOC2)

**Best Choices:**

**Option 1: Azure OpenAI**

- Microsoft enterprise security
- HIPAA BAA available
- SOC2 certified
- Enterprise SLAs

**Option 2: Amazon Bedrock**

- AWS security features
- HIPAA BAA available
- SOC2 certified
- Audit logging

**Option 3: Google Vertex**

- GCP security
- HIPAA BAA available
- SOC2 certified
- Data residency controls

---

## Section C: Performance & Quality

### Scenario C1: Highest Quality (No Compromises)

**Best Choice: Anthropic Claude 4.5 Sonnet**

- Best reasoning capabilities
- Extended thinking
- 200K context window
- Native PDF support

**Setup:**

```bash
ANTHROPIC_API_KEY=your_api_key
ANTHROPIC_MODEL=claude-sonnet-4-5-20250929
```

```typescript
const result = await neurolink.generate({
  provider: "anthropic",
  model: "claude-sonnet-4-5-20250929",
  prompt: "Complex reasoning task",
  thinkingLevel: "high",
});
```

**When to Use:**

- Critical customer-facing features
- Complex analysis requiring deep reasoning
- Document-heavy workflows (PDF support)
- Agentic workflows with multi-step tool use

---

### Scenario C2: Best Vision Quality

**Best Choice: Anthropic**

- 20 images per request (highest)
- Excellent vision understanding
- Combined with text reasoning
- PDF processing included

**Code Example:**

```typescript
const result = await neurolink.generate({
  provider: "anthropic",
  input: {
    text: "Analyze these medical images",
    images: ["/path/to/scan1.jpg", "/path/to/scan2.jpg", "/path/to/scan3.jpg"],
  },
});
```

**Alternative: OpenAI GPT-4o**

- Industry-leading vision
- 10 images per request
- Fast inference
- Good for general vision tasks

---

### Scenario C3: Fastest Response Time

**Best Choice: Ollama (Local)**

- 50-200ms time to first token
- No network latency
- Streaming immediately available

**Alternative: Google AI Studio**

- 300-700ms TTFT
- FREE tier
- Professional quality

---

## Section D: Document Processing

### Scenario D1: PDF-Heavy Workflows

**Best Choice: Anthropic**

- Native PDF understanding
- No preprocessing required
- Extracts text, tables, structure
- Visual analysis of PDF pages

**Setup:**

```typescript
const result = await neurolink.generate({
  provider: "anthropic",
  input: {
    text: "Analyze this contract",
    pdfFiles: ["/path/to/contract.pdf"],
  },
  thinkingLevel: "high",
});
```

**Alternative: Google AI Studio**

- PDF support (Gemini models)
- FREE tier
- Extended thinking
- Good for budget-conscious teams

---

### Scenario D2: Mixed Documents (PDF + Images + Text)

**Best Choice: Anthropic**

- Handles all formats natively
- Up to 20 images + PDFs
- Unified analysis

**Code Example:**

```typescript
const result = await neurolink.generate({
  provider: "anthropic",
  input: {
    text: "Compare these documents",
    images: ["/path/to/diagram1.png", "/path/to/chart.jpg"],
    pdfFiles: ["/path/to/report.pdf", "/path/to/analysis.pdf"],
  },
});
```

---

## Section E: Advanced Reasoning

### Scenario E1: Extended Thinking Required

**Best Choice: Anthropic**

- Native extended thinking (best)
- Transparent reasoning process
- Configurable thinking levels
- Deep analysis capabilities

**Setup:**

```typescript
const result = await neurolink.generate({
  provider: "anthropic",
  model: "claude-sonnet-4-5-20250929",
  prompt: "Solve this complex problem: ...",
  thinkingLevel: "high", // minimal | low | medium | high
});
```

**Cost Impact:**

- Extended thinking increases token usage
- High level: 2-3x more tokens
- Medium level: 1.5-2x more tokens
- Worth it for complex tasks

**Alternative: Google AI Studio**

- Gemini 2.5+, Gemini 3 thinking
- FREE tier available
- Good for budget teams

---

### Scenario E2: Multi-Step Tool Use (Agentic Workflows)

**Best Choice: Anthropic**

- Advanced tool use
- Parallel tool execution
- Tool result caching
- Best for agentic patterns

**Code Example:**

```typescript
const neurolink = new NeuroLink({
  provider: "anthropic",
  model: "claude-sonnet-4-5-20250929",
});

// Register tools
neurolink.registerTool({
  name: "search_database",
  description: "Search customer database",
  parameters: z.object({
    query: z.string(),
  }),
  execute: async ({ query }) => {
    // Implementation
    return results;
  },
});

neurolink.registerTool({
  name: "send_email",
  description: "Send email to customer",
  parameters: z.object({
    to: z.string(),
    subject: z.string(),
    body: z.string(),
  }),
  execute: async ({ to, subject, body }) => {
    // Implementation
    return { sent: true };
  },
});

// Claude will automatically use tools in sequence
const result = await neurolink.generate({
  prompt: "Find customer John Doe and send him a follow-up email",
  maxSteps: 10, // Allow multi-step tool use
});
```

---

## Section F: Enterprise Features

### Scenario F1: AWS-Based Enterprise

**Best Choice: Amazon Bedrock**

- Seamless AWS integration
- IAM-based authentication
- VPC endpoints available
- CloudWatch logging
- Multiple model providers

**Setup:**

```bash
AWS_ACCESS_KEY_ID=your_key
AWS_SECRET_ACCESS_KEY=your_secret
AWS_REGION=us-east-1
BEDROCK_MODEL=anthropic.claude-3-sonnet-20240229-v1:0
```

**Benefits:**

- Use existing AWS account
- Consolidated billing
- Infrastructure as Code (Terraform/CDK)
- Compliance certifications

---

### Scenario F2: Azure-Based Enterprise

**Best Choice: Azure OpenAI**

- Microsoft ecosystem integration
- Azure AD authentication
- Virtual network integration
- Enterprise support

**Setup:**

```bash
AZURE_OPENAI_API_KEY=your_key
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
AZURE_OPENAI_DEPLOYMENT=gpt-4o
AZURE_API_VERSION=2024-05-01-preview
```

**Benefits:**

- Same models as OpenAI
- Microsoft SLAs
- Azure compliance
- Integrated monitoring

---

### Scenario F3: GCP-Based Enterprise

**Best Choice: Google Vertex AI**

- Dual provider (Gemini + Claude)
- GCP integration
- Service account authentication
- Stackdriver logging

**Setup:**

```bash
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
VERTEX_PROJECT_ID=your-project
VERTEX_LOCATION=us-central1
VERTEX_MODEL=gemini-2.5-flash
```

**Benefits:**

- Use both Gemini and Claude
- GCP billing
- Regional deployments
- Vertex AI pipelines

---

## Section G: Experimentation

### Scenario G1: Testing Multiple Models

**Best Choice: LiteLLM**

- Unified proxy for 100+ models
- Cost tracking
- A/B testing support
- Load balancing

**Setup:**

```bash
# Start LiteLLM proxy
litellm --config config.yaml

# Configure NeuroLink
LITELLM_BASE_URL=http://localhost:4000
LITELLM_API_KEY=sk-anything
```

**Config Example:**

```yaml
model_list:
  - model_name: gpt-4
    litellm_params:
      model: openai/gpt-4o
      api_key: sk-openai-key

  - model_name: claude
    litellm_params:
      model: anthropic/claude-3-5-sonnet
      api_key: sk-ant-key

  - model_name: gemini
    litellm_params:
      model: vertex_ai/gemini-2.5-flash
      vertex_project: my-project
```

**Usage:**

```typescript
// Test different models easily
const models = [
  "openai/gpt-4o",
  "anthropic/claude-3-5-sonnet",
  "google/gemini-2.5-flash",
];

for (const model of models) {
  const result = await neurolink.generate({
    provider: "litellm",
    model,
    prompt: "Same test prompt",
  });
  console.log(`${model}: ${result.content}`);
}
```

---

### Scenario G2: Research & Open Source Models

**Best Choice: HuggingFace**

- 100,000+ models
- Cutting-edge research models
- Community support
- Free tier available

**Setup:**

```bash
HUGGINGFACE_API_KEY=hf_your_key
HUGGINGFACE_MODEL=meta-llama/Llama-3.1-8B-Instruct
```

**Recommended Research Models:**

- `meta-llama/Llama-3.1-70B-Instruct` - Meta's flagship
- `mistralai/Mistral-7B-Instruct-v0.3` - Mistral open model
- `nvidia/Llama-3.1-Nemotron-Ultra-253B-v1` - NVIDIA enhanced

---

## Real-World Use Case Examples

### Use Case 1: Startup MVP (Budget: $0-100/month)

**Recommendation: Google AI Studio**

**Why:**

- FREE tier (1M tokens/day)
- Professional quality
- Extended thinking
- PDF support
- Easy setup

**Configuration:**

```bash
GOOGLE_AI_API_KEY=your_key
GOOGLE_AI_MODEL=gemini-2.5-flash
```

**Expected Costs:**

- Development: $0/month (free tier)
- Production (low traffic): $0-$50/month
- Scaling strategy: Move to Vertex AI when you outgrow free tier

---

### Use Case 2: Healthcare Application (HIPAA Required)

**Recommendation: Azure OpenAI**

**Why:**

- HIPAA BAA available
- Enterprise security
- Microsoft compliance
- Audit logging

**Setup Checklist:**

1. ✅ Sign Azure HIPAA BAA
2. ✅ Configure Virtual Network
3. ✅ Enable audit logging
4. ✅ Set up Azure AD authentication
5. ✅ Configure data residency

**Configuration:**

```bash
AZURE_OPENAI_API_KEY=your_key
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
AZURE_OPENAI_DEPLOYMENT=gpt-4o
```

---

### Use Case 3: Legal Document Analysis

**Recommendation: Anthropic Claude 4.5 Sonnet**

**Why:**

- Extended thinking (deep analysis)
- Native PDF support
- 200K context window (handle long documents)
- Best reasoning quality

**Configuration:**

```typescript
const neurolink = new NeuroLink({
  provider: "anthropic",
  model: "claude-sonnet-4-5-20250929",
});

const analysis = await neurolink.generate({
  input: {
    text: "Analyze this contract for risks and obligations",
    pdfFiles: ["/path/to/contract.pdf"],
  },
  thinkingLevel: "high",
  maxTokens: 150000, // Use large context
});
```

---

### Use Case 4: Customer Support Chatbot (High Volume)

**Recommendation: OpenRouter with Free Models**

**Why:**

- FREE models for common queries
- Fallback to premium for complex cases
- Cost tracking
- Auto-failover

**Configuration:**

```typescript
async function handleSupportQuery(query: string, complexity: string) {
  if (complexity === "simple") {
    // Use free model
    return await neurolink.generate({
      provider: "openrouter",
      model: "google/gemini-2.0-flash-exp:free",
      prompt: query,
    });
  } else {
    // Use premium model
    return await neurolink.generate({
      provider: "openrouter",
      model: "anthropic/claude-3-5-sonnet",
      prompt: query,
    });
  }
}
```

**Expected Costs:**

- 80% simple queries: $0 (free model)
- 20% complex queries: ~$50/month (premium)
- **Total: $50/month** vs $250/month with all-premium

---

### Use Case 5: Internal Tools (Privacy Sensitive)

**Recommendation: Ollama (Local)**

**Why:**

- 100% private (no cloud)
- No ongoing costs
- Works offline
- Fast response

**Setup:**

```bash
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Pull model
ollama pull llama3.1:70b

# Configure NeuroLink
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama3.1:70b
```

**Deployment Options:**

- **Development:** Run on developer machines
- **Staging:** Shared server with GPU
- **Production:** Kubernetes cluster with GPU nodes

---

## Provider Comparison Decision Matrix

### Budget vs Quality Trade-off

```
High Quality
    │
    │  Anthropic Claude 4.5
    │  OpenAI GPT-4o
    │  ↑
    │  │
    │  │  Azure OpenAI
    │  │  Bedrock (Claude)
    │  │  ↑
    │  │  │
    │  │  │  Mistral Large
    │  │  │  Vertex (Gemini Pro)
    │  │  │  ↑
    │  │  │  │
    │  │  │  │  Google AI (Gemini Flash) ← FREE
    │  │  │  │  OpenRouter (free models) ← FREE
    │  │  │  │  ↑
    │  │  │  │  │
    │  │  │  │  │  Ollama ← FREE + Private
    │  │  │  │  │
    └──┴──┴──┴──┴──┴──────> Cost
   Free      $      $$     $$$
```

### Features vs Complexity

```
Many Features
    │
    │  Amazon Bedrock (multi-provider)
    │  OpenRouter (300+ models)
    │  ↑
    │  │
    │  │  Google Vertex (Gemini + Claude)
    │  │  LiteLLM (100+ models)
    │  │  ↑
    │  │  │
    │  │  │  Anthropic (extended thinking + PDF)
    │  │  │  Google AI Studio (thinking + PDF + free)
    │  │  │  ↑
    │  │  │  │
    │  │  │  │  OpenAI (vision + tools)
    │  │  │  │  Azure OpenAI
    │  │  │  │  ↑
    │  │  │  │  │
    │  │  │  │  │  Mistral
    │  │  │  │  │  Ollama
    │  │  │  │  │
    └──┴──┴──┴──┴──┴──────> Setup Complexity
   Easy   Moderate    Complex
```

---

## Common Migration Paths

### Path 1: Prototype → Production

```
Phase 1 (Prototype): Google AI Studio (FREE)
    ↓
Phase 2 (Beta): Mistral (low cost)
    ↓
Phase 3 (Production): Anthropic (high quality)
```

### Path 2: Cloud → Local

```
Phase 1: Cloud Provider (OpenAI, Anthropic)
    ↓
Phase 2: Test Ollama locally
    ↓
Phase 3: Full migration to Ollama (privacy + cost savings)
```

### Path 3: Single → Multi-Provider

```
Phase 1: Single provider (e.g., OpenAI)
    ↓
Phase 2: Add LiteLLM proxy
    ↓
Phase 3: Route to optimal provider per task
```

---

## Quick Reference Cards

### Card 1: "I Need Something Fast"

**Fastest Setup (2 minutes):**

1. Google AI Studio - Just need API key
2. OpenAI - Industry standard
3. Mistral - European option

**Get Started:**

```bash
# Google AI Studio
export GOOGLE_AI_API_KEY=your_key
```

```typescript
const result = await neurolink.generate({
  provider: "google-ai",
  prompt: "Your task",
});
```

---

### Card 2: "I Have No Budget"

**Free Options Ranked:**

1. **Google AI Studio** - Best free option
   - 1M tokens/day FREE
   - Professional quality
   - Extended thinking + PDF

2. **Ollama** - Completely free
   - Local execution
   - Privacy-first
   - Requires GPU

3. **OpenRouter** - Free models available
   - Gemini 2.0 Flash
   - Llama 3.3 70B
   - Many others

---

### Card 3: "I Need Maximum Privacy"

**Privacy-First Options:**

1. **Ollama** (Best) - 100% local
2. **Mistral** - GDPR, EU data centers
3. **Self-hosted OpenAI Compatible** - Full control

**Ollama Setup:**

```bash
curl -fsSL https://ollama.com/install.sh | sh
ollama pull llama3.1:8b
```

---

### Card 4: "I Need Extended Thinking"

**Only 3 Providers:**

1. **Anthropic** (Best) - Native extended thinking
2. **Google AI Studio** - Gemini 2.5+, 3 (FREE)
3. **Google Vertex** - Same as AI Studio (paid)

**No other providers support extended thinking**

---

## Final Recommendation Algorithm

Answer YES/NO to each question:

1. **Do you have ZERO budget?**
   - YES → Google AI Studio or Ollama
   - NO → Continue

2. **Do you need HIPAA/enterprise compliance?**
   - YES → Azure OpenAI or Bedrock
   - NO → Continue

3. **Do you need extended thinking?**
   - YES → Anthropic (best) or Google AI Studio (free)
   - NO → Continue

4. **Do you need PDF processing?**
   - YES → Anthropic or Google AI Studio
   - NO → Continue

5. **Are you on AWS/Azure/GCP?**
   - AWS → Bedrock
   - Azure → Azure OpenAI
   - GCP → Vertex
   - None → Continue

6. **Do you need maximum privacy?**
   - YES → Ollama (local)
   - NO → Continue

7. **Do you want the absolute best quality?**
   - YES → OpenAI or Anthropic
   - NO → Mistral or Google AI Studio

---

## Still Unsure? Default Recommendations

### For Most Teams

**Start with Google AI Studio**

- FREE tier
- Easy setup
- Professional quality
- Upgrade path to Vertex

### For Enterprises

**Start with your cloud provider's offering**

- AWS → Bedrock
- Azure → Azure OpenAI
- GCP → Vertex

### For Developers

**Start with NeuroLink + LiteLLM**

- Test multiple providers
- Compare results
- Optimize costs
- Make informed decision

---

## Next Steps

1. **Read:** [Provider Comparison Guide](/docs/reference/provider-comparison)
2. **Audit:** [Provider Capabilities](/docs/reference/provider-capabilities-audit)
3. **Setup:** Follow provider-specific setup guide
4. **Test:** Run sample requests with your use case
5. **Monitor:** Track costs and performance
6. **Optimize:** Adjust based on real-world usage

---

## Need Help?

**Contact Options:**

- Documentation: [docs/](/docs/)
- GitHub Issues: Report bugs or ask questions
- Community: Join discussions

**Professional Support:**

- Enterprise consulting available
- Custom provider integration
- Performance optimization
- Migration assistance

---

**Remember:** With NeuroLink, you're never locked into a single provider. You can easily switch or use multiple providers simultaneously. Start with the recommendation above, monitor your usage, and adjust as needed.

---

## Multi-Provider Failover & High Availability

<!-- Source: guides/enterprise/multi-provider-failover.md -->

# Multi-Provider Failover Guide

**Build resilient AI applications with automatic provider failover and redundancy**

## Quick Start

### Basic Failover Configuration

```typescript

const ai = new NeuroLink({
  providers: [
    {
      name: "openai",
      priority: 1, // Primary provider
      config: {
        apiKey: process.env.OPENAI_API_KEY,
      },
    },
    {
      name: "anthropic",
      priority: 2, // Fallback 1
      config: {
        apiKey: process.env.ANTHROPIC_API_KEY,
      },
    },
    {
      name: "google-ai",
      priority: 3, // Fallback 2
      config: {
        apiKey: process.env.GOOGLE_AI_API_KEY,
      },
    },
  ],
});

// Automatically tries OpenAI → Anthropic → Google AI
const result = await ai.generate({
  input: { text: "Hello world!" },
  // No provider specified - uses priority order
});
```

### Test Failover

```typescript
// Simulate OpenAI failure
const result = await ai.generate({
  input: { text: "Test failover" },
});

console.log("Used provider:", result.provider); // Will show fallback provider
console.log("Attempts:", result.metadata.attempts);
console.log("Failed providers:", result.metadata.failedProviders);
```

---

## Failover Strategies

### 1. Priority-Based Failover (Recommended)

Try providers in priority order until one succeeds.

```typescript
const ai = new NeuroLink({
  providers: [
    { name: "openai", priority: 1 }, // Try first
    { name: "anthropic", priority: 2 }, // Try second
    { name: "google-ai", priority: 3 }, // Try third
  ],
  failoverConfig: {
    enabled: true,
    maxAttempts: 3, // Try up to 3 providers
    retryDelay: 1000, // Wait 1s between attempts
    exponentialBackoff: true, // 1s, 2s, 4s delays
  },
});
```

### 2. Condition-Based Routing

Route to specific providers based on request conditions.

```typescript
const ai = new NeuroLink({
  providers: [
    {
      name: "mistral",
      priority: 1, // (1)!
      condition: (req) => req.userRegion === "EU", // (2)!
      config: { apiKey: process.env.MISTRAL_API_KEY },
    },
    {
      name: "openai",
      priority: 1,
      condition: (req) => req.userRegion !== "EU", // (3)!
      config: { apiKey: process.env.OPENAI_API_KEY },
    },
    {
      name: "google-ai",
      priority: 2, // (4)!
      config: { apiKey: process.env.GOOGLE_AI_API_KEY },
    },
  ],
});

// Usage
const result = await ai.generate({
  input: { text: "Your prompt" },
  metadata: { userRegion: "EU" }, // (5)!
});
```

1. **Same priority**: Both Mistral and OpenAI have priority 1, but conditions determine which one is used.
2. **GDPR compliance**: Route EU users to Mistral AI (European provider) for automatic GDPR compliance.
3. **Regional routing**: Non-EU users go to OpenAI. Multiple providers at same priority with mutually exclusive conditions.
4. **Universal fallback**: Google AI (priority 2) has no condition, so it's used if both priority 1 providers fail.
5. **Pass routing metadata**: Include `userRegion` in metadata so conditions can access it for routing decisions.

### 3. Cost-Based Routing

Try cheaper providers first, fallback to premium providers.

```typescript
const ai = new NeuroLink({
  providers: [
    {
      name: "google-ai",
      priority: 1, // Free tier first
      model: "gemini-2.0-flash",
      condition: (req) => !req.requiresPremium,
    },
    {
      name: "openai",
      priority: 2, // Paid tier fallback
      model: "gpt-4o-mini",
      condition: (req) => !req.requiresPremium,
    },
    {
      name: "anthropic",
      priority: 3, // Premium for complex tasks
      model: "claude-3-5-sonnet-20241022",
    },
  ],
});

// Cheap query (uses Google AI free tier)
const cheap = await ai.generate({
  input: { text: "Simple customer query" },
  metadata: { requiresPremium: false },
});

// Complex query (uses Anthropic)
const premium = await ai.generate({
  input: { text: "Complex business analysis requiring detailed reasoning..." },
  metadata: { requiresPremium: true },
});
```

### 4. Load-Balanced Failover

Combine load balancing with failover.

```typescript
const ai = new NeuroLink({
  providers: [
    // Load balanced primary tier
    {
      name: "openai-1",
      priority: 1,
      config: { apiKey: process.env.OPENAI_KEY_1 },
    },
    {
      name: "openai-2",
      priority: 1,
      config: { apiKey: process.env.OPENAI_KEY_2 },
    },
    {
      name: "openai-3",
      priority: 1,
      config: { apiKey: process.env.OPENAI_KEY_3 },
    },

    // Fallback tier
    { name: "anthropic", priority: 2 },
    { name: "google-ai", priority: 3 },
  ],
  loadBalancing: "round-robin", // Balance across same-priority providers
  failoverConfig: { enabled: true },
});
```

---

## Retry Configuration

### Exponential Backoff

```typescript
const ai = new NeuroLink({
  providers: [
    { name: "openai", priority: 1 },
    { name: "anthropic", priority: 2 },
  ],
  failoverConfig: {
    enabled: true,
    maxAttempts: 5,
    retryDelay: 1000, // Start with 1s
    exponentialBackoff: true, // 1s, 2s, 4s, 8s, 16s
    maxRetryDelay: 30000, // Cap at 30s
  },
});
```

### Selective Retry

```typescript
const ai = new NeuroLink({
  providers: [
    { name: "openai", priority: 1 },
    { name: "anthropic", priority: 2 },
  ],
  failoverConfig: {
    enabled: true,
    retryOn: [
      // (1)!
      "ECONNREFUSED", // Connection errors
      "ETIMEDOUT", // Timeout
      "429", // Rate limit
      "500", // Server errors
      "502", // Bad gateway
      "503", // Service unavailable
      "504", // Gateway timeout
    ],
    doNotRetryOn: [
      // (2)!
      "400", // Bad request (client error)
      "401", // Invalid API key
      "403", // Forbidden
    ],
  },
});
```

1. **Retryable errors**: Transient failures worth retrying. Network errors (ECONNREFUSED, ETIMEDOUT) and server issues (429, 5xx) often resolve on retry.
2. **Non-retryable errors**: Client-side errors that won't be fixed by retrying. Invalid requests (400), authentication failures (401), and authorization issues (403) require code changes.

### Custom Retry Logic

```typescript
const ai = new NeuroLink({
  providers: [
    { name: "openai", priority: 1 },
    { name: "anthropic", priority: 2 },
  ],
  failoverConfig: {
    enabled: true,
    shouldRetry: (error, attempt, provider) => {
      // Custom retry logic
      if (error.message.includes("rate limit")) {
        console.log(`Rate limited on ${provider}, waiting...`);
        return attempt  {
      // Custom delay calculation
      if (error.message.includes("rate limit")) {
        return 5000; // Wait 5s for rate limits
      }
      return Math.pow(2, attempt) * 1000; // Exponential for others
    },
  },
});
```

---

## Provider Health Checks

### Active Health Monitoring

```typescript
const ai = new NeuroLink({
  providers: [
    { name: "openai", priority: 1 },
    { name: "anthropic", priority: 2 },
    { name: "google-ai", priority: 3 },
  ],
  healthCheck: {
    enabled: true,
    interval: 60000, // Check every 60s
    timeout: 5000, // 5s timeout per check
    unhealthyThreshold: 3, // Mark unhealthy after 3 failures
    healthyThreshold: 2, // Mark healthy after 2 successes
  },
});

// Get provider health status
const health = await ai.getProviderHealth();
console.log(health);
/*
{
  openai: { status: 'healthy', latency: 120, lastCheck: '2025-01-15T10:00:00Z' },
  anthropic: { status: 'unhealthy', latency: null, lastCheck: '2025-01-15T10:00:00Z' },
  'google-ai': { status: 'healthy', latency: 95, lastCheck: '2025-01-15T10:00:00Z' }
}
*/

// Only use healthy providers
const result = await ai.generate({
  input: { text: "Your prompt" },
  useOnlyHealthy: true, // Skip anthropic (unhealthy)
});
```

### Circuit Breaker Pattern

```typescript
const ai = new NeuroLink({
  providers: [
    { name: "openai", priority: 1 },
    { name: "anthropic", priority: 2 },
  ],
  circuitBreaker: {
    enabled: true,
    failureThreshold: 5, // Open circuit after 5 failures
    resetTimeout: 60000, // Try again after 60s
    halfOpenRequests: 3, // Test with 3 requests when half-open
  },
});

// Circuit breaker state machine:
// CLOSED (normal) → 5 failures → OPEN (block requests)
// → wait 60s → HALF_OPEN (test with 3 requests)
// → 3 successes → CLOSED | 1 failure → OPEN

// Get circuit state
const state = await ai.getCircuitState("openai");
console.log(state); // 'CLOSED' | 'OPEN' | 'HALF_OPEN'
```

---

## Production Patterns

### Pattern 1: High Availability Setup

```typescript
// Production-ready HA configuration
const ai = new NeuroLink({
  providers: [
    // Tier 1: Load-balanced primary
    { name: "openai-us-east", priority: 1, region: "us-east-1" },
    { name: "openai-us-west", priority: 1, region: "us-west-2" },

    // Tier 2: Alternative provider
    { name: "anthropic-us", priority: 2 },

    // Tier 3: Emergency fallback
    { name: "google-ai", priority: 3 },
  ],
  loadBalancing: "latency-based", // Route to fastest provider
  failoverConfig: {
    enabled: true,
    maxAttempts: 6, // Try all providers
    retryDelay: 500,
    exponentialBackoff: true,
  },
  healthCheck: {
    enabled: true,
    interval: 30000, // Check every 30s
    timeout: 3000,
  },
  circuitBreaker: {
    enabled: true,
    failureThreshold: 10,
    resetTimeout: 120000, // 2 minutes
  },
});
```

### Pattern 2: Cost-Optimized Failover

```typescript
// Free tier first, paid tier fallback
const ai = new NeuroLink({
  providers: [
    {
      name: "google-ai",
      priority: 1,
      model: "gemini-2.0-flash",
      config: { apiKey: process.env.GOOGLE_AI_KEY },
      costPerToken: 0, // Free tier
    },
    {
      name: "openai",
      priority: 2,
      model: "gpt-4o-mini",
      config: { apiKey: process.env.OPENAI_KEY },
      costPerToken: 0.00015,
    },
    {
      name: "anthropic",
      priority: 3,
      model: "claude-3-5-sonnet-20241022",
      config: { apiKey: process.env.ANTHROPIC_KEY },
      costPerToken: 0.003,
    },
  ],
  failoverConfig: {
    enabled: true,
    // Skip rate-limited free tier immediately
    shouldFailover: (error, provider) => {
      if (provider.costPerToken === 0 && error.message.includes("quota")) {
        console.log("Free tier exhausted, failing over to paid tier");
        return true;
      }
      return error.code?.startsWith("E"); // Network errors
    },
  },
});
```

### Pattern 3: Geographic Routing

```typescript
// Route to nearest region with failover
const ai = new NeuroLink({
  providers: [
    // US East
    {
      name: "openai-us-east",
      priority: 1,
      condition: (req) => req.userRegion === "us-east",
    },

    // US West
    {
      name: "openai-us-west",
      priority: 1,
      condition: (req) => req.userRegion === "us-west",
    },

    // Europe
    {
      name: "mistral-eu",
      priority: 1,
      condition: (req) => req.userRegion === "eu",
    },

    // Asia Pacific
    {
      name: "vertex-asia",
      priority: 1,
      condition: (req) => req.userRegion === "asia",
    },

    // Global fallback
    { name: "openai-global", priority: 2 },
  ],
});

// Usage
const result = await ai.generate({
  input: { text: "Your prompt" },
  metadata: {
    userRegion: getUserRegion(req.ip), // Detect from IP
  },
});
```

### Pattern 4: Model-Specific Failover

```typescript
// Different models with same capability
const ai = new NeuroLink({
  providers: [
    // Primary: GPT-4
    {
      name: "openai",
      priority: 1,
      model: "gpt-4o",
      capability: "complex-reasoning",
    },

    // Fallback 1: Claude 3.5 Sonnet (similar capability)
    {
      name: "anthropic",
      priority: 2,
      model: "claude-3-5-sonnet-20241022",
      capability: "complex-reasoning",
    },

    // Fallback 2: Gemini Pro
    {
      name: "google-ai",
      priority: 3,
      model: "gemini-1.5-pro",
      capability: "complex-reasoning",
    },
  ],
  failoverConfig: {
    enabled: true,
    matchCapability: true, // Only failover to same capability
  },
});
```

---

## Monitoring and Metrics

### Track Failover Events

```typescript
const ai = new NeuroLink({
  providers: [
    { name: "openai", priority: 1 },
    { name: "anthropic", priority: 2 },
  ],
  failoverConfig: {
    enabled: true,
    onFailover: (event) => {
      // Log failover event
      console.log({
        timestamp: new Date().toISOString(),
        from: event.failedProvider,
        to: event.successfulProvider,
        error: event.error.message,
        attempts: event.attempts,
        latency: event.totalLatency,
      });

      // Send to monitoring system
      metrics.increment("ai.failover.count", {
        from: event.failedProvider,
        to: event.successfulProvider,
      });
    },
    onSuccess: (event) => {
      // Log successful request
      metrics.histogram("ai.latency", event.latency, {
        provider: event.provider,
        model: event.model,
      });
    },
  },
});
```

### Failover Metrics Dashboard

```typescript
// Track provider reliability
class FailoverMetrics {
  private stats = new Map();

  recordAttempt(provider: string, success: boolean, latency: number) {
    if (!this.stats.has(provider)) {
      this.stats.set(provider, {
        total: 0,
        successes: 0,
        failures: 0,
        totalLatency: 0,
      });
    }

    const stat = this.stats.get(provider);
    stat.total++;
    stat.totalLatency += latency;

    if (success) {
      stat.successes++;
    } else {
      stat.failures++;
    }
  }

  getProviderStats() {
    const stats = [];

    for (const [provider, stat] of this.stats.entries()) {
      stats.push({
        provider,
        total: stat.total,
        successRate: (stat.successes / stat.total) * 100,
        avgLatency: stat.totalLatency / stat.total,
        failureCount: stat.failures,
      });
    }

    return stats.sort((a, b) => b.successRate - a.successRate);
  }
}

// Usage
const metrics = new FailoverMetrics();

const ai = new NeuroLink({
  providers: [
    /* ... */
  ],
  failoverConfig: {
    enabled: true,
    onSuccess: (event) => {
      metrics.recordAttempt(event.provider, true, event.latency);
    },
    onFailover: (event) => {
      metrics.recordAttempt(event.failedProvider, false, event.latency);
      metrics.recordAttempt(event.successfulProvider, true, event.latency);
    },
  },
});

// View stats
console.log(metrics.getProviderStats());
/*
[
  { provider: 'openai', total: 1000, successRate: 99.5, avgLatency: 120, failureCount: 5 },
  { provider: 'anthropic', total: 50, successRate: 98.0, avgLatency: 150, failureCount: 1 },
  { provider: 'google-ai', total: 10, successRate: 100, avgLatency: 95, failureCount: 0 }
]
*/
```

---

## Best Practices

### 1. ✅ Always Configure Multiple Providers

```typescript
// ❌ Bad: Single provider (no failover)
const ai = new NeuroLink({
  providers: [{ name: "openai" }],
});

// ✅ Good: Multiple providers with failover
const ai = new NeuroLink({
  providers: [
    { name: "openai", priority: 1 },
    { name: "anthropic", priority: 2 },
    { name: "google-ai", priority: 3 },
  ],
  failoverConfig: { enabled: true },
});
```

### 2. ✅ Use Health Checks in Production

```typescript
// ✅ Good: Active health monitoring
const ai = new NeuroLink({
  providers: [
    /* ... */
  ],
  healthCheck: {
    enabled: true,
    interval: 60000, // 1 minute
    timeout: 5000, // 5 seconds
  },
});
```

### 3. ✅ Implement Circuit Breakers

```typescript
// ✅ Good: Prevent cascading failures
const ai = new NeuroLink({
  providers: [
    /* ... */
  ],
  circuitBreaker: {
    enabled: true,
    failureThreshold: 5,
    resetTimeout: 60000,
  },
});
```

### 4. ✅ Monitor Failover Events

```typescript
// ✅ Good: Track failures for debugging
failoverConfig: {
  enabled: true,
  onFailover: (event) => {
    logger.error('Provider failover', {
      from: event.failedProvider,
      to: event.successfulProvider,
      error: event.error
    });

    // Alert if too many failovers
    if (event.attempts > 3) {
      alerting.sendAlert('Multiple provider failures detected');
    }
  }
}
```

### 5. ✅ Test Failover Regularly

```typescript
// ✅ Good: Test failover in CI/CD
describe("Failover", () => {
  it("should failover when primary provider fails", async () => {
    // Mock OpenAI failure
    mockOpenAI.mockRejectedValue(new Error("503 Service Unavailable"));

    const result = await ai.generate({
      input: { text: "test" },
    });

    // Verify failover occurred
    expect(result.provider).toBe("anthropic");
    expect(result.metadata.attempts).toBe(2);
  });
});
```

---

## Troubleshooting

### Issue 1: Failover Not Triggering

**Problem**: Requests fail without trying fallback providers.

**Solution**:

```typescript
// Ensure failover is enabled
failoverConfig: {
  enabled: true,  // Must be true
  maxAttempts: 3  // Must be > 1
}

// Check provider priorities
providers: [
  { name: 'openai', priority: 1 },  // Different priorities
  { name: 'anthropic', priority: 2 }  // Not same priority
]
```

### Issue 2: Too Many Retry Attempts

**Problem**: Requests take too long due to excessive retries.

**Solution**:

```typescript
// Limit retry attempts
failoverConfig: {
  enabled: true,
  maxAttempts: 3,        // Limit attempts
  retryDelay: 1000,      // Reduce delay
  maxRetryDelay: 5000    // Cap max delay
}
```

### Issue 3: Circuit Breaker Stuck Open

**Problem**: Provider marked as failed even when healthy.

**Solution**:

```typescript
// Adjust circuit breaker settings
circuitBreaker: {
  enabled: true,
  failureThreshold: 10,    // Increase threshold
  resetTimeout: 30000,     // Reduce timeout
  halfOpenRequests: 5      // More test requests
}

// Manually reset circuit
await ai.resetCircuit('openai');
```

---

## Related Documentation

**Feature Guides:**

- **[Provider Orchestration](/docs/features/provider-orchestration)** - Intelligent provider selection and routing
- **[Regional Streaming](/docs/features/regional-streaming)** - Region-specific failover strategies
- **[Auto Evaluation](/docs/features/auto-evaluation)** - Validate failover quality

**Enterprise Guides:**

- **[Load Balancing Guide](/docs/guides/enterprise/load-balancing)** - Distribution strategies
- **[Cost Optimization](/docs/cookbook/cost-optimization)** - Reduce AI costs
- **[Provider Setup](/docs/getting-started/provider-setup)** - Provider configuration
- **[Monitoring Guide](/docs/observability/health-monitoring)** - Observability and metrics

---

## Additional Resources

- **[NeuroLink GitHub](https://github.com/juspay/neurolink)** - Source code
- **[GitHub Discussions](https://github.com/juspay/neurolink/discussions)** - Community support
- **[Issues](https://github.com/juspay/neurolink/issues)** - Report bugs

---

**Need Help?** Join our [GitHub Discussions](https://github.com/juspay/neurolink/discussions) or open an [issue](https://github.com/juspay/neurolink/issues).

---

## Complete Redis Configuration Guide

<!-- Source: guides/redis-configuration.md -->

# Complete Redis Configuration Guide

Comprehensive guide for configuring Redis storage for NeuroLink in all environments from development to enterprise production.

## Table of Contents

- [Architecture Overview](#architecture-overview)
- [Installation Options](#installation-options)
- [Configuration Reference](#configuration-reference)
- [Production Setup](#production-setup)
- [Performance Tuning](#performance-tuning)
- [Security Hardening](#security-hardening)
- [High Availability](#high-availability)
- [Monitoring](#monitoring)
- [NeuroLink Integration](#neurolink-integration)

## Architecture Overview

### Redis Role in NeuroLink

Redis serves as NeuroLink's persistent storage backend for:

- **Conversation Memory**: Multi-turn conversation history with summarization
- **Session Management**: User session data with TTL-based expiration
- **Tool Execution History**: Complete tool call and result tracking
- **Analytics Data**: Real-time metrics and performance data

### Storage Architecture

```
┌─────────────────────────────────────────────┐
│         NeuroLink Application               │
│  ┌─────────┐  ┌─────────┐  ┌─────────┐    │
│  │   SDK   │  │   CLI   │  │  Tools  │    │
│  └────┬────┘  └────┬────┘  └────┬────┘    │
└───────┼───────────┼───────────┼───────────┘
        │           │           │
        └───────────┴───────────┘
                    │
        ┌───────────▼──────────────┐
        │ RedisConversationMemoryManager │
        └───────────┬──────────────┘
                    │
        ┌───────────▼──────────────┐
        │     Redis Storage         │
        │  ┌────────────────────┐  │
        │  │ DB 0: Conversations│  │
        │  │ DB 1: Sessions     │  │
        │  │ DB 2: Analytics    │  │
        │  └────────────────────┘  │
        └──────────────────────────┘
```

## Installation Options

### Standalone Server

#### Ubuntu/Debian

```bash
# Add Redis repository
curl -fsSL https://packages.redis.io/gpg | sudo gpg --dearmor -o /usr/share/keyrings/redis-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/redis-archive-keyring.gpg] https://packages.redis.io/deb $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/redis.list

# Install Redis
sudo apt update
sudo apt install redis-server

# Configure for production
sudo systemctl enable redis-server
sudo systemctl start redis-server

# Verify
redis-cli ping
```

#### CentOS/RHEL

```bash
# Install EPEL repository
sudo yum install epel-release

# Install Redis
sudo yum install redis

# Start and enable
sudo systemctl start redis
sudo systemctl enable redis
```

#### macOS

```bash
# Install with Homebrew
brew install redis

# Start as a service
brew services start redis

# Configuration file
/usr/local/etc/redis.conf
```

### Docker

#### Development Setup

```bash
# Basic development container
docker run -d \
  --name neurolink-redis \
  -p 6379:6379 \
  -v redis-data:/data \
  redis:7-alpine
```

#### Production-Ready Container

```bash
# Create custom Redis configuration
cat > redis.conf  /etc/redis/cluster/7001/redis.conf  {
  const neurolink = new NeuroLink({
    conversationMemory: {
      enabled: true,
      store: "redis",
      redisConfig: {
        // Primary production Redis
        host: process.env.REDIS_PRIMARY_HOST,
        port: parseInt(process.env.REDIS_PRIMARY_PORT || "6379"),
        password: process.env.REDIS_PASSWORD,
        db: 0,

        // Production-grade settings
        keyPrefix: `${process.env.ENVIRONMENT}:neurolink:`,
        ttl: 604800, // 7 days for production

        connectionOptions: {
          connectTimeout: 15000,
          retryDelayOnFailover: 200,
          maxRetriesPerRequest: 5,
        },
      },

      // Production conversation settings
      maxSessions: 10000,
      maxTurnsPerSession: 100,
      tokenThreshold: 100000,
      enableSummarization: true,
      summarizationProvider: "vertex",
      summarizationModel: "gemini-2.5-flash",
    },

    // Additional production features
    telemetry: {
      enabled: true,
      provider: "otel",
    },
  });

  return neurolink;
};

export default setupProduction;
```

## Performance Tuning

### Memory Optimization

```ini
# redis.conf - Memory tuning
maxmemory 16gb
maxmemory-policy allkeys-lru
maxmemory-samples 10

# Optimize for conversation data structures
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
list-max-ziplist-size -2
```

### Connection Pooling

```typescript
// Optimize connection pool for high concurrency
const neurolink = new NeuroLink({
  conversationMemory: {
    enabled: true,
    store: "redis",
    redisConfig: {
      host: "localhost",
      port: 6379,
      connectionOptions: {
        connectTimeout: 5000,
        maxRetriesPerRequest: 3,
        retryDelayOnFailover: 100,
      },
    },
  },
});
```

### Persistence Tuning

```ini
# For high-write workloads (less durability, better performance)
appendfsync no
save ""

# For balanced workload (recommended)
appendfsync everysec
save 300 10
save 60 1000

# For maximum durability (lower performance)
appendfsync always
save 60 1
```

## Security Hardening

### Authentication

```ini
# redis.conf
requirepass strong_password_at_least_32_characters_long_2024
```

### Access Control Lists (Redis 6.0+)

```bash
# Create NeuroLink application user with limited permissions
redis-cli
127.0.0.1:6379> AUTH default admin_password
127.0.0.1:6379> ACL SETUSER neurolink-app on >app_password ~neurolink:* +@read +@write +@stream -@dangerous
127.0.0.1:6379> ACL SAVE

# Create read-only monitoring user
127.0.0.1:6379> ACL SETUSER neurolink-monitor on >monitor_password ~* +@read +info +ping -@write -@dangerous
127.0.0.1:6379> ACL SAVE
```

### TLS/SSL Configuration

```ini
# redis.conf - Enable TLS
port 0
tls-port 6380
tls-cert-file /etc/redis/tls/redis.crt
tls-key-file /etc/redis/tls/redis.key
tls-ca-cert-file /etc/redis/tls/ca.crt
tls-protocols "TLSv1.2 TLSv1.3"
```

### Network Security

```bash
# Ubuntu UFW firewall
sudo ufw allow from 10.0.0.0/8 to any port 6379
sudo ufw deny 6379

# CentOS/RHEL firewalld
sudo firewall-cmd --permanent --add-rich-rule="rule family='ipv4' source address='10.0.0.0/8' port protocol='tcp' port='6379' accept"
sudo firewall-cmd --reload
```

## High Availability

### Redis Sentinel

```ini
# sentinel.conf
port 26379
sentinel monitor neurolink-master 192.168.1.100 6379 2
sentinel auth-pass neurolink-master redis_password
sentinel down-after-milliseconds neurolink-master 5000
sentinel parallel-syncs neurolink-master 1
sentinel failover-timeout neurolink-master 60000
```

### NeuroLink with Sentinel

```typescript
// TypeScript configuration for Sentinel

const neurolink = new NeuroLink({
  conversationMemory: {
    enabled: true,
    store: "redis",
    redisConfig: {
      // Sentinel configuration
      host: "sentinel-node-1",
      port: 26379,
      password: process.env.REDIS_PASSWORD,
      db: 0,
    },
  },
});
```

## Monitoring

### Key Metrics to Monitor

```bash
# Connection metrics
redis-cli info clients | grep connected_clients

# Memory usage
redis-cli info memory | grep used_memory_human

# Operations per second
redis-cli --stat

# Slow queries
redis-cli slowlog get 10

# Keyspace info
redis-cli info keyspace
```

### Health Check Script

```bash
#!/bin/bash
# neurolink-redis-health.sh

REDIS_HOST="localhost"
REDIS_PORT="6379"
REDIS_PASSWORD="your_password"

# Test connectivity
if redis-cli -h $REDIS_HOST -p $REDIS_PORT -a $REDIS_PASSWORD ping | grep -q "PONG"; then
  echo "✅ Redis is responsive"
else
  echo "❌ Redis is not responding"
  exit 1
fi

# Check memory usage
MEMORY_USED=$(redis-cli -h $REDIS_HOST -p $REDIS_PORT -a $REDIS_PASSWORD info memory | grep used_memory_human | cut -d: -f2)
echo "Memory Used: $MEMORY_USED"

# Check connected clients
CLIENTS=$(redis-cli -h $REDIS_HOST -p $REDIS_PORT -a $REDIS_PASSWORD info clients | grep connected_clients | cut -d: -f2)
echo "Connected Clients: $CLIENTS"

# Check replication status
ROLE=$(redis-cli -h $REDIS_HOST -p $REDIS_PORT -a $REDIS_PASSWORD info replication | grep role | cut -d: -f2)
echo "Role: $ROLE"
```

## NeuroLink Integration

### Complete Integration Example

```typescript

// Initialize with Redis storage
const neurolink = new NeuroLink({
  conversationMemory: {
    enabled: true,
    store: "redis",
    redisConfig: {
      host: process.env.REDIS_HOST || "localhost",
      port: parseInt(process.env.REDIS_PORT || "6379"),
      password: process.env.REDIS_PASSWORD,
      db: 0,
      keyPrefix: "neurolink:conversation:",
      ttl: 86400, // 24 hours
      connectionOptions: {
        connectTimeout: 10000,
        retryDelayOnFailover: 100,
        maxRetriesPerRequest: 3,
      },
    },
    maxSessions: 1000,
    maxTurnsPerSession: 50,
    tokenThreshold: 50000,
    enableSummarization: true,
  },
});

// Use with conversation persistence
const result = await neurolink.generate({
  input: { text: "What did we discuss yesterday about the project timeline?" },
  sessionId: "project-planning-session",
  userId: "user123",
  provider: "anthropic",
  model: "claude-3-5-sonnet",
});

console.log(result.content);

// Retrieve conversation history
const history = await neurolink.conversationMemory?.getUserSessionHistory(
  "user123",
  "project-planning-session",
);

console.log(`Conversation has ${history?.length} messages`);

// Get all user sessions
const sessions =
  await neurolink.conversationMemory?.getUserAllSessionsHistory("user123");
console.log(`User has ${sessions?.length} active sessions`);

// Clear a specific session
await neurolink.conversationMemory?.clearSession(
  "project-planning-session",
  "user123",
);

// Get storage statistics
const stats = await neurolink.conversationMemory?.getStats();
console.log(
  `Total sessions: ${stats?.totalSessions}, Total turns: ${stats?.totalTurns}`,
);
```

## See Also

- [Redis Quick Start](/docs/getting-started/redis-quickstart) - 5-minute setup guide
- [Redis Migration Patterns](/docs/guides/redis-migration) - Migration from in-memory to Redis
- [Conversation Memory Guide](/docs/features/conversation-history) - Advanced conversation features
- [Troubleshooting Guide](/docs/reference/troubleshooting) - Common issues and solutions

## External Resources

- [Redis Documentation](https://redis.io/documentation)
- [Redis Best Practices](https://redis.io/topics/admin)
- [Redis Persistence](https://redis.io/topics/persistence)
- [Redis Security](https://redis.io/topics/security)

---

## Multi-Region Deployment Guide

<!-- Source: guides/enterprise/multi-region.md -->

# Multi-Region Deployment Guide

**Deploy AI applications globally with optimal latency, compliance, and reliability**

## Quick Start

### Basic Multi-Region Setup

```typescript

const ai = new NeuroLink({
  providers: [
    // US East
    {
      name: "openai-us-east",
      region: "us-east-1",
      priority: 1,
      config: { apiKey: process.env.OPENAI_KEY },
      condition: (req) => req.userRegion === "us-east",
    },

    // US West
    {
      name: "openai-us-west",
      region: "us-west-2",
      priority: 1,
      config: { apiKey: process.env.OPENAI_KEY },
      condition: (req) => req.userRegion === "us-west",
    },

    // Europe
    {
      name: "mistral-eu",
      region: "eu-west-1",
      priority: 1,
      config: { apiKey: process.env.MISTRAL_KEY },
      condition: (req) => req.userRegion === "eu",
    },

    // Asia Pacific
    {
      name: "vertex-asia",
      region: "asia-southeast1",
      priority: 1,
      config: {
        projectId: process.env.GCP_PROJECT_ID,
        location: "asia-southeast1",
      },
      condition: (req) => req.userRegion === "asia",
    },

    // Global fallback
    {
      name: "openai-global",
      region: "us-east-1",
      priority: 2,
    },
  ],
});

// Detect user region and route accordingly
const result = await ai.generate({
  input: { text: "Your prompt" },
  metadata: {
    userRegion: detectRegion(req.ip), // us-east, us-west, eu, asia
  },
});

console.log(`Routed to: ${result.provider} in ${result.region}`);
```

---

## Region Detection

### IP-Based Geolocation

```typescript

type RegionInfo = {
  region: string;
  country: string;
  city: string;
  latitude: number;
  longitude: number;
};

function detectRegion(ip: string): string {
  const geo = geoip.lookup(ip);

  if (!geo) return "us-east"; // Default fallback

  // Map country to region
  const countryToRegion: Record = {
    // North America
    US: getNearestUSRegion(geo.ll[0], geo.ll[1]),
    CA: "us-east",
    MX: "us-west",

    // Europe
    GB: "eu-west",
    DE: "eu-central",
    FR: "eu-west",
    IT: "eu-south",
    ES: "eu-west",
    NL: "eu-west",
    SE: "eu-north",
    PL: "eu-central",

    // Asia Pacific
    JP: "asia-northeast",
    SG: "asia-southeast",
    IN: "asia-south",
    AU: "asia-southeast",
    KR: "asia-northeast",
    CN: "asia-east",

    // South America
    BR: "sa-east",
    AR: "sa-east",
    CL: "sa-east",
  };

  return countryToRegion[geo.country] || "us-east";
}

function getNearestUSRegion(lat: number, lon: number): string {
  // Coordinates of US regions
  const regions = [
    { name: "us-east", lat: 39.0, lon: -77.5 }, // Virginia
    { name: "us-west", lat: 45.5, lon: -122.7 }, // Oregon
    { name: "us-central", lat: 41.3, lon: -95.9 }, // Iowa
  ];

  // Find nearest region using Haversine distance
  let nearest = regions[0];
  let minDistance = haversineDistance(lat, lon, nearest.lat, nearest.lon);

  for (const region of regions.slice(1)) {
    const distance = haversineDistance(lat, lon, region.lat, region.lon);
    if (distance  {
    const country = request.cf?.country || "US";
    const city = request.cf?.city || "Unknown";
    const region = mapCountryToRegion(country);

    const result = await ai.generate({
      input: { text: await request.text() },
      metadata: {
        userRegion: region,
        country,
        city,
      },
    });

    return new Response(JSON.stringify(result));
  },
};
```

---

## Provider-Specific Multi-Region

### OpenAI Multi-Region

OpenAI doesn't have explicit region selection, but uses global load balancing.

```typescript
// Load balance across multiple OpenAI accounts for better distribution
const ai = new NeuroLink({
  providers: [
    {
      name: "openai-account-1",
      config: { apiKey: process.env.OPENAI_KEY_1 },
      weight: 1,
    },
    {
      name: "openai-account-2",
      config: { apiKey: process.env.OPENAI_KEY_2 },
      weight: 1,
    },
    {
      name: "openai-account-3",
      config: { apiKey: process.env.OPENAI_KEY_3 },
      weight: 1,
    },
  ],
  loadBalancing: "round-robin",
});
```

### Google Cloud Vertex AI (Multi-Region)

Vertex AI supports explicit region selection.

```typescript
const ai = new NeuroLink({
  providers: [
    // US regions
    {
      name: "vertex-us-east1",
      region: "us-east1",
      config: {
        projectId: process.env.GCP_PROJECT,
        location: "us-east1",
      },
    },
    {
      name: "vertex-us-west1",
      region: "us-west1",
      config: {
        projectId: process.env.GCP_PROJECT,
        location: "us-west1",
      },
    },

    // EU regions
    {
      name: "vertex-eu-west1",
      region: "eu-west1",
      config: {
        projectId: process.env.GCP_PROJECT,
        location: "europe-west1",
      },
    },

    // Asia regions
    {
      name: "vertex-asia-southeast1",
      region: "asia-southeast1",
      config: {
        projectId: process.env.GCP_PROJECT,
        location: "asia-southeast1",
      },
    },
  ],
});
```

### Mistral AI (European Provider)

Mistral AI is EU-based, perfect for European users.

```typescript
const ai = new NeuroLink({
  providers: [
    {
      name: "mistral-eu",
      region: "eu",
      priority: 1,
      condition: (req) => req.userRegion === "eu",
      config: { apiKey: process.env.MISTRAL_KEY },
    },
  ],
});
```

---

## Deployment Patterns

### Pattern 1: Edge Deployment

Deploy at edge locations (Cloudflare Workers, Vercel Edge).

```typescript
// vercel.json - Edge configuration
{
  "regions": ["iad1", "sfo1", "fra1", "sin1"]
}
```

```typescript
// pages/api/ai/generate.ts - Vercel Edge Function

export const config = {
  runtime: "edge",
  regions: ["iad1", "sfo1", "fra1", "sin1"],
};

const ai = new NeuroLink({
  providers: [
    /* ... */
  ],
});

export default async function handler(req: Request) {
  const { geolocation } = req;
  const region = mapGeoToRegion(geolocation);

  const result = await ai.generate({
    input: { text: await req.text() },
    metadata: { userRegion: region },
  });

  return new Response(JSON.stringify(result));
}
```

### Pattern 2: Kubernetes Multi-Region

Deploy across multiple Kubernetes clusters.

```yaml
# k8s/deployment-us-east.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: neurolink-us-east
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: neurolink
      region: us-east-1
  template:
    metadata:
      labels:
        app: neurolink
        region: us-east-1
    spec:
      containers:
        - name: neurolink
          image: your-registry/neurolink:latest
          env:
            - name: REGION
              value: "us-east-1"
            - name: OPENAI_API_KEY
              valueFrom:
                secretKeyRef:
                  name: ai-keys
                  key: openai-key
---
# Repeat for us-west-2, eu-west-1, asia-southeast-1
```

### Pattern 3: Multi-Cloud Deployment

Distribute across AWS, GCP, Azure.

```typescript
const ai = new NeuroLink({
  providers: [
    // AWS Bedrock (US)
    {
      name: "bedrock-us",
      region: "us-east-1",
      cloud: "aws",
      config: {
        region: "us-east-1",
        accessKeyId: process.env.AWS_ACCESS_KEY,
        secretAccessKey: process.env.AWS_SECRET_KEY,
      },
    },

    // Google Vertex (EU)
    {
      name: "vertex-eu",
      region: "eu-west-1",
      cloud: "gcp",
      config: {
        projectId: process.env.GCP_PROJECT,
        location: "europe-west1",
      },
    },

    // Azure OpenAI (Asia)
    {
      name: "azure-asia",
      region: "asia-southeast",
      cloud: "azure",
      config: {
        endpoint: process.env.AZURE_ENDPOINT_ASIA,
        apiKey: process.env.AZURE_KEY,
      },
    },
  ],
});
```

---

## Latency Optimization

### Measure Latency by Region

```typescript
class RegionLatencyTracker {
  private latencies = new Map();

  recordLatency(region: string, latency: number) {
    if (!this.latencies.has(region)) {
      this.latencies.set(region, []);
    }

    const arr = this.latencies.get(region)!;
    arr.push(latency);

    // Keep last 100 measurements
    if (arr.length > 100) {
      arr.shift();
    }
  }

  getAverageLatency(region: string): number {
    const arr = this.latencies.get(region) || [];
    if (arr.length === 0) return Infinity;

    return arr.reduce((a, b) => a + b, 0) / arr.length;
  }

  getP95Latency(region: string): number {
    const arr = this.latencies.get(region) || [];
    if (arr.length === 0) return Infinity;

    const sorted = arr.slice().sort((a, b) => a - b);
    const index = Math.floor(sorted.length * 0.95);
    return sorted[index];
  }

  getFastestRegion(regions: string[]): string {
    let fastest = regions[0];
    let lowestLatency = this.getAverageLatency(fastest);

    for (const region of regions) {
      const latency = this.getAverageLatency(region);
      if (latency  a.avg - b.avg);
  }
}

// Usage
const latencyTracker = new RegionLatencyTracker();

const ai = new NeuroLink({
  providers: [
    /* ... */
  ],
  onSuccess: (result) => {
    latencyTracker.recordLatency(result.region, result.latency);
  },
});

// View latency stats
console.table(latencyTracker.getStats());
/*
┌─────────┬───────────────────┬──────┬──────┬─────────┐
│ (index) │      region       │ avg  │ p95  │ samples │
├─────────┼───────────────────┼──────┼──────┼─────────┤
│    0    │   'eu-west-1'     │  35  │  45  │   100   │
│    1    │  'us-east-1'      │  50  │  70  │   100   │
│    2    │  'us-west-2'      │  55  │  75  │   100   │
│    3    │ 'asia-southeast1' │  60  │  80  │   100   │
└─────────┴───────────────────┴──────┴──────┴─────────┘
*/
```

### Dynamic Region Selection

Route to fastest region based on real-time latency.

```typescript
const latencyTracker = new RegionLatencyTracker();

const ai = new NeuroLink({
  providers: [
    { name: "provider-us-east", region: "us-east-1" },
    { name: "provider-us-west", region: "us-west-2" },
    { name: "provider-eu-west", region: "eu-west-1" },
  ],
  loadBalancing: {
    strategy: "custom",
    selector: (providers, req) => {
      // Get available regions
      const regions = providers.map((p) => p.region);

      // Select fastest region
      const fastest = latencyTracker.getFastestRegion(regions);

      // Return provider for that region
      return providers.find((p) => p.region === fastest) || providers[0];
    },
  },
});
```

---

## Data Residency & Compliance

### GDPR-Compliant Regional Routing

```typescript
// Ensure EU data stays in EU
const ai = new NeuroLink({
  providers: [
    // EU providers (GDPR-compliant)
    {
      name: "mistral-eu",
      region: "eu-west-1",
      compliance: ["GDPR"],
      priority: 1,
      condition: (req) => req.userRegion === "eu",
    },
    {
      name: "vertex-eu",
      region: "europe-west1",
      compliance: ["GDPR"],
      priority: 2,
      condition: (req) => req.userRegion === "eu",
    },

    // US providers (for non-EU users)
    {
      name: "openai-us",
      region: "us-east-1",
      priority: 1,
      condition: (req) => req.userRegion !== "eu",
    },
  ],
  compliance: {
    enforceDataResidency: true, // Block cross-region data flow
    rejectNonCompliant: true, // Only use compliant providers
  },
});

// Usage
const result = await ai.generate({
  input: { text: euUserData },
  metadata: {
    userRegion: "eu",
    gdprCompliant: true,
  },
});
// Guaranteed to use EU provider
```

### Region-Specific Data Storage

```typescript
class RegionalDataStore {
  private stores = {
    "us-east": createS3Client("us-east-1"),
    "us-west": createS3Client("us-west-2"),
    "eu-west": createS3Client("eu-west-1"),
    "asia-southeast": createS3Client("ap-southeast-1"),
  };

  async store(region: string, userId: string, data: any) {
    const store = this.stores[region];

    if (!store) {
      throw new Error(`No storage configured for region: ${region}`);
    }

    await store.putObject({
      Bucket: `neurolink-data-${region}`,
      Key: `users/${userId}/ai-data.json`,
      Body: JSON.stringify(data),
      ServerSideEncryption: "AES256",
    });
  }

  async retrieve(region: string, userId: string) {
    const store = this.stores[region];
    const result = await store.getObject({
      Bucket: `neurolink-data-${region}`,
      Key: `users/${userId}/ai-data.json`,
    });

    return JSON.parse(result.Body.toString());
  }
}
```

---

## Monitoring Multi-Region

### Regional Metrics Dashboard

```typescript
class RegionalMetrics {
  private metrics = new Map();

  recordRequest(region: string, latency: number, cost: number, error: boolean) {
    if (!this.metrics.has(region)) {
      this.metrics.set(region, {
        requests: 0,
        errors: 0,
        totalLatency: 0,
        totalCost: 0,
      });
    }

    const metric = this.metrics.get(region)!;
    metric.requests++;
    metric.totalLatency += latency;
    metric.totalCost += cost;

    if (error) {
      metric.errors++;
    }
  }

  getStats() {
    const stats = [];

    for (const [region, metric] of this.metrics.entries()) {
      stats.push({
        region,
        requests: metric.requests,
        errorRate: (metric.errors / metric.requests) * 100,
        avgLatency: metric.totalLatency / metric.requests,
        totalCost: metric.totalCost,
        avgCost: metric.totalCost / metric.requests,
      });
    }

    return stats.sort((a, b) => b.requests - a.requests);
  }

  exportPrometheus() {
    let output = "";

    for (const [region, metric] of this.metrics.entries()) {
      output += `neurolink_requests_total{region="${region}"} ${metric.requests}\n`;
      output += `neurolink_errors_total{region="${region}"} ${metric.errors}\n`;
      output += `neurolink_latency_sum{region="${region}"} ${metric.totalLatency}\n`;
      output += `neurolink_cost_sum{region="${region}"} ${metric.totalCost}\n`;
    }

    return output;
  }
}

// Usage
const regionalMetrics = new RegionalMetrics();

app.get("/metrics", (req, res) => {
  res.set("Content-Type", "text/plain");
  res.send(regionalMetrics.exportPrometheus());
});
```

---

## Best Practices

### 1. ✅ Always Have Regional Fallbacks

```typescript
// ✅ Good: Fallback to other regions
const ai = new NeuroLink({
  providers: [
    { name: "primary-eu", region: "eu-west-1", priority: 1 },
    { name: "fallback-us", region: "us-east-1", priority: 2 },
  ],
  failoverConfig: { enabled: true },
});
```

### 2. ✅ Monitor Latency by Region

```typescript
// ✅ Track latency for each region
const latencyTracker = new RegionLatencyTracker();
// Alert if latency exceeds threshold
```

### 3. ✅ Enforce Data Residency

```typescript
// ✅ For GDPR compliance
compliance: {
  enforceDataResidency: true,
  rejectNonCompliant: true
}
```

### 4. ✅ Test Failover Between Regions

```typescript
// ✅ Test regional failover
describe("Regional Failover", () => {
  it("should failover to another region", async () => {
    // Simulate EU region failure
    mockProvider("mistral-eu").mockRejectedValue(new Error("503"));

    const result = await ai.generate({
      input: { text: "test" },
      metadata: { userRegion: "eu" },
    });

    // Should failover to another EU provider
    expect(result.region).toMatch(/^eu-/);
  });
});
```

### 5. ✅ Cache Regionally

```typescript
// ✅ Cache responses in each region
const cache = {
  "us-east": new Redis("redis-us-east.example.com"),
  "eu-west": new Redis("redis-eu-west.example.com"),
  "asia-southeast": new Redis("redis-asia.example.com"),
};
```

---

## Related Documentation

- **[Multi-Provider Failover](/docs/guides/enterprise/multi-provider-failover)** - Automatic failover
- **[Load Balancing](/docs/guides/enterprise/load-balancing)** - Distribution strategies
- **[Compliance Guide](/docs/guides/enterprise/compliance)** - GDPR data residency
- **[Monitoring](/docs/observability/health-monitoring)** - Regional monitoring

---

## Additional Resources

- **[AWS Global Infrastructure](https://aws.amazon.com/about-aws/global-infrastructure/)** - AWS regions
- **[GCP Locations](https://cloud.google.com/about/locations)** - Google Cloud regions
- **[Cloudflare Network Map](https://www.cloudflare.com/network/)** - Edge locations

---

**Need Help?** Join our [GitHub Discussions](https://github.com/juspay/neurolink/discussions) or open an [issue](https://github.com/juspay/neurolink/issues).

---

## Redis Migration Patterns

<!-- Source: guides/redis-migration.md -->

# Redis Migration Patterns

Complete guide for migrating conversation storage between different backends and Redis configurations.

## Table of Contents

- [In-Memory to Redis Migration](#in-memory-to-redis-migration)
- [Version Upgrades](#version-upgrades)
- [Single to Cluster Migration](#single-to-cluster-migration)
- [Cloud Provider Migrations](#cloud-provider-migrations)
- [Backup and Restore](#backup-and-restore)
- [Zero-Downtime Migration](#zero-downtime-migration)

## In-Memory to Redis Migration

### When to Migrate

Consider migrating from in-memory to Redis storage when:

- **Multi-Instance Deployment**: Running multiple NeuroLink instances that need shared conversation state
- **Session Persistence**: Need conversations to survive application restarts
- **Long-Running Sessions**: Managing conversations that span multiple days/weeks
- **Analytics Requirements**: Need to analyze conversation patterns and history
- **Compliance**: Regulatory requirements for conversation retention and audit trails

### Migration Steps

#### Step 1: Set Up Redis Server

```bash
# Quick Docker setup for development
docker run -d \
  --name neurolink-redis \
  -p 6379:6379 \
  -v redis-data:/data \
  redis:7-alpine

# Verify Redis is running
docker exec -it neurolink-redis redis-cli ping
# Expected: PONG
```

#### Step 2: Update NeuroLink Configuration

```typescript
// Before: In-memory storage
const neurolinkOld = new NeuroLink({
  conversationMemory: {
    enabled: true,
    store: "memory", // Default in-memory storage
    maxSessions: 100,
    maxTurnsPerSession: 50,
  },
});

// After: Redis storage
const neurolinkNew = new NeuroLink({
  conversationMemory: {
    enabled: true,
    store: "redis",
    redisConfig: {
      host: "localhost",
      port: 6379,
      db: 0,
      keyPrefix: "neurolink:conversation:",
      ttl: 86400, // 24 hours
    },
    maxSessions: 1000, // Can handle more with Redis
    maxTurnsPerSession: 100,
    enableSummarization: true,
  },
});
```

#### Step 3: Migrate Existing Sessions (Optional)

```typescript
// migration-script.ts

async function migrateToRedis() {
  // Initialize both instances
  const memoryInstance = new NeuroLink({
    conversationMemory: { enabled: true, store: "memory" },
  });

  const redisInstance = new NeuroLink({
    conversationMemory: {
      enabled: true,
      store: "redis",
      redisConfig: {
        host: "localhost",
        port: 6379,
        db: 0,
      },
    },
  });

  console.log("Starting migration from memory to Redis...");

  // Note: In-memory storage doesn't persist data across restarts
  // This example shows conceptual migration if you have active sessions

  // If you have a way to export memory data:
  // 1. Export sessions from memory storage
  // 2. Import into Redis storage
  // 3. Verify migration success

  console.log("✅ Migration completed");
  console.log(
    "Note: Historical data from in-memory storage before migration is not preserved.",
  );
  console.log("All new conversations will now be stored in Redis.");
}

migrateToRedis().catch(console.error);
```

#### Step 4: Verify Migration

```typescript
// verify-redis.ts

async function verifyRedisStorage() {
  const neurolink = new NeuroLink({
    conversationMemory: {
      enabled: true,
      store: "redis",
      redisConfig: {
        host: "localhost",
        port: 6379,
      },
    },
  });

  // Create a test conversation
  console.log("Creating test conversation...");
  await neurolink.generate({
    input: { text: "Test message for Redis verification" },
    sessionId: "test-session",
    userId: "test-user",
    provider: "openai",
  });

  // Verify persistence
  const history = await neurolink.conversationMemory?.getUserSessionHistory(
    "test-user",
    "test-session",
  );

  console.log(`✅ Redis storage verified`);
  console.log(`Conversation has ${history?.length} messages`);

  // Check stats
  const stats = await neurolink.conversationMemory?.getStats();
  console.log(
    `Total sessions: ${stats?.totalSessions}, Total turns: ${stats?.totalTurns}`,
  );

  // Cleanup test data
  await neurolink.conversationMemory?.clearSession("test-session", "test-user");
  console.log("✅ Test data cleaned up");
}

verifyRedisStorage().catch(console.error);
```

### Code Example: Gradual Migration

```typescript
// gradual-migration.ts - Migrate incrementally

class GradualMigration {
  private memoryInstance: NeuroLink;
  private redisInstance: NeuroLink;
  private migrationProgress = 0;

  constructor() {
    // Initialize both storage backends
    this.memoryInstance = new NeuroLink({
      conversationMemory: {
        enabled: true,
        store: "memory",
      },
    });

    this.redisInstance = new NeuroLink({
      conversationMemory: {
        enabled: true,
        store: "redis",
        redisConfig: {
          host: "localhost",
          port: 6379,
        },
      },
    });
  }

  // Gradual cutover: route traffic based on percentage
  async generate(options: {
    input: any;
    sessionId: string;
    userId: string;
    provider: string;
  }) {
    const useRedis = Math.random()  /etc/redis/cluster/$port/redis.conf  /var/lib/redis/dump.rdb
gunzip -c /backup/neurolink-aof-20260101-020000.aof.gz > /var/lib/redis/appendonly.aof

# Set correct permissions
sudo chown redis:redis /var/lib/redis/dump.rdb
sudo chown redis:redis /var/lib/redis/appendonly.aof

# Start Redis
sudo systemctl start redis-server

# Verify restoration
redis-cli -a ${REDIS_PASSWORD} DBSIZE
```

#### Selective Restore (Specific Keys)

```bash
# Export specific keys from backup
redis-cli --rdb /tmp/backup.rdb

# Start temporary Redis instance
redis-server --port 6380 --dir /tmp --dbfilename backup.rdb --daemonize yes

# Copy specific keys to production
redis-cli -p 6380 --scan --pattern "neurolink:conversation:user123:*" | \
  xargs redis-cli -p 6380 MIGRATE localhost 6379 0 5000 KEYS

# Cleanup temporary instance
redis-cli -p 6380 SHUTDOWN
```

### Disaster Recovery Procedure

```bash
#!/bin/bash
# disaster-recovery.sh

echo "Starting NeuroLink Redis disaster recovery..."

# 1. Stop affected Redis instance
sudo systemctl stop redis-server

# 2. Check data integrity
redis-check-rdb /var/lib/redis/dump.rdb
redis-check-aof /var/lib/redis/appendonly.aof

# 3. If corrupted, restore from latest backup
if [ $? -ne 0 ]; then
  echo "Data corruption detected. Restoring from backup..."
  LATEST_BACKUP=$(ls -t /backup/redis/neurolink-dump-*.rdb.gz | head -1)
  gunzip -c $LATEST_BACKUP > /var/lib/redis/dump.rdb
  sudo chown redis:redis /var/lib/redis/dump.rdb
fi

# 4. Restart Redis
sudo systemctl start redis-server

# 5. Verify health
if redis-cli -a ${REDIS_PASSWORD} ping | grep -q "PONG"; then
  echo "✅ Redis recovery successful"
else
  echo "❌ Redis recovery failed"
  exit 1
fi

# 6. Verify NeuroLink connectivity
node -e "
const { NeuroLink } = require('@juspay/neurolink');
const nl = new NeuroLink({
  conversationMemory: {
    enabled: true,
    store: 'redis',
    redisConfig: { host: 'localhost', port: 6379 }
  }
});
nl.conversationMemory.getStats().then(stats => {
  console.log('✅ NeuroLink verification successful');
  console.log('Sessions:', stats.totalSessions);
}).catch(err => {
  console.error('❌ NeuroLink verification failed:', err);
  process.exit(1);
});
"

echo "Recovery procedure completed"
```

## Zero-Downtime Migration

### Strategy: Dual-Write Pattern

```typescript
// dual-write-migration.ts

class DualWriteMigration {
  private sourceInstance: NeuroLink;
  private targetInstance: NeuroLink;

  constructor() {
    // Source: Current Redis instance
    this.sourceInstance = new NeuroLink({
      conversationMemory: {
        enabled: true,
        store: "redis",
        redisConfig: {
          host: "old-redis.example.com",
          port: 6379,
        },
      },
    });

    // Target: New Redis instance/cluster
    this.targetInstance = new NeuroLink({
      conversationMemory: {
        enabled: true,
        store: "redis",
        redisConfig: {
          host: "new-redis.example.com",
          port: 6379,
        },
      },
    });
  }

  // Write to both instances
  async generate(options: any) {
    try {
      // Primary write to source
      const result = await this.sourceInstance.generate(options);

      // Async write to target (don't wait)
      this.targetInstance.generate(options).catch((err) => {
        console.error("Target write failed:", err);
        // Log for later reconciliation
      });

      return result;
    } catch (error) {
      console.error("Source write failed:", error);
      // Could fall back to target or retry
      throw error;
    }
  }

  // Gradual switchover
  async switchToTarget() {
    console.log("Switching primary to target instance...");
    const temp = this.sourceInstance;
    this.sourceInstance = this.targetInstance;
    this.targetInstance = temp;
    console.log("✅ Switched to new Redis instance");
  }
}

// Usage
const migration = new DualWriteMigration();

// Phase 1: Dual write (both instances receive writes)
await migration.generate({
  input: { text: "Test message" },
  sessionId: "session1",
  userId: "user1",
  provider: "openai",
});

// Phase 2: After data sync, switch primary
await migration.switchToTarget();

// Phase 3: Continue with new instance as primary
```

### Blue-Green Deployment

```bash
#!/bin/bash
# blue-green-migration.sh

# Blue: Current production Redis
BLUE_REDIS="redis-blue.example.com:6379"

# Green: New Redis instance
GREEN_REDIS="redis-green.example.com:6379"

echo "Starting Blue-Green migration..."

# 1. Sync data from Blue to Green
redis-cli --rdb /tmp/blue-backup.rdb -h redis-blue.example.com -p 6379
redis-cli -h redis-green.example.com -p 6379 --pipe < /tmp/blue-backup.rdb

# 2. Enable dual-write mode in application
# Update environment variable
export REDIS_DUAL_WRITE=true
export REDIS_PRIMARY=$BLUE_REDIS
export REDIS_SECONDARY=$GREEN_REDIS

# 3. Monitor for consistency
sleep 300  # 5 minutes of dual-write

# 4. Switch primary to Green
export REDIS_PRIMARY=$GREEN_REDIS
export REDIS_SECONDARY=$BLUE_REDIS

# 5. Verify new primary
redis-cli -h redis-green.example.com -p 6379 DBSIZE

# 6. After validation, decommission Blue
echo "✅ Migration to Green completed"
```

## See Also

- [Redis Quick Start](/docs/getting-started/redis-quickstart) - 5-minute Redis setup
- [Redis Configuration Guide](/docs/guides/redis-configuration) - Complete configuration reference
- [Conversation Memory](/docs/features/conversation-history) - Conversation memory features
- [Troubleshooting](/docs/reference/troubleshooting) - Common issues and solutions

## External Resources

- [Redis Persistence](https://redis.io/topics/persistence) - RDB and AOF persistence
- [Redis Cluster Tutorial](https://redis.io/topics/cluster-tutorial) - Cluster setup guide
- [Redis Replication](https://redis.io/topics/replication) - Replication and high availability
- [Redis Backup Best Practices](https://redis.io/topics/admin#backup) - Backup strategies

---

## Session Management & Persistence Guide

<!-- Source: guides/session-management.md -->

# Session Management & Persistence Guide

**NeuroLink Enhanced MCP Platform - Session Management**

## ️ **Architecture & Components**

### **Session Manager Core**

```typescript
export class SessionManager {
  private sessions: Map = new Map();
  private persistence: SessionPersistence;
  private cleanupScheduler: NodeJS.Timeout;

  async createSession(
    context: NeuroLinkExecutionContext,
    options?: SessionOptions,
  ): Promise {
    const session: OrchestratorSession = {
      id: uuidv4(),
      context,
      toolHistory: [],
      state: new Map(),
      metadata: options?.metadata || {},
      createdAt: Date.now(),
      lastActivity: Date.now(),
      expiresAt: Date.now() + (options?.ttl || 3600000), // 1 hour default
    };

    this.sessions.set(session.id, session);
    await this.persistence.saveSession(session);

    return session;
  }
}
```

### **Session Data Structure**

```typescript
export type OrchestratorSession = {
  id: string; // UUID v4 identifier
  context: NeuroLinkExecutionContext; // Execution context
  toolHistory: ToolResult[]; // Complete tool execution history
  state: Map; // Session-specific state
  metadata: {
    userAgent?: string; // Client user agent
    origin?: string; // Request origin
    tags?: string[]; // Custom tags
    [key: string]: any; // Custom metadata
  };
  createdAt: number; // Creation timestamp
  lastActivity: number; // Last activity timestamp
  expiresAt: number; // Expiration timestamp
};
```

---

##  **Persistence Mechanisms**

### **File-based Persistence**

```typescript
export class SessionPersistence {
  async saveSession(session: OrchestratorSession): Promise {
    const sessionPath = this.getSessionPath(session.id);
    const sessionData = this.serializeSession(session);

    // Atomic write with temporary file
    const tempPath = `${sessionPath}.tmp`;
    await fs.writeFile(tempPath, JSON.stringify(sessionData, null, 2));
    await fs.rename(tempPath, sessionPath);

    // Create checksum for integrity verification
    const checksum = await this.calculateChecksum(sessionData);
    await fs.writeFile(`${sessionPath}.checksum`, checksum);
  }

  async loadSession(sessionId: string): Promise {
    try {
      const sessionPath = this.getSessionPath(sessionId);
      const sessionData = JSON.parse(await fs.readFile(sessionPath, "utf8"));
      return this.deserializeSession(sessionData);
    } catch (error) {
      console.error(`Failed to load session ${sessionId}:`, error);
      return null;
    }
  }
}
```

---

##  **Usage Examples**

### **Basic Session Usage**

```typescript
// Create session with metadata
const session = await sessionManager.createSession(
  {
    userId: "user123",
    aiProvider: "google-ai",
    permissions: ["read-data", "analyze"],
  },
  {
    ttl: 7200000, // 2 hours
    metadata: {
      userAgent: "NeuroLink-CLI/1.0",
      tags: ["analysis", "urgent"],
    },
  },
);
```

### **Long-running Workflow**

```typescript
// Execute multi-step workflow with session state
const executeLongWorkflow = async (sessionId: string) => {
  const session = await sessionManager.getSession(sessionId);

  // Step 1: Fetch data (if not already done)
  if (!session.state.has("dataFetched")) {
    const userData = await orchestrator.executeTool(
      "database-query",
      {},
      session.context,
    );
    session.state.set("userData", userData);
    session.state.set("dataFetched", true);
    await sessionManager.updateSession(session);
  }

  // Continue workflow...
};
```

---

## ⏰ **TTL Management & Cleanup**

### **Automatic Cleanup**

```typescript
export class SessionCleanupManager {
  async performCleanup(): Promise {
    const allSessions = await this.sessionManager.getAllSessions();
    const now = Date.now();
    let cleanedSessions = 0;

    for (const session of allSessions) {
      if (session.expiresAt ;
};

// Collect session metrics
const analytics = await sessionAnalyticsCollector.collectSessionMetrics();
console.log("Active sessions:", analytics.activeSessions);
console.log("Average duration:", analytics.averageSessionDuration);
```

---

##  **Testing Examples**

### **Persistence Testing**

```typescript
// Test session recovery after restart
const testPersistence = async () => {
  // Create session with state
  const session = await sessionManager.createSession(context);
  session.state.set("testData", { value: 42 });
  await sessionManager.updateSession(session);

  // Simulate restart
  const newSessionManager = new SessionManager({ persistenceStrategy: "file" });
  const recovered = await newSessionManager.getSession(session.id);

  console.log("Recovery successful:", !!recovered);
  console.log("Data intact:", recovered?.state.get("testData")?.value === 42);
};
```

---

##  **Configuration**

### **Advanced Setup**

```typescript
const sessionManager = new SessionManager({
  persistenceStrategy: "file",
  persistence: {
    basePath: "./sessions",
    encryptionKey: process.env.SESSION_ENCRYPTION_KEY,
  },
  defaults: {
    ttl: 3600000, // 1 hour
    maxSessions: 1000, // Max concurrent sessions
    cleanupInterval: 300000, // 5 minutes
  },
});
```

---

##  **Best Practices**

### **Session Safety**

```typescript
// Always check session validity
const safeSessionOperation = async (sessionId: string, operation: Function) => {
  const session = await sessionManager.getSession(sessionId);
  if (!session) {
    throw new Error("Session not found or expired");
  }

  session.lastActivity = Date.now();
  const result = await operation(session);
  await sessionManager.updateSession(session);
  return result;
};
```

### **Resource Management**

```typescript
// Implement graceful shutdown
const gracefulShutdown = async () => {
  const activeSessions = await sessionManager.getActiveSessions();
  for (const session of activeSessions) {
    await sessionManager.updateSession(session);
  }
  sessionManager.stopCleanup();
};
```

---

**STATUS**: Production-ready session management system with comprehensive persistence, TTL management, and analytics capabilities. Enables long-running operations with full state recovery across process restarts.

---

## Vector Stores Guide

<!-- Source: guides/vector-stores.md -->

# Vector Stores Guide

Learn how to configure and use vector stores for semantic search in RAG pipelines.

> **Since**: v8.44.0 | **Status**: Stable | **Availability**: SDK + CLI

## Overview

Vector stores are the backbone of semantic search in RAG (Retrieval-Augmented Generation) systems. They store document embeddings and enable fast similarity search to find relevant content for your queries.

NeuroLink provides:

- **Abstract VectorStore Interface** - Consistent API for any vector database
- **InMemoryVectorStore** - Built-in store for development and testing
- **Provider-Specific Options** - Native support for Pinecone, pgVector, and Chroma
- **Metadata Filtering** - Rich query syntax for filtering results
- **Hybrid Search Integration** - Combine vector search with BM25 keyword matching

## Quick Start

```typescript

// Create a vector store
const vectorStore = new InMemoryVectorStore();

// Add documents with embeddings
await vectorStore.upsert("my-index", [
  {
    id: "doc-1",
    vector: [0.1, 0.2, 0.3 /* ... embedding values */],
    metadata: { text: "Machine learning fundamentals", topic: "ml" },
  },
  {
    id: "doc-2",
    vector: [0.15, 0.25, 0.35 /* ... embedding values */],
    metadata: { text: "Deep learning architectures", topic: "dl" },
  },
]);

// Query for similar documents
const results = await vectorStore.query({
  indexName: "my-index",
  queryVector: [0.12, 0.22, 0.32 /* ... query embedding */],
  topK: 5,
});

console.log(results);
// [{ id: "doc-1", score: 0.95, text: "...", metadata: {...} }, ...]
```

## Available Vector Stores

### InMemoryVectorStore

The built-in `InMemoryVectorStore` is perfect for development, testing, and small-scale applications.

```typescript

const store = new InMemoryVectorStore();
```

**Features:**

- Zero dependencies - works out of the box
- Full metadata filtering support
- Cosine similarity search
- No persistence (data lost on restart)

**When to Use:**

- Development and testing
- Prototyping RAG pipelines
- Small datasets (;

  constructor(apiKey: string, indexName: string) {
    this.client = new Pinecone({ apiKey });
    this.index = this.client.index(indexName);
  }

  async query(params: {
    indexName: string;
    queryVector: number[];
    topK?: number;
    filter?: Record;
    includeVectors?: boolean;
  }) {
    const response = await this.index.query({
      vector: params.queryVector,
      topK: params.topK || 10,
      filter: params.filter,
      includeMetadata: true,
      includeValues: params.includeVectors,
    });

    return response.matches.map((match) => ({
      id: match.id,
      score: match.score,
      text: match.metadata?.text as string,
      metadata: match.metadata,
      vector: match.values,
    }));
  }
}

// Usage
const pineconeStore = new PineconeVectorStore(
  process.env.PINECONE_API_KEY!,
  "my-index",
);
```

#### pgVector Integration

```typescript

class PgVectorStore implements VectorStore {
  private pool: Pool;

  constructor(connectionString: string) {
    this.pool = new Pool({ connectionString });
  }

  async query(params: {
    indexName: string;
    queryVector: number[];
    topK?: number;
    filter?: Record;
  }) {
    const vectorStr = `[${params.queryVector.join(",")}]`;

    // WARNING: Validate indexName against allowlist before use
    const safeName = params.indexName.replace(/[^a-zA-Z0-9_]/g, "");
    const result = await this.pool.query(
      `
      SELECT id, text, metadata,
             1 - (embedding  $1::vector) as score
      FROM ${safeName}
      ORDER BY embedding  $1::vector
      LIMIT $2
    `,
      [vectorStr, params.topK || 10],
    );

    return result.rows.map((row) => ({
      id: row.id,
      score: row.score,
      text: row.text,
      metadata: row.metadata,
    }));
  }
}
```

#### Chroma Integration

```typescript

class ChromaVectorStore implements VectorStore {
  private client: ChromaClient;

  constructor(path?: string) {
    this.client = new ChromaClient({ path });
  }

  async query(params: {
    indexName: string;
    queryVector: number[];
    topK?: number;
    filter?: Record;
  }) {
    const collection = await this.client.getCollection({
      name: params.indexName,
    });

    const results = await collection.query({
      queryEmbeddings: [params.queryVector],
      nResults: params.topK || 10,
      where: params.filter,
    });

    return (results.ids[0] || []).map((id, i) => ({
      id,
      score: results.distances?.[0]?.[i]
        ? 1 - results.distances[0][i]
        : undefined,
      text: results.documents?.[0]?.[i] || undefined,
      metadata: results.metadatas?.[0]?.[i] || undefined,
    }));
  }
}
```

## Configuration

### VectorStore Interface

All vector stores implement this interface:

```typescript
type VectorStore = {
  query(params: {
    indexName: string;
    queryVector: number[];
    topK?: number;
    filter?: MetadataFilter;
    includeVectors?: boolean;
  }): Promise;
};
```

### VectorQueryResult

Query results follow this structure:

```typescript
type VectorQueryResult = {
  /** Unique identifier */
  id: string;
  /** Text content */
  text?: string;
  /** Similarity/relevance score (0-1) */
  score?: number;
  /** Associated metadata */
  metadata?: Record;
  /** Embedding vector (if requested) */
  vector?: number[];
};
```

### Provider-Specific Options

Configure provider-specific behavior through `VectorProviderOptions`:

```typescript
type VectorProviderOptions = {
  /** Pinecone options */
  pinecone?: {
    namespace?: string;
    sparseVector?: number[];
  };
  /** pgVector options */
  pgVector?: {
    minScore?: number;
    ef?: number; // HNSW ef_search parameter
    probes?: number; // IVFFlat probes parameter
  };
  /** Chroma options */
  chroma?: {
    where?: Record;
    whereDocument?: Record;
  };
};
```

## Usage Examples

### Adding Documents/Chunks

```typescript

// Create store and get embedding provider
const store = new InMemoryVectorStore();
const embedder = await ProviderFactory.createProvider(
  "openai",
  "text-embedding-3-small",
);

// Prepare documents
const documents = [
  { id: "1", text: "Introduction to machine learning concepts" },
  { id: "2", text: "Neural network architectures and training" },
  { id: "3", text: "Natural language processing techniques" },
];

// Generate embeddings and upsert
const items = await Promise.all(
  documents.map(async (doc) => ({
    id: doc.id,
    vector: await embedder.embed(doc.text),
    metadata: { text: doc.text, source: "tutorial" },
  })),
);

await store.upsert("knowledge-base", items);
```

### Searching with Filters

```typescript
// Basic similarity search
const results = await store.query({
  indexName: "knowledge-base",
  queryVector: await embedder.embed("How do neural networks work?"),
  topK: 5,
});

// Search with metadata filter
const filteredResults = await store.query({
  indexName: "knowledge-base",
  queryVector: await embedder.embed("machine learning basics"),
  topK: 10,
  filter: {
    topic: "ml",
    difficulty: { $in: ["beginner", "intermediate"] },
  },
});
```

### Metadata Filter Syntax

NeuroLink supports MongoDB/Sift-style query operators:

```typescript
// Equality
filter: { topic: "ml" }

// Comparison operators
filter: {
  score: { $gt: 0.8 },        // Greater than
  score: { $gte: 0.8 },       // Greater than or equal
  score: { $lt: 0.5 },        // Less than
  score: { $lte: 0.5 },       // Less than or equal
  status: { $ne: "archived" } // Not equal
}

// Array operators
filter: {
  tags: { $in: ["ml", "ai", "nlp"] },    // Value in array
  category: { $nin: ["draft", "test"] }   // Value not in array
}

// Logical operators
filter: {
  $and: [
    { topic: "ml" },
    { difficulty: "beginner" }
  ]
}

filter: {
  $or: [
    { author: "alice" },
    { author: "bob" }
  ]
}

filter: {
  $not: { status: "draft" }
}

// Special operators
filter: {
  summary: { $exists: true },       // Field exists
  title: { $contains: "guide" },    // String contains
  tags: { $regex: "^ml-" }          // Regex match
}
```

### Using the Vector Query Tool

The `createVectorQueryTool` function creates a tool suitable for AI agents:

```typescript

const vectorStore = new InMemoryVectorStore();
// ... populate with data

const queryTool = createVectorQueryTool(
  {
    id: "knowledge-search",
    description: "Search the knowledge base for relevant information",
    indexName: "docs",
    embeddingModel: {
      provider: "openai",
      modelName: "text-embedding-3-small",
    },
    topK: 10,
    enableFilter: true,
    includeSources: true,
    reranker: {
      model: { provider: "openai", modelName: "gpt-4o-mini" },
      weights: { semantic: 0.5, vector: 0.3, position: 0.2 },
      topK: 5,
    },
  },
  vectorStore,
);

// Use in an agent
const response = await queryTool.execute({
  query: "What are the best practices for RAG?",
  filter: { category: "best-practices" },
  topK: 5,
});

console.log(response.relevantContext);
console.log(response.sources);
```

### Hybrid Search Integration

Combine vector search with BM25 for improved retrieval:

```typescript

  InMemoryVectorStore,
  InMemoryBM25Index,
  createHybridSearch,
} from "@juspay/neurolink";

// Create both indices
const vectorStore = new InMemoryVectorStore();
const bm25Index = new InMemoryBM25Index();

// Add documents to both
const documents = [
  { id: "1", text: "Machine learning fundamentals", metadata: { topic: "ml" } },
  { id: "2", text: "Deep learning architectures", metadata: { topic: "dl" } },
];

// Populate BM25 index
await bm25Index.addDocuments(documents);

// Populate vector store (with embeddings)
await vectorStore.upsert(
  "docs",
  documents.map((doc) => ({
    id: doc.id,
    vector: /* embedding */,
    metadata: { ...doc.metadata, text: doc.text },
  })),
);

// Create hybrid search
const hybridSearch = createHybridSearch({
  vectorStore,
  bm25Index,
  indexName: "docs",
  embeddingModel: {
    provider: "openai",
    modelName: "text-embedding-3-small",
  },
  defaultConfig: {
    fusionMethod: "rrf", // or "linear"
    vectorWeight: 0.5,
    bm25Weight: 0.5,
    topK: 10,
  },
});

// Execute hybrid search
const results = await hybridSearch("neural network training", {
  topK: 5,
  fusionMethod: "rrf",
});
```

## Best Practices

### When to Use Which Store

| Use Case                   | Recommended Store           | Why                               |
| -------------------------- | --------------------------- | --------------------------------- |
| Development/Testing        | `InMemoryVectorStore`       | Zero setup, fast iteration        |
| Small apps ( 1M vectors) | Pinecone, Weaviate, Qdrant  | Purpose-built for scale           |
| Serverless                 | Pinecone, Supabase pgVector | Managed, auto-scaling             |
| Self-hosted                | pgVector, Chroma, Milvus    | Full control, data locality       |
| Hybrid search required     | Pinecone (sparse-dense)     | Native support for sparse vectors |

### Performance Considerations

1. **Batch Operations**

   ```typescript
   // Good: Batch upsert
   await store.upsert("index", items); // Single call with many items

   // Avoid: Individual upserts
   for (const item of items) {
     await store.upsert("index", [item]); // Many calls
   }
   ```

2. **Index Configuration**
   - For pgVector: Use HNSW index for faster queries at slight accuracy cost
   - For Pinecone: Choose pod type based on query latency requirements
   - For Chroma: Use persistent storage for production

3. **Query Optimization**

   ```typescript
   // Use appropriate topK - don't over-fetch
   const results = await store.query({
     indexName: "docs",
     queryVector: embedding,
     topK: 10, // Only what you need
   });

   // Apply filters to reduce search space
   const filtered = await store.query({
     indexName: "docs",
     queryVector: embedding,
     topK: 10,
     filter: { category: "active" }, // Reduces candidates
   });
   ```

4. **Embedding Dimensions**
   - Smaller dimensions (384, 768) = faster search, lower storage
   - Larger dimensions (1536, 3072) = better accuracy, more resources
   - Match model to use case: `text-embedding-3-small` (1536) vs `text-embedding-3-large` (3072)

### Production Recommendations

1. **Use Managed Services** - Pinecone, Supabase, or cloud-hosted options reduce operational burden

2. **Implement Connection Pooling**

   ```typescript
   // For database-backed stores
   const pool = new Pool({
     connectionString: process.env.DATABASE_URL,
     max: 20, // Connection pool size
     idleTimeoutMillis: 30000,
   });
   ```

3. **Add Circuit Breakers**

   ```typescript
   import { RAGCircuitBreaker } from "@juspay/neurolink";

   const breaker = new RAGCircuitBreaker("vector-store", {
     failureThreshold: 5,
     resetTimeout: 60000,
   });

   const results = await breaker.execute(
     () => store.query({ indexName: "docs", queryVector: embedding }),
     "query",
   );
   ```

4. **Monitor Performance**

   ```typescript
   const startTime = Date.now();
   const results = await store.query({ ... });
   const queryTime = Date.now() - startTime;

   logger.info("Vector query completed", {
     queryTime,
     resultsCount: results.length,
     indexName: "docs",
   });
   ```

5. **Handle Failures Gracefully**

   ```typescript
   import { RAGRetryHandler } from "@juspay/neurolink";

   const retryHandler = new RAGRetryHandler({
     maxRetries: 3,
     initialDelay: 1000,
     backoffMultiplier: 2,
   });

   const results = await retryHandler.executeWithRetry(() =>
     store.query({ indexName: "docs", queryVector: embedding }),
   );
   ```

## Troubleshooting

| Problem             | Solution                                                             |
| ------------------- | -------------------------------------------------------------------- |
| Empty results       | Verify embeddings are generated with same model used for indexing    |
| Slow queries        | Add appropriate indices; reduce topK; use metadata filters           |
| Memory issues       | Switch from InMemoryVectorStore to a persistent store                |
| Inconsistent scores | Ensure vectors are normalized; check embedding model consistency     |
| Filter not working  | Verify metadata was stored during upsert; check filter syntax        |
| Connection timeouts | Implement connection pooling; add retry logic; check network latency |

## See Also

- [RAG Document Processing Guide](/docs/tutorials/rag) - Complete RAG pipeline documentation
- [Hybrid Search](#hybrid-search-integration) - Combining vector and keyword search
- [Reranking Guide](/docs/features/rag.md#reranking) - Improving result relevance
- [Observability Guide](/docs/observability/health-monitoring) - Monitoring RAG operations
- [Resilience Patterns](/docs/features/rag.md#resilience-patterns) - Circuit breakers and retry handling

---

# Memory

## Conversation Memory

<!-- Source: memory/conversation.md -->

# Conversation Memory

NeuroLink's Conversation Memory feature enables AI models to maintain context across multiple turns within a session, creating more natural and coherent conversations.

##  Overview

The conversation memory system provides:

- **Session-based memory**: Each conversation session maintains its own context
- **Turn-by-turn persistence**: AI remembers previous messages within a session
- **Automatic cleanup**: Configurable limits to prevent memory bloat
- **Session isolation**: Different sessions don't interfere with each other
- **In-memory storage**: Fast, lightweight storage for conversation history
- **Universal Method Support**: Works seamlessly with both `generate()` and `stream()` methods
- **Stream Integration**: Full conversation memory support for streaming responses

## ⚙️ Configuration

### Environment Variables

```bash
# Enable/disable conversation memory
NEUROLINK_MEMORY_ENABLED=true

# Maximum number of sessions to keep in memory
NEUROLINK_MEMORY_MAX_SESSIONS=50

# Maximum number of turns per session
NEUROLINK_MEMORY_MAX_TURNS_PER_SESSION=50
```

### Programmatic Configuration

```javascript

const neurolink = new NeuroLink({
  conversationMemory: {
    enabled: true,
    maxSessions: 10,
    maxTurnsPerSession: 20,
  },
});
```

##  Usage Examples

### Basic Usage with Session ID

```javascript

const neurolink = new NeuroLink({
  conversationMemory: { enabled: true },
});

// First message in session
const response1 = await neurolink.generate({
  prompt: "My name is Alice and I love reading books",
  context: {
    sessionId: "user-123",
    userId: "alice",
  },
});

// Follow-up message - AI will remember previous context
const response2 = await neurolink.generate({
  prompt: "What is my favorite hobby?",
  context: {
    sessionId: "user-123",
    userId: "alice",
  },
});
// Response: "Based on what you told me, your favorite hobby is reading books!"
```

### Streaming Support

The conversation memory system now fully supports streaming responses with the same memory persistence:

```javascript

const neurolink = new NeuroLink({
  conversationMemory: { enabled: true },
});

// Stream a response - memory is AUTOMATICALLY captured in background!
const streamResult = await neurolink.stream({
  input: { text: "My favorite hobby is photography" },
  provider: "vertex",
  context: {
    sessionId: "photo-session",
    userId: "photographer",
  },
});

// OPTIONAL: Consume the stream for real-time display
// Memory is saved automatically regardless of whether you consume the stream
let response = "";
for await (const chunk of streamResult.stream) {
  response += chunk.content;
  process.stdout.write(chunk.content); // Real-time display
}

// Memory works even without consuming the stream!
// Both user input AND AI response are automatically stored

// Follow-up message will remember the streamed conversation
const followUp = await neurolink.generate({
  input: { text: "What hobby did I mention?" },
  provider: "vertex",
  context: {
    sessionId: "photo-session", // Same session
    userId: "photographer",
  },
});
// Response: "You mentioned that your favorite hobby is photography!"
```

### Mixed Generate/Stream Conversations

You can seamlessly mix `generate()` and `stream()` calls within the same conversation:

```javascript
// Start with generate
await neurolink.generate({
  input: { text: "I work as a software engineer" },
  context: { sessionId: "career-chat" },
});

// Continue with stream
const streamResult = await neurolink.stream({
  input: { text: "I specialize in AI development" },
  context: { sessionId: "career-chat" },
});

// Back to generate - AI remembers both previous messages
const summary = await neurolink.generate({
  input: { text: "Summarize what you know about my career" },
  context: { sessionId: "career-chat" },
});
// Response includes both software engineering and AI development details
```

### Session Isolation Example

```javascript
// Session 1
await neurolink.generate({
  prompt: "My favorite color is blue",
  context: { sessionId: "session-1" },
});

// Session 2 - completely isolated
await neurolink.generate({
  prompt: "What is my favorite color?",
  context: { sessionId: "session-2" },
});
// Response: "I don't have information about your favorite color..."
```

##  Memory Management

### Turn Limits

When the number of conversation turns exceeds `maxTurnsPerSession`, older messages are automatically removed:

```javascript
// With maxTurnsPerSession: 3
// Turn 1: User + AI response (2 messages)
// Turn 2: User + AI response (4 messages total)
// Turn 3: User + AI response (6 messages total)
// Turn 4: User + AI response (6 messages - oldest turn removed)
```

### Session Limits

When the number of active sessions exceeds `maxSessions`, the least recently used sessions are removed:

```javascript
// With maxSessions: 2
// Session 1: Active
// Session 2: Active
// Session 3: Created -> Session 1 (least recent) is removed
```

##  API Reference

### Memory Statistics

```javascript
// Get memory usage statistics
const stats = await neurolink.getConversationStats();
console.log(stats);
// Output: { totalSessions: 3, totalTurns: 15 }
```

### Session Management

```javascript
// Clear specific session
const cleared = await neurolink.clearConversationSession("session-123");
console.log(cleared); // true if session existed and was cleared

// Clear all conversations
await neurolink.clearAllConversations();
```

##  Test Results

The conversation memory system has been thoroughly tested and validated:

### ✅ Test Suite Results

| Test Case             | Status  | Description                                     |
| --------------------- | ------- | ----------------------------------------------- |
| **Basic Memory**      | ✅ PASS | AI correctly remembers information across turns |
| **Session Isolation** | ✅ PASS | Sessions remain completely separate             |
| **Turn Limits**       | ✅ PASS | Automatic cleanup when limits exceeded          |
| **Session Limits**    | ✅ PASS | LRU eviction of old sessions                    |
| **API Functions**     | ✅ PASS | Clear operations work correctly                 |

### Example Test Output

```
 NeuroLink Conversation Memory - Quick Test

 TEST 1: Basic Memory Functionality
----------------------------------
 User: My name is Alice
 AI: Hello Alice! It's nice to meet you. How can I help you today?

 User: What is my name?
 AI: Your name is Alice! You introduced yourself to me in your previous message.
✅ Memory Test: PASS - Remembers name correctly

 TEST 2: Session Isolation
----------------------------------------
 User (different session): Do you know Alice?
 AI: I don't know a specific person named Alice...
✅ Isolation Test: PASS - Sessions properly isolated

 OVERALL: ✅ ALL TESTS PASSED
```

##  Best Practices

### 1. Session ID Strategy

```javascript
// Use consistent session IDs for the same conversation
const sessionId = `user-${userId}-${conversationId}`;

// Include user ID for better tracking
const context = {
  sessionId: sessionId,
  userId: userId,
};
```

### 2. Memory Limits

```javascript
// For chat applications
const chatConfig = {
  maxSessions: 100, // Support many users
  maxTurnsPerSession: 50, // Long conversations
};

// For short interactions
const quickConfig = {
  maxSessions: 20, // Fewer concurrent users
  maxTurnsPerSession: 10, // Brief exchanges
};
```

### 3. Error Handling

```javascript
try {
  const response = await neurolink.generate({
    prompt: "Hello",
    context: { sessionId: "test-session" },
  });
} catch (error) {
  console.error("Generation failed:", error);
  // Memory operations are designed to fail gracefully
  // Generation will continue without memory if needed
}
```

##  Technical Implementation

### Architecture

```
┌─────────────────────┐
│   NeuroLink SDK     │
└─────────┬───────────┘
          │
┌─────────▼───────────┐
│ ConversationMemory  │
│     Manager         │
└─────────┬───────────┘
          │
┌─────────▼───────────┐
│   In-Memory Store   │
│  (Map) │
└─────────────────────┘
```

### Message Format

```typescript
type ChatMessage = {
  role: "user" | "assistant" | "system";
  content: string;
};

// Internal storage format
type SessionMemory = {
  sessionId: string;
  userId?: string;
  messages: ChatMessage[];
  createdAt: number;
  lastActivity: number;
};
```

##  Troubleshooting

### Common Issues

**Memory not persisting between calls**

- Ensure `sessionId` is consistent across calls
- Verify `conversationMemory.enabled` is true
- Check that `sessionId` is a valid string

**Performance issues with large conversations**

- Reduce `maxTurnsPerSession` limit
- Implement session cleanup strategies
- Monitor memory usage statistics

**Session isolation not working**

- Verify different `sessionId` values are being used
- Check for session ID conflicts or duplicates

### Debug Logging

```javascript
// Enable debug logging to see memory operations
const neurolink = new NeuroLink({
  conversationMemory: { enabled: true },
  debug: true, // Enables detailed logging
});
```

##  Related Documentation

- **[Redis Conversation Export](/docs/features/conversation-history)** - Export session history as JSON for analytics (Q4 2025)
- [API Reference](/docs/sdk/api-reference) - Complete SDK documentation
- [Configuration](/docs/deployment/configuration) - Environment setup guide
- [Examples](/docs/guides/examples/use-cases) - More usage examples
- [Testing Guide](/docs/development/testing) - How to test conversation memory

##  Performance Characteristics

- **Memory Usage**: ~1KB per conversation turn
- **Lookup Time**: O(1) for session retrieval
- **Cleanup Time**: O(n) for session limit enforcement
- **Concurrency**: Thread-safe in-memory operations

The conversation memory system is designed for production use with efficient memory management and robust error handling.

---

## NeuroLink Mem0 Memory Integration

<!-- Source: memory/mem0.md -->

# NeuroLink Mem0 Memory Integration

## Overview

NeuroLink now includes advanced memory capabilities powered by Mem0, enabling AI conversations to remember context across sessions and maintain user-specific memory isolation. This integration provides semantic memory storage and retrieval using vector databases for long-term conversation continuity.

## Features

- ✅ **Cross-Session Memory**: Remember conversations across different sessions
- ✅ **User Isolation**: Separate memory contexts for different users
- ✅ **Semantic Search**: Vector-based memory retrieval using embeddings
- ✅ **Multiple Vector Stores**: Support for Qdrant, Chroma, and more
- ✅ **Streaming Integration**: Memory-aware streaming responses
- ✅ **Background Storage**: Non-blocking memory operations
- ✅ **Configurable Search**: Customize memory retrieval parameters

## Architecture

```
┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   NeuroLink     │    │      Mem0       │    │  Vector Store   │
│                 │───▶│                 │───▶│   (Qdrant)      │
│ generate()/     │    │ Memory Provider │    │                 │
│ stream()        │    │                 │    │ Embeddings +    │
└─────────────────┘    └─────────────────┘    │ Semantic Search │
                                              └─────────────────┘
```

## Configuration

### Basic Configuration

```typescript

const neurolink = new NeuroLink({
  conversationMemory: {
    enabled: true,
    provider: "mem0",
    mem0Enabled: true,
    mem0Config: {
      vectorStore: {
        provider: "qdrant",
        type: "qdrant",
        collection: "neurolink_memories",
        dimensions: 768, // Must match your embedding model
        host: "localhost",
        port: 6333,
      },
      model: "gemini-2.0-flash-exp",
      llmProvider: "google",
      embeddings: {
        provider: "google",
        model: "text-embedding-004", // 768 dimensions
      },
      search: {
        maxResults: 5,
        timeoutMs: 50000,
      },
      storage: {
        timeoutMs: 30000,
      },
    },
  },
  providers: {
    google: {
      apiKey: process.env.GEMINI_API_KEY,
    },
  },
});
```

### Vector Store Options

#### Qdrant Configuration

```typescript
vectorStore: {
  provider: 'qdrant',
  type: 'qdrant',
  collection: 'my_memories',
  dimensions: 768,
  host: 'localhost',
  port: 6333,
  // OR use URL instead of host+port
  endpoint: 'http://localhost:6333'
}
```

#### Chroma Configuration

```typescript
vectorStore: {
  provider: 'chroma',
  type: 'chroma',
  collection: 'my_memories',
  dimensions: 768,
  host: 'localhost',
  port: 8000
}
```

### Embedding Provider Options

#### Google Embeddings (768 dimensions)

```typescript
embeddings: {
  provider: "google",
  model: "text-embedding-004"
}
```

#### OpenAI Embeddings (1536 dimensions)

```typescript
embeddings: {
  provider: "openai",
  model: "text-embedding-3-small"
}
```

## Usage Examples

### Basic Memory with Generate

```typescript
// First conversation - storing user preferences
const response1 = await neurolink.generate({
  input: {
    text: "Hi! I'm Alice, a frontend developer. I love React and JavaScript.",
  },
  context: {
    userId: "alice_123",
    sessionId: "session_1",
  },
  provider: "vertex",
  model: "claude-sonnet-4@20250514",
});

console.log(response1.content);
// AI introduces itself and acknowledges Alice's background

// Later conversation - memory retrieval
const response2 = await neurolink.generate({
  input: {
    text: "What programming languages do I work with?",
  },
  context: {
    userId: "alice_123", // Same user
    sessionId: "session_2", // Different session
  },
  provider: "vertex",
  model: "claude-sonnet-4@20250514",
});

console.log(response2.content);
// AI recalls: "You work with React and JavaScript"
```

### User Isolation Example

```typescript
// Alice's context
const aliceResponse = await neurolink.generate({
  input: {
    text: "I work at TechCorp as a senior frontend developer",
  },
  context: {
    userId: "alice_123",
    sessionId: "alice_session",
  },
});

// Bob's context (separate user)
const bobResponse = await neurolink.generate({
  input: {
    text: "I work at DataCorp as a machine learning engineer",
  },
  context: {
    userId: "bob_456", // Different user
    sessionId: "bob_session",
  },
});

// Bob queries his info - only sees his own memories
const bobQuery = await neurolink.generate({
  input: {
    text: "Where do I work and what's my role?",
  },
  context: {
    userId: "bob_456",
  },
});
// Returns: "DataCorp, machine learning engineer"
// Does NOT return Alice's TechCorp information
```

### Streaming with Memory Context

```typescript
// Create stream with memory-aware responses
const stream = await neurolink.stream({
  input: {
    text: "Tell me a story about a programmer",
  },
  context: {
    userId: "alice_123", // Alice's context
    sessionId: "story_session",
  },
  provider: "vertex",
  model: "gemini-2.5-flash",
  streaming: {
    enabled: true,
    enableProgress: true,
  },
});

// Process streaming chunks
let fullResponse = "";
for await (const chunk of stream) {
  if (chunk.type === "content") {
    fullResponse += chunk.content;
    process.stdout.write(chunk.content);
  }
}

// The story will be personalized based on Alice's
// previously stored context (React, JavaScript, TechCorp)
```

### Advanced Memory Search

```typescript
// Configure custom search parameters
const neurolink = new NeuroLink({
  conversationMemory: {
    // ... other config
    mem0Config: {
      // ... other config
      search: {
        maxResults: 10, // Retrieve more memories
        timeoutMs: 60000, // Longer timeout
        minScore: 0.3, // Minimum relevance score
      },
    },
  },
});
```

## Memory Storage Process

### Automatic Storage

Memory storage happens automatically after each conversation:

1. **Conversation Turn Creation**: Input + output combined
2. **Background Processing**: Memory stored asynchronously
3. **Vector Embedding**: Text converted to embeddings
4. **Storage**: Saved to vector database with user context
5. **Indexing**: Available for future retrieval

### Storage Format

```typescript
// Stored conversation turn structure
{
  messages: [
    { role: "user", content: "User's input text" },
    { role: "assistant", content: "AI's response" }
  ],
  metadata: {
    session_id: "session_123",
    user_id: "alice_123",
    timestamp: "2025-01-15T10:30:00Z",
    type: "conversation_turn"
  }
}
```

## Memory Retrieval Process

### Semantic Search Flow

1. **Query Processing**: User input analyzed for context
2. **Embedding Generation**: Query converted to vector
3. **Similarity Search**: Vector database search
4. **Relevance Filtering**: Results above threshold kept
5. **Context Injection**: Relevant memories added to prompt

### Context Enhancement

Retrieved memories are seamlessly integrated:

```typescript
// Original prompt
"What framework should I learn?"
// Enhanced with memory context
`Based on your background as a React developer at TechCorp who loves JavaScript:

What framework should I learn?

Relevant context from previous conversations:
- You're a senior frontend developer
- You work with React and JavaScript
- You're employed at TechCorp`;
```

## Testing Memory Integration

### Complete Test Example

```typescript

async function testMemoryIntegration() {
  const neurolink = new NeuroLink({
    conversationMemory: {
      enabled: true,
      provider: "mem0",
      mem0Enabled: true,
      mem0Config: {
        vectorStore: {
          provider: "qdrant",
          type: "qdrant",
          collection: "test_memories",
          dimensions: 768,
          host: "localhost",
          port: 6333,
        },
        embeddings: {
          provider: "google",
          model: "text-embedding-004",
        },
      },
    },
    providers: {
      google: { apiKey: process.env.GEMINI_API_KEY },
    },
  });

  // Step 1: Store initial context
  console.log("Step 1: Storing user context...");
  await neurolink.generate({
    input: {
      text: "I'm a data scientist working with Python and PyTorch",
    },
    context: {
      userId: "test_user",
      sessionId: "session_1",
    },
  });

  // Wait for indexing
  await new Promise((resolve) => setTimeout(resolve, 5000));

  // Step 2: Test memory recall
  console.log("Step 2: Testing memory recall...");
  const response = await neurolink.generate({
    input: {
      text: "What programming language do I use?",
    },
    context: {
      userId: "test_user",
      sessionId: "session_2", // Different session
    },
  });

  console.log("AI Response:", response.content);
  // Should mention Python and PyTorch

  // Step 3: Test streaming with memory
  console.log("Step 3: Testing streaming with memory...");
  const stream = await neurolink.stream({
    input: {
      text: "Give me coding tips for my expertise area",
    },
    context: {
      userId: "test_user",
      sessionId: "session_3",
    },
    streaming: { enabled: true },
  });

  for await (const chunk of stream) {
    if (chunk.type === "content") {
      process.stdout.write(chunk.content);
    }
  }
  // Should provide Python/PyTorch specific tips
}

testMemoryIntegration();
```

## Performance Considerations

### Memory Storage

- **Background Processing**: Storage doesn't block response generation
- **Timeout Handling**: Configurable timeouts prevent hanging
- **Error Resilience**: Failures don't affect conversation flow

### Memory Retrieval

- **Fast Search**: Vector similarity search is typically \ ({
  vectorStore: {
    collection: `memories_${tenantId}`, // Separate collections per tenant
    // ... other config
  },
});
```

### 3. Performance Monitoring

```typescript
// Monitor memory operations
const startTime = Date.now();
const response = await neurolink.generate(options);
const memoryTime = Date.now() - startTime;
console.log(`Memory-enhanced response time: ${memoryTime}ms`);
```

### 4. Graceful Degradation

```typescript
// Always handle memory failures gracefully
const memoryConfig = {
  enabled: true,
  provider: "mem0",
  // Add fallback configuration
  fallbackOnError: true,
  maxRetries: 2,
};
```

## Troubleshooting

### Debug Mode

Enable debug logging for memory operations:

```typescript
// Set environment variable
process.env.NEUROLINK_DEBUG_MEMORY = "true";

// Or configure in code (development only)
const neurolink = new NeuroLink({
  conversationMemory: {
    // ... config
    debug: true, // Development only
  },
});
```

### Vector Store Health Check

```bash
# Check Qdrant status
curl -s http://localhost:6333/health

# List collections
curl -s http://localhost:6333/collections

# Check collection info
curl -s http://localhost:6333/collections/your_collection_name
```

### Memory Verification

```typescript
// Test memory storage and retrieval
async function verifyMemory(neurolink, userId) {
  // Store test data
  await neurolink.generate({
    input: { text: "Remember: I like pizza" },
    context: { userId },
  });

  // Wait for indexing
  await new Promise((resolve) => setTimeout(resolve, 2000));

  // Test retrieval
  const response = await neurolink.generate({
    input: { text: "What food do I like?" },
    context: { userId },
  });

  console.log("Memory test result:", response.content);
  // Should mention pizza
}
```

## Conclusion

The NeuroLink Mem0 integration provides powerful memory capabilities that enable truly conversational AI experiences. With proper configuration and usage patterns, you can build applications that remember user context across sessions while maintaining privacy and performance.

For additional support or advanced use cases, refer to the [Mem0 documentation](https://docs.mem0.ai/) and [NeuroLink examples](/docs/guides/examples/use-cases).

---

## Automatic Conversation Summarization

<!-- Source: memory/summarization.md -->

#  Automatic Conversation Summarization

NeuroLink includes a powerful feature for automatic context summarization, designed to enable long-running, stateful conversations without exceeding AI provider token limits. This feature is part of the **Conversation Memory** system.

## Overview

When building conversational agents, the history of the conversation can quickly grow too large for the AI model's context window. Manually managing this history is complex and error-prone. The Automatic Conversation Summarization feature handles this for you.

When enabled, the `NeuroLink` instance will keep track of the entire conversation for each session. If a conversation's length (measured in turns) exceeds a configurable limit, the feature will automatically use an AI model to summarize the history. This summary then replaces the older parts of the conversation, preserving the essential context while keeping the overall history size manageable.

## How to Use

The feature is part of the `conversationMemory` system and is enabled and configured in the `NeuroLink` constructor.

### Enabling Summarization

To enable the feature, you must enable both `conversationMemory` and `enableSummarization` in the constructor configuration.

```typescript

// Enable conversation memory and summarization with default settings
const neurolink = new NeuroLink({
  conversationMemory: {
    enabled: true,
    enableSummarization: true,
  },
});

// All generate calls with a sessionId will now be context-aware and summarize automatically
await neurolink.generate({
  input: { text: "This is the first turn." },
  context: { sessionId: "session-123" },
});
```

### Custom Configuration

You can easily override the default settings by providing more options in the configuration object.

```typescript

const neurolink = new NeuroLink({
  conversationMemory: {
    enabled: true,
    enableSummarization: true,
    // Trigger summarization when turn count exceeds 15
    summarizationThresholdTurns: 15,
    // Keep the last 5 turns and summarize the rest
    summarizationTargetTurns: 5,
    // Use a specific provider and model for the summarization task
    summarizationProvider: "openai",
    summarizationModel: "gpt-4o-mini",
  },
});
```

## Configuration Options

The `conversationMemory` configuration object accepts the following properties related to summarization:

- `enableSummarization: boolean`
  - **Description**: Set to `true` to enable the automatic summarization feature. `enabled` must also be `true`.
  - **Default**: `false`

- `summarizationThresholdTurns: number`
  - **Description**: The number of turns after which summarization should be triggered.
  - **Default**: `20`
  - **Note**: This is a **legacy option**. The newer `SummarizationEngine` uses token-based thresholds instead of turn counts. See [Token-Based vs Turn-Based Summarization](#token-based-vs-turn-based-summarization) below.

- `summarizationTargetTurns: number`
  - **Description**: The number of recent turns to _keep_ when a summary is created. The older turns will be replaced by the summary.
  - **Default**: `10`
  - **Note**: This is a **legacy option**. The token-based engine calculates the split point dynamically using a `RECENT_MESSAGES_RATIO` (default 30% of the threshold) rather than a fixed turn count.

- `tokenThreshold: number`
  - **Description**: Token-based threshold that triggers summarization. When the estimated token count of context messages exceeds this value, summarization is triggered automatically. If not set, the threshold is calculated as 80% of the model's available input tokens (looked up from the context window registry).
  - **Default**: Computed from the model's context window, or `50000` as a fallback for unknown models. Can be overridden via the `NEUROLINK_TOKEN_THRESHOLD` environment variable.

- `summarizationModel: string`
  - **Description**: The specific AI model to use for the summarization task. It's recommended to use a fast and cost-effective model.
  - **Default**: `"gemini-2.5-flash"`

- `summarizationProvider: string`
  - **Description**: The AI provider to use for the summarization task.
  - **Default**: `"vertex"`

## Order of Operations

To prevent race conditions and ensure correct context management, the system follows a strict order of operations after each AI response is generated:

1.  The new turn (user prompt + AI response) is added to the session's history.
2.  The system checks if the total number of turns now exceeds `summarizationThresholdTurns`.
3.  If it does, the oldest turns are summarized, and the history is replaced with a `system` message containing the summary, followed by the most recent turns (as defined by `summarizationTargetTurns`).
4.  Finally, the system checks if the total number of turns exceeds `maxTurnsPerSession` and truncates the oldest messages if necessary.

This ensures that summarization always happens _before_ simple truncation, preserving the context of long conversations.

## Context Compaction System

The turn-based summarization described above is now complemented by a full
**Context Compaction System** that operates at the token level rather than the
turn level. See the [Context Compaction Guide](/docs/features/context-compaction)
for the complete specification.

The compaction system provides a 4-stage reduction pipeline:

1. **Tool Output Pruning** -- replaces old tool results with lightweight placeholders.
2. **File Read Deduplication** -- keeps only the latest read of each file path.
3. **LLM Summarization** -- produces a structured 9-section summary with iterative merging.
4. **Sliding Window Truncation** -- non-destructive tagging of the oldest messages.

Key components:

- **BudgetChecker** (`src/lib/context/budgetChecker.ts`) validates that the context fits
  within the model's window before every LLM call. When usage exceeds 80 %, it
  automatically triggers compaction.
- **ContextCompactor** (`src/lib/context/contextCompactor.ts`) orchestrates the
  multi-stage pipeline described above.
- **`getContextStats()` API** returns live token counts, capacity, and per-stage
  reduction metrics so callers can monitor context health programmatically.

## SummarizationEngine

The `SummarizationEngine` class (`src/lib/context/summarizationEngine.ts`) is the shared, centralized engine used by both `ConversationMemoryManager` (in-memory) and `RedisConversationMemoryManager` (Redis-backed). It was extracted from those two managers to eliminate code duplication and ensure consistent summarization behavior regardless of the storage backend.

The engine is responsible for:

- **Token-based threshold checking** — it estimates the total token count of a session's context messages (using `TokenUtils.estimateTokenCount`) and compares it against a configurable threshold. If the count exceeds the threshold, summarization is triggered.
- **Split-point calculation** — rather than using a fixed turn count, the engine works backwards from the most recent message to find a split point based on a target token budget for recent messages (controlled by `RECENT_MESSAGES_RATIO`, default 30% of the threshold). Messages before the split point are summarized; messages after it are kept as-is.
- **Pointer-based, non-destructive summarization** — the engine tracks which messages have already been summarized via a `summarizedUpToMessageId` pointer on the session. Original messages are never deleted; the pointer simply advances forward as new summaries are generated.
- **Delegating to `generateSummary()`** — the actual LLM call to produce the summary text is handled by the `generateSummary()` utility in `src/lib/utils/conversationMemory.ts`, which constructs the structured prompt and invokes the configured summarization provider/model.

### Usage

Both memory managers call `SummarizationEngine.checkAndSummarize()` after storing each new conversation turn:

```typescript
const engine = new SummarizationEngine();
const wasSummarized = await engine.checkAndSummarize(
  session, // SessionMemory object
  threshold, // Token threshold (e.g. 80% of context window)
  config, // ConversationMemoryConfig
  "[MyManager]", // Log prefix
);
```

## Structured Summary: The 9-Section Format

When summarization runs, the conversation history is distilled into a structured summary with exactly **9 sections**. This structure is defined in `src/lib/context/prompts/summarizationPrompt.ts` and ensures that summaries are comprehensive, consistent, and easy for the AI to consume as context.

The 9 sections are:

1. **Primary Request and Intent** — What is the user's main goal or request? What are they trying to accomplish?
2. **Key Technical Concepts** — What technologies, frameworks, patterns, or concepts are central to this conversation?
3. **Files and Code Sections** — What specific files, functions, or code sections have been discussed or modified?
4. **Problem Solving** — What problems were identified? What solutions were attempted or implemented?
5. **Pending Tasks** — What tasks remain incomplete or need follow-up?
6. **Task Evolution** — How has the task changed or evolved during the conversation?
7. **Current Work** — What is being actively worked on right now?
8. **Next Step** — What is the immediate next action to take?
9. **Required Files** — What files will need to be accessed or modified to continue?

If a section is not applicable to the conversation, the summarizer writes "N/A" for that section. The prompt also supports an optional **File Context** addendum listing files read and files modified during the conversation, which is appended to the prompt when available.

## Incremental Merge Mode

When summarization runs **more than once** during a long conversation, the system uses an **incremental merge** strategy to avoid information loss. This is controlled by the `isIncremental` flag and `previousSummary` field in the `SummarizationPromptOptions` interface.

Here is how it works:

1. On the **first** summarization, an initial prompt is used that asks the LLM to analyze the conversation and produce a fresh 9-section summary.
2. On **subsequent** summarizations, the prompt switches to incremental mode. The existing summary is included verbatim in the prompt under an "Existing Summary" block, and the LLM is instructed to **merge** the new conversation content into the existing sections.
3. The merge instructions tell the LLM to:
   - Review the existing summary
   - Analyze the new conversation content
   - Merge new information into the appropriate sections
   - Update sections with relevant new information
   - Remove information that is no longer relevant
   - Keep the summary concise but comprehensive
   - Maintain the 9-section format

This incremental approach means that context accumulated over many summarization cycles is preserved and refined, rather than being discarded and regenerated from scratch each time. The `createSummarizationPrompt()` function in `src/lib/utils/conversationMemory.ts` handles this automatically — it checks whether a `previousSummary` exists on the session and sets `isIncremental: true` when one is present.

## Token-Based vs Turn-Based Summarization

The original summarization system used a **turn-based** approach: summarization was triggered when the number of conversation turns exceeded `summarizationThresholdTurns` (default: 20), and a fixed number of recent turns (`summarizationTargetTurns`, default: 10) were kept.

The newer `SummarizationEngine` replaces this with a **token-based** approach:

| Aspect               | Turn-Based (Legacy)                              | Token-Based (Current)                                                                                |
| -------------------- | ------------------------------------------------ | ---------------------------------------------------------------------------------------------------- |
| **Trigger**          | Turn count exceeds `summarizationThresholdTurns` | Estimated token count exceeds `tokenThreshold`                                                       |
| **What to keep**     | Fixed `summarizationTargetTurns` recent turns    | Dynamic split point calculated from `RECENT_MESSAGES_RATIO` (30% of threshold in tokens)             |
| **Threshold source** | Hardcoded default (20 turns)                     | Computed from model's context window (80% of available input tokens) via `getAvailableInputTokens()` |
| **Fallback**         | N/A                                              | `50000` tokens if model context window is unknown                                                    |
| **Override**         | Constructor config only                          | `NEUROLINK_TOKEN_THRESHOLD` env var, session-level override, or constructor config                   |

**Why the change?** Turn counting is a poor proxy for actual context window usage. A single turn with a large code block or document attachment may consume far more tokens than 10 short chat turns. Token-based thresholds align summarization decisions with the actual constraint that matters: the model's context window size.

The legacy turn-based configuration options (`summarizationThresholdTurns`, `summarizationTargetTurns`, `maxTurnsPerSession`) are still accepted for backward compatibility but are marked as deprecated. New integrations should use the token-based `tokenThreshold` configuration or rely on the automatic model-aware defaults.

---

# Observability

## Health Monitoring & Auto-Recovery Guide

<!-- Source: observability/health-monitoring.md -->

# Health Monitoring & Auto-Recovery Guide

> ⚠️ **PLANNED FEATURE**: This documentation describes features that are planned but not yet implemented. The `HealthMonitor` class referenced in this guide does not currently exist in the codebase. The code examples are illustrative of the intended API design.

**NeuroLink Enhanced MCP Platform - Health Monitoring**

## ️ **Architecture & Components**

### **Connection Status States**

```typescript
export enum ConnectionStatus {
  DISCONNECTED = "DISCONNECTED", // No connection established
  CONNECTING = "CONNECTING", // Connection attempt in progress
  CONNECTED = "CONNECTED", // Successfully connected and operational
  CHECKING = "CHECKING", // Health check in progress
  ERROR = "ERROR", // Connection error detected
  RECOVERING = "RECOVERING", // Auto-recovery in progress
}
```

### **Health Monitor Core**

```typescript
export class HealthMonitor extends EventEmitter {
  private healthCheckTimers: Map = new Map();
  private serverStatus: Map = new Map();
  private recoveryAttempts: Map = new Map();

  async startHealthMonitoring(registry: MCPRegistry): Promise {
    const servers = await registry.listServers();

    for (const serverId of servers) {
      await this.initializeServerMonitoring(serverId);
    }

    this.emit("monitoring-started", { serverCount: servers.length });
  }

  async performHealthCheck(serverId: string): Promise {
    const startTime = Date.now();

    try {
      this.updateServerStatus(serverId, ConnectionStatus.CHECKING);

      const server = await this.registry.getServer(serverId);
      await server.ping(); // Custom ping implementation

      const result: HealthCheckResult = {
        success: true,
        status: ConnectionStatus.CONNECTED,
        latency: Date.now() - startTime,
        timestamp: Date.now(),
      };

      this.updateServerStatus(serverId, ConnectionStatus.CONNECTED);
      this.emit("health-check-success", { serverId, result });

      return result;
    } catch (error) {
      const result: HealthCheckResult = {
        success: false,
        status: ConnectionStatus.ERROR,
        error: error as Error,
        timestamp: Date.now(),
      };

      this.updateServerStatus(serverId, ConnectionStatus.ERROR);
      this.emit("health-check-failed", { serverId, result });

      // Trigger auto-recovery
      await this.attemptRecovery(serverId);

      return result;
    }
  }
}
```

### **Health Check Interface**

```typescript
export type HealthCheckResult = {
  success: boolean;
  status: ConnectionStatus;
  message?: string;
  latency?: number;
  error?: Error;
  timestamp: number;
  metadata?: {
    serverVersion?: string;
    capabilities?: string[];
    resourceUsage?: ResourceMetrics;
  };
};

export type ServerHealth = {
  serverId: string;
  status: ConnectionStatus;
  lastHealthCheck: HealthCheckResult;
  healthHistory: HealthCheckResult[];
  recoveryAttempts: number;
  uptime: number;
  lastSuccessfulConnection: number;
};
```

---

##  **Auto-Recovery Mechanisms**

### **Intelligent Recovery Logic**

```typescript
export class RecoveryManager {
  private maxRecoveryAttempts: number = 3;
  private baseRetryInterval: number = 5000; // 5 seconds
  private maxRetryInterval: number = 60000; // 1 minute

  async attemptRecovery(serverId: string): Promise {
    const attempts = this.recoveryAttempts.get(serverId) || 0;

    if (attempts >= this.maxRecoveryAttempts) {
      console.warn(`Max recovery attempts reached for server ${serverId}`);
      this.emit("recovery-failed", { serverId, attempts });
      return false;
    }

    this.updateServerStatus(serverId, ConnectionStatus.RECOVERING);
    this.recoveryAttempts.set(serverId, attempts + 1);

    // Exponential backoff with jitter
    const delay =
      Math.min(
        this.baseRetryInterval * Math.pow(2, attempts),
        this.maxRetryInterval,
      ) +
      Math.random() * 1000; // Add jitter

    await new Promise((resolve) => setTimeout(resolve, delay));

    try {
      await this.reconnectServer(serverId);

      // Reset recovery attempts on success
      this.recoveryAttempts.delete(serverId);
      this.updateServerStatus(serverId, ConnectionStatus.CONNECTED);

      this.emit("recovery-success", { serverId, attempts: attempts + 1 });
      return true;
    } catch (error) {
      console.error(
        `Recovery attempt ${attempts + 1} failed for ${serverId}:`,
        error,
      );

      // Schedule next recovery attempt
      setTimeout(() => {
        this.attemptRecovery(serverId);
      }, delay);

      return false;
    }
  }
}
```

### **Connection Lifecycle Management**

```typescript
// State transition logic
const connectionLifecycle = {
  async connect(serverId: string): Promise {
    this.updateStatus(serverId, ConnectionStatus.CONNECTING);

    try {
      await this.establishConnection(serverId);
      this.updateStatus(serverId, ConnectionStatus.CONNECTED);
      this.startPeriodicHealthChecks(serverId);
    } catch (error) {
      this.updateStatus(serverId, ConnectionStatus.ERROR);
      await this.attemptRecovery(serverId);
    }
  },

  async disconnect(serverId: string): Promise {
    this.stopHealthChecks(serverId);
    await this.closeConnection(serverId);
    this.updateStatus(serverId, ConnectionStatus.DISCONNECTED);
  },
};
```

---

##  **Usage Examples**

### **Basic Health Monitoring Setup**

```typescript

// Initialize health monitor
const healthMonitor = new HealthMonitor({
  healthCheckInterval: 30000, // 30 seconds
  recoveryRetryInterval: 5000, // 5 seconds
  maxRecoveryAttempts: 3,
  enableEventLogging: true,
});

// Start monitoring all servers
await healthMonitor.startHealthMonitoring(mcpRegistry);

// Listen for health events
healthMonitor.on("health-check-failed", ({ serverId, result }) => {
  console.warn(`Health check failed for ${serverId}:`, result.error?.message);
});

healthMonitor.on("recovery-success", ({ serverId, attempts }) => {
  console.log(`Server ${serverId} recovered after ${attempts} attempts`);
});
```

### **Custom Health Check Implementation**

```typescript
// Implement custom health checks
class CustomHealthMonitor extends HealthMonitor {
  async performAdvancedHealthCheck(
    serverId: string,
  ): Promise {
    const startTime = Date.now();

    try {
      const server = await this.registry.getServer(serverId);

      // Basic connectivity check
      await server.ping();

      // Advanced checks
      const capabilities = await server.listCapabilities();
      const resourceUsage = await server.getResourceUsage();
      const version = await server.getVersion();

      return {
        success: true,
        status: ConnectionStatus.CONNECTED,
        latency: Date.now() - startTime,
        timestamp: Date.now(),
        metadata: {
          serverVersion: version,
          capabilities: capabilities,
          resourceUsage: resourceUsage,
        },
      };
    } catch (error) {
      return {
        success: false,
        status: ConnectionStatus.ERROR,
        error: error as Error,
        timestamp: Date.now(),
      };
    }
  }
}
```

### **Health-Aware Tool Execution**

```typescript
// Execute tools with health awareness
const healthAwareExecution = async (
  toolName: string,
  args: any,
  context: any,
) => {
  const serverId = await registry.getServerForTool(toolName);
  const serverHealth = await healthMonitor.getServerHealth(serverId);

  if (serverHealth.status !== ConnectionStatus.CONNECTED) {
    // Try to recover connection first
    const recovered = await healthMonitor.attemptRecovery(serverId);

    if (!recovered) {
      throw new Error(`Server ${serverId} is unavailable for tool ${toolName}`);
    }
  }

  // Execute tool with health monitoring
  try {
    const result = await registry.executeTool(toolName, args, context);

    // Update health status on successful execution
    healthMonitor.recordSuccessfulOperation(serverId);

    return result;
  } catch (error) {
    // Report health issue on tool execution failure
    healthMonitor.recordFailedOperation(serverId, error);
    throw error;
  }
};
```

---

##  **Health Analytics & Monitoring**

### **Health Metrics Collection**

```typescript
type HealthMetrics = {
  serverCount: number;
  healthyServers: number;
  unhealthyServers: number;
  recoveringServers: number;
  averageLatency: number;
  uptimePercentage: number;
  totalHealthChecks: number;
  failedHealthChecks: number;
  successfulRecoveries: number;
  failedRecoveries: number;
};

export class HealthAnalytics {
  async collectHealthMetrics(): Promise {
    const allServers = await this.healthMonitor.getAllServerHealth();
    const now = Date.now();

    const healthyServers = allServers.filter(
      (s) => s.status === ConnectionStatus.CONNECTED,
    ).length;
    const unhealthyServers = allServers.filter(
      (s) => s.status === ConnectionStatus.ERROR,
    ).length;
    const recoveringServers = allServers.filter(
      (s) => s.status === ConnectionStatus.RECOVERING,
    ).length;

    const totalLatency = allServers.reduce((sum, server) => {
      return sum + (server.lastHealthCheck.latency || 0);
    }, 0);

    const uptimePercentage =
      allServers.reduce((sum, server) => {
        const uptime =
          (now - server.lastSuccessfulConnection) / (now - server.createdAt);
        return sum + Math.max(0, Math.min(1, uptime));
      }, 0) / allServers.length;

    return {
      serverCount: allServers.length,
      healthyServers,
      unhealthyServers,
      recoveringServers,
      averageLatency: totalLatency / allServers.length,
      uptimePercentage,
      totalHealthChecks: this.getTotalHealthChecks(),
      failedHealthChecks: this.getFailedHealthChecks(),
      successfulRecoveries: this.getSuccessfulRecoveries(),
      failedRecoveries: this.getFailedRecoveries(),
    };
  }
}
```

### **Real-time Health Dashboard**

```typescript
// Real-time health monitoring dashboard
export class HealthDashboard {
  private metrics: HealthMetrics;
  private updateInterval: NodeJS.Timeout;

  start(): void {
    this.updateInterval = setInterval(async () => {
      await this.updateDashboard();
    }, 5000); // Update every 5 seconds
  }

  private async updateDashboard(): Promise {
    this.metrics = await this.healthAnalytics.collectHealthMetrics();

    console.clear();
    console.log("=== NeuroLink MCP Health Dashboard ===");
    console.log(
      `Servers: ${this.metrics.healthyServers}/${this.metrics.serverCount} healthy`,
    );
    console.log(`Average Latency: ${this.metrics.averageLatency.toFixed(2)}ms`);
    console.log(`Uptime: ${(this.metrics.uptimePercentage * 100).toFixed(2)}%`);
    console.log(
      `Recovery Success Rate: ${this.getRecoverySuccessRate().toFixed(2)}%`,
    );

    // Display server status
    const serverHealth = await this.healthMonitor.getAllServerHealth();
    console.log("\nServer Status:");
    serverHealth.forEach((server) => {
      const statusIcon = this.getStatusIcon(server.status);
      const latency = server.lastHealthCheck.latency || 0;
      console.log(
        `  ${statusIcon} ${server.serverId}: ${server.status} (${latency}ms)`,
      );
    });
  }

  private getStatusIcon(status: ConnectionStatus): string {
    switch (status) {
      case ConnectionStatus.CONNECTED:
        return "";
      case ConnectionStatus.CONNECTING:
        return "";
      case ConnectionStatus.CHECKING:
        return "";
      case ConnectionStatus.RECOVERING:
        return "";
      case ConnectionStatus.ERROR:
        return "";
      case ConnectionStatus.DISCONNECTED:
        return "⚫";
      default:
        return "❓";
    }
  }
}
```

---

##  **Testing & Validation**

### **Health Check Testing**

```typescript
// Test health monitoring functionality
const testHealthMonitoring = async () => {
  console.log("=== Testing Health Monitoring ===");

  // Test basic health check
  const result = await healthMonitor.performHealthCheck("test-server");
  console.log("Health check result:", result);

  // Test recovery mechanism
  console.log("Testing recovery mechanism...");
  await healthMonitor.simulateServerFailure("test-server");

  // Wait for auto-recovery
  await new Promise((resolve) => {
    healthMonitor.once("recovery-success", () => {
      console.log("✅ Auto-recovery successful");
      resolve(undefined);
    });

    healthMonitor.once("recovery-failed", () => {
      console.log("❌ Auto-recovery failed");
      resolve(undefined);
    });
  });
};
```

### **Performance Testing**

```typescript
// Test health monitoring performance
const testHealthPerformance = async () => {
  const serverCount = 50;
  const healthCheckCount = 100;

  console.log(`Testing health checks for ${serverCount} servers...`);

  const startTime = Date.now();
  const promises = [];

  for (let i = 0; i  r.success).length;
  const averageLatency =
    results.reduce((sum, r) => sum + (r.latency || 0), 0) / results.length;

  console.log("Performance Results:");
  console.log(`- Total checks: ${results.length}`);
  console.log(
    `- Success rate: ${((successCount / results.length) * 100).toFixed(2)}%`,
  );
  console.log(`- Average latency: ${averageLatency.toFixed(2)}ms`);
  console.log(`- Total duration: ${duration}ms`);
  console.log(
    `- Checks per second: ${((results.length / duration) * 1000).toFixed(2)}`,
  );
};
```

---

##  **Configuration & Customization**

### **Advanced Configuration**

```typescript
type HealthMonitorConfig = {
  intervals: {
    healthCheck: number; // Health check interval (ms)
    recovery: number; // Recovery retry interval (ms)
    cleanup: number; // Cleanup interval for old data (ms)
  };
  thresholds: {
    maxRecoveryAttempts: number; // Max recovery attempts before giving up
    maxLatency: number; // Max acceptable latency (ms)
    minUptime: number; // Minimum uptime percentage
  };
  recovery: {
    strategy: "exponential" | "linear" | "custom";
    baseDelay: number; // Base delay for recovery attempts
    maxDelay: number; // Maximum delay between attempts
    jitter: boolean; // Add random jitter to delays
  };
  alerting: {
    enableAlerts: boolean; // Enable health alerts
    alertThresholds: {
      consecutiveFailures: number; // Alert after N consecutive failures
      uptimeBelow: number; // Alert when uptime drops below percentage
      latencyAbove: number; // Alert when latency exceeds threshold
    };
  };
};

const healthMonitor = new HealthMonitor({
  intervals: {
    healthCheck: 30000, // 30 seconds
    recovery: 5000, // 5 seconds
    cleanup: 3600000, // 1 hour
  },
  thresholds: {
    maxRecoveryAttempts: 5,
    maxLatency: 5000, // 5 seconds
    minUptime: 0.95, // 95%
  },
  recovery: {
    strategy: "exponential",
    baseDelay: 1000, // 1 second
    maxDelay: 60000, // 1 minute
    jitter: true,
  },
  alerting: {
    enableAlerts: true,
    alertThresholds: {
      consecutiveFailures: 3,
      uptimeBelow: 0.9, // 90%
      latencyAbove: 3000, // 3 seconds
    },
  },
});
```

---

##  **Best Practices**

### **Monitoring Strategy**

```typescript
// Implement tiered monitoring
const tieredMonitoring = {
  // Critical servers: frequent monitoring
  critical: {
    interval: 15000, // 15 seconds
    maxLatency: 1000, // 1 second
    immediateRecovery: true,
  },

  // Important servers: standard monitoring
  important: {
    interval: 30000, // 30 seconds
    maxLatency: 3000, // 3 seconds
    recoveryDelay: 5000, // 5 seconds
  },

  // Background servers: light monitoring
  background: {
    interval: 60000, // 1 minute
    maxLatency: 10000, // 10 seconds
    recoveryDelay: 30000, // 30 seconds
  },
};
```

### **Resource Optimization**

```typescript
// Optimize health monitoring resources
const optimizeMonitoring = {
  // Batch health checks
  async batchHealthChecks(serverIds: string[]): Promise {
    const batchSize = 10;
    const results: HealthCheckResult[] = [];

    for (let i = 0; i  this.performHealthCheck(id)),
      );
      results.push(...batchResults);
    }

    return results;
  },

  // Adaptive monitoring intervals
  adjustMonitoringInterval(
    serverId: string,
    healthHistory: HealthCheckResult[],
  ): number {
    const recentFailures = healthHistory
      .slice(-5)
      .filter((h) => !h.success).length;
    const baseInterval = 30000;

    if (recentFailures === 0) {
      return baseInterval * 2; // Healthy servers need less frequent checks
    } else if (recentFailures >= 3) {
      return baseInterval / 2; // Unhealthy servers need more frequent checks
    }

    return baseInterval;
  },
};
```

---

**STATUS**: Planned health monitoring system (not yet implemented)

---

## Provider Status Monitoring and Health Management

<!-- Source: observability/provider-status.md -->

# Provider Status Monitoring and Health Management

> **Enterprise-Grade Provider Health Monitoring** - Real-time provider status, performance metrics, and intelligent recommendations for optimal AI development workflows.

## Overview

NeuroLink's Provider Status Monitoring system provides comprehensive health monitoring, performance analytics, and actionable recommendations for all AI providers in your configuration. This enterprise-grade feature ensures optimal provider selection, proactive issue detection, and seamless failover capabilities.

## Features

###  Real-Time Health Monitoring

- **Live Provider Status**: Real-time connectivity and authentication validation
- **Response Time Tracking**: Millisecond-precision performance monitoring
- **Configuration Validation**: Automatic detection of missing or invalid credentials
- **Availability Monitoring**: Continuous health checks with historical tracking

###  Performance Analytics

- **Response Time Analysis**: Detailed latency metrics across providers
- **Health Scoring**: 0-100 health score calculation based on multiple factors
- **Cost Analysis**: Provider cost tiers and budget optimization recommendations
- **Capability Assessment**: Feature comparison across providers (streaming, vision, function-calling)

###  Intelligent Recommendations

- **Provider Optimization**: AI-powered recommendations for primary and fallback providers
- **Configuration Guidance**: Step-by-step setup instructions for unconfigured providers
- **Performance Insights**: Actionable suggestions for improving response times and reliability
- **Cost Optimization**: Smart recommendations for balancing cost and performance

## Implementation

### Core Components

The Provider Status system is built on three main components:

```typescript
// Enhanced Provider Status Utility
export async function getEnhancedProviderStatus(): Promise;

// Health Score Calculation
function calculateHealthScore(result: ProviderResult): number;

// Intelligent Recommendations
function generateRecommendations(results: ProviderResult[]): Recommendation[];
```

### Architecture Pattern

```mermaid
graph TD
    A[CLI/SDK Request] --> B[Enhanced Status Utility]
    B --> C[NeuroLink SDK Core]
    C --> D[Provider Status Check]
    D --> E[Response Time Measurement]
    E --> F[Health Score Calculation]
    F --> G[Recommendation Engine]
    G --> H[Enhanced Status Response]
```

## Usage Examples

### CLI Usage

#### Basic Status Check

```bash
# Quick provider status overview
npx @juspay/neurolink generate "test" --provider google-ai

# JSON output for programmatic use
npx @juspay/neurolink generate "test" --provider google-ai --json
```

#### Advanced Monitoring

```bash
# Test MCP server connectivity
npx @juspay/neurolink mcp test

# Test specific MCP server
npx @juspay/neurolink mcp test filesystem
```

### SDK Integration

#### Basic Status Monitoring

```typescript

// Check provider status programmatically
async function checkProviderHealth() {
  const providers = ["google-ai", "openai", "anthropic"];

  for (const providerName of providers) {
    try {
      const provider = await createAIProvider(providerName);
      const result = await provider.generate({
        prompt: "test",
        maxTokens: 5,
      });

      console.log(
        `✅ ${providerName}: Working (${result.usage?.totalTokens} tokens)`,
      );
    } catch (err) {
      const message = err instanceof Error ? err.message : String(err);
      console.log(`❌ ${providerName}: ${message}`);
    }
  }
}

// Check via demo server API
const response = await fetch("http://localhost:9876/api/status");
const status = await response.json();

console.log(
  `✅ ${
    Object.keys(status.providers).filter((p) => status.providers[p].available)
      .length
  } providers available`,
);
console.log(` Best provider: ${status.bestProvider}`);
```

#### Real-Time Monitoring Dashboard

```typescript

class ProviderHealthMonitor extends EventEmitter {
  private providers: string[];
  private healthStatus: Map;

  constructor() {
    super();
    this.providers = ["google-ai", "openai", "anthropic", "vertex"];
    this.healthStatus = new Map();
  }

  async startMonitoring(interval = 30000) {
    setInterval(async () => {
      const healthUpdate = await this.checkAllProviders();

      // Emit health events
      this.emit("healthUpdate", healthUpdate);

      // Alert on provider failures
      const failedProviders = Object.entries(healthUpdate)
        .filter(([_, status]) => !status.working)
        .map(([name, _]) => name);

      if (failedProviders.length > 0) {
        this.emit("healthAlert", {
          severity: "warning",
          providers: failedProviders,
          recommendations: this.generateRecommendations(healthUpdate),
        });
      }
    }, interval);
  }

  async checkAllProviders() {
    const results: Record = {};

    for (const providerName of this.providers) {
      try {
        const provider = await createAIProvider(providerName);
        const startTime = Date.now();

        await provider.generate({
          prompt: "test",
          maxTokens: 5,
        });

        results[providerName] = {
          working: true,
          responseTime: Date.now() - startTime,
          lastChecked: new Date().toISOString(),
        };
      } catch (error) {
        results[providerName] = {
          working: false,
          error: error.message,
          lastChecked: new Date().toISOString(),
        };
      }
    }

    return results;
  }

  generateRecommendations(healthUpdate: any): string[] {
    const recommendations = [];
    const workingProviders = Object.values(healthUpdate).filter(
      (status: any) => status.working,
    );

    if (workingProviders.length === 0) {
      recommendations.push(
        "All providers are down. Check network connectivity and API credentials.",
      );
    } else if (workingProviders.length === 1) {
      recommendations.push(
        "Only one provider working. Consider configuring backup providers for reliability.",
      );
    }

    return recommendations;
  }
}

// Usage
const monitor = new ProviderHealthMonitor();
monitor.on("healthAlert", (alert) => {
  console.warn(`⚠️ Provider health issue: ${alert.providers.join(", ")}`);
  alert.recommendations.forEach((rec) => console.log(` ${rec}`));
});

await monitor.startMonitoring();
```

## Status Response Structure

### Provider Status Result (from `/api/status`)

```typescript
type ProviderStatusResult = {
  timestamp: string;
  providers: Record;
  bestProvider: string | null;
  configuration: {
    defaultProvider: string;
    streamingEnabled: boolean;
    fallbackEnabled: boolean;
  };
  // Added for parity with examples below
  summary: {
    availabilityRate: number;
    totalProviders: number;
    workingProviders: number;
  };
  insights: {
    fastestProvider?: string;
    slowestProvider?: string;
    averageResponseTime: number;
  };
  recommendations: Recommendation[];
};
```

### Provider Status Information

```typescript
type ProviderStatus = {
  configured: boolean;
  authenticated: boolean;
  available: boolean;
  // True when all checks (configured + authenticated + generation) pass
  working: boolean;
  model?: string;
  costTier?:
    | "free-tier"
    | "free-local"
    | "low"
    | "medium"
    | "premium"
    | "enterprise"
    | "variable"
    | "custom";
  error?: string;
};
```

### Enhanced Status Result

```typescript
type EnhancedStatusResult = {
  timestamp: string;
  providers: Record;
  bestProvider: string | null;
  summary: {
    availabilityRate: number;
    totalProviders: number;
    workingProviders: number;
  };
  insights: {
    fastestProvider: string | null;
    slowestProvider: string | null;
    averageResponseTime: number;
  };
  recommendations: Recommendation[];
  configuration: {
    defaultProvider: string;
    streamingEnabled: boolean;
    fallbackEnabled: boolean;
  };
};

type Recommendation = {
  type: "critical" | "warning" | "info" | "success";
  category: "configuration" | "reliability" | "performance" | "cost" | "setup";
  message: string;
  action: string;
};
```

## Provider Status Classification

The system evaluates providers based on their actual runtime status:

### Status Categories

- **Configured**: Provider has required environment variables set
- **Authenticated**: Provider successfully validates API credentials
- **Available**: Provider responds to test generation requests
- **Working**: All checks pass - ready for production use

### Status Determination Process

1. **Environment Check**: Verify required API keys and configuration
2. **Authentication Test**: Validate credentials with minimal API call
3. **Generation Test**: Confirm provider can generate content
4. **Best Provider Selection**: Choose first working provider from priority list

## Provider Cost Tiers

Understanding provider cost structures helps optimize your AI spending:

### Cost Tier Classification

- **Free Tier**: `google-ai`, `huggingface` - No cost for basic usage
- **Free Local**: `ollama` - Local processing, no API costs
- **Low Cost**: `vertex`, `mistral` - Competitive pricing for production use
- **Medium Cost**: `bedrock`, `anthropic` - Balanced features and pricing
- **Premium**: `openai` - Advanced capabilities, higher cost
- **Enterprise**: `azure` - Enterprise features and compliance
- **Variable**: `litellm` - Cost depends on underlying provider
- **Custom**: `sagemaker` - Custom model hosting costs

## Intelligent Recommendations

The recommendation engine provides actionable guidance based on your current configuration:

### Configuration Recommendations

```typescript
// Critical: No providers configured
{
  type: 'critical',
  category: 'configuration',
  message: 'No providers configured. Set up at least one provider to use NeuroLink.',
  action: 'Configure GOOGLE_AI_API_KEY for free tier access'
}

// Warning: Single point of failure
{
  type: 'warning',
  category: 'reliability',
  message: 'Only one provider configured. Add backup providers for better reliability.',
  action: 'Configure additional providers like OpenAI or Anthropic'
}
```

### Performance Recommendations

```typescript
// Info: Slow response times
{
  type: 'info',
  category: 'performance',
  message: 'Slow response times detected: vertex, bedrock',
  action: 'Consider using faster providers for time-sensitive applications'
}
```

### Cost Optimization

```typescript
// Info: No free tier providers
{
  type: 'info',
  category: 'cost',
  message: 'No free-tier providers configured.',
  action: 'Consider adding Google AI Studio (free tier) for development'
}
```

### Success Acknowledgment

```typescript
// Success: Good configuration
{
  type: 'success',
  category: 'setup',
  message: 'Excellent! 3 providers working correctly.',
  action: 'Your setup provides good reliability and fallback options'
}
```

## Provider Selection Intelligence

### Primary Provider Selection

The system intelligently recommends primary providers based on:

1. **Priority Order**: `['google-ai', 'openai', 'anthropic', 'vertex', 'mistral']`
2. **Performance Metrics**: Response time and reliability
3. **Availability**: Current working status
4. **Use Case Suitability**: Feature compatibility

### Fallback Provider Selection

Fallback providers are chosen for maximum diversity:

1. **Different Provider Types**: Avoid single points of failure
2. **Geographic Diversity**: Different infrastructure providers
3. **Capability Overlap**: Ensure feature compatibility
4. **Performance Balance**: Maintain acceptable response times

## Error Handling and Recovery

### Common Error Scenarios

- **Authentication Failures**: Invalid API keys or expired tokens
- **Network Issues**: Connectivity problems or timeouts
- **Service Outages**: Provider-side service disruptions
- **Configuration Errors**: Missing environment variables or invalid settings

### Automatic Recovery

The system provides automatic recovery mechanisms:

```typescript
// Graceful degradation with fallback
if (!primaryProvider.working) {
  console.log(
    `Primary provider ${primaryProvider.name} failed, switching to ${fallbackProvider.name}`,
  );
  return await fallbackProvider.generate(prompt);
}
```

## Best Practices

### 1. Multi-Provider Setup

```bash
# Configure multiple providers for reliability
export GOOGLE_AI_API_KEY="your-google-api-key"
export OPENAI_API_KEY="your-openai-api-key"
export ANTHROPIC_API_KEY="your-anthropic-api-key"
```

### 2. Regular Health Monitoring

```typescript
// Set up periodic health checks
setInterval(async () => {
  const status = await getEnhancedProviderStatus();

  if (status.summary.availabilityRate  info.costTier === "Free Tier")
  .map(([name, _]) => name);

if (isDevelopment && freeTierProviders.length > 0) {
  return await neurolink.generate(prompt, { provider: freeTierProviders[0] });
}
```

## Integration with CI/CD

### Health Check in CI Pipeline

```yaml
# .github/workflows/health-check.yml
name: Provider Health Check
on:
  schedule:
    - cron: "0 */6 * * *" # Every 6 hours

jobs:
  health-check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npm install -g @juspay/neurolink
      - run: npx @juspay/neurolink status --json > health-report.json
      - name: Check Provider Status
        run: |
          # Count truly available/working providers
          WORKING_PROVIDERS=$(node -e "const status = JSON.parse(require('fs').readFileSync('health-report.json')); const working = Object.values(status.providers || {}).filter(p => (p && (p.working === true || p.available === true || p.status === 'working'))).length; console.log(working)")
          if [ "$WORKING_PROVIDERS" -lt 2 ]; then
            echo "❌ Insufficient available/working providers: ${WORKING_PROVIDERS}"
            exit 1
          else
            echo "✅ Provider health good: ${WORKING_PROVIDERS} providers available/working"
          fi
```

### Deployment Health Gates

```typescript
// deployment-health-check.js

async function validateDeployment() {
  const providers = ["google-ai", "openai", "anthropic"];
  const workingProviders = [];

  for (const providerName of providers) {
    try {
      const provider = await createAIProvider(providerName);
      await provider.generate({ prompt: "test", maxTokens: 5 });
      workingProviders.push(providerName);
    } catch (error) {
      console.warn(`Provider ${providerName} not available: ${error.message}`);
    }
  }

  // Require at least 2 working providers
  if (workingProviders.length  {
  res.set("Content-Type", register.contentType);
  res.end(await register.metrics());
});

app.listen(9100, () => console.log("Metrics server running on :9100"));
```

### Grafana Dashboard

```json
{
  "dashboard": {
    "title": "NeuroLink Provider Health",
    "panels": [
      {
        "title": "Provider Status",
        "type": "stat",
        "targets": [
          {
            "expr": "neurolink_provider_status",
            "legendFormat": "{{provider}}"
          }
        ]
      },
      {
        "title": "Response Time Distribution",
        "type": "heatmap",
        "targets": [
          {
            "expr": "rate(neurolink_provider_response_time_ms_bucket[5m])",
            "legendFormat": "{{provider}}"
          }
        ]
      }
    ]
  }
}
```

## Advanced Use Cases

### Load Balancing Based on Provider Status

```typescript

class StatusAwareLoadBalancer {
  private providers: string[];
  private statusCache: Map;
  private lastUpdate: number;
  private CACHE_TTL: number;
  private _rrIndex: number;

  constructor() {
    this.providers = ["google-ai", "openai", "anthropic", "vertex"];
    this.statusCache = new Map();
    this.lastUpdate = 0;
    this.CACHE_TTL = 60000; // 1 minute
    this._rrIndex = 0;
  }

  async getWorkingProvider() {
    // Update status cache if needed
    if (Date.now() - this.lastUpdate > this.CACHE_TTL) {
      await this.updateStatusCache();
    }

    // Get providers that are currently working
    const workingProviders = Array.from(this.statusCache.entries())
      .filter(([_, status]) => status.working)
      .map(([name, _]) => name);

    if (workingProviders.length === 0) {
      throw new Error("No working providers available");
    }

    // Round-robin selection using _rrIndex
    const selectedProvider =
      workingProviders[this._rrIndex % workingProviders.length];
    this._rrIndex = (this._rrIndex + 1) % workingProviders.length;

    return selectedProvider;
  }

  async updateStatusCache() {
    this.statusCache.clear();

    for (const providerName of this.providers) {
      try {
        const provider = await createAIProvider(providerName);
        const startTime = Date.now();

        await provider.generate({ prompt: "test", maxTokens: 5 });

        this.statusCache.set(providerName, {
          working: true,
          responseTime: Date.now() - startTime,
          lastChecked: Date.now(),
        });
      } catch (error) {
        this.statusCache.set(providerName, {
          working: false,
          error: error.message,
          lastChecked: Date.now(),
        });
      }
    }

    this.lastUpdate = Date.now();
  }
}

// Usage
const loadBalancer = new StatusAwareLoadBalancer();
const workingProvider = await loadBalancer.getWorkingProvider();
```

### Circuit Breaker Pattern

```typescript
class ProviderCircuitBreaker {
  private failureCount = 0;
  private lastFailureTime = 0;
  private state: "CLOSED" | "OPEN" | "HALF_OPEN" = "CLOSED";

  constructor(
    private providerName: string,
    private failureThreshold = 5,
    private recoveryTimeout = 60000,
  ) {}

  async execute(operation: () => Promise): Promise {
    if (this.state === "OPEN") {
      if (Date.now() - this.lastFailureTime > this.recoveryTimeout) {
        this.state = "HALF_OPEN";
      } else {
        throw new Error(`Circuit breaker OPEN for ${this.providerName}`);
      }
    }

    try {
      const result = await operation();

      if (this.state === "HALF_OPEN") {
        this.state = "CLOSED";
        this.failureCount = 0;
      }

      return result;
    } catch (error) {
      this.failureCount++;
      this.lastFailureTime = Date.now();

      if (this.failureCount >= this.failureThreshold) {
        this.state = "OPEN";
      }

      throw error;
    }
  }
}
```

## Troubleshooting

### Common Issues

#### 1. No Providers Available

```bash
# Diagnosis
npx @juspay/neurolink status --json

# Typical output showing configuration issues
{
  "timestamp": "2025-08-18T...",
  "providers": {
    "google-ai": {
      "available": false,
      "configured": false,
      "authenticated": false,
      "error": "Missing required environment variables: GOOGLE_AI_API_KEY"
    },
    "openai": {
      "available": false,
      "configured": false,
      "authenticated": false,
      "error": "Missing required environment variables: OPENAI_API_KEY"
    }
  },
  "bestProvider": null
}
```

**Solution**: Set up the required environment variables for at least one provider.

#### 2. Slow Response Times

```bash
# Check provider performance using benchmark
npx @juspay/neurolink benchmark

# Example output
{
  "timestamp": "2025-08-18T...",
  "prompt": "Write a haiku about artificial intelligence.",
  "results": {
    "google-ai": {
      "success": true,
      "responseTime": 1200,
      "model": "gemini-2.5-pro"
    },
    "vertex": {
      "success": true,
      "responseTime": 3400,
      "model": "gemini-2.5-pro"
    }
  }
}
```

**Solution**: Use the faster providers (like google-ai in this example) for time-sensitive applications.

#### 3. Authentication Failures

```bash
# Check specific provider status
npx @juspay/neurolink status --json

# Example authentication error
{
  "providers": {
    "openai": {
      "available": false,
      "configured": true,
      "authenticated": false,
      "error": "Invalid API key provided"
    }
  }
}
```

**Solution**: Verify and update the API key environment variable (OPENAI_API_KEY in this case).

### Debugging Commands

```bash
# Basic status check
npx @juspay/neurolink status

# JSON output for scripting
npx @juspay/neurolink status --json

# Performance benchmarking
npx @juspay/neurolink benchmark

# Test specific provider
GOOGLE_AI_API_KEY=your-key npx @juspay/neurolink status --json | jq '.providers."google-ai"'

# Check demo server status (if running)
curl http://localhost:9876/api/status
```

## Conclusion

NeuroLink's Provider Status Monitoring system provides enterprise-grade health management for AI provider infrastructure. With real-time monitoring, intelligent recommendations, and comprehensive analytics, it ensures optimal provider selection and proactive issue resolution.

Key benefits include:

- **Proactive Issue Detection**: Identify problems before they impact production
- **Intelligent Provider Selection**: Automatic optimization for performance and cost
- **Operational Excellence**: Complete visibility into AI infrastructure health
- **Developer Productivity**: Actionable recommendations reduce debugging time

This system transforms AI provider management from reactive troubleshooting to proactive optimization, ensuring reliable and efficient AI operations at enterprise scale.

---

## Enterprise Telemetry Guide

<!-- Source: observability/telemetry.md -->

#  Enterprise Telemetry Guide

**Advanced OpenTelemetry Integration for NeuroLink**

##  Overview

NeuroLink includes optional OpenTelemetry integration for enterprise monitoring and observability. The telemetry system provides comprehensive insights into AI operations, performance metrics, and system health with **zero overhead when disabled**.

##  Key Features

- **✅ Zero Overhead by Default** - Telemetry disabled unless explicitly configured
- ** AI Operation Tracking** - Monitor text generation, token usage, costs, and response times
- ** MCP Tool Monitoring** - Track tool calls, execution time, and success rates
- ** Performance Metrics** - Response times, error rates, throughput monitoring
- ** Distributed Tracing** - Full request tracing across AI providers and services
- ** Custom Dashboards** - Grafana, Jaeger, and Prometheus integration
- ** Production Ready** - Enterprise-grade monitoring for production deployments

##  Basic Setup

### Environment Configuration

```bash
# Enable telemetry
NEUROLINK_TELEMETRY_ENABLED=true

# OpenTelemetry endpoint (Jaeger, OTLP collector, etc.)
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318

# Service identification
OTEL_SERVICE_NAME=my-ai-application
OTEL_SERVICE_VERSION=1.0.0

# Optional: Resource attributes
OTEL_RESOURCE_ATTRIBUTES="service.name=my-ai-app,service.version=1.0.0,deployment.environment=production"

# Optional: Sampling configuration
OTEL_TRACES_SAMPLER=traceidratio
OTEL_TRACES_SAMPLER_ARG=0.1  # Sample 10% of traces
```

### Programmatic Initialization

```typescript

// Configuration is done via environment variables:
// NEUROLINK_TELEMETRY_ENABLED=true
// OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
// OTEL_SERVICE_NAME=my-ai-application
// OTEL_SERVICE_VERSION=1.0.0

// Initialize telemetry (reads from environment variables)
const success = await initializeTelemetry();
// Returns: Promise

if (success) {
  console.log("Telemetry initialized successfully");
}

// Check telemetry status
const status = await getTelemetryStatus();
// Returns: { enabled: boolean, initialized: boolean, endpoint?: string, service?: string, version?: string }

console.log("Telemetry enabled:", status.enabled);
console.log("Endpoint:", status.endpoint);
```

### Environment Variables

| Variable                      | Description              | Default        |
| ----------------------------- | ------------------------ | -------------- |
| `NEUROLINK_TELEMETRY_ENABLED` | Enable/disable telemetry | `false`        |
| `OTEL_EXPORTER_OTLP_ENDPOINT` | OTLP endpoint URL        | -              |
| `OTEL_SERVICE_NAME`           | Service name             | `neurolink-ai` |
| `OTEL_SERVICE_VERSION`        | Service version          | `3.0.1`        |

---

##  Production Deployment

### Docker Compose with Jaeger

```yaml
# docker-compose.yml
version: "3.8"
services:
  my-ai-app:
    build: .
    environment:
      - NEUROLINK_TELEMETRY_ENABLED=true
      - OTEL_EXPORTER_OTLP_ENDPOINT=http://jaeger:14268/api/traces
      - OTEL_SERVICE_NAME=my-ai-application
      - OPENAI_API_KEY=${OPENAI_API_KEY}
    depends_on:
      - jaeger
    ports:
      - "3000:3000"

  jaeger:
    image: jaegertracing/all-in-one:latest
    ports:
      - "16686:16686" # Jaeger UI
      - "14268:14268" # OTLP HTTP
      - "14250:14250" # OTLP gRPC
    environment:
      - COLLECTOR_OTLP_ENABLED=true
      - LOG_LEVEL=debug

  # Optional: Prometheus for metrics
  prometheus:
    image: prom/prometheus:latest
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml

  # Optional: Grafana for dashboards
  grafana:
    image: grafana/grafana:latest
    ports:
      - "3001:3000"
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=admin
    volumes:
      - grafana-storage:/var/lib/grafana

volumes:
  grafana-storage:
```

---

##  Key Metrics to Track

### AI Operation Metrics

- **Response Time**: Time to generate AI responses
- **Token Usage**: Input/output tokens by provider and model
- **Cost Tracking**: Estimated costs per operation
- **Error Rates**: Failed AI requests by provider
- **Provider Performance**: Success rates and latency by provider

### Sample Prometheus Queries

```promql
# Average AI response time over 5 minutes
rate(neurolink_ai_duration_sum[5m]) / rate(neurolink_ai_duration_count[5m])

# Token usage by provider
sum by (provider) (rate(neurolink_tokens_total[5m]))

# Error rate percentage
rate(neurolink_errors_total[5m]) / rate(neurolink_requests_total[5m]) * 100

# Cost per hour by provider
sum by (provider) (rate(neurolink_cost_total[1h]))

# Active WebSocket connections
neurolink_websocket_connections_active
```

---

##  Getting Started Checklist

### ✅ Quick Setup (5 minutes)

1. **Enable Telemetry**

   ```bash
   export NEUROLINK_TELEMETRY_ENABLED=true
   export OTEL_SERVICE_NAME=my-ai-app
   ```

2. **Start Jaeger (Local Development)**

   ```bash
   docker run -d \
     -p 16686:16686 \
     -p 14268:14268 \
     jaegertracing/all-in-one:latest
   ```

3. **Configure Endpoint**

   ```bash
   export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:14268/api/traces
   ```

4. **Initialize in Code**

   ```typescript
   import { initializeTelemetry } from "@juspay/neurolink";
   await initializeTelemetry();
   ```

5. **View Traces**
   - Open http://localhost:16686
   - Generate some AI requests
   - Search for traces in Jaeger UI

---

##  Additional Resources

- **[API Reference](/docs/sdk/api-reference)** - Complete telemetry API documentation
- **[Real-time Services](/docs/features/real-time-services)** - WebSocket infrastructure guide
- **[Performance Optimization](/docs/deployment/performance)** - Optimization strategies

**Ready for enterprise-grade AI monitoring with NeuroLink! **

---

# Deployment

## ⚙️ NeuroLink Configuration Guide

<!-- Source: deployment/configuration.md -->

# ⚙️ NeuroLink Configuration Guide

## ✅ IMPLEMENTATION STATUS: COMPLETE (2025-01-07)

**Generate Function Migration completed - Configuration examples updated**

- ✅ All code examples now show `generate()` as primary method
- ✅ Legacy `generate()` examples preserved for reference
- ✅ Factory pattern configuration benefits documented
- ✅ Zero configuration changes required for migration

> **Migration Note**: Configuration remains identical for both `generate()` and `generate()`.
> All existing configurations continue working unchanged.

##  **Overview**

This guide covers all configuration options for NeuroLink, including AI provider setup, dynamic model configuration, MCP integration, and environment configuration.

### **Basic Usage Examples**

```typescript

const neurolink = new NeuroLink();

// NEW: Primary method (recommended)
const result = await neurolink.generate({
  input: { text: "Configure AI providers" },
  provider: "google-ai",
  temperature: 0.7,
});

// LEGACY: Still fully supported
const legacyResult = await neurolink.generate({
  prompt: "Configure AI providers",
  provider: "google-ai",
  temperature: 0.7,
});
```

---

##  **AI Provider Configuration**

### **Environment Variables**

NeuroLink supports multiple AI providers. Set up one or more API keys:

```bash
# Google AI Studio (Recommended - Free tier available)
export GOOGLE_AI_API_KEY="AIza-your-google-ai-api-key"

# OpenAI
export OPENAI_API_KEY="sk-your-openai-api-key"

# Anthropic
export ANTHROPIC_API_KEY="sk-ant-your-anthropic-api-key"

# Azure OpenAI
export AZURE_OPENAI_API_KEY="your-azure-key"
export AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com/"

# AWS Bedrock
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export AWS_REGION="us-east-1"

# Hugging Face
export HUGGING_FACE_API_KEY="hf_your-hugging-face-token"

# Mistral AI
export MISTRAL_API_KEY="your-mistral-api-key"
```

### **.env File Configuration**

Create a `.env` file in your project root:

```env
# .env file - automatically loaded by NeuroLink
GOOGLE_AI_API_KEY=AIza-your-google-ai-api-key
OPENAI_API_KEY=sk-your-openai-api-key
ANTHROPIC_API_KEY=sk-ant-your-anthropic-api-key

# Optional: Provider preferences
NEUROLINK_PREFERRED_PROVIDER=google-ai
NEUROLINK_DEBUG=false
```

### **Provider Selection Priority**

NeuroLink automatically selects the best available provider:

1. **Google AI Studio** (if `GOOGLE_AI_API_KEY` is set)
2. **OpenAI** (if `OPENAI_API_KEY` is set)
3. **Anthropic** (if `ANTHROPIC_API_KEY` is set)
4. **Other providers** in order of availability

**Force specific provider**:

```bash
# CLI
npx neurolink generate "Hello" --provider openai
```

```typescript
// SDK

const neurolink = new NeuroLink();
const result = await neurolink.generate({
  input: { text: "Hello" },
  provider: "openai",
});
```

---

##  **Dynamic Model Configuration (v1.8.0+)**

### **Overview**

The dynamic model system enables intelligent model selection, cost optimization, and runtime model configuration without code changes.

### **Environment Variables**

```bash
# Dynamic Model System Configuration
export MODEL_SERVER_URL="http://localhost:3001"           # Model config server URL
export MODEL_CONFIG_PATH="./config/models.json"           # Model configuration file
export ENABLE_DYNAMIC_MODELS="true"                       # Enable dynamic models
export DEFAULT_MODEL_PREFERENCE="quality"                 # 'cost', 'speed', or 'quality'
export FALLBACK_MODEL="gpt-4o-mini"                      # Fallback when preferred unavailable
```

### **Model Configuration Server**

Start the model configuration server to enable dynamic model features:

```bash
# Start the model server (provides REST API for model configs)
npm run start:model-server

# Server provides endpoints at http://localhost:3001:
# GET /models                     - List all models
# GET /models/search?capability=vision - Search by capability
# GET /models/provider/anthropic  - Get provider models
# GET /models/resolve/claude-latest - Resolve aliases
```

### **Model Configuration File**

Create or modify `config/models.json` to define available models:

```json
{
  "models": [
    {
      "id": "claude-3-5-sonnet",
      "name": "Claude 3.5 Sonnet",
      "provider": "anthropic",
      "pricing": { "input": 0.003, "output": 0.015 },
      "capabilities": ["functionCalling", "vision", "code"],
      "contextWindow": 200000,
      "deprecated": false,
      "aliases": ["claude-latest", "best-coding"]
    }
  ],
  "aliases": {
    "claude-latest": "claude-3-5-sonnet",
    "fastest": "gpt-4o-mini",
    "cheapest": "claude-3-haiku"
  }
}
```

### **Dynamic Model Usage**

#### **CLI Usage**

```bash
# Use model aliases for convenience
npx neurolink generate "Write code" --model best-coding

# Capability-based selection
npx neurolink generate "Describe image" --capability vision --optimize-cost

# Search and discover models
npx neurolink models search --capability functionCalling --max-price 0.001
npx neurolink models list
npx neurolink models best --use-case coding
```

#### **SDK Usage**

```typescript

const neurolink = new NeuroLink();

// Use aliases for easy access
const result = await neurolink.generate({
  input: { text: "Write code" },
  provider: "anthropic",
  model: "claude-latest", // Auto-resolves to latest Claude
});

// Capability-based selection with vision model
const visionResult = await neurolink.generate({
  input: { text: "Describe this image" },
  provider: "openai",
  model: "gpt-4o", // Vision-capable model
});

// Use cost-effective models
const efficientResult = await neurolink.generate({
  input: { text: "Quick task" },
  provider: "anthropic",
  model: "claude-3-haiku", // Cost-effective option
});
```

### **Benefits**

- ✅ **Runtime Updates**: Add new models without code deployment
- ✅ **Smart Selection**: Automatic model selection based on capabilities
- ✅ **Cost Optimization**: Choose models based on price constraints
- ✅ **Easy Aliases**: Use friendly names like "claude-latest", "fastest"
- ✅ **Provider Agnostic**: Unified interface across all AI providers

---

## ️ **MCP Configuration (v1.7.1)**

### **Built-in Tools Configuration**

Built-in tools are automatically available in v1.7.1:

```json
{
  "builtInTools": {
    "enabled": true,
    "tools": ["time", "utilities", "registry", "configuration", "validation"]
  }
}
```

**Test built-in tools**:

```bash
# Built-in tools work immediately
npx neurolink generate "What time is it?" --debug
```

### **External MCP Server Configuration**

External servers are auto-discovered from all major AI tools:

#### **Auto-Discovery Locations**

**macOS**:

```bash
~/Library/Application Support/Claude/
~/Library/Application Support/Code/User/
~/.cursor/
~/.codeium/windsurf/
```

**Linux**:

```bash
~/.config/Code/User/
~/.continue/
~/.aider/
```

**Windows**:

```bash
%APPDATA%/Code/User/
```

#### **Manual MCP Configuration**

Create `.mcp-config.json` in your project root:

```json
{
  "mcpServers": {
    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "/"],
      "transport": "stdio"
    }
  }
}
```

#### **HTTP Transport Configuration**

For remote MCP servers, use HTTP transport with authentication, retry, and rate limiting:

```json
{
  "mcpServers": {
    "remote-api": {
      "transport": "http",
      "url": "https://api.example.com/mcp",
      "headers": {
        "Authorization": "Bearer YOUR_TOKEN",
        "X-API-Key": "your-api-key"
      },
      "httpOptions": {
        "connectionTimeout": 30000,
        "requestTimeout": 60000,
        "idleTimeout": 120000,
        "keepAliveTimeout": 30000
      },
      "retryConfig": {
        "maxAttempts": 3,
        "initialDelay": 1000,
        "maxDelay": 30000,
        "backoffMultiplier": 2
      },
      "rateLimiting": {
        "requestsPerMinute": 60,
        "maxBurst": 10,
        "useTokenBucket": true
      }
    }
  }
}
```

**HTTP Transport Options:**

| Option         | Type     | Description                             |
| -------------- | -------- | --------------------------------------- |
| `transport`    | `"http"` | Transport type for remote servers       |
| `url`          | `string` | URL of the remote MCP endpoint          |
| `headers`      | `object` | HTTP headers for authentication         |
| `httpOptions`  | `object` | Connection and timeout settings         |
| `retryConfig`  | `object` | Retry behavior with exponential backoff |
| `rateLimiting` | `object` | Rate limiting configuration             |

See [MCP HTTP Transport Guide](/docs/mcp/http-transport) for complete documentation.

### **MCP Discovery Commands**

```bash
# Discover all external servers
npx neurolink mcp discover --format table

# Export discovery results
npx neurolink mcp discover --format json > discovered-servers.json

# Test discovery
npx neurolink mcp discover --format yaml
```

---

## ️ **CLI Configuration**

### **Global CLI Options**

```bash
# Debug mode
export NEUROLINK_DEBUG=true

# Preferred provider
export NEUROLINK_PREFERRED_PROVIDER=google-ai

# Custom timeout
export NEUROLINK_TIMEOUT=30000
```

### **Command-line Options**

```bash
# Provider selection
npx neurolink generate "Hello" --provider openai

# Debug output
npx neurolink generate "Hello" --debug

# Temperature control
npx neurolink generate "Hello" --temperature 0.7

# Token limits
npx neurolink generate "Hello" --max-tokens 1000

# Disable tools
npx neurolink generate "Hello" --disable-tools
```

---

##  **Development Configuration**

### **TypeScript Configuration**

For TypeScript projects, add to your `tsconfig.json`:

```json
{
  "compilerOptions": {
    "moduleResolution": "node",
    "allowSyntheticDefaultImports": true,
    "esModuleInterop": true,
    "strict": true
  },
  "include": ["src/**/*", "node_modules/@juspay/neurolink/dist/**/*"]
}
```

### **Package.json Scripts**

Add useful scripts to your `package.json`:

```json
{
  "scripts": {
    "neurolink:status": "npx neurolink status --verbose",
    "neurolink:test": "npx neurolink generate 'Test message'",
    "neurolink:mcp-discover": "npx neurolink mcp discover --format table",
    "neurolink:mcp-test": "npx neurolink generate 'What time is it?' --debug"
  }
}
```

### **Environment Setup Script**

Create `setup-neurolink.sh`:

```bash
#!/bin/bash

echo " NeuroLink Environment Setup"

# Check Node.js version
if ! command -v node &> /dev/null; then
    echo "❌ Node.js not found. Please install Node.js v18+"
    exit 1
fi

NODE_VERSION=$(node -v | cut -d'v' -f2 | cut -d'.' -f1)
if [ "$NODE_VERSION" -lt 18 ]; then
    echo "❌ Node.js v18+ required. Current version: $(node -v)"
    exit 1
fi

# Install NeuroLink
echo " Installing NeuroLink..."
npm install @juspay/neurolink

# Create .env template
if [ ! -f .env ]; then
    echo " Creating .env template..."
    cat > .env  /dev/null 2>&1; then
    echo "✅ NeuroLink installed successfully"

    # Test MCP discovery
    echo " Testing MCP discovery..."
    SERVERS=$(npx neurolink mcp discover --format json 2>/dev/null | jq '.servers | length' 2>/dev/null || echo "0")
    echo "✅ Discovered $SERVERS external MCP servers"

    echo ""
    echo " Setup complete! Next steps:"
    echo "1. Add your API key to .env file"
    echo "2. Test: npx neurolink generate 'Hello'"
    echo "3. Test MCP tools: npx neurolink generate 'What time is it?' --debug"
else
    echo "❌ Installation test failed"
    exit 1
fi
```

---

## Context Compaction Configuration

### Overview

Context compaction automatically manages conversation history to keep it within a model's context window. When the estimated input tokens exceed a configurable threshold (default: 80% of available input space), a multi-stage reduction pipeline runs before the next LLM call. The four stages, in order, are:

1. **Tool Output Pruning** -- Replace old, large tool results with compact placeholders (no LLM call)
2. **File Read Deduplication** -- Keep only the latest read of each file path (no LLM call)
3. **LLM Summarization** -- Produce a structured summary of older messages (requires LLM call)
4. **Sliding Window Truncation** -- Tag the oldest messages as truncated (no LLM call)

Each stage only runs if the previous stage did not bring token usage below the target. The pipeline exits early once the context fits.

### SDK Configuration

Configure context compaction through the `contextCompaction` field inside `conversationMemory`:

```typescript

const neurolink = new NeuroLink({
  conversationMemory: {
    enabled: true,
    enableSummarization: true,

    contextCompaction: {
      // Enable auto-compaction (default: true when summarization enabled)
      enabled: true,

      // Compaction trigger threshold as fraction of available input tokens.
      // When usage ratio >= this value, compaction runs automatically.
      // Range: 0.0 - 1.0. Default: 0.80
      threshold: 0.8,

      // Enable Stage 1: tool output pruning (default: true)
      enablePruning: true,

      // Enable Stage 2: file read deduplication (default: true)
      enableDeduplication: true,

      // Enable Stage 4: sliding window truncation fallback (default: true)
      enableSlidingWindow: true,

      // Maximum tool output size in bytes before truncation.
      // Default: 51200 (50 KB)
      maxToolOutputBytes: 51200,

      // Maximum tool output lines before truncation.
      // Default: 2000
      maxToolOutputLines: 2000,

      // Fraction of remaining context budget allocated to file reads.
      // Range: 0.0 - 1.0. Default: 0.60
      fileReadBudgetPercent: 0.6,
    },

    // Provider and model used for Stage 3 (LLM summarization).
    // These are top-level conversationMemory fields, not inside contextCompaction.
    summarizationProvider: "vertex",
    summarizationModel: "gemini-2.5-flash",
  },
});
```

**Field Reference:**

| Field                   | Type      | Default                             | Description                                     |
| ----------------------- | --------- | ----------------------------------- | ----------------------------------------------- |
| `enabled`               | `boolean` | `true` (when summarization enabled) | Master switch for auto-compaction               |
| `threshold`             | `number`  | `0.80`                              | Usage ratio that triggers compaction (0.0--1.0) |
| `enablePruning`         | `boolean` | `true`                              | Enable Stage 1: tool output pruning             |
| `enableDeduplication`   | `boolean` | `true`                              | Enable Stage 2: file read deduplication         |
| `enableSlidingWindow`   | `boolean` | `true`                              | Enable Stage 4: sliding window truncation       |
| `maxToolOutputBytes`    | `number`  | `51200`                             | Tool output byte limit (50 KB)                  |
| `maxToolOutputLines`    | `number`  | `2000`                              | Tool output line limit                          |
| `fileReadBudgetPercent` | `number`  | `0.60`                              | Fraction of remaining context for file reads    |

Summarization provider/model are configured at the `conversationMemory` level:

| Field                   | Type     | Default              | Description                            |
| ----------------------- | -------- | -------------------- | -------------------------------------- |
| `summarizationProvider` | `string` | `"vertex"`           | Provider for Stage 3 LLM summarization |
| `summarizationModel`    | `string` | `"gemini-2.5-flash"` | Model for Stage 3 LLM summarization    |

### CLI Flags

The `loop` command accepts two context compaction flags:

```bash
# Set compaction threshold (0.0-1.0, default: 0.8)
npx neurolink loop --compact-threshold 0.70

# Disable automatic compaction entirely
npx neurolink loop --disable-compaction
```

| Flag                   | Type      | Default | Description                                     |
| ---------------------- | --------- | ------- | ----------------------------------------------- |
| `--compact-threshold`  | `number`  | `0.8`   | Context compaction trigger threshold (0.0--1.0) |
| `--disable-compaction` | `boolean` | `false` | Disable automatic context compaction            |

These flags map to `contextCompaction.threshold` and `contextCompaction.enabled` respectively.

### Per-Provider Context Windows

The budget checker uses per-provider, per-model context window sizes to calculate available input tokens. The available input space is:

```
availableInput = contextWindow - outputReserve
```

Where `outputReserve` defaults to 35% of the context window (capped at 64,000 tokens), or the explicit `maxTokens` value if provided.

| Provider         | Model                                                                             | Input Token Limit |
| ---------------- | --------------------------------------------------------------------------------- | ----------------- |
| **Anthropic**    | claude-opus-4, claude-sonnet-4, claude-3.5-sonnet, claude-3-opus (all variants)   | 200,000           |
| **OpenAI**       | gpt-4o, gpt-4o-mini, gpt-4-turbo, o1-mini                                         | 128,000           |
| **OpenAI**       | o1, o1-pro, o3, o3-mini, o4-mini                                                  | 200,000           |
| **OpenAI**       | gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, gpt-5                                        | 1,047,576         |
| **OpenAI**       | gpt-4                                                                             | 8,192             |
| **OpenAI**       | gpt-3.5-turbo                                                                     | 16,385            |
| **Google AI**    | gemini-2.5-pro, gemini-2.5-flash, gemini-2.0-flash, gemini-1.5-flash, gemini-3-\* | 1,048,576         |
| **Google AI**    | gemini-1.5-pro                                                                    | 2,097,152         |
| **Vertex**       | gemini-2.5-pro, gemini-2.5-flash, gemini-2.0-flash, gemini-1.5-flash              | 1,048,576         |
| **Vertex**       | gemini-1.5-pro                                                                    | 2,097,152         |
| **Bedrock**      | anthropic.claude-3-\* (all variants)                                              | 200,000           |
| **Bedrock**      | amazon.nova-pro-v1:0, amazon.nova-lite-v1:0                                       | 300,000           |
| **Azure**        | gpt-4o, gpt-4o-mini, gpt-4-turbo                                                  | 128,000           |
| **Azure**        | gpt-4                                                                             | 8,192             |
| **Mistral**      | mistral-large-latest, mistral-small-latest                                        | 128,000           |
| **Mistral**      | codestral-latest                                                                  | 256,000           |
| **Mistral**      | mistral-medium-latest                                                             | 32,000            |
| **Ollama**       | (default)                                                                         | 128,000           |
| **LiteLLM**      | (default)                                                                         | 128,000           |
| **Hugging Face** | (default)                                                                         | 32,000            |
| **SageMaker**    | (default)                                                                         | 128,000           |

Unknown providers or models fall back to a global default of 128,000 tokens.

### Advanced Configuration

#### Manual Compaction with `compactSession()`

You can trigger compaction manually on any session using the `CompactionConfig` interface, which provides per-stage control beyond what the SDK-level `contextCompaction` field exposes:

```typescript

const neurolink = new NeuroLink({
  conversationMemory: { enabled: true },
});

const result: CompactionResult | null = await neurolink.compactSession(
  "session-abc-123",
  {
    // Per-stage toggles
    enablePrune: true,
    enableDeduplicate: true,
    enableSummarize: true,
    enableTruncate: true,

    // Stage 1 (prune) options
    pruneProtectTokens: 40_000, // Protect recent N tokens from pruning
    pruneMinimumSavings: 20_000, // Only prune if savings exceed this
    pruneProtectedTools: ["skill"], // Tool names to never prune

    // Stage 3 (summarize) options
    summarizationProvider: "vertex",
    summarizationModel: "gemini-2.5-flash",
    keepRecentRatio: 0.3, // Fraction of messages to keep verbatim

    // Stage 4 (truncate) options
    truncationFraction: 0.5, // Fraction of messages to truncate

    // Provider hint for token estimation
    provider: "anthropic",
  },
);

if (result?.compacted) {
  console.log(`Saved ${result.tokensSaved} tokens`);
  console.log(`Stages used: ${result.stagesUsed.join(", ")}`);
  // result.stagesUsed is an array of: "prune" | "deduplicate" | "summarize" | "truncate"
}
```

**`CompactionConfig` Field Reference:**

| Field                   | Type       | Default              | Description                                             |
| ----------------------- | ---------- | -------------------- | ------------------------------------------------------- |
| `enablePrune`           | `boolean`  | `true`               | Enable Stage 1: tool output pruning                     |
| `enableDeduplicate`     | `boolean`  | `true`               | Enable Stage 2: file read deduplication                 |
| `enableSummarize`       | `boolean`  | `true`               | Enable Stage 3: LLM summarization                       |
| `enableTruncate`        | `boolean`  | `true`               | Enable Stage 4: sliding window truncation               |
| `pruneProtectTokens`    | `number`   | `40000`              | Number of recent tokens protected from pruning          |
| `pruneMinimumSavings`   | `number`   | `20000`              | Minimum token savings required to apply pruning         |
| `pruneProtectedTools`   | `string[]` | `["skill"]`          | Tool names whose outputs are never pruned               |
| `summarizationProvider` | `string`   | `"vertex"`           | Provider for LLM summarization                          |
| `summarizationModel`    | `string`   | `"gemini-2.5-flash"` | Model for LLM summarization                             |
| `keepRecentRatio`       | `number`   | `0.3`                | Fraction of messages kept verbatim during summarization |
| `truncationFraction`    | `number`   | `0.5`                | Fraction of oldest messages tagged as truncated         |
| `provider`              | `string`   | `""`                 | Provider hint for token estimation multipliers          |

#### File Token Budget Constants

These constants in `src/lib/context/fileTokenBudget.ts` control how file reads interact with the context budget:

| Constant                   | Value    | Description                                                    |
| -------------------------- | -------- | -------------------------------------------------------------- |
| `FILE_READ_BUDGET_PERCENT` | `0.6`    | Fraction of remaining context allocated for file reads         |
| `FILE_FAST_PATH_SIZE`      | `100 KB` | Files below this size skip budget validation                   |
| `FILE_PREVIEW_MODE_SIZE`   | `5 MB`   | Files above this size get preview-only mode (first 2000 chars) |
| `FILE_PREVIEW_CHARS`       | `2000`   | Number of characters shown in preview mode                     |

#### Tool Output Limits Constants

These constants in `src/lib/context/toolOutputLimits.ts` control tool output truncation:

| Constant                | Value           | Description                                 |
| ----------------------- | --------------- | ------------------------------------------- |
| `MAX_TOOL_OUTPUT_BYTES` | `51200` (50 KB) | Maximum tool output size before truncation  |
| `MAX_TOOL_OUTPUT_LINES` | `2000`          | Maximum tool output lines before truncation |

---

##  **Advanced Configuration**

### **Custom Provider Configuration**

```typescript

// Create NeuroLink instance with custom settings
const neurolink = new NeuroLink({
  timeout: 30000,
});

// Generate with specific provider
const result = await neurolink.generate({
  input: { text: "Hello" },
  provider: "openai",
  model: "gpt-4o",
});
```

### **Tool Configuration**

```typescript

const neurolink = new NeuroLink();

// Enable/disable tools via generate options
const result = await neurolink.generate({
  input: { text: "What time is it?" },
  provider: "openai",
  maxToolRoundtrips: 5, // Control tool call iterations
});
```

### **Logging Configuration**

```bash
# Enable detailed logging
export NEUROLINK_DEBUG=true
export NEUROLINK_LOG_LEVEL=verbose

# Custom log format
export NEUROLINK_LOG_FORMAT=json
```

---

## ️ **Security Configuration**

### **API Key Security**

```bash
# Use environment variables (not hardcoded)
export GOOGLE_AI_API_KEY="$(cat ~/.secrets/google-ai-key)"

# Use .env files (add to .gitignore)
echo ".env" >> .gitignore
```

### **Tool Security**

```json
{
  "toolSecurity": {
    "allowedDomains": ["api.example.com"],
    "blockedTools": ["dangerous-tool"],
    "requireConfirmation": true
  }
}
```

---

##  **Testing Configuration**

### **Test Environment Setup**

```bash
# Test environment
export NEUROLINK_ENV=test
export NEUROLINK_DEBUG=true

# Mock providers for testing
export NEUROLINK_MOCK_PROVIDERS=true
```

### **Validation Commands**

```bash
# Validate configuration
npx neurolink status --verbose

# Test built-in tools (v1.7.1)
npx neurolink generate "What time is it?" --debug

# Test external discovery
npx neurolink mcp discover --format table

# Full system test
npm run build && npm run test:run -- test/mcp-comprehensive.test.ts
```

---

##  **Configuration Examples**

### **Minimal Setup (Google AI)**

```bash
export GOOGLE_AI_API_KEY="AIza-your-key"
npx neurolink generate "Hello"
```

### **Multi-Provider Setup**

```env
GOOGLE_AI_API_KEY=AIza-your-google-key
OPENAI_API_KEY=sk-your-openai-key
ANTHROPIC_API_KEY=sk-ant-your-anthropic-key
NEUROLINK_PREFERRED_PROVIDER=google-ai
```

### **Development Setup**

```env
NEUROLINK_DEBUG=true
NEUROLINK_LOG_LEVEL=verbose
NEUROLINK_TIMEOUT=60000
NEUROLINK_MOCK_PROVIDERS=false
```

---

** For most users, setting `GOOGLE_AI_API_KEY` is sufficient to get started with NeuroLink and test all MCP functionality!**

---

## ️ Enterprise Configuration Management Guide

<!-- Source: deployment/configuration-management.md -->

# ️ Enterprise Configuration Management Guide

**NeuroLink Configuration System v3.0** - Complete guide to enterprise configuration management with automatic backup/restore, validation, and error recovery.

##  **Quick Start**

### **Basic Configuration Setup**

```typescript

// Initialize config manager
const configManager = new ConfigManager();

// Update configuration (automatic backup created)
await configManager.updateConfig({
  providers: {
    google: { enabled: true, model: "gemini-2.5-pro" },
    openai: { enabled: true, model: "gpt-4o" },
  },
  performance: {
    timeout: 30000,
    retries: 3,
  },
});
// ✅ Backup created: .neurolink.backups/neurolink-config-2025-01-07T10-30-00.js
```

### **Environment Configuration**

```bash
# Enable automatic backups
NEUROLINK_BACKUP_ENABLED=true
NEUROLINK_BACKUP_RETENTION=30
NEUROLINK_BACKUP_DIRECTORY=.neurolink.backups

# Validation settings
NEUROLINK_VALIDATION_STRICT=false
NEUROLINK_VALIDATION_WARNINGS=true

# Provider monitoring
NEUROLINK_PROVIDER_STATUS_CHECK=true
NEUROLINK_PROVIDER_TIMEOUT=30000
```

---

##  **Configuration Structure**

### **NeuroLinkConfig Interface**

```typescript
type NeuroLinkConfig = {
  providers: ProviderConfig; // AI provider settings
  performance: PerformanceConfig; // Performance optimization
  analytics: AnalyticsConfig; // Analytics configuration
  backup: BackupConfig; // Backup system settings
  validation: ValidationConfig; // Validation rules
};
```

### **Provider Configuration**

```typescript
type ProviderConfig = {
  google?: {
    enabled: boolean;
    model?: string;
    apiKey?: string;
    timeout?: number;
  };
  openai?: {
    enabled: boolean;
    model?: string;
    apiKey?: string;
    timeout?: number;
  };
  // ... other providers
};
```

### **Performance Configuration**

```typescript
type PerformanceConfig = {
  timeout: number; // Default timeout (ms)
  retries: number; // Default retry count
  cacheEnabled: boolean; // Enable execution caching
  cacheTTL: number; // Cache TTL (seconds)
  concurrency: number; // Max concurrent operations
};
```

---

##  **Automatic Backup System**

### **How It Works**

1. **Before Update**: Config manager creates timestamped backup
2. **Update Attempt**: Apply new configuration
3. **Validation**: Validate new configuration
4. **Success/Failure**: Keep new config or auto-restore from backup

### **Backup File Structure**

```
.neurolink.backups/
├── neurolink-config-2025-01-07T10-30-00.js    # Timestamped backup
├── neurolink-config-2025-01-07T11-15-30.js    # Another backup
├── metadata.json                               # Backup metadata
└── .backup-index                              # Backup index file
```

### **Backup Metadata**

```typescript
type BackupMetadata = {
  timestamp: string;
  hash: string; // SHA-256 hash
  size: number; // File size in bytes
  reason: string; // Reason for backup
  version: string; // Config version
  environment: string; // Environment context
  user?: string; // User who made change
};
```

### **Manual Backup Operations**

```typescript
// Create manual backup
const backupPath = await configManager.createBackup("manual-backup");
console.log(`Backup created: ${backupPath}`);

// List all backups
const backups = await configManager.listBackups();
console.log("Available backups:", backups);

// Restore from specific backup
await configManager.restoreFromBackup(
  "neurolink-config-2025-01-07T10-30-00.js",
);
```

---

## ✅ **Configuration Validation**

### **Validation Process**

1. **Schema Validation**: Check against TypeScript interfaces
2. **Provider Validation**: Verify provider configurations
3. **Dependency Validation**: Check inter-config dependencies
4. **Performance Validation**: Validate performance settings
5. **Security Validation**: Check for security issues

### **Validation Examples**

```typescript
// Validate current config
const validation = await configManager.validateConfig();

if (!validation.isValid) {
  console.log("Validation errors:", validation.errors);
  console.log("Suggestions:", validation.suggestions);
}

// Validate before update
await configManager.updateConfig(newConfig, {
  validateBeforeUpdate: true,
  onValidationError: (errors) => {
    console.log("Validation failed:", errors);
  },
});
```

### **Common Validation Errors**

```typescript
// Example validation results
{
  isValid: false,
  errors: [
    {
      field: 'providers.google.model',
      message: 'Model "gemini-pro-deprecated" is deprecated',
      severity: 'warning',
      suggestion: 'Use "gemini-2.5-pro" instead'
    },
    {
      field: 'performance.timeout',
      message: 'Timeout value too low (= 1000ms for reliable operation'
    }
  ],
  suggestions: [
    'Consider enabling caching for better performance',
    'Add fallback providers for reliability'
  ]
}
```

---

## ️ **Advanced Configuration**

### **Update Strategies**

```typescript
// Replace entire config
await configManager.updateConfig(newConfig, {
  mergeStrategy: "replace",
});

// Merge with existing config
await configManager.updateConfig(partialConfig, {
  mergeStrategy: "merge",
});

// Deep merge (preserves nested objects)
await configManager.updateConfig(partialConfig, {
  mergeStrategy: "deep-merge",
});
```

### **Custom Validation Rules**

```typescript
// Add custom validation
configManager.addValidator("performance", (config) => {
  if (config.performance.timeout = 5000ms",
    };
  }
  return { isValid: true };
});
```

### **Event Handlers**

```typescript
// Listen for config events
configManager.on("configUpdated", (newConfig, oldConfig) => {
  console.log("Config updated:", { newConfig, oldConfig });
});

configManager.on("backupCreated", (backupPath) => {
  console.log("Backup created:", backupPath);
});

configManager.on("configRestored", (backupPath) => {
  console.log("Config restored from:", backupPath);
});
```

---

##  **Error Recovery**

### **Auto-Restore Process**

1. **Detection**: Config update fails validation or causes errors
2. **Identification**: Find most recent valid backup
3. **Restoration**: Restore config from backup
4. **Verification**: Validate restored config
5. **Notification**: Log recovery action

### **Manual Recovery**

```typescript
// Check config health
const health = await configManager.checkHealth();
if (!health.isHealthy) {
  console.log("Config issues detected:", health.issues);

  // Restore from backup
  await configManager.autoRestore();
}

// Recovery from specific backup
try {
  await configManager.restoreFromBackup("backup-name.js");
  console.log("Successfully restored from backup");
} catch (error) {
  console.error("Restore failed:", error.message);
}
```

### **Recovery Scenarios**

- **Corrupted Config**: Auto-restore from last known good backup
- **Invalid Provider**: Disable problematic provider, restore working config
- **Performance Issues**: Restore previous performance settings
- **Validation Failures**: Rollback to validated configuration

---

##  **Cleanup & Maintenance**

### **Automatic Cleanup**

```typescript
// Configure automatic cleanup
await configManager.updateConfig({
  backup: {
    retention: 30, // Keep backups for 30 days
    maxBackups: 100, // Keep max 100 backups
    autoCleanup: true, // Enable automatic cleanup
  },
});
```

### **Manual Cleanup**

```typescript
// Clean old backups
const cleaned = await configManager.cleanupBackups({
  olderThan: 30, // Days
  keepMinimum: 5, // Always keep at least 5 backups
});
console.log(`Cleaned ${cleaned.count} old backups`);

// Verify backup integrity
const verification = await configManager.verifyBackups();
console.log("Backup verification:", verification);
```

---

##  **Monitoring & Diagnostics**

### **Config Status**

```typescript
// Get config status
const status = await configManager.getStatus();
console.log("Config status:", {
  isValid: status.isValid,
  lastUpdated: status.lastUpdated,
  backupCount: status.backupCount,
  providerStatus: status.providers,
});
```

### **Provider Health Monitoring**

```typescript
// Check provider health
const providers = await configManager.checkProviderHealth();
providers.forEach((provider) => {
  console.log(`${provider.name}: ${provider.status}`);
  if (provider.status === "error") {
    console.log(`Error: ${provider.error}`);
  }
});
```

### **Performance Metrics**

```typescript
// Get performance metrics
const metrics = await configManager.getMetrics();
console.log("Config performance:", {
  updateTime: metrics.averageUpdateTime,
  validationTime: metrics.averageValidationTime,
  backupTime: metrics.averageBackupTime,
});
```

---

##  **Best Practices**

### **Configuration Management**

1. **Always Validate**: Enable validation before updates
2. **Use Backups**: Keep automatic backups enabled
3. **Monitor Health**: Regular provider health checks
4. **Version Control**: Consider versioning config files
5. **Environment Separation**: Different configs for dev/prod

### **Performance Optimization**

1. **Cache Settings**: Enable caching for frequently used configs
2. **Timeout Tuning**: Set appropriate timeouts for your use case
3. **Provider Selection**: Use fastest available providers
4. **Cleanup Schedule**: Regular backup cleanup

### **Security Considerations**

1. **API Key Management**: Store API keys securely
2. **Backup Encryption**: Consider encrypting sensitive backups
3. **Access Control**: Limit config update permissions
4. **Audit Logging**: Log all config changes

---

## 🆘 **Troubleshooting**

### **Common Issues**

**Config Update Fails**

```bash
# Check config validation
npx @juspay/neurolink config validate

# Check provider status
npx @juspay/neurolink status

# Restore from backup
npx @juspay/neurolink config restore --backup latest
```

**Backup System Issues**

```bash
# Verify backup directory
ls -la .neurolink.backups/

# Check backup integrity
npx @juspay/neurolink config verify-backups

# Manual cleanup
npx @juspay/neurolink config cleanup --older-than 30
```

**Provider Configuration Issues**

```bash
# Test provider connection
npx @juspay/neurolink test-provider google

# Reset provider config
npx @juspay/neurolink config reset-provider google

# Check environment variables
npx @juspay/neurolink env check
```

### **Support & Resources**

- **Documentation**: See [API Reference](/docs/sdk/api-reference) for interface details
- **Migration Guide**: See `docs/INTERFACE-MIGRATION-GUIDE.md`
- **Troubleshooting**: See `docs/TROUBLESHOOTING.md`
- **GitHub Issues**: Report bugs and feature requests

---

** Enterprise configuration management provides robust, reliable, and maintainable configuration handling for production NeuroLink deployments.**

---

## Enterprise & Proxy Setup Guide

<!-- Source: deployment/enterprise-proxy.md -->

#  Enterprise & Proxy Setup Guide

NeuroLink provides comprehensive proxy support for enterprise environments, enabling AI integration behind corporate firewalls and proxy servers.

## ✨ Zero Configuration Proxy Support

NeuroLink automatically detects and uses proxy settings when environment variables are configured. **No code changes required.**

### Quick Setup

```bash
# Set proxy environment variables
export HTTPS_PROXY=http://your-corporate-proxy:port
export HTTP_PROXY=http://your-corporate-proxy:port

# NeuroLink will automatically use these settings
npx @juspay/neurolink generate "Hello from behind corporate proxy"
```

##  Environment Variables

### Required Proxy Variables

| Variable      | Description                     | Example                         |
| ------------- | ------------------------------- | ------------------------------- |
| `HTTPS_PROXY` | Proxy server for HTTPS requests | `http://proxy.company.com:8080` |
| `HTTP_PROXY`  | Proxy server for HTTP requests  | `http://proxy.company.com:8080` |

### Optional Proxy Variables

| Variable   | Description             | Default               |
| ---------- | ----------------------- | --------------------- |
| `NO_PROXY` | Domains to bypass proxy | `localhost,127.0.0.1` |

##  Provider-Specific Proxy Support

### ✅ Full Proxy Support

All NeuroLink providers automatically work through corporate proxies:

| Provider             | Proxy Method                        | Status               |
| -------------------- | ----------------------------------- | -------------------- |
| **Anthropic Claude** | Direct fetch calls with proxy       | ✅ Verified + Tested |
| **OpenAI**           | Global fetch handling               | ✅ Verified + Tested |
| **Google Vertex AI** | Custom fetch with undici ProxyAgent | ✅ Verified + Tested |
| **Google AI Studio** | Custom fetch with undici ProxyAgent | ✅ Verified + Tested |
| **Mistral AI**       | Custom fetch with undici ProxyAgent | ✅ Verified + Tested |
| **Ollama**           | Custom fetch with undici ProxyAgent | ✅ Verified + Tested |
| **HuggingFace**      | Custom fetch with undici ProxyAgent | ✅ Implemented       |
| **Azure OpenAI**     | Custom fetch with undici ProxyAgent | ✅ Implemented       |
| **Amazon Bedrock**   | Global fetch handling               | ✅ Implemented       |

##  Quick Validation

### Test Proxy Configuration

```bash
# 1. Set proxy variables
export HTTPS_PROXY=http://your-proxy:port
export HTTP_PROXY=http://your-proxy:port

# 2. Test with any provider
npx @juspay/neurolink generate "Test proxy connection" --provider google-ai

# 3. Check proxy logs for connection intercepts
```

### Verify Proxy Usage

When proxy is working correctly, you should see:

- ✅ AI responses generated successfully
- ✅ Proxy server logs showing intercepted connections
- ✅ No direct internet access required
- ✅ Enterprise MCP tools work alongside proxy

### Enterprise Grade Testing

NeuroLink includes comprehensive proxy validation tests:

```bash
# Run enterprise proxy tests
npm test -- test/proxy/proxySupport.test.ts

# Test all providers with proxy + MCP
npm test -- test/proxy/proxySupport.test.ts --run
```

**Test Coverage:**

- ✅ Proxy usage validation (negative/positive testing)
- ✅ All enterprise providers (Anthropic, OpenAI, Vertex, Mistral, Ollama)
- ✅ MCP + Proxy compatibility (enterprise grade)
- ✅ Real-world timeout handling
- ✅ SDK and CLI interface testing

##  Enterprise Configuration Examples

### Corporate Firewall Setup

```bash
# Standard corporate proxy
export HTTPS_PROXY=http://proxy.company.com:8080
export HTTP_PROXY=http://proxy.company.com:8080
export NO_PROXY=localhost,127.0.0.1,.company.com
```

### Authenticated Proxy

```bash
# Proxy with authentication
export HTTPS_PROXY=http://username:password@proxy.company.com:8080
export HTTP_PROXY=http://username:password@proxy.company.com:8080
```

### Multiple Environment Setup

```bash
# Development environment
export HTTPS_PROXY=http://dev-proxy.company.com:8080

# Production environment
export HTTPS_PROXY=http://prod-proxy.company.com:8080
```

## ️ Technical Implementation

### Architecture Overview

NeuroLink uses the **undici ProxyAgent** for reliable proxy support:

```typescript
// Automatic proxy detection and configuration
const proxyFetch = createProxyFetch();

// Provider integration varies by SDK capabilities:
// - Custom fetch parameter (Google AI, Vertex AI)
// - Direct fetch calls (Anthropic)
// - Global fetch handling (OpenAI, Bedrock)
```

### Key Benefits

-  **Automatic Detection** - Zero configuration for standard setups
-  **Enterprise Ready** - Works with corporate authentication
- ⚡ **High Performance** - Optimized undici implementation
- ️ **Security Compliant** - Respects corporate security policies

##  Troubleshooting

### Common Issues

#### Proxy Not Working

```bash
# Check environment variables
echo $HTTPS_PROXY
echo $HTTP_PROXY

# Verify proxy server accessibility
curl -I --proxy $HTTPS_PROXY https://api.openai.com
```

#### Connection Timeouts

```bash
# Increase timeout for slow proxies
export NEUROLINK_TIMEOUT=60000  # 60 seconds
```

#### Authentication Issues

```bash
# URL encode special characters in credentials
# @ becomes %40, : becomes %3A
export HTTPS_PROXY=http://user%40domain.com:pass%3Aword@proxy:8080
```

### Debug Mode

```bash
# Enable detailed proxy logging
export DEBUG=neurolink:proxy
npx @juspay/neurolink generate "Debug proxy connection" --debug
```

##  AWS & Cloud Deployment

### AWS Corporate Environment

```bash
# Set in AWS Lambda environment variables
HTTPS_PROXY=http://corporate-proxy.amazonaws.com:8080
HTTP_PROXY=http://corporate-proxy.amazonaws.com:8080
```

### Docker Deployment

```dockerfile
# Dockerfile
ENV HTTPS_PROXY=http://proxy.company.com:8080
ENV HTTP_PROXY=http://proxy.company.com:8080
RUN npm install @juspay/neurolink
```

### Kubernetes Configuration

```yaml
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
spec:
  template:
    spec:
      containers:
        - name: neurolink-app
          env:
            - name: HTTPS_PROXY
              value: "http://proxy.company.com:8080"
            - name: HTTP_PROXY
              value: "http://proxy.company.com:8080"
```

##  Checklist for Enterprise Deployment

### Pre-deployment

- [ ] Proxy server details obtained from IT team
- [ ] Network connectivity tested with curl/wget
- [ ] Authentication credentials secured
- [ ] Firewall rules configured for AI provider domains

### Testing

- [ ] Environment variables set correctly
- [ ] NeuroLink proxy test successful
- [ ] All required providers accessible
- [ ] Production environment validated

### Security

- [ ] Proxy credentials stored securely
- [ ] NO_PROXY configured for internal services
- [ ] SSL/TLS verification enabled
- [ ] Logging configured appropriately

##  Related Documentation

- [Provider Configuration](/docs/getting-started/provider-setup) - Detailed provider setup
- [CLI Guide](/docs/cli) - Command line proxy usage
- [Environment Variables](/docs/getting-started/environment-variables) - Complete variable reference
- [Troubleshooting](/docs/reference/troubleshooting) - Common issues and solutions

---

**Enterprise Support**: For enterprise deployment assistance, contact [enterprise@juspay.in](mailto:enterprise@juspay.in)

---

## Performance Optimization Guide for NeuroLink CLI with Domain Features

<!-- Source: deployment/performance-guide.md -->

# Performance Optimization Guide for NeuroLink CLI with Domain Features

This guide provides comprehensive strategies for optimizing performance when using NeuroLink CLI with domain-specific features and factory pattern infrastructure.

## Table of Contents

- [Overview](#overview)
- [Performance Benchmarks](#performance-benchmarks)
- [CLI Startup Optimization](#cli-startup-optimization)
- [Domain Configuration Performance](#domain-configuration-performance)
- [Memory Usage Optimization](#memory-usage-optimization)
- [Generation Speed Optimization](#generation-speed-optimization)
- [Streaming Performance](#streaming-performance)
- [Provider Selection Strategy](#provider-selection-strategy)
- [Context Data Optimization](#context-data-optimization)
- [Caching and Configuration](#caching-and-configuration)
- [Monitoring and Profiling](#monitoring-and-profiling)
- [Troubleshooting](#troubleshooting)

## Overview

The NeuroLink CLI with Phase 1 Factory Infrastructure introduces domain-specific features that enhance functionality while maintaining performance. This guide helps you optimize performance across different use cases and configurations.

### Performance Goals

- **CLI Startup**: \ startup-profile.txt

# Monitor system calls during startup
strace -c neurolink --version 2>&1 | grep -E "(calls|syscall)"
```

## Domain Configuration Performance

### Efficient Domain Usage

1. **Choose Appropriate Domain**

   ```bash
   # Use specific domain for better performance
   neurolink generate "healthcare query" --evaluationDomain healthcare  # Optimized
   neurolink generate "healthcare query" --evaluationDomain analytics   # Less optimized
   ```

2. **Selective Feature Enablement**

   ```bash
   # Enable only needed features
   neurolink generate "prompt" --evaluationDomain healthcare --enable-evaluation  # Evaluation only
   neurolink generate "prompt" --evaluationDomain healthcare --enable-analytics   # Analytics only
   neurolink generate "prompt" --evaluationDomain healthcare --enable-evaluation --enable-analytics  # Both (higher overhead)
   ```

3. **Configuration Defaults**
   ```bash
   # Set defaults to avoid runtime overhead
   neurolink config init
   # Configure default domain and features during setup
   ```

### Domain-Specific Optimizations

#### Healthcare Domain

```bash
# Optimized healthcare usage
neurolink generate "medical query" \
  --evaluationDomain healthcare \
  --enable-evaluation \
  --max-tokens 800 \
  --provider anthropic \
  --format json
```

#### Analytics Domain

```bash
# Optimized analytics usage
neurolink generate "data analysis query" \
  --evaluationDomain analytics \
  --enable-evaluation \
  --enable-analytics \
  --max-tokens 1200 \
  --provider google-ai \
  --format json
```

#### Finance Domain

```bash
# Optimized finance usage
neurolink generate "financial analysis" \
  --evaluationDomain finance \
  --enable-evaluation \
  --max-tokens 1000 \
  --provider openai \
  --format json
```

## Memory Usage Optimization

### Memory-Efficient Practices

1. **Context Size Management**

   ```bash
   # Efficient - minimal context
   neurolink generate "prompt" \
     --context '{"key":"value"}' \
     --evaluationDomain analytics

   # Inefficient - large context
   neurolink generate "prompt" \
     --context '{"massive":{"nested":{"object":"with-lots-of-data"}}}' \
     --evaluationDomain analytics
   ```

2. **Token Limit Optimization**

   ```bash
   # Set appropriate token limits
   neurolink generate "short query" --max-tokens 200 --dryRun
   neurolink generate "complex analysis" --max-tokens 2000 --dryRun
   ```

3. **Sequential Processing**
   ```bash
   # Process in sequence rather than parallel for memory efficiency
   neurolink generate "query1" --evaluationDomain healthcare --dryRun
   neurolink generate "query2" --evaluationDomain analytics --dryRun
   ```

### Memory Monitoring

```bash
# Monitor memory usage during operation
watch -n 1 'ps aux | grep neurolink | grep -v grep'

# Memory profiling with detailed breakdown
valgrind --tool=massif neurolink generate "test" --dryRun

# System memory monitoring
top -p $(pgrep -f neurolink)
```

## Generation Speed Optimization

### Speed Optimization Strategies

1. **Provider Selection for Speed**

   ```bash
   # Fast providers for quick responses
   neurolink generate "prompt" --provider google-ai --max-tokens 500

   # Quality vs speed tradeoff
   neurolink generate "prompt" --provider anthropic --max-tokens 1000  # Higher quality, slower
   neurolink generate "prompt" --provider google-ai --max-tokens 800   # Faster response
   ```

2. **Optimal Token Limits**

   ```bash
   # Right-size token limits for your use case
   neurolink generate "brief summary" --max-tokens 200      # Fast
   neurolink generate "detailed analysis" --max-tokens 1500  # Comprehensive
   ```

3. **Format Selection Impact**

   ```bash
   # Text format (fastest)
   neurolink generate "prompt" --format text

   # JSON format (slight overhead for parsing)
   neurolink generate "prompt" --format json

   # Table format (most processing overhead)
   neurolink generate "prompt" --format table
   ```

### Generation Performance Monitoring

```bash
# Time different configurations
hyperfine 'neurolink generate "test" --dryRun' \
          'neurolink generate "test" --evaluationDomain healthcare --dryRun' \
          'neurolink generate "test" --evaluationDomain analytics --enable-analytics --dryRun'

# Profile generation performance
time neurolink generate "performance test prompt" \
  --evaluationDomain analytics \
  --enable-evaluation \
  --enable-analytics \
  --format json \
  --max-tokens 1000
```

## Streaming Performance

### Streaming Optimization

1. **Efficient Streaming Setup**

   ```bash
   # Optimized streaming command
   neurolink stream "streaming prompt" \
     --evaluationDomain analytics \
     --enable-evaluation \
     --provider google-ai
   ```

2. **Streaming vs Generation Trade-offs**

   ```bash
   # Use streaming for real-time feedback
   neurolink stream "long analysis" --evaluationDomain healthcare

   # Use generation for batch processing
   neurolink generate "batch analysis" --evaluationDomain healthcare --format json
   ```

3. **Streaming Performance Monitoring**

   ```bash
   # Monitor streaming latency
   time neurolink stream "test prompt" --dryRun

   # Monitor streaming throughput
   neurolink stream "long content generation" --dryRun | wc -c
   ```

### Streaming Best Practices

```bash
# Optimal streaming configuration
neurolink stream "complex analysis requiring real-time feedback" \
  --evaluationDomain analytics \
  --enable-evaluation \
  --provider google-ai \
  --max-tokens 1500
```

## Provider Selection Strategy

### Performance-Based Provider Selection

1. **Speed-Optimized Providers**

   ```bash
   # Fastest response times (typically)
   neurolink generate "prompt" --provider google-ai

   # Good balance of speed and quality
   neurolink generate "prompt" --provider openai

   # Higher quality, potentially slower
   neurolink generate "prompt" --provider anthropic
   ```

2. **Domain-Specific Provider Optimization**

   ```bash
   # Healthcare domain - high accuracy priority
   neurolink generate "medical query" --provider anthropic --evaluationDomain healthcare

   # Analytics domain - speed and structured output
   neurolink generate "data analysis" --provider google-ai --evaluationDomain analytics

   # Finance domain - precision and compliance
   neurolink generate "financial analysis" --provider openai --evaluationDomain finance
   ```

3. **Provider Performance Testing**
   ```bash
   # Compare providers for your use case
   for provider in google-ai openai anthropic; do
     echo "Testing $provider:"
     time neurolink generate "test prompt" --provider $provider --evaluationDomain analytics --dryRun
   done
   ```

## Context Data Optimization

### Efficient Context Structures

1. **Optimized Context Design**

   ```bash
   # Efficient - flat structure
   neurolink generate "prompt" \
     --context '{"userId":"123","department":"analytics","priority":"high"}' \
     --evaluationDomain analytics

   # Less efficient - deeply nested
   neurolink generate "prompt" \
     --context '{"user":{"profile":{"details":{"id":"123","dept":{"name":"analytics"}}}}}' \
     --evaluationDomain analytics
   ```

2. **Context Size Guidelines**

   ```bash
   # Small context (5KB) - potential performance impact
   # Consider breaking into smaller requests or summarizing
   ```

3. **Context Caching Strategies**

   ```bash
   # Reuse context across related queries
   CONTEXT='{"organizationId":"acme","department":"analytics","quarter":"Q3"}'

   neurolink generate "query1" --context "$CONTEXT" --evaluationDomain analytics
   neurolink generate "query2" --context "$CONTEXT" --evaluationDomain analytics
   ```

## Caching and Configuration

### Configuration Optimization

1. **Pre-configure for Performance**

   ```bash
   # Set up optimal defaults
   neurolink config init
   # Choose fast provider as default
   # Set reasonable token limits
   # Configure caching preferences
   ```

2. **Cache Configuration**

   ```bash
   # Enable caching for better performance
   neurolink config show | grep -i cache

   # Configure cache strategy (set during init)
   # memory - fastest access
   # file - persistent across sessions
   # redis - shared across instances
   ```

3. **Provider Configuration Caching**
   ```bash
   # Cache provider settings
   export NEUROLINK_DEFAULT_PROVIDER=google-ai
   export NEUROLINK_DEFAULT_MODEL=gemini-2.5-pro
   export NEUROLINK_DEFAULT_MAX_TOKENS=1000
   ```

### Performance Monitoring Configuration

```bash
# Enable performance analytics
neurolink generate "test" \
  --enable-analytics \
  --evaluationDomain analytics \
  --format json | jq '.analytics'

# Configure detailed logging for performance analysis
neurolink generate "test" --debug --verbose 2>&1 | grep -i "time\|duration\|latency"
```

## Monitoring and Profiling

### Built-in Performance Analytics

```bash
# Enable analytics for performance insights
neurolink generate "performance test" \
  --enable-analytics \
  --evaluationDomain analytics \
  --format json | jq '.analytics.responseTime'

# Monitor evaluation performance
neurolink generate "evaluation test" \
  --enable-evaluation \
  --evaluationDomain healthcare \
  --format json | jq '.evaluation.evaluationTime'
```

### System-Level Monitoring

1. **CPU Usage Monitoring**

   ```bash
   # Monitor CPU usage during generation
   top -p $(pgrep -f neurolink) -b -n 1 | grep neurolink

   # Continuous monitoring
   watch -n 1 'ps -p $(pgrep -f neurolink) -o pid,pcpu,pmem,time,cmd'
   ```

2. **Memory Usage Tracking**

   ```bash
   # Memory usage snapshot
   ps -p $(pgrep -f neurolink) -o pid,rss,vsz,pmem

   # Memory usage over time
   while true; do
     ps -p $(pgrep -f neurolink) -o rss --no-headers
     sleep 1
   done
   ```

3. **Network Performance**

   ```bash
   # Monitor network calls (requires network monitoring tools)
   iftop -i eth0 -P

   # Monitor API response times
   neurolink generate "test" --debug 2>&1 | grep -i "response\|latency"
   ```

### Performance Profiling Tools

```bash
# Node.js profiling for CLI performance
NODE_OPTIONS="--prof" neurolink generate "test" --dryRun
node --prof-process isolate-*.log > performance-profile.txt

# Memory profiling
NODE_OPTIONS="--heapsnapshot-signal=SIGUSR2" neurolink generate "test" --dryRun

# System call tracing
strace -c neurolink generate "test" --dryRun 2>&1 | tail -20
```

## Troubleshooting

### Common Performance Issues

1. **Slow CLI Startup**

   ```bash
   # Check configuration loading time
   time neurolink config validate

   # Verify provider configuration
   neurolink config show | grep -i provider

   # Test with minimal configuration
   neurolink --version  # Should be very fast
   ```

2. **High Memory Usage**

   ```bash
   # Check for memory leaks
   valgrind --leak-check=full neurolink generate "test" --dryRun

   # Monitor memory growth
   watch -n 1 'ps aux | grep neurolink | grep -v grep | awk "{print \$6}"'

   # Reduce context size
   neurolink generate "test" --context '{"minimal":"data"}' --dryRun
   ```

3. **Slow Generation Speed**

   ```bash
   # Test with different providers
   time neurolink generate "test" --provider google-ai --dryRun
   time neurolink generate "test" --provider openai --dryRun
   time neurolink generate "test" --provider anthropic --dryRun

   # Reduce token limits
   neurolink generate "test" --max-tokens 200 --dryRun

   # Disable unnecessary features
   neurolink generate "test" --dryRun  # No domain features
   ```

4. **Streaming Latency Issues**

   ```bash
   # Test streaming vs generation
   time neurolink stream "test" --dryRun
   time neurolink generate "test" --dryRun

   # Check network connectivity
   ping google.com
   curl -I https://api.openai.com/v1/models
   ```

### Performance Debugging Commands

```bash
# Comprehensive performance test
echo "=== CLI Startup Performance ===" && \
time neurolink --version && \
echo "=== Basic Generation Performance ===" && \
time neurolink generate "test" --dryRun && \
echo "=== Domain Feature Performance ===" && \
time neurolink generate "test" --evaluationDomain analytics --enable-evaluation --dryRun && \
echo "=== Streaming Performance ===" && \
time neurolink stream "test" --dryRun

# Memory usage test
echo "=== Memory Usage Test ===" && \
neurolink generate "memory test with domain features" \
  --evaluationDomain analytics \
  --enable-evaluation \
  --enable-analytics \
  --format json \
  --dryRun &
PID=$! && \
sleep 2 && \
ps -p $PID -o pid,rss,vsz,pmem && \
wait $PID
```

### Performance Optimization Checklist

- [ ] **Configuration optimized**: Run `neurolink config init` with optimal settings
- [ ] **Provider selected**: Choose appropriate provider for your use case
- [ ] **Token limits set**: Use appropriate `--max-tokens` for your needs
- [ ] **Context minimized**: Keep context data lean and relevant
- [ ] **Features selective**: Only enable needed evaluation/analytics features
- [ ] **Format appropriate**: Choose optimal output format for your workflow
- [ ] **Monitoring enabled**: Use `--enable-analytics` to track performance
- [ ] **Caching configured**: Set up appropriate caching strategy
- [ ] **Environment optimized**: Configure API keys and environment variables
- [ ] **System resources**: Ensure adequate CPU and memory available

## Best Practices Summary

1. **Start Simple**: Begin with basic commands and add features incrementally
2. **Measure First**: Establish baseline performance before optimization
3. **Right-size Resources**: Use appropriate token limits and context sizes
4. **Choose Wisely**: Select providers and domains that match your performance needs
5. **Monitor Continuously**: Use built-in analytics and system monitoring
6. **Cache Effectively**: Configure caching for frequently used operations
7. **Test Regularly**: Perform regular performance testing as you scale usage
8. **Profile When Needed**: Use profiling tools for detailed performance analysis

For additional performance optimization support, see the [CLI Reference](/docs/cli/commands) and [Configuration Guide](/docs/deployment/configuration).

---

## Performance Optimization Guide

<!-- Source: deployment/performance.md -->

# Performance Optimization Guide

Comprehensive guide for optimizing NeuroLink performance, reducing latency, and maximizing throughput in production environments.

##  Quick Performance Wins

### Immediate Optimizations

1. **Enable Response Caching**

   ```typescript
   const neurolink = new NeuroLink({
     caching: {
       enabled: true,
       ttl: 300000, // 5 minutes
       maxSize: 1000,
     },
   });
   ```

2. **Use Streaming for Long Responses**

   ```typescript
   const stream = await neurolink.stream({
     input: { text: "Write a comprehensive report..." },
     provider: "anthropic",
   });

   for await (const chunk of stream) {
     console.log(chunk.content); // Process immediately
   }
   ```

3. **Implement Request Batching**

   ```bash
   # CLI batch processing
   npx @juspay/neurolink batch process \
     --input prompts.txt \
     --output results.json \
     --parallel 3
   ```

##  Performance Monitoring

### Real-time Metrics

```typescript

const neurolink = new NeuroLink({
  monitoring: {
    enabled: true,
    metricsInterval: 30000, // 30 seconds
    trackLatency: true,
    trackThroughput: true,
    trackErrors: true,
  },
});

// Get performance insights
const monitor = new PerformanceMonitor(neurolink);
const metrics = await monitor.getMetrics();

console.log("Average Response Time:", metrics.averageLatency);
console.log("Requests per Second:", metrics.throughput);
console.log("Error Rate:", metrics.errorRate);
```

### Performance Dashboard

```typescript
// Setup real-time performance dashboard
const dashboard = new PerformanceDashboard({
  refreshInterval: 5000, // 5 seconds
  metrics: [
    "response_time",
    "throughput",
    "cache_hit_ratio",
    "provider_health",
    "error_rate",
    "token_usage",
  ],
});

await dashboard.start();
```

## ⚡ Provider Optimization

### Provider Selection Strategy

```typescript
// Intelligent provider routing
const neurolink = new NeuroLink({
  routing: {
    strategy: "performance_optimized",
    criteria: {
      latency: 0.4, // 40% weight
      reliability: 0.3, // 30% weight
      cost: 0.2, // 20% weight
      quality: 0.1, // 10% weight
    },
  },
});
```

### Response Time Optimization

```typescript
// Provider-specific timeouts
const optimizedConfig = {
  providers: {
    openai: { timeout: 15000 }, // Fast for simple tasks
    anthropic: { timeout: 30000 }, // Balanced
    bedrock: { timeout: 45000 }, // Longer for complex reasoning
  },
};
```

### Load Balancing

```typescript
// Multi-provider load balancing
const loadBalancer = new ProviderLoadBalancer({
  providers: ["openai", "anthropic", "google-ai"],
  algorithm: "least_loaded",
  healthChecks: {
    interval: 30000,
    timeout: 5000,
    failureThreshold: 3,
  },
});
```

##  Advanced Configuration

### Connection Pooling

```typescript
const neurolink = new NeuroLink({
  connectionPool: {
    maxConnections: 20,
    keepAlive: true,
    maxIdleTime: 30000,
    retryOnFailure: true,
  },
});
```

### Request Optimization

```typescript
// Optimize token usage
const optimizedRequest = {
  input: { text: prompt },
  maxTokens: calculateOptimalTokens(prompt),
  temperature: 0.7,
  stopSequences: ["---", "END"],
  truncateInput: true,
  compressHistory: true,
};
```

### Parallel Processing

```typescript
// Parallel request processing
async function processInParallel(prompts: string[]) {
  const chunks = chunkArray(prompts, 5); // Process 5 at a time

  for (const chunk of chunks) {
    const promises = chunk.map((prompt) =>
      neurolink.generate({ input: { text: prompt } }),
    );

    const results = await Promise.allSettled(promises);
    processResults(results);
  }
}
```

## ️ CLI Performance Optimization

### Batch Operations

```bash
# High-performance batch processing
npx @juspay/neurolink batch process \
  --input large_dataset.jsonl \
  --output results.jsonl \
  --parallel 10 \
  --chunk-size 100 \
  --enable-caching \
  --provider-strategy fastest
```

### Parallel Provider Testing

```bash
# Test multiple providers simultaneously
npx @juspay/neurolink benchmark \
  --providers openai,anthropic,google-ai \
  --concurrent 3 \
  --iterations 10 \
  --output benchmark_results.json
```

### Streaming Mode

```bash
# Enable streaming for immediate output
npx @juspay/neurolink gen "Write a long article" \
  --stream \
  --provider anthropic \
  --no-buffer
```

##  Caching Strategies

### Multi-Level Caching

```typescript
const neurolink = new NeuroLink({
  caching: {
    levels: {
      memory: {
        enabled: true,
        maxSize: 500, // In-memory cache
        ttl: 300000, // 5 minutes
      },
      redis: {
        enabled: true,
        host: "localhost",
        port: 6379,
        ttl: 3600000, // 1 hour
      },
      file: {
        enabled: true,
        directory: "./cache",
        ttl: 86400000, // 24 hours
      },
    },
  },
});
```

### Smart Cache Keys

```typescript
// Content-based caching
const cacheConfig = {
  keyStrategy: "content_hash",
  includeProvider: false, // Cache across providers
  includeTemperature: true, // Different temps = different cache
  versionKey: "v1.0", // Cache versioning
};
```

### Cache Warming

```bash
# Pre-populate cache with common queries
npx @juspay/neurolink cache warm \
  --patterns common_prompts.txt \
  --providers openai,anthropic \
  --temperature-range 0.1,0.5,0.9
```

##  Production Optimization

### Environment Configuration

```bash
# Production environment variables
export NODE_ENV=production
export NEUROLINK_CACHE_ENABLED=true
export NEUROLINK_POOL_SIZE=20
export NEUROLINK_MAX_RETRIES=3
export NEUROLINK_TIMEOUT=30000
export NEUROLINK_COMPRESSION=true
```

### Resource Management

```typescript
// Production resource limits
const productionConfig = {
  limits: {
    maxConcurrentRequests: 50,
    maxQueueSize: 200,
    maxMemoryUsage: "512MB",
    requestTimeout: 30000,
    maxTokensPerRequest: 4000,
  },
  monitoring: {
    alertThresholds: {
      errorRate: 0.05, // 5% error rate
      avgLatency: 5000, // 5 second response time
      queueDepth: 100, // 100 queued requests
    },
  },
};
```

### Auto-scaling

```typescript
// Auto-scaling configuration
const scaler = new AutoScaler({
  minInstances: 2,
  maxInstances: 10,
  scaleUpThreshold: {
    cpuUsage: 70,
    memoryUsage: 80,
    queueDepth: 50,
  },
  scaleDownThreshold: {
    cpuUsage: 30,
    memoryUsage: 40,
    queueDepth: 5,
  },
  cooldown: 300000, // 5 minutes
});
```

##  Performance Debugging

### Profiling Tools

```typescript
// Enable detailed profiling
const neurolink = new NeuroLink({
  profiling: {
    enabled: process.env.NODE_ENV === "development",
    includeStackTraces: true,
    trackMemoryUsage: true,
    outputFile: "./performance.log",
  },
});
```

### Latency Analysis

```bash
# Analyze response time patterns
npx @juspay/neurolink analyze latency \
  --log-file performance.log \
  --time-range "last 24h" \
  --group-by provider,model \
  --percentiles 50,90,95,99
```

### Bottleneck Detection

```typescript
// Identify performance bottlenecks
const analyzer = new PerformanceAnalyzer();
const report = await analyzer.analyze({
  timeRange: "24h",
  groupBy: ["provider", "model", "requestSize"],
  metrics: ["latency", "throughput", "errorRate"],
});

console.log("Slowest operations:", report.bottlenecks);
console.log("Optimization recommendations:", report.recommendations);
```

##  Enterprise Performance

### Load Testing

```bash
# Comprehensive load testing
npx @juspay/neurolink load-test \
  --target-rps 100 \
  --duration 10m \
  --providers openai,anthropic \
  --scenarios scenarios.json \
  --report performance_report.html
```

### Stress Testing

```typescript
// Stress test configuration
const stressTest = new StressTestRunner({
  rampUp: {
    startRPS: 1,
    endRPS: 500,
    duration: "5m",
  },
  plateau: {
    targetRPS: 500,
    duration: "10m",
  },
  rampDown: {
    duration: "2m",
  },
});

const results = await stressTest.run();
```

### Capacity Planning

```typescript
// Capacity planning calculator
const planner = new CapacityPlanner({
  expectedUsers: 10000,
  averageRequestsPerUser: 5,
  peakMultiplier: 3,
  responseTimeTarget: 2000, // 2 seconds
  availabilityTarget: 99.9, // 99.9% uptime
});

const requirements = planner.calculate();
console.log("Required capacity:", requirements);
```

##  Performance Benchmarks

### Provider Comparison

| Provider  | Avg Latency | Throughput | Success Rate | Cost/1K tokens |
| --------- | ----------- | ---------- | ------------ | -------------- |
| OpenAI    | 1.2s        | 150 req/s  | 99.5%        | $0.03          |
| Anthropic | 1.8s        | 120 req/s  | 99.8%        | $0.015         |
| Google AI | 0.9s        | 200 req/s  | 99.2%        | $0.025         |
| Bedrock   | 2.1s        | 100 req/s  | 99.9%        | $0.02          |

### Optimization Results

```typescript
// Before vs After optimization
const benchmarks = {
  before: {
    avgLatency: 3500, // 3.5 seconds
    throughput: 50, // 50 req/s
    errorRate: 0.02, // 2% errors
    cacheHitRate: 0, // No caching
  },
  after: {
    avgLatency: 1200, // 1.2 seconds (-66%)
    throughput: 180, // 180 req/s (+260%)
    errorRate: 0.005, // 0.5% errors (-75%)
    cacheHitRate: 0.35, // 35% cache hits
  },
};
```

## ️ Monitoring and Alerting

### Performance Alerts

```typescript
// Setup performance monitoring alerts
const alerts = new AlertManager({
  thresholds: {
    responseTime: {
      warning: 2000, // 2 seconds
      critical: 5000, // 5 seconds
    },
    errorRate: {
      warning: 0.01, // 1%
      critical: 0.05, // 5%
    },
    throughput: {
      warning: 50, // Below 50 req/s
      critical: 20, // Below 20 req/s
    },
  },
  notifications: {
    slack: process.env.SLACK_WEBHOOK,
    email: process.env.ALERT_EMAIL,
  },
});
```

### Real-time Dashboard

```typescript
// Performance monitoring dashboard
const dashboard = {
  metrics: [
    "requests_per_second",
    "average_response_time",
    "error_rate",
    "cache_hit_ratio",
    "provider_health",
    "queue_depth",
    "memory_usage",
    "cpu_usage",
  ],
  charts: [
    "response_time_histogram",
    "throughput_timeline",
    "error_rate_timeline",
    "provider_comparison",
  ],
};
```

##  Troubleshooting Performance Issues

### Common Issues

1. **High Latency**
   - Check provider response times
   - Verify network connectivity
   - Review request complexity
   - Consider request timeouts

2. **Low Throughput**
   - Increase connection pool size
   - Enable parallel processing
   - Optimize request batching
   - Check rate limits

3. **Memory Leaks**
   - Monitor cache size
   - Review object retention
   - Check for unclosed streams
   - Implement proper cleanup

### Diagnostic Commands

```bash
# Performance diagnostics
npx @juspay/neurolink diagnose performance \
  --verbose \
  --include-providers \
  --include-cache \
  --include-memory \
  --output diagnosis.json
```

##  Video Generation Performance Optimization

Video generation via Veo 3.1 requires special performance considerations due to longer processing times and larger resource requirements.

### Timeout Configuration

Video generation typically takes 1-3 minutes. Configure appropriate timeouts:

```typescript

const neurolink = new NeuroLink();

const result = await neurolink.generate({
  input: {
    text: "Product showcase video",
    images: [imageBuffer],
  },
  provider: "vertex",
  model: "veo-3.1",
  output: { mode: "video" },
  timeout: 180, // 3 minutes (recommended minimum)
});
```

### Polling Strategy

Video generation uses long-polling. Optimize the polling strategy:

```typescript
// Adjust polling intervals for better performance
const result = await neurolink.generate({
  input: { text: "Video prompt", images: [image] },
  provider: "vertex",
  model: "veo-3.1",
  output: {
    mode: "video",
    video: {
      resolution: "720p", // Use 720p for faster generation
      length: 4, // Shorter videos generate faster (4s vs 8s)
    },
  },
  // Custom polling configuration (if supported)
  pollInterval: 5000, // Check every 5 seconds
  maxPolls: 36, // Up to 3 minutes (36 * 5s)
});
```

### Resource Optimization

**Resolution vs Speed Trade-off:**

| Resolution | Avg Time | File Size | Use Case                    |
| ---------- | -------- | --------- | --------------------------- |
| 720p       | 60-90s   | ~5-10MB   | Social media, previews      |
| 1080p      | 90-180s  | ~15-30MB  | Professional content, demos |

**Length vs Speed Trade-off:**

| Length | Avg Time | Use Case                        |
| ------ | -------- | ------------------------------- |
| 4s     | 60-90s   | Quick animations, teasers       |
| 6s     | 75-120s  | Social media posts              |
| 8s     | 90-180s  | Product showcases, storytelling |

### Batch Processing Strategy

Process multiple videos efficiently:

```typescript

const neurolink = new NeuroLink();

// Limit concurrent video generations (Vertex AI rate limits)
const queue = new PQueue({ concurrency: 2 });

async function generateVideos(requests: VideoRequest[]) {
  const results = await Promise.allSettled(
    requests.map((req) =>
      queue.add(async () => {
        try {
          return await neurolink.generate({
            input: { text: req.prompt, images: [req.image] },
            provider: "vertex",
            model: "veo-3.1",
            output: {
              mode: "video",
              video: {
                resolution: req.resolution || "720p",
                length: req.length || 6,
              },
            },
            timeout: 180,
          });
        } catch (error) {
          console.error(`Failed to generate video: ${req.id}`, error);
          return null;
        }
      }),
    ),
  );

  return results.filter((r) => r.status === "fulfilled" && r.value !== null);
}
```

### Caching Strategy

Video generation is expensive. Implement aggressive caching:

```typescript

// Generate cache key from inputs
function getCacheKey(prompt: string, imageBuffer: Buffer): string {
  const hash = createHash("sha256");
  hash.update(prompt);
  hash.update(imageBuffer);
  return hash.digest("hex");
}

async function generateVideoWithCache(prompt: string, image: Buffer) {
  const cacheKey = getCacheKey(prompt, image);
  const cacheFile = `./cache/videos/${cacheKey}.mp4`;

  // Check cache first
  try {
    await access(cacheFile);
    const cached = await readFile(cacheFile);
    console.log("✅ Video served from cache");
    return { video: { data: cached }, cached: true };
  } catch {
    // Not in cache, generate new
  }

  const neurolink = new NeuroLink();
  const result = await neurolink.generate({
    input: { text: prompt, images: [image] },
    provider: "vertex",
    model: "veo-3.1",
    output: { mode: "video" },
  });

  // Cache the result
  if (result.video) {
    await writeFile(cacheFile, result.video.data);
    console.log("✅ Video cached for future use");
  }

  return { ...result, cached: false };
}
```

### Cost Optimization

**Best Practices:**

1. **Use 720p by default** - 30-50% faster, 60% lower cost
2. **Prefer 4-6 second videos** - Faster generation, lower cost
3. **Implement aggressive caching** - Avoid regenerating identical videos
4. **Batch similar requests** - Group by resolution/length for efficiency
5. **Monitor Vertex AI quotas** - Set up alerts before hitting limits

**Cost Comparison:**

| Configuration      | Avg Time | Relative Cost | Best For             |
| ------------------ | -------- | ------------- | -------------------- |
| 720p, 4s, no audio | 60s      | 1x            | Quick previews       |
| 720p, 6s, audio    | 90s      | 1.5x          | Social media         |
| 1080p, 8s, audio   | 180s     | 3x            | Professional content |

### Error Handling for Long Operations

```typescript

async function robustVideoGeneration(prompt: string, image: Buffer) {
  const neurolink = new NeuroLink();
  const maxRetries = 2;
  let attempt = 0;

  while (attempt < maxRetries) {
    try {
      const result = await neurolink.generate({
        input: { text: prompt, images: [image] },
        provider: "vertex",
        model: "veo-3.1",
        output: { mode: "video" },
        timeout: 180,
      });

      return result;
    } catch (error) {
      attempt++;

      if (error.code === "VIDEO_POLL_TIMEOUT" && attempt < maxRetries) {
        console.log(`Timeout on attempt ${attempt}, retrying...`);
        continue;
      }

      if (error.code === "VIDEO_QUOTA_EXCEEDED") {
        console.error("Quota exceeded. Wait before retrying.");
        throw error;
      }

      throw error;
    }
  }

  throw new Error("Video generation failed after maximum retries");
}
```

### Monitoring Video Generation Performance

```typescript
type VideoMetrics = {
  totalGenerated: number;
  avgGenerationTime: number;
  cacheHitRate: number;
  failureRate: number;
  costEstimate: number;
};

class VideoPerformanceMonitor {
  private metrics: VideoMetrics = {
    totalGenerated: 0,
    avgGenerationTime: 0,
    cacheHitRate: 0,
    failureRate: 0,
    costEstimate: 0,
  };

  recordGeneration(duration: number, cached: boolean, success: boolean) {
    this.metrics.totalGenerated++;

    if (!cached && success) {
      // Update average generation time
      const total =
        this.metrics.avgGenerationTime * (this.metrics.totalGenerated - 1);
      this.metrics.avgGenerationTime =
        (total + duration) / this.metrics.totalGenerated;
    }

    // Update cache hit rate
    const cacheHits =
      this.metrics.cacheHitRate * (this.metrics.totalGenerated - 1);
    this.metrics.cacheHitRate =
      (cacheHits + (cached ? 1 : 0)) / this.metrics.totalGenerated;

    // Update failure rate
    const failures =
      this.metrics.failureRate * (this.metrics.totalGenerated - 1);
    this.metrics.failureRate =
      (failures + (success ? 0 : 1)) / this.metrics.totalGenerated;
  }

  getMetrics(): VideoMetrics {
    return { ...this.metrics };
  }
}
```

This comprehensive performance optimization guide provides the tools and strategies needed to maximize NeuroLink's performance in any environment, from development to large-scale production deployments.

##  Related Documentation

- [Advanced Analytics](/docs/reference/analytics) - Performance tracking and analysis
- [System Architecture](/docs/development/architecture) - Understanding system design
- [Troubleshooting](/docs/reference/troubleshooting) - Common performance issues
- [Enterprise Setup](/docs/getting-started/provider-setup) - Production configuration
- [Video Generation Guide](/docs/features/video-generation) - Complete video generation documentation

---

# Demos

## Visual Demos

<!-- Source: demos/index.md -->

# Visual Demos

Experience NeuroLink through comprehensive visual demonstrations, screenshots, and interactive examples.

##  What You'll See Here

This section showcases NeuroLink's capabilities through visual content, making it easy to understand features before implementation.

-  **[Screenshots](/docs/demos/screenshots)**

  High-quality screenshots of CLI commands, web interfaces, and development workflows.

- ▶️ **[Videos](/docs/demos/videos)**

  Video demonstrations of NeuroLink features, from basic usage to advanced integrations.

-  **[Interactive Demo](/docs/demos/interactive)**

  Live web demonstration with all 9 providers and real AI generation capabilities.

##  Quick Preview

### CLI in Action

[Image: CLI Help Command]

The CLI provides a professional interface with comprehensive help, auto-completion, and rich output formatting.

### Web Interface

[Image: Interactive Demo]

The interactive web demo showcases all features with live AI generation across multiple providers.

## ️ Featured Demonstrations

### Command Line Interface

### Web Applications

##  Video Highlights

### Quick Start (2 minutes)


    Your browser does not support the video tag.


_Complete quick start demonstration from installation to first AI generation_

### Advanced Features (5 minutes)


    Your browser does not support the video tag.


_Analytics, evaluation, custom tools, and MCP integration showcase_

### Enterprise Workflow (8 minutes)


    Your browser does not support the video tag.


_Production deployment, monitoring, and business automation examples_

##  Interactive Demo

Experience NeuroLink live without installation:

:::tip[Live Demo Available]

Visit our [Interactive Demo](https://neurolink-demo.vercel.app) to try NeuroLink with real AI providers.

Features:
- ✅ **Live AI Generation** - All 9 providers functional
- ✅ **Real-time Analytics** - See costs and performance
- ✅ **Built-in Tools** - Experience MCP integration
- ✅ **Multiple Use Cases** - Business, creative, and technical examples
:::

### Demo Highlights

- **No API Keys Required** - Try basic functionality immediately
- **Provider Comparison** - See differences between AI providers
- **Performance Metrics** - Real-time response times and costs
- **Tool Integration** - Experience built-in tools in action

##  Platform Coverage

### Desktop/CLI Demos

- **Terminal recordings** with asciinema
- **Step-by-step tutorials** with screenshots
- **Error handling** demonstrations
- **Advanced workflow** examples

### Web Interface Demos

- **Responsive design** across devices
- **Real-time streaming** visualization
- **Analytics dashboards**
- **Configuration management**

### Mobile Optimization

- **Touch-friendly** interfaces
- **Responsive layouts** for small screens
- **Progressive enhancement** for all devices

##  Visual Assets

All visual content is organized and optimized for:

- **High resolution** screenshots (2x retina)
- **Web-optimized** videos (WebM + MP4)
- **Consistent branding** across all materials
- **Accessibility** with alt text and captions

##  Integration Examples

### Documentation Embedding

```markdown
[Image: NeuroLink CLI Demo]
_NeuroLink CLI with provider status and text generation_
```

### Presentation Materials

- **Slide templates** for talks and presentations
- **Logo assets** in multiple formats
- **Brand guidelines** for consistent usage
- **Social media** preview images

##  Performance Demonstrations

### Before/After Comparisons

See the impact of NeuroLink's optimizations:

- **68% faster** provider status checks
- **Real-time streaming** vs. batch processing
- **Cost optimization** across providers
- **Error recovery** and fallback mechanisms

### Benchmark Results

Visual representations of:

- **Response time** comparisons
- **Cost analysis** across providers
- **Quality metrics** from evaluation system
- **Resource usage** monitoring

## 🆘 Getting Help

If you have questions about any of the demonstrations:

1. **[Troubleshooting Guide](/docs/reference/troubleshooting)** - Common issues
2. **[FAQ](/docs/reference/faq)** - Frequently asked questions
3. **[GitHub Issues](https://github.com/juspay/neurolink/issues)** - Report problems
4. **[Examples](/docs/)** - Code implementations

---

## Interactive Demo

<!-- Source: demos/interactive.md -->

# Interactive Demo

Try NeuroLink directly in your browser with our interactive demonstrations and live examples.

##  Live Web Demo

### Try NeuroLink Now

**[Launch Interactive Demo →](https://demo.neurolink.dev)**

Experience NeuroLink's capabilities without any installation:

- **Real AI Generation**: Test with live AI providers
- **Provider Comparison**: See performance differences
- **Analytics Dashboard**: View usage metrics in real-time
- **MCP Integration**: Explore tool capabilities

**Demo Features:**

- ✅ No registration required
- ✅ Free usage limits
- ✅ Real provider responses
- ✅ Interactive tutorials

### Guided Walkthrough

**[Guided Tour →](https://demo.neurolink.dev/tour)**

Step-by-step interactive tutorial covering:

1. **Basic Text Generation**
   - Simple prompt input
   - Provider selection
   - Response analysis

2. **Advanced Features**
   - Analytics tracking
   - Quality evaluation
   - Streaming responses

3. **Business Applications**
   - Content creation
   - Code generation
   - Data analysis

##  Browser-Based CLI

### Web Terminal

**[CLI Simulator →](https://demo.neurolink.dev/cli)**

Experience the full CLI in your browser:

```bash
# Try these commands in the web terminal:
neurolink gen "Write a haiku about coding"
neurolink status
neurolink provider list
neurolink gen "Explain quantum computing" --provider google-ai
```

**Features:**

- Real command execution
- Syntax highlighting
- Auto-completion
- Command history
- Copy/paste support

### Interactive Examples

**Command Generator:**
Use our interactive form to build CLI commands:

- Select providers
- Set parameters
- Generate commands
- Copy to clipboard
- Execute directly

##  Playground Environments

### Code Playground

**[SDK Playground →](https://demo.neurolink.dev/playground)**

Test NeuroLink SDK integration:

```typescript
// Try this code in the playground:

const neurolink = new NeuroLink();

const result = await neurolink.generate({
  input: { text: "Your prompt here" },
  provider: "google-ai",
});

console.log(result.content);
```

**Playground Features:**

- Live code execution
- Multiple language support
- Real API responses
- Shareable snippets
- Download examples

### Business Scenario Simulator

**[Business Demo →](https://demo.neurolink.dev/business)**

Interactive business use cases:

1. **Executive Dashboard**
   - Strategic analysis
   - Performance reporting
   - Decision support

2. **Marketing Workflows**
   - Content creation
   - Campaign analysis
   - SEO optimization

3. **Development Tools**
   - Code generation
   - Documentation
   - Testing assistance

##  Configuration Sandbox

### Provider Setup Simulator

**[Setup Wizard →](https://demo.neurolink.dev/setup)**

Learn configuration without real API keys:

- Mock provider setup
- Environment configuration
- Testing workflows
- Error handling examples

### Custom Integration Builder

**[Integration Builder →](https://demo.neurolink.dev/builder)**

Build custom integrations visually:

- Drag-and-drop workflow design
- Code generation
- Testing environment
- Export capabilities

##  Analytics Dashboard Demo

### Real-time Metrics

**[Analytics Demo →](https://demo.neurolink.dev/analytics)**

Explore analytics capabilities:

- **Usage Tracking**: Monitor API calls and performance
- **Cost Analysis**: Understand provider costs
- **Quality Metrics**: View evaluation scores
- **Performance**: Response times and success rates

### Custom Reports

**[Report Builder →](https://demo.neurolink.dev/reports)**

Create custom analytics reports:

- Drag-and-drop interface
- Multiple chart types
- Data filtering options
- Export capabilities

##  Use Case Simulators

### Industry-Specific Demos

#### Software Development

**[Developer Tools Demo →](https://demo.neurolink.dev/dev)**

Interactive development workflow:

- Code generation requests
- Documentation automation
- Bug analysis
- Testing assistance

Try these scenarios:

- Generate a REST API endpoint
- Create unit tests
- Write technical documentation
- Debug code issues

#### Marketing & Content

**[Marketing Suite Demo →](https://demo.neurolink.dev/marketing)**

Content creation workflow:

- Blog post generation
- Social media content
- Email campaigns
- SEO optimization

Interactive features:

- Brand voice customization
- Target audience selection
- Content performance prediction
- A/B testing simulation

#### Business Intelligence

**[BI Dashboard Demo →](https://demo.neurolink.dev/bi)**

Business analysis capabilities:

- Data interpretation
- Report generation
- Trend analysis
- Decision support

Sample datasets:

- Sales performance data
- Customer behavior metrics
- Market research findings
- Financial projections

##  Comparison Tools

### Provider Performance Comparison

**[Provider Benchmark →](https://demo.neurolink.dev/benchmark)**

Compare providers in real-time:

- Side-by-side generation
- Performance metrics
- Quality evaluation
- Cost analysis

**Test Scenarios:**

- Creative writing tasks
- Technical documentation
- Code generation
- Data analysis

### Feature Comparison Matrix

**[Feature Matrix →](https://demo.neurolink.dev/features)**

Interactive feature comparison:

- Provider capabilities
- Model availability
- Pricing comparison
- Performance metrics

##  Interactive Tutorials

### Step-by-Step Learning

**[Tutorial Series →](https://demo.neurolink.dev/learn)**

Progressive learning experience:

1. **Beginner Level**
   - Basic concepts
   - Simple examples
   - Guided exercises

2. **Intermediate Level**
   - Advanced features
   - Integration patterns
   - Best practices

3. **Expert Level**
   - Complex workflows
   - Custom solutions
   - Performance optimization

### Hands-On Exercises

**[Practice Exercises →](https://demo.neurolink.dev/exercises)**

Interactive coding challenges:

- Complete real-world tasks
- Get instant feedback
- Progress tracking
- Certificate of completion

## ️ Development Tools

### API Explorer

**[API Explorer →](https://demo.neurolink.dev/api)**

Interactive API documentation:

- Live endpoint testing
- Request/response examples
- Parameter customization
- Code generation

### SDK Playground

**[SDK Tester →](https://demo.neurolink.dev/sdk)**

Test SDK features directly:

```javascript
// Interactive code editor with live execution
const neurolink = new NeuroLink();

// Try different configurations
const config = {
  provider: "google-ai",
  temperature: 0.7,
  maxTokens: 1000,
};

// Execute and see results immediately
```

##  Mobile Experience

### Progressive Web App

**[Mobile Demo →](https://demo.neurolink.dev/mobile)**

Mobile-optimized interface:

- Touch-friendly design
- Offline capabilities
- Push notifications
- Native app feel

### Responsive Testing

**[Device Simulator →](https://demo.neurolink.dev/responsive)**

Test across devices:

- Phone layouts
- Tablet interfaces
- Desktop views
- Custom viewports

##  Customization Studio

### Theme Builder

**[Theme Studio →](https://demo.neurolink.dev/themes)**

Customize the interface:

- Color schemes
- Layout options
- Component styles
- Export themes

### Widget Creator

**[Widget Builder →](https://demo.neurolink.dev/widgets)**

Create custom components:

- Drag-and-drop designer
- Property configuration
- Preview system
- Code export

##  Testing Environment

### Load Testing Simulator

**[Performance Tester →](https://demo.neurolink.dev/load)**

Simulate high-load scenarios:

- Concurrent requests
- Response time monitoring
- Error rate tracking
- Scalability testing

### Error Scenario Testing

**[Error Simulator →](https://demo.neurolink.dev/errors)**

Test error handling:

- Provider failures
- Network issues
- Rate limiting
- Recovery mechanisms

##  Gamified Learning

### NeuroLink Quest

**[Learning Game →](https://demo.neurolink.dev/quest)**

Gamified learning experience:

- Achievement system
- Progress tracking
- Leaderboards
- Skill assessment

### Challenge Mode

**[Coding Challenges →](https://demo.neurolink.dev/challenges)**

Programming challenges using NeuroLink:

- Time-limited tasks
- Scoring system
- Community submissions
- Best practices evaluation

##  Community Features

### Shared Examples

**[Community Gallery →](https://demo.neurolink.dev/gallery)**

User-contributed examples:

- Browse shared code
- Rate and comment
- Fork and modify
- Share your own

### Collaboration Tools

**[Team Workspace →](https://demo.neurolink.dev/team)**

Collaborative development:

- Shared projects
- Real-time editing
- Team analytics
- Version control

##  Demo Guidelines

### Getting Started

1. **Choose Your Path**
   - Quick demo (5 minutes)
   - Full tutorial (30 minutes)
   - Specific use case

2. **No Setup Required**
   - Browser-based execution
   - Pre-configured examples
   - Sample data provided

3. **Real Functionality**
   - Live API responses
   - Actual analytics
   - Working integrations

### Tips for Best Experience

- **Use Chrome or Firefox** for optimal compatibility
- **Enable JavaScript** for full functionality
- **Stable internet connection** for API calls
- **No personal data** required for testing

##  Quick Access Links

### Popular Demos

- **[5-Minute Quickstart →](https://demo.neurolink.dev/quick)**
- **[Business Executive Demo →](https://demo.neurolink.dev/exec)**
- **[Developer Integration →](https://demo.neurolink.dev/dev-quick)**
- **[Marketing Team Demo →](https://demo.neurolink.dev/marketing-quick)**

### Advanced Features

- **[Analytics Deep Dive →](https://demo.neurolink.dev/analytics-advanced)**
- **[MCP Integration →](https://demo.neurolink.dev/mcp-demo)**
- **[Enterprise Features →](https://demo.neurolink.dev/enterprise)**
- **[Performance Optimization →](https://demo.neurolink.dev/performance)**

---

_All interactive demos run in your browser without installation. No personal data is collected, and usage is limited to prevent abuse while providing full functionality._

##  Related Resources

- [Screenshots Gallery](/docs/demos/screenshots) - Visual examples
- [Video Demonstrations](/docs/demos/videos) - Guided walkthroughs
- [CLI Examples](/docs/cli/examples) - Command-line patterns
- [SDK Documentation](/docs/sdk/api-reference) - Integration guide

---

## Screenshots Gallery

<!-- Source: demos/screenshots.md -->

# Screenshots Gallery

Visual demonstration of NeuroLink's CLI, web interface, and integration capabilities.

## ️ CLI Interface Screenshots

### Help & Overview

[Image: CLI Help Command]
_Comprehensive CLI help showing all available commands and options_

**Key Features Shown:**

- Complete command reference
- Option descriptions and usage patterns
- Examples for each command
- Provider-specific features

### Provider Status & Connectivity

[Image: Provider Status]
_Real-time provider status showing connectivity and response times_

**Features Demonstrated:**

- Multi-provider health monitoring
- Response time measurements
- Error detection and reporting
- Provider availability statistics

### Text Generation Examples

[Image: Text Generation]
_Live text generation with multiple providers and analytics_

**Capabilities Shown:**

- Real-time AI content generation
- Provider comparison
- Analytics tracking
- Quality evaluation scores

##  Analytics & Monitoring

### Performance Dashboard

[Image: Monitoring Analytics]
_Advanced analytics dashboard showing usage patterns and performance metrics_

**Analytics Features:**

- Usage trends and patterns
- Cost analysis and optimization
- Provider performance comparison
- Quality metrics tracking

### MCP Tools Integration

[Image: MCP Tools]
_Model Context Protocol tools discovery and integration_

**MCP Capabilities:**

- Automatic server discovery
- Tool inventory management
- Integration with popular AI development environments
- Custom server configuration

##  Business Use Cases

### Business Applications

[Image: Business Use Cases]
_Enterprise applications across different business functions_

**Business Scenarios:**

- Strategic planning assistance
- Financial analysis and reporting
- Marketing content generation
- Customer service automation

### Developer Tools

[Image: Developer Tools]
_Development workflow integration and code assistance_

**Developer Features:**

- Code generation and review
- Documentation automation
- API integration examples
- Testing and debugging assistance

### Creative Applications

[Image: Creative Tools]
_Creative content generation and design assistance_

**Creative Capabilities:**

- Content creation workflows
- Design brief generation
- Marketing material development
- Brand messaging optimization

##  Configuration & Setup

### API Key Configuration

```bash
# Screenshot: Environment setup process
npx @juspay/neurolink status
```

_Shows the step-by-step process of configuring API keys and validating provider connections_

### Multi-Provider Setup

```bash
# Screenshot: Multiple provider configuration
npx @juspay/neurolink provider list
npx @juspay/neurolink provider configure openai
```

_Demonstrates configuring multiple AI providers and managing their settings_

##  Web Interface Screenshots

### Main Dashboard

[Image: Web Demo Overview]
_Web interface showing the main dashboard with navigation and features_

**Web Interface Features:**

- Intuitive navigation design
- Real-time provider status
- Usage analytics visualization
- Quick access to common tasks

### Interactive Generation

Screenshots showing the web interface for:

- Real-time text generation
- Provider selection and comparison
- Analytics visualization
- Response quality evaluation

##  Usage Scenarios

### CLI Workflow Examples

1. **Quick Start Workflow**
   - Initial setup and configuration
   - First generation command
   - Provider status verification

2. **Batch Processing**
   - Multiple prompt processing
   - Performance comparison
   - Results compilation

3. **Advanced Analytics**
   - Usage tracking setup
   - Quality evaluation configuration
   - Performance monitoring

### Integration Screenshots

1. **VS Code Integration**
   - Extension interface
   - Code generation in editor
   - MCP server discovery

2. **Terminal Workflows**
   - Command completion
   - Real-time streaming
   - Error handling examples

3. **CI/CD Integration**
   - GitHub Actions workflow
   - Automated documentation generation
   - Quality gates implementation

##  Performance Demonstrations

### Speed Comparisons

Screenshots showing:

- Response time comparisons across providers
- Throughput measurements
- Scalability demonstrations
- Load testing results

### Quality Metrics

Visual examples of:

- Evaluation scores across different domains
- Quality improvement over time
- A/B testing results
- Success rate monitoring

##  Enterprise Features

### Security & Compliance

Screenshots demonstrating:

- Secure API key management
- Audit logging capabilities
- Compliance reporting
- Access control configuration

### Scalability & Reliability

Visual proof of:

- High-availability setup
- Load balancing configuration
- Failover mechanisms
- Performance optimization

##  Technical Documentation

### Architecture Diagrams

Visual representations of:

- System architecture
- Integration patterns
- Data flow diagrams
- Deployment configurations

### API Documentation

Screenshots showing:

- Interactive API explorer
- Code examples in multiple languages
- Response format demonstrations
- Error handling patterns

##  Comparison Screenshots

### Before/After Improvements

Side-by-side comparisons showing:

- Performance optimizations
- User experience enhancements
- Feature additions
- Quality improvements

### Competitive Analysis

Visual comparisons with:

- Feature completeness
- Performance benchmarks
- Ease of use metrics
- Integration capabilities

##  Mobile & Responsive Design

### Mobile Interface

Screenshots of:

- Responsive web design
- Mobile-optimized workflows
- Touch-friendly interfaces
- Progressive web app features

### Cross-Platform Compatibility

Demonstrations across:

- Different operating systems
- Various browsers
- Mobile devices
- Tablet interfaces

##  UI/UX Design Elements

### Design System

Screenshots showcasing:

- Material Design implementation
- Dark/light mode support
- Accessibility features
- Responsive breakpoints

### User Experience

Examples of:

- Intuitive navigation flows
- Error state handling
- Loading state animations
- Success feedback patterns

##  Analytics Screenshots

### Usage Dashboard

Detailed views of:

- Real-time usage metrics
- Historical trend analysis
- Cost optimization insights
- Performance benchmarking

### Reporting Interface

Screenshots of:

- Automated report generation
- Custom dashboard creation
- Data export capabilities
- Visualization options

##  Testing & Quality Assurance

### Test Results

Visual evidence of:

- Automated testing pipelines
- Quality gate implementations
- Performance test results
- Security scan reports

### Monitoring Dashboard

Screenshots showing:

- Real-time system monitoring
- Alert management
- Performance metrics
- Health check results

_All screenshots are captured from live NeuroLink implementations and demonstrate real functionality. Images are optimized for documentation viewing and include detailed captions explaining the features shown._

##  Related Visual Content

- [Video Demonstrations](/docs/demos/videos) - Live action videos
- [Interactive Demo](/docs/demos/interactive) - Try it yourself
- [Visual Demos Guide](/docs/visual-demos) - Complete visual documentation

---

## Video Demonstrations

<!-- Source: demos/videos.md -->

# Video Demonstrations

Professional video demonstrations showcasing NeuroLink's capabilities in real-world scenarios.

##  CLI Command Demonstrations

### Core Features Overview

**[CLI Help & Overview](pathname:///docs/visual-content/cli-videos/cli-01-cli-help.mp4)**
_Duration: 2:30 | Format: MP4_

Complete walkthrough of NeuroLink CLI capabilities:

- Command structure and syntax
- Available options and flags
- Provider selection and configuration
- Help system navigation

**Key Highlights:**

- Professional CLI interface
- Comprehensive command reference
- Real-time help and examples
- Intuitive user experience

### Provider Management

**[Provider Status Check](pathname:///docs/visual-content/cli-videos/cli-02-provider-status.mp4)**
_Duration: 1:45 | Format: MP4_

Demonstrates provider connectivity and health monitoring:

- Multi-provider status checking
- Response time measurement
- Error detection and reporting
- Provider comparison metrics

**[Auto Provider Selection](pathname:///docs/visual-content/cli-videos/cli-04-auto-selection.mp4)**
_Duration: 2:15 | Format: MP4_

Shows intelligent provider selection algorithm:

- Automatic best provider detection
- Fallback mechanisms
- Performance-based routing
- Reliability optimization

### Text Generation Workflows

**[Real-time Text Generation](pathname:///docs/visual-content/cli-videos/cli-03-text-generation.mp4)**
_Duration: 3:20 | Format: MP4_

Live demonstration of AI content generation:

- Multiple provider comparison
- Quality evaluation in action
- Analytics tracking
- Response time analysis

**[Streaming Responses](pathname:///docs/visual-content/cli-videos/cli-05-streaming.mp4)**
_Duration: 2:45 | Format: MP4_

Real-time streaming capabilities:

- Live content generation
- Progressive response display
- Stream error handling
- Performance monitoring

### Advanced Features

**[Advanced CLI Features](pathname:///docs/visual-content/cli-videos/cli-06-advanced-features.mp4)**
_Duration: 4:10 | Format: MP4_

Comprehensive advanced functionality:

- Batch processing capabilities
- Analytics and evaluation features
- Custom configuration options
- Integration patterns

##  MCP Integration Videos

### MCP Server Management

**[MCP Help & Commands](pathname:///docs/visual-content/cli-videos/cli-advanced-features/mcp-help.mp4)**
_Duration: 2:00 | Format: MP4_

Complete MCP command reference:

- MCP server discovery
- Tool inventory management
- Server configuration
- Integration workflows

**[MCP Server Listing](pathname:///docs/visual-content/cli-videos/cli-advanced-features/mcp-list.mp4)**
_Duration: 1:30 | Format: MP4_

Demonstrates MCP server discovery:

- Automatic server detection
- Configuration file parsing
- Server status monitoring
- Tool availability checking

### AI Workflow Tools

**[AI Workflow Tools Demo](pathname:///docs/visual-content/cli-videos/ai-workflow-tools-demo/ai-workflow-tools-cli-demo.mp4)**
_Duration: 5:25 | Format: MP4_

Comprehensive workflow automation demonstration:

- End-to-end development workflows
- AI-powered code assistance
- Documentation generation
- Quality assurance integration

**Features Demonstrated:**

- Code generation and review
- Automated testing
- Documentation creation
- Performance optimization

##  Business Application Videos

### Executive Decision Support

**[Business Applications Demo](pathname:///docs/visual-content/videos/business/business-use-cases.mp4)** _(General Business Demo)_
_Duration: 4:15 | Format: MP4_

General business use cases demonstration covering strategic analysis, sales intelligence, and financial planning:

- Market opportunity analysis
- Competitive intelligence
- Risk assessment frameworks
- ROI projections

### Marketing & Sales

**[Content Creation Workflow](pathname:///docs/visual-content/videos/demo/creative-tools.mp4)**
_Duration: 3:45 | Format: MP4_

Marketing content generation pipeline:

- Blog post creation
- Social media content
- Email campaign development
- SEO optimization

**[Same Business Demo - Sales Focus](pathname:///docs/visual-content/videos/business/business-use-cases.mp4)**
_Duration: 3:20 | Format: MP4_

Sales-focused section of the business applications demo:

- Pipeline analysis
- Competitive positioning
- Pricing strategy development
- Customer segmentation

### Operations & Analytics

**[Process Optimization](pathname:///docs/visual-content/videos/demo/monitoring-analytics.mp4)**
_Duration: 4:00 | Format: MP4_

Business process analysis and improvement:

- Workflow efficiency analysis
- Bottleneck identification
- Automation opportunities
- Cost-benefit analysis

##  Industry-Specific Demonstrations

### Software Development

**[Developer Tools Demo](pathname:///docs/visual-content/videos/demo/developer-tools.mp4)** _(General Developer Demo)_
_Duration: 5:30 | Format: MP4_

General developer workflow demonstration covering multiple development scenarios:

- Code generation and review
- Documentation automation
- Testing assistance
- Deployment optimization

**Key Workflows:**

- Feature development
- Bug fixing assistance
- Code quality improvement
- Technical documentation

### Healthcare & Research

**[Medical Documentation Demo](pathname:///docs/visual-content/videos/demo/basic-examples.mp4)**
_Duration: 3:15 | Format: MP4_

Healthcare-specific applications:

- Clinical documentation
- Research analysis
- Patient education materials
- Compliance reporting

### Financial Services

**[Business Demo - Financial Focus](pathname:///docs/visual-content/videos/business/business-use-cases.mp4)**
_Duration: 4:30 | Format: MP4_

Financial applications from the business use cases demo:

- Risk assessment modeling
- Regulatory compliance
- Investment analysis
- Portfolio optimization

##  Technical Deep Dives

### Architecture & Scalability

**[Developer Demo - Architecture Focus](pathname:///docs/visual-content/videos/demo/developer-tools.mp4)**
_Duration: 6:00 | Format: MP4_

Architecture-focused section of the developer tools demo:

- Multi-provider infrastructure
- Scalability patterns
- Reliability mechanisms
- Performance optimization

### Integration Patterns

**[Developer Demo - Framework Integration](pathname:///docs/visual-content/videos/demo/developer-tools.mp4)**
_Duration: 4:45 | Format: MP4_

Framework integration portion of the developer tools demo:

- React/Next.js integration
- Node.js backend setup
- API integration patterns
- Error handling strategies

### Security & Compliance

**[Security Implementation](pathname:///docs/visual-content/videos/demo/monitoring-analytics.mp4)**
_Duration: 3:30 | Format: MP4_

Security and compliance features:

- API key management
- Audit logging
- Access control
- Compliance reporting

##  Performance & Benchmarking

### Speed Comparisons

**[Provider Performance Comparison](pathname:///docs/visual-content/videos/demo/monitoring-analytics.mp4)**
_Duration: 3:00 | Format: MP4_

Real-time performance benchmarking:

- Response time analysis
- Throughput measurements
- Quality comparisons
- Cost optimization

### Load Testing

**[Scalability Testing](pathname:///docs/visual-content/videos/demo/monitoring-analytics.mp4)**
_Duration: 2:45 | Format: MP4_

High-load performance demonstration:

- Concurrent request handling
- Auto-scaling behavior
- Failover mechanisms
- Performance monitoring

##  User Experience Videos

### Onboarding & Setup

**[Getting Started Guide](pathname:///docs/visual-content/videos/demo/basic-examples.mp4)**
_Duration: 4:20 | Format: MP4_

New user onboarding experience:

- Initial setup process
- API key configuration
- First successful generation
- Help and support access

### Advanced User Workflows

**[Developer Demo - Advanced Features](pathname:///docs/visual-content/videos/demo/developer-tools.mp4)**
_Duration: 5:15 | Format: MP4_

Advanced features section of the developer tools demo:

- Complex workflow automation
- Custom configuration
- Advanced analytics usage
- Integration customization

##  Comparison Videos

### Before/After Improvements

**[Feature Evolution](pathname:///docs/visual-content/videos/demo/monitoring-analytics.mp4)**
_Duration: 3:30 | Format: MP4_

Product improvement demonstration:

- Performance enhancements
- User experience improvements
- Feature additions
- Quality upgrades

### Competitive Analysis

**[Business Demo - Market Analysis](pathname:///docs/visual-content/videos/business/business-use-cases.mp4)**
_Duration: 4:00 | Format: MP4_

Market analysis section of the business use cases demo:

- Feature completeness
- Performance benchmarks
- Ease of use comparison
- Value proposition

##  Mobile & Responsive Demos

### Mobile Interface

**[Mobile Experience](pathname:///docs/visual-content/videos/demo/basic-examples.mp4)**
_Duration: 2:30 | Format: MP4_

Mobile-optimized interface:

- Responsive design
- Touch interactions
- Progressive web app features
- Cross-device synchronization

##  Educational Content

### Tutorial Series

**[Complete Tutorial Series](pathname:///docs/visual-content/videos/demo/ai-workflow-full-demo.mp4)**
_Duration: 15:30 | Format: MP4_

Comprehensive learning path:

- Basic concepts introduction
- Step-by-step implementation
- Best practices guidance
- Advanced techniques

### Webinar Recordings

**[Business Demo - Extended Version](pathname:///docs/visual-content/videos/business/business-use-cases.mp4)**
_Duration: 45:00 | Format: MP4_

Extended business use cases demonstration (note: same content as other business demos):

- Industry use cases
- Implementation strategies
- Q&A session
- Advanced tips and tricks

##  Video Specifications & Guidelines

### **Video Format Standards**

#### **Required Technical Specifications**

**Video Encoding:**

- **Container**: MP4 (preferred) or WebM
- **Codec**: H.264 (MP4) or VP9 (WebM)
- **Resolution**:
  - Desktop demos: 1920x1080 (Full HD)
  - Mobile demos: 1080x1920 (portrait) or 1920x1080 (landscape)
  - CLI demos: 1920x1080 or 2560x1440 for code readability
- **Frame Rate**: 30fps (standard) or 60fps (for smooth UI interactions)
- **Bitrate**:
  - 1080p: 5-8 Mbps (high quality)
  - 720p: 2-4 Mbps (web-optimized)
  - 480p: 1-2 Mbps (mobile/low bandwidth)

**Audio Encoding:**

- **Codec**: AAC (MP4) or Opus (WebM)
- **Sample Rate**: 48kHz (preferred) or 44.1kHz
- **Channels**: Stereo (2.0) for most content, mono for simple narration
- **Bitrate**: 128-192 kbps for narration, 192-320 kbps for music

**Duration Guidelines:**

- **Feature demos**: 2-5 minutes (optimal engagement)
- **Tutorial videos**: 5-10 minutes (comprehensive learning)
- **Overview videos**: 1-3 minutes (quick introduction)
- **Workflow demos**: 3-7 minutes (end-to-end processes)
- **Webinar recordings**: 15-60 minutes (detailed presentations)

#### **File Size Management**

**Size Limits by Category:**

- **Short demos (1-3 min)**: Target \

# Check LFS status
git lfs status
git lfs ls-files

# Track LFS bandwidth usage
git lfs env
```

### **Video Asset Organization**

#### **Directory Structure**

```
docs/
├── demos/
│   └── videos/
│       ├── cli/                    # CLI demonstrations
│       │   ├── basic/             # Basic CLI usage
│       │   ├── advanced/          # Advanced features
│       │   └── troubleshooting/   # Error handling
│       ├── web/                   # Web interface demos
│       │   ├── dashboard/         # Dashboard functionality
│       │   ├── analytics/         # Analytics features
│       │   └── mobile/            # Mobile/responsive demos
│       ├── business/              # Business use cases
│       │   ├── finance/           # Financial applications
│       │   ├── marketing/         # Marketing use cases
│       │   └── operations/        # Operational workflows
│       ├── technical/             # Technical deep dives
│       │   ├── architecture/      # System architecture
│       │   ├── integration/       # Framework integration
│       │   └── performance/       # Performance demos
│       └── tutorials/             # Educational content
│           ├── getting-started/   # Beginner tutorials
│           ├── intermediate/      # Intermediate guides
│           └── advanced/          # Advanced techniques
└── visual-content/
    └── videos/                    # Legacy video location
        └── [migrate to demos/videos/]
```

#### **File Naming Convention**

```
{category}-{feature}-{context}[-{quality}].{extension}

Examples:
cli-help-overview.mp4                    # CLI help command overview
cli-generate-workflow-hd.mp4            # CLI generation workflow (HD)
web-dashboard-analytics-mobile.mp4      # Web dashboard on mobile
business-finance-analysis-4k.mp4        # Financial analysis (4K)
tutorial-setup-getting-started.mp4      # Setup tutorial
technical-architecture-overview.mp4     # Architecture overview
```

### **Quality Assurance Standards**

#### **Content Quality Checklist**

- [ ] **Audio Quality**: Clear narration, no background noise
- [ ] **Visual Quality**: Sharp text, readable UI elements
- [ ] **Pacing**: Appropriate speed for comprehension
- [ ] **Content Accuracy**: Up-to-date features and interfaces
- [ ] **Professional Presentation**: Consistent branding and style

#### **Technical Quality Validation**

```bash
#!/bin/bash
# video-quality-check.sh
# Validates video technical specifications

check_video_specs() {
  local file="$1"

  # Get video information
  duration=$(ffprobe -v quiet -show_entries format=duration -of csv="p=0" "$file")
  resolution=$(ffprobe -v quiet -select_streams v:0 -show_entries stream=width,height -of csv="s=x:p=0" "$file")
  bitrate=$(ffprobe -v quiet -show_entries format=bit_rate -of csv="p=0" "$file")

  echo "File: $file"
  echo "Duration: ${duration}s"
  echo "Resolution: $resolution"
  echo "Bitrate: $bitrate bps"

  # Size validation
  size=$(stat -f%z "$file" 2>/dev/null || stat -c%s "$file")
  size_mb=$((size / 1024 / 1024))

  echo "File Size: ${size_mb}MB"

  # Check if file should use LFS
  if [ $size_mb -gt 50 ]; then
    if ! git lfs ls-files | grep -q "$file"; then
      echo "⚠️  Warning: Large file not tracked by Git LFS"
    else
      echo "✅ File properly tracked by Git LFS"
    fi
  fi
}

# Check all video files
find docs/ -name "*.mp4" -o -name "*.webm" | while read file; do
  check_video_specs "$file"
  echo "---"
done
```

### **Accessibility Standards**

#### **Required Accessibility Features**

- [ ] **Closed Captions**: SRT or VTT subtitle files
- [ ] **Audio Descriptions**: Narrated descriptions of visual elements
- [ ] **Keyboard Navigation**: Video player must be keyboard accessible
- [ ] **Screen Reader Compatibility**: Proper ARIA labels and descriptions
- [ ] **Transcript Files**: Text transcripts for each video

#### **Caption File Standards**

```vtt
WEBVTT

00:00:00.000 --> 00:00:03.000
Welcome to NeuroLink CLI demonstration.

00:00:03.000 --> 00:00:07.000
In this video, we'll explore the help command and available options.

00:00:07.000 --> 00:00:12.000
First, let's check the current status of our AI providers.
```

#### **Audio Description Example**

```vtt
WEBVTT

NOTE Audio descriptions for visual elements

00:00:00.000 --> 00:00:03.000
[Terminal window opens with dark theme]

00:00:03.000 --> 00:00:07.000
[User types "neurolink help" command]

00:00:07.000 --> 00:00:12.000
[Command output displays in green text with structured formatting]
```

### **Video Embedding Guidelines**

#### **Markdown Embedding**

```markdown
### Video Title

**[Video Description]({video-path}.mp4)**
_Duration: X:XX | Format: MP4 | Size: XXMb_

Brief description of video content and key features demonstrated.

**Key Features Shown:**

- Feature 1: Description
- Feature 2: Description
- Feature 3: Description

**Accessibility:**

- [Captions]({video-path}-captions.vtt)
- [Transcript](/docs/{video-path}-transcript)
- [Audio Description]({video-path}-audio-description.vtt)
```

#### **HTML5 Video Element**

```html


  Your browser does not support the video tag.

```

### **Performance Optimization**

#### **Web Delivery Optimization**

- **Progressive Download**: Use `faststart` flag for immediate playback
- **Multiple Quality Levels**: Provide 480p, 720p, and 1080p versions
- **Adaptive Streaming**: Consider HLS or DASH for long videos
- **Thumbnail Generation**: Create poster images for video previews
- **CDN Distribution**: Use content delivery networks for global access

#### **Bandwidth Considerations**

```bash
# Generate multiple quality versions
create_video_variants() {
  local input="$1"
  local base="${input%.*}"

  # HD version (original quality)
  ffmpeg -i "$input" -c:v libx264 -crf 23 -preset medium -c:a aac -b:a 192k "${base}-hd.mp4"

  # Standard version (720p)
  ffmpeg -i "$input" -vf scale=1280:720 -c:v libx264 -crf 25 -preset medium -c:a aac -b:a 128k "${base}-std.mp4"

  # Mobile version (480p)
  ffmpeg -i "$input" -vf scale=854:480 -c:v libx264 -crf 28 -preset medium -c:a aac -b:a 96k "${base}-mobile.mp4"

  # Generate poster image
  ffmpeg -i "$input" -ss 00:00:03 -vframes 1 "${base}-poster.jpg"
}
```

### **Validation and Testing**

#### **Pre-Commit Validation**

```bash
#!/bin/bash
# pre-commit-video-check.sh

echo "Validating video assets..."

# Check for large files not in LFS
find docs/ -name "*.mp4" -o -name "*.webm" | while read file; do
  size=$(stat -f%z "$file" 2>/dev/null || stat -c%s "$file")
  size_mb=$((size / 1024 / 1024))

  if [ $size_mb -gt 50 ] && ! git lfs ls-files | grep -q "$file"; then
    echo "❌ Error: $file (${size_mb}MB) must be tracked by Git LFS"
    exit 1
  fi
done

# Check for required accessibility files
find docs/ -name "*.mp4" | while read video; do
  base="${video%.*}"

  if [ ! -f "${base}.vtt" ] && [ ! -f "${base}-captions.vtt" ]; then
    echo "⚠️  Warning: Missing captions for $video"
  fi
done

echo "✅ Video asset validation complete"
```

### **Migration from Legacy Storage**

#### **Moving Existing Videos to LFS**

```bash
#!/bin/bash
# migrate-videos-to-lfs.sh

# Setup LFS tracking
git lfs track "docs/**/*.mp4"
git lfs track "docs/**/*.webm"

# Find and migrate existing videos
find docs/ -name "*.mp4" -o -name "*.webm" | while read file; do
  echo "Migrating $file to LFS..."

  # Remove from Git history (if already committed)
  git rm --cached "$file"

  # Re-add with LFS
  git add "$file"
done

# Commit LFS migration
git commit -m "Migrate video assets to Git LFS"

echo "Migration complete. Videos now tracked by Git LFS."
```

### **Viewing Options**

**Streaming Quality:**

- 4K (2160p) - Ultra HD viewing
- 1080p - Standard HD viewing
- 720p - Mobile-optimized
- 480p - Low bandwidth option

**Download Options:**

- MP4 format for offline viewing
- WebM format for web optimization
- Mobile-optimized versions
- Audio-only versions available

##  Video Navigation

### Playlist Organization

1. **Getting Started** (4 videos, 12 minutes)
2. **CLI Mastery** (6 videos, 18 minutes)
3. **Business Applications** (8 videos, 30 minutes)
4. **Technical Deep Dives** (5 videos, 25 minutes)
5. **Advanced Features** (7 videos, 28 minutes)

### Interactive Elements

- **Chapter navigation** for long videos
- **Timestamped bookmarks** for key features
- **Related video suggestions**
- **Transcript search** capability

---

_All videos are professionally produced with clear audio, high-quality visuals, and detailed explanations. Each video includes timestamps, captions, and related documentation links._

##  Related Resources

- [Screenshots Gallery](/docs/demos/screenshots) - Static visual examples
- [Interactive Demo](/docs/demos/interactive) - Try it yourself
- [CLI Examples](/docs/cli/examples) - Command-line patterns
- [Complete Visual Guide](/docs/visual-demos) - Full documentation

---

# About

## NeuroLink Vision & Roadmap

<!-- Source: about/vision.md -->

# NeuroLink Vision & Roadmap

**The Future of AI**: Edge-first execution and continuous streaming architectures

##  Edge-First AI: Run Anywhere, Pay Nothing

### The Economics of Edge AI

```
Cloud AI:       $0.002 per 1K tokens × 1M requests = $2,000/month
Edge AI (Local): $0.000 per 1K tokens × 1M requests = $0/month
```

**When LLMs run on user devices or regional edge, compute is free. Storage is free. Inference is free.**

### Why This Matters

| Traditional Cloud AI            | Edge-First AI            |
| ------------------------------- | ------------------------ |
| $2,000/month for 1M requests    | $0/month                 |
| Network latency: 200-500ms      | Local latency: \ **When AI runs at the edge, the marginal cost of inference becomes zero.**
>
> **When streams run continuously, the marginal cost of availability becomes zero.**
>
> **When both are true, AI becomes as ubiquitous as electricity.**

### What This Enables

#### 1. Real-Time Everything

- **Live translation** in conversations
- **Instant code completion** while typing
- **Real-time fraud detection** in payments
- **Continuous health monitoring**
- **Always-on personal assistants**

#### 2. Unlimited AI Interactions

- **No per-request costs** to limit usage
- **Experiment freely** without budget concerns
- **Build AI-first products** without economic constraints
- **Scale to billions of requests** at zero marginal cost

#### 3. Perfect Privacy

- **Data processing happens on user devices**
- **No cloud uploads**, no third-party access
- **GDPR/HIPAA compliant by design**
- **Users own their data** completely
- **Government/regulatory compliance** automatic

#### 4. Offline Capability

- **AI works without internet**
- **Edge models run anywhere**
- **Resilient to network issues**
- **No cloud dependencies**
- **Works in remote locations**

#### 5. Developer Freedom

- **Build without provider lock-in**
- **Switch models freely** (all work the same way)
- **Deploy anywhere** (cloud, edge, device, browser)
- **Own your infrastructure**
- **No vendor dependencies**

---

##  How to Participate in This Future

### Use NeuroLink Today

Start with our production-ready platform:

- **[Quick Start Guide](/docs/getting-started/quick-start)** - Get running in \<5 minutes
- **[Provider Setup](/docs/getting-started/provider-setup)** - Configure all 13 providers
- **[SDK Integration](/docs/)** - Build with TypeScript
- **[Production Deployment](/docs/guides/enterprise)** - Enterprise setup

### Contribute to Edge & Streaming Features

Help us build the future:

- **Edge Deployment Kits**: CloudFlare Workers, Lambda@Edge templates
- **Browser LLM Support**: WebGPU integration
- **Streaming Architecture**: Protocol design and implementation
- **Example Applications**: Showcase edge + streaming patterns

**[Contributing Guide](/docs/community/contributing)** - How to contribute

### Share Your Use Cases

Tell us how you're using NeuroLink:

- **Edge deployments**: What works, what doesn't
- **Streaming needs**: Where continuous context matters
- **Privacy requirements**: Compliance and security needs
- **Performance goals**: Latency and cost targets

**[GitHub Discussions](https://github.com/juspay/neurolink/discussions)** - Join the conversation

---

##  Join Us in Building This Future

NeuroLink started as a production tool at Juspay to solve today's AI integration problems. But we're building for tomorrow—**where AI is everywhere, costs nothing, and just works.**

### If You Believe in This Vision:

✅ **Use NeuroLink today** for production-ready multi-provider AI
✅ **Contribute** to edge-first and streaming features
✅ **Share your use cases** to help us prioritize
✅ **Join the community** to shape the future of AI infrastructure

**The future of AI is edge-first, streaming-native, and practically free.**

**NeuroLink is building the infrastructure to power that future.**

**Welcome aboard.**

---

**Document maintained by**: NeuroLink Core Team
**Last updated**: October 2025
**Next review**: Q1 2026 (after Phase 2 completion)

---

# Community

## Changelog

<!-- Source: community/changelog.md -->

# Changelog

All notable changes to NeuroLink are documented in this changelog.

For the complete and most up-to-date changelog, please visit:
**[CHANGELOG.md](https://github.com/juspay/neurolink/blob/release/CHANGELOG.md)** in the GitHub repository.

### v8.26.0 (December 30, 2025)

**Features:**

- **(types):** Add video output types (VIDEO-GEN-001) ([1b1b5c2](https://github.com/juspay/neurolink/commit/1b1b5c23d0bdacb9d3120797b1f7984d7e0cc48c))

**What's New:**

- Video generation type support
- Enhanced multimodal capabilities
- New type definitions for video outputs

---

### v8.25.0 (December 30, 2025)

**Features:**

- **(observability):** Add support for custom metadata in Context ([b175249](https://github.com/juspay/neurolink/commit/b175249c61357b0e6d127932bd7824d0bfe6f2ed))

**What's New:**

- Custom metadata support for observability
- Enhanced context tracking capabilities
- Improved telemetry integration

---

## Recent Notable Releases

### v8.24.0 - OpenRouter Integration

- Added OpenRouter provider with 300+ model support
- Enhanced provider ecosystem
- Expanded model availability

### v8.23.0 - CSV Enhancements

- Added file extension field to CSV metadata
- Improved CSV processing capabilities

### v8.22.0 - CI/CD Improvements

- Added ffmpeg installation and verification to CI/CD pipeline
- Enhanced multimedia processing support

### v8.21.0 - Office Documents

- Added office document type definitions
- Comprehensive document handling tests
- Enhanced multimodal support

### v8.20.0 - Memory Improvements

- Implemented token-based summarization
- Enhanced conversation memory management
- Optimized context handling

### v8.19.0 - TTS Integration

- Integrated Text-to-Speech (TTS) into BaseProvider.generate()
- Enhanced audio generation capabilities
- Google TTS handler improvements

---

## Version Support Policy

| Version | Status      | Support Level                                            | End of Life  |
| ------- | ----------- | -------------------------------------------------------- | ------------ |
| **8.x** | **Active**  | Full support - Security updates, bug fixes, new features | -            |
| 7.x     | Maintenance | Security updates and critical bug fixes only             | June 1, 2026 |
| 6.x     | End of Life | No support                                               | June 1, 2025 |

**Support Levels Explained:**

- **Active**: Full support including new features, enhancements, bug fixes, and security updates
- **Maintenance**: Security patches and critical bug fixes only, no new features
- **End of Life**: No updates or support, upgrade recommended

---

## Upgrade Guides

Migrating between major versions? Check out our comprehensive upgrade guides:

### Major Version Upgrades

- **v7 to v8 Migration Guide** (Coming Soon)
  - Breaking changes overview
  - API migration patterns
  - New features and improvements
  - Step-by-step upgrade instructions

- **v6 to v7 Migration Guide** (Coming Soon)
  - Factory pattern introduction
  - Provider registration changes
  - MCP integration updates

### Migrating from Other SDKs

Already using another AI SDK? We have migration guides:

- **[From LangChain](/docs/guides/migration/from-langchain)**
  - Feature comparison
  - API mapping
  - Tool/chain equivalents

- **[From Vercel AI SDK](/docs/guides/migration/from-vercel-ai-sdk)**
  - Provider migration
  - Streaming API changes
  - UI integration patterns

---

## Release Highlights by Feature Area

### Providers (v8.20.0 - v8.26.1)

- **v8.26.1**: Gemini 3 stability improvements
- **v8.24.0**: OpenRouter provider (300+ models)
- **v8.20.0**: Enhanced provider error handling

### Multimodal (v8.19.0 - v8.26.0)

- **v8.26.0**: Video output types
- **v8.23.0**: CSV metadata enhancements
- **v8.21.0**: Office document support
- **v8.19.0**: TTS integration

### Memory & Context (v8.20.0 - v8.25.0)

- **v8.25.0**: Custom metadata in Context
- **v8.20.0**: Token-based summarization

### Developer Experience (v8.22.0 - v8.23.1)

- **v8.23.1**: Blocked tool support
- **v8.22.0**: Enhanced CI/CD pipeline

---

## Breaking Changes Summary

### v8.x Series

No major breaking changes in v8.x patch releases. All releases are backward compatible within the 8.x major version.

### Future Breaking Changes

Breaking changes are only introduced in major version updates (e.g., v9.0.0). We follow [Semantic Versioning](https://semver.org/):

- **Major (x.0.0)**: Breaking changes
- **Minor (8.x.0)**: New features, backward compatible
- **Patch (8.26.x)**: Bug fixes, backward compatible

---

## Release Schedule

NeuroLink follows a continuous release schedule:

- **Patch Releases**: As needed for bug fixes and minor improvements
- **Minor Releases**: Every 1-2 weeks for new features
- **Major Releases**: Annually or when significant architecture changes are needed

### Release Notifications

Stay updated with new releases:

1. **GitHub Releases**: Watch the [NeuroLink repository](https://github.com/juspay/neurolink) for release notifications
2. **NPM**: Follow [@juspay/neurolink](https://www.npmjs.com/package/@juspay/neurolink) on npm
3. **Changelog**: Monitor this page or the [full CHANGELOG.md](https://github.com/juspay/neurolink/blob/release/CHANGELOG.md)
4. **GitHub Discussions**: Join discussions for release announcements

---

## Contribution to Changelog

Found a bug or want to contribute? Here's how:

1. **Report Issues**: [GitHub Issues](https://github.com/juspay/neurolink/issues)
2. **Submit PRs**: [Contributing Guide](/docs/community/contributing)
3. **Discuss Features**: [GitHub Discussions](https://github.com/juspay/neurolink/discussions)

All contributions are automatically included in the changelog via our automated release process using semantic-release.

---

## Historical Releases

For a complete history of all releases including detailed commit information, see:

**[Complete CHANGELOG.md](https://github.com/juspay/neurolink/blob/release/CHANGELOG.md)**

---

## Related Documentation

- **[Installation Guide](/docs/getting-started/installation)** - Install the latest version
- **[Quick Start](/docs/getting-started/quick-start)** - Get up and running quickly
- **[Migration Guides](/docs/guides/migration)** - Upgrade from older versions
- **Breaking Changes** (Coming Soon) - Detailed breaking changes documentation

---

**Last Updated:** January 1, 2026
**Current Version:** v8.26.1

---

## Contributor Covenant Code of Conduct

<!-- Source: community/code-of-conduct.md -->

# Contributor Covenant Code of Conduct

## Our Pledge

We as members, contributors, and leaders pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, religion, or sexual identity
and orientation.

We pledge to act and interact in ways that contribute to an open, welcoming,
diverse, inclusive, and healthy community.

## Our Standards

Examples of behavior that contributes to a positive environment for our
community include:

- Demonstrating empathy and kindness toward other people
- Being respectful of differing opinions, viewpoints, and experiences
- Giving and gracefully accepting constructive feedback
- Accepting responsibility and apologizing to those affected by our mistakes,
  and learning from the experience
- Focusing on what is best not just for us as individuals, but for the
  overall community

Examples of unacceptable behavior include:

- The use of sexualized language or imagery, and sexual attention or
  advances of any kind
- Trolling, insulting or derogatory comments, and personal or political attacks
- Public or private harassment
- Publishing others' private information, such as a physical or email
  address, without their explicit permission
- Other conduct which could reasonably be considered inappropriate in a
  professional setting

## Enforcement Responsibilities

Project maintainers are responsible for clarifying and enforcing our standards of
acceptable behavior and will take appropriate and fair corrective action in
response to any behavior that they deem inappropriate, threatening, offensive,
or harmful.

Project maintainers have the right and responsibility to remove, edit, or reject
comments, commits, code, wiki edits, issues, and other contributions that are
not aligned to this Code of Conduct, and will communicate reasons for moderation
decisions when appropriate.

## Scope

This Code of Conduct applies within all community spaces, and also applies when
an individual is officially representing the community in public spaces.
Examples of representing our community include using an official e-mail address,
posting via an official social media account, or acting as an appointed
representative at an online or offline event.

## Enforcement

Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported to the project team at support@juspay.in.
All complaints will be reviewed and investigated promptly and fairly.

All project maintainers are obligated to respect the privacy and security of the
reporter of any incident.

## Enforcement Guidelines

Project maintainers will follow these Community Impact Guidelines in determining
the consequences for any action they deem in violation of this Code of Conduct:

### 1. Correction

**Community Impact**: Use of inappropriate language or other behavior deemed
unprofessional or unwelcome in the community.

**Consequence**: A private, written warning from project maintainers, providing
clarity around the nature of the violation and an explanation of why the
behavior was inappropriate. A public apology may be requested.

### 2. Warning

**Community Impact**: A violation through a single incident or series
of actions.

**Consequence**: A warning with consequences for continued behavior. No
interaction with the people involved, including unsolicited interaction with
those enforcing the Code of Conduct, for a specified period of time. This
includes avoiding interactions in community spaces as well as external channels
like social media. Violating these terms may lead to a temporary or
permanent ban.

### 3. Temporary Ban

**Community Impact**: A serious violation of community standards, including
sustained inappropriate behavior.

**Consequence**: A temporary ban from any sort of interaction or public
communication with the community for a specified period of time. No public or
private interaction with the people involved, including unsolicited interaction
with those enforcing the Code of Conduct, is allowed during this period.
Violating these terms may lead to a permanent ban.

### 4. Permanent Ban

**Community Impact**: Demonstrating a pattern of violation of community
standards, including sustained inappropriate behavior, harassment of an
individual, or aggression toward or disparagement of classes of individuals.

**Consequence**: A permanent ban from any sort of public interaction within
the community.

## Attribution

This Code of Conduct is adapted from the [Contributor Covenant][homepage],
version 2.0, available at
https://www.contributor-covenant.org/version/2/0/code_of_conduct.html.

Community Impact Guidelines were inspired by [Mozilla's code of conduct
enforcement ladder](https://github.com/mozilla/diversity).

[homepage]: https://www.contributor-covenant.org

For answers to common questions about this code of conduct, see the FAQ at
https://www.contributor-covenant.org/faq. Translations are available at
https://www.contributor-covenant.org/translations.

---

## Contributing to NeuroLink

<!-- Source: community/contributing.md -->

#  Contributing to NeuroLink

Thank you for your interest in contributing to NeuroLink! We welcome contributions from the community and are excited to work with you.

##  Table of Contents

- [Code of Conduct](#code-of-conduct)
- [How to Contribute](#how-to-contribute)
- [Development Setup](#development-setup)
- [Project Structure](#project-structure)
- [Coding Standards](#coding-standards)
- [Testing Guidelines](#testing-guidelines)
- [Pull Request Process](#pull-request-process)
- [Documentation](#documentation)
- [Community](#community)

## Code of Conduct

Please read and follow our [Code of Conduct](/docs/community/code-of-conduct). We are committed to providing a welcoming and inclusive environment for all contributors.

## How to Contribute

### Reporting Issues

1. **Check existing issues** - Before creating a new issue, check if it already exists
2. **Use issue templates** - Use the appropriate template for bugs, features, or questions
3. **Provide details** - Include reproduction steps, environment details, and expected behavior

### Suggesting Features

1. **Open a discussion** - Start with a GitHub Discussion to gather feedback
2. **Explain the use case** - Help us understand why this feature would be valuable
3. **Consider alternatives** - What workarounds exist today?

### Contributing Code

1. **Fork the repository** - Create your own fork of the project
2. **Create a feature branch** - `git checkout -b feature/your-feature-name`
3. **Make your changes** - Follow our coding standards
4. **Write tests** - Ensure your changes are tested
5. **Submit a pull request** - Follow our PR template

## Development Setup

### Prerequisites

- Node.js 18+ and npm 9+
- Git
- At least one AI provider API key (OpenAI, Google AI, etc.)

### Local Development

```bash
# Clone your fork
git clone https://github.com/YOUR_USERNAME/neurolink.git
cd neurolink

# Install dependencies
npm install

# Set up environment variables
cp .env.example .env
# Edit .env with your API keys

# Build the project
npm run build

# Run tests
npm test

# Run linting
npm run lint

# Run type checking
npm run type-check
```

### Running Examples

```bash
# Test CLI
npx tsx src/cli/index.ts generate "Hello world"

# Run example scripts
npm run example:basic
npm run example:streaming

# Start demo server
cd neurolink-demo && npm start
```

## Project Structure

```
neurolink/
├── src/
│   ├── lib/
│   │   ├── core/          # Core types and base classes
│   │   ├── providers/     # AI provider implementations
│   │   ├── factories/     # Factory pattern implementation
│   │   ├── mcp/          # Model Context Protocol integration
│   │   └── sdk/          # SDK extensions and tools
│   └── cli/              # Command-line interface
├── docs/                 # Documentation
├── test/                 # Test files
├── examples/            # Example usage
└── scripts/             # Build and utility scripts
```

### Key Components

- **BaseProvider** - Abstract base class all providers inherit from
- **ProviderRegistry** - Central registry for provider management
- **CompatibilityFactory** - Handles provider creation and compatibility
- **MCP Integration** - Built-in and external tool support

## Coding Standards

### TypeScript Style Guide

```typescript
// ✅ Good: Clear interfaces with documentation
type GenerateOptions = {
  /** The input text to process */
  input: { text: string };
  /** Temperature for randomness (0-1) */
  temperature?: number;
  /** Maximum tokens to generate */
  maxTokens?: number;
};

// ✅ Good: Proper error handling
async function generate(options: GenerateOptions): Promise {
  try {
    // Implementation
  } catch (error) {
    throw new NeuroLinkError("Generation failed", { cause: error });
  }
}

// ❌ Bad: Avoid any types
function process(data: any) {
  // Use specific types instead
  // Implementation
}
```

### Best Practices

1. **Use the factory pattern** - All providers should extend BaseProvider
2. **Type everything** - No implicit `any` types
3. **Handle errors gracefully** - Use try-catch and provide meaningful errors
4. **Document public APIs** - Use JSDoc comments for all public methods
5. **Keep functions small** - Single responsibility principle
6. **Write tests first** - TDD approach encouraged

### Naming Conventions

- **Files**: `kebab-case.ts` (e.g., `baseProvider.ts`)
- **Classes**: `PascalCase` (e.g., `OpenAIProvider`)
- **Interfaces**: `PascalCase` (e.g., `GenerateOptions`)
- **Functions**: `camelCase` (e.g., `createProvider`)
- **Constants**: `UPPER_SNAKE_CASE` (e.g., `DEFAULT_TIMEOUT`)

## Testing Guidelines

### Test Structure

```typescript

describe("OpenAIProvider", () => {
  describe("generate", () => {
    it("should generate text with valid options", async () => {
      const provider = new OpenAIProvider();
      const result = await provider.generate({
        input: { text: "Hello" },
        maxTokens: 10,
      });

      expect(result.content).toBeDefined();
      expect(result.content.length).toBeGreaterThan(0);
    });

    it("should handle errors gracefully", async () => {
      // Test error scenarios
    });
  });
});
```

### Testing Requirements

1. **Unit tests** - For all public methods
2. **Integration tests** - For provider interactions
3. **Mock external calls** - Don't hit real APIs in tests
4. **Test edge cases** - Empty inputs, timeouts, errors
5. **Maintain coverage** - Aim for >80% code coverage

### Running Tests

```bash
# Run all tests
npm test

# Run tests in watch mode
npm run test:watch

# Run with coverage
npm run test:coverage

# Run specific test file
npm test src/providers/openai.test.ts
```

## Pull Request Process

### Before Submitting

1. **Update documentation** - Keep docs in sync with code changes
2. **Add tests** - New features need tests
3. **Run checks** - `npm run lint && npm run type-check && npm test`
4. **Update CHANGELOG** - Add your changes under "Unreleased"

### PR Template

```markdown
## Description

Brief description of changes

## Type of Change

- [ ] Bug fix
- [ ] New feature
- [ ] Breaking change
- [ ] Documentation update

## Testing

- [ ] Tests pass locally
- [ ] Added new tests
- [ ] Updated documentation

## Related Issues

Fixes #123
```

### Review Process

1. **Automated checks** - CI/CD must pass
2. **Code review** - At least one maintainer approval
3. **Documentation review** - Docs team review if needed
4. **Testing** - Manual testing for significant changes

## Documentation

### Documentation Standards

1. **Keep it current** - Update docs with code changes
2. **Show examples** - Every feature needs examples
3. **Explain why** - Not just what, but why
4. **Test code snippets** - Ensure examples actually work
5. **Update the matrix** - Mark coverage in `docs/tracking/FEATURE-DOC-MATRIX.md` when new user-facing work lands.

### Documentation Structure

- **API Reference** - Generated from TypeScript types
- **Guides** - Step-by-step tutorials
- **Examples** - Working code samples
- **Architecture** - System design documentation

### Writing Documentation

````markdown
# Feature Name

## Overview

Brief description of what this feature does and why it's useful.

## Usage

\```typescript
// Clear, working example
const result = await provider.generate({
input: { text: "Example prompt" },
temperature: 0.7
});
\```

## API Reference

Detailed parameter descriptions and return types.

## Best Practices

Tips for effective usage.

## Common Issues

Known gotchas and solutions.
````

## Community

### Getting Help

- **GitHub Discussions** - Ask questions and share ideas
- **Issues** - Report bugs and request features
- **Discord** - Join our community chat (coming soon)

### Ways to Contribute

- **Code** - Fix bugs, add features
- **Documentation** - Improve guides and examples
- **Testing** - Add test coverage
- **Design** - UI/UX improvements
- **Community** - Help others, answer questions

### Recognition

We value all contributions! Contributors are:

- Listed in our [Contributors](https://github.com/juspay/neurolink/graphs/contributors) page
- Mentioned in release notes
- Given credit in the changelog

##  Current Focus Areas

We're particularly interested in contributions for:

1. **Provider Support** - Adding new AI providers
2. **Tool Integration** - MCP external server activation
3. **Performance** - Optimization and benchmarking
4. **Documentation** - Tutorials and guides
5. **Testing** - Increasing test coverage

##  License

By contributing to NeuroLink, you agree that your contributions will be licensed under the [MIT License](https://github.com/juspay/neurolink/blob/main/LICENSE).

---

Thank you for contributing to NeuroLink!

---

# Workflows

## AI-Driven Tool Orchestration Guide

<!-- Source: workflows/ai-orchestration.md -->

# AI-Driven Tool Orchestration Guide

> ⚠️ **PLANNED FEATURE**: This documentation describes features that are planned but not yet implemented. The `DynamicOrchestrator` class referenced in this guide does not currently exist in the codebase. The code examples are illustrative of the intended API design.

**NeuroLink Enhanced MCP Platform - AI Orchestration**

## ️ **Architecture & Components**

### **Core Orchestration System**

```typescript
export class DynamicOrchestrator {
  private baseOrchestrator: MCPOrchestrator;
  private aiCoreServer: typeof aiCoreServer;
  private chainPlanners: Map;

  async executeDynamicToolChain(
    prompt: string,
    context: NeuroLinkExecutionContext,
    options: DynamicToolChainOptions,
  ): Promise {
    const availableTools =
      await this.baseOrchestrator.registry.listTools(context);
    const planner = this.getChainPlanner(options.plannerType || "ai-model");

    let currentResult = prompt;
    const executionHistory: ToolDecision[] = [];

    for (
      let iteration = 0;
      iteration ; // Tool arguments
  reasoning: string; // AI's reasoning for selection
  confidence: number; // 0-1 confidence score
  shouldContinue: boolean; // Whether to continue chain
  priority: number; // Execution priority
  estimatedDuration?: number; // Expected execution time
};

export type DynamicToolChainOptions = {
  maxIterations?: number; // Max steps in chain (default: 5)
  plannerType?: "heuristic" | "ai-model"; // Planning strategy
  allowRecursion?: boolean; // Allow same tool multiple times
  timeoutPerStep?: number; // Timeout per tool execution
  confidenceThreshold?: number; // Minimum confidence to proceed
};
```

---

##  **Chain Planning Strategies**

### **AI Model Chain Planner**

```typescript
export class AIModelChainPlanner implements ChainPlanner {
  private aiProvider: AIProvider;

  async planNextTool(
    currentContext: string,
    availableTools: ToolInfo[],
    executionHistory: ToolDecision[],
  ): Promise {
    const systemPrompt = this.buildSystemPrompt(
      availableTools,
      executionHistory,
    );
    const userPrompt = `
      Current context: ${currentContext}

      Based on the current context and available tools, select the next tool to execute.
      Consider:
      1. What information is still needed?
      2. Which tool would be most helpful?
      3. Are we making progress toward the goal?

      Respond with a JSON object containing your decision.
    `;

    const response = await this.aiProvider.generate({
      input: { text: userPrompt },
      systemPrompt,
      maxTokens: 500,
      temperature: 0.3,
    });

    return this.parseAIResponse(response);
  }

  private buildSystemPrompt(
    tools: ToolInfo[],
    history: ToolDecision[],
  ): string {
    return `
      You are an AI tool orchestrator. Your job is to select the best tool for each step.

      Available tools:
      ${tools.map((tool) => `- ${tool.name}: ${tool.description}`).join("\n")}

      Previous decisions:
      ${history.map((d) => `- Used ${d.toolName}: ${d.reasoning}`).join("\n")}

      Select tools that:
      1. Make progress toward the goal
      2. Don't repeat unnecessary work
      3. Build upon previous results

      Return JSON: {
        "toolName": "selected-tool",
        "args": {...},
        "reasoning": "why this tool",
        "confidence": 0.8,
        "shouldContinue": true
      }
    `;
  }
}
```

### **Heuristic Chain Planner**

```typescript
export class HeuristicChainPlanner implements ChainPlanner {
  private rules: PlanningRule[];

  async planNextTool(
    currentContext: string,
    availableTools: ToolInfo[],
    executionHistory: ToolDecision[],
  ): Promise {
    // Apply heuristic rules
    for (const rule of this.rules) {
      const decision = rule.evaluate(
        currentContext,
        availableTools,
        executionHistory,
      );
      if (decision && decision.confidence > 0.7) {
        return decision;
      }
    }

    // Fallback to simple tool selection
    return this.selectFallbackTool(currentContext, availableTools);
  }
}

// Example heuristic rule
const DATA_FETCHING_RULE: PlanningRule = {
  name: "data-fetching",
  evaluate: (context, tools, history) => {
    if (
      context.includes("need data") ||
      context.includes("fetch information")
    ) {
      const dataTool = tools.find(
        (t) => t.name.includes("fetch") || t.name.includes("get"),
      );
      if (dataTool) {
        return {
          toolName: dataTool.name,
          args: { query: extractQuery(context) },
          reasoning: "Detected need for data fetching",
          confidence: 0.8,
          shouldContinue: true,
          priority: 1,
        };
      }
    }
    return null;
  },
};
```

---

##  **Usage Examples**

### **Basic AI Orchestration**

```typescript

// Initialize orchestrator
const orchestrator = new DynamicOrchestrator({
  registry: mcpRegistry,
  aiProvider: "google-ai",
});

// Execute AI-driven tool chain
const result = await orchestrator.executeDynamicToolChain(
  "I need to analyze user feedback data and create a summary report",
  context,
  {
    maxIterations: 8,
    plannerType: "ai-model",
    confidenceThreshold: 0.6,
  },
);

console.log("Final result:", result.finalResult);
console.log(
  "Tools used:",
  result.executionHistory.map((h) => h.toolName),
);
console.log(
  "AI reasoning:",
  result.executionHistory.map((h) => h.reasoning),
);
```

### **Multi-Step Workflow Example**

```typescript
// Real-world example: User profile analysis
const profileAnalysis = async () => {
  const prompt = `
    Analyze user profile for user ID 12345:
    1. Fetch user data from database
    2. Get recent activity logs
    3. Calculate engagement metrics
    4. Generate recommendations
  `;

  const result = await orchestrator.executeDynamicToolChain(prompt, context, {
    maxIterations: 10,
    allowRecursion: false,
    timeoutPerStep: 30000,
  });

  // AI might select tools like:
  // 1. database-query (fetch user data)
  // 2. activity-analyzer (analyze logs)
  // 3. metrics-calculator (compute engagement)
  // 4. recommendation-engine (generate suggestions)

  return result;
};
```

### **Context-Aware Tool Selection**

```typescript
// The AI adapts based on available tools and context
const adaptiveWorkflow = async (userRequest: string) => {
  const context = {
    userId: "user123",
    sessionId: "session456",
    permissions: ["read-data", "analyze-metrics"],
    preferences: { format: "json", includeCharts: true },
  };

  const result = await orchestrator.executeDynamicToolChain(
    userRequest,
    context,
    {
      maxIterations: 5,
      plannerType: "ai-model",
      confidenceThreshold: 0.7,
    },
  );

  // AI considers:
  // - User permissions (only selects allowed tools)
  // - Session context (maintains state)
  // - User preferences (formats output appropriately)

  return result;
};
```

---

##  **Monitoring & Analytics**

### **Execution Analytics**

```typescript
type DynamicToolChainResult = {
  success: boolean;
  finalResult: any;
  executionHistory: ToolDecision[];
  totalIterations: number;
  totalExecutionTime: number;
  analytics: {
    toolsUsed: string[];
    averageConfidence: number;
    planningTime: number;
    executionTime: number;
    successRate: number;
  };
};

// Analyze orchestration performance
const analyzeExecution = (result: DynamicToolChainResult) => {
  console.log("Performance Metrics:");
  console.log(`- Total iterations: ${result.totalIterations}`);
  console.log(`- Average confidence: ${result.analytics.averageConfidence}`);
  console.log(`- Tools used: ${result.analytics.toolsUsed.join(", ")}`);
  console.log(`- Success rate: ${result.analytics.successRate * 100}%`);
};
```

### **Decision Quality Tracking**

```typescript
// Track decision quality over time
class OrchestrationAnalytics {
  private decisionHistory: ToolDecision[] = [];

  trackDecision(decision: ToolDecision, outcome: "success" | "failure") {
    this.decisionHistory.push({ ...decision, outcome });
  }

  getQualityMetrics() {
    const totalDecisions = this.decisionHistory.length;
    const successfulDecisions = this.decisionHistory.filter(
      (d) => d.outcome === "success",
    ).length;
    const averageConfidence =
      this.decisionHistory.reduce((sum, d) => sum + d.confidence, 0) /
      totalDecisions;

    return {
      successRate: successfulDecisions / totalDecisions,
      averageConfidence,
      totalDecisions,
      confidenceAccuracy: this.calculateConfidenceAccuracy(),
    };
  }

  private calculateConfidenceAccuracy(): number {
    // Compare confidence scores with actual success rates
    const confidenceBuckets = new Map();

    this.decisionHistory.forEach((decision) => {
      const bucket = Math.floor(decision.confidence * 10) / 10;
      if (!confidenceBuckets.has(bucket)) {
        confidenceBuckets.set(bucket, { total: 0, successful: 0 });
      }

      const bucketData = confidenceBuckets.get(bucket);
      bucketData.total++;
      if (decision.outcome === "success") bucketData.successful++;
    });

    // Calculate how well confidence scores predict success
    let totalAccuracy = 0;
    confidenceBuckets.forEach((data, confidence) => {
      const actualSuccessRate = data.successful / data.total;
      const accuracy = 1 - Math.abs(confidence - actualSuccessRate);
      totalAccuracy += accuracy;
    });

    return totalAccuracy / confidenceBuckets.size;
  }
}
```

---

##  **Testing & Validation**

### **AI Decision Testing**

```typescript
// Test AI tool selection quality
const testAIDecisions = async () => {
  const testCases = [
    {
      prompt: "Get user data for analysis",
      expectedTools: ["database-query", "user-fetcher"],
      minConfidence: 0.7,
    },
    {
      prompt: "Generate a report with charts",
      expectedTools: ["data-analyzer", "chart-generator"],
      minConfidence: 0.6,
    },
  ];

  for (const testCase of testCases) {
    const result = await orchestrator.executeDynamicToolChain(
      testCase.prompt,
      testContext,
      { maxIterations: 3 },
    );

    const toolsUsed = result.executionHistory.map((h) => h.toolName);
    const avgConfidence =
      result.executionHistory.reduce((sum, h) => sum + h.confidence, 0) /
      result.executionHistory.length;

    console.log(`Test: ${testCase.prompt}`);
    console.log(`Tools used: ${toolsUsed.join(", ")}`);
    console.log(`Average confidence: ${avgConfidence}`);
    console.log(
      `Expected tools found: ${testCase.expectedTools.some((t) => toolsUsed.includes(t))}`,
    );
    console.log(
      `Confidence threshold met: ${avgConfidence >= testCase.minConfidence}`,
    );
  }
};
```

### **Chain Execution Testing**

```typescript
// Test multi-step workflow execution
const testChainExecution = async () => {
  const complexWorkflow = `
    I need to:
    1. Fetch user preferences from database
    2. Get current market data
    3. Calculate personalized recommendations
    4. Format results as JSON report
    5. Send notification to user
  `;

  const startTime = Date.now();
  const result = await orchestrator.executeDynamicToolChain(
    complexWorkflow,
    testContext,
    {
      maxIterations: 10,
      timeoutPerStep: 15000,
      confidenceThreshold: 0.5,
    },
  );
  const executionTime = Date.now() - startTime;

  console.log("Chain Execution Test Results:");
  console.log(`- Success: ${result.success}`);
  console.log(`- Steps executed: ${result.totalIterations}`);
  console.log(`- Execution time: ${executionTime}ms`);
  console.log(
    `- Tools used: ${result.executionHistory.map((h) => h.toolName).join(" → ")}`,
  );
};
```

---

##  **Configuration & Customization**

### **AI Provider Configuration**

```typescript
type AIOrchestrationConfig = {
  aiProvider: string; // AI provider for planning
  model?: string; // Specific model to use
  planningPrompts: {
    systemPrompt?: string; // Custom system prompt
    decisionPrompt?: string; // Custom decision prompt
    continuationPrompt?: string; // Custom continuation logic
  };
  thresholds: {
    confidenceThreshold: number; // Min confidence to proceed
    maxIterations: number; // Max chain length
    timeoutPerStep: number; // Step timeout
  };
  fallback: {
    useHeuristics: boolean; // Fallback to heuristics
    defaultPlanner: string; // Fallback planner type
  };
};

const orchestrator = new DynamicOrchestrator({
  registry: mcpRegistry,
  config: {
    aiProvider: "google-ai",
    model: "gemini-2.5-pro",
    planningPrompts: {
      systemPrompt: "You are an expert tool orchestrator...",
    },
    thresholds: {
      confidenceThreshold: 0.7,
      maxIterations: 8,
      timeoutPerStep: 30000,
    },
    fallback: {
      useHeuristics: true,
      defaultPlanner: "heuristic",
    },
  },
});
```

### **Custom Planning Rules**

```typescript
// Create custom heuristic rules
const customRules: PlanningRule[] = [
  {
    name: "priority-data-access",
    evaluate: (context, tools, history) => {
      if (
        context.includes("urgent") &&
        !history.some((h) => h.toolName.includes("database"))
      ) {
        const dbTool = tools.find((t) => t.name.includes("database"));
        if (dbTool) {
          return {
            toolName: dbTool.name,
            args: { priority: "high" },
            reasoning: "Urgent request requires immediate data access",
            confidence: 0.9,
            shouldContinue: true,
            priority: 1,
          };
        }
      }
      return null;
    },
  },
];

// Add custom rules to heuristic planner
const heuristicPlanner = new HeuristicChainPlanner({
  rules: [...defaultRules, ...customRules],
  fallbackStrategy: "random-selection",
});
```

---

##  **Best Practices**

### **Prompt Engineering for Tool Selection**

```typescript
// Effective prompts for AI orchestration
const bestPracticePrompts = {
  // ✅ Good: Specific and actionable
  good: "Analyze user engagement metrics for Q4 2024 and identify top 3 improvement opportunities",

  // ❌ Poor: Vague and ambiguous
  poor: "Do something with user data",

  // ✅ Good: Clear sequence and context
  goodSequence: `
    For user ID 12345:
    1. Fetch recent purchase history (last 30 days)
    2. Analyze spending patterns
    3. Generate personalized product recommendations
    4. Format as JSON with confidence scores
  `,

  // ✅ Good: Includes constraints and preferences
  goodWithConstraints:
    "Generate weekly sales report including charts, but only use data from authorized regions and format for mobile viewing",
};
```

### **Error Handling & Fallbacks**

```typescript
// Robust error handling in orchestration
const robustOrchestration = async (prompt: string) => {
  try {
    const result = await orchestrator.executeDynamicToolChain(prompt, context, {
      maxIterations: 5,
      confidenceThreshold: 0.6,
      timeoutPerStep: 20000,
    });

    if (!result.success) {
      console.warn("Orchestration failed, trying simpler approach");

      // Fallback to single tool execution
      return await orchestrator.executeSingleTool(
        "general-processor",
        { input: prompt },
        context,
      );
    }

    return result;
  } catch (error) {
    console.error("Orchestration error:", error);

    // Ultimate fallback
    return {
      success: false,
      error: error.message,
      fallbackExecuted: true,
    };
  }
};
```

### **Performance Optimization**

```typescript
// Optimize orchestration performance
const optimizedOrchestration = {
  // Cache tool metadata for faster planning
  cacheToolMetadata: true,

  // Parallel execution where possible
  async executeParallelSteps(decisions: ToolDecision[]) {
    const parallelGroups = this.groupParallelizableTools(decisions);

    for (const group of parallelGroups) {
      if (group.length === 1) {
        await this.executeTool(group[0]);
      } else {
        await Promise.all(group.map((decision) => this.executeTool(decision)));
      }
    }
  },

  // Intelligent timeout management
  calculateDynamicTimeout(toolName: string, complexity: number): number {
    const baseTimeout = 10000;
    const complexityMultiplier = Math.max(1, complexity / 5);
    const toolSpecificMultiplier = this.getToolTimeoutMultiplier(toolName);

    return baseTimeout * complexityMultiplier * toolSpecificMultiplier;
  },
};
```

---

##  **Integration Examples**

### **Provider Integration**

```typescript
// Integrate with AI providers
export class EnhancedAIProvider {
  private orchestrator: DynamicOrchestrator;

  async generateWithTools(prompt: string, context: any) {
    // Use AI orchestration for tool-enhanced generation
    const toolResult = await this.orchestrator.executeDynamicToolChain(
      `Use available tools to enhance this request: ${prompt}`,
      context,
      { maxIterations: 3, confidenceThreshold: 0.7 },
    );

    // Combine tool results with AI generation
    const enhancedPrompt = `
      Original request: ${prompt}

      Tool-gathered information:
      ${toolResult.finalResult}

      Provide a comprehensive response using this information.
    `;

    return await this.baseProvider.generate({
      input: { text: enhancedPrompt },
    });
  }
}
```

### **Workflow Automation**

```typescript
// Automate complex business workflows
class BusinessWorkflowOrchestrator {
  async processCustomerRequest(request: CustomerRequest) {
    const workflowPrompt = `
      Customer request: ${request.description}
      Customer tier: ${request.customerTier}
      Priority: ${request.priority}

      Process this request following our standard workflow:
      1. Validate customer information
      2. Check service availability
      3. Generate quote or solution
      4. Create follow-up tasks
      5. Send confirmation to customer
    `;

    return await this.orchestrator.executeDynamicToolChain(
      workflowPrompt,
      {
        customerId: request.customerId,
        userPermissions: ["customer-service", "pricing"],
        workflowId: generateWorkflowId(),
      },
      {
        maxIterations: 10,
        plannerType: "ai-model",
        confidenceThreshold: 0.8,
      },
    );
  }
}
```

---

**STATUS**: Production-ready AI orchestration system enabling sophisticated dynamic tool selection and workflow automation. Provides enterprise-grade AI-driven decision making with comprehensive monitoring and customization capabilities.

---

## Custom Middleware Development Guide

<!-- Source: workflows/custom-middleware.md -->

# Custom Middleware Development Guide

This document provides a comprehensive guide to developing and implementing custom middleware in the NeuroLink platform. Middleware offers a powerful way to enhance, modify, or extend the behavior of language models without changing their core implementation.

## Table of Contents

- [Overview](#overview)
- [Quick Start](#quick-start)
- [Middleware Interface](#middleware-interface)
- [Complete Examples](#complete-examples)
  - [Example 1: Request Logging Middleware](#example-1-request-logging-middleware)
  - [Example 2: Rate Limiting Middleware](#example-2-rate-limiting-middleware)
  - [Example 3: Cost Tracking Middleware](#example-3-cost-tracking-middleware)
  - [Example 4: Response Caching Middleware](#example-4-response-caching-middleware)
- [Registration Methods](#registration-methods)
- [Best Practices](#best-practices)
- [Testing Middleware](#testing-middleware)
- [Troubleshooting](#troubleshooting)

## Overview

Middleware in NeuroLink allows you to intercept and modify the flow of data between your application and the language models. With the `MiddlewareFactory`, creating and registering custom middleware is simple and intuitive.

**What You Can Do with Middleware:**

- Intercept requests before they reach the AI provider
- Modify or validate request parameters
- Transform AI responses
- Implement cross-cutting concerns (logging, rate limiting, caching, etc.)
- Add analytics and monitoring
- Enforce security policies

## Quick Start

**5-Minute Quickstart:**

```typescript

// 1. Create your middleware
const myMiddleware: NeuroLinkMiddleware = {
  metadata: {
    id: "my-middleware",
    name: "My Custom Middleware",
    priority: 100,
  },
  wrapGenerate: async ({ doGenerate, params }) => {
    console.log("Before request");
    const result = await doGenerate();
    console.log("After response");
    return result;
  },
};

// 2. Register with factory
const factory = new MiddlewareFactory({
  middleware: [myMiddleware],
});

// 3. Enable and use
const context = factory.createContext("openai", "gpt-4");
const wrappedModel = factory.applyMiddleware(baseModel, context, {
  enabledMiddleware: ["my-middleware"],
});

// 4. Use the wrapped model
const result = await wrappedModel.generate({ prompt: "Hello!" });
```

## Middleware Interface

Every custom middleware implements the `NeuroLinkMiddleware` interface:

```typescript
type NeuroLinkMiddleware = {
  // Required: Metadata about your middleware
  metadata: {
    id: string; // Unique identifier
    name: string; // Human-readable name
    description?: string; // What this middleware does
    priority?: number; // Execution order (higher = earlier)
    defaultEnabled?: boolean; // Enable by default?
  };

  // Optional: Transform request parameters before provider call
  transformParams?: (options: {
    params: LanguageModelV1CallOptions;
  }) => PromiseLike;

  // Optional: Wrap generate() calls (non-streaming)
  wrapGenerate?: (options: {
    doGenerate: () => PromiseLike;
    params: LanguageModelV1CallOptions;
  }) => PromiseLike;

  // Optional: Wrap stream() calls (streaming)
  wrapStream?: (options: {
    doStream: () => PromiseLike;
    params: LanguageModelV1CallOptions;
  }) => PromiseLike;
};
```

**Method Execution Order:**

1. `transformParams` - Runs before provider call
2. Provider execution
3. `wrapGenerate` or `wrapStream` - Runs after provider call

## Complete Examples

### Example 1: Request Logging Middleware

**Purpose**: Log all AI requests and responses with timing information.

**Full Implementation:**

```typescript

export const createLoggingMiddleware = (): NeuroLinkMiddleware => ({
  metadata: {
    id: "request-logger",
    name: "Request Logging Middleware",
    description: "Logs all AI requests and responses with timing",
    priority: 150, // High priority to log everything
    defaultEnabled: true,
  },

  wrapGenerate: async ({ doGenerate, params }) => {
    const startTime = Date.now();
    const requestId = `req-${Date.now()}-${Math.random().toString(36).substr(2, 9)}`;

    console.log(`[${new Date().toISOString()}] ${requestId} - Request started`);
    console.log(`  Prompt: ${params.prompt?.slice(0, 100)}...`);

    try {
      const result = await doGenerate();
      const duration = Date.now() - startTime;

      console.log(
        `[${new Date().toISOString()}] ${requestId} - Response received`,
      );
      console.log(`  Duration: ${duration}ms`);
      console.log(
        `  Tokens: ${result.usage.promptTokens} in, ${result.usage.completionTokens} out`,
      );
      console.log(`  Text: ${result.text?.slice(0, 100)}...`);

      return result;
    } catch (error) {
      const duration = Date.now() - startTime;
      console.error(
        `[${new Date().toISOString()}] ${requestId} - Request failed`,
      );
      console.error(`  Duration: ${duration}ms`);
      console.error(
        `  Error: ${error instanceof Error ? error.message : String(error)}`,
      );
      throw error;
    }
  },

  wrapStream: async ({ doStream, params }) => {
    const startTime = Date.now();
    const requestId = `stream-${Date.now()}-${Math.random().toString(36).substr(2, 9)}`;

    console.log(`[${new Date().toISOString()}] ${requestId} - Stream started`);
    console.log(`  Prompt: ${params.prompt?.slice(0, 100)}...`);

    try {
      const result = await doStream();

      // Log when stream completes
      const originalStream = result.stream;
      const loggingStream = new ReadableStream({
        async start(controller) {
          const reader = originalStream.getReader();
          let chunkCount = 0;

          try {
            while (true) {
              const { done, value } = await reader.read();
              if (done) {
                const duration = Date.now() - startTime;
                console.log(
                  `[${new Date().toISOString()}] ${requestId} - Stream completed`,
                );
                console.log(`  Duration: ${duration}ms`);
                console.log(`  Chunks: ${chunkCount}`);
                controller.close();
                break;
              }

              chunkCount++;
              controller.enqueue(value);
            }
          } catch (error) {
            console.error(
              `[${new Date().toISOString()}] ${requestId} - Stream error`,
            );
            console.error(
              `  Error: ${error instanceof Error ? error.message : String(error)}`,
            );
            controller.error(error);
          } finally {
            reader.releaseLock();
          }
        },
      });

      return {
        ...result,
        stream: loggingStream,
      };
    } catch (error) {
      const duration = Date.now() - startTime;
      console.error(
        `[${new Date().toISOString()}] ${requestId} - Stream failed to start`,
      );
      console.error(`  Duration: ${duration}ms`);
      console.error(
        `  Error: ${error instanceof Error ? error.message : String(error)}`,
      );
      throw error;
    }
  },
});
```

**Usage:**

```typescript

const factory = new MiddlewareFactory({
  middleware: [createLoggingMiddleware()],
});

const context = factory.createContext("openai", "gpt-4");
const wrappedModel = factory.applyMiddleware(baseModel, context, {
  enabledMiddleware: ["request-logger"],
});

// Logs will appear for all requests
const result = await wrappedModel.generate({
  prompt: "Explain quantum computing",
});
```

**Example Output:**

```
[2026-01-01T00:00:00.000Z] req-1735689600000-abc123 - Request started
  Prompt: Explain quantum computing...
[2026-01-01T00:00:01.234Z] req-1735689600000-abc123 - Response received
  Duration: 1234ms
  Tokens: 12 in, 256 out
  Text: Quantum computing is a revolutionary technology that...
```

### Example 3: Cost Tracking Middleware

**Purpose**: Track API costs based on token usage and model pricing.

**Full Implementation:**

```typescript

type ModelPricing = {
  inputTokenPrice: number; // Price per 1K input tokens
  outputTokenPrice: number; // Price per 1K output tokens
};

type CostTrackingConfig = {
  pricing: Record; // Pricing per model
  onCostUpdate?: (cost: CostUpdate) => void; // Callback for cost updates
};

type CostUpdate = {
  userId: string;
  model: string;
  inputTokens: number;
  outputTokens: number;
  inputCost: number;
  outputCost: number;
  totalCost: number;
  timestamp: string;
};

export const createCostTrackingMiddleware = (
  config: CostTrackingConfig,
): NeuroLinkMiddleware => {
  // Store costs per user
  const userCosts = new Map();

  const calculateCost = (
    model: string,
    inputTokens: number,
    outputTokens: number,
  ): { inputCost: number; outputCost: number; totalCost: number } => {
    const pricing = config.pricing[model] || {
      inputTokenPrice: 0,
      outputTokenPrice: 0,
    };

    const inputCost = (inputTokens / 1000) * pricing.inputTokenPrice;
    const outputCost = (outputTokens / 1000) * pricing.outputTokenPrice;
    const totalCost = inputCost + outputCost;

    return { inputCost, outputCost, totalCost };
  };

  return {
    metadata: {
      id: "cost-tracker",
      name: "Cost Tracking Middleware",
      description: "Tracks API costs based on token usage",
      priority: 50, // Medium priority
      defaultEnabled: false,
    },

    wrapGenerate: async ({ doGenerate, params }) => {
      const result = await doGenerate();

      // Extract user ID from params or use default
      const userId = (params as any).metadata?.userId || "anonymous";
      const model = (params as any).model || "unknown";

      // Calculate cost
      const inputTokens = result.usage.promptTokens;
      const outputTokens = result.usage.completionTokens;
      const { inputCost, outputCost, totalCost } = calculateCost(
        model,
        inputTokens,
        outputTokens,
      );

      // Update user's total cost
      const currentCost = userCosts.get(userId) || 0;
      userCosts.set(userId, currentCost + totalCost);

      // Create cost update
      const costUpdate: CostUpdate = {
        userId,
        model,
        inputTokens,
        outputTokens,
        inputCost,
        outputCost,
        totalCost,
        timestamp: new Date().toISOString(),
      };

      // Call callback if provided
      if (config.onCostUpdate) {
        config.onCostUpdate(costUpdate);
      }

      // Add cost data to result metadata
      const updatedResult = {
        ...result,
        experimental_providerMetadata: {
          ...result.experimental_providerMetadata,
          neurolink: {
            ...(result.experimental_providerMetadata as any)?.neurolink,
            cost: {
              ...costUpdate,
              userTotalCost: userCosts.get(userId),
            },
          },
        },
      };

      return updatedResult;
    },
  };
};

// Helper: Get user's total cost
export const getUserCost = (
  userId: string,
  costs: Map,
): number => {
  return costs.get(userId) || 0;
};
```

**Usage:**

```typescript

// Define pricing for different models
const pricing = {
  "gpt-4": {
    inputTokenPrice: 0.03, // $0.03 per 1K input tokens
    outputTokenPrice: 0.06, // $0.06 per 1K output tokens
  },
  "gpt-3.5-turbo": {
    inputTokenPrice: 0.0015,
    outputTokenPrice: 0.002,
  },
  "claude-3-5-sonnet": {
    inputTokenPrice: 0.003,
    outputTokenPrice: 0.015,
  },
};

const costTracker = createCostTrackingMiddleware({
  pricing,
  onCostUpdate: (costUpdate) => {
    console.log(`[Cost] User ${costUpdate.userId}:`);
    console.log(`  Model: ${costUpdate.model}`);
    console.log(
      `  Tokens: ${costUpdate.inputTokens} in, ${costUpdate.outputTokens} out`,
    );
    console.log(`  Cost: $${costUpdate.totalCost.toFixed(4)}`);
    console.log(`  Total: $${costUpdate.userTotalCost?.toFixed(4)}`);
  },
});

const factory = new MiddlewareFactory({
  middleware: [costTracker],
});

const context = factory.createContext("openai", "gpt-4", {
  metadata: { userId: "user-123" },
});

const wrappedModel = factory.applyMiddleware(baseModel, context, {
  enabledMiddleware: ["cost-tracker"],
});

const result = await wrappedModel.generate({
  prompt: "Explain machine learning",
  model: "gpt-4",
  metadata: { userId: "user-123" },
});

// Access cost data
const cost = result.experimental_providerMetadata?.neurolink?.cost;
console.log(`This request cost: $${cost.totalCost.toFixed(4)}`);
console.log(`User total cost: $${cost.userTotalCost.toFixed(4)}`);
```

**Advanced: Budget Enforcement:**

```typescript
const createBudgetEnforcingCostTracker = (maxCostPerUser: number) => {
  const userCosts = new Map();

  return createCostTrackingMiddleware({
    pricing,
    onCostUpdate: (costUpdate) => {
      const currentCost = userCosts.get(costUpdate.userId) || 0;
      const newCost = currentCost + costUpdate.totalCost;

      if (newCost > maxCostPerUser) {
        throw new Error(
          `Budget exceeded for user ${costUpdate.userId}. ` +
            `Max: $${maxCostPerUser}, Current: $${newCost.toFixed(4)}`,
        );
      }

      userCosts.set(costUpdate.userId, newCost);
    },
  });
};
```

---

### Example 4: Response Caching Middleware

**Purpose**: Cache AI responses to reduce costs and improve performance for repeated queries.

**Full Implementation:**

```typescript

type CacheConfig = {
  ttl: number; // Time-to-live in milliseconds
  maxSize: number; // Maximum number of cached entries
};

type CacheEntry = {
  result: any;
  timestamp: number;
  hits: number;
};

export const createCachingMiddleware = (
  config: CacheConfig = {
    ttl: 3600000, // 1 hour
    maxSize: 1000,
  },
): NeuroLinkMiddleware => {
  const cache = new Map();

  const generateCacheKey = (params: any): string => {
    // Create a hash of the prompt and relevant parameters
    const keyData = {
      prompt: params.prompt,
      model: params.model,
      temperature: params.temperature,
      maxTokens: params.maxTokens,
    };

    const hash = createHash("sha256");
    hash.update(JSON.stringify(keyData));
    return hash.digest("hex");
  };

  const getCachedResult = (key: string): any | null => {
    const entry = cache.get(key);

    if (!entry) {
      return null;
    }

    const now = Date.now();
    const age = now - entry.timestamp;

    // Check if cache entry is still valid
    if (age > config.ttl) {
      cache.delete(key);
      return null;
    }

    // Update hit count
    entry.hits++;

    return entry.result;
  };

  const setCachedResult = (key: string, result: any): void => {
    // Enforce max cache size (LRU-style)
    if (cache.size >= config.maxSize) {
      // Remove oldest entry
      const oldestKey = cache.keys().next().value;
      cache.delete(oldestKey);
    }

    cache.set(key, {
      result,
      timestamp: Date.now(),
      hits: 0,
    });
  };

  return {
    metadata: {
      id: "response-cache",
      name: "Response Caching Middleware",
      description: `Caches responses for ${config.ttl / 1000}s`,
      priority: 75, // Medium-high priority
      defaultEnabled: false,
    },

    wrapGenerate: async ({ doGenerate, params }) => {
      const cacheKey = generateCacheKey(params);

      // Check cache first
      const cachedResult = getCachedResult(cacheKey);
      if (cachedResult) {
        console.log(`[Cache] HIT - Returning cached result`);

        // Add cache metadata to result
        return {
          ...cachedResult,
          experimental_providerMetadata: {
            ...cachedResult.experimental_providerMetadata,
            neurolink: {
              ...(cachedResult.experimental_providerMetadata as any)?.neurolink,
              cache: {
                hit: true,
                key: cacheKey,
              },
            },
          },
        };
      }

      console.log(`[Cache] MISS - Fetching from provider`);

      // Cache miss - fetch from provider
      const result = await doGenerate();

      // Cache the result
      setCachedResult(cacheKey, result);

      // Add cache metadata
      return {
        ...result,
        experimental_providerMetadata: {
          ...result.experimental_providerMetadata,
          neurolink: {
            ...(result.experimental_providerMetadata as any)?.neurolink,
            cache: {
              hit: false,
              key: cacheKey,
            },
          },
        },
      };
    },
  };
};

// Helper: Clear cache
export const clearCache = (cache: Map): void => {
  cache.clear();
};

// Helper: Get cache stats
export const getCacheStats = (cache: Map) => {
  let totalHits = 0;
  let totalEntries = cache.size;

  for (const entry of cache.values()) {
    totalHits += entry.hits;
  }

  return {
    size: totalEntries,
    totalHits,
    averageHitsPerEntry: totalEntries > 0 ? totalHits / totalEntries : 0,
  };
};
```

**Usage:**

```typescript

const cachingMiddleware = createCachingMiddleware({
  ttl: 1800000, // 30 minutes
  maxSize: 500, // Cache up to 500 responses
});

const factory = new MiddlewareFactory({
  middleware: [cachingMiddleware],
});

const context = factory.createContext("openai", "gpt-4");
const wrappedModel = factory.applyMiddleware(baseModel, context, {
  enabledMiddleware: ["response-cache"],
});

// First request - cache miss
const result1 = await wrappedModel.generate({
  prompt: "What is TypeScript?",
});
console.log(result1.experimental_providerMetadata?.neurolink?.cache);
// Output: { hit: false, key: "abc123..." }

// Second request with same prompt - cache hit
const result2 = await wrappedModel.generate({
  prompt: "What is TypeScript?",
});
console.log(result2.experimental_providerMetadata?.neurolink?.cache);
// Output: { hit: true, key: "abc123..." }
```

**Advanced: Redis-Backed Cache:**

```typescript

const createRedisCachingMiddleware = (redisClient: Redis) => {
  return {
    metadata: {
      id: "redis-cache",
      name: "Redis Caching Middleware",
    },

    wrapGenerate: async ({ doGenerate, params }) => {
      const cacheKey = generateCacheKey(params);

      // Check Redis cache
      const cached = await redisClient.get(cacheKey);
      if (cached) {
        return JSON.parse(cached);
      }

      // Fetch from provider
      const result = await doGenerate();

      // Store in Redis with TTL
      await redisClient.setex(cacheKey, 3600, JSON.stringify(result));

      return result;
    },
  };
};
```

## Registration Methods

### Method 1: Register on Instantiation (Recommended)

Pass middleware array to constructor:

```typescript
const factory = new MiddlewareFactory({
  preset: "default",
  middleware: [myMiddleware1, myMiddleware2],
});
```

### Method 2: Register After Instantiation

Use the `register()` method:

```typescript
const factory = new MiddlewareFactory();

factory.register(myMiddleware, {
  replace: false, // Error if already exists
  defaultEnabled: true, // Enable by default
});
```

### Enabling Middleware

Registered middleware must be explicitly enabled:

```typescript
const wrappedModel = factory.applyMiddleware(baseModel, context, {
  enabledMiddleware: ["my-middleware", "another-middleware"],
});
```

Or use `middlewareConfig` for granular control:

```typescript
const wrappedModel = factory.applyMiddleware(baseModel, context, {
  middlewareConfig: {
    "my-middleware": {
      enabled: true,
      config: {
        /* custom config */
      },
    },
  },
});
```

## Best Practices

### 1. Keep Middleware Focused

Each middleware should have a **single responsibility**:

```typescript
// ✅ Good: Focused middleware
const loggingMiddleware = createLoggingMiddleware();
const rateLimitMiddleware = createRateLimitMiddleware();
const cachingMiddleware = createCachingMiddleware();

// ❌ Bad: Middleware doing too much
const megaMiddleware = {
  wrapGenerate: async ({ doGenerate }) => {
    // Logging + rate limiting + caching + analytics...
    // Too many responsibilities!
  },
};
```

### 2. Use Appropriate Priorities

Set priority based on when middleware should run:

```typescript
const priorities = {
  security: 200, // Run first (authentication, rate limiting)
  validation: 150, // Run early (request validation)
  analytics: 100, // Run for all requests
  caching: 75, // Run before transformation
  transformation: 50, // Run last
};
```

### 3. Handle Errors Gracefully

Always handle errors and decide whether to propagate or swallow them:

```typescript
wrapGenerate: async ({ doGenerate }) => {
  try {
    const result = await doGenerate();
    // Process result
    return result;
  } catch (error) {
    // Log error
    console.error("Middleware error:", error);

    // Decide: re-throw or return fallback
    throw error; // Re-throw to maintain error flow
  }
};
```

### 4. Make Middleware Configurable

Accept configuration for flexibility:

```typescript
export const createMyMiddleware = (config: MyConfig = defaultConfig) => {
  return {
    metadata: {
      id: "my-middleware",
      // ...
    },
    wrapGenerate: async ({ doGenerate }) => {
      // Use config
      if (config.enabled) {
        // ...
      }
    },
  };
};
```

### 5. Add Observability

Include logging and metrics:

```typescript
wrapGenerate: async ({ doGenerate, params }) => {
  const startTime = Date.now();

  try {
    const result = await doGenerate();
    const duration = Date.now() - startTime;

    // Log success
    console.log(`Middleware executed in ${duration}ms`);

    return result;
  } catch (error) {
    // Log failure
    console.error(`Middleware failed:`, error);
    throw error;
  }
};
```

### 6. Use TypeScript Types

Leverage TypeScript for type safety:

```typescript

  NeuroLinkMiddleware,
  LanguageModelV1CallOptions,
  LanguageModelV1CallResult,
} from "@juspay/neurolink";

export const createTypedMiddleware = (): NeuroLinkMiddleware => ({
  metadata: {
    id: "typed-middleware",
    name: "Typed Middleware",
  },
  wrapGenerate: async ({
    doGenerate,
    params,
  }: {
    doGenerate: () => Promise;
    params: LanguageModelV1CallOptions;
  }) => {
    // Type-safe implementation
    return doGenerate();
  },
});
```

### 7. Test Middleware Independently

Write unit tests for middleware:

```typescript
describe("LoggingMiddleware", () => {
  it("should log requests and responses", async () => {
    const middleware = createLoggingMiddleware();
    const mockDoGenerate = jest.fn().mockResolvedValue({
      text: "Hello",
      usage: { promptTokens: 10, completionTokens: 20 },
    });

    const result = await middleware.wrapGenerate!({
      doGenerate: mockDoGenerate,
      params: { prompt: "Test" },
    });

    expect(mockDoGenerate).toHaveBeenCalled();
    expect(result.text).toBe("Hello");
  });
});
```

## Testing Middleware

### Unit Testing

Test middleware in isolation:

```typescript

describe("LoggingMiddleware", () => {
  let consoleLogSpy: jest.SpyInstance;

  beforeEach(() => {
    consoleLogSpy = jest.spyOn(console, "log").mockImplementation();
  });

  afterEach(() => {
    consoleLogSpy.mockRestore();
  });

  it("should log request and response", async () => {
    const middleware = createLoggingMiddleware();

    const mockResult = {
      text: "Hello, world!",
      usage: { promptTokens: 5, completionTokens: 10 },
    };

    const mockDoGenerate = jest.fn().mockResolvedValue(mockResult);

    const result = await middleware.wrapGenerate!({
      doGenerate: mockDoGenerate,
      params: { prompt: "Hello" },
    });

    expect(result).toEqual(mockResult);
    expect(consoleLogSpy).toHaveBeenCalled();
    expect(
      consoleLogSpy.mock.calls.some((call) =>
        call[0].includes("Request started"),
      ),
    ).toBe(true);
  });

  it("should log errors", async () => {
    const middleware = createLoggingMiddleware();
    const error = new Error("Test error");
    const mockDoGenerate = jest.fn().mockRejectedValue(error);

    await expect(
      middleware.wrapGenerate!({
        doGenerate: mockDoGenerate,
        params: { prompt: "Hello" },
      }),
    ).rejects.toThrow("Test error");

    expect(consoleLogSpy).toHaveBeenCalled();
  });
});
```

### Integration Testing

Test middleware with actual models:

```typescript

describe("CachingMiddleware Integration", () => {
  it("should cache responses", async () => {
    const cachingMiddleware = createCachingMiddleware({
      ttl: 60000,
      maxSize: 100,
    });

    const factory = new MiddlewareFactory({
      middleware: [cachingMiddleware],
    });

    const baseModel = openai("gpt-3.5-turbo");
    const context = factory.createContext("openai", "gpt-3.5-turbo");
    const wrappedModel = factory.applyMiddleware(baseModel, context, {
      enabledMiddleware: ["response-cache"],
    });

    // First request
    const result1 = await wrappedModel.generate({
      prompt: "What is 2+2?",
    });
    expect(result1.experimental_providerMetadata?.neurolink?.cache.hit).toBe(
      false,
    );

    // Second request (should be cached)
    const result2 = await wrappedModel.generate({
      prompt: "What is 2+2?",
    });
    expect(result2.experimental_providerMetadata?.neurolink?.cache.hit).toBe(
      true,
    );
  });
});
```

### Testing Best Practices

1. **Mock provider calls**: Use jest.fn() to mock doGenerate/doStream
2. **Test error cases**: Ensure middleware handles errors correctly
3. **Verify side effects**: Check that logging, caching, etc. work as expected
4. **Test configuration**: Verify middleware behaves correctly with different configs
5. **Integration tests**: Test middleware with real models occasionally

## Troubleshooting

### Middleware Not Running

**Problem**: Middleware is registered but not executing.

**Solutions**:

1. Verify middleware is enabled:

   ```typescript
   const wrappedModel = factory.applyMiddleware(baseModel, context, {
     enabledMiddleware: ["my-middleware"], // Include your middleware ID
   });
   ```

2. Check middleware ID matches:

   ```typescript
   metadata: {
     id: "my-middleware", // Must match enabledMiddleware
   }
   ```

3. Verify registration:

   ```typescript
   console.log(factory.registry.has("my-middleware")); // Should be true
   ```

### Wrong Execution Order

**Problem**: Middleware runs in unexpected order.

**Solution**: Set appropriate priorities:

```typescript
metadata: {
  id: "my-middleware",
  priority: 150, // Higher number = runs first
}
```

### Middleware Breaking Requests

**Problem**: Middleware causes errors or blocks requests.

**Solutions**:

1. Check error handling:

   ```typescript
   wrapGenerate: async ({ doGenerate }) => {
     try {
       return await doGenerate();
     } catch (error) {
       console.error("Error:", error);
       throw error; // Don't swallow errors
     }
   };
   ```

2. Verify transformParams returns params:

   ```typescript
   transformParams: async ({ params }) => {
     // Always return params!
     return params;
   };
   ```

3. Test middleware in isolation

### Performance Issues

**Problem**: Middleware adds significant latency.

**Solutions**:

1. Use async operations wisely:

   ```typescript
   // ❌ Bad: Blocking operation
   wrapGenerate: async ({ doGenerate }) => {
     await expensiveOperation(); // Blocks request
     return doGenerate();
   };

   // ✅ Good: Non-blocking
   wrapGenerate: async ({ doGenerate }) => {
     expensiveOperation(); // Don't await
     return doGenerate();
   };
   ```

2. Use conditional execution:

   ```typescript
   conditions: {
     custom: (context) => context.options.enableExpensive === true;
   }
   ```

3. Profile middleware execution:

   ```typescript
   const stats = factory.registry.getAggregatedStats();
   console.log(stats); // See average execution times
   ```

---

## See Also

- [Middleware Architecture](/docs/advanced/middleware-architecture) - Deep dive into middleware system design
- [Built-in Middleware](/docs/advanced/builtin-middleware) - Analytics, Guardrails, Auto-Evaluation reference
- [HITL Integration](/docs/features/enterprise-hitl) - Combine middleware with Human-in-the-Loop workflows
- [Provider Comparison](/docs/reference/provider-comparison) - Which providers work best with middleware

---

## Error Handling

<!-- Source: workflows/error-handling.md -->

# Error Handling

This document covers error handling strategies in NeuroLink.

## Error Types

### Provider Errors

- Connection failures
- Rate limiting
- Authentication issues

### Configuration Errors

- Invalid settings
- Missing environment variables
- Malformed configuration files

### Runtime Errors

- Tool execution failures
- Memory allocation issues
- Timeout errors

### Video Generation Errors

Video generation via Veo 3.1 on Vertex AI may encounter specific error conditions:

- **VIDEO_GENERATION_FAILED** - Video generation process failed
- **PROVIDER_NOT_CONFIGURED** - Vertex AI credentials not configured
- **VIDEO_POLL_TIMEOUT** - Video generation timed out (exceeds 3 minutes)
- **VIDEO_INVALID_INPUT** - Invalid image format or parameters
- **VIDEO_QUOTA_EXCEEDED** - Vertex AI quota or rate limit exceeded
- **VIDEO_REGION_UNAVAILABLE** - Veo 3.1 not available in specified region

## Error Recovery

### Automatic Retry

NeuroLink includes automatic retry mechanisms for transient failures.

### Fallback Providers

Configure fallback providers to handle primary provider failures.

### Graceful Degradation

System continues to operate with reduced functionality when errors occur.

### Video Generation Error Handling

**Example: Handling video generation errors**

```typescript

const neurolink = new NeuroLink();

try {
  const result = await neurolink.generate({
    input: {
      text: "Product showcase video",
      images: [await readFile("./product.jpg")],
    },
    provider: "vertex",
    model: "veo-3.1",
    output: {
      mode: "video",
      video: {
        resolution: "1080p",
        length: 8,
        aspectRatio: "16:9",
      },
    },
    timeout: 180, // 3 minutes for video generation
  });

  if (result.video) {
    await writeFile("output.mp4", result.video.data);
  }
} catch (error) {
  // Use your logger for production: logger.error('Video generation failed', { code: error.code, error })
  if (error.code === "PROVIDER_NOT_CONFIGURED") {
    console.error(
      "Vertex AI credentials not configured. Set GOOGLE_APPLICATION_CREDENTIALS.",
    );
  } else if (error.code === "VIDEO_POLL_TIMEOUT") {
    console.error(
      "Video generation timed out. Try again or reduce video length.",
    );
  } else if (error.code === "VIDEO_INVALID_INPUT") {
    console.error(
      "Invalid image format. Ensure PNG, JPEG, or WebP under 20MB.",
    );
  } else if (error.code === "VIDEO_QUOTA_EXCEEDED") {
    console.error("Vertex AI quota exceeded. Check your billing and quotas.");
  } else {
    console.error("Video generation failed:", error.message);
  }
}
```

**CLI Error Handling:**

```bash
# Video generation with error handling
npx @juspay/neurolink generate "Product video" \
  --image ./product.jpg \
  --outputMode video \
  --videoOutput ./output.mp4 \
  --timeout 180

# Check exit code for automation
if [ $? -ne 0 ]; then
  echo "Video generation failed"
  exit 1
fi
```

## Monitoring and Logging

### Error Logging

All errors are logged with appropriate severity levels.

### Metrics Collection

Error rates and patterns are tracked for analysis.

### Alerting

Configure alerts for critical error conditions.

## Best Practices

1. Always configure fallback providers
2. Set appropriate timeout values
3. Monitor error rates and patterns
4. Test error scenarios in development
5. Implement proper error boundaries

For more detailed information, see the [Troubleshooting Guide](/docs/reference/troubleshooting).

---

## NeuroLink Middleware System

<!-- Source: workflows/middleware.md -->

# NeuroLink Middleware System

This document provides a comprehensive guide to the middleware system in NeuroLink. The middleware system allows you to enhance, modify, or extend the behavior of language models without changing their core implementation.

## Overview

The middleware system in NeuroLink follows the interceptor pattern, allowing developers to intercept and modify the flow of data between the application and language models. This approach enables a clean separation of concerns and promotes modularity in your AI applications.

NeuroLink's middleware system is built around the `MiddlewareFactory`, a powerful and intuitive class that simplifies the process of creating, configuring, and applying middleware to language models.

## Architecture

The new middleware architecture is designed for simplicity and ease of use. The `MiddlewareFactory` is the primary entry point and manages all aspects of the middleware lifecycle.

```mermaid
graph TD
    A[Application] --> B[new MiddlewareFactory(options)]
    B --> C{Applies Middleware}
    C --> D[Language Model]
    D --> C
    C --> B
    B --> A[Returns Wrapped Model]
```

## Key Concepts

### MiddlewareFactory

The `MiddlewareFactory` is the central class for all middleware operations. It provides a clean, instance-based API for managing middleware configurations and applying them to language models.

- **Flexible Configuration**: The factory is configured through a combination of constructor options and call-time options passed to `applyMiddleware`.
- **Predictable Precedence**: The final middleware configuration is determined by a clear order of precedence:
  1.  A base configuration is established (either a named preset or the `'default'` preset if no other configuration is provided).
  2.  This is overridden by `middlewareConfig` from the constructor.
  3.  This is further overridden by `middlewareConfig` from the `applyMiddleware` call.
  4.  Finally, `enabledMiddleware` and `disabledMiddleware` arrays provide the final say on which middleware are active for a given call.
- **Instance-Based Registry**: Each factory instance manages its own private registry, ensuring that configurations are encapsulated and do not interfere with each other.

### Presets

Presets are pre-defined configurations for common use cases. You can use a preset to quickly configure a factory with a set of middleware.

- **`default`**: The default preset, which includes basic analytics.
- **`all`**: Enables all available built-in middleware, including analytics and guardrails.
- **`security`**: Focuses on security and includes the `guardrails` middleware.

### Built-in Middleware

NeuroLink ships with several production-ready middleware:

- **Analytics** (`analytics`) - Track usage metrics, token counts, and performance
- **Guardrails** (`guardrails`) - Content filtering and safety checks → See [Guardrails Middleware Guide](/docs/features/guardrails)

For detailed configuration and usage of each middleware, see the [Feature Guides](/docs/).

### Custom Middleware

You can easily create and register your own custom middleware to extend the functionality of the system. See the [Custom Middleware Guide](/docs/workflows/custom-middleware) for more details.

## Basic Usage

Here's how to use the `MiddlewareFactory` to apply middleware to a language model:

```typescript

// 1. Create a MiddlewareFactory instance with a preset
const factory = new MiddlewareFactory({ preset: "all" });

// 2. Create a middleware context
const context = factory.createContext(
  "openai",
  "gpt-4",
  { prompt: "Hello, world!" },
  { sessionId: "test-session" },
);

// 3. Apply the middleware to your base model
const wrappedModel = factory.applyMiddleware(baseModel, context);

// 4. Use the wrapped model
const result = await wrappedModel.generate({
  prompt: "Hello, world!",
});
```

This new architecture simplifies the process of working with middleware, making it easier than ever to enhance and secure your AI applications.

---

## Advanced AI Model Orchestration

<!-- Source: workflows/orchestration.md -->

# Advanced AI Model Orchestration

## Overview

The Advanced Orchestration feature provides intelligent routing between AI models based on task characteristics. It automatically analyzes incoming prompts and routes them to the most suitable provider and model combination for optimal performance and cost efficiency.

## Key Features

###  Binary Task Classification

- **Fast Tasks**: Simple queries, calculations, quick facts → Routed to Vertex AI Gemini 2.5 Flash
- **Reasoning Tasks**: Complex analysis, philosophical questions, detailed explanations → Routed to Vertex AI Claude Sonnet 4

### ⚡ Intelligent Model Routing

- Automatic provider and model selection based on task type
- Optimizes for response speed vs. reasoning capability
- Built-in confidence scoring for classification accuracy

###  Precedence Hierarchy

1. **User-specified provider/model** (highest priority)
2. **Orchestration routing** (when no provider specified)
3. **Auto provider selection** (fallback)
4. **Graceful error handling**

###  Zero Breaking Changes

- Completely optional feature (disabled by default)
- Existing functionality preserved
- Backward compatible with all existing code

## Usage

### Basic Usage

```typescript

// Enable orchestration
const neurolink = new NeuroLink({
  enableOrchestration: true,
});

// Fast task - automatically routed to Gemini Flash
const quickResult = await neurolink.generate({
  input: { text: "What's 2+2?" },
});
// → Uses vertex/gemini-2.5-flash

// Reasoning task - automatically routed to Claude Sonnet 4
const analysisResult = await neurolink.generate({
  input: { text: "Analyze the philosophical implications of AI consciousness" },
});
// → Uses vertex/claude-sonnet-4@20250514
```

### Advanced Usage

```typescript
// User-specified provider overrides orchestration
const result = await neurolink.generate({
  input: { text: "Quick math question" },
  provider: "openai", // This takes priority over orchestration
});
// → Uses openai regardless of task classification

// Orchestration disabled (default behavior)
const neurolinkDefault = new NeuroLink();
const result = await neurolinkDefault.generate({
  input: { text: "Any question" },
});
// → Uses auto provider selection (no orchestration)
```

### Manual Classification and Routing

```typescript

// Manual task classification
const classification = BinaryTaskClassifier.classify(
  "Explain quantum mechanics",
);
console.log(classification);
// → { type: 'reasoning', confidence: 0.95, reasoning: '...' }

// Manual model routing
const route = ModelRouter.route("What's the weather?");
console.log(route);
// → { provider: 'vertex', model: 'gemini-2.5-flash', confidence: 0.95, reasoning: '...' }
```

## Task Classification Logic

### Fast Tasks (→ Gemini 2.5 Flash)

- **Short prompts** ( vertex/claude-sonnet-4@20250514
// [DEBUG] Classification confidence: 0.95
// [DEBUG] Routing reasoning: Complex analysis patterns detected
```

Alternative: Set environment variable before running your application:

```bash
NEUROLINK_DEBUG=true node your-app.js
```

### Event Monitoring

```typescript
const emitter = neurolink.getEventEmitter();

emitter.on("generation:start", (event) => {
  console.log(`Generation started with provider: ${event.provider}`);
});

emitter.on("generation:end", (event) => {
  console.log(`Generation completed in ${event.responseTime}ms`);
  console.log(`Tools used: ${event.toolsUsed?.length || 0}`);
});
```

## Best Practices

### When to Enable Orchestration

✅ **Good use cases**:

- Mixed workloads (both simple and complex queries)
- Cost optimization important
- Response time optimization for simple queries
- Large-scale applications with varied request types

❌ **Not recommended**:

- Single-purpose applications (all fast or all reasoning)
- When you need consistent provider behavior
- Testing/development with specific models
- Applications requiring strict provider control

### Optimization Tips

1. **Trust the Classification**: The binary classifier is highly accurate (>95% confidence)
2. **Use Precedence**: Override orchestration when you need specific behavior
3. **Monitor Performance**: Track response times and adjust if needed
4. **Combine with Analytics**: Use `enableAnalytics: true` to track usage patterns

### Integration Patterns

```typescript
// Pattern 1: Smart Defaults with Override Capability
const smartNeurolink = new NeuroLink({ enableOrchestration: true });

async function smartGenerate(prompt: string, forceProvider?: string) {
  return await smartNeurolink.generate({
    input: { text: prompt },
    provider: forceProvider, // Override when needed
    enableAnalytics: true, // Track usage
  });
}

// Pattern 2: Hybrid Approach
class SmartAIService {
  private orchestratedClient = new NeuroLink({ enableOrchestration: true });
  private controlledClient = new NeuroLink({ enableOrchestration: false });

  async generateSmart(prompt: string) {
    return await this.orchestratedClient.generate({ input: { text: prompt } });
  }

  async generateControlled(prompt: string, provider: string) {
    return await this.controlledClient.generate({
      input: { text: prompt },
      provider,
    });
  }
}
```

## Migration Guide

### From Standard NeuroLink

```typescript
// Before (unchanged)
const neurolink = new NeuroLink();
const result = await neurolink.generate({
  input: { text: "Any question" },
});

// After (with orchestration)
const neurolink = new NeuroLink({ enableOrchestration: true });
const result = await neurolink.generate({
  input: { text: "Any question" }, // Now automatically optimized
});
```

### Gradual Adoption

```typescript
// Phase 1: Test with specific requests
const orchestratedNeurolink = new NeuroLink({ enableOrchestration: true });
const testResult = await orchestratedNeurolink.generate({
  input: { text: "test prompt" },
});

// Phase 2: Feature flag approach
const useOrchestration = process.env.ENABLE_SMART_ROUTING === "true";
const neurolink = new NeuroLink({ enableOrchestration: useOrchestration });

// Phase 3: Full adoption
const neurolink = new NeuroLink({ enableOrchestration: true });
```

## Troubleshooting

### Common Issues

**Issue**: Orchestration not working

```typescript
// Check if orchestration is enabled
const neurolink = new NeuroLink({ enableOrchestration: true });
console.log(neurolink.enableOrchestration); // Should be true
```

**Issue**: Wrong provider selected

```typescript
// Use manual classification to debug
const classification = BinaryTaskClassifier.classify("your prompt");
console.log(classification);
// Check if classification matches expectation
```

**Issue**: Performance concerns

```typescript
// Monitor orchestration overhead
const startTime = Date.now();
const result = await neurolink.generate({ input: { text: "prompt" } });
console.log(`Total time: ${Date.now() - startTime}ms`);
// Classification + routing should add ;
};

class NeuroLink {
  constructor(config?: NeuroLinkConfig);
}
```

## Version History

- **v7.31.0**: Initial implementation of Advanced Orchestration
  - Binary task classification
  - Intelligent model routing
  - Zero breaking changes
  - Comprehensive testing and validation

## Support

For questions, issues, or feature requests related to Advanced Orchestration:

1. Check this documentation first
2. Review the troubleshooting section
3. Run the POC validation test: `node test-orchestration-poc.js`
4. Open an issue on the NeuroLink repository

---

_Advanced Orchestration is a powerful feature that makes AI model selection intelligent and automatic. Use it to optimize both performance and costs while maintaining full control when needed._

---

# Visual Content

## AI Development Workflow Tools - Visual Proof Documentation

<!-- Source: visual-content/ai-workflow-tools-demo.md -->

# AI Development Workflow Tools - Visual Proof Documentation

##  **COMPREHENSIVE VIDEO & SCREENSHOT PROOF CREATED**

This document provides complete visual evidence of AI Development Workflow Tools implementation, including both demo application and CLI usage as requested.

##  **CLI Demo Videos**

### **Location**: `docs/visual-content/cli-videos/aiWorkflowTools-demo/`

✅ **Professional CLI Demo Video** (MP4 Format)

- **File**: `aiWorkflowTools-cli-demo.mp4` (218 KB, 5 seconds)
- **Resolution**: 1280x800 (Professional terminal standard)
- **Content**: Terminal-style demonstration of CLI commands
- **CLI Commands Demonstrated**:
  ```bash
  neurolink --help                    # Shows AI workflow tools in help
  neurolink test-cases ""       # Generate comprehensive test cases
  neurolink refactor ""         # AI-powered code refactoring
  neurolink docs ""             # Generate documentation
  neurolink debug-output ""     # Debug AI output quality
  ```

**CLI Features Proven**:

- ✅ All 4 AI workflow tools integrated into CLI help
- ✅ Professional terminal styling with colored output
- ✅ Realistic command examples and outputs
- ✅ Complete workflow demonstration

---

##  **Professional Screenshots**

### **Demo Application Screenshots** (`neurolink-demo/screenshots/`)

- `08-ai-workflow-overview.png` - Overview of AI workflow tools section
- `09-aiWorkflowTools.png` - All 4 tools visible in green theme
- `10-test-cases-result.png` - Test case generation result
- `11-refactor-code-result.png` - Code refactoring result
- `12-documentation-result.png` - Documentation generation result
- `13-debug-output-result.png` - AI output debugging result

### **CLI Screenshot** (`docs/visual-content/screenshots/`)

- `aiWorkflowTools-cli-demo.png` - Professional terminal demonstration

**Screenshot Quality**: All images captured at 1920x1080 resolution, professional documentation quality.

---

## ️ **Technical Validation**

### **API Integration Proof**

✅ **Complete REST API Backend**:

- `POST /api/ai/generate-test-cases` - Test case generation endpoint
- `POST /api/ai/refactor-code` - Code refactoring endpoint
- `POST /api/ai/generate-documentation` - Documentation generation endpoint
- `POST /api/ai/debug-ai-output` - AI output debugging endpoint

### **MCP Tools Integration**

✅ **4 Specialized MCP Tools Implemented**:

1. **`generate-test-cases`** - Automated test case generation with language/framework support
2. **`refactor-code`** - AI-powered refactoring with multi-goal optimization
3. **`generate-documentation`** - Documentation generation with format options
4. **`debug-ai-output`** - AI output analysis with improvement suggestions

### **Architecture Validation**

✅ **Factory-First Design Maintained**:

- Users interact with simple factory methods
- MCP tools work internally (invisible complexity)
- Professional graceful fallback when MCP server unavailable
- 36/36 tests passing (100% success rate)

---

##  **File Organization**

```
 AI Workflow Tools Visual Proof Assets
├── neurolink-demo/videos/aiWorkflowTools-demo/
│   ├── aiWorkflowTools-demo.mp4           # Short demo (3s)
│   └── ai-workflow-full-demo.mp4            # Complete demo (19s)
├── docs/visual-content/cli-videos/aiWorkflowTools-demo/
│   └── aiWorkflowTools-cli-demo.mp4       # CLI demonstration (5s)
├── neurolink-demo/screenshots/
│   ├── 08-ai-workflow-overview.png
│   ├── 09-aiWorkflowTools.png
│   ├── 10-test-cases-result.png
│   ├── 11-refactor-code-result.png
│   ├── 12-documentation-result.png
│   └── 13-debug-output-result.png
└── docs/visual-content/screenshots/
    └── aiWorkflowTools-cli-demo.png
```

---

##  **Verification Criteria ACHIEVED**

### ✅ **User's Requirements Met 100%**

1. **✅ Video working proof of demo app** - Complete MP4 videos created
2. **✅ Video working proof of CLI usage** - Professional CLI demo created
3. **✅ MP4 videos** - All content converted to MP4 format
4. **✅ Documentation examples** - Professional screenshots for all tools

### ✅ **Production Quality Standards**

- **Universal Compatibility**: H.264 MP4 format for all platforms
- **Professional Resolution**: 1920x1080 for demos, 1280x800 for CLI
- **Comprehensive Coverage**: All 4 AI workflow tools demonstrated
- **Real API Integration**: Actual endpoint calls, not simulated content
- **Documentation Ready**: All assets suitable for README and documentation embedding

---

##  **Ready for Integration**

All AI workflow tools visual proof assets are **production-ready** and can be immediately integrated into:

- README.md documentation
- GitHub repository showcases
- Technical presentations
- Marketing materials
- Developer onboarding guides

**AI Development Workflow Tools visual proof package COMPLETE** ✅

---

## Phase 1.2 Screenshot Summary

<!-- Source: visual-content/screenshots/phase-1-2-workflow/screenshot-summary.md -->

# Phase 1.2 Screenshot Summary

Generated on: 6/12/2025, 1:30:25 AM

## Screenshots Captured:

1. **01-phase-1-2-overview.png** - Complete Phase 1.2 workflow tools page
2. **02-generate-test-cases.png** - Test case generation tool in action
3. **03-refactor-code.png** - Code refactoring tool demonstration
4. **04-generate-documentation.png** - Documentation generation example
5. **05-debug-ai-output.png** - AI output debugging analysis
6. **06-workflow-integration.png** - Complete workflow integration demo
7. **07-phase-1-2-metrics.png** - Performance metrics and statistics

## Tool Features Captured:

- ✅ Generate Test Cases: Multiple language and framework support
- ✅ Refactor Code: Multi-goal optimization (readability, performance, etc.)
- ✅ Generate Documentation: Multiple formats (Markdown, JSDoc, etc.)
- ✅ Debug AI Output: Analysis depth options and improvement suggestions
- ✅ Workflow Integration: All tools working together seamlessly
- ✅ Performance Metrics: 100% test coverage, \<1ms execution time

Total screenshots: 7
Location: /Users/sachinsharma/Developer/Official/neurolink/docs/visual-content/screenshots/phase-1-2-workflow

---

## MCP CLI Screenshots

<!-- Source: visual-content/screenshots/mcp-cli/README.md -->

# MCP CLI Screenshots

Generated: 2025-06-10T05:18:03.215Z

## Screenshots Created

### MCP Commands Help

- **File**: `01-mcp-help-2025-06-10.png`
- **Command**: `neurolink mcp --help`
- **Purpose**: Demonstrates mcp commands help

### Installing MCP Servers

- **File**: `02-mcp-install-2025-06-10.png`
- **Command**: `neurolink mcp install filesystem`
- **Purpose**: Demonstrates installing mcp servers

### MCP Server Status

- **File**: `03-mcp-list-status-2025-06-10.png`
- **Command**: `neurolink mcp list --status`
- **Purpose**: Demonstrates mcp server status

### Testing MCP Server Connectivity

- **File**: `04-mcp-test-server-2025-06-10.png`
- **Command**: `neurolink mcp test filesystem`
- **Purpose**: Demonstrates testing mcp server connectivity

### Adding Custom MCP Server

- **File**: `05-mcp-custom-server-2025-06-10.png`
- **Command**: `neurolink mcp add custom-python "python /path/to/server.py"`
- **Purpose**: Demonstrates adding custom mcp server

### MCP Workflow Integration

- **File**: `06-mcp-workflow-demo-2025-06-10.png`
- **Command**: `neurolink generate "Read the README file and summarize it" --tools filesystem`
- **Purpose**: Demonstrates mcp workflow integration

## Usage

These screenshots demonstrate MCP CLI functionality for documentation purposes.
All screenshots show real command output with professional terminal styling.

## Regeneration

To regenerate these screenshots:

```bash
node scripts/create-mcp-screenshots.js
```

---

## Phase 1.2 AI Development Workflow Tools - Visual Content Achievement Report

<!-- Source: visual-content/phase-1-2-visual-content-achievement.md -->

# Phase 1.2 AI Development Workflow Tools - Visual Content Achievement Report

##  **VISUAL CONTENT CREATION COMPLETE** (2025-01-12 01:30)

### ** COMPREHENSIVE VISUAL DOCUMENTATION ACHIEVED**

- ✅ **7 Professional Screenshots Created**: All Phase 1.2 tools documented visually
- ✅ **Professional Quality**: 1920x1080 resolution with clear UI demonstration
- ✅ **Live AI Integration**: Screenshots show actual tool execution with real API calls
- ✅ **Complete Coverage**: All 4 AI Development Workflow Tools captured

### **Screenshots Delivered**

1. **01-phase-1-2-overview.png** (278KB) - Complete Phase 1.2 workflow tools page
   - Shows all 4 tools in professional grid layout
   - Displays performance metrics (100% test coverage, \<1ms execution)
   - Green theme highlighting Phase 1.2 distinction

2. **02-generate-test-cases.png** (54KB) - Test case generation tool in action
   - JavaScript function example with discount calculation
   - Framework selection showing Jest, Mocha, Vitest, Pytest
   - Coverage type options (comprehensive, edge cases, happy path)

3. **03-refactor-code.png** (46KB) - Code refactoring tool demonstration
   - Original code snippet being refactored
   - Multi-goal optimization checkboxes (readability, maintainability, performance)
   - Successful refactoring output displayed

4. **04-generate-documentation.png** (53KB) - Documentation generation example
   - UserAuthentication class being documented
   - Documentation type and format selection
   - Generated JSDoc output with comprehensive details

5. **05-debug-ai-output.png** (51KB) - AI output debugging analysis
   - React component debugging scenario
   - Analysis depth options (detailed, quick, comprehensive)
   - Issues and recommendations displayed

6. **06-workflow-integration.png** (58KB) - Complete workflow integration demo
   - Tabbed interface showing 5-step workflow
   - Original code → Refactor → Document → Test → Debug
   - All tools working together seamlessly

7. **07-phase-1-2-metrics.png** (38KB) - Performance metrics and statistics
   - 4 Workflow Tools count
   - 100% Test Coverage achievement
   - \<1ms Tool Execution performance
   - 26/26 Tests Passing status

### **Technical Achievement Metrics**

- **Total Screenshots**: 7 professional captures
- **Total Size**: ~578KB (optimized for documentation)
- **Resolution**: 1920x1080 pixels (professional quality)
- **Coverage**: 100% of Phase 1.2 tools documented
- **Integration**: Live demo server integration captured

### **Visual Content Highlights**

- **Professional UI Design**: Clean, modern interface with intuitive layout
- **Real AI Integration**: Screenshots show actual AI-generated content
- **Tool Functionality**: Each tool's unique features clearly demonstrated
- **Workflow Integration**: Complete development lifecycle visualization
- **Performance Metrics**: Quantitative achievements prominently displayed

### **Phase 1.2 Visual Documentation Status**

- ✅ **Planning Document**: Created comprehensive visual content plan
- ✅ **Screenshot Script**: Automated Playwright capture script implemented
- ✅ **Professional Captures**: All 7 screenshots successfully generated
- ✅ **Summary Report**: Detailed achievement documentation created
- ✅ **Integration Ready**: Screenshots ready for README and documentation embedding

### **Impact on Phase 1.2 Verification**

With the visual content creation complete, Phase 1.2 now achieves all 7 verification criteria:

1. ✅ **Tool Implementation** - 4 AI workflow tools working
2. ✅ **Testing Excellence** - 36/36 tests passing (100% success)
3. ✅ **Demo Integration** - Professional UI with API endpoints
4. ✅ **Documentation Sync** - Memory bank files updated
5. ✅ **Visual Content** - 7 professional screenshots created ← **JUST COMPLETED**
6. ✅ **Production Ready** - All components validated
7. ✅ **Architecture Validation** - Factory-First design maintained

## ** PHASE 1.2 FULLY COMPLETE**

All verification criteria achieved. NeuroLink has successfully evolved into a Comprehensive AI Development Workflow Platform with 10 specialized tools and complete visual documentation.

---

## Phase 1.2 AI Development Workflow Tools - Visual Content Plan

<!-- Source: visual-content/phase-1-2-workflow-tools-plan.md -->

# Phase 1.2 AI Development Workflow Tools - Visual Content Plan

## Overview

Create professional visual documentation for the 4 AI Development Workflow Tools implemented in Phase 1.2.

## Tools to Document

1. **generate-test-cases** - Automated test case generation for multiple languages and frameworks
2. **refactor-code** - AI-powered code refactoring with optimization goals
3. **generate-documentation** - Automatic documentation generation in multiple formats
4. **debug-ai-output** - AI output analysis and debugging with improvement suggestions

## Visual Content Requirements

### 1. Screenshots (1920x1080 resolution)

- **Overview Screenshot**: AI workflow demo page showing all 4 tools
- **Tool-Specific Screenshots** (4 total):
  - Generate Test Cases in action
  - Refactor Code demonstration
  - Generate Documentation example
  - Debug AI Output analysis

### 2. Demo Videos

- **Comprehensive Workflow Video**: Showing all 4 tools working together
- **Individual Tool Demos**: Quick demonstrations of each tool's capabilities

## Screenshot Capture Plan

### Screenshot 1: Phase 1.2 Overview

- URL: http://localhost:9876/ai-workflow-demo.html
- Content: Full page showing all 4 workflow tools
- Focus: Professional UI with green theme for Phase 1.2

### Screenshot 2: Generate Test Cases

- Show: Test case generation for JavaScript function
- Include: Framework selection (Jest), coverage options
- Result: Generated test suite with multiple test cases

### Screenshot 3: Refactor Code

- Show: Code refactoring with optimization goals
- Include: Multiple refactoring goals selected
- Result: Refactored code with improvements highlighted

### Screenshot 4: Generate Documentation

- Show: Documentation generation for code snippet
- Include: Format selection (Markdown, JSDoc)
- Result: Professional documentation output

### Screenshot 5: Debug AI Output

- Show: AI output analysis and debugging
- Include: Analysis depth options
- Result: Debugging insights and improvement suggestions

## Implementation Steps

1. **Ensure Demo Server Running**
   - Server should be on port 9876
   - All 4 Phase 1.2 tools integrated

2. **Create AI Workflow Demo Page**
   - Professional UI with forms for each tool
   - Green color theme for Phase 1.2 distinction

3. **Capture Screenshots**
   - Use browser or Playwright for consistent captures
   - Save to `docs/visual-content/screenshots/phase-1-2-workflow/`

4. **Create Demo Videos** (Optional)
   - Record tool demonstrations
   - Save to `docs/visual-content/videos/phase-1-2-workflow/`

5. **Update Documentation**
   - Add visual content to README.md
   - Update memory bank files with completion status

---

# Playground

## Interactive Playground

<!-- Source: playground/index.md -->

# Interactive Playground

Try NeuroLink in a live coding environment without any local setup required.

## Try NeuroLink Now

Click the button below to open a live coding environment powered by StackBlitz:

[[Image: Open in StackBlitz]](https://stackblitz.com/github/juspay/neurolink-playground)

## Example Playgrounds

Explore these interactive examples to learn NeuroLink's capabilities:

### Basic Chat

Get started with a simple chat application using NeuroLink.

- **Demonstrates:** Provider setup, basic text generation
- **Complexity:** Beginner
- [Open in StackBlitz](https://stackblitz.com/github/juspay/neurolink-playground/tree/main/examples/basic-chat)

**Preview:**

```typescript

const neurolink = new NeuroLink();
const result = await neurolink.generate({
  prompt: "Hello! Tell me about NeuroLink.",
  provider: "openai",
});

console.log(result.text);
```

### Streaming Responses

Learn how to implement real-time streaming responses.

- **Demonstrates:** Stream API, chunk processing, real-time UI updates
- **Complexity:** Intermediate
- [Open in StackBlitz](https://stackblitz.com/github/juspay/neurolink-playground/tree/main/examples/streaming)

**Preview:**

```typescript
const stream = await neurolink.stream({
  prompt: "Write a story about AI",
  provider: "anthropic",
});

for await (const chunk of stream) {
  process.stdout.write(chunk.text);
}
```

### MCP Tools Integration

Explore Model Context Protocol (MCP) tools with NeuroLink.

- **Demonstrates:** Tool registry, tool execution, external MCP servers
- **Complexity:** Advanced
- [Open in StackBlitz](https://stackblitz.com/github/juspay/neurolink-playground/tree/main/examples/mcp-tools)

**Preview:**

```typescript

const registry = new MCPToolRegistry();
await registry.addBuiltinTools(["readFile", "writeFile"]);

const neurolink = new NeuroLink({ toolRegistry: registry });
const result = await neurolink.generate({
  prompt: "Read the README.md file",
  provider: "anthropic",
});
```

### Multi-Provider Failover

Implement enterprise-grade multi-provider failover patterns.

- **Demonstrates:** Provider failover, error handling, cost optimization
- **Complexity:** Advanced
- [Open in StackBlitz](https://stackblitz.com/github/juspay/neurolink-playground/tree/main/examples/multi-provider)

**Preview:**

```typescript
const result = await neurolink.generate({
  prompt: "Analyze this data",
  provider: "openai",
  fallbackProviders: ["anthropic", "google-ai"],
});
```

## Running Playgrounds Locally

Want to run these examples on your local machine? Use `degit` to quickly clone any example:

### Quick Start

```bash
# Clone the basic chat example
npx degit juspay/neurolink-playground/examples/basic-chat my-neurolink-app

# Navigate to the project
cd my-neurolink-app

# Install dependencies
pnpm install

# Set up your environment variables
cp .env.example .env
# Edit .env and add your API keys

# Run the development server
pnpm dev
```

### Available Examples

Clone any example by changing the path:

```bash
# Streaming example
npx degit juspay/neurolink-playground/examples/streaming my-project

# MCP tools example
npx degit juspay/neurolink-playground/examples/mcp-tools my-project

# Multi-provider example
npx degit juspay/neurolink-playground/examples/multi-provider my-project
```

## Create Your Own Playground

Start from our template to build custom NeuroLink applications:

```bash
# Clone the playground template
npx degit juspay/neurolink-playground my-custom-app

# Install dependencies
cd my-custom-app
pnpm install

# Start developing
pnpm dev
```

## Playground Features

All playground examples include:

- **Zero Configuration** - Pre-configured with sensible defaults
- **TypeScript Support** - Full type safety out of the box
- **Hot Reload** - Instant feedback as you code
- **Environment Setup** - `.env.example` files for easy API key configuration
- **Modern Stack** - Built with Vite, TypeScript, and modern tooling
- **Commented Code** - Detailed inline documentation explaining key concepts

## Embed Playgrounds

You can embed any playground example in your documentation or blog posts:

### Iframe Embed

```html

```

### Markdown Embed Link

```markdown
[[Image: Edit in StackBlitz]](https://stackblitz.com/github/juspay/neurolink-playground/tree/main/examples/basic-chat)
```

## Need Help?

- **Documentation:** [Getting Started Guide](/docs/)
- **Examples:** [SDK Examples](/docs/)
- **Support:** [GitHub Issues](https://github.com/juspay/neurolink/issues)
- **Community:** [GitHub Discussions](https://github.com/juspay/neurolink/discussions)

---

**Note:** The NeuroLink Playground repository is currently under development. Some examples may be placeholders. We welcome contributions! See our [Contributing Guide](/docs/community/contributing) for details.

---

# Rag

## RAG Processing - CLI Reference

<!-- Source: rag/CLI-COVERAGE.md -->

# RAG Processing - CLI Reference

## Status: FULLY IMPLEMENTED

**Feature:** RAG Processing
**CLI Commands:** 3 commands available
**Last Updated:** January 31, 2026

> **Provider Defaults:** When `--provider` and `--model` are not specified, NeuroLink defaults to **Vertex AI** with **gemini-2.5-flash** for text generation tasks (like metadata extraction with `--extract`).
>
> **Embedding Models:** For `index` and `query` commands that require embeddings, NeuroLink **automatically selects the appropriate embedding model** for the provider:
>
> - **Vertex AI:** `text-embedding-004`
> - **OpenAI:** `text-embedding-3-small`
> - **Bedrock:** `amazon.titan-embed-text-v2:0`
>
> You can override this by specifying an embedding model explicitly with `--model`.

## Commands

### 1. `neurolink rag chunk `

Chunk a document into smaller pieces for processing.

#### Syntax

```bash
neurolink rag chunk  [options]
```

#### Arguments

| Argument | Description               | Required |
| -------- | ------------------------- | -------- |
| `` | Path to the file to chunk | Yes      |

#### Options

| Option       | Alias | Description                                 | Type    | Default         |
| ------------ | ----- | ------------------------------------------- | ------- | --------------- |
| `--strategy` | `-s`  | Chunking strategy to use                    | string  | Auto-detected   |
| `--maxSize`  | `-m`  | Maximum chunk size in characters            | number  | `1000`          |
| `--overlap`  | `-o`  | Overlap between chunks in characters        | number  | `200`           |
| `--format`   | `-f`  | Output format                               | string  | `text`          |
| `--output`   |       | Output file path (optional)                 | string  | stdout          |
| `--extract`  | `-e`  | Extract metadata (title, summary, keywords) | boolean | `false`         |
| `--provider` | `-p`  | Provider for semantic chunking/metadata     | string  | From env/config |
| `--model`    |       | Model for semantic chunking/metadata        | string  | From env/config |
| `--verbose`  | `-v`  | Enable verbose output                       | boolean | `false`         |

#### Strategy Options

| Strategy    | Description                        | Auto-detected for      |
| ----------- | ---------------------------------- | ---------------------- |
| `character` | Fixed-size character splits        | -                      |
| `recursive` | Paragraph/sentence-aware splits    | `.txt`, `.csv`, `.pdf` |
| `sentence`  | Sentence boundary splitting        | -                      |
| `token`     | Token-based splitting              | -                      |
| `markdown`  | Markdown structure-aware splitting | `.md`, `.markdown`     |
| `html`      | HTML tag-aware splitting           | `.html`, `.htm`        |
| `json`      | JSON structure-aware splitting     | `.json`                |
| `latex`     | LaTeX structure-aware splitting    | `.tex`, `.latex`       |
| `semantic`  | LLM-powered semantic splitting     | -                      |

#### Format Options

| Format  | Description                                  |
| ------- | -------------------------------------------- |
| `text`  | Human-readable text with chunk separators    |
| `json`  | Full JSON output with all chunk data         |
| `table` | Tabular summary with ID, length, and preview |

#### Examples

**Basic chunking with auto-detected strategy:**

```bash
neurolink rag chunk document.md
```

**Chunk with specific strategy and size:**

```bash
neurolink rag chunk document.txt --strategy recursive --maxSize 500 --overlap 100
```

**Output as JSON to file:**

```bash
neurolink rag chunk document.md --format json --output chunks.json
```

**Extract metadata using LLM:**

```bash
neurolink rag chunk document.md --extract --provider vertex --model gemini-2.5-flash
```

**Verbose output with table format:**

```bash
neurolink rag chunk document.md --format table --verbose
```

#### Output Examples

**Text format (default):**

```
--- Chunk 1 (487 chars) ---
# Introduction

This document covers the basics of RAG processing...

--- Chunk 2 (523 chars) ---
## Architecture

The system consists of three main components...
```

**Table format:**

```
#  | ID       | Length | Preview
---+----------+--------+---------------------------------------------------
1  | a1b2c3d4 | 487    | # Introduction This document covers the basics...
2  | e5f6g7h8 | 523    | ## Architecture The system consists of three m...
```

**JSON format:**

```json
[
  {
    "id": "a1b2c3d4-...",
    "text": "# Introduction\n\nThis document covers...",
    "metadata": {
      "source": "document.md",
      "title": "Introduction",
      "summary": "Overview of RAG processing basics",
      "keywords": ["RAG", "introduction", "basics"]
    }
  }
]
```

---

### 2. `neurolink rag index `

Index a document for semantic search.

#### Syntax

```bash
neurolink rag index  [options]
```

#### Arguments

| Argument | Description               | Required |
| -------- | ------------------------- | -------- |
| `` | Path to the file to index | Yes      |

#### Options

| Option        | Alias | Description                          | Type    | Default                    |
| ------------- | ----- | ------------------------------------ | ------- | -------------------------- |
| `--indexName` | `-n`  | Name for the index                   | string  | Filename without extension |
| `--strategy`  | `-s`  | Chunking strategy to use             | string  | Auto-detected              |
| `--maxSize`   | `-m`  | Maximum chunk size in characters     | number  | `1000`                     |
| `--overlap`   | `-o`  | Overlap between chunks in characters | number  | `200`                      |
| `--provider`  | `-p`  | Provider for embeddings              | string  | From env/config            |
| `--model`     |       | Model for embeddings                 | string  | From env/config            |
| `--graph`     | `-g`  | Build Graph RAG index                | boolean | `false`                    |
| `--verbose`   | `-v`  | Enable verbose output                | boolean | `false`                    |

#### Strategy Options

Same as the `chunk` command. See [Strategy Options](#strategy-options) above.

#### Examples

**Basic indexing:**

```bash
# Uses default provider (Vertex) with automatic embedding model (text-embedding-004)
neurolink rag index document.md
```

**Index with custom name:**

```bash
neurolink rag index document.md --indexName my-docs
```

**Index with Graph RAG:**

```bash
neurolink rag index document.md --graph --verbose
```

**Custom chunking with explicit embedding model:**

```bash
# You can specify an embedding model explicitly
neurolink rag index document.md \
  --strategy markdown \
  --maxSize 800 \
  --overlap 150 \
  --provider openai \
  --model text-embedding-3-small
```

**Using Vertex AI (default):**

```bash
# Provider defaults to Vertex, embedding model auto-selects to text-embedding-004
neurolink rag index document.md --verbose
```

#### Output Examples

**Standard output:**

```
Indexed 15 chunks as "document"
```

**With Graph RAG:**

```
Indexed 15 chunks as "document" with Graph RAG
```

**Verbose output:**

```
Indexed 15 chunks as "document" with Graph RAG

--- Index Summary ---
Index name: document
Total chunks: 15
Embedding dimension: 1536
Graph nodes: 15
Graph edges: 42
```

---

### 3. `neurolink rag query `

Query indexed documents using semantic search.

#### Syntax

```bash
neurolink rag query  [options]
```

#### Arguments

| Argument  | Description         | Required |
| --------- | ------------------- | -------- |
| `` | Search query string | Yes      |

#### Options

| Option        | Alias | Description                       | Type    | Default               |
| ------------- | ----- | --------------------------------- | ------- | --------------------- |
| `--indexName` | `-n`  | Name of the index to query        | string  | First available index |
| `--topK`      | `-k`  | Number of results to return       | number  | `5`                   |
| `--hybrid`    | `-h`  | Use hybrid search (vector + BM25) | boolean | `false`               |
| `--graph`     | `-g`  | Use Graph RAG search              | boolean | `false`               |
| `--provider`  | `-p`  | Provider for embeddings           | string  | From env/config       |
| `--model`     |       | Model for embeddings              | string  | From env/config       |
| `--format`    | `-f`  | Output format                     | string  | `text`                |
| `--verbose`   | `-v`  | Enable verbose output             | boolean | `false`               |

#### Search Modes

| Mode      | Flag       | Description                                         |
| --------- | ---------- | --------------------------------------------------- |
| Vector    | (default)  | Pure vector similarity search using embeddings      |
| Hybrid    | `--hybrid` | Combines vector search with BM25 keyword matching   |
| Graph RAG | `--graph`  | Traverses knowledge graph for context-aware results |

#### Format Options

| Format  | Description                                   |
| ------- | --------------------------------------------- |
| `text`  | Full text results with score headers          |
| `json`  | Complete JSON output with id, score, and text |
| `table` | Compact table with scores and text previews   |

#### Examples

**Basic query:**

```bash
# Uses default provider (Vertex) with automatic embedding model (text-embedding-004)
neurolink rag query "How does RAG processing work?"
```

**Query specific index with more results:**

```bash
neurolink rag query "authentication methods" --indexName my-docs --topK 10
```

**Hybrid search:**

```bash
neurolink rag query "vector embeddings" --hybrid
```

**Graph RAG search:**

```bash
neurolink rag query "system architecture" --graph --verbose
```

**JSON output with OpenAI embeddings:**

```bash
neurolink rag query "API endpoints" --format json --provider openai
```

#### Output Examples

**Text format (default):**

```
Found 5 results

Search Results:

--- Result 1 (Score: 0.8934) ---
RAG processing works by first chunking documents into smaller pieces,
then creating vector embeddings for each chunk...

--- Result 2 (Score: 0.8521) ---
The retrieval phase uses similarity search to find the most relevant
chunks based on the query embedding...
```

**Table format:**

```
Found 5 results

Search Results:

[1] Score: 0.8934
RAG processing works by first chunking documents into smaller pieces, then creating vector embeddings for each chunk...

[2] Score: 0.8521
The retrieval phase uses similarity search to find the most relevant chunks based on the query embedding...
```

**JSON format:**

```json
[
  {
    "id": "a1b2c3d4-...",
    "score": 0.8934,
    "text": "RAG processing works by first chunking documents..."
  },
  {
    "id": "e5f6g7h8-...",
    "score": 0.8521,
    "text": "The retrieval phase uses similarity search..."
  }
]
```

**Verbose output:**

```
Found 5 results

Search Results:
...

--- Query Info ---
Index: document
Query: How does RAG processing work?
Search type: Hybrid
```

---

## Workflow Example

A typical RAG workflow using the CLI:

```bash
# Step 1: Chunk a document to preview the splitting
neurolink rag chunk docs/guide.md --format table --verbose

# Step 2: Index the document for search
# Note: Embedding model is automatically selected based on provider
# Default: Vertex AI with text-embedding-004
neurolink rag index docs/guide.md --indexName guide --graph --verbose

# Step 3: Query the indexed document
# Uses same embedding model as indexing for consistency
neurolink rag query "How do I configure authentication?" --indexName guide --topK 3

# Step 4: Use hybrid search for better results
neurolink rag query "API rate limits" --indexName guide --hybrid --format json

# Alternative: Use OpenAI embeddings
neurolink rag index docs/guide.md --indexName guide-openai --provider openai --verbose
neurolink rag query "authentication" --indexName guide-openai --provider openai
```

---

## Environment Variables

The following environment variables can be used to configure default behavior:

### Provider & Authentication

| Variable                  | Description                              | Default  |
| ------------------------- | ---------------------------------------- | -------- |
| `NEUROLINK_PROVIDER`      | Default AI provider                      | `vertex` |
| `AI_PROVIDER`             | Alternative env var for default provider | `vertex` |
| `GOOGLE_CLOUD_PROJECT_ID` | Google Cloud project ID (for Vertex AI)  | -        |
| `GOOGLE_API_KEY`          | Google AI Studio API key                 | -        |
| `OPENAI_API_KEY`          | OpenAI API key                           | -        |
| `ANTHROPIC_API_KEY`       | Anthropic API key                        | -        |

### Embedding Models (for `index` and `query` commands)

| Variable                       | Description                    | Default                        |
| ------------------------------ | ------------------------------ | ------------------------------ |
| `NEUROLINK_EMBEDDING_MODEL`    | Global default embedding model | Provider-specific default      |
| `VERTEX_EMBEDDING_MODEL`       | Vertex AI embedding model      | `text-embedding-004`           |
| `GOOGLE_EMBEDDING_MODEL`       | Google AI embedding model      | `text-embedding-004`           |
| `OPENAI_EMBEDDING_MODEL`       | OpenAI embedding model         | `text-embedding-3-small`       |
| `AZURE_OPENAI_EMBEDDING_MODEL` | Azure OpenAI embedding model   | `text-embedding-3-small`       |
| `BEDROCK_EMBEDDING_MODEL`      | AWS Bedrock embedding model    | `amazon.titan-embed-text-v2:0` |

### Generation Models (for `chunk --extract` and other text generation)

| Variable             | Description                    | Default            |
| -------------------- | ------------------------------ | ------------------ |
| `VERTEX_MODEL`       | Default model for Vertex AI    | `gemini-2.5-flash` |
| `OPENAI_MODEL`       | Default model for OpenAI       | `gpt-4o`           |
| `AZURE_OPENAI_MODEL` | Default model for Azure OpenAI | Deployment-based   |
| `BEDROCK_MODEL`      | Default model for AWS Bedrock  | Provider-specific  |

### Embedding Model Resolution Order

For `index` and `query` commands, the embedding model is resolved in this order:

1. **CLI `--model` flag** (if it's an embedding model)
2. **`NEUROLINK_EMBEDDING_MODEL`** (global embedding model)
3. **Provider-specific embedding env vars** (e.g., `VERTEX_EMBEDDING_MODEL`)
4. **Provider's default model env var** (if it's an embedding model, e.g., if `VERTEX_MODEL=text-embedding-004`)
5. **Provider-specific default embedding model** (e.g., `text-embedding-004` for Vertex)
6. **Fallback:** OpenAI `text-embedding-3-small`

> **Note:** The RAG CLI is smart about model selection. Even if you have `VERTEX_MODEL=gemini-2.5-flash` set for text generation, the `index` and `query` commands will automatically use the appropriate embedding model for your provider.
>
> If you explicitly specify a model with `--model`, ensure it's an embedding model that supports the `embed()` operation.

---

## Error Handling

### Common Errors

**File not found:**

```
File not found: /path/to/document.md
```

Ensure the file path is correct and the file exists.

**No indexed documents:**

```
No indexed documents found. Run 'neurolink rag index' first.
```

You must index a document before querying. Run `neurolink rag index ` first.

**Index not found:**

```
Index "my-docs" not found.
```

The specified index name doesn't exist. Check available indices or use the default.

---

## Notes

- **In-memory storage:** Currently, indexed documents are stored in memory and will be lost when the process exits. For persistence, use the SDK API with a vector database.
- **Auto-detection:** When `--strategy` is not specified, the chunking strategy is automatically detected based on file extension.
- **Graph RAG:** Building a Graph RAG index (`--graph`) requires additional processing time but enables context-aware traversal during queries.

---

## See Also

- [RAG Feature Guide](/docs/tutorials/rag) - Main RAG documentation with CLI usage
- [RAG Configuration](/docs/deployment/configuration) - Configuration reference

---

## RAG Processing - Configuration Guide

<!-- Source: rag/CONFIGURATION.md -->

# RAG Processing - Configuration Guide

This document provides comprehensive configuration options for the RAG (Retrieval-Augmented Generation) processing system in NeuroLink.

## Overview

The RAG processing system consists of three main components:

1. **Chunkers** - Split documents into smaller, processable segments
2. **Rerankers** - Re-score and re-order search results for relevance
3. **Hybrid Search** - Combine BM25 and vector search for improved retrieval

---------------- | --------------------------------- | --------------------------- |
| `character`         | Fixed-size character splits       | Simple text, logs           |
| `recursive`         | Paragraph/sentence-aware splits   | General documents           |
| `sentence`          | Sentence boundary splitting       | Natural language text       |
| `token`             | Token-based (GPT tokenizer)       | LLM context optimization    |
| `markdown`          | Header-aware markdown parsing     | Documentation, README files |
| `html`              | HTML tag-aware splitting          | Web content                 |
| `json`              | JSON structure-aware              | API responses, config files |
| `latex`             | LaTeX section-aware               | Academic papers             |
| `semantic-markdown` | Semantic markdown with embeddings | Technical documentation     |

### Common Configuration Options

```typescript
type ChunkerConfig = {
  // Maximum chunk size (characters or tokens)
  maxSize: number; // Default: 1000

  // Overlap between chunks (characters or tokens)
  overlap: number; // Default: 100

  // Minimum chunk size (avoid tiny chunks)
  minSize?: number; // Default: 10

  // Document ID for metadata tracking
  documentId?: string; // Default: auto-generated UUID

  // Additional metadata to attach to chunks
  metadata?: Record;

  // Whether to preserve metadata from source document
  preserveMetadata?: boolean; // Default: true
};
```

### Strategy-Specific Configuration

#### Character Chunker

```typescript
const config = {
  maxSize: 1000, // Max characters per chunk
  overlap: 100, // Character overlap between chunks
  separator: "", // No separator (split by character count)
};
```

#### Recursive Chunker

```typescript
const config = {
  maxSize: 1000,
  overlap: 100,
  separators: ["\n\n", "\n", ". ", " ", ""], // Priority order
  keepSeparators: true, // Keep separators in output chunks
};
```

#### Sentence Chunker

```typescript
const config = {
  maxSize: 1000, // Max characters per chunk
  overlap: 1, // Overlap in sentences (not characters)
  minSentences: 1, // Minimum sentences per chunk
  maxSentences: 10, // Maximum sentences per chunk
};
```

#### Token Chunker

```typescript
const config = {
  maxSize: 512, // Max tokens per chunk
  overlap: 50, // Token overlap
  tokenizer: "cl100k_base", // OpenAI tokenizer
};
```

#### Markdown Chunker

```typescript
const config = {
  maxSize: 1000,
  overlap: 100,
  preserveHeaders: true, // Include parent headers in chunks
  codeBlockHandling: "preserve", // 'preserve' | 'split' | 'remove'
};
```

#### HTML Chunker

```typescript
const config = {
  maxSize: 1000,
  overlap: 100,
  preserveTags: ["p", "div", "section", "article"],
  removeTags: ["script", "style", "nav", "footer"],
  extractText: true, // Strip HTML tags from output
};
```

#### JSON Chunker

```typescript
const config = {
  maxSize: 500,
  preserveStructure: true, // Keep valid JSON in chunks
  flattenDepth: 2, // Max nesting depth before flattening
  arrayHandling: "split", // 'split' | 'preserve'
};
```

#### LaTeX Chunker

```typescript
const config = {
  maxSize: 1000,
  overlap: 100,
  sectionCommands: ["\\section", "\\subsection", "\\chapter"],
  preserveMath: true, // Keep math environments intact
  includeComments: false, // Strip LaTeX comments
};
```

#### Semantic Markdown Chunker

```typescript
const config = {
  maxSize: 500,
  overlap: 100,
  semanticThreshold: 0.7, // Similarity threshold for merging
  embedder: "openai", // Embedding provider
};
```

### Usage Examples

```typescript

// List available strategies
const strategies = getAvailableStrategies();
console.log(strategies); // ['character', 'recursive', ...]

// Create a chunker with configuration
const chunker = await createChunker("recursive", {
  maxSize: 500,
  overlap: 50,
});

// Chunk a document
const chunks = await chunker.chunk(documentText, {
  maxSize: 500,
  overlap: 50,
});

// Each chunk has structure:
// {
//   id: string,
//   text: string,
//   metadata: {
//     documentId: string,
//     chunkIndex: number,
//     startOffset: number,
//     endOffset: number,
//     ...customMetadata
//   }
// }
```

---

## Reranker Configuration

### Available Reranker Types

| Type            | Description                   | Requires Model | Use Case                |
| --------------- | ----------------------------- | -------------- | ----------------------- |
| `simple`        | Position + vector score combo | No             | Fast, no-cost reranking |
| `llm`           | LLM semantic scoring          | Yes            | High-quality semantic   |
| `cross-encoder` | Cross-encoder model           | Yes            | Accuracy-focused        |
| `cohere`        | Cohere Rerank API             | Yes (API key)  | Production-grade        |
| `batch`         | Batch LLM reranking           | Yes            | Large result sets       |

### Common Configuration Options

```typescript
type RerankerConfig = {
  // Number of top results to return
  topK: number; // Default: 10

  // Minimum score threshold
  minScore?: number; // Default: 0.0

  // Include original scores in output
  includeOriginalScores?: boolean; // Default: false
};
```

### Type-Specific Configuration

#### Simple Reranker

```typescript
const config = {
  topK: 10,
  positionWeight: 0.3, // Weight for position in results
  scoreWeight: 0.7, // Weight for original vector score
};
```

#### LLM Reranker

```typescript
const config = {
  topK: 5,
  model: "gpt-4",
  temperature: 0.0,
  prompt: "Rate relevance of this passage to the query (0-1):",
  batchSize: 5, // Process in batches
};
```

#### Cross-Encoder Reranker

```typescript
const config = {
  topK: 10,
  model: "cross-encoder/ms-marco-MiniLM-L-12-v2",
  normalize: true, // Normalize scores to 0-1
};
```

#### Cohere Reranker

```typescript
const config = {
  topK: 10,
  model: "rerank-english-v2.0",
  maxChunksPerDoc: 10,
  returnDocuments: false,
};
```

#### Batch Reranker

```typescript
const config = {
  topK: 20,
  batchSize: 10, // Documents per LLM call
  parallelBatches: 3, // Concurrent batches
  model: "gpt-3.5-turbo",
};
```

### Usage Examples

```typescript

// List available types
const types = getAvailableRerankerTypes();
console.log(types); // ['simple', 'llm', 'cross-encoder', 'cohere', 'batch']

// Create a simple reranker (no model required)
const reranker = await createReranker("simple", { topK: 5 });

// Rerank search results
const reranked = await reranker.rerank(searchResults, query, { topK: 5 });

// Each result has structure:
// {
//   id: string,
//   text: string,
//   score: number,
//   originalScore?: number,
//   metadata?: Record
// }
```

---

## Hybrid Search Configuration

### BM25 Index Configuration

```typescript
type BM25Config = {
  // BM25 parameters
  k1: number; // Default: 1.2 (term frequency saturation)
  b: number; // Default: 0.75 (document length normalization)

  // Preprocessing
  lowercase: boolean; // Default: true
  stemming: boolean; // Default: false
  stopwords: string[]; // Default: English stopwords
};
```

### Fusion Methods

#### Reciprocal Rank Fusion (RRF)

```typescript

const fusedScores = reciprocalRankFusion(
  [vectorRankings, bm25Rankings],
  60, // k parameter (default: 60)
);
```

#### Linear Combination

```typescript

const combinedScores = linearCombination(
  vectorScores, // Map
  bm25Scores, // Map
  0.5, // alpha: weight for vector scores (0-1)
);
```

### Hybrid Search Pipeline

```typescript

// Create BM25 index
const bm25Index = new InMemoryBM25Index({ k1: 1.2, b: 0.75 });

// Add documents
await bm25Index.addDocuments([
  { id: "doc1", text: "Document content...", metadata: {} },
  // ...
]);

// Create hybrid search
const hybridSearch = createHybridSearch({
  bm25Index,
  vectorStore, // Your vector store instance
  fusionMethod: "rrf", // 'rrf' | 'linear'
  alpha: 0.5, // Vector weight (for linear fusion)
  k: 60, // RRF parameter
});

// Execute hybrid search
const results = await hybridSearch.search(query, {
  topK: 10,
  filter: { category: "technical" },
});
```

---

## Resilience Configuration

The RAG system includes resilience patterns to handle failures gracefully.

### Circuit Breaker Configuration

Circuit breakers prevent cascading failures by stopping operations when error rates are too high.

```typescript
type RAGCircuitBreakerConfig = {
  // Number of failures before opening circuit
  failureThreshold: number; // Default: 5

  // Time in ms before attempting reset
  resetTimeout: number; // Default: 60000 (1 minute)

  // Max calls allowed in half-open state
  halfOpenMaxCalls: number; // Default: 3

  // Operation timeout in ms
  operationTimeout: number; // Default: 30000 (30 seconds)

  // Minimum calls before calculating failure rate
  minimumCallsBeforeCalculation: number; // Default: 10

  // Time window for statistics in ms
  statisticsWindowSize: number; // Default: 300000 (5 minutes)
};
```

#### Circuit Breaker Usage

```typescript

  getCircuitBreaker,
  executeWithCircuitBreaker,
} from "@juspay/neurolink";

// Create a circuit breaker for vector queries
const breaker = getCircuitBreaker("vector-queries", {
  failureThreshold: 3,
  resetTimeout: 30000,
});

// Execute operation with circuit breaker protection
const result = await breaker.execute(async () => {
  return await vectorStore.query(embedding, { topK: 10 });
}, "vector-query");

// Or use the convenience function
const result = await executeWithCircuitBreaker(
  "embedding-service",
  () => embeddingProvider.embed(text),
  "embedding",
  { failureThreshold: 5 },
);

// Get circuit breaker statistics
const stats = breaker.getStats();
// {
//   state: 'closed' | 'open' | 'half-open',
//   totalCalls: number,
//   failureRate: number,
//   averageLatency: number,
//   p95Latency: number,
//   ...
// }
```

### Retry Handler Configuration

Retry handlers provide automatic retries with exponential backoff for transient failures.

```typescript
type RAGRetryConfig = {
  // Maximum number of retry attempts
  maxRetries: number; // Default: 3

  // Initial delay in ms
  initialDelay: number; // Default: 1000

  // Maximum delay in ms
  maxDelay: number; // Default: 30000

  // Backoff multiplier
  backoffMultiplier: number; // Default: 2

  // Whether to add jitter
  jitter: boolean; // Default: true

  // Retryable HTTP status codes
  retryableStatusCodes?: number[]; // Default: [408, 429, 500, 502, 503, 504]
};
```

#### Retry Handler Usage

```typescript

  withRAGRetry,
  RAGRetryHandler,
  embeddingRetryHandler,
  vectorStoreRetryHandler,
} from "@juspay/neurolink";

// Simple retry wrapper
const result = await withRAGRetry(() => embeddingProvider.embed(text), {
  maxRetries: 5,
  initialDelay: 2000,
});

// Use specialized retry handlers
const embedding = await embeddingRetryHandler.executeWithRetry(() =>
  embeddingProvider.embed(text),
);

const queryResult = await vectorStoreRetryHandler.executeWithRetry(() =>
  vectorStore.query(embedding),
);

// Batch operations with retry
const handler = new RAGRetryHandler({ maxRetries: 3 });
const results = await handler.executeBatch(
  documents,
  async (doc, index) => await processDocument(doc),
  { concurrency: 5, continueOnError: true },
);
// Returns: { successful: [...], failed: [...], successRate: number }
```

#### Specialized Retry Handlers

| Handler                          | maxRetries | initialDelay | Use Case                      |
| -------------------------------- | ---------- | ------------ | ----------------------------- |
| `embeddingRetryHandler`          | 5          | 2000ms       | Embedding API rate limits     |
| `vectorStoreRetryHandler`        | 3          | 1000ms       | Vector store operations       |
| `metadataExtractionRetryHandler` | 3          | 1500ms       | LLM-based metadata extraction |

---

## Metadata Extraction Configuration

The RAG system supports extracting metadata from document chunks using LLMs.

### Extractor Types

| Type        | Description                       | Output                    |
| ----------- | --------------------------------- | ------------------------- |
| `title`     | Extract document title            | `string`                  |
| `summary`   | Generate chunk summary            | `string`                  |
| `keywords`  | Extract relevant keywords         | `string[]`                |
| `questions` | Generate Q&A pairs for retrieval  | `{question, answer}[]`    |
| `custom`    | Custom schema extraction with Zod | `Record` |

### Base Extractor Configuration

```typescript
type BaseExtractorConfig = {
  // Language model to use
  modelName?: string; // e.g., "gpt-4", "claude-3-sonnet"

  // Provider for the model
  provider?: string; // e.g., "openai", "anthropic"

  // Custom prompt template
  promptTemplate?: string;

  // Maximum tokens for LLM response
  maxTokens?: number;

  // Temperature for LLM generation
  temperature?: number;
};
```

### Title Extractor

```typescript
const titleConfig = {
  modelName: "gpt-4",
  nodes: 5, // Number of nodes to analyze
  nodeTemplate: "Extract the main topic from: {text}",
  combineTemplate: "Combine these topics into a title: {topics}",
};
```

### Summary Extractor

```typescript
const summaryConfig = {
  modelName: "gpt-3.5-turbo",
  summaryTypes: ["current", "previous", "next"], // Context-aware summaries
  maxWords: 100, // Maximum summary length
};
```

### Keyword Extractor

```typescript
const keywordConfig = {
  modelName: "gpt-3.5-turbo",
  maxKeywords: 10, // Maximum keywords to extract
  minRelevance: 0.5, // Minimum relevance score (0-1)
};
```

### Question-Answer Extractor

```typescript
const questionConfig = {
  modelName: "gpt-4",
  numQuestions: 5, // Number of Q&A pairs
  includeAnswers: true, // Include answers in output
  embeddingOnly: false, // Generate full questions vs embedding-optimized
};
```

### Usage Example

```typescript

const doc = new MDocument(content, { type: "markdown" });

// Chunk with metadata extraction
const chunks = await doc.chunk({
  strategy: "recursive",
  config: { maxSize: 1000, overlap: 100 },
  extract: {
    title: true,
    summary: { maxWords: 50 },
    keywords: { maxKeywords: 5 },
    questions: { numQuestions: 3 },
  },
});

// Each chunk now includes extracted metadata:
// {
//   id: string,
//   text: string,
//   metadata: {
//     title: "Extracted Title",
//     summary: "Brief summary...",
//     keywords: ["keyword1", "keyword2"],
//     ...
//   }
// }
```

---

## Pipeline Configuration

### Full RAG Pipeline

```typescript

  createChunker,
  createReranker,
  createHybridSearch,
} from "@juspay/neurolink";

// 1. Configure chunker
const chunker = await createChunker("recursive", {
  maxSize: 500,
  overlap: 50,
});

// 2. Configure reranker
const reranker = await createReranker("simple", {
  topK: 5,
});

// 3. Configure hybrid search
const hybridSearch = createHybridSearch({
  bm25Index,
  vectorStore,
  fusionMethod: "rrf",
});

// 4. Process documents
const chunks = await chunker.chunk(document);

// 5. Index chunks (implementation depends on your vector store)
await vectorStore.addDocuments(chunks);
await bm25Index.addDocuments(chunks);

// 6. Search and rerank
const searchResults = await hybridSearch.search(query, { topK: 20 });
const finalResults = await reranker.rerank(searchResults, query, { topK: 5 });
```

---

## Environment Variables

| Variable            | Description                | Required |
| ------------------- | -------------------------- | -------- |
| `OPENAI_API_KEY`    | For LLM/semantic reranking | Optional |
| `COHERE_API_KEY`    | For Cohere reranker        | Optional |
| `ANTHROPIC_API_KEY` | For Claude-based reranking | Optional |

---

## Best Practices

### Chunking

1. **Match chunk size to context window** - Use token chunker for LLMs
2. **Choose strategy by content type** - Markdown for docs, HTML for web
3. **Use overlap for continuity** - 10-20% overlap prevents context loss
4. **Preserve structure** - Use format-aware chunkers when possible

### Reranking

1. **Start simple** - Simple reranker is fast and often sufficient
2. **Use LLM reranking for quality** - When accuracy matters more than speed
3. **Batch for efficiency** - Use batch reranker for large result sets
4. **Consider cost** - API-based rerankers have per-call costs

### Hybrid Search

1. **Balance weights** - Start with 0.5 alpha and tune based on results
2. **RRF is robust** - Less sensitive to score scale differences
3. **Index incrementally** - Update both BM25 and vector indices together
4. **Filter early** - Apply metadata filters before fusion when possible

---

## Troubleshooting

### Common Issues

1. **Empty chunks** - Check if maxSize is too small for content
2. **Overlapping content** - Reduce overlap parameter
3. **Missing context** - Increase chunk size or overlap
4. **Slow reranking** - Use simple reranker or reduce topK
5. **Poor search quality** - Tune BM25 parameters (k1, b)

### Debug Logging

```bash
# Enable verbose logging
DEBUG=neurolink:rag:* npx tsx your-script.ts
```

---

## API Reference

For complete API documentation, see the TypeScript definitions in:

- `src/lib/rag/types.ts` - Core type definitions
- `src/lib/rag/ChunkerFactory.ts` - Chunker factory API
- `src/lib/rag/reranker/RerankerFactory.ts` - Reranker factory API
- `src/lib/rag/retrieval/hybridSearch.ts` - Hybrid search API

## See Also

- [RAG Feature Guide](/docs/tutorials/rag) - Main RAG documentation with quick start and overview
- [RAG Testing Guide](/docs/development/testing) - How to run RAG tests
- [RAG API Reference](../sdk/api-reference) - API documentation

---

## RAG Processing - Testing Guide

<!-- Source: rag/TESTING.md -->

# RAG Processing - Testing Guide

## Prerequisites

### Environment Setup

1. **Node.js**: Version 18+ required
2. **pnpm**: Package manager (install with `npm install -g pnpm`)
3. **TypeScript**: Included in devDependencies

### Build Requirements

Before running tests, ensure the project is built:

```bash
# Full build
pnpm run build

# Or build only what's needed for tests
pnpm run build:cli
```

### Environment Variables

No specific environment variables are required for RAG processing unit tests.

For integration tests with external services (e.g., Cohere reranking), you may need:

```bash
# Optional - for Cohere reranker tests
export COHERE_API_KEY=your_api_key

# Optional - for LLM-based reranking tests
export OPENAI_API_KEY=your_api_key
```

## Running Tests

### Run RAG Test Suite

```bash
# Run the continuous RAG test suite
npx tsx test/continuous-test-suite-rag.ts

# With verbose output
VERBOSE=true npx tsx test/continuous-test-suite-rag.ts
```

### Run Unit Tests (Vitest)

```bash
# Run all RAG-related unit tests
pnpm test test/rag/

# Run specific test files
pnpm test test/rag/ChunkerFactory.test.ts
pnpm test test/rag/ChunkerRegistry.test.ts

# Run with coverage
pnpm run test:coverage -- --include=src/lib/rag/
```

### Run Integration Tests

```bash
# Run RAG integration tests
pnpm test test/rag/integration/

# Run all integration tests
pnpm run test:integration
```

## Test Structure

### Test Suite Organization

```
test/
├── continuous-test-suite-rag.ts    # Main RAG continuous test suite
├── rag/
│   ├── ChunkerFactory.test.ts      # ChunkerFactory unit tests
│   ├── ChunkerRegistry.test.ts     # ChunkerRegistry unit tests
│   ├── integration/
│   │   └── ...                     # Integration tests
│   └── resilience/
│       └── ...                     # Resilience pattern tests
└── fixtures/
    └── rag/
        ├── sample-documents.txt    # Sample text for chunking
        ├── chunker-config.json     # Chunker configurations
        ├── search-queries.json     # Search test queries
        └── reranker-config.json    # Reranker configurations
```

### Test Categories

1. **Chunker Tests**
   - Factory pattern tests
   - Registry pattern tests
   - All 10 chunking strategies
   - Alias resolution
   - Metadata retrieval

2. **Reranker Tests**
   - Factory pattern tests
   - Registry pattern tests
   - Simple reranking
   - Alias resolution
   - Model-free rerankers

3. **Hybrid Search Tests**
   - BM25 indexing and search
   - Reciprocal Rank Fusion (RRF)
   - Linear combination
   - Score normalization

4. **Integration Tests**
   - End-to-end chunking pipeline
   - Multiple chunker comparison
   - Error handling

## Expected Results

### Chunker Strategies Tested

| Strategy          | Description                 | Test Coverage |
| ----------------- | --------------------------- | ------------- |
| character         | Fixed-size character chunks | Full          |
| recursive         | Paragraph/sentence-based    | Full          |
| sentence          | Sentence boundary splitting | Full          |
| token             | Token-based (GPT tokenizer) | Full          |
| markdown          | Header-aware markdown       | Full          |
| html              | HTML tag-aware              | Full          |
| json              | JSON structure-aware        | Full          |
| latex             | LaTeX section-aware         | Full          |
| semantic          | Semantic similarity-based   | Full          |
| semantic-markdown | Semantic markdown           | Full          |

### Reranker Types Tested

| Type          | Description             | Requires Model |
| ------------- | ----------------------- | -------------- |
| simple        | Position + vector score | No             |
| llm           | LLM semantic scoring    | Yes            |
| cross-encoder | Cross-encoder model     | Yes            |
| cohere        | Cohere Rerank API       | Yes (API)      |
| batch         | Batch LLM reranking     | Yes            |

## Troubleshooting

### Common Issues

1. **Module not found errors**

   ```bash
   # Ensure build is up to date
   pnpm run build
   ```

2. **Timeout errors**
   - Increase timeout in TEST_CONFIG
   - Check for slow file I/O

3. **Memory issues with large documents**
   - Reduce chunk size in config
   - Process documents in batches

### Debug Mode

Enable verbose logging:

```bash
VERBOSE=true DEBUG=neurolink:rag:* npx tsx test/continuous-test-suite-rag.ts
```

## Adding New Tests

### Adding a Chunker Test

```typescript
// In continuous-test-suite-rag.ts
const newChunkerTest = async (): Promise => {
  const chunker = await createChunker("new-strategy", { maxSize: 500 });
  const chunks = await chunker.chunk(testText, { maxSize: 500 });
  // Validate chunks...
  return true;
};
```

### Adding a Reranker Test

```typescript
// In continuous-test-suite-rag.ts
const newRerankerTest = async (): Promise => {
  const reranker = await createReranker("new-type", { topK: 3 });
  const results = await reranker.rerank(mockResults, query);
  // Validate results...
  return true;
};
```

## See Also

- [RAG Feature Guide](/docs/tutorials/rag) - Main RAG documentation
- [RAG Configuration](/docs/deployment/configuration) - Detailed configuration options

---

## RAG Processing - Manual Verification Checklist

<!-- Source: rag/VERIFICATION.md -->

# RAG Processing - Manual Verification Checklist

This document provides a comprehensive manual verification checklist for the RAG (Retrieval-Augmented Generation) processing feature in NeuroLink.

## 1. Chunker Verification

### 1.1 ChunkerFactory Tests

| Test                             | Command/Action                                                  | Expected Result                                      | Status |
| -------------------------------- | --------------------------------------------------------------- | ---------------------------------------------------- | ------ |
| Singleton instance               | `ChunkerFactory.getInstance() === ChunkerFactory.getInstance()` | Returns same instance                                | [ ]    |
| Available strategies             | `getAvailableStrategies()`                                      | Returns array with 9+ strategies                     | [ ]    |
| Create character chunker         | `createChunker('character')`                                    | Returns chunker with `strategy: 'character'`         | [ ]    |
| Create recursive chunker         | `createChunker('recursive')`                                    | Returns chunker with `strategy: 'recursive'`         | [ ]    |
| Create sentence chunker          | `createChunker('sentence')`                                     | Returns chunker with `strategy: 'sentence'`          | [ ]    |
| Create token chunker             | `createChunker('token')`                                        | Returns chunker with `strategy: 'token'`             | [ ]    |
| Create markdown chunker          | `createChunker('markdown')`                                     | Returns chunker with `strategy: 'markdown'`          | [ ]    |
| Create HTML chunker              | `createChunker('html')`                                         | Returns chunker with `strategy: 'html'`              | [ ]    |
| Create JSON chunker              | `createChunker('json')`                                         | Returns chunker with `strategy: 'json'`              | [ ]    |
| Create LaTeX chunker             | `createChunker('latex')`                                        | Returns chunker with `strategy: 'latex'`             | [ ]    |
| Create semantic-markdown chunker | `createChunker('semantic-markdown')`                            | Returns chunker with `strategy: 'semantic-markdown'` | [ ]    |

### 1.2 Alias Resolution Tests

| Alias  | Expected Strategy | Status |
| ------ | ----------------- | ------ |
| `char` | `character`       | [ ]    |
| `md`   | `markdown`        | [ ]    |
| `tok`  | `token`           | [ ]    |
| `sent` | `sentence`        | [ ]    |
| `tex`  | `latex`           | [ ]    |

### 1.3 ChunkerRegistry Tests

| Test                   | Command/Action                                                    | Expected Result                | Status |
| ---------------------- | ----------------------------------------------------------------- | ------------------------------ | ------ |
| Singleton instance     | `ChunkerRegistry.getInstance() === ChunkerRegistry.getInstance()` | Returns same instance          | [ ]    |
| Get available chunkers | `getAvailableChunkers()`                                          | Returns array with 9+ chunkers | [ ]    |
| Has valid chunker      | `chunkerRegistry.hasChunker('recursive')`                         | Returns `true`                 | [ ]    |
| Has invalid chunker    | `chunkerRegistry.hasChunker('invalid')`                           | Returns `false`                | [ ]    |
| Get by use case        | `chunkerRegistry.getChunkersByUseCase('documentation')`           | Includes 'markdown'            | [ ]    |

### 1.4 Chunking Execution Tests

For each chunker, verify the following with sample text:

```typescript
const chunks = await chunker.chunk(sampleText, { maxSize: 200 });
```

| Chunker           | Chunks Generated | Valid Structure | Metadata Present | Status |
| ----------------- | ---------------- | --------------- | ---------------- | ------ |
| character         | >0 chunks        | [ ]             | [ ]              | [ ]    |
| recursive         | >0 chunks        | [ ]             | [ ]              | [ ]    |
| sentence          | >0 chunks        | [ ]             | [ ]              | [ ]    |
| token             | >0 chunks        | [ ]             | [ ]              | [ ]    |
| markdown          | >0 chunks        | [ ]             | [ ]              | [ ]    |
| html              | >0 chunks        | [ ]             | [ ]              | [ ]    |
| json              | >0 chunks        | [ ]             | [ ]              | [ ]    |
| latex             | >0 chunks        | [ ]             | [ ]              | [ ]    |
| semantic-markdown | >0 chunks        | [ ]             | [ ]              | [ ]    |

**Chunk structure validation:**

```typescript
// Each chunk should have:
{
  id: string,           // Non-empty UUID
  text: string,         // Non-empty content
  metadata: {
    documentId: string, // Parent document ID
    chunkIndex: number, // 0-based index
    startOffset: number,
    endOffset: number
  }
}
```

---

## 2. Reranker Verification

### 2.1 RerankerFactory Tests

| Test                   | Command/Action                                                    | Expected Result                              | Status |
| ---------------------- | ----------------------------------------------------------------- | -------------------------------------------- | ------ |
| Singleton instance     | `RerankerFactory.getInstance() === RerankerFactory.getInstance()` | Returns same instance                        | [ ]    |
| Available types        | `getAvailableRerankerTypes()`                                     | Returns array with 5 types                   | [ ]    |
| Create simple reranker | `createReranker('simple')`                                        | Returns reranker with `type: 'simple'`       | [ ]    |
| Get metadata           | `getRerankerMetadata('simple')`                                   | Returns description, defaultConfig, useCases | [ ]    |
| Model-free list        | `rerankerFactory.getModelFreeRerankers()`                         | Includes 'simple'                            | [ ]    |

### 2.2 Reranker Alias Resolution Tests

| Alias      | Expected Type          | Status |
| ---------- | ---------------------- | ------ |
| `fast`     | `simple`               | [ ]    |
| `basic`    | `simple`               | [ ]    |
| `semantic` | `llm` (requires model) | [ ]    |

### 2.3 RerankerRegistry Tests

| Test                 | Command/Action                                                      | Expected Result                 | Status |
| -------------------- | ------------------------------------------------------------------- | ------------------------------- | ------ |
| Singleton instance   | `RerankerRegistry.getInstance() === RerankerRegistry.getInstance()` | Returns same instance           | [ ]    |
| Available rerankers  | `getAvailableRerankers()`                                           | Returns array with 4+ rerankers | [ ]    |
| Has valid reranker   | `rerankerRegistry.hasReranker('simple')`                            | Returns `true`                  | [ ]    |
| Has invalid reranker | `rerankerRegistry.hasReranker('invalid')`                           | Returns `false`                 | [ ]    |
| Get by use case      | `rerankerRegistry.getRerankersByUseCase('fast')`                    | Includes 'simple'               | [ ]    |

### 2.4 Reranking Execution Tests

```typescript
const results = [
  { id: "doc1", text: "Machine learning...", score: 0.85 },
  { id: "doc2", text: "Neural networks...", score: 0.92 },
  { id: "doc3", text: "Data science...", score: 0.78 },
];

const reranked = await reranker.rerank(results, "query", { topK: 3 });
```

| Test                               | Expected Result                          | Status |
| ---------------------------------- | ---------------------------------------- | ------ |
| Simple rerank returns topK results | `reranked.length === 3`                  | [ ]    |
| Results sorted by score descending | `reranked[0].score >= reranked[1].score` | [ ]    |
| All results have id, text, score   | Each has required fields                 | [ ]    |

---

## 3. Hybrid Search Verification

### 3.1 BM25 Index Tests

| Test                   | Command/Action                       | Expected Result         | Status |
| ---------------------- | ------------------------------------ | ----------------------- | ------ |
| Create index           | `new InMemoryBM25Index()`            | Index created           | [ ]    |
| Add documents          | `await bm25Index.addDocuments(docs)` | Documents indexed       | [ ]    |
| Search returns results | `await bm25Index.search('query', 3)` | Returns up to 3 results | [ ]    |
| Results have scores    | Each result has `score` field        | [ ]                     |
| Results match query    | Top results contain query terms      | [ ]                     |

### 3.2 Fusion Method Tests

#### Reciprocal Rank Fusion (RRF)

```typescript
const vectorRanking = [
  { id: "doc1", rank: 1 },
  { id: "doc2", rank: 2 },
];
const bm25Ranking = [
  { id: "doc2", rank: 1 },
  { id: "doc1", rank: 2 },
];
const fused = reciprocalRankFusion([vectorRanking, bm25Ranking], 60);
```

| Test                                  | Expected Result                | Status |
| ------------------------------------- | ------------------------------ | ------ |
| Fused scores exist                    | `fused.size > 0`               | [ ]    |
| Docs in both lists have higher scores | doc1, doc2 scores > doc3 score | [ ]    |

#### Linear Combination

```typescript
const vectorScores = new Map([
  ["doc1", 0.9],
  ["doc2", 0.7],
]);
const bm25Scores = new Map([
  ["doc1", 0.6],
  ["doc2", 0.8],
]);
const combined = linearCombination(vectorScores, bm25Scores, 0.5);
```

| Test                        | Expected Result          | Status |
| --------------------------- | ------------------------ | ------ |
| Combined scores exist       | `combined.size > 0`      | [ ]    |
| Scores are weighted average | doc1: ~0.75, doc2: ~0.75 | [ ]    |

---

## 4. Integration Tests

### 4.1 End-to-End Chunking Pipeline

```typescript
// 1. Create chunker
const chunker = await createChunker("markdown", { maxSize: 300 });

// 2. Chunk document
const chunks = await chunker.chunk(markdownDocument, { maxSize: 300 });

// 3. Validate
```

| Test                   | Expected Result             | Status |
| ---------------------- | --------------------------- | ------ |
| Chunks generated       | `chunks.length > 0`         | [ ]    |
| All chunks valid       | All have id, text, metadata | [ ]    |
| Chunk sizes reasonable | Average  0` | [ ]    |

### 4.2 Multiple Chunker Comparison

| Chunker   | Same Input | Produces Chunks | Different Results | Status |
| --------- | ---------- | --------------- | ----------------- | ------ |
| character | ✓          | [ ]             | [ ]               | [ ]    |
| sentence  | ✓          | [ ]             | [ ]               | [ ]    |
| recursive | ✓          | [ ]             | [ ]               | [ ]    |

---

## 5. Error Handling Tests

| Test                     | Action                          | Expected Result                           | Status |
| ------------------------ | ------------------------------- | ----------------------------------------- | ------ |
| Invalid chunker strategy | `createChunker('invalid-xyz')`  | Throws "Unknown chunking strategy"        | [ ]    |
| Invalid reranker type    | `createReranker('invalid-xyz')` | Throws "Unknown reranker type"            | [ ]    |
| Empty input to chunker   | `chunker.chunk('')`             | Returns empty array or handles gracefully | [ ]    |
| Null input to chunker    | `chunker.chunk(null)`           | Throws error or handles gracefully        | [ ]    |

---

## 6. Performance Verification

### 6.1 Chunking Performance

Test with documents of varying sizes:

| Document Size | Chunker   | Time (ms) | Memory   | Status |
| ------------- | --------- | --------- | -------- | ------ |
| 1 KB          | recursive | < 100     | < 10 MB  | [ ]    |
| 10 KB         | recursive | < 500     | < 50 MB  | [ ]    |
| 100 KB        | recursive | < 2000    | < 200 MB | [ ]    |

### 6.2 Reranking Performance

| Results Count | Reranker | Time (ms) | Status |
| ------------- | -------- | --------- | ------ |
| 10            | simple   | < 10      | [ ]    |
| 100           | simple   | < 50      | [ ]    |
| 1000          | simple   | < 500     | [ ]    |

---

## 7. Test Suite Execution

### Run Continuous Test Suite

```bash
npx tsx test/continuous-test-suite-rag.ts
```

| Test Suite          | Status   |
| ------------------- | -------- |
| ChunkerFactory      | [ ] PASS |
| ChunkerRegistry     | [ ] PASS |
| All 9 Chunkers      | [ ] PASS |
| RerankerFactory     | [ ] PASS |
| RerankerRegistry    | [ ] PASS |
| Simple Reranking    | [ ] PASS |
| Hybrid Search       | [ ] PASS |
| Chunker Integration | [ ] PASS |
| Error Handling      | [ ] PASS |

### Run Unit Tests

```bash
pnpm test test/rag/
```

| Test File                           | Status   |
| ----------------------------------- | -------- |
| ChunkerFactory.test.ts              | [ ] PASS |
| ChunkerRegistry.test.ts             | [ ] PASS |
| integration/rag.integration.test.ts | [ ] PASS |
| resilience/RetryHandler.test.ts     | [ ] PASS |
| resilience/CircuitBreaker.test.ts   | [ ] PASS |

---

## 8. Documentation Verification

| Document         | Exists | Accurate | Complete | Status |
| ---------------- | ------ | -------- | -------- | ------ |
| TESTING.md       | [ ]    | [ ]      | [ ]      | [ ]    |
| CONFIGURATION.md | [ ]    | [ ]      | [ ]      | [ ]    |
| VERIFICATION.md  | [ ]    | [ ]      | [ ]      | [ ]    |
| CLI-COVERAGE.md  | [ ]    | [ ]      | [ ]      | [ ]    |

---

## Sign-off

| Role      | Name | Date | Signature |
| --------- | ---- | ---- | --------- |
| Developer |      |      |           |
| QA        |      |      |           |
| Tech Lead |      |      |           |

---

## Notes

_Add any observations, issues, or recommendations here:_

```
_______________________________________________________________________________
_______________________________________________________________________________
_______________________________________________________________________________
```

---

# Implementation Guides

## RAG Document Processing - Implementation Guide

<!-- Source: implementation-guides/14-rag-document-processing.md -->

# RAG Document Processing - Implementation Guide

> **User Documentation**: For user-facing documentation, see the [RAG Feature Guide](/docs/tutorials/rag).

## Status: 100% Complete

**Last Updated:** January 31, 2026

## Overview

The RAG (Retrieval-Augmented Generation) Document Processing feature provides comprehensive capabilities for processing, chunking, embedding, and retrieving documents for AI-powered applications. This implementation follows NeuroLink's Factory + Registry patterns for consistency and extensibility.

## Components

### 1. Document Loading (`/src/lib/rag/document/`)

- **MDocument**: Fluent document processing class
- **Loaders**: TextLoader, MarkdownLoader, HTMLLoader, JSONLoader, CSVLoader, PDFLoader, WebLoader
- **Functions**: `loadDocument()`, `loadDocuments()`

### 2. Chunking Strategies (`/src/lib/rag/chunkers/` & `/src/lib/rag/chunking/`)

10 chunking strategies available:

| Strategy            | Description                         | Use Cases                   |
| ------------------- | ----------------------------------- | --------------------------- |
| `character`         | Fixed-size character chunks         | Simple text processing      |
| `recursive`         | Ordered separator-based splitting   | General documents (default) |
| `sentence`          | Sentence boundary splitting         | Q&A applications            |
| `token`             | Token-aware splitting               | Model-specific optimization |
| `markdown`          | Header-based markdown splitting     | Documentation               |
| `html`              | Semantic tag-based HTML splitting   | Web content                 |
| `json`              | Object boundary JSON splitting      | Structured data             |
| `latex`             | Section/environment LaTeX splitting | Academic papers             |
| `semantic`          | Semantic similarity-based chunking  | Context-aware splitting     |
| `semantic-markdown` | Semantic similarity + markdown      | Knowledge bases             |

**Factory & Registry Pattern:**

```typescript

  ChunkerFactory,
  ChunkerRegistry,
  createChunker,
} from "@juspay/neurolink";

// Using factory
const chunker = await ChunkerFactory.getInstance().createChunker("markdown", {
  maxSize: 1000,
});

// Using convenience function
const chunker = await createChunker("recursive", { overlap: 100 });

// Using registry
const chunker = await ChunkerRegistry.getInstance().getChunker("semantic-md");
```

### 3. Metadata Extraction (`/src/lib/rag/metadata/`)

**NEW: MetadataExtractorFactory & MetadataExtractorRegistry**

LLM-powered metadata extraction supporting:

- Title extraction
- Summary generation
- Keyword extraction
- Q&A pair generation
- Custom schema extraction

**Extractor Types:**

| Type        | Description                 | Extraction Types |
| ----------- | --------------------------- | ---------------- |
| `llm`       | Full LLM-powered extraction | All types        |
| `title`     | Title-only extraction       | title            |
| `summary`   | Summary-only extraction     | summary          |
| `keywords`  | Keyword-only extraction     | keywords         |
| `questions` | Q&A generation              | questions        |
| `custom`    | Custom schema extraction    | custom           |
| `composite` | Multi-type extraction       | All types        |

**Usage:**

```typescript

  MetadataExtractorFactory,
  createMetadataExtractor,
  metadataExtractorRegistry,
} from "@juspay/neurolink";

// Using factory
const extractor = await MetadataExtractorFactory.getInstance().createExtractor(
  "title",
  {
    provider: "openai",
    modelName: "gpt-4o-mini",
  },
);

// Using convenience function
const extractor = await createMetadataExtractor("keywords");

// Extract metadata
const results = await extractor.extract(chunks, { keywords: true });
```

### 4. Reranking (`/src/lib/rag/reranker/`)

**NEW: RerankerFactory & RerankerRegistry**

Multi-factor scoring system for reranking retrieval results.

**Reranker Types:**

| Type            | Description                     | Requires Model    |
| --------------- | ------------------------------- | ----------------- |
| `llm`           | LLM-powered semantic reranking  | Yes               |
| `cross-encoder` | Cross-encoder relevance scoring | Yes               |
| `cohere`        | Cohere Rerank API               | No (external API) |
| `simple`        | Position + vector score only    | No                |
| `batch`         | Batch LLM reranking             | Yes               |

**Usage:**

```typescript

  RerankerFactory,
  createReranker,
  rerankerFactory,
} from "@juspay/neurolink";

// Set model provider for LLM-based rerankers
rerankerFactory.setModelProvider(aiProvider);

// Create reranker
const reranker = await createReranker("llm", { topK: 5 });

// Rerank results
const reranked = await reranker.rerank(vectorResults, query);
```

### 5. Retrieval (`/src/lib/rag/retrieval/`)

- **Vector Query Tool**: `createVectorQueryTool()` with metadata filtering
- **Hybrid Search**: `createHybridSearch()` combining BM25 + vector
- **In-Memory Stores**: `InMemoryVectorStore`, `InMemoryBM25Index`
- **Fusion Methods**: `reciprocalRankFusion()`, `linearCombination()`

### 6. Graph RAG (`/src/lib/rag/graphRag/`)

Knowledge graph-based retrieval using:

- Node and edge graph structure
- Random walk algorithms
- Semantic similarity thresholds

### 7. RAG Pipeline (`/src/lib/rag/pipeline/`)

Full pipeline orchestration:

```typescript

const pipeline = new RAGPipeline({
  embeddingModel: { provider: "openai", modelName: "text-embedding-3-small" },
  generationModel: { provider: "openai", modelName: "gpt-4o-mini" },
});

await pipeline.ingest(["./docs/*.md"]);
const response = await pipeline.query("What are the key features?");
```

### 8. Resilience (`/src/lib/rag/resilience/`)

- **CircuitBreaker**: Fault tolerance pattern
- **RetryHandler**: Configurable retry with backoff

### 9. Error Handling (`/src/lib/rag/errors/`)

Typed errors for all RAG operations:

- `ChunkingError`
- `MetadataExtractionError`
- `EmbeddingError`
- `VectorQueryError`
- `RerankerError`
- `GraphRAGError`
- `PipelineError`
- `RAGCircuitBreakerError`

## Factory + Registry Patterns

All major components follow NeuroLink's Factory + Registry patterns:

| Component           | Factory                    | Registry                    |
| ------------------- | -------------------------- | --------------------------- |
| Chunkers            | `ChunkerFactory`           | `ChunkerRegistry`           |
| Rerankers           | `RerankerFactory`          | `RerankerRegistry`          |
| Metadata Extractors | `MetadataExtractorFactory` | `MetadataExtractorRegistry` |

### Pattern Benefits

1. **Lazy Loading**: Dynamic imports prevent circular dependencies
2. **Singleton Management**: Consistent lifecycle across the SDK
3. **Alias Support**: Multiple names for same component (e.g., 'md' → 'markdown')
4. **Metadata Discovery**: Rich metadata for tooling and documentation
5. **Type Safety**: Full TypeScript support with exported types

## API Reference

### Convenience Functions

```typescript
// Chunkers

  createChunker,
  getAvailableStrategies,
  getChunkerMetadata,
} from "@juspay/neurolink";

// Rerankers

  createReranker,
  getAvailableRerankerTypes,
  getRerankerMetadata,
} from "@juspay/neurolink";

// Metadata Extractors

  createMetadataExtractor,
  getAvailableExtractorTypes,
  getExtractorMetadata,
} from "@juspay/neurolink";

// Document Processing

```

### Type Exports

```typescript

  // Chunking
  Chunk,
  ChunkMetadata,
  ChunkerConfig,
  ChunkingStrategy,

  // Metadata
  ExtractParams,
  ExtractionResult,
  MetadataExtractor,
  MetadataExtractorType,
  MetadataExtractorConfig,

  // Reranking
  Reranker,
  RerankerType,
  RerankerConfig,
  RerankResult,
  RerankerOptions,

  // Retrieval
  VectorQueryResult,
  MetadataFilter,
  HybridSearchConfig,

  // Graph RAG
  GraphNode,
  GraphEdge,
  GraphQueryParams,

  // Pipeline
  RAGPipelineConfig,
  RAGResponse,
} from "@juspay/neurolink";
```

## Implementation Notes

### Dynamic Imports

All factory registrations use dynamic imports to avoid circular dependencies:

```typescript
this.registerChunker(
  "markdown",
  async (config?: ChunkerConfig) => {
    const { MarkdownChunker } = await import("./chunkers/MarkdownChunker.js");
    return new MarkdownChunker(config);
  },
  metadata,
);
```

### Error Handling

Use the specialized error classes for proper error identification:

```typescript

  isRAGError,
  isRetryableRAGError,
  isPartialFailure,
} from "@juspay/neurolink";

try {
  await pipeline.ingest(files);
} catch (error) {
  if (isPartialFailure(error)) {
    console.log(
      `Processed ${error.successfulChunks} of ${error.successfulChunks + error.failedChunks}`,
    );
  }
}
```

## Migration from Previous Versions

If upgrading from a version without Factory/Registry patterns:

```typescript
// Old way

const result = await rerank(results, query, model);

// New way (with factory)

rerankerFactory.setModelProvider(model);
const reranker = await createReranker("llm");
const result = await reranker.rerank(results, query);

// Direct function still works for backwards compatibility

const result = await rerank(results, query, model);
```

## RAG Integration with generate()/stream() (v9.2.0)

### Simplified API

The `rag: { files }` option on `generate()` and `stream()` provides automatic RAG pipeline setup:

```typescript
const result = await neurolink.generate({
  prompt: "What is this about?",
  rag: { files: ["./docs/guide.md"], strategy: "markdown", topK: 5 },
});
```

**Implementation:** `src/lib/rag/ragIntegration.ts` exports `prepareRAGTool()` which:

1. Loads files from disk
2. Auto-detects chunking strategy from file extension
3. Chunks content using ChunkerRegistry
4. Generates embeddings (character-frequency hash, 128 dimensions)
5. Stores in InMemoryVectorStore
6. Returns a Vercel AI SDK `Tool` with Zod parameters

**Injection points in `src/lib/neurolink.ts`:**

- `generate()` method (~line 1942): Dynamic import of ragIntegration, tool injection, system prompt append
- `stream()` method (~line 3037): Identical pattern

### Streaming Tool Architecture (v9.2.0)

`BaseProvider.stream()` now centrally pre-merges base tools (MCP/built-in) with user-provided tools (including RAG) into `options.tools` before calling provider-specific `executeStream()`.

**Provider fixes:** All 10 providers updated to use `options.tools || await this.getAllTools()` pattern:

- `openRouter.ts`, `amazonBedrock.ts`, `ollama.ts`, `huggingFace.ts` - explicit fix
- `openAI.ts`, `anthropic.ts`, `mistral.ts`, `litellm.ts` - simplified to use pre-merged tools
- `googleVertex.ts`, `googleAiStudio.ts` - already fixed

### vectorQueryTool Zod Migration (v9.2.0)

`createVectorQueryTool()` now returns Zod schemas for `parameters` instead of raw JSON Schema objects. This ensures compatibility with Vercel AI SDK's `generateText`/`streamText` which require Zod schemas for tool parameter definitions.

### CLI Flags (v9.2.0)

Five new flags on `generate`, `stream`, `batch` commands:

- `--rag-files` (string[]) - File paths to load
- `--rag-strategy` (string) - Chunking strategy
- `--rag-chunk-size` (number) - Max chunk size (default: 1000)
- `--rag-chunk-overlap` (number) - Chunk overlap (default: 200)
- `--rag-top-k` (number) - Top results (default: 5)

### New Exports

```typescript
// Types
export type { RAGConfig } from "./rag/types.js";
export type { RAGPreparedTool } from "./rag/ragIntegration.js";

// Functions
export { prepareRAGTool } from "./rag/ragIntegration.js";
```

### Key Files

| File                                  | Purpose                                |
| ------------------------------------- | -------------------------------------- |
| `src/lib/rag/ragIntegration.ts`       | `prepareRAGTool()` - auto RAG pipeline |
| `src/lib/rag/types.ts`                | `RAGConfig` type definition            |
| `src/lib/types/generateTypes.ts`      | `rag?: RAGConfig` on GenerateOptions   |
| `src/lib/types/streamTypes.ts`        | `rag?: RAGConfig` on StreamOptions     |
| `src/lib/core/baseProvider.ts`        | Central tool merge in stream()         |
| `src/lib/neurolink.ts`                | RAG injection in generate/stream       |
| `src/cli/factories/commandFactory.ts` | CLI --rag-files flags                  |

## Testing

```bash
# Run RAG tests
pnpm run test:rag

# Run specific test suites
pnpm vitest run test/rag/chunkers.test.ts
pnpm vitest run test/rag/reranker.test.ts
pnpm vitest run test/rag/metadata.test.ts
```

## Related Documentation

- Vector Store Integrations
- Evaluation and Scoring
- Master Implementation Guide

---

# Api

## NeuroLink API Reference v8.42.0

<!-- Source: api/README.md -->

**NeuroLink API Reference v8.42.0**

---

# NeuroLink API Reference v8.42.0

NeuroLink AI Toolkit

A unified AI provider interface with support for 14+ providers,
automatic fallback, streaming, MCP tool integration, HITL security,
Redis persistence, and enterprise-grade middleware.

NeuroLink provides comprehensive AI functionality with battle-tested
patterns extracted from production systems at Juspay.

## Example

```typescript

// Create NeuroLink instance
const neurolink = new NeuroLink();

// Generate with any provider
const result = await neurolink.generate({
  input: { text: "Explain quantum computing" },
  provider: "vertex",
  model: "gemini-3-flash",
});

console.log(result.content);
```

## Since

1.0.0

## Enumerations

- [AIProviderName](/docs/api/enumerations/AIProviderName)
- [BedrockModels](/docs/api/enumerations/BedrockModels)
- [OpenAIModels](/docs/api/enumerations/OpenAIModels)
- [VertexModels](/docs/api/enumerations/VertexModels)

## Classes

### Core

- [NeuroLink](/docs/api/classes/NeuroLink)

### Other

- [AIProviderFactory](/docs/api/classes/AIProviderFactory)
- [NeuroLinkOAuthProvider](/docs/api/classes/NeuroLinkOAuthProvider)
- [InMemoryTokenStorage](/docs/api/classes/InMemoryTokenStorage)
- [FileTokenStorage](/docs/api/classes/FileTokenStorage)
- [HTTPRateLimiter](/docs/api/classes/HTTPRateLimiter)
- [RateLimiterManager](/docs/api/classes/RateLimiterManager)
- [MCPCircuitBreaker](/docs/api/classes/MCPCircuitBreaker)
- [CircuitBreakerManager](/docs/api/classes/CircuitBreakerManager)
- [MiddlewareFactory](/docs/api/classes/MiddlewareFactory)

## Type Aliases

- [AnalyticsData](/docs/api/type-aliases/AnalyticsData)
- [EvaluationData](/docs/api/type-aliases/EvaluationData)
- [GenerateOptions](/docs/api/type-aliases/GenerateOptions)
- [GenerateResult](/docs/api/type-aliases/GenerateResult)
- [EnhancedProvider](/docs/api/type-aliases/EnhancedProvider)
- [TextGenerationOptions](/docs/api/type-aliases/TextGenerationOptions)
- [TextGenerationResult](/docs/api/type-aliases/TextGenerationResult)
- [MCPServerInfo](/docs/api/type-aliases/MCPServerInfo)
- [DiscoveredMcp](/docs/api/type-aliases/DiscoveredMcp)
- [McpMetadata](/docs/api/type-aliases/McpMetadata)
- [OAuthTokens](/docs/api/type-aliases/OAuthTokens)
- [TokenStorage](/docs/api/type-aliases/TokenStorage)
- [MCPOAuthConfig](/docs/api/type-aliases/MCPOAuthConfig)
- [OAuthClientInformation](/docs/api/type-aliases/OAuthClientInformation)
- [AuthorizationUrlResult](/docs/api/type-aliases/AuthorizationUrlResult)
- [TokenExchangeRequest](/docs/api/type-aliases/TokenExchangeRequest)
- [~~RateLimitConfig~~](/docs/api/type-aliases/RateLimitConfig)
- [HTTPRetryConfig](/docs/api/type-aliases/HTTPRetryConfig)
- [NeuroLinkMiddleware](/docs/api/type-aliases/NeuroLinkMiddleware)
- [MiddlewareConfig](/docs/api/type-aliases/MiddlewareConfig)
- [MiddlewareContext](/docs/api/type-aliases/MiddlewareContext)
- [MiddlewarePreset](/docs/api/type-aliases/MiddlewarePreset)
- [MiddlewareFactoryOptions](/docs/api/type-aliases/MiddlewareFactoryOptions)
- [DynamicModelConfig](/docs/api/type-aliases/DynamicModelConfig)
- [ModelRegistry](/docs/api/type-aliases/ModelRegistry)
- [LangfuseConfig](/docs/api/type-aliases/LangfuseConfig)
- [LangfuseSpanAttributes](/docs/type-aliases/langfusespanattributes)
- [TraceNameFormat](/docs/type-aliases/tracenameformat)
- [OpenTelemetryConfig](/docs/api/type-aliases/OpenTelemetryConfig)
- [ObservabilityConfig](/docs/api/type-aliases/ObservabilityConfig)
- [SupportedModelName](/docs/api/type-aliases/SupportedModelName)
- [AIModelProviderConfig](/docs/api/type-aliases/AIModelProviderConfig)
- [AIProvider](/docs/api/type-aliases/AIProvider)
- [ProviderAttempt](/docs/api/type-aliases/ProviderAttempt)
- [StreamingOptions](/docs/api/type-aliases/StreamingOptions)
- [ExecutionContext](/docs/api/type-aliases/ExecutionContext)
- [ToolInfo](/docs/api/type-aliases/ToolInfo)
- [ToolExecutionResult](/docs/api/type-aliases/ToolExecutionResult)
- [ToolContext](/docs/api/type-aliases/ToolContext)
- [ToolResult](/docs/api/type-aliases/ToolResult)
- [ToolDefinition](/docs/api/type-aliases/ToolDefinition)
- [LogLevel](/docs/api/type-aliases/LogLevel)

## Variables

- [dynamicModelProvider](/docs/api/variables/dynamicModelProvider)
- [VERSION](/docs/api/variables/VERSION)
- [DEFAULT_RATE_LIMIT_CONFIG](/docs/api/variables/DEFAULT_RATE_LIMIT_CONFIG)
- [globalRateLimiterManager](/docs/api/variables/globalRateLimiterManager)
- [DEFAULT_HTTP_RETRY_CONFIG](/docs/api/variables/DEFAULT_HTTP_RETRY_CONFIG)
- [globalCircuitBreakerManager](/docs/api/variables/globalCircuitBreakerManager)
- [DEFAULT_PROVIDER_CONFIGS](/docs/api/variables/DEFAULT_PROVIDER_CONFIGS)
- [mcpLogger](/docs/api/variables/mcpLogger)

## Functions

### Factory

- [createAIProvider](/docs/api/functions/createAIProvider)
- [createAIProviderWithFallback](/docs/api/functions/createAIProviderWithFallback)
- [createBestAIProvider](/docs/api/functions/createBestAIProvider)

### Legacy

- [~~generateText~~](/docs/api/functions/generateText)

### Other

- [initializeTelemetry](/docs/api/functions/initializeTelemetry)
- [getTelemetryStatus](/docs/api/functions/getTelemetryStatus)
- [createOAuthProviderFromConfig](/docs/api/functions/createOAuthProviderFromConfig)
- [isTokenExpired](/docs/api/functions/isTokenExpired)
- [calculateExpiresAt](/docs/api/functions/calculateExpiresAt)
- [isRetryableStatusCode](/docs/api/functions/isRetryableStatusCode)
- [isRetryableHTTPError](/docs/api/functions/isRetryableHTTPError)
- [withHTTPRetry](/docs/api/functions/withHTTPRetry)
- [initializeMCPEcosystem](/docs/api/functions/initializeMCPEcosystem)
- [listMCPs](/docs/api/functions/listMCPs)
- [executeMCP](/docs/api/functions/executeMCP)
- [getMCPStats](/docs/api/functions/getMCPStats)
- [validateTool](/docs/api/functions/validateTool)
- [initializeOpenTelemetry](/docs/api/functions/initializeOpenTelemetry)
- [flushOpenTelemetry](/docs/api/functions/flushOpenTelemetry)
- [shutdownOpenTelemetry](/docs/api/functions/shutdownOpenTelemetry)
- [getLangfuseHealthStatus](/docs/api/functions/getLangfuseHealthStatus)
- [setLangfuseContext](/docs/api/functions/setLangfuseContext)
- [getLangfuseContext](/docs/functions/getlangfusecontext)
- [getTracer](/docs/functions/gettracer)
- [getSpanProcessors](/docs/functions/getspanprocessors)
- [createContextEnricher](/docs/functions/createcontextenricher)
- [isUsingExternalTracerProvider](/docs/functions/isusingexternaltracerprovider)
- [getTracerProvider](/docs/functions/gettracerprovider)
- [getLangfuseSpanProcessor](/docs/functions/getlangfusespanprocessor)
- [buildObservabilityConfigFromEnv](/docs/api/functions/buildObservabilityConfigFromEnv)
- [getBestProvider](/docs/api/functions/getBestProvider)
- [getAvailableProviders](/docs/api/functions/getAvailableProviders)
- [isValidProvider](/docs/api/functions/isValidProvider)

## RAG Document Processing

### Classes

- [ChunkerFactory](/docs/classes/chunkerfactory)
- [ChunkerRegistry](/docs/classes/chunkerregistry)
- [RerankerFactory](/docs/classes/rerankerfactory)
- [RerankerRegistry](/docs/classes/rerankerregistry)
- [MDocument](/docs/classes/mdocument)
- [RAGPipeline](/docs/classes/ragpipeline)
- [InMemoryVectorStore](/docs/classes/inmemoryvectorstore)
- [InMemoryBM25Index](/docs/classes/inmemorybm25index)
- [GraphRAG](/docs/classes/graphrag)

### Functions

- [createChunker](/docs/functions/createchunker)
- [getAvailableStrategies](/docs/functions/getavailablestrategies)
- [getChunkerMetadata](/docs/functions/getchunkermetadata)
- [chunkText](/docs/functions/chunktext)
- [createReranker](/docs/functions/createreranker)
- [getAvailableRerankerTypes](/docs/functions/getavailablererankertypes)
- [rerank](/docs/functions/rerank)
- [batchRerank](/docs/functions/batchrerank)
- [simpleRerank](/docs/functions/simplererank)
- [createHybridSearch](/docs/functions/createhybridsearch)
- [reciprocalRankFusion](/docs/functions/reciprocalrankfusion)
- [linearCombination](/docs/functions/linearcombination)
- [loadDocument](/docs/functions/loaddocument)
- [loadDocuments](/docs/functions/loaddocuments)
- [assembleContext](/docs/functions/assemblecontext)
- [createContextWindow](/docs/functions/createcontextwindow)
- [prepareRAGTool](/docs/functions/prepareragtool)

### Type Aliases

- [ChunkingStrategy](/docs/type-aliases/chunkingstrategy)
- [ChunkerConfig](/docs/type-aliases/chunkerconfig)
- [RerankerType](/docs/type-aliases/rerankertype)
- [RerankerConfig](/docs/type-aliases/rerankerconfig)
- [HybridSearchConfig](/docs/type-aliases/hybridsearchconfig)
- [VectorQueryToolConfig](/docs/type-aliases/vectorquerytoolconfig)
- [Chunk](/docs/type-aliases/chunk)
- [ChunkMetadata](/docs/type-aliases/chunkmetadata)
- [RAGConfig](/docs/type-aliases/ragconfig)
- [RAGPreparedTool](/docs/type-aliases/ragpreparedtool)

### Using RAG Tools with generate()

#### Simplified API (Recommended)

Pass `rag: { files }` directly to `generate()` or `stream()` for automatic RAG pipeline setup. NeuroLink handles file loading, chunking, embedding, vector storage, and tool creation automatically:

```typescript

const neurolink = new NeuroLink();

// Generate with RAG - just pass files
const result = await neurolink.generate({
  prompt: "What are the key features described in the docs?",
  rag: {
    files: ["./docs/guide.md", "./docs/api.md"],
    strategy: "markdown", // Optional: auto-detected from extension
    chunkSize: 512, // Optional: default 1000
    chunkOverlap: 50, // Optional: default 200
    topK: 5, // Optional: default 5
  },
});

// Stream with RAG - same API
const stream = await neurolink.stream({
  prompt: "Summarize the architecture",
  rag: { files: ["./docs/architecture.md"] },
});
```

#### Advanced API

For full control over embeddings and vector stores, use `createVectorQueryTool` directly:

```typescript

  NeuroLink,
  createVectorQueryTool,
  InMemoryVectorStore,
} from "@juspay/neurolink";

const vectorStore = new InMemoryVectorStore();
// ... populate with data

const ragTool = createVectorQueryTool(
  {
    id: "kb-search",
    indexName: "knowledge-base",
    embeddingModel: { provider: "openai", modelName: "text-embedding-3-small" },
  },
  vectorStore,
);

const result = await neurolink.generate({
  input: { text: "Your question" },
  tools: [ragTool],
});
```

**Related Documentation:**

- [createVectorQueryTool](/docs/functions/createvectorquerytool) - Factory function for creating vector query tools
- [InMemoryVectorStore](/docs/classes/inmemoryvectorstore) - In-memory vector store implementation
- [VectorQueryToolConfig](/docs/type-aliases/vectorquerytoolconfig) - Configuration options for vector query tools

---

## Variable: DEFAULT_HTTP_RETRY_CONFIG

<!-- Source: api/variables/DEFAULT_HTTP_RETRY_CONFIG.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / DEFAULT_HTTP_RETRY_CONFIG

# Variable: DEFAULT_HTTP_RETRY_CONFIG

> `const` **DEFAULT_HTTP_RETRY_CONFIG**: [`HTTPRetryConfig`](/docs/api/type-aliases/HTTPRetryConfig)

Defined in: [mcp/httpRetryHandler.ts:15](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/httpRetryHandler.ts#L15)

Default HTTP retry configuration

---

## Enumeration: AIProviderName

<!-- Source: api/enumerations/AIProviderName.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### OPENAI

> **OPENAI**: `"openai"`

Defined in: [constants/enums.ts:10](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L10)

---

### OPENAI_COMPATIBLE

> **OPENAI_COMPATIBLE**: `"openai-compatible"`

Defined in: [constants/enums.ts:11](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L11)

---

### OPENROUTER

> **OPENROUTER**: `"openrouter"`

Defined in: [constants/enums.ts:12](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L12)

---

### VERTEX

> **VERTEX**: `"vertex"`

Defined in: [constants/enums.ts:13](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L13)

---

### ANTHROPIC

> **ANTHROPIC**: `"anthropic"`

Defined in: [constants/enums.ts:14](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L14)

---

### AZURE

> **AZURE**: `"azure"`

Defined in: [constants/enums.ts:15](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L15)

---

### GOOGLE_AI

> **GOOGLE_AI**: `"google-ai"`

Defined in: [constants/enums.ts:16](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L16)

---

### HUGGINGFACE

> **HUGGINGFACE**: `"huggingface"`

Defined in: [constants/enums.ts:17](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L17)

---

### OLLAMA

> **OLLAMA**: `"ollama"`

Defined in: [constants/enums.ts:18](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L18)

---

### MISTRAL

> **MISTRAL**: `"mistral"`

Defined in: [constants/enums.ts:19](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L19)

---

### LITELLM

> **LITELLM**: `"litellm"`

Defined in: [constants/enums.ts:20](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L20)

---

### SAGEMAKER

> **SAGEMAKER**: `"sagemaker"`

Defined in: [constants/enums.ts:21](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L21)

---

### AUTO

> **AUTO**: `"auto"`

Defined in: [constants/enums.ts:22](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L22)

---

## Type Alias: AIModelProviderConfig

<!-- Source: api/type-aliases/AIModelProviderConfig.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### models

> **models**: [`SupportedModelName`](/docs/api/type-aliases/SupportedModelName)[]

Defined in: [types/providers.ts:262](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/providers.ts#L262)

---

## Function: assembleContext()

<!-- Source: api/functions/assembleContext.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

\n\n"`

#### options.includeMetadata?

`boolean`

Include chunk metadata in context. Default: `false`

#### options.deduplicate?

`boolean`

Remove overlapping content. Default: `false`

#### options.dedupeThreshold?

`number`

Similarity threshold for deduplication (0-1). Default: `0.8`

#### options.orderByRelevance?

`boolean`

Sort chunks by relevance score. Default: `true`

#### options.includeSectionHeaders?

`boolean`

Add section headers to chunks. Default: `false`

#### options.headerTemplate?

`string`

Header template with `{index}`, `{source}`, `{score}` placeholders. Default: `"[{index}] Source: {source}"`

## Returns

`string`

Assembled context string ready for LLM prompt insertion

## Examples

### Basic context assembly

```typescript

const results = await vectorStore.query({ query: "climate change", topK: 5 });

const context = assembleContext(results);

const prompt = `Based on the following context, answer the question.

Context:
${context}

Question: What are the main causes of climate change?`;
```

### With token limit and citations

```typescript

const context = assembleContext(results, {
  maxTokens: 4000,
  citationFormat: "numbered",
  includeSectionHeaders: true,
});

// Output includes [1], [2], etc. for each chunk
```

### Deduplicated context

```typescript

// When chunks may have overlapping content
const context = assembleContext(results, {
  deduplicate: true,
  dedupeThreshold: 0.7, // Remove chunks with >70% word overlap
  orderByRelevance: true,
});
```

### Custom formatting

```typescript

const context = assembleContext(results, {
  maxTokens: 8000,
  separator: "\n\n",
  includeMetadata: true,
  includeSectionHeaders: true,
  headerTemplate: "### [{index}] {source} (relevance: {score})",
});
```

### For RAG pipeline

```typescript

async function ragQuery(question: string) {
  const queryTool = createVectorQueryTool(vectorStore, embeddingModel);
  const results = await queryTool.query(question, { topK: 10 });

  const context = assembleContext(results, {
    maxTokens: 4000,
    deduplicate: true,
    citationFormat: "numbered",
  });

  const response = await llm.generate({
    prompt: `Context:\n${context}\n\nQuestion: ${question}`,
  });

  return response;
}
```

## Notes

- Token count is approximated at 4 characters per token
- Chunks exceeding the token limit are partially included when possible
- Deduplication uses Jaccard similarity on word sets
- Empty results return an empty string
- Relevance ordering uses the `score` field from results

## Since

v8.44.0

## See Also

- [createContextWindow](/docs/createcontextwindow) - Create context window with detailed tracking
- [formatContextWithCitations](/docs/formatcontextwithcitations) - Format context with citation list
- [summarizeContext](/docs/summarizecontext) - Summarize context using LLM

---

## Class: AIProviderFactory

<!-- Source: api/classes/AIProviderFactory.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### createProviderWithModel()

> `static` **createProviderWithModel**(`provider`, `model`): `Promise`\

Defined in: [core/factory.ts:346](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/core/factory.ts#L346)

Create a provider instance with specific provider enum and model

#### Parameters

##### provider

[`AIProviderName`](/docs/api/enumerations/AIProviderName)

Provider enum value

##### model

[`SupportedModelName`](/docs/api/type-aliases/SupportedModelName)

Specific model enum value

#### Returns

`Promise`\

AIProvider instance

---

### createBestProvider()

> `static` **createBestProvider**(`requestedProvider?`, `modelName?`, `enableMCP?`, `sdk?`): `Promise`\

Defined in: [core/factory.ts:388](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/core/factory.ts#L388)

Create the best available provider automatically

#### Parameters

##### requestedProvider?

`string`

Optional preferred provider

##### modelName?

Optional model name override

`string` | `null`

##### enableMCP?

`boolean` = `true`

Optional flag to enable MCP integration (default: true)

##### sdk?

`UnknownRecord`

#### Returns

`Promise`\

AIProvider instance

---

### createProviderWithFallback()

> `static` **createProviderWithFallback**(`primaryProvider`, `fallbackProvider`, `modelName?`, `enableMCP?`): `Promise`\\>

Defined in: [core/factory.ts:428](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/core/factory.ts#L428)

Create primary and fallback provider instances

#### Parameters

##### primaryProvider

`string`

Primary provider name

##### fallbackProvider

`string`

Fallback provider name

##### modelName?

Optional model name override

`string` | `null`

##### enableMCP?

`boolean` = `true`

Optional flag to enable MCP integration (default: true)

#### Returns

`Promise`\\>

Object with primary and fallback providers

---

## Variable: DEFAULT_PROVIDER_CONFIGS

<!-- Source: api/variables/DEFAULT_PROVIDER_CONFIGS.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / DEFAULT_PROVIDER_CONFIGS

# Variable: DEFAULT_PROVIDER_CONFIGS

> `const` **DEFAULT_PROVIDER_CONFIGS**: [`AIModelProviderConfig`](/docs/api/type-aliases/AIModelProviderConfig)[]

Defined in: [types/providers.ts:716](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/providers.ts#L716)

Default provider configurations

---

## Enumeration: BedrockModels

<!-- Source: api/enumerations/BedrockModels.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### CLAUDE_4_5_SONNET

> **CLAUDE_4_5_SONNET**: `"anthropic.claude-sonnet-4-5-20250929-v1:0"`

Defined in: [constants/enums.ts:59](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L59)

---

### CLAUDE_4_5_HAIKU

> **CLAUDE_4_5_HAIKU**: `"anthropic.claude-haiku-4-5-20251001-v1:0"`

Defined in: [constants/enums.ts:60](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L60)

---

### CLAUDE_4_1_OPUS

> **CLAUDE_4_1_OPUS**: `"anthropic.claude-opus-4-1-20250805-v1:0"`

Defined in: [constants/enums.ts:63](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L63)

---

### CLAUDE_4_SONNET

> **CLAUDE_4_SONNET**: `"anthropic.claude-sonnet-4-20250514-v1:0"`

Defined in: [constants/enums.ts:64](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L64)

---

### CLAUDE_3_7_SONNET

> **CLAUDE_3_7_SONNET**: `"anthropic.claude-3-7-sonnet-20250219-v1:0"`

Defined in: [constants/enums.ts:67](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L67)

---

### CLAUDE_3_5_SONNET

> **CLAUDE_3_5_SONNET**: `"anthropic.claude-3-5-sonnet-20241022-v1:0"`

Defined in: [constants/enums.ts:70](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L70)

---

### CLAUDE_3_5_HAIKU

> **CLAUDE_3_5_HAIKU**: `"anthropic.claude-3-5-haiku-20241022-v1:0"`

Defined in: [constants/enums.ts:71](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L71)

---

### CLAUDE_3_SONNET

> **CLAUDE_3_SONNET**: `"anthropic.claude-3-sonnet-20240229-v1:0"`

Defined in: [constants/enums.ts:74](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L74)

---

### CLAUDE_3_HAIKU

> **CLAUDE_3_HAIKU**: `"anthropic.claude-3-haiku-20240307-v1:0"`

Defined in: [constants/enums.ts:75](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L75)

---

### NOVA_PREMIER

> **NOVA_PREMIER**: `"amazon.nova-premier-v1:0"`

Defined in: [constants/enums.ts:82](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L82)

---

### NOVA_PRO

> **NOVA_PRO**: `"amazon.nova-pro-v1:0"`

Defined in: [constants/enums.ts:83](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L83)

---

### NOVA_LITE

> **NOVA_LITE**: `"amazon.nova-lite-v1:0"`

Defined in: [constants/enums.ts:84](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L84)

---

### NOVA_MICRO

> **NOVA_MICRO**: `"amazon.nova-micro-v1:0"`

Defined in: [constants/enums.ts:85](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L85)

---

### NOVA_2_LITE

> **NOVA_2_LITE**: `"amazon.nova-2-lite-v1:0"`

Defined in: [constants/enums.ts:88](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L88)

---

### NOVA_2_SONIC

> **NOVA_2_SONIC**: `"amazon.nova-2-sonic-v1:0"`

Defined in: [constants/enums.ts:89](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L89)

---

### NOVA_SONIC

> **NOVA_SONIC**: `"amazon.nova-sonic-v1:0"`

Defined in: [constants/enums.ts:92](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L92)

---

### NOVA_CANVAS

> **NOVA_CANVAS**: `"amazon.nova-canvas-v1:0"`

Defined in: [constants/enums.ts:93](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L93)

---

### NOVA_REEL

> **NOVA_REEL**: `"amazon.nova-reel-v1:0"`

Defined in: [constants/enums.ts:94](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L94)

---

### NOVA_REEL_V1_1

> **NOVA_REEL_V1_1**: `"amazon.nova-reel-v1:1"`

Defined in: [constants/enums.ts:95](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L95)

---

### NOVA_MULTIMODAL_EMBEDDINGS

> **NOVA_MULTIMODAL_EMBEDDINGS**: `"amazon.nova-2-multimodal-embeddings-v1:0"`

Defined in: [constants/enums.ts:96](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L96)

---

### TITAN_TEXT_LARGE

> **TITAN_TEXT_LARGE**: `"amazon.titan-tg1-large"`

Defined in: [constants/enums.ts:103](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L103)

---

### TITAN_EMBED_TEXT_V2

> **TITAN_EMBED_TEXT_V2**: `"amazon.titan-embed-text-v2:0"`

Defined in: [constants/enums.ts:106](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L106)

---

### TITAN_EMBED_TEXT_V1

> **TITAN_EMBED_TEXT_V1**: `"amazon.titan-embed-text-v1"`

Defined in: [constants/enums.ts:107](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L107)

---

### TITAN_EMBED_G1_TEXT_02

> **TITAN_EMBED_G1_TEXT_02**: `"amazon.titan-embed-g1-text-02"`

Defined in: [constants/enums.ts:108](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L108)

---

### TITAN_EMBED_IMAGE_V1

> **TITAN_EMBED_IMAGE_V1**: `"amazon.titan-embed-image-v1"`

Defined in: [constants/enums.ts:111](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L111)

---

### TITAN_IMAGE_GENERATOR_V2

> **TITAN_IMAGE_GENERATOR_V2**: `"amazon.titan-image-generator-v2:0"`

Defined in: [constants/enums.ts:114](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L114)

---

### LLAMA_4_MAVERICK_17B

> **LLAMA_4_MAVERICK_17B**: `"meta.llama4-maverick-17b-instruct-v1:0"`

Defined in: [constants/enums.ts:121](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L121)

---

### LLAMA_4_SCOUT_17B

> **LLAMA_4_SCOUT_17B**: `"meta.llama4-scout-17b-instruct-v1:0"`

Defined in: [constants/enums.ts:122](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L122)

---

### LLAMA_3_3_70B

> **LLAMA_3_3_70B**: `"meta.llama3-3-70b-instruct-v1:0"`

Defined in: [constants/enums.ts:125](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L125)

---

### LLAMA_3_2_90B

> **LLAMA_3_2_90B**: `"meta.llama3-2-90b-instruct-v1:0"`

Defined in: [constants/enums.ts:128](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L128)

---

### LLAMA_3_2_11B

> **LLAMA_3_2_11B**: `"meta.llama3-2-11b-instruct-v1:0"`

Defined in: [constants/enums.ts:129](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L129)

---

### LLAMA_3_2_3B

> **LLAMA_3_2_3B**: `"meta.llama3-2-3b-instruct-v1:0"`

Defined in: [constants/enums.ts:130](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L130)

---

### LLAMA_3_2_1B

> **LLAMA_3_2_1B**: `"meta.llama3-2-1b-instruct-v1:0"`

Defined in: [constants/enums.ts:131](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L131)

---

### LLAMA_3_1_405B

> **LLAMA_3_1_405B**: `"meta.llama3-1-405b-instruct-v1:0"`

Defined in: [constants/enums.ts:134](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L134)

---

### LLAMA_3_1_70B

> **LLAMA_3_1_70B**: `"meta.llama3-1-70b-instruct-v1:0"`

Defined in: [constants/enums.ts:135](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L135)

---

### LLAMA_3_1_8B

> **LLAMA_3_1_8B**: `"meta.llama3-1-8b-instruct-v1:0"`

Defined in: [constants/enums.ts:136](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L136)

---

### LLAMA_3_70B

> **LLAMA_3_70B**: `"meta.llama3-70b-instruct-v1:0"`

Defined in: [constants/enums.ts:139](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L139)

---

### LLAMA_3_8B

> **LLAMA_3_8B**: `"meta.llama3-8b-instruct-v1:0"`

Defined in: [constants/enums.ts:140](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L140)

---

### MISTRAL_LARGE_3

> **MISTRAL_LARGE_3**: `"mistral.mistral-large-3-675b-instruct"`

Defined in: [constants/enums.ts:147](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L147)

---

### MISTRAL_LARGE_2407

> **MISTRAL_LARGE_2407**: `"mistral.mistral-large-2407-v1:0"`

Defined in: [constants/enums.ts:148](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L148)

---

### MISTRAL_LARGE_2402

> **MISTRAL_LARGE_2402**: `"mistral.mistral-large-2402-v1:0"`

Defined in: [constants/enums.ts:149](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L149)

---

### MAGISTRAL_SMALL_2509

> **MAGISTRAL_SMALL_2509**: `"mistral.magistral-small-2509"`

Defined in: [constants/enums.ts:152](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L152)

---

### MINISTRAL_3_14B

> **MINISTRAL_3_14B**: `"mistral.ministral-3-14b-instruct"`

Defined in: [constants/enums.ts:153](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L153)

---

### MINISTRAL_3_8B

> **MINISTRAL_3_8B**: `"mistral.ministral-3-8b-instruct"`

Defined in: [constants/enums.ts:154](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L154)

---

### MINISTRAL_3_3B

> **MINISTRAL_3_3B**: `"mistral.ministral-3-3b-instruct"`

Defined in: [constants/enums.ts:155](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L155)

---

### MISTRAL_7B

> **MISTRAL_7B**: `"mistral.mistral-7b-instruct-v0:2"`

Defined in: [constants/enums.ts:158](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L158)

---

### MIXTRAL_8x7B

> **MIXTRAL_8x7B**: `"mistral.mixtral-8x7b-instruct-v0:1"`

Defined in: [constants/enums.ts:159](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L159)

---

### PIXTRAL_LARGE_2502

> **PIXTRAL_LARGE_2502**: `"mistral.pixtral-large-2502-v1:0"`

Defined in: [constants/enums.ts:162](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L162)

---

### VOXTRAL_SMALL_24B

> **VOXTRAL_SMALL_24B**: `"mistral.voxtral-small-24b-2507"`

Defined in: [constants/enums.ts:163](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L163)

---

### VOXTRAL_MINI_3B

> **VOXTRAL_MINI_3B**: `"mistral.voxtral-mini-3b-2507"`

Defined in: [constants/enums.ts:164](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L164)

---

### COHERE_COMMAND_R_PLUS

> **COHERE_COMMAND_R_PLUS**: `"cohere.command-r-plus-v1:0"`

Defined in: [constants/enums.ts:171](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L171)

---

### COHERE_COMMAND_R

> **COHERE_COMMAND_R**: `"cohere.command-r-v1:0"`

Defined in: [constants/enums.ts:172](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L172)

---

### DEEPSEEK_R1

> **DEEPSEEK_R1**: `"deepseek.r1-v1:0"`

Defined in: [constants/enums.ts:175](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L175)

---

### DEEPSEEK_V3

> **DEEPSEEK_V3**: `"deepseek.v3-v1:0"`

Defined in: [constants/enums.ts:176](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L176)

---

### QWEN_3_235B_A22B

> **QWEN_3_235B_A22B**: `"qwen.qwen3-235b-a22b-2507-v1:0"`

Defined in: [constants/enums.ts:179](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L179)

---

### QWEN_3_CODER_480B_A35B

> **QWEN_3_CODER_480B_A35B**: `"qwen.qwen3-coder-480b-a35b-v1:0"`

Defined in: [constants/enums.ts:180](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L180)

---

### QWEN_3_CODER_30B_A3B

> **QWEN_3_CODER_30B_A3B**: `"qwen.qwen3-coder-30b-a3b-v1:0"`

Defined in: [constants/enums.ts:181](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L181)

---

### QWEN_3_32B

> **QWEN_3_32B**: `"qwen.qwen3-32b-v1:0"`

Defined in: [constants/enums.ts:182](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L182)

---

### QWEN_3_NEXT_80B_A3B

> **QWEN_3_NEXT_80B_A3B**: `"qwen.qwen3-next-80b-a3b"`

Defined in: [constants/enums.ts:183](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L183)

---

### QWEN_3_VL_235B_A22B

> **QWEN_3_VL_235B_A22B**: `"qwen.qwen3-vl-235b-a22b"`

Defined in: [constants/enums.ts:184](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L184)

---

### GEMMA_3_27B_IT

> **GEMMA_3_27B_IT**: `"google.gemma-3-27b-it"`

Defined in: [constants/enums.ts:187](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L187)

---

### GEMMA_3_12B_IT

> **GEMMA_3_12B_IT**: `"google.gemma-3-12b-it"`

Defined in: [constants/enums.ts:188](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L188)

---

### GEMMA_3_4B_IT

> **GEMMA_3_4B_IT**: `"google.gemma-3-4b-it"`

Defined in: [constants/enums.ts:189](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L189)

---

### JAMBA_1_5_LARGE

> **JAMBA_1_5_LARGE**: `"ai21.jamba-1-5-large-v1:0"`

Defined in: [constants/enums.ts:192](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L192)

---

### JAMBA_1_5_MINI

> **JAMBA_1_5_MINI**: `"ai21.jamba-1-5-mini-v1:0"`

Defined in: [constants/enums.ts:193](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L193)

---

## Type Alias: AIProvider

<!-- Source: api/type-aliases/AIProvider.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### generate()

> **generate**(`optionsOrPrompt`, `analysisSchema?`): `Promise`\

Defined in: [types/providers.ts:303](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/providers.ts#L303)

#### Parameters

##### optionsOrPrompt

`string` | [`TextGenerationOptions`](/docs/api/type-aliases/TextGenerationOptions)

##### analysisSchema?

`ValidationSchema`

#### Returns

`Promise`\

---

### gen()

> **gen**(`optionsOrPrompt`, `analysisSchema?`): `Promise`\

Defined in: [types/providers.ts:308](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/providers.ts#L308)

#### Parameters

##### optionsOrPrompt

`string` | [`TextGenerationOptions`](/docs/api/type-aliases/TextGenerationOptions)

##### analysisSchema?

`ValidationSchema`

#### Returns

`Promise`\

---

### setupToolExecutor()

> **setupToolExecutor**(`sdk`, `functionTag`): `void`

Defined in: [types/providers.ts:314](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/providers.ts#L314)

#### Parameters

##### sdk

###### customTools

`Map`\

###### executeTool

(`toolName`, `params`) => `Promise`\

##### functionTag

`string`

#### Returns

`void`

---

## Function: batchRerank()

<!-- Source: api/functions/batchRerank.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / batchRerank

# Function: batchRerank()

> **batchRerank**(`results`, `query`, `model`, `options?`): `Promise`

Defined in: [lib/rag/reranker/reranker.ts:184](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/reranker.ts#L184)

Batch rerank with optimized LLM calls

Scores multiple documents in a single LLM prompt for improved efficiency
compared to individual scoring. This is ideal for large result sets where
reducing API calls is important for cost and latency.

## Parameters

### results

`VectorQueryResult[]`

Vector search results to rerank. Each result should have:

- `id` - Unique identifier
- `text` - Text content (or `metadata.text`)
- `score` - Original vector similarity score
- `metadata` - Additional metadata

### query

`string`

Original search query for relevance scoring

### model

`AIProvider`

Language model provider for batch semantic scoring

### options?

`RerankerOptions`

Optional reranking configuration:

- `topK` - Number of results to return (default: 3)
- `weights` - Scoring weights (must sum to 1.0)
  - `semantic` - Weight for LLM-based score (default: 0.4)
  - `vector` - Weight for vector similarity score (default: 0.4)
  - `position` - Weight for position score (default: 0.2)

## Returns

`Promise`

Array of reranked results sorted by combined score, each containing:

- `result` - Original VectorQueryResult
- `score` - Combined relevance score (0-1)
- `details` - Score breakdown with `semantic`, `vector`, and `position`

## Examples

### Basic batch reranking

```typescript

const model = await ProviderFactory.createProvider("openai", "gpt-4o-mini");

// Efficiently score all results in one LLM call
const rerankedResults = await batchRerank(
  vectorSearchResults,
  "What is the return policy?",
  model,
  { topK: 5 },
);
```

### Cost-efficient reranking for large result sets

```typescript

async function efficientSearch(query: string, results: VectorQueryResult[]) {
  // Batch reranking uses a single prompt to score all documents
  // Much more efficient than individual scoring for 20+ results
  const reranked = await batchRerank(results, query, model, {
    topK: 10,
    weights: { semantic: 0.5, vector: 0.35, position: 0.15 },
  });

  return reranked;
}
```

### With fallback handling

```typescript

async function robustRerank(results: VectorQueryResult[], query: string) {
  try {
    // Try batch reranking first for efficiency
    return await batchRerank(results, query, model, { topK: 5 });
  } catch (error) {
    console.warn("Batch reranking failed, falling back to individual scoring");
    // batchRerank automatically falls back to individual rerank on failure
    return await rerank(results, query, model, { topK: 5 });
  }
}
```

### Pipeline integration

```typescript

async function hybridSearchWithReranking(query: string) {
  const hybridSearch = createHybridSearch(hybridConfig);

  // Get initial hybrid search results
  const initialResults = await hybridSearch(query, { topK: 50 });

  // Efficiently rerank the top results
  const reranked = await batchRerank(
    initialResults.map((r) => ({
      id: r.id,
      text: r.text,
      score: r.score,
      metadata: r.metadata,
    })),
    query,
    model,
    { topK: 10 },
  );

  return reranked;
}
```

## Notes

- Uses a single LLM prompt to score all documents simultaneously
- Falls back to individual `rerank()` if batch scoring fails
- Documents are truncated to 300 characters in the batch prompt
- Scores are parsed from the LLM response; unparseable scores default to 0.5

## Since

v8.44.0

## See Also

- [rerank](/docs/rerank) - Individual document reranking
- [simpleRerank](/docs/simplererank) - Reranking without LLM
- [createReranker](/docs/createreranker) - Factory for reranker instances
- [RerankResult](/docs/type-aliases/rerankresult) - Result type definition

---

## Class: ChunkerFactory

<!-- Source: api/classes/ChunkerFactory.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

### resetInstance()

> `static` **resetInstance**(): `void`

Defined in: [rag/ChunkerFactory.ts:119](https://github.com/juspay/neurolink/blob/main/src/lib/rag/ChunkerFactory.ts#L119)

Reset the singleton instance (primarily for testing).

#### Returns

`void`

---

### createChunker()

> **createChunker**(`strategyOrAlias`, `config?`): `Promise`\

Defined in: [rag/ChunkerFactory.ts:258](https://github.com/juspay/neurolink/blob/main/src/lib/rag/ChunkerFactory.ts#L258)

Creates a new chunker instance for the specified strategy.

#### Parameters

##### strategyOrAlias

`string`

Chunking strategy name or alias (e.g., "markdown", "md", "recursive")

##### config?

[`ChunkerConfig`](/docs/type-aliases/chunkerconfig)

Optional configuration to override defaults

#### Returns

`Promise`\

Configured chunker instance

#### Throws

`ChunkingError` - If strategy is not found or creation fails

---

### registerChunker()

> **registerChunker**(`strategy`, `factory`, `metadata`): `void`

Defined in: [rag/ChunkerFactory.ts:239](https://github.com/juspay/neurolink/blob/main/src/lib/rag/ChunkerFactory.ts#L239)

Register a custom chunker with metadata and aliases.

#### Parameters

##### strategy

[`ChunkingStrategy`](/docs/type-aliases/chunkingstrategy) | `string`

Strategy name to register

##### factory

(`config?`: [`ChunkerConfig`](/docs/type-aliases/chunkerconfig)) => `Promise`\

Async factory function that creates the chunker

##### metadata

[`ChunkerMetadata`](/docs/type-aliases/chunkermetadata)

Metadata including description, defaults, and aliases

#### Returns

`void`

---

### getAvailableStrategies()

> **getAvailableStrategies**(): `Promise`\

Defined in: [rag/ChunkerFactory.ts:312](https://github.com/juspay/neurolink/blob/main/src/lib/rag/ChunkerFactory.ts#L312)

Get all available chunking strategies (not including aliases).

#### Returns

`Promise`\

Array of strategy names

---

### getChunkerMetadata()

> **getChunkerMetadata**(`strategyOrAlias`): [`ChunkerMetadata`](/docs/type-aliases/chunkermetadata) | `undefined`

Defined in: [rag/ChunkerFactory.ts:296](https://github.com/juspay/neurolink/blob/main/src/lib/rag/ChunkerFactory.ts#L296)

Get metadata for a chunker strategy.

#### Parameters

##### strategyOrAlias

`string`

Strategy name or alias

#### Returns

[`ChunkerMetadata`](/docs/type-aliases/chunkermetadata) | `undefined`

Chunker metadata or undefined if not found

---

### getDefaultConfig()

> **getDefaultConfig**(`strategyOrAlias`): [`ChunkerConfig`](/docs/type-aliases/chunkerconfig) | `undefined`

Defined in: [rag/ChunkerFactory.ts:304](https://github.com/juspay/neurolink/blob/main/src/lib/rag/ChunkerFactory.ts#L304)

Get the default configuration for a chunker strategy.

#### Parameters

##### strategyOrAlias

`string`

Strategy name or alias

#### Returns

[`ChunkerConfig`](/docs/type-aliases/chunkerconfig) | `undefined`

Default configuration or undefined if not found

---

### getStrategyAliases()

> **getStrategyAliases**(): `Map`\

Defined in: [rag/ChunkerFactory.ts:320](https://github.com/juspay/neurolink/blob/main/src/lib/rag/ChunkerFactory.ts#L320)

Get all aliases mapped to their canonical strategy names.

#### Returns

`Map`\

Map of alias to strategy name

---

### hasStrategy()

> **hasStrategy**(`strategyOrAlias`): `boolean`

Defined in: [rag/ChunkerFactory.ts:327](https://github.com/juspay/neurolink/blob/main/src/lib/rag/ChunkerFactory.ts#L327)

Check if a strategy or alias exists.

#### Parameters

##### strategyOrAlias

`string`

Strategy name or alias to check

#### Returns

`boolean`

True if the strategy exists

---

### getChunkersForUseCase()

> **getChunkersForUseCase**(`useCase`): [`ChunkingStrategy`](/docs/type-aliases/chunkingstrategy)[]

Defined in: [rag/ChunkerFactory.ts:335](https://github.com/juspay/neurolink/blob/main/src/lib/rag/ChunkerFactory.ts#L335)

Get chunkers suitable for a specific use case.

#### Parameters

##### useCase

`string`

Use case description (e.g., "documentation", "Q&A")

#### Returns

[`ChunkingStrategy`](/docs/type-aliases/chunkingstrategy)[]

Array of matching strategy names

---

### getAllMetadata()

> **getAllMetadata**(): `Map`\

Defined in: [rag/ChunkerFactory.ts:352](https://github.com/juspay/neurolink/blob/main/src/lib/rag/ChunkerFactory.ts#L352)

Get metadata for all registered chunkers.

#### Returns

`Map`\

Map of strategy names to their metadata

---

### clear()

> **clear**(): `void`

Defined in: [rag/ChunkerFactory.ts:359](https://github.com/juspay/neurolink/blob/main/src/lib/rag/ChunkerFactory.ts#L359)

Clear the factory registry and metadata.

#### Returns

`void`

## Examples

### Basic Usage

```typescript

// Create a markdown chunker with custom config
const chunker = await chunkerFactory.createChunker("markdown", {
  maxSize: 500,
  headerLevels: [1, 2],
});

// Chunk a document
const chunks = await chunker.chunk(markdownContent);
console.log(`Created ${chunks.length} chunks`);
```

### Using Aliases

```typescript

// "md" is an alias for "markdown"
const chunker = await chunkerFactory.createChunker("md");

// "char" is an alias for "character"
const charChunker = await chunkerFactory.createChunker("char", {
  maxSize: 1000,
  overlap: 100,
});
```

### Using Convenience Functions

```typescript

  createChunker,
  getAvailableStrategies,
  getChunkerMetadata,
  getDefaultConfig,
} from "@juspay/neurolink";

// Create chunker directly
const chunker = await createChunker("recursive", {
  separators: ["\n\n", "\n", ". ", " "],
});

// List available strategies
const strategies = await getAvailableStrategies();
console.log("Available:", strategies);
// ["character", "recursive", "sentence", "token", "markdown", "html", "json", "latex", "semantic", "semantic-markdown"]

// Get metadata for a strategy
const metadata = getChunkerMetadata("markdown");
console.log(metadata?.description);
// "Splits markdown content by headers and structural elements"

// Get default config
const defaults = getDefaultConfig("token");
console.log(defaults);
// { maxSize: 512, overlap: 50 }
```

### Finding Chunkers by Use Case

```typescript

// Find chunkers for documentation processing
const docChunkers = chunkerFactory.getChunkersForUseCase("documentation");
console.log(docChunkers); // ["markdown"]

// Find chunkers for Q&A applications
const qaChunkers = chunkerFactory.getChunkersForUseCase("Q&A");
console.log(qaChunkers); // ["sentence"]
```

### Registering Custom Chunkers

```typescript

// Register a custom chunker
chunkerFactory.registerChunker(
  "custom-xml",
  async (config) => {
    return new MyXMLChunker(config);
  },
  {
    description: "Custom XML-aware chunker",
    defaultConfig: { maxSize: 1000 },
    supportedOptions: ["maxSize", "splitTags"],
    useCases: ["XML documents", "SOAP responses"],
    aliases: ["xml"],
  },
);

// Now usable via factory
const xmlChunker = await chunkerFactory.createChunker("xml");
```

## Supported Strategies

| Strategy            | Aliases                                    | Description                                   | Best For                                |
| ------------------- | ------------------------------------------ | --------------------------------------------- | --------------------------------------- |
| `character`         | `char`, `fixed-size`, `fixed`              | Fixed-size character splitting with overlap   | Simple text, fixed-size requirements    |
| `recursive`         | `recursive-character`, `langchain-default` | Hierarchical separator-based splitting        | General text documents (default choice) |
| `sentence`          | `sent`, `sentence-based`                   | Sentence boundary splitting                   | Q&A applications, NLP tasks             |
| `token`             | `tok`, `tokenized`                         | Token-count based splitting                   | LLM context management, model-specific  |
| `markdown`          | `md`, `markdown-header`                    | Header and structure-aware markdown splitting | Documentation, README files             |
| `html`              | `html-tag`, `web`                          | Semantic HTML tag splitting                   | Web content, HTML documents             |
| `json`              | `json-object`, `structured`                | JSON object boundary splitting                | API responses, structured data          |
| `latex`             | `tex`, `latex-section`                     | Section and environment-aware LaTeX splitting | Academic papers, scientific docs        |
| `semantic`          | `llm`, `ai-semantic`                       | LLM-powered semantic split points             | Advanced semantic understanding         |
| `semantic-markdown` | `semantic-md`, `smart-markdown`            | Markdown + semantic similarity                | Knowledge bases, context-aware docs     |

## Notes

- The factory uses **lazy initialization** - chunkers are registered on first access
- All chunker creation is **async** due to dynamic imports
- The **singleton pattern** ensures consistent behavior across the application
- Use `resetInstance()` in tests to get a fresh factory state

## See Also

- [ChunkerRegistry](/docs/chunkerregistry) - Alternative registry-based chunker access
- [ChunkingStrategy](/docs/type-aliases/chunkingstrategy) - Strategy type definition
- [ChunkerConfig](/docs/type-aliases/chunkerconfig) - Configuration type union
- [ChunkerMetadata](/docs/type-aliases/chunkermetadata) - Metadata type definition
- [MDocument](/docs/mdocument) - Document class with integrated chunking
- [createChunker](/docs/functions/createchunker) - Convenience function
- [getAvailableStrategies](/docs/functions/getavailablestrategies) - List strategies function

---

## Variable: DEFAULT_RATE_LIMIT_CONFIG

<!-- Source: api/variables/DEFAULT_RATE_LIMIT_CONFIG.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / DEFAULT_RATE_LIMIT_CONFIG

# Variable: DEFAULT_RATE_LIMIT_CONFIG

> `const` **DEFAULT_RATE_LIMIT_CONFIG**: [`RateLimitConfig`](/docs/api/type-aliases/RateLimitConfig)

Defined in: [mcp/httpRateLimiter.ts:14](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/httpRateLimiter.ts#L14)

Default rate limit configuration
Provides sensible defaults for most MCP HTTP transport use cases

---

## Enumeration: OpenAIModels

<!-- Source: api/enumerations/OpenAIModels.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### GPT_5_2_CHAT_LATEST

> **GPT_5_2_CHAT_LATEST**: `"gpt-5.2-chat-latest"`

Defined in: [constants/enums.ts:202](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L202)

---

### GPT_5_2_PRO

> **GPT_5_2_PRO**: `"gpt-5.2-pro"`

Defined in: [constants/enums.ts:203](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L203)

---

### GPT_5

> **GPT_5**: `"gpt-5"`

Defined in: [constants/enums.ts:206](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L206)

---

### GPT_5_MINI

> **GPT_5_MINI**: `"gpt-5-mini"`

Defined in: [constants/enums.ts:207](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L207)

---

### GPT_5_NANO

> **GPT_5_NANO**: `"gpt-5-nano"`

Defined in: [constants/enums.ts:208](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L208)

---

### GPT_4_1

> **GPT_4_1**: `"gpt-4.1"`

Defined in: [constants/enums.ts:211](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L211)

---

### GPT_4_1_MINI

> **GPT_4_1_MINI**: `"gpt-4.1-mini"`

Defined in: [constants/enums.ts:212](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L212)

---

### GPT_4_1_NANO

> **GPT_4_1_NANO**: `"gpt-4.1-nano"`

Defined in: [constants/enums.ts:213](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L213)

---

### GPT_4O

> **GPT_4O**: `"gpt-4o"`

Defined in: [constants/enums.ts:216](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L216)

---

### GPT_4O_MINI

> **GPT_4O_MINI**: `"gpt-4o-mini"`

Defined in: [constants/enums.ts:217](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L217)

---

### O3

> **O3**: `"o3"`

Defined in: [constants/enums.ts:220](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L220)

---

### O3_MINI

> **O3_MINI**: `"o3-mini"`

Defined in: [constants/enums.ts:221](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L221)

---

### O3_PRO

> **O3_PRO**: `"o3-pro"`

Defined in: [constants/enums.ts:222](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L222)

---

### O4_MINI

> **O4_MINI**: `"o4-mini"`

Defined in: [constants/enums.ts:223](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L223)

---

### O1

> **O1**: `"o1"`

Defined in: [constants/enums.ts:224](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L224)

---

### O1_PREVIEW

> **O1_PREVIEW**: `"o1-preview"`

Defined in: [constants/enums.ts:225](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L225)

---

### O1_MINI

> **O1_MINI**: `"o1-mini"`

Defined in: [constants/enums.ts:226](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L226)

---

### GPT_4

> **GPT_4**: `"gpt-4"`

Defined in: [constants/enums.ts:229](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L229)

---

### GPT_4_TURBO

> **GPT_4_TURBO**: `"gpt-4-turbo"`

Defined in: [constants/enums.ts:230](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L230)

---

### GPT_3_5_TURBO

> **GPT_3_5_TURBO**: `"gpt-3.5-turbo"`

Defined in: [constants/enums.ts:233](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L233)

---

## Type Alias: AnalyticsData

<!-- Source: api/type-aliases/AnalyticsData.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### model?

> `optional` **model**: `string`

Defined in: [types/analytics.ts:36](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/analytics.ts#L36)

---

### tokenUsage

> **tokenUsage**: `TokenUsage`

Defined in: [types/analytics.ts:37](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/analytics.ts#L37)

---

### requestDuration

> **requestDuration**: `number`

Defined in: [types/analytics.ts:38](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/analytics.ts#L38)

---

### timestamp

> **timestamp**: `string`

Defined in: [types/analytics.ts:39](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/analytics.ts#L39)

---

### cost?

> `optional` **cost**: `number`

Defined in: [types/analytics.ts:40](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/analytics.ts#L40)

---

### context?

> `optional` **context**: `JsonValue`

Defined in: [types/analytics.ts:41](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/analytics.ts#L41)

---

## Function: buildObservabilityConfigFromEnv()

<!-- Source: api/functions/buildObservabilityConfigFromEnv.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / buildObservabilityConfigFromEnv

# Function: buildObservabilityConfigFromEnv()

> **buildObservabilityConfigFromEnv**(): [`ObservabilityConfig`](/docs/api/type-aliases/ObservabilityConfig) \| `undefined`

Defined in: [utils/observabilityHelpers.ts:29](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/utils/observabilityHelpers.ts#L29)

Build observability config from environment variables

Reads Langfuse configuration from environment:

- LANGFUSE_ENABLED: Enable/disable Langfuse (must be "true")
- LANGFUSE_PUBLIC_KEY: Your Langfuse public key (required)
- LANGFUSE_SECRET_KEY: Your Langfuse secret key (required)
- LANGFUSE_BASE_URL: Langfuse server URL (default: https://cloud.langfuse.com)
- LANGFUSE_ENVIRONMENT: Environment name (default: dev)
- PUBLIC_APP_VERSION: Release/version identifier (default: v1.0.0)

## Returns

[`ObservabilityConfig`](/docs/api/type-aliases/ObservabilityConfig) \| `undefined`

ObservabilityConfig if all required env vars are set, undefined otherwise

## Example

```typescript

const neurolink = new NeuroLink({
  observability: buildObservabilityConfigFromEnv(),
});
```

---

## Class: ChunkerRegistry

<!-- Source: api/classes/ChunkerRegistry.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

### resetInstance()

> `static` **resetInstance**(): `void`

Defined in: [rag/ChunkerRegistry.ts:147](https://github.com/juspay/neurolink/blob/main/src/lib/rag/ChunkerRegistry.ts#L147)

Reset the singleton instance (primarily for testing). Clears all registered chunkers and aliases.

#### Returns

`void`

---

### registerChunker()

> **registerChunker**(`strategy`, `factory`, `metadata`): `void`

Defined in: [rag/ChunkerRegistry.ts:254](https://github.com/juspay/neurolink/blob/main/src/lib/rag/ChunkerRegistry.ts#L254)

Register a chunker with metadata and aliases.

#### Parameters

##### strategy

[`ChunkingStrategy`](/docs/type-aliases/chunkingstrategy) | `string`

Strategy name to register

##### factory

() => `Promise`\

Async factory function that creates the chunker instance

##### metadata

[`ChunkerMetadata`](/docs/type-aliases/chunkermetadata)

Metadata including description, defaults, use cases, and aliases

#### Returns

`void`

---

### resolveStrategy()

> **resolveStrategy**(`nameOrAlias`): [`ChunkingStrategy`](/docs/type-aliases/chunkingstrategy)

Defined in: [rag/ChunkerRegistry.ts:273](https://github.com/juspay/neurolink/blob/main/src/lib/rag/ChunkerRegistry.ts#L273)

Resolve a strategy name from an alias or verify a direct strategy name exists.

#### Parameters

##### nameOrAlias

`string`

Strategy name or alias to resolve

#### Returns

[`ChunkingStrategy`](/docs/type-aliases/chunkingstrategy)

The canonical strategy name

#### Throws

`ChunkingError` - If the strategy or alias is not found

---

### getChunker()

> **getChunker**(`strategyOrAlias`): `Promise`\

Defined in: [rag/ChunkerRegistry.ts:304](https://github.com/juspay/neurolink/blob/main/src/lib/rag/ChunkerRegistry.ts#L304)

Get a chunker instance by strategy name or alias.

#### Parameters

##### strategyOrAlias

`string`

Chunking strategy name or alias (e.g., "markdown", "md", "recursive")

#### Returns

`Promise`\

The chunker instance

#### Throws

`ChunkingError` - If strategy is not found

---

### getAvailableChunkers()

> **getAvailableChunkers**(): `Promise`\

Defined in: [rag/ChunkerRegistry.ts:322](https://github.com/juspay/neurolink/blob/main/src/lib/rag/ChunkerRegistry.ts#L322)

Get list of all available chunker strategies (not including aliases).

#### Returns

`Promise`\

Array of strategy names

---

### getChunkerMetadata()

> **getChunkerMetadata**(`strategyOrAlias`): [`ChunkerMetadata`](/docs/type-aliases/chunkermetadata) | `undefined`

Defined in: [rag/ChunkerRegistry.ts:330](https://github.com/juspay/neurolink/blob/main/src/lib/rag/ChunkerRegistry.ts#L330)

Get metadata for a specific chunker strategy.

#### Parameters

##### strategyOrAlias

`string`

Strategy name or alias

#### Returns

[`ChunkerMetadata`](/docs/type-aliases/chunkermetadata) | `undefined`

Chunker metadata or undefined if not found

---

### getAliasesForStrategy()

> **getAliasesForStrategy**(`strategy`): `string`[]

Defined in: [rag/ChunkerRegistry.ts:339](https://github.com/juspay/neurolink/blob/main/src/lib/rag/ChunkerRegistry.ts#L339)

Get all aliases for a specific strategy.

#### Parameters

##### strategy

[`ChunkingStrategy`](/docs/type-aliases/chunkingstrategy)

The canonical strategy name

#### Returns

`string`[]

Array of alias strings for the strategy

---

### getAllAliases()

> **getAllAliases**(): `Map`\

Defined in: [rag/ChunkerRegistry.ts:347](https://github.com/juspay/neurolink/blob/main/src/lib/rag/ChunkerRegistry.ts#L347)

Get all registered aliases mapped to their canonical strategy names.

#### Returns

`Map`\

Map of alias to strategy name

---

### hasChunker()

> **hasChunker**(`strategyOrAlias`): `boolean`

Defined in: [rag/ChunkerRegistry.ts:354](https://github.com/juspay/neurolink/blob/main/src/lib/rag/ChunkerRegistry.ts#L354)

Check if a strategy or alias exists in the registry.

#### Parameters

##### strategyOrAlias

`string`

Strategy name or alias to check

#### Returns

`boolean`

True if the strategy or alias exists

---

### getChunkersByUseCase()

> **getChunkersByUseCase**(`useCase`): [`ChunkingStrategy`](/docs/type-aliases/chunkingstrategy)[]

Defined in: [rag/ChunkerRegistry.ts:366](https://github.com/juspay/neurolink/blob/main/src/lib/rag/ChunkerRegistry.ts#L366)

Get chunkers suitable for a specific use case.

#### Parameters

##### useCase

`string`

Use case description (e.g., "documentation", "Q&A", "web scraping")

#### Returns

[`ChunkingStrategy`](/docs/type-aliases/chunkingstrategy)[]

Array of matching strategy names

---

### getDefaultConfig()

> **getDefaultConfig**(`strategyOrAlias`): [`ChunkerConfig`](/docs/type-aliases/chunkerconfig) | `undefined`

Defined in: [rag/ChunkerRegistry.ts:383](https://github.com/juspay/neurolink/blob/main/src/lib/rag/ChunkerRegistry.ts#L383)

Get the default configuration for a chunker strategy.

#### Parameters

##### strategyOrAlias

`string`

Strategy name or alias

#### Returns

[`ChunkerConfig`](/docs/type-aliases/chunkerconfig) | `undefined`

Default configuration or undefined if not found

---

### clear()

> **clear**(): `void`

Defined in: [rag/ChunkerRegistry.ts:391](https://github.com/juspay/neurolink/blob/main/src/lib/rag/ChunkerRegistry.ts#L391)

Clear the registry, removing all registered chunkers and aliases.

#### Returns

`void`

## Exported Functions

The module also exports convenience functions for common operations:

### getAvailableChunkers()

> **getAvailableChunkers**(): `Promise`\

Defined in: [rag/ChunkerRegistry.ts:405](https://github.com/juspay/neurolink/blob/main/src/lib/rag/ChunkerRegistry.ts#L405)

Convenience function to get all available chunker strategies.

#### Returns

`Promise`\

---

### getChunker()

> **getChunker**(`strategyOrAlias`): `Promise`\

Defined in: [rag/ChunkerRegistry.ts:412](https://github.com/juspay/neurolink/blob/main/src/lib/rag/ChunkerRegistry.ts#L412)

Convenience function to get a chunker by strategy name or alias.

#### Parameters

##### strategyOrAlias

`string`

Strategy name or alias

#### Returns

`Promise`\

---

### getChunkerMetadata()

> **getChunkerMetadata**(`strategyOrAlias`): [`ChunkerMetadata`](/docs/type-aliases/chunkermetadata) | `undefined`

Defined in: [rag/ChunkerRegistry.ts:419](https://github.com/juspay/neurolink/blob/main/src/lib/rag/ChunkerRegistry.ts#L419)

Convenience function to get chunker metadata.

#### Parameters

##### strategyOrAlias

`string`

Strategy name or alias

#### Returns

[`ChunkerMetadata`](/docs/type-aliases/chunkermetadata) | `undefined`

## Examples

### Basic Usage

```typescript

// Get a chunker by strategy name
const chunker = await chunkerRegistry.getChunker("markdown");

// Chunk a document
const chunks = await chunker.chunk(markdownContent);
console.log(`Created ${chunks.length} chunks`);
```

### Using Aliases

```typescript

// "md" is an alias for "markdown"
const mdChunker = await chunkerRegistry.getChunker("md");

// "char" is an alias for "character"
const charChunker = await chunkerRegistry.getChunker("char");

// "tok" is an alias for "token"
const tokenChunker = await chunkerRegistry.getChunker("tok");
```

### Using Convenience Functions

```typescript

  getChunker,
  getAvailableChunkers,
  getChunkerMetadata,
} from "@juspay/neurolink";

// Get chunker directly
const chunker = await getChunker("recursive");

// List all available strategies
const strategies = await getAvailableChunkers();
console.log("Available:", strategies);
// ["character", "recursive", "sentence", "token", "markdown", "html", "json", "latex", "semantic-markdown"]

// Get metadata for a strategy
const metadata = getChunkerMetadata("sentence");
console.log(metadata?.description);
// "Splits text by sentence boundaries for semantically meaningful chunks"
console.log(metadata?.useCases);
// ["Q&A applications", "Sentence-level analysis", "Preserving complete thoughts"]
```

### Finding Chunkers by Use Case

```typescript

// Find chunkers for documentation processing
const docChunkers = chunkerRegistry.getChunkersByUseCase("documentation");
console.log(docChunkers); // ["markdown"]

// Find chunkers for Q&A applications
const qaChunkers = chunkerRegistry.getChunkersByUseCase("Q&A");
console.log(qaChunkers); // ["sentence"]

// Find chunkers for web content
const webChunkers = chunkerRegistry.getChunkersByUseCase("web");
console.log(webChunkers); // ["html"]
```

### Resolving Aliases

```typescript

// Resolve an alias to its canonical strategy name
const strategy = chunkerRegistry.resolveStrategy("md");
console.log(strategy); // "markdown"

// Get all aliases for a strategy
const aliases = chunkerRegistry.getAliasesForStrategy("character");
console.log(aliases); // ["char", "fixed-size", "fixed"]

// Get all registered aliases
const allAliases = chunkerRegistry.getAllAliases();
allAliases.forEach((strategy, alias) => {
  console.log(`${alias} -> ${strategy}`);
});
```

### Checking Strategy Availability

```typescript

// Check if a strategy or alias exists
console.log(chunkerRegistry.hasChunker("markdown")); // true
console.log(chunkerRegistry.hasChunker("md")); // true
console.log(chunkerRegistry.hasChunker("unknown")); // false

// Get default configuration
const defaultConfig = chunkerRegistry.getDefaultConfig("token");
console.log(defaultConfig);
// { maxSize: 512, overlap: 50 }
```

### Registering Custom Chunkers

```typescript

// Register a custom chunker
chunkerRegistry.registerChunker(
  "custom-xml",
  async () => {
    return new MyXMLChunker();
  },
  {
    description: "Custom XML-aware chunker for structured documents",
    defaultConfig: { maxSize: 1000, overlap: 0 },
    supportedOptions: ["maxSize", "overlap", "splitTags", "preserveAttributes"],
    useCases: ["XML documents", "SOAP responses", "Configuration files"],
    aliases: ["xml", "xml-tag"],
  },
);

// Now usable via registry
const xmlChunker = await chunkerRegistry.getChunker("xml");
```

## Supported Strategies

| Strategy            | Aliases                                    | Description                                   | Best For                                |
| ------------------- | ------------------------------------------ | --------------------------------------------- | --------------------------------------- |
| `character`         | `char`, `fixed-size`, `fixed`              | Fixed-size character splitting with overlap   | Simple text, fixed-size requirements    |
| `recursive`         | `recursive-character`, `langchain-default` | Hierarchical separator-based splitting        | General text documents (default choice) |
| `sentence`          | `sent`, `sentence-based`                   | Sentence boundary splitting                   | Q&A applications, NLP tasks             |
| `token`             | `tok`, `tokenized`                         | Token-count based splitting                   | LLM context management, model-specific  |
| `markdown`          | `md`, `markdown-header`                    | Header and structure-aware markdown splitting | Documentation, README files             |
| `html`              | `html-tag`, `web`                          | Semantic HTML tag splitting                   | Web content, HTML documents             |
| `json`              | `json-object`, `structured`                | JSON object boundary splitting                | API responses, structured data          |
| `latex`             | `tex`, `latex-section`                     | Section and environment-aware LaTeX splitting | Academic papers, scientific docs        |
| `semantic`          | `llm`, `ai-semantic`                       | LLM-powered semantic split points             | Advanced semantic understanding         |
| `semantic-markdown` | `semantic-md`, `smart-markdown`            | Markdown + semantic similarity                | Knowledge bases, context-aware docs     |

## Notes

- The registry uses **lazy initialization** - chunkers are registered on first access via `ensureInitialized()`
- All chunker retrieval is **async** due to dynamic imports for lazy loading
- The **singleton pattern** ensures consistent behavior across the application
- Use `resetInstance()` in tests to get a fresh registry state
- The registry extends `BaseRegistry` for consistent lifecycle management

## See Also

- [ChunkerFactory](/docs/chunkerfactory) - Factory for creating configured chunker instances
- [ChunkingStrategy](/docs/type-aliases/chunkingstrategy) - Strategy type definition
- [ChunkerConfig](/docs/type-aliases/chunkerconfig) - Configuration type union
- [ChunkerMetadata](/docs/type-aliases/chunkermetadata) - Metadata type definition
- [Chunker](/docs/interfaces/chunker) - Chunker interface definition
- [MDocument](/docs/mdocument) - Document class with integrated chunking

---

## Variable: VERSION

<!-- Source: api/variables/VERSION.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / VERSION

# Variable: VERSION

> `const` **VERSION**: `"1.0.0"` = `"1.0.0"`

Defined in: [index.ts:125](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/index.ts#L125)

---

## Enumeration: VertexModels

<!-- Source: api/enumerations/VertexModels.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### CLAUDE_4_5_SONNET

> **CLAUDE_4_5_SONNET**: `"claude-sonnet-4-5@20250929"`

Defined in: [constants/enums.ts:292](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L292)

---

### CLAUDE_4_5_HAIKU

> **CLAUDE_4_5_HAIKU**: `"claude-haiku-4-5@20251001"`

Defined in: [constants/enums.ts:293](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L293)

---

### CLAUDE_4_0_SONNET

> **CLAUDE_4_0_SONNET**: `"claude-sonnet-4@20250514"`

Defined in: [constants/enums.ts:296](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L296)

---

### CLAUDE_4_0_OPUS

> **CLAUDE_4_0_OPUS**: `"claude-opus-4@20250514"`

Defined in: [constants/enums.ts:297](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L297)

---

### CLAUDE_3_7_SONNET

> **CLAUDE_3_7_SONNET**: `"claude-3-7-sonnet@20250219"`

Defined in: [constants/enums.ts:300](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L300)

---

### CLAUDE_3_5_SONNET

> **CLAUDE_3_5_SONNET**: `"claude-3-5-sonnet-20241022"`

Defined in: [constants/enums.ts:303](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L303)

---

### CLAUDE_3_5_HAIKU

> **CLAUDE_3_5_HAIKU**: `"claude-3-5-haiku-20241022"`

Defined in: [constants/enums.ts:304](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L304)

---

### CLAUDE_3_SONNET

> **CLAUDE_3_SONNET**: `"claude-3-sonnet-20240229"`

Defined in: [constants/enums.ts:307](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L307)

---

### CLAUDE_3_OPUS

> **CLAUDE_3_OPUS**: `"claude-3-opus-20240229"`

Defined in: [constants/enums.ts:308](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L308)

---

### CLAUDE_3_HAIKU

> **CLAUDE_3_HAIKU**: `"claude-3-haiku-20240307"`

Defined in: [constants/enums.ts:309](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L309)

---

### GEMINI_3_PRO

> **GEMINI_3_PRO**: `"gemini-3-pro"`

Defined in: [constants/enums.ts:313](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L313)

Gemini 3 Pro - Base model with adaptive thinking

---

### GEMINI_3_PRO_PREVIEW_11_2025

> **GEMINI_3_PRO_PREVIEW_11_2025**: `"gemini-3-pro-preview-11-2025"`

Defined in: [constants/enums.ts:315](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L315)

Gemini 3 Pro Preview - Versioned preview (November 2025)

---

### GEMINI_3_PRO_LATEST

> **GEMINI_3_PRO_LATEST**: `"gemini-3-pro-latest"`

Defined in: [constants/enums.ts:317](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L317)

Gemini 3 Pro Latest - Auto-updated alias (always points to latest preview)

---

### GEMINI_3_PRO_PREVIEW

> **GEMINI_3_PRO_PREVIEW**: `"gemini-3-pro-preview"`

Defined in: [constants/enums.ts:319](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L319)

Gemini 3 Pro Preview - Generic preview (legacy)

---

### GEMINI_3_FLASH

> **GEMINI_3_FLASH**: `"gemini-3-flash"`

Defined in: [constants/enums.ts:321](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L321)

Gemini 3 Flash - Base model with adaptive thinking

---

### GEMINI_3_FLASH_PREVIEW

> **GEMINI_3_FLASH_PREVIEW**: `"gemini-3-flash-preview"`

Defined in: [constants/enums.ts:323](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L323)

Gemini 3 Flash Preview - Versioned preview

---

### GEMINI_3_FLASH_LATEST

> **GEMINI_3_FLASH_LATEST**: `"gemini-3-flash-latest"`

Defined in: [constants/enums.ts:325](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L325)

Gemini 3 Flash Latest - Auto-updated alias (always points to latest preview)

---

### GEMINI_2_5_PRO

> **GEMINI_2_5_PRO**: `"gemini-2.5-pro"`

Defined in: [constants/enums.ts:328](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L328)

---

### GEMINI_2_5_FLASH

> **GEMINI_2_5_FLASH**: `"gemini-2.5-flash"`

Defined in: [constants/enums.ts:329](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L329)

---

### GEMINI_2_5_FLASH_LITE

> **GEMINI_2_5_FLASH_LITE**: `"gemini-2.5-flash-lite"`

Defined in: [constants/enums.ts:330](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L330)

---

### GEMINI_2_5_FLASH_IMAGE

> **GEMINI_2_5_FLASH_IMAGE**: `"gemini-2.5-flash-image"`

Defined in: [constants/enums.ts:331](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L331)

---

### GEMINI_2_0_FLASH

> **GEMINI_2_0_FLASH**: `"gemini-2.0-flash"`

Defined in: [constants/enums.ts:334](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L334)

---

### GEMINI_2_0_FLASH_001

> **GEMINI_2_0_FLASH_001**: `"gemini-2.0-flash-001"`

Defined in: [constants/enums.ts:335](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L335)

---

### GEMINI_2_0_FLASH_LITE

> **GEMINI_2_0_FLASH_LITE**: `"gemini-2.0-flash-lite"`

Defined in: [constants/enums.ts:337](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L337)

Gemini 2.0 Flash Lite - GA, production-ready, cost-optimized

---

### GEMINI_1_5_PRO

> **GEMINI_1_5_PRO**: `"gemini-1.5-pro-002"`

Defined in: [constants/enums.ts:340](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L340)

---

### GEMINI_1_5_FLASH

> **GEMINI_1_5_FLASH**: `"gemini-1.5-flash-002"`

Defined in: [constants/enums.ts:341](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/constants/enums.ts#L341)

---

## Type Alias: AuthorizationUrlResult

<!-- Source: api/type-aliases/AuthorizationUrlResult.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### state

> **state**: `string`

Defined in: [types/mcpTypes.ts:915](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L915)

---

### codeVerifier?

> `optional` **codeVerifier**: `string`

Defined in: [types/mcpTypes.ts:916](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L916)

---

## Function: calculateExpiresAt()

<!-- Source: api/functions/calculateExpiresAt.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / calculateExpiresAt

# Function: calculateExpiresAt()

> **calculateExpiresAt**(`expiresIn`): `number`

Defined in: [mcp/auth/tokenStorage.ts:165](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/auth/tokenStorage.ts#L165)

Calculate token expiration timestamp from expires_in value

## Parameters

### expiresIn

`number`

Token lifetime in seconds

## Returns

`number`

Expiration timestamp (Unix epoch in milliseconds)

---

## Class: CircuitBreakerManager

<!-- Source: api/classes/CircuitBreakerManager.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### removeBreaker()

> **removeBreaker**(`name`): `boolean`

Defined in: [mcp/mcpCircuitBreaker.ts:384](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/mcpCircuitBreaker.ts#L384)

Remove a circuit breaker and clean up its resources

#### Parameters

##### name

`string`

#### Returns

`boolean`

---

### getBreakerNames()

> **getBreakerNames**(): `string`[]

Defined in: [mcp/mcpCircuitBreaker.ts:402](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/mcpCircuitBreaker.ts#L402)

Get all circuit breaker names

#### Returns

`string`[]

---

### getAllStats()

> **getAllStats**(): `Record`\

Defined in: [mcp/mcpCircuitBreaker.ts:409](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/mcpCircuitBreaker.ts#L409)

Get statistics for all circuit breakers

#### Returns

`Record`\

---

### resetAll()

> **resetAll**(): `void`

Defined in: [mcp/mcpCircuitBreaker.ts:422](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/mcpCircuitBreaker.ts#L422)

Reset all circuit breakers

#### Returns

`void`

---

### getHealthSummary()

> **getHealthSummary**(): `object`

Defined in: [mcp/mcpCircuitBreaker.ts:433](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/mcpCircuitBreaker.ts#L433)

Get health summary

#### Returns

`object`

##### totalBreakers

> **totalBreakers**: `number`

##### closedBreakers

> **closedBreakers**: `number`

##### openBreakers

> **openBreakers**: `number`

##### halfOpenBreakers

> **halfOpenBreakers**: `number`

##### unhealthyBreakers

> **unhealthyBreakers**: `string`[]

---

### destroyAll()

> **destroyAll**(): `void`

Defined in: [mcp/mcpCircuitBreaker.ts:475](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/mcpCircuitBreaker.ts#L475)

Destroy all circuit breakers and clean up their resources
This should be called during application shutdown to prevent memory leaks

#### Returns

`void`

---

## Variable: dynamicModelProvider

<!-- Source: api/variables/dynamicModelProvider.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / dynamicModelProvider

# Variable: dynamicModelProvider

> `const` **dynamicModelProvider**: `DynamicModelProvider`

Defined in: [core/dynamicModels.ts:507](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/core/dynamicModels.ts#L507)

---

## Type Alias: Chunk

<!-- Source: api/type-aliases/Chunk.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

### text

> **text**: `string`

Defined in: [lib/rag/types.ts:68](https://github.com/juspay/neurolink/blob/main/src/lib/rag/types.ts#L68)

The text content of the chunk

---

### metadata

> **metadata**: [`ChunkMetadata`](/docs/chunkmetadata)

Defined in: [lib/rag/types.ts:70](https://github.com/juspay/neurolink/blob/main/src/lib/rag/types.ts#L70)

Metadata associated with the chunk, including source document information and position

---

### embedding?

> `optional` **embedding**: `number[]`

Defined in: [lib/rag/types.ts:72](https://github.com/juspay/neurolink/blob/main/src/lib/rag/types.ts#L72)

Optional embedding vector (populated after embedding generation)

## Example

```typescript

const chunk: Chunk = {
  id: "doc-001-chunk-0",
  text: "RAG (Retrieval-Augmented Generation) enhances LLM responses by incorporating relevant context from external knowledge bases.",
  metadata: {
    documentId: "doc-001",
    source: "rag-overview.md",
    chunkIndex: 0,
    totalChunks: 5,
    startPosition: 0,
    endPosition: 125,
    documentType: "markdown"
  },
  embedding: [0.023, -0.156, 0.089, ...] // 1536-dimensional vector
};
```

## Since

v8.44.0

---

## Function: chunkText()

<!-- Source: api/functions/chunkText.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / chunkText

# Function: chunkText()

> **chunkText**(`text`, `strategy?`, `config?`): `Promise`

Defined in: [lib/rag/chunking/chunkerRegistry.ts:207](https://github.com/juspay/neurolink/blob/main/src/lib/rag/chunking/chunkerRegistry.ts#L207)

Convenience function to chunk text with a given strategy

This is a simple wrapper around the ChunkerRegistry that handles
chunker instantiation automatically. Ideal for one-off chunking operations
where you don't need to reuse the chunker instance.

## Parameters

### text

`string`

The text content to chunk

### strategy?

`ChunkingStrategy`

Chunking strategy to use (default: `"recursive"`)

Available strategies:

- `character` - Simple character-based splitting
- `recursive` - Smart splitting with ordered separators (recommended default)
- `sentence` - Sentence-boundary aware splitting
- `token` - Token-count based splitting for LLM compatibility
- `markdown` - Markdown structure-aware splitting
- `html` - HTML tag-aware splitting
- `json` - JSON structure-aware splitting
- `latex` - LaTeX environment-aware splitting
- `semantic` - LLM-powered semantic splitting

### config?

`Record`

Strategy-specific configuration options

## Returns

`Promise`

Array of Chunk objects, each containing:

- `id` - Unique chunk identifier
- `text` - The chunk text content
- `metadata` - Chunk metadata including position and source info

## Examples

### Basic text chunking

```typescript

const text = "Your long document text here...";
const chunks = await chunkText(text);

console.log(`Created ${chunks.length} chunks`);
chunks.forEach((chunk, i) => {
  console.log(`Chunk ${i + 1}: ${chunk.text.slice(0, 50)}...`);
});
```

### Chunking with specific strategy

```typescript

// Use sentence chunking for Q&A applications
const chunks = await chunkText(articleText, "sentence", {
  maxSize: 500,
  minSentences: 2,
});
```

### Processing markdown documentation

```typescript

const readmeContent = fs.readFileSync("README.md", "utf-8");

const chunks = await chunkText(readmeContent, "markdown", {
  maxSize: 1000,
  headerLevels: [1, 2, 3],
  preserveCodeBlocks: true,
  includeHeader: true,
});

// Each chunk will be a logical section from the markdown
for (const chunk of chunks) {
  console.log(`Section: ${chunk.metadata.header || "Introduction"}`);
  console.log(`Content: ${chunk.text.slice(0, 100)}...`);
}
```

### Token-aware chunking for embeddings

```typescript

// Ensure chunks fit within embedding model limits
const chunks = await chunkText(document, "token", {
  maxTokens: 512,
  tokenOverlap: 50,
  tokenizer: "cl100k_base", // GPT-4 tokenizer
});
```

### Processing JSON data

```typescript

const jsonData = JSON.stringify(apiResponse);

const chunks = await chunkText(jsonData, "json", {
  maxSize: 800,
  maxDepth: 5,
  includeJsonPath: true,
});

// Each chunk includes its JSON path in metadata
chunks.forEach((chunk) => {
  console.log(`Path: ${chunk.metadata.jsonPath}`);
});
```

## Since

v8.44.0

## See Also

- [createChunker](/docs/createchunker) - Create reusable chunker instances
- [getAvailableStrategies](/docs/getavailablestrategies) - List available strategies
- [Chunk](/docs/type-aliases/chunk) - Chunk type definition
- [ChunkingStrategy](/docs/type-aliases/chunkingstrategy) - Strategy type definition

---

## Class: FileTokenStorage

<!-- Source: api/classes/FileTokenStorage.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### saveTokens()

> **saveTokens**(`serverId`, `tokens`): `Promise`\

Defined in: [mcp/auth/tokenStorage.ts:117](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/auth/tokenStorage.ts#L117)

Save tokens for a server

#### Parameters

##### serverId

`string`

Unique identifier for the MCP server

##### tokens

[`OAuthTokens`](/docs/api/type-aliases/OAuthTokens)

OAuth tokens to store

#### Returns

`Promise`\

#### Implementation of

`TokenStorage.saveTokens`

---

### deleteTokens()

> **deleteTokens**(`serverId`): `Promise`\

Defined in: [mcp/auth/tokenStorage.ts:123](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/auth/tokenStorage.ts#L123)

Delete stored tokens for a server

#### Parameters

##### serverId

`string`

Unique identifier for the MCP server

#### Returns

`Promise`\

#### Implementation of

`TokenStorage.deleteTokens`

---

### hasTokens()

> **hasTokens**(`serverId`): `Promise`\

Defined in: [mcp/auth/tokenStorage.ts:129](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/auth/tokenStorage.ts#L129)

Check if tokens exist for a server

#### Parameters

##### serverId

`string`

Unique identifier for the MCP server

#### Returns

`Promise`\

True if tokens exist

#### Implementation of

`TokenStorage.hasTokens`

---

### clearAll()

> **clearAll**(): `Promise`\

Defined in: [mcp/auth/tokenStorage.ts:134](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/auth/tokenStorage.ts#L134)

Clear all stored tokens

#### Returns

`Promise`\

#### Implementation of

`TokenStorage.clearAll`

---

## Variable: globalCircuitBreakerManager

<!-- Source: api/variables/globalCircuitBreakerManager.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / globalCircuitBreakerManager

# Variable: globalCircuitBreakerManager

> `const` **globalCircuitBreakerManager**: [`CircuitBreakerManager`](/docs/api/classes/CircuitBreakerManager)

Defined in: [mcp/mcpCircuitBreaker.ts:486](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/mcpCircuitBreaker.ts#L486)

MCP (Model Context Protocol) Plugin Ecosystem

Extensible plugin architecture based on research blueprint for
transforming NeuroLink into a Universal AI Development Platform.

## Example

```typescript

// Initialize the ecosystem
await mcpEcosystem.initialize();

// List available plugins
const plugins = await mcpEcosystem.list();

// Use filesystem operations
const content = await readFile("README.md");
await writeFile("output.txt", "Hello from MCP!");
```

---

## Type Alias: ChunkMetadata

<!-- Source: api/type-aliases/ChunkMetadata.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

### source?

> `optional` **source**: `string`

Defined in: [lib/rag/types.ts:32](https://github.com/juspay/neurolink/blob/main/src/lib/rag/types.ts#L32)

Original document filename or URL

---

### chunkIndex

> **chunkIndex**: `number`

Defined in: [lib/rag/types.ts:34](https://github.com/juspay/neurolink/blob/main/src/lib/rag/types.ts#L34)

Position in the original document (0-indexed)

---

### totalChunks?

> `optional` **totalChunks**: `number`

Defined in: [lib/rag/types.ts:36](https://github.com/juspay/neurolink/blob/main/src/lib/rag/types.ts#L36)

Total number of chunks from the document

---

### startPosition?

> `optional` **startPosition**: `number`

Defined in: [lib/rag/types.ts:38](https://github.com/juspay/neurolink/blob/main/src/lib/rag/types.ts#L38)

Start character position in original text

---

### endPosition?

> `optional` **endPosition**: `number`

Defined in: [lib/rag/types.ts:40](https://github.com/juspay/neurolink/blob/main/src/lib/rag/types.ts#L40)

End character position in original text

---

### documentType?

> `optional` **documentType**: [`DocumentType`](/docs/documenttype)

Defined in: [lib/rag/types.ts:42](https://github.com/juspay/neurolink/blob/main/src/lib/rag/types.ts#L42)

Document type (markdown, html, json, etc.)

---

### custom?

> `optional` **custom**: `Record`

Defined in: [lib/rag/types.ts:44](https://github.com/juspay/neurolink/blob/main/src/lib/rag/types.ts#L44)

Custom metadata from extraction

---

### title?

> `optional` **title**: `string`

Defined in: [lib/rag/types.ts:46](https://github.com/juspay/neurolink/blob/main/src/lib/rag/types.ts#L46)

Extracted title (from metadata extraction)

---

### summary?

> `optional` **summary**: `string`

Defined in: [lib/rag/types.ts:48](https://github.com/juspay/neurolink/blob/main/src/lib/rag/types.ts#L48)

Extracted summary (from metadata extraction)

---

### keywords?

> `optional` **keywords**: `string[]`

Defined in: [lib/rag/types.ts:50](https://github.com/juspay/neurolink/blob/main/src/lib/rag/types.ts#L50)

Extracted keywords (from metadata extraction)

---

### headerLevel?

> `optional` **headerLevel**: `number`

Defined in: [lib/rag/types.ts:52](https://github.com/juspay/neurolink/blob/main/src/lib/rag/types.ts#L52)

Header level for markdown/html chunks

---

### header?

> `optional` **header**: `string`

Defined in: [lib/rag/types.ts:54](https://github.com/juspay/neurolink/blob/main/src/lib/rag/types.ts#L54)

Header text for structured documents

---

### jsonPath?

> `optional` **jsonPath**: `string`

Defined in: [lib/rag/types.ts:56](https://github.com/juspay/neurolink/blob/main/src/lib/rag/types.ts#L56)

JSON path for JSON chunks

---

### latexEnvironment?

> `optional` **latexEnvironment**: `string`

Defined in: [lib/rag/types.ts:58](https://github.com/juspay/neurolink/blob/main/src/lib/rag/types.ts#L58)

LaTeX environment name

## Example

```typescript

const metadata: ChunkMetadata = {
  documentId: "doc-001",
  source: "technical-docs/api-guide.md",
  chunkIndex: 2,
  totalChunks: 15,
  startPosition: 1024,
  endPosition: 2048,
  documentType: "markdown",
  title: "API Authentication",
  summary: "Guide for implementing OAuth2 authentication",
  keywords: ["authentication", "OAuth2", "API", "security"],
  headerLevel: 2,
  header: "## Authentication Methods",
  custom: {
    author: "Engineering Team",
    lastUpdated: "2024-01-15",
  },
};
```

## Since

v8.44.0

---

## Function: createAIProvider()

<!-- Source: api/functions/createAIProvider.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / createAIProvider

# Function: createAIProvider()

> **createAIProvider**(`providerName?`, `modelName?`): `Promise`\

Defined in: [index.ts:158](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/index.ts#L158)

Quick start factory function for creating AI provider instances.

Creates a configured AI provider instance ready for immediate use.
Supports all 13 providers: OpenAI, Anthropic, Google AI Studio,
Google Vertex, AWS Bedrock, AWS SageMaker, Azure OpenAI, Hugging Face,
LiteLLM, Mistral, Ollama, OpenAI Compatible, and OpenRouter.

## Parameters

### providerName?

`string`

The AI provider name (e.g., 'bedrock', 'vertex', 'openai')

### modelName?

`string`

Optional model name to override provider default

## Returns

`Promise`\

Promise resolving to configured AI provider instance

## Examples

```typescript

const provider = await createAIProvider("bedrock");
const result = await provider.stream({ input: { text: "Hello, AI!" } });
```

```typescript
const provider = await createAIProvider("vertex", "gemini-3-flash");
```

## See

- [AIProviderFactory.createProvider](/docs/api/classes/AIProviderFactory)
- [NeuroLink](/docs/api/classes/NeuroLink) for the main SDK class

## Since

1.0.0

---

## Class: GraphRAG

<!-- Source: api/classes/GraphRAG.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

------ | ----------------------------------- | ------------------------------ |
| `config?` | [`GraphRAGConfig`](#graphragconfig) | Optional configuration options |

#### Returns

`GraphRAG`

#### Example

```typescript
// Default configuration (dimension: 1536, threshold: 0.7)
const graph = new GraphRAG();

// Custom configuration
const graph = new GraphRAG({
  dimension: 768, // For smaller embedding models
  threshold: 0.8, // Stricter similarity threshold
});
```

## Methods

### createGraph()

> **createGraph**(`chunks`, `embeddings`): `void`

Defined in: [src/lib/rag/graphRag/graphRAG.ts:46](https://github.com/juspay/neurolink/blob/main/src/lib/rag/graphRag/graphRAG.ts#L46)

Create a knowledge graph from document chunks and their embeddings. This clears any existing graph data and builds a new graph from scratch.

#### Parameters

| Parameter    | Type               | Description                     |
| ------------ | ------------------ | ------------------------------- |
| `chunks`     | `GraphChunk[]`     | Array of document chunks        |
| `embeddings` | `GraphEmbedding[]` | Corresponding embedding vectors |

#### Returns

`void`

#### Throws

`Error` - If chunks and embeddings arrays have different lengths

#### Example

```typescript
const chunks = documents.map((doc) => ({
  text: doc.content,
  metadata: doc.meta,
}));
const embeddings = await embedder.embedMany(chunks.map((c) => c.text));

graph.createGraph(
  chunks,
  embeddings.map((v) => ({ vector: v })),
);
```

---

### query()

> **query**(`params`): `RankedNode[]`

Defined in: [src/lib/rag/graphRag/graphRAG.ts:116](https://github.com/juspay/neurolink/blob/main/src/lib/rag/graphRag/graphRAG.ts#L116)

Query the graph using Random Walk with Restart algorithm. Combines initial similarity scores with graph traversal to find contextually relevant nodes.

#### Parameters

| Parameter | Type                                    | Description      |
| --------- | --------------------------------------- | ---------------- |
| `params`  | [`GraphQueryParams`](#graphqueryparams) | Query parameters |

#### Returns

`RankedNode[]`

Array of ranked nodes sorted by relevance score

#### Example

```typescript
const queryEmbedding = await embedder.embed("What is machine learning?");

const results = graph.query({
  query: queryEmbedding,
  topK: 10,
  randomWalkSteps: 100,
  restartProb: 0.15,
});

results.forEach((node) => {
  console.log(`[${node.score.toFixed(3)}] ${node.content}`);
});
```

---

### addNode()

> **addNode**(`chunk`, `embedding`): `string`

Defined in: [src/lib/rag/graphRag/graphRAG.ts:213](https://github.com/juspay/neurolink/blob/main/src/lib/rag/graphRag/graphRAG.ts#L213)

Add a single node to the graph. Automatically creates edges to existing nodes based on similarity threshold.

#### Parameters

| Parameter   | Type             | Description      |
| ----------- | ---------------- | ---------------- |
| `chunk`     | `GraphChunk`     | Document chunk   |
| `embedding` | `GraphEmbedding` | Embedding vector |

#### Returns

`string`

The unique ID of the newly created node

#### Example

```typescript
const newDoc = {
  text: "Attention mechanisms allow models to focus...",
  metadata: { topic: "transformers" },
};
const embedding = await embedder.embed(newDoc.text);

const nodeId = graph.addNode(newDoc, { vector: embedding });
console.log(`Created node: ${nodeId}`);
```

---

### removeNode()

> **removeNode**(`id`): `boolean`

Defined in: [src/lib/rag/graphRag/graphRAG.ts:266](https://github.com/juspay/neurolink/blob/main/src/lib/rag/graphRag/graphRAG.ts#L266)

Remove a node and all its edges from the graph.

#### Parameters

| Parameter | Type     | Description       |
| --------- | -------- | ----------------- |
| `id`      | `string` | Node ID to remove |

#### Returns

`boolean`

`true` if node was removed, `false` if node was not found

#### Example

```typescript
const removed = graph.removeNode("node-uuid-123");
if (removed) {
  console.log("Node successfully removed");
}
```

---

### getNode()

> **getNode**(`id`): `GraphNode | undefined`

Defined in: [src/lib/rag/graphRag/graphRAG.ts:306](https://github.com/juspay/neurolink/blob/main/src/lib/rag/graphRag/graphRAG.ts#L306)

Get a node by its ID.

#### Parameters

| Parameter | Type     | Description |
| --------- | -------- | ----------- |
| `id`      | `string` | Node ID     |

#### Returns

`GraphNode | undefined`

The node if found, undefined otherwise

---

### getAllNodes()

> **getAllNodes**(): `GraphNode[]`

Defined in: [src/lib/rag/graphRag/graphRAG.ts:313](https://github.com/juspay/neurolink/blob/main/src/lib/rag/graphRag/graphRAG.ts#L313)

Get all nodes in the graph.

#### Returns

`GraphNode[]`

Array of all graph nodes

---

### getEdges()

> **getEdges**(`nodeId`): `GraphEdge[]`

Defined in: [src/lib/rag/graphRag/graphRAG.ts:320](https://github.com/juspay/neurolink/blob/main/src/lib/rag/graphRag/graphRAG.ts#L320)

Get all edges for a specific node.

#### Parameters

| Parameter | Type     | Description |
| --------- | -------- | ----------- |
| `nodeId`  | `string` | Node ID     |

#### Returns

`GraphEdge[]`

Array of edges originating from the node

---

### getStats()

> **getStats**(): `GraphStats`

Defined in: [src/lib/rag/graphRag/graphRAG.ts:289](https://github.com/juspay/neurolink/blob/main/src/lib/rag/graphRag/graphRAG.ts#L289)

Get graph statistics including node count, edge count, and average degree.

#### Returns

[`GraphStats`](#graphstats)

Graph statistics object

#### Example

```typescript
const stats = graph.getStats();
console.log(`Graph has ${stats.nodeCount} nodes and ${stats.edgeCount} edges`);
console.log(`Average connections per node: ${stats.avgDegree.toFixed(2)}`);
```

---

### findConnectedComponents()

> **findConnectedComponents**(): `string[][]`

Defined in: [src/lib/rag/graphRag/graphRAG.ts:327](https://github.com/juspay/neurolink/blob/main/src/lib/rag/graphRag/graphRAG.ts#L327)

Find connected components in the graph using BFS traversal. Useful for identifying clusters of related documents.

#### Returns

`string[][]`

Array of components, where each component is an array of node IDs

#### Example

```typescript
const components = graph.findConnectedComponents();

if (components.length > 1) {
  console.log(`Graph has ${components.length} disconnected clusters`);
  components.forEach((comp, i) => {
    console.log(`Cluster ${i + 1}: ${comp.length} documents`);
  });
}
```

---

### updateThreshold()

> **updateThreshold**(`threshold`): `void`

Defined in: [src/lib/rag/graphRag/graphRAG.ts:414](https://github.com/juspay/neurolink/blob/main/src/lib/rag/graphRag/graphRAG.ts#L414)

Update the similarity threshold and rebuild all edges. Useful for tuning graph density without re-creating nodes.

#### Parameters

| Parameter   | Type     | Description                           |
| ----------- | -------- | ------------------------------------- |
| `threshold` | `number` | New similarity threshold (0.0 to 1.0) |

#### Returns

`void`

#### Example

```typescript
// Start with a lower threshold
const graph = new GraphRAG({ threshold: 0.6 });
graph.createGraph(chunks, embeddings);

console.log(`Edges with 0.6 threshold: ${graph.getStats().edgeCount}`);

// Increase threshold for sparser graph
graph.updateThreshold(0.8);
console.log(`Edges with 0.8 threshold: ${graph.getStats().edgeCount}`);
```

---

### toJSON()

> **toJSON**(): `{ nodes: GraphNode[]; edges: Array; config: { dimension: number; threshold: number } }`

Defined in: [src/lib/rag/graphRag/graphRAG.ts:459](https://github.com/juspay/neurolink/blob/main/src/lib/rag/graphRag/graphRAG.ts#L459)

Serialize the graph to a JSON-compatible object. Includes all nodes, edges, and configuration.

#### Returns

`object`

JSON-serializable graph representation

| Property | Type                                            | Description                     |
| -------- | ----------------------------------------------- | ------------------------------- |
| `nodes`  | `GraphNode[]`                                   | All graph nodes with embeddings |
| `edges`  | `Array` | Edge lists keyed by source node |
| `config` | `{ dimension: number; threshold: number }`      | Graph configuration             |

#### Example

```typescript
const data = graph.toJSON();
const json = JSON.stringify(data);
await fs.writeFile("graph.json", json);
```

---

### fromJSON() (static)

> **static fromJSON**(`json`): `GraphRAG`

Defined in: [src/lib/rag/graphRag/graphRAG.ts:480](https://github.com/juspay/neurolink/blob/main/src/lib/rag/graphRag/graphRAG.ts#L480)

Create a GraphRAG instance from serialized JSON data.

#### Parameters

| Parameter | Type                                                                                                                             | Description           |
| --------- | -------------------------------------------------------------------------------------------------------------------------------- | --------------------- |
| `json`    | `{ nodes: GraphNode[]; edges: Array; config: { dimension: number; threshold: number } }` | Serialized graph data |

#### Returns

`GraphRAG`

Restored GraphRAG instance

#### Example

```typescript
const json = JSON.parse(await fs.readFile("graph.json", "utf-8"));
const graph = GraphRAG.fromJSON(json);

// Graph is ready for querying
const results = graph.query({ query: embedding, topK: 5 });
```

## Configuration

### GraphRAGConfig

Configuration options for GraphRAG constructor.

| Option      | Type     | Default | Description                                             |
| ----------- | -------- | ------- | ------------------------------------------------------- |
| `dimension` | `number` | `1536`  | Embedding vector dimension (must match your embeddings) |
| `threshold` | `number` | `0.7`   | Similarity threshold for edge creation (0.0 to 1.0)     |

### GraphQueryParams

Parameters for the `query()` method.

| Option            | Type       | Default      | Description                                          |
| ----------------- | ---------- | ------------ | ---------------------------------------------------- |
| `query`           | `number[]` | **required** | Query embedding vector                               |
| `topK`            | `number`   | `10`         | Number of results to return                          |
| `randomWalkSteps` | `number`   | `100`        | Number of random walk iterations                     |
| `restartProb`     | `number`   | `0.15`       | Probability of restarting walk at query-similar node |

## Types

### GraphNode

Represents a node in the knowledge graph.

| Property    | Type                      | Description              |
| ----------- | ------------------------- | ------------------------ |
| `id`        | `string`                  | Unique node identifier   |
| `content`   | `string`                  | Text content of the node |
| `metadata`  | `Record` | Associated metadata      |
| `embedding` | `number[] \| undefined`   | Embedding vector         |

### GraphEdge

Represents an edge (relationship) between nodes.

| Property | Type                  | Description                     |
| -------- | --------------------- | ------------------------------- |
| `source` | `string`              | Source node ID                  |
| `target` | `string`              | Target node ID                  |
| `weight` | `number`              | Edge weight (similarity)        |
| `type`   | `string \| undefined` | Edge type (default: "semantic") |

### GraphChunk

Input format for document chunks.

| Property   | Type                                   | Description        |
| ---------- | -------------------------------------- | ------------------ |
| `text`     | `string`                               | Chunk text content |
| `metadata` | `Record \| undefined` | Optional metadata  |

### GraphEmbedding

Input format for embedding vectors.

| Property | Type       | Description      |
| -------- | ---------- | ---------------- |
| `vector` | `number[]` | Embedding vector |

### RankedNode

Result format from graph queries.

| Property   | Type                      | Description           |
| ---------- | ------------------------- | --------------------- |
| `id`       | `string`                  | Node ID               |
| `content`  | `string`                  | Node text content     |
| `metadata` | `Record` | Node metadata         |
| `score`    | `number`                  | Relevance score (0-1) |

### GraphStats

Graph statistics from `getStats()`.

| Property    | Type     | Description                  |
| ----------- | -------- | ---------------------------- |
| `nodeCount` | `number` | Total number of nodes        |
| `edgeCount` | `number` | Total number of edges        |
| `avgDegree` | `number` | Average edges per node       |
| `threshold` | `number` | Current similarity threshold |

## Algorithm Details

### Random Walk with Restart (RWR)

The query algorithm combines direct similarity with graph structure:

1. **Initial Ranking**: Compute cosine similarity between query embedding and all nodes
2. **Starting Nodes**: Select top-5 most similar nodes as walk starting points
3. **Random Walk**: Perform random walk iterations:
   - With probability `restartProb`: Jump to a query-similar node
   - Otherwise: Follow an edge weighted by similarity
4. **Visit Counting**: Track how often each node is visited during walks
5. **Score Combination**: Final score = 0.6 × similarity + 0.4 × visit frequency
6. **Return**: Top-K nodes by combined score

This approach finds documents that are both directly relevant and contextually connected to relevant documents.

## See Also

- [RAGPipeline](/docs/ragpipeline) - High-level RAG orchestration with Graph RAG support
- [InMemoryVectorStore](/docs/inmemoryvectorstore) - Vector storage for embeddings
- [MDocument](/docs/mdocument) - Document processing and chunking

---

## Variable: globalRateLimiterManager

<!-- Source: api/variables/globalRateLimiterManager.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / globalRateLimiterManager

# Variable: globalRateLimiterManager

> `const` **globalRateLimiterManager**: [`RateLimiterManager`](/docs/api/classes/RateLimiterManager)

Defined in: [mcp/httpRateLimiter.ts:460](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/httpRateLimiter.ts#L460)

Global rate limiter manager instance
Use this for application-wide rate limiting management

---

## Type Alias: ChunkParams

<!-- Source: api/type-aliases/ChunkParams.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

### config?

> `optional` **config**: [`ChunkerConfig`](/docs/chunkerconfig)

Strategy-specific configuration options including maxSize, overlap, and strategy-specific settings.

---

### extract?

> `optional` **extract**: [`ExtractParams`](/docs/extractparams)

Metadata extraction options to apply during chunking

## Example

```typescript

const doc = MDocument.fromMarkdown(content);

// Basic chunking with defaults
await doc.chunk();

// Recursive chunking with custom settings
const params: ChunkParams = {
  strategy: "recursive",
  config: {
    maxSize: 1000,
    overlap: 200,
    separators: ["\n\n", "\n", ". ", " "],
  },
};
await doc.chunk(params);

// Markdown-aware chunking
await doc.chunk({
  strategy: "markdown",
  config: {
    headerLevels: [1, 2, 3],
    preserveCodeBlocks: true,
    includeHeader: true,
  },
});

// Token-based chunking for LLM context windows
await doc.chunk({
  strategy: "token",
  config: {
    maxTokens: 512,
    tokenOverlap: 50,
    tokenizer: "cl100k_base",
  },
});

// Semantic chunking with LLM
await doc.chunk({
  strategy: "semantic",
  config: {
    modelName: "gpt-4o-mini",
    provider: "openai",
    similarityThreshold: 0.8,
  },
});
```

---

## Function: createAIProviderWithFallback()

<!-- Source: api/functions/createAIProviderWithFallback.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / createAIProviderWithFallback

# Function: createAIProviderWithFallback()

> **createAIProviderWithFallback**(`primaryProvider?`, `fallbackProvider?`, `modelName?`): `Promise`\\>

Defined in: [index.ts:207](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/index.ts#L207)

Create provider with automatic fallback for production resilience.

Creates both primary and fallback provider instances for high-availability
deployments. Automatically switches to fallback on primary provider failure.

## Parameters

### primaryProvider?

`string`

Primary AI provider name (default: 'bedrock')

### fallbackProvider?

`string`

Fallback AI provider name (default: 'vertex')

### modelName?

`string`

Optional model name for both providers

## Returns

`Promise`\\>

Promise resolving to object with primary and fallback providers

## Examples

```typescript

const { primary, fallback } = await createAIProviderWithFallback(
  "bedrock",
  "vertex",
);

try {
  const result = await primary.generate({ input: { text: "Hello!" } });
} catch (error) {
  // Automatically use fallback
  const result = await fallback.generate({ input: { text: "Hello!" } });
}
```

```typescript
const { primary, fallback } = await createAIProviderWithFallback(
  "vertex", // Primary: US region
  "bedrock", // Fallback: Global
  "claude-3-sonnet",
);
```

## See

[AIProviderFactory.createProviderWithFallback](/docs/api/classes/AIProviderFactory)

## Since

1.0.0

---

## Class: HTTPRateLimiter

<!-- Source: api/classes/HTTPRateLimiter.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### tryAcquire()

> **tryAcquire**(): `boolean`

Defined in: [mcp/httpRateLimiter.ts:163](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/httpRateLimiter.ts#L163)

Try to acquire a token without waiting

#### Returns

`boolean`

true if a token was acquired, false otherwise

---

### handleRateLimitResponse()

> **handleRateLimitResponse**(`headers`): `number`

Defined in: [mcp/httpRateLimiter.ts:189](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/httpRateLimiter.ts#L189)

Handle rate limit response headers from server
Parses Retry-After header and returns wait time in milliseconds

#### Parameters

##### headers

`Headers`

Response headers from the server

#### Returns

`number`

Wait time in milliseconds, or 0 if no rate limit headers found

---

### getRemainingTokens()

> **getRemainingTokens**(): `number`

Defined in: [mcp/httpRateLimiter.ts:252](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/httpRateLimiter.ts#L252)

Get the number of remaining tokens

#### Returns

`number`

Current number of available tokens

---

### reset()

> **reset**(): `void`

Defined in: [mcp/httpRateLimiter.ts:261](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/httpRateLimiter.ts#L261)

Reset the rate limiter to initial state
Useful for testing or when server indicates rate limits have been reset

#### Returns

`void`

---

### getStats()

> **getStats**(): `RateLimiterStats`

Defined in: [mcp/httpRateLimiter.ts:281](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/httpRateLimiter.ts#L281)

Get current rate limiter statistics

#### Returns

`RateLimiterStats`

---

### updateConfig()

> **updateConfig**(`config`): `void`

Defined in: [mcp/httpRateLimiter.ts:296](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/httpRateLimiter.ts#L296)

Update configuration dynamically
Useful when server provides rate limit information

#### Parameters

##### config

`Partial`\

#### Returns

`void`

---

### getConfig()

> **getConfig**(): `Readonly`\

Defined in: [mcp/httpRateLimiter.ts:304](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/httpRateLimiter.ts#L304)

Get current configuration

#### Returns

`Readonly`\

---

## Variable: mcpLogger

<!-- Source: api/variables/mcpLogger.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / mcpLogger

# Variable: mcpLogger

> `const` **mcpLogger**: `NeuroLinkLogger` = `neuroLinkLogger`

Defined in: [utils/logger.ts:409](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/utils/logger.ts#L409)

MCP compatibility exports - all use the same unified logger instance.
These exports maintain backward compatibility with code that expects
separate loggers for different MCP components, while actually using
the same underlying logger instance.

---

## Type Alias: ChunkerConfig

<!-- Source: api/type-aliases/ChunkerConfig.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

### minSize?

> `optional` **minSize**: `number`

Minimum chunk size

---

### overlap?

> `optional` **overlap**: `number`

Overlap between consecutive chunks

---

### trimWhitespace?

> `optional` **trimWhitespace**: `boolean`

Whether to trim whitespace from chunks

---

### metadata?

> `optional` **metadata**: `Record`

Custom metadata to add to all chunks

---

### preserveMetadata?

> `optional` **preserveMetadata**: `boolean`

Whether to preserve metadata from source document

## Strategy-Specific Configurations

### CharacterChunkerConfig

For `"character"` strategy:

- `separator?`: Character separator (default: "")
- `keepSeparator?`: Keep separator in chunks

### RecursiveChunkerConfig

For `"recursive"` strategy:

- `separators?`: Ordered list of separators to try (default: ["\n\n", "\n", " ", ""])
- `isSeparatorRegex?`: Whether separators are regex patterns
- `keepSeparators?`: Whether to keep separators in the output chunks

### SentenceChunkerConfig

For `"sentence"` strategy:

- `sentenceEnders?`: Sentence ending characters (default: [".", "!", "?", "\n"])
- `minSentences?`: Minimum sentences per chunk
- `maxSentences?`: Maximum sentences per chunk

### TokenChunkerConfig

For `"token"` strategy:

- `tokenizer?`: Tokenizer to use (default: "cl100k_base" for GPT models)
- `modelName?`: Model name for token counting (alternative to tokenizer)
- `maxTokens?`: Maximum tokens per chunk
- `tokenOverlap?`: Token overlap between chunks

### MarkdownChunkerConfig

For `"markdown"` strategy:

- `headerLevels?`: Header levels to split on (default: [1, 2, 3])
- `preserveCodeBlocks?`: Include code blocks as single chunks
- `includeHeader?`: Include the header in the chunk content
- `stripFormatting?`: Strip markdown formatting from output

### HTMLChunkerConfig

For `"html"` strategy:

- `splitTags?`: Tags to split on (default: ["div", "p", "section", "article"])
- `preserveTags?`: Tags to preserve as single chunks
- `extractTextOnly?`: Extract text only (strip HTML tags)
- `includeTagMetadata?`: Include tag metadata in chunks

### JSONChunkerConfig

For `"json"` strategy:

- `maxDepth?`: Maximum depth to traverse
- `splitKeys?`: Keys to split on (arrays/objects at these keys become chunks)
- `preserveKeys?`: Keys to preserve as single units
- `includeJsonPath?`: Include JSON path in metadata

### LaTeXChunkerConfig

For `"latex"` strategy:

- `splitEnvironments?`: Environments to split on (default: ["section", "subsection", "chapter"])
- `preserveMath?`: Preserve math environments as single chunks
- `includePreamble?`: Include preamble as separate chunk

### SemanticChunkerConfig

For `"semantic"` and `"semantic-markdown"` strategies:

- `joinThreshold?`: Minimum tokens before considering a split
- `modelName?`: Model for semantic analysis
- `provider?`: Provider for the model
- `semanticPrompt?`: Custom prompt for semantic grouping
- `maxHeaderDepth?`: Maximum header depth to consider for grouping
- `similarityThreshold?`: Similarity threshold for grouping (0-1)

## Example

```typescript

// Recursive chunking configuration
const recursiveConfig: ChunkerConfig = {
  maxSize: 512,
  overlap: 50,
  separators: ["\n\n", "\n", ". ", " "],
  trimWhitespace: true,
};

// Markdown chunking configuration
const markdownConfig: ChunkerConfig = {
  maxSize: 1000,
  headerLevels: [1, 2, 3],
  preserveCodeBlocks: true,
  includeHeader: true,
};

// Token-based chunking configuration
const tokenConfig: ChunkerConfig = {
  maxTokens: 256,
  tokenOverlap: 20,
  tokenizer: "cl100k_base",
};

// Semantic chunking configuration
const semanticConfig: ChunkerConfig = {
  maxSize: 1000,
  similarityThreshold: 0.8,
  modelName: "gpt-4o-mini",
  provider: "openai",
};

const doc = MDocument.fromMarkdown(content);
const chunks = await doc.chunk({
  strategy: "markdown",
  config: markdownConfig,
});
```

## Since

v8.44.0

---

## Function: createBestAIProvider()

<!-- Source: api/functions/createBestAIProvider.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / createBestAIProvider

# Function: createBestAIProvider()

> **createBestAIProvider**(`requestedProvider?`, `modelName?`): `Promise`\

Defined in: [index.ts:260](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/index.ts#L260)

Create the best available provider based on environment configuration.

Intelligently selects the best provider based on available API keys
in environment variables. Automatically detects and configures the
optimal provider without manual configuration.

## Parameters

### requestedProvider?

`string`

Optional preferred provider name

### modelName?

`string`

Optional model name

## Returns

`Promise`\

Promise resolving to the best configured provider

## Examples

```typescript

// Automatically uses provider with configured API key
const provider = await createBestAIProvider();
const result = await provider.generate({ input: { text: "Hello!" } });
```

```typescript
// Tries to use OpenAI, falls back to available provider
const provider = await createBestAIProvider("openai");
```

## Remarks

Environment variables checked (in order):

- OPENAI_API_KEY
- ANTHROPIC_API_KEY
- GOOGLE_API_KEY
- VERTEX_PROJECT_ID + credentials
- AWS credentials for Bedrock
- And more...

## See

- [AIProviderFactory.createBestProvider](/docs/api/classes/AIProviderFactory)
- [getBestProvider](/docs/api/functions/getBestProvider) for provider detection utility

## Since

1.0.0

---

## Class: InMemoryBM25Index

<!-- Source: api/classes/InMemoryBM25Index.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

### addDocuments()

> **addDocuments**(`documents`): `Promise`\

Defined in: [rag/retrieval/hybridSearch.ts:114](https://github.com/juspay/neurolink/blob/main/src/lib/rag/retrieval/hybridSearch.ts#L114)

Add documents to the BM25 index.

Each document is:

1. Tokenized (lowercase, punctuation removed, whitespace split)
2. Stored with its tokens and metadata
3. Used to recalculate the average document length for BM25 scoring

#### Parameters

##### documents

`Array }>`

Array of documents to index

| Property    | Type                      | Description                                  |
| ----------- | ------------------------- | -------------------------------------------- |
| `id`        | `string`                  | Unique document identifier                   |
| `text`      | `string`                  | Document text content to index               |
| `metadata?` | `Record` | Optional metadata to store with the document |

#### Returns

`Promise`\

## Examples

### Basic Usage

```typescript

// Create a new BM25 index
const bm25Index = new InMemoryBM25Index();

// Add documents to the index
await bm25Index.addDocuments([
  {
    id: "doc1",
    text: "Machine learning is a subset of artificial intelligence",
    metadata: { category: "AI" },
  },
  {
    id: "doc2",
    text: "Deep learning uses neural networks with multiple layers",
    metadata: { category: "AI" },
  },
  {
    id: "doc3",
    text: "Natural language processing enables text understanding",
    metadata: { category: "NLP" },
  },
]);

// Search the index
const results = await bm25Index.search("machine learning", 5);

console.log(results);
// [
//   { id: "doc1", score: 1.234, text: "Machine learning is...", metadata: {...} },
//   { id: "doc2", score: 0.567, text: "Deep learning uses...", metadata: {...} },
// ]
```

### Hybrid Search with Vector Store

```typescript

  InMemoryBM25Index,
  createHybridSearch,
  PgVectorStore,
} from "@juspay/neurolink";

// Create BM25 index
const bm25Index = new InMemoryBM25Index();

// Add documents to BM25 index
await bm25Index.addDocuments(documents);

// Create hybrid search function
const hybridSearch = createHybridSearch({
  vectorStore: pgVectorStore,
  bm25Index,
  indexName: "my_embeddings",
  embeddingModel: {
    provider: "OPEN_AI",
    modelName: "text-embedding-3-small",
  },
  defaultConfig: {
    vectorWeight: 0.5,
    bm25Weight: 0.5,
    fusionMethod: "rrf", // Reciprocal Rank Fusion
  },
});

// Execute hybrid search
const results = await hybridSearch("What is machine learning?", {
  topK: 10,
  enableReranking: true,
});
```

### Using with RAG Pipeline

```typescript

// Create pipeline with custom BM25 index
const pipeline = new RAGPipeline({
  vectorStore: myVectorStore,
  bm25Index: new InMemoryBM25Index(),
  embeddingModel: {
    provider: "OPEN_AI",
    modelName: "text-embedding-3-small",
  },
  enableHybridSearch: true,
});

// Documents are automatically indexed in both vector and BM25 stores
await pipeline.ingest(documents);

// Query uses hybrid search
const results = await pipeline.query("search query");
```

### Batch Document Indexing

```typescript

const bm25Index = new InMemoryBM25Index();

// Index documents in batches
const batchSize = 100;
for (let i = 0; i  ({
      id: `doc-${i + idx}`,
      text: doc.content,
      metadata: { source: doc.source, page: doc.page },
    })),
  );
}

// Search with metadata preserved
const results = await bm25Index.search("specific keywords", 20);
results.forEach((r) => {
  console.log(`[${r.metadata?.source}] Score: ${r.score.toFixed(3)}`);
});
```

## BM25 Algorithm Details

The BM25 scoring formula used:

```
score(D, Q) = SUM[i=1..n]( IDF(qi) * (f(qi, D) * (k1 + 1)) / (f(qi, D) + k1 * (1 - b + b * |D| / avgdl)) )
```

Where:

- `f(qi, D)` = frequency of term qi in document D
- `|D|` = length of document D (in tokens)
- `avgdl` = average document length across the collection
- `k1` = term frequency saturation parameter (default: 1.5)
- `b` = length normalization parameter (default: 0.75)
- `IDF(qi)` = log((N - n(qi) + 0.5) / (n(qi) + 0.5) + 1)
- `N` = total number of documents
- `n(qi)` = number of documents containing term qi

## Notes

- **Tokenization**: Uses simple whitespace tokenization with lowercase conversion and punctuation removal. For production use cases requiring stemming, stop word removal, or language-specific tokenization, consider implementing a custom `BM25Index`.

- **Memory Usage**: All documents and their tokens are stored in memory. For large collections (100K+ documents), consider using a persistent BM25 implementation like Elasticsearch or a specialized library.

- **Thread Safety**: The index is not thread-safe. In concurrent environments, synchronize access or use separate instances.

- **Incremental Updates**: Documents can be added incrementally; the average document length is recalculated on each `addDocuments` call.

## See Also

- [BM25Index](/docs/interfaces/bm25index) - Interface for BM25 implementations
- [BM25Result](/docs/type-aliases/bm25result) - Result type returned by search
- [HybridSearchConfig](/docs/type-aliases/hybridsearchconfig) - Configuration for hybrid search
- [createHybridSearch](/docs/functions/createhybridsearch) - Create hybrid search function
- [RAGPipeline](/docs/ragpipeline) - Pipeline with integrated hybrid search
- [reciprocalRankFusion](/docs/functions/reciprocalrankfusion) - RRF fusion method
- [linearCombination](/docs/functions/linearcombination) - Linear combination fusion method

---

## Type Alias: ChunkerMetadata

<!-- Source: api/type-aliases/ChunkerMetadata.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

### supportedTypes?

> `optional` **supportedTypes**: [`DocumentType`](/docs/documenttype)[]

Document types this chunker is optimized for

---

### requiresExternalDeps?

> `optional` **requiresExternalDeps**: `boolean`

Whether the chunker requires external dependencies (e.g., tokenizers, LLM providers)

---

### defaultConfig?

> `optional` **defaultConfig**: `Record`

Default configuration values for this chunker

---

### supportedOptions?

> `optional` **supportedOptions**: `string[]`

List of supported configuration option names

---

### useCases?

> `optional` **useCases**: `string[]`

Use cases where this chunker excels

---

### aliases?

> `optional` **aliases**: `string[]`

Alternative names or aliases for this chunker

## Example

```typescript

// Registering a custom chunker with metadata
const metadata: ChunkerMetadata = {
  description: "Splits documents by paragraph boundaries",
  supportedTypes: ["text", "markdown"],
  requiresExternalDeps: false,
  defaultConfig: {
    maxSize: 1000,
    overlap: 100,
  },
  supportedOptions: ["maxSize", "minSize", "overlap", "trimWhitespace"],
  useCases: ["Blog posts", "Articles", "Documentation"],
  aliases: ["paragraph", "para"],
};

ChunkerRegistry.register("paragraph", paragraphChunker, metadata);

// Querying chunker metadata
const allChunkers = ChunkerRegistry.list();
const markdownChunkers = ChunkerRegistry.listForType("markdown");
```

---

## Function: createChunker()

<!-- Source: api/functions/createChunker.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / createChunker

# Function: createChunker()

> **createChunker**(`strategyOrAlias`, `config?`): `Promise`

Defined in: [lib/rag/ChunkerFactory.ts:373](https://github.com/juspay/neurolink/blob/main/src/lib/rag/ChunkerFactory.ts#L373)

Create a chunker instance by strategy name or alias

This factory function provides a convenient way to instantiate chunkers
without directly interacting with the ChunkerFactory singleton. It supports
all built-in chunking strategies and their aliases.

## Parameters

### strategyOrAlias

`string`

Chunking strategy name or alias. Supported strategies:

- `character` (aliases: `char`, `fixed-size`, `fixed`)
- `recursive` (aliases: `recursive-character`, `langchain-default`)
- `sentence` (aliases: `sent`, `sentence-based`)
- `token` (aliases: `tok`, `tokenized`)
- `markdown` (aliases: `md`, `markdown-header`)
- `html` (aliases: `html-tag`, `web`)
- `json` (aliases: `json-object`, `structured`)
- `latex` (aliases: `tex`, `latex-section`)
- `semantic` (aliases: `llm`, `ai-semantic`)
- `semantic-markdown` (aliases: `semantic-md`, `smart-markdown`)

### config?

`ChunkerConfig`

Strategy-specific configuration options:

- `maxSize` - Maximum chunk size (default varies by strategy)
- `overlap` - Overlap between consecutive chunks
- `minSize` - Minimum chunk size
- Additional options vary by strategy

## Returns

`Promise`

A Chunker instance configured with the specified strategy

## Throws

`ChunkingError` - If the strategy is unknown or creation fails

## Examples

### Basic usage with strategy name

```typescript

const chunker = await createChunker("recursive");
const chunks = await chunker.chunk(documentText);
```

### Using strategy alias

```typescript

// Use 'md' alias for markdown chunker
const chunker = await createChunker("md", { maxSize: 500 });
const chunks = await chunker.chunk(markdownContent);
```

### With custom configuration

```typescript

const chunker = await createChunker("sentence", {
  maxSize: 1000,
  overlap: 100,
  minSentences: 2,
  maxSentences: 10,
});

const chunks = await chunker.chunk(articleText);
```

### Processing code with recursive chunker

```typescript

const chunker = await createChunker("recursive", {
  maxSize: 800,
  overlap: 50,
  separators: ["\n\n", "\n", " ", ""],
  keepSeparators: true,
});

const codeChunks = await chunker.chunk(sourceCode);
```

## Since

v8.44.0

## See Also

- [getAvailableStrategies](/docs/getavailablestrategies) - List available chunking strategies
- [chunkText](/docs/chunktext) - Convenience function for one-off chunking
- [ChunkerConfig](/docs/type-aliases/chunkerconfig) - Configuration options
- [Chunker](/docs/interfaces/chunker) - Chunker interface

---

## Class: InMemoryTokenStorage

<!-- Source: api/classes/InMemoryTokenStorage.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### saveTokens()

> **saveTokens**(`serverId`, `tokens`): `Promise`\

Defined in: [mcp/auth/tokenStorage.ts:21](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/auth/tokenStorage.ts#L21)

Save tokens for a server

#### Parameters

##### serverId

`string`

Unique identifier for the MCP server

##### tokens

[`OAuthTokens`](/docs/api/type-aliases/OAuthTokens)

OAuth tokens to store

#### Returns

`Promise`\

#### Implementation of

`TokenStorage.saveTokens`

---

### deleteTokens()

> **deleteTokens**(`serverId`): `Promise`\

Defined in: [mcp/auth/tokenStorage.ts:25](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/auth/tokenStorage.ts#L25)

Delete stored tokens for a server

#### Parameters

##### serverId

`string`

Unique identifier for the MCP server

#### Returns

`Promise`\

#### Implementation of

`TokenStorage.deleteTokens`

---

### hasTokens()

> **hasTokens**(`serverId`): `Promise`\

Defined in: [mcp/auth/tokenStorage.ts:29](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/auth/tokenStorage.ts#L29)

Check if tokens exist for a server

#### Parameters

##### serverId

`string`

Unique identifier for the MCP server

#### Returns

`Promise`\

True if tokens exist

#### Implementation of

`TokenStorage.hasTokens`

---

### clearAll()

> **clearAll**(): `Promise`\

Defined in: [mcp/auth/tokenStorage.ts:33](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/auth/tokenStorage.ts#L33)

Clear all stored tokens

#### Returns

`Promise`\

#### Implementation of

`TokenStorage.clearAll`

---

### getServerIds()

> **getServerIds**(): `string`[]

Defined in: [mcp/auth/tokenStorage.ts:47](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/auth/tokenStorage.ts#L47)

Get all server IDs with stored tokens

#### Returns

`string`[]

---

## Type Alias: ChunkingStrategy

<!-- Source: api/type-aliases/ChunkingStrategy.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

### "recursive"

Smart splitting based on content structure. Tries multiple separators in order (paragraphs, then lines, then words, then characters).

---

### "sentence"

Sentence-aware splitting. Respects sentence boundaries to maintain semantic coherence.

---

### "token"

Token-aware splitting using a tokenizer. Ensures chunks fit within model token limits.

---

### "markdown"

Structure-aware markdown splitting. Splits on headers while preserving code blocks and formatting.

---

### "html"

HTML structure-aware splitting. Splits on HTML tags while maintaining document structure.

---

### "json"

JSON structure-aware splitting. Splits on array elements and object keys while preserving valid JSON.

---

### "latex"

LaTeX structure-aware splitting. Splits on LaTeX environments and commands.

---

### "semantic"

LLM-based semantic splitting. Uses language models to identify natural topic boundaries.

---

### "semantic-markdown"

Semantic splitting optimized for markdown documents. Combines markdown structure awareness with semantic analysis.

## Example

```typescript

// Using different chunking strategies
const strategies: ChunkingStrategy[] = [
  "recursive", // Best for general text
  "markdown", // Best for markdown files
  "token", // Best for LLM token limits
  "semantic", // Best for topic-based splitting
];

const doc = MDocument.fromText(content);

// Chunk with recursive strategy (recommended default)
const chunks = await doc.chunk({
  strategy: "recursive",
  config: {
    maxSize: 512,
    overlap: 50,
  },
});

// Chunk markdown with structure awareness
const mdChunks = await doc.chunk({
  strategy: "markdown",
  config: {
    headerLevels: [1, 2, 3],
    preserveCodeBlocks: true,
  },
});
```

## Since

v8.44.0

---

## Function: createContextEnricher()

<!-- Source: api/functions/createContextEnricher.md -->

[**NeuroLink API Reference v8.42.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / createContextEnricher

# Function: createContextEnricher()

> **createContextEnricher**(): `SpanProcessor`

Defined in: [services/server/ai/observability/instrumentation.ts:558](https://github.com/juspay/neurolink/blob/main/src/lib/services/server/ai/observability/instrumentation.ts#L558)

Create a new ContextEnricher span processor

Use this when `useExternalTracerProvider` is true to add context enrichment
to your own TracerProvider. The ContextEnricher adds Langfuse context
(userId, sessionId, conversationId, etc.) to spans.

## Returns

`SpanProcessor`

A new ContextEnricher instance implementing the OpenTelemetry SpanProcessor interface

## ContextEnricher Behavior

### onStart(span)

Enriches the span with context from AsyncLocalStorage:

- `user.id` - User identifier
- `session.id` - Session identifier
- `conversation.id` - Conversation/thread identifier
- `request.id` - Request identifier for log correlation
- `trace.name` - Custom trace name
- `metadata.*` - Custom metadata as prefixed attributes

### onEnd(span)

Reads GenAI semantic convention attributes from the span and logs token usage
for debugging. Detects spans from Vercel AI SDK's `experimental_telemetry`.

## Example

```typescript

  createContextEnricher,
  getLangfuseSpanProcessor,
} from "@juspay/neurolink";

const provider = new NodeTracerProvider();

// Add ContextEnricher for Langfuse context propagation
provider.addSpanProcessor(createContextEnricher());

// Add Langfuse processor for sending to Langfuse
const langfuseProcessor = getLangfuseSpanProcessor();
if (langfuseProcessor) {
  provider.addSpanProcessor(langfuseProcessor);
}

provider.register();
```

## Notes

- Each call creates a new ContextEnricher instance
- Can be called before or after initialization
- Works with any TracerProvider, not just NeuroLink's

## See Also

- [getSpanProcessors](/docs/getspanprocessors) - Get both processors together
- [setLangfuseContext](/docs/setlangfusecontext) - Set context for enrichment
- [LangfuseSpanAttributes](/docs/type-aliases/langfusespanattributes) - GenAI attributes

---

## Class: InMemoryVectorStore

<!-- Source: api/classes/InMemoryVectorStore.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

### query()

> **query**(`params`): `Promise`\

Defined in: [rag/retrieval/vectorQueryTool.ts:231](https://github.com/juspay/neurolink/blob/main/src/lib/rag/retrieval/vectorQueryTool.ts#L231)

Query vectors by similarity using cosine distance.

#### Parameters

##### params

`object`

Query parameters object

##### params.indexName

`string`

Name of the index to search

##### params.queryVector

`number[]`

The query embedding vector to search for

##### params.topK?

`number`

Maximum number of results to return (default: 10)

##### params.filter?

[`MetadataFilter`](/docs/type-aliases/metadatafilter)

Optional metadata filter to narrow results

##### params.includeVectors?

`boolean`

Whether to include vectors in results (default: false)

#### Returns

`Promise`\

Array of matching results sorted by similarity score (descending)

---

### delete()

> **delete**(`indexName`, `ids`): `Promise`\

Defined in: [rag/retrieval/vectorQueryTool.ts:288](https://github.com/juspay/neurolink/blob/main/src/lib/rag/retrieval/vectorQueryTool.ts#L288)

Delete vectors from an index by their IDs.

#### Parameters

##### indexName

`string`

Name of the index to delete from

##### ids

`string[]`

Array of vector IDs to delete

#### Returns

`Promise`\

## Metadata Filtering

InMemoryVectorStore supports a rich query language for filtering results by metadata.

### Comparison Operators

| Operator    | Description               | Example                             |
| ----------- | ------------------------- | ----------------------------------- |
| `$eq`       | Equal to                  | `{ status: { $eq: "published" } }`  |
| `$ne`       | Not equal to              | `{ status: { $ne: "draft" } }`      |
| `$gt`       | Greater than              | `{ score: { $gt: 0.8 } }`           |
| `$gte`      | Greater than or equal     | `{ count: { $gte: 10 } }`           |
| `$lt`       | Less than                 | `{ price: { $lt: 100 } }`           |
| `$lte`      | Less than or equal        | `{ age: { $lte: 30 } }`             |
| `$in`       | Value in array            | `{ category: { $in: ["a", "b"] } }` |
| `$nin`      | Value not in array        | `{ type: { $nin: ["x", "y"] } }`    |
| `$exists`   | Field exists (or not)     | `{ author: { $exists: true } }`     |
| `$contains` | String contains substring | `{ title: { $contains: "AI" } }`    |
| `$regex`    | String matches regex      | `{ name: { $regex: "^test" } }`     |

### Logical Operators

| Operator | Description                       | Example                                               |
| -------- | --------------------------------- | ----------------------------------------------------- |
| `$and`   | All conditions must match         | `{ $and: [{ a: 1 }, { b: 2 }] }`                      |
| `$or`    | At least one condition must match | `{ $or: [{ status: "active" }, { featured: true }] }` |
| `$not`   | Negates a condition               | `{ $not: { status: "deleted" } }`                     |

### Direct Equality

For simple equality checks, you can use direct field values:

```typescript
const filter = { category: "documentation", version: "2.0" };
```

## Examples

### Basic Usage

```typescript

// Create a new store
const store = new InMemoryVectorStore();

// Add vectors with metadata
await store.upsert("documents", [
  {
    id: "doc-1",
    vector: [0.1, 0.2, 0.3, 0.4],
    metadata: {
      text: "Introduction to machine learning",
      category: "tutorial",
      author: "John Doe",
    },
  },
  {
    id: "doc-2",
    vector: [0.2, 0.3, 0.4, 0.5],
    metadata: {
      text: "Advanced neural network architectures",
      category: "research",
      author: "Jane Smith",
    },
  },
]);

// Query for similar vectors
const results = await store.query({
  indexName: "documents",
  queryVector: [0.15, 0.25, 0.35, 0.45],
  topK: 5,
});

console.log(results);
// [
//   { id: "doc-1", score: 0.998, text: "Introduction to...", metadata: {...} },
//   { id: "doc-2", score: 0.995, text: "Advanced neural...", metadata: {...} }
// ]
```

### Using with Embeddings

```typescript

const store = new InMemoryVectorStore();

// Generate embeddings for documents
const documents = [
  "The quick brown fox jumps over the lazy dog",
  "Machine learning is a subset of artificial intelligence",
  "Vector databases enable semantic search",
];

for (let i = 0; i  {
  let store: InMemoryVectorStore;

  beforeEach(() => {
    // Fresh store for each test
    store = new InMemoryVectorStore();
  });

  it("should retrieve relevant documents", async () => {
    // Seed test data
    await store.upsert("test-index", [
      {
        id: "1",
        vector: [1, 0, 0],
        metadata: { text: "Document about cats" },
      },
      {
        id: "2",
        vector: [0, 1, 0],
        metadata: { text: "Document about dogs" },
      },
    ]);

    // Query for cat-related content
    const results = await store.query({
      indexName: "test-index",
      queryVector: [0.9, 0.1, 0],
      topK: 1,
    });

    expect(results).toHaveLength(1);
    expect(results[0].metadata.text).toContain("cats");
  });
});
```

## Notes

- **Similarity metric**: Uses cosine similarity for vector comparison
- **Thread safety**: Not thread-safe; use separate instances for concurrent access in multi-threaded environments
- **Memory usage**: All vectors are stored in memory; consider dataset size accordingly
- **Persistence**: Data is not persisted; all vectors are lost when the process ends
- **Vector dimensions**: Query and stored vectors must have matching dimensions

## See Also

- [VectorStore](/docs/interfaces/vectorstore) - Interface implemented by this class
- [VectorQueryResult](/docs/interfaces/vectorqueryresult) - Result type returned by query
- [MetadataFilter](/docs/type-aliases/metadatafilter) - Filter type definition
- [createVectorQueryTool](/docs/functions/createvectorquerytool) - Create a vector query tool
- [RAGPipeline](/docs/ragpipeline) - Full RAG pipeline implementation
- [embed](/docs/functions/embed) - Generate embeddings for text

---

## Type Alias: DiscoveredMcp\<TTools\>

<!-- Source: api/type-aliases/DiscoveredMcp.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### tools?

> `optional` **tools**: `TTools`

Defined in: [types/mcpTypes.ts:518](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L518)

---

### capabilities?

> `optional` **capabilities**: `string`[]

Defined in: [types/mcpTypes.ts:519](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L519)

---

### version?

> `optional` **version**: `string`

Defined in: [types/mcpTypes.ts:520](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L520)

---

### configuration?

> `optional` **configuration**: `Record`\

Defined in: [types/mcpTypes.ts:521](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L521)

---

## Function: createContextWindow()

<!-- Source: api/functions/createContextWindow.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

-------------- | --------------------- | -------------------------------------- |
| `text`            | `string`              | Assembled context text                 |
| `chunkCount`      | `number`              | Number of chunks included              |
| `charCount`       | `number`              | Total character count                  |
| `tokenCount`      | `number`              | Estimated token count                  |
| `truncatedChunks` | `number`              | Number of chunks truncated or excluded |
| `citations`       | `Map` | Map of chunk IDs to citation strings   |

## Examples

### Basic usage

```typescript

const results = await vectorStore.query({
  query: "machine learning",
  topK: 10,
});

const window = createContextWindow(results, {
  maxTokens: 4000,
});

console.log(`Included ${window.chunkCount} chunks`);
console.log(`Token count: ${window.tokenCount}`);
console.log(`Truncated: ${window.truncatedChunks} chunks`);
```

### Track context utilization

```typescript

const window = createContextWindow(results, { maxTokens: 8000 });

const utilization = (window.tokenCount / 8000) * 100;
console.log(`Context utilization: ${utilization.toFixed(1)}%`);

if (window.truncatedChunks > 0) {
  console.warn(`Warning: ${window.truncatedChunks} chunks were truncated`);
}
```

### Use citations in response

```typescript

const window = createContextWindow(results, { maxTokens: 4000 });

const response = await llm.generate({
  prompt: `Context:\n${window.text}\n\nQuestion: ${question}`,
});

// Include citations in the response
const citationList = [...window.citations.values()].join("\n");
const fullResponse = `${response.content}\n\nSources:\n${citationList}`;
```

### Adaptive context sizing

```typescript

function createAdaptiveContext(
  results: VectorQueryResult[],
  modelContext: number,
) {
  // Reserve tokens for system prompt and response
  const availableTokens = modelContext - 2000;

  const window = createContextWindow(results, {
    maxTokens: availableTokens,
  });

  return {
    context: window.text,
    metadata: {
      chunksUsed: window.chunkCount,
      chunksExcluded: window.truncatedChunks,
      tokensUsed: window.tokenCount,
      tokensAvailable: availableTokens,
    },
  };
}
```

### With logging and monitoring

```typescript

async function buildContext(query: string) {
  const results = await search(query);
  const window = createContextWindow(results, { maxTokens: 4000 });

  // Log context metrics
  logger.info("Context assembled", {
    query,
    chunkCount: window.chunkCount,
    charCount: window.charCount,
    tokenCount: window.tokenCount,
    truncatedChunks: window.truncatedChunks,
    sourceCount: window.citations.size,
  });

  return window;
}
```

## Notes

- Token count is estimated at 4 characters per token
- Partial chunk inclusion is attempted when space allows (>100 chars remaining)
- Citations are automatically generated from chunk metadata or IDs
- Truncated chunks are marked with "(truncated)" in their citation

## Since

v8.44.0

## See Also

- [assembleContext](/docs/assemblecontext) - Simple context assembly returning string only
- [formatContextWithCitations](/docs/formatcontextwithcitations) - Format with separate citation list
- [summarizeContext](/docs/summarizecontext) - Summarize context using LLM

---

## Class: MCPCircuitBreaker

<!-- Source: api/classes/MCPCircuitBreaker.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### getStats()

> **getStats**(): `CircuitBreakerStats`

Defined in: [mcp/mcpCircuitBreaker.ts:257](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/mcpCircuitBreaker.ts#L257)

Get current statistics

#### Returns

`CircuitBreakerStats`

---

### reset()

> **reset**(): `void`

Defined in: [mcp/mcpCircuitBreaker.ts:286](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/mcpCircuitBreaker.ts#L286)

Manually reset the circuit breaker

#### Returns

`void`

---

### forceOpen()

> **forceOpen**(`reason`): `void`

Defined in: [mcp/mcpCircuitBreaker.ts:296](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/mcpCircuitBreaker.ts#L296)

Force open the circuit breaker

#### Parameters

##### reason

`string` = `"Manual force open"`

#### Returns

`void`

---

### getName()

> **getName**(): `string`

Defined in: [mcp/mcpCircuitBreaker.ts:304](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/mcpCircuitBreaker.ts#L304)

Get circuit breaker name

#### Returns

`string`

---

### isOpen()

> **isOpen**(): `boolean`

Defined in: [mcp/mcpCircuitBreaker.ts:311](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/mcpCircuitBreaker.ts#L311)

Check if circuit is open

#### Returns

`boolean`

---

### isClosed()

> **isClosed**(): `boolean`

Defined in: [mcp/mcpCircuitBreaker.ts:318](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/mcpCircuitBreaker.ts#L318)

Check if circuit is closed

#### Returns

`boolean`

---

### isHalfOpen()

> **isHalfOpen**(): `boolean`

Defined in: [mcp/mcpCircuitBreaker.ts:325](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/mcpCircuitBreaker.ts#L325)

Check if circuit is half-open

#### Returns

`boolean`

---

### destroy()

> **destroy**(): `void`

Defined in: [mcp/mcpCircuitBreaker.ts:334](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/mcpCircuitBreaker.ts#L334)

Destroy the circuit breaker and clean up resources
This method should be called when the circuit breaker is no longer needed
to prevent memory leaks from the cleanup timer

#### Returns

`void`

---

## Type Alias: DocumentType

<!-- Source: api/type-aliases/DocumentType.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

### "markdown"

Markdown formatted documents with headers, lists, and code blocks

---

### "html"

HTML documents with DOM structure

---

### "json"

JSON structured data documents

---

### "latex"

LaTeX scientific documents with mathematical notation

---

### "csv"

Comma-separated values tabular data

---

### "pdf"

PDF documents (requires PDF parsing)

## Example

```typescript

// Explicit type specification
const docType: DocumentType = "markdown";

// Using with MDocument factory methods
const markdownDoc = MDocument.fromMarkdown("# Title\n\nContent here");
const htmlDoc = MDocument.fromHTML("TitleContent");
const jsonDoc = MDocument.fromJSONContent({ key: "value" });

// Manual configuration
const doc = new MDocument(content, {
  type: "latex",
  metadata: { source: "paper.tex" },
});
```

---

## Function: createHybridSearch()

<!-- Source: api/functions/createHybridSearch.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / createHybridSearch

# Function: createHybridSearch()

> **createHybridSearch**(`options`): (`query`: `string`, `config?`: `HybridSearchConfig`) => `Promise`

Defined in: [lib/rag/retrieval/hybridSearch.ts:262](https://github.com/juspay/neurolink/blob/main/src/lib/rag/retrieval/hybridSearch.ts#L262)

Create a hybrid search function combining vector and BM25 retrieval

Hybrid search improves retrieval quality by combining dense (vector) and
sparse (BM25) search methods. This addresses limitations of pure vector
search for keyword-heavy queries and lexical matching.

## Parameters

### options

`HybridSearchOptions`

Configuration for the hybrid search function:

- `vectorStore` - Vector store instance for dense retrieval
- `bm25Index` - BM25 index instance for sparse retrieval
- `indexName` - Index name within the vector store
- `embeddingModel` - Configuration for query embedding
  - `provider` - Embedding provider name
  - `modelName` - Embedding model name
- `defaultConfig` - Optional default search configuration

## Returns

`Function`

A hybrid search function that accepts:

- `query` - Search query string
- `config` - Optional search configuration (HybridSearchConfig)

Returns `Promise` - Array of search results with combined scores

### HybridSearchConfig options

- `vectorWeight` - Weight for vector scores (default: 0.5)
- `bm25Weight` - Weight for BM25 scores (default: 0.5)
- `fusionMethod` - Score fusion method: `"rrf"` or `"linear"` (default: `"rrf"`)
- `rrfK` - RRF constant parameter (default: 60)
- `topK` - Number of results to return (default: 10)
- `enableReranking` - Enable post-retrieval reranking (default: false)
- `reranker` - Reranker configuration if reranking is enabled

## Examples

### Basic hybrid search setup

```typescript

  createHybridSearch,
  InMemoryBM25Index,
  InMemoryVectorStore,
} from "@juspay/neurolink";

// Create stores
const vectorStore = new InMemoryVectorStore({ dimension: 1536 });
const bm25Index = new InMemoryBM25Index();

// Add documents to both stores
await vectorStore.upsert({
  indexName: "docs",
  vectors: documents.map((d) => ({
    id: d.id,
    vector: d.embedding,
    metadata: { text: d.text },
  })),
});
await bm25Index.addDocuments(documents);

// Create hybrid search function
const hybridSearch = createHybridSearch({
  vectorStore,
  bm25Index,
  indexName: "docs",
  embeddingModel: {
    provider: "openai",
    modelName: "text-embedding-3-small",
  },
});

// Execute search
const results = await hybridSearch("machine learning algorithms");
```

### Using Reciprocal Rank Fusion (RRF)

```typescript

const hybridSearch = createHybridSearch({
  vectorStore,
  bm25Index,
  indexName: "knowledge-base",
  embeddingModel: { provider: "openai", modelName: "text-embedding-3-small" },
  defaultConfig: {
    fusionMethod: "rrf",
    rrfK: 60,
    topK: 10,
  },
});

const results = await hybridSearch("API authentication methods");
```

### Using Linear Combination fusion

```typescript

const hybridSearch = createHybridSearch({
  vectorStore,
  bm25Index,
  indexName: "docs",
  embeddingModel: { provider: "openai", modelName: "text-embedding-3-small" },
});

// Linear combination allows fine-tuning the balance
const results = await hybridSearch("error handling best practices", {
  fusionMethod: "linear",
  vectorWeight: 0.7, // Emphasize semantic similarity
  bm25Weight: 0.3, // Lower weight for keyword matching
  topK: 15,
});
```

### With reranking enabled

```typescript

const hybridSearch = createHybridSearch({
  vectorStore,
  bm25Index,
  indexName: "docs",
  embeddingModel: { provider: "openai", modelName: "text-embedding-3-small" },
});

const results = await hybridSearch("how to configure SSL certificates", {
  topK: 20,
  enableReranking: true,
  reranker: {
    model: { provider: "openai", modelName: "gpt-4o-mini" },
    weights: { semantic: 0.5, vector: 0.3, position: 0.2 },
    topK: 5,
  },
});

// Results include reranking scores
results.forEach((r) => {
  console.log(`ID: ${r.id}, Score: ${r.score}`);
  console.log(`  Vector: ${r.scores?.vector}, BM25: ${r.scores?.bm25}`);
  console.log(`  Reranked: ${r.scores?.reranked}`);
});
```

### RAG pipeline integration

```typescript

async function buildRAGPipeline() {
  const hybridSearch = createHybridSearch({
    vectorStore,
    bm25Index,
    indexName: "knowledge",
    embeddingModel: { provider: "openai", modelName: "text-embedding-3-small" },
  });

  async function retrieveContext(query: string) {
    const results = await hybridSearch(query, {
      fusionMethod: "rrf",
      topK: 5,
    });

    return results.map((r) => r.text).join("\n\n");
  }

  // Use in generation
  const context = await retrieveContext("What is the refund policy?");
  const response = await llm.generate({
    prompt: `Context:\n${context}\n\nQuestion: What is the refund policy?`,
  });
}
```

## Notes

- BM25 excels at keyword/lexical matching while vectors capture semantic similarity
- RRF is generally more robust and doesn't require score normalization
- Linear combination allows fine-grained control over the balance
- Both retrieval methods run in parallel for optimal latency
- When reranking is enabled, more candidates are retrieved then filtered

## Since

v8.44.0

## See Also

- [reciprocalRankFusion](/docs/reciprocalrankfusion) - RRF fusion algorithm
- [linearCombination](/docs/linearcombination) - Linear score combination
- [rerank](/docs/rerank) - Post-retrieval reranking
- [InMemoryBM25Index](/docs/classes/inmemorybm25index) - In-memory BM25 implementation
- [HybridSearchConfig](/docs/type-aliases/hybridsearchconfig) - Configuration type
- [HybridSearchResult](/docs/type-aliases/hybridsearchresult) - Result type

---

## Class: MDocument

<!-- Source: api/classes/MDocument.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

### fromMarkdown()

> `static` **fromMarkdown**(`markdown`, `metadata?`): `MDocument`

Defined in: [src/lib/rag/document/MDocument.ts:108](https://github.com/juspay/neurolink/blob/feat/rag-processing/src/lib/rag/document/MDocument.ts#L108)

Create MDocument from markdown content.

#### Parameters

##### markdown

`string`

Markdown content

##### metadata?

`Record`

Optional metadata to attach

#### Returns

`MDocument`

New MDocument instance with type "markdown"

#### Example

```typescript
const doc = MDocument.fromMarkdown("# Title\n\nContent here");
await doc.chunk({ strategy: "markdown" });
```

---

### fromHTML()

> `static` **fromHTML**(`html`, `metadata?`): `MDocument`

Defined in: [src/lib/rag/document/MDocument.ts:121](https://github.com/juspay/neurolink/blob/feat/rag-processing/src/lib/rag/document/MDocument.ts#L121)

Create MDocument from HTML content.

#### Parameters

##### html

`string`

HTML content

##### metadata?

`Record`

Optional metadata to attach

#### Returns

`MDocument`

New MDocument instance with type "html"

#### Example

```typescript
const doc = MDocument.fromHTML("Content");
await doc.chunk({ strategy: "html", config: { extractTextOnly: true } });
```

---

### fromJSONContent()

> `static` **fromJSONContent**(`json`, `metadata?`): `MDocument`

Defined in: [src/lib/rag/document/MDocument.ts:131](https://github.com/juspay/neurolink/blob/feat/rag-processing/src/lib/rag/document/MDocument.ts#L131)

Create MDocument from JSON content.

#### Parameters

##### json

`string | object`

JSON string or object (will be stringified)

##### metadata?

`Record`

Optional metadata to attach

#### Returns

`MDocument`

New MDocument instance with type "json"

#### Example

```typescript
const doc = MDocument.fromJSONContent({ users: [...], config: {...} });
await doc.chunk({ strategy: "json", config: { splitKeys: ["users"] } });
```

---

### fromLaTeX()

> `static` **fromLaTeX**(`latex`, `metadata?`): `MDocument`

Defined in: [src/lib/rag/document/MDocument.ts:146](https://github.com/juspay/neurolink/blob/feat/rag-processing/src/lib/rag/document/MDocument.ts#L146)

Create MDocument from LaTeX content.

#### Parameters

##### latex

`string`

LaTeX content

##### metadata?

`Record`

Optional metadata to attach

#### Returns

`MDocument`

New MDocument instance with type "latex"

#### Example

```typescript
const doc = MDocument.fromLaTeX("\\section{Introduction}\nContent...");
await doc.chunk({ strategy: "latex" });
```

---

### fromCSV()

> `static` **fromCSV**(`csv`, `metadata?`): `MDocument`

Defined in: [src/lib/rag/document/MDocument.ts:159](https://github.com/juspay/neurolink/blob/feat/rag-processing/src/lib/rag/document/MDocument.ts#L159)

Create MDocument from CSV content.

#### Parameters

##### csv

`string`

CSV content

##### metadata?

`Record`

Optional metadata to attach

#### Returns

`MDocument`

New MDocument instance with type "csv"

---

### fromJSON()

> `static` **fromJSON**(`json`): `MDocument`

Defined in: [src/lib/rag/document/MDocument.ts:486](https://github.com/juspay/neurolink/blob/feat/rag-processing/src/lib/rag/document/MDocument.ts#L486)

Create MDocument from serialized JSON (deserialization).

Restores a previously serialized MDocument including its chunks, history, and metadata.

#### Parameters

##### json

Serialized document data

| Property    | Type                                              | Description                 |
| ----------- | ------------------------------------------------- | --------------------------- |
| `id?`       | `string`                                          | Document ID to restore      |
| `content`   | `string`                                          | Document content            |
| `type`      | [`DocumentType`](/docs/type-aliases/documenttype) | Document type               |
| `metadata?` | `Record`                         | Document metadata           |
| `chunks?`   | [`Chunk`](/docs/type-aliases/chunk)[]             | Previously generated chunks |
| `history?`  | `string[]`                                        | Processing history          |

#### Returns

`MDocument`

Restored MDocument instance

#### Example

```typescript
const serialized = existingDoc.toJSON();
const restored = MDocument.fromJSON(serialized);
```

## Instance Methods

### Core Processing Methods

#### chunk()

> **chunk**(`params?`): `Promise`

Defined in: [src/lib/rag/document/MDocument.ts:172](https://github.com/juspay/neurolink/blob/feat/rag-processing/src/lib/rag/document/MDocument.ts#L172)

Chunk the document using the specified strategy.

Uses ChunkerRegistry to get the appropriate chunker. If no strategy is specified, automatically selects the best strategy based on document type.

#### Parameters

##### params?

[`ChunkParams`](/docs/type-aliases/chunkparams)

Chunking parameters

| Property    | Type                                                      | Description                                      |
| ----------- | --------------------------------------------------------- | ------------------------------------------------ |
| `strategy?` | [`ChunkingStrategy`](/docs/type-aliases/chunkingstrategy) | Strategy to use (auto-detected if not specified) |
| `config?`   | [`ChunkerConfig`](/docs/type-aliases/chunkerconfig)       | Strategy-specific configuration                  |

#### Returns

`Promise`

This MDocument instance (for chaining)

#### Example

```typescript
await doc.chunk({
  strategy: "recursive",
  config: { maxSize: 1000, overlap: 200, separators: ["\n\n", "\n", " "] },
});
```

---

#### extractMetadata()

> **extractMetadata**(`params`, `options?`): `Promise`

Defined in: [src/lib/rag/document/MDocument.ts:211](https://github.com/juspay/neurolink/blob/feat/rag-processing/src/lib/rag/document/MDocument.ts#L211)

Extract metadata from chunks using LLM.

Requires `chunk()` to be called first. Uses LLMMetadataExtractor to analyze chunks and extract titles, summaries, keywords, or custom fields.

#### Parameters

##### params

[`ExtractParams`](/docs/type-aliases/extractparams)

Extraction parameters specifying what to extract

| Property    | Type                                | Description              |
| ----------- | ----------------------------------- | ------------------------ |
| `title?`    | `boolean \| TitleExtractorConfig`   | Extract document title   |
| `summary?`  | `boolean \| SummaryExtractorConfig` | Extract summary          |
| `keywords?` | `boolean \| KeywordExtractorConfig` | Extract keywords         |
| `custom?`   | `CustomSchemaExtractorConfig`       | Custom schema extraction |

##### options?

Extractor options

| Property     | Type     | Description               |
| ------------ | -------- | ------------------------- |
| `provider?`  | `string` | LLM provider name         |
| `modelName?` | `string` | Model name for extraction |

#### Returns

`Promise`

This MDocument instance (for chaining)

#### Example

```typescript
await doc.chunk({ strategy: "recursive" });
await doc.extractMetadata(
  { title: true, summary: true, keywords: { maxKeywords: 10 } },
  { provider: "openai", modelName: "gpt-4" },
);
```

---

#### embed()

> **embed**(`provider?`, `modelName?`): `Promise`

Defined in: [src/lib/rag/document/MDocument.ts:267](https://github.com/juspay/neurolink/blob/feat/rag-processing/src/lib/rag/document/MDocument.ts#L267)

Generate embeddings for all chunks.

Requires `chunk()` to be called first. Embeddings are stored both in the document state and on each chunk object.

#### Parameters

##### provider?

`string`

Embedding provider name (uses NEUROLINK_PROVIDER env var or "vertex" if not specified)

##### modelName?

`string`

Embedding model name (uses VERTEX_MODEL env var or "gemini-2.5-flash" for Vertex, provider-specific defaults for others)

#### Returns

`Promise`

This MDocument instance (for chaining)

#### Throws

When provider does not support embeddings

#### Example

```typescript
await doc.chunk({ strategy: "recursive" });
await doc.embed("openai", "text-embedding-3-small");

const embeddings = doc.getEmbeddings();
console.log(
  `Generated ${embeddings.length} embeddings of dimension ${embeddings[0].length}`,
);
```

### Accessor Methods

#### getId()

> **getId**(): `string`

Defined in: [src/lib/rag/document/MDocument.ts:330](https://github.com/juspay/neurolink/blob/feat/rag-processing/src/lib/rag/document/MDocument.ts#L330)

Get the unique document ID.

#### Returns

`string`

UUID assigned at document creation

---

#### getContent()

> **getContent**(): `string`

Defined in: [src/lib/rag/document/MDocument.ts:337](https://github.com/juspay/neurolink/blob/feat/rag-processing/src/lib/rag/document/MDocument.ts#L337)

Get raw document content.

#### Returns

`string`

Original document content

---

#### getType()

> **getType**(): [`DocumentType`](/docs/type-aliases/documenttype)

Defined in: [src/lib/rag/document/MDocument.ts:344](https://github.com/juspay/neurolink/blob/feat/rag-processing/src/lib/rag/document/MDocument.ts#L344)

Get document type.

#### Returns

[`DocumentType`](/docs/type-aliases/documenttype)

Document type ("text", "markdown", "html", "json", "latex", "csv")

---

#### getMetadata()

> **getMetadata**(): `Record`

Defined in: [src/lib/rag/document/MDocument.ts:351](https://github.com/juspay/neurolink/blob/feat/rag-processing/src/lib/rag/document/MDocument.ts#L351)

Get document metadata.

#### Returns

`Record`

Copy of document metadata object

---

#### getChunks()

> **getChunks**(): [`Chunk`](/docs/type-aliases/chunk)[]

Defined in: [src/lib/rag/document/MDocument.ts:358](https://github.com/juspay/neurolink/blob/feat/rag-processing/src/lib/rag/document/MDocument.ts#L358)

Get processed chunks.

#### Returns

[`Chunk`](/docs/type-aliases/chunk)[]

Copy of chunks array (empty if `chunk()` not called)

---

#### getEmbeddings()

> **getEmbeddings**(): `number[][]`

Defined in: [src/lib/rag/document/MDocument.ts:365](https://github.com/juspay/neurolink/blob/feat/rag-processing/src/lib/rag/document/MDocument.ts#L365)

Get chunk embeddings.

#### Returns

`number[][]`

Copy of embeddings array (empty if `embed()` not called)

---

#### getHistory()

> **getHistory**(): `string[]`

Defined in: [src/lib/rag/document/MDocument.ts:372](https://github.com/juspay/neurolink/blob/feat/rag-processing/src/lib/rag/document/MDocument.ts#L372)

Get processing history.

#### Returns

`string[]`

Array of processing steps (e.g., ["created", "chunked:recursive", "embedded:openai:text-embedding-3-small"])

---

#### isChunked()

> **isChunked**(): `boolean`

Defined in: [src/lib/rag/document/MDocument.ts:379](https://github.com/juspay/neurolink/blob/feat/rag-processing/src/lib/rag/document/MDocument.ts#L379)

Check if document has been chunked.

#### Returns

`boolean`

True if chunks have been generated

---

#### hasEmbeddings()

> **hasEmbeddings**(): `boolean`

Defined in: [src/lib/rag/document/MDocument.ts:386](https://github.com/juspay/neurolink/blob/feat/rag-processing/src/lib/rag/document/MDocument.ts#L386)

Check if document has embeddings.

#### Returns

`boolean`

True if embeddings have been generated

---

#### getChunkCount()

> **getChunkCount**(): `number`

Defined in: [src/lib/rag/document/MDocument.ts:393](https://github.com/juspay/neurolink/blob/feat/rag-processing/src/lib/rag/document/MDocument.ts#L393)

Get chunk count.

#### Returns

`number`

Number of chunks (0 if not chunked)

### Transformation Methods

#### setMetadata()

> **setMetadata**(`key`, `value`): `MDocument`

Defined in: [src/lib/rag/document/MDocument.ts:407](https://github.com/juspay/neurolink/blob/feat/rag-processing/src/lib/rag/document/MDocument.ts#L407)

Set a single metadata key-value pair.

#### Parameters

##### key

`string`

Metadata key

##### value

`unknown`

Metadata value

#### Returns

`MDocument`

This MDocument instance (for chaining)

---

#### mergeMetadata()

> **mergeMetadata**(`metadata`): `MDocument`

Defined in: [src/lib/rag/document/MDocument.ts:417](https://github.com/juspay/neurolink/blob/feat/rag-processing/src/lib/rag/document/MDocument.ts#L417)

Merge metadata into document.

#### Parameters

##### metadata

`Record`

Metadata object to merge

#### Returns

`MDocument`

This MDocument instance (for chaining)

---

#### filterChunks()

> **filterChunks**(`predicate`): `MDocument`

Defined in: [src/lib/rag/document/MDocument.ts:427](https://github.com/juspay/neurolink/blob/feat/rag-processing/src/lib/rag/document/MDocument.ts#L427)

Filter chunks based on predicate.

Creates a new MDocument with filtered chunks. Corresponding embeddings are also filtered.

#### Parameters

##### predicate

`(chunk: Chunk) => boolean`

Filter function

#### Returns

`MDocument`

New MDocument with filtered chunks

#### Example

```typescript
const filtered = doc.filterChunks((chunk) => chunk.text.length > 100);
```

---

#### mapChunks()

> **mapChunks**(`transform`): `MDocument`

Defined in: [src/lib/rag/document/MDocument.ts:445](https://github.com/juspay/neurolink/blob/feat/rag-processing/src/lib/rag/document/MDocument.ts#L445)

Map transformation over chunks.

Creates a new MDocument with transformed chunks.

#### Parameters

##### transform

`(chunk: Chunk) => Chunk`

Transform function

#### Returns

`MDocument`

New MDocument with transformed chunks

#### Example

```typescript
const transformed = doc.mapChunks((chunk) => ({
  ...chunk,
  text: chunk.text.toLowerCase(),
}));
```

### Serialization Methods

#### toJSON()

> **toJSON**(): `object`

Defined in: [src/lib/rag/document/MDocument.ts:463](https://github.com/juspay/neurolink/blob/feat/rag-processing/src/lib/rag/document/MDocument.ts#L463)

Convert to plain object for serialization.

#### Returns

`object`

Serializable object with all document state

| Property   | Type                                              |
| ---------- | ------------------------------------------------- |
| `id`       | `string`                                          |
| `content`  | `string`                                          |
| `type`     | [`DocumentType`](/docs/type-aliases/documenttype) |
| `metadata` | `Record`                         |
| `chunks`   | [`Chunk`](/docs/type-aliases/chunk)[]             |
| `history`  | `string[]`                                        |

## Properties

| Property           | Type                                              | Description                                            |
| ------------------ | ------------------------------------------------- | ------------------------------------------------------ |
| `documentId`       | `string`                                          | Unique document identifier (UUID)                      |
| `state.content`    | `string`                                          | Raw document content                                   |
| `state.type`       | [`DocumentType`](/docs/type-aliases/documenttype) | Document type (text, markdown, html, json, latex, csv) |
| `state.metadata`   | `Record`                         | Document metadata including documentId and createdAt   |
| `state.chunks`     | [`Chunk`](/docs/type-aliases/chunk)[]             | Processed chunks (populated after `chunk()`)           |
| `state.embeddings` | `number[][]`                                      | Embedding vectors (populated after `embed()`)          |
| `state.history`    | `string[]`                                        | Processing history log                                 |

## See Also

- [loadDocument](/docs/functions/loaddocument) - Load documents from files
- [Chunk](/docs/type-aliases/chunk) - Chunk type definition
- [ChunkingStrategy](/docs/type-aliases/chunkingstrategy) - Available chunking strategies
- [ChunkerConfig](/docs/type-aliases/chunkerconfig) - Chunker configuration options
- [ExtractParams](/docs/type-aliases/extractparams) - Metadata extraction parameters
- [DocumentType](/docs/type-aliases/documenttype) - Supported document types

---

## Type Alias: DynamicModelConfig

<!-- Source: api/type-aliases/DynamicModelConfig.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / DynamicModelConfig

# Type Alias: DynamicModelConfig

> **DynamicModelConfig** = `z.infer`\

Defined in: [types/modelTypes.ts:106](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/modelTypes.ts#L106)

Dynamic model configuration type

---

## Function: createOAuthProviderFromConfig()

<!-- Source: api/functions/createOAuthProviderFromConfig.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / createOAuthProviderFromConfig

# Function: createOAuthProviderFromConfig()

> **createOAuthProviderFromConfig**(`authConfig`, `storage?`): [`NeuroLinkOAuthProvider`](/docs/api/classes/NeuroLinkOAuthProvider)

Defined in: [mcp/auth/oauthClientProvider.ts:402](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/auth/oauthClientProvider.ts#L402)

Create an OAuth provider from MCP server auth configuration

## Parameters

### authConfig

#### clientId

`string`

#### clientSecret?

`string`

#### authorizationUrl

`string`

#### tokenUrl

`string`

#### redirectUrl

`string`

#### scope?

`string`

#### usePKCE?

`boolean`

### storage?

[`TokenStorage`](/docs/api/type-aliases/TokenStorage)

## Returns

[`NeuroLinkOAuthProvider`](/docs/api/classes/NeuroLinkOAuthProvider)

---

## Class: MiddlewareFactory

<!-- Source: api/classes/MiddlewareFactory.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### presets

> **presets**: `Map`\

Defined in: [middleware/factory.ts:25](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/middleware/factory.ts#L25)

## Methods

### registerPreset()

> **registerPreset**(`preset`, `replace`): `void`

Defined in: [middleware/factory.ts:91](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/middleware/factory.ts#L91)

Register a custom preset

#### Parameters

##### preset

[`MiddlewarePreset`](/docs/api/type-aliases/MiddlewarePreset)

##### replace

`boolean` = `false`

#### Returns

`void`

---

### register()

> **register**(`middleware`, `options?`): `void`

Defined in: [middleware/factory.ts:103](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/middleware/factory.ts#L103)

Register a custom middleware

#### Parameters

##### middleware

[`NeuroLinkMiddleware`](/docs/api/type-aliases/NeuroLinkMiddleware)

##### options?

`MiddlewareRegistrationOptions`

#### Returns

`void`

---

### applyMiddleware()

> **applyMiddleware**(`model`, `context`, `options`): `LanguageModelV1`

Defined in: [middleware/factory.ts:113](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/middleware/factory.ts#L113)

Apply middleware to a language model

#### Parameters

##### model

`LanguageModelV1`

##### context

[`MiddlewareContext`](/docs/api/type-aliases/MiddlewareContext)

##### options

[`MiddlewareFactoryOptions`](/docs/api/type-aliases/MiddlewareFactoryOptions) = `{}`

#### Returns

`LanguageModelV1`

---

### createContext()

> **createContext**(`provider`, `model`, `options`, `session?`): [`MiddlewareContext`](/docs/api/type-aliases/MiddlewareContext)

Defined in: [middleware/factory.ts:292](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/middleware/factory.ts#L292)

Create middleware context from provider and options

#### Parameters

##### provider

`string`

##### model

`string`

##### options

`Record`\ = `{}`

##### session?

###### sessionId?

`string`

###### userId?

`string`

#### Returns

[`MiddlewareContext`](/docs/api/type-aliases/MiddlewareContext)

---

### validateConfig()

> **validateConfig**(`config`): `object`

Defined in: [middleware/factory.ts:313](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/middleware/factory.ts#L313)

Validate middleware configuration

#### Parameters

##### config

`Record`\

#### Returns

`object`

##### isValid

> **isValid**: `boolean`

##### errors

> **errors**: `string`[]

##### warnings

> **warnings**: `string`[]

---

### getAvailablePresets()

> **getAvailablePresets**(): `object`[]

Defined in: [middleware/factory.ts:368](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/middleware/factory.ts#L368)

Get available presets

#### Returns

`object`[]

---

### getChainStats()

> **getChainStats**(`context`, `config`): `MiddlewareChainStats`

Defined in: [middleware/factory.ts:383](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/middleware/factory.ts#L383)

Get middleware chain statistics

#### Parameters

##### context

[`MiddlewareContext`](/docs/api/type-aliases/MiddlewareContext)

##### config

`Record`\

#### Returns

`MiddlewareChainStats`

---

### createModelFactory()

> **createModelFactory**(`baseModelFactory`, `defaultOptions`): (`context`, `options`) => `Promise`\

Defined in: [middleware/factory.ts:416](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/middleware/factory.ts#L416)

Create a middleware-enabled model factory function

#### Parameters

##### baseModelFactory

() => `Promise`\

##### defaultOptions

[`MiddlewareFactoryOptions`](/docs/api/type-aliases/MiddlewareFactoryOptions) = `{}`

#### Returns

> (`context`, `options`): `Promise`\

##### Parameters

###### context

[`MiddlewareContext`](/docs/api/type-aliases/MiddlewareContext)

###### options

[`MiddlewareFactoryOptions`](/docs/api/type-aliases/MiddlewareFactoryOptions) = `{}`

##### Returns

`Promise`\

---

## Type Alias: EnhancedProvider

<!-- Source: api/type-aliases/EnhancedProvider.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### getName()

> **getName**(): `string`

Defined in: [types/generateTypes.ts:413](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L413)

#### Returns

`string`

---

### isAvailable()

> **isAvailable**(): `Promise`\

Defined in: [types/generateTypes.ts:414](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L414)

#### Returns

`Promise`\

---

## Function: createReranker()

<!-- Source: api/functions/createReranker.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / createReranker

# Function: createReranker()

> **createReranker**(`typeOrAlias`, `config?`): `Promise`

Defined in: [lib/rag/reranker/RerankerFactory.ts:539](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/RerankerFactory.ts#L539)

Create a reranker instance by type or alias

This factory function provides a convenient way to instantiate rerankers
for improving retrieval quality by re-scoring and re-ordering search results
based on relevance to the query.

## Parameters

### typeOrAlias

`string`

Reranker type or alias. Supported types:

- `llm` (aliases: `semantic`, `ai`, `model-based`) - LLM-powered semantic reranking
- `cross-encoder` (aliases: `cross`, `encoder`, `bi-encoder`) - Cross-encoder model reranking
- `cohere` (aliases: `cohere-rerank`, `cohere-api`) - Cohere Rerank API
- `simple` (aliases: `fast`, `basic`, `position-based`) - Position and vector score-based (no LLM)
- `batch` (aliases: `batch-llm`, `efficient`, `bulk`) - Batch LLM reranking for efficiency

### config?

`RerankerConfig`

Reranker configuration options:

- `type` - Reranker type
- `model` - Model name for LLM-based rerankers
- `provider` - Provider for the model
- `topK` - Number of results to return after reranking
- `weights` - Scoring weights for multi-factor reranking
- `apiKey` - API key for external services (e.g., Cohere)

## Returns

`Promise`

A Reranker instance configured with the specified type

## Throws

`RerankerError` - If the type is unknown or creation fails

## Examples

### Basic LLM reranking

```typescript

// Set up the model provider first
rerankerFactory.setModelProvider(myAIProvider);

const reranker = await createReranker("llm", {
  topK: 5,
  weights: { semantic: 0.5, vector: 0.3, position: 0.2 },
});

const rerankedResults = await reranker.rerank(searchResults, "user query");
```

### Simple reranking without LLM

```typescript

// Fast reranking using vector scores and position
const reranker = await createReranker("simple", {
  topK: 10,
  weights: { vector: 0.8, position: 0.2 },
});

const results = await reranker.rerank(vectorSearchResults, query);
```

### Batch reranking for efficiency

```typescript

rerankerFactory.setModelProvider(aiProvider);

// Efficient batch scoring for large result sets
const reranker = await createReranker("batch", {
  topK: 20,
  weights: { semantic: 0.4, vector: 0.4, position: 0.2 },
});

const rerankedResults = await reranker.rerank(largeResultSet, query);
```

### Using Cohere Rerank API

```typescript

const reranker = await createReranker("cohere", {
  model: "rerank-v3.5",
  topK: 10,
  apiKey: process.env.COHERE_API_KEY,
});

const results = await reranker.rerank(searchResults, query);
```

## Since

v8.44.0

## See Also

- [rerank](/docs/rerank) - Direct LLM-based reranking function
- [simpleRerank](/docs/simplererank) - Simple position-based reranking
- [batchRerank](/docs/batchrerank) - Batch reranking for efficiency
- [RerankerConfig](/docs/type-aliases/rerankerconfig) - Configuration options
- [Reranker](/docs/interfaces/reranker) - Reranker interface

---

## Class: NeuroLink

<!-- Source: api/classes/NeuroLink.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

#### isTelemetryEnabled()

> **isTelemetryEnabled**(): `boolean`

Defined in: [neurolink.ts:1664](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L1664)

Check if Langfuse telemetry is enabled
Centralized utility to avoid duplication across providers

##### Returns

`boolean`

---

#### initializeLangfuseObservability()

> **initializeLangfuseObservability**(): `Promise`\

Defined in: [neurolink.ts:1672](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L1672)

Public method to initialize Langfuse observability
This method can be called externally to ensure Langfuse is properly initialized

##### Returns

`Promise`\

---

#### shutdown()

> **shutdown**(): `Promise`\

Defined in: [neurolink.ts:1698](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L1698)

Gracefully shutdown NeuroLink and all MCP connections

##### Returns

`Promise`\

---

#### generateText()

> **generateText**(`options`): `Promise`\

Defined in: [neurolink.ts:2090](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L2090)

BACKWARD COMPATIBILITY: Legacy generateText method
Internally calls generate() and converts result format

##### Parameters

###### options

[`TextGenerationOptions`](/docs/api/type-aliases/TextGenerationOptions)

##### Returns

`Promise`\

---

#### streamText()

> **streamText**(`prompt`, `options?`): `Promise`\\>

Defined in: [neurolink.ts:2775](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L2775)

BACKWARD COMPATIBILITY: Legacy streamText method
Internally calls stream() and converts result format

##### Parameters

###### prompt

`string`

###### options?

`Partial`\

##### Returns

`Promise`\\>

---

#### stream()

> **stream**(`options`): `Promise`\

Defined in: [neurolink.ts:2855](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L2855)

Stream AI-generated content in real-time using the best available provider.
This method provides real-time streaming of AI responses with full MCP tool integration.

##### Parameters

###### options

[`StreamOptions`](#)

Stream configuration options

##### Returns

`Promise`\

Promise resolving to StreamResult with an async iterable stream

##### Example

```typescript
// Basic streaming usage
const result = await neurolink.stream({
  input: { text: "Tell me a story about space exploration" },
});

// Consume the stream
for await (const chunk of result.stream) {
  process.stdout.write(chunk.content);
}

// Advanced streaming with options
const result = await neurolink.stream({
  input: { text: "Explain machine learning" },
  provider: "openai",
  model: "gpt-4",
  temperature: 0.7,
  enableAnalytics: true,
  context: { domain: "education", audience: "beginners" },
});

// Access metadata and analytics
console.log(result.provider);
console.log(result.analytics?.usage);
```

##### Throws

When input text is missing or invalid

##### Throws

When all providers fail to generate content

##### Throws

When conversation memory operations fail (if enabled)

---

#### getEventEmitter()

> **getEventEmitter**(): `TypedEventEmitter`\

Defined in: [neurolink.ts:3677](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L3677)

Get the EventEmitter instance to listen to NeuroLink events for real-time monitoring and debugging.
This method provides access to the internal event system that emits events during AI generation,
tool execution, streaming, and other operations for comprehensive observability.

##### Returns

`TypedEventEmitter`\

EventEmitter instance that emits various NeuroLink operation events

##### Examples

```typescript
// Basic event listening setup
const neurolink = new NeuroLink();
const emitter = neurolink.getEventEmitter();

// Listen to generation events
emitter.on("generation:start", (event) => {
  console.log(`Generation started with provider: ${event.provider}`);
  console.log(`Started at: ${new Date(event.timestamp)}`);
});

emitter.on("generation:end", (event) => {
  console.log(`Generation completed in ${event.responseTime}ms`);
  console.log(`Tools used: ${event.toolsUsed?.length || 0}`);
});

// Listen to streaming events
emitter.on("stream:start", (event) => {
  console.log(`Streaming started with provider: ${event.provider}`);
});

emitter.on("stream:end", (event) => {
  console.log(`Streaming completed in ${event.responseTime}ms`);
  if (event.fallback) console.log("Used fallback streaming");
});

// Listen to tool execution events
emitter.on("tool:start", (event) => {
  console.log(`Tool execution started: ${event.toolName}`);
});

emitter.on("tool:end", (event) => {
  console.log(
    `Tool ${event.toolName} ${event.success ? "succeeded" : "failed"}`,
  );
  console.log(`Execution time: ${event.responseTime}ms`);
});

// Listen to tool registration events
emitter.on("tools-register:start", (event) => {
  console.log(`Registering tool: ${event.toolName}`);
});

emitter.on("tools-register:end", (event) => {
  console.log(
    `Tool registration ${event.success ? "succeeded" : "failed"}: ${event.toolName}`,
  );
});

// Listen to external MCP server events
emitter.on("externalMCP:serverConnected", (event) => {
  console.log(`External MCP server connected: ${event.serverId}`);
  console.log(`Tools available: ${event.toolCount || 0}`);
});

emitter.on("externalMCP:serverDisconnected", (event) => {
  console.log(`External MCP server disconnected: ${event.serverId}`);
  console.log(`Reason: ${event.reason || "Unknown"}`);
});

emitter.on("externalMCP:toolDiscovered", (event) => {
  console.log(`New tool discovered: ${event.toolName} from ${event.serverId}`);
});

// Advanced usage with error handling
emitter.on("error", (error) => {
  console.error("NeuroLink error:", error);
});

// Clean up event listeners when done
function cleanup() {
  emitter.removeAllListeners();
}

process.on("SIGINT", cleanup);
process.on("SIGTERM", cleanup);
```

```typescript
// Advanced monitoring with metrics collection
const neurolink = new NeuroLink();
const emitter = neurolink.getEventEmitter();
const metrics = {
  generations: 0,
  totalResponseTime: 0,
  toolExecutions: 0,
  failures: 0,
};

// Collect performance metrics
emitter.on("generation:end", (event) => {
  metrics.generations++;
  metrics.totalResponseTime += event.responseTime;
  metrics.toolExecutions += event.toolsUsed?.length || 0;
});

emitter.on("tool:end", (event) => {
  if (!event.success) {
    metrics.failures++;
  }
});

// Log metrics every 10 seconds
setInterval(() => {
  const avgResponseTime =
    metrics.generations > 0
      ? metrics.totalResponseTime / metrics.generations
      : 0;

  console.log("NeuroLink Metrics:", {
    totalGenerations: metrics.generations,
    averageResponseTime: `${avgResponseTime.toFixed(2)}ms`,
    totalToolExecutions: metrics.toolExecutions,
    failureRate: `${((metrics.failures / (metrics.toolExecutions || 1)) * 100).toFixed(2)}%`,
  });
}, 10000);
```

**Available Events:**

**Generation Events:**

- `generation:start` - Fired when text generation begins
  - `{ provider: string, timestamp: number }`
- `generation:end` - Fired when text generation completes
  - `{ provider: string, responseTime: number, toolsUsed?: string[], timestamp: number }`

**Streaming Events:**

- `stream:start` - Fired when streaming begins
  - `{ provider: string, timestamp: number }`
- `stream:end` - Fired when streaming completes
  - `{ provider: string, responseTime: number, fallback?: boolean }`

**Tool Events:**

- `tool:start` - Fired when tool execution begins
  - `{ toolName: string, timestamp: number }`
- `tool:end` - Fired when tool execution completes
  - `{ toolName: string, responseTime: number, success: boolean, timestamp: number }`
- `tools-register:start` - Fired when tool registration begins
  - `{ toolName: string, timestamp: number }`
- `tools-register:end` - Fired when tool registration completes
  - `{ toolName: string, success: boolean, timestamp: number }`

**External MCP Events:**

- `externalMCP:serverConnected` - Fired when external MCP server connects
  - `{ serverId: string, toolCount?: number, timestamp: number }`
- `externalMCP:serverDisconnected` - Fired when external MCP server disconnects
  - `{ serverId: string, reason?: string, timestamp: number }`
- `externalMCP:serverFailed` - Fired when external MCP server fails
  - `{ serverId: string, error: string, timestamp: number }`
- `externalMCP:toolDiscovered` - Fired when external MCP tool is discovered
  - `{ toolName: string, serverId: string, timestamp: number }`
- `externalMCP:toolRemoved` - Fired when external MCP tool is removed
  - `{ toolName: string, serverId: string, timestamp: number }`
- `externalMCP:serverAdded` - Fired when external MCP server is added
  - `{ serverId: string, config: MCPServerInfo, toolCount: number, timestamp: number }`
- `externalMCP:serverRemoved` - Fired when external MCP server is removed
  - `{ serverId: string, timestamp: number }`

**Error Events:**

- `error` - Fired when an error occurs
  - `{ error: Error, context?: object }`

##### Throws

This method does not throw errors as it returns the internal EventEmitter

##### Since

1.0.0

##### See

- [https://nodejs.org/api/events.html](https://nodejs.org/api/events.html) Node.js EventEmitter documentation
- [NeuroLink.generate](#generate) for events related to text generation
- [NeuroLink.stream](#stream) for events related to streaming
- [NeuroLink.executeTool](#executetool) for events related to tool execution

---

#### emitToolStart()

> **emitToolStart**(`toolName`, `input`, `startTime`): `string`

Defined in: [neurolink.ts:3695](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L3695)

Emit tool start event with execution tracking

##### Parameters

###### toolName

`string`

Name of the tool being executed

###### input

`unknown`

Input parameters for the tool

###### startTime

`number` = `...`

Timestamp when execution started

##### Returns

`string`

executionId for tracking this specific execution

---

#### emitToolEnd()

> **emitToolEnd**(`toolName`, `result?`, `error?`, `startTime?`, `endTime?`, `executionId?`): `void`

Defined in: [neurolink.ts:3744](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L3744)

Emit tool end event with execution summary

##### Parameters

###### toolName

`string`

Name of the tool that finished

###### result?

`unknown`

Result from the tool execution

###### error?

`string`

Error message if execution failed

###### startTime?

`number`

When execution started

###### endTime?

`number` = `...`

When execution finished

###### executionId?

`string`

Optional execution ID for tracking

##### Returns

`void`

---

#### getCurrentToolExecutions()

> **getCurrentToolExecutions**(): `ToolExecutionContext`[]

Defined in: [neurolink.ts:3821](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L3821)

Get current tool execution contexts for stream metadata

##### Returns

`ToolExecutionContext`[]

---

#### getToolExecutionHistory()

> **getToolExecutionHistory**(): `ToolExecutionSummary`[]

Defined in: [neurolink.ts:3828](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L3828)

Get tool execution history

##### Returns

`ToolExecutionSummary`[]

---

#### clearCurrentStreamExecutions()

> **clearCurrentStreamExecutions**(): `void`

Defined in: [neurolink.ts:3835](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L3835)

Clear current stream tool executions (called at stream start)

##### Returns

`void`

---

#### registerTool()

> **registerTool**(`name`, `tool`): `void`

Defined in: [neurolink.ts:3851](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L3851)

Register a custom tool that will be available to all AI providers

##### Parameters

###### name

`string`

Unique name for the tool

###### tool

Tool in MCPExecutableTool format (unified MCP protocol type)

###### name

`string`

###### description

`string`

###### inputSchema?

`object`

###### execute?

(`params`, `context?`) => `unknown`

##### Returns

`void`

---

#### setToolContext()

> **setToolContext**(`context`): `void`

Defined in: [neurolink.ts:3928](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L3928)

Set the context that will be passed to tools during execution
This context will be merged with any runtime context passed by the AI model

##### Parameters

###### context

`Record`\

Context object containing session info, tokens, shop data, etc.

##### Returns

`void`

---

#### getToolContext()

> **getToolContext**(): `Record`\ \| `undefined`

Defined in: [neurolink.ts:3943](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L3943)

Get the current tool execution context

##### Returns

`Record`\ \| `undefined`

Current context or undefined if not set

---

#### clearToolContext()

> **clearToolContext**(): `void`

Defined in: [neurolink.ts:3952](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L3952)

Clear the tool execution context

##### Returns

`void`

---

#### registerTools()

> **registerTools**(`tools`): `void`

Defined in: [neurolink.ts:3964](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L3964)

Register multiple tools at once - Supports both object and array formats

##### Parameters

###### tools

Object mapping tool names to MCPExecutableTool format OR Array of tools with names

Object format (existing): { toolName: MCPExecutableTool, ... }
Array format (Lighthouse compatible): [{ name: string, tool: MCPExecutableTool }, ...]

`Record`\ `unknown`; \}\> | `object`[]

##### Returns

`void`

---

#### unregisterTool()

> **unregisterTool**(`name`): `boolean`

Defined in: [neurolink.ts:3987](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L3987)

Unregister a custom tool

##### Parameters

###### name

`string`

Name of the tool to remove

##### Returns

`boolean`

true if the tool was removed, false if it didn't exist

---

#### getCustomTools()

> **getCustomTools**(): `Map`\ `unknown`; \}\>

Defined in: [neurolink.ts:4001](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L4001)

Get all registered custom tools

##### Returns

`Map`\ `unknown`; \}\>

Map of tool names to MCPExecutableTool format

---

#### addInMemoryMCPServer()

> **addInMemoryMCPServer**(`serverId`, `serverInfo`): `Promise`\

Defined in: [neurolink.ts:4094](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L4094)

Add an in-memory MCP server (from git diff)
Allows registration of pre-instantiated server objects

##### Parameters

###### serverId

`string`

Unique identifier for the server

###### serverInfo

[`MCPServerInfo`](/docs/api/type-aliases/MCPServerInfo)

Server configuration

##### Returns

`Promise`\

---

#### getInMemoryServers()

> **getInMemoryServers**(): `Map`\

Defined in: [neurolink.ts:4133](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L4133)

Get all registered in-memory servers

##### Returns

`Map`\

Map of server IDs to MCPServerInfo

---

#### getInMemoryServerInfos()

> **getInMemoryServerInfos**(): [`MCPServerInfo`](/docs/api/type-aliases/MCPServerInfo)[]

Defined in: [neurolink.ts:4157](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L4157)

Get in-memory servers as MCPServerInfo - ZERO conversion needed
Now fetches from centralized tool registry instead of local duplication

##### Returns

[`MCPServerInfo`](/docs/api/type-aliases/MCPServerInfo)[]

Array of MCPServerInfo

---

#### getAutoDiscoveredServerInfos()

> **getAutoDiscoveredServerInfos**(): [`MCPServerInfo`](/docs/api/type-aliases/MCPServerInfo)[]

Defined in: [neurolink.ts:4173](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L4173)

Get auto-discovered servers as MCPServerInfo - ZERO conversion needed

##### Returns

[`MCPServerInfo`](/docs/api/type-aliases/MCPServerInfo)[]

Array of MCPServerInfo

---

#### executeTool()

> **executeTool**\(`toolName`, `params`, `options?`): `Promise`\

Defined in: [neurolink.ts:4185](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L4185)

Execute a specific tool by name with robust error handling
Supports both custom tools and MCP server tools with timeout, retry, and circuit breaker patterns

##### Type Parameters

###### T

`T` = `unknown`

##### Parameters

###### toolName

`string`

Name of the tool to execute

###### params

`unknown` = `{}`

Parameters to pass to the tool

###### options?

Execution options including optional authentication context

###### timeout?

`number`

###### maxRetries?

`number`

###### retryDelayMs?

`number`

###### authContext?

\{\[`key`: `string`\]: `unknown`; `userId?`: `string`; `sessionId?`: `string`; `user?`: `Record`\; \}

###### authContext.userId?

`string`

###### authContext.sessionId?

`string`

###### authContext.user?

`Record`\

##### Returns

`Promise`\

Tool execution result

---

#### getAllAvailableTools()

> **getAllAvailableTools**(): `Promise`\

Defined in: [neurolink.ts:4581](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L4581)

##### Returns

`Promise`\

---

#### getProviderStatus()

> **getProviderStatus**(`options?`): `Promise`\

Defined in: [neurolink.ts:4749](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L4749)

Get comprehensive status of all AI providers
Primary method for provider health checking and diagnostics

##### Parameters

###### options?

###### quiet?

`boolean`

##### Returns

`Promise`\

---

#### testProvider()

> **testProvider**(`providerName`): `Promise`\

Defined in: [neurolink.ts:4940](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L4940)

Test a specific AI provider's connectivity and authentication

##### Parameters

###### providerName

`string`

Name of the provider to test

##### Returns

`Promise`\

Promise resolving to true if provider is working

---

#### getBestProvider()

> **getBestProvider**(`requestedProvider?`): `Promise`\

Defined in: [neurolink.ts:4972](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L4972)

Get the best available AI provider based on configuration and availability

##### Parameters

###### requestedProvider?

`string`

Optional preferred provider name

##### Returns

`Promise`\

Promise resolving to the best provider name

---

#### getAvailableProviders()

> **getAvailableProviders**(): `Promise`\

Defined in: [neurolink.ts:4981](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L4981)

Get list of all available AI provider names

##### Returns

`Promise`\

Array of supported provider names

---

#### isValidProvider()

> **isValidProvider**(`providerName`): `Promise`\

Defined in: [neurolink.ts:4991](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L4991)

Validate if a provider name is supported

##### Parameters

###### providerName

`string`

Provider name to validate

##### Returns

`Promise`\

True if provider name is valid

---

#### getMCPStatus()

> **getMCPStatus**(): `Promise`\

Defined in: [neurolink.ts:5004](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L5004)

Get comprehensive MCP (Model Context Protocol) status information

##### Returns

`Promise`\

Promise resolving to MCP status details

---

#### listMCPServers()

> **listMCPServers**(): `Promise`\

Defined in: [neurolink.ts:5074](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L5074)

List all configured MCP servers with their status

##### Returns

`Promise`\

Promise resolving to array of MCP server information

---

#### testMCPServer()

> **testMCPServer**(`serverId`): `Promise`\

Defined in: [neurolink.ts:5089](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L5089)

Test connectivity to a specific MCP server

##### Parameters

###### serverId

`string`

ID of the MCP server to test

##### Returns

`Promise`\

Promise resolving to true if server is reachable

---

#### hasProviderEnvVars()

> **hasProviderEnvVars**(`providerName`): `Promise`\

Defined in: [neurolink.ts:5130](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L5130)

Check if a provider has the required environment variables configured

##### Parameters

###### providerName

`string`

Name of the provider to check

##### Returns

`Promise`\

Promise resolving to true if provider has required env vars

---

#### checkProviderHealth()

> **checkProviderHealth**(`providerName`, `options`): `Promise`\

Defined in: [neurolink.ts:5153](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L5153)

Perform comprehensive health check on a specific provider

##### Parameters

###### providerName

`string`

Name of the provider to check

###### options

Health check options

###### timeout?

`number`

###### includeConnectivityTest?

`boolean`

###### includeModelValidation?

`boolean`

###### cacheResults?

`boolean`

##### Returns

`Promise`\

Promise resolving to detailed health status

---

#### checkAllProvidersHealth()

> **checkAllProvidersHealth**(`options`): `Promise`\

Defined in: [neurolink.ts:5199](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L5199)

Check health of all supported providers

##### Parameters

###### options

Health check options

###### timeout?

`number`

###### includeConnectivityTest?

`boolean`

###### includeModelValidation?

`boolean`

###### cacheResults?

`boolean`

##### Returns

`Promise`\

Promise resolving to array of health statuses for all providers

---

#### getProviderHealthSummary()

> **getProviderHealthSummary**(): `Promise`\

Defined in: [neurolink.ts:5243](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L5243)

Get a summary of provider health across all supported providers

##### Returns

`Promise`\

Promise resolving to health summary statistics

---

#### clearProviderHealthCache()

> **clearProviderHealthCache**(`providerName?`): `Promise`\

Defined in: [neurolink.ts:5290](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L5290)

Clear provider health cache (useful for re-testing after configuration changes)

##### Parameters

###### providerName?

`string`

Optional specific provider to clear cache for

##### Returns

`Promise`\

---

#### getToolExecutionMetrics()

> **getToolExecutionMetrics**(): `Record`\

Defined in: [neurolink.ts:5301](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L5301)

Get execution metrics for all tools

##### Returns

`Record`\

Object with execution metrics for each tool

---

#### getToolCircuitBreakerStatus()

> **getToolCircuitBreakerStatus**(): `Record`\

Defined in: [neurolink.ts:5341](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L5341)

Get circuit breaker status for all tools

##### Returns

`Record`\

Object with circuit breaker status for each tool

---

#### resetToolCircuitBreaker()

> **resetToolCircuitBreaker**(`toolName`): `void`

Defined in: [neurolink.ts:5376](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L5376)

Reset circuit breaker for a specific tool

##### Parameters

###### toolName

`string`

Name of the tool to reset circuit breaker for

##### Returns

`void`

---

#### clearToolExecutionMetrics()

> **clearToolExecutionMetrics**(): `void`

Defined in: [neurolink.ts:5393](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L5393)

Clear all tool execution metrics

##### Returns

`void`

---

#### getToolHealthReport()

> **getToolHealthReport**(): `Promise`\; \}\>

Defined in: [neurolink.ts:5402](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L5402)

Get comprehensive tool health report

##### Returns

`Promise`\; \}\>

Detailed health report for all tools

---

#### ensureConversationMemoryInitialized()

> **ensureConversationMemoryInitialized**(): `Promise`\

Defined in: [neurolink.ts:5522](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L5522)

Initialize conversation memory if enabled (public method for explicit initialization)
This is useful for testing or when you want to ensure conversation memory is ready

##### Returns

`Promise`\

Promise resolving to true if initialization was successful, false otherwise

---

#### getConversationStats()

> **getConversationStats**(): `Promise`\

Defined in: [neurolink.ts:5542](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L5542)

Get conversation memory statistics (public API)

##### Returns

`Promise`\

---

#### getConversationHistory()

> **getConversationHistory**(`sessionId`): `Promise`\

Defined in: [neurolink.ts:5563](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L5563)

Get complete conversation history for a specific session (public API)

##### Parameters

###### sessionId

`string`

The session ID to retrieve history for

##### Returns

`Promise`\

Array of ChatMessage objects in chronological order, or empty array if session doesn't exist

---

#### clearConversationSession()

> **clearConversationSession**(`sessionId`): `Promise`\

Defined in: [neurolink.ts:5606](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L5606)

Clear conversation history for a specific session (public API)

##### Parameters

###### sessionId

`string`

##### Returns

`Promise`\

---

#### clearAllConversations()

> **clearAllConversations**(): `Promise`\

Defined in: [neurolink.ts:5625](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L5625)

Clear all conversation history (public API)

##### Returns

`Promise`\

---

#### storeToolExecutions()

> **storeToolExecutions**(`sessionId`, `userId`, `toolCalls`, `toolResults`, `currentTime?`): `Promise`\

Defined in: [neurolink.ts:5649](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L5649)

Store tool executions in conversation memory if enabled and Redis is configured

##### Parameters

###### sessionId

`string`

Session identifier

###### userId

User identifier (optional)

`string` | `undefined`

###### toolCalls

`object`[]

Array of tool calls

###### toolResults

`object`[]

Array of tool results

###### currentTime?

`Date`

Date when the tool execution occurred (optional)

##### Returns

`Promise`\

Promise resolving when storage is complete

---

#### isToolExecutionStorageAvailable()

> **isToolExecutionStorageAvailable**(): `boolean`

Defined in: [neurolink.ts:5706](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L5706)

Check if tool execution storage is available

##### Returns

`boolean`

boolean indicating if Redis storage is configured and available

---

#### addExternalMCPServer()

> **addExternalMCPServer**(`serverId`, `config`): `Promise`\\>

Defined in: [neurolink.ts:5725](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L5725)

Add an external MCP server
Automatically discovers and registers tools from the server

##### Parameters

###### serverId

`string`

Unique identifier for the server

###### config

[`MCPServerInfo`](/docs/api/type-aliases/MCPServerInfo)

External MCP server configuration

##### Returns

`Promise`\\>

Operation result with server instance

---

#### removeExternalMCPServer()

> **removeExternalMCPServer**(`serverId`): `Promise`\\>

Defined in: [neurolink.ts:5782](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L5782)

Remove an external MCP server
Stops the server and removes all its tools

##### Parameters

###### serverId

`string`

ID of the server to remove

##### Returns

`Promise`\\>

Operation result

---

#### listExternalMCPServers()

> **listExternalMCPServers**(): `object`[]

Defined in: [neurolink.ts:5824](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L5824)

List all external MCP servers

##### Returns

`object`[]

Array of server health information

---

#### getExternalMCPServer()

> **getExternalMCPServer**(`serverId`): `ExternalMCPServerInstance` \| `undefined`

Defined in: [neurolink.ts:5853](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L5853)

Get external MCP server status

##### Parameters

###### serverId

`string`

ID of the server

##### Returns

`ExternalMCPServerInstance` \| `undefined`

Server instance or undefined if not found

---

#### executeExternalMCPTool()

> **executeExternalMCPTool**(`serverId`, `toolName`, `parameters`, `options?`): `Promise`\

Defined in: [neurolink.ts:5867](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L5867)

Execute a tool from an external MCP server

##### Parameters

###### serverId

`string`

ID of the server

###### toolName

`string`

Name of the tool

###### parameters

`JsonObject`

Tool parameters

###### options?

Execution options

###### timeout?

`number`

##### Returns

`Promise`\

Tool execution result

---

#### getExternalMCPTools()

> **getExternalMCPTools**(): `ExternalMCPToolInfo`[]

Defined in: [neurolink.ts:5902](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L5902)

Get all tools from external MCP servers

##### Returns

`ExternalMCPToolInfo`[]

Array of external tool information

---

#### getExternalMCPServerTools()

> **getExternalMCPServerTools**(`serverId`): `ExternalMCPToolInfo`[]

Defined in: [neurolink.ts:5911](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L5911)

Get tools from a specific external MCP server

##### Parameters

###### serverId

`string`

ID of the server

##### Returns

`ExternalMCPToolInfo`[]

Array of tool information for the server

---

#### testExternalMCPConnection()

> **testExternalMCPConnection**(`config`): `Promise`\

Defined in: [neurolink.ts:5920](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L5920)

Test connection to an external MCP server

##### Parameters

###### config

[`MCPServerInfo`](/docs/api/type-aliases/MCPServerInfo)

Server configuration to test

##### Returns

`Promise`\

Test result with connection status

---

#### getExternalMCPStatistics()

> **getExternalMCPStatistics**(): `object`

Defined in: [neurolink.ts:5945](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L5945)

Get external MCP server manager statistics

##### Returns

`object`

Statistics about external servers and tools

###### totalServers

> **totalServers**: `number`

###### connectedServers

> **connectedServers**: `number`

###### failedServers

> **failedServers**: `number`

###### totalTools

> **totalTools**: `number`

###### totalConnections

> **totalConnections**: `number`

###### totalErrors

> **totalErrors**: `number`

---

#### shutdownExternalMCPServers()

> **shutdownExternalMCPServers**(): `Promise`\

Defined in: [neurolink.ts:5960](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L5960)

Shutdown all external MCP servers
Called automatically on process exit

##### Returns

`Promise`\

---

#### dispose()

> **dispose**(): `Promise`\

Defined in: [neurolink.ts:6161](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/neurolink.ts#L6161)

Dispose of all resources and cleanup connections
Call this method when done using the NeuroLink instance to prevent resource leaks
Especially important in test environments where multiple instances are created

##### Returns

`Promise`\

---

## Type Alias: EvaluationData

<!-- Source: api/type-aliases/EvaluationData.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### accuracy

> **accuracy**: `number`

Defined in: [types/evaluation.ts:32](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/evaluation.ts#L32)

---

### completeness

> **completeness**: `number`

Defined in: [types/evaluation.ts:33](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/evaluation.ts#L33)

---

### overall

> **overall**: `number`

Defined in: [types/evaluation.ts:34](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/evaluation.ts#L34)

---

### domainAlignment?

> `optional` **domainAlignment**: `number`

Defined in: [types/evaluation.ts:35](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/evaluation.ts#L35)

---

### terminologyAccuracy?

> `optional` **terminologyAccuracy**: `number`

Defined in: [types/evaluation.ts:36](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/evaluation.ts#L36)

---

### toolEffectiveness?

> `optional` **toolEffectiveness**: `number`

Defined in: [types/evaluation.ts:37](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/evaluation.ts#L37)

---

### responseContent?

> `optional` **responseContent**: `string`

Defined in: [types/evaluation.ts:40](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/evaluation.ts#L40)

---

### queryContent?

> `optional` **queryContent**: `string`

Defined in: [types/evaluation.ts:41](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/evaluation.ts#L41)

---

### isOffTopic

> **isOffTopic**: `boolean`

Defined in: [types/evaluation.ts:44](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/evaluation.ts#L44)

---

### alertSeverity

> **alertSeverity**: `AlertSeverity`

Defined in: [types/evaluation.ts:45](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/evaluation.ts#L45)

---

### reasoning

> **reasoning**: `string`

Defined in: [types/evaluation.ts:46](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/evaluation.ts#L46)

---

### suggestedImprovements?

> `optional` **suggestedImprovements**: `string`

Defined in: [types/evaluation.ts:47](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/evaluation.ts#L47)

---

### evaluationModel

> **evaluationModel**: `string`

Defined in: [types/evaluation.ts:50](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/evaluation.ts#L50)

---

### evaluationTime

> **evaluationTime**: `number`

Defined in: [types/evaluation.ts:51](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/evaluation.ts#L51)

---

### evaluationDomain?

> `optional` **evaluationDomain**: `string`

Defined in: [types/evaluation.ts:52](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/evaluation.ts#L52)

---

### evaluationProvider?

> `optional` **evaluationProvider**: `string`

Defined in: [types/evaluation.ts:55](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/evaluation.ts#L55)

---

### evaluationAttempt?

> `optional` **evaluationAttempt**: `number`

Defined in: [types/evaluation.ts:56](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/evaluation.ts#L56)

---

### evaluationConfig?

> `optional` **evaluationConfig**: `object`

Defined in: [types/evaluation.ts:57](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/evaluation.ts#L57)

#### mode

> **mode**: `string`

#### fallbackUsed

> **fallbackUsed**: `boolean`

#### costEstimate

> **costEstimate**: `number`

---

### domainConfig?

> `optional` **domainConfig**: `object`

Defined in: [types/evaluation.ts:64](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/evaluation.ts#L64)

#### domainName

> **domainName**: `string`

#### domainDescription

> **domainDescription**: `string`

#### keyTerms

> **keyTerms**: `string`[]

#### failurePatterns

> **failurePatterns**: `string`[]

#### successPatterns

> **successPatterns**: `string`[]

#### evaluationCriteria?

> `optional` **evaluationCriteria**: `Record`\

---

### domainEvaluation?

> `optional` **domainEvaluation**: `object`

Defined in: [types/evaluation.ts:74](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/evaluation.ts#L74)

#### domainRelevance

> **domainRelevance**: `number`

#### terminologyAccuracy

> **terminologyAccuracy**: `number`

#### domainExpertise

> **domainExpertise**: `number`

#### domainSpecificInsights

> **domainSpecificInsights**: `string`[]

---

## Function: createVectorQueryTool()

<!-- Source: api/functions/createVectorQueryTool.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

-------- | ----------------------------------------------------------- | ---------------------------------------------------------------- |
| config      | `VectorQueryToolConfig`                                     | Tool configuration options                                       |
| vectorStore | `VectorStore \| ((context: RequestContext) => VectorStore)` | Vector store instance or factory function for dynamic resolution |

## Returns

`Tool`

A tool object with `name`, `description`, `parameters`, and `execute` method compatible with NeuroLink's generate() and stream() APIs.

## Configuration Options

The `VectorQueryToolConfig` type accepts the following properties:

| Property          | Type                                      | Required | Default                          | Description                                  |
| ----------------- | ----------------------------------------- | -------- | -------------------------------- | -------------------------------------------- |
| `id`              | `string`                                  | No       | `vector-query-{uuid}`            | Unique tool identifier                       |
| `description`     | `string`                                  | No       | `"Access the knowledge base..."` | Tool description shown to AI agents          |
| `indexName`       | `string`                                  | **Yes**  | -                                | Index name within the vector store           |
| `embeddingModel`  | `{ provider: string; modelName: string }` | **Yes**  | -                                | Embedding model specification                |
| `enableFilter`    | `boolean`                                 | No       | `false`                          | Enable metadata filtering in tool parameters |
| `includeVectors`  | `boolean`                                 | No       | `false`                          | Include embedding vectors in results         |
| `includeSources`  | `boolean`                                 | No       | `true`                           | Include full source objects in results       |
| `topK`            | `number`                                  | No       | `10`                             | Number of results to return                  |
| `reranker`        | `RerankerConfig`                          | No       | -                                | Reranker configuration for result refinement |
| `providerOptions` | `VectorProviderOptions`                   | No       | -                                | Provider-specific query options              |

### RerankerConfig

| Property  | Type                                                        | Description                       |
| --------- | ----------------------------------------------------------- | --------------------------------- |
| `model`   | `{ provider: string; modelName: string }`                   | Language model for reranking      |
| `weights` | `{ semantic?: number; vector?: number; position?: number }` | Scoring weights                   |
| `topK`    | `number`                                                    | Number of results after reranking |

### VectorProviderOptions

Provider-specific options for Pinecone, pgVector, and Chroma:

```typescript
type VectorProviderOptions = {
  pinecone?: {
    namespace?: string;
    sparseVector?: number[];
  };
  pgVector?: {
    minScore?: number;
    ef?: number;
    probes?: number;
  };
  chroma?: {
    where?: Record;
    whereDocument?: Record;
  };
};
```

## Examples

### Basic usage

```typescript

const vectorStore = new InMemoryVectorStore();

// Pre-populate with data
await vectorStore.upsert("knowledge-base", [
  { id: "doc1", vector: [0.1, 0.2, ...], metadata: { text: "Paris is the capital of France." } },
  { id: "doc2", vector: [0.3, 0.4, ...], metadata: { text: "London is the capital of England." } },
]);

const queryTool = createVectorQueryTool(
  {
    indexName: "knowledge-base",
    embeddingModel: {
      provider: "openai",
      modelName: "text-embedding-3-small",
    },
  },
  vectorStore
);

// Use with generate()
const response = await generate({
  model: openai("gpt-4"),
  tools: { knowledgeSearch: queryTool },
  prompt: "What is the capital of France?",
});
```

### With reranking

```typescript

const queryTool = createVectorQueryTool(
  {
    id: "docs-search",
    description: "Search the documentation for relevant information",
    indexName: "documentation",
    embeddingModel: {
      provider: "openai",
      modelName: "text-embedding-3-large",
    },
    topK: 20, // Fetch more results initially
    reranker: {
      model: {
        provider: "openai",
        modelName: "gpt-4o-mini",
      },
      weights: {
        semantic: 0.6,
        vector: 0.3,
        position: 0.1,
      },
      topK: 5, // Return top 5 after reranking
    },
  },
  vectorStore,
);
```

### With metadata filtering

```typescript

const queryTool = createVectorQueryTool(
  {
    id: "filtered-search",
    indexName: "products",
    embeddingModel: {
      provider: "openai",
      modelName: "text-embedding-3-small",
    },
    enableFilter: true, // Enable filter parameter for the tool
    topK: 10,
  },
  vectorStore,
);

// The AI can now use filters when calling the tool
const response = await generate({
  model: openai("gpt-4"),
  tools: { productSearch: queryTool },
  prompt: "Find electronics products under $100",
});

// Or call the tool directly with filters
const results = await queryTool.execute({
  query: "wireless headphones",
  filter: {
    category: "electronics",
    price: { $lt: 100 },
    $or: [{ brand: "Sony" }, { brand: "Bose" }],
  },
  topK: 5,
});
```

### Dynamic vector store resolution

```typescript

// Factory function for multi-tenant scenarios
const vectorStoreFactory = (context: RequestContext) => {
  const tenantId = context.tenantId;
  return getTenantVectorStore(tenantId); // Returns tenant-specific store
};

const queryTool = createVectorQueryTool(
  {
    id: "tenant-search",
    indexName: "documents",
    embeddingModel: {
      provider: "openai",
      modelName: "text-embedding-3-small",
    },
  },
  vectorStoreFactory,
);

// Context is passed during execution
const results = await queryTool.execute(
  { query: "quarterly report" },
  { tenantId: "tenant-123", userId: "user-456" },
);
```

### With provider-specific options

```typescript

// Pinecone with namespace
const pineconeQueryTool = createVectorQueryTool(
  {
    indexName: "my-index",
    embeddingModel: {
      provider: "openai",
      modelName: "text-embedding-3-small",
    },
    providerOptions: {
      pinecone: {
        namespace: "production",
      },
    },
  },
  pineconeStore,
);

// pgVector with minimum score threshold
const pgVectorQueryTool = createVectorQueryTool(
  {
    indexName: "documents",
    embeddingModel: {
      provider: "openai",
      modelName: "text-embedding-3-small",
    },
    providerOptions: {
      pgVector: {
        minScore: 0.7,
        probes: 10,
      },
    },
  },
  pgVectorStore,
);
```

## Response Format

The tool returns a `VectorQueryResponse` object:

```typescript
type VectorQueryResponse = {
  /** Formatted relevant context string */
  relevantContext: string;
  /** Source query results (if includeSources is true) */
  sources: VectorQueryResult[];
  /** Total results found */
  totalResults: number;
  /** Query metadata */
  metadata: {
    queryTime: number;
    reranked: boolean;
    filtered: boolean;
  };
};
```

## Metadata Filter Syntax

When `enableFilter` is true, the tool accepts MongoDB/Sift-style query syntax:

```typescript
// Comparison operators
{
  field: {
    $eq: value;
  }
} // Equal
{
  field: {
    $ne: value;
  }
} // Not equal
{
  field: {
    $gt: 10;
  }
} // Greater than
{
  field: {
    $gte: 10;
  }
} // Greater than or equal
{
  field: {
    $lt: 10;
  }
} // Less than
{
  field: {
    $lte: 10;
  }
} // Less than or equal
{
  field: {
    $in: [1, 2];
  }
} // In array
{
  field: {
    $nin: [1, 2];
  }
} // Not in array

// Logical operators
{
  $and: [filter1, filter2];
}
{
  $or: [filter1, filter2];
}
{
  $not: filter;
}

// Special operators
{
  field: {
    $exists: true;
  }
}
{
  field: {
    $contains: "text";
  }
}
{
  field: {
    $regex: "pattern";
  }
}

// Direct equality (shorthand)
{
  category: "electronics";
}
```

## Notes

- The tool automatically generates embeddings for the query using the specified embedding model
- Results are formatted as numbered context for easy reference by AI models
- When using reranking, consider fetching more initial results (higher `topK`) and then reducing with the reranker's `topK`
- The dynamic vector store factory is useful for multi-tenant applications or per-request store selection
- Query timing and reranking status are included in the response metadata for observability

## See Also

- [InMemoryVectorStore](/docs/classes/inmemoryvectorstore) - Built-in vector store for testing
- [VectorQueryToolConfig](/docs/type-aliases/vectorquerytoolconfig) - Configuration type reference
- [generate](/docs/generatetext) - Using tools with the generate API
- [createReranker](/docs/createreranker) - Creating standalone rerankers
- [createHybridSearch](/docs/createhybridsearch) - Hybrid vector + BM25 search

---

## Class: NeuroLinkOAuthProvider

<!-- Source: api/classes/NeuroLinkOAuthProvider.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### saveTokens()

> **saveTokens**(`serverId`, `tokens`): `Promise`\

Defined in: [mcp/auth/oauthClientProvider.ts:84](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/auth/oauthClientProvider.ts#L84)

Save tokens for a server

#### Parameters

##### serverId

`string`

##### tokens

[`OAuthTokens`](/docs/api/type-aliases/OAuthTokens)

#### Returns

`Promise`\

---

### deleteTokens()

> **deleteTokens**(`serverId`): `Promise`\

Defined in: [mcp/auth/oauthClientProvider.ts:91](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/auth/oauthClientProvider.ts#L91)

Delete tokens for a server

#### Parameters

##### serverId

`string`

#### Returns

`Promise`\

---

### clientInformation()

> **clientInformation**(): [`OAuthClientInformation`](/docs/api/type-aliases/OAuthClientInformation)

Defined in: [mcp/auth/oauthClientProvider.ts:98](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/auth/oauthClientProvider.ts#L98)

Get client information for MCP SDK

#### Returns

[`OAuthClientInformation`](/docs/api/type-aliases/OAuthClientInformation)

---

### redirectToAuthorization()

> **redirectToAuthorization**(`_serverId`): [`AuthorizationUrlResult`](/docs/api/type-aliases/AuthorizationUrlResult)

Defined in: [mcp/auth/oauthClientProvider.ts:111](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/auth/oauthClientProvider.ts#L111)

Generate authorization URL for OAuth flow
Returns the URL to redirect the user to for authorization

#### Parameters

##### \_serverId

`string`

Server ID (reserved for future use in state management)

#### Returns

[`AuthorizationUrlResult`](/docs/api/type-aliases/AuthorizationUrlResult)

---

### exchangeCode()

> **exchangeCode**(`serverId`, `request`): `Promise`\

Defined in: [mcp/auth/oauthClientProvider.ts:160](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/auth/oauthClientProvider.ts#L160)

Exchange authorization code for tokens

#### Parameters

##### serverId

`string`

##### request

[`TokenExchangeRequest`](/docs/api/type-aliases/TokenExchangeRequest)

#### Returns

`Promise`\

---

### refreshTokens()

> **refreshTokens**(`serverId`, `refreshToken`): `Promise`\

Defined in: [mcp/auth/oauthClientProvider.ts:236](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/auth/oauthClientProvider.ts#L236)

Refresh tokens using refresh token

#### Parameters

##### serverId

`string`

##### refreshToken

`string`

#### Returns

`Promise`\

---

### revokeTokens()

> **revokeTokens**(`serverId`, `revocationUrl`): `Promise`\

Defined in: [mcp/auth/oauthClientProvider.ts:286](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/auth/oauthClientProvider.ts#L286)

Revoke tokens (if supported by the OAuth server)

#### Parameters

##### serverId

`string`

##### revocationUrl

`string`

#### Returns

`Promise`\

---

### getAuthorizationHeader()

> **getAuthorizationHeader**(`serverId`): `Promise`\

Defined in: [mcp/auth/oauthClientProvider.ts:322](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/auth/oauthClientProvider.ts#L322)

Get authorization header value for API requests

#### Parameters

##### serverId

`string`

#### Returns

`Promise`\

---

### hasValidTokens()

> **hasValidTokens**(`serverId`): `Promise`\

Defined in: [mcp/auth/oauthClientProvider.ts:335](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/auth/oauthClientProvider.ts#L335)

Check if a server has valid (non-expired) tokens

#### Parameters

##### serverId

`string`

#### Returns

`Promise`\

---

### getConfig()

> **getConfig**(): `Readonly`\

Defined in: [mcp/auth/oauthClientProvider.ts:370](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/auth/oauthClientProvider.ts#L370)

Get the OAuth configuration

#### Returns

`Readonly`\

---

### getStorage()

> **getStorage**(): [`TokenStorage`](/docs/api/type-aliases/TokenStorage)

Defined in: [mcp/auth/oauthClientProvider.ts:377](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/auth/oauthClientProvider.ts#L377)

Get the token storage instance

#### Returns

[`TokenStorage`](/docs/api/type-aliases/TokenStorage)

---

### cleanupPendingRequests()

> **cleanupPendingRequests**(): `void`

Defined in: [mcp/auth/oauthClientProvider.ts:385](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/auth/oauthClientProvider.ts#L385)

Clean up expired pending states and challenges
Should be called periodically to prevent memory leaks

#### Returns

`void`

---

## Type Alias: ExecutionContext\<T\>

<!-- Source: api/type-aliases/ExecutionContext.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### userId?

> `optional` **userId**: `string`

Defined in: [types/tools.ts:58](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/tools.ts#L58)

---

### config?

> `optional` **config**: `T`

Defined in: [types/tools.ts:61](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/tools.ts#L61)

---

### metadata?

> `optional` **metadata**: `StandardRecord`

Defined in: [types/tools.ts:62](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/tools.ts#L62)

---

### cacheOptions?

> `optional` **cacheOptions**: `CacheOptions`

Defined in: [types/tools.ts:65](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/tools.ts#L65)

---

### fallbackOptions?

> `optional` **fallbackOptions**: `FallbackOptions`

Defined in: [types/tools.ts:66](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/tools.ts#L66)

---

### timeoutMs?

> `optional` **timeoutMs**: `number`

Defined in: [types/tools.ts:67](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/tools.ts#L67)

---

### startTime?

> `optional` **startTime**: `number`

Defined in: [types/tools.ts:68](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/tools.ts#L68)

---

## Function: executeMCP()

<!-- Source: api/functions/executeMCP.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / executeMCP

# Function: executeMCP()

> **executeMCP**\(`_name`, `_config`, `_args`, `_context?`): `Promise`\

Defined in: [mcp/index.ts:73](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/index.ts#L73)

Execute an MCP operation - simplified

## Type Parameters

### T

`T` = `unknown`

## Parameters

### \_name

`string`

### \_config

`unknown`

### \_args

`unknown`

### \_context?

#### sessionId?

`string`

#### userId?

`string`

## Returns

`Promise`\

---

## Class: RAGPipeline

<!-- Source: api/classes/RAGPipeline.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

------ | ----------------------------------------- | ------------------------------ |
| `config`  | [`RAGPipelineConfig`](#ragpipelineconfig) | Pipeline configuration options |

#### Returns

`RAGPipeline`

#### Example

```typescript
const pipeline = new RAGPipeline({
  embeddingModel: { provider: "openai", modelName: "text-embedding-3-small" },
  generationModel: { provider: "openai", modelName: "gpt-4o-mini" },
});
```

## Methods

### initialize()

> **initialize**(): `Promise`

Defined in: [src/lib/rag/pipeline/RAGPipeline.ts:225](https://github.com/juspay/neurolink/blob/main/src/lib/rag/pipeline/RAGPipeline.ts#L225)

Initialize the pipeline by loading AI providers. Called automatically on first use.

#### Returns

`Promise`

---

### ingest()

> **ingest**(`sources`, `options?`): `Promise`

Defined in: [src/lib/rag/pipeline/RAGPipeline.ts:259](https://github.com/juspay/neurolink/blob/main/src/lib/rag/pipeline/RAGPipeline.ts#L259)

Ingests documents into the pipeline. Performs the complete ingestion workflow:

1. Loads documents from file paths, URLs, or MDocument instances
2. Chunks documents using the configured strategy
3. Optionally extracts metadata using LLM
4. Generates embeddings for all chunks
5. Stores chunks in vector store and BM25 index
6. Updates Graph RAG if enabled

#### Parameters

| Parameter  | Type                              | Description                                       |
| ---------- | --------------------------------- | ------------------------------------------------- |
| `sources`  | `Array`      | Array of file paths, URLs, or MDocument instances |
| `options?` | [`IngestOptions`](#ingestoptions) | Optional ingestion configuration                  |

#### Returns

`Promise`

Object containing counts of processed documents and created chunks

#### Example

```typescript
// Ingest from file paths
const result = await pipeline.ingest([
  "./docs/guide.md",
  "./docs/api.md",
  "https://example.com/content.html",
]);
console.log(
  `Processed ${result.documentsProcessed} documents, ${result.chunksCreated} chunks`,
);

// Ingest with custom options
await pipeline.ingest(sources, {
  strategy: "markdown",
  chunkSize: 1500,
  chunkOverlap: 200,
  metadata: { source: "documentation", version: "2.0" },
  extractMetadata: true,
});

// Ingest MDocument instances
const doc = new MDocument({ text: "My content", metadata: { type: "manual" } });
await pipeline.ingest([doc]);
```

---

### query()

> **query**(`query`, `options?`): `Promise`

Defined in: [src/lib/rag/pipeline/RAGPipeline.ts:384](https://github.com/juspay/neurolink/blob/main/src/lib/rag/pipeline/RAGPipeline.ts#L384)

Queries the pipeline and generates a response with sources. Performs:

1. Generates embedding for the query
2. Retrieves relevant chunks using vector, hybrid, or graph search
3. Optionally reranks results for better relevance
4. Assembles context from retrieved chunks
5. Generates answer using LLM (if configured)

#### Parameters

| Parameter  | Type                            | Description                  |
| ---------- | ------------------------------- | ---------------------------- |
| `query`    | `string`                        | The search query             |
| `options?` | [`QueryOptions`](#queryoptions) | Optional query configuration |

#### Returns

`Promise`

RAG response with answer, context, sources, and metadata

#### Example

```typescript
// Basic query
const response = await pipeline.query("What are the main features?");
console.log(response.answer);
console.log(response.sources);

// Query with options
const response = await pipeline.query("How do I configure auth?", {
  topK: 10,
  hybrid: true,
  rerank: true,
  filter: { type: "documentation" },
  systemPrompt: "You are a helpful technical assistant.",
  temperature: 0.5,
});

// Retrieval only (no generation)
const response = await pipeline.query("authentication", {
  generate: false,
  topK: 20,
});
console.log(response.context);
```

---

### getStats()

> **getStats**(): `PipelineStats`

Defined in: [src/lib/rag/pipeline/RAGPipeline.ts:498](https://github.com/juspay/neurolink/blob/main/src/lib/rag/pipeline/RAGPipeline.ts#L498)

Get pipeline statistics including document counts and feature status.

#### Returns

[`PipelineStats`](#pipelinestats)

Statistics about the pipeline state

#### Example

```typescript
const stats = pipeline.getStats();
console.log(`Documents: ${stats.totalDocuments}`);
console.log(`Chunks: ${stats.totalChunks}`);
console.log(`Hybrid search: ${stats.hybridSearchEnabled}`);
console.log(`Graph RAG: ${stats.graphRAGEnabled}`);
```

---

### getId()

> **getId**(): `string`

Defined in: [src/lib/rag/pipeline/RAGPipeline.ts:512](https://github.com/juspay/neurolink/blob/main/src/lib/rag/pipeline/RAGPipeline.ts#L512)

Get the unique pipeline identifier.

#### Returns

`string`

Pipeline ID

---

### clear()

> **clear**(): `Promise`

Defined in: [src/lib/rag/pipeline/RAGPipeline.ts:519](https://github.com/juspay/neurolink/blob/main/src/lib/rag/pipeline/RAGPipeline.ts#L519)

Clear all indexed data from the pipeline. Removes all documents, chunks, and graph data.

#### Returns

`Promise`

#### Example

```typescript
await pipeline.clear();
console.log(pipeline.getStats().totalDocuments); // 0
```

## Configuration

### RAGPipelineConfig

Configuration options for RAGPipeline constructor.

| Option                    | Type                    | Default               | Description                               |
| ------------------------- | ----------------------- | --------------------- | ----------------------------------------- |
| `id`                      | `string`                | auto-generated        | Unique pipeline identifier                |
| `vectorStore`             | `VectorStore`           | `InMemoryVectorStore` | Vector storage backend for embeddings     |
| `bm25Index`               | `BM25Index`             | `InMemoryBM25Index`   | BM25 index for keyword search             |
| `indexName`               | `string`                | `"default"`           | Name of the index in the vector store     |
| `embeddingModel`          | `EmbeddingModelConfig`  | **required**          | Embedding model configuration             |
| `generationModel`         | `GenerationModelConfig` | -                     | LLM configuration for response generation |
| `defaultChunkingStrategy` | `ChunkingStrategy`      | `"recursive"`         | Default chunking strategy                 |
| `defaultChunkSize`        | `number`                | `1000`                | Default maximum chunk size in characters  |
| `defaultChunkOverlap`     | `number`                | `200`                 | Default overlap between chunks            |
| `enableHybridSearch`      | `boolean`               | `false`               | Enable BM25 + vector hybrid search        |
| `enableGraphRAG`          | `boolean`               | `false`               | Enable knowledge graph retrieval          |
| `graphThreshold`          | `number`                | `0.7`                 | Similarity threshold for graph edges      |
| `defaultTopK`             | `number`                | `5`                   | Default number of results to retrieve     |
| `enableReranking`         | `boolean`               | `false`               | Enable result reranking                   |
| `rerankingModel`          | `EmbeddingModelConfig`  | -                     | Model configuration for reranking         |

### EmbeddingModelConfig

| Option      | Type     | Description                                              |
| ----------- | -------- | -------------------------------------------------------- |
| `provider`  | `string` | AI provider name (e.g., "openai", "vertex", "anthropic") |
| `modelName` | `string` | Model identifier (e.g., "text-embedding-3-small")        |

### GenerationModelConfig

| Option        | Type     | Default | Description                |
| ------------- | -------- | ------- | -------------------------- |
| `provider`    | `string` | -       | AI provider name           |
| `modelName`   | `string` | -       | Model identifier           |
| `temperature` | `number` | `0.7`   | Generation temperature     |
| `maxTokens`   | `number` | `1000`  | Maximum tokens in response |

### IngestOptions

| Option            | Type                      | Description                                |
| ----------------- | ------------------------- | ------------------------------------------ |
| `strategy`        | `ChunkingStrategy`        | Override default chunking strategy         |
| `chunkSize`       | `number`                  | Override default chunk size                |
| `chunkOverlap`    | `number`                  | Override default chunk overlap             |
| `metadata`        | `Record` | Custom metadata to add to chunks           |
| `extractMetadata` | `boolean`                 | Extract title, summary, keywords using LLM |

### QueryOptions

| Option           | Type                      | Default            | Description                   |
| ---------------- | ------------------------- | ------------------ | ----------------------------- |
| `topK`           | `number`                  | config default     | Number of chunks to retrieve  |
| `hybrid`         | `boolean`                 | config default     | Use hybrid search             |
| `graph`          | `boolean`                 | config default     | Use Graph RAG                 |
| `rerank`         | `boolean`                 | config default     | Enable reranking              |
| `filter`         | `Record` | -                  | Metadata filter for retrieval |
| `includeSources` | `boolean`                 | `true`             | Include sources in response   |
| `generate`       | `boolean`                 | `true`             | Generate LLM response         |
| `systemPrompt`   | `string`                  | default RAG prompt | Custom system prompt          |
| `temperature`    | `number`                  | config default     | Generation temperature        |

## Response Types

### RAGResponse

| Property   | Type                  | Description                             |
| ---------- | --------------------- | --------------------------------------- |
| `answer`   | `string \| undefined` | Generated answer (if generate=true)     |
| `context`  | `string`              | Assembled context from retrieved chunks |
| `sources`  | `Array`       | Retrieved source chunks with scores     |
| `metadata` | `ResponseMetadata`    | Query execution metadata                |

### Source

| Property   | Type                      | Description        |
| ---------- | ------------------------- | ------------------ |
| `id`       | `string`                  | Chunk identifier   |
| `text`     | `string`                  | Chunk text content |
| `score`    | `number`                  | Relevance score    |
| `metadata` | `Record` | Chunk metadata     |

### ResponseMetadata

| Property          | Type      | Description                               |
| ----------------- | --------- | ----------------------------------------- |
| `queryTime`       | `number`  | Total query time in milliseconds          |
| `retrievalMethod` | `string`  | Method used ("vector", "hybrid", "graph") |
| `chunksRetrieved` | `number`  | Number of chunks retrieved                |
| `reranked`        | `boolean` | Whether results were reranked             |

### PipelineStats

| Property              | Type                  | Description                  |
| --------------------- | --------------------- | ---------------------------- |
| `totalDocuments`      | `number`              | Number of ingested documents |
| `totalChunks`         | `number`              | Total number of chunks       |
| `indexName`           | `string`              | Vector store index name      |
| `embeddingDimension`  | `number \| undefined` | Embedding vector dimension   |
| `hybridSearchEnabled` | `boolean`             | Hybrid search status         |
| `graphRAGEnabled`     | `boolean`             | Graph RAG status             |

## Factory Function

### createRAGPipeline()

> **createRAGPipeline**(`options`): `RAGPipeline`

Defined in: [src/lib/rag/pipeline/RAGPipeline.ts:622](https://github.com/juspay/neurolink/blob/main/src/lib/rag/pipeline/RAGPipeline.ts#L622)

Create a simple RAG pipeline with sensible defaults.

#### Parameters

| Parameter                 | Type      | Default                    | Description                              |
| ------------------------- | --------- | -------------------------- | ---------------------------------------- |
| `options.provider`        | `string`  | `"openai"`                 | AI provider for embedding and generation |
| `options.embeddingModel`  | `string`  | `"text-embedding-3-small"` | Embedding model name                     |
| `options.generationModel` | `string`  | -                          | Generation model name                    |
| `options.enableHybrid`    | `boolean` | `false`                    | Enable hybrid search                     |
| `options.enableGraph`     | `boolean` | `false`                    | Enable Graph RAG                         |

#### Returns

`RAGPipeline`

Configured RAGPipeline instance

#### Example

```typescript

// Simple pipeline
const pipeline = createRAGPipeline({
  generationModel: "gpt-4o-mini",
});

// With hybrid search
const pipeline = createRAGPipeline({
  provider: "openai",
  embeddingModel: "text-embedding-3-large",
  generationModel: "gpt-4o",
  enableHybrid: true,
});
```

## See Also

- [MDocument](/docs/mdocument) - Document representation and operations
- [InMemoryVectorStore](/docs/inmemoryvectorstore) - Default vector storage
- [InMemoryBM25Index](/docs/inmemorybm25index) - BM25 keyword index
- [GraphRAG](/docs/graphrag) - Knowledge graph retrieval
- [ChunkingStrategy](/docs/type-aliases/chunkingstrategy) - Available chunking strategies

---

## Type Alias: ExtractParams

<!-- Source: api/type-aliases/ExtractParams.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

### summary?

> `optional` **summary**: `boolean` | [`SummaryExtractorConfig`](/docs/summaryextractorconfig)

Extract document summary. Set to `true` for defaults or provide configuration object.

---

### keywords?

> `optional` **keywords**: `boolean` | [`KeywordExtractorConfig`](/docs/keywordextractorconfig)

Extract keywords from content. Set to `true` for defaults or provide configuration object.

---

### questions?

> `optional` **questions**: `boolean` | [`QuestionExtractorConfig`](/docs/questionextractorconfig)

Generate Q&A pairs from content. Set to `true` for defaults or provide configuration object.

---

### custom?

> `optional` **custom**: [`CustomSchemaExtractorConfig`](/docs/customschemaextractorconfig)

Custom schema extraction using Zod schemas for structured data extraction.

## Example

```typescript

const doc = MDocument.fromMarkdown(content);
await doc.chunk({ strategy: "markdown" });

// Simple boolean flags for default extraction
await doc.extractMetadata({
  title: true,
  summary: true,
  keywords: true,
});

// Advanced configuration with options
await doc.extractMetadata({
  title: {
    nodes: 3,
    modelName: "gpt-4o-mini",
  },
  summary: {
    summaryTypes: ["current", "next"],
    maxWords: 100,
  },
  keywords: {
    maxKeywords: 10,
    minRelevance: 0.7,
  },
  questions: {
    numQuestions: 5,
    includeAnswers: true,
  },
});

// Custom schema extraction

await doc.extractMetadata({
  custom: {
    schema: z.object({
      entities: z.array(z.string()),
      sentiment: z.enum(["positive", "negative", "neutral"]),
    }),
    description: "Extract named entities and sentiment",
  },
});
```

---

## Function: flushOpenTelemetry()

<!-- Source: api/functions/flushOpenTelemetry.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / flushOpenTelemetry

# Function: flushOpenTelemetry()

> **flushOpenTelemetry**(): `Promise`\

Defined in: [services/server/ai/observability/instrumentation.ts:137](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/services/server/ai/observability/instrumentation.ts#L137)

Flush all pending spans to Langfuse

## Returns

`Promise`\

---

## Class: RateLimiterManager

<!-- Source: api/classes/RateLimiterManager.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### hasLimiter()

> **hasLimiter**(`serverId`): `boolean`

Defined in: [mcp/httpRateLimiter.ts:351](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/httpRateLimiter.ts#L351)

Check if a rate limiter exists for a server

#### Parameters

##### serverId

`string`

Unique identifier for the server

#### Returns

`boolean`

true if a rate limiter exists for the server

---

### removeLimiter()

> **removeLimiter**(`serverId`): `void`

Defined in: [mcp/httpRateLimiter.ts:360](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/httpRateLimiter.ts#L360)

Remove a rate limiter for a server

#### Parameters

##### serverId

`string`

Unique identifier for the server

#### Returns

`void`

---

### getServerIds()

> **getServerIds**(): `string`[]

Defined in: [mcp/httpRateLimiter.ts:377](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/httpRateLimiter.ts#L377)

Get all server IDs with active rate limiters

#### Returns

`string`[]

Array of server IDs

---

### getAllStats()

> **getAllStats**(): `Record`\

Defined in: [mcp/httpRateLimiter.ts:386](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/httpRateLimiter.ts#L386)

Get statistics for all rate limiters

#### Returns

`Record`\

Record of server IDs to their rate limiter statistics

---

### resetAll()

> **resetAll**(): `void`

Defined in: [mcp/httpRateLimiter.ts:399](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/httpRateLimiter.ts#L399)

Reset all rate limiters

#### Returns

`void`

---

### destroyAll()

> **destroyAll**(): `void`

Defined in: [mcp/httpRateLimiter.ts:411](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/httpRateLimiter.ts#L411)

Destroy all rate limiters and clean up resources
This should be called during application shutdown

#### Returns

`void`

---

### getHealthSummary()

> **getHealthSummary**(): `object`

Defined in: [mcp/httpRateLimiter.ts:423](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/httpRateLimiter.ts#L423)

Get health summary for all rate limiters

#### Returns

`object`

##### totalLimiters

> **totalLimiters**: `number`

##### serversWithQueuedRequests

> **serversWithQueuedRequests**: `string`[]

##### totalQueuedRequests

> **totalQueuedRequests**: `number`

##### averageTokensAvailable

> **averageTokensAvailable**: `number`

---

## Type Alias: GenerateOptions

<!-- Source: api/type-aliases/GenerateOptions.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### output?

> `optional` **output**: `object`

Defined in: [types/generateTypes.ts:72](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L72)

Output configuration options

#### format?

> `optional` **format**: `"text"` \| `"structured"` \| `"json"`

Output format for text generation

#### mode?

> `optional` **mode**: `"text"` \| `"video"`

Output mode - determines the type of content generated

- "text": Standard text generation (default)
- "video": Video generation using models like Veo 3.1

#### video?

> `optional` **video**: `VideoOutputOptions`

Video generation configuration (used when mode is "video")
Requires an input image and text prompt

#### Examples

```typescript
output: {
  format: "text";
}
```

```typescript
output: {
  mode: "video",
  video: {
    resolution: "1080p",
    length: 8,
    aspectRatio: "16:9",
    audio: true
  }
}
```

---

### csvOptions?

> `optional` **csvOptions**: `object`

Defined in: [types/generateTypes.ts:89](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L89)

#### maxRows?

> `optional` **maxRows**: `number`

#### formatStyle?

> `optional` **formatStyle**: `"raw"` \| `"markdown"` \| `"json"`

#### includeHeaders?

> `optional` **includeHeaders**: `boolean`

---

### videoOptions?

> `optional` **videoOptions**: `object`

Defined in: [types/generateTypes.ts:96](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L96)

#### frames?

> `optional` **frames**: `number`

#### quality?

> `optional` **quality**: `number`

#### format?

> `optional` **format**: `"jpeg"` \| `"png"`

#### transcribeAudio?

> `optional` **transcribeAudio**: `boolean`

---

### tts?

> `optional` **tts**: `TTSOptions`

Defined in: [types/generateTypes.ts:135](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L135)

Text-to-Speech (TTS) configuration

Enable audio generation from the text response. The generated audio will be
returned in the result's `audio` field as a TTSResult object.

#### Examples

```typescript
const result = await neurolink.generate({
  input: { text: "Tell me a story" },
  provider: "google-ai",
  tts: { enabled: true, voice: "en-US-Neural2-C" },
});
console.log(result.audio?.buffer); // Audio Buffer
```

```typescript
const result = await neurolink.generate({
  input: { text: "Speak slowly and clearly" },
  provider: "google-ai",
  tts: {
    enabled: true,
    voice: "en-US-Neural2-D",
    speed: 0.8,
    pitch: 2.0,
    format: "mp3",
    quality: "standard",
  },
});
```

---

### thinkingConfig?

> `optional` **thinkingConfig**: `object`

Defined in: [types/generateTypes.ts:177](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L177)

Thinking/reasoning configuration for extended thinking models

Enables extended thinking capabilities for supported models.

**Gemini 3 Models** (gemini-3-pro-preview, gemini-3-flash-preview):
Use `thinkingLevel` to control reasoning depth:

- `minimal` - Near-zero thinking (Flash only)
- `low` - Fast reasoning for simple tasks
- `medium` - Balanced reasoning/latency
- `high` - Maximum reasoning depth (default for Pro)

**Anthropic Claude** (claude-3-7-sonnet, etc.):
Use `budgetTokens` to set token budget for thinking.

#### enabled?

> `optional` **enabled**: `boolean`

#### type?

> `optional` **type**: `"enabled"` \| `"disabled"`

#### budgetTokens?

> `optional` **budgetTokens**: `number`

Token budget for thinking (Anthropic models)

#### thinkingLevel?

> `optional` **thinkingLevel**: `"minimal"` \| `"low"` \| `"medium"` \| `"high"`

Thinking level for Gemini 3 models: minimal, low, medium, high

#### Examples

```typescript
const result = await neurolink.generate({
  input: { text: "Solve this complex problem..." },
  provider: "google-ai",
  model: "gemini-3-pro-preview",
  thinkingConfig: {
    thinkingLevel: "high",
  },
});
```

```typescript
const result = await neurolink.generate({
  input: { text: "Solve this complex math problem..." },
  provider: "anthropic",
  model: "claude-3-7-sonnet-20250219",
  thinkingConfig: {
    enabled: true,
    budgetTokens: 10000,
  },
});
```

---

### provider?

> `optional` **provider**: [`AIProviderName`](/docs/api/enumerations/AIProviderName) \| `string`

Defined in: [types/generateTypes.ts:187](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L187)

---

### model?

> `optional` **model**: `string`

Defined in: [types/generateTypes.ts:188](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L188)

---

### region?

> `optional` **region**: `string`

Defined in: [types/generateTypes.ts:189](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L189)

---

### temperature?

> `optional` **temperature**: `number`

Defined in: [types/generateTypes.ts:190](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L190)

---

### maxTokens?

> `optional` **maxTokens**: `number`

Defined in: [types/generateTypes.ts:191](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L191)

---

### systemPrompt?

> `optional` **systemPrompt**: `string`

Defined in: [types/generateTypes.ts:192](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L192)

---

### schema?

> `optional` **schema**: `ValidationSchema`

Defined in: [types/generateTypes.ts:225](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L225)

Zod schema for structured output validation

#### Important

Google Gemini Limitation
Google Vertex AI and Google AI Studio cannot combine function calling with
structured output. You MUST use `disableTools: true` when using schemas with
Google providers.

Error without disableTools: "Function calling with a response mime type:
'application/json' is unsupported"

This is a documented Google API limitation, not a NeuroLink bug.
All frameworks (LangChain, Vercel AI SDK, Agno, Instructor) use this approach.

#### Example

```typescript
// ✅ Correct for Google providers
const result = await neurolink.generate({
  schema: MySchema,
  provider: "vertex",
  disableTools: true, // Required for Google
});

// ✅ No restriction for other providers
const result = await neurolink.generate({
  schema: MySchema,
  provider: "openai", // Works without disableTools
});
```

#### See

https://ai.google.dev/gemini-api/docs/function-calling

---

### tools?

> `optional` **tools**: `Record`\

Defined in: [types/generateTypes.ts:226](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L226)

---

### timeout?

> `optional` **timeout**: `number` \| `string`

Defined in: [types/generateTypes.ts:227](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L227)

---

### disableTools?

> `optional` **disableTools**: `boolean`

Defined in: [types/generateTypes.ts:245](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L245)

Disable tool execution (including built-in tools)

#### Required

For Google Gemini providers when using schemas
Google Vertex AI and Google AI Studio require this flag when using
structured output (schemas) due to Google API limitations.

#### Example

```typescript
// Required for Google providers with schemas
await neurolink.generate({
  schema: MySchema,
  provider: "vertex",
  disableTools: true,
});
```

---

### maxSteps?

> `optional` **maxSteps**: `number`

Defined in: [types/generateTypes.ts:247](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L247)

Maximum number of tool execution steps (default: 5).

---

### toolChoice?

> `optional` **toolChoice**: `"auto"` \| `"none"` \| `"required"` \| \{ `type`: `"tool"`; `toolName`: `string` \}

Defined in: [types/generateTypes.ts:263](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L263)

Tool choice configuration for the generation.
Controls whether and which tools the model must call.

- `"auto"` (default): the model can choose whether and which tools to call
- `"none"`: no tool calls allowed
- `"required"`: the model must call at least one tool and calls indefinitely until maxSteps is reached and outputs empty string.
- `{ type: "tool", toolName: string }`: the model must and only call the specified tool and calls indefinitely until maxSteps is reached and outputs empty string.

> **Note:** When used without `prepareStep`, this applies to **every step** in the
> `maxSteps` loop. Using `"required"` or `{ type: "tool" }` without `prepareStep`
> will cause infinite tool calls until `maxSteps` is exhausted.

---

### prepareStep?

> `optional` **prepareStep**: (`options`: \{ `steps`: `StepResult`[]; `stepNumber`: `number`; `maxSteps`: `number`; `model`: `LanguageModel` \}) => `PromiseLike`\

Defined in: [types/generateTypes.ts:288](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L288)

Optional callback that runs before each step in a multi-step generation.
Allows dynamically changing `toolChoice` and available tools per step.

This is the recommended way to enforce specific tool calls on certain steps
while allowing the model freedom on others.

Maps to Vercel AI SDK's `experimental_prepareStep`.

#### Example

```typescript
prepareStep: async ({ stepNumber }) => {
  if (stepNumber === 0) {
    return {
      toolChoice: { type: "tool", toolName: "sequentialThinking" },
    };
  }
  return { toolChoice: "auto" };
};
```

#### See

[SDK Custom Tools Guide — Controlling Tool Execution](/docs/sdk/custom-tools-guide#-controlling-tool-execution)

---

### enableEvaluation?

> `optional` **enableEvaluation**: `boolean`

Defined in: [types/generateTypes.ts:248](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L248)

---

### enableAnalytics?

> `optional` **enableAnalytics**: `boolean`

Defined in: [types/generateTypes.ts:249](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L249)

---

### context?

> `optional` **context**: `StandardRecord`

Defined in: [types/generateTypes.ts:250](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L250)

---

### evaluationDomain?

> `optional` **evaluationDomain**: `string`

Defined in: [types/generateTypes.ts:253](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L253)

---

### toolUsageContext?

> `optional` **toolUsageContext**: `string`

Defined in: [types/generateTypes.ts:254](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L254)

---

### conversationHistory?

> `optional` **conversationHistory**: `object`[]

Defined in: [types/generateTypes.ts:255](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L255)

#### role

> **role**: `string`

#### content

> **content**: `string`

---

### factoryConfig?

> `optional` **factoryConfig**: `object`

Defined in: [types/generateTypes.ts:258](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L258)

#### domainType?

> `optional` **domainType**: `string`

#### domainConfig?

> `optional` **domainConfig**: `StandardRecord`

#### enhancementType?

> `optional` **enhancementType**: `"domain-configuration"` \| `"streaming-optimization"` \| `"mcp-integration"` \| `"legacy-migration"` \| `"context-conversion"`

#### preserveLegacyFields?

> `optional` **preserveLegacyFields**: `boolean`

#### validateDomainData?

> `optional` **validateDomainData**: `boolean`

---

### streaming?

> `optional` **streaming**: `object`

Defined in: [types/generateTypes.ts:272](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L272)

#### enabled?

> `optional` **enabled**: `boolean`

#### chunkSize?

> `optional` **chunkSize**: `number`

#### bufferSize?

> `optional` **bufferSize**: `number`

#### enableProgress?

> `optional` **enableProgress**: `boolean`

#### fallbackToGenerate?

> `optional` **fallbackToGenerate**: `boolean`

---

## ~~Function: generateText()~~

<!-- Source: api/functions/generateText.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / generateText

# ~~Function: generateText()~~

> **generateText**(`options`): `Promise`\

Defined in: [index.ts:430](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/index.ts#L430)

Legacy generateText function for backward compatibility.

Provides standalone text generation function for existing code.
For new code, use [NeuroLink.generate](/docs/api/classes/NeuroLink) instead which provides
more features including streaming, tools, and structured output.

## Parameters

### options

[`TextGenerationOptions`](/docs/api/type-aliases/TextGenerationOptions)

Text generation options

## Returns

`Promise`\

Promise resolving to text generation result with content and metadata

## Deprecated

Use [NeuroLink.generate](/docs/api/classes/NeuroLink) for new code

## Examples

```typescript

const result = await generateText({
  prompt: "Explain quantum computing in simple terms",
  provider: "bedrock",
  model: "claude-3-sonnet",
});
console.log(result.content);
```

```typescript
const result = await generateText({
  prompt: "Write a creative story",
  provider: "openai",
  temperature: 1.5,
  maxTokens: 500,
});
```

## See

[NeuroLink.generate](/docs/api/classes/NeuroLink) for modern API with more features

## Since

1.0.0

---

## Class: RerankerFactory

<!-- Source: api/classes/RerankerFactory.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

### resetInstance()

> `static` **resetInstance**(): `void`

Defined in: [lib/rag/reranker/RerankerFactory.ts:172](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/RerankerFactory.ts#L172)

Reset the singleton instance. Primarily used for testing to ensure clean state between tests.

#### Returns

`void`

---

### setModelProvider()

> **setModelProvider**(`provider`): `void`

Defined in: [lib/rag/reranker/RerankerFactory.ts:182](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/RerankerFactory.ts#L182)

Set the AI provider for LLM-based rerankers. Must be called before using `llm` or `batch` reranker types.

#### Parameters

##### provider

[`AIProvider`](/docs/api/type-aliases/AIProvider)

The AI provider instance to use for semantic scoring

#### Returns

`void`

---

### createReranker()

> **createReranker**(`typeOrAlias`, `config?`): `Promise`\

Defined in: [lib/rag/reranker/RerankerFactory.ts:391](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/RerankerFactory.ts#L391)

Create a reranker by type or alias. This is the primary method for obtaining reranker instances.

#### Parameters

##### typeOrAlias

`string`

The reranker type ('llm', 'simple', 'batch', 'cross-encoder', 'cohere') or an alias ('semantic', 'fast', etc.)

##### config?

[`RerankerConfig`](/docs/type-aliases/rerankerconfig)

Optional configuration for the reranker

#### Returns

`Promise`\

A configured Reranker instance

#### Throws

`RerankerError` if the type is unknown or creation fails

---

### getAvailableTypes()

> **getAvailableTypes**(): `Promise`\

Defined in: [lib/rag/reranker/RerankerFactory.ts:447](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/RerankerFactory.ts#L447)

Get all available reranker types (not including aliases).

#### Returns

`Promise`\

Array of available reranker type identifiers

---

### getRerankerMetadata()

> **getRerankerMetadata**(`typeOrAlias`): [`RerankerMetadata`](/docs/interfaces/rerankermetadata) | `undefined`

Defined in: [lib/rag/reranker/RerankerFactory.ts:431](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/RerankerFactory.ts#L431)

Get metadata for a reranker type, including description, use cases, and configuration options.

#### Parameters

##### typeOrAlias

`string`

The reranker type or alias

#### Returns

[`RerankerMetadata`](/docs/interfaces/rerankermetadata) | `undefined`

Metadata object or undefined if not found

---

### getDefaultConfig()

> **getDefaultConfig**(`typeOrAlias`): `Partial`\ | `undefined`

Defined in: [lib/rag/reranker/RerankerFactory.ts:439](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/RerankerFactory.ts#L439)

Get the default configuration for a reranker type.

#### Parameters

##### typeOrAlias

`string`

The reranker type or alias

#### Returns

`Partial`\ | `undefined`

Default config or undefined if not found

---

### getRerankersForUseCase()

> **getRerankersForUseCase**(`useCase`): [`RerankerType`](/docs/type-aliases/rerankertype)[]

Defined in: [lib/rag/reranker/RerankerFactory.ts:470](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/RerankerFactory.ts#L470)

Find rerankers suitable for a specific use case by searching metadata.

#### Parameters

##### useCase

`string`

Description of the use case (e.g., "fast", "semantic", "production")

#### Returns

[`RerankerType`](/docs/type-aliases/rerankertype)[]

Array of matching reranker types

---

### getLocalRerankers()

> **getLocalRerankers**(): [`RerankerType`](/docs/type-aliases/rerankertype)[]

Defined in: [lib/rag/reranker/RerankerFactory.ts:487](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/RerankerFactory.ts#L487)

Get rerankers that don't require external APIs (can run locally).

#### Returns

[`RerankerType`](/docs/type-aliases/rerankertype)[]

Array of local reranker types: `['llm', 'cross-encoder', 'simple', 'batch']`

---

### getModelFreeRerankers()

> **getModelFreeRerankers**(): [`RerankerType`](/docs/type-aliases/rerankertype)[]

Defined in: [lib/rag/reranker/RerankerFactory.ts:502](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/RerankerFactory.ts#L502)

Get rerankers that don't require AI models (fastest options).

#### Returns

[`RerankerType`](/docs/type-aliases/rerankertype)[]

Array of model-free reranker types: `['simple']`

---

### getTypeAliases()

> **getTypeAliases**(): `Map`\

Defined in: [lib/rag/reranker/RerankerFactory.ts:455](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/RerankerFactory.ts#L455)

Get all aliases mapped to their canonical reranker types.

#### Returns

`Map`\

Map of alias → type mappings

---

### hasType()

> **hasType**(`typeOrAlias`): `boolean`

Defined in: [lib/rag/reranker/RerankerFactory.ts:462](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/RerankerFactory.ts#L462)

Check if a reranker type or alias exists.

#### Parameters

##### typeOrAlias

`string`

The reranker type or alias to check

#### Returns

`boolean`

True if the type exists

---

### getAllMetadata()

> **getAllMetadata**(): `Map`\

Defined in: [lib/rag/reranker/RerankerFactory.ts:517](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/RerankerFactory.ts#L517)

Get metadata for all registered rerankers.

#### Returns

`Map`\

Map of type → metadata for all rerankers

---

### clear()

> **clear**(): `void`

Defined in: [lib/rag/reranker/RerankerFactory.ts:524](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/RerankerFactory.ts#L524)

Clear the factory, removing all registered rerankers and resetting the model provider.

#### Returns

`void`

## Reranker Types

| Type            | Requires Model | Requires API | Description                                                  |
| --------------- | -------------- | ------------ | ------------------------------------------------------------ |
| `simple`        | No             | No           | Fast, position and vector score-based reranking              |
| `llm`           | Yes            | No           | LLM-powered semantic reranking with multi-factor scoring     |
| `batch`         | Yes            | No           | Batch LLM reranking for efficient multi-document scoring     |
| `cross-encoder` | Yes            | No           | Cross-encoder model reranking (placeholder)                  |
| `cohere`        | No             | Yes          | Cohere Rerank API for production-grade scoring (placeholder) |

### Type Aliases

Each reranker type supports multiple aliases for convenience:

| Type            | Aliases                           |
| --------------- | --------------------------------- |
| `llm`           | `semantic`, `ai`, `model-based`   |
| `simple`        | `fast`, `basic`, `position-based` |
| `batch`         | `batch-llm`, `efficient`, `bulk`  |
| `cross-encoder` | `cross`, `encoder`, `bi-encoder`  |
| `cohere`        | `cohere-rerank`, `cohere-api`     |

## Examples

### Simple Reranking (No LLM Required)

```typescript

// Option 1: Use simpleRerank function directly
const results = await vectorStore.query("machine learning", 10);
const reranked = simpleRerank(results, { topK: 5 });

// Option 2: Use factory
const reranker = await rerankerFactory.createReranker("simple", { topK: 5 });
const reranked = await reranker.rerank(results, "machine learning");

// Using alias
const fastReranker = await rerankerFactory.createReranker("fast");
```

### LLM-Powered Semantic Reranking

```typescript

// Set up model provider first
const provider = await AIProviderFactory.createProvider("vertex");
rerankerFactory.setModelProvider(provider);

// Create LLM reranker
const reranker = await rerankerFactory.createReranker("llm", {
  topK: 5,
  weights: {
    semantic: 0.5, // LLM relevance score
    vector: 0.3, // Original similarity score
    position: 0.2, // Position in original results
  },
});

// Rerank results
const results = await vectorStore.query("explain transformers", 20);
const reranked = await reranker.rerank(results, "explain transformers");

console.log(
  reranked.map((r) => ({
    text: r.result.text?.slice(0, 100),
    score: r.score,
    details: r.details,
  })),
);
```

### Batch Reranking for Efficiency

```typescript

// Batch reranker scores multiple documents in a single LLM call
const batchReranker = await rerankerFactory.createReranker("batch", {
  topK: 10,
});

// More efficient for large result sets
const largeResults = await vectorStore.query("neural networks", 50);
const reranked = await batchReranker.rerank(largeResults, "neural networks");
```

### Discovering Rerankers by Use Case

```typescript

// Find fast rerankers
const fastRerankers = rerankerFactory.getRerankersForUseCase("fast");
// Returns: ['simple']

// Find rerankers for semantic understanding
const semanticRerankers = rerankerFactory.getRerankersForUseCase("semantic");
// Returns: ['llm']

// Get rerankers that don't need models
const modelFree = rerankerFactory.getModelFreeRerankers();
// Returns: ['simple']

// Get all metadata for documentation
const allMetadata = rerankerFactory.getAllMetadata();
for (const [type, meta] of allMetadata) {
  console.log(`${type}: ${meta.description}`);
  console.log(`  Use cases: ${meta.useCases.join(", ")}`);
}
```

### Custom Configuration

```typescript

// Get default config for a type
const defaultConfig = rerankerFactory.getDefaultConfig("llm");
console.log(defaultConfig);
// { topK: 3, weights: { semantic: 0.4, vector: 0.4, position: 0.2 } }

// Override with custom config
const reranker = await rerankerFactory.createReranker("llm", {
  topK: 10,
  weights: {
    semantic: 0.6, // Emphasize semantic relevance
    vector: 0.3,
    position: 0.1,
  },
});
```

## Global Singleton

A pre-configured singleton instance is exported for convenience:

```typescript

// Use directly without calling getInstance()
rerankerFactory.setModelProvider(provider);
const reranker = await rerankerFactory.createReranker("llm");
```

## Convenience Functions

The module also exports convenience functions that use the global singleton:

```typescript

  createReranker,
  getAvailableRerankerTypes,
  getRerankerMetadata,
  getRerankerDefaultConfig,
} from "@juspay/neurolink";

// Create reranker
const reranker = await createReranker("simple", { topK: 5 });

// Get available types
const types = await getAvailableRerankerTypes();
// ['llm', 'cross-encoder', 'cohere', 'simple', 'batch']

// Get metadata
const meta = getRerankerMetadata("llm");
console.log(meta?.description);
// "LLM-powered semantic reranking with multi-factor scoring"

// Get default config
const config = getRerankerDefaultConfig("simple");
// { topK: 3, weights: { vector: 0.8, position: 0.2 } }
```

## See Also

- [RerankerType](/docs/type-aliases/rerankertype) - Available reranker type identifiers
- [RerankerConfig](/docs/type-aliases/rerankerconfig) - Configuration options for rerankers
- [Reranker](/docs/interfaces/reranker) - Reranker interface
- [rerank](/docs/functions/rerank) - LLM rerank function
- [batchRerank](/docs/functions/batchrerank) - Batch rerank function
- [simpleRerank](/docs/functions/simplererank) - Simple rerank function
- [RAGPipeline](/docs/ragpipeline) - Full RAG pipeline with reranking support

---

## Type Alias: GenerateResult

<!-- Source: api/type-aliases/GenerateResult.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### outputs?

> `optional` **outputs**: `object`

Defined in: [types/generateTypes.ts:287](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L287)

#### text

> **text**: `string`

---

### audio?

> `optional` **audio**: `TTSResult`

Defined in: [types/generateTypes.ts:317](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L317)

Text-to-Speech audio result

Contains the generated audio buffer and metadata when TTS is enabled.
Generated by TTSProcessor.synthesize() using the specified provider.

#### Example

```typescript
const result = await neurolink.generate({
  input: { text: "Hello world" },
  provider: "google-ai",
  tts: { enabled: true, voice: "en-US-Neural2-C" },
});

if (result.audio) {
  console.log(`Audio size: ${result.audio.size} bytes`);
  console.log(`Format: ${result.audio.format}`);
  if (result.audio.duration) {
    console.log(`Duration: ${result.audio.duration}s`);
  }
  if (result.audio.voice) {
    console.log(`Voice: ${result.audio.voice}`);
  }
  // Save or play the audio buffer
  fs.writeFileSync("output.mp3", result.audio.buffer);
}
```

---

### video?

> `optional` **video**: `VideoGenerationResult`

Defined in: [types/generateTypes.ts:341](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L341)

Video generation result

Contains the generated video buffer and metadata when video mode is enabled.
Present when `output.mode` is set to "video" in GenerateOptions.

#### Example

```typescript
const result = await neurolink.generate({
  input: { text: "Product showcase", images: [imageBuffer] },
  provider: "vertex",
  model: "veo-3.1",
  output: { mode: "video", video: { resolution: "1080p" } },
});

if (result.video) {
  fs.writeFileSync("output.mp4", result.video.data);
  console.log(`Duration: ${result.video.metadata?.duration}s`);
  console.log(
    `Dimensions: ${result.video.metadata?.dimensions?.width}x${result.video.metadata?.dimensions?.height}`,
  );
}
```

---

### imageOutput?

> `optional` **imageOutput**: \{ `base64`: `string`; \} \| `null`

Defined in: [types/generateTypes.ts:342](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L342)

---

### provider?

> `optional` **provider**: `string`

Defined in: [types/generateTypes.ts:345](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L345)

---

### model?

> `optional` **model**: `string`

Defined in: [types/generateTypes.ts:346](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L346)

---

### usage?

> `optional` **usage**: `TokenUsage`

Defined in: [types/generateTypes.ts:349](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L349)

---

### responseTime?

> `optional` **responseTime**: `number`

Defined in: [types/generateTypes.ts:350](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L350)

---

### toolCalls?

> `optional` **toolCalls**: `object`[]

Defined in: [types/generateTypes.ts:353](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L353)

#### toolCallId

> **toolCallId**: `string`

#### toolName

> **toolName**: `string`

#### args

> **args**: `StandardRecord`

---

### toolResults?

> `optional` **toolResults**: `unknown`[]

Defined in: [types/generateTypes.ts:358](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L358)

---

### toolsUsed?

> `optional` **toolsUsed**: `string`[]

Defined in: [types/generateTypes.ts:359](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L359)

---

### toolExecutions?

> `optional` **toolExecutions**: `object`[]

Defined in: [types/generateTypes.ts:360](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L360)

#### name

> **name**: `string`

#### input

> **input**: `StandardRecord`

#### output

> **output**: `unknown`

---

### enhancedWithTools?

> `optional` **enhancedWithTools**: `boolean`

Defined in: [types/generateTypes.ts:365](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L365)

---

### availableTools?

> `optional` **availableTools**: `object`[]

Defined in: [types/generateTypes.ts:366](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L366)

#### name

> **name**: `string`

#### description

> **description**: `string`

#### parameters

> **parameters**: `StandardRecord`

---

### analytics?

> `optional` **analytics**: [`AnalyticsData`](/docs/api/type-aliases/AnalyticsData)

Defined in: [types/generateTypes.ts:373](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L373)

---

### evaluation?

> `optional` **evaluation**: [`EvaluationData`](/docs/api/type-aliases/EvaluationData)

Defined in: [types/generateTypes.ts:374](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L374)

---

### factoryMetadata?

> `optional` **factoryMetadata**: `object`

Defined in: [types/generateTypes.ts:377](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L377)

#### enhancementApplied

> **enhancementApplied**: `boolean`

#### enhancementType?

> `optional` **enhancementType**: `string`

#### domainType?

> `optional` **domainType**: `string`

#### processingTime?

> `optional` **processingTime**: `number`

#### configurationUsed?

> `optional` **configurationUsed**: `StandardRecord`

#### migrationPerformed?

> `optional` **migrationPerformed**: `boolean`

#### legacyFieldsPreserved?

> `optional` **legacyFieldsPreserved**: `boolean`

---

### streamingMetadata?

> `optional` **streamingMetadata**: `object`

Defined in: [types/generateTypes.ts:388](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L388)

#### streamingUsed

> **streamingUsed**: `boolean`

#### fallbackToGenerate?

> `optional` **fallbackToGenerate**: `boolean`

#### chunkCount?

> `optional` **chunkCount**: `number`

#### streamingDuration?

> `optional` **streamingDuration**: `number`

#### streamId?

> `optional` **streamId**: `string`

#### bufferOptimization?

> `optional` **bufferOptimization**: `boolean`

---

## Function: getAvailableProviders()

<!-- Source: api/functions/getAvailableProviders.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / getAvailableProviders

# Function: getAvailableProviders()

> **getAvailableProviders**(): `string`[]

Defined in: [utils/providerUtils.ts:526](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/utils/providerUtils.ts#L526)

Get available provider names

## Returns

`string`[]

Array of available provider names

---

## Class: RerankerRegistry

<!-- Source: api/classes/RerankerRegistry.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

### resetInstance()

> `static` **resetInstance**(): `void`

Defined in: [lib/rag/reranker/RerankerRegistry.ts:106](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/RerankerRegistry.ts#L106)

Reset the singleton instance. Primarily used for testing to ensure clean state between tests. Clears all registered rerankers and aliases.

#### Returns

`void`

---

### registerReranker()

> **registerReranker**(`type`, `factory`, `metadata`): `void`

Defined in: [lib/rag/reranker/RerankerRegistry.ts:264](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/RerankerRegistry.ts#L264)

Register a reranker with its factory function and metadata. Also registers all aliases defined in the metadata.

#### Parameters

##### type

[`RerankerType`](/docs/type-aliases/rerankertype)

The canonical reranker type identifier

##### factory

`() => Promise`

Async factory function that creates the reranker instance

##### metadata

[`RerankerMetadata`](/docs/interfaces/rerankermetadata)

Metadata including description, use cases, aliases, and configuration

#### Returns

`void`

---

### resolveType()

> **resolveType**(`nameOrAlias`): [`RerankerType`](/docs/type-aliases/rerankertype)

Defined in: [lib/rag/reranker/RerankerRegistry.ts:277](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/RerankerRegistry.ts#L277)

Resolve a type name or alias to its canonical reranker type.

#### Parameters

##### nameOrAlias

`string`

The reranker type or alias to resolve

#### Returns

[`RerankerType`](/docs/type-aliases/rerankertype)

The canonical reranker type

#### Throws

`RerankerError` if the type or alias is not found

---

### getReranker()

> **getReranker**(`typeOrAlias`): `Promise`\

Defined in: [lib/rag/reranker/RerankerRegistry.ts:309](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/RerankerRegistry.ts#L309)

Get a reranker instance by type or alias. Ensures the registry is initialized before lookup.

#### Parameters

##### typeOrAlias

`string`

The reranker type ('llm', 'simple', 'batch', 'cross-encoder', 'cohere') or an alias ('semantic', 'fast', etc.)

#### Returns

`Promise`\

The reranker instance

#### Throws

`RerankerError` if the reranker type is not found

---

### getAvailableRerankers()

> **getAvailableRerankers**(): `Promise`\

Defined in: [lib/rag/reranker/RerankerRegistry.ts:328](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/RerankerRegistry.ts#L328)

Get a list of all available reranker types (not including aliases).

#### Returns

`Promise`\

Array of available reranker type identifiers

---

### getRerankerMetadata()

> **getRerankerMetadata**(`typeOrAlias`): [`RerankerMetadata`](/docs/interfaces/rerankermetadata) | `undefined`

Defined in: [lib/rag/reranker/RerankerRegistry.ts:336](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/RerankerRegistry.ts#L336)

Get metadata for a specific reranker type, including description, use cases, and configuration options.

#### Parameters

##### typeOrAlias

`string`

The reranker type or alias

#### Returns

[`RerankerMetadata`](/docs/interfaces/rerankermetadata) | `undefined`

Metadata object or undefined if not found

---

### getAliasesForType()

> **getAliasesForType**(`type`): `string`[]

Defined in: [lib/rag/reranker/RerankerRegistry.ts:345](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/RerankerRegistry.ts#L345)

Get all aliases registered for a specific reranker type.

#### Parameters

##### type

[`RerankerType`](/docs/type-aliases/rerankertype)

The canonical reranker type

#### Returns

`string`[]

Array of alias strings for the type

---

### getAllAliases()

> **getAllAliases**(): `Map`\

Defined in: [lib/rag/reranker/RerankerRegistry.ts:353](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/RerankerRegistry.ts#L353)

Get all registered aliases mapped to their canonical reranker types.

#### Returns

`Map`\

Map of alias → type mappings

---

### hasReranker()

> **hasReranker**(`typeOrAlias`): `boolean`

Defined in: [lib/rag/reranker/RerankerRegistry.ts:360](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/RerankerRegistry.ts#L360)

Check if a reranker type or alias exists in the registry.

#### Parameters

##### typeOrAlias

`string`

The reranker type or alias to check

#### Returns

`boolean`

True if the type or alias exists

---

### getRerankersByUseCase()

> **getRerankersByUseCase**(`useCase`): [`RerankerType`](/docs/type-aliases/rerankertype)[]

Defined in: [lib/rag/reranker/RerankerRegistry.ts:372](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/RerankerRegistry.ts#L372)

Find rerankers suitable for a specific use case by searching metadata. Performs case-insensitive partial matching against use case descriptions.

#### Parameters

##### useCase

`string`

Description of the use case (e.g., "fast", "semantic", "production")

#### Returns

[`RerankerType`](/docs/type-aliases/rerankertype)[]

Array of matching reranker types

---

### getDefaultConfig()

> **getDefaultConfig**(`typeOrAlias`): `Partial`\ | `undefined`

Defined in: [lib/rag/reranker/RerankerRegistry.ts:389](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/RerankerRegistry.ts#L389)

Get the default configuration for a reranker type.

#### Parameters

##### typeOrAlias

`string`

The reranker type or alias

#### Returns

`Partial`\ | `undefined`

Default config or undefined if not found

---

### getLocalRerankers()

> **getLocalRerankers**(): [`RerankerType`](/docs/type-aliases/rerankertype)[]

Defined in: [lib/rag/reranker/RerankerRegistry.ts:397](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/RerankerRegistry.ts#L397)

Get rerankers that don't require external APIs (can run locally).

#### Returns

[`RerankerType`](/docs/type-aliases/rerankertype)[]

Array of local reranker types: `['llm', 'cross-encoder', 'simple', 'batch']`

---

### getModelFreeRerankers()

> **getModelFreeRerankers**(): [`RerankerType`](/docs/type-aliases/rerankertype)[]

Defined in: [lib/rag/reranker/RerankerRegistry.ts:412](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/RerankerRegistry.ts#L412)

Get rerankers that don't require AI models (fastest options).

#### Returns

[`RerankerType`](/docs/type-aliases/rerankertype)[]

Array of model-free reranker types: `['cohere', 'simple']`

---

### clear()

> **clear**(): `void`

Defined in: [lib/rag/reranker/RerankerRegistry.ts:427](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/RerankerRegistry.ts#L427)

Clear the registry, removing all registered rerankers and aliases.

#### Returns

`void`

## Registered Reranker Types

| Type            | Requires Model | Requires API | Description                                                 |
| --------------- | -------------- | ------------ | ----------------------------------------------------------- |
| `simple`        | No             | No           | Position and vector score-based reranking (no LLM required) |
| `llm`           | Yes            | No           | LLM-powered semantic reranking with multi-factor scoring    |
| `batch`         | Yes            | No           | Batch LLM reranking for efficient multi-document scoring    |
| `cross-encoder` | Yes            | No           | Cross-encoder model for query-document relevance scoring    |
| `cohere`        | No             | Yes          | Cohere Rerank API for production-grade relevance scoring    |

### Type Aliases

Each reranker type supports multiple aliases for convenience:

| Type            | Aliases                           |
| --------------- | --------------------------------- |
| `llm`           | `semantic`, `ai`, `model-based`   |
| `simple`        | `fast`, `basic`, `position-based` |
| `batch`         | `batch-llm`, `efficient`, `bulk`  |
| `cross-encoder` | `cross`, `encoder`, `bi-encoder`  |
| `cohere`        | `cohere-rerank`, `cohere-api`     |

## Examples

### Basic Registry Usage

```typescript

// Get available reranker types
const types = await rerankerRegistry.getAvailableRerankers();
console.log(types);
// ['llm', 'cross-encoder', 'cohere', 'simple', 'batch']

// Check if a reranker exists
if (rerankerRegistry.hasReranker("semantic")) {
  console.log("Semantic reranker is available");
}

// Resolve an alias to its canonical type
const type = rerankerRegistry.resolveType("fast");
console.log(type); // 'simple'
```

### Getting Reranker Instances

```typescript

// Get a reranker by type
const simpleReranker = await rerankerRegistry.getReranker("simple");

// Get a reranker by alias
const fastReranker = await rerankerRegistry.getReranker("fast");

// Both return the same reranker type
console.log(simpleReranker.type); // 'simple'
console.log(fastReranker.type); // 'simple'
```

### Discovering Rerankers by Use Case

```typescript

// Find rerankers for fast processing
const fastRerankers = rerankerRegistry.getRerankersByUseCase("fast");
console.log(fastRerankers); // ['simple']

// Find rerankers for semantic understanding
const semanticRerankers = rerankerRegistry.getRerankersByUseCase("semantic");
console.log(semanticRerankers); // ['llm']

// Find rerankers for production use
const productionRerankers =
  rerankerRegistry.getRerankersByUseCase("production");
console.log(productionRerankers); // ['cohere']

// Find rerankers for batch processing
const batchRerankers = rerankerRegistry.getRerankersByUseCase("batch");
console.log(batchRerankers); // ['batch']
```

### Working with Metadata

```typescript

// Get metadata for a reranker type
const metadata = rerankerRegistry.getRerankerMetadata("llm");
console.log(metadata?.description);
// "LLM-powered semantic reranking with multi-factor scoring"

console.log(metadata?.useCases);
// ["High-quality semantic reranking", "Complex query understanding", "Context-aware scoring"]

console.log(metadata?.supportedOptions);
// ["model", "provider", "topK", "weights"]

// Get default configuration
const defaultConfig = rerankerRegistry.getDefaultConfig("simple");
console.log(defaultConfig);
// { topK: 3, weights: { vector: 0.8, position: 0.2 } }
```

### Filtering Rerankers by Requirements

```typescript

// Get rerankers that work without external APIs
const localRerankers = rerankerRegistry.getLocalRerankers();
console.log(localRerankers);
// ['llm', 'cross-encoder', 'simple', 'batch']

// Get rerankers that don't need AI models (fastest)
const modelFreeRerankers = rerankerRegistry.getModelFreeRerankers();
console.log(modelFreeRerankers);
// ['cohere', 'simple']
```

### Working with Aliases

```typescript

// Get all aliases for a specific type
const llmAliases = rerankerRegistry.getAliasesForType("llm");
console.log(llmAliases);
// ['semantic', 'ai', 'model-based']

// Get all registered aliases
const allAliases = rerankerRegistry.getAllAliases();
for (const [alias, type] of allAliases) {
  console.log(`'${alias}' -> '${type}'`);
}
// 'semantic' -> 'llm'
// 'ai' -> 'llm'
// 'model-based' -> 'llm'
// 'fast' -> 'simple'
// ...
```

### Custom Reranker Registration

```typescript

const registry = RerankerRegistry.getInstance();

// Register a custom reranker
registry.registerReranker(
  "custom" as RerankerType,
  async () => ({
    type: "custom" as RerankerType,
    async rerank(results, query, options) {
      // Custom reranking logic
      return results.slice(0, options?.topK ?? 3).map((result, index) => ({
        result,
        score: 1 - index * 0.1,
        details: { custom: true },
      }));
    },
  }),
  {
    description: "Custom reranking implementation",
    defaultConfig: { topK: 5 },
    supportedOptions: ["topK"],
    useCases: ["Custom use case"],
    aliases: ["my-reranker"],
    requiresModel: false,
    requiresExternalAPI: false,
  },
);

// Use the custom reranker
const customReranker = await registry.getReranker("my-reranker");
```

## Global Singleton

A pre-configured singleton instance is exported for convenience:

```typescript

// Use directly without calling getInstance()
const types = await rerankerRegistry.getAvailableRerankers();
const reranker = await rerankerRegistry.getReranker("simple");
```

## Convenience Functions

The module also exports convenience functions that use the global singleton:

```typescript

  getAvailableRerankers,
  getReranker,
  getRegisteredRerankerMetadata,
} from "@juspay/neurolink";

// Get available reranker types
const types = await getAvailableRerankers();
// ['llm', 'cross-encoder', 'cohere', 'simple', 'batch']

// Get a reranker instance
const reranker = await getReranker("simple");

// Get metadata
const metadata = getRegisteredRerankerMetadata("llm");
console.log(metadata?.description);
```

## See Also

- [RerankerFactory](/docs/rerankerfactory) - Factory for creating configured reranker instances
- [RerankerType](/docs/type-aliases/rerankertype) - Available reranker type identifiers
- [RerankerConfig](/docs/type-aliases/rerankerconfig) - Configuration options for rerankers

---

## Type Alias: HTTPRetryConfig

<!-- Source: api/type-aliases/HTTPRetryConfig.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### initialDelay

> **initialDelay**: `number`

Defined in: [types/mcpTypes.ts:954](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L954)

Initial delay in ms (default: 1000)

---

### maxDelay

> **maxDelay**: `number`

Defined in: [types/mcpTypes.ts:956](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L956)

Maximum delay in ms (default: 30000)

---

### backoffMultiplier

> **backoffMultiplier**: `number`

Defined in: [types/mcpTypes.ts:958](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L958)

Backoff multiplier (default: 2)

---

### retryableStatusCodes

> **retryableStatusCodes**: `number`[]

Defined in: [types/mcpTypes.ts:960](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L960)

HTTP status codes that trigger retry

---

## Function: getAvailableRerankerTypes()

<!-- Source: api/functions/getAvailableRerankerTypes.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

-------------- | ---------------------------------- | -------------- | --------------------- |
| `"llm"`           | LLM-powered semantic reranking     | Yes            | No                    |
| `"cross-encoder"` | Cross-encoder relevance scoring    | Yes            | No                    |
| `"cohere"`        | Cohere Rerank API                  | No             | Yes                   |
| `"simple"`        | Position and score-based reranking | No             | No                    |
| `"batch"`         | Batch LLM reranking                | Yes            | No                    |

## Examples

### List available rerankers

```typescript

const types = await getAvailableRerankerTypes();
console.log("Available rerankers:", types);
// ["llm", "cross-encoder", "cohere", "simple", "batch"]
```

### Build selection UI

```typescript

  getAvailableRerankerTypes,
  getRerankerMetadata,
} from "@juspay/neurolink";

async function buildRerankerOptions() {
  const types = await getAvailableRerankerTypes();

  return types.map((type) => {
    const metadata = getRerankerMetadata(type);
    return {
      value: type,
      label: type.charAt(0).toUpperCase() + type.slice(1),
      description: metadata?.description || "",
      requiresModel: metadata?.requiresModel || false,
      requiresExternalAPI: metadata?.requiresExternalAPI || false,
    };
  });
}
```

### Filter by requirements

```typescript

  getAvailableRerankerTypes,
  getRerankerMetadata,
} from "@juspay/neurolink";

async function getLocalRerankers() {
  const types = await getAvailableRerankerTypes();

  return types.filter((type) => {
    const metadata = getRerankerMetadata(type);
    return !metadata?.requiresExternalAPI;
  });
}

async function getModelFreeRerankers() {
  const types = await getAvailableRerankerTypes();

  return types.filter((type) => {
    const metadata = getRerankerMetadata(type);
    return !metadata?.requiresModel;
  });
}

// Get rerankers that work without any external dependencies
const localTypes = await getLocalRerankers();
// ["llm", "cross-encoder", "simple", "batch"]

const modelFreeTypes = await getModelFreeRerankers();
// ["cohere", "simple"]
```

### Dynamic reranker selection

```typescript

  getAvailableRerankerTypes,
  getRerankerMetadata,
  createReranker,
} from "@juspay/neurolink";

async function selectReranker(options: {
  preferFast?: boolean;
  allowExternalAPI?: boolean;
  hasModel?: boolean;
}) {
  const types = await getAvailableRerankerTypes();

  // Filter based on requirements
  const candidates = types.filter((type) => {
    const metadata = getRerankerMetadata(type);
    if (!metadata) return false;

    if (!options.allowExternalAPI && metadata.requiresExternalAPI) {
      return false;
    }

    if (!options.hasModel && metadata.requiresModel) {
      return false;
    }

    return true;
  });

  // Select based on preference
  if (options.preferFast && candidates.includes("simple")) {
    return createReranker("simple");
  }

  if (candidates.includes("llm")) {
    return createReranker("llm");
  }

  return createReranker(candidates[0] || "simple");
}
```

### Validate reranker type

```typescript

async function isValidRerankerType(type: string): Promise {
  const types = await getAvailableRerankerTypes();
  return types.includes(type as any);
}

// Validate user input
const userType = "llm";
if (await isValidRerankerType(userType)) {
  const reranker = await createReranker(userType);
}
```

## Notes

- The function is async because the factory initializes lazily
- Only canonical type names are returned, not aliases
- Use `getRerankerMetadata()` to get detailed information about each type
- The factory ensures all built-in rerankers are registered before returning

## Since

v8.44.0

## See Also

- [createReranker](/docs/createreranker) - Create a reranker instance
- [getRerankerMetadata](/docs/getrerankermetadata) - Get metadata for a reranker type
- [getRerankerDefaultConfig](/docs/getrerankerdefaultconfig) - Get default configuration

---

## Type Alias: HybridSearchConfig

<!-- Source: api/type-aliases/HybridSearchConfig.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

### bm25Weight?

> `optional` **bm25Weight**: `number`

Defined in: [lib/rag/types.ts:589](https://github.com/juspay/neurolink/blob/main/src/lib/rag/types.ts#L589)

Weight for BM25 keyword search results (0-1). Higher values prioritize exact keyword matches.

---

### fusionMethod?

> `optional` **fusionMethod**: `"rrf"` | `"linear"`

Defined in: [lib/rag/types.ts:591](https://github.com/juspay/neurolink/blob/main/src/lib/rag/types.ts#L591)

Method for combining search results:

- `"rrf"`: Reciprocal Rank Fusion - combines rankings using reciprocal of positions
- `"linear"`: Linear combination of normalized scores

---

### rrfK?

> `optional` **rrfK**: `number`

Defined in: [lib/rag/types.ts:593](https://github.com/juspay/neurolink/blob/main/src/lib/rag/types.ts#L593)

RRF k parameter. Controls the impact of lower-ranked results in Reciprocal Rank Fusion. Typical values: 20-60.

---

### topK?

> `optional` **topK**: `number`

Defined in: [lib/rag/types.ts:595](https://github.com/juspay/neurolink/blob/main/src/lib/rag/types.ts#L595)

Number of results to return after fusion

---

### enableReranking?

> `optional` **enableReranking**: `boolean`

Defined in: [lib/rag/types.ts:597](https://github.com/juspay/neurolink/blob/main/src/lib/rag/types.ts#L597)

Enable reranking of fused results for additional relevance improvement

---

### reranker?

> `optional` **reranker**: [`RerankerConfig`](/docs/rerankerconfig)

Defined in: [lib/rag/types.ts:599](https://github.com/juspay/neurolink/blob/main/src/lib/rag/types.ts#L599)

Reranker configuration (used when `enableReranking` is true)

## Example

```typescript

// Basic hybrid search with equal weights
const basicConfig: HybridSearchConfig = {
  vectorWeight: 0.5,
  bm25Weight: 0.5,
  fusionMethod: "rrf",
  topK: 10,
};

// Advanced hybrid search favoring semantic similarity
const semanticFocusedConfig: HybridSearchConfig = {
  vectorWeight: 0.7,
  bm25Weight: 0.3,
  fusionMethod: "linear",
  topK: 20,
};

// Hybrid search with RRF and reranking
const rerankedConfig: HybridSearchConfig = {
  vectorWeight: 0.6,
  bm25Weight: 0.4,
  fusionMethod: "rrf",
  rrfK: 60,
  topK: 50,
  enableReranking: true,
  reranker: {
    model: {
      provider: "cohere",
      modelName: "rerank-english-v3.0",
    },
    topK: 10,
  },
};

// Use in search
const results = await searchIndex.hybridSearch({
  query: "machine learning best practices",
  config: rerankedConfig,
});
```

## Since

v8.44.0

---

## Function: getAvailableStrategies()

<!-- Source: api/functions/getAvailableStrategies.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / getAvailableStrategies

# Function: getAvailableStrategies()

> **getAvailableStrategies**(): `Promise`

Defined in: [lib/rag/ChunkerFactory.ts:380](https://github.com/juspay/neurolink/blob/main/src/lib/rag/ChunkerFactory.ts#L380)

Get all available chunking strategies

Returns a list of all registered chunking strategy names (not including aliases).
This is useful for dynamically discovering available strategies or validating
user input.

## Returns

`Promise`

Array of available chunking strategy names:

- `character` - Fixed-size character chunks
- `recursive` - Recursive text splitting with ordered separators
- `sentence` - Sentence-boundary aware splitting
- `token` - Token-count based splitting
- `markdown` - Markdown structure-aware splitting
- `html` - HTML tag-aware splitting
- `json` - JSON structure-aware splitting
- `latex` - LaTeX environment-aware splitting
- `semantic` - LLM-powered semantic splitting
- `semantic-markdown` - Combines markdown splitting with semantic similarity

## Examples

### List all strategies

```typescript

const strategies = await getAvailableStrategies();
console.log("Available strategies:", strategies);
// Output: ["character", "recursive", "sentence", "token", "markdown", "html", "json", "latex", "semantic", "semantic-markdown"]
```

### Validate user-selected strategy

```typescript

async function processWithStrategy(text: string, userStrategy: string) {
  const strategies = await getAvailableStrategies();

  if (!strategies.includes(userStrategy as ChunkingStrategy)) {
    throw new Error(`Invalid strategy. Choose from: ${strategies.join(", ")}`);
  }

  const chunker = await createChunker(userStrategy);
  return chunker.chunk(text);
}
```

### Build a strategy selector UI

```typescript

async function buildStrategyOptions() {
  const strategies = await getAvailableStrategies();

  return strategies.map((strategy) => {
    const metadata = getChunkerMetadata(strategy);
    return {
      value: strategy,
      label: strategy,
      description: metadata?.description,
      useCases: metadata?.useCases,
    };
  });
}
```

## Since

v8.44.0

## See Also

- [createChunker](/docs/createchunker) - Create a chunker instance
- [chunkText](/docs/chunktext) - Convenience function for chunking
- [ChunkingStrategy](/docs/type-aliases/chunkingstrategy) - Strategy type definition

---

## Type Alias: LangfuseConfig

<!-- Source: api/type-aliases/LangfuseConfig.md -->

[**NeuroLink API Reference v8.42.0**](/docs/readme)

### publicKey

> **publicKey**: `string`

Defined in: [types/observability.ts:14](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/observability.ts#L14)

Langfuse public key

---

### secretKey

> **secretKey**: `string`

Defined in: [types/observability.ts:21](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/observability.ts#L21)

Langfuse secret key

#### Sensitive

WARNING: This is a sensitive credential. Handle securely.
Do NOT log, expose, or share this key. Follow best practices for secret management.

---

### baseUrl?

> `optional` **baseUrl**: `string`

Defined in: [types/observability.ts:23](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/observability.ts#L23)

Langfuse base URL (default: https://cloud.langfuse.com)

---

### environment?

> `optional` **environment**: `string`

Defined in: [types/observability.ts:25](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/observability.ts#L25)

Environment name (e.g., dev, staging, prod)

---

### release?

> `optional` **release**: `string`

Defined in: [types/observability.ts:27](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/observability.ts#L27)

Release/version identifier

---

### userId?

> `optional` **userId**: `string`

Defined in: [types/observability.ts:29](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/observability.ts#L29)

Optional default user id to attach to spans

---

### sessionId?

> `optional` **sessionId**: `string`

Defined in: [types/observability.ts:31](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/observability.ts#L31)

Optional default session id to attach to spans

---

### useExternalTracerProvider?

> `optional` **useExternalTracerProvider**: `boolean`

Defined in: [types/observability.ts:43](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/observability.ts#L43)

If true, NeuroLink will NOT create or register its own TracerProvider.
Instead, it will only create the LangfuseSpanProcessor and ContextEnricher,
which the parent application must add to its own TracerProvider.

Use this when your application already has OpenTelemetry instrumentation.

#### Default

`false`

---

### autoDetectExternalProvider?

> `optional` **autoDetectExternalProvider**: `boolean`

Defined in: [types/observability.ts:110](https://github.com/juspay/neurolink/blob/main/src/lib/types/observability.ts#L110)

If true, NeuroLink will automatically detect if a TracerProvider is already
registered globally and skip its own registration to avoid conflicts.

This is a convenience option that combines well with useExternalTracerProvider.

#### Default

`false`

---

### autoDetectOperationName?

> `optional` **autoDetectOperationName**: `boolean`

Defined in: [types/observability.ts:133](https://github.com/juspay/neurolink/blob/main/src/lib/types/observability.ts#L133)

Enable auto-detection of operation names from span names.

When `true` (default), AI operation spans (`ai.streamText`, `ai.generateText`, etc.)
will have their operation name automatically extracted and included in the
trace name.

#### Default

`true`

#### Examples

```typescript
// With auto-detection enabled (default):
// Span "ai.streamText" + userId "user@email.com"
// → Trace name: "user@email.com:ai.streamText"

// With auto-detection disabled:
// → Trace name: "user@email.com" (legacy behavior)
```

---

### traceNameFormat?

> `optional` **traceNameFormat**: [`TraceNameFormat`](/docs/tracenameformat)

Defined in: [types/observability.ts:155](https://github.com/juspay/neurolink/blob/main/src/lib/types/observability.ts#L155)

Format for trace names in Langfuse.

Controls how `userId` and `operationName` are combined to form the trace name.
Can be a predefined format string or a custom function for full control.

#### Default

`"userId:operationName"`

#### Examples

```typescript
// Predefined formats:
traceNameFormat: "userId:operationName"; // "user@email.com:ai.streamText"
traceNameFormat: "operationName:userId"; // "ai.streamText:user@email.com"
traceNameFormat: "operationName"; // "ai.streamText"
traceNameFormat: "userId"; // "user@email.com" (legacy)

// Custom function:
traceNameFormat: (ctx) => `[${ctx.operationName || "unknown"}] ${ctx.userId}`;
// → "[ai.streamText] user@email.com"
```

## See Also

- [TraceNameFormat](/docs/tracenameformat) - Type definition for trace name formats
- [setLangfuseContext](/docs/api/functions/setLangfuseContext) - Set context for spans
- [getSpanProcessors](/docs/functions/getspanprocessors) - Get span processors for external provider mode

---

## Function: getBestProvider()

<!-- Source: api/functions/getBestProvider.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / getBestProvider

# Function: getBestProvider()

> **getBestProvider**(`requestedProvider?`): `Promise`\

Defined in: [utils/providerUtils.ts:24](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/utils/providerUtils.ts#L24)

Get the best available provider based on real-time availability checks
Enhanced version consolidated from providerUtils-fixed.ts

## Parameters

### requestedProvider?

`string`

Optional preferred provider name

## Returns

`Promise`\

The best provider name to use

---

## Type Alias: LangfuseSpanAttributes

<!-- Source: api/type-aliases/LangfuseSpanAttributes.md -->

[**NeuroLink API Reference v8.42.0**](/docs/readme)

----------------------------- | ---------- | ----------------------------------------------------- |
| `gen_ai.system`                  | `string`   | AI system/provider name (e.g., "openai", "anthropic") |
| `gen_ai.request.model`           | `string`   | Model name used in request                            |
| `gen_ai.response.model`          | `string`   | Actual model used in response                         |
| `gen_ai.request.max_tokens`      | `number`   | Max tokens requested                                  |
| `gen_ai.request.temperature`     | `number`   | Temperature setting                                   |
| `gen_ai.request.top_p`           | `number`   | Top-p sampling setting                                |
| `gen_ai.usage.input_tokens`      | `number`   | Input/prompt tokens used                              |
| `gen_ai.usage.output_tokens`     | `number`   | Output/completion tokens used                         |
| `gen_ai.usage.total_tokens`      | `number`   | Total tokens used                                     |
| `gen_ai.response.finish_reasons` | `string[]` | Finish reasons from model                             |
| `gen_ai.prompt`                  | `string`   | The prompt sent (if enabled)                          |
| `gen_ai.completion`              | `string`   | The completion received (if enabled)                  |

### Vercel AI SDK Specific Attributes

Additional attributes created by Vercel AI SDK's telemetry:

| Property                    | Type     | Description                       |
| --------------------------- | -------- | --------------------------------- |
| `ai.model.id`               | `string` | Model identifier                  |
| `ai.model.provider`         | `string` | Provider identifier               |
| `ai.operationId`            | `string` | Operation identifier              |
| `ai.telemetry.functionId`   | `string` | Function identifier for telemetry |
| `ai.finishReason`           | `string` | Why generation finished           |
| `ai.usage.promptTokens`     | `number` | Prompt tokens (alias)             |
| `ai.usage.completionTokens` | `number` | Completion tokens (alias)         |

### Custom Attributes

The type also allows arbitrary custom attributes:

```typescript
[key: string]: unknown
```

## Example Usage

```typescript

// Type-safe attribute access
function logTokenUsage(attributes: LangfuseSpanAttributes) {
  const inputTokens =
    attributes["gen_ai.usage.input_tokens"] ??
    attributes["ai.usage.promptTokens"];

  const outputTokens =
    attributes["gen_ai.usage.output_tokens"] ??
    attributes["ai.usage.completionTokens"];

  console.log(`Tokens: ${inputTokens} in, ${outputTokens} out`);
}

// Check if span is GenAI-related
function isGenAISpan(attributes: LangfuseSpanAttributes): boolean {
  return !!(
    attributes["gen_ai.system"] ||
    attributes["ai.model.id"] ||
    attributes["gen_ai.request.model"]
  );
}
```

## How NeuroLink Uses These

NeuroLink's `ContextEnricher` span processor reads these attributes in its `onEnd()` method:

1. Detects GenAI spans by checking for `gen_ai.system`, `ai.model.id`, or `gen_ai.request.model`
2. Logs the model and provider for debugging
3. Captures token usage metrics for observability
4. Enriches spans with Langfuse context (userId, sessionId, etc.)

This enables automatic capture of AI operation metrics when using Vercel AI SDK's
`experimental_telemetry` feature with NeuroLink's Langfuse integration.

## See Also

- [setLangfuseContext](/docs/api/functions/setLangfuseContext) - Set context for spans
- [LangfuseConfig](/docs/api/type-aliases/LangfuseConfig) - Langfuse configuration
- [getSpanProcessors](/docs/functions/getspanprocessors) - Get span processors

---

## Function: getChunkerMetadata()

<!-- Source: api/functions/getChunkerMetadata.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

--------------- | --------------- | ----------------------------------------- |
| `description`      | `string`        | Human-readable description of the chunker |
| `defaultConfig`    | `ChunkerConfig` | Default configuration values              |
| `supportedOptions` | `string[]`      | List of supported configuration options   |
| `useCases`         | `string[]`      | Recommended use cases for this chunker    |
| `aliases`          | `string[]`      | Alternative names for this strategy       |

## Examples

### Get strategy information

```typescript

const metadata = getChunkerMetadata("recursive");

if (metadata) {
  console.log(metadata.description);
  // "Recursively splits text using ordered separators"

  console.log(metadata.defaultConfig);
  // { maxSize: 1000, overlap: 100, separators: ["\n\n", "\n", ". ", " ", ""] }

  console.log(metadata.useCases);
  // ["General text documents", "Default choice"]
}
```

### Using aliases

```typescript

// All these return the same metadata
const md1 = getChunkerMetadata("markdown");
const md2 = getChunkerMetadata("md");
const md3 = getChunkerMetadata("markdown-header");
```

### Build configuration UI

```typescript

async function buildChunkerOptions() {
  const strategies = await getAvailableStrategies();

  return strategies.map((strategy) => {
    const metadata = getChunkerMetadata(strategy);
    return {
      value: strategy,
      label: strategy,
      description: metadata?.description || "",
      defaultConfig: metadata?.defaultConfig || {},
      options: metadata?.supportedOptions || [],
    };
  });
}
```

### Validate configuration options

```typescript

function validateChunkerConfig(
  strategy: string,
  config: Record,
) {
  const metadata = getChunkerMetadata(strategy);

  if (!metadata) {
    throw new Error(`Unknown strategy: ${strategy}`);
  }

  const invalidOptions = Object.keys(config).filter(
    (key) => !metadata.supportedOptions.includes(key),
  );

  if (invalidOptions.length > 0) {
    console.warn(
      `Warning: Unsupported options for ${strategy}: ${invalidOptions.join(", ")}`,
    );
  }

  return true;
}
```

### Find chunker by use case

```typescript

async function findChunkerForUseCase(useCase: string) {
  const strategies = await getAvailableStrategies();

  for (const strategy of strategies) {
    const metadata = getChunkerMetadata(strategy);
    if (
      metadata?.useCases.some((uc) =>
        uc.toLowerCase().includes(useCase.toLowerCase()),
      )
    ) {
      return { strategy, metadata };
    }
  }

  return null;
}

// Find chunker for documentation
const result = await findChunkerForUseCase("documentation");
// Returns { strategy: "markdown", metadata: { ... } }
```

## Available Strategies

| Strategy            | Aliases                                    | Description                         |
| ------------------- | ------------------------------------------ | ----------------------------------- |
| `character`         | `char`, `fixed-size`, `fixed`              | Fixed-size character chunks         |
| `recursive`         | `recursive-character`, `langchain-default` | Recursive splitting with separators |
| `sentence`          | `sent`, `sentence-based`                   | Split by sentence boundaries        |
| `token`             | `tok`, `tokenized`                         | Token-aware splitting               |
| `markdown`          | `md`, `markdown-header`                    | Split by markdown structure         |
| `html`              | `html-tag`, `web`                          | Split by HTML semantic tags         |
| `json`              | `json-object`, `structured`                | Split by JSON object boundaries     |
| `latex`             | `tex`, `latex-section`                     | Split by LaTeX sections             |
| `semantic`          | `llm`, `ai-semantic`                       | LLM-powered semantic splitting      |
| `semantic-markdown` | `semantic-md`, `smart-markdown`            | Semantic markdown combination       |

## Notes

- Returns `undefined` for unknown strategies (check before using)
- Aliases resolve to canonical strategy names
- Metadata is registered at factory initialization
- Use `getAvailableStrategies()` to list all valid strategy names

## Since

v8.44.0

## See Also

- [createChunker](/docs/createchunker) - Create a chunker instance
- [getAvailableStrategies](/docs/getavailablestrategies) - List available chunking strategies
- [getDefaultConfig](/docs/getdefaultconfig) - Get default configuration for a strategy

---

## Type Alias: LogLevel

<!-- Source: api/type-aliases/LogLevel.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / LogLevel

# Type Alias: LogLevel

> **LogLevel** = `"debug"` \| `"info"` \| `"warn"` \| `"error"`

Defined in: [types/utilities.ts:16](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/utilities.ts#L16)

Represents the available logging severity levels.

- debug: Detailed information for debugging purposes
- info: General information about system operation
- warn: Potential issues that don't prevent operation
- error: Critical issues that may cause failures

---

## Function: getLangfuseContext()

<!-- Source: api/functions/getLangfuseContext.md -->

[**NeuroLink API Reference v8.42.0**](/docs/readme)

------------- | --------------------------------- | -------------------------------------------------- |
| `userId`         | `string \| null`                  | User identifier attached to spans                  |
| `sessionId`      | `string \| null`                  | Session identifier attached to spans               |
| `conversationId` | `string \| null`                  | Conversation/thread identifier for grouping traces |
| `requestId`      | `string \| null`                  | Request identifier for log correlation             |
| `traceName`      | `string \| null`                  | Custom trace name in Langfuse UI                   |
| `metadata`       | `Record \| null` | Custom key-value metadata                          |

## Examples

### Basic usage

```typescript

// Set some context
await setLangfuseContext({
  userId: "user-123",
  conversationId: "conv-456",
});

// Read it back
const context = getLangfuseContext();
console.log(context?.userId); // "user-123"
console.log(context?.conversationId); // "conv-456"
```

### Check if context exists

```typescript

const context = getLangfuseContext();
if (context) {
  console.log("Context is set:", context.userId, context.sessionId);
} else {
  console.log("No context set in current async scope");
}
```

### Access in middleware or handlers

```typescript

async function handleRequest(req: Request) {
  // Context was set earlier in the request pipeline
  const context = getLangfuseContext();

  // Log with correlation IDs
  console.log(
    `[${context?.requestId}] Processing request for user ${context?.userId}`,
  );

  // Use context for business logic
  if (context?.metadata?.tier === "premium") {
    // Premium user handling
  }
}
```

## See Also

- [setLangfuseContext](/docs/setlangfusecontext) - Set the context
- [getTracer](/docs/gettracer) - Get a Tracer for custom spans
- [LangfuseConfig](/docs/api/type-aliases/LangfuseConfig) - Configuration options

---

## Type Alias: MCPOAuthConfig

<!-- Source: api/type-aliases/MCPOAuthConfig.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### clientSecret?

> `optional` **clientSecret**: `string`

Defined in: [types/mcpTypes.ts:886](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L886)

OAuth client secret (optional for public clients with PKCE)

---

### authorizationUrl

> **authorizationUrl**: `string`

Defined in: [types/mcpTypes.ts:888](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L888)

Authorization endpoint URL

---

### tokenUrl

> **tokenUrl**: `string`

Defined in: [types/mcpTypes.ts:890](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L890)

Token endpoint URL

---

### redirectUrl

> **redirectUrl**: `string`

Defined in: [types/mcpTypes.ts:892](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L892)

Redirect URI for OAuth callback

---

### scope?

> `optional` **scope**: `string`

Defined in: [types/mcpTypes.ts:894](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L894)

OAuth scope (space-separated)

---

### usePKCE?

> `optional` **usePKCE**: `boolean`

Defined in: [types/mcpTypes.ts:896](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L896)

Enable PKCE (Proof Key for Code Exchange) - recommended for OAuth 2.1

---

### additionalParams?

> `optional` **additionalParams**: `Record`\

Defined in: [types/mcpTypes.ts:898](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L898)

Additional authorization parameters

---

## Function: getLangfuseHealthStatus()

<!-- Source: api/functions/getLangfuseHealthStatus.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / getLangfuseHealthStatus

# Function: getLangfuseHealthStatus()

> **getLangfuseHealthStatus**(): `object`

Defined in: [services/server/ai/observability/instrumentation.ts:208](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/services/server/ai/observability/instrumentation.ts#L208)

Get health status for Langfuse observability

## Returns

`object`

### isHealthy

> **isHealthy**: `boolean` \| `undefined`

### initialized

> **initialized**: `boolean` = `isInitialized`

### credentialsValid

> **credentialsValid**: `boolean` = `isCredentialsValid`

### enabled

> **enabled**: `boolean`

### hasProcessor

> **hasProcessor**: `boolean`

### config

> **config**: \{ `baseUrl`: `string`; `environment`: `string`; `release`: `string`; \} \| `undefined`

---

## Type Alias: MCPServerInfo

<!-- Source: api/type-aliases/MCPServerInfo.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### name

> **name**: `string`

Defined in: [types/mcpTypes.ts:80](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L80)

---

### description

> **description**: `string`

Defined in: [types/mcpTypes.ts:81](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L81)

---

### transport

> **transport**: `MCPTransportType`

Defined in: [types/mcpTypes.ts:82](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L82)

---

### status

> **status**: `MCPServerConnectionStatus`

Defined in: [types/mcpTypes.ts:83](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L83)

---

### tools

> **tools**: `object`[]

Defined in: [types/mcpTypes.ts:86](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L86)

#### name

> **name**: `string`

#### description

> **description**: `string`

#### inputSchema?

> `optional` **inputSchema**: `object`

#### execute()?

> `optional` **execute**: (`params`, `context?`) => `Promise`\ \| `unknown`

##### Parameters

###### params

`unknown`

###### context?

`unknown`

##### Returns

`Promise`\ \| `unknown`

---

### command?

> `optional` **command**: `string`

Defined in: [types/mcpTypes.ts:97](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L97)

---

### args?

> `optional` **args**: `string`[]

Defined in: [types/mcpTypes.ts:98](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L98)

---

### env?

> `optional` **env**: `Record`\

Defined in: [types/mcpTypes.ts:99](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L99)

---

### url?

> `optional` **url**: `string`

Defined in: [types/mcpTypes.ts:100](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L100)

---

### headers?

> `optional` **headers**: `Record`\

Defined in: [types/mcpTypes.ts:101](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L101)

---

### httpOptions?

> `optional` **httpOptions**: `MCPHTTPTransportOptions`

Defined in: [types/mcpTypes.ts:103](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L103)

HTTP transport-specific options

---

### timeout?

> `optional` **timeout**: `number`

Defined in: [types/mcpTypes.ts:104](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L104)

---

### retries?

> `optional` **retries**: `number`

Defined in: [types/mcpTypes.ts:105](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L105)

---

### error?

> `optional` **error**: `string`

Defined in: [types/mcpTypes.ts:106](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L106)

---

### installed?

> `optional` **installed**: `boolean`

Defined in: [types/mcpTypes.ts:107](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L107)

---

### cwd?

> `optional` **cwd**: `string`

Defined in: [types/mcpTypes.ts:110](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L110)

---

### autoRestart?

> `optional` **autoRestart**: `boolean`

Defined in: [types/mcpTypes.ts:111](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L111)

---

### healthCheckInterval?

> `optional` **healthCheckInterval**: `number`

Defined in: [types/mcpTypes.ts:112](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L112)

---

### retryConfig?

> `optional` **retryConfig**: `object`

Defined in: [types/mcpTypes.ts:115](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L115)

Retry configuration for HTTP transport

#### maxAttempts?

> `optional` **maxAttempts**: `number`

#### initialDelay?

> `optional` **initialDelay**: `number`

#### maxDelay?

> `optional` **maxDelay**: `number`

#### backoffMultiplier?

> `optional` **backoffMultiplier**: `number`

---

### rateLimiting?

> `optional` **rateLimiting**: `object`

Defined in: [types/mcpTypes.ts:123](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L123)

Rate limiting configuration for HTTP transport

#### requestsPerMinute?

> `optional` **requestsPerMinute**: `number`

Maximum requests per minute (default: 60)

#### requestsPerHour?

> `optional` **requestsPerHour**: `number`

Maximum requests per hour (optional)

#### maxBurst?

> `optional` **maxBurst**: `number`

Maximum burst size for token bucket (default: 10)

#### useTokenBucket?

> `optional` **useTokenBucket**: `boolean`

Use token bucket algorithm (default: true)

---

### blockedTools?

> `optional` **blockedTools**: `string`[]

Defined in: [types/mcpTypes.ts:135](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L135)

---

### auth?

> `optional` **auth**: `object`

Defined in: [types/mcpTypes.ts:138](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L138)

Authentication configuration for HTTP/SSE/WebSocket transports

#### type

> **type**: `"oauth2"` \| `"bearer"` \| `"api-key"`

Authentication type

#### oauth?

> `optional` **oauth**: `object`

OAuth 2.1 configuration

##### oauth.clientId

> **clientId**: `string`

OAuth client ID

##### oauth.clientSecret?

> `optional` **clientSecret**: `string`

OAuth client secret (optional for public clients with PKCE)

##### oauth.authorizationUrl

> **authorizationUrl**: `string`

Authorization endpoint URL

##### oauth.tokenUrl

> **tokenUrl**: `string`

Token endpoint URL

##### oauth.redirectUrl

> **redirectUrl**: `string`

Redirect URI for OAuth callback

##### oauth.scope?

> `optional` **scope**: `string`

OAuth scope (space-separated)

##### oauth.usePKCE?

> `optional` **usePKCE**: `boolean`

Enable PKCE (Proof Key for Code Exchange) - recommended for OAuth 2.1

#### token?

> `optional` **token**: `string`

Bearer token for simple token authentication

#### apiKey?

> `optional` **apiKey**: `string`

API key for API key authentication

#### apiKeyHeader?

> `optional` **apiKeyHeader**: `string`

Header name for API key (default: "X-API-Key")

---

### metadata?

> `optional` **metadata**: `object`

Defined in: [types/mcpTypes.ts:167](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L167)

#### Index Signature

\[`key`: `string`\]: `unknown`

#### uptime?

> `optional` **uptime**: `number`

#### toolCount?

> `optional` **toolCount**: `number`

#### category?

> `optional` **category**: `MCPServerCategory`

#### provider?

> `optional` **provider**: `string`

#### version?

> `optional` **version**: `string`

#### author?

> `optional` **author**: `string`

#### tags?

> `optional` **tags**: `string`[]

---

## Function: getLangfuseSpanProcessor()

<!-- Source: api/functions/getLangfuseSpanProcessor.md -->

[**NeuroLink API Reference v8.42.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / getLangfuseSpanProcessor

# Function: getLangfuseSpanProcessor()

> **getLangfuseSpanProcessor**(): `LangfuseSpanProcessor | null`

Defined in: [services/server/ai/observability/instrumentation.ts:457](https://github.com/juspay/neurolink/blob/main/src/lib/services/server/ai/observability/instrumentation.ts#L457)

Get the LangfuseSpanProcessor instance

Returns the LangfuseSpanProcessor that sends spans to the Langfuse platform.
This processor is created during initialization and is available in both
standalone and external provider modes.

## Returns

`LangfuseSpanProcessor | null`

The LangfuseSpanProcessor instance, or `null` if:

- Langfuse is not enabled
- Credentials are missing or invalid
- Initialization has not occurred

## Example

```typescript

const processor = getLangfuseSpanProcessor();
if (processor) {
  // Manually flush pending spans to Langfuse
  await processor.forceFlush();

  // Shutdown the processor
  await processor.shutdown();
}
```

## External Provider Mode Usage

```typescript

  createContextEnricher,
  getLangfuseSpanProcessor,
} from "@juspay/neurolink";

// Create your own TracerProvider
const provider = new NodeTracerProvider();

// Add NeuroLink's processors
provider.addSpanProcessor(createContextEnricher());

const langfuseProcessor = getLangfuseSpanProcessor();
if (langfuseProcessor) {
  provider.addSpanProcessor(langfuseProcessor);
}

provider.register();
```

## Processor Behavior

The LangfuseSpanProcessor:

1. **Collects spans** from OpenTelemetry instrumentation
2. **Transforms spans** to Langfuse trace format
3. **Batches spans** for efficient network usage
4. **Sends to Langfuse** via the configured `baseUrl`

## Notes

- The processor is reused across calls (singleton)
- Available in both standalone and external provider modes
- Requires valid Langfuse credentials (`publicKey`, `secretKey`)
- Use `getSpanProcessors()` to get both ContextEnricher and LangfuseSpanProcessor together

## See Also

- [getSpanProcessors](/docs/getspanprocessors) - Get both processors together
- [createContextEnricher](/docs/createcontextenricher) - Create ContextEnricher for context propagation
- [flushOpenTelemetry](/docs/flushopentelemetry) - Convenience method to flush all spans
- [LangfuseConfig](/docs/api/type-aliases/LangfuseConfig) - Configuration options

---

## Type Alias: MDocumentConfig

<!-- Source: api/type-aliases/MDocumentConfig.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

### metadata?

> `optional` **metadata**: `Record`

Custom metadata to attach to the document and propagate to chunks

## Example

```typescript

// Basic configuration
const config: MDocumentConfig = {
  type: "markdown",
};

const doc = new MDocument(markdownContent, config);

// With custom metadata
const configWithMetadata: MDocumentConfig = {
  type: "html",
  metadata: {
    source: "https://example.com/article",
    author: "Jane Doe",
    publishedAt: "2024-01-15",
    tags: ["ai", "machine-learning"],
  },
};

const docWithMeta = new MDocument(htmlContent, configWithMetadata);

// Metadata is preserved through processing
await docWithMeta.chunk({ strategy: "html" });
const chunks = docWithMeta.getChunks();
// Each chunk inherits document metadata
```

---

## Function: getMCPStats()

<!-- Source: api/functions/getMCPStats.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / getMCPStats

# Function: getMCPStats()

> **getMCPStats**(): `Promise`\; `availablePlugins`: `string`[]; \}\>

Defined in: [mcp/index.ts:88](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/index.ts#L88)

Get MCP ecosystem statistics - simplified

## Returns

`Promise`\; `availablePlugins`: `string`[]; \}\>

---

## Type Alias: McpMetadata

<!-- Source: api/type-aliases/McpMetadata.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### description?

> `optional` **description**: `string`

Defined in: [types/mcpTypes.ts:531](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L531)

---

### version?

> `optional` **version**: `string`

Defined in: [types/mcpTypes.ts:532](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L532)

---

### author?

> `optional` **author**: `string`

Defined in: [types/mcpTypes.ts:533](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L533)

---

### homepage?

> `optional` **homepage**: `string`

Defined in: [types/mcpTypes.ts:534](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L534)

---

### repository?

> `optional` **repository**: `string`

Defined in: [types/mcpTypes.ts:535](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L535)

---

### category?

> `optional` **category**: `string`

Defined in: [types/mcpTypes.ts:536](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L536)

---

## Function: getSpanProcessors()

<!-- Source: api/functions/getSpanProcessors.md -->

[**NeuroLink API Reference v8.42.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / getSpanProcessors

# Function: getSpanProcessors()

> **getSpanProcessors**(): `SpanProcessor[]`

Defined in: [services/server/ai/observability/instrumentation.ts:568](https://github.com/juspay/neurolink/blob/main/src/lib/services/server/ai/observability/instrumentation.ts#L568)

Get all span processors that NeuroLink would use

Convenience function that returns `[ContextEnricher, LangfuseSpanProcessor]`.
Use this when integrating with an external TracerProvider to add NeuroLink's
observability capabilities to your existing OpenTelemetry setup.

## Returns

`SpanProcessor[]`

Array of span processors, or empty array if not initialized

The returned array contains:

1. **ContextEnricher** - Enriches spans with Langfuse context (userId, sessionId, etc.)
2. **LangfuseSpanProcessor** - Sends spans to Langfuse platform

## Example

```typescript

// 1. Initialize NeuroLink with external provider mode
const neurolink = new NeuroLink({
  observability: {
    langfuse: {
      enabled: true,
      publicKey: process.env.LANGFUSE_PUBLIC_KEY!,
      secretKey: process.env.LANGFUSE_SECRET_KEY!,
      useExternalTracerProvider: true,
    },
  },
});

// 2. Get NeuroLink's span processors
const neurolinkProcessors = getSpanProcessors();

// 3. Add to your existing OTEL setup
const jaegerExporter = new OTLPTraceExporter({
  url: "http://jaeger:4318/v1/traces",
});
const sdk = new NodeSDK({
  spanProcessors: [
    new BatchSpanProcessor(jaegerExporter),
    ...neurolinkProcessors,
  ],
});
sdk.start();
```

## Notes

- Must be called after `initializeOpenTelemetry()` or NeuroLink initialization
- Returns empty array if observability is not initialized or disabled
- Each call to `getSpanProcessors()` creates a new ContextEnricher instance
- The LangfuseSpanProcessor is reused across calls

## See Also

- [createContextEnricher](/docs/createcontextenricher) - Create ContextEnricher separately
- [isUsingExternalTracerProvider](/docs/isusingexternaltracerprovider) - Check provider mode
- [LangfuseConfig](/docs/api/type-aliases/LangfuseConfig) - Configuration options

---

## Type Alias: MiddlewareConfig

<!-- Source: api/type-aliases/MiddlewareConfig.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### config?

> `optional` **config**: `Record`\

Defined in: [types/middlewareTypes.ts:41](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/middlewareTypes.ts#L41)

Middleware-specific configuration

---

### conditions?

> `optional` **conditions**: `MiddlewareConditions`

Defined in: [types/middlewareTypes.ts:43](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/middlewareTypes.ts#L43)

Conditions under which to apply this middleware

---

## Function: getTelemetryStatus()

<!-- Source: api/functions/getTelemetryStatus.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / getTelemetryStatus

# Function: getTelemetryStatus()

> **getTelemetryStatus**(): `Promise`\

Defined in: [index.ts:365](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/index.ts#L365)

## Returns

`Promise`\

---

## Type Alias: MiddlewareContext

<!-- Source: api/type-aliases/MiddlewareContext.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### model

> **model**: `string`

Defined in: [types/middlewareTypes.ts:67](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/middlewareTypes.ts#L67)

Model name

---

### options

> **options**: `Record`\

Defined in: [types/middlewareTypes.ts:69](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/middlewareTypes.ts#L69)

Request options

---

### session?

> `optional` **session**: `object`

Defined in: [types/middlewareTypes.ts:71](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/middlewareTypes.ts#L71)

Session information

#### sessionId?

> `optional` **sessionId**: `string`

#### userId?

> `optional` **userId**: `string`

---

### metadata?

> `optional` **metadata**: `Record`\

Defined in: [types/middlewareTypes.ts:76](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/middlewareTypes.ts#L76)

Additional metadata

---

## Function: getTracer()

<!-- Source: api/functions/getTracer.md -->

[**NeuroLink API Reference v8.42.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / getTracer

# Function: getTracer()

> **getTracer**(`name?`, `version?`): `Tracer`

Defined in: [services/server/ai/observability/instrumentation.ts:615](https://github.com/juspay/neurolink/blob/main/src/lib/services/server/ai/observability/instrumentation.ts#L615)

Get an OpenTelemetry Tracer for creating custom spans

This allows applications to create their own spans that will be
processed by the same span processors (ContextEnricher + LangfuseSpanProcessor).
Custom spans will inherit the Langfuse context set via `setLangfuseContext()`.

## Parameters

### name?

`string`

Tracer name, defaults to "neurolink"

### version?

`string`

Tracer version (optional)

## Returns

`Tracer`

OpenTelemetry Tracer instance from `@opentelemetry/api`

## Examples

### Basic custom span

```typescript

const tracer = getTracer("my-app");
const span = tracer.startSpan("custom-operation");
try {
  // ... do work
  span.setAttribute("custom.key", "value");
} finally {
  span.end();
}
```

### Nested spans with context

```typescript

const tracer = getTracer("my-app", "1.0.0");

await setLangfuseContext({ userId: "user-123" }, async () => {
  const parentSpan = tracer.startSpan("parent-operation");

  try {
    // Create child span
    const childSpan = tracer.startSpan("child-operation");
    try {
      await doSomeWork();
      childSpan.setAttribute("result", "success");
    } finally {
      childSpan.end();
    }
  } finally {
    parentSpan.end();
  }
});
```

### Tracing async operations

```typescript

const tracer = getTracer("my-app");

async function tracedOperation() {
  return tracer.startActiveSpan("my-operation", async (span) => {
    try {
      const result = await fetchData();
      span.setAttribute("data.count", result.length);
      return result;
    } catch (error) {
      span.recordException(error as Error);
      span.setStatus({ code: 2, message: (error as Error).message });
      throw error;
    } finally {
      span.end();
    }
  });
}
```

### With error recording

```typescript

const tracer = getTracer("my-app");

async function riskyOperation() {
  const span = tracer.startSpan("risky-operation");
  try {
    await doRiskyThing();
    span.setStatus({ code: SpanStatusCode.OK });
  } catch (error) {
    span.recordException(error as Error);
    span.setStatus({
      code: SpanStatusCode.ERROR,
      message: (error as Error).message,
    });
    throw error;
  } finally {
    span.end();
  }
}
```

## Notes

- The tracer uses the global TracerProvider (either NeuroLink's or your external one)
- Spans created with this tracer will be processed by ContextEnricher and LangfuseSpanProcessor
- In external provider mode, spans will be sent to your configured exporters
- Always call `span.end()` to ensure spans are properly recorded

## See Also

- [setLangfuseContext](/docs/setlangfusecontext) - Set context for spans
- [getLangfuseContext](/docs/getlangfusecontext) - Read current context
- [getSpanProcessors](/docs/getspanprocessors) - Get span processors for external providers
- [LangfuseSpanAttributes](/docs/type-aliases/langfusespanattributes) - GenAI attribute types

---

## Type Alias: MiddlewareFactoryOptions

<!-- Source: api/type-aliases/MiddlewareFactoryOptions.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### enabledMiddleware?

> `optional` **enabledMiddleware**: `string`[]

Defined in: [types/middlewareTypes.ts:151](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/middlewareTypes.ts#L151)

Enable specific middleware

---

### disabledMiddleware?

> `optional` **disabledMiddleware**: `string`[]

Defined in: [types/middlewareTypes.ts:153](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/middlewareTypes.ts#L153)

Disable specific middleware

---

### middlewareConfig?

> `optional` **middlewareConfig**: `Record`\

Defined in: [types/middlewareTypes.ts:155](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/middlewareTypes.ts#L155)

Middleware configurations

---

### preset?

> `optional` **preset**: `string`

Defined in: [types/middlewareTypes.ts:157](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/middlewareTypes.ts#L157)

Use a preset configuration

---

### global?

> `optional` **global**: `object`

Defined in: [types/middlewareTypes.ts:159](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/middlewareTypes.ts#L159)

Global middleware settings

#### maxExecutionTime?

> `optional` **maxExecutionTime**: `number`

Maximum execution time for middleware chain

#### continueOnError?

> `optional` **continueOnError**: `boolean`

Whether to continue on middleware errors

#### collectStats?

> `optional` **collectStats**: `boolean`

Whether to collect execution statistics

---

## Function: getTracerProvider()

<!-- Source: api/functions/getTracerProvider.md -->

[**NeuroLink API Reference v8.42.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / getTracerProvider

# Function: getTracerProvider()

> **getTracerProvider**(): `NodeTracerProvider | null`

Defined in: [services/server/ai/observability/instrumentation.ts:464](https://github.com/juspay/neurolink/blob/main/src/lib/services/server/ai/observability/instrumentation.ts#L464)

Get the NodeTracerProvider instance managed by NeuroLink

Returns the TracerProvider that NeuroLink created and registered, or `null`
if NeuroLink is operating in external provider mode or if not initialized.

## Returns

`NodeTracerProvider | null`

The NodeTracerProvider instance, or `null` if:

- NeuroLink is in external provider mode (`useExternalTracerProvider: true`)
- OpenTelemetry is not initialized
- Langfuse is disabled

## When This Returns Null

- `useExternalTracerProvider: true` was set in LangfuseConfig
- `autoDetectExternalProvider: true` detected an external provider
- TracerProvider registration failed (switched to external mode)
- `initializeOpenTelemetry()` was not called or failed

## Example

```typescript

  getTracerProvider,
  isUsingExternalTracerProvider,
} from "@juspay/neurolink";

// Check the mode first
if (isUsingExternalTracerProvider()) {
  console.log("External mode - no TracerProvider from NeuroLink");
} else {
  const provider = getTracerProvider();
  if (provider) {
    console.log("Standalone mode - NeuroLink managing TracerProvider");
    // Access provider methods if needed
    await provider.forceFlush();
  }
}
```

## Advanced Usage

```typescript

// Add additional exporters to NeuroLink's provider
const provider = getTracerProvider();
if (provider) {
  // Add Jaeger exporter alongside Langfuse
  const jaegerExporter = new OTLPTraceExporter({
    url: "http://jaeger:4318/v1/traces",
  });
  provider.addSpanProcessor(new BatchSpanProcessor(jaegerExporter));
}
```

## Notes

- In standalone mode, NeuroLink creates and registers its own TracerProvider
- In external provider mode, this always returns `null`
- Use `isUsingExternalTracerProvider()` to check the current mode
- The provider includes ContextEnricher and LangfuseSpanProcessor

## See Also

- [isUsingExternalTracerProvider](/docs/isusingexternaltracerprovider) - Check provider mode
- [getSpanProcessors](/docs/getspanprocessors) - Get processors for external mode
- [getLangfuseSpanProcessor](/docs/getlangfusespanprocessor) - Get Langfuse processor directly
- [LangfuseConfig](/docs/api/type-aliases/LangfuseConfig) - Configuration options

---

## Type Alias: MiddlewarePreset

<!-- Source: api/type-aliases/MiddlewarePreset.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### description

> **description**: `string`

Defined in: [types/middlewareTypes.ts:139](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/middlewareTypes.ts#L139)

Description of the preset

---

### config

> **config**: `Record`\

Defined in: [types/middlewareTypes.ts:141](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/middlewareTypes.ts#L141)

Middleware configurations in the preset

---

## Function: initializeMCPEcosystem()

<!-- Source: api/functions/initializeMCPEcosystem.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / initializeMCPEcosystem

# Function: initializeMCPEcosystem()

> **initializeMCPEcosystem**(): `Promise`\

Defined in: [mcp/index.ts:58](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/index.ts#L58)

Initialize the MCP ecosystem - simplified

## Returns

`Promise`\

---

## Type Alias: ModelRegistry

<!-- Source: api/type-aliases/ModelRegistry.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / ModelRegistry

# Type Alias: ModelRegistry

> **ModelRegistry** = `z.infer`\

Defined in: [types/modelTypes.ts:111](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/modelTypes.ts#L111)

Dynamic model registry type

---

## Function: initializeOpenTelemetry()

<!-- Source: api/functions/initializeOpenTelemetry.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / initializeOpenTelemetry

# Function: initializeOpenTelemetry()

> **initializeOpenTelemetry**(`config`): `void`

Defined in: [services/server/ai/observability/instrumentation.ts:73](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/services/server/ai/observability/instrumentation.ts#L73)

Initialize OpenTelemetry with Langfuse span processor

This connects Vercel AI SDK's experimental_telemetry to Langfuse by:

1. Creating LangfuseSpanProcessor with Langfuse credentials
2. Creating a NodeTracerProvider with service metadata and span processor
3. Registering the provider globally for AI SDK to use

## Parameters

### config

[`LangfuseConfig`](/docs/api/type-aliases/LangfuseConfig)

Langfuse configuration passed from parent application

## Returns

`void`

---

## Type Alias: NeuroLinkMiddleware

<!-- Source: api/type-aliases/NeuroLinkMiddleware.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / NeuroLinkMiddleware

# Type Alias: NeuroLinkMiddleware

> **NeuroLinkMiddleware** = `LanguageModelV1Middleware` & `object`

Defined in: [types/middlewareTypes.ts:29](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/middlewareTypes.ts#L29)

NeuroLink middleware with metadata
Combines standard AI SDK middleware with NeuroLink-specific metadata

## Type Declaration

### metadata

> `readonly` **metadata**: `NeuroLinkMiddlewareMetadata`

Middleware metadata

---

## Function: initializeTelemetry()

<!-- Source: api/functions/initializeTelemetry.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / initializeTelemetry

# Function: initializeTelemetry()

> **initializeTelemetry**(): `Promise`\

Defined in: [index.ts:356](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/index.ts#L356)

## Returns

`Promise`\

---

## Type Alias: OAuthClientInformation

<!-- Source: api/type-aliases/OAuthClientInformation.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### clientSecret?

> `optional` **clientSecret**: `string`

Defined in: [types/mcpTypes.ts:906](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L906)

---

### redirectUri

> **redirectUri**: `string`

Defined in: [types/mcpTypes.ts:907](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L907)

---

## Function: isRetryableHTTPError()

<!-- Source: api/functions/isRetryableHTTPError.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / isRetryableHTTPError

# Function: isRetryableHTTPError()

> **isRetryableHTTPError**(`error`, `config`): `boolean`

Defined in: [mcp/httpRetryHandler.ts:57](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/httpRetryHandler.ts#L57)

Check if an error is retryable for HTTP operations

Considers:

- Network errors (ECONNRESET, ENOTFOUND, ECONNREFUSED, ETIMEDOUT)
- Timeout errors
- HTTP status codes in the retryable list
- Fetch/network-related errors

## Parameters

### error

`unknown`

Error to check

### config

[`HTTPRetryConfig`](/docs/api/type-aliases/HTTPRetryConfig) = `DEFAULT_HTTP_RETRY_CONFIG`

HTTP retry configuration (optional)

## Returns

`boolean`

True if the error is retryable

---

## Type Alias: OAuthTokens

<!-- Source: api/type-aliases/OAuthTokens.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### refreshToken?

> `optional` **refreshToken**: `string`

Defined in: [types/mcpTypes.ts:832](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L832)

Refresh token for obtaining new access tokens

---

### expiresAt?

> `optional` **expiresAt**: `number`

Defined in: [types/mcpTypes.ts:834](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L834)

Token expiration timestamp (Unix epoch in milliseconds)

---

### tokenType

> **tokenType**: `string`

Defined in: [types/mcpTypes.ts:836](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L836)

Token type (typically "Bearer")

---

### scope?

> `optional` **scope**: `string`

Defined in: [types/mcpTypes.ts:838](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L838)

OAuth scope granted

---

## Function: isRetryableStatusCode()

<!-- Source: api/functions/isRetryableStatusCode.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / isRetryableStatusCode

# Function: isRetryableStatusCode()

> **isRetryableStatusCode**(`status`, `config`): `boolean`

Defined in: [mcp/httpRetryHandler.ts:37](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/httpRetryHandler.ts#L37)

Check if an HTTP status code is retryable based on configuration

## Parameters

### status

`number`

HTTP status code to check

### config

[`HTTPRetryConfig`](/docs/api/type-aliases/HTTPRetryConfig) = `DEFAULT_HTTP_RETRY_CONFIG`

HTTP retry configuration

## Returns

`boolean`

True if the status code should trigger a retry

---

## Type Alias: ObservabilityConfig

<!-- Source: api/type-aliases/ObservabilityConfig.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### openTelemetry?

> `optional` **openTelemetry**: [`OpenTelemetryConfig`](/docs/api/type-aliases/OpenTelemetryConfig)

Defined in: [types/observability.ts:55](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/observability.ts#L55)

OpenTelemetry configuration

---

## Function: isTokenExpired()

<!-- Source: api/functions/isTokenExpired.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / isTokenExpired

# Function: isTokenExpired()

> **isTokenExpired**(`tokens`, `bufferSeconds`): `boolean`

Defined in: [mcp/auth/tokenStorage.ts:146](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/auth/tokenStorage.ts#L146)

Check if tokens are expired or about to expire

## Parameters

### tokens

[`OAuthTokens`](/docs/api/type-aliases/OAuthTokens)

OAuth tokens to check

### bufferSeconds

`number` = `60`

Buffer time in seconds before expiration (default: 60)

## Returns

`boolean`

True if tokens are expired or will expire within buffer time

---

## Type Alias: OpenTelemetryConfig

<!-- Source: api/type-aliases/OpenTelemetryConfig.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### endpoint?

> `optional` **endpoint**: `string`

Defined in: [types/observability.ts:41](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/observability.ts#L41)

OTLP endpoint URL

---

### serviceName?

> `optional` **serviceName**: `string`

Defined in: [types/observability.ts:43](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/observability.ts#L43)

Service name for traces

---

### serviceVersion?

> `optional` **serviceVersion**: `string`

Defined in: [types/observability.ts:45](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/observability.ts#L45)

Service version

---

## Function: isUsingExternalTracerProvider()

<!-- Source: api/functions/isUsingExternalTracerProvider.md -->

[**NeuroLink API Reference v8.42.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / isUsingExternalTracerProvider

# Function: isUsingExternalTracerProvider()

> **isUsingExternalTracerProvider**(): `boolean`

Defined in: [services/server/ai/observability/instrumentation.ts:584](https://github.com/juspay/neurolink/blob/main/src/lib/services/server/ai/observability/instrumentation.ts#L584)

Check if using external TracerProvider mode

Returns true if NeuroLink is operating in external TracerProvider mode,
meaning it did not create or register its own TracerProvider. In this mode,
you must add NeuroLink's span processors to your own TracerProvider.

## Returns

`boolean`

`true` if operating in external TracerProvider mode, `false` otherwise

## When This Returns True

- `useExternalTracerProvider: true` was set in LangfuseConfig
- `autoDetectExternalProvider: true` was set and detected external provider
- TracerProvider registration failed due to duplicate registration

## Example

```typescript

  isUsingExternalTracerProvider,
  getSpanProcessors,
} from "@juspay/neurolink";

// Check mode after initialization
if (isUsingExternalTracerProvider()) {
  console.log(
    "External provider mode - add processors to your TracerProvider:",
  );
  const processors = getSpanProcessors();
  // Add processors to your existing OTEL setup
  myTracerProvider.addSpanProcessor(processors[0]); // ContextEnricher
  myTracerProvider.addSpanProcessor(processors[1]); // LangfuseSpanProcessor
} else {
  console.log("Standalone mode - NeuroLink managing its own TracerProvider");
}
```

## Conditional Setup

```typescript

  NeuroLink,
  isUsingExternalTracerProvider,
  getSpanProcessors,
} from "@juspay/neurolink";

const neurolink = new NeuroLink({
  observability: {
    langfuse: {
      enabled: true,
      publicKey: process.env.LANGFUSE_PUBLIC_KEY!,
      secretKey: process.env.LANGFUSE_SECRET_KEY!,
      autoDetectExternalProvider: true, // Auto-detect mode
    },
  },
});

// Only set up OTEL SDK if NeuroLink isn't managing it
if (isUsingExternalTracerProvider()) {
  const sdk = new NodeSDK({
    spanProcessors: [...getSpanProcessors()],
  });
  sdk.start();
}
```

## See Also

- [getSpanProcessors](/docs/getspanprocessors) - Get processors for external mode
- [LangfuseConfig](/docs/api/type-aliases/LangfuseConfig) - Configuration options
- [Observability Guide](/docs/observability/health-monitoring) - Full setup guide

---

## Type Alias: ProviderAttempt

<!-- Source: api/type-aliases/ProviderAttempt.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### model

> **model**: [`SupportedModelName`](/docs/api/type-aliases/SupportedModelName)

Defined in: [types/providers.ts:328](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/providers.ts#L328)

---

### success

> **success**: `boolean`

Defined in: [types/providers.ts:329](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/providers.ts#L329)

---

### error?

> `optional` **error**: `string`

Defined in: [types/providers.ts:330](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/providers.ts#L330)

---

### stack?

> `optional` **stack**: `string`

Defined in: [types/providers.ts:331](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/providers.ts#L331)

---

## Function: isValidProvider()

<!-- Source: api/functions/isValidProvider.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / isValidProvider

# Function: isValidProvider()

> **isValidProvider**(`provider`): `boolean`

Defined in: [utils/providerUtils.ts:545](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/utils/providerUtils.ts#L545)

Validate provider name

## Parameters

### provider

`string`

Provider name to validate

## Returns

`boolean`

True if provider name is valid

---

## ~~Type Alias: RateLimitConfig~~

<!-- Source: api/type-aliases/RateLimitConfig.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / RateLimitConfig

# ~~Type Alias: RateLimitConfig~~

> **RateLimitConfig** = `TokenBucketRateLimitConfig`

Defined in: [types/mcpTypes.ts:945](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L945)

## Deprecated

Use TokenBucketRateLimitConfig instead

---

## Function: linearCombination()

<!-- Source: api/functions/linearCombination.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / linearCombination

# Function: linearCombination()

> **linearCombination**(`vectorScores`, `bm25Scores`, `alpha?`): `Map`

Defined in: [lib/rag/retrieval/hybridSearch.ts:193](https://github.com/juspay/neurolink/blob/main/src/lib/rag/retrieval/hybridSearch.ts#L193)

Linear Combination of normalized scores from vector and BM25 search results.

This fusion method normalizes scores from each retrieval method to a 0-1 range,
then combines them using a weighted average. Useful when you want precise control
over the contribution of each retrieval method.

## Parameters

### vectorScores

`Map`

Map of document IDs to vector search scores

### bm25Scores

`Map`

Map of document IDs to BM25 search scores

### alpha?

`number`

Weight for vector scores (0-1). BM25 scores receive weight `(1 - alpha)`. Default is `0.5` for equal weighting.

## Returns

`Map`

Map of document IDs to combined normalized scores

## Examples

### Basic linear combination

```typescript

// Scores from vector search
const vectorScores = new Map([
  ["doc-1", 0.95],
  ["doc-2", 0.82],
  ["doc-3", 0.71],
]);

// Scores from BM25 search
const bm25Scores = new Map([
  ["doc-2", 12.5],
  ["doc-1", 8.3],
  ["doc-4", 15.2],
]);

// Equal weighting (default)
const combinedScores = linearCombination(vectorScores, bm25Scores);

// Get sorted results
const results = [...combinedScores.entries()].sort((a, b) => b[1] - a[1]);
```

### Favor semantic similarity

```typescript

// Give 70% weight to vector search, 30% to BM25
const combinedScores = linearCombination(vectorScores, bm25Scores, 0.7);
```

### Favor keyword matching

```typescript

// Give 30% weight to vector search, 70% to BM25
const combinedScores = linearCombination(vectorScores, bm25Scores, 0.3);
```

### Integration with hybrid search results

```typescript

async function hybridSearch(query: string) {
  // Get results from both methods
  const [vectorResults, bm25Results] = await Promise.all([
    vectorStore.query({ query, topK: 20 }),
    bm25Index.search(query, 20),
  ]);

  // Convert to score maps
  const vectorScores = new Map(vectorResults.map((r) => [r.id, r.score]));
  const bm25Scores = new Map(bm25Results.map((r) => [r.id, r.score]));

  // Combine with custom weighting
  const combined = linearCombination(vectorScores, bm25Scores, 0.6);

  // Merge with original data and return top results
  return [...combined.entries()]
    .sort((a, b) => b[1] - a[1])
    .slice(0, 10)
    .map(([id, score]) => ({
      id,
      score,
      data:
        vectorResults.find((r) => r.id === id) ||
        bm25Results.find((r) => r.id === id),
    }));
}
```

## Notes

- Scores are normalized to 0-1 range using min-max normalization before combination
- Documents appearing in only one set receive 0 for the missing score
- Alpha controls the semantic vs. keyword trade-off:
  - `alpha = 1.0`: Pure vector search
  - `alpha = 0.5`: Equal weighting (default)
  - `alpha = 0.0`: Pure BM25 search
- Unlike RRF, this method considers actual score magnitudes (after normalization)

## Since

v8.44.0

## See Also

- [reciprocalRankFusion](/docs/reciprocalrankfusion) - Alternative fusion method using rank positions
- [createHybridSearch](/docs/createhybridsearch) - Create a hybrid search function using RRF or linear combination

---

## Type Alias: RerankerConfig

<!-- Source: api/type-aliases/RerankerConfig.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

### weights?

> `optional` **weights**: `object`

Defined in: [lib/rag/types.ts:486](https://github.com/juspay/neurolink/blob/main/src/lib/rag/types.ts#L486)

Scoring weights for combining different relevance signals

#### weights.semantic?

> `optional` **semantic**: `number`

Weight for semantic similarity score (0-1)

#### weights.vector?

> `optional` **vector**: `number`

Weight for vector similarity score (0-1)

#### weights.position?

> `optional` **position**: `number`

Weight for original position score (0-1)

---

### topK?

> `optional` **topK**: `number`

Defined in: [lib/rag/types.ts:492](https://github.com/juspay/neurolink/blob/main/src/lib/rag/types.ts#L492)

Number of results to return after reranking

## Example

```typescript

// Basic reranker configuration
const rerankerConfig: RerankerConfig = {
  model: {
    provider: "openai",
    modelName: "gpt-4o-mini",
  },
  topK: 10,
};

// Advanced configuration with custom weights
const advancedRerankerConfig: RerankerConfig = {
  model: {
    provider: "cohere",
    modelName: "rerank-english-v3.0",
  },
  weights: {
    semantic: 0.5, // 50% weight on semantic relevance
    vector: 0.3, // 30% weight on vector similarity
    position: 0.2, // 20% weight on original ranking
  },
  topK: 5,
};

// Use reranker in vector query configuration
const queryConfig: VectorQueryToolConfig = {
  indexName: "knowledge-base",
  embeddingModel: {
    provider: "openai",
    modelName: "text-embedding-3-small",
  },
  topK: 50, // Fetch more results initially
  reranker: advancedRerankerConfig, // Rerank to top 5
};
```

## Since

v8.44.0

---

## Function: listMCPs()

<!-- Source: api/functions/listMCPs.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / listMCPs

# Function: listMCPs()

> **listMCPs**(): `Promise`\

Defined in: [mcp/index.ts:66](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/index.ts#L66)

List available MCPs - simplified

## Returns

`Promise`\

---

## Type Alias: RerankerType

<!-- Source: api/type-aliases/RerankerType.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

### "colbert"

ColBERT (Contextualized Late Interaction over BERT) reranking. Uses late interaction between query and document token embeddings for efficient and accurate reranking.

---

### "cohere"

Cohere Rerank API. Uses Cohere's hosted reranking service for high-quality relevance scoring without managing infrastructure.

---

### "llm"

LLM-based reranking. Uses a large language model to evaluate and score the relevance of each result to the query. Most flexible but potentially slower.

## Example

```typescript

// Using different reranker types
const rerankerTypes: RerankerType[] = [
  "cross-encoder", // Best accuracy, moderate speed
  "colbert", // Good balance of speed and accuracy
  "cohere", // Managed service, easy to use
  "llm", // Most flexible, custom prompting
];

// Configure reranker with specific type
const config: RerankerConfig = {
  model: {
    provider: "openai",
    modelName: "gpt-4o-mini",
  },
  weights: {
    semantic: 0.5,
    vector: 0.3,
    position: 0.2,
  },
  topK: 10,
};

// Use with vector query
const results = await vectorStore.query({
  query: "How to implement authentication?",
  topK: 50,
  reranker: config,
});
```

## Since

v8.44.0

---

## Function: loadDocument()

<!-- Source: api/functions/loadDocument.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

----------------------- | ------------- | -------------- |
| `.txt`                     | text          | TextLoader     |
| `.md`, `.markdown`, `.mdx` | markdown      | MarkdownLoader |
| `.html`, `.htm`, `.xhtml`  | html          | HTMLLoader     |
| `.json`, `.jsonl`          | json          | JSONLoader     |
| `.csv`, `.tsv`             | csv           | CSVLoader      |
| `.pdf`                     | pdf           | PDFLoader      |
| `http://`, `https://`      | html          | WebLoader      |

## Notes

- File existence is checked before loading; non-existent files are treated as raw content. Note: PDF files will throw an error if the file doesn't exist. Only text-based files may fall back to raw content treatment.
- PDF loading requires the optional `pdf-parse` package
- Web loading supports timeout configuration and content extraction
- The returned MDocument supports method chaining for processing workflows

## Since

v8.44.0

## See Also

- [loadDocuments](/docs/loaddocuments) - Load multiple documents in parallel
- [MDocument](/docs/classes/mdocument) - Document processing class

---

## Type Alias: StreamingOptions

<!-- Source: api/type-aliases/StreamingOptions.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### temperature?

> `optional` **temperature**: `number`

Defined in: [types/streamTypes.ts:51](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/streamTypes.ts#L51)

---

### maxTokens?

> `optional` **maxTokens**: `number`

Defined in: [types/streamTypes.ts:52](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/streamTypes.ts#L52)

---

### systemPrompt?

> `optional` **systemPrompt**: `string`

Defined in: [types/streamTypes.ts:53](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/streamTypes.ts#L53)

---

## Function: loadDocuments()

<!-- Source: api/functions/loadDocuments.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / loadDocuments

# Function: loadDocuments()

> **loadDocuments**(`sources`, `options?`): `Promise`

Defined in: [lib/rag/document/loaders.ts:648](https://github.com/juspay/neurolink/blob/main/src/lib/rag/document/loaders.ts#L648)

Load multiple documents in parallel with error handling.

Processes an array of sources concurrently using `Promise.allSettled`,
ensuring that failures in individual documents don't prevent others from loading.
Failed documents are logged as warnings but don't throw errors.

## Parameters

### sources

`string[]`

Array of file paths, URLs, or raw content strings to load

### options?

`LoaderOptions`

Optional loader configuration applied to all documents

#### options.metadata?

`Record`

Custom metadata to add to all documents

#### options.encoding?

`BufferEncoding`

Text encoding for file reading (default: `"utf-8"`)

#### options.type?

`DocumentType`

Override auto-detected document type for all sources

## Returns

`Promise`

Promise resolving to array of successfully loaded MDocument instances

## Examples

### Load multiple files

```typescript

const docs = await loadDocuments([
  "/path/to/doc1.md",
  "/path/to/doc2.md",
  "/path/to/doc3.md",
]);

console.log(`Loaded ${docs.length} documents`);
```

### Load mixed sources

```typescript

const docs = await loadDocuments([
  "./README.md",
  "./config.json",
  "https://example.com/article",
  "./data.csv",
]);

// Each document is loaded with the appropriate loader
for (const doc of docs) {
  console.log(`${doc.getMetadata().source}: ${doc.getType()}`);
}
```

### Load with shared metadata

```typescript

const docs = await loadDocuments(
  ["./chapter1.md", "./chapter2.md", "./chapter3.md"],
  {
    metadata: {
      book: "User Guide",
      version: "2.0",
      loadedAt: new Date().toISOString(),
    },
  },
);
```

### Process loaded documents

```typescript

const docs = await loadDocuments(filePaths);

// Process all documents
const allChunks = [];
for (const doc of docs) {
  await doc.chunk({ strategy: "recursive", config: { maxSize: 1000 } });
  allChunks.push(...doc.getChunks());
}

console.log(
  `Created ${allChunks.length} total chunks from ${docs.length} documents`,
);
```

### Handle partial failures gracefully

```typescript

// Some files may not exist or fail to load
const sources = [
  "./valid-file.md",
  "./missing-file.md", // Will fail but not throw
  "./another-valid.md",
];

const docs = await loadDocuments(sources);
// docs will contain only successfully loaded documents
// Failed loads are logged as warnings

console.log(
  `Successfully loaded ${docs.length} of ${sources.length} documents`,
);
```

### Batch processing pipeline

```typescript

// Load all markdown files in a directory
const files = await glob("./docs/**/*.md");
const docs = await loadDocuments(files);

// Chunk all documents
await Promise.all(
  docs.map((doc) =>
    doc.chunk({ strategy: "markdown", config: { maxSize: 1000 } }),
  ),
);

// Collect all chunks for indexing
const allChunks = docs.flatMap((doc) => doc.getChunks());
```

## Notes

- Uses `Promise.allSettled` for resilient parallel loading
- Failed documents are logged but don't cause the function to throw
- The returned array may be smaller than the input if some sources fail
- All successfully loaded documents maintain their original order
- Options are applied uniformly to all documents

## Since

v8.44.0

## See Also

- [loadDocument](/docs/loaddocument) - Load a single document
- [MDocument](/docs/classes/mdocument) - Document processing class

---

## Type Alias: SupportedModelName

<!-- Source: api/type-aliases/SupportedModelName.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / SupportedModelName

# Type Alias: SupportedModelName

> **SupportedModelName** = [`BedrockModels`](/docs/api/enumerations/BedrockModels) \| [`OpenAIModels`](/docs/api/enumerations/OpenAIModels) \| [`VertexModels`](/docs/api/enumerations/VertexModels) \| `GoogleAIModels` \| `AnthropicModels`

Defined in: [types/providers.ts:39](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/providers.ts#L39)

Union type of all supported model names

---

## Function: reciprocalRankFusion()

<!-- Source: api/functions/reciprocalRankFusion.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / reciprocalRankFusion

# Function: reciprocalRankFusion()

> **reciprocalRankFusion**(`rankings`, `k?`): `Map`

Defined in: [lib/rag/retrieval/hybridSearch.ts:169](https://github.com/juspay/neurolink/blob/main/src/lib/rag/retrieval/hybridSearch.ts#L169)

Reciprocal Rank Fusion (RRF) combines rankings from multiple retrieval methods into a single unified ranking.

RRF is particularly effective for hybrid search scenarios where you want to combine
results from different retrieval strategies (e.g., vector search and BM25) without
requiring score normalization.

## Parameters

### rankings

`Array>`

Array of ranking lists from different retrieval methods. Each ranking is an array of objects containing document `id` and `rank` (1-indexed position).

### k?

`number`

RRF constant that controls the impact of lower-ranked documents. Default is `60`. Higher values give more weight to lower-ranked results.

## Returns

`Map`

Map of document IDs to their fused RRF scores. Higher scores indicate more relevant documents.

## Examples

### Basic rank fusion

```typescript

// Rankings from two different retrieval methods
const vectorRanking = [
  { id: "doc-1", rank: 1 },
  { id: "doc-2", rank: 2 },
  { id: "doc-3", rank: 3 },
];

const bm25Ranking = [
  { id: "doc-2", rank: 1 },
  { id: "doc-1", rank: 2 },
  { id: "doc-4", rank: 3 },
];

const fusedScores = reciprocalRankFusion([vectorRanking, bm25Ranking]);

// Get sorted results
const sortedResults = [...fusedScores.entries()]
  .sort((a, b) => b[1] - a[1])
  .map(([id, score]) => ({ id, score }));

console.log(sortedResults);
// doc-1 and doc-2 will have highest scores (appear in both rankings)
```

### Custom k parameter

```typescript

// Use lower k for more emphasis on top-ranked results
const fusedScores = reciprocalRankFusion(rankings, 20);

// Use higher k for smoother score distribution
const smootherScores = reciprocalRankFusion(rankings, 100);
```

### Combining multiple retrieval methods

```typescript

// Combine three retrieval methods
const semanticRanking = results.semantic.map((r, i) => ({
  id: r.id,
  rank: i + 1,
}));
const keywordRanking = results.keyword.map((r, i) => ({
  id: r.id,
  rank: i + 1,
}));
const recentRanking = results.recent.map((r, i) => ({ id: r.id, rank: i + 1 }));

const fusedScores = reciprocalRankFusion([
  semanticRanking,
  keywordRanking,
  recentRanking,
]);
```

## Notes

- RRF score is calculated as: `sum(1 / (k + rank))` across all rankings
- Documents appearing in multiple rankings will have higher fused scores
- The k parameter prevents high-ranked documents from dominating (k=60 is a common default)
- RRF does not require score normalization, making it robust for combining heterogeneous retrieval methods

## Since

v8.44.0

## See Also

- [linearCombination](/docs/linearcombination) - Alternative fusion method using weighted score combination
- [createHybridSearch](/docs/createhybridsearch) - Create a hybrid search function using RRF or linear combination

---

## Type Alias: TextGenerationOptions

<!-- Source: api/type-aliases/TextGenerationOptions.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### input?

> `optional` **input**: `object`

Defined in: [types/generateTypes.ts:448](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L448)

Alternative input format for multimodal SDK operations.

NOTE: This field is only used by the higher-level `generate()` API
(NeuroLink.generate, BaseProvider.generate). Legacy `generateText()`
callers must still use the `prompt` field directly.

Supports text, images, and other multimodal inputs.

#### text

> **text**: `string`

#### images?

> `optional` **images**: (`Buffer` \| `string` \| `ImageWithAltText`)[]

Images to include in the request.
For video generation, the first image is used as the source frame.

#### pdfFiles?

> `optional` **pdfFiles**: (`Buffer` \| `string`)[]

---

### provider?

> `optional` **provider**: [`AIProviderName`](/docs/api/enumerations/AIProviderName)

Defined in: [types/generateTypes.ts:457](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L457)

---

### model?

> `optional` **model**: `string`

Defined in: [types/generateTypes.ts:458](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L458)

---

### region?

> `optional` **region**: `string`

Defined in: [types/generateTypes.ts:459](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L459)

---

### temperature?

> `optional` **temperature**: `number`

Defined in: [types/generateTypes.ts:460](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L460)

---

### maxTokens?

> `optional` **maxTokens**: `number`

Defined in: [types/generateTypes.ts:461](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L461)

---

### systemPrompt?

> `optional` **systemPrompt**: `string`

Defined in: [types/generateTypes.ts:462](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L462)

---

### schema?

> `optional` **schema**: `ZodUnknownSchema` \| `Schema`\

Defined in: [types/generateTypes.ts:463](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L463)

---

### output?

> `optional` **output**: `object`

Defined in: [types/generateTypes.ts:475](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L475)

Output configuration options

#### format?

> `optional` **format**: `"text"` \| `"structured"` \| `"json"`

#### mode?

> `optional` **mode**: `"text"` \| `"video"`

Output mode - determines the type of content generated

- "text": Standard text generation (default)
- "video": Video generation using models like Veo 3.1

#### video?

> `optional` **video**: `VideoOutputOptions`

Video generation configuration (used when mode is "video")

#### Example

```typescript
output: {
  mode: "video",
  video: { resolution: "1080p", length: 8 }
}
```

---

### tools?

> `optional` **tools**: `Record`\

Defined in: [types/generateTypes.ts:488](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L488)

---

### timeout?

> `optional` **timeout**: `number` \| `string`

Defined in: [types/generateTypes.ts:489](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L489)

---

### disableTools?

> `optional` **disableTools**: `boolean`

Defined in: [types/generateTypes.ts:490](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L490)

---

### maxSteps?

> `optional` **maxSteps**: `number`

Defined in: [types/generateTypes.ts:491](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L491)

Maximum number of tool execution steps (default: 5).

---

### toolChoice?

> `optional` **toolChoice**: `"auto"` \| `"none"` \| `"required"` \| \{ `type`: `"tool"`; `toolName`: `string` \}

Defined in: [types/generateTypes.ts:506](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L506)

Tool choice configuration for the generation.
Controls whether and which tools the model must call.

- `"auto"` (default): the model can choose whether and which tools to call
- `"none"`: no tool calls allowed
- `"required"`: the model must call at least one tool and calls indefinitely until maxSteps is reached and outputs empty string.
- `{ type: "tool", toolName: string }`: the model must and only call the specified tool and calls indefinitely until maxSteps is reached and outputs empty string.

> **Note:** When used without `prepareStep`, this applies to **every step** in the
> `maxSteps` loop. Using `"required"` or `{ type: "tool" }` without `prepareStep`
> will cause infinite tool calls until `maxSteps` is exhausted.

---

### prepareStep?

> `optional` **prepareStep**: (`options`: \{ `steps`: `StepResult`[]; `stepNumber`: `number`; `maxSteps`: `number`; `model`: `LanguageModel` \}) => `PromiseLike`\

Defined in: [types/generateTypes.ts:531](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L531)

Optional callback that runs before each step in a multi-step generation.
Allows dynamically changing `toolChoice` and available tools per step.

This is the recommended way to enforce specific tool calls on certain steps
while allowing the model freedom on others.

Maps to Vercel AI SDK's `experimental_prepareStep`.

#### Example

```typescript
prepareStep: async ({ stepNumber }) => {
  if (stepNumber === 0) {
    return {
      toolChoice: { type: "tool", toolName: "sequentialThinking" },
    };
  }
  return { toolChoice: "auto" };
};
```

#### See

[SDK Custom Tools Guide — Controlling Tool Execution](/docs/sdk/custom-tools-guide#-controlling-tool-execution)

---

### tts?

> `optional` **tts**: `TTSOptions`

Defined in: [types/generateTypes.ts:522](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L522)

Text-to-Speech (TTS) configuration

Enable audio generation from text. Behavior depends on useAiResponse flag:

- When useAiResponse is false/undefined (default): TTS synthesizes the input text directly
- When useAiResponse is true: TTS synthesizes the AI-generated response

#### Examples

```typescript
const neurolink = new NeuroLink();
const result = await neurolink.generate({
  input: { text: "Hello world" },
  provider: "google-ai",
  tts: { enabled: true, voice: "en-US-Neural2-C" },
});
// TTS synthesizes "Hello world" directly, no AI generation
```

```typescript
const neurolink = new NeuroLink();
const result = await neurolink.generate({
  input: { text: "Tell me a joke" },
  provider: "google-ai",
  tts: { enabled: true, useAiResponse: true, voice: "en-US-Neural2-C" },
});
// AI generates the joke, then TTS synthesizes the AI's response
```

---

### enableEvaluation?

> `optional` **enableEvaluation**: `boolean`

Defined in: [types/generateTypes.ts:525](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L525)

---

### enableAnalytics?

> `optional` **enableAnalytics**: `boolean`

Defined in: [types/generateTypes.ts:526](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L526)

---

### context?

> `optional` **context**: `Record`\

Defined in: [types/generateTypes.ts:527](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L527)

---

### evaluationDomain?

> `optional` **evaluationDomain**: `string`

Defined in: [types/generateTypes.ts:530](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L530)

---

### toolUsageContext?

> `optional` **toolUsageContext**: `string`

Defined in: [types/generateTypes.ts:531](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L531)

---

### conversationHistory?

> `optional` **conversationHistory**: `object`[]

Defined in: [types/generateTypes.ts:532](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L532)

#### role

> **role**: `string`

#### content

> **content**: `string`

---

### conversationMessages?

> `optional` **conversationMessages**: `ChatMessage`[]

Defined in: [types/generateTypes.ts:535](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L535)

---

### conversationMemoryConfig?

> `optional` **conversationMemoryConfig**: `Partial`\

Defined in: [types/generateTypes.ts:538](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L538)

---

### originalPrompt?

> `optional` **originalPrompt**: `string`

Defined in: [types/generateTypes.ts:539](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L539)

---

### middleware?

> `optional` **middleware**: [`MiddlewareFactoryOptions`](/docs/api/type-aliases/MiddlewareFactoryOptions)

Defined in: [types/generateTypes.ts:542](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L542)

---

### expectedOutcome?

> `optional` **expectedOutcome**: `string`

Defined in: [types/generateTypes.ts:545](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L545)

---

### evaluationCriteria?

> `optional` **evaluationCriteria**: `string`[]

Defined in: [types/generateTypes.ts:546](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L546)

---

### csvOptions?

> `optional` **csvOptions**: `object`

Defined in: [types/generateTypes.ts:549](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L549)

#### maxRows?

> `optional` **maxRows**: `number`

#### formatStyle?

> `optional` **formatStyle**: `"raw"` \| `"markdown"` \| `"json"`

#### includeHeaders?

> `optional` **includeHeaders**: `boolean`

---

### enableSummarization?

> `optional` **enableSummarization**: `boolean`

Defined in: [types/generateTypes.ts:555](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L555)

---

### thinking?

> `optional` **thinking**: `boolean`

Defined in: [types/generateTypes.ts:612](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L612)

Enable extended thinking capability (simplified option).
Equivalent to `thinkingConfig.enabled = true`.
Works with both Anthropic and Gemini 3 models.

---

### thinkingBudget?

> `optional` **thinkingBudget**: `number`

Defined in: [types/generateTypes.ts:619](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L619)

Token budget for thinking (Anthropic models only).
Equivalent to `thinkingConfig.budgetTokens`.
Range: 5000-100000 tokens. Ignored for Gemini models.

---

### thinkingLevel?

> `optional` **thinkingLevel**: `"minimal"` \| `"low"` \| `"medium"` \| `"high"`

Defined in: [types/generateTypes.ts:630](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L630)

Thinking level for Gemini 3 models only.
Equivalent to `thinkingConfig.thinkingLevel`.

- `minimal` - Near-zero thinking (Flash only)
- `low` - Light reasoning
- `medium` - Balanced reasoning/latency
- `high` - Deep reasoning (Pro default)
  Ignored for Anthropic models.

---

### thinkingConfig?

> `optional` **thinkingConfig**: `object`

Defined in: [types/generateTypes.ts:638](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L638)

Full thinking/reasoning configuration (recommended for SDK usage).
Takes precedence over simplified options (thinking, thinkingBudget, thinkingLevel).

#### enabled?

> `optional` **enabled**: `boolean`

Enable extended thinking. Default: false

#### type?

> `optional` **type**: `"enabled"` \| `"disabled"`

Explicit enable/disable type. Alternative to `enabled` boolean.

#### budgetTokens?

> `optional` **budgetTokens**: `number`

Token budget for thinking (Anthropic: 5000-100000). Ignored for Gemini.

#### thinkingLevel?

> `optional` **thinkingLevel**: `"minimal"` \| `"low"` \| `"medium"` \| `"high"`

Thinking level (Gemini 3: minimal|low|medium|high). Ignored for Anthropic.

#### See

Above documentation for provider-specific behavior and option compatibility.

---

## Function: rerank()

<!-- Source: api/functions/rerank.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / rerank

# Function: rerank()

> **rerank**(`results`, `query`, `model`, `options?`): `Promise`

Defined in: [lib/rag/reranker/reranker.ts:39](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/reranker.ts#L39)

Rerank vector search results using multi-factor scoring

Combines three scoring factors to produce a comprehensive relevance score:

1. **Semantic score**: LLM-based relevance assessment (0-1)
2. **Vector score**: Original similarity score from vector search
3. **Position score**: Inverse of original ranking position

Results are processed in parallel batches for efficiency.

## Parameters

### results

`VectorQueryResult[]`

Vector search results to rerank. Each result should have:

- `id` - Unique identifier
- `text` - Text content (or `metadata.text`)
- `score` - Original vector similarity score
- `metadata` - Additional metadata

### query

`string`

Original search query used for semantic relevance scoring

### model

`AIProvider`

Language model provider for semantic scoring. Must implement the `generate()` method.

### options?

`RerankerOptions`

Optional reranking configuration:

- `topK` - Number of results to return (default: 3)
- `weights` - Scoring weights (must sum to 1.0)
  - `semantic` - Weight for LLM-based score (default: 0.4)
  - `vector` - Weight for vector similarity score (default: 0.4)
  - `position` - Weight for position score (default: 0.2)

## Returns

`Promise`

Array of reranked results sorted by combined score, each containing:

- `result` - Original VectorQueryResult
- `score` - Combined relevance score (0-1)
- `details` - Score breakdown with `semantic`, `vector`, `position`, and optional `queryAnalysis`

## Examples

### Basic reranking

```typescript

const model = await ProviderFactory.createProvider("openai", "gpt-4o-mini");

const rerankedResults = await rerank(
  vectorSearchResults,
  "What are the key features?",
  model,
);

console.log("Top result:", rerankedResults[0].result.text);
console.log("Score breakdown:", rerankedResults[0].details);
```

### Custom weights emphasizing semantic relevance

```typescript

const results = await rerank(searchResults, query, model, {
  topK: 5,
  weights: {
    semantic: 0.6, // Emphasize LLM-based scoring
    vector: 0.3,
    position: 0.1,
  },
});
```

### Integration with RAG pipeline

```typescript

async function enhancedSearch(query: string) {
  // Initial vector search
  const vectorTool = createVectorQueryTool(vectorStore, config);
  const initialResults = await vectorTool.query(query);

  // Rerank for better relevance
  const rerankedResults = await rerank(
    initialResults.sources,
    query,
    llmProvider,
    { topK: 3 },
  );

  // Use top reranked results for generation
  return rerankedResults.map((r) => r.result.text).join("\n\n");
}
```

### Analyzing score distribution

```typescript

const results = await rerank(searchResults, query, model, { topK: 10 });

results.forEach((r, i) => {
  console.log(`Rank ${i + 1}:`);
  console.log(`  Combined: ${r.score.toFixed(3)}`);
  console.log(`  Semantic: ${r.details.semantic.toFixed(3)}`);
  console.log(`  Vector:   ${r.details.vector.toFixed(3)}`);
  console.log(`  Position: ${r.details.position.toFixed(3)}`);
});
```

## Notes

- Weights are automatically normalized if they don't sum to 1.0
- Semantic scoring uses LLM to rate relevance on a 0-1 scale
- If semantic scoring fails, a default score of 0.5 is used
- Results are processed in batches of 5 for parallel efficiency

## Since

v8.44.0

## See Also

- [batchRerank](/docs/batchrerank) - Optimized batch reranking
- [simpleRerank](/docs/simplererank) - Reranking without LLM
- [createReranker](/docs/createreranker) - Factory for reranker instances
- [RerankResult](/docs/type-aliases/rerankresult) - Result type definition

---

## Type Alias: TextGenerationResult

<!-- Source: api/type-aliases/TextGenerationResult.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### provider?

> `optional` **provider**: `string`

Defined in: [types/generateTypes.ts:655](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L655)

---

### model?

> `optional` **model**: `string`

Defined in: [types/generateTypes.ts:656](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L656)

---

### usage?

> `optional` **usage**: `TokenUsage`

Defined in: [types/generateTypes.ts:657](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L657)

---

### responseTime?

> `optional` **responseTime**: `number`

Defined in: [types/generateTypes.ts:658](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L658)

---

### toolsUsed?

> `optional` **toolsUsed**: `string`[]

Defined in: [types/generateTypes.ts:659](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L659)

---

### toolExecutions?

> `optional` **toolExecutions**: `object`[]

Defined in: [types/generateTypes.ts:660](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L660)

#### toolName

> **toolName**: `string`

#### executionTime

> **executionTime**: `number`

#### success

> **success**: `boolean`

#### serverId?

> `optional` **serverId**: `string`

---

### enhancedWithTools?

> `optional` **enhancedWithTools**: `boolean`

Defined in: [types/generateTypes.ts:666](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L666)

---

### availableTools?

> `optional` **availableTools**: `object`[]

Defined in: [types/generateTypes.ts:667](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L667)

#### name

> **name**: `string`

#### description

> **description**: `string`

#### server

> **server**: `string`

#### category?

> `optional` **category**: `string`

---

### analytics?

> `optional` **analytics**: [`AnalyticsData`](/docs/api/type-aliases/AnalyticsData)

Defined in: [types/generateTypes.ts:674](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L674)

---

### evaluation?

> `optional` **evaluation**: [`EvaluationData`](/docs/api/type-aliases/EvaluationData)

Defined in: [types/generateTypes.ts:675](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L675)

---

### audio?

> `optional` **audio**: `TTSResult`

Defined in: [types/generateTypes.ts:676](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L676)

---

### video?

> `optional` **video**: `VideoGenerationResult`

Defined in: [types/generateTypes.ts:678](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L678)

Video generation result

---

### imageOutput?

> `optional` **imageOutput**: \{ `base64`: `string`; \} \| `null`

Defined in: [types/generateTypes.ts:680](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/generateTypes.ts#L680)

Image generation output

---

## Function: setLangfuseContext()

<!-- Source: api/functions/setLangfuseContext.md -->

[**NeuroLink API Reference v8.42.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / setLangfuseContext

# Function: setLangfuseContext()

> **setLangfuseContext**\(`context`, `callback?`): `Promise`\

Defined in: [services/server/ai/observability/instrumentation.ts:550](https://github.com/juspay/neurolink/blob/main/src/lib/services/server/ai/observability/instrumentation.ts#L550)

Set user and session context for Langfuse spans in the current async context

Merges the provided context with existing AsyncLocalStorage context. If a callback is provided,
the context is scoped to that callback execution and returns the callback's result.
Without a callback, the context applies to the current execution context and its children.

Uses AsyncLocalStorage to properly scope context per request, avoiding race conditions
in concurrent scenarios.

## Type Parameters

### T

The return type of the callback function (defaults to `void`)

## Parameters

### context

Object containing context fields to merge with existing context

#### userId?

`string` \| `null`

User identifier to attach to spans

#### sessionId?

`string` \| `null`

Session identifier to attach to spans

#### conversationId?

`string` \| `null`

Conversation/thread identifier for grouping related traces

#### requestId?

`string` \| `null`

Request identifier for correlating with application logs

#### traceName?

`string` \| `null`

Custom trace name for better organization in Langfuse UI

#### metadata?

`Record` \| `null`

Custom metadata to attach to spans as key-value pairs

#### operationName?

`string` \| `null`

Explicit operation name for the trace. Overrides auto-detection when set.

Use this to provide meaningful names like "customer-support-chat" or "code-review"
that will appear in the trace name alongside the userId.

#### autoDetectOperationName?

`boolean`

Override the global `autoDetectOperationName` setting for this specific context.

When `undefined`, uses the global setting from `LangfuseConfig` (defaults to `true`).
Set to `false` to disable auto-detection for this context only.

### callback?

`() => T | Promise`

Optional callback to run within the context scope. If omitted, context applies to current execution

## Returns

`Promise`\

The callback's return value if provided, otherwise void

## Examples

### With callback - returns the result

```typescript

const result = await setLangfuseContext(
  { userId: "user123", conversationId: "conv-456" },
  async () => {
    return await generateText({ model: "gpt-4", prompt: "Hello" });
  },
);
// result is typed as the return value of the callback
```

### Without callback - sets context for current execution

```typescript

await setLangfuseContext({
  sessionId: "session456",
  traceName: "chat-completion",
  metadata: { feature: "support", tier: "premium" },
});
// Context now applies to all subsequent spans in this async context
```

### With full context

```typescript

await setLangfuseContext({
  userId: "user-123",
  sessionId: "session-456",
  conversationId: "conv-789",
  requestId: "req-abc",
  traceName: "customer-support-chat",
  metadata: {
    feature: "support",
    tier: "premium",
    region: "us-east-1",
  },
});

// Verify context was set
const context = getLangfuseContext();
console.log(context?.conversationId); // "conv-789"
```

### With explicit operation name

```typescript

// Explicit operation name overrides auto-detection
await setLangfuseContext(
  {
    userId: "user@email.com",
    operationName: "customer-support-chat",
  },
  async () => {
    // Trace name will be: "user@email.com:customer-support-chat"
    return await generateText({ model: "gpt-4", prompt: "Help me with..." });
  },
);
```

### Disabling auto-detection for specific context

```typescript

// Disable operation name auto-detection for this context only
// (global setting remains unchanged for other contexts)
await setLangfuseContext(
  {
    userId: "user@email.com",
    autoDetectOperationName: false,
  },
  async () => {
    // Trace name will be: "user@email.com" (legacy behavior)
    return await streamText({ model: "gpt-4", prompt: "Stream this..." });
  },
);
```

### Combining explicit operation name with auto-detection off

```typescript

// When both are set, operationName takes precedence
await setLangfuseContext(
  {
    userId: "user@email.com",
    operationName: "my-custom-operation",
    autoDetectOperationName: false, // This is redundant when operationName is set
  },
  async () => {
    // Trace name: "user@email.com:my-custom-operation"
    return await generateText({ model: "gpt-4", prompt: "..." });
  },
);
```

## See Also

- [getLangfuseContext](/docs/getlangfusecontext) - Read the current context
- [getTracer](/docs/gettracer) - Get a Tracer for custom spans
- [LangfuseConfig](/docs/api/type-aliases/LangfuseConfig) - Configuration options

---

## Type Alias: TokenExchangeRequest

<!-- Source: api/type-aliases/TokenExchangeRequest.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### state

> **state**: `string`

Defined in: [types/mcpTypes.ts:924](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L924)

---

### codeVerifier?

> `optional` **codeVerifier**: `string`

Defined in: [types/mcpTypes.ts:925](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L925)

---

## Function: shutdownOpenTelemetry()

<!-- Source: api/functions/shutdownOpenTelemetry.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / shutdownOpenTelemetry

# Function: shutdownOpenTelemetry()

> **shutdownOpenTelemetry**(): `Promise`\

Defined in: [services/server/ai/observability/instrumentation.ts:164](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/services/server/ai/observability/instrumentation.ts#L164)

Shutdown OpenTelemetry and Langfuse span processor

## Returns

`Promise`\

---

## Type Alias: TokenStorage

<!-- Source: api/type-aliases/TokenStorage.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### saveTokens()

> **saveTokens**(`serverId`, `tokens`): `Promise`\

Defined in: [types/mcpTypes.ts:858](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L858)

Save tokens for a server

#### Parameters

##### serverId

`string`

Unique identifier for the MCP server

##### tokens

[`OAuthTokens`](/docs/api/type-aliases/OAuthTokens)

OAuth tokens to store

#### Returns

`Promise`\

---

### deleteTokens()

> **deleteTokens**(`serverId`): `Promise`\

Defined in: [types/mcpTypes.ts:864](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L864)

Delete stored tokens for a server

#### Parameters

##### serverId

`string`

Unique identifier for the MCP server

#### Returns

`Promise`\

---

### hasTokens()?

> `optional` **hasTokens**(`serverId`): `Promise`\

Defined in: [types/mcpTypes.ts:871](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L871)

Check if tokens exist for a server

#### Parameters

##### serverId

`string`

Unique identifier for the MCP server

#### Returns

`Promise`\

True if tokens exist

---

### clearAll()?

> `optional` **clearAll**(): `Promise`\

Defined in: [types/mcpTypes.ts:876](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/mcpTypes.ts#L876)

Clear all stored tokens

#### Returns

`Promise`\

---

## Function: simpleRerank()

<!-- Source: api/functions/simpleRerank.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / simpleRerank

# Function: simpleRerank()

> **simpleRerank**(`results`, `options?`): `RerankResult[]`

Defined in: [lib/rag/reranker/reranker.ts:295](https://github.com/juspay/neurolink/blob/main/src/lib/rag/reranker/reranker.ts#L295)

Simple position-based reranker (no LLM required)

A fast, synchronous reranking function that combines vector similarity scores
with position-based scoring. Ideal for scenarios where LLM-based semantic
scoring is not available or when low latency is critical.

## Parameters

### results

`VectorQueryResult[]`

Vector search results to rerank. Each result should have:

- `id` - Unique identifier
- `text` - Text content
- `score` - Original vector similarity score
- `metadata` - Additional metadata

### options?

`object`

Optional configuration:

- `topK` - Number of results to return (default: 3)
- `vectorWeight` - Weight for vector score (default: 0.8)
- `positionWeight` - Weight for position score (default: 0.2)

## Returns

`RerankResult[]`

Array of reranked results sorted by combined score, each containing:

- `result` - Original VectorQueryResult
- `score` - Combined score (0-1)
- `details` - Score breakdown with `semantic: 0`, `vector`, and `position`

## Examples

### Basic simple reranking

```typescript

const rerankedResults = simpleRerank(vectorSearchResults, {
  topK: 5,
});

console.log("Top result:", rerankedResults[0].result.text);
```

### Adjusting weight distribution

```typescript

// Emphasize vector similarity over position
const results = simpleRerank(searchResults, {
  topK: 10,
  vectorWeight: 0.9,
  positionWeight: 0.1,
});
```

### Low-latency search pipeline

```typescript

async function fastSearch(query: string) {
  // Get vector search results
  const vectorResults = await vectorStore.query({
    queryVector: await embed(query),
    topK: 50,
  });

  // Fast synchronous reranking (no LLM calls)
  const reranked = simpleRerank(vectorResults, {
    topK: 10,
    vectorWeight: 0.85,
    positionWeight: 0.15,
  });

  return reranked.map((r) => r.result);
}
```

### Fallback when LLM is unavailable

```typescript

async function rerankWithFallback(
  results: VectorQueryResult[],
  query: string,
  model?: AIProvider,
) {
  if (model) {
    // Use LLM-based reranking when available
    return await rerank(results, query, model, { topK: 5 });
  }

  // Fall back to simple reranking
  return simpleRerank(results, { topK: 5 });
}
```

### Comparing reranking methods

```typescript

async function compareReranking(results: VectorQueryResult[], query: string) {
  // Simple reranking (fast, no API calls)
  const simpleResults = simpleRerank(results, { topK: 5 });

  // LLM reranking (slower, more accurate)
  const llmResults = await rerank(results, query, model, { topK: 5 });

  console.log(
    "Simple ranking:",
    simpleResults.map((r) => r.result.id),
  );
  console.log(
    "LLM ranking:",
    llmResults.map((r) => r.result.id),
  );
}
```

## Notes

- This is a synchronous function (returns immediately, no async)
- Semantic score is always 0 in the details (no LLM scoring)
- Weights are automatically normalized to sum to 1.0
- Position score is calculated as `1 - (index / total)`, giving earlier results higher scores

## Since

v8.44.0

## See Also

- [rerank](/docs/rerank) - LLM-based reranking with semantic scoring
- [batchRerank](/docs/batchrerank) - Efficient batch LLM reranking
- [createReranker](/docs/createreranker) - Factory for reranker instances
- [RerankResult](/docs/type-aliases/rerankresult) - Result type definition

---

## Type Alias: ToolContext

<!-- Source: api/type-aliases/ToolContext.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### userId?

> `optional` **userId**: `string`

Defined in: [types/tools.ts:179](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/tools.ts#L179)

---

### aiProvider?

> `optional` **aiProvider**: `string`

Defined in: [types/tools.ts:180](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/tools.ts#L180)

---

### metadata?

> `optional` **metadata**: `ToolExecutionMetadata`

Defined in: [types/tools.ts:181](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/tools.ts#L181)

---

## Function: validateTool()

<!-- Source: api/functions/validateTool.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / validateTool

# Function: validateTool()

> **validateTool**(`name`, `tool`): `void`

Defined in: [sdk/toolRegistration.ts:355](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/sdk/toolRegistration.ts#L355)

Validate tool configuration with detailed error messages

## Parameters

### name

`string`

### tool

`SimpleTool`

## Returns

`void`

---

## Type Alias: ToolDefinition\<TArgs, TResult\>

<!-- Source: api/type-aliases/ToolDefinition.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### parameters?

> `optional` **parameters**: `ToolParameterSchema`

Defined in: [types/tools.ts:333](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/tools.ts#L333)

---

### metadata?

> `optional` **metadata**: `ToolMetadata`

Defined in: [types/tools.ts:334](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/tools.ts#L334)

---

### execute()

> **execute**: (`params`, `context?`) => `Promise`\\> \| [`ToolResult`](/docs/api/type-aliases/ToolResult)\

Defined in: [types/tools.ts:335](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/tools.ts#L335)

#### Parameters

##### params

`TArgs`

##### context?

[`ToolContext`](/docs/api/type-aliases/ToolContext)

#### Returns

`Promise`\\> \| [`ToolResult`](/docs/api/type-aliases/ToolResult)\

---

## Function: withHTTPRetry()

<!-- Source: api/functions/withHTTPRetry.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / withHTTPRetry

# Function: withHTTPRetry()

> **withHTTPRetry**\(`operation`, `config`): `Promise`\

Defined in: [mcp/httpRetryHandler.ts:155](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/mcp/httpRetryHandler.ts#L155)

Execute an HTTP operation with retry logic

Implements exponential backoff with jitter to avoid thundering herd problems.
Uses the calculateBackoffDelay function from the core retry handler for
consistent delay calculation across the codebase.

## Type Parameters

### T

`T`

## Parameters

### operation

() => `Promise`\

Async operation to execute with retries

### config

`Partial`\ = `{}`

Partial HTTP retry configuration (merged with defaults)

## Returns

`Promise`\

Result of the operation

## Throws

Last error if all retry attempts fail

## Example

```typescript
const result = await withHTTPRetry(
  async () => {
    const response = await fetch(url);
    if (!response.ok) {
      const error = new Error(`HTTP ${response.status}`) as Error & {
        status: number;
      };
      error.status = response.status;
      throw error;
    }
    return response.json();
  },
  { maxAttempts: 5, initialDelay: 500 },
);
```

---

## Type Alias: ToolExecutionResult\<T\>

<!-- Source: api/type-aliases/ToolExecutionResult.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### context?

> `optional` **context**: [`ExecutionContext`](/docs/api/type-aliases/ExecutionContext)

Defined in: [types/tools.ts:142](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/tools.ts#L142)

---

### performance?

> `optional` **performance**: `object`

Defined in: [types/tools.ts:143](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/tools.ts#L143)

#### duration

> **duration**: `number`

#### tokensUsed?

> `optional` **tokensUsed**: `number`

#### cost?

> `optional` **cost**: `number`

---

### validation?

> `optional` **validation**: `ValidationResult`

Defined in: [types/tools.ts:148](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/tools.ts#L148)

---

### cached?

> `optional` **cached**: `boolean`

Defined in: [types/tools.ts:149](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/tools.ts#L149)

---

### fallback?

> `optional` **fallback**: `boolean`

Defined in: [types/tools.ts:150](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/tools.ts#L150)

---

## Type Alias: ToolInfo

<!-- Source: api/type-aliases/ToolInfo.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

### description?

> `optional` **description**: `string`

Defined in: [types/tools.ts:98](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/tools.ts#L98)

---

### category?

> `optional` **category**: `string`

Defined in: [types/tools.ts:99](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/tools.ts#L99)

---

### serverId?

> `optional` **serverId**: `string`

Defined in: [types/tools.ts:100](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/tools.ts#L100)

---

### inputSchema?

> `optional` **inputSchema**: `StandardRecord`

Defined in: [types/tools.ts:101](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/tools.ts#L101)

---

### outputSchema?

> `optional` **outputSchema**: `StandardRecord`

Defined in: [types/tools.ts:102](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/tools.ts#L102)

---

## Type Alias: ToolResult\<T\>

<!-- Source: api/type-aliases/ToolResult.md -->

[**NeuroLink API Reference v8.32.0**](/docs/readme)

---

[NeuroLink API Reference](/docs/readme) / ToolResult

# Type Alias: ToolResult\

> **ToolResult**\ = `Result`\ & `object`

Defined in: [types/tools.ts:243](https://github.com/juspay/neurolink/blob/1be79595b7d7307795c98da4267c1728cb50033d/src/lib/types/tools.ts#L243)

Tool execution result

## Type Declaration

### success

> **success**: `boolean`

### data?

> `optional` **data**: `T` \| `null`

### error?

> `optional` **error**: `ErrorInfo` \| `string`

### usage?

> `optional` **usage**: `ToolResultUsage`

### metadata?

> `optional` **metadata**: `ToolResultMetadata`

## Type Parameters

### T

`T` = `JsonValue` \| `unknown`

---

## Type Alias: TraceNameFormat

<!-- Source: api/type-aliases/TraceNameFormat.md -->

[**NeuroLink API Reference v8.42.0**](/docs/readme)

--------------------- | -------------------------------- | ------------------------------------------------------------------- |
| `"userId:operationName"` | `"user@email.com:ai.streamText"` | Default format. User first, then operation.                         |
| `"operationName:userId"` | `"ai.streamText:user@email.com"` | Operation first, then user. Useful for operation-centric filtering. |
| `"operationName"`        | `"ai.streamText"`                | Operation name only. User ID not included in trace name.            |
| `"userId"`               | `"user@email.com"`               | User ID only. Legacy behavior, operation name not included.         |

## Custom Function Format

For full control over trace naming, provide a function that receives the context:

```typescript
type CustomFormat = (context: {
  userId?: string;
  operationName?: string;
}) => string;
```

## Examples

### Using predefined formats

```typescript

// Default: userId:operationName
const neurolink1 = new NeuroLink({
  observability: {
    langfuse: {
      enabled: true,
      publicKey: "pk-...",
      secretKey: "sk-...",
      traceNameFormat: "userId:operationName",
    },
  },
});
// Trace name: "user@email.com:ai.streamText"

// Operation-centric naming
const neurolink2 = new NeuroLink({
  observability: {
    langfuse: {
      enabled: true,
      publicKey: "pk-...",
      secretKey: "sk-...",
      traceNameFormat: "operationName:userId",
    },
  },
});
// Trace name: "ai.streamText:user@email.com"

// Operation only (no user in trace name)
const neurolink3 = new NeuroLink({
  observability: {
    langfuse: {
      enabled: true,
      publicKey: "pk-...",
      secretKey: "sk-...",
      traceNameFormat: "operationName",
    },
  },
});
// Trace name: "ai.streamText"

// Legacy behavior (user only)
const neurolink4 = new NeuroLink({
  observability: {
    langfuse: {
      enabled: true,
      publicKey: "pk-...",
      secretKey: "sk-...",
      traceNameFormat: "userId",
    },
  },
});
// Trace name: "user@email.com"
```

### Using a custom function

```typescript

// Custom format with brackets
const neurolink = new NeuroLink({
  observability: {
    langfuse: {
      enabled: true,
      publicKey: "pk-...",
      secretKey: "sk-...",
      traceNameFormat: (ctx) =>
        `[${ctx.operationName || "unknown"}] ${ctx.userId || "anonymous"}`,
    },
  },
});
// Trace name: "[ai.streamText] user@email.com"
```

### Custom function with environment prefix

```typescript

const env = process.env.NODE_ENV || "dev";

const neurolink = new NeuroLink({
  observability: {
    langfuse: {
      enabled: true,
      publicKey: "pk-...",
      secretKey: "sk-...",
      environment: env,
      traceNameFormat: (ctx) => {
        const parts = [env];
        if (ctx.operationName) parts.push(ctx.operationName);
        if (ctx.userId) parts.push(ctx.userId);
        return parts.join(":");
      },
    },
  },
});
// Trace name: "prod:ai.streamText:user@email.com"
```

### Handling missing values

```typescript

const neurolink = new NeuroLink({
  observability: {
    langfuse: {
      enabled: true,
      publicKey: "pk-...",
      secretKey: "sk-...",
      traceNameFormat: (ctx) => {
        // Handle cases where operationName or userId might be undefined
        if (ctx.operationName && ctx.userId) {
          return `${ctx.userId}/${ctx.operationName}`;
        }
        if (ctx.operationName) {
          return ctx.operationName;
        }
        return ctx.userId || "trace";
      },
    },
  },
});
```

## Fallback Behavior

When `operationName` is not available (e.g., auto-detection is disabled and no explicit name is set),
predefined formats that include `operationName` will fall back gracefully:

- `"userId:operationName"` falls back to `"userId"`
- `"operationName:userId"` falls back to `"userId"`
- `"operationName"` falls back to `"userId"`

## See Also

- [LangfuseConfig](/docs/api/type-aliases/LangfuseConfig) - Configuration options including `traceNameFormat`
- [setLangfuseContext](/docs/api/functions/setLangfuseContext) - Set operation name per-context

---

## Type Alias: VectorQueryToolConfig

<!-- Source: api/type-aliases/VectorQueryToolConfig.md -->

[**NeuroLink API Reference v8.44.0**](/docs/readme)

### description?

> `optional` **description**: `string`

Tool description for AI agents to understand when to use this tool

---

### indexName

> **indexName**: `string`

Index name within the vector store to query against

---

### embeddingModel

> **embeddingModel**: `object`

Embedding model specification for query vectorization

#### embeddingModel.provider

> **provider**: `string`

Provider name (e.g., "openai", "cohere")

#### embeddingModel.modelName

> **modelName**: `string`

Model name (e.g., "text-embedding-3-small")

---

### enableFilter?

> `optional` **enableFilter**: `boolean`

Enable metadata filtering on query results

---

### includeVectors?

> `optional` **includeVectors**: `boolean`

Include embedding vectors in query results

---

### includeSources?

> `optional` **includeSources**: `boolean`

Include full source objects in query results

---

### topK?

> `optional` **topK**: `number`

Number of results to return from vector search

---

### reranker?

> `optional` **reranker**: [`RerankerConfig`](/docs/rerankerconfig)

Reranker configuration for result refinement

---

### providerOptions?

> `optional` **providerOptions**: `VectorProviderOptions`

Provider-specific query options (Pinecone, pgVector, Chroma)

## Example

```typescript

const vectorTool = createVectorQueryTool({
  indexName: "documents",
  embeddingModel: {
    provider: "openai",
    modelName: "text-embedding-3-small",
  },
  topK: 10,
  enableFilter: true,
  reranker: {
    model: {
      provider: "openai",
      modelName: "gpt-4o-mini",
    },
    topK: 5,
  },
});
```

---

# End of Documentation

For the latest documentation, visit: https://docs.neurolink.ink
GitHub: https://github.com/juspay/neurolink