Deployment Guide

Deploy NeuroLink server adapters to production

This guide covers deploying NeuroLink server adapters to various environments including Docker, Kubernetes, and serverless platforms.

Environment Variables

Configure your server using environment variables for security and flexibility.

Required Variables

# AI Provider API Keys (at least one required)
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_AI_API_KEY=AIza...

# Server Configuration
PORT=3000
HOST=0.0.0.0
NODE_ENV=production

Optional Variables

# Security
JWT_SECRET=your-jwt-secret-min-32-chars
API_KEY_SECRET=your-api-key-for-service-auth

# CORS
ALLOWED_ORIGINS=https://myapp.com,https://api.myapp.com

# Rate Limiting
RATE_LIMIT_MAX_REQUESTS=100
RATE_LIMIT_WINDOW_MS=60000

# Redis (for distributed rate limiting and memory)
REDIS_URL=redis://localhost:6379
REDIS_PASSWORD=optional-password

# Logging
LOG_LEVEL=info

# Timeouts
REQUEST_TIMEOUT_MS=30000

# Observability
LANGFUSE_PUBLIC_KEY=pk-...
LANGFUSE_SECRET_KEY=sk-...
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318

Environment Configuration in Code

import { NeuroLink } from "@juspay/neurolink";
import { createServer } from "@juspay/neurolink";

const neurolink = new NeuroLink({
  defaultProvider: process.env.DEFAULT_PROVIDER || "openai",
});

const server = await createServer(neurolink, {
  framework: "hono",
  config: {
    port: parseInt(process.env.PORT || "3000"),
    host: process.env.HOST || "0.0.0.0",
    timeout: parseInt(process.env.REQUEST_TIMEOUT_MS || "30000"),
    cors: {
      enabled: true,
      origins: process.env.ALLOWED_ORIGINS?.split(",") || ["*"],
    },
    rateLimit: {
      enabled: true,
      maxRequests: parseInt(process.env.RATE_LIMIT_MAX_REQUESTS || "100"),
      windowMs: parseInt(process.env.RATE_LIMIT_WINDOW_MS || "60000"),
    },
  },
});

Docker Deployment

Basic Dockerfile

# syntax=docker/dockerfile:1
FROM node:20-alpine AS base

# Install dependencies only when needed
FROM base AS deps
WORKDIR /app

# Install dependencies
COPY package.json package-lock.json* ./
RUN npm ci --only=production

# Build the application
FROM base AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN npm run build

# Production image
FROM base AS runner
WORKDIR /app

ENV NODE_ENV=production

# Create non-root user
RUN addgroup --system --gid 1001 nodejs
RUN adduser --system --uid 1001 neurolink

# Copy built assets
COPY --from=builder --chown=neurolink:nodejs /app/dist ./dist
COPY --from=builder --chown=neurolink:nodejs /app/node_modules ./node_modules
COPY --from=builder --chown=neurolink:nodejs /app/package.json ./package.json

USER neurolink

EXPOSE 3000

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
  CMD wget --no-verbose --tries=1 --spider http://localhost:3000/api/health || exit 1

CMD ["node", "dist/server.js"]

Multi-Stage Build for Smaller Images

# syntax=docker/dockerfile:1
FROM node:20-alpine AS builder

WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build

# Production stage with minimal dependencies
FROM node:20-alpine AS production

WORKDIR /app

# Security: non-root user
RUN addgroup -g 1001 -S nodejs && \
    adduser -S neurolink -u 1001

# Copy only production dependencies
COPY package*.json ./
RUN npm ci --only=production && npm cache clean --force

# Copy built application
COPY --from=builder --chown=neurolink:nodejs /app/dist ./dist

USER neurolink
EXPOSE 3000

HEALTHCHECK --interval=30s --timeout=3s \
  CMD wget --spider -q http://localhost:3000/api/health || exit 1

CMD ["node", "dist/server.js"]

Docker Compose

version: "3.8"

services:
  api:
    build:
      context: .
      dockerfile: Dockerfile
    ports:
      - "3000:3000"
    environment:
      - NODE_ENV=production
      - PORT=3000
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
      - REDIS_URL=redis://redis:6379
      - JWT_SECRET=${JWT_SECRET}
    depends_on:
      redis:
        condition: service_healthy
    healthcheck:
      test:
        ["CMD", "wget", "--spider", "-q", "http://localhost:3000/api/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 10s
    restart: unless-stopped
    deploy:
      resources:
        limits:
          cpus: "2"
          memory: 2G
        reservations:
          cpus: "0.5"
          memory: 512M

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    volumes:
      - redis-data:/data
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 10s
      timeout: 5s
      retries: 5
    restart: unless-stopped
    command: redis-server --appendonly yes

volumes:
  redis-data:

Build and Run

# Build the image
docker build -t neurolink-api:latest .

# Run with environment variables
docker run -d \
  --name neurolink-api \
  -p 3000:3000 \
  -e OPENAI_API_KEY=$OPENAI_API_KEY \
  -e JWT_SECRET=$JWT_SECRET \
  neurolink-api:latest

# Using docker-compose
docker-compose up -d

Kubernetes Deployment

Deployment Manifest

apiVersion: apps/v1
kind: Deployment
metadata:
  name: neurolink-api
  labels:
    app: neurolink-api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: neurolink-api
  template:
    metadata:
      labels:
        app: neurolink-api
    spec:
      containers:
        - name: neurolink-api
          image: your-registry/neurolink-api:latest
          ports:
            - containerPort: 3000
          env:
            - name: NODE_ENV
              value: "production"
            - name: PORT
              value: "3000"
            - name: OPENAI_API_KEY
              valueFrom:
                secretKeyRef:
                  name: neurolink-secrets
                  key: openai-api-key
            - name: JWT_SECRET
              valueFrom:
                secretKeyRef:
                  name: neurolink-secrets
                  key: jwt-secret
            - name: REDIS_URL
              valueFrom:
                configMapKeyRef:
                  name: neurolink-config
                  key: redis-url
          resources:
            requests:
              memory: "512Mi"
              cpu: "250m"
            limits:
              memory: "2Gi"
              cpu: "2000m"
          # Liveness probe - is the container alive?
          livenessProbe:
            httpGet:
              path: /api/health
              port: 3000
            initialDelaySeconds: 10
            periodSeconds: 30
            timeoutSeconds: 5
            failureThreshold: 3
          # Readiness probe - is the container ready to serve traffic?
          readinessProbe:
            httpGet:
              path: /api/ready
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 10
            timeoutSeconds: 5
            failureThreshold: 3
          # Startup probe - has the container started?
          startupProbe:
            httpGet:
              path: /api/health
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 5
            failureThreshold: 30
      terminationGracePeriodSeconds: 30
---
apiVersion: v1
kind: Service
metadata:
  name: neurolink-api
spec:
  selector:
    app: neurolink-api
  ports:
    - protocol: TCP
      port: 80
      targetPort: 3000
  type: ClusterIP
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: neurolink-api
  annotations:
    kubernetes.io/ingress.class: nginx
    cert-manager.io/cluster-issuer: letsencrypt-prod
    nginx.ingress.kubernetes.io/rate-limit: "100"
    nginx.ingress.kubernetes.io/rate-limit-window: "1m"
spec:
  tls:
    - hosts:
        - api.yourdomain.com
      secretName: neurolink-api-tls
  rules:
    - host: api.yourdomain.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: neurolink-api
                port:
                  number: 80

Secrets and ConfigMap

apiVersion: v1
kind: Secret
metadata:
  name: neurolink-secrets
type: Opaque
stringData:
  openai-api-key: "sk-..."
  anthropic-api-key: "sk-ant-..."
  jwt-secret: "your-secure-jwt-secret"
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: neurolink-config
data:
  redis-url: "redis://redis-master:6379"
  log-level: "info"
  rate-limit-max: "100"

Horizontal Pod Autoscaler

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: neurolink-api-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: neurolink-api
  minReplicas: 3
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
        - type: Percent
          value: 10
          periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 0
      policies:
        - type: Percent
          value: 100
          periodSeconds: 15
        - type: Pods
          value: 4
          periodSeconds: 15
      selectPolicy: Max

Serverless Deployment

Cloudflare Workers (Hono)

Hono is ideal for edge deployment:

// src/worker.ts
import { NeuroLink } from "@juspay/neurolink";
import { createServer } from "@juspay/neurolink";

const neurolink = new NeuroLink({
  defaultProvider: "openai",
});

const server = await createServer(neurolink, {
  framework: "hono",
  config: {
    basePath: "/api",
  },
});

await server.initialize();

export default {
  fetch: server.getFrameworkInstance().fetch,
};

# wrangler.toml
name = "neurolink-api"
main = "src/worker.ts"
compatibility_date = "2024-01-01"

[vars]
NODE_ENV = "production"

[[kv_namespaces]]
binding = "RATE_LIMIT_KV"
id = "your-kv-id"

Vercel Edge Functions

// api/[[...route]].ts
import { NeuroLink } from "@juspay/neurolink";
import { createServer } from "@juspay/neurolink";

const neurolink = new NeuroLink({
  defaultProvider: "openai",
});

const server = await createServer(neurolink, {
  framework: "hono",
  config: { basePath: "/api" },
});

await server.initialize();

export const config = {
  runtime: "edge",
};

export default server.getFrameworkInstance().fetch;

AWS Lambda

// handler.ts
import { NeuroLink } from "@juspay/neurolink";
import { createServer } from "@juspay/neurolink";
import { handle } from "hono/aws-lambda";

const neurolink = new NeuroLink({
  defaultProvider: "openai",
});

const server = await createServer(neurolink, {
  framework: "hono",
  config: { basePath: "/api" },
});

await server.initialize();

export const handler = handle(server.getFrameworkInstance());

# serverless.yml
service: neurolink-api

provider:
  name: aws
  runtime: nodejs20.x
  region: us-east-1
  environment:
    NODE_ENV: production
    OPENAI_API_KEY: ${ssm:/neurolink/openai-api-key}

functions:
  api:
    handler: handler.handler
    events:
      - httpApi:
          path: /api/{proxy+}
          method: ANY
    timeout: 30
    memorySize: 1024

Production Configuration Recommendations

Server Configuration

const server = await createServer(neurolink, {
  framework: "hono",
  config: {
    // Server
    port: parseInt(process.env.PORT || "3000"),
    host: "0.0.0.0",
    timeout: 30000,

    // CORS (specific origins only)
    cors: {
      enabled: true,
      origins: process.env.ALLOWED_ORIGINS?.split(",") || [],
      methods: ["GET", "POST"],
      credentials: true,
    },

    // Rate limiting
    rateLimit: {
      enabled: true,
      maxRequests: 100,
      windowMs: 60000,
      skipPaths: ["/api/health", "/api/ready"],
    },

    // Body parsing
    bodyParser: {
      enabled: true,
      maxSize: "1mb",
      jsonLimit: "1mb",
    },

    // Logging
    logging: {
      enabled: true,
      level: "info",
      includeBody: false,
      includeResponse: false,
    },

    // Redaction (for sensitive data)
    redaction: {
      enabled: true,
      additionalFields: ["ssn", "creditCard"],
    },

    // Features
    enableMetrics: true,
    enableSwagger: false, // Disable in production
  },
});

Health and Readiness Endpoints

The server adapter provides built-in health endpoints:

GET /api/health - Basic health check (is the server running?)
GET /api/ready - Readiness check (is the server ready to serve traffic?)
GET /api/version - Version information

Graceful Shutdown

NeuroLink server adapters support configurable graceful shutdown to ensure clean termination of active connections and requests.

Shutdown Configuration

const server = await createServer(neurolink, {
  framework: "hono",
  config: {
    shutdown: {
      gracefulShutdownTimeoutMs: 30000, // Max time to wait for shutdown
      drainTimeoutMs: 15000, // Max time to drain connections
      forceClose: true, // Force close if timeout exceeded
    },
  },
});

Option	Default	Description
`gracefulShutdownTimeoutMs`	30000	Maximum total time to wait for graceful shutdown
`drainTimeoutMs`	15000	Maximum time to wait for active connections to end
`forceClose`	true	Force close remaining connections after timeout

Shutdown Process Steps

When server.stop() is called, the shutdown proceeds through these steps:

Stop accepting new connections - The server immediately stops accepting new requests
Drain active connections - Active requests are allowed to complete (up to drainTimeoutMs)
Complete graceful shutdown - Finalize cleanup within gracefulShutdownTimeoutMs
Force close if needed - If forceClose: true, remaining connections are forcefully terminated after timeout

Signal Handling Example

const server = await createServer(neurolink, {
  framework: "hono",
  config: {
    shutdown: {
      gracefulShutdownTimeoutMs: 30000,
      drainTimeoutMs: 15000,
      forceClose: true,
    },
  },
});

await server.initialize();
await server.start();

// Handle SIGTERM (sent by Kubernetes, Docker, etc.)
process.on("SIGTERM", async () => {
  console.log("SIGTERM received, starting graceful shutdown...");
  await server.stop();
  process.exit(0);
});

// Handle SIGINT (Ctrl+C)
process.on("SIGINT", async () => {
  console.log("SIGINT received, starting graceful shutdown...");
  await server.stop();
  process.exit(0);
});

Complete Shutdown Handler

For production deployments, implement a comprehensive shutdown handler:

const server = await createServer(neurolink, { framework: "hono" });
await server.initialize();
await server.start();

// Handle graceful shutdown
const shutdown = async (signal: string) => {
  console.log(`Received ${signal}. Gracefully shutting down...`);

  // Stop accepting new requests
  await server.stop();

  // Close database connections, flush logs, etc.
  await cleanup();

  process.exit(0);
};

process.on("SIGTERM", () => shutdown("SIGTERM"));
process.on("SIGINT", () => shutdown("SIGINT"));

Kubernetes Considerations

When deploying to Kubernetes, align your shutdown configuration with Kubernetes settings:

Match terminationGracePeriodSeconds with gracefulShutdownTimeoutMs

spec:
  terminationGracePeriodSeconds: 30 # Should match gracefulShutdownTimeoutMs
  containers:
    - name: neurolink-api
      # ...

Use preStop hook for additional delay (if load balancer needs time to deregister)

lifecycle:
  preStop:
    exec:
      command: ["sh", "-c", "sleep 5"]

Ensure drainTimeoutMs < gracefulShutdownTimeoutMs < terminationGracePeriodSeconds

// Recommended configuration for Kubernetes with 30s termination period
const server = await createServer(neurolink, {
  framework: "hono",
  config: {
    shutdown: {
      gracefulShutdownTimeoutMs: 25000, // Leave buffer for cleanup
      drainTimeoutMs: 15000, // Drain before graceful timeout
      forceClose: true,
    },
  },
});

Logging for Production

import { NeuroLink } from "@juspay/neurolink";
import pino from "pino";

const logger = pino({
  level: process.env.LOG_LEVEL || "info",
  formatters: {
    level: (label) => ({ level: label }),
  },
  timestamp: pino.stdTimeFunctions.isoTime,
});

// Log all server events
server.on("request", (event) => {
  logger.info(
    { requestId: event.requestId, path: event.path },
    "Request received",
  );
});

server.on("response", (event) => {
  logger.info(
    {
      requestId: event.requestId,
      status: event.statusCode,
      duration: event.duration,
    },
    "Response sent",
  );
});

server.on("error", (event) => {
  logger.error(
    {
      requestId: event.requestId,
      error: event.error.message,
    },
    "Request error",
  );
});

Production Deployment Checklist

Pre-Deployment

All environment variables configured
Secrets stored securely (Kubernetes Secrets, AWS Secrets Manager, etc.)
Docker image built and tested
Health endpoints working
Rate limiting configured appropriately
CORS configured with specific origins
Authentication middleware in place
Logging configured

Infrastructure

Monitoring

Health check monitoring configured
Metrics collection enabled
Log aggregation set up
Alerting configured
Error tracking (Sentry, etc.) integrated

Scaling

Horizontal pod autoscaler configured
Resource requests and limits set
Redis (or equivalent) for distributed state
Database connection pooling configured

Security

Non-root container user
Read-only filesystem where possible
Security headers configured
Network policies defined
Regular security scanning enabled

Deployment Verification via CLI

Use CLI commands to verify your deployment:

Pre-Deployment Checklist

# Verify configuration
neurolink server config --format json

# Check all routes are registered
neurolink server routes

# Generate OpenAPI spec for documentation
neurolink server openapi -o openapi.json

Post-Deployment Verification

# Start server and verify status
neurolink server start --port 3000
neurolink server status

# Verify routes are accessible
neurolink server routes --format json

# Stop for production deployment
neurolink server stop

Health Check Endpoints

After deployment, verify these endpoints are accessible:

Endpoint	Purpose
`GET /api/health`	Basic health check
`GET /api/ready`	Readiness probe
`GET /api/metrics`	Metrics endpoint

Use neurolink server routes --group health to list all health endpoints.

Server Adapters Overview - Getting started with server adapters
Security Best Practices - Securing your deployment
Hono Adapter - Recommended for serverless deployments
Enterprise Monitoring - Production monitoring

Need Help? Join our GitHub Discussions or open an issue.

Environment Variables​

Required Variables​

Optional Variables​

Environment Configuration in Code​

Docker Deployment​

Basic Dockerfile​

Multi-Stage Build for Smaller Images​

Docker Compose​

Build and Run​

Kubernetes Deployment​

Deployment Manifest​

Secrets and ConfigMap​

Horizontal Pod Autoscaler​

Serverless Deployment​

Cloudflare Workers (Hono)​

Vercel Edge Functions​

AWS Lambda​

Production Configuration Recommendations​

Server Configuration​

Health and Readiness Endpoints​

Graceful Shutdown​

Shutdown Configuration​

Shutdown Process Steps​

Signal Handling Example​

Complete Shutdown Handler​

Kubernetes Considerations​

Logging for Production​

Production Deployment Checklist​

Pre-Deployment​

Infrastructure​

Monitoring​

Scaling​

Security​

Deployment Verification via CLI​

Pre-Deployment Checklist​

Post-Deployment Verification​

Health Check Endpoints​

Related Documentation​