Skip to main content

Claude Proxy

NeuroLink includes a Claude-API-compatible proxy server that sits between Claude Code and Anthropic. It pools multiple Claude accounts, handles rate-limit failover automatically, refreshes OAuth tokens on demand before they expire, and falls back to other providers when all Claude accounts are exhausted.

Overview

Why use the proxy?

Claude Code supports only one Anthropic account at a time. If you hit a rate limit, you wait. If your token expires mid-session, you re-authenticate manually. The NeuroLink proxy solves these problems:

  • Multi-account pooling -- Combine multiple Claude Pro/Max subscriptions for higher aggregate throughput.
  • Automatic token refresh -- OAuth tokens are refreshed before they expire (pre-request check + 401 retry).
  • Rate-limit failover -- When one account hits a 429, the proxy immediately tries the next account with exponential backoff.
  • Multi-provider fallback -- When all Claude accounts are exhausted, requests are routed to alternative providers (Gemini, OpenAI, etc.) through NeuroLink's provider layer.
  • Transparent to Claude Code -- Set ANTHROPIC_BASE_URL and Claude Code works normally. The proxy auto-configures this on start.

How it works at a glance

Claude Code
|
| POST /v1/messages
v
NeuroLink Proxy (localhost:55669)
|
|-- Passthrough mode (Claude -> Claude): raw body forwarding
|-- Translation mode (Claude -> Other): through neurolink.generate()/stream()
v
Anthropic API / Google AI / OpenAI / ...

Quick Start

One-command setup

neurolink proxy setup

This command:

  1. Checks for existing authenticated accounts
  2. Runs OAuth login if no valid accounts exist
  3. Installs the proxy as a launchd service (macOS) that auto-restarts on crash or reboot
  4. Auto-configures Claude Code to use the proxy

Use --no-service to skip service installation and start the proxy in the foreground instead:

neurolink proxy setup --no-service

Manual setup

# Step 1: Authenticate with Anthropic via OAuth
neurolink auth login anthropic --method oauth

# Step 2: (Optional) Add more accounts for pooling
neurolink auth login anthropic --method oauth --add --label work
neurolink auth login anthropic --method oauth --add --label personal

# Step 3: Start the proxy
neurolink proxy start

# Step 4: Restart Claude Code to pick up the new ANTHROPIC_BASE_URL

How It Works

Request Flow

Every request from Claude Code flows through the proxy in one of two modes:

Passthrough mode (Claude to Claude): The request body is forwarded directly to api.anthropic.com with only the authentication headers modified. This preserves multi-turn conversation history, thinking content, cache control, and tool definitions exactly as Claude Code sent them. No lossy conversion through an intermediate format.

Translation mode (Claude to other provider): When model routing directs a request to a non-Anthropic provider, the proxy parses the Claude Messages API request into NeuroLink's internal format, calls neurolink.generate() or neurolink.stream(), and serializes the result back into Claude Messages API format (including SSE streaming events). For streaming, the proxy emits SSE keep-alive comments (: keep-alive) every 15 seconds during idle periods to prevent connection timeouts.

Token Management

The proxy uses a reactive two-layer token refresh strategy to ensure requests never fail due to expired tokens:

  1. Pre-request check -- Before each request, the proxy checks if the OAuth token expires within the next 1 hour. If so, it refreshes the token before sending the request.
  2. 401 retry -- If Anthropic returns a 401 despite the above check, the proxy refreshes the token and retries the request up to 5 times per account. If all retries fail, the account enters a 5-minute cooldown and the proxy tries the next account. After 15 consecutive refresh failures across requests, the account is permanently disabled until re-authentication.

Refreshed tokens are persisted to ~/.neurolink/anthropic-credentials.json using atomic writes (write to .tmp, then rename) with 0o600 permissions.

Multi-Account Routing

When multiple accounts are available, the proxy uses fill-first routing:

  1. Use the first non-cooling account for every request.
  2. On a 429, apply exponential backoff to that account and try the next one.
  3. Continue until a request succeeds or all accounts are exhausted.
  4. If all accounts are exhausted, walk the fallback chain (alternative providers).
  5. If all fallbacks fail, return a 429 with a Retry-After header indicating the earliest account recovery time.

Account sources are checked in priority order:

  1. TokenStore compound keys (e.g., anthropic:work, anthropic:personal) -- from neurolink auth login --label
  2. Legacy credentials file (~/.neurolink/anthropic-credentials.json) -- only if no TokenStore accounts exist
  3. Environment variable (ANTHROPIC_API_KEY) -- only if no other accounts exist

Fallback Chain

When all Claude accounts are rate-limited, the proxy walks the fallback chain defined in the config file. Each fallback entry specifies a provider and model:

routing:
fallback-chain:
- provider: google-ai
model: gemini-2.5-flash
- provider: openai
model: gpt-4o

Fallback requests go through NeuroLink's stream() pipeline (translation mode), which handles the format conversion to and from the target provider's API. Tools, thinking configuration, and conversation history from the original request are passed through to the fallback provider.

Configuration

Proxy config file

The proxy loads configuration from ~/.neurolink/proxy-config.yaml by default (override with --config). The file supports YAML or JSON format with environment variable interpolation.

# ~/.neurolink/proxy-config.yaml
version: 1

# Account definitions (alternative to neurolink auth login)
accounts:
anthropic:
- name: primary
apiKey: ${ANTHROPIC_API_KEY_PRIMARY}
- name: secondary
apiKey: ${ANTHROPIC_API_KEY_SECONDARY}
weight: 2
rateLimit: 100

# Routing configuration
routing:
strategy: fill-first # or round-robin

# Model mappings: remap incoming model names to different providers
model-mappings:
- from: claude-sonnet-4-20250514
to: gemini-2.5-pro
provider: google-ai

# Fallback chain: try these when all Claude accounts are exhausted
fallback-chain:
- provider: google-ai
model: gemini-2.5-flash
- provider: openai
model: gpt-4o

# Models that always go to Anthropic (skip routing logic)
passthrough-models:
- claude-opus-4-20250514
- claude-sonnet-4-5-20250929

# Cloaking configuration (request transformation for OAuth)
cloaking:
mode: auto # "auto" | "always" | "never"
plugins: {}

Environment variable interpolation

String values in the config file support ${VAR_NAME} and ${VAR_NAME:-default} syntax:

accounts:
anthropic:
- name: primary
apiKey: ${ANTHROPIC_KEY_1}
- name: fallback
apiKey: ${ANTHROPIC_KEY_2:-sk-ant-fallback-key}

Account configuration options

FieldTypeDefaultDescription
namestringunnamedHuman-readable label for the account
apiKeystring--API key or token (supports ${ENV_VAR})
baseUrlstring--Override the provider endpoint URL
orgIdstring--Organization ID (e.g., for OpenAI orgs)
weightnumber1Weight for weighted round-robin selection
enabledbooleantrueWhether this account is active
rateLimitnumber--Max requests per minute for this account
metadataobject--Arbitrary metadata attached to the account

Server options

OptionDefaultDescription
port55669Port to listen on
host127.0.0.1Host to bind to
config~/.neurolink/proxy-config.yamlPath to config file

CLI Commands

One-command onboarding: checks for existing accounts, runs OAuth login if needed, installs the proxy as a persistent service, and configures Claude Code.

neurolink proxy setup              # Full setup: login + install as launchd service (macOS)
neurolink proxy setup --no-service # Login + start foreground (no auto-restart)
neurolink proxy setup -p 9000 # Setup on custom port

Install the proxy as a persistent macOS launchd service. The service auto-restarts on crash (5-second throttle interval) and starts on login.

neurolink proxy install              # Install with defaults (port 55669)
neurolink proxy install --port 9000 # Install on custom port
neurolink proxy install --host 0.0.0.0 # Bind to all interfaces

Options:

FlagAliasDefaultDescription
--port-p55669Port to listen on
--host-H127.0.0.1Host to bind to

Remove the launchd service. Stops the proxy if it is running and deletes the launchd plist.

neurolink proxy uninstall

Start the proxy server.

neurolink proxy start                           # Default: port 55669, round-robin
neurolink proxy start -p 8080 -s fill-first # Custom port and strategy
neurolink proxy start --config ./my-proxy.yaml # Custom config file
neurolink proxy start --debug # Enable debug logging
neurolink proxy start --quiet # Suppress non-essential output

Options:

FlagAliasDefaultDescription
--port-p55669Port to listen on
--host-H127.0.0.1Host to bind to
--strategy-sround-robinAccount selection strategy (round-robin or fill-first)
--health-interval30Health check interval (seconds)
--config-c~/.neurolink/proxy-config.yamlConfig file path
--quiet-qfalseSuppress output
--debug-dfalseEnable debug output

Strategy choices: round-robin, fill-first

Show proxy status, including PID, uptime, strategy, fallback chain, and per-account usage statistics (fetched from the live /status endpoint).

neurolink proxy status               # Human-readable text output
neurolink proxy status --format json # Machine-readable JSON

Authenticate with Anthropic. Supports multi-account pooling via --add --label.

# Interactive (prompts for method)
neurolink auth login anthropic

# OAuth (for Claude Pro/Max subscription)
neurolink auth login anthropic --method oauth

# API key
neurolink auth login anthropic --method api-key

# Create API key via OAuth (Claude Pro/Max)
neurolink auth login anthropic --method create-api-key

# Add a second account with a label
neurolink auth login anthropic --method oauth --add --label work
neurolink auth login anthropic --method oauth --add --label personal

# Non-interactive mode (requires environment variables)
neurolink auth login anthropic --method api-key --non-interactive

Options:

FlagAliasDefaultDescription
--method-m--Auth method: api-key, oauth, create-api-key
--addfalseAdd as additional account to the pool (instead of replacing)
--label--Human-readable label for this account (used with --add)
--non-interactivefalseSkip interactive prompts (requires environment variables)
--formattextOutput format: text or json
--debugfalseEnable debug output

List all authenticated accounts with status, including the account email address (resolved via OAuth token exchange), token expiry, and per-account quota utilization (5-hour and 7-day windows).

neurolink auth list               # Text output
neurolink auth list --format json # JSON output
neurolink auth list --debug # Include debug details

Show authentication status for a specific provider (or all providers if omitted).

neurolink auth status              # Show all providers
neurolink auth status anthropic # Show Anthropic only
neurolink auth status --format json # JSON output

Manually refresh OAuth tokens.

neurolink auth refresh anthropic

Remove expired and disabled accounts from the token store.

neurolink auth cleanup           # Interactive: prompts before removing
neurolink auth cleanup --force # Remove without prompting

Re-enable a previously disabled account (e.g., one disabled after repeated refresh failures).

neurolink auth enable work       # Re-enable the account labeled "work"

Multi-Account Setup

Adding multiple accounts

Each neurolink auth login --add --label <name> creates a separate account entry in the TokenStore (~/.neurolink/tokens.json):

# Account 1: personal Claude Max
neurolink auth login anthropic --method oauth --add --label personal

# Account 2: work Claude Max
neurolink auth login anthropic --method oauth --add --label work

# Account 3: API key for fallback
neurolink auth login anthropic --method api-key --add --label api

How accounts are selected

The proxy discovers accounts in this order:

  1. Compound keys from TokenStore (e.g., anthropic:personal, anthropic:work)
  2. Legacy credentials file (if no compound keys exist)
  3. ANTHROPIC_API_KEY environment variable (if no other accounts exist)

Within the account pool, the proxy uses fill-first routing: it always tries the first non-cooling account and only switches on failure. This avoids unnecessary identity switches that could confuse Claude Code's session state.

Cooldown and backoff

When an account encounters an error, it enters a cooldown period based on the error type:

Status CodeCooldown DurationBehavior
429Exponential backoff (1s to 10 min)Try next account
401/402/4035 minutesTry next account
404No cooldownReturn error immediately
5xx/transientNo cooldownRotate immediately
Network errorNo cooldownRotate immediately

Exponential backoff on 429:

The proxy respects the Retry-After header from Anthropic when present. For repeated 429s on the same account, the cooldown is calculated as baseCooldown * 2^level where baseCooldown is the Retry-After value (or 1 second if absent) and level increments on each consecutive 429. This produces a sequence like 1s, 2s, 4s, 8s, 16s, ... up to a 10-minute cap. The backoff level resets to zero on a successful request.

Error Handling

The proxy classifies upstream errors and applies different strategies:

429 Rate Limit

  • Parse Retry-After header (seconds or HTTP date format)
  • Apply exponential backoff with level tracking
  • Put the account into cooling state
  • Immediately try the next account
  • Log: [proxy] <- 429 account=work backoff-level=2 cooldown=4s

401/402/403 Authentication Errors

  • OAuth accounts with refresh token: Refresh the token and retry the request up to 5 times per account. If all retries fail, apply a 5-minute cooldown and try the next account. After 15 consecutive refresh failures across requests, the account is permanently disabled until re-authentication via neurolink auth login.
  • OAuth accounts without refresh token: Apply a 5-minute cooldown, try the next account.
  • API key accounts: Apply a 5-minute cooldown, try the next account.

400/422 Request Shape Error

  • Detected via HTTP 422 status or invalid_request_error error type in the response body.
  • No retry or failover. These are client-side errors (malformed request, invalid parameters).
  • Return the error body directly to Claude Code.

404 Not Found

  • Typically means the model is not available for this account.
  • No cooldown applied.
  • Return the error body immediately to the client (no failover to next account).

5xx / Transient Server Error

  • Transient errors (408, 500, 502, 503, 504, and Cloudflare 520-526/529).
  • Also matches 400 responses with api_error or overloaded_error types that wrap transient HTML content (e.g., Cloudflare error pages).
  • No cooldown applied -- immediate rotation to the next account.

All Accounts Exhausted

When every account is in a cooling state:

  1. Walk the fallback chain (if configured).
  2. Each fallback uses NeuroLink's stream() pipeline with the specified provider/model.
  3. If all fallbacks also fail, return a 429 with Retry-After set to the earliest account recovery time.

Bootstrap Retry (Streaming)

For streaming requests, the proxy reads the first chunk from the upstream response before forwarding it to the client. If the first chunk is empty (indicating a failed stream), the proxy retries with the next account. This prevents Claude Code from receiving an empty SSE stream.

Auto-Configuration

Claude Code integration

When the proxy starts, it automatically updates ~/.claude/settings.json:

{
"env": {
"ANTHROPIC_BASE_URL": "http://127.0.0.1:55669",
"ENABLE_TOOL_SEARCH": "true"
}
}

When the proxy stops (Ctrl+C or SIGTERM), it removes these entries from the settings file. This means Claude Code automatically routes through the proxy when it is running and goes direct when it is not.

Note: You must restart Claude Code after starting or stopping the proxy for the settings change to take effect.

Proxy state file

The proxy persists its running state to ~/.neurolink/proxy-state.json so that neurolink proxy status can report on it and neurolink proxy start can detect an already-running instance. The state includes PID, port, host, strategy, start time, fallback chain, and the optional fail-open guard PID.

Fail-open guard

On startup, the proxy spawns a detached background process (neurolink proxy guard) that monitors the proxy's health endpoint. If the proxy process exits unexpectedly without cleaning up ~/.claude/settings.json, the guard removes the stale ANTHROPIC_BASE_URL entry so that Claude Code falls back to direct Anthropic access rather than failing against a dead proxy.

Architecture

Endpoints

MethodPathDescription
POST/v1/messagesClaude Messages API (main endpoint)
GET/v1/modelsList available Claude models
POST/v1/messages/count_tokensToken counting
GET/healthHealth check (status, strategy, uptime)
GET/statusDetailed proxy status

Passthrough mode (Claude to Claude)

When the target provider is anthropic (the default for any claude-* model), the proxy operates in passthrough mode:

  1. Load all available accounts (TokenStore, legacy file, env var). Expired accounts are given one refresh attempt at startup; if that fails, they are disabled.
  2. Select the first non-cooling account (fill-first via round-robin cursor).
  3. Auto-refresh the token if expiring within 1 hour.
  4. Forward the raw request body via plain fetch() to https://api.anthropic.com/v1/messages?beta=true.
  5. Set authentication headers (Authorization: Bearer for OAuth, x-api-key for API keys).
  6. Forward client headers as-is; fill defaults only when absent (e.g., user-agent, anthropic-version). Ensure oauth-2025-04-20 is in the beta header.
  7. For streaming: verify the first chunk (bootstrap retry), then forward the stream. For non-streaming: return JSON.

This mode preserves the exact request format that Claude Code expects, including thinking blocks, cache control headers, and multi-turn tool use conversations. Rate-limit headers from Anthropic (retry-after, anthropic-ratelimit-requests-remaining, anthropic-ratelimit-requests-limit, anthropic-ratelimit-tokens-remaining, anthropic-ratelimit-tokens-limit) are passed through to the client.

Translation mode (Claude to other provider)

When model routing directs to a non-Anthropic provider:

  1. Parse the Claude request using parseClaudeRequest() -- extracts prompt, system prompt, images, tools, thinking config, and conversation history. The thinking type field is handled adaptively: both "enabled" (fixed budget) and "adaptive" (auto budget, mapped to thinkingLevel: "medium") are supported.
  2. Call neurolink.stream() with the target provider and model. Tools and conversation messages from the original request are passed through (not disabled).
  3. For streaming: use ClaudeStreamSerializer to emit Claude-compatible SSE events (message_start, content_block_start, content_block_delta, content_block_stop, message_delta, message_stop).
  4. For non-streaming: collect all text from the stream and call serializeClaudeResponse() to build a Claude Messages API response.

OAuth cloaking

For OAuth-authenticated requests, the proxy applies transformations to make requests appear as standard Claude CLI traffic:

  • User-Agent: claude-cli/2.1.80 (external, cli)
  • Beta headers: claude-code-20250219, oauth-2025-04-20, interleaved-thinking-2025-05-14, context-management-2025-06-27, prompt-caching-scope-2026-01-05
  • Identity headers: x-app: cli, anthropic-dangerous-direct-browser-access: true
  • Stainless SDK headers: x-stainless-runtime, x-stainless-lang, x-stainless-os, etc.
  • Billing header: Injected into the system prompt as a text block
  • User ID: Synthetic user_id in metadata (cached per token prefix, 1-hour TTL)

The CloakingPipeline supports three modes:

ModeBehavior
autoApply cloaking only for OAuth accounts (default)
alwaysApply cloaking for all accounts
neverSkip all cloaking

Cloaking plugins

The pipeline runs plugins in order field order:

  • HeaderScrubber -- Removes or modifies headers that reveal proxy usage
  • SessionIdentity -- Generates consistent fake session identifiers
  • SystemPromptInjector -- Adds billing and agent block to system prompts
  • TlsFingerprint -- TLS fingerprint matching
  • WordObfuscator -- Obfuscates identifiable patterns

Request logging

The proxy logs every request to ~/.neurolink/logs/proxy-YYYY-MM-DD.jsonl in JSONL format. Each entry includes timestamp, request ID, method, path, model, account label, response status, response time, and token usage. Log files use 0o600 permissions.

Full debug logs (complete request/response bodies and headers) are written to ~/.neurolink/logs/proxy-debug-YYYY-MM-DD.jsonl. These are useful for diagnosing upstream API issues.

Header redaction: Request headers are redacted before logging — sensitive values (authorization, x-api-key) are truncated or masked. Response headers from the upstream API are currently logged unredacted.

Log rotation

Log files are automatically cleaned up on two triggers:

  • At startup -- deletes files older than 7 days, then trims remaining files if total size exceeds 500 MB (oldest first).
  • Hourly -- repeats the same cleanup during proxy runtime.

This prevents unbounded log growth without requiring external cron jobs.

Usage statistics

In-memory per-account statistics track:

  • Request count, success count, error count, rate-limit count
  • Current backoff level and cooling state
  • Last request and last error timestamps

Statistics reset on proxy restart. Access via the /status endpoint.

Comparison with CLIProxyAPI

FeatureNeuroLink ProxyCLIProxyAPI (Go)
LanguageTypeScript (Node.js)Go
Multi-account poolingYes (fill-first + failover)Yes (round-robin)
OAuth token refresh2-layer (pre-request + 401 retry)Single refresh
Multi-provider fallbackYes (any NeuroLink provider)No
Model mapping/routingYes (YAML config)No
Anti-detection/cloakingPlugin pipelineBuilt-in
SDK integrationFull NeuroLink SDK accessStandalone binary
Config formatYAML/JSON with env varsTOML
Installationnpm install @juspay/neurolinkStandalone binary
Claude Code integrationAuto-configures settings.jsonManual setup
StreamingSSE passthrough + bootstrap retrySSE passthrough
Token storageTokenStore (multi-provider)Single-provider file

Key Files

FilePurpose
src/cli/commands/proxy.tsCLI commands: start, status, setup, install, uninstall
src/lib/server/routes/claudeProxyRoutes.tsClaude API route handlers (passthrough + translation)
src/lib/proxy/modelRouter.tsModel name resolution and fallback chain
src/lib/proxy/claudeFormat.tsRequest parser, response serializer, SSE state machine
src/lib/proxy/oauthFetch.tsOAuth fetch wrapper with cloaking
src/lib/proxy/proxyConfig.tsYAML/JSON config loader with env var interpolation
src/lib/proxy/requestLogger.tsJSONL request logging
src/lib/proxy/usageStats.tsIn-memory per-account statistics
src/lib/proxy/tokenRefresh.tsShared token refresh helpers (needsRefresh, refreshToken, persistTokens)
src/lib/proxy/accountQuota.tsQuota header parsing (unified-5h, unified-7d) and persistence
src/lib/proxy/cloaking/index.tsCloakingPipeline orchestrator
src/lib/proxy/cloaking/types.tsCloaking plugin interface and context types
src/lib/auth/tokenStore.tsMulti-provider OAuth token storage
src/lib/auth/anthropicOAuth.tsAnthropic OAuth 2.0 + PKCE flow
src/lib/auth/accountPool.tsAccount pool management
src/cli/commands/auth.tsAuth CLI commands: login, logout, list, status, refresh, cleanup, enable
src/cli/factories/authCommandFactory.tsAuth command builder with subcommands
src/lib/types/subscriptionTypes.tsSubscription tier, auth, and routing types

Troubleshooting

Proxy won't start: "already running"

The proxy detected a running instance. Check status and stop the existing one:

neurolink proxy status
# If the reported PID is stale, remove the state file:
rm ~/.neurolink/proxy-state.json
neurolink proxy start

Claude Code not connecting through proxy

  1. Verify the proxy is running: neurolink proxy status
  2. Check ~/.claude/settings.json has ANTHROPIC_BASE_URL set
  3. Restart Claude Code after starting the proxy

Token refresh failures

If you see refresh failed in the logs:

# Manually refresh
neurolink auth refresh anthropic

# Or re-login
neurolink auth login anthropic --method oauth

All accounts rate-limited

Check cooldown status and wait for recovery:

neurolink proxy status --format json
# Look at fallbackChain and uptime

Add more accounts to the pool to increase throughput:

neurolink auth login anthropic --method oauth --add --label extra

Config file not loading

Verify the config file exists and is valid YAML:

cat ~/.neurolink/proxy-config.yaml
# Or specify explicitly:
neurolink proxy start --config /path/to/config.yaml

Unresolved ${VAR} references in the config indicate missing environment variables. The proxy warns about plaintext API keys in config files -- use ${ENV_VAR} references instead.


Planned Future Features

Features explored during the CLIProxyAPI comparison analysis and deferred for future implementation.

OpenAI-Compatible Endpoint (/v1/chat/completions)

Priority: High | Complexity: Medium

Add an OpenAI-compatible API endpoint so any tool that speaks the OpenAI format (Cursor, Continue, Aider, Open Interpreter, etc.) can route through the proxy to Claude accounts.

  • What exists: NeuroLink SDK already translates between all providers via Vercel AI SDK. The Claude proxy (claudeFormat.ts + claudeProxyRoutes.ts) is the production template.
  • What's needed:
    • openaiFormat.ts — parse OpenAI requests, serialize OpenAI responses, streaming SSE state machine (mirror of claudeFormat.ts)
    • openaiProxyRoutes.tsPOST /v1/chat/completions, GET /v1/models, POST /v1/embeddings endpoints
    • Route registration in src/lib/server/routes/index.ts with openaiProxy: true
  • Key format differences: OpenAI uses choices[].message.content vs Claude's content[].text, finish_reason inline vs stop_reason, system messages in the messages array vs top-level system field
  • Account pool: Shares the same OAuth account pool as the Claude proxy — all traffic pools across accounts with fill-first routing

TLS Fingerprint Spoofing

Priority: Medium | Complexity: High

Bypass Cloudflare TLS fingerprinting on Anthropic OAuth endpoints. CLIProxyAPI uses refraction-networking/utls with tls.HelloChrome_Auto to impersonate Chrome's TLS handshake.

  • Current status: Switching refresh endpoint from console.anthropic.com to api.anthropic.com (lighter Cloudflare) resolved most issues. Revisit only if Cloudflare blocks resurface.
  • Node.js options:
    • curl-impersonate bindings via native module
    • tls-client npm package
    • Subprocess to curl-impersonate for OAuth operations only
  • Scope: Only needed for token exchange and refresh calls, not API requests (those use proper headers already)

Management Dashboard

Priority: Low | Complexity: Medium

Web-based UI for monitoring proxy status, account health, quota utilization, and request logs.

  • Data sources: ~/.neurolink/account-quotas.json (live quota), ~/.neurolink/logs/proxy-*.jsonl (request logs), ~/.neurolink/tokens.json (account status)
  • Possible approach: Lightweight Hono route serving a static HTML dashboard, reading from existing files
  • CLIProxyAPI pattern: Uses a management API (/v0/management/auth-files) for remote status — could expose similar endpoints

WebSocket Relay

Priority: Low | Complexity: High

WebSocket-based connections for real-time bidirectional communication.

  • Use cases: Live dashboard updates, browser-based clients, streaming multiplexing
  • Current need: None — no consumer exists today
  • CLIProxyAPI pattern: Uses WebSocket for dynamically connecting providers (e.g., Gemini via WebSocket). Only relevant if we add browser-based provider injection.

Hot-Reload of Config Files

Priority: Low | Complexity: Low | Partially Implemented

Watch configuration files for changes and reload without restart.

  • Credentials hot-reload: Already implemented — accounts are loaded per-request from disk, and runtime state auto-resets when credentials change (including re-enabling disabled accounts)
  • What's missing: Config file hot-reload (proxy-config.yaml) — currently requires proxy restart. Could use chokidar or fs.watch to detect YAML changes and reload ModelRouter, strategy, and other settings
  • CLIProxyAPI pattern: Uses fsnotify with debouncing (50ms for files, 150ms for config) and SHA256 change detection

Quota-Aware Routing

Priority: Medium | Complexity: Low

Use captured quota data (account-quotas.json) to make smarter routing decisions.

  • Current behavior: Fill-first — exhausts one account before moving to the next on 429/401
  • Enhancement: Check sessionUsed / weeklyUsed before routing. If the primary account is above the fallbackPercentage threshold (50%), proactively switch to the next account before hitting a hard 429
  • Data available: All quota headers are already captured and stored per-account

Per-Model Account Restrictions

Priority: Low | Complexity: Low

Allow configuring which accounts can use which models.

  • Use case: Account A has Max subscription (can use Opus), Account B has Pro (Sonnet/Haiku only). Routing Opus requests to Account B wastes a round-trip on a guaranteed 403.
  • CLIProxyAPI pattern: Per-account excluded-models list with wildcard matching
  • Implementation: Add excludedModels?: string[] to account config, filter during account selection