Skip to main content

Claude Proxy Configuration Reference

This document is the authoritative reference for every configurable aspect of the NeuroLink Claude proxy. It covers CLI flags, the YAML config file schema, environment variables, auto-configured Claude Code settings, and all file locations.


1. CLI Flags

Start the Claude multi-account proxy server.

FlagAliasTypeDefaultDescription
--port-pnumber55669Port to listen on.
--host-Hstring127.0.0.1Host/IP to bind to. Use 0.0.0.0 to listen on all interfaces.
--strategy-sstringround-robinAccount selection strategy. Choices: round-robin, fill-first.
--health-intervalnumber30Health check interval in seconds.
--quiet-qbooleanfalseSuppress non-essential output (banner, status messages).
--debug-dbooleanfalseEnable debug output (stack traces on errors, verbose logging).
--config-cstring~/.neurolink/proxy-config.yamlPath to proxy config file (YAML or JSON).

Examples:

# Start with defaults (port 55669, round-robin strategy)
neurolink proxy start

# Custom port and round-robin strategy
neurolink proxy start -p 8080 -s round-robin

# Start with 60-second health checks, debug output
neurolink proxy start --health-interval 60 --debug

# Use a custom config file
neurolink proxy start --config /path/to/my-proxy.yaml

Show the current proxy status.

FlagAliasTypeDefaultDescription
--formatstringtextOutput format. Choices: text, json.
--quiet-qbooleanfalseSuppress non-essential output.

Examples:

# Human-readable status
neurolink proxy status

# Machine-readable JSON (for scripts)
neurolink proxy status --format json

JSON output shape (when --format json):

{
"running": true,
"pid": 12345,
"port": 55669,
"host": "127.0.0.1",
"strategy": "round-robin",
"startTime": "2025-03-22T10:00:00.000Z",
"uptime": 3600000,
"url": "http://127.0.0.1:55669",
"fallbackChain": [{ "provider": "google-ai", "model": "gemini-2.5-pro" }]
}

One-command setup: login + start proxy + configure Claude Code.

FlagAliasTypeDefaultDescription
--port-pnumber55669Proxy port.
--hoststring127.0.0.1Proxy host/IP to bind to.
--methodstringoauthAuthentication method. Choices: oauth, api-key.
--no-servicebooleanfalseSkip launchd install, just start foreground.

Examples:

# Full setup with defaults (OAuth login, port 55669, launchd service)
neurolink proxy setup

# Setup on a custom port
neurolink proxy setup -p 9000

# Login + start foreground (no auto-restart service)
neurolink proxy setup --no-service

What proxy setup does:

  1. Checks for existing authenticated accounts in the TokenStore.
  2. Falls back to the legacy ~/.neurolink/anthropic-credentials.json file.
  3. If no valid accounts are found, runs the OAuth login flow.
  4. Installs as macOS launchd service (auto-restart on crash/reboot). Use --no-service for foreground start.

Internal fail-open guard process. Spawned automatically by proxy start as a detached child. Monitors the proxy health endpoint and reverts Claude Code settings if the proxy dies unexpectedly.

FlagTypeDefaultDescription
--hoststring127.0.0.1Proxy host to monitor.
--portnumber55669Proxy port to monitor.
--parent-pidnumber(required)PID of the parent proxy process.
--max-wait-msnumber0Maximum monitoring duration (0 = indefinite).
--failure-thresholdnumber5Consecutive health check failures before triggering cleanup.
--poll-interval-msnumber1000Interval between health checks in milliseconds.
--quietbooleantrueSuppress output (guards are silent by default).

You should never need to run this command manually.

Install the proxy as a persistent macOS launchd service. The service auto-starts on login and auto-restarts on crash (5-second throttle). Currently macOS-only.

FlagAliasTypeDefaultDescription
--port-pnumber55669Proxy port.
--hoststring127.0.0.1Proxy host/IP to bind to.

Examples:

# Install with defaults (port 55669)
neurolink proxy install

# Install on custom port
neurolink proxy install -p 9000

What it does:

  1. Writes a launchd plist to ~/Library/LaunchAgents/com.neurolink.proxy.plist.
  2. Loads the service via launchctl load.
  3. The service runs neurolink proxy start --port <port> --host <host> --quiet.
  4. Logs go to ~/.neurolink/logs/proxy-launchd-stdout.log and proxy-launchd-stderr.log.

Management:

# Start/stop manually
launchctl start com.neurolink.proxy
launchctl stop com.neurolink.proxy

# Remove entirely
neurolink proxy uninstall

Remove the proxy launchd background service. Unloads the service and deletes the plist file. Currently macOS-only.

No flags.

Examples:

neurolink proxy uninstall

Remove expired and disabled accounts from the token store.

FlagTypeDefaultDescription
--forcebooleanfalseSkip confirmation when removing disabled accounts.

Examples:

# Interactive cleanup (prompts before removing disabled accounts)
neurolink auth cleanup

# Force cleanup without confirmation
neurolink auth cleanup --force

What it does:

  1. Prunes expired entries that have no refresh token.
  2. Finds permanently disabled entries (e.g., accounts that failed refresh).
  3. Prompts for confirmation before removing disabled accounts (unless --force).

Re-enable a previously disabled account so it can be used by the proxy pool again.

ArgumentTypeRequiredDescription
<account>stringYesAccount key to re-enable (e.g., anthropic:1-VjRIq).

Examples:

# Re-enable a disabled account
neurolink auth enable anthropic:1-VjRIq

Run neurolink auth list to see all accounts and their current status.


2. Config File (~/.neurolink/proxy-config.yaml)

The proxy loads its configuration from a YAML (or JSON) file. The default location is ~/.neurolink/proxy-config.yaml. Override it with --config.

YAML parsing uses js-yaml when available; otherwise falls back to JSON.parse.

Environment Variable Interpolation

All string values support ${VAR_NAME} and ${VAR_NAME:-default} syntax for environment variable resolution:

accounts:
anthropic:
- name: production
apiKey: "${ANTHROPIC_API_KEY}" # resolved from env
- name: backup
apiKey: "${BACKUP_KEY:-sk-fallback-123}" # with default value

Resolution order:

  1. Look up VAR_NAME in process.env.
  2. If not found, use the :-default value when present.
  3. If no default, the literal ${VAR_NAME} token is preserved (validation will catch missing keys).

Full Schema

# ---------------------------------------------------------------------------
# Top-level fields
# ---------------------------------------------------------------------------

# Schema version (optional, default: 1)
version: 1

# Default provider applied when not specified per-account (optional)
defaultProvider: "anthropic"

# Default base URL applied to accounts that omit baseUrl (optional)
defaultBaseUrl: "https://api.anthropic.com"

# ---------------------------------------------------------------------------
# accounts (REQUIRED)
# ---------------------------------------------------------------------------
# Map of provider names to arrays of account configurations.
# At least one provider with at least one account is required.
accounts:
anthropic:
- name: "personal-pro" # Human-readable label (default: "unnamed")
apiKey: "${ANTHROPIC_KEY_1}" # API key or OAuth token (REQUIRED, non-empty)
baseUrl: "https://api.anthropic.com" # Base URL override (optional)
orgId: "org-abc123" # Organization ID (optional)
weight: 2 # Weight for weighted round-robin (default: 1)
enabled: true # Whether this account is active (default: true)
rateLimit: 60 # Max requests per minute (optional)
metadata: # Arbitrary metadata (optional)
tier: "pro"
notes: "Main account"

- name: "team-max"
apiKey: "${ANTHROPIC_KEY_2}"
weight: 3
enabled: true

# ---------------------------------------------------------------------------
# routing (optional)
# ---------------------------------------------------------------------------
# Controls model mapping, fallback chains, and routing strategy.
# Accepts both camelCase and kebab-case keys for YAML-friendliness.
routing:
# Account selection strategy: "round-robin" | "fill-first"
strategy: "round-robin"

# Model mappings: remap incoming model names to different provider/model pairs
# Accepts: model-mappings (kebab) or modelMappings (camel)
model-mappings:
- from: "claude-sonnet-4-20250514" # Model name sent by Claude Code
to: "gemini-2.5-pro" # Target model name
provider: "google-ai" # Target provider (default: "anthropic")

- from: "claude-3-haiku-20240307"
to: "gpt-4o-mini"
provider: "openai"

# Fallback chain: when all Claude accounts are exhausted, try these in order
# Accepts: fallback-chain (kebab) or fallbackChain (camel)
fallback-chain:
- provider: "google-ai"
model: "gemini-2.5-pro"
- provider: "openai"
model: "gpt-4o"

# Passthrough models: model IDs that skip routing and go directly to Anthropic
# Accepts: passthrough-models (kebab) or passthroughModels (camel)
passthrough-models:
- "claude-sonnet-4-20250514"
- "claude-3-5-sonnet-20241022"
- "claude-3-haiku-20240307"

# ---------------------------------------------------------------------------
# cloaking (optional)
# ---------------------------------------------------------------------------
# Cloaking pipeline for making proxy requests indistinguishable from
# genuine Claude Code sessions.
cloaking:
# Mode: "auto" | "always" | "never"
# auto - apply cloaking only to OAuth accounts (default behavior)
# always - apply to all accounts (OAuth and API key)
# never - disable all cloaking plugins
mode: "auto"

plugins:
# Strip proxy-revealing headers (x-forwarded-for, via, etc.)
headerScrubber: true

# Generate consistent session identities per account (1-hour TTL)
sessionIdentity: true

# Inject Claude Code session context into system prompt (OAuth only)
systemPromptInjector: true

# Zero-width character insertion into sensitive words
wordObfuscator:
enabled: true
words: # Custom words to obfuscate
- "proxy"
- "neurolink"
- "load balancer"
- "round-robin"
- "failover"
- "multi-account"

# TLS fingerprint mimicry (stub/placeholder -- not yet implemented)
tlsFingerprint:
enabled: false

Field Reference Table

Top-Level Fields

FieldTypeDefaultRequiredDescription
versionnumber1NoConfig schema version.
defaultProviderstring(none)NoDefault provider name applied to accounts that omit it.
defaultBaseUrlstring(none)NoDefault base URL applied to accounts that omit baseUrl.
accountsRecord<string, Account[]>(none)YesMap of provider names to account arrays.
routingRoutingConfig(none)NoRouting strategy, model mappings, and fallback chain.
cloakingCloakingConfig(none)NoCloaking pipeline configuration.

Account Fields

FieldTypeDefaultRequiredDescription
namestring"unnamed"NoHuman-readable account label.
apiKeystring(none)YesAPI key or OAuth token. Supports ${ENV_VAR} interpolation.
baseUrlstring(none)NoOverride the provider's API base URL.
orgIdstring(none)NoOrganization ID (e.g., OpenAI organizations).
weightnumber1NoWeight for weighted round-robin selection. Higher weight = more traffic.
enabledbooleantrueNoWhether this account is active. Disabled accounts are skipped.
rateLimitnumber(none)NoMaximum requests per minute for this account.
metadataRecord<string, unknown>(none)NoArbitrary metadata (tier info, notes, tags).

Routing Fields

FieldTypeDefaultRequiredDescription
strategy"round-robin" | "fill-first"(none)NoAccount selection strategy. round-robin rotates across accounts. fill-first uses one account until exhausted.
model-mappings / modelMappingsModelMapping[][]NoArray of model-to-model remapping rules.
fallback-chain / fallbackChainFallbackEntry[][]NoOrdered list of alternative providers to try when primary accounts are exhausted.
passthrough-models / passthroughModelsstring[][]NoModel IDs that bypass routing and go directly to Anthropic.

ModelMapping Fields

FieldTypeDefaultRequiredDescription
fromstring""YesIncoming model name (what Claude Code requests).
tostring""YesTarget model name at the destination provider.
providerstring"anthropic"NoTarget provider to route to.

FallbackEntry Fields

FieldTypeDefaultRequiredDescription
providerstring""YesProvider name (e.g., google-ai, openai).
modelstring""YesModel to use at that provider.

Cloaking Fields

FieldTypeDefaultDescription
mode"auto" | "always" | "never""auto"auto applies cloaking only to OAuth accounts. always applies to all. never disables all plugins.
plugins.headerScrubberbooleanfalseStrip proxy-revealing headers (x-forwarded-for, via, sec-ch-*, etc.).
plugins.sessionIdentitybooleanfalseGenerate consistent user_id/session_id per account with 1-hour TTL.
plugins.systemPromptInjectorbooleanfalseInject Claude Code session context (IDE metadata, timestamps) into system prompt. OAuth accounts only.
plugins.wordObfuscator.enabledbooleanfalseInsert zero-width characters into sensitive words to defeat string matching.
plugins.wordObfuscator.wordsstring[]["proxy", "neurolink", ...]Words to obfuscate. Defaults include: proxy, neurolink, load balancer, round-robin, failover, multi-account.
plugins.tlsFingerprint.enabledbooleanfalseTLS fingerprint mimicry. Currently a stub/placeholder (no-op).

Validation Rules

The config loader validates the following:

  • accounts must be present and be a non-array object.
  • Each provider key in accounts must map to an array.
  • Each account must have a non-empty string apiKey.
  • If version is present, it must be a number.
  • Plaintext API keys (not using ${ENV_VAR} references) trigger a warning.

3. Environment Variables

VariablePurposeUsed By
ANTHROPIC_API_KEYAnthropic API key. Used as a fallback credential when no OAuth accounts are found.Proxy routes, Anthropic provider
ANTHROPIC_OAUTH_TOKENOAuth access token for Anthropic (alternative to stored tokens).Anthropic provider, providerConfig
CLAUDE_OAUTH_TOKENAlias for ANTHROPIC_OAUTH_TOKEN. Checked as a fallback.Anthropic provider, providerConfig
NEUROLINK_SKIP_MCPSet to "true" to skip MCP server initialization. Automatically set by proxy start (tools come from Claude Code, not local MCP servers).NeuroLink constructor
NEUROLINK_LOG_LEVELLog level for the NeuroLink logger. Values: error, warn, info, debug.Logger utility

Priority for Anthropic credentials (checked in order by the proxy routes):

  1. TokenStore compound keys -- anthropic:<label> entries in ~/.neurolink/tokens.json.
  2. Legacy credentials file -- ~/.neurolink/anthropic-credentials.json (only if no compound keys exist).
  3. ANTHROPIC_API_KEY env var -- Only if no OAuth accounts are found at all.

4. Claude Code Settings

When the proxy starts, it automatically writes to ~/.claude/settings.json:

{
"env": {
"ANTHROPIC_BASE_URL": "http://127.0.0.1:55669",
"ENABLE_TOOL_SEARCH": "true"
}
}
KeyValueDescription
ANTHROPIC_BASE_URLhttp://<host>:<port>Tells Claude Code to route all Anthropic API requests through the proxy.
ENABLE_TOOL_SEARCH"true"Enables tool search in Claude Code (required for full proxy compatibility).

Lifecycle:

  • On proxy start -- Both keys are written (or merged into existing settings).
  • On proxy stop (Ctrl+C / SIGTERM) -- Both keys are removed. Other env keys in the settings file are preserved.
  • Fail-open guard -- If the proxy crashes without a clean shutdown, the detached guard process detects the unhealthy endpoint and removes the stale settings automatically.
  • Safety -- If the ANTHROPIC_BASE_URL has been changed to a different value (e.g., another proxy), the cleanup will not overwrite it.

After starting the proxy, restart Claude Code for the new settings to take effect.


5. File Locations

All NeuroLink proxy files are stored under ~/.neurolink/ (with 0o700 directory permissions).

FilePermissionsDescription
~/.neurolink/tokens.json0o600TokenStore -- Multi-provider OAuth token storage. Stores tokens keyed by provider:label (e.g., anthropic:personal). XOR-obfuscated by default (not plaintext).
~/.neurolink/anthropic-credentials.json0o600Legacy credentials -- Single-account OAuth tokens. Used as a fallback when no compound keys exist in tokens.json. Updated on token refresh (pre-request or on-401).
~/.neurolink/proxy-config.yamluser defaultProxy config -- YAML/JSON configuration file. Loaded by proxy start (default path, overridable with --config).
~/.neurolink/proxy-state.jsonuser defaultProxy state -- Runtime state persisted by the running proxy (PID, port, host, strategy, start time, fallback chain, guard PID). Used by proxy status and the fail-open guard.
~/.neurolink/logs/proxy-YYYY-MM-DD.jsonl0o600Request logs -- One JSONL entry per proxied request. Rotated daily by date. Each entry contains: timestamp, requestId, method, path, model, stream flag, tool count, account label, response status, response time, error info, and token usage.
~/.neurolink/logs/proxy-debug-YYYY-MM-DD.jsonl0o600Debug logs -- Full request/response debug entries. Includes complete request headers, body summary (model, max_tokens, message count, tool count, thinking config), response status, response headers, response body (first 2000 chars on errors), and duration.
~/.neurolink/account-quotas.jsonuser defaultAccount quotas -- Cached quota/utilization data from Anthropic's unified-5h and unified-7d rate-limit headers. Flushed to disk every 5 seconds.
~/.claude/settings.jsonuser defaultClaude Code settings -- Auto-configured with ANTHROPIC_BASE_URL and ENABLE_TOOL_SEARCH when the proxy starts. Cleaned up on shutdown.

TokenStore Details

The tokens.json file uses this internal structure (after deobfuscation):

{
"version": "2.0",
"lastModified": 1711100000000,
"providers": {
"anthropic:personal": {
"tokens": {
"accessToken": "...",
"refreshToken": "...",
"expiresAt": 1711103600000,
"tokenType": "Bearer",
"scope": "..."
},
"createdAt": 1711100000000,
"lastAccessed": 1711100000000
},
"anthropic:team": {
"tokens": { "...": "..." },
"createdAt": 1711100000000,
"lastAccessed": 1711100000000
}
}
}

The TokenStore class options:

  • encryptionEnabled (default: true) -- XOR obfuscation with a machine-derived key.
  • customStoragePath -- Override the default ~/.neurolink/tokens.json path.

Tokens are automatically refreshed 1 hour before expiration when a TokenRefresher function is registered.


6. Model Mapping Examples

Model mappings let you reroute specific model requests to different providers. The proxy's ModelRouter checks mappings in this order:

  1. Explicit mapping -- If the requested model has a from match in model-mappings, use the corresponding to/provider.
  2. Passthrough list -- If the model is in passthrough-models, route to Anthropic.
  3. Claude prefix -- Any model starting with claude- is routed to Anthropic.
  4. Unknown model -- Returns provider: null (the proxy will attempt Anthropic by default).

Example: Route Haiku to a Cheaper Provider

routing:
model-mappings:
- from: "claude-3-haiku-20240307"
to: "gpt-4o-mini"
provider: "openai"

Claude Code requests claude-3-haiku-20240307 but the proxy sends the request to OpenAI's gpt-4o-mini instead, translating the request format via neurolink.generate().

Example: Use Gemini for All Sonnet Requests

routing:
model-mappings:
- from: "claude-sonnet-4-20250514"
to: "gemini-2.5-pro"
provider: "google-ai"
- from: "claude-3-5-sonnet-20241022"
to: "gemini-2.5-flash"
provider: "google-ai"

Example: Passthrough Specific Models

routing:
passthrough-models:
- "claude-sonnet-4-20250514"
- "claude-3-opus-20240229"
model-mappings:
- from: "claude-3-haiku-20240307"
to: "gemini-2.5-flash"
provider: "google-ai"

Here, Sonnet 4 and Opus requests go directly to Anthropic (passthrough), while Haiku requests are redirected to Gemini.

Example: No Routing (Pure Multi-Account Pool)

Omit the routing section entirely. All requests pass through to Anthropic using the configured accounts with round-robin rotation:

accounts:
anthropic:
- name: "account-1"
apiKey: "${ANTHROPIC_KEY_1}"
- name: "account-2"
apiKey: "${ANTHROPIC_KEY_2}"
- name: "account-3"
apiKey: "${ANTHROPIC_KEY_3}"

7. Fallback Chain Examples

The fallback chain is tried in order when all primary Claude accounts are exhausted (rate-limited, errored, or cooling down). Each entry specifies a provider and model. The proxy translates the Claude-format request into the target provider's format using neurolink.generate() or neurolink.stream().

Example: Gemini then OpenAI

routing:
fallback-chain:
- provider: "google-ai"
model: "gemini-2.5-pro"
- provider: "openai"
model: "gpt-4o"

Request flow:

  1. Try all Claude accounts (round-robin with retry).
  2. If all exhausted, try Google AI Studio with gemini-2.5-pro.
  3. If that also fails, try OpenAI with gpt-4o.

Example: Multiple Gemini Tiers

routing:
fallback-chain:
- provider: "google-ai"
model: "gemini-2.5-pro"
- provider: "google-ai"
model: "gemini-2.5-flash"
- provider: "openai"
model: "gpt-4o-mini"

Falls back through progressively cheaper models.

Example: Vertex AI as Primary Fallback (Enterprise)

routing:
fallback-chain:
- provider: "google-vertex"
model: "gemini-2.5-pro"
- provider: "amazon-bedrock"
model: "anthropic.claude-3-5-sonnet-20241022-v2:0"

Uses enterprise-grade providers (Vertex AI, Bedrock) as fallbacks. Requires the corresponding provider credentials to be configured in environment variables.

Example: Full Multi-Tier Setup

version: 1

accounts:
anthropic:
- name: "pro-personal"
apiKey: "${CLAUDE_PRO_KEY}"
weight: 1
- name: "max-team"
apiKey: "${CLAUDE_MAX_KEY}"
weight: 3

routing:
strategy: "round-robin"

passthrough-models:
- "claude-sonnet-4-20250514"

model-mappings:
- from: "claude-3-haiku-20240307"
to: "gemini-2.5-flash"
provider: "google-ai"

fallback-chain:
- provider: "google-ai"
model: "gemini-2.5-pro"
- provider: "openai"
model: "gpt-4o"

cloaking:
mode: "auto"
plugins:
headerScrubber: true
sessionIdentity: true
systemPromptInjector: true
wordObfuscator:
enabled: true
words:
- "proxy"
- "neurolink"

This configuration:

  • Pools two Claude accounts with 1:3 weighting (Max gets 3x traffic).
  • Passes Sonnet 4 requests directly to Anthropic.
  • Redirects Haiku requests to Gemini Flash.
  • Falls back to Gemini Pro, then GPT-4o when Claude accounts are exhausted.
  • Applies cloaking to OAuth accounts (header scrubbing, session identity, system prompt injection, word obfuscation).

Proxy Endpoints

For reference, the running proxy exposes these HTTP endpoints:

MethodPathDescription
POST/v1/messagesAnthropic-compatible chat completions (main endpoint).
GET/v1/modelsList available models.
POST/v1/messages/count_tokensToken counting endpoint.
GET/healthHealth check. Returns { status, strategy, uptime }.
GET/statusDetailed status with per-account stats, request counts, error rates.

Log Rotation

Log files (proxy-*.jsonl and proxy-debug-*.jsonl) are automatically cleaned up to prevent unbounded growth.

ParameterValueDescription
Max age7 daysFiles older than 7 days are deleted
Max total size500 MBIf remaining files exceed 500 MB, oldest are deleted first
Cleanup triggersStartup + hourlyRuns once at proxy start, then every 60 minutes

The cleanupLogs() function performs two passes:

  1. Age pass -- delete all files with mtime older than the cutoff.
  2. Size pass -- if remaining files exceed the size limit, delete oldest first until under the cap.

Log rotation is non-fatal. If cleanup fails, the proxy continues operating normally.


Rate Limit Headers from Anthropic

The proxy captures and uses Anthropic's quota headers for per-account utilization tracking:

HeaderFormatDescription
anthropic-ratelimit-unified-5h-utilizationfloat (0.0-1.0)5-hour rolling session utilization
anthropic-ratelimit-unified-5h-statusstringSession status (e.g., ok, warning)
anthropic-ratelimit-unified-5h-resetinteger (epoch)When the 5-hour window resets
anthropic-ratelimit-unified-7d-utilizationfloat (0.0-1.0)7-day rolling weekly utilization
anthropic-ratelimit-unified-7d-statusstringWeekly status
anthropic-ratelimit-unified-7d-resetinteger (epoch)When the 7-day window resets
anthropic-ratelimit-fallback-percentagefloatFallback percentage threshold
anthropic-ratelimit-overage-statusstringOverage status

These headers are parsed by parseQuotaHeaders() in accountQuota.ts and cached in memory with debounced persistence to ~/.neurolink/account-quotas.json. The neurolink auth list command displays per-account 5h and 7d utilization when available.


Token Refresh

The proxy uses a reactive (not background) refresh strategy. There is no background timer polling for token expiry. Instead, tokens are refreshed on demand:

  1. Pre-request check — Before each request, if the token's expiresAt <= now + 1 hour, the proxy refreshes it inline via POST https://api.anthropic.com/v1/oauth/token (fallback: https://console.anthropic.com/v1/oauth/token). On success, the credential file is updated atomically (write to .tmp, then rename).
  2. On-401 retry — If Anthropic returns a 401 despite the pre-request check, the proxy refreshes the token and retries the request up to 5 times before failing over to the next account.