Built-in Middleware Reference
NeuroLink includes three production-ready middleware components for common enterprise use cases: Analytics, Guardrails, and Auto-Evaluation. These middleware are battle-tested and ready to use in production applications.
Quick Start
Enable all built-in middleware with a single preset:
import { MiddlewareFactory } from "@juspay/neurolink";
const factory = new MiddlewareFactory({
preset: "all", // Enables analytics + guardrails
});
Or enable specific middleware:
const factory = new MiddlewareFactory({
enabledMiddleware: ["analytics", "guardrails", "autoEvaluation"],
});
Analytics Middleware
Purpose
The Analytics Middleware collects comprehensive usage metrics, timing data, and operational analytics for all AI operations. It's essential for monitoring production applications, tracking costs, and understanding usage patterns.
Key Capabilities:
- Token usage tracking (input, output, total)
- Response time measurement
- Request success/failure tracking
- Provider and model information
- Automatic metrics storage in response metadata
Configuration
Basic Configuration:
import { MiddlewareFactory } from "@juspay/neurolink";
const factory = new MiddlewareFactory({
preset: "default", // Analytics enabled by default
});
Advanced Configuration:
const factory = new MiddlewareFactory({
middlewareConfig: {
analytics: {
enabled: true,
config: {
// Custom configuration options can be added here
// Currently analytics runs with default settings
},
},
},
});
Conditional Analytics (Production Only):
const factory = new MiddlewareFactory({
middlewareConfig: {
analytics: {
enabled: true,
conditions: {
custom: (context) => process.env.NODE_ENV === "production",
},
},
},
});
Collected Metrics
| Metric | Type | Description | Unit |
|---|---|---|---|
requestId | string | Unique identifier for this request | - |
timestamp | string | ISO 8601 timestamp | - |
responseTime | number | Total request duration | milliseconds |
usage.input | number | Input tokens consumed | tokens |
usage.output | number | Output tokens generated | tokens |
usage.total | number | Total tokens used | tokens |
Output Format
Analytics data is automatically added to the response metadata:
Generate Response:
import { NeuroLink } from "@juspay/neurolink";
const neurolink = new NeuroLink();
const result = await neurolink.generate({
input: { text: "Explain quantum computing" },
provider: "openai",
model: "gpt-4",
});
// Access analytics from response metadata
const analytics = result.experimental_providerMetadata?.neurolink?.analytics;
console.log(analytics);
Analytics Object Structure:
{
"requestId": "analytics-1735689600000",
"responseTime": 1523,
"timestamp": "2026-01-01T00:00:00.000Z",
"usage": {
"input": 12,
"output": 256,
"total": 268
}
}
Stream Response:
For streaming responses, analytics are available in the rawResponse:
const result = await neurolink.stream({
input: { text: "Write a story" },
});
// Analytics available in rawResponse
const streamAnalytics = result.rawResponse?.neurolink?.analytics;
console.log(streamAnalytics);
Stream Analytics Structure:
{
"requestId": "analytics-stream-1735689600000",
"startTime": 1735689600000,
"timestamp": "2026-01-01T00:00:00.000Z",
"streamingMode": true
}
Use Cases
1. Cost Tracking:
const result = await neurolink.generate({ input: { text: "..." } });
const analytics = result.experimental_providerMetadata?.neurolink?.analytics;
// Calculate cost (example: $0.03 per 1K input tokens, $0.06 per 1K output tokens)
const inputCost = (analytics.usage.input / 1000) * 0.03;
const outputCost = (analytics.usage.output / 1000) * 0.06;
const totalCost = inputCost + outputCost;
console.log(`Request cost: $${totalCost.toFixed(4)}`);
2. Performance Monitoring:
const analytics = result.experimental_providerMetadata?.neurolink?.analytics;
if (analytics.responseTime > 3000) {
console.warn(`Slow request detected: ${analytics.responseTime}ms`);
// Send alert to monitoring system
}
3. Usage Analytics Dashboard:
// Aggregate analytics over multiple requests
const requests = [];
for (const prompt of prompts) {
const result = await neurolink.generate({ input: { text: prompt } });
const analytics = result.experimental_providerMetadata?.neurolink?.analytics;
requests.push(analytics);
}
// Calculate aggregates
const totalTokens = requests.reduce((sum, a) => sum + a.usage.total, 0);
const avgResponseTime =
requests.reduce((sum, a) => sum + a.responseTime, 0) / requests.length;
console.log(`Total tokens used: ${totalTokens}`);
console.log(`Average response time: ${avgResponseTime}ms`);
Integration with External Systems
Send to Datadog:
import { StatsD } from "node-dogstatsd";
const dogstatsd = new StatsD();
const result = await neurolink.generate({ input: { text: "..." } });
const analytics = result.experimental_providerMetadata?.neurolink?.analytics;
dogstatsd.histogram("neurolink.response_time", analytics.responseTime);
dogstatsd.increment("neurolink.tokens.total", analytics.usage.total);
dogstatsd.increment("neurolink.requests.success");
Send to Prometheus:
import { register, Histogram, Counter } from "prom-client";
const responseTimeHistogram = new Histogram({
name: "neurolink_response_time_ms",
help: "Response time in milliseconds",
buckets: [100, 500, 1000, 2000, 5000],
});
const tokenCounter = new Counter({
name: "neurolink_tokens_total",
help: "Total tokens consumed",
});
const result = await neurolink.generate({ input: { text: "..." } });
const analytics = result.experimental_providerMetadata?.neurolink?.analytics;
responseTimeHistogram.observe(analytics.responseTime);
tokenCounter.inc(analytics.usage.total);
Guardrails Middleware
Purpose
The Guardrails Middleware provides comprehensive content filtering and policy enforcement to block or redact unsafe content, prevent prompt injection attacks, and maintain compliance with content policies.
Key Capabilities:
- Bad word filtering (configurable word list)
- AI model-based content safety evaluation
- Precall evaluation (block unsafe prompts before they reach the LLM)
- Stream and generate support
- Configurable filtering actions (block, redact, log)
Configuration
Basic Configuration:
import { MiddlewareFactory } from "@juspay/neurolink";
const factory = new MiddlewareFactory({
middlewareConfig: {
guardrails: {
enabled: true,
config: {
badWords: ["inappropriate", "unsafe", "prohibited"],
},
},
},
});
Advanced Configuration with Model-Based Filtering:
import { openai } from "@ai-sdk/openai";
const factory = new MiddlewareFactory({
middlewareConfig: {
guardrails: {
enabled: true,
config: {
// Basic word filtering
badWords: ["spam", "scam", "inappropriate"],
// AI model-based filtering
modelFilter: {
enabled: true,
filterModel: openai("gpt-3.5-turbo"), // Use a fast model for filtering
},
},
},
},
});
Precall Evaluation (Block Unsafe Prompts):
const factory = new MiddlewareFactory({
middlewareConfig: {
guardrails: {
enabled: true,
config: {
badWords: ["prohibited"],
// Precall evaluation blocks unsafe prompts before they reach the LLM
precallEvaluation: {
enabled: true,
provider: "openai",
evaluationModel: "gpt-4", // Model name as string
thresholds: {
safetyScore: 7, // Safety threshold (1-10 scale, higher = more restrictive)
appropriatenessScore: 6, // Appropriateness threshold (1-10 scale)
},
blockUnsafeRequests: true, // Block requests that fail evaluation
},
},
},
},
});
Built-in Filters
| Filter Type | Description | Action | Configuration |
|---|---|---|---|
| Bad Words | Block/redact specific words or phrases | Redact with *** | badWords: string[] |
| Model-Based | Use AI to evaluate content safety | Block if unsafe | modelFilter.enabled: true |
| Precall Evaluation | Block unsafe prompts before LLM call | Block request | precallEvaluation.enabled: true |
Bad Word Filtering
How It Works:
The bad word filter scans both requests and responses for prohibited terms and replaces them with ***.
Example:
// Input prompt: "This is an inappropriate message"
// With badWords: ["inappropriate"]
// Output: "This is an *** message"
Configuration:
config: {
badWords: [
"inappropriate",
"prohibited",
"unsafe",
// Add your organization's blocked terms
];
}
Model-Based Filtering
How It Works:
Uses a separate AI model to evaluate whether content is safe. The filter sends the content to the model with a safety evaluation prompt.
Safety Evaluation Prompt:
Is the following text safe? Respond with only "safe" or "unsafe".
Text: "[content to evaluate]"
Example:
// If model responds "unsafe", content is redacted
// Output: "<REDACTED BY AI GUARDRAIL>"
Configuration:
import { openai } from "@ai-sdk/openai";
config: {
modelFilter: {
enabled: true,
filterModel: openai("gpt-3.5-turbo") // Fast, cost-effective model
}
}
Precall Evaluation
How It Works:
Evaluates the safety of the input prompt before it reaches the main LLM. If the prompt is deemed unsafe, the request is blocked entirely, saving costs and preventing unsafe content generation.
Evaluation Process:
- User submits a prompt
- Guardrails middleware intercepts in
transformParams - Safety evaluation model scores the prompt (0-1 scale)
- If score < threshold, request is blocked
- If score >= threshold, request proceeds to main LLM
Blocked Response:
{
"text": "<BLOCKED BY PRECALL GUARDRAILS>",
"usage": {
"promptTokens": 0,
"completionTokens": 0
}
}
Configuration:
config: {
precallEvaluation: {
enabled: true,
provider: "openai",
evaluationModel: "gpt-4", // Model for safety evaluation (string)
thresholds: {
safetyScore: 7, // Safety threshold (1-10 scale, default 7)
appropriatenessScore: 6, // Appropriateness threshold (1-10 scale, default 6)
},
blockUnsafeRequests: true, // Block requests that fail evaluation
actions: {
onUnsafe: "block",
onInappropriate: "sanitize",
onSuspicious: "warn",
},
}
}
Streaming Support
Guardrails work seamlessly with streaming responses:
const result = await neurolink.stream({
input: { text: "Generate a story" },
});
// Each chunk is filtered in real-time
for await (const chunk of result.stream) {
console.log(chunk); // Filtered content
}
Stream Filtering:
- Bad words are replaced with
***in each text delta - Model-based filtering is not applied to streams (too slow)
- Precall evaluation works for streams
Use Cases
1. Content Moderation for User-Generated Prompts:
const factory = new MiddlewareFactory({
middlewareConfig: {
guardrails: {
enabled: true,
config: {
badWords: ["spam", "abuse", "harassment"],
precallEvaluation: {
enabled: true,
provider: "openai",
evaluationModel: "gpt-4",
thresholds: {
safetyScore: 9, // Strict filtering (1-10 scale)
appropriatenessScore: 8,
},
blockUnsafeRequests: true,
},
},
},
},
});
2. Compliance with Content Policies:
const factory = new MiddlewareFactory({
middlewareConfig: {
guardrails: {
enabled: true,
config: {
badWords: organizationBlocklist, // Your org's blocked terms
modelFilter: {
enabled: true,
filterModel: openai("gpt-3.5-turbo"),
},
},
conditions: {
providers: ["openai", "anthropic"], // Only for external providers
},
},
},
});
3. Protecting Against Prompt Injection:
const factory = new MiddlewareFactory({
middlewareConfig: {
guardrails: {
enabled: true,
config: {
precallEvaluation: {
enabled: true,
provider: "openai",
evaluationModel: "gpt-4",
thresholds: {
safetyScore: 8, // High safety threshold (1-10 scale)
appropriatenessScore: 7,
},
blockUnsafeRequests: true,
actions: {
onUnsafe: "block",
onInappropriate: "block",
onSuspicious: "block",
},
},
},
},
},
});