Structured Output with JSON Schema
Problem
AI models return unstructured text by default:
- Inconsistent formatting
- Manual parsing required
- Type safety missing
- Error-prone extraction
- Difficult validation
Applications need structured, typed data:
- JSON objects for APIs
- Type-safe TypeScript interfaces
- Database records
- Form data
Solution
Use JSON schema to enforce structured output:
- Define TypeScript interfaces
- Generate JSON schemas
- Validate responses
- Type-safe parsing
- Error handling
Code
import { NeuroLink } from "@juspay/neurolink";
// Define your data structure
type ProductReview = {
productName: string;
rating: number;
sentiment: "positive" | "negative" | "neutral";
pros: string[];
cons: string[];
recommendationScore: number;
summary: string;
};
// JSON Schema for validation
const productReviewSchema = {
type: "object",
properties: {
productName: {
type: "string",
description: "Name of the product being reviewed",
},
rating: {
type: "number",
minimum: 1,
maximum: 5,
description: "Rating from 1 to 5 stars",
},
sentiment: {
type: "string",
enum: ["positive", "negative", "neutral"],
description: "Overall sentiment of the review",
},
pros: {
type: "array",
items: { type: "string" },
description: "List of positive aspects",
},
cons: {
type: "array",
items: { type: "string" },
description: "List of negative aspects",
},
recommendationScore: {
type: "number",
minimum: 0,
maximum: 100,
description: "Likelihood to recommend (0-100)",
},
summary: {
type: "string",
description: "Brief summary of the review",
},
},
required: [
"productName",
"rating",
"sentiment",
"pros",
"cons",
"recommendationScore",
"summary",
],
};
class StructuredOutputGenerator {
private neurolink: NeuroLink;
constructor() {
this.neurolink = new NeuroLink();
}
/**
* Extract structured data from text
*/
async extractStructured<T>(
prompt: string,
schema: any,
provider: string = "openai",
): Promise<T> {
const result = await this.neurolink.generate({
input: { text: prompt },
provider,
structuredOutput: {
type: "json",
schema,
},
});
// Parse and validate JSON
try {
const parsed = JSON.parse(result.content);
this.validateAgainstSchema(parsed, schema);
return parsed as T;
} catch (error: any) {
throw new Error(`Failed to parse structured output: ${error.message}`);
}
}
/**
* Basic schema validation
*/
private validateAgainstSchema(data: any, schema: any): void {
// Check required fields
if (schema.required) {
for (const field of schema.required) {
if (!(field in data)) {
throw new Error(`Missing required field: ${field}`);
}
}
}
// Check types
for (const [key, value] of Object.entries(data)) {
const fieldSchema = schema.properties?.[key];
if (!fieldSchema) continue;
const actualType = Array.isArray(value) ? "array" : typeof value;
if (fieldSchema.type !== actualType) {
throw new Error(
`Field "${key}" has wrong type. Expected ${fieldSchema.type}, got ${actualType}`,
);
}
// Validate enum
if (fieldSchema.enum && !fieldSchema.enum.includes(value)) {
throw new Error(
`Field "${key}" must be one of: ${fieldSchema.enum.join(", ")}`,
);
}
// Validate number ranges
if (fieldSchema.type === "number") {
if (fieldSchema.minimum !== undefined && value < fieldSchema.minimum) {
throw new Error(`Field "${key}" must be >= ${fieldSchema.minimum}`);
}
if (fieldSchema.maximum !== undefined && value > fieldSchema.maximum) {
throw new Error(`Field "${key}" must be <= ${fieldSchema.maximum}`);
}
}
}
}
/**
* Batch extraction with retry on validation failure
*/
async extractWithRetry<T>(
prompt: string,
schema: any,
maxRetries: number = 3,
): Promise<T> {
let lastError: Error | null = null;
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
return await this.extractStructured<T>(prompt, schema);
} catch (error: any) {
lastError = error;
console.error(`❌ Attempt ${attempt} failed: ${error.message}`);
if (attempt < maxRetries) {
console.log("🔄 Retrying with more explicit instructions...");
// Add validation error to prompt
prompt += `\n\nPrevious attempt failed validation: ${error.message}. Please ensure strict adherence to the schema.`;
}
}
}
throw lastError || new Error("Extraction failed");
}
}
// Usage Examples
async function example1_ProductReview() {
const generator = new StructuredOutputGenerator();
const reviewText = `
I recently purchased the UltraBook Pro laptop and I'm mostly impressed.
The build quality is excellent, the screen is gorgeous, and battery life
is amazing - easily lasts 12 hours. However, the keyboard feels a bit mushy
and it can get quite hot during intensive tasks. Overall, I'd recommend it
for productivity work but gamers should look elsewhere.
`;
const review = await generator.extractStructured<ProductReview>(
`Extract a structured review from this text: ${reviewText}`,
productReviewSchema,
);
console.log("✅ Extracted Review:");
console.log(JSON.stringify(review, null, 2));
}
// Example 2: Contact Information Extraction
type ContactInfo = {
name: string;
email: string;
phone?: string;
company?: string;
role?: string;
};
const contactSchema = {
type: "object",
properties: {
name: { type: "string" },
email: { type: "string", format: "email" },
phone: { type: "string" },
company: { type: "string" },
role: { type: "string" },
},
required: ["name", "email"],
};
async function example2_ContactExtraction() {
const generator = new StructuredOutputGenerator();
const text = `
Hi, I'm John Smith, Senior Engineer at TechCorp Inc.
You can reach me at [email protected] or call me at
+1-555-0123. Looking forward to connecting!
`;
const contact = await generator.extractStructured<ContactInfo>(
`Extract contact information from: ${text}`,
contactSchema,
);
console.log("✅ Extracted Contact:");
console.log(contact);
}
// Example 3: Database Record Generation
type UserProfile = {
userId: string;
username: string;
age: number;
interests: string[];
subscriptionTier: "free" | "basic" | "premium";
joinedDate: string;
};
const userProfileSchema = {
type: "object",
properties: {
userId: { type: "string", pattern: "^[A-Z0-9]{8}$" },
username: { type: "string", minLength: 3, maxLength: 20 },
age: { type: "number", minimum: 13, maximum: 120 },
interests: { type: "array", items: { type: "string" } },
subscriptionTier: { type: "string", enum: ["free", "basic", "premium"] },
joinedDate: { type: "string", format: "date" },
},
required: [
"userId",
"username",
"age",
"interests",
"subscriptionTier",
"joinedDate",
],
};
async function example3_DatabaseRecord() {
const generator = new StructuredOutputGenerator();
const userData = `
Create a user profile for Sarah Chen, a 28-year-old photography enthusiast
who also loves hiking and cooking. She's on our premium plan and joined
last month.
`;
const profile = await generator.extractStructured<UserProfile>(
userData,
userProfileSchema,
"anthropic", // Claude handles structured output well
);
console.log("✅ User Profile:");
console.log(profile);
}
// Main
async function main() {
console.log("=== Example 1: Product Review ===\n");
await example1_ProductReview();
console.log("\n=== Example 2: Contact Extraction ===\n");
await example2_ContactExtraction();
console.log("\n=== Example 3: Database Record ===\n");
await example3_DatabaseRecord();
}
main();
Explanation
1. JSON Schema Definition
Define structure upfront:
const schema = {
type: "object",
properties: {
field: { type: "string" },
},
required: ["field"],
};
2. Type Safety
Use TypeScript interfaces for compile-time checking:
type MyData = {
field: string;
};
const data = await extract<MyData>(prompt, schema);
// data.field is typed as string
3. Validation
Validate parsed JSON against schema:
- Required fields present
- Correct types
- Enum values valid
- Number ranges respected
4. Error Handling
Retry with enhanced prompt on validation failure:
prompt += `\nPrevious failed: ${error.message}`;
5. Provider Selection
Different providers handle structured output differently:
- OpenAI: Excellent JSON mode
- Anthropic: Good with clear schemas
- Google AI: NOTE - Cannot use tools with structured output
Variations
Nested Objects
Handle complex nested structures:
type Company = {
name: string;
employees: Array<{
name: string;
role: string;
department: {
name: string;
budget: number;
};
}>;
};
const companySchema = {
type: "object",
properties: {
name: { type: "string" },
employees: {
type: "array",
items: {
type: "object",
properties: {
name: { type: "string" },
role: { type: "string" },
department: {
type: "object",
properties: {
name: { type: "string" },
budget: { type: "number" },
},
required: ["name", "budget"],
},
},
required: ["name", "role", "department"],
},
},
},
required: ["name", "employees"],
};
Streaming Structured Output
Stream and validate incrementally:
async function streamStructuredOutput<T>(
prompt: string,
schema: any,
): Promise<T> {
let buffer = "";
const stream = await neurolink.stream({
input: { text: prompt },
structuredOutput: { type: "json", schema },
});
for await (const chunk of stream) {
if (chunk.type === "content-delta") {
buffer += chunk.delta;
process.stdout.write(chunk.delta);
}
}
return JSON.parse(buffer) as T;
}
Union Types
Handle multiple possible schemas:
type Response = SuccessResponse | ErrorResponse;
type SuccessResponse = {
status: "success";
data: any;
};
type ErrorResponse = {
status: "error";
error: string;
code: number;
};
async function parseResponse(text: string): Promise<Response> {
const result = await generator.extractStructured(text, responseSchema);
if (result.status === "success") {
return result as SuccessResponse;
} else {
return result as ErrorResponse;
}
}
Schema from TypeScript
Auto-generate schemas from interfaces:
import { zodToJsonSchema } from "zod-to-json-schema";
import { z } from "zod";
const UserSchema = z.object({
name: z.string(),
age: z.number().min(0).max(120),
email: z.string().email(),
});
const jsonSchema = zodToJsonSchema(UserSchema);
const user = await generator.extractStructured<z.infer<typeof UserSchema>>(
prompt,
jsonSchema,
);
Use Cases
| Use Case | Schema Complexity | Recommended Provider |
|---|---|---|
| Data extraction | Simple | OpenAI, Anthropic |
| Form filling | Medium | OpenAI |
| API responses | Medium | OpenAI, Google AI |
| Database records | Complex | OpenAI |
| Classification | Simple | Any provider |
| Sentiment analysis | Simple | Anthropic |
Best Practices
- Define schemas upfront: Don't rely on prompt engineering alone
- Use TypeScript types: Compile-time safety prevents runtime errors
- Validate responses: Don't trust AI output blindly
- Retry on failure: Validation errors can be recovered
- Test schemas: Verify with sample data before production
- Keep schemas simple: Complex nesting reduces accuracy