Skip to main content
A good schema is a contract between you and the model. The tighter the contract, the less work your application code does. If your schema says "enum": ["billing", "technical", "account"], your routing logic doesn’t need a default case. If it says "minItems": 1, your code doesn’t need an empty-array check. If it says "pattern": "^[A-Z]{2}$", you get country codes, not country names. Most schemas we see in production are too loose. They define the structure but not the boundaries: no length limits on strings, no bounds on arrays, and no patterns on identifiers. The model fills in reasonable values most of the time, and then once in a thousand requests it produces a 4,000-character “summary” or an array with 200 items, and something downstream breaks. This section is about writing schemas that don’t break. Start with Improve Your Schema if you have an existing schema, or pick a domain example close to your use case.

Authoring

Create schemas from the tools you already use.

Pydantic

Define models in Python and generate JSON Schema from them.

Zod

Define schemas in TypeScript with runtime validation.

From sample data (quicktype)

Generate schemas from example JSON with quicktype.

From sample data (genson)

Infer a baseline schema from representative JSON instances.

Improve your schema

Turn domain knowledge into constraints that guide generation.

Patterns

The difference between a schema that works and one that breaks in production usually comes down to a few missing constraints. These patterns address the problems we see most often.

String bounds and patterns

Control length, format, and regex patterns on string fields.

Bounded arrays

Set min/max item counts to prevent runaway generation.

Optional fields

Truly optional fields that the model can omit entirely.

Unions and discriminators

Route output to different shapes based on a discriminator field.

Conditional logic

Use if/then/else and dependent keywords when requirements vary by context.

Recursive schemas

Model trees, nested structures, and self-referencing types.

Chain of thought outputs

Constrain reasoning-style outputs into structured, inspectable fields.

Domain examples

Complete schemas for real tasks, with the reasoning behind each constraint choice.

Classification

Enums, confidence scores, and grounded evidence.

Data extraction

Pull structured fields from invoices, receipts, and documents.

Form processing

Normalize messy user input into typed backend payloads.

API call generation

Map natural language to execution-ready API requests.

UI generation

Generate renderable form specs from product requirements.

Content generation

Structured marketing copy with length bounds and tone control.

Reference

Data types

String, number, integer, boolean, null, const, object, array.

Schema composition

Combine and reuse schemas with allOf, anyOf, oneOf, not, $ref, and $defs.

Conditional logic

Model context-dependent requirements with if/then/else and dependent keywords.
Core type references: String, Number, Integer, Boolean, Null, Const, Object, Array.