Skip to main content
Pydantic is the most common way to author JSON Schemas in Python. You define a model class, Pydantic generates JSON Schema with model_json_schema(), and dottxt can enforce that schema through the OpenAI-compatible API.

Use with dottxt

Pydantic is a good fit when you want Python-native types, schemas defined as code, and runtime validation from the same model definitions.

Install

pip install openai pydantic

Basic usage

Generate JSON Schema from a Pydantic model and send it in response_format:
import os
from openai import OpenAI
from pydantic import BaseModel, ConfigDict, Field

class Contact(BaseModel):
    model_config = ConfigDict(extra="forbid")

    name: str = Field(description="Full name")
    email: str = Field(description="Email address")
    role: str | None = Field(default=None, description="Job title")

client = OpenAI(
    base_url="https://api.dottxt.ai/v1",
    api_key=os.environ["DOTTXT_API_KEY"],
)

response = client.chat.completions.create(
    model="openai/gpt-oss-20b",
    messages=[
        {"role": "user", "content": "Extract: John Smith <john@acme.com>, VP Engineering"}
    ],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "contact",
            "strict": True,
            "schema": Contact.model_json_schema(),
        },
    },
)

contact = Contact.model_validate_json(response.choices[0].message.content)

print(contact.name)
print(contact.email)
print(contact.role)
Set model_config = ConfigDict(extra="forbid") when you want strict object schemas. Without it, the generated schema allows extra properties. Pydantic also generates a title for fields and models. Those are omitted from the examples below for readability.

Add constraints and descriptions

Use Field() for constraints and descriptions. Use Literal for enum values:
from typing import Literal
from pydantic import BaseModel, ConfigDict, Field

class SupportTicket(BaseModel):
    model_config = ConfigDict(extra="forbid")

    category: Literal["billing", "account", "bug", "feature"] = Field(
        description="The area this ticket relates to."
    )
    priority: Literal["low", "medium", "high"] = Field(
        description="How urgently this ticket needs attention."
    )
    summary: str = Field(
        min_length=10,
        max_length=500,
        description="A brief description of the issue."
    )
    confidence: float = Field(
        ge=0.0,
        le=1.0,
        description="How confident the model is in the classification."
    )
Descriptions help guide generation, but they are not enforceable constraints like enum, pattern, or required.

Reference

Use the sections below as a reference for how common Pydantic patterns map to JSON Schema.

Enums

Use enums when a field must be one of a fixed set of known values. Use Literal when a field should be limited to a fixed set of values:
from typing import Literal
from pydantic import BaseModel, ConfigDict, Field

class Sentiment(BaseModel):
    model_config = ConfigDict(extra="forbid")

    label: Literal["positive", "negative", "neutral"]
    confidence: float = Field(ge=0.0, le=1.0)
Python’s enum.Enum also works. Pydantic puts the enum definition in $defs and references it:
from enum import Enum
from pydantic import BaseModel, ConfigDict

class Color(str, Enum):
    red = "red"
    green = "green"
    blue = "blue"

class Palette(BaseModel):
    model_config = ConfigDict(extra="forbid")

    primary: Color
    accent: Color
Prefer Literal when the values are only used once. Use Enum when you want to reuse the same set of values across fields or models.

Const

Use const-style fields when a value should never vary. Use a single-value Literal[...] when a field must always have one exact value:
from typing import Literal
from pydantic import BaseModel, ConfigDict

class SearchStep(BaseModel):
    model_config = ConfigDict(extra="forbid")

    action: Literal["search"]
    query: str

Optional and nullable fields

Use optional and nullable fields carefully because they produce different schema contracts.
from pydantic import BaseModel, ConfigDict, Field

class Lead(BaseModel):
    model_config = ConfigDict(extra="forbid")

    name: str
    email: str
    company: str | None = None
    phone: str | None = None
    notes: str | None = Field(...)
Fields typed as T | None with a default of None become nullable and optional. Fields typed as T | None = Field(...) stay required but nullable. See Optional vs Null for the semantic difference.

Arrays and lists

Use list types for repeated values, then add bounds on the list or its items as needed. Use list[T] for array fields. Pydantic maps list min_length and max_length to minItems and maxItems:
from pydantic import BaseModel, ConfigDict, Field

class Survey(BaseModel):
    model_config = ConfigDict(extra="forbid")

    question: str
    options: list[str] = Field(min_length=2, max_length=6)
    tags: list[str] = Field(default_factory=list, max_length=5)
Setting bounds on arrays prevents the model from generating unbounded lists. See Bounded Arrays for more. If you need constraints on each item, put them on the item type:
from typing import Annotated
from pydantic import Field

Tag = Annotated[str, Field(min_length=1, max_length=40)]
tags: list[Tag] = Field(max_length=5)

Formats and specialized types

Use specialized Pydantic types when you want the generated schema to carry semantic format information. Use Pydantic’s built-in types when you want semantic formats in the generated schema:
from datetime import date
from pydantic import BaseModel, ConfigDict, EmailStr

class ContactRecord(BaseModel):
    model_config = ConfigDict(extra="forbid")

    email: EmailStr
    signup_date: date
Prefer semantic types like EmailStr and date over plain str when you want the schema to carry format information.

Nested models

Use nested models to reuse object shapes and keep larger schemas maintainable. Nested models become $defs references in the generated schema:
from pydantic import BaseModel, ConfigDict

class Address(BaseModel):
    model_config = ConfigDict(extra="forbid")

    street: str
    city: str
    country: str

class Customer(BaseModel):
    model_config = ConfigDict(extra="forbid")

    name: str
    billing_address: Address
    shipping_address: Address

Discriminated unions

Use discriminated unions when the output can take one of several object shapes. Use Literal with Field(discriminator=...) to generate tagged oneOf schemas in Pydantic:
from typing import Literal
from pydantic import BaseModel, ConfigDict, Field

class SearchAction(BaseModel):
    model_config = ConfigDict(extra="forbid")

    action: Literal["search"]
    query: str

class LookupAction(BaseModel):
    model_config = ConfigDict(extra="forbid")

    action: Literal["lookup"]
    id: int

class AgentOutput(BaseModel):
    model_config = ConfigDict(extra="forbid")

    step: SearchAction | LookupAction = Field(discriminator="action")
Pydantic emits discriminator metadata in the generated schema, but the important part for dottxt is the oneOf structure and the const tag values on each branch. That is what makes the output unambiguous at generation time. See AnyOf Object Variants for the schema design side of this pattern.

Recursive models

Use recursive models for trees and other nested structures where items can contain more items of the same shape. Models that reference themselves produce recursive $defs:
from __future__ import annotations
from pydantic import BaseModel, ConfigDict, Field

class TreeNode(BaseModel):
    model_config = ConfigDict(extra="forbid")

    label: str
    children: list[TreeNode] = Field(default_factory=list, max_length=10)

class TreeResponse(BaseModel):
    model_config = ConfigDict(extra="forbid")

    tree: TreeNode
from __future__ import annotations enables forward references so the model can reference itself. Set bounds on recursive lists so generation does not expand without limit. Keep the recursive type under a named object property rather than using the recursive node itself as the top-level response schema.

Composition and inheritance

Use inheritance to combine shared field groups without repeating schema definitions by hand. Use multiple inheritance to combine reusable field groups:
from pydantic import BaseModel, ConfigDict

class Timestamped(BaseModel):
    created_at: str
    updated_at: str

class Authored(BaseModel):
    author: str

class Article(Timestamped, Authored):
    model_config = ConfigDict(extra="forbid")

    title: str
    body: str

Validators do not affect schema

Use validators for application-side checks, but do not rely on them to shape the generated schema. Pydantic validators run at parse time, but they do not appear in the generated JSON Schema. If you need to constrain generation, express it in the type annotation or Field():
from pydantic import BaseModel, ConfigDict, field_validator

class Invoice(BaseModel):
    model_config = ConfigDict(extra="forbid")

    amount: float
    currency: str

    @field_validator("currency")
    @classmethod
    def currency_must_be_valid(cls, v: str) -> str:
        if v not in ("USD", "EUR", "GBP"):
            raise ValueError("unsupported currency")
        return v

Notes

  • Use Pydantic when you want Python types, runtime validation, and JSON Schema generation from one model definition. Use raw JSON Schema directly when you need full control over the output shape or keywords that do not map cleanly from Pydantic types.
  • Set ConfigDict(extra="forbid") when you want additionalProperties: false.
  • Use Literal for enums rather than json_schema_extra={"enum": [...]}.
  • Use validators for parse-time checks, not generation-time constraints.
  • See String Bounds, Bounded Arrays, AnyOf Object Variants, and Optional vs Null for schema design details that matter during generation.