Instructor - dottxt docs

Instructor patches the OpenAI client to return typed Pydantic objects instead of raw completions. Since dottxt exposes an OpenAI-compatible endpoint, you can use Instructor on top of the OpenAI Python SDK. For dottxt, the important detail is to use Instructor in JSON mode so your response_model is translated into JSON Schema and sent through dottxt structured generation, rather than relying on tool-calling behavior. Instructor has excellent documentation covering advanced patterns like validation, retries, partial streaming, and multi-modal extraction. This page covers the dottxt-specific setup. Refer to the Instructor docs for everything else.

Install

pip install instructor openai pydantic

Configure

Create an OpenAI client pointed at dottxt, then patch it with Instructor in JSON mode:

import os
import instructor
from openai import OpenAI

client = instructor.from_openai(
    OpenAI(
        base_url="https://api.dottxt.ai/v1",
        api_key=os.environ["DOTTXT_API_KEY"],
    ),
    mode=instructor.Mode.JSON,
)

Basic usage

Define a Pydantic model and pass it as response_model. Instructor will derive JSON Schema from the model, send that schema to dottxt, and validate the response back into a Pydantic object:

from typing import Optional
from pydantic import BaseModel, ConfigDict, Field

class Contact(BaseModel):
    model_config = ConfigDict(extra="forbid")

    name: str = Field(description="Full name")
    email: str = Field(description="Email address")
    role: Optional[str] = Field(default=None, description="Job title")

contact = client.chat.completions.create(
    model="openai/gpt-oss-20b",
    response_model=Contact,
    messages=[
        {"role": "user", "content": "Extract: John Smith <john@acme.com>, VP Engineering"}
    ],
)

print(contact.name)   # "John Smith"
print(contact.email)  # "john@acme.com"
print(contact.role)   # "VP Engineering"

{
  "name": "John Smith",
  "email": "john@acme.com",
  "role": "VP Engineering"
}

Instructor handles schema generation, request construction, and response parsing for you. The underlying API call still uses the same dottxt structured generation path described in API Overview and Pydantic Authoring.

What Instructor sends to dottxt

Under the hood, the Contact model above is converted into JSON Schema and sent to dottxt as structured output constraints:

{
  "type": "object",
  "properties": {
    "name": { "type": "string", "description": "Full name" },
    "email": { "type": "string", "description": "Email address" },
    "role": {
      "anyOf": [{ "type": "string" }, { "type": "null" }],
      "default": null,
      "description": "Job title"
    }
  },
  "required": ["name", "email"],
  "additionalProperties": false
}

Nested models and enums

from typing import Literal, Optional
from pydantic import BaseModel, ConfigDict, Field

class Tag(BaseModel):
    model_config = ConfigDict(extra="forbid")

    name: str
    confidence: float = Field(ge=0.0, le=1.0)

class TicketExtraction(BaseModel):
    model_config = ConfigDict(extra="forbid")

    title: str = Field(description="Short summary of the issue")
    priority: Literal["low", "medium", "high", "critical"]
    tags: list[Tag]
    assignee: Optional[str] = None

ticket = client.chat.completions.create(
    model="openai/gpt-oss-20b",
    response_model=TicketExtraction,
    messages=[
        {
            "role": "user",
            "content": (
                "Parse this support ticket: "
                "URGENT: Payment gateway returning 500 errors on checkout. "
                "Tags: payments, backend, production-incident. "
                "Assign to the payments team."
            ),
        }
    ],
)

{
  "title": "Payment gateway 500 errors on checkout",
  "priority": "critical",
  "tags": [
    {"name": "payments", "confidence": 0.95},
    {"name": "backend", "confidence": 0.9},
    {"name": "production-incident", "confidence": 0.95}
  ],
  "assignee": "payments team"
}

Streaming partial results

Use create_partial to yield progressively-complete model instances as tokens stream in:

from typing import Optional
from pydantic import BaseModel, ConfigDict, Field

class Contact(BaseModel):
    model_config = ConfigDict(extra="forbid")

    name: str = Field(description="Full name")
    email: str = Field(description="Email address")
    role: Optional[str] = Field(default=None, description="Job title")

for partial in client.chat.completions.create_partial(
    model="openai/gpt-oss-20b",
    response_model=Contact,
    messages=[
        {"role": "user", "content": "Extract: Alice Chen <alice@startup.io>, CTO"}
    ],
):
    print(partial)

Notes

Use mode=instructor.Mode.JSON with dottxt so Instructor goes through the structured output path instead of defaulting to tool calling.
ConfigDict(extra="forbid") is useful when you want additionalProperties: false in the generated schema.
create_with_completion() returns both the parsed model and the raw completion, useful for inspecting token usage.
See the Pydantic authoring guide for how to write effective schemas.

​Install

​Configure

​Basic usage

​What Instructor sends to dottxt

​Nested models and enums

​Streaming partial results

​Notes