Skip to main content
Not every field can always be extracted. Some information may be absent from the source text, or may only be available after a secondary enrichment step. Making all fields required forces the model to hallucinate values for missing data; making all fields optional means your downstream code can never trust that anything is present. The right approach is to split fields into a required core (fields your application cannot function without) and optional enrichments (fields that add value when present but don’t block processing when absent). Both groups stay typed and bounded; optional does not mean unconstrained.

Use case

Lead qualification output where lead_id, segment, and priority are always needed for routing, but company_size, tech_stack, and notes are only available when the source data mentions them.

Schema pattern

{
  "type": "object",
  "properties": {
    "lead_id": { "type": "string", "pattern": "^LEAD-[0-9]{4,10}$" },
    "segment": { "type": "string", "enum": ["smb", "mid_market", "enterprise"] },
    "priority": { "type": "string", "enum": ["low", "medium", "high"] },
    "company_size": { "type": "string", "enum": ["1-10", "11-50", "51-200", "201+"] },
    "tech_stack": {
      "type": "array",
      "items": { "type": "string", "minLength": 1, "maxLength": 40 },
      "maxItems": 10
    },
    "notes": { "type": "string", "maxLength": 300 }
  },
  "required": ["lead_id", "segment", "priority"],
  "additionalProperties": false
}

Example outputs

Core-only output:
{
  "lead_id": "LEAD-9821",
  "segment": "mid_market",
  "priority": "high"
}
Enriched output:
{
  "lead_id": "LEAD-9821",
  "segment": "mid_market",
  "priority": "high",
  "company_size": "51-200",
  "tech_stack": ["salesforce", "hubspot"],
  "notes": "Team requested migration support in Q2."
}

Why this works

The required array in the JSON schema guarantees that lead_id, segment, and priority are always present, so your routing logic never hits a missing key. The optional fields (company_size, tech_stack, notes) are still fully typed with enums, bounds, and maxItems, so when they do appear, they conform to the same quality standards as the required fields. This also makes the schema forward-compatible. When you add a new enrichment field later, existing consumers continue working because they only depend on the required core. The new field appears in outputs that have the data, and is absent from those that don’t.