Skip to main content
Optional and nullable look similar but mean different things:
  • Optional (not in required): the key may be absent entirely. This means “we didn’t ask” or “not applicable.”
  • Nullable (type: ["string", "null"]): the key is always present, but the value may be null. This means “we asked, but the answer is unknown.”
The distinction matters for storage, analytics, and downstream logic. If you treat both as the same thing, you can’t tell whether a field was never captured or was captured and found to be empty, and that ambiguity cascades into every system that touches the data.

Use case

CRM contact records where nickname is a nice-to-have that the model may or may not extract (optional), but middle_name should always be present in the output and set to null when the source text doesn’t mention one.

Schema pattern

{
  "type": "object",
  "properties": {
    "first_name": { "type": "string", "minLength": 1 },
    "middle_name": { "type": ["string", "null"], "maxLength": 80 },
    "last_name": { "type": "string", "minLength": 1 },
    "nickname": { "type": "string", "maxLength": 80 }
  },
  "required": ["first_name", "middle_name", "last_name"],
  "additionalProperties": false
}

Example outputs

Known middle name, no nickname:
{
  "first_name": "Alice",
  "middle_name": "Marie",
  "last_name": "Johnson"
}
Unknown middle name, nickname present:
{
  "first_name": "Alice",
  "middle_name": null,
  "last_name": "Johnson",
  "nickname": "AJ"
}

Why this works

In the first example, nickname is absent. The model didn’t extract one, and your application can skip rendering it entirely. In the second example, middle_name is explicitly null. The model looked for a middle name and didn’t find one, so your UI can show “Unknown” instead of leaving a blank gap. This distinction is especially important for analytics and data pipelines. A COUNT of non-null middle_name values tells you how many contacts have known middle names. A COUNT of records where nickname exists tells you how many contacts provided one. Without the distinction, both queries return the same misleading number.