Skip to main content
POST
/
chat
/
completions
Create chat completion
curl --request POST \
  --url https://api.dottxt.ai/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "max_tokens": 256,
  "messages": [
    {
      "content": "You are a helpful assistant.",
      "role": "system"
    },
    {
      "content": "What is a doubleword?",
      "role": "user"
    }
  ],
  "model": "Qwen/Qwen3-30B-A3B-FP8",
  "temperature": 0.7
}
'
{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "A doubleword is a data unit that is twice the size of a standard word in computer architecture, typically 32 or 64 bits depending on the system.",
        "role": "assistant"
      }
    }
  ],
  "created": 1703187200,
  "id": "chatcmpl-abc123",
  "model": "Qwen/Qwen3-30B-A3B-FP8",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 36,
    "prompt_tokens": 24,
    "total_tokens": 60
  }
}
Use this endpoint for real-time chat generations on the OpenAI-compatible API. This documentation covers the OpenAI-compatible chat/completions endpoint. If an SDK defaults to the newer OpenAI Responses API, configure it to use chat completions instead.

Base URL

https://api.dottxt.ai/v1

Structured output

Use response_format with type: "json_schema" to constrain the model output to your schema.
curl https://api.dottxt.ai/v1/chat/completions \
  -H "Authorization: Bearer $DOTTXT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-oss-20b",
    "messages": [
      { "role": "user", "content": "Classify: My card was charged twice for order ORD-9842. Need refund today." }
    ],
    "response_format": {
      "type": "json_schema",
      "json_schema": {
        "name": "ticket",
        "schema": {
          "type": "object",
          "properties": {
            "category": {
              "type": "string",
              "enum": ["billing", "technical", "account", "shipping"]
            },
            "priority": {
              "type": "string",
              "enum": ["low", "medium", "high", "urgent"]
            },
            "summary": {
              "type": "string",
              "minLength": 10,
              "maxLength": 120
            },
            "tags": {
              "type": "array",
              "items": { "type": "string" },
              "minItems": 1,
              "maxItems": 4
            }
          },
          "required": ["category", "priority", "summary", "tags"],
          "additionalProperties": false
        }
      }
    }
  }'
Response
{
  "choices": [
    {
      "index": 0,
      "finish_reason": "stop",
      "message": {
        "role": "assistant",
        "content": "{\"category\": \"billing\", \"priority\": \"high\", \"summary\": \"Customer reports duplicate card charge on order ORD-9842, requesting refund\", \"tags\": [\"refund\", \"duplicate-charge\"]}"
      }
    }
  ]
}
category is always one of the four enum values. summary is between 10 and 120 characters. tags has 1–4 items. See the supported features for the full list of enforceable constraints.

Plain chat

curl https://api.dottxt.ai/v1/chat/completions \
  -H "Authorization: Bearer $DOTTXT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-oss-20b",
    "messages": [
      { "role": "system", "content": "You are a concise assistant." },
      { "role": "user", "content": "Summarize why batch processing is useful." }
    ],
    "temperature": 0.3,
    "max_tokens": 180
  }'

Example response shape

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1703187200,
  "model": "openai/gpt-oss-20b",
  "choices": [
    {
      "index": 0,
      "finish_reason": "stop",
      "message": {
        "role": "assistant",
        "content": "Batch processing reduces cost and improves throughput for non-urgent workloads."
      }
    }
  ],
  "usage": {
    "prompt_tokens": 24,
    "completion_tokens": 36,
    "total_tokens": 60
  }
}

Notes

  • Set stream: true to receive server-sent events.
  • For model discovery, call GET /models.
  • For auth setup, see Authentication.
  • For failures, inspect the HTTP status and the error object in the response body.

Authorizations

Authorization
string
header
required

API key authentication. Include your key in the Authorization header:

Authorization: Bearer YOUR_API_KEY

API keys can be created and managed in the dashboard.

Body

application/json

Request body for chat completions.

messages
object[]
required

A list of messages comprising the conversation so far.

model
string
required

ID of the model to use.

Example:

"Qwen/Qwen3-30B-A3B-FP8"

frequency_penalty
number<float> | null

Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far.

Example:

0

max_tokens
integer<int32> | null

The maximum number of tokens to generate in the chat completion.

Example:

256

n
integer<int32> | null

How many chat completion choices to generate for each input message.

Example:

1

presence_penalty
number<float> | null

Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far.

Example:

0

stop
string[] | null

Up to 4 sequences where the API will stop generating further tokens.

stream
boolean | null

If set, partial message deltas will be sent as server-sent events.

Example:

false

temperature
number<float> | null

What sampling temperature to use, between 0 and 2.

Example:

0.7

tool_choice
any

Controls which (if any) tool is called by the model.

tools
object[] | null

A list of tools the model may call.

top_p
number<float> | null

An alternative to sampling with temperature, called nucleus sampling.

Example:

1

user
string | null

A unique identifier representing your end-user.

Response

Chat completion generated successfully. When streaming, returns a series of SSE events.

Response from chat completions.

choices
object[]
required

A list of chat completion choices.

created
integer<int64>
required

The Unix timestamp of when the chat completion was created.

Example:

1703187200

id
string
required

A unique identifier for the chat completion.

Example:

"chatcmpl-abc123"

model
string
required

The model used for the chat completion.

Example:

"Qwen/Qwen3-30B-A3B-FP8"

object
string
required

The object type, always "chat.completion".

Example:

"chat.completion"

system_fingerprint
string | null

The system fingerprint of the model.

usage
object

Usage statistics for the completion request.

Example:
{
  "completion_tokens": 36,
  "prompt_tokens": 24,
  "total_tokens": 60
}