Generate a model response from a message array, with support for streaming and tool calling.
chat/completions endpoint. If an SDK defaults to the newer OpenAI Responses API, configure it to use chat completions instead.
https://api.dottxt.ai/v1
response_format with type: "json_schema" to constrain the model output to your schema.
category is always one of the four enum values. summary is between 10 and 120 characters. tags has 1–4 items. See the supported features for the full list of enforceable constraints.
stream: true to receive server-sent events.GET /models.error object in the response body.API key authentication. Include your key in the Authorization header:
Authorization: Bearer YOUR_API_KEY
API keys can be created and managed in the dashboard.
Request body for chat completions.
A list of messages comprising the conversation so far.
ID of the model to use.
"Qwen/Qwen3-30B-A3B-FP8"
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far.
0
The maximum number of tokens to generate in the chat completion.
256
How many chat completion choices to generate for each input message.
1
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far.
0
Up to 4 sequences where the API will stop generating further tokens.
If set, partial message deltas will be sent as server-sent events.
false
What sampling temperature to use, between 0 and 2.
0.7
Controls which (if any) tool is called by the model.
A list of tools the model may call.
An alternative to sampling with temperature, called nucleus sampling.
1
A unique identifier representing your end-user.
Chat completion generated successfully. When streaming, returns a series of SSE events.
Response from chat completions.
A list of chat completion choices.
The Unix timestamp of when the chat completion was created.
1703187200
A unique identifier for the chat completion.
"chatcmpl-abc123"
The model used for the chat completion.
"Qwen/Qwen3-30B-A3B-FP8"
The object type, always "chat.completion".
"chat.completion"
The system fingerprint of the model.
Usage statistics for the completion request.
{
"completion_tokens": 36,
"prompt_tokens": 24,
"total_tokens": 60
}