Create Chat Completion
Generate a model response from a message array, with support for streaming and tool calling.
Use this endpoint for real-time chat generations on the OpenAI-compatible API. This documentation covers the OpenAI-compatibleDocumentation Index
Fetch the complete documentation index at: https://docs.dottxt.ai/llms.txt
Use this file to discover all available pages before exploring further.
chat/completions endpoint. If an SDK defaults to the newer OpenAI Responses API, configure it to use chat completions instead.
Base URL
https://api.dottxt.ai/v1
Structured output
Useresponse_format with type: "json_schema" to constrain the model output to your schema.
category is always one of the four enum values. summary is between 10 and 120 characters. tags has 1–4 items. See the supported features for the full list of enforceable constraints.
Plain chat
Example response shape
Notes
- Set
stream: trueto receive server-sent events. - For model discovery, call
GET /models. - For auth setup, see Authentication.
- For failures, inspect the HTTP status and the
errorobject in the response body.
Authorizations
API key authentication. Include your key in the Authorization header:
Authorization: Bearer YOUR_API_KEY
API keys can be created and managed in the dashboard.
Body
Request body for chat completions.
A list of messages comprising the conversation so far.
ID of the model to use.
"Qwen/Qwen3-30B-A3B-FP8"
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far.
0
The maximum number of tokens to generate in the chat completion.
256
How many chat completion choices to generate for each input message.
1
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far.
0
Up to 4 sequences where the API will stop generating further tokens.
If set, partial message deltas will be sent as server-sent events.
false
What sampling temperature to use, between 0 and 2.
0.7
Controls which (if any) tool is called by the model.
A list of tools the model may call.
An alternative to sampling with temperature, called nucleus sampling.
1
A unique identifier representing your end-user.
Response
Chat completion generated successfully. When streaming, returns a series of SSE events.
Response from chat completions.
A list of chat completion choices.
The Unix timestamp of when the chat completion was created.
1703187200
A unique identifier for the chat completion.
"chatcmpl-abc123"
The model used for the chat completion.
"Qwen/Qwen3-30B-A3B-FP8"
The object type, always "chat.completion".
"chat.completion"
The system fingerprint of the model.
Usage statistics for the completion request.
{
"completion_tokens": 36,
"prompt_tokens": 24,
"total_tokens": 60
}