Chat Completions
Create a model response for a given conversation. Accepts a list of messages and returns the model's next reply. Use this endpoint to power chat interfaces, AI assistants, content generation, and any text workflow.
This page documents the OpenAI-compatible Chat Completions endpoint. Use the openai Python package or any OpenAI-compatible SDK with OPENAI_BASE_URL=https://api.linkharbor.ai/v1. For Anthropic Messages API, use /anthropic/v1/messages instead.
Request Body
The ID of the model to use. Retrieve available models from GET /v1/models and replace your-model-name with a real ID from the live catalog.
A list of messages that make up the conversation. The model uses this history to generate the next reply.
The role of the message author. One of: system (sets assistant behavior), user (human input), or assistant (prior model replies).
The text content of the message.
If true, the response is streamed back as Server-Sent Events (SSE) instead of a single JSON object. Each chunk contains a delta with the incremental content. The stream ends with data: [DONE].
Sampling temperature. Higher values (e.g. 0.9) produce more creative, varied output. Lower values (e.g. 0.2) make responses more focused and deterministic. Adjust this or top_p, not both.
Maximum number of tokens to generate. The total of input tokens and this value cannot exceed the model's context window. Omit to use the model's default maximum.
Response
Returns a chat completion object. On success, the HTTP status is 200. On error, a JSON object with error type and message is returned instead.
Unique identifier for this completion, prefixed with chatcmpl-.
The model that generated this completion.
Array of generated choices. Each contains a message with role and content, and a finish_reason (e.g. stop when the model completes naturally, length when max_tokens is reached).
Token usage statistics for this request.
Number of tokens in the input messages.
Number of tokens in the generated response.
Total tokens used (prompt + completion).