Skip to main content
POST
/
v1
/
chat
/
completions
curl --request POST \ --url https://api.caprioletech.com/v1/chat/completions \ --header 'Authorization: Bearer <token>' \ --header 'Content-Type: application/json' \ --data ' { "model": "google/gemini-3.1-flash-lite-preview", "messages": [ { "role": "user", "content": "Tell me a short joke." } ] } '
{ "id": "chatcmpl_49da8eb7916b43a3ab02442bc2841839", "object": "chat.completion", "created": 1774487511, "model": "google/gemini-3.1-flash-lite-preview", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Here is a short joke." }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 11, "completion_tokens": 6, "total_tokens": 17 } }

Authorizations

Authorization
string
header
required

Use an API key created in the Capriole AI page. Send it as Authorization: Bearer sk-....

Body

application/json
model
string
required

Model key from public.llm_models

messages
object[]
required
Minimum array length: 1
stream
boolean
default:false

Streaming flag. Only false is supported in the current MVP

temperature
number

Sampling temperature

top_p
number

Top-p nucleus sampling value

max_completion_tokens
integer

Maximum output tokens for this completion

max_tokens
integer

Alias of max_completion_tokens for compatibility

stop

Optional stop sequence or stop sequence list

reasoning_effort
enum<string>

Unified reasoning effort hint

Available options:
none,
minimal,
low,
medium,
high,
xhigh,
max
reasoning
object
thinking
object

Response

Chat completion response

id
string
required
object
string
required
created
integer
required
model
string
required
choices
object[]
required
usage
object
required