OpenAI compatibility - Capriole AI API

curl --request POST \
  --url https://api.caprioletech.com/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "google/gemini-3.1-flash-lite-preview",
  "messages": [
    {
      "role": "user",
      "content": "Tell me a short joke."
    }
  ]
}
'

{
  "id": "chatcmpl_49da8eb7916b43a3ab02442bc2841839",
  "object": "chat.completion",
  "created": 1774487511,
  "model": "google/gemini-3.1-flash-lite-preview",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Here is a short joke."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 11,
    "completion_tokens": 6,
    "total_tokens": 17
  }
}

POST

/

v1

/

chat

/

completions

curl --request POST \
  --url https://api.caprioletech.com/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "google/gemini-3.1-flash-lite-preview",
  "messages": [
    {
      "role": "user",
      "content": "Tell me a short joke."
    }
  ]
}
'

{
  "id": "chatcmpl_49da8eb7916b43a3ab02442bc2841839",
  "object": "chat.completion",
  "created": 1774487511,
  "model": "google/gemini-3.1-flash-lite-preview",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Here is a short joke."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 11,
    "completion_tokens": 6,
    "total_tokens": 17
  }
}

Authorizations

Authorization

string

header

required

Use an API key created in the Capriole AI page. Send it as Authorization: Bearer sk-....

Body

application/json

model

string

required

Model key from public.llm_models

messages

object[]

required

Minimum array length: 1

Show child attributes

stream

boolean

default:false

Streaming flag. Only false is supported in the current MVP

temperature

number

Sampling temperature

top_p

number

Top-p nucleus sampling value

max_completion_tokens

integer

Maximum output tokens for this completion

max_tokens

integer

Alias of max_completion_tokens for compatibility

stop

Optional stop sequence or stop sequence list

reasoning_effort

enum<string>

Unified reasoning effort hint

Available options:

none,

minimal,

low,

medium,

high,

xhigh,

max

reasoning

object

Show child attributes

thinking

object

Show child attributes

Response

Chat completion response

id

string

required

object

string

required

created

integer

required

model

string

required

choices

object[]

required

Show child attributes

usage

object

required

Show child attributes