Creates an OpenAI-compatible chat completion.
Use an API key created in the Capriole AI page. Send it as Authorization: Bearer sk-....
Model key from public.llm_models
1Streaming flag. Only false is supported in the current MVP
Sampling temperature
Top-p nucleus sampling value
Maximum output tokens for this completion
Alias of max_completion_tokens for compatibility
Optional stop sequence or stop sequence list
Unified reasoning effort hint
none, minimal, low, medium, high, xhigh, max