Operational
Intelligent
AI Gateway
Enterprise-grade proxy for large language model inference. Secure, fast, and built for scale.
Available Models
Local LLM inference — no cloud costs, full privacy
Loading models...
API Endpoints
Integrate with your application using these endpoints
POST
/api/chat
Send a conversation with message history and get a response.
// Headers
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json
// Request body
{
"model": "qwen3:8b",
"messages": [
{ "role": "user", "content": "Hello!" }
]
}
// Response
{
"model": "qwen3:8b",
"message": {
"role": "assistant",
"content": "Hi there!"
},
"done": true
}
POST
/api/generate
Send a single prompt and get a text completion.
// Headers
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json
// Request body
{
"model": "qwen3:8b",
"prompt": "Explain AI in one sentence."
}
// Response
{
"model": "qwen3:8b",
"response": "AI is...",
"done": true
}
All endpoints require an API key via Authorization: Bearer or X-API-Key header
Authenticated Access
SHA-256 hashed API keys with multiple auth methods.
Rate Protected
Per-key throttling with configurable request limits.
Full Audit Trail
Every request logged with latency, status, and source.