Operational

Intelligent
AI Gateway

Enterprise-grade proxy for large language model inference. Secure, fast, and built for scale.

Available Models

Local LLM inference — no cloud costs, full privacy

Loading models...

API Endpoints

Integrate with your application using these endpoints

POST
/api/chat

Send a conversation with message history and get a response.

// Headers Authorization: Bearer YOUR_API_KEY Content-Type: application/json // Request body { "model": "qwen3:8b", "messages": [ { "role": "user", "content": "Hello!" } ] } // Response { "model": "qwen3:8b", "message": { "role": "assistant", "content": "Hi there!" }, "done": true }
POST
/api/generate

Send a single prompt and get a text completion.

// Headers Authorization: Bearer YOUR_API_KEY Content-Type: application/json // Request body { "model": "qwen3:8b", "prompt": "Explain AI in one sentence." } // Response { "model": "qwen3:8b", "response": "AI is...", "done": true }

All endpoints require an API key via Authorization: Bearer or X-API-Key header

Authenticated Access

SHA-256 hashed API keys with multiple auth methods.

Rate Protected

Per-key throttling with configurable request limits.

Full Audit Trail

Every request logged with latency, status, and source.