Drop-in OpenAI-style API · local & cloud
Trinix API
Works like OpenAI with the clients you already use—often cheaper—and one surface for local and cloud models. Production endpoints for chat, generation, embeddings, and web search, with limits that behave in the real world. New here? Use the browser playground to try free models before you create an account.
api.trinix.gg
Rate limit 100/min per IP & per key/session
Body 1 MB max
🧠 Why this matters
Fresh, grounded answers shouldn’t live behind a separate product lane or an enterprise upsell. If you expose search the same way you expose everything else—a plain HTTP call—
POST /api/v1/web-search
👉 That’s just a basic feature. Same X-API-Key as /api/v1/chat, same integration mindset, no second vendor or special SDK for “grounding.” Builders wire it in like any other endpoint and move on.
Authentication
Use X-API-Key on native endpoints and either X-API-Key or bearer auth on OpenAI-compatible endpoints.
X-API-Key: YOUR_API_KEY
Authorization: Bearer YOUR_API_KEY
Endpoints
Native API
GET /api/v1/modelsGET /api/v1/queue-statusPOST /api/v1/chatPOST /api/v1/generatePOST /api/v1/embedPOST /api/v1/web-search
OpenAI-compatible
POST /v1/chat/completionsGET /v1/models
Task API
POST /task/v1/chat/completionsGET /task/v1/models
Models
- ⚡
trinix-chatfree - 💻
trinix-codefree - 🧠
trinix-reasonfree - 📐
trinix-embedfree - 🚀
trinix-propaid - 💻
trinix-coder-propaid - 🧠
trinix-reason-propaid - 🚀
trinix-ultrapaid
Quick start
Native chat
curl https://api.trinix.gg/api/v1/chat \
-H "X-API-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"trinix-chat","messages":[{"role":"user","content":"hello"}]}'
OpenAI-compatible chat
curl https://api.trinix.gg/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"trinix-chat","messages":[{"role":"user","content":"hello"}]}'
Python SDK
from openai import OpenAI
client = OpenAI(base_url="https://api.trinix.gg/v1", api_key="YOUR_API_KEY")
resp = client.chat.completions.create(
model="trinix-chat",
messages=[{"role": "user", "content": "hello"}]
)
print(resp.choices[0].message.content)
Streaming
- Native: NDJSON chunks (
{"delta":"..."}, then{"done":true}). - OpenAI-compatible: SSE frames ending with
data: [DONE].
Limits
- Prompt/input cap by model and plan (up to 200k chars for enterprise paid flow).
- Request body maximum:
1 MB. - Rate limits:
100 requests/minuteper client IP and again per API key (or per browser session when no key). Both apply. - Paid models enforce daily and weekly quotas.
- Web search (
POST /api/v1/web-search) daily quota by plan:
| Plan | Web search |
|---|---|
| Free | ❌ Disabled |
| Pro | 10/day |
| Enterprise | 50/day default · typically 50–100/day on contract |
- Queue full/timeout returns
429.