Developer docs
The Smart Router for LLMs.
SKYLINE INTELLIGENCE gives customer applications a stable OpenAI-compatible gateway, smart model routing, quota-aware billing, and usage visibility across Claude, GPT, Gemini, DeepSeek, Qwen, and local providers.
POST /v1/chat/completions
Authorization: Bearer sk_aibrg_...
model: "auto"
Build with SKYLINE INTELLIGENCE
Recommended path for new developers
Follow this sequence when integrating a new app. It mirrors the customer console: create credentials, confirm model access, make one request, then monitor usage and quota.
First steps
Make your first API call
The fastest path is to use SKYLINE INTELLIGENCE as the base URL for an existing OpenAI-compatible client. The public site is https://skylineintelligence.top/; customer API calls should use https://api.skylineintelligence.top/v1.
-
1
Create an API key
Open
API Keys, create a key, and keep the raw key from the one-time creation screen. Default customer scopes includemodel:invoke,chat:read, andchat:write. -
2
Copy the Base URL
Use the console connection panel or call
GET /api/connection-infowith your console session token. The returnedbaseUrlshould behttps://api.skylineintelligence.top/v1. -
3
Send one chat request
Use
model: "auto"for router-owned selection, or choose a specific model from/v1/modelsor the Model Plaza.
Connect
Use existing OpenAI clients
Most apps only need two changes: set baseURL to SKYLINE INTELLIGENCE and replace the API key. The request body stays familiar.
export SKYLINE_BASE_URL="https://api.skylineintelligence.top/v1"
export SKYLINE_API_KEY="sk_aibrg_your_key"
curl "$SKYLINE_BASE_URL/chat/completions" \
-H "Authorization: Bearer $SKYLINE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "auto",
"messages": [
{ "role": "user", "content": "Explain SKYLINE INTELLIGENCE in one sentence." }
]
}'
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.SKYLINE_API_KEY,
baseURL: process.env.SKYLINE_BASE_URL
});
const response = await client.chat.completions.create({
model: "auto",
messages: [
{ role: "user", content: "Write a launch checklist." }
],
stream: false
});
process.stdout.write(response.choices[0].message.content);
from openai import OpenAI
import os
client = OpenAI(
api_key=os.environ["SKYLINE_API_KEY"],
base_url=os.environ["SKYLINE_BASE_URL"],
)
response = client.chat.completions.create(
model="auto",
messages=[
{"role": "user", "content": "Summarize this product for a CTO."}
],
)
print(response.choices[0].message.content)
API reference
Runtime API surface
Customer traffic goes through the runtime service under /v1. Start with chat completions unless you specifically need Responses, Anthropic-style Messages, or file operations.
/v1/models
List models visible to the caller's organization.
/v1/models/{modelId}
Fetch one enabled model. Returns not found or forbidden if unavailable.
/v1/chat/completions
Primary OpenAI-compatible chat endpoint. Supports streaming SSE.
/v1/responses
Responses-style input mapped into SKYLINE INTELLIGENCE chat routing.
/v1/messages
Anthropic-style Messages surface for compatible clients.
/v1/messages/count_tokens
Estimate input tokens before sending a Messages request.
/v1/files
Upload files for file-aware model workflows.
Pass Authorization: Bearer sk_aibrg_.... Runtime also accepts x-api-key for Anthropic-style clients.
Responses include a request-id header. Keep it with application logs so support can trace routing and billing events.
OpenAI-compatible streams emit SSE chunks and terminate with data: [DONE]. Billing continues to drain provider usage even if the client disconnects.
Models and routing
Choose a model, or let the router choose
SKYLINE INTELLIGENCE separates customer-facing model codes from upstream provider details. Your organization only sees models enabled by an admin. The router then picks an eligible channel based on model access, capability requirements, priority, weight, health, and fallback availability.
Use a specific model
Call GET /v1/models or open Model Plaza, then send a known model code. This is best when your app has strict quality, context, or cost expectations.
{
"model": "gpt-4o-mini",
"messages": [{ "role": "user", "content": "Hello" }]
}
Use automatic routing
Send model: "auto" when the organization wants SKYLINE INTELLIGENCE to choose an eligible route. The router considers request features like streaming, tool calls, and vision before selecting a target.
{
"model": "auto",
"stream": true,
"messages": [{ "role": "user", "content": "Draft release notes" }]
}
Billing and usage
Every request is authorized before it runs
SKYLINE INTELLIGENCE reserves estimated quota and billing capacity before the provider call, then settles against actual usage after completion. That keeps quota enforcement predictable during concurrent traffic and gives finance teams a request-level ledger.
Checks API key status, organization status, enabled model access, key rate limits, and estimated cost.
Calculates billable input, output, cache, and tool dimensions from provider usage.
View requests, tokens, credits, latency, status, model usage, and usage by key.
/api/usage/summaryAggregate by model with optional time range and comparison./api/usage/summary/by-keyAggregate by API key and project./api/usage/recordsRequest-level rows with token, latency, credit, status, and time./api/billing/ledgerLedger entries for customer credit movement.Console workflow
Operate the gateway from the customer UI
The docs map directly to the current console navigation so operators and developers can speak the same language.
Create, rotate, disable, and delete keys. Copy the Base URL and first cURL from the quick-connect panel.
Search enabled models by provider, capability, model code, context window, and per-token pricing.
Review by model, by key, or at request-detail level with credits, tokens, latency, and status.
Enable or disable models distributed to the organization and manage member quotas.
Register provider base URLs, discover models, map channels, and set pricing dimensions.
Bind model abilities to channels, priority, weight, provider model codes, and adapter regions.
Errors and fallback
Handle failures explicitly
Runtime errors use a stable JSON shape with a request ID. Rate limits and temporary service pressure include Retry-After. Fallback responses expose degradation metadata so your app can decide whether to display, retry, or audit the result.
Error body
{
"type": "error",
"error": {
"type": "rate_limit_error",
"message": "API key rate limit exceeded"
},
"request_id": "req_..."
}
Fallback headers
X-SKYLINE-Degraded: true
X-SKYLINE-Fallback-Reason: channel_unhealthy
X-SKYLINE-Fallback-From: claude-sonnet
X-SKYLINE-Fallback-To: gpt-4o-mini
Before live traffic
Go-live preflight
Use this as a final client-integration check before an app sends real traffic through SKYLINE INTELLIGENCE. It is not a platform production runbook.
No docs sections match that search.