API Reference
HPP Router's request and response schemas are OpenAI-compatible, with HPP-specific extensions for smart routing headers and prepaid quota. At a high level, you use the same patterns as the OpenAI Chat API — point your client at https://router.hpp.io and authenticate with your API key.
OpenAPI Specification
The complete Consumer API is documented using OpenAPI 3.1. The spec is the single source of truth for request/response shapes and auth schemes:
| Format | Location |
|---|---|
| YAML (bundled) | consumer-v1.yaml in this repo |
| YAML (source) | hpp-router/openapi/consumer-v1.yaml |
Import the spec into Swagger UI, Postman, or an OpenAPI code generator to explore endpoints or produce client stubs.
For live requests, use the Router Playground or follow the Quickstart — streaming and image responses are easier to test there than in a static reference page.
Base URL & auth
- Base URL:
https://router.hpp.io - Auth:
apikeyheader orAuthorization: Bearer <key>. See Authentication. - Version: Consumer API
0.1.0.
Endpoints
| Method | Path | Summary |
|---|---|---|
POST | /llm/v1/chat/completions | Create a chat completion |
GET | /llm/v1/models | List available models |
POST | /v1/images/generations | Generate images |
GET | /api/usage | Get current consumer usage |
GET | /api/quota-check | Check current consumer quota |
POST /llm/v1/chat/completions
OpenAI-compatible chat completion endpoint with HPP smart-routing headers.
Request body (ChatCompletionRequest):
| Field | Type | Required | Notes |
|---|---|---|---|
model | string | ✅ | e.g. hpprouter/auto, openai/gpt-5, anthropic/claude-sonnet-4, moonshotai/kimi-k2.6, ollama/gpt-oss:120b, ollama/solidity-master:2. |
messages | ChatMessage[] | ✅ | Each has role (system/user/assistant/tool) and content (string or content parts). |
stream | boolean | Stream as SSE. | |
max_tokens | integer (≥1) | ||
max_completion_tokens | integer (≥1) | ||
temperature | number | ||
stream_options | object |
Additional properties are allowed and passed through.
Responses:
200—ChatCompletionResponse(application/json) or an SSE stream (text/event-stream). Response headers includeX-HPP-Router-Resolved-Model,X-HPP-Router-Basket,X-HPP-Router-Rule-Id,X-HPP-Router-Rules-Version, andX-HPP-Router-Tier.400,401,429,500— error envelope.
See Chat Completions and Smart Routing.
GET /llm/v1/models
Lists available models (OpenAI-compatible). Each Model has id, object ("model"), owned_by, and an optional pricing object (input, output, cache_write, cache_read).
Responses: 200 — ModelListResponse; 401, 500 — error envelope.
See Models & Pricing.
POST /v1/images/generations
OpenAI-compatible image generation for gpt-image-1.
Request body (ImageGenerationRequest):
| Field | Type | Required | Default |
|---|---|---|---|
prompt | string | ✅ | — |
model | string | gpt-image-1 | |
n | integer (1–4) | 1 | |
size | 1024x1024 / 1024x1536 / 1536x1024 | 1024x1024 | |
quality | low / medium / high / auto | auto | |
background | string | — | |
output_format | string | — |
Responses: 200 — ImageGenerationResponse (data[] with b64_json/url, plus usage); 400, 401, 429, 500 — error envelope.
See Image Generation.
GET /api/usage
Usage and quota summary for the authenticated consumer.
Response 200 (UsageResponse): consumer_id, username, custom_id, quota, used, remaining, requests, total_tokens, total_cost.
Errors: 401, 404, 500.
GET /api/quota-check
Quota availability for the authenticated consumer.
Response 200 (QuotaCheckResponse): has_quota, quota, used, remaining.
Errors: 401, 503, 500.
See Quota & Usage.
Error envelope
Errors use one of two shapes (ErrorEnvelope):
{ "error": "string", "message": "string" }
{
"error": {
"message": "string",
"type": "string",
"code": "string",
"provider": "string",
"upstream_status": 0,
"retryable": true
}
}
See Errors for handling guidance.
Security schemes
| Scheme | Type | Where |
|---|---|---|
ApiKeyAuth | apiKey | header apikey |
BearerAuth | http bearer | header Authorization: Bearer <key> |