Errors - Huzz API

When something goes wrong, the Huzz API returns a standard OpenAI-style error body alongside an HTTP status code. Use the status code for control flow and retries; use the body for logs and user-facing messages.

Error shape

{
  "error": {
    "message": "The model `gpt-4o-typo` does not exist or you do not have access to it.",
    "type": "invalid_request_error",
    "param": null,
    "code": "model_not_found"
  }
}

Field	Meaning
`message`	Human-readable explanation of what failed.
`type`	Error class, e.g. `invalid_request_error`, `authentication_error`, `rate_limit_error`, `server_error`.
`param`	The request parameter at fault, when one can be identified.
`code`	Stable machine-readable code, e.g. `model_not_found`.

Because this is the same shape the OpenAI SDKs expect, their built-in exception types (AuthenticationError, RateLimitError, and friends) work unchanged against Huzz.

Status codes

Status	Type	What it means	What to do
`400`	`invalid_request_error`	The request body or parameters are malformed.	Fix the request. Do not retry as-is.
`401`	`authentication_error`	Missing, malformed, expired, or revoked API key.	Check `HUZZ_API_KEY` and the `Authorization` header. See Authentication.
`404`	`invalid_request_error`	Unknown route, model, or prediction id.	Verify the model id against the catalog and the URL path.
`429`	`rate_limit_error`	You hit your requests-per-minute limit or budget cap.	Back off and retry. Honor `Retry-After` when present.
`500`	`server_error`	Something failed inside Huzz or an upstream provider.	Retry with backoff; contact support if it persists.
`503`	`server_error`	The model or gateway is temporarily overloaded.	Retry with backoff.

Retry guidance

Retry 429 and 5xx only. Other 4xx errors are deterministic — the same request will fail the same way until you change it.
Use exponential backoff with jitter. Start around 1 second, double each attempt, add random jitter, and cap total attempts (3–5 is typical).
Honor Retry-After. When a 429 response includes the header, wait at least that long before the next attempt.
Keep retries idempotent. For async predictions, retry the poll, not the submit, unless the submit itself failed — otherwise you may pay for duplicate generations.

Minimal backoff loop

import os, time, random
from openai import OpenAI, APIStatusError, RateLimitError

huzz = OpenAI(api_key=os.environ["HUZZ_API_KEY"], base_url="https://api.huzz.ai/v1")

for attempt in range(5):
    try:
        res = huzz.chat.completions.create(
            model="gpt-4o",
            messages=[{"role": "user", "content": "Hello"}],
        )
        break
    except (RateLimitError, APIStatusError) as err:
        if getattr(err, "status_code", 500) < 500 and err.status_code != 429:
            raise  # deterministic 4xx — don't retry
        time.sleep(min(2 ** attempt, 30) + random.random())

Mid-stream failures during streaming responses do not carry an error body — if the SSE connection drops before data: [DONE], treat the output as incomplete and retry the request.

​Error shape

​Status codes

​Retry guidance

Error shape

Status codes

Retry guidance