When something goes wrong, the Huzz API returns a standard OpenAI-style error body alongside an HTTP status code. Use the status code for control flow and retries; use the body for logs and user-facing messages.

Error shape

{
  "error": {
    "message": "The model `gpt-4o-typo` does not exist or you do not have access to it.",
    "type": "invalid_request_error",
    "param": null,
    "code": "model_not_found"
  }
}
FieldMeaning
messageHuman-readable explanation of what failed.
typeError class, e.g. invalid_request_error, authentication_error, rate_limit_error, server_error.
paramThe request parameter at fault, when one can be identified.
codeStable machine-readable code, e.g. model_not_found.
Because this is the same shape the OpenAI SDKs expect, their built-in exception types (AuthenticationError, RateLimitError, and friends) work unchanged against Huzz.

Status codes

StatusTypeWhat it meansWhat to do
400invalid_request_errorThe request body or parameters are malformed.Fix the request. Do not retry as-is.
401authentication_errorMissing, malformed, expired, or revoked API key.Check HUZZ_API_KEY and the Authorization header. See Authentication.
404invalid_request_errorUnknown route, model, or prediction id.Verify the model id against the catalog and the URL path.
429rate_limit_errorYou hit your requests-per-minute limit or budget cap.Back off and retry. Honor Retry-After when present.
500server_errorSomething failed inside Huzz or an upstream provider.Retry with backoff; contact support if it persists.
503server_errorThe model or gateway is temporarily overloaded.Retry with backoff.

Retry guidance

  • Retry 429 and 5xx only. Other 4xx errors are deterministic — the same request will fail the same way until you change it.
  • Use exponential backoff with jitter. Start around 1 second, double each attempt, add random jitter, and cap total attempts (3–5 is typical).
  • Honor Retry-After. When a 429 response includes the header, wait at least that long before the next attempt.
  • Keep retries idempotent. For async predictions, retry the poll, not the submit, unless the submit itself failed — otherwise you may pay for duplicate generations.
Minimal backoff loop
import os, time, random
from openai import OpenAI, APIStatusError, RateLimitError

huzz = OpenAI(api_key=os.environ["HUZZ_API_KEY"], base_url="https://api.huzz.ai/v1")

for attempt in range(5):
    try:
        res = huzz.chat.completions.create(
            model="gpt-4o",
            messages=[{"role": "user", "content": "Hello"}],
        )
        break
    except (RateLimitError, APIStatusError) as err:
        if getattr(err, "status_code", 500) < 500 and err.status_code != 429:
            raise  # deterministic 4xx — don't retry
        time.sleep(min(2 ** attempt, 30) + random.random())
Mid-stream failures during streaming responses do not carry an error body — if the SSE connection drops before data: [DONE], treat the output as incomplete and retry the request.