Error Codes

All Casola API errors use a consistent envelope:

{
  "error": {
    "code": "not_found",
    "message": "Resource not found"
  }
}

The error object always contains code (machine-readable) and message (human-readable). Some errors include a details object with additional context, such as validation issues.

Error codes

Code	HTTP Status	Description	Retryable
`bad_request`	400	Malformed request	No
`validation_error`	400	Request body failed schema validation	No — fix request body
`invalid_json`	400	JSON parse failure	No
`unauthorized`	401	Missing or invalid authentication token	No — check credentials
`forbidden`	403	Token lacks required scope or role	No
`platform_invite_required`	403	Platform invite needed to create an account	No
`not_found`	404	Resource does not exist	No
`conflict`	409	Resource already exists (duplicate)	No
`gone`	410	Resource has expired (e.g. invite link)	No
`rate_limit`	429	Too many requests	Yes — honor `Retry-After` header
`quota_exceeded`	429	Plan usage limit reached	No — upgrade plan or wait for reset
`backlog_full`	429	Job queue is at capacity	Yes — backoff and retry
`internal_error`	500	Unexpected server error	Yes — retry with backoff
`bad_gateway`	502	Upstream provider returned an error	Yes — retry
`worker_error`	502	GPU worker reported a failure	Yes — retry
`no_capacity`	503	No GPU workers available for this model	Yes — backoff and retry
`model_warming_up`	503	Model is loading onto a GPU	Yes — wait 30s-5min, then retry
`not_configured`	503	Model is not configured on the platform	No
`timeout`	504	Job exceeded its time limit	Yes — retry may help

Rate limit headers

When you receive a 429 response with code rate_limit, the response includes headers to help you pace requests:

Header	Description
`X-RateLimit-Limit`	Maximum requests allowed in the window
`X-RateLimit-Remaining`	Requests remaining (always `0` on a 429)
`X-RateLimit-Reset`	Unix timestamp (seconds) when the window resets
`Retry-After`	Seconds to wait before retrying

Retry guidance

Do retry (with exponential backoff):

429 rate_limit — wait for the Retry-After duration
429 backlog_full — the queue is temporarily full; retry after 5-10 seconds
502 worker_error / 502 bad_gateway — transient upstream failure
503 no_capacity — no workers are free; retry after 10-30 seconds
503 model_warming_up — a GPU is loading the model; retry after 30-60 seconds
504 timeout — the job took too long; retry or switch to async submission
500 internal_error — unexpected failure; retry with backoff

Do not retry:

400 errors — fix the request
401 / 403 — check your token and scopes
404 — resource does not exist
409 — duplicate resource
429 quota_exceeded — your plan’s usage limit is reached

Backoff strategy: Start with a 1-second delay, double on each retry, and cap at 60 seconds. Add jitter (random 0-500ms) to avoid thundering herd.