Error Codes
All Casola API errors use a consistent envelope:
{ "error": { "code": "not_found", "message": "Resource not found" }}The error object always contains code (machine-readable) and message (human-readable). Some errors include a details object with additional context, such as validation issues.
Error codes
Section titled “Error codes”| Code | HTTP Status | Description | Retryable |
|---|---|---|---|
bad_request | 400 | Malformed request | No |
validation_error | 400 | Request body failed schema validation | No — fix request body |
invalid_json | 400 | JSON parse failure | No |
unauthorized | 401 | Missing or invalid authentication token | No — check credentials |
forbidden | 403 | Token lacks required scope or role | No |
platform_invite_required | 403 | Platform invite needed to create an account | No |
not_found | 404 | Resource does not exist | No |
conflict | 409 | Resource already exists (duplicate) | No |
gone | 410 | Resource has expired (e.g. invite link) | No |
rate_limit | 429 | Too many requests | Yes — honor Retry-After header |
quota_exceeded | 429 | Plan usage limit reached | No — upgrade plan or wait for reset |
backlog_full | 429 | Job queue is at capacity | Yes — backoff and retry |
internal_error | 500 | Unexpected server error | Yes — retry with backoff |
bad_gateway | 502 | Upstream provider returned an error | Yes — retry |
worker_error | 502 | GPU worker reported a failure | Yes — retry |
no_capacity | 503 | No GPU workers available for this model | Yes — backoff and retry |
model_warming_up | 503 | Model is loading onto a GPU | Yes — wait 30s-5min, then retry |
not_configured | 503 | Model is not configured on the platform | No |
timeout | 504 | Job exceeded its time limit | Yes — retry may help |
Rate limit headers
Section titled “Rate limit headers”When you receive a 429 response with code rate_limit, the response includes headers to help you pace requests:
| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum requests allowed in the window |
X-RateLimit-Remaining | Requests remaining (always 0 on a 429) |
X-RateLimit-Reset | Unix timestamp (seconds) when the window resets |
Retry-After | Seconds to wait before retrying |
Retry guidance
Section titled “Retry guidance”Do retry (with exponential backoff):
429 rate_limit— wait for theRetry-Afterduration429 backlog_full— the queue is temporarily full; retry after 5-10 seconds502 worker_error/502 bad_gateway— transient upstream failure503 no_capacity— no workers are free; retry after 10-30 seconds503 model_warming_up— a GPU is loading the model; retry after 30-60 seconds504 timeout— the job took too long; retry or switch to async submission500 internal_error— unexpected failure; retry with backoff
Do not retry:
400errors — fix the request401/403— check your token and scopes404— resource does not exist409— duplicate resource429 quota_exceeded— your plan’s usage limit is reached
Backoff strategy: Start with a 1-second delay, double on each retry, and cap at 60 seconds. Add jitter (random 0-500ms) to avoid thundering herd.