/run
: Submit an asynchronous job that processes in the background while you receive an immediate job ID./runsync
: Submit a synchronous job and wait for the complete results in a single response./status
: Check the current status, execution details, and results of a previously submitted job./stream
: Receive incremental results from a job as they become available./cancel
: Stop a job that is in progress or waiting in the queue./retry
: Requeue a failed or timed-out job using the same job ID and input parameters./purge-queue
: Clear all pending jobs from the queue without affecting jobs already in progress./health
: Monitor the operational status of your endpoint, including worker and job statistics./run
)/runsync
)/status
)IN_QUEUE
, IN_PROGRESS
, COMPLETED
, FAILED
, etc.)./status
operation to configure the time-to-live (TTL) for an individual job by appending a TTL parameter when checking the status of a job. For example, https://api.runpod.ai/v2/{endpoint_id}/status/{job_id}?ttl=6000
sets the TTL for the job to 6 seconds. Use this when you want to tell the system to remove a job result sooner than the default retention time./stream
)/health
)/cancel
)/retry
)FAILED
or TIMED_OUT
status.
/run
): Results available for 30 minutes/runsync
): Results available for 1 minute/purge-queue
)Operation | Method | Rate Limit | Concurrent Limit |
---|---|---|---|
/run | POST | 1000 requests per 10 seconds | 200 concurrent |
/runsync | POST | 2000 requests per 10 seconds | 400 concurrent |
/status , /status-sync , /stream | GET/POST | 2000 requests per 10 seconds | 400 concurrent |
/cancel | POST | 100 requests per 10 seconds | 20 concurrent |
/purge-queue | POST | 2 requests per 10 seconds | N/A |
/openai/* | POST | 2000 requests per 10 seconds | 400 concurrent |
/requests | GET | 10 requests per 10 seconds | 2 concurrent |
429 (Too Many Requests)
status if:
endpoint.WorkersMax * 500
Issue | Possible Causes | Solutions |
---|---|---|
Job stuck in queue | No available workers, max workers limit reached | Increase max workers, check endpoint health |
Timeout errors | Job takes longer than execution timeout | Increase timeout in job policy, optimize job processing |
Failed jobs | Worker errors, input validation issues | Check logs, verify input format, retry with fixed input |
Rate limiting | Too many requests in short time | Implement backoff strategy, batch requests when possible |
Missing results | Results expired | Retrieve results within expiration window (30 min for async, 1 min for sync) |