Execution Modes & HTTP Queue
Explains how deAPI executes model requests over HTTP using a default queued job model (POST → request_id → GET /request-status), and outlines planned extensions like webhooks and synchronous modes.
All deAPI model endpoints follow the same execution pattern. Requests are sent as asynchronous jobs to a queue, and you fetch their status and results via a dedicated endpoint.
Default queued execution model
Every model endpoint (Text-to-Image, Text-to-Speech, Image-to-Text, Image-to-Image, Video-to-Text, Image-to-Video, Text-to-Video, Text-to-Embedding, etc.) works the same way:
You send a request to:
POST /api/v1/client/{task}The response returns a
request_idYou check progress and fetch the result via:
GET /api/v1/client/request-status/{job_request}
The request-status response includes:
current job status (e.g. pending, running, completed, failed),
optional progress information,
result URLs and/or inline output data,
optional error details when a job fails.
This queued model is the default in deAPI because it:
keeps long-running jobs (image/video generation, transcription) off the critical HTTP path,
avoids common HTTP timeout issues,
lets deAPI schedule work efficiently across a distributed GPU network,
gives you one consistent pattern across all model endpoints.
For low-level details, see:
Get Results –
GET /api/v1/client/request-status/{job_request}individual API pages under API (Text-to-Image, Text-to-Speech, etc.).
Future extensions: webhooks and synchronous HTTP
On top of the queue model described above, we plan to add:
Webhooks for queued jobs – instead of polling
request-status, you will be able to provide a webhook URL that deAPI will call when the job finishes.Additional HTTP modes (e.g. synchronous requests) – for small, low-latency jobs where returning the result in a single HTTP response makes sense.
Until these features are available, the queued execution model described on this page is the recommended way to run all model requests over HTTP.
Last updated