Models
The page contains a single, canonical reference for what models are available right now, how to choose the right one, and how to address models in API calls.
deAPI gives you a unified API across multiple open-source models running on a decentralized GPU cloud. Models evolve frequently; always rely on the live list (endpoint at the bottom) for the current availability and IDs.
How model selection works
Every task requires a
modelparameter. Example:model: "Flux1schnell"for Text-to-Image ormodel: "WhisperLargeV3"for speech/video transcription.Display names vs. API IDs. In tables and UI we show human-friendly names (e.g., “FLUX.1-schnell”). The API accepts stable IDs (lowercase, hyphen/period separated). Use the Models endpoint to fetch the exact
idstrings.Quality ↔ Speed trade-off. Larger models often yield higher quality but cost more and take longer. Use our Price Calculator on the homepage to estimate cost before running large jobs.
Versioning & lifecycle. Models may be updated, superseded, or deprecated. Your application should resolve the model ID at runtime (from the live list) or pin to a specific version string if reproducibility is critical.
Safety & acceptable use. Follow the Terms of Service. Some content types may be blocked or filtered. See the Safety section in each task’s docs.
Supported tasks & models (curated)
The table below lists the core, production-ready models available today. For the authoritative list (including experimental or newly added ones), use the Models endpoint.
Text-to-Image
Generate images from text
Concept art, prototyping, creative exploration
FLUX.1-schnell Z-Image-Turbo INT8
Flux1schnell
ZImageTurbo_INT8
txt2img
Text-to-Speech
Turn text into natural voice
Narration, accessibility, product voices in multiple languages and tones
Kokoro-82M
Kokoro
txt2audio
Video-to-Text
Transcribe video into text
Subtitles, captions, indexing, SEO, datasets (with optional timestamps)
Whisper large-v3
WhisperLargeV3
video2txt
Image/Text-to-Video
Generate lishort AI videos
Cinematic motion, transitions, stylization
LTX-Video-0.9.8 13B
Ltxv_13B_0_9_8_Distilled_FP8
img2video
txt2video
Image-to-Text
Extract meaning from images
Descriptions, OCR, accessibility, moderation
Nanonets_Ocr_S_F16
Nanonets-Ocr-S-F16
img2txt
Audio-to-Text
Convert audio into text
Subtitles, notes, search, accessibility (multi-language)
Whisper large-v3
WhisperLargeV3
video2txt
Text-to-Embedding
Create vector embeddings
Search, RAG, semantic similarity, clustering
BGE M3
Bge_M3_FP16
txt2embedding
Image-to-Image
Transform existing images
Style transfer, edits, in/outpainting
QwenImageEdit-Plus (NF4)
QwenImageEdit_Plus_NF4
img2img
Note: Display names and example IDs above are provided for clarity; always fetch the live model list for the exact
idstrings currently enabled on the network.
Picking the right model
Image Generation (Text-to-Image): Start with
Flux1schnellorZImageTurbo_INT8for fast iteration. Increase steps/resolution for quality; control style via prompt and (optionally) LoRAs.Speech Generation (TTS):
Kokorooffers natural prosody across multiple languages/voices. Choose voices/language in the payload; test speed vs. quality for your use case.Text Transcription (Video/Audio-to-Text):
WhisperLargeV3is a strong general-purpose baseline with robust multilingual support. For long videos, enable timestamps and chunking.Text Recognition (Image-to-Text):
Nanonets-Ocr-S-F16targets clear captions and text extraction. For complex layouts, consider multiple passes or post-processing.Video Generation (Image-to-Video / Text-to-Video):
Ltxv_13B_0_9_8_Distilled_FP8is suited for short clips and stylized motion. Start with low duration/frames to validate aesthetics, then scale up.Embedding (Text-to-Embedding):
Bge_M3_FP16provides dense vector embeddings for semantic search, clustering, and retrieval-augmented generation (RAG). Use for similarity queries or knowledge base indexing.Image Transformation (Image-to-Image):
For edits and style transfer, start with
QwenImageEdit_Plus_NF4. Use fewer steps for quick drafts and increase steps for higher fidelity; combine with masks or control prompts for targeted changes.
For a deeper discussion of strengths/limits and typical parameter sets per task, see Model Selection.
API usage examples
1) List models (fetch IDs at runtime)
curl -X GET "https://api.deapi.ai/api/v1/client/models" \
-H "Authorization: Bearer $DEAPI_API_KEY"2) Use a model in Text-to-Image
curl -X POST "https://api.deapi.ai/api/v1/client/txt2img" \
-H "Authorization: Bearer $DEAPI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"prompt": "isometric cozy cabin at dusk, soft rim light, artstation trending",
"model": "Flux1schnell",
"width": 768,
"height": 768,
"steps": 25,
"guidance": 3.5,
"seed": 12345
}'3) Use a model in Video-to-Text (YouTube)
curl -X POST "https://api.deapi.ai/api/v1/client/vid2txt" \
-H "Authorization: Bearer $DEAPI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"youtube_url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"include_ts": true,
"model": "WhisperLargeV3"
}'Jobs return a
request_id. Poll results withGET /api/v1/client/request-status/{request_id}.
Best practices
Resolve the model list dynamically. Don’t hardcode IDs—fetch once at startup or periodically.
Pin versions for reproducibility. If you need bit-for-bit repeats (e.g., in T2I), pin model version + set seed.
Budget before scaling. Larger models and higher resolution/steps cost more—use the calculator on the homepage.
Handle deprecation. Implement a fallback path if a model becomes unavailable (e.g., switch to a recommended successor).
Related docs
Model Selection: how to choose the best model per task and budget.
Live models (API)
Use the endpoint below to retrieve the current, authoritative list of models (with id strings to use in requests):
GET https://api.deapi.ai/api/v1/client/models(The response includes stable id values to pass as the model parameter in any task endpoint.)
Last updated