Task Types
Task types define the input/output contract for jobs and workflow nodes. Each model supports one or more task types — see the Models reference for which models support which tasks.
Overview
Section titled “Overview”| Task Type | Category | Sync | Description |
|---|---|---|---|
openai/chat-completion | Text | Yes | Chat completion (LLM) |
openai/chat-completion/vision | Text | Yes | Chat completion with image input |
openai/chat-completion/ocr | OCR | Yes | OCR via vision model |
openai/embeddings | Text | Yes | Text embeddings |
openai/rerank | Text | Yes | Document reranking |
openai/score | Text | Yes | Text pair similarity scoring |
openai/audio-speech | Audio | Yes | Text-to-speech |
openai/audio-transcription | Audio | Yes | Speech-to-text |
openai/image-generation | Image | Yes | Image generation (OpenAI format) |
fal/text-to-image | Image | Yes | Image generation (Fal format) |
fal/image-edit | Image | Yes | Image editing / inpainting |
fal/text-to-video | Video | No | Text-to-video generation |
fal/image-to-video | Video | No | Image-to-video animation |
fal/speech-to-video | Video | No | Speech-driven video (talking head) |
fal/audio-transcription | Audio | Yes | Speech-to-text (Fal format) |
fal/video-interpolate | Video | No | Frame interpolation (slow motion) |
fal/video-upscale | Video | No | Video super-resolution |
Sync = can be dispatched synchronously (result returned inline). Async tasks return a request_id for polling.
Text Tasks
Section titled “Text Tasks”openai/chat-completion
Section titled “openai/chat-completion”Standard chat completion following the OpenAI API format.
| Field | Type | Required | Description |
|---|---|---|---|
messages | array | Yes | Array of {role, content} message objects |
model | string | No | Model ID (set automatically when using a specific endpoint) |
temperature | number | No | Sampling temperature (0-2) |
top_p | number | No | Nucleus sampling |
max_tokens | number | No | Maximum tokens to generate |
stream | boolean | No | Enable streaming response |
stop | string/array | No | Stop sequences |
tools | array | No | Tool/function definitions |
Output: choices[0].message.content (text)
Models: qwen3-0.6b, qwen3.5-4b, qwen3.5-9b, gpt-oss-20b, deepseek-ocr-v1, deepseek-ocr-v2
openai/chat-completion/vision
Section titled “openai/chat-completion/vision”Chat completion with image input. Same parameters as openai/chat-completion, but messages can include image content:
{ "messages": [{ "role": "user", "content": [ {"type": "text", "text": "What's in this image?"}, {"type": "image_url", "image_url": {"url": "https://..."}} ] }]}Models: qwen3.5-4b, qwen3.5-9b
openai/chat-completion/ocr
Section titled “openai/chat-completion/ocr”OCR via a vision-capable model. Uses the same message format as vision, optimized for text extraction from images.
Models: deepseek-ocr-v1, deepseek-ocr-v2
openai/embeddings
Section titled “openai/embeddings”Generate vector embeddings for text.
| Field | Type | Required | Description |
|---|---|---|---|
input | string/array | Yes | Text or array of texts to embed |
encoding_format | string | No | float or base64 |
dimensions | number | No | Output dimensions |
Output: data[0].embedding (float array)
Models: qwen3-0.6b-embed, gpt-oss-20b-embed
openai/rerank
Section titled “openai/rerank”Rerank documents by relevance to a query.
| Field | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Model ID |
query | string | Yes | Search query |
documents | array | Yes | Array of strings or {text} objects |
top_n | number | No | Number of top results to return |
return_documents | boolean | No | Include document text in results |
Output: Ranked documents with relevance scores
Models: qwen3-0.6b-score, gpt-oss-20b-score
openai/score
Section titled “openai/score”Compute similarity scores between text pairs.
| Field | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Model ID |
text_1 | string/array | Yes | First text(s) |
text_2 | string/array | Yes | Second text(s) |
Output: Similarity scores
Models: qwen3-0.6b-score, gpt-oss-20b-score
Audio Tasks
Section titled “Audio Tasks”openai/audio-speech
Section titled “openai/audio-speech”Convert text to spoken audio.
| Field | Type | Required | Description |
|---|---|---|---|
input | string | Yes | Text to speak |
voice | string | No | Voice ID (model-specific) |
response_format | string | No | mp3, opus, aac, flac, wav, pcm |
speed | number | No | Playback speed multiplier |
Output: audio_url (audio file URL)
Models: qwen3-tts (9 voices), fox-tts (150+ voices)
openai/audio-transcription
Section titled “openai/audio-transcription”Transcribe audio to text.
| Field | Type | Required | Description |
|---|---|---|---|
audio_url | string | Yes | URL to the audio file |
language | string | No | Language code (ISO 639-1) |
task | string | No | transcribe or translate |
Also supports multipart file upload with a file field.
Output: text (transcribed text)
Models: whisper-large-v3
fal/audio-transcription
Section titled “fal/audio-transcription”Same as openai/audio-transcription but using the Fal request format.
| Field | Type | Required | Description |
|---|---|---|---|
audio_url | string | Yes | URL to the audio file |
language | string | No | Language code |
task | string | No | transcribe or translate |
Output: text
Models: whisper-large-v3
Image Tasks
Section titled “Image Tasks”openai/image-generation
Section titled “openai/image-generation”Generate images using the OpenAI-compatible format.
| Field | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | Image description |
n | number | No | Number of images |
size | string | No | Image dimensions |
quality | string | No | standard or hd |
response_format | string | No | url or b64_json |
Output: data[0].url (image URL)
Models: qwen-image-2512
fal/text-to-image
Section titled “fal/text-to-image”Generate images using the Fal format. More parameters than the OpenAI format.
| Field | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | Image description |
negative_prompt | string | No | What to avoid |
image_size | string | No | square_hd, square, portrait_4_3, portrait_16_9, landscape_4_3, landscape_16_9 |
num_inference_steps | number | No | Denoising steps |
guidance_scale | number | No | Prompt adherence strength |
num_images | number | No | Number of images (max 10) |
seed | number | No | Reproducibility seed |
loras | array | No | LoRA configs [{path, scale}] (scale 0-4) |
enable_safety_checker | boolean | No | Content safety filter |
output_format | string | No | Output image format |
Output: images[0].url (image URL)
Models: nunchaku-flux1-schnell, sglang-diffusion-flux2-klein-4b, qwen-image-2512, qwen-image-2512-lightx2v-fp8, sglang-diffusion-qwen-image-2512-fp8
fal/image-edit
Section titled “fal/image-edit”Edit or inpaint images.
| Field | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | Edit instruction |
image_url | string | Yes | Source image URL |
mask_url | string | No | Mask for inpainting |
strength | number | No | Edit strength (0-1) |
num_inference_steps | number | No | Denoising steps |
guidance_scale | number | No | Prompt adherence |
num_images | number | No | Number of results (max 10) |
seed | number | No | Reproducibility seed |
loras | array | No | LoRA configs |
Output: images[0].url (image URL)
Models: nunchaku-flux1-schnell, sglang-diffusion-flux2-klein-4b, qwen-image-edit-2511, qwen-image-edit-2511-lightx2v-fp8, sglang-diffusion-qwen-image-edit-2511-fp8
Video Tasks
Section titled “Video Tasks”All video tasks are async only — they return a request_id for polling.
fal/text-to-video
Section titled “fal/text-to-video”Generate video from a text prompt.
| Field | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | Video description |
resolution | string | No | 480p, 720p |
aspect_ratio | string | No | 16:9, 9:16, 1:1 |
num_inference_steps | number | No | Denoising steps |
guidance_scale | number | No | Prompt adherence |
num_frames | number | No | Number of frames |
fps | number | No | Frames per second |
seed | number | No | Reproducibility seed |
output_format | string | No | Video format |
Output: video.url (video URL)
Models: wan22-ti2v, sglang-diffusion-wan22-t2v-a14b-fp8, ltx2-distilled
fal/image-to-video
Section titled “fal/image-to-video”Animate a still image into video.
| Field | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | Motion description |
image_url | string | Yes | Source image URL |
resolution | string | No | 480p, 720p |
aspect_ratio | string | No | 16:9, 9:16, 1:1 |
num_inference_steps | number | No | Denoising steps |
guidance_scale | number | No | Prompt adherence |
num_frames | number | No | Number of frames |
fps | number | No | Frames per second |
seed | number | No | Reproducibility seed |
Output: video.url (video URL)
Models: wan22-ti2v, ltx2-distilled
fal/speech-to-video
Section titled “fal/speech-to-video”Generate talking-head video driven by audio.
| Field | Type | Required | Description |
|---|---|---|---|
audio_url | string | Yes | Audio file URL |
image_url | string | Yes | Face/character image URL |
prompt | string | No | Additional scene description |
num_inference_steps | number | No | Denoising steps |
guidance_scale | number | No | Prompt adherence |
seed | number | No | Reproducibility seed |
Output: video.url (video URL)
Models: wan22-s2v
fal/video-interpolate
Section titled “fal/video-interpolate”Increase video frame rate (slow motion effect).
| Field | Type | Required | Description |
|---|---|---|---|
video_url | string | Yes | Source video URL |
multiplier | number | No | Frame rate multiplier (2-8, default 2) |
scene_detect | boolean | No | Detect scene changes (default true) |
Output: video.url (video URL)
fal/video-upscale
Section titled “fal/video-upscale”Upscale video resolution.
| Field | Type | Required | Description |
|---|---|---|---|
video_url | string | Yes | Source video URL |
upscale_factor | number | No | Scale factor (1-10) |
seed | number | No | Reproducibility seed |
Output: video.url (video URL)
Workflow Usage
Section titled “Workflow Usage”Task types are used as the task field in workflow DAG nodes. Each node specifies a task type and a model, and can wire outputs from upstream nodes into its inputs.
{ "nodes": [ { "id": "generate", "task": "fal/text-to-image", "model": "nunchaku-flux1-schnell", "payload": { "prompt": "{{input.prompt}}" } } ]}See the Workflows guide for details on building multi-step pipelines.