text-to-video

InfiniteTalk Fast Video-To-Video

Audio-driven infinitetalk-fast turns one video plus audio into realistic talking or singing videos with lip-sync. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Use one of our client libraries to get started quickly.

InfiniteTalk Fast Video-To-Video

Audio-driven infinitetalk-fast turns one video plus audio into realistic talking or singing videos with lip-sync. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Video output~10–30s per 1 second of videofrom $0.17/run

1. Calling the API

Submit a request

Send a POST request to start generation. The API returns immediately with a prediction ID for polling.

curl -X POST "https://api.vibedream.ai/api/v1/models/video-to-video/generate" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $VIBEDREAM_API_KEY" \
  -d '{
    "audio": "https://example.com/your-audio.mp3",
    "video": "https://example.com/your-video.mp4",
    "mask_image": "https://example.com/your-mask_image.jpg",
    "prompt": "A beautiful sunset over mountains with golden light",
    "seed": 0
}'

2. Authentication

The API uses an API Key for authentication.

Get your API Key

Get your API key from vibedream.ai/models/api-keys.

Environment variable

export VIBEDREAM_API_KEY="your-api-key"

3. Queue & Results

Generation requests are queued and processed asynchronously. Poll the prediction endpoint until status is completed or failed.

Submit request

curl -X POST "https://api.vibedream.ai/api/v1/models/video-to-video/generate" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $VIBEDREAM_API_KEY" \
  -d '{
    "audio": "https://example.com/your-audio.mp3",
    "video": "https://example.com/your-video.mp4",
    "mask_image": "https://example.com/your-mask_image.jpg",
    "prompt": "A beautiful sunset over mountains with golden light",
    "seed": 0
}'

Response

Returns immediately with a prediction ID. Use id to poll for results.

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "processing",
  "statusUrl": "https://api.vibedream.ai/api/v1/predictions/550e8400-e29b-41d4-a716-446655440000",
  "estimatedTime": "10–30s per 1 second of video",
  "costCents": 17,
  "createdAt": "2025-01-15T12:00:00.000Z"
}

Get the result

Polling

# Replace YOUR_PREDICTION_ID with the id from the submit response
curl "https://api.vibedream.ai/api/v1/predictions/YOUR_PREDICTION_ID" \
  -H "Authorization: Bearer $VIBEDREAM_API_KEY"

# Poll every 1-2s until status is "completed" or "failed"

Completed response

outputs is string[] — an array of direct download URLs hosted on assets.vibedream.ai.

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "modelId": "video-to-video",
  "modelName": "InfiniteTalk Fast Video-To-Video",
  "status": "completed",
  "outputs": [
    "https://assets.vibedream.ai/outputs/550e8400-e29b-41d4-a716-446655440000/1736942400000-0.mp4"
  ],
  "error": null,
  "createdAt": "2025-01-15T12:00:00.000Z",
  "completedAt": "2025-01-15T12:00:30.000Z"
}

Failed response

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "modelId": "video-to-video",
  "modelName": "InfiniteTalk Fast Video-To-Video",
  "status": "failed",
  "outputs": null,
  "error": "Your request was flagged by content moderation. Please modify your prompt.",
  "createdAt": "2025-01-15T12:00:00.000Z",
  "completedAt": "2025-01-15T12:00:05.000Z"
}
FieldTypeDescription
idstringUnique prediction ID (UUID).
modelIdstringID of the model used for generation.
modelNamestringHuman-readable model name.
statusstringCurrent status. One of: processing, completed, failed.
outputsstring[] | nullArray of output URLs. Each URL is a direct download link to the generated file on assets.vibedream.ai. null when still processing or failed.
errorstring | nullError message if the generation failed. null on success.
createdAtstringISO 8601 timestamp when the request was submitted.
completedAtstring | nullISO 8601 timestamp when generation finished. null while processing.

4. Schema

Input

ParameterTypeRequiredDefaultRangeDescription
audiostringYes----Upload the audio file to sync with the video.Pass a public audio URL. Accepted: MP3, WAV, M4A.
videostringYes----Upload the base video for the transformation.Pass a public video URL. Accepted: MP4, WebM, MOV.
mask_imagestringNo----Upload a mask image to control which regions can move (optional).Pass a public image URL. Accepted: JPEG, PNG, GIF, WebP.
promptstringNo----Specify the style, pose or expressions you want to guide your video (optional).
seednumberNo---- – 1000 (step 1)Set the seed for reproducibility of the results (optional).

Example request

{
    "audio": "https://example.com/your-audio.mp3",
    "video": "https://example.com/your-video.mp4",
    "mask_image": "https://example.com/your-mask_image.jpg",
    "prompt": "A beautiful sunset over mountains with golden light",
    "seed": 0
}

Output

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "modelId": "video-to-video",
  "modelName": "InfiniteTalk Fast Video-To-Video",
  "status": "completed",
  "outputs": [
    "https://assets.vibedream.ai/outputs/550e8400-e29b-41d4-a716-446655440000/1736942400000-0.mp4"
  ],
  "error": null,
  "createdAt": "2025-01-15T12:00:00.000Z",
  "completedAt": "2025-01-15T12:00:30.000Z"
}