Bring your audio into another language.
Dubbing and Translation takes spoken audio, transcribes it with a Whisper-class model, translates the script, and re-voices it in a new language on a multilingual engine. Three real steps, chained into one flow and live today.

Record once. Ship everywhere.
Dubbing is three real steps. We transcribe your recording, translate the script into the language you pick (you can edit every word before it is voiced), and re-voice it on the multilingual engine.
01 Transcribe 02 Translate 03 Re-voice
Spanish
Translated + re-voiced
Shown on the homepage
French
Translated + re-voiced
German
Translated + re-voiced
Italian
Translated + re-voiced
Portuguese
Translated + re-voiced
Japanese
Translated + re-voiced
Hindi
Translated + re-voiced
Arabic
Translated + re-voiced
Dubbing carries your words and the read through Gemini Flash, the multilingual engine. It does not lip-sync video, and you review and edit the translated script before anything is re-voiced.
From a blank script to audio you own.
Step 1: Transcribe the source
Your original audio is turned into text by a Whisper-class model. Edit the transcript before you translate it.
Step 2: Translate the script
The script is translated into one of eight languages, with bracketed [stage directions] kept in place. Review it before voicing.
Step 3: Re-voice it
Gemini Flash, the multilingual engine, speaks the translated script in the new language, in a voice you choose.
Step 4: Export and own
Download the dubbed audio as MP3 or WAV, yours to publish commercially.
What Dubbing & Translation gives you.
One pipeline, three steps
Transcribe, translate, then re-voice, chained into a single flow so you are not stitching three tools together by hand.
Voiced by live engines
The final speech runs on the same real engines as Text to Speech, so the output is genuinely generated voice.
Review before you ship
Designed so you can check the translated script before it is voiced, because a dub is only as good as its words.
Own the result
Every dub you generate is yours to export and publish, with commercial rights and no watermark.
The engines behind it.
The only engine here that acts your bracketed [emotion] directions.
- Quality Elo
- 1225
- Latency
- 2770 ms (measured 2026-06-10)
- Languages
- 24
- Rights
- Commercial use; outputs are yours
Quality Elo from the Artificial Analysis Speech Arena, retrieved June 10, 2026. Latencies are our own real wall-clock numbers.
Dubbing & Translation is generating in the studio today.
Live today, end to end. A Whisper-class model transcribes the source, a translation model turns the script into the target language, and the final voice runs on Gemini Flash, our multilingual engine (Kokoro, Grok Voice, MAI Voice 2, and Zonos are English-only, so the dub uses Gemini). English-source audio works best. You can edit the text at every step before moving on.
The honest answers.
What Dubbing & Translation can and cannot do today, in plain language.
Can I dub audio today?
How does it work under the hood?
Which languages can I dub into?
Why does the dub only use Gemini Flash?
Eight target languages, live today. Each has its own honest guide to dubbing into it.
Start generating with Dubbing & Translation.
Free to start, no credit meter. Open the console and hear it for yourself.