Skip to content
New · the open voice benchmark is liveRead it
cantari
Tools

How to re-voice a recording as one of your cloned voices

Upload or record a take, pick a clone you built, and convert. What carries over, the real caps, and how usage meters by audio length.

Updated June 11, 2026

What the voice changer will be here

The voice changer is coming soon. It depends on the same cloning engine as voice cloning, which we have not connected yet. When it ships it will live on the Voice Changer page in the app. You will give it a recording of one person speaking, pick one of your own cloned voices, and it will re-voice the take: the timing, pacing, and emphasis of the original performance carry over, while the timbre becomes the clone's. This guide describes how it will work so you know what to expect.

It is speech to speech, not speech to text to speech. Your recording is never turned back into a script along the way, which is why the delivery survives. The conversion will run on Sonic, the same engine that holds your cloned voices.

Which voices you will deliver in

Your own clones, built on the voice cloning page with the required consent attestation. The changer will not re-voice into the stock engine roster, and we will not pretend it does: the honest target list is the voices you own.

Voice cloning ships first; the changer follows. A clone will take about a minute to create.

Recording or uploading a take

Upload a file (mp3, wav, m4a, webm, ogg, or flac) or record straight in the browser. The caps are 25 MB and 10 minutes per take. Browser recordings are automatically re-encoded to plain 24 kHz mono WAV on your machine before upload, the same safeguard the cloning studio uses, and fall back to the original capture if that conversion fails.

Perform the line the way you want it delivered. The whole point of the tool is that your read survives: rushed lines stay rushed, pauses stay pauses, emphasis lands where you put it.

One speaker only. Music beds, crosstalk, or a second voice in the take will degrade the conversion.

How usage is metered

There is no script to count characters from, so the voice changer meters by audio length at the house rate: about 1,000 characters per minute of recording, with a 100-character floor per conversion. In plain terms, a minute of audio costs about a minute of your monthly allowance, the same as generating that minute from text.

The clip length is read by your browser's audio decoder and shown on the Convert button's status line before you spend anything, so the cost is never a surprise. The full metering story lives in usage and limits.

Where results save

Every successful conversion returns an MP3 you can play and download on the spot, and saves server-side to your private library, tagged with the clone's name. Like everything in Cantari, the result is yours to export and use commercially.

The output speaks in a cloned voice, so the cloning rules follow it: voices you create are yours to use and yours to answer for. Do not use a re-voiced take to impersonate a real person.