Skip to content
New · the open voice benchmark is liveRead it
cantari
Platform

How usage, limits, and file caps work

What a character buys, when the meter resets, what happens at the limit, and the real caps on uploads and scripts.

Updated June 11, 2026

How metering works

Your plan is measured in characters, because that is what a voice engine actually consumes. Each character of the text you submit counts toward one monthly total, whichever tool did the work, with one deliberate exception: Kokoro, the open-weight engine, is unlimited on every plan and never touches your meter. Premium engines (Gemini Flash, Grok Voice, MAI Voice 2, Zonos, and your clones on Sonic) are what the allowance counts. Transcription draws from the same allowance: when a Whisper-class model turns your audio into text, the transcript's character count is what gets recorded.

The voice changer has no script to count, so it meters by audio length at the house rate: about 1,000 characters per minute of recording, with a small 100-character floor per conversion. A one-minute take costs about a minute of allowance, the same as generating that minute from text.

The meter resets on the 1st of each month (UTC). There is no rollover and no expiry drama: a fresh month is a fresh allowance.

The usage page in your workspace shows the live meter for the current month and a per-engine breakdown, so you can see exactly where the characters went.

The house math

About 1,000 characters is a minute of audio. We say about on purpose: read speed varies with the script, the voice, and the pauses, and we will not fake precision the engines do not have. It is still a reliable planning number: a 5,000-character script is around five minutes of audio, and the free plan's 10,000 characters are roughly 10 minutes a month.

Thinking in minutes and chapters, not characters, is the point. The meter is an implementation detail; the work is audio.

What happens at the limit

When a premium generation would push past the month's allowance, it does not run, and the studio says so plainly: an upgrade panel replaces the error line, with two honest ways forward. Upgrade to lift the limit, keep drafting on unlimited Kokoro, or wait for the 1st, when the meter resets on its own.

Nothing else changes at the limit. Your library, your history, and your downloads all keep working; finished audio is never held hostage to a meter.

Within your allowance, iteration is free in the way that matters: take five costs the same characters as take two, and a retake never feels like a purchase.

File and script caps

A few hard caps protect the pipeline. They are generous for real work and they are the actual numbers the routes enforce:

WhereCap
Text to Speech script (per generation)30,000 characters
Speech to Text upload25 MB
Voice cloning reference clip20 MB, up to about 2 minutes
Voice changer take25 MB, up to 10 minutes
Audiobook manuscript import150,000 characters

* Speech to Text and the voice changer accept mp3, wav, m4a, webm, ogg, and flac files.

* A cloning clip does not need to be long: a clear, single-speaker recording up to two minutes is plenty.

Fair use, in plain words

A flat allowance only works if it is used for making things, and that is the whole framing: generate, regenerate, experiment, and ship, as much as your plan covers. The caps above exist to keep one runaway upload or script from degrading the studio for everyone, not to meter your creativity by the keystroke.

If you are regularly hitting the ceiling with real work, that is what the plans are for, and the getting started guide covers what the free plan includes.