Answers, in plain language.
The honest version of how Cantari works today. If a question is not here, the contact line at the bottom reaches a real person.
01Getting started
What works today and how to make your first sound.
What can I actually do right now?
Generate speech from text. Open the console or the Text to Speech tool, paste a script, pick a voice, and generate live audio. Speech to Text is live too; the dubbing pipeline and the premium-engine tools are still in progress.
Do I need a credit card to start?
No. You can start free. See pricing for what the free tier includes and where paid tiers pick up.
Is this a finished product?
The core of it, yes: the console generates real audio today and you own every export. A few tools still wear a Beta label while they mature, and we say so plainly. The about page lays out where everything stands.
02Engines and quality
Which engine to pick, and how we measure them.
Which engine should I use?
Use Gemini Flash when you want a voice to act bracketed [emotion] cues, Kokoro for the cheapest clean read, Grok Voice for its English personas, MAI Voice 2 for real style and speed controls, and Zonos for its American and British voices. The engines page and the open benchmark show the trade-offs side by side.
Are the benchmark numbers real?
Yes. Latency is our own measured wall-clock time on the same script. Quality scores are a third-party Quality Elo from the Artificial Analysis Speech Arena, not numbers we invent. The benchmark shows the method, the source, and the date.
Can I hear an engine before I commit?
Yes. The sample gallery plays real generations from each engine, recorded unedited, no voice actors. The console generates live on your own words.
03Usage and limits
How much you can generate and how pricing works.
How long can a single script be?
Up to 30,000 characters per generation. For a full book you chain sections together, which is what the Audiobook Studio workflow is built around.
How does pricing work?
Flat, not per-character. You get a monthly allowance rather than a credit meter, so a revision is not a re-charge. Details on the pricing page.
Can I use cloning and the voice changer?
Not yet. Voice cloning and the voice changer are coming soon. When cloning ships, it will build a reusable voice from a short clip (consent attested), and the voice changer will re-voice a recording as one of your clones, keeping your timing and delivery. Everything else is live today: text to speech, speech to text, dubbing, the audiobook studio, and Sound & Music (instrumental beds and atmospheres; one-shot sound effects are still coming). Each tool page says plainly what works today rather than faking a demo.
04Your library and rights
Who owns the audio and what we do with your data.
Who owns what I generate?
You do. Commercial use, worldwide, no watermark, no attribution required. The ownership page spells it out in plain language.
What do you store, and do you train on my content?
We store your account email, your scripts and generated audio in your private library, and usage counters. We do not sell your data and we do not train on your content. The full picture is on the privacy page.
How do I delete my work or my account?
Delete items from your library individually, or email us to delete your whole account. Anything you already exported stays on your machine. See privacy for the controls.
Still stuck?
Email a real person.
We answer fast while the product is young. No chat bot, no ticket queue, just a reply.
support [at] cantari.ioPrefer a form? Use the contact page.