THE NAME

Written to be heard.

Six hundred years ago, in the piazzas of Italy, a cantare was a story written to be performed aloud. Printed words were only half the work. The other half was a voice, standing in the square, giving them somewhere to live.

That is the idea we named ourselves after. Cantari is a studio where writing becomes performance: your scripts, your chapters, your lessons, read by the best voices machines can offer, measured in the open, and owned by you.

The voices are new. The idea is six hundred years old. Welcome to Cantari.

Read by Kore on Gemini Flash

About Cantari

A fair voice platform, benchmarked in the open.

Cantari routes every TTS job to the best engine, measures performance honestly, and makes sure what you generate is yours to keep. No credit meters, no vendor scorecards, no lock-in.

Measured, not marketed

Why an open benchmark?

Every TTS vendor has a demo that sounds great. We wanted something harder to fake: the same script, the same conditions, and published numbers anyone can reproduce. No engine grades its own homework. The benchmark is updated when engines change and the date is always shown.

Real wall-clock latency, not vendor SLAs
Same input, same conditions, every run
Dated and reproducible so you can verify

See the benchmark →

Latency to full audioMeasured

Kokoro~1.0 sopen-weight, free
MAI Voice 2~2.4 sreal style controls
Grok Voice~2.4 s5 English personas
Gemini Flash~2.8 sacts [emotion] cues
Zonos~4.5 s4 accent voices

Wall-clock, same script, same conditions

No lock-in

Own your voice.

Flat pricing replaces the per-character credit meter. What you generate is yours: commercial rights worldwide, MP3 or WAV export, and no watermark. A plain-language license per generation instead of a usage clause buried in terms.

Commercial rights, worldwide, no watermark
Export every generation as MP3 or WAV
Flat pricing, not a credit balance

See pricing →

Current engines

Gemini Flash$1/M in + $20/M out
Kokoro$0.62/M in · $0 out
Grok Voice$15/M in · $0 out
MAI Voice 2$22/M in · $0 out
Zonos$7/M in · $0 out

Rates checked 2026-06-11 · more engines are planned

5: Real engines
Open: Benchmark
Flat: Pricing model
$0: To start

Early days

Early, and honest about it.

Cantari is live and in active development. The engine count is small on purpose and some planned features are still on the way. What is real today: the console generates live audio, the benchmark measures real engines, and the pricing model is the one we ship.

The roadmap includes more engines and deeper production tooling. We are building the benchmark infrastructure first so that when new engines are added, they land with honest numbers, not just marketing copy.

If you find something broken or want to discuss a use case, the console is the fastest way to test what works today.

Roadmap

BenchmarkLive
Speech to TextLive
BillingLive
Audiobook StudioLive
Cloning engineComing soon
More enginesPlanned
Team workspacesPlanned
AppSumoPlanned

What we stand for

Four principles, no asterisks.

Open benchmark

Same script, same conditions, published numbers anyone can reproduce.

You own the audio

Commercial rights worldwide, no watermark, export every generation.

Honest controls

Plain-language license, dated results, no engine grades its own homework.

Flat pricing

One price, not a per-character credit meter watching every syllable.

Hear it for yourself.

Open the console and generate audio from any of the five real engines, free.

Open console See the benchmark