Skip to content
New · the open voice benchmark is liveRead it
cantari
THE NAME

Written to be heard.

Six hundred years ago, in the piazzas of Italy, a cantare was a story written to be performed aloud. Printed words were only half the work. The other half was a voice, standing in the square, giving them somewhere to live.

That is the idea we named ourselves after. Cantari is a studio where writing becomes performance: your scripts, your chapters, your lessons, read by the best voices machines can offer, measured in the open, and owned by you.

The voices are new. The idea is six hundred years old. Welcome to Cantari.

Read by Kore on Gemini Flash
About Cantari

A fair voice platform, benchmarked in the open.

Cantari routes every TTS job to the best engine, measures performance honestly, and makes sure what you generate is yours to keep. No credit meters, no vendor scorecards, no lock-in.

Measured, not marketed

Why an open benchmark?

Every TTS vendor has a demo that sounds great. We wanted something harder to fake: the same script, the same conditions, and published numbers anyone can reproduce. No engine grades its own homework. The benchmark is updated when engines change and the date is always shown.
  • Real wall-clock latency, not vendor SLAs
  • Same input, same conditions, every run
  • Dated and reproducible so you can verify
See the benchmark
Latency to full audioMeasured
  • Kokoro~1.0 sopen-weight, free
  • MAI Voice 2~2.4 sreal style controls
  • Grok Voice~2.4 s5 English personas
  • Gemini Flash~2.8 sacts [emotion] cues
  • Zonos~4.5 s4 accent voices

Wall-clock, same script, same conditions

No lock-in

Own your voice.

Flat pricing replaces the per-character credit meter. What you generate is yours: commercial rights worldwide, MP3 or WAV export, and no watermark. A plain-language license per generation instead of a usage clause buried in terms.
  • Commercial rights, worldwide, no watermark
  • Export every generation as MP3 or WAV
  • Flat pricing, not a credit balance
See pricing
Current engines
  • Gemini Flash$1/M in + $20/M out
  • Kokoro$0.62/M in · $0 out
  • Grok Voice$15/M in · $0 out
  • MAI Voice 2$22/M in · $0 out
  • Zonos$7/M in · $0 out

Rates checked 2026-06-11 · more engines are planned

5
Real engines
Open
Benchmark
Flat
Pricing model
$0
To start
Early days

Early, and honest about it.

Cantari is live and in active development. The engine count is small on purpose and some planned features are still on the way. What is real today: the console generates live audio, the benchmark measures real engines, and the pricing model is the one we ship.

The roadmap includes more engines and deeper production tooling. We are building the benchmark infrastructure first so that when new engines are added, they land with honest numbers, not just marketing copy.

If you find something broken or want to discuss a use case, the console is the fastest way to test what works today.

Roadmap
  • BenchmarkLive
  • Speech to TextLive
  • BillingLive
  • Audiobook StudioLive
  • Cloning engineComing soon
  • More enginesPlanned
  • Team workspacesPlanned
  • AppSumoPlanned
What we stand for

Four principles, no asterisks.

01

Open benchmark

Same script, same conditions, published numbers anyone can reproduce.

02

You own the audio

Commercial rights worldwide, no watermark, export every generation.

03

Honest controls

Plain-language license, dated results, no engine grades its own homework.

04

Flat pricing

One price, not a per-character credit meter watching every syllable.

Hear it for yourself.

Open the console and generate audio from any of the five real engines, free.