For authors and publishers

Narrate a whole book without the per-word meter running.

A full-length title is hundreds of thousands of characters. On a per-character credit plan, every revision costs you again. Cantari is built for long-form: one voice held across every chapter, acted delivery where you want it, and a flat allowance so a reread is not a re-charge.

Start free Open the studio →

No credit card · Real engines · The audio is yours

Painted library corner with an open book

Worked exampleGemini Flash · Kore

calm The lamp had been dark for three winters, and still the ships came.

MARA

softly You kept it lit. All this time, you kept it lit.

ELI

weary Somebody had to watch the water.

building Above them, the great lens began, very slowly, to turn.

Real output · the engine acted the cues

The moment

The narration quote came back this morning: four thousand dollars and a six-week wait, for a novel that earned a few hundred last month. You know every line of this book; you have read it aloud a dozen times in revision. The audiobook edition should not be the thing that never happens.

Why this is hard

What Audiobooks & Publishing actually needs.

We would rather name the friction plainly than pretend it away. Here is the problem this page is about.

The honest problem

A novel is roughly 80,000 words, and an audiobook of it can run past 450,000 characters once you count narration and dialogue. On the usual per-character credit plans, that meter runs on every pass, so re-recording one chapter for a pronunciation fix means paying for the whole thing twice. The anxiety is not the first render, it is the fourth. Cantari replaces the meter with a flat allowance so you can revise a chapter without watching a balance drain.

How Cantari helps

Real features, mapped to the job.

Every item here works today, or says plainly where it is still in progress.

Cue-directed narration

Write bracketed [softly] or [urgent] directions inline and Gemini Flash performs them, so dialogue carries intent instead of reading flat. The plain-read engines tell you up front when they will ignore a cue.

Per-segment audiobook studio

The Audiobook Studio workflow holds one voice across every chapter and re-renders a single section in isolation, so fixing one line never means rebuilding the whole book. It is live in beta today, from manuscript import to stitched chapter export.

Flat allowance, not credits

Pricing is a flat monthly allowance rather than a per-character meter, so a fourth revision of a chapter does not cost you a fourth time.

Export and own the master

Download every chapter as MP3 or WAV with commercial rights, worldwide, no watermark. The finished audiobook is yours to distribute.

The honest arithmetic · about 1,000 characters is a minute of speech

~450,000: characters in an 80,000-word novel
~7.5 hrs: of narration at about 1,000 characters a minute
1: voice held across every chapter

The workflow

How it goes, step by step.

Step 1: Draft a chapter in Text to Speech

Paste up to 30,000 characters per generation, add bracketed cues for delivery, and hear it back live in the console.

Step 2: Pick and lock one voice

Audition voices on the engines page, then keep a single voice so chapter twelve sounds like chapter one.

Step 3: Revise without re-charge

Re-render any section to fix a name or a beat. The flat allowance means a reread is not a re-charge.

Step 4: Export the set

Export an ordered set of chapter files as MP3 or WAV, yours to ship to any distributor.

Recommended engine

Start with Gemini Flash.

Gemini Flash is the only engine here that acts bracketed [emotion] cues, which is what dialogue and dramatic narration need. Draft fast on Kokoro, then render the keepers on Gemini.

Gemini FlashExpressive - follows [cues]

The only engine here that acts your bracketed [emotion] directions.

Quality Elo: 1225
Latency: 2770 ms (measured 2026-06-10)
Languages: 24
Rights: Commercial use; outputs are yours

Cue-followingExpressive

Hear a line for this use case

“[softly] She closed the book, and for a long moment the room held its breath with her.”

Real Gemini Flash output, recorded unedited.

Tools behind itAudiobook Studio Text to Speech Speech to Text

The honest answers.

What Cantari can and cannot do for audiobooks & publishing today, in plain language.

Can I produce a finished audiobook today?

Yes, in beta. Audiobook Studio is live: import your manuscript, hold one voice across every chapter, re-render any segment in isolation, and export each chapter as a stitched WAV. Pronunciation memory is still on the way, and we say so rather than pretend.

How does this avoid per-character cost?

Pricing is a flat monthly allowance, not a credit meter. You are not charged per character, so revising a chapter does not bill you again for the words you already rendered.

Do I own the narration commercially?

Yes. Every generation is yours to export as MP3 or WAV and publish commercially, worldwide, with no watermark. See the ownership page for the plain-language summary.

Keep exploring

Try Cantari for audiobooks & publishing.

Free to start, no credit meter. Open the studio and hear it for yourself.

Start free Open the studio