Skip to content
New · the open voice benchmark is liveRead it
cantari
LiveProduce

Produce a full audiobook, chapter by chapter.

Our flagship workflow turns a manuscript into a finished, consistent audiobook. Keep one voice across every chapter, direct delivery with bracketed cues, and export production-ready audio you own outright.

Painted library corner with an open book
Chapters
Ch 01
Ch 02
Ch 03
From manuscript

Paste the book. Get the chapters.

Paste up to 150,000 characters and the chapters are detected from your own text, then split locally, so nothing gets rewritten. An optional speakable pass expands numbers, dates, and abbreviations so every engine reads them right. About 1,000 characters is a minute of audio, so a full paste is hours of narration, structured before you render a word.

Import manuscript
  • Found 14 chapters
  • Chapter titles kept
  • Speakable pass: on

Fix one line, not one chapter.

Every paragraph is its own take. Edit a word, re-render just that line, and the rest of the chapter keeps its finished takes. The voice you cast carries from chapter to chapter, with a per-line override for the moments that need one.

Chapters14 chapters · one voice
Ch 01
Ch 02
Ch 03
Ch 04
Ch 05

The lamp room smelled of brass polish and old rain, the way it had for forty years.

Mara set her cup on the sill and watched the small boat drift toward the rocks.

Export

Chaptered audio, yours to keep.

WAV, stitched chapterMP3 or WAV, single takes

Every render also saves to your Library. The files are yours: upload them wherever you publish.

One narrator voice, chapter one to the end. That is the whole point.

How it works

From a blank script to audio you own.

Step 1: Import your manuscript

Bring your book as chapters. Cantari keeps the structure so each chapter stays its own track.

Step 2: Cast one voice

Pick a single voice and lock it across the whole book, so chapter twelve sounds like chapter one.

Step 3: Direct the delivery

Add bracketed [emotion] cues for dramatic moments yourself, or let Auto Tag suggest them and undo anything you do not keep.

Step 4: Assemble and export

Render every segment, then export each chapter as a single stitched WAV, ready for distribution and yours to ship.

Capabilities

What Audiobook Studio gives you.

Consistent voice across the book

The hardest part of long-form is keeping one voice steady for hours. The studio is built to hold a single voice across every chapter.

Chapter-by-chapter workflow

Each chapter is its own track you can re-render in isolation, so fixing one line never means rebuilding the whole book.

Acted, directed delivery

Gemini Flash performs your bracketed [emotion] cues, so dialogue and narration carry the intent you wrote.

Manuscript import

Paste a whole book or upload a text file: chapters are detected without altering your words, and an optional speakable pass expands numbers and abbreviations so engines read them right.

Production-ready export

Download any single take, or export a whole chapter as one stitched WAV, with commercial rights and no watermark.

Own the whole master

The finished audiobook is yours outright, licensed in plain English per generation rather than rented by the credit.

Powered by

The engines behind it.

Gemini FlashExpressive - follows [cues]

The only engine here that acts your bracketed [emotion] directions.

Quality Elo
1225
Latency
2770 ms (measured 2026-06-10)
Languages
24
Rights
Commercial use; outputs are yours
Cue-followingExpressive
KokoroLightweight - plain read

Cheapest. Clean, plain read. Ignores cues.

Quality Elo
1060
Latency
973 ms (measured 2026-06-10)
Languages
8
Rights
Apache-2.0 model; commercial OK
CheapestFast
Grok VoicePersona voices - plain read

xAI voice with 5 personas. Plain read, ignores cues.

Quality Elo
1197
Latency
2444 ms (measured 2026-06-10)
Languages
1
Rights
Commercial use; outputs are yours
5 personasEnglish
MAI Voice 2Styled voice - real controls

Microsoft voice with real style and speed controls.

Quality Elo
1007 *
Latency
2426 ms (measured 2026-06-10)
Languages
1
Rights
Commercial use; outputs are yours
Style controlsEnglish
ZonosOpen-weight - plain read

Open-weight Zyphra engine with four accent voices.

Quality Elo
1000 *
Latency
4523 ms (measured 2026-06-10)
Languages
1
Rights
Apache-2.0 model; commercial OK
4 accentsOpen-source

Quality Elo from the Artificial Analysis Speech Arena, retrieved June 10, 2026. Latencies are our own real wall-clock numbers.

Live now

Audiobook Studio is generating in the studio today.

Audiobook Studio is live in beta. Chapters and segments, a held voice per chapter with per-line overrides, acted delivery through bracketed cues, per-segment re-renders so a fix never costs a whole chapter, and a stitched WAV export. Pronunciation memory is still on the way.

Engine connectedTool live
Open Audiobook Studio
Questions

The honest answers.

What Audiobook Studio can and cannot do today, in plain language.

Can I produce an audiobook today?
Yes, in beta. Audiobook Studio assembles chaptered projects with per-line re-renders, manuscript import, and stitched per-chapter WAV export, live today.
What is already working?
The whole chaptered workflow: manuscript import, a held voice per chapter with per-segment overrides, acted [emotion] cues, per-segment re-renders, and a stitched WAV export per chapter, all on the same five live engines as Text to Speech. Pronunciation memory is the piece still on the way.
Why is this your flagship?
Long-form is where most tools fall down: voices drift and pronunciations break over hours of audio. Audiobook Studio is designed specifically to keep a book consistent end to end.
Will I own the finished audiobook?
Yes. Every generation is yours to export and ship commercially, worldwide, with no watermark and no credit meter.
How do I get in?
Open it. Audiobook Studio is live in beta for every account: sign in, start from a manuscript or a blank chapter, and your first render is a real take. Beta means the studio is new, not that it is gated.

Start generating with Audiobook Studio.

Free to start, no credit meter. Open the console and hear it for yourself.