For course creators and L&D teams

Narrate a whole curriculum without rerecording.

Courses are long, and they change. A clear, consistent narrator across every module, regenerated whenever you update a lesson, on a fast clean-read engine and one flat plan so the next revision does not reopen the budget.

Start free Open the studio →

No credit card · Real engines · The audio is yours

Painted study desk with headphones on a stack of books, a globe, and an open notebook

Generated voice

MP3 + WAV · yours to export

Worked example

Module 3, lesson 1: reading a balance sheet

Script fragmentKokoro

Welcome back. In the last lesson you built a simple income statement.

Today we move to the balance sheet: what the business owns, what it owes, and what is left over for you.

Pause here and open the worksheet for module three. We will fill in the first column together.

Line 2, real Kokoro output, unedited.

Kokoro, voice Heart: a clean, neutral read that stays out of the content's way, drafted fast enough to keep a fifty-module course moving.

The honest arithmetic · about 1,000 characters is a minute of speech

~5,000: characters in a five-minute lesson
30,000: characters per pass, about half an hour of narration
1: section re-rendered when a stat changes

Why this is hard

What E-learning actually needs.

We would rather name the friction plainly than pretend it away. Here is the problem this page is about.

The honest problem

Course content gets updated constantly: a stat changes, a policy shifts, a module gets rewritten. Re-recording narration each time is slow and expensive, and hiring a voice actor for every edit is not realistic. The need is a clear, neutral read you can regenerate quickly whenever a lesson changes, kept consistent across every module so the course feels like one course.

Sound familiar?

Your course finally has paying students, and the first reviews agree on one thing: module three is out of date. You fixed the slides in an afternoon. The narration is the part you keep putting off, because last time it meant a weekend in a closet with a microphone.

How Cantari helps

Real features, mapped to the job.

Every item here works today, or says plainly where it is still in progress.

Fast, clean read

Kokoro is the fastest-drafting engine and gives a clean, neutral narration that suits instructional content, so a full curriculum comes together quickly.

Consistent across modules

Lock one voice across every lesson so a fifty-module course sounds like one narrator, not fifty different sessions.

Regenerate on every update

When a lesson changes, regenerate just that section. The flat allowance means frequent updates do not run up a per-character bill.

Own the course audio

Export MP3 or WAV with commercial rights and no watermark, yours to host on any LMS.

The workflow

How it goes, step by step.

Step 1: Paste the lesson script

Drop each module's narration into Text to Speech.

Step 2: Pick the fast clean-read engine

Choose Kokoro for the fastest clean drafts and lock the voice across modules.

Step 3: Generate and update

Generate each module, and regenerate just the section that changed when a lesson is revised.

Step 4: Export to your LMS

Export MP3 or WAV and upload to your learning platform. Commercial rights, no watermark.

Course notes

Writing e-learning narration that survives revisions.

Script for the ear, not the slide

Slide text is scannable; narration is linear. The e-learning scripts that generate best read like a teacher talking: one idea per sentence, numbers spelled the way you would say them, and the worksheet named out loud instead of pointed at. If a sentence is hard to read aloud yourself, the engine will not save it.

Chunk lessons so updates stay cheap

Generate each lesson as its own file rather than one module-length render. When a figure changes in lesson three, you re-render about ninety seconds, not thirty minutes, and your platform sees one swapped file. The chunking good e-learning design already wants turns out to be the chunking that makes regeneration nearly free.

Feedback lines are content too

Quiz feedback, hints, and the encouragement between sections deserve the same narrator as the lessons. Each is a sentence or two, a tiny generation, and rendering them in the course voice is the difference between an e-learning module that feels produced and one that feels assembled.

Recommended engine

Start with Kokoro.

Kokoro gives the fastest clean drafts and reads cleanly and plainly, which is exactly what instructional narration wants. It keeps a long curriculum moving without slowing down.

KokoroLightweight - plain read

Cheapest. Clean, plain read. Ignores cues.

Quality Elo: 1060
Latency: 973 ms (measured 2026-06-10)
Languages: 8
Rights: Apache-2.0 model; commercial OK

CheapestFast

Hear a line for this use case

“In this module, we will walk through each step in order, so take your time and follow along.”

Real Kokoro output, recorded unedited.

Tools behind itText to Speech Audiobook Studio Speech to Text

The honest answers.

What Cantari can and cannot do for e-learning today, in plain language.

Why Kokoro for courses?

Kokoro gives the fastest clean drafts and a neutral read. For instructional narration you usually want clarity over drama, and Kokoro delivers that quickly, which matters across a long course you revise often.

What happens when I update a lesson?

Regenerate just the section that changed. The flat allowance means frequent updates do not run up a per-character bill the way credit plans do.

Can I use the narration on my LMS commercially?

Yes. Every generation is yours to export and host commercially, with no watermark and no attribution required.

Keep exploring

Try Cantari for e-learning.

Free to start, no credit meter. Open the studio and hear it for yourself.

Start free Open the studio