Skip to content
New · the open voice benchmark is liveRead it
cantari
Cantari Scribe

Cantari Scribe: push-to-talk dictation for Windows

Install, pair with a one-time code, hold a key, and clean text lands at your cursor in any app. How metering, privacy, and the beta caveats actually work.

Updated June 12, 2026

What Scribe is

Scribe is a small Windows app, downloaded from the Scribe page, that turns held-down speech into typed text. Hold the hotkey (Ctrl+Space by default), say the sentence, release: the transcript is placed at your cursor in whatever app has focus. A document, an email, a chat box, a code comment.

It is dictation software for Windows in the push-to-talk style: nothing listens until you hold the key, and a live meter of your real microphone level shows while you speak. The window never takes focus away from where you are typing.

New here? The setup guide walks every step from download to your first placed sentence, and the troubleshooting guide covers the known dead ends.

Connecting your account

Scribe never asks for your password. On first run it shows an 8-character code and opens cantari.io in your browser; you approve the code while signed in, and the app connects itself. The connection is a device token owned by your account.

Every connected computer is listed on your account page under Cantari Scribe, with when it was connected and last used. Disconnect revokes that computer immediately: its next dictation simply asks to reconnect.

How dictation is metered

Dictation draws from the same monthly allowance as the studio, at the same house math: about 1,000 characters is about a minute. There is no separate Scribe subscription and no per-word meter; a Scribe minute and a studio minute are the same minute.

Each plan's dictation time is listed in the plans and features table on the pricing page. The settings window shows your live usage against the allowance, and the full metering story lives in usage and limits.

Where your audio goes

On release, the take travels to cantari.io and through the same disclosed transcription processor the studio's speech to text tool uses, then the text returns to your cursor. The audio is held in memory on your machine, never written to disk, and the processor list is named in the privacy policy.

Holds are capped at sixty seconds per take in this beta; the pill tells you when you reach the cap.

Clipboard behavior: Scribe places text by briefly using the clipboard and restoring what was there. If the clipboard held something other than text (an image, files), it is left untouched rather than risked.

Beta honesty

The installer is not yet code-signed, so Windows SmartScreen asks once before running it; that warning is expected while signing is in progress. Installers exist for both regular PCs (x64) and Windows on ARM, and the wrong one simply refuses to run.

We do not publish accuracy percentages we have not measured. When we do, the number will carry its date and method, the same standard as every figure on the open benchmark.