Skip to content
New · the open voice benchmark is liveRead it
cantari
Convert

WAV to Text

The master copy, every sample intact, read back as words.

Painted transcription desk with a typewriter and tape recorder
Three steps

How does WAV to Text work?

Step 1: Upload or drop the file

Drag your .wav into Speech to Text. Uploads up to 25 MB per file.

Step 2: A Whisper-class model transcribes

The audio goes to a Whisper-class model and the transcript comes back in the same view, usually within seconds.

Step 3: Copy, download, or save

Copy the text, download it as .txt, or save it to your library next to the source audio.

The format

What is a WAV file?

WAV is audio with nothing thrown away: raw PCM samples behind a small header, the same numbers the converter measured at the microphone. It is what DAWs print when you bounce a session, what field recorders write when the take matters, and what archives specify when a recording has to outlive its era.

The honest catch with WAV is bulk. Under this tool's 25 MB cap, uncompressed audio holds roughly two and a half minutes at 44.1 kHz 16-bit stereo, about five minutes of the same in mono, and close to nine minutes at 24 kHz mono; all of those are approximations. For longer material, a FLAC or a high-bitrate MP3 of the same take carries the words across just as well.

Real sources

Who ends up holding WAV files

  • DAW sessionsPro Tools, Logic, and REAPER bounce interview and voiceover masters to WAV as a matter of habit.
  • Field recordershandheld rigs from the Zoom and Tascam lines write WAV by default when capturing interviews on location.
  • Preservation projectsoral-history and library digitization standards name uncompressed WAV as the master format.
  • This studio's own outputGemini Flash takes and stitched audiobook chapters leave here as WAV, so a round trip back to text works.
The honest specifics
  • Uploads up to 25 MB per file
  • Reads .wav
  • Output: plain text, as a copyable transcript or a .txt download
  • No watermark, yours to keep
Straight answers

WAV to Text questions, answered honestly.

Does WAV transcribe more accurately than MP3?
Marginally at best, and usually not measurably. A Whisper-class model is robust to sensible compression, so the lossless advantage shows up in editing and mastering, not in the transcript. Upload whichever copy you have closest.
My WAV is bigger than 25 MB. What are my options?
Convert losslessly to FLAC, which typically halves the size with nothing lost, or export a 128 kbps or better MP3, which shrinks it dramatically with no real cost to the words. Both formats upload on the same page.
How many minutes of WAV fit in 25 MB?
Depends entirely on the sample rate and channel count: think two to three minutes of CD-quality stereo, around five minutes of 44.1 kHz mono, and close to nine minutes of 24 kHz mono spoken word. Rough numbers by design; the format stores a fixed amount per second, so your header tells the exact story.
Do you read timecode or markers embedded in my WAV?
No. Broadcast-WAV metadata, cue markers, and region labels are ignored; the audio samples are the only input, and plain text is the only output. Keep the original file if those markers matter to your edit.
Keep converting

Related formats.

Want the longer read? Open the Speech to Text guide in the docs.

Bring the file. Leave with the words.

Drop the recording into Speech to Text and read it back in seconds. Free to start, no credit meter.