Engine

Gemini Flash

The only engine here that acts your bracketed [emotion] directions.

Cue-followingExpressiveMultilingual

Hear it in the studio All engines →

Generated by Gemini Flash

5 real voices · previews below

The numbers

Scores, speed, and the real rate.

The same trust rows as the open benchmark: a third-party quality score, our own measured latency, and the raw engine cost in the open.

Quality Elo*: 1225
Measured latency: 2770ms measured 2026-06-10
Languages: 24 languages
Follows [cues]: Yes, acts bracketed [cues]
Engine cost: $1/M in + $20/M out rate checked 2026-06-11
The provider’s published rate when we last checked. Rates move, and when they do we update this row. It’s here so you can weigh our flat pricing against the raw cost underneath, instead of taking our word for it.
Rights: Commercial use; outputs are yours

* Quality Elo from the Artificial Analysis Speech Arena, retrieved June 10, 2026. It is a user-vote arena rating; the top model of all rated is Fun-Realtime-TTS at 1228.06.

Latency is our own wall-clock time to full audio, measured 2026-06-10 on the same path that serves the studio. A measurement, not a server SLA.

Voices