Browser TTS workspace

Piper TTS

Free AI text to speech with Piper TTS. 25 curated voices from 904-speaker dataset, ~75MB model, WASM. Runs entirely in your browser — offline and private.

Private generation WAV + MP3 export 25 voices · Fastest CPU

Switch Tool TTS + STT

Kokoro TTS 54 voices · Best quality Kitten TTS 8 voices · Lightest Piper TTS 25 voices · Fastest CPU Supertonic TTS 5 languages · Local Whisper STT 99 langs · captions

WASM

MB model

25+904

curated + full

voices

3-5x

realtime

speed

WASM

CPU only

backend

TTS works best on desktop

Audio generation uses WebGPU/WASM. Desktop Chrome or Edge gives the most reliable result.

About Piper TTS

Piper TTS is one of the most established open-source TTS engines, widely used in Home Assistant, accessibility tools, and edge computing projects. It uses the VITS neural architecture trained on the LibriTTS dataset.

The full dataset contains 904 distinct English voices. We curate 25 of the most distinct and useful speakers for the browser interface, ranging from warm narrators to professional presenters.

Piper is optimized for CPU inference via WebAssembly, generating audio 3-5x faster than realtime on standard hardware. It has a fixed 22.05kHz sample rate and runs without WebGPU — making it compatible with every modern browser.

Compare engines: Kokoro TTS (54 voices · Best quality) · Kitten TTS (8 voices · Lightest) · Supertonic TTS (5 languages · Local)

Getting Started with Piper TTS

Download the Model

Piper's model is ~75MB — a one-time download cached in your browser. It uses the VITS neural architecture trained on the LibriTTS dataset.

Browse 25 Curated Voices

Each voice has a distinct vocal character — warm narrators, professional presenters, conversational tones. Pick one that matches your content style.

Enter Your Text

Type or paste up to 50,000 characters of English text. Piper handles punctuation naturally — commas, periods, and question marks all create distinct speech patterns.

Generate at CPU Speed

Piper runs purely on CPU via WebAssembly — no GPU needed. It generates speech 3-5x faster than realtime, even on modest hardware.

Tips for Piper TTS

No WebGPU required. Unlike Kokoro and Kitten, Piper runs entirely on WASM/CPU. This means it works in every modern browser, including those without WebGPU support like Safari.

Fastest CPU generation. Piper generates audio 3-5x faster than realtime on a standard CPU. If you need bulk generation (batch processing chapters, scripts), Piper is the fastest option.

Explore the full 904-voice dataset. The browser interface curates 25 of the best voices, but the full LibriTTS dataset has 904 speakers. If you need a specific voice character, the broader dataset may have what you need.

Fixed 22.05kHz sample rate. Piper outputs at 22.05kHz. This is fine for most use cases including podcasts and YouTube. If you need higher sample rates, use Kokoro (24kHz) or Kitten (configurable up to 48kHz).