How to Transcribe Audio to Text (Free, In Your Browser)

Updated 2026-06-21

To transcribe audio to text, open a tool that runs the Whisper speech-to-text model in your browser, drop in your audio file (or speak into your mic), and let it generate the transcript — then edit and export it. Because it runs on-device, the audio never leaves your computer.

Transcribe an existing audio file

If you already have a recording — an MP3 of a meeting, a voice memo, a podcast, a lecture — you don't need to retype it.

Open Voice Dictation and drop your audio file onto the page (or use the file picker).
Pick the spoken language, or let it auto-detect. Over 90 languages are supported.
Wait while Whisper transcribes on your device. Longer files take longer because the work happens locally, not on a server.
Read the result, fix any stray words, and export.

A worked example: a 20-minute interview saved as an MP3 comes back as a full paragraph transcript you can clean up and paste into your notes — without uploading a single byte to a third party.

Dictate with your microphone

You can also skip the keyboard entirely. Choose the microphone option, start speaking, and watch the text appear. This is voice typing for drafting emails, notes, journal entries, or first drafts — handy when your hands are busy or typing is slow.

A practical tip: speak in steady, complete phrases rather than one word at a time. Whisper uses surrounding context to pick the right spelling, so natural sentences transcribe more accurately than clipped, isolated words.

Fix recurring words and export

Every speech-to-text engine mishears the occasional proper noun — a name, a product, a piece of jargon. Instead of correcting the same word ten times, use the fix recurring words feature to replace every instance at once (for example, mapping a misheard brand name to its correct spelling across the whole transcript).

When the text is clean, export in the format you need:

TXT — plain text for notes, documents, or pasting anywhere.
Markdown — formatted text for blogs, wikis, and docs.
SRT — timestamped subtitles, ready to attach to a video so captions line up with the audio.

The SRT export is what makes this a quick captioning workflow: transcribe the audio from a video, tidy the text, and you have a subtitle file without a captioning service.

Why on-device matters

Most online transcription services upload your audio to their servers, which is a real concern for confidential meetings, medical notes, interviews, or anything personal. This tool runs the Whisper model entirely in your browser tab. There's no account, no upload, and no copy of your recording sitting on someone else's machine — the file and the transcript stay on your device.

Ready to turn speech into text? Open Voice Dictation, drop in your audio or tap the mic, and get an editable, exportable transcript in your browser.

Try the Voice Dictation →