v1.0 · Open source · MIT Linux Windows macOS soon

Local Whisper. Cloud models. One interface.

A local-first desktop app for dictation. Switch between free local models and paid cloud providers in one click. Transcription types straight at your cursor — in any app.

Smart Voice Flow main window showing live transcription in progress
Live streaming transcription with partial results. Red indicator = recording.

/ what you get

Four pillars. No compromise.

Every feature built so you actually use it daily — not a demo reel.

01

Free and private by default

Run faster-whisper on your own machine. No audio leaves your computer. Eight model sizes from tiny (40MB) to large-v3 (3GB). Works offline.

No account. No cloud. No leak.

02

Cloud accuracy on demand

OpenAI gpt-4o-transcribe, Deepgram Nova-3, AssemblyAI Universal, OpenRouter, Speaches. Bring your own API key. Switch in one click.

Local for drafts. Cloud for final takes.

03

A profile for every context

German podcast, English code comments, mixed-language meeting notes — each profile has its own model, post-processing, and hotkeys. Switch on the fly.

One app. Every use case.

04

Know what you spend

Every cloud call is logged with tokens, duration, and cost. Export to CSV. Set per-profile budgets. Never get surprised by a bill.

Receipts for every second.

/ workflow

Three steps. Zero friction.

  1. Pick a profile

    Choose your language, model, and post-processing in one click. Presets for podcasting, code comments, email, and notes ship by default.

    Profile selection screen
  2. Hold your hotkey and speak

    A floating indicator shows it's listening. Pause, think, keep going. Live partial results appear so you can course-correct mid-sentence.

    Floating recording indicator overlay
  3. Release. Text lands at your cursor.

    Transcription is typed directly into whatever has focus — VS Code, Notion, your CMS, a terminal, an email draft. No copy-paste.

    Text appearing at cursor in external application

/ for developers

Use Smart Voice Flow from anywhere.

A local HTTP API so other apps can tap into your configured models. One endpoint, any ASR engine, unified response shape.

  • Localhost-only by default — no public surface
  • Bring-your-own-key respected per profile
  • Streaming + non-streaming response modes
  • CLI companion for scripts and pipelines
Full API reference in the docs
cURL
$ curl -X POST http://localhost:8123/transcribe \
    -F "audio=@meeting.wav" \
    -F "profile=meeting-notes"

> {
    "text": "Kickoff on Monday, agenda attached...",
    "profile": "meeting-notes",
    "model": "faster-whisper/large-v3",
    "duration_ms": 2480,
    "cost_eur": 0.00
  }

/ what people are saying

Built for real workflows.

"I dictate podcast scripts in German with local Whisper, then switch to the cloud profile for final polish. Twenty minutes saved per episode, every episode."
M. K. podcaster · Berlin
"Replaced three separate SDKs with one local API. Cost tracking alone earned it a permanent spot in my dotfiles."
T. S. backend engineer
"Teaching online means switching languages mid-session. Profiles make that trivial. Also: actually private. No upload."
A. H. language tutor
"The floating indicator is what sold me. No guessing whether it's listening. And the open-source license means I can ship it to my team without procurement drama."
J. L. product designer

Used Smart Voice Flow? Send a short note — honest feedback is how this project grows.

Send a review

/ get it

Install in under a minute.

Two platforms at launch. macOS is on the way.

Linux

X11 and Wayland. Python 3.11+ recommended.

Windows

Signed binary. UAC-aware installer.

macOS
coming soon Apple Silicon + Intel

On macOS? Help us test so we can ship it sooner.

Or build from source Open source · MIT · free forever

/ questions

Things people ask.

Is Smart Voice Flow free?
Yes. Local Whisper is completely free. Cloud models use your own API keys and you pay the provider directly.
What operating systems are supported?
Linux (X11 and Wayland) and Windows at launch. macOS support is in progress.
Does my audio leave my computer?
With local Whisper: no. Audio stays on your machine. With cloud models: yes — audio is sent to the provider you chose (OpenAI, Deepgram, etc.) under their terms.
Which cloud providers are supported?
OpenAI (gpt-4o-transcribe), Deepgram (Nova-3), AssemblyAI (Universal), OpenRouter, and self-hosted Speaches.
How accurate is local Whisper compared to cloud?
large-v3 local is very close to cloud accuracy, especially for clean audio. For noisy or niche-vocabulary audio, cloud models still win on average. You can A/B in the app.
Does it work offline?
Local models work fully offline. Cloud calls need a connection.
Can I use it for live captioning?
Yes — streaming transcription with partial results is supported. Pair it with OBS or any window capture.
Is there an API?
Yes. Smart Voice Flow runs a local HTTP server you can call from other apps. See the API reference.
Is it really open source?
Yes. MIT license. All code on GitHub. Contributions welcome.
Where do I report a bug?
GitHub Issues. Template provided.