On-device speech to text, your professional vocabulary, and a voice line straight into your coding agent. Nothing leaves your Mac unless you say so.
Dictate a paragraph, watch it clean up on-device, then fire a command into your open agent. Fifteen seconds, no cloud round trip.
WhisperKit speech to text and a local model run entirely on your Mac. Cloud fallback is off by default and redacted when you turn it on. No account, no telemetry, no upload.
Teach it once. It biases recognition before it guesses and corrects after the fact. ISO 42001, GDPR, RCW, the names in your matters. Built for how professionals actually speak.
Hold a key, speak, and the command lands in your open Claude Code or Codex session. Connectors are plain files, so you and anyone who forks it can add their own.
The same reasons the best local-first tools exist, applied to dictation.
Audio and transcripts stay on the machine. There is no server to trust, breach, or subpoena.
You run it. No per-seat pricing that climbs every year, no usage caps, no lock-in.
Add vocabulary packs, write new agent connectors, wire in your own tools. The codebase is yours to shape.
Open source under MIT. No black box around how audio is handled or how text is cleaned.
| VoxFlow Local | Cloud dictation (Wispr, Otter, cloud STT) | |
|---|---|---|
| Where audio goes | Stays on your Mac | Uploaded to a server |
| Monthly cost | Free, you run it | Subscription, per seat |
| Your vocabulary | Learned and biased on-device | Generic, often capped |
| Drives your agent | Voice into Claude Code / Codex Experimental | No |
| Source code | Open, MIT, fork it | Closed |
| Works offline | Yes | No |
macOS 14 or later, Apple Silicon. A free Apple ID developer certificate keeps the accessibility grant across rebuilds; ad-hoc signing works without an Apple account.
# 1. clone git clone https://github.com/ZOLAtheCodeX/voxflow-local.git cd voxflow-local # 2. bootstrap the backend and models ./scripts/bootstrap_all.sh # 3. run the backend, then the app ./scripts/run_backend.sh swift run VoxFlowLocal
Full signing matrix and Ollama setup in the README. Text polish uses a local Ollama model; the regex pipeline is the fallback when Ollama is not running.