Simple and Free Speech-to-Text on macOS

After seeing Wispr Flow mentioned a few times, I was curious about using speech-to-text to interact with Claude Code. The idea of paying $12/month and potentially sharing all my prompts wasn’t ideal, so I decided to see what Claude Code could help me build.

With a bit of experimentation, I settled on using sox for audio recording, since it seemed the best at silence detection, and parakeet-mlx for transcribing. I tried various improvements to silence detection, but found turning up my input volume helped the most.

With three commands in a script, I have a decent local speech-to-text solution:

#!/bin/bash

RECORDING_FILE="/tmp/record-recording.wav"
TRANSCRIPT_FILE="/tmp/record-recording.txt"

# Record until silence or 60 seconds.
rec -q "${RECORDING_FILE}" rate 16k pad 0.2 0 silence 1 0.05 1% 1 1.0 1% trim 0 60

# Transcribe and output.
if [ -f "${RECORDING_FILE}" ]; then
  parakeet-mlx "${RECORDING_FILE}" --output-format txt --output-dir /tmp --chunk-duration 30 >/dev/null 2>&1

  if [ -f "${TRANSCRIPT_FILE}" ]; then
    cat "${TRANSCRIPT_FILE}"
  fi
fi

The real script is a little more verbose, but this demonstrates the core of it. I use Hammerspoon to trigger the script and type the response for me. I also have it display a recording and transcribing status indicator in the menu bar.

My next step is to decide what to use for transcription on Linux so I can use it on my other machine. I might also create a second script that pipes the results through an LLM for use outside of prompts.

December 8th, 2025 at 10:00 PM — ai, speech-to-text