Back to docs
guides

Voice Dictation

Speak Your Notes

You’re deep in a coding session. You need to jot down an architecture decision before you forget it. But switching to your notes app, finding the right file, and typing it out breaks your flow.

Voice dictation lets you speak directly into any text field on your Mac. Press a hotkey, talk, and your words appear wherever the cursor is — your editor, a note, a terminal, a chat window. No context switch. No copy-paste.

Dictation runs entirely on-device. Your voice never leaves your Mac — no cloud APIs, no server round-trips, no audio uploads. Everything stays local.

How It Works

You: *press Option+Space*

Floating pill appears near cursor: "Recording..."

You: "Add a caching layer between the API gateway
      and the database. Use Redis for session data
      and in-memory LRU for hot queries."

You: *release Option+Space*

Pill shows: "Processing..."

Text appears in your focused field, cursor moves to end.

Pill disappears.

The entire flow happens in-place. The floating pill follows your text cursor so you always know the state without looking away from your work.

Activation Modes

Two ways to control recording:

Push-to-Talk (Default) Hold the hotkey to record. Release to transcribe. Best for quick notes and short dictations — hold, speak, release.

Toggle Press the hotkey once to start recording. Press again to stop. Better for longer dictations where holding a key is uncomfortable.

Toggle mode also supports a hybrid behavior: if you hold the key for more than 300ms, it acts like push-to-talk (releasing stops the recording). A quick tap keeps recording until you tap again.

Auto-Stop on Silence

When enabled, dictation automatically stops recording after 2 seconds of silence. This works in both modes but is most useful with Toggle — press the hotkey, speak, and walk away. Dictation finishes on its own.

If no speech is detected within 5 seconds of starting, the recording cancels automatically.

The Floating Pill

A small pill-shaped indicator appears near your text cursor during dictation:

StateAppearanceMeaning
RecordingPulsing microphone iconCapturing audio
ProcessingWaveform iconTranscribing your speech
ErrorWarning iconSomething went wrong

The pill positions itself below your text cursor using the Accessibility API. On multi-monitor setups, it appears on the screen where your cursor is. If the cursor position can’t be determined (some apps don’t expose it), the pill falls back to your mouse position.

During recording, a live preview of partial transcription text appears in the pill so you can see what’s being recognized in real-time.

Text Injection

After transcription, the text is injected into whatever application has focus. Strayfiles tries three methods in order:

  1. Accessibility API — Reads the current text field value, inserts at the cursor position, and moves the cursor forward. Handles emoji and multi-byte characters correctly. This is the preferred method.

  2. Keyboard Simulation — Types the text character by character using simulated key events. Works in apps that don’t expose their text fields to the Accessibility API.

  3. Clipboard Paste — Saves your clipboard, pastes the transcription, then restores your original clipboard contents. Universal fallback that works in secure text fields and protected areas.

The method is chosen automatically. You don’t need to configure anything.

Configuration

Open dictation settings from the Strayfiles macOS app: Settings > Dictation.

Hotkey

The default hotkey is Option+Space. You can change it to any key combined with one or more modifiers (Command, Option, Shift, Control).

Choose a combination that doesn’t conflict with your editor or system shortcuts. Common alternatives:

HotkeyNotes
Option+SpaceDefault. Doesn’t conflict with most editors.
Command+Shift+DEasy to remember (D for dictation).
Control+SpaceMay conflict with Spotlight or input source switching.
F5Function key, unlikely to conflict.

Model Size

Dictation downloads a speech recognition model on first use. Larger models are more accurate but use more RAM:

ModelDownload SizeRAM UsageBest For
Tiny75 MB~1 GBQuick notes, simple vocabulary
Base140 MB~1.5 GBGeneral use (default)
Small466 MB~2.5 GBTechnical vocabulary, accents
Medium1.5 GB~5 GBMaximum accuracy

Models are stored in ~/.strayfiles/models/whisper/ and can be deleted and re-downloaded from settings.

Formatting Options

SettingDefaultDescription
Add punctuationOnInserts periods, commas, and question marks
Capitalize sentencesOnCapitalizes the first word after a period
Remove fillersOnStrips “um”, “uh”, “like”, “you know”

Per-App Rules

Override formatting settings for specific applications. For example, you might want punctuation in your notes app but not in your terminal:

AppPunctuationCapitalizationRemove Fillers
VS CodeOnOnOn
TerminalOffOffOn
SlackOnOnOff

Rules are matched by bundle ID (e.g., com.microsoft.VSCode).

Permissions

Dictation requires two macOS permissions:

Microphone Access Required to capture audio. macOS will prompt you the first time you start recording. You can also grant it from System Settings > Privacy & Security > Microphone.

Accessibility Access Required for the floating pill to position near your cursor and for text injection via the Accessibility API. Grant it from System Settings > Privacy & Security > Accessibility. Add the Strayfiles app to the allowed list.

Both permissions are checked in the dictation settings panel with clear status indicators and buttons to open the relevant System Settings pane.

Completely Private

Most dictation tools send your audio to a cloud API. That means your voice — including anything you say about proprietary code, internal architecture, or sensitive project details — passes through someone else’s server.

Strayfiles dictation never does this. The speech recognition model runs on your Mac. Audio is processed in real-time and immediately discarded. Nothing is recorded. Nothing is uploaded. Nothing is logged.

  • On-device only — The model runs locally. No network requests during transcription.
  • No recording — Audio buffers are processed and discarded in real-time. Nothing is saved to disk.
  • No telemetry — Strayfiles doesn’t collect any data about your dictation usage, frequency, or content.
  • No account required — Dictation works without signing in. No cloud dependency at all.

This matters for developers working on proprietary codebases. You can dictate architecture decisions, API designs, security notes, and internal documentation without any of it leaving your machine.

Troubleshooting

“Microphone permission denied”

  • Open System Settings > Privacy & Security > Microphone
  • Make sure Strayfiles is toggled on
  • If it doesn’t appear in the list, try restarting the app

“Accessibility permission denied”

  • Open System Settings > Privacy & Security > Accessibility
  • Add Strayfiles to the list and toggle it on
  • You may need to unlock the settings pane with your password

“Hotkey not working”

  • Check for conflicts with other apps or system shortcuts
  • Try a different key combination
  • Make sure dictation is enabled in settings

“Text not appearing in the app”

  • Some apps restrict programmatic text input
  • The clipboard fallback should handle most cases
  • If text still doesn’t appear, try clicking in the text field first to ensure it has focus

“Transcription is inaccurate”

  • Try a larger model (Small or Medium) for better accuracy
  • Speak clearly and at a moderate pace
  • Reduce background noise
  • Check that the correct language is selected

“Pill appearing in wrong position”

  • Some apps don’t expose cursor position to the Accessibility API
  • The pill falls back to mouse position in these cases
  • This is expected behavior for apps like Terminal

Requirements

  • macOS only (Sonoma 14+)
  • Microphone permission
  • Accessibility permission
  • ~140 MB disk space for the default (Base) model
  • Not available on iOS — the iOS app shows a placeholder message