Voice Dictation - Guides

Speak Your Notes

You’re deep in a coding session. You need to jot down an architecture decision before you forget it. But switching to your notes app, finding the right file, and typing it out breaks your flow.

Voice dictation lets you speak directly into any text field on your Mac. Press a hotkey, talk, and your words appear wherever the cursor is — your editor, a note, a terminal, a chat window. No context switch. No copy-paste.

Dictation runs entirely on-device. Your voice never leaves your Mac — no cloud APIs, no server round-trips, no audio uploads. Everything stays local.

How It Works

You: *press Option+Space*

Floating pill appears near cursor: "Recording..."

You: "Add a caching layer between the API gateway
      and the database. Use Redis for session data
      and in-memory LRU for hot queries."

You: *release Option+Space*

Pill shows: "Processing..."

Text appears in your focused field, cursor moves to end.

Pill disappears.

The entire flow happens in-place. The floating pill follows your text cursor so you always know the state without looking away from your work.

Activation Modes

Two ways to control recording:

Push-to-Talk (Default) Hold the hotkey to record. Release to transcribe. Best for quick notes and short dictations — hold, speak, release.

Toggle Press the hotkey once to start recording. Press again to stop. Better for longer dictations where holding a key is uncomfortable.

Toggle mode also supports a hybrid behavior: if you hold the key for more than 300ms, it acts like push-to-talk (releasing stops the recording). A quick tap keeps recording until you tap again.

Auto-Stop on Silence

When enabled, dictation automatically stops recording after 2 seconds of silence. This works in both modes but is most useful with Toggle — press the hotkey, speak, and walk away. Dictation finishes on its own.

If no speech is detected within 5 seconds of starting, the recording cancels automatically.

The Floating Pill

A small pill-shaped indicator appears near your text cursor during dictation:

State	Appearance	Meaning
Recording	Pulsing microphone icon	Capturing audio
Processing	Waveform icon	Transcribing your speech
Error	Warning icon	Something went wrong

The pill positions itself below your text cursor using the Accessibility API. On multi-monitor setups, it appears on the screen where your cursor is. If the cursor position can’t be determined (some apps don’t expose it), the pill falls back to your mouse position.

During recording, a live preview of partial transcription text appears in the pill so you can see what’s being recognized in real-time.

Text Injection

After transcription, the text is injected into whatever application has focus. Strayfiles tries three methods in order:

Accessibility API — Reads the current text field value, inserts at the cursor position, and moves the cursor forward. Handles emoji and multi-byte characters correctly. This is the preferred method.
Keyboard Simulation — Types the text character by character using simulated key events. Works in apps that don’t expose their text fields to the Accessibility API.
Clipboard Paste — Saves your clipboard, pastes the transcription, then restores your original clipboard contents. Universal fallback that works in secure text fields and protected areas.

The method is chosen automatically. You don’t need to configure anything.

Configuration

Open dictation settings from the Strayfiles macOS app: Settings > Dictation.

Hotkey

The default hotkey is Option+Space. You can change it to any key combined with one or more modifiers (Command, Option, Shift, Control).

Choose a combination that doesn’t conflict with your editor or system shortcuts. Common alternatives:

Hotkey	Notes
`Option+Space`	Default. Doesn’t conflict with most editors.
`Command+Shift+D`	Easy to remember (D for dictation).
`Control+Space`	May conflict with Spotlight or input source switching.
`F5`	Function key, unlikely to conflict.

Model Size

Dictation downloads a speech recognition model on first use. Larger models are more accurate but use more RAM:

Model	Download Size	RAM Usage	Best For
Tiny	75 MB	~1 GB	Quick notes, simple vocabulary
Base	140 MB	~1.5 GB	General use (default)
Small	466 MB	~2.5 GB	Technical vocabulary, accents
Medium	1.5 GB	~5 GB	Maximum accuracy

Models are stored in ~/.strayfiles/models/whisper/ and can be deleted and re-downloaded from settings.

Formatting Options

Setting	Default	Description
Add punctuation	On	Inserts periods, commas, and question marks
Capitalize sentences	On	Capitalizes the first word after a period
Remove fillers	On	Strips “um”, “uh”, “like”, “you know”

Per-App Rules

Override formatting settings for specific applications. For example, you might want punctuation in your notes app but not in your terminal:

App	Punctuation	Capitalization	Remove Fillers
VS Code	On	On	On
Terminal	Off	Off	On
Slack	On	On	Off

Rules are matched by bundle ID (e.g., com.microsoft.VSCode).

Permissions

Dictation requires two macOS permissions:

Microphone Access Required to capture audio. macOS will prompt you the first time you start recording. You can also grant it from System Settings > Privacy & Security > Microphone.

Accessibility Access Required for the floating pill to position near your cursor and for text injection via the Accessibility API. Grant it from System Settings > Privacy & Security > Accessibility. Add the Strayfiles app to the allowed list.

Both permissions are checked in the dictation settings panel with clear status indicators and buttons to open the relevant System Settings pane.

Completely Private

Most dictation tools send your audio to a cloud API. That means your voice — including anything you say about proprietary code, internal architecture, or sensitive project details — passes through someone else’s server.

Strayfiles dictation never does this. The speech recognition model runs on your Mac. Audio is processed in real-time and immediately discarded. Nothing is recorded. Nothing is uploaded. Nothing is logged.

On-device only — The model runs locally. No network requests during transcription.
No recording — Audio buffers are processed and discarded in real-time. Nothing is saved to disk.
No telemetry — Strayfiles doesn’t collect any data about your dictation usage, frequency, or content.
No account required — Dictation works without signing in. No cloud dependency at all.

This matters for developers working on proprietary codebases. You can dictate architecture decisions, API designs, security notes, and internal documentation without any of it leaving your machine.

Troubleshooting

“Microphone permission denied”

Open System Settings > Privacy & Security > Microphone
Make sure Strayfiles is toggled on
If it doesn’t appear in the list, try restarting the app

“Accessibility permission denied”

Open System Settings > Privacy & Security > Accessibility
Add Strayfiles to the list and toggle it on
You may need to unlock the settings pane with your password

“Hotkey not working”

Check for conflicts with other apps or system shortcuts
Try a different key combination
Make sure dictation is enabled in settings

“Text not appearing in the app”

Some apps restrict programmatic text input
The clipboard fallback should handle most cases
If text still doesn’t appear, try clicking in the text field first to ensure it has focus

“Transcription is inaccurate”

Try a larger model (Small or Medium) for better accuracy
Speak clearly and at a moderate pace
Reduce background noise
Check that the correct language is selected

“Pill appearing in wrong position”

Some apps don’t expose cursor position to the Accessibility API
The pill falls back to mouse position in these cases
This is expected behavior for apps like Terminal

Requirements

macOS only (Sonoma 14+)
Microphone permission
Accessibility permission
~140 MB disk space for the default (Base) model
Not available on iOS