A terminal user interface for whisper.cpp dictation with one non-negotiable requirement: zero character drops. Every word you speak appears exactly as transcribed, at typing speed.

Problem

Voice dictation tools exist, but they’re either:

  • Cloud-dependent (privacy, latency)
  • GUI-only (doesn’t integrate with terminal workflows)
  • Unreliable (dropped characters, garbled output)

For hands-free coding and writing, you need local processing, terminal integration, and absolute reliability.

Architecture

Service-oriented async TUI with strict separation:

┌─────────────────────────────────────────────────────────┐
│                    Textual TUI Layer                     │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐     │
│  │ Output      │  │ Queue       │  │ Status Bar  │     │
│  │ History     │  │ Panel       │  │ + Levels    │     │
│  └─────────────┘  └─────────────┘  └─────────────┘     │
├─────────────────────────────────────────────────────────┤
│                    Service Layer                         │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐     │
│  │ Audio       │  │ Transcribe  │  │ Type        │     │
│  │ Capture     │  │ Queue       │  │ Simulator   │     │
│  │ (GStreamer) │  │             │  │ (xdotool)   │     │
│  └─────────────┘  └─────────────┘  └─────────────┘     │
├─────────────────────────────────────────────────────────┤
│                 Legacy Bridge Layer                      │
│        (Proven client_lite scripts preserved)           │
└─────────────────────────────────────────────────────────┘

Key Pattern: Integration via Preservation

The typing system uses battle-tested legacy scripts rather than reimplementing. New TUI wraps proven code.

Performance Requirements

MetricTargetRationale
Character drop rate0%Non-negotiable for trust
Typing speed15ms/charFeels natural, doesn’t trigger rate limits
Error recovery90%Failed transcriptions recoverable via re-record
Workflow interruption<5%Down from ~30% with basic client

Constitutional Development

The project enforces design principles programmatically:

constitution.md defines:

  • Non-blocking I/O everywhere
  • State-machine-first design
  • Explicit error boundaries
  • Performance thresholds

tests/constitutional/ validates every change against the constitution. Not guidelines—automated gates.

# Example constitutional test
def test_no_blocking_io_in_handlers():
    """All event handlers must be async"""
    for handler in get_all_handlers():
        assert inspect.iscoroutinefunction(handler)

Current Status

Why This Exists

Hands-free input isn’t a nice-to-have—it’s accessibility. But accessibility tools need to be rock-solid. This project proves you can have local, private, terminal-native dictation that actually works. The constitutional development pattern ensures it stays working.