← The garden

Giving a machine fleet its own voices

Per-box voice notifications and a voice-fusion engine for a fleet of agents.

🌱 Seedling Planted 2026-06-16 Last tended 2026-06-16

Early notes on a system where each machine in an agent fleet speaks with its own voice — devbox sounds like an engineer, another box like a researcher — so notifications are distinguishable by ear without looking at a screen.

Two pieces interest me most so far:

  • Voice fusion — blending two voices by averaging their embedding-space x-vectors, so you can dial a new voice between two existing ones.
  • A decoupled heads-up dashboard — the notification surface lives separately from the agent infrastructure, so the “who’s saying what” view doesn’t depend on any one box being up.

This is rough and moving fast — capturing it here so the thinking is visible while it’s still forming.

Questions I haven’t answered

  • Does per-box voice actually reduce glance-at-screen rate, or is it novelty?
  • How to keep audible notifications from becoming noise at fleet scale.