This document is a standalone, validated procedure for instrumenting Handy and producing a complete, cross-correlated trace of one or more transcription cycles — from process boot, through trigger, through the recording steady state, through VAD finalization and Whisper inference, through paste, back to idle.
It is written without assuming prior context. Every environment variable, flag, and structural decision in this document is one that has been validated empirically against Handy on Linux/WebKitGTK. Where a plausible alternative is wrong in a way that wastes time, the failure mode is spelled out inline so the reader can avoid it.
This document tells you how to capture and synthesize observations. It does not present observations about Handy's runtime behavior — those you will produce yourself by following the procedure.
- Definition of done
- Handy's runtime topology (what you are instrumenting)
- The four-tool methodology
- Pre-flight: environment and isolation
- Tool 1 — uftrace (Rust function tracing via mcount)
- Tool 2 — strace (syscall and IPC observation)
- Tool 3 — WebKit Remote Inspector (JS Timeline / ScriptProfiler / Heap)
- Tool 4 — source counters (compiled-in event counters)
- Building the instrumented binary
- The coverage audit (a priori static enumeration)
- The synchronized capture
- Post-hoc synthesis: producing the four deliverables
- Validation contracts (per tool and end-to-end)
- Common pitfalls and the failure modes they cause
- Gap audit: what this procedure does not capture
- Reference: flags, env vars, paths, commands
Before you start, agree on what a complete execution-flow map consists of. This procedure produces five things:
- A coverage audit (
coverage-map.csv) — a static, a-priori enumeration of every code path that could fire during the lifecycle, with each path mapped to the lifecycle phase in which it is expected and to the instrumentation tool that could observe it. - A synchronized capture — one continuous, wall-clock-anchored run with
all chosen tools recording simultaneously through a real lifecycle. The
raw artifacts:
uftrace.data/— Rust function-call tracestrace.log— syscall + IPC tracewebinspector-recording_overlay.json— JS profile for the overlay webviewwebinspector-settings.json— JS profile for the main webviewhandy.stdout.log— application stdout, including any source-counter linesphase-timeline.log— wall-clock timestamps for each lifecycle marker
- A synthesis of those artifacts into:
execution-trace.csv— every function / event / syscall observed, with phase assignment, tool source, and frequencycall-graph.csv— caller → callee edges from the function tracercoverage.csv— for every row in the audit, whether it fired this cycle and (if not) whyexecution-flow.mmd— a Mermaid sequence diagram across the architecture's lanes (user, Handy main, audio thread, overlay JS, external subprocess)
- A gap audit (
gap-audit.md) — explicit enumeration of what this composition does not capture (see section 15 for the template). - A validation pass (
status-summary.md) — per-phase status with the evidence on which each phase's PASS rests.
If any of these is missing, the map is incomplete. Anything reported as "PASS" without an artifact whose size and content satisfy the validation contract in section 13 is a fraudulent PASS.
You are instrumenting a Tauri 2.x desktop application with a Rust backend
(src-tauri/) and a React/TypeScript frontend (src/). The runtime is
multi-process and multi-threaded.
PROCESS: handy (Rust binary)
┌─────────────────────────────────────────────────────────────────────────┐
│ main thread │
│ └─ tauri::Builder runs the event loop │
│ ├─ command handlers (specta_builder.invoke_handler) │
│ ├─ event emitters (app_handle.emit, window.emit_to) │
│ ├─ tray, autostart, single-instance, updater │
│ └─ custom URI scheme handlers │
│ │
│ audio thread (cpal's internal stream callback thread) │
│ └─ samples → mpsc::Sender<AudioChunk> │
│ │
│ audio consumer thread (run_consumer worker) │
│ ├─ FrameResampler │
│ ├─ AudioVisualiser (FFT → level buckets) │
│ │ └─ level callback fires here │
│ └─ Silero VAD │
│ │
│ input thread (rdev global listener) │
│ signal-handler thread (Unix: SIGUSR1/SIGUSR2/SIGTERM) │
│ │
│ ad-hoc threads: │
│ ├─ lazy-close watchdog │
│ ├─ model download │
│ ├─ inference (transcribe-rs, blocking) │
│ └─ idle watcher │
└─────────────────────────────────────────────────────────────────────────┘
│
Tauri IPC bridge (wry-managed)
• commands : invoke handler called from JS │
• events : Rust → JS via wry InnerWebView::eval │
• schemes : custom URI schemes registered in tauri::Builder │
│
▼
┌────────────────────────────┐ ┌────────────────────────────┐
│ PROCESS: WebKitWebProcess │ │ PROCESS: WebKitWebProcess │
│ window label "main" │ │ window label "recording_ │
│ │ │ overlay" │
│ src/main.tsx │ │ src/overlay/main.tsx │
└────────────────────────────┘ └────────────────────────────┘
│
PROCESS: WebKitNetworkProcess
(HTTP fetch isolation)
Key architectural facts you need to know to instrument correctly:
- Two windows = two webviews = (typically) two WebKitWebProcess
subprocesses on Linux. The window labels are
"main"and"recording_overlay". Instrument both. - The Tauri event path is
wry InnerWebView::eval, which crosses to the WebKit subprocess viawebkit_web_view_run_javascript(). On Linux this serializes a JS source string over an unnamedSOCK_SEQPACKETUnix domain socketpair. The WebKit child process is invoked with argv[..., <eventfd_fd>, <ipc_socket_fd>];argv[2]is the IPC socket FD inside the child. You can cross-reference/proc/<child>/fd/<argv[2]>to a socket inode in/proc/net/unix(type0005= SEQPACKET) and find its peer inode (which is the UIProcess-side FD on the Handy main process). - Trigger paths are not equivalent. A recording can be started by:
global shortcut (rdev), CLI flag (
--toggle-transcription,--cancel), or Unix signal (SIGUSR2). The CLI flag launches a second instance that initializes Whisper/Vulkan before tauri-plugin-single-instance hands off — under instrumentation overhead this can cascade into resource pressure that distorts the trace. UseSIGUSR2for synchronized captures. The signal handler insignal_handle.rsinvokes the same in-process trigger the shortcut would have driven; no second instance is spawned. - The single-instance plugin is bundle-ID-scoped, not display-scoped.
A Handy you launch on Xvfb
:103will be blocked silently by a Handy the user has running on:0. Confirm no user Handy is running before launching any instrumented binary. cpalon Linux routes through PipeWire/PulseAudio's ALSA emulation. Per-app input redirection viaPULSE_SOURCE/PIPEWIRE_NODEenv vars does NOT work for cpal — Handy will use the default source. Usepactl move-source-outputon each stream after it appears.
Handy's execution flow spans userspace function calls, syscalls (especially the IPC socketpair), the JS runtime inside the webview, and code paths the function tracer cannot see (closure bodies, FFI internals). No single tool covers the whole surface. Use all four:
| Tool | Sees | Cannot see |
|---|---|---|
| uftrace | Every Rust function entry/exit at the symbol level, with timestamps, thread IDs, and caller chain | Closure bodies without their own symbol; C/C++ dylib internals; subprocess argv |
| strace | Every syscall — execve (subprocess invocations), sendmsg/sendto/writev/write and recvmsg/recvfrom (IPC and I/O) |
Pre-syscall in-process work; userspace-only function calls |
| WebKit Remote Inspector | JS Timeline events (EventDispatch, TimerFire, RenderingFrame), ScriptProfiler call samples, Heap.* events (GC, snapshots), console messages | Native code inside the WebKit subprocess; sub-sampler-resolution JS work |
| Source counters | Any path you statically add a counter to; useful where uftrace's mcount cannot reach because the symbol disappears under inlining or the path is a closure body | Anything you didn't add a counter for; carries source-edit risk (section 8) |
The execution-flow map is the union of what these four tools observe. Validate each independently, then synthesize.
These are non-negotiable. Skip any and your capture is contaminated.
The single-instance plugin will silently block your launch otherwise. Confirm:
pgrep -x handy && echo "USER HANDY RUNNING -- quit it first"In a controlled lab harness, also verify the exe path of any running handy
to distinguish a user-installed instance (e.g., ~/.local/bin/handy,
/usr/bin/handy, /opt/handy/handy, the AppImage) from a stale dev-tree
process you can reap:
for pid in $(pgrep -x handy); do
echo "$pid $(readlink /proc/$pid/exe 2>/dev/null)"
doneRun the instrumented binary on its own X display so window-manager events,
input focus, and unrelated GTK traffic do not bleed into the trace. Pick a
display number not in use by the user (:0 is the user's session;
echo $DISPLAY confirms theirs):
Xvfb :103 -screen 0 1920x1080x24 -ac +extension RANDR -nolisten tcp \
>/tmp/xvfb.log 2>&1 &
XVFB_PID=$!
xdpyinfo -display :103 >/dev/null # poll until this succeeds
export DISPLAY=:103Tear down with kill $XVFB_PID at the end.
Driving the microphone path with a known audio file gives you reproducible timing and a way to assert that audio actually reached cpal. Do NOT modify the user's default PipeWire source (typically a real microphone). Instead, load a per-capture null sink and move Handy's source-output onto it after it appears.
# Load the null sink.
SINK=handycap_sink
MODID=$(pactl load-module module-null-sink "sink_name=$SINK" media.class=Audio/Sink)
pactl set-sink-volume "$SINK" 100%
pactl set-source-volume "${SINK}.monitor" 100%
# Confirm the user's default source is untouched:
pactl info | grep '^Default Source:' # should be the user's actual micAfter triggering recording (section 11), find Handy's source-output and move it:
# Find Handy's PipeWire source-output id.
pactl list source-outputs | awk '
/^Source Output #/ { id=$3 }
/PipeWire ALSA \[handy\]/ { print id }
'
# Move it onto the null sink's monitor.
pactl move-source-output <id> "${SINK}.monitor"
# Play the test audio into the null sink. paplay reads PCM WAV directly.
paplay --device="$SINK" --volume=65536 /path/to/test-audio.wav &The audio path is: paplay → null sink → ${SINK}.monitor → Handy's
source-output. This is the only Linux-validated method that reliably
redirects cpal-ALSA-PipeWire input per app; env-var redirection does not
work for this stack.
Tear the sink down in cleanup:
pactl unload-module "$MODID"If multiple instrumentation runs may launch concurrently (e.g., in a CI or multi-agent harness), serialize on a flock so only one Handy launches at a time:
exec 9>/tmp/handy-instr-lock
flock -x 9
# ...launch Handy under instrumentation...
flock -u 9The lock is also a hedge against orphaned WebKitWebProcess children from prior failed runs — reap them while you hold the lock.
Loaded null sinks survive process death; WirePlumber's stream-restore cache can latch the user's installed Handy onto a sink that no longer exists and silently break their voice-to-text the moment they next try to use it. Register an unconditional cleanup that unloads any sink your run created and reaps any wrapper processes:
trap '
pactl list short modules | grep "null-sink" | grep "$SINK" | cut -f1 |
xargs -r -n1 pactl unload-module
pgrep -f handycap-wrap | xargs -r kill -TERM
' EXITIn a Python harness, register the same logic with atexit so it fires
even on KeyboardInterrupt or AssertionError.
strace uses ptrace to attach. On most distributions you can ptrace
your own children with no extra permission. If /proc/sys/kernel/yama/ptrace_scope
is 2 and your launcher does not exec the target as a child of strace,
attaches will fail; the procedure below launches strace as the outermost
process and execs the target down the chain, so ptrace_scope=1 is
sufficient and no sudo is required.
for t in uftrace strace pactl paplay xdpyinfo c++filt Xvfb objdump nm; do
command -v $t >/dev/null || echo "MISSING: $t"
doneuftrace must be >= 0.15 for the mcount handling and dump --chrome
output used by the synthesis step.
uftrace records every Rust function entry and exit at the symbol level. It is the only tool in the four that gives you per-function call counts and a real call graph. It is the primary tool for execution-flow mapping.
The target binary must be compiled with the GCC/Clang mcount
instrumentation hook — every function entry calls the mcount symbol,
which uftrace's libmcount.so (LD_PRELOAD'd into the process) intercepts
and records. For Rust this means a Cargo profile with
-Z instrument-mcount set in RUSTFLAGS.
Nightly toolchain required. As of rustc 1.95 (stable), instrument-mcount
is an unstable -Z flag and requires a nightly toolchain. The
historical -C instrument-mcount form does NOT exist on stable rustc —
the compiler rejects it. Use cargo +nightly for the instrumented build.
Add this to src-tauri/Cargo.toml (commit-quality, but tag the addition
as instrumentation-only):
[profile.release-debug]
inherits = "release"
debug = "full" # full DWARF for symbolication
strip = false # do not strip symbols
lto = "off" # uftrace cannot see inlined-then-stripped frames
codegen-units = 16
incremental = falseBuild the binary with mcount enabled and into a non-default target dir so
your normal cargo build is not contaminated:
cd src-tauri
CARGO_TARGET_DIR=target-uftrace \
RUSTFLAGS="-Z instrument-mcount" \
cargo +nightly build --profile release-debug --bin handyThe product is src-tauri/target-uftrace/release-debug/handy.
BIN=src-tauri/target-uftrace/release-debug/handy
# mcount symbol must be undefined (it's imported from libmcount.so at runtime).
# The actual symbol on glibc-linked Rust binaries is `U mcount@GLIBC_X.Y.Z`
# (e.g., `U mcount@GLIBC_2.2.5`), NOT a bare `U mcount` — the strict regex
# `^ *U mcount$` matches nothing on a real instrumented build. Anchor on
# either `@` or end-of-line:
nm -D --undefined-only "$BIN" | grep -E '^ *U mcount(@|$)'
# Belt-and-suspenders: require the binary to also contain tens of thousands
# of `call mcount@plt` instructions. A successful Handy build shows >50k.
# A near-zero count means the flag silently no-op'd (most commonly because
# the build did not use a nightly toolchain — see 5.1).
CALL_COUNT=$(objdump -d "$BIN" | grep -c 'call.*<mcount@plt>')
[ "$CALL_COUNT" -gt 1000 ] || { echo "suspicious mcount call count: $CALL_COUNT"; exit 1; }
# If you also want Web Inspector in this same binary (recommended), the
# `devtools` feature of the tauri crate must be enabled at build time.
# Verify with:
objdump -d "$BIN" | grep -q 'webkit_settings_set_enable_developer_extras@' \
&& echo "devtools call site present"To enable devtools, temporarily add "devtools" to the tauri feature
list in src-tauri/Cargo.toml:
tauri = { version = "2.10.2", features = [
"protocol-asset", "macos-private-api", "tray-icon", "image-png",
"devtools", # add for instrumentation builds only
] }Restore the file when you are done so the change does not ride into a release build.
Tauri's resource-resolution code treats the binary as a "cargo
development build" when the exe path's component at index len-3 is
literally "target" — and only then does it resolve resources relative
to the binary location. The path
src-tauri/target-uftrace/release-debug/handy has target-uftrace at
len-3, which fails the check; tray init will crash because the icon
PNG cannot be resolved.
The fix is a wrapper directory that has target at len-3. Hardlink the
binary and symlink resources next to it:
WRAP=/tmp/handycap-wrap/target/release-debug
mkdir -p "$WRAP"
ln -f src-tauri/target-uftrace/release-debug/handy "$WRAP/handy"
ln -snf "$PWD/src-tauri/target-uftrace/release-debug/resources" "$WRAP/resources"
touch "$WRAP/.cargo-lock"
# Verify the path component count.
python3 -c "
from pathlib import Path
p = Path('$WRAP/handy')
parts = p.parts
assert parts[-3] == 'target', f'wrong shape: {parts!r}'
print('OK', p)
"Launch the binary from this wrapper path, not the original.
Confirm the binary boots through resource resolution, database init, model preload, and global-shortcut registration before you wrap it in uftrace + strace, because composing all three layers can slow boot enough to mask a bad binary as "uftrace caused the timeout."
DISPLAY=:103 "$WRAP/handy" --start-hidden >/tmp/smoke.log 2>&1 &
SMOKE_PID=$!
# Wait up to 60 s for the boot-complete marker in stdout.
timeout 60 bash -c 'while ! grep -q "Shortcuts initialized" /tmp/smoke.log; do sleep 0.3; done'
echo $? # 0 = boot succeeded
kill -TERM $SMOKE_PID"Shortcuts initialized" is the canonical boot-complete marker emitted
by shortcut/mod.rs during startup. The instrumented binary must reach
it for any synchronized capture to be meaningful.
uftrace is invoked as the inner process; strace is outermost. The launch line is built in section 11. The uftrace-specific args:
uftrace record \
-d /path/to/uftrace.data \
--no-libcall \
-- \
/tmp/handycap-wrap/target/release-debug/handy --start-hidden
-d— output directory.--no-libcall— skip library function calls. Without this, uftrace records into libc, libgtk, libwebkit2gtk, libonnxruntime, libwhisper, etc., which inflates the trace by orders of magnitude and obscures Rust function behavior. The four-tool decomposition assigns dylib internals to a different (out-of-scope) instrumentation strategy (section 15).- Do not use
-K(depth-limit) or-F(filter) for the initial map. Capture everything mcount sees; filter at synthesis.
After the lifecycle, generate two views of the data:
# Per-function aggregate (calls, total time, self time). Pipe through
# c++filt to demangle any C++/Rust mangled symbols.
uftrace report -d /path/to/uftrace.data --no-libcall | c++filt \
> uftrace-report.txt
# Chrome-trace-format dump for caller→callee edge extraction.
uftrace dump -d /path/to/uftrace.data --chrome | c++filt \
> uftrace-dump-chrome.json.txtThe chrome dump is line-oriented JSON; each line is {"ts":..., "ph":"B"|"E", "name":..., "pid":..., "tid":...}. Build per-thread stacks from the B
(begin) and E (end) events to reconstruct caller→callee edges. This is
how call-graph.csv is produced (section 12).
- The cpal stream callback closure (built inside
build_stream<T>inaudio_toolkit/audio/recorder.rs). The closure has no Rust-source symbol; mcount only sees the function it is constructed in. - Level-callback closure bodies passed via
with_level_callback. uftrace catches the parent function entry, not the closure body. - Any function whose symbol disappears under inlining. Set
lto = "off"(as above) to keep most boundaries visible, but#[inline(always)]and aggressive optimization can still hide short helpers. For those, fall back to source counters (section 8). - Anything inside a dylib (libwebkit2gtk, libwhisper, libonnxruntime, libcuda, libasound, libgtk). Listed in the gap audit (section 15) with the tool that would reach each.
strace records syscalls. Its job in this composition is to expose three things uftrace cannot:
- The execve of every subprocess Handy spawns (the Linux paste path fans across multiple tools; clipboard helpers; system info queries).
- The IPC socket traffic between the Handy main process and each WebKitWebProcess subprocess (event names appear as literal JSON in the payload).
- The fork→exec chain of WebKit's UIProcess spawning new WebProcess / NetworkProcess children, so you can map a WebKit subprocess PID back to its IPC FD via the argv convention described in section 2.
The single biggest failure mode with strace is attaching to a Handy that
is already running. strace -f -p <handy-pid> only follows forks from
this moment forward; the long-lived WebKitWebProcess that was forked
during boot retains its pre-attach state and most of its IPC traffic is
invisible. Always launch under strace:
strace -f -yy -s 16384 \
-e trace=execve,writev,write,sendmsg,sendto,recvmsg,recvfrom \
-o /path/to/strace.log \
-- <binary-or-inner-tool> <args>| Flag | Why |
|---|---|
-f |
Follow forks. Captures the WebKit subprocess(es) spawned during boot. Without this, you only see the main process. |
-yy |
Annotate FDs with file/peer info. sendmsg(24<UNIX:[12345->67890]> is how you confirm the IPC socketpair without manually cross-referencing /proc/net/unix. |
-s 16384 |
Maximum string size per syscall. Tauri's IPC payloads — especially boot-time JS source shipped via webkit_web_view_run_javascript and event JSON like {"event":"<name>","handler":<id>,...} — can run several KB. The strace default of -s 32 and even commonly-suggested -s 200 truncates these silently. 16 KB is comfortably above any single Tauri payload observed. |
-e trace=execve,writev,write,sendmsg,sendto,recvmsg,recvfrom |
The narrow set of syscalls relevant for IPC + subprocess invocations. The full syscall set explodes the log size during the audio loop (cpal poll calls run at audio-host cadence). Filter at strace level for a manageable artifact. |
-o <file> |
Output destination. Do not rely on stderr; multi-process strace output interleaves and is hard to post-process if mixed with the target's own stderr. |
Do NOT filter by PID. -f gives you the entire process tree; PID-level
filtering at strace level loses cross-process events.
size=$(stat -c%s /path/to/strace.log)
echo "strace log: $size bytes"
[ "$size" -ge 50000 ] || echo "FAIL: too small"
# Count literal occurrences of known Tauri event names. The frontend's
# listen() registrations serialize the event name as a literal string
# in the IPC payload, so even a capture that did not include a recording
# cycle should surface several distinct event names from boot-time
# listener setup.
for ev in show-overlay hide-overlay model-state-changed loading_completed; do
c=$(grep -c -- "$ev" /path/to/strace.log)
echo " $ev: $c"
doneIf the count of all event names sums to zero, the capture is broken — most
commonly because strace attached too late or -s was set too low. Re-run.
After the run, identify WebKit child PIDs and their IPC sockets:
# Filter strace.log to execve lines for WebKit children:
grep 'execve.*WebKitWebProcess' /path/to/strace.log | head -5
# For each WebKitWebProcess PID, argv[2] is the IPC socket FD inside the child.
# Find the socket inode and peer:
for pid in $(pgrep -f WebKitWebProcess); do
echo "=== $pid ==="
ls -l /proc/$pid/fd/ | grep socket
doneThe pair (WebKit FD, peer FD on Handy main) is the SOCK_SEQPACKET pair
that carries every evaluate_script call and every Tauri IPC event.
The WebKit Remote Inspector gives you the JavaScript-side view: which events JS handled, which timers fired, which functions consumed sampler time, when GCs happened. It is the only tool in the four that observes the webview side.
There are TWO different inspector env vars, and they do different things:
WEBKIT_INSPECTOR_SERVER=host:port— enables WebKit's internalinspector://scheme protocol. The listening socket performs a binary handshake; HTTP clients (includingcurl /json) receive "empty reply from server."WEBKIT_INSPECTOR_HTTP_SERVER=host:port— enables the HTTP/Chromium-style inspector. The listening socket serves an HTML page listing inspectable targets and accepts WebSocket connections on per-target paths.
Use WEBKIT_INSPECTOR_HTTP_SERVER. Anything else will silently fail
to connect.
export WEBKIT_INSPECTOR_HTTP_SERVER=127.0.0.1:9230Reference: WebKit's own documentation, https://people.igalia.com/aperez/Documentation/wpe-webkit/remote-inspector.html.
Many tutorials assume the Chromium DevTools Protocol shape: GET /json
returns a JSON array of targets. WebKitGTK does not implement that
endpoint. GET / returns an HTML page listing targets in a <table>,
with WebSocket paths embedded in each row's onclick handler:
onclick="window.open('Main.html?ws=' + window.location.host +
'/socket/1/N/WebPage', ...)"
Parse the HTML for targets:
import re, urllib.request
html = urllib.request.urlopen("http://127.0.0.1:9230/").read().decode()
targets = re.findall(
r'<div class="targetname">([^<]+)</div>.*?(/socket/1/\d+/WebPage)',
html, re.DOTALL,
)
# targets is a list of (name, socket_path) tuples.
# Handy presents two targets during a normal run: the main settings
# webview and the recording_overlay webview.Then open a WebSocket to each target at ws://127.0.0.1:9230<socket_path>.
<div class="targetname"> is the window TITLE, not the Tauri window LABEL.
This is a subtle gotcha that will silently break filter-substring matching.
Tauri's WebviewWindowBuilder::new(app, "recording_overlay", url).title("Recording")
produces a window whose internal label is recording_overlay and whose
inspector target name is the title-derived string (which may differ from
the label and may have framework-added suffixes — observed Recording Overlay
on WebKitGTK). Filter substrings on the title side, not the label side, and
normalize whitespace and case before comparing — e.g., lowercase both
sides and replace _ with spaces:
def matches(target_name: str, want: str) -> bool:
norm = lambda s: s.lower().replace("_", " ").strip()
return norm(want) in norm(target_name)If you filter on the literal Tauri window label (recording_overlay) you
will get zero matches and a quiet empty-records JSON output, which looks
exactly like a different failure mode (lazy target registration, WS
connection failure, etc.). Always normalize.
WebKitGTK's inspector WebSocket protocol wraps every command and response
in a Target.* envelope:
-
To send
Inspector.enableto a target, you must wrap it:{ "id": 2, "method": "Target.sendMessageToTarget", "params": { "targetId": "<targetId discovered earlier>", "message": "{\"id\":1,\"method\":\"Inspector.enable\",\"params\":{}}" } } -
Responses and events from the target arrive wrapped as:
{ "method": "Target.dispatchMessageFromTarget", "params": { "targetId": "...", "message": "{\"method\":\"Timeline.eventRecorded\", ... }" } }Unwrap by parsing the inner
messagefield as JSON.
To discover the targetId, listen for an initial Target.targetCreated
event after connecting and use the targetInfo.targetId from that event.
For each target's WebSocket, after the initial Target.targetCreated:
Target.setPauseOnStartwith{"pauseOnStart": false}(outer, not wrapped).- Inside-target (wrapped via
Target.sendMessageToTarget):Inspector.enable{}Timeline.start{"maxCallStackDepth": 5}ScriptProfiler.startTracking{"includeSamples": true}Heap.enable{}
- Drain all messages, accumulating into a list, until you decide to stop.
- Send the corresponding stop commands (wrapped):
ScriptProfiler.stopTracking{}Heap.disable{}Timeline.stop{}
- Drain a final batch of straggler messages, then close the WebSocket.
Persist the accumulated records to a JSON file per target with metadata including target name, socket path, and capture timestamp. A complete record file from a single recording cycle's overlay target typically runs to thousands of records (Timeline events dominate); the settings webview's record count is much lower (it is idle during a recording cycle if no UI interaction happens). Use this as a coarse sanity check on capture success.
tauri = { features = ["devtools", ...] } must be enabled at build time
for the Web Inspector to be accessible on a release-profile binary. In
dev builds (debug_assertions true) it is automatic. The build in
section 5.1 enables this; verify with the objdump check.
uftrace cannot see closure bodies or inlined helpers; strace cannot see userspace-only call boundaries; the Web Inspector sees only JS. For paths that fall through all three, the remaining option is to compile a counter into the source.
This is the most-intrusive of the four tools. Use it sparingly. A
prior attempt to bulk-inject counter calls at 56 paths via anchor-based
regex edits produced 25 compile errors because anchors matched the middle
of multi-line function signatures and turned &self, parameters into
free arguments. Do not repeat that pattern.
Only when all of the following are true:
- The path is not visible to uftrace (closure body without a symbol, or inlined-away helper).
- It is not visible to strace (no syscall on the path).
- It is not visible to the Web Inspector (it is Rust-side, not JS).
- It fires at a rate that matters for the analysis you are doing.
Otherwise the path goes in the gap audit (section 15) without a counter.
Edit the source file by hand, with the surrounding function signature fully visible. Do not use regex sed/awk replacement across the codebase.
A minimal counter primitive (place in src-tauri/src/instr.rs, behind a
build feature so it is removable):
// Lightweight per-path counter. Increments are wait-free; periodic
// dumps go to stdout on a fixed cadence from a background thread.
#[cfg(feature = "instr-counters")]
pub mod counters {
use std::sync::atomic::{AtomicU64, Ordering};
macro_rules! counter {
($name:ident) => {
pub static $name: AtomicU64 = AtomicU64::new(0);
};
}
// Declare one counter per path of interest.
counter!(LEVEL_CALLBACK_INVOCATIONS);
counter!(SCHEME_HANDLER_HITS);
// ...etc...
pub fn bump(c: &'static AtomicU64) -> u64 {
c.fetch_add(1, Ordering::Relaxed)
}
pub fn start_reporter() {
std::thread::spawn(|| loop {
std::thread::sleep(std::time::Duration::from_secs(1));
let now = std::time::SystemTime::now()
.duration_since(std::time::UNIX_EPOCH)
.unwrap();
// Emit one line per counter, prefixed with a fixed tag so
// your harness can grep for it.
println!(
"[instr-counter] t={}.{:06} {} = {}",
now.as_secs(),
now.subsec_micros(),
"LEVEL_CALLBACK_INVOCATIONS",
LEVEL_CALLBACK_INVOCATIONS.load(Ordering::Relaxed)
);
// ...etc...
});
}
}At the path you want to count (and only inside that closure / inlined helper):
#[cfg(feature = "instr-counters")]
crate::instr::counters::bump(&crate::instr::counters::LEVEL_CALLBACK_INVOCATIONS);Build with --features instr-counters for instrumented runs; the feature
gate keeps shipping builds clean.
A counter that emits a wall-clock-timestamped line per second can be
post-joined with the phase timeline (section 11.4) to produce per-phase
rates without any uftrace involvement. The harness greps for the
[instr-counter] tag in handy.stdout.log and emits a CSV row per
(phase, counter) pair.
A counter installed in the wrong file scope, missing the feature gate, or with a typo in the path name produces silently-zero readings. Before trusting a counter, verify it produces non-zero output during a smoke test that you know fires the path.
Assembling the build from sections 5.1, 5.2, and 8 into one procedure:
cd src-tauri
# 1) Add the [profile.release-debug] block to Cargo.toml (section 5.1) and
# temporarily add "devtools" to the tauri feature list (section 5.2).
# Record the file's pre-edit sha256 so you can detect drift later.
sha256sum Cargo.toml > /tmp/cargo-toml-baseline.sha
# 2) Build with mcount + (optional) instr-counters.
# NOTE: -Z instrument-mcount is unstable; requires `cargo +nightly`.
CARGO_TARGET_DIR=target-uftrace \
RUSTFLAGS="-Z instrument-mcount" \
cargo +nightly build \
--profile release-debug \
--bin handy \
--features instr-counters # omit if you are not using source counters
# 3) Verify the binary.
# NOTE on the mcount check: under `set -uo pipefail`, the bare
# `nm | grep -q ...` form is unsafe — once grep finds its match it
# closes stdin, nm receives SIGPIPE, and pipefail surfaces that as a
# pipeline failure (see section 14 pitfall 15). Use a counted form
# instead, which avoids the SIGPIPE entirely:
BIN=target-uftrace/release-debug/handy
test -f "$BIN" || { echo "build failed"; exit 1; }
test "$(stat -c%s "$BIN")" -gt $((100*1024*1024)) || { echo "binary suspiciously small"; exit 1; }
# The actual symbol is `U mcount@GLIBC_X.Y.Z`; match on @-or-end-of-line.
MCOUNT_HITS=$(nm -D --undefined-only "$BIN" | grep -cE '^ *U mcount(@|$)')
[ "$MCOUNT_HITS" -gt 0 ] || { echo "no mcount symbol — flag silently no-op'd? Check nightly toolchain"; exit 1; }
# Real instrumented Handy builds show tens of thousands of mcount call sites.
CALL_COUNT=$(objdump -d "$BIN" | grep -c 'call.*<mcount@plt>')
[ "$CALL_COUNT" -gt 1000 ] || { echo "suspicious mcount call count: $CALL_COUNT"; exit 1; }
DEVTOOLS_HITS=$(objdump -d "$BIN" | grep -c 'webkit_settings_set_enable_developer_extras@')
[ "$DEVTOOLS_HITS" -gt 0 ] || { echo "no devtools call site"; exit 1; }
# 4) Build the Tauri-compatible wrapper layout (section 5.3).
WRAP=/tmp/handycap-wrap/target/release-debug
mkdir -p "$WRAP"
ln -f "$PWD/$BIN" "$WRAP/handy"
ln -snf "$PWD/target-uftrace/release-debug/resources" "$WRAP/resources"
touch "$WRAP/.cargo-lock"
# 5) Restore Cargo.toml after the build (the devtools edit should not
# ride into a release build). Verify the restored sha256 matches the
# baseline you saved in step 1.The wrapper at $WRAP/handy is the binary to launch under uftrace
- strace.
Before you instrument, walk the source and enumerate every code path you
expect to fire across the lifecycle. This produces coverage-map.csv,
the static baseline against which the synchronized capture's
coverage.csv is computed in synthesis.
A capture-only methodology suffers a fatal blind spot: paths that did not fire during a single cycle disappear from the trace, and you cannot tell the difference between "this path never fires in this configuration" (the alternative branch was taken; the feature is off) and "this path should have fired but didn't" (your trigger missed it; an upstream condition failed). The audit makes that distinction visible.
It also surfaces tool-coverage gaps before you capture: if a path is flagged "needs source counter" and you have not added one, you will know to either add the counter or accept the gap and document it in the gap audit.
Walk every Rust source file in src-tauri/src/. The set as of v0.8.3:
main.rs lib.rs overlay.rs
settings.rs utils.rs tray.rs
tray_i18n.rs input.rs clipboard.rs
actions.rs transcription_coordinator.rs audio_feedback.rs
signal_handle.rs cli.rs portable.rs
llm_client.rs apple_intelligence.rs
managers/audio.rs managers/model.rs managers/transcription.rs
managers/history.rs managers/mod.rs managers/transcription_mock.rs
audio_toolkit/mod.rs audio_toolkit/constants.rs
audio_toolkit/text.rs audio_toolkit/utils.rs
audio_toolkit/audio/device.rs audio_toolkit/audio/recorder.rs
audio_toolkit/audio/resampler.rs audio_toolkit/audio/utils.rs
audio_toolkit/audio/visualizer.rs audio_toolkit/audio/mod.rs
audio_toolkit/vad/silero.rs audio_toolkit/vad/smoothed.rs
audio_toolkit/vad/mod.rs
commands/audio.rs commands/transcription.rs
commands/history.rs commands/models.rs
commands/mod.rs
shortcut/mod.rs shortcut/handy_keys.rs
helpers/clamshell.rs
And every TS file in src/:
main.tsx App.tsx bindings.ts
overlay/main.tsx overlay/RecordingOverlay.tsx
stores/modelStore.ts stores/settingsStore.ts
hooks/useSettings.ts i18n/index.ts
components/... (every component used during a recording lifecycle)
Independent of file walking, find and enumerate every:
tokio::spawn,std::thread::spawn,tauri::async_runtime::spawn,spawn_blocking(Rust thread/task creation).emit(,.emit_to(,.listen((Rust → JS / JS → Rust events)register_uri_scheme_protocol(custom schemes)#[tauri::command](commands)Command::new(subprocess invocations)invoke(,listen(,setInterval,setTimeout,requestAnimationFrame(TypeScript)
Re-derive these with grep so the list reflects current code, not a stale snapshot:
grep -rn "#\[tauri::command\]" src-tauri/src --include="*.rs"
grep -rn "\.emit(\|\.emit_to(\|\.listen(" src-tauri/src --include="*.rs"
grep -rn "register_uri_scheme_protocol\|Command::new" src-tauri/src --include="*.rs"
grep -rn "tokio::spawn\|thread::spawn\|async_runtime::spawn\|spawn_blocking" \
src-tauri/src --include="*.rs"
grep -rn "invoke(\|listen(\|setInterval\|setTimeout\|requestAnimationFrame" \
src --include="*.ts" --include="*.tsx"Every enumerated path is assigned to one or more lifecycle phases. The canonical phase set:
| Phase | Definition |
|---|---|
boot |
Called during run() setup, Tauri builder, plugin init, initialize_core_logic |
idle_pre_record |
After boot, before first trigger |
trigger_start |
Direct result of the start trigger (signal, shortcut, CLI flag) |
recording_init |
One-time per-cycle setup of audio (stream open, VAD load, visualizer construct) |
recording_steady |
Fires per audio frame / poll tick during active recording |
trigger_stop |
Direct result of the stop trigger |
vad_finalize |
Resampler drain, VAD finalization at recording end |
whisper_init |
Per-cycle model load (if not already loaded) |
whisper_inference |
The blocking transcription call |
transcription_postprocess |
Output filtering, custom-word substitution, history save, event emit |
paste_invoke |
Clipboard write + the platform-specific paste path |
post_paste |
Tray icon update, unload-watcher arming, overlay hide |
idle_return |
After the cycle completes; idle-watcher polls; settings UI commands fired by user during idle |
A function can fire in multiple phases (e.g., get_settings is called
from many places). Record all applicable phases per row.
For each row, indicate which tool(s) could observe it:
- uftrace if the function has a stable Rust symbol (most non-closure functions).
- strace if the function makes a syscall on the trace= list.
- webinspector if it is a JS path.
- source-counter if uftrace's symbol disappears (closure body, inlined helper). Mark the path explicitly; add or skip the counter per the policy in section 8.
coverage-map.csv columns:
path, # fully qualified function name or event/scheme/syscall id
file, # source file relative to repo root
line, # line number of the definition
phase, # one phase from the table; multiple rows if multi-phase
observation_tool, # uftrace | strace | webinspector | source-counter
notes # free-text; e.g. "closure body; uftrace sees parent only"
Save it at coverage-map.csv in your run directory. Hash it; the SHA256
will be part of the Phase 0 hard gate (section 11.1).
One Handy launch. All four tools recording simultaneously. Real audio. Real trigger. Real paste. Wall-clock anchors for every phase boundary.
Before launching anything, verify every input artifact exists at its expected path with its expected SHA256. This includes:
- The instrumented binary (sha256 of the build output).
- The wrapper-layout binary (same content, different path; the hardlink preserves the sha).
- The coverage map you just produced.
- The test audio file (a WAV of known content; PCM; a few seconds long with intelligible speech is enough to exercise VAD and Whisper).
Halt on any mismatch. A capture that proceeds past a failed hard gate is producing a measurement of a system you cannot identify.
expected_bin_sha=$(cat /path/to/expected-bin.sha)
actual_bin_sha=$(sha256sum /tmp/handycap-wrap/target/release-debug/handy | cut -d' ' -f1)
[ "$expected_bin_sha" = "$actual_bin_sha" ] || { echo "bin sha mismatch"; exit 1; }
expected_cov_sha=$(cat /path/to/expected-cov.sha)
actual_cov_sha=$(sha256sum coverage-map.csv | cut -d' ' -f1)
[ "$expected_cov_sha" = "$actual_cov_sha" ] || { echo "coverage map sha mismatch"; exit 1; }
# ...etc for every other input...The composed launch line is:
strace -f -yy -s 16384 \
-e trace=execve,writev,write,sendmsg,sendto,recvmsg,recvfrom \
-o /path/to/strace.log \
uftrace record \
-d /path/to/uftrace.data \
--no-libcall \
-- \
/tmp/handycap-wrap/target/release-debug/handy --start-hidden
Why strace is outermost: if you reverse the nesting
(uftrace record -- strace -- handy), uftrace inspects its first
argv'd executable for mcount, sees strace (which has no mcount), and
fails the instrumentation check rather than following the exec chain
down to handy. With strace outermost, strace ptrace-attaches its
direct child (uftrace), follows that exec chain via -f, and ends up
tracing the syscalls of the final handy exec. uftrace, meanwhile,
inspects the handy it directly invokes, finds the mcount symbol, and
records via in-process LD_PRELOAD libmcount.so. The two layers do
not interact — uftrace does not use ptrace, and strace does not
intercept libmcount.
Environment for the launch:
export DISPLAY=:103
export WEBKIT_INSPECTOR_HTTP_SERVER=127.0.0.1:9230
# (Plus any others you would normally pass to Handy: RUST_LOG, etc.)Launch:
nohup <composed-command> >/path/to/handy.stdout.log 2>&1 &
LAUNCH_PID=$!Do not start the cycle before the binary has finished booting. The
canonical marker is "Shortcuts initialized" in stdout. Poll for it
with a generous timeout (under composed instrumentation, boot can take
30–60 s on a slow host):
timeout 180 bash -c '
while ! grep -q "Shortcuts initialized" /path/to/handy.stdout.log 2>/dev/null; do
sleep 0.5
done
'
phase_mark "shortcuts_initialized"If this times out, the binary is broken (or your composition is
introducing more overhead than the boot can absorb). Inspect
handy.stdout.log for panics or unresolved permissions; do not
proceed.
Maintain a wall-clock-anchored log of phase boundaries. Every phase
boundary writes one line <unix_ts>\t<iso_local>\t<label> to
phase-timeline.log. The labels match the phases in section 10.4.
phase_mark() {
ts=$(date +%s.%6N)
iso=$(date -Iseconds)
printf "%s\t%s\t%s\n" "$ts" "$iso" "$1" >> /path/to/phase-timeline.log
}
phase_mark "phase2-start"
phase_mark "launch"
# ...after boot wait...
phase_mark "shortcuts_initialized"In a Python harness, the equivalent:
def phase_mark(label):
ts = time.time()
iso = time.strftime("%Y-%m-%dT%H:%M:%S", time.localtime(ts))
phase_log.write(f"{ts:.6f}\t{iso}\t{label}\n")
phase_log.flush()After boot, the Handy process is several levels down the strace → uftrace
→ wrapper-binary exec chain. Identify it by matching /proc/<pid>/exe:
HANDY_PID=$(
for pid in $(pgrep -x handy); do
exe=$(readlink /proc/$pid/exe 2>/dev/null)
[ "$exe" = "/tmp/handycap-wrap/target/release-debug/handy" ] && echo $pid
done | head -1
)
echo "HANDY_PID=$HANDY_PID"You will need this PID for the Web Inspector connection (no — that connects via the HTTP port, not the PID), the SIGUSR2 trigger, and the PipeWire source-output move.
In the harness (Python is convenient here for the WebSocket client):
GET http://127.0.0.1:9230/, parse the HTML for targets (section 7.2).- For each target, connect a WebSocket, wait for
Target.targetCreated, extracttargetId, and start Timeline + ScriptProfiler + Heap (section 7.4). - Spin a background thread per session that accumulates records until the orchestrator signals stop.
Already covered in section 4.3. Two operational notes specific to the synchronized capture:
- Start
paplayBEFORE the SIGUSR2 trigger so a buffer exists at the moment recording begins. - The source-output
movemust happen AFTER recording starts (the source-output does not exist until Handy opens its input stream). Poll for it with a 10-second deadline.
phase_mark "paplay_start"
paplay --device="$SINK" --volume=65536 /path/to/test.wav \
>/path/to/paplay.log 2>&1 &
sleep 2
phase_mark "sigusr2_start"
kill -USR2 $HANDY_PID
# Move Handy's source-output to the null-sink monitor (poll for it).
for i in {1..50}; do
SO_ID=$(pactl list source-outputs |
awk '/^Source Output #/{id=$3} /PipeWire ALSA \[handy\]/{print id; exit}')
[ -n "$SO_ID" ] && break
sleep 0.2
done
[ -n "$SO_ID" ] && pactl move-source-output "$SO_ID" "${SINK}.monitor"
phase_mark "recording_steady_window_begin"
sleep 10 # the recording-steady window
phase_mark "recording_steady_window_end"
# Capture a baseline count of paste-method log lines BEFORE sigusr2_stop.
# A naive "count > 0" check breaks for multi-cycle runs: cycle 1's marker
# remains in stdout, so cycles 2 and 3 see count > 0 immediately and skip
# the actual wait. Always track per-cycle increments by comparing the
# post-stop count to the pre-stop baseline.
BASELINE_HITS=$(grep -aciE 'paste method' /path/to/handy.stdout.log 2>/dev/null || echo 0)
phase_mark "sigusr2_stop"
kill -USR2 $HANDY_PID
# Wait for THIS cycle's transcription completion. The wait condition is
# that the paste-method line count has INCREASED past the baseline — not
# that it is simply nonzero.
#
# Avoid the `grep -q | pipefail` form (see section 14 pitfall 15); use a
# count-and-compare form which is SIGPIPE-safe.
timeout 60 bash -c '
while :; do
N=$(grep -aciE "paste method" /path/to/handy.stdout.log 2>/dev/null || echo 0)
[ "$N" -gt "'$BASELINE_HITS'" ] && break
sleep 1
done
'
phase_mark "post_transcribe_paste"
# Trailing slack: Whisper inference + paste pipeline can run ~17 s on
# desktop CPUs/GPUs after the LAST cycle's sigusr2_stop. The historical
# `sleep 5` here is too tight for the final cycle. Use ≥ 20 s to ensure
# the last paste line is flushed to stdout before phase2-end.
sleep 20 # capture any straggler events including final paste
phase_mark "idle_end"SIGUSR2 is a deliberate choice. The Unix signal handler invokes the
in-process trigger function directly; no second Handy instance is
launched. The CLI --toggle-transcription would launch a second process
that initializes Whisper/Vulkan before handing off via single-instance —
under instrumentation overhead, this can cascade into resource pressure
that distorts the trace.
# Stop paplay.
pkill -f "paplay.*$SINK" 2>/dev/null
# Signal stop to the Web Inspector sessions; harness joins them and writes
# per-target JSON to disk.
# SIGTERM the launch process tree to let uftrace flush its buffers.
kill -TERM $LAUNCH_PID
wait $LAUNCH_PID 2>/dev/null
# Reap any remaining wrapper-binary children.
pgrep -f handycap-wrap | xargs -r kill -TERM
sleep 2
pgrep -f handycap-wrap | xargs -r kill -KILL
# Unload the null sink.
pactl list short modules | awk -v sink="$SINK" '$0 ~ sink {print $1}' \
| xargs -r -n1 pactl unload-module
# Stop Xvfb.
kill -TERM $XVFB_PID
phase_mark "phase2-end"Now convert the raw artifacts into the four CSV / Mermaid deliverables.
import csv
phase_anchors = {}
with open("phase-timeline.log") as f:
for line in f:
ts, _iso, label = line.rstrip("\n").split("\t")
if label not in phase_anchors: # first occurrence wins
phase_anchors[label] = float(ts)
t0 = phase_anchors.get("launch", phase_anchors["phase2-start"])
phase_offsets = {l: ts - t0 for l, ts in phase_anchors.items()}phase_offsets is the map from label → seconds-since-launch you will
use to assign observations to phases.
# Parse uftrace report (one line per function, columns: total_v total_u
# self_v self_u calls function).
import re
uftrace_rows = []
with open("uftrace-report.txt") as f:
for ln in f:
m = re.match(r"\s*([\d.]+)\s*(\w+)\s+([\d.]+)\s*(\w+)\s+(\d+)\s+(.*)$", ln)
if not m: continue
total_v, total_u, self_v, self_u, calls, fn = m.groups()
# Convert (value, unit) → seconds.
def to_s(v, u):
return float(v) * {"ns":1e-9,"us":1e-6,"ms":1e-3,"s":1,"m":60}.get(u,0)
uftrace_rows.append({
"tool": "uftrace",
"name": fn.strip(),
"calls": int(calls),
"total_s": to_s(total_v, total_u),
"self_s": to_s(self_v, self_u),
})For each row, assign a phase using the chrome-dump timestamps (next
section) or — coarser — by joining on the function's expected phase
from coverage-map.csv. Add columns: phase, frequency_hz
(calls / phase_duration_s).
import json
from collections import Counter, defaultdict
edges = Counter()
per_thread_stack = defaultdict(list)
with open("uftrace-dump-chrome.json.txt") as f:
for line in f:
line = line.strip().rstrip(",")
if not line.startswith("{"): continue
try: ev = json.loads(line)
except json.JSONDecodeError: continue
ph, name, tid = ev.get("ph"), ev.get("name",""), ev.get("tid")
if ph == "B":
stack = per_thread_stack[tid]
if stack:
edges[(stack[-1], name)] += 1
stack.append(name)
elif ph == "E":
stack = per_thread_stack.get(tid, [])
if stack: stack.pop()
with open("call-graph.csv", "w") as out:
w = csv.writer(out)
w.writerow(["caller", "callee", "count"])
for (caller, callee), c in edges.most_common():
w.writerow([caller, callee, c])# strace: count syscall occurrences per phase using PID-level filtering
# (the WebKit subprocesses appear as distinct PIDs).
# Web Inspector: from each target's records list, count by record["method"]
# (Timeline.eventRecorded, ScriptProfiler.events, Heap.garbageCollected, etc.)
# and aggregate by inner record["params"]["record"]["type"] for the
# Timeline events.
# Append rows to execution-trace.csv with tool="strace" or tool="webinspector"
# and an appropriate name (e.g., "syscall:sendmsg", "Timeline.eventRecorded:EventDispatch").# Read coverage-map.csv. For each row, determine whether it fired:
# - uftrace rows: look up function name in execution-trace.csv
# - strace rows: look up syscall / event name occurrences in strace.log
# - webinspector rows: look up the event in the JSON record list
# - source-counter rows: look up the counter name in handy.stdout.log
#
# Output coverage.csv with: path, file, line, phase, observation_tool,
# fired (bool), fire_count, reason (if not fired):
# - "did not fire in this cycle (alternative branch / feature off)"
# - "needs source-counter; not instrumented in this run"
# - "expected only in <other phase>; phase not covered"
# - "tool cannot observe (FFI internal)"The Mermaid sequence diagram is the most opinionated of the four deliverables; it is the one a human will read first. Use this template and fill it from the empirical observations:
sequenceDiagram
autonumber
participant U as User
participant H as Handy main<br/>(Rust)
participant A as Audio thread<br/>(cpal)
participant J as Overlay JS<br/>(WebKit subprocess)
participant X as External<br/>subprocess
Note over U,X: BOOT (<duration>s)
U->>H: launch
H->>H: <observed boot work>
H->>J: spawn WebKitWebProcess + load webviews
J-->>H: listen(...) registrations via Tauri IPC
Note over U,X: TRIGGER START (<t>s)
U->>H: SIGUSR2 (start)
H->>A: <observed audio-start work>
H->>J: emit("show-overlay") via wry InnerWebView::eval
Note over A,J: RECORDING STEADY STATE (<window>s)
rect rgb(245, 245, 235)
loop <observed Hz> audio callback
A->>A: <observed audio work>
A->>H: <observed level path>
end
loop <observed Hz> JS work
J->>J: <observed JS work>
J->>H: <observed JS→Rust call, if any>
H-->>J: <observed Rust→JS response, if any>
end
end
Note over U,X: TRIGGER STOP (<t>s)
U->>H: SIGUSR2 (stop)
H->>A: <observed audio-stop work>
H->>J: emit("hide-overlay") via wry InnerWebView::eval
Note over H,X: TRANSCRIBE + PASTE (<duration>s)
H->>H: <observed VAD finalize + inference work>
H->>H: clipboard write
H->>X: spawn <observed paste tool>
X-->>H: subprocess exit
Note over H,X: IDLE RETURN
H->>H: idle-watcher polls
Fill the angle-bracketed placeholders from your execution-trace.csv
and phase-timeline.log. Do not invent observations to fit the diagram.
Each artifact has a quantitative validation contract. A run that does not satisfy them is not a valid capture, regardless of how complete the synthesis CSVs look.
| Artifact | Validation |
|---|---|
uftrace.data/ |
Size >= 30 MB after a 10-second recording cycle; uftrace report produces a non-empty table; >= 100 distinct Rust symbols appear |
strace.log |
Size >= 50 KB; at least one literal occurrence of every known event name registered by the frontend's listen() calls (a per-listener registration line appears in the boot IPC traffic); sendmsg calls annotated with UNIX:[...] confirm the SOCK_SEQPACKET socketpair |
webinspector-recording_overlay.json |
>= 100 records; at least one Timeline.eventRecorded record; at least one Heap.garbageCollected or equivalent confirms the Heap domain accepted commands |
webinspector-settings.json |
>= 50 records; the smaller threshold reflects an idle main webview during a recording cycle (no UI interaction triggered) |
handy.stdout.log |
Contains "Shortcuts initialized" (boot complete); contains transcription-complete and paste-complete markers; if source counters were enabled, contains [instr-counter] lines for each declared counter |
phase-timeline.log |
One row per phase marker; monotonically increasing timestamps; recording_steady_window_end - recording_steady_window_begin matches the intended recording duration within ±0.5 s |
# A capture is end-to-end-valid iff:
# 1. Every artifact above passes its individual validation.
# 2. The cycle includes a transcription. handy.stdout.log contains
# evidence of the transcription text being written.
# 3. Coverage of paths flagged "expected in recording_steady" is
# >= 80%. (If you tightened the audit, raise this threshold.
# If your audit is loose, lower it and document why.)
# 4. There is non-zero IPC traffic in the strace log during the
# recording_steady window (filter by timestamp).
# 5. There is non-zero JS Timeline activity in the overlay webview
# during the recording_steady window.A phase is PASS only when:
- The phase reached its terminal action without an exception.
- The artifact it produces satisfies the validation contract above.
- The reason string explains WHY it passed in terms of measured properties of the artifact, not in terms of "no errors were encountered."
A run that records "PASS" without a measured property is not really PASS; mark it FAIL and reproduce.
These are time-saving warnings. Each is a real way the procedure can silently produce useless data.
WEBKIT_INSPECTOR_SERVERinstead ofWEBKIT_INSPECTOR_HTTP_SERVER. First produces a binary-handshake socket; nothing HTTP-shaped can connect to it. Spend an hour debugging "empty reply from server" before realizing.--toggle-transcriptioninstead ofSIGUSR2. The CLI flag launches a second instance that performs heavy initialization before handing off via single-instance. Under instrumentation overhead this can cause an OOM cascade that kills your capture mid-cycle.- strace
-sset too low. The default truncates Tauri event payloads. You will see syscalls but not the event names that make them recognizable. - strace attached after boot.
strace -pfollows forks from now; pre-attach WebKit subprocesses retain their pre-attach state. Always launch the target under strace. - uftrace as outermost rather than innermost. Reverses the exec chain inspection; uftrace inspects strace (no mcount) instead of handy (has mcount) and fails.
lto = trueorstrip = truein the release-debug profile. Inlined symbols disappear from the trace; backtraces show??.devtoolsfeature not enabled in the tauri Cargo dep. The Web Inspector port opens but no targets are listed becausewebkit_settings_set_enable_developer_extrasis never called.- Tauri's
target/parts[len-3]resource-resolution quirk. A binary attarget-uftrace/release-debug/handywill fail in tray init with an unhelpful error. The hardlink-into-wrapper fix (section 5.3) is the only path forward. bun run tauri devfor a capture run. Loads JS over Vite HMR port 1420 instead of the bundleddist/. The JS that runs is not the JS that ships. Always build viabun run buildfirst and launch the binary directly.- A still-running user-installed Handy. The single-instance plugin will block your launch silently. Always check first.
- Env-var-based per-app PipeWire/PulseAudio redirection. Does
not work for cpal-ALSA-PipeWire on Handy. The only validated
method is
pactl move-source-outputper stream cycle. - Loaded null sinks survive process death. WirePlumber's stream-restore cache then binds the user's installed Handy to a sink that no longer exists, breaking their voice-to-text silently the next time they use it. Always register an unconditional cleanup that runs on every exit.
- Anchor-regex bulk source-counter injection. Anchors match
inside multi-line function signatures, landing counter calls
before
&self,and producing 20+ compile errors. Edit source by hand, with the full signature visible. - Trusting "no errors" as the success criterion. A capture that records zero observations is "error-free." Validate by measured properties of the artifact (size, content match), not by exception-absence.
set -uo pipefail+grep -qafter a pipe → SIGPIPE → false-fatal. Oncegrep -qfinds its first match it closes stdin; the upstream command (nm,objdump,cat, anything streaming through the pipe) then writes to a closed pipe and gets SIGPIPE. Underset -uo pipefailthe script treats that as a pipeline failure and bails, even though the grep succeeded. Symptom: build-verify gates fail on builds you can manually confirm are correct. Fix: replace everynm | grep -q .../objdump | grep -q ...with a counted form (COUNT=$(nm | grep -c ...); [ "$COUNT" -gt 0 ]). The count form drains the pipe fully so no SIGPIPE happens.grep -qcumulative-count wait logic for multi-cycle runs. Writing a per-cycle "wait for transcription marker" loop aswhile ! grep -q PATTERN handy.stdout.logworks for the first cycle but breaks for cycles 2+: cycle 1's marker is still in stdout, so the condition is already true at cycle 2's wait. The script proceeds immediately and the actual transcription may finish AFTER the script tears down — losing the last cycle's paste / transcription stdout. Fix: capture a baseline count BEFORE the trigger-stop signal, then wait for the count to increase past it. (Section 11.8 shows the corrected form.)- Filtering Web Inspector targets by Tauri window LABEL instead of
TITLE.
<div class="targetname">is the window title, not Tauri's internal label. Arecording_overlay-labeled window may have an inspector target name ofRecording Overlay. Filter on the title side and normalize case +_↔; otherwise the harness silently records zero records and you debug the wrong failure mode (lazy load, WS connection, etc.). See section 7.2.
A complete instrumentation report documents not just what it captured but what it could not. Template:
| # | Out-of-reach surface | What would reach it |
|---|---|---|
| 1 | C/C++ internals of dylibs (libwhisper, libwebkit2gtk, libgtk, libonnxruntime, libcuda, libasound) | A custom build of each dylib with -fno-omit-frame-pointer and either uftrace --libcall or a heaptrack run |
| 2 | WebKit subprocess internals (WebKitWebProcess, WebKitNetworkProcess) beyond the JS layer | A heaptrack attach to the WebKit subprocess PID, or the WebKit Inspector connected to the WebProcess directly (vs. through the UIProcess proxy) |
| 3 | Sub-millisecond JS work below ScriptProfiler's sampling resolution | Explicit performance.now() instrumentation in JS, or Heap.startSampling at a higher rate |
| 4 | React reconciler "why did this render" semantics | React DevTools Profiler attached at the React commit phase boundary (visible here only as react-dom stack samples) |
| 5 | Kernel activity below the syscall boundary | eBPF / bpftrace. Note that some systems disable unprivileged BPF (unprivileged_bpf_disabled=2); if so, this is unreachable without root |
| 6 | cpal-PortAudio internals and the ALSA hw-mmap path | uftrace --libcall against a debug-built libasound, plus strace -e read on the snd_pcm_mmap FD |
| 7 | ONNX Runtime kernels invoked by the Silero VAD model | The ONNX Runtime profiler API (SessionOptions.EnableProfiling) — a build-time switch |
| 8 | Source-counter closures not instrumented in this run | Per section 8: add hand-edited counters at the specific closure body, build with --features instr-counters, re-run |
This list is template; expand or contract based on what you decided to include in your composition. The point is to make the surface area explicit.
strace -f -yy -s 16384 \
-e trace=execve,writev,write,sendmsg,sendto,recvmsg,recvfrom \
-o $RUN_DIR/strace.log \
uftrace record \
-d $RUN_DIR/uftrace.data \
--no-libcall \
-- \
$WRAP/handy --start-hidden
With environment:
DISPLAY=:103
WEBKIT_INSPECTOR_HTTP_SERVER=127.0.0.1:9230
RUST_LOG=handy_app_lib=trace,handy=trace # optional
# Cargo.toml temporary edits: add [profile.release-debug] and the
# "devtools" feature on the tauri dep. Remember to revert.
CARGO_TARGET_DIR=target-uftrace \
RUSTFLAGS="-C instrument-mcount" \
cargo build --profile release-debug --bin handy \
--features instr-counters
# Wrapper layout (Tauri target/parts[len-3] gotcha):
WRAP=/tmp/handycap-wrap/target/release-debug
mkdir -p $WRAP
ln -f src-tauri/target-uftrace/release-debug/handy $WRAP/handy
ln -snf $PWD/src-tauri/target-uftrace/release-debug/resources $WRAP/resources
touch $WRAP/.cargo-lockphase2-start
launch
shortcuts_initialized
paplay_start
sigusr2_start
recording_steady_window_begin
recording_steady_window_end
sigusr2_stop
post_transcribe_paste
idle_end
phase2-end
execution-trace.csv:
phase, tool, name, file, line, calls, total_s, self_s, frequency_hz
call-graph.csv:
caller, callee, count
coverage.csv:
path, file, line, phase, observation_tool, fired, fire_count, reason
status-summary.md: per-phase status + the measured properties on which
each PASS rests.
These are starting points; tune to your hardware:
- uftrace.data size: >= 30 MB for a 10-second cycle on a desktop-class CPU. Under heavier instrumentation or longer cycles this scales linearly.
- strace.log size: >= 50 KB; expect 10–100 MB for a full cycle.
- webinspector overlay records: >= 500 for a cycle with an active level meter; >= 50 for an idle webview.
- handy.stdout.log: presence of
"Shortcuts initialized"is the boot marker; transcription / paste markers depend on app version.
uftrace >= 0.15
strace
pactl, paplay (pulseaudio/pipewire utils)
Xvfb, xdpyinfo (X virtual framebuffer)
nm, objdump, c++filt (binutils)
python3 (for the synthesis harness)
Optional:
heaptrack (for the gap-audit follow-ups in section 15)
gdb (for the rare interactive forensics)
- sudo. The entire procedure runs as the unprivileged user.
- A modified kernel. uftrace uses LD_PRELOAD; strace uses ptrace at
ptrace_scope <= 1; the Web Inspector is a localhost HTTP server. - A custom WebKit / wry / Tauri build. Stock libwebkit2gtk and the patched-but-released Tauri runtime in this repo are sufficient.
The procedure above produces a complete, validated execution-flow map of one transcription lifecycle. To extend to multiple cycles, wrap the trigger sequence (section 11.8) in a loop, phase-mark each cycle boundary, and re-run synthesis. The audit, the per-tool validation, and the gap audit remain unchanged.
When in doubt: instrument less, measure twice, validate every artifact before drawing any conclusion. A trace that you trust is worth a hundred traces you don't.