Skip to content

Instantly share code, notes, and snippets.

@primiano
Created March 21, 2026 08:35
Show Gist options
  • Select an option

  • Save primiano/7574b8533d429c5a45b030c3982ca452 to your computer and use it in GitHub Desktop.

Select an option

Save primiano/7574b8533d429c5a45b030c3982ca452 to your computer and use it in GitHub Desktop.
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Trace Format Comparison for Perfetto</title>
<link href="https://fonts.googleapis.com/css2?family=JetBrains+Mono:wght@400;500;700&family=Source+Serif+4:opsz,wght@8..60,400;8..60,600;8..60,700&family=DM+Sans:wght@400;500;600&display=swap" rel="stylesheet">
<style>
:root {
--bg: #0c0c0e;
--surface: #16161a;
--surface2: #1e1e24;
--border: #2a2a32;
--text: #e2e0dc;
--text-dim: #8a8880;
--accent-cbor-s: #f0a040;
--accent-cbor-i: #60d0a0;
--accent-resp: #40c8f0;
--accent-ndjson: #a0e060;
--accent-ion: #c080f0;
--accent-arrow: #f06080;
--accent-chrome: #888;
--code-bg: #111114;
--radius: 8px;
}
* { margin: 0; padding: 0; box-sizing: border-box; }
body { background: var(--bg); color: var(--text); font-family: 'DM Sans', sans-serif; line-height: 1.7; }
.hero { padding: 80px 48px 60px; max-width: 1200px; margin: 0 auto; }
.hero h1 { font-family: 'Source Serif 4', serif; font-size: 2.4rem; font-weight: 700; letter-spacing: -0.03em; line-height: 1.15; margin-bottom: 16px; }
.hero .subtitle { color: var(--text-dim); font-size: 1.05rem; max-width: 750px; margin-bottom: 32px; }
.scenario-banner { background: var(--surface); border: 1px solid var(--border); border-radius: var(--radius); padding: 20px 28px; margin-bottom: 24px; }
.scenario-banner h3 { font-family: 'JetBrains Mono', monospace; font-size: 0.78rem; text-transform: uppercase; letter-spacing: 0.1em; color: var(--text-dim); margin-bottom: 8px; }
.scenario-banner p { font-size: 0.93rem; }
.content { max-width: 1200px; margin: 0 auto; padding: 0 48px 80px; }
.tabs { display: flex; gap: 3px; margin-bottom: 0; position: sticky; top: 0; z-index: 10; background: var(--bg); padding: 8px 0 0; flex-wrap: wrap; }
.tab { padding: 9px 14px; border: 1px solid var(--border); border-bottom: none; border-radius: var(--radius) var(--radius) 0 0; background: var(--surface); color: var(--text-dim); font-family: 'JetBrains Mono', monospace; font-size: 0.74rem; font-weight: 500; cursor: pointer; transition: all 0.2s; user-select: none; white-space: nowrap; }
.tab:hover { color: var(--text); background: var(--surface2); }
.tab.active { color: var(--text); background: var(--surface2); }
.tab[data-tab="cbor-s"].active { border-top: 2px solid var(--accent-cbor-s); }
.tab[data-tab="cbor-i"].active { border-top: 2px solid var(--accent-cbor-i); }
.tab[data-tab="resp"].active { border-top: 2px solid var(--accent-resp); }
.tab[data-tab="ndjson"].active { border-top: 2px solid var(--accent-ndjson); }
.tab[data-tab="ion"].active { border-top: 2px solid var(--accent-ion); }
.tab[data-tab="arrow"].active { border-top: 2px solid var(--accent-arrow); }
.tab[data-tab="chrome-json"].active { border-top: 2px solid var(--accent-chrome); }
.panel { display: none; background: var(--surface2); border: 1px solid var(--border); border-radius: 0 var(--radius) var(--radius) var(--radius); padding: 32px; }
.panel.active { display: block; }
.panel h2 { font-family: 'Source Serif 4', serif; font-size: 1.4rem; font-weight: 600; margin-bottom: 8px; }
.panel .desc { color: var(--text-dim); font-size: 0.88rem; margin-bottom: 24px; max-width: 750px; }
.code-section { margin-bottom: 24px; }
.code-section h3 { font-family: 'JetBrains Mono', monospace; font-size: 0.74rem; text-transform: uppercase; letter-spacing: 0.1em; color: var(--text-dim); margin-bottom: 8px; display: flex; align-items: center; gap: 8px; }
.code-section h3 .dot { width: 8px; height: 8px; border-radius: 50%; display: inline-block; }
.code-block { background: var(--code-bg); border: 1px solid var(--border); border-radius: var(--radius); padding: 16px 20px; overflow-x: auto; font-family: 'JetBrains Mono', monospace; font-size: 0.78rem; line-height: 1.7; white-space: pre; color: var(--text); tab-size: 2; }
.code-block .c { color: #666; font-style: italic; }
.code-block .k { color: #e0a060; }
.code-block .s { color: #a0d870; }
.code-block .n { color: #70b0e0; }
.code-block .t { color: #d080c0; }
.code-block .p { color: #888; }
.code-block .w { color: #e07070; }
.code-block .f { color: #60c0d0; }
.pros-cons { display: grid; grid-template-columns: 1fr 1fr; gap: 14px; margin-top: 18px; }
.pros, .cons { background: var(--surface); border: 1px solid var(--border); border-radius: var(--radius); padding: 16px; }
.pros h4, .cons h4 { font-family: 'JetBrains Mono', monospace; font-size: 0.72rem; text-transform: uppercase; letter-spacing: 0.08em; margin-bottom: 8px; }
.pros h4 { color: #80d060; }
.cons h4 { color: #e06060; }
.pros li, .cons li { font-size: 0.84rem; color: var(--text-dim); margin-bottom: 4px; list-style: none; padding-left: 16px; position: relative; }
.pros li::before { content: '+'; position: absolute; left: 0; color: #80d060; font-weight: 700; }
.cons li::before { content: '−'; position: absolute; left: 0; color: #e06060; font-weight: 700; }
.filter-section { margin-top: 20px; background: var(--surface); border: 1px solid var(--border); border-radius: var(--radius); padding: 18px; }
.filter-section h4 { font-family: 'JetBrains Mono', monospace; font-size: 0.72rem; text-transform: uppercase; letter-spacing: 0.08em; color: var(--text-dim); margin-bottom: 10px; }
.filter-section .code-block { margin-bottom: 0; font-size: 0.76rem; }
.wire-section { margin-top: 28px; background: var(--surface); border: 1px solid var(--border); border-radius: var(--radius); padding: 18px; }
.wire-section h4 { font-family: 'JetBrains Mono', monospace; font-size: 0.72rem; text-transform: uppercase; letter-spacing: 0.08em; color: var(--text-dim); margin-bottom: 10px; }
.size-bar { display: flex; align-items: center; gap: 10px; margin-bottom: 6px; }
.size-bar .label { font-family: 'JetBrains Mono', monospace; font-size: 0.74rem; color: var(--text-dim); width: 130px; text-align: right; flex-shrink: 0; }
.size-bar .bar { height: 18px; border-radius: 4px; min-width: 20px; display: flex; align-items: center; padding-left: 8px; font-family: 'JetBrains Mono', monospace; font-size: 0.68rem; color: #fff; font-weight: 600; }
.bottom-note { margin-top: 36px; padding: 22px 26px; background: var(--surface); border: 1px solid var(--border); border-radius: var(--radius); }
.bottom-note h3 { font-family: 'Source Serif 4', serif; font-size: 1.1rem; margin-bottom: 10px; }
.bottom-note p { color: var(--text-dim); font-size: 0.88rem; margin-bottom: 7px; }
@media (max-width: 768px) {
.hero { padding: 36px 16px 24px; }
.hero h1 { font-size: 1.4rem; }
.content { padding: 0 12px 40px; }
.panel { padding: 16px; }
.pros-cons { grid-template-columns: 1fr; }
.tab { font-size: 0.66rem; padding: 7px 10px; }
}
</style>
</head>
<body>
<div class="hero">
<h1>Alternative trace formats for Perfetto</h1>
<p class="subtitle">Concrete examples with emitter code in C and Python, plus filtering/query examples. Context: <code>google/perfetto#3491</code>.</p>
<div class="scenario-banner">
<h3>Scenario modelled in every tab</h3>
<p>A GPU workload trace: nested track groups (GPU → Queue → CmdBuf), slices, a counter (VRAM usage). This is exactly the kind of thing Chrome JSON handles poorly and protobuf handles verbosely.</p>
</div>
</div>
<div class="content">
<div class="tabs">
<div class="tab active" data-tab="cbor-s">CBOR (string keys)</div>
<div class="tab" data-tab="cbor-i">CBOR (int keys)</div>
<div class="tab" data-tab="ndjson">NDJSON</div>
<div class="tab" data-tab="ion">Amazon Ion</div>
<div class="tab" data-tab="resp">RESP-like</div>
<div class="tab" data-tab="arrow">Arrow IPC</div>
<div class="tab" data-tab="chrome-json">Chrome JSON</div>
</div>
<!-- ============================================================ -->
<!-- CBOR STRING KEYS -->
<!-- ============================================================ -->
<div class="panel active" id="panel-cbor-s">
<h2 style="color: var(--accent-cbor-s);">CBOR — String Keys (JSON-compatible)</h2>
<p class="desc">CBOR (RFC 8949) with UTF-8 string map keys. Looks like JSON's data model, encodes to binary. Round-trips losslessly to JSON, so the full <code>jq</code> ecosystem works via a simple pipe. ~30-40% smaller than JSON.</p>
<div class="code-section">
<h3><span class="dot" style="background: var(--accent-cbor-s);"></span> Python emitter</h3>
<div class="code-block"><span class="w">import</span> cbor2 <span class="c"># pip install cbor2</span>
<span class="w">with</span> <span class="f">open</span>(<span class="s">"trace.cbor"</span>, <span class="s">"wb"</span>) <span class="w">as</span> f:
cbor2.dump({<span class="s">"type"</span>: <span class="s">"track"</span>, <span class="s">"uuid"</span>: <span class="n">1</span>, <span class="s">"name"</span>: <span class="s">"GPU 0"</span>}, f)
cbor2.dump({<span class="s">"type"</span>: <span class="s">"track"</span>, <span class="s">"uuid"</span>: <span class="n">2</span>, <span class="s">"name"</span>: <span class="s">"Graphics Queue"</span>, <span class="s">"parent"</span>: <span class="n">1</span>}, f)
cbor2.dump({<span class="s">"type"</span>: <span class="s">"track"</span>, <span class="s">"uuid"</span>: <span class="n">3</span>, <span class="s">"name"</span>: <span class="s">"CmdBuf #42"</span>, <span class="s">"parent"</span>: <span class="n">2</span>}, f)
cbor2.dump({<span class="s">"type"</span>: <span class="s">"B"</span>, <span class="s">"t"</span>: <span class="n">3</span>, <span class="s">"ts"</span>: <span class="n">1000000</span>,
<span class="s">"name"</span>: <span class="s">"vkCmdDraw"</span>, <span class="s">"args"</span>: {<span class="s">"vertices"</span>: <span class="n">36000</span>}}, f)
cbor2.dump({<span class="s">"type"</span>: <span class="s">"E"</span>, <span class="s">"t"</span>: <span class="n">3</span>, <span class="s">"ts"</span>: <span class="n">1500000</span>}, f)
cbor2.dump({<span class="s">"type"</span>: <span class="s">"C"</span>, <span class="s">"t"</span>: <span class="n">4</span>, <span class="s">"ts"</span>: <span class="n">1000000</span>, <span class="s">"v"</span>: <span class="n">268435456</span>}, f)</div>
</div>
<div class="code-section">
<h3><span class="dot" style="background: var(--accent-cbor-s);"></span> C emitter — ~50 lines, zero deps</h3>
<div class="code-block"><span class="t">void</span> <span class="f">cbor_map</span>(FILE *f, <span class="t">int</span> n) { fputc(<span class="n">0xA0</span> | n, f); }
<span class="t">void</span> <span class="f">cbor_uint</span>(FILE *f, <span class="t">uint64_t</span> v) {
<span class="w">if</span> (v &lt;= <span class="n">23</span>) fputc(v, f);
<span class="w">else if</span> (v &lt;= <span class="n">0xFF</span>) { fputc(<span class="n">0x18</span>, f); fputc(v, f); }
<span class="w">else if</span> (v &lt;= <span class="n">0xFFFF</span>) { fputc(<span class="n">0x19</span>, f); <span class="c">/* 2B big-endian */</span> }
<span class="w">else</span> { fputc(<span class="n">0x1A</span>, f); <span class="c">/* 4B big-endian */</span> }
}
<span class="t">void</span> <span class="f">cbor_str</span>(FILE *f, <span class="w">const</span> <span class="t">char</span> *s) {
<span class="t">size_t</span> n = strlen(s);
<span class="w">if</span> (n &lt;= <span class="n">23</span>) fputc(<span class="n">0x60</span>|n, f); <span class="w">else</span> { fputc(<span class="n">0x78</span>, f); fputc(n, f); }
fwrite(s, <span class="n">1</span>, n, f);
}
<span class="c">// Emit one slice-begin:</span>
<span class="f">cbor_map</span>(f, <span class="n">4</span>);
<span class="f">cbor_str</span>(f,<span class="s">"type"</span>); <span class="f">cbor_str</span>(f,<span class="s">"B"</span>);
<span class="f">cbor_str</span>(f,<span class="s">"t"</span>); <span class="f">cbor_uint</span>(f,<span class="n">3</span>);
<span class="f">cbor_str</span>(f,<span class="s">"ts"</span>); <span class="f">cbor_uint</span>(f,<span class="n">1000000</span>);
<span class="f">cbor_str</span>(f,<span class="s">"name"</span>); <span class="f">cbor_str</span>(f,<span class="s">"vkCmdDraw"</span>);</div>
</div>
<div class="filter-section">
<h4>Filtering &amp; querying</h4>
<div class="code-block"><span class="c"># CBOR → JSON is lossless for string keys. Full jq works.</span>
<span class="c"># Pretty-print entire trace</span>
cat trace.cbor | cbor2 --sequence --pretty
<span class="c"># Find all slice-begin events</span>
cat trace.cbor | cbor2 -d --sequence | jq <span class="s">'select(.type == "B")'</span>
<span class="c"># Get all events on track 3 with timestamps > 1.2ms</span>
cat trace.cbor | cbor2 -d --sequence | jq <span class="s">'select(.t == 3 and .ts > 1200000)'</span>
<span class="c"># List unique event names</span>
cat trace.cbor | cbor2 -d --sequence | jq -r <span class="s">'select(.name) | .name'</span> | sort -u
<span class="c"># Count events per type</span>
cat trace.cbor | cbor2 -d --sequence | jq -s <span class="s">'group_by(.type) | map({type: .[0].type, count: length})'</span>
<span class="c"># Filter and re-encode back to CBOR</span>
cat trace.cbor | cbor2 -d --seq | jq -c <span class="s">'select(.ts >= 1000000 and .ts <= 2000000)'</span> | cbor2 -e > filtered.cbor
<span class="c"># Inspect raw bytes</span>
cat trace.cbor | cbor-diag <span class="c"># Ruby: gem install cbor-diag</span>
<span class="c"># or</span>
cbor-diag trace.cbor <span class="c"># Rust: cargo install cbor-diag-cli</span></div>
</div>
<div class="pros-cons">
<div class="pros"><h4>Strengths</h4><ul>
<li>JSON data model — trivially learnable</li>
<li>Lossless JSON round-trip → full jq/grep ecosystem</li>
<li>CBOR Sequences = crash-safe streaming</li>
<li>~35% smaller than JSON, native uint64 timestamps</li>
<li>Tiny encoder, IETF standard (RFC 8949)</li>
</ul></div>
<div class="cons"><h4>Weaknesses</h4><ul>
<li>Not human-readable in raw form</li>
<li>Only ~35% smaller — string keys are most of the overhead</li>
<li>No built-in interning</li>
<li>Requires <code>cbor2</code> in the pipeline (pip/gem install)</li>
</ul></div>
</div>
</div>
<!-- ============================================================ -->
<!-- CBOR INTEGER KEYS -->
<!-- ============================================================ -->
<div class="panel" id="panel-cbor-i">
<h2 style="color: var(--accent-cbor-i);">CBOR — Integer Keys (CTAP2-style)</h2>
<p class="desc">Same CBOR spec (RFC 8949), but using integer map keys instead of strings — the same approach used by FIDO2/CTAP2 security keys and COSE. Gets within ~15% of protobuf compactness. <strong>Requires an external schema</strong> to map key numbers to field names.</p>
<div class="code-section">
<h3><span class="dot" style="background: var(--accent-cbor-i);"></span> Schema definition (must be maintained alongside the spec)</h3>
<div class="code-block"><span class="c">// Field number → name mapping (a .json or .cddl file you maintain)</span>
<span class="c">// This is analogous to a .proto file.</span>
{
<span class="s">"event_fields"</span>: {
<span class="s">"1"</span>: <span class="s">"type"</span>, <span class="c">// 0=begin, 1=end, 2=counter, 3=instant</span>
<span class="s">"2"</span>: <span class="s">"track"</span>,
<span class="s">"3"</span>: <span class="s">"ts"</span>, <span class="c">// nanoseconds</span>
<span class="s">"4"</span>: <span class="s">"name"</span>,
<span class="s">"5"</span>: <span class="s">"value"</span>, <span class="c">// counter value</span>
<span class="s">"6"</span>: <span class="s">"name_iid"</span>, <span class="c">// interned string ref</span>
<span class="s">"7"</span>: <span class="s">"args"</span> <span class="c">// nested map (string keys ok here)</span>
},
<span class="s">"track_fields"</span>: {
<span class="s">"1"</span>: <span class="s">"uuid"</span>,
<span class="s">"2"</span>: <span class="s">"name"</span>,
<span class="s">"3"</span>: <span class="s">"parent"</span>,
<span class="s">"4"</span>: <span class="s">"counter_unit"</span>
},
<span class="s">"type_enum"</span>: { <span class="s">"0"</span>: <span class="s">"begin"</span>, <span class="s">"1"</span>: <span class="s">"end"</span>, <span class="s">"2"</span>: <span class="s">"counter"</span> }
}</div>
</div>
<div class="code-section">
<h3><span class="dot" style="background: var(--accent-cbor-i);"></span> Python emitter</h3>
<div class="code-block"><span class="w">import</span> cbor2
<span class="c"># Field constants (from schema)</span>
F_TYPE, F_TRACK, F_TS, F_NAME, F_VAL = <span class="n">1</span>, <span class="n">2</span>, <span class="n">3</span>, <span class="n">4</span>, <span class="n">5</span>
T_BEGIN, T_END, T_COUNTER = <span class="n">0</span>, <span class="n">1</span>, <span class="n">2</span>
<span class="w">with</span> <span class="f">open</span>(<span class="s">"trace.cbor"</span>, <span class="s">"wb"</span>) <span class="w">as</span> f:
<span class="c"># Track defs can still use string keys (human-readable, infrequent)</span>
cbor2.dump({<span class="s">"type"</span>: <span class="s">"track"</span>, <span class="s">"uuid"</span>: <span class="n">1</span>, <span class="s">"name"</span>: <span class="s">"GPU 0"</span>}, f)
cbor2.dump({<span class="s">"type"</span>: <span class="s">"track"</span>, <span class="s">"uuid"</span>: <span class="n">2</span>, <span class="s">"name"</span>: <span class="s">"Graphics Queue"</span>, <span class="s">"parent"</span>: <span class="n">1</span>}, f)
<span class="c"># Hot-path events use integer keys (compact, millions of these)</span>
cbor2.dump({F_TYPE: T_BEGIN, F_TRACK: <span class="n">3</span>, F_TS: <span class="n">1000000</span>, F_NAME: <span class="s">"vkCmdDraw"</span>}, f)
cbor2.dump({F_TYPE: T_END, F_TRACK: <span class="n">3</span>, F_TS: <span class="n">1500000</span>}, f)
cbor2.dump({F_TYPE: T_COUNTER, F_TRACK: <span class="n">4</span>, F_TS: <span class="n">1000000</span>, F_VAL: <span class="n">268435456</span>}, f)</div>
</div>
<div class="code-section">
<h3><span class="dot" style="background: var(--accent-cbor-i);"></span> C emitter</h3>
<div class="code-block"><span class="c">// Same cbor_uint/cbor_str helpers as string-key variant.</span>
<span class="c">// Only difference: keys are integers, not strings.</span>
<span class="f">cbor_map</span>(f, <span class="n">4</span>);
<span class="f">cbor_uint</span>(f, <span class="n">1</span>); <span class="f">cbor_uint</span>(f, <span class="n">0</span>); <span class="c">// 1=type, 0=BEGIN</span>
<span class="f">cbor_uint</span>(f, <span class="n">2</span>); <span class="f">cbor_uint</span>(f, <span class="n">3</span>); <span class="c">// 2=track</span>
<span class="f">cbor_uint</span>(f, <span class="n">3</span>); <span class="f">cbor_uint</span>(f, <span class="n">1000000</span>); <span class="c">// 3=ts</span>
<span class="f">cbor_uint</span>(f, <span class="n">4</span>); <span class="f">cbor_str</span>(f, <span class="s">"vkCmdDraw"</span>); <span class="c">// 4=name</span>
<span class="c">// Bytes on wire: A4 01 00 02 03 03 1A000F4240 04 69"vkCmdDraw"</span>
<span class="c">// Total: 22 bytes (vs 34 string-key, vs 19 protobuf)</span></div>
</div>
<div class="filter-section">
<h4>Filtering &amp; querying (the hard part)</h4>
<div class="code-block"><span class="c"># Raw cbor2json converts integer keys to string-ified integers.</span>
<span class="c"># {1: 0, 2: 3, 3: 1000000} becomes {"1": 0, "2": 3, "3": 1000000}</span>
<span class="c"># This is LOSSY — int key 1 and string key "1" become indistinguishable.</span>
<span class="c"># Raw (no schema): works but ugly</span>
cat trace.cbor | cbor2 -d --seq | jq <span class="s">'select(.["1"] == 0)'</span> <span class="c"># "field 1 == BEGIN"</span>
<span class="c"># With a schema-expansion script (you have to write this yourself):</span>
cat trace.cbor | cbor2 -d --seq | ./expand-schema.py | jq <span class="s">'select(.type == "begin")'</span>
<span class="c"># expand-schema.py is ~20 lines: read JSON, replace int keys with names</span>
<span class="c"># But you must maintain it whenever the schema changes.</span>
<span class="c"># For comparison, protobuf has this built in:</span>
protoc --decode=TraceEvent trace.proto &lt; trace.bin <span class="c"># just works</span>
<span class="c"># CBOR diagnostic notation (shows raw structure, no field names):</span>
cat trace.cbor | cbor-diag
<span class="c"># Output: {1: 0, 2: 3, 3: 1000000, 4: "vkCmdDraw"}</span>
<span class="c"># Readable if you memorize the schema. Not great for debugging.</span></div>
</div>
<div class="pros-cons">
<div class="pros"><h4>Strengths</h4><ul>
<li>Within ~15% of protobuf compactness</li>
<li>Same tiny encoder as string-key CBOR</li>
<li>Can mix: string keys for rare defs, int keys for hot-path events</li>
<li>IETF standard — used by FIDO2/CTAP2, COSE in production</li>
<li>No code generation step (unlike protobuf)</li>
</ul></div>
<div class="cons"><h4>Weaknesses</h4><ul>
<li>Requires a schema — same as protobuf's fundamental tradeoff</li>
<li>No protoc-equivalent for CBOR schema expansion</li>
<li>Lossy JSON round-trip (int keys become string "1", "2"...)</li>
<li>jq queries use opaque <code>.["3"]</code> instead of <code>.ts</code></li>
<li>Schema versioning/compat is your problem (protobuf has mature tooling)</li>
<li>Awkward middle: schema tax without protobuf's ecosystem</li>
</ul></div>
</div>
</div>
<!-- ============================================================ -->
<!-- NDJSON -->
<!-- ============================================================ -->
<div class="panel" id="panel-ndjson">
<h2 style="color: var(--accent-ndjson);">NDJSON — Newline-Delimited JSON</h2>
<p class="desc">One JSON object per line, with a simplified data model. Requires zero new libraries. Streamable, crash-safe, greppable. The best jq experience of any option.</p>
<div class="code-section">
<h3><span class="dot" style="background: var(--accent-ndjson);"></span> Trace file</h3>
<div class="code-block"><span class="p">{"track":{"uuid":1,"name":"GPU 0"}}</span>
<span class="p">{"track":{"uuid":2,"name":"Graphics Queue","parent":1}}</span>
<span class="p">{"track":{"uuid":3,"name":"CmdBuf #42","parent":2}}</span>
<span class="p">{"track":{"uuid":4,"name":"VRAM","parent":1,"counter":"bytes"}}</span>
<span class="p">{"begin":{"t":3,"ts":1000000,"name":"vkCmdDraw","args":{"vertices":36000}}}</span>
<span class="p">{"end":{"t":3,"ts":1500000}}</span>
<span class="p">{"counter":{"t":4,"ts":1000000,"v":268435456}}</span></div>
</div>
<div class="code-section">
<h3><span class="dot" style="background: var(--accent-ndjson);"></span> Python emitter</h3>
<div class="code-block"><span class="w">import</span> json
<span class="w">with</span> <span class="f">open</span>(<span class="s">"trace.ndjson"</span>, <span class="s">"w"</span>) <span class="w">as</span> f:
<span class="w">def</span> <span class="f">emit</span>(obj): f.write(json.dumps(obj, separators=(<span class="s">','</span>,<span class="s">':'</span>)) + <span class="s">'\n'</span>)
emit({<span class="s">"track"</span>: {<span class="s">"uuid"</span>: <span class="n">1</span>, <span class="s">"name"</span>: <span class="s">"GPU 0"</span>}})
emit({<span class="s">"track"</span>: {<span class="s">"uuid"</span>: <span class="n">2</span>, <span class="s">"name"</span>: <span class="s">"Graphics Queue"</span>, <span class="s">"parent"</span>: <span class="n">1</span>}})
emit({<span class="s">"begin"</span>: {<span class="s">"t"</span>: <span class="n">3</span>, <span class="s">"ts"</span>: <span class="n">1000000</span>, <span class="s">"name"</span>: <span class="s">"vkCmdDraw"</span>,
<span class="s">"args"</span>: {<span class="s">"vertices"</span>: <span class="n">36000</span>}}})</div>
</div>
<div class="code-section">
<h3><span class="dot" style="background: var(--accent-ndjson);"></span> C emitter</h3>
<div class="code-block"><span class="t">void</span> <span class="f">emit_begin</span>(FILE *f, <span class="t">int</span> trk, <span class="t">uint64_t</span> ts, <span class="w">const</span> <span class="t">char</span> *name) {
fprintf(f, <span class="s">"{\"begin\":{\"t\":%d,\"ts\":%lu,\"name\":\"%s\"}}\n"</span>, trk, ts, name);
}
<span class="t">void</span> <span class="f">emit_end</span>(FILE *f, <span class="t">int</span> trk, <span class="t">uint64_t</span> ts) {
fprintf(f, <span class="s">"{\"end\":{\"t\":%d,\"ts\":%lu}}\n"</span>, trk, ts);
}</div>
</div>
<div class="filter-section">
<h4>Filtering &amp; querying — the gold standard</h4>
<div class="code-block"><span class="c"># No conversion needed. jq operates directly on the file.</span>
<span class="c"># All begin events</span>
jq <span class="s">'select(.begin)'</span> trace.ndjson
<span class="c"># Events on track 3 between 1ms and 2ms</span>
jq <span class="s">'select(.begin and .begin.t == 3 and .begin.ts >= 1000000 and .begin.ts <= 2000000)'</span> trace.ndjson
<span class="c"># List all track names</span>
jq -r <span class="s">'select(.track) | .track.name'</span> trace.ndjson
<span class="c"># Count events by type</span>
jq -s <span class="s">'map(keys[0]) | group_by(.) | map({type: .[0], count: length})'</span> trace.ndjson
<span class="c"># Extract args from all begin events</span>
jq <span class="s">'select(.begin.args) | .begin | {name, args}'</span> trace.ndjson
<span class="c"># Also works with standard Unix tools:</span>
grep <span class="s">'"begin"'</span> trace.ndjson | head -20 <span class="c"># first 20 begin events</span>
wc -l trace.ndjson <span class="c"># total event count</span>
grep <span class="s">'vkCmdDraw'</span> trace.ndjson <span class="c"># find by name</span>
tail -f trace.ndjson | jq <span class="s">'select(.begin)'</span> <span class="c"># live stream filtering</span>
awk -F<span class="s">'"ts":'</span> <span class="s">'{print $2}'</span> trace.ndjson | head <span class="c"># quick timestamp extraction</span></div>
</div>
<div class="pros-cons">
<div class="pros"><h4>Strengths</h4><ul>
<li>Zero new libraries — every language has JSON</li>
<li>Full jq + grep + awk + tail -f — best tooling story</li>
<li>Streamable, crash-safe, diff-friendly</li>
<li>Supports nested args natively</li>
<li>0% learning curve</li>
</ul></div>
<div class="cons"><h4>Weaknesses</h4><ul>
<li>Largest on disk — JSON string overhead everywhere</li>
<li>No native uint64 (JS number limits)</li>
<li>No interning — strings repeated every event</li>
<li>Escaping strings in C is annoying</li>
</ul></div>
</div>
</div>
<!-- ============================================================ -->
<!-- Amazon Ion -->
<!-- ============================================================ -->
<div class="panel" id="panel-ion">
<h2 style="color: var(--accent-ion);">Amazon Ion</h2>
<p class="desc">A JSON superset with native timestamps, symbol tables (auto-interning), comments, and interchangeable text &amp; binary forms. Author in text, convert to binary for production. Both are semantically identical. Libraries in C, Python, Rust, Java, Go, JS.</p>
<div class="code-section">
<h3><span class="dot" style="background: var(--accent-ion);"></span> Trace file — Ion text</h3>
<div class="code-block"><span class="c">// Valid Ion — this is a superset of JSON</span>
<span class="c">// Symbols (unquoted keys) auto-intern in binary form</span>
{
<span class="k">type</span>: <span class="s">track</span>,
<span class="k">uuid</span>: <span class="n">1</span>,
<span class="k">name</span>: <span class="s">"GPU 0"</span>,
}
{
<span class="k">type</span>: <span class="s">slice_begin</span>,
<span class="k">track</span>: <span class="n">3</span>,
<span class="k">ts</span>: <span class="n">2025-11-02T10:30:00.001000Z</span>, <span class="c">// native timestamp!</span>
<span class="k">name</span>: <span class="s">"vkCmdDraw"</span>,
<span class="k">args</span>: { <span class="k">vertices</span>: <span class="n">36000</span>, <span class="k">shader</span>: <span class="s">"pbr_frag"</span> }
}
{
<span class="k">type</span>: <span class="s">counter</span>,
<span class="k">track</span>: <span class="n">4</span>,
<span class="k">ts</span>: <span class="n">2025-11-02T10:30:00.001000Z</span>,
<span class="k">value</span>: <span class="n">268435456</span>,
}
<span class="c">// Binary form: all repeated symbols (type, track, ts, name...)</span>
<span class="c">// stored as integer IDs via Ion's built-in symbol table.</span>
<span class="c">// No manual interning needed!</span></div>
</div>
<div class="code-section">
<h3><span class="dot" style="background: var(--accent-ion);"></span> Python emitter</h3>
<div class="code-block"><span class="w">import</span> amazon.ion.simpleion <span class="w">as</span> ion <span class="c"># pip install amazon.ion</span>
events = [
{<span class="s">"type"</span>: <span class="s">"track"</span>, <span class="s">"uuid"</span>: <span class="n">1</span>, <span class="s">"name"</span>: <span class="s">"GPU 0"</span>},
{<span class="s">"type"</span>: <span class="s">"slice_begin"</span>, <span class="s">"track"</span>: <span class="n">3</span>, <span class="s">"ts"</span>: <span class="n">1000000</span>,
<span class="s">"name"</span>: <span class="s">"vkCmdDraw"</span>, <span class="s">"args"</span>: {<span class="s">"vertices"</span>: <span class="n">36000</span>}},
]
<span class="c"># Text — for debugging:</span>
<span class="w">with</span> <span class="f">open</span>(<span class="s">"trace.ion"</span>, <span class="s">"w"</span>) <span class="w">as</span> f:
<span class="w">for</span> ev <span class="w">in</span> events: f.write(ion.dumps(ev, binary=<span class="n">False</span>) + <span class="s">"\n"</span>)
<span class="c"># Binary — for production (auto-interns symbols!):</span>
<span class="w">with</span> <span class="f">open</span>(<span class="s">"trace.10n"</span>, <span class="s">"wb"</span>) <span class="w">as</span> f:
<span class="w">for</span> ev <span class="w">in</span> events: f.write(ion.dumps(ev, binary=<span class="n">True</span>))</div>
</div>
<div class="code-section">
<h3><span class="dot" style="background: var(--accent-ion);"></span> C emitter — using ion-c</h3>
<div class="code-block"><span class="c">// Using official ion-c (github.com/amazon-ion/ion-c)</span>
<span class="w">#include</span> <span class="s">"ion.h"</span>
hWRITER writer;
ION_WRITER_OPTIONS opts = {<span class="n">0</span>};
opts.output_as_binary = TRUE; <span class="c">// flip to FALSE for text</span>
ion_writer_open(&amp;writer, stream, &amp;opts);
ion_writer_start_container(writer, tid_STRUCT);
<span class="c">// symbols auto-intern in binary mode</span>
ion_writer_write_field_name_from_cstr(writer, <span class="s">"type"</span>);
ion_writer_write_symbol_from_cstr(writer, <span class="s">"slice_begin"</span>);
ion_writer_write_field_name_from_cstr(writer, <span class="s">"track"</span>);
ion_writer_write_int(writer, <span class="n">3</span>);
ion_writer_write_field_name_from_cstr(writer, <span class="s">"ts"</span>);
ion_writer_write_int(writer, <span class="n">1000000</span>);
ion_writer_write_field_name_from_cstr(writer, <span class="s">"name"</span>);
ion_writer_write_string_from_cstr(writer, <span class="s">"vkCmdDraw"</span>);
ion_writer_finish_container(writer);</div>
</div>
<div class="filter-section">
<h4>Filtering &amp; querying</h4>
<div class="code-block"><span class="c"># Ion CLI tool: convert binary ↔ text, inspect, filter</span>
<span class="c"># Install: cargo install ion-cli (or from amazon-ion/ion-cli)</span>
<span class="c"># Binary → human-readable text</span>
ion dump --format text trace.10n
<span class="c"># Binary → JSON (for piping to jq)</span>
ion dump --format json trace.10n | jq <span class="s">'select(.type == "slice_begin")'</span>
<span class="c"># Text form is directly readable — can grep it</span>
grep <span class="s">'slice_begin'</span> trace.ion
grep <span class="s">'vkCmdDraw'</span> trace.ion
<span class="c"># Ion text → Ion binary (for production)</span>
ion dump --format binary trace.ion -o trace.10n
<span class="c"># PartiQL — SQL-like querying (Ion's native query language, used by AWS)</span>
<span class="c"># Not widely available as a CLI tool yet, but the spec exists.</span>
<span class="c"># SELECT * FROM trace WHERE type = 'slice_begin' AND ts > 1000000</span>
<span class="c"># Round-trip: text → binary → text is lossless (except whitespace/comments)</span>
ion dump --format text trace.10n &gt; roundtrip.ion
diff trace.ion roundtrip.ion <span class="c"># identical content, maybe different whitespace</span></div>
</div>
<div class="pros-cons">
<div class="pros"><h4>Strengths</h4><ul>
<li>Dual text/binary — prototype in text, ship binary</li>
<li>JSON superset — any JSON is valid Ion</li>
<li>Built-in symbol tables = automatic interning</li>
<li>Native timestamps with arbitrary precision</li>
<li>Comments in text form</li>
<li>Lossless text↔binary round-trip</li>
<li>Libraries in C, Python, Rust, Java, Go, JS, C#</li>
</ul></div>
<div class="cons"><h4>Weaknesses</h4><ul>
<li>C library (ion-c) is verbose — more boilerplate than CBOR</li>
<li>Less well-known than protobuf or CBOR</li>
<li>Spec more complex than CBOR — steeper curve for implementors</li>
<li>ion-cli exists but is less mature than jq</li>
<li>jq requires Ion→JSON conversion (lossy for Ion-specific types like timestamps)</li>
</ul></div>
</div>
</div>
<!-- ============================================================ -->
<!-- RESP-like -->
<!-- ============================================================ -->
<div class="panel" id="panel-resp">
<h2 style="color: var(--accent-resp);">RESP-like Trace Format</h2>
<p class="desc">Text-based, line-oriented, type-prefixed. The entire C emitter is <code>fprintf</code>. Human-readable, greppable, pipeable.</p>
<div class="code-section">
<h3><span class="dot" style="background: var(--accent-resp);"></span> Trace file</h3>
<div class="code-block"><span class="c"># T=track B=begin E=end C=counter I=intern</span>
<span class="t">T</span> <span class="k">uuid</span>=<span class="n">1</span> <span class="k">name</span>=<span class="s">GPU 0</span>
<span class="t">T</span> <span class="k">uuid</span>=<span class="n">2</span> <span class="k">name</span>=<span class="s">Graphics Queue</span> <span class="k">parent</span>=<span class="n">1</span>
<span class="t">T</span> <span class="k">uuid</span>=<span class="n">3</span> <span class="k">name</span>=<span class="s">CmdBuf #42</span> <span class="k">parent</span>=<span class="n">2</span>
<span class="t">B</span> <span class="k">t</span>=<span class="n">3</span> <span class="k">ts</span>=<span class="n">1000000</span> <span class="k">n</span>=<span class="s">vkCmdDraw</span> <span class="k">vertices</span>=<span class="n">36000</span>
<span class="t">E</span> <span class="k">t</span>=<span class="n">3</span> <span class="k">ts</span>=<span class="n">1500000</span>
<span class="t">C</span> <span class="k">t</span>=<span class="n">4</span> <span class="k">ts</span>=<span class="n">1000000</span> <span class="k">v</span>=<span class="n">268435456</span></div>
</div>
<div class="code-section">
<h3><span class="dot" style="background: var(--accent-resp);"></span> C emitter — 6 lines</h3>
<div class="code-block"><span class="t">void</span> <span class="f">emit_begin</span>(FILE *f, <span class="t">int</span> trk, <span class="t">uint64_t</span> ts, <span class="w">const</span> <span class="t">char</span> *name) {
fprintf(f, <span class="s">"B\tt=%d\tts=%lu\tn=%s\n"</span>, trk, ts, name);
}
<span class="t">void</span> <span class="f">emit_end</span>(FILE *f, <span class="t">int</span> trk, <span class="t">uint64_t</span> ts) {
fprintf(f, <span class="s">"E\tt=%d\tts=%lu\n"</span>, trk, ts);
}</div>
</div>
<div class="filter-section">
<h4>Filtering &amp; querying — Unix native</h4>
<div class="code-block"><span class="c"># Pure Unix tools — no jq needed, no conversion step.</span>
<span class="c"># All begin events</span>
grep <span class="s">'^B\t'</span> trace.ptf
<span class="c"># Begin events on track 3</span>
grep <span class="s">'^B\t'</span> trace.ptf | grep <span class="s">'t=3'</span>
<span class="c"># Events with ts between 1ms and 2ms</span>
awk -F<span class="s">'\t'</span> <span class="s">'/^[BCE]/ { for(i=1;i<=NF;i++) if($i ~ /^ts=/) {
split($i,a,"="); if(a[2]>=1000000 && a[2]<=2000000) print } }'</span> trace.ptf
<span class="c"># Count events by type</span>
cut -f1 trace.ptf | sort | uniq -c | sort -rn
<span class="c"># All unique event names</span>
grep <span class="s">'^B\t'</span> trace.ptf | grep -oP <span class="s">'n=\K[^\t]*'</span> | sort -u
<span class="c"># Live stream monitoring</span>
tail -f trace.ptf | grep <span class="s">'^B\t'</span>
<span class="c"># Total event count</span>
wc -l trace.ptf
<span class="c"># Can also convert to NDJSON for jq, if needed:</span>
awk -F<span class="s">'\t'</span> <span class="s">'{ printf "{"; for(i=1;i<=NF;i++) { split($i,a,"=");
if(i>1) printf ","; printf "\"%s\":\"%s\"",a[1],a[2] } print "}" }'</span> trace.ptf | jq .</div>
</div>
<div class="pros-cons">
<div class="pros"><h4>Strengths</h4><ul>
<li>Maximally simple: printf/echo is enough</li>
<li>Human-readable — cat, grep, awk, tail -f</li>
<li>Line-oriented = crash-safe, streamable</li>
<li>No parser needed — split on \t, split on =</li>
</ul></div>
<div class="cons"><h4>Weaknesses</h4><ul>
<li>2-4× larger than binary formats</li>
<li>Flat key=value — no nested args</li>
<li>Tab/newline in values needs escaping</li>
<li>awk queries are ugly compared to jq</li>
<li>Completely bespoke — no standard</li>
</ul></div>
</div>
</div>
<!-- ============================================================ -->
<!-- Arrow IPC -->
<!-- ============================================================ -->
<div class="panel" id="panel-arrow">
<h2 style="color: var(--accent-arrow);">Apache Arrow IPC</h2>
<p class="desc">Columnar batches instead of row-oriented records. Optimized for the reader: Perfetto could <code>mmap()</code> the file and query with near-zero deserialization. Trade-off: must buffer events and flush in batches.</p>
<div class="code-section">
<h3><span class="dot" style="background: var(--accent-arrow);"></span> Python emitter</h3>
<div class="code-block"><span class="w">import</span> pyarrow <span class="w">as</span> pa <span class="c"># pip install pyarrow</span>
schema = pa.schema([
(<span class="s">"type"</span>, pa.uint8()), <span class="c"># 0=begin, 1=end, 2=counter</span>
(<span class="s">"track"</span>, pa.int64()),
(<span class="s">"ts"</span>, pa.int64()), <span class="c"># nanoseconds, native int64</span>
(<span class="s">"name"</span>, pa.dictionary(pa.int32(), pa.utf8())), <span class="c"># auto-interned!</span>
(<span class="s">"value"</span>, pa.float64()),
])
batch = pa.record_batch([
pa.array([<span class="n">0</span>, <span class="n">1</span>, <span class="n">2</span>], type=pa.uint8()),
pa.array([<span class="n">3</span>, <span class="n">3</span>, <span class="n">4</span>]),
pa.array([<span class="n">1000000</span>, <span class="n">1500000</span>, <span class="n">1000000</span>]),
pa.array([<span class="s">"vkCmdDraw"</span>, <span class="n">None</span>, <span class="n">None</span>]).dictionary_encode(),
pa.array([<span class="n">None</span>, <span class="n">None</span>, <span class="n">268435456.0</span>]),
], schema=schema)
<span class="w">with</span> pa.OSFile(<span class="s">"trace.arrow"</span>, <span class="s">"wb"</span>) <span class="w">as</span> sink:
writer = pa.ipc.new_file(sink, schema)
writer.write_batch(batch)
writer.close()</div>
</div>
<div class="code-section">
<h3><span class="dot" style="background: var(--accent-arrow);"></span> C emitter — using nanoarrow</h3>
<div class="code-block"><span class="c">// nanoarrow: single-header C library from Apache Arrow project</span>
<span class="w">#include</span> <span class="s">"nanoarrow/nanoarrow.h"</span>
<span class="t">struct</span> ArrowSchema schema;
ArrowSchemaInit(&amp;schema);
ArrowSchemaSetTypeStruct(&amp;schema, <span class="n">5</span>);
ArrowSchemaSetType(schema.children[<span class="n">0</span>], NANOARROW_TYPE_UINT8); <span class="c">// type</span>
ArrowSchemaSetType(schema.children[<span class="n">1</span>], NANOARROW_TYPE_INT64); <span class="c">// track</span>
ArrowSchemaSetType(schema.children[<span class="n">2</span>], NANOARROW_TYPE_INT64); <span class="c">// ts</span>
ArrowSchemaSetType(schema.children[<span class="n">3</span>], NANOARROW_TYPE_STRING); <span class="c">// name</span>
ArrowSchemaSetType(schema.children[<span class="n">4</span>], NANOARROW_TYPE_DOUBLE); <span class="c">// value</span>
<span class="t">struct</span> ArrowArray array;
ArrowArrayInitFromSchema(&amp;array, &amp;schema, NULL);
ArrowArrayStartAppending(&amp;array);
ArrowArrayAppendUInt(array.children[<span class="n">0</span>], <span class="n">0</span>); <span class="c">// begin</span>
ArrowArrayAppendInt(array.children[<span class="n">1</span>], <span class="n">3</span>); <span class="c">// track</span>
ArrowArrayAppendInt(array.children[<span class="n">2</span>], <span class="n">1000000</span>); <span class="c">// ts</span>
<span class="c">// ... etc</span>
ArrowArrayFinishBuildingDefault(&amp;array, NULL);</div>
</div>
<div class="filter-section">
<h4>Filtering &amp; querying — via DuckDB or Python</h4>
<div class="code-block"><span class="c"># Arrow IPC files can be queried with DuckDB (zero-copy, SQL!)</span>
<span class="c"># Install: pip install duckdb / brew install duckdb</span>
duckdb -c <span class="s">"SELECT * FROM 'trace.arrow' WHERE type = 0 AND ts > 1000000"</span>
duckdb -c <span class="s">"SELECT name, COUNT(*) as cnt FROM 'trace.arrow'
WHERE type = 0 GROUP BY name ORDER BY cnt DESC"</span>
<span class="c"># Or in Python:</span>
<span class="w">import</span> pyarrow.ipc <span class="w">as</span> ipc
reader = ipc.open_file(<span class="s">"trace.arrow"</span>)
table = reader.read_all()
begins = table.filter(table.column(<span class="s">"type"</span>) == <span class="n">0</span>)
<span class="c"># Convert to pandas for familiar filtering:</span>
df = table.to_pandas()
df[df[<span class="s">"type"</span>] == <span class="n">0</span>].sort_values(<span class="s">"ts"</span>)
<span class="c"># Convert to JSON for jq (lossy for performance, but works):</span>
python -c <span class="s">"import pyarrow.ipc as i, json, sys
for b in i.open_file('trace.arrow').read_all().to_pydict():
json.dump(b, sys.stdout); print()"</span> | jq .
<span class="c"># No direct jq-style CLI tool for Arrow files.
# But DuckDB's SQL is arguably more powerful than jq for analytics.</span></div>
</div>
<div class="pros-cons">
<div class="pros"><h4>Strengths</h4><ul>
<li>Fastest read: mmap + zero-copy, zero deserialization</li>
<li>Columnar = perfect for Perfetto's SQL engine</li>
<li>DuckDB gives you SQL queries directly on .arrow files</li>
<li>Dictionary encoding = built-in interning</li>
<li>Massive ecosystem (Pandas, Spark, Polars, DuckDB)</li>
</ul></div>
<div class="cons"><h4>Weaknesses</h4><ul>
<li>Must buffer events and flush as batches</li>
<li>Schema declared upfront — no ad-hoc fields per event</li>
<li>Not human-readable at all</li>
<li>No jq — need DuckDB or Python for queries</li>
<li>Conceptually different (columnar, not row-oriented)</li>
</ul></div>
</div>
</div>
<!-- ============================================================ -->
<!-- Chrome JSON -->
<!-- ============================================================ -->
<div class="panel" id="panel-chrome-json">
<h2 style="color: var(--accent-chrome);">Chrome JSON (status quo)</h2>
<p class="desc">The current de facto standard. Dead simple, but limited to Chrome's pid/tid model.</p>
<div class="code-section">
<h3><span class="dot" style="background: var(--accent-chrome);"></span> Trace file</h3>
<div class="code-block"><span class="p">{</span> <span class="s">"traceEvents"</span><span class="p">: [</span>
<span class="p">{</span><span class="s">"name"</span><span class="p">:</span><span class="s">"process_name"</span><span class="p">,</span> <span class="s">"ph"</span><span class="p">:</span><span class="s">"M"</span><span class="p">,</span> <span class="s">"pid"</span><span class="p">:</span><span class="n">1</span><span class="p">,</span> <span class="s">"args"</span><span class="p">:{</span><span class="s">"name"</span><span class="p">:</span><span class="s">"GPU 0"</span><span class="p">}},</span>
<span class="p">{</span><span class="s">"name"</span><span class="p">:</span><span class="s">"thread_name"</span><span class="p">,</span> <span class="s">"ph"</span><span class="p">:</span><span class="s">"M"</span><span class="p">,</span> <span class="s">"pid"</span><span class="p">:</span><span class="n">1</span><span class="p">,</span> <span class="s">"tid"</span><span class="p">:</span><span class="n">1</span><span class="p">,</span>
<span class="s">"args"</span><span class="p">:{</span><span class="s">"name"</span><span class="p">:</span><span class="s">"Graphics Queue"</span><span class="p">}},</span>
<span class="c">// CmdBuf #42 can't nest under Queue — only 2 levels!</span>
<span class="p">{</span><span class="s">"name"</span><span class="p">:</span><span class="s">"vkCmdDraw"</span><span class="p">,</span> <span class="s">"ph"</span><span class="p">:</span><span class="s">"B"</span><span class="p">,</span> <span class="s">"pid"</span><span class="p">:</span><span class="n">1</span><span class="p">,</span> <span class="s">"tid"</span><span class="p">:</span><span class="n">1</span><span class="p">,</span> <span class="s">"ts"</span><span class="p">:</span><span class="n">1000</span><span class="p">,</span>
<span class="s">"args"</span><span class="p">:{</span><span class="s">"vertices"</span><span class="p">:</span><span class="n">36000</span><span class="p">}},</span>
<span class="p">{</span><span class="s">"name"</span><span class="p">:</span><span class="s">"vkCmdDraw"</span><span class="p">,</span> <span class="s">"ph"</span><span class="p">:</span><span class="s">"E"</span><span class="p">,</span> <span class="s">"pid"</span><span class="p">:</span><span class="n">1</span><span class="p">,</span> <span class="s">"tid"</span><span class="p">:</span><span class="n">1</span><span class="p">,</span> <span class="s">"ts"</span><span class="p">:</span><span class="n">1500</span><span class="p">},</span>
<span class="p">{</span><span class="s">"name"</span><span class="p">:</span><span class="s">"VRAM"</span><span class="p">,</span> <span class="s">"ph"</span><span class="p">:</span><span class="s">"C"</span><span class="p">,</span> <span class="s">"pid"</span><span class="p">:</span><span class="n">1</span><span class="p">,</span> <span class="s">"ts"</span><span class="p">:</span><span class="n">1000</span><span class="p">,</span>
<span class="s">"args"</span><span class="p">:{</span><span class="s">"VRAM"</span><span class="p">:</span><span class="n">268435456</span><span class="p">}}</span>
<span class="p">]}</span></div>
</div>
<div class="filter-section">
<h4>Filtering &amp; querying</h4>
<div class="code-block"><span class="c"># Must slurp the entire file (it's one big JSON array)</span>
jq <span class="s">'.traceEvents[] | select(.ph == "B")'</span> trace.json
<span class="c"># Events by pid/tid</span>
jq <span class="s">'.traceEvents[] | select(.pid == 1 and .tid == 1)'</span> trace.json
<span class="c"># Problem: large traces don't fit in memory for jq -s</span>
<span class="c"># Problem: can't stream — wrapping array requires full parse</span>
<span class="c"># Problem: can't grep — events span multiple lines in pretty-printed form</span></div>
</div>
</div>
<!-- ============================================================ -->
<!-- Size comparison -->
<!-- ============================================================ -->
<div class="wire-section">
<h4>Approximate size per event (slice_begin, pre-compression)</h4>
<div class="size-bar">
<span class="label">Protobuf</span>
<div class="bar" style="width: 60px; background: #666;">~19 B</div>
</div>
<div class="size-bar">
<span class="label">Protobuf+intern</span>
<div class="bar" style="width: 35px; background: #555;">~10 B</div>
</div>
<div class="size-bar">
<span class="label">CBOR int-keys</span>
<div class="bar" style="width: 70px; background: var(--accent-cbor-i);">~22 B</div>
</div>
<div class="size-bar">
<span class="label">Arrow IPC</span>
<div class="bar" style="width: 75px; background: var(--accent-arrow);">~8 B/row*</div>
</div>
<div class="size-bar">
<span class="label">CBOR str-keys</span>
<div class="bar" style="width: 110px; background: var(--accent-cbor-s);">~34 B</div>
</div>
<div class="size-bar">
<span class="label">Ion binary</span>
<div class="bar" style="width: 100px; background: var(--accent-ion);">~30 B</div>
</div>
<div class="size-bar">
<span class="label">RESP-like</span>
<div class="bar" style="width: 130px; background: var(--accent-resp);">~40 B</div>
</div>
<div class="size-bar">
<span class="label">NDJSON</span>
<div class="bar" style="width: 160px; background: var(--accent-ndjson);">~50 B</div>
</div>
<div class="size-bar">
<span class="label">Chrome JSON</span>
<div class="bar" style="width: 180px; background: var(--accent-chrome);">~56 B</div>
</div>
<div class="size-bar">
<span class="label">Ion text</span>
<div class="bar" style="width: 170px; background: var(--accent-ion); opacity:0.5;">~52 B</div>
</div>
<p style="color: var(--text-dim); font-size: 0.74rem; margin-top: 8px;">
Per-event bytes for a slice_begin with track=3, ts=1000000, name="vkCmdDraw". *Arrow IPC amortizes schema overhead across thousands of rows — per-row cost drops dramatically at scale. All formats compress similarly with zstd.
</p>
</div>
<!-- ============================================================ -->
<!-- Bottom -->
<!-- ============================================================ -->
<div class="bottom-note">
<h3>Summary</h3>
<p><strong style="color: var(--accent-ndjson);">NDJSON</strong> — best tooling (native jq/grep/awk), zero deps, largest on disk. The "boring but correct" choice.</p>
<p><strong style="color: var(--accent-ion);">Ion</strong> — best of both worlds: human-readable text for debugging, compact binary for production. Built-in interning. Tooling maturing.</p>
<p><strong style="color: var(--accent-cbor-s);">CBOR str-keys</strong> — binary JSON, lossless jq round-trip, ~35% smaller. Modest savings for losing readability.</p>
<p><strong style="color: var(--accent-cbor-i);">CBOR int-keys</strong> — near-protobuf compactness, but requires a schema and lacks protobuf's tooling. Awkward middle ground.</p>
<p><strong style="color: var(--accent-arrow);">Arrow IPC</strong> — fastest reads, DuckDB gives SQL queries, ideal if Perfetto's trace processor consumed it directly.</p>
<p><strong style="color: var(--accent-resp);">RESP-like</strong> — nuclear simplicity for systems/embedded devs who want <code>fprintf</code>-and-done.</p>
<p style="color: var(--text); font-weight: 500; margin-top: 10px;">Core insight: the thread keeps framing this as a serialization problem, but it's a <em>data model</em> problem. Design the middle-ground data model first (tracks with nesting, slices, counters, args-as-maps), then pick the encoding.</p>
</div>
</div>
<script>
document.querySelectorAll('.tab').forEach(tab => {
tab.addEventListener('click', () => {
document.querySelectorAll('.tab').forEach(t => t.classList.remove('active'));
document.querySelectorAll('.panel').forEach(p => p.classList.remove('active'));
tab.classList.add('active');
document.getElementById('panel-' + tab.dataset.tab).classList.add('active');
});
});
</script>
</body>
</html>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment