Created
March 21, 2026 08:35
-
-
Save primiano/7574b8533d429c5a45b030c3982ca452 to your computer and use it in GitHub Desktop.
Trace Formats Evaluation https://github.com/google/perfetto/issues/3491
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| <!DOCTYPE html> | |
| <html lang="en"> | |
| <head> | |
| <meta charset="UTF-8"> | |
| <meta name="viewport" content="width=device-width, initial-scale=1.0"> | |
| <title>Trace Format Comparison for Perfetto</title> | |
| <link href="https://fonts.googleapis.com/css2?family=JetBrains+Mono:wght@400;500;700&family=Source+Serif+4:opsz,wght@8..60,400;8..60,600;8..60,700&family=DM+Sans:wght@400;500;600&display=swap" rel="stylesheet"> | |
| <style> | |
| :root { | |
| --bg: #0c0c0e; | |
| --surface: #16161a; | |
| --surface2: #1e1e24; | |
| --border: #2a2a32; | |
| --text: #e2e0dc; | |
| --text-dim: #8a8880; | |
| --accent-cbor-s: #f0a040; | |
| --accent-cbor-i: #60d0a0; | |
| --accent-resp: #40c8f0; | |
| --accent-ndjson: #a0e060; | |
| --accent-ion: #c080f0; | |
| --accent-arrow: #f06080; | |
| --accent-chrome: #888; | |
| --code-bg: #111114; | |
| --radius: 8px; | |
| } | |
| * { margin: 0; padding: 0; box-sizing: border-box; } | |
| body { background: var(--bg); color: var(--text); font-family: 'DM Sans', sans-serif; line-height: 1.7; } | |
| .hero { padding: 80px 48px 60px; max-width: 1200px; margin: 0 auto; } | |
| .hero h1 { font-family: 'Source Serif 4', serif; font-size: 2.4rem; font-weight: 700; letter-spacing: -0.03em; line-height: 1.15; margin-bottom: 16px; } | |
| .hero .subtitle { color: var(--text-dim); font-size: 1.05rem; max-width: 750px; margin-bottom: 32px; } | |
| .scenario-banner { background: var(--surface); border: 1px solid var(--border); border-radius: var(--radius); padding: 20px 28px; margin-bottom: 24px; } | |
| .scenario-banner h3 { font-family: 'JetBrains Mono', monospace; font-size: 0.78rem; text-transform: uppercase; letter-spacing: 0.1em; color: var(--text-dim); margin-bottom: 8px; } | |
| .scenario-banner p { font-size: 0.93rem; } | |
| .content { max-width: 1200px; margin: 0 auto; padding: 0 48px 80px; } | |
| .tabs { display: flex; gap: 3px; margin-bottom: 0; position: sticky; top: 0; z-index: 10; background: var(--bg); padding: 8px 0 0; flex-wrap: wrap; } | |
| .tab { padding: 9px 14px; border: 1px solid var(--border); border-bottom: none; border-radius: var(--radius) var(--radius) 0 0; background: var(--surface); color: var(--text-dim); font-family: 'JetBrains Mono', monospace; font-size: 0.74rem; font-weight: 500; cursor: pointer; transition: all 0.2s; user-select: none; white-space: nowrap; } | |
| .tab:hover { color: var(--text); background: var(--surface2); } | |
| .tab.active { color: var(--text); background: var(--surface2); } | |
| .tab[data-tab="cbor-s"].active { border-top: 2px solid var(--accent-cbor-s); } | |
| .tab[data-tab="cbor-i"].active { border-top: 2px solid var(--accent-cbor-i); } | |
| .tab[data-tab="resp"].active { border-top: 2px solid var(--accent-resp); } | |
| .tab[data-tab="ndjson"].active { border-top: 2px solid var(--accent-ndjson); } | |
| .tab[data-tab="ion"].active { border-top: 2px solid var(--accent-ion); } | |
| .tab[data-tab="arrow"].active { border-top: 2px solid var(--accent-arrow); } | |
| .tab[data-tab="chrome-json"].active { border-top: 2px solid var(--accent-chrome); } | |
| .panel { display: none; background: var(--surface2); border: 1px solid var(--border); border-radius: 0 var(--radius) var(--radius) var(--radius); padding: 32px; } | |
| .panel.active { display: block; } | |
| .panel h2 { font-family: 'Source Serif 4', serif; font-size: 1.4rem; font-weight: 600; margin-bottom: 8px; } | |
| .panel .desc { color: var(--text-dim); font-size: 0.88rem; margin-bottom: 24px; max-width: 750px; } | |
| .code-section { margin-bottom: 24px; } | |
| .code-section h3 { font-family: 'JetBrains Mono', monospace; font-size: 0.74rem; text-transform: uppercase; letter-spacing: 0.1em; color: var(--text-dim); margin-bottom: 8px; display: flex; align-items: center; gap: 8px; } | |
| .code-section h3 .dot { width: 8px; height: 8px; border-radius: 50%; display: inline-block; } | |
| .code-block { background: var(--code-bg); border: 1px solid var(--border); border-radius: var(--radius); padding: 16px 20px; overflow-x: auto; font-family: 'JetBrains Mono', monospace; font-size: 0.78rem; line-height: 1.7; white-space: pre; color: var(--text); tab-size: 2; } | |
| .code-block .c { color: #666; font-style: italic; } | |
| .code-block .k { color: #e0a060; } | |
| .code-block .s { color: #a0d870; } | |
| .code-block .n { color: #70b0e0; } | |
| .code-block .t { color: #d080c0; } | |
| .code-block .p { color: #888; } | |
| .code-block .w { color: #e07070; } | |
| .code-block .f { color: #60c0d0; } | |
| .pros-cons { display: grid; grid-template-columns: 1fr 1fr; gap: 14px; margin-top: 18px; } | |
| .pros, .cons { background: var(--surface); border: 1px solid var(--border); border-radius: var(--radius); padding: 16px; } | |
| .pros h4, .cons h4 { font-family: 'JetBrains Mono', monospace; font-size: 0.72rem; text-transform: uppercase; letter-spacing: 0.08em; margin-bottom: 8px; } | |
| .pros h4 { color: #80d060; } | |
| .cons h4 { color: #e06060; } | |
| .pros li, .cons li { font-size: 0.84rem; color: var(--text-dim); margin-bottom: 4px; list-style: none; padding-left: 16px; position: relative; } | |
| .pros li::before { content: '+'; position: absolute; left: 0; color: #80d060; font-weight: 700; } | |
| .cons li::before { content: '−'; position: absolute; left: 0; color: #e06060; font-weight: 700; } | |
| .filter-section { margin-top: 20px; background: var(--surface); border: 1px solid var(--border); border-radius: var(--radius); padding: 18px; } | |
| .filter-section h4 { font-family: 'JetBrains Mono', monospace; font-size: 0.72rem; text-transform: uppercase; letter-spacing: 0.08em; color: var(--text-dim); margin-bottom: 10px; } | |
| .filter-section .code-block { margin-bottom: 0; font-size: 0.76rem; } | |
| .wire-section { margin-top: 28px; background: var(--surface); border: 1px solid var(--border); border-radius: var(--radius); padding: 18px; } | |
| .wire-section h4 { font-family: 'JetBrains Mono', monospace; font-size: 0.72rem; text-transform: uppercase; letter-spacing: 0.08em; color: var(--text-dim); margin-bottom: 10px; } | |
| .size-bar { display: flex; align-items: center; gap: 10px; margin-bottom: 6px; } | |
| .size-bar .label { font-family: 'JetBrains Mono', monospace; font-size: 0.74rem; color: var(--text-dim); width: 130px; text-align: right; flex-shrink: 0; } | |
| .size-bar .bar { height: 18px; border-radius: 4px; min-width: 20px; display: flex; align-items: center; padding-left: 8px; font-family: 'JetBrains Mono', monospace; font-size: 0.68rem; color: #fff; font-weight: 600; } | |
| .bottom-note { margin-top: 36px; padding: 22px 26px; background: var(--surface); border: 1px solid var(--border); border-radius: var(--radius); } | |
| .bottom-note h3 { font-family: 'Source Serif 4', serif; font-size: 1.1rem; margin-bottom: 10px; } | |
| .bottom-note p { color: var(--text-dim); font-size: 0.88rem; margin-bottom: 7px; } | |
| @media (max-width: 768px) { | |
| .hero { padding: 36px 16px 24px; } | |
| .hero h1 { font-size: 1.4rem; } | |
| .content { padding: 0 12px 40px; } | |
| .panel { padding: 16px; } | |
| .pros-cons { grid-template-columns: 1fr; } | |
| .tab { font-size: 0.66rem; padding: 7px 10px; } | |
| } | |
| </style> | |
| </head> | |
| <body> | |
| <div class="hero"> | |
| <h1>Alternative trace formats for Perfetto</h1> | |
| <p class="subtitle">Concrete examples with emitter code in C and Python, plus filtering/query examples. Context: <code>google/perfetto#3491</code>.</p> | |
| <div class="scenario-banner"> | |
| <h3>Scenario modelled in every tab</h3> | |
| <p>A GPU workload trace: nested track groups (GPU → Queue → CmdBuf), slices, a counter (VRAM usage). This is exactly the kind of thing Chrome JSON handles poorly and protobuf handles verbosely.</p> | |
| </div> | |
| </div> | |
| <div class="content"> | |
| <div class="tabs"> | |
| <div class="tab active" data-tab="cbor-s">CBOR (string keys)</div> | |
| <div class="tab" data-tab="cbor-i">CBOR (int keys)</div> | |
| <div class="tab" data-tab="ndjson">NDJSON</div> | |
| <div class="tab" data-tab="ion">Amazon Ion</div> | |
| <div class="tab" data-tab="resp">RESP-like</div> | |
| <div class="tab" data-tab="arrow">Arrow IPC</div> | |
| <div class="tab" data-tab="chrome-json">Chrome JSON</div> | |
| </div> | |
| <!-- ============================================================ --> | |
| <!-- CBOR STRING KEYS --> | |
| <!-- ============================================================ --> | |
| <div class="panel active" id="panel-cbor-s"> | |
| <h2 style="color: var(--accent-cbor-s);">CBOR — String Keys (JSON-compatible)</h2> | |
| <p class="desc">CBOR (RFC 8949) with UTF-8 string map keys. Looks like JSON's data model, encodes to binary. Round-trips losslessly to JSON, so the full <code>jq</code> ecosystem works via a simple pipe. ~30-40% smaller than JSON.</p> | |
| <div class="code-section"> | |
| <h3><span class="dot" style="background: var(--accent-cbor-s);"></span> Python emitter</h3> | |
| <div class="code-block"><span class="w">import</span> cbor2 <span class="c"># pip install cbor2</span> | |
| <span class="w">with</span> <span class="f">open</span>(<span class="s">"trace.cbor"</span>, <span class="s">"wb"</span>) <span class="w">as</span> f: | |
| cbor2.dump({<span class="s">"type"</span>: <span class="s">"track"</span>, <span class="s">"uuid"</span>: <span class="n">1</span>, <span class="s">"name"</span>: <span class="s">"GPU 0"</span>}, f) | |
| cbor2.dump({<span class="s">"type"</span>: <span class="s">"track"</span>, <span class="s">"uuid"</span>: <span class="n">2</span>, <span class="s">"name"</span>: <span class="s">"Graphics Queue"</span>, <span class="s">"parent"</span>: <span class="n">1</span>}, f) | |
| cbor2.dump({<span class="s">"type"</span>: <span class="s">"track"</span>, <span class="s">"uuid"</span>: <span class="n">3</span>, <span class="s">"name"</span>: <span class="s">"CmdBuf #42"</span>, <span class="s">"parent"</span>: <span class="n">2</span>}, f) | |
| cbor2.dump({<span class="s">"type"</span>: <span class="s">"B"</span>, <span class="s">"t"</span>: <span class="n">3</span>, <span class="s">"ts"</span>: <span class="n">1000000</span>, | |
| <span class="s">"name"</span>: <span class="s">"vkCmdDraw"</span>, <span class="s">"args"</span>: {<span class="s">"vertices"</span>: <span class="n">36000</span>}}, f) | |
| cbor2.dump({<span class="s">"type"</span>: <span class="s">"E"</span>, <span class="s">"t"</span>: <span class="n">3</span>, <span class="s">"ts"</span>: <span class="n">1500000</span>}, f) | |
| cbor2.dump({<span class="s">"type"</span>: <span class="s">"C"</span>, <span class="s">"t"</span>: <span class="n">4</span>, <span class="s">"ts"</span>: <span class="n">1000000</span>, <span class="s">"v"</span>: <span class="n">268435456</span>}, f)</div> | |
| </div> | |
| <div class="code-section"> | |
| <h3><span class="dot" style="background: var(--accent-cbor-s);"></span> C emitter — ~50 lines, zero deps</h3> | |
| <div class="code-block"><span class="t">void</span> <span class="f">cbor_map</span>(FILE *f, <span class="t">int</span> n) { fputc(<span class="n">0xA0</span> | n, f); } | |
| <span class="t">void</span> <span class="f">cbor_uint</span>(FILE *f, <span class="t">uint64_t</span> v) { | |
| <span class="w">if</span> (v <= <span class="n">23</span>) fputc(v, f); | |
| <span class="w">else if</span> (v <= <span class="n">0xFF</span>) { fputc(<span class="n">0x18</span>, f); fputc(v, f); } | |
| <span class="w">else if</span> (v <= <span class="n">0xFFFF</span>) { fputc(<span class="n">0x19</span>, f); <span class="c">/* 2B big-endian */</span> } | |
| <span class="w">else</span> { fputc(<span class="n">0x1A</span>, f); <span class="c">/* 4B big-endian */</span> } | |
| } | |
| <span class="t">void</span> <span class="f">cbor_str</span>(FILE *f, <span class="w">const</span> <span class="t">char</span> *s) { | |
| <span class="t">size_t</span> n = strlen(s); | |
| <span class="w">if</span> (n <= <span class="n">23</span>) fputc(<span class="n">0x60</span>|n, f); <span class="w">else</span> { fputc(<span class="n">0x78</span>, f); fputc(n, f); } | |
| fwrite(s, <span class="n">1</span>, n, f); | |
| } | |
| <span class="c">// Emit one slice-begin:</span> | |
| <span class="f">cbor_map</span>(f, <span class="n">4</span>); | |
| <span class="f">cbor_str</span>(f,<span class="s">"type"</span>); <span class="f">cbor_str</span>(f,<span class="s">"B"</span>); | |
| <span class="f">cbor_str</span>(f,<span class="s">"t"</span>); <span class="f">cbor_uint</span>(f,<span class="n">3</span>); | |
| <span class="f">cbor_str</span>(f,<span class="s">"ts"</span>); <span class="f">cbor_uint</span>(f,<span class="n">1000000</span>); | |
| <span class="f">cbor_str</span>(f,<span class="s">"name"</span>); <span class="f">cbor_str</span>(f,<span class="s">"vkCmdDraw"</span>);</div> | |
| </div> | |
| <div class="filter-section"> | |
| <h4>Filtering & querying</h4> | |
| <div class="code-block"><span class="c"># CBOR → JSON is lossless for string keys. Full jq works.</span> | |
| <span class="c"># Pretty-print entire trace</span> | |
| cat trace.cbor | cbor2 --sequence --pretty | |
| <span class="c"># Find all slice-begin events</span> | |
| cat trace.cbor | cbor2 -d --sequence | jq <span class="s">'select(.type == "B")'</span> | |
| <span class="c"># Get all events on track 3 with timestamps > 1.2ms</span> | |
| cat trace.cbor | cbor2 -d --sequence | jq <span class="s">'select(.t == 3 and .ts > 1200000)'</span> | |
| <span class="c"># List unique event names</span> | |
| cat trace.cbor | cbor2 -d --sequence | jq -r <span class="s">'select(.name) | .name'</span> | sort -u | |
| <span class="c"># Count events per type</span> | |
| cat trace.cbor | cbor2 -d --sequence | jq -s <span class="s">'group_by(.type) | map({type: .[0].type, count: length})'</span> | |
| <span class="c"># Filter and re-encode back to CBOR</span> | |
| cat trace.cbor | cbor2 -d --seq | jq -c <span class="s">'select(.ts >= 1000000 and .ts <= 2000000)'</span> | cbor2 -e > filtered.cbor | |
| <span class="c"># Inspect raw bytes</span> | |
| cat trace.cbor | cbor-diag <span class="c"># Ruby: gem install cbor-diag</span> | |
| <span class="c"># or</span> | |
| cbor-diag trace.cbor <span class="c"># Rust: cargo install cbor-diag-cli</span></div> | |
| </div> | |
| <div class="pros-cons"> | |
| <div class="pros"><h4>Strengths</h4><ul> | |
| <li>JSON data model — trivially learnable</li> | |
| <li>Lossless JSON round-trip → full jq/grep ecosystem</li> | |
| <li>CBOR Sequences = crash-safe streaming</li> | |
| <li>~35% smaller than JSON, native uint64 timestamps</li> | |
| <li>Tiny encoder, IETF standard (RFC 8949)</li> | |
| </ul></div> | |
| <div class="cons"><h4>Weaknesses</h4><ul> | |
| <li>Not human-readable in raw form</li> | |
| <li>Only ~35% smaller — string keys are most of the overhead</li> | |
| <li>No built-in interning</li> | |
| <li>Requires <code>cbor2</code> in the pipeline (pip/gem install)</li> | |
| </ul></div> | |
| </div> | |
| </div> | |
| <!-- ============================================================ --> | |
| <!-- CBOR INTEGER KEYS --> | |
| <!-- ============================================================ --> | |
| <div class="panel" id="panel-cbor-i"> | |
| <h2 style="color: var(--accent-cbor-i);">CBOR — Integer Keys (CTAP2-style)</h2> | |
| <p class="desc">Same CBOR spec (RFC 8949), but using integer map keys instead of strings — the same approach used by FIDO2/CTAP2 security keys and COSE. Gets within ~15% of protobuf compactness. <strong>Requires an external schema</strong> to map key numbers to field names.</p> | |
| <div class="code-section"> | |
| <h3><span class="dot" style="background: var(--accent-cbor-i);"></span> Schema definition (must be maintained alongside the spec)</h3> | |
| <div class="code-block"><span class="c">// Field number → name mapping (a .json or .cddl file you maintain)</span> | |
| <span class="c">// This is analogous to a .proto file.</span> | |
| { | |
| <span class="s">"event_fields"</span>: { | |
| <span class="s">"1"</span>: <span class="s">"type"</span>, <span class="c">// 0=begin, 1=end, 2=counter, 3=instant</span> | |
| <span class="s">"2"</span>: <span class="s">"track"</span>, | |
| <span class="s">"3"</span>: <span class="s">"ts"</span>, <span class="c">// nanoseconds</span> | |
| <span class="s">"4"</span>: <span class="s">"name"</span>, | |
| <span class="s">"5"</span>: <span class="s">"value"</span>, <span class="c">// counter value</span> | |
| <span class="s">"6"</span>: <span class="s">"name_iid"</span>, <span class="c">// interned string ref</span> | |
| <span class="s">"7"</span>: <span class="s">"args"</span> <span class="c">// nested map (string keys ok here)</span> | |
| }, | |
| <span class="s">"track_fields"</span>: { | |
| <span class="s">"1"</span>: <span class="s">"uuid"</span>, | |
| <span class="s">"2"</span>: <span class="s">"name"</span>, | |
| <span class="s">"3"</span>: <span class="s">"parent"</span>, | |
| <span class="s">"4"</span>: <span class="s">"counter_unit"</span> | |
| }, | |
| <span class="s">"type_enum"</span>: { <span class="s">"0"</span>: <span class="s">"begin"</span>, <span class="s">"1"</span>: <span class="s">"end"</span>, <span class="s">"2"</span>: <span class="s">"counter"</span> } | |
| }</div> | |
| </div> | |
| <div class="code-section"> | |
| <h3><span class="dot" style="background: var(--accent-cbor-i);"></span> Python emitter</h3> | |
| <div class="code-block"><span class="w">import</span> cbor2 | |
| <span class="c"># Field constants (from schema)</span> | |
| F_TYPE, F_TRACK, F_TS, F_NAME, F_VAL = <span class="n">1</span>, <span class="n">2</span>, <span class="n">3</span>, <span class="n">4</span>, <span class="n">5</span> | |
| T_BEGIN, T_END, T_COUNTER = <span class="n">0</span>, <span class="n">1</span>, <span class="n">2</span> | |
| <span class="w">with</span> <span class="f">open</span>(<span class="s">"trace.cbor"</span>, <span class="s">"wb"</span>) <span class="w">as</span> f: | |
| <span class="c"># Track defs can still use string keys (human-readable, infrequent)</span> | |
| cbor2.dump({<span class="s">"type"</span>: <span class="s">"track"</span>, <span class="s">"uuid"</span>: <span class="n">1</span>, <span class="s">"name"</span>: <span class="s">"GPU 0"</span>}, f) | |
| cbor2.dump({<span class="s">"type"</span>: <span class="s">"track"</span>, <span class="s">"uuid"</span>: <span class="n">2</span>, <span class="s">"name"</span>: <span class="s">"Graphics Queue"</span>, <span class="s">"parent"</span>: <span class="n">1</span>}, f) | |
| <span class="c"># Hot-path events use integer keys (compact, millions of these)</span> | |
| cbor2.dump({F_TYPE: T_BEGIN, F_TRACK: <span class="n">3</span>, F_TS: <span class="n">1000000</span>, F_NAME: <span class="s">"vkCmdDraw"</span>}, f) | |
| cbor2.dump({F_TYPE: T_END, F_TRACK: <span class="n">3</span>, F_TS: <span class="n">1500000</span>}, f) | |
| cbor2.dump({F_TYPE: T_COUNTER, F_TRACK: <span class="n">4</span>, F_TS: <span class="n">1000000</span>, F_VAL: <span class="n">268435456</span>}, f)</div> | |
| </div> | |
| <div class="code-section"> | |
| <h3><span class="dot" style="background: var(--accent-cbor-i);"></span> C emitter</h3> | |
| <div class="code-block"><span class="c">// Same cbor_uint/cbor_str helpers as string-key variant.</span> | |
| <span class="c">// Only difference: keys are integers, not strings.</span> | |
| <span class="f">cbor_map</span>(f, <span class="n">4</span>); | |
| <span class="f">cbor_uint</span>(f, <span class="n">1</span>); <span class="f">cbor_uint</span>(f, <span class="n">0</span>); <span class="c">// 1=type, 0=BEGIN</span> | |
| <span class="f">cbor_uint</span>(f, <span class="n">2</span>); <span class="f">cbor_uint</span>(f, <span class="n">3</span>); <span class="c">// 2=track</span> | |
| <span class="f">cbor_uint</span>(f, <span class="n">3</span>); <span class="f">cbor_uint</span>(f, <span class="n">1000000</span>); <span class="c">// 3=ts</span> | |
| <span class="f">cbor_uint</span>(f, <span class="n">4</span>); <span class="f">cbor_str</span>(f, <span class="s">"vkCmdDraw"</span>); <span class="c">// 4=name</span> | |
| <span class="c">// Bytes on wire: A4 01 00 02 03 03 1A000F4240 04 69"vkCmdDraw"</span> | |
| <span class="c">// Total: 22 bytes (vs 34 string-key, vs 19 protobuf)</span></div> | |
| </div> | |
| <div class="filter-section"> | |
| <h4>Filtering & querying (the hard part)</h4> | |
| <div class="code-block"><span class="c"># Raw cbor2json converts integer keys to string-ified integers.</span> | |
| <span class="c"># {1: 0, 2: 3, 3: 1000000} becomes {"1": 0, "2": 3, "3": 1000000}</span> | |
| <span class="c"># This is LOSSY — int key 1 and string key "1" become indistinguishable.</span> | |
| <span class="c"># Raw (no schema): works but ugly</span> | |
| cat trace.cbor | cbor2 -d --seq | jq <span class="s">'select(.["1"] == 0)'</span> <span class="c"># "field 1 == BEGIN"</span> | |
| <span class="c"># With a schema-expansion script (you have to write this yourself):</span> | |
| cat trace.cbor | cbor2 -d --seq | ./expand-schema.py | jq <span class="s">'select(.type == "begin")'</span> | |
| <span class="c"># expand-schema.py is ~20 lines: read JSON, replace int keys with names</span> | |
| <span class="c"># But you must maintain it whenever the schema changes.</span> | |
| <span class="c"># For comparison, protobuf has this built in:</span> | |
| protoc --decode=TraceEvent trace.proto < trace.bin <span class="c"># just works</span> | |
| <span class="c"># CBOR diagnostic notation (shows raw structure, no field names):</span> | |
| cat trace.cbor | cbor-diag | |
| <span class="c"># Output: {1: 0, 2: 3, 3: 1000000, 4: "vkCmdDraw"}</span> | |
| <span class="c"># Readable if you memorize the schema. Not great for debugging.</span></div> | |
| </div> | |
| <div class="pros-cons"> | |
| <div class="pros"><h4>Strengths</h4><ul> | |
| <li>Within ~15% of protobuf compactness</li> | |
| <li>Same tiny encoder as string-key CBOR</li> | |
| <li>Can mix: string keys for rare defs, int keys for hot-path events</li> | |
| <li>IETF standard — used by FIDO2/CTAP2, COSE in production</li> | |
| <li>No code generation step (unlike protobuf)</li> | |
| </ul></div> | |
| <div class="cons"><h4>Weaknesses</h4><ul> | |
| <li>Requires a schema — same as protobuf's fundamental tradeoff</li> | |
| <li>No protoc-equivalent for CBOR schema expansion</li> | |
| <li>Lossy JSON round-trip (int keys become string "1", "2"...)</li> | |
| <li>jq queries use opaque <code>.["3"]</code> instead of <code>.ts</code></li> | |
| <li>Schema versioning/compat is your problem (protobuf has mature tooling)</li> | |
| <li>Awkward middle: schema tax without protobuf's ecosystem</li> | |
| </ul></div> | |
| </div> | |
| </div> | |
| <!-- ============================================================ --> | |
| <!-- NDJSON --> | |
| <!-- ============================================================ --> | |
| <div class="panel" id="panel-ndjson"> | |
| <h2 style="color: var(--accent-ndjson);">NDJSON — Newline-Delimited JSON</h2> | |
| <p class="desc">One JSON object per line, with a simplified data model. Requires zero new libraries. Streamable, crash-safe, greppable. The best jq experience of any option.</p> | |
| <div class="code-section"> | |
| <h3><span class="dot" style="background: var(--accent-ndjson);"></span> Trace file</h3> | |
| <div class="code-block"><span class="p">{"track":{"uuid":1,"name":"GPU 0"}}</span> | |
| <span class="p">{"track":{"uuid":2,"name":"Graphics Queue","parent":1}}</span> | |
| <span class="p">{"track":{"uuid":3,"name":"CmdBuf #42","parent":2}}</span> | |
| <span class="p">{"track":{"uuid":4,"name":"VRAM","parent":1,"counter":"bytes"}}</span> | |
| <span class="p">{"begin":{"t":3,"ts":1000000,"name":"vkCmdDraw","args":{"vertices":36000}}}</span> | |
| <span class="p">{"end":{"t":3,"ts":1500000}}</span> | |
| <span class="p">{"counter":{"t":4,"ts":1000000,"v":268435456}}</span></div> | |
| </div> | |
| <div class="code-section"> | |
| <h3><span class="dot" style="background: var(--accent-ndjson);"></span> Python emitter</h3> | |
| <div class="code-block"><span class="w">import</span> json | |
| <span class="w">with</span> <span class="f">open</span>(<span class="s">"trace.ndjson"</span>, <span class="s">"w"</span>) <span class="w">as</span> f: | |
| <span class="w">def</span> <span class="f">emit</span>(obj): f.write(json.dumps(obj, separators=(<span class="s">','</span>,<span class="s">':'</span>)) + <span class="s">'\n'</span>) | |
| emit({<span class="s">"track"</span>: {<span class="s">"uuid"</span>: <span class="n">1</span>, <span class="s">"name"</span>: <span class="s">"GPU 0"</span>}}) | |
| emit({<span class="s">"track"</span>: {<span class="s">"uuid"</span>: <span class="n">2</span>, <span class="s">"name"</span>: <span class="s">"Graphics Queue"</span>, <span class="s">"parent"</span>: <span class="n">1</span>}}) | |
| emit({<span class="s">"begin"</span>: {<span class="s">"t"</span>: <span class="n">3</span>, <span class="s">"ts"</span>: <span class="n">1000000</span>, <span class="s">"name"</span>: <span class="s">"vkCmdDraw"</span>, | |
| <span class="s">"args"</span>: {<span class="s">"vertices"</span>: <span class="n">36000</span>}}})</div> | |
| </div> | |
| <div class="code-section"> | |
| <h3><span class="dot" style="background: var(--accent-ndjson);"></span> C emitter</h3> | |
| <div class="code-block"><span class="t">void</span> <span class="f">emit_begin</span>(FILE *f, <span class="t">int</span> trk, <span class="t">uint64_t</span> ts, <span class="w">const</span> <span class="t">char</span> *name) { | |
| fprintf(f, <span class="s">"{\"begin\":{\"t\":%d,\"ts\":%lu,\"name\":\"%s\"}}\n"</span>, trk, ts, name); | |
| } | |
| <span class="t">void</span> <span class="f">emit_end</span>(FILE *f, <span class="t">int</span> trk, <span class="t">uint64_t</span> ts) { | |
| fprintf(f, <span class="s">"{\"end\":{\"t\":%d,\"ts\":%lu}}\n"</span>, trk, ts); | |
| }</div> | |
| </div> | |
| <div class="filter-section"> | |
| <h4>Filtering & querying — the gold standard</h4> | |
| <div class="code-block"><span class="c"># No conversion needed. jq operates directly on the file.</span> | |
| <span class="c"># All begin events</span> | |
| jq <span class="s">'select(.begin)'</span> trace.ndjson | |
| <span class="c"># Events on track 3 between 1ms and 2ms</span> | |
| jq <span class="s">'select(.begin and .begin.t == 3 and .begin.ts >= 1000000 and .begin.ts <= 2000000)'</span> trace.ndjson | |
| <span class="c"># List all track names</span> | |
| jq -r <span class="s">'select(.track) | .track.name'</span> trace.ndjson | |
| <span class="c"># Count events by type</span> | |
| jq -s <span class="s">'map(keys[0]) | group_by(.) | map({type: .[0], count: length})'</span> trace.ndjson | |
| <span class="c"># Extract args from all begin events</span> | |
| jq <span class="s">'select(.begin.args) | .begin | {name, args}'</span> trace.ndjson | |
| <span class="c"># Also works with standard Unix tools:</span> | |
| grep <span class="s">'"begin"'</span> trace.ndjson | head -20 <span class="c"># first 20 begin events</span> | |
| wc -l trace.ndjson <span class="c"># total event count</span> | |
| grep <span class="s">'vkCmdDraw'</span> trace.ndjson <span class="c"># find by name</span> | |
| tail -f trace.ndjson | jq <span class="s">'select(.begin)'</span> <span class="c"># live stream filtering</span> | |
| awk -F<span class="s">'"ts":'</span> <span class="s">'{print $2}'</span> trace.ndjson | head <span class="c"># quick timestamp extraction</span></div> | |
| </div> | |
| <div class="pros-cons"> | |
| <div class="pros"><h4>Strengths</h4><ul> | |
| <li>Zero new libraries — every language has JSON</li> | |
| <li>Full jq + grep + awk + tail -f — best tooling story</li> | |
| <li>Streamable, crash-safe, diff-friendly</li> | |
| <li>Supports nested args natively</li> | |
| <li>0% learning curve</li> | |
| </ul></div> | |
| <div class="cons"><h4>Weaknesses</h4><ul> | |
| <li>Largest on disk — JSON string overhead everywhere</li> | |
| <li>No native uint64 (JS number limits)</li> | |
| <li>No interning — strings repeated every event</li> | |
| <li>Escaping strings in C is annoying</li> | |
| </ul></div> | |
| </div> | |
| </div> | |
| <!-- ============================================================ --> | |
| <!-- Amazon Ion --> | |
| <!-- ============================================================ --> | |
| <div class="panel" id="panel-ion"> | |
| <h2 style="color: var(--accent-ion);">Amazon Ion</h2> | |
| <p class="desc">A JSON superset with native timestamps, symbol tables (auto-interning), comments, and interchangeable text & binary forms. Author in text, convert to binary for production. Both are semantically identical. Libraries in C, Python, Rust, Java, Go, JS.</p> | |
| <div class="code-section"> | |
| <h3><span class="dot" style="background: var(--accent-ion);"></span> Trace file — Ion text</h3> | |
| <div class="code-block"><span class="c">// Valid Ion — this is a superset of JSON</span> | |
| <span class="c">// Symbols (unquoted keys) auto-intern in binary form</span> | |
| { | |
| <span class="k">type</span>: <span class="s">track</span>, | |
| <span class="k">uuid</span>: <span class="n">1</span>, | |
| <span class="k">name</span>: <span class="s">"GPU 0"</span>, | |
| } | |
| { | |
| <span class="k">type</span>: <span class="s">slice_begin</span>, | |
| <span class="k">track</span>: <span class="n">3</span>, | |
| <span class="k">ts</span>: <span class="n">2025-11-02T10:30:00.001000Z</span>, <span class="c">// native timestamp!</span> | |
| <span class="k">name</span>: <span class="s">"vkCmdDraw"</span>, | |
| <span class="k">args</span>: { <span class="k">vertices</span>: <span class="n">36000</span>, <span class="k">shader</span>: <span class="s">"pbr_frag"</span> } | |
| } | |
| { | |
| <span class="k">type</span>: <span class="s">counter</span>, | |
| <span class="k">track</span>: <span class="n">4</span>, | |
| <span class="k">ts</span>: <span class="n">2025-11-02T10:30:00.001000Z</span>, | |
| <span class="k">value</span>: <span class="n">268435456</span>, | |
| } | |
| <span class="c">// Binary form: all repeated symbols (type, track, ts, name...)</span> | |
| <span class="c">// stored as integer IDs via Ion's built-in symbol table.</span> | |
| <span class="c">// No manual interning needed!</span></div> | |
| </div> | |
| <div class="code-section"> | |
| <h3><span class="dot" style="background: var(--accent-ion);"></span> Python emitter</h3> | |
| <div class="code-block"><span class="w">import</span> amazon.ion.simpleion <span class="w">as</span> ion <span class="c"># pip install amazon.ion</span> | |
| events = [ | |
| {<span class="s">"type"</span>: <span class="s">"track"</span>, <span class="s">"uuid"</span>: <span class="n">1</span>, <span class="s">"name"</span>: <span class="s">"GPU 0"</span>}, | |
| {<span class="s">"type"</span>: <span class="s">"slice_begin"</span>, <span class="s">"track"</span>: <span class="n">3</span>, <span class="s">"ts"</span>: <span class="n">1000000</span>, | |
| <span class="s">"name"</span>: <span class="s">"vkCmdDraw"</span>, <span class="s">"args"</span>: {<span class="s">"vertices"</span>: <span class="n">36000</span>}}, | |
| ] | |
| <span class="c"># Text — for debugging:</span> | |
| <span class="w">with</span> <span class="f">open</span>(<span class="s">"trace.ion"</span>, <span class="s">"w"</span>) <span class="w">as</span> f: | |
| <span class="w">for</span> ev <span class="w">in</span> events: f.write(ion.dumps(ev, binary=<span class="n">False</span>) + <span class="s">"\n"</span>) | |
| <span class="c"># Binary — for production (auto-interns symbols!):</span> | |
| <span class="w">with</span> <span class="f">open</span>(<span class="s">"trace.10n"</span>, <span class="s">"wb"</span>) <span class="w">as</span> f: | |
| <span class="w">for</span> ev <span class="w">in</span> events: f.write(ion.dumps(ev, binary=<span class="n">True</span>))</div> | |
| </div> | |
| <div class="code-section"> | |
| <h3><span class="dot" style="background: var(--accent-ion);"></span> C emitter — using ion-c</h3> | |
| <div class="code-block"><span class="c">// Using official ion-c (github.com/amazon-ion/ion-c)</span> | |
| <span class="w">#include</span> <span class="s">"ion.h"</span> | |
| hWRITER writer; | |
| ION_WRITER_OPTIONS opts = {<span class="n">0</span>}; | |
| opts.output_as_binary = TRUE; <span class="c">// flip to FALSE for text</span> | |
| ion_writer_open(&writer, stream, &opts); | |
| ion_writer_start_container(writer, tid_STRUCT); | |
| <span class="c">// symbols auto-intern in binary mode</span> | |
| ion_writer_write_field_name_from_cstr(writer, <span class="s">"type"</span>); | |
| ion_writer_write_symbol_from_cstr(writer, <span class="s">"slice_begin"</span>); | |
| ion_writer_write_field_name_from_cstr(writer, <span class="s">"track"</span>); | |
| ion_writer_write_int(writer, <span class="n">3</span>); | |
| ion_writer_write_field_name_from_cstr(writer, <span class="s">"ts"</span>); | |
| ion_writer_write_int(writer, <span class="n">1000000</span>); | |
| ion_writer_write_field_name_from_cstr(writer, <span class="s">"name"</span>); | |
| ion_writer_write_string_from_cstr(writer, <span class="s">"vkCmdDraw"</span>); | |
| ion_writer_finish_container(writer);</div> | |
| </div> | |
| <div class="filter-section"> | |
| <h4>Filtering & querying</h4> | |
| <div class="code-block"><span class="c"># Ion CLI tool: convert binary ↔ text, inspect, filter</span> | |
| <span class="c"># Install: cargo install ion-cli (or from amazon-ion/ion-cli)</span> | |
| <span class="c"># Binary → human-readable text</span> | |
| ion dump --format text trace.10n | |
| <span class="c"># Binary → JSON (for piping to jq)</span> | |
| ion dump --format json trace.10n | jq <span class="s">'select(.type == "slice_begin")'</span> | |
| <span class="c"># Text form is directly readable — can grep it</span> | |
| grep <span class="s">'slice_begin'</span> trace.ion | |
| grep <span class="s">'vkCmdDraw'</span> trace.ion | |
| <span class="c"># Ion text → Ion binary (for production)</span> | |
| ion dump --format binary trace.ion -o trace.10n | |
| <span class="c"># PartiQL — SQL-like querying (Ion's native query language, used by AWS)</span> | |
| <span class="c"># Not widely available as a CLI tool yet, but the spec exists.</span> | |
| <span class="c"># SELECT * FROM trace WHERE type = 'slice_begin' AND ts > 1000000</span> | |
| <span class="c"># Round-trip: text → binary → text is lossless (except whitespace/comments)</span> | |
| ion dump --format text trace.10n > roundtrip.ion | |
| diff trace.ion roundtrip.ion <span class="c"># identical content, maybe different whitespace</span></div> | |
| </div> | |
| <div class="pros-cons"> | |
| <div class="pros"><h4>Strengths</h4><ul> | |
| <li>Dual text/binary — prototype in text, ship binary</li> | |
| <li>JSON superset — any JSON is valid Ion</li> | |
| <li>Built-in symbol tables = automatic interning</li> | |
| <li>Native timestamps with arbitrary precision</li> | |
| <li>Comments in text form</li> | |
| <li>Lossless text↔binary round-trip</li> | |
| <li>Libraries in C, Python, Rust, Java, Go, JS, C#</li> | |
| </ul></div> | |
| <div class="cons"><h4>Weaknesses</h4><ul> | |
| <li>C library (ion-c) is verbose — more boilerplate than CBOR</li> | |
| <li>Less well-known than protobuf or CBOR</li> | |
| <li>Spec more complex than CBOR — steeper curve for implementors</li> | |
| <li>ion-cli exists but is less mature than jq</li> | |
| <li>jq requires Ion→JSON conversion (lossy for Ion-specific types like timestamps)</li> | |
| </ul></div> | |
| </div> | |
| </div> | |
| <!-- ============================================================ --> | |
| <!-- RESP-like --> | |
| <!-- ============================================================ --> | |
| <div class="panel" id="panel-resp"> | |
| <h2 style="color: var(--accent-resp);">RESP-like Trace Format</h2> | |
| <p class="desc">Text-based, line-oriented, type-prefixed. The entire C emitter is <code>fprintf</code>. Human-readable, greppable, pipeable.</p> | |
| <div class="code-section"> | |
| <h3><span class="dot" style="background: var(--accent-resp);"></span> Trace file</h3> | |
| <div class="code-block"><span class="c"># T=track B=begin E=end C=counter I=intern</span> | |
| <span class="t">T</span> <span class="k">uuid</span>=<span class="n">1</span> <span class="k">name</span>=<span class="s">GPU 0</span> | |
| <span class="t">T</span> <span class="k">uuid</span>=<span class="n">2</span> <span class="k">name</span>=<span class="s">Graphics Queue</span> <span class="k">parent</span>=<span class="n">1</span> | |
| <span class="t">T</span> <span class="k">uuid</span>=<span class="n">3</span> <span class="k">name</span>=<span class="s">CmdBuf #42</span> <span class="k">parent</span>=<span class="n">2</span> | |
| <span class="t">B</span> <span class="k">t</span>=<span class="n">3</span> <span class="k">ts</span>=<span class="n">1000000</span> <span class="k">n</span>=<span class="s">vkCmdDraw</span> <span class="k">vertices</span>=<span class="n">36000</span> | |
| <span class="t">E</span> <span class="k">t</span>=<span class="n">3</span> <span class="k">ts</span>=<span class="n">1500000</span> | |
| <span class="t">C</span> <span class="k">t</span>=<span class="n">4</span> <span class="k">ts</span>=<span class="n">1000000</span> <span class="k">v</span>=<span class="n">268435456</span></div> | |
| </div> | |
| <div class="code-section"> | |
| <h3><span class="dot" style="background: var(--accent-resp);"></span> C emitter — 6 lines</h3> | |
| <div class="code-block"><span class="t">void</span> <span class="f">emit_begin</span>(FILE *f, <span class="t">int</span> trk, <span class="t">uint64_t</span> ts, <span class="w">const</span> <span class="t">char</span> *name) { | |
| fprintf(f, <span class="s">"B\tt=%d\tts=%lu\tn=%s\n"</span>, trk, ts, name); | |
| } | |
| <span class="t">void</span> <span class="f">emit_end</span>(FILE *f, <span class="t">int</span> trk, <span class="t">uint64_t</span> ts) { | |
| fprintf(f, <span class="s">"E\tt=%d\tts=%lu\n"</span>, trk, ts); | |
| }</div> | |
| </div> | |
| <div class="filter-section"> | |
| <h4>Filtering & querying — Unix native</h4> | |
| <div class="code-block"><span class="c"># Pure Unix tools — no jq needed, no conversion step.</span> | |
| <span class="c"># All begin events</span> | |
| grep <span class="s">'^B\t'</span> trace.ptf | |
| <span class="c"># Begin events on track 3</span> | |
| grep <span class="s">'^B\t'</span> trace.ptf | grep <span class="s">'t=3'</span> | |
| <span class="c"># Events with ts between 1ms and 2ms</span> | |
| awk -F<span class="s">'\t'</span> <span class="s">'/^[BCE]/ { for(i=1;i<=NF;i++) if($i ~ /^ts=/) { | |
| split($i,a,"="); if(a[2]>=1000000 && a[2]<=2000000) print } }'</span> trace.ptf | |
| <span class="c"># Count events by type</span> | |
| cut -f1 trace.ptf | sort | uniq -c | sort -rn | |
| <span class="c"># All unique event names</span> | |
| grep <span class="s">'^B\t'</span> trace.ptf | grep -oP <span class="s">'n=\K[^\t]*'</span> | sort -u | |
| <span class="c"># Live stream monitoring</span> | |
| tail -f trace.ptf | grep <span class="s">'^B\t'</span> | |
| <span class="c"># Total event count</span> | |
| wc -l trace.ptf | |
| <span class="c"># Can also convert to NDJSON for jq, if needed:</span> | |
| awk -F<span class="s">'\t'</span> <span class="s">'{ printf "{"; for(i=1;i<=NF;i++) { split($i,a,"="); | |
| if(i>1) printf ","; printf "\"%s\":\"%s\"",a[1],a[2] } print "}" }'</span> trace.ptf | jq .</div> | |
| </div> | |
| <div class="pros-cons"> | |
| <div class="pros"><h4>Strengths</h4><ul> | |
| <li>Maximally simple: printf/echo is enough</li> | |
| <li>Human-readable — cat, grep, awk, tail -f</li> | |
| <li>Line-oriented = crash-safe, streamable</li> | |
| <li>No parser needed — split on \t, split on =</li> | |
| </ul></div> | |
| <div class="cons"><h4>Weaknesses</h4><ul> | |
| <li>2-4× larger than binary formats</li> | |
| <li>Flat key=value — no nested args</li> | |
| <li>Tab/newline in values needs escaping</li> | |
| <li>awk queries are ugly compared to jq</li> | |
| <li>Completely bespoke — no standard</li> | |
| </ul></div> | |
| </div> | |
| </div> | |
| <!-- ============================================================ --> | |
| <!-- Arrow IPC --> | |
| <!-- ============================================================ --> | |
| <div class="panel" id="panel-arrow"> | |
| <h2 style="color: var(--accent-arrow);">Apache Arrow IPC</h2> | |
| <p class="desc">Columnar batches instead of row-oriented records. Optimized for the reader: Perfetto could <code>mmap()</code> the file and query with near-zero deserialization. Trade-off: must buffer events and flush in batches.</p> | |
| <div class="code-section"> | |
| <h3><span class="dot" style="background: var(--accent-arrow);"></span> Python emitter</h3> | |
| <div class="code-block"><span class="w">import</span> pyarrow <span class="w">as</span> pa <span class="c"># pip install pyarrow</span> | |
| schema = pa.schema([ | |
| (<span class="s">"type"</span>, pa.uint8()), <span class="c"># 0=begin, 1=end, 2=counter</span> | |
| (<span class="s">"track"</span>, pa.int64()), | |
| (<span class="s">"ts"</span>, pa.int64()), <span class="c"># nanoseconds, native int64</span> | |
| (<span class="s">"name"</span>, pa.dictionary(pa.int32(), pa.utf8())), <span class="c"># auto-interned!</span> | |
| (<span class="s">"value"</span>, pa.float64()), | |
| ]) | |
| batch = pa.record_batch([ | |
| pa.array([<span class="n">0</span>, <span class="n">1</span>, <span class="n">2</span>], type=pa.uint8()), | |
| pa.array([<span class="n">3</span>, <span class="n">3</span>, <span class="n">4</span>]), | |
| pa.array([<span class="n">1000000</span>, <span class="n">1500000</span>, <span class="n">1000000</span>]), | |
| pa.array([<span class="s">"vkCmdDraw"</span>, <span class="n">None</span>, <span class="n">None</span>]).dictionary_encode(), | |
| pa.array([<span class="n">None</span>, <span class="n">None</span>, <span class="n">268435456.0</span>]), | |
| ], schema=schema) | |
| <span class="w">with</span> pa.OSFile(<span class="s">"trace.arrow"</span>, <span class="s">"wb"</span>) <span class="w">as</span> sink: | |
| writer = pa.ipc.new_file(sink, schema) | |
| writer.write_batch(batch) | |
| writer.close()</div> | |
| </div> | |
| <div class="code-section"> | |
| <h3><span class="dot" style="background: var(--accent-arrow);"></span> C emitter — using nanoarrow</h3> | |
| <div class="code-block"><span class="c">// nanoarrow: single-header C library from Apache Arrow project</span> | |
| <span class="w">#include</span> <span class="s">"nanoarrow/nanoarrow.h"</span> | |
| <span class="t">struct</span> ArrowSchema schema; | |
| ArrowSchemaInit(&schema); | |
| ArrowSchemaSetTypeStruct(&schema, <span class="n">5</span>); | |
| ArrowSchemaSetType(schema.children[<span class="n">0</span>], NANOARROW_TYPE_UINT8); <span class="c">// type</span> | |
| ArrowSchemaSetType(schema.children[<span class="n">1</span>], NANOARROW_TYPE_INT64); <span class="c">// track</span> | |
| ArrowSchemaSetType(schema.children[<span class="n">2</span>], NANOARROW_TYPE_INT64); <span class="c">// ts</span> | |
| ArrowSchemaSetType(schema.children[<span class="n">3</span>], NANOARROW_TYPE_STRING); <span class="c">// name</span> | |
| ArrowSchemaSetType(schema.children[<span class="n">4</span>], NANOARROW_TYPE_DOUBLE); <span class="c">// value</span> | |
| <span class="t">struct</span> ArrowArray array; | |
| ArrowArrayInitFromSchema(&array, &schema, NULL); | |
| ArrowArrayStartAppending(&array); | |
| ArrowArrayAppendUInt(array.children[<span class="n">0</span>], <span class="n">0</span>); <span class="c">// begin</span> | |
| ArrowArrayAppendInt(array.children[<span class="n">1</span>], <span class="n">3</span>); <span class="c">// track</span> | |
| ArrowArrayAppendInt(array.children[<span class="n">2</span>], <span class="n">1000000</span>); <span class="c">// ts</span> | |
| <span class="c">// ... etc</span> | |
| ArrowArrayFinishBuildingDefault(&array, NULL);</div> | |
| </div> | |
| <div class="filter-section"> | |
| <h4>Filtering & querying — via DuckDB or Python</h4> | |
| <div class="code-block"><span class="c"># Arrow IPC files can be queried with DuckDB (zero-copy, SQL!)</span> | |
| <span class="c"># Install: pip install duckdb / brew install duckdb</span> | |
| duckdb -c <span class="s">"SELECT * FROM 'trace.arrow' WHERE type = 0 AND ts > 1000000"</span> | |
| duckdb -c <span class="s">"SELECT name, COUNT(*) as cnt FROM 'trace.arrow' | |
| WHERE type = 0 GROUP BY name ORDER BY cnt DESC"</span> | |
| <span class="c"># Or in Python:</span> | |
| <span class="w">import</span> pyarrow.ipc <span class="w">as</span> ipc | |
| reader = ipc.open_file(<span class="s">"trace.arrow"</span>) | |
| table = reader.read_all() | |
| begins = table.filter(table.column(<span class="s">"type"</span>) == <span class="n">0</span>) | |
| <span class="c"># Convert to pandas for familiar filtering:</span> | |
| df = table.to_pandas() | |
| df[df[<span class="s">"type"</span>] == <span class="n">0</span>].sort_values(<span class="s">"ts"</span>) | |
| <span class="c"># Convert to JSON for jq (lossy for performance, but works):</span> | |
| python -c <span class="s">"import pyarrow.ipc as i, json, sys | |
| for b in i.open_file('trace.arrow').read_all().to_pydict(): | |
| json.dump(b, sys.stdout); print()"</span> | jq . | |
| <span class="c"># No direct jq-style CLI tool for Arrow files. | |
| # But DuckDB's SQL is arguably more powerful than jq for analytics.</span></div> | |
| </div> | |
| <div class="pros-cons"> | |
| <div class="pros"><h4>Strengths</h4><ul> | |
| <li>Fastest read: mmap + zero-copy, zero deserialization</li> | |
| <li>Columnar = perfect for Perfetto's SQL engine</li> | |
| <li>DuckDB gives you SQL queries directly on .arrow files</li> | |
| <li>Dictionary encoding = built-in interning</li> | |
| <li>Massive ecosystem (Pandas, Spark, Polars, DuckDB)</li> | |
| </ul></div> | |
| <div class="cons"><h4>Weaknesses</h4><ul> | |
| <li>Must buffer events and flush as batches</li> | |
| <li>Schema declared upfront — no ad-hoc fields per event</li> | |
| <li>Not human-readable at all</li> | |
| <li>No jq — need DuckDB or Python for queries</li> | |
| <li>Conceptually different (columnar, not row-oriented)</li> | |
| </ul></div> | |
| </div> | |
| </div> | |
| <!-- ============================================================ --> | |
| <!-- Chrome JSON --> | |
| <!-- ============================================================ --> | |
| <div class="panel" id="panel-chrome-json"> | |
| <h2 style="color: var(--accent-chrome);">Chrome JSON (status quo)</h2> | |
| <p class="desc">The current de facto standard. Dead simple, but limited to Chrome's pid/tid model.</p> | |
| <div class="code-section"> | |
| <h3><span class="dot" style="background: var(--accent-chrome);"></span> Trace file</h3> | |
| <div class="code-block"><span class="p">{</span> <span class="s">"traceEvents"</span><span class="p">: [</span> | |
| <span class="p">{</span><span class="s">"name"</span><span class="p">:</span><span class="s">"process_name"</span><span class="p">,</span> <span class="s">"ph"</span><span class="p">:</span><span class="s">"M"</span><span class="p">,</span> <span class="s">"pid"</span><span class="p">:</span><span class="n">1</span><span class="p">,</span> <span class="s">"args"</span><span class="p">:{</span><span class="s">"name"</span><span class="p">:</span><span class="s">"GPU 0"</span><span class="p">}},</span> | |
| <span class="p">{</span><span class="s">"name"</span><span class="p">:</span><span class="s">"thread_name"</span><span class="p">,</span> <span class="s">"ph"</span><span class="p">:</span><span class="s">"M"</span><span class="p">,</span> <span class="s">"pid"</span><span class="p">:</span><span class="n">1</span><span class="p">,</span> <span class="s">"tid"</span><span class="p">:</span><span class="n">1</span><span class="p">,</span> | |
| <span class="s">"args"</span><span class="p">:{</span><span class="s">"name"</span><span class="p">:</span><span class="s">"Graphics Queue"</span><span class="p">}},</span> | |
| <span class="c">// CmdBuf #42 can't nest under Queue — only 2 levels!</span> | |
| <span class="p">{</span><span class="s">"name"</span><span class="p">:</span><span class="s">"vkCmdDraw"</span><span class="p">,</span> <span class="s">"ph"</span><span class="p">:</span><span class="s">"B"</span><span class="p">,</span> <span class="s">"pid"</span><span class="p">:</span><span class="n">1</span><span class="p">,</span> <span class="s">"tid"</span><span class="p">:</span><span class="n">1</span><span class="p">,</span> <span class="s">"ts"</span><span class="p">:</span><span class="n">1000</span><span class="p">,</span> | |
| <span class="s">"args"</span><span class="p">:{</span><span class="s">"vertices"</span><span class="p">:</span><span class="n">36000</span><span class="p">}},</span> | |
| <span class="p">{</span><span class="s">"name"</span><span class="p">:</span><span class="s">"vkCmdDraw"</span><span class="p">,</span> <span class="s">"ph"</span><span class="p">:</span><span class="s">"E"</span><span class="p">,</span> <span class="s">"pid"</span><span class="p">:</span><span class="n">1</span><span class="p">,</span> <span class="s">"tid"</span><span class="p">:</span><span class="n">1</span><span class="p">,</span> <span class="s">"ts"</span><span class="p">:</span><span class="n">1500</span><span class="p">},</span> | |
| <span class="p">{</span><span class="s">"name"</span><span class="p">:</span><span class="s">"VRAM"</span><span class="p">,</span> <span class="s">"ph"</span><span class="p">:</span><span class="s">"C"</span><span class="p">,</span> <span class="s">"pid"</span><span class="p">:</span><span class="n">1</span><span class="p">,</span> <span class="s">"ts"</span><span class="p">:</span><span class="n">1000</span><span class="p">,</span> | |
| <span class="s">"args"</span><span class="p">:{</span><span class="s">"VRAM"</span><span class="p">:</span><span class="n">268435456</span><span class="p">}}</span> | |
| <span class="p">]}</span></div> | |
| </div> | |
| <div class="filter-section"> | |
| <h4>Filtering & querying</h4> | |
| <div class="code-block"><span class="c"># Must slurp the entire file (it's one big JSON array)</span> | |
| jq <span class="s">'.traceEvents[] | select(.ph == "B")'</span> trace.json | |
| <span class="c"># Events by pid/tid</span> | |
| jq <span class="s">'.traceEvents[] | select(.pid == 1 and .tid == 1)'</span> trace.json | |
| <span class="c"># Problem: large traces don't fit in memory for jq -s</span> | |
| <span class="c"># Problem: can't stream — wrapping array requires full parse</span> | |
| <span class="c"># Problem: can't grep — events span multiple lines in pretty-printed form</span></div> | |
| </div> | |
| </div> | |
| <!-- ============================================================ --> | |
| <!-- Size comparison --> | |
| <!-- ============================================================ --> | |
| <div class="wire-section"> | |
| <h4>Approximate size per event (slice_begin, pre-compression)</h4> | |
| <div class="size-bar"> | |
| <span class="label">Protobuf</span> | |
| <div class="bar" style="width: 60px; background: #666;">~19 B</div> | |
| </div> | |
| <div class="size-bar"> | |
| <span class="label">Protobuf+intern</span> | |
| <div class="bar" style="width: 35px; background: #555;">~10 B</div> | |
| </div> | |
| <div class="size-bar"> | |
| <span class="label">CBOR int-keys</span> | |
| <div class="bar" style="width: 70px; background: var(--accent-cbor-i);">~22 B</div> | |
| </div> | |
| <div class="size-bar"> | |
| <span class="label">Arrow IPC</span> | |
| <div class="bar" style="width: 75px; background: var(--accent-arrow);">~8 B/row*</div> | |
| </div> | |
| <div class="size-bar"> | |
| <span class="label">CBOR str-keys</span> | |
| <div class="bar" style="width: 110px; background: var(--accent-cbor-s);">~34 B</div> | |
| </div> | |
| <div class="size-bar"> | |
| <span class="label">Ion binary</span> | |
| <div class="bar" style="width: 100px; background: var(--accent-ion);">~30 B</div> | |
| </div> | |
| <div class="size-bar"> | |
| <span class="label">RESP-like</span> | |
| <div class="bar" style="width: 130px; background: var(--accent-resp);">~40 B</div> | |
| </div> | |
| <div class="size-bar"> | |
| <span class="label">NDJSON</span> | |
| <div class="bar" style="width: 160px; background: var(--accent-ndjson);">~50 B</div> | |
| </div> | |
| <div class="size-bar"> | |
| <span class="label">Chrome JSON</span> | |
| <div class="bar" style="width: 180px; background: var(--accent-chrome);">~56 B</div> | |
| </div> | |
| <div class="size-bar"> | |
| <span class="label">Ion text</span> | |
| <div class="bar" style="width: 170px; background: var(--accent-ion); opacity:0.5;">~52 B</div> | |
| </div> | |
| <p style="color: var(--text-dim); font-size: 0.74rem; margin-top: 8px;"> | |
| Per-event bytes for a slice_begin with track=3, ts=1000000, name="vkCmdDraw". *Arrow IPC amortizes schema overhead across thousands of rows — per-row cost drops dramatically at scale. All formats compress similarly with zstd. | |
| </p> | |
| </div> | |
| <!-- ============================================================ --> | |
| <!-- Bottom --> | |
| <!-- ============================================================ --> | |
| <div class="bottom-note"> | |
| <h3>Summary</h3> | |
| <p><strong style="color: var(--accent-ndjson);">NDJSON</strong> — best tooling (native jq/grep/awk), zero deps, largest on disk. The "boring but correct" choice.</p> | |
| <p><strong style="color: var(--accent-ion);">Ion</strong> — best of both worlds: human-readable text for debugging, compact binary for production. Built-in interning. Tooling maturing.</p> | |
| <p><strong style="color: var(--accent-cbor-s);">CBOR str-keys</strong> — binary JSON, lossless jq round-trip, ~35% smaller. Modest savings for losing readability.</p> | |
| <p><strong style="color: var(--accent-cbor-i);">CBOR int-keys</strong> — near-protobuf compactness, but requires a schema and lacks protobuf's tooling. Awkward middle ground.</p> | |
| <p><strong style="color: var(--accent-arrow);">Arrow IPC</strong> — fastest reads, DuckDB gives SQL queries, ideal if Perfetto's trace processor consumed it directly.</p> | |
| <p><strong style="color: var(--accent-resp);">RESP-like</strong> — nuclear simplicity for systems/embedded devs who want <code>fprintf</code>-and-done.</p> | |
| <p style="color: var(--text); font-weight: 500; margin-top: 10px;">Core insight: the thread keeps framing this as a serialization problem, but it's a <em>data model</em> problem. Design the middle-ground data model first (tracks with nesting, slices, counters, args-as-maps), then pick the encoding.</p> | |
| </div> | |
| </div> | |
| <script> | |
| document.querySelectorAll('.tab').forEach(tab => { | |
| tab.addEventListener('click', () => { | |
| document.querySelectorAll('.tab').forEach(t => t.classList.remove('active')); | |
| document.querySelectorAll('.panel').forEach(p => p.classList.remove('active')); | |
| tab.classList.add('active'); | |
| document.getElementById('panel-' + tab.dataset.tab).classList.add('active'); | |
| }); | |
| }); | |
| </script> | |
| </body> | |
| </html> |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment