Here is a brain dump about an idea for a wasm tracing tool I have been thinking about. I hope this is useful :)
dtrace
. Lightweight, make your own debugger/profiler. Not a complete profiler
or debugger for some specific use case, but is instead a collection of legos or
a toolkit for building one-off debuggers and profilers.
- https://en.wikipedia.org/wiki/DTrace
- http://www.brendangregg.com/dtrace.html
- http://www.brendangregg.com/DTrace/dtrace_oneliners.txt
-
Maintain ring buffer of N latest functions were called or returned
-
Same as above but only for certain functions that match a regex
-
"strace" functionality by tracing calls to imported functions (which are the moral equivalent of syscalls in native code)
-
Arguments to and values returned from specific functions
-
The
grow_memory
instruction -
Traps
-
Maintain a shadow stack in memory (via inserting prologue and epilogue into every function) and then capture the current stack on various events (things listed above).
-
Easiest: as a flat log. For example, listing the last N calls to imported functions:
query_selector(0x12345678) -> 0xbad0bad1 create_element(0xcafecafe) -> 0xdeaddead append_child(0xbad0bad1, 0xdeaddead) ...
-
As a nested log where a call introduces new indenting and a return removes indenting. For example, show me the last N function calls that happened before this bug.
call crate::mod::func(123, 456) call crate::mod::helper(0) return 42 from crate::mod::helper call crate::mod::another() call util::blah(986, 345) return from util::blah return 1 from crate::mod::another ...
-
For a series of captured stacks: a call tree with counts (can be inverted too). For example, trace the stack whenever we call the
free
function, and then aggregate this into a call stack:Total Count | Self Count | Stack Frame ------------+------------+---------------------------- 123 | 0 | do_tick 67 | 0 | ├── physics 67 | 67 | │ └── destroy_collision_node 56 | 0 | └── render 43 | 43 | ├── finish_draw_rect 13 | 13 | └── finish_draw_circle
-
For any scalar data, ie the arguments to and values returned from some functions, we could draw histograms. This would be neat combined with tracing the requested sizes of allocations, for example:
value ------------- Distribution ------------- count 16 | 0 32 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 169 64 |@@@ 16 128 |@@ 10 256 | 0
This is the hardest feature to do well, but also most important, in my opinion!
I want to use this tool when debugging or profiling, and I want to be able to
just apply it to my code without any ceremony. As much as possible, I don't want
to mess with build configurations, I don't want to manually add new <script>
tags, I don't want to have to change my own source code, or the way I
instantiate the wasm module. This is hard because we are talking about
introducing new JS code and potentially linking another wasm object file into
the instrumented wasm.
Ideally, to trace the first argument to the malloc
function and display it as
a histogram, I would just do something like
$ wasm-trace --trace "arg(0, malloc)" --display histogram path/to/module.wasm
And that is all. It would mutate the wasm binary in place, so my existing build system and all that wouldn't have to understand this temporary build step.
I foresee two main components: (1) the thing that does the instrumenting and the code it inserts into the instrumented binary, and (2) the JS that extracts the traced data and displays it.
Inserts new instructions into a wasm binary to capture and maintain tracing information. Adds new data segments to store traced info inside.
Do we want a ring buffer, where old data is overwritten when we wrap around, or do we start summarizing data at that point (when applicable), or do we call a well-known imported function from the JS displayer that knows how to empty all the data? Maybe different approaches in different situations.
Maybe the instrumenter could itself be compiled to wasm and the instrumentation of a debuggee wasm binary could be applied just before wasm compilation inside a webpage? yo_dawg.jpg
Some JavaScript module that collects the traced data from inside the wasm memory
and displays it in console.log
or within some <pre>
or does a cool canvas
visualization or something.
Would be awesome if this worked with both node.js and the web (or if there were two versions).
Design decision: do all aggregation in the instrumented wasm (via linking a runtime into the instrumented code?) or post process in this displayer JS? Former is likely more performant, but latter might be easier?
https://github.com/WebAssembly/binaryen
Framework for compilation passes over wasm, analyses, instrumentation, etc. Written in C++, fairly mature.
https://crates.io/crates/parity-wasm
Wasm parser crate. Written in Rust and is pretty solid. However, it is really
only a parser / builder for wasm. Anyone doing analyses on top of this
probably has to build out more infrastructure compared to binaryen
.
Hi @fitzgen, thanks for sharing your thoughts!
While working on Wasm3, I found that having such a tool would be very useful.
I'm aware of existing tools like Wasabi and sliminality/wasm-trace.
But I decided to give it a try and develop our own version: https://github.com/wasm3/wasm-trace
For example I would really like to decouple the
instrumentation
,execution
andvisualization
phases, so that we can get similar (and directly comparable) results from different wasm engines and runtimes.Getting good results already, see this thread on Twitter
Would love to hear some feedback from you. Have a nice day 😃