Skip to content

Instantly share code, notes, and snippets.

@trevnorris
Last active April 9, 2016 16:24
  • Star 30 You must be signed in to star a gist
  • Fork 8 You must be signed in to fork a gist
Star You must be signed in to star a gist
Save trevnorris/7712539 to your computer and use it in GitHub Desktop.

build

Clone and build Node for analysis:

$ git clone https://github.com/joyent/node.git
$ cd node
$ export GYP_DEFINES="v8_enable_disassembler=1"
$ ./configure
$ make -j4

Note: To run with IRHydra run like the following:

$ node --trace-hydrogen --trace-phase=Z --trace-deopt \
  --code-comments --hydrogen-track-positions --redirect-code-traces \
  --redirect-code-traces-to=code.asm --print-opt-code script.js

Now build v8:

$ cd deps/v8
$ make dependencies
$ make -j4

Building modules with custom local build of Node:

$ /path/to/node/node $(which node-gyp) rebuild --release --nodedir=/path/to/node

(note: can also use --debug)

Install npm libraries using custom local build:

$ /path/to/node/node $(which npm) --nodedir=/path/to/node install

v8 flags

Print deoptimizations:

$ ./node --trace-deopt --code-comments ./script.js

Four types of deoptimizations:

  • eager: Occurs with unexpected element transitions.
  • lazy: Occurs when an assumption is made on a global which no longer holds true.
  • soft: As v8 collects type information on functions it will mark dead subgraphs of the AST (abstract syntax tree). If one of those marked graphs is called this will cause the function to deoptimize and tell v8 more type information needs to be collected.
  • debugger: Not really a bailout, but used by the debugger to deoptimize stack frames to allow inspection.

Print optimizations and inlining:

$ ./node --trace-opt --trace-inlining ./script.js

Check activity of the garbage collector:

$ ./node --trace_gc --print_cumulative_gc_stat ./script

Trace for allocations of external memory (e.g. Buffers/ArrayBuffers):

./node --trace_external_memory ./script

Grab v8 profile of script and use the tick processor:

$ ./node --prof ./script
$ D8_PATH=/path/to/deps/v8/out/native /path/to/deps/v8/tools/linux-tick-processor

Key things to look for from tick processor output:

  • Stub: Hand written assembly that follows the JS calling convention.
  • LazyCompile: Code was "lazily compiled", not that v8 spent that much time compiling code.
  • * prefix: The function has been run through the optimizing compiler (i.e. Crankshaft).
  • ~ prefix: The function was run through the full-codegen (or unoptimized) compiler, but v8 is letting us know the function could be optimized.
  • no prefix: Either that v8 doesn't have enough type info on the function to know whether it could be optimized, or the function has been marked as unoptimizable.

Find unnamed function in --trace-* output (using --trace-opt as example):

$ ./node --trace-opt --print-opt-code ./script > code.asm

The output will be a mix of the two flags. From --trace-opt the output will usually look like so:

[marking 0x1531090902b9 myFunctionName <JS Function (SharedFunctionInfo 0x3f91d3335139)> ...

But in the case where the function has no name it will look like this:

[marking 0x1531090902b9 <JS Function (SharedFunctionInfo 0x3f91d3335139)> ...

So grab the SharedFunctionInfo address and search for object: 0x3f91d3335139 <SharedFunctionInfo>. This will drop you into the execution body of the calling function. Scroll up and you'll see a body of text that contains the unnamed function.

perf

I suggest getting latest kernel and building perf (example from time of this writing):

$ wget https://www.kernel.org/pub/linux/kernel/v3.x/testing/linux-3.13-rc2.tar.xz
$ tar xvf linux-3.13-rc2.tar.xz
$ linux-3.13-rc2/tools/perf/
$ make -f Makefile.perf -j4

I installed it globally, so did the following:

$ sudo make -f Makefile.perf -j4 install prefix=/usr

Before getting perf output we'll need to prevent kernel pointers from being hidden:

$ sudo sysctl kernel/kptr_restrict=0

Now, let's start with some basic output:

$ perf stat -- ./node script.js

And some basic recording:

$ perf record -e cycles:u ./node script.js
$ perf report

Let's get an inverted call graph:

$ perf record --call-graph dwarf -e cycles -- ./node script.js
$ perf report

Get a list of all system tracepoints:

$ sudo perf list

Print how many times a specific syscall(s) is performed during execution:

$ sudo perf stat -e syscalls:sys_*_mmap -- ./node script.js

Get the stack traces for specific syscalls:

$ sudo perf record -e syscalls:sys_*_mmap --call-graph dwarf -- ./node script.js

List possible probes on Node

$ perf probe --no-demangle -F -x /path/to/node/node

Add probe for specific call in Node:

$ sudo perf probe --no-demangle -x /path/to/node/node -a uv_fs_open

Quick script to add probes for all Node (i.e. libuv) file system operations:

for i in `perf probe --no-demangle -F -x /path/to/node/node | egrep "^uv_fs_"`
do
  sudo perf probe --no-demangle -x /path/to/node/node -a $i
done

Or for everything:

for i in `perf probe --no-demangle -F -x /path/to/node/node | egrep "^uv_[^_]"`
do
  sudo perf probe --no-demangle -x /path/to/node/node -a $i
done

See how many times each are called:

$ sudo perf stat -e probe_node:uv_fs_* -- ./node script.js

And don't forget to remove them when you're all done :)

$ sudo perf probe -d uv_fs_*
@dshaw
Copy link

dshaw commented Dec 6, 2013

We have a new session of Performance Analysis Training scheduled for January. Join us!

Read the detailed syllabus: http://thenodefirm.com/training/performance-analysis
Grab a ticket: http://firm.io/perf-beta

@dshaw
Copy link

dshaw commented Jan 10, 2014

@trevnorris is on paternity leave for February. We'll be back on March 14, 2014.

Join us: http://firm.io/perf-beta

@dudleycarr
Copy link

When following the instructions above for node branched on 0.10.30, code.asm isn't generated using the command line parameters specified. Any idea why that might be?

@wildlyinaccurate
Copy link

Not sure when it changed, but in 0.10.30 all the V8 flags are underscore_separated, and even compiling with v8_enable_disassembler=1 doesn't enable the --print-opt-code (or --print_opt_code) flag. Any ideas?

Edit: Scratch that, the --print_opt_code flag was there. I just wasn't looking hard enough. @dudleycarr, try underscore_separating the flags and see if that works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment