Clone and build Node for analysis:
$ git clone https://github.com/joyent/node.git
$ cd node
$ export GYP_DEFINES="v8_enable_disassembler=1"
$ ./configure
$ make -j4
Note: To run with IRHydra run like the following:
$ node --trace-hydrogen --trace-phase=Z --trace-deopt \
--code-comments --hydrogen-track-positions --redirect-code-traces \
--redirect-code-traces-to=code.asm --print-opt-code script.js
Now build v8:
$ cd deps/v8
$ make dependencies
$ make -j4
Building modules with custom local build of Node:
$ /path/to/node/node $(which node-gyp) rebuild --release --nodedir=/path/to/node
(note: can also use --debug
)
Install npm libraries using custom local build:
$ /path/to/node/node $(which npm) --nodedir=/path/to/node install
Print deoptimizations:
$ ./node --trace-deopt --code-comments ./script.js
Four types of deoptimizations:
eager
: Occurs with unexpected element transitions.lazy
: Occurs when an assumption is made on a global which no longer holds true.soft
: As v8 collects type information on functions it will mark dead subgraphs of the AST (abstract syntax tree). If one of those marked graphs is called this will cause the function to deoptimize and tell v8 more type information needs to be collected.debugger
: Not really a bailout, but used by the debugger to deoptimize stack frames to allow inspection.
Print optimizations and inlining:
$ ./node --trace-opt --trace-inlining ./script.js
Check activity of the garbage collector:
$ ./node --trace_gc --print_cumulative_gc_stat ./script
Trace for allocations of external memory (e.g. Buffers/ArrayBuffers):
./node --trace_external_memory ./script
Grab v8 profile of script and use the tick processor:
$ ./node --prof ./script
$ D8_PATH=/path/to/deps/v8/out/native /path/to/deps/v8/tools/linux-tick-processor
Key things to look for from tick processor output:
Stub
: Hand written assembly that follows the JS calling convention.LazyCompile
: Code was "lazily compiled", not that v8 spent that much time compiling code.*
prefix: The function has been run through the optimizing compiler (i.e. Crankshaft).~
prefix: The function was run through the full-codegen (or unoptimized) compiler, but v8 is letting us know the function could be optimized.- no prefix: Either that v8 doesn't have enough type info on the function to know whether it could be optimized, or the function has been marked as unoptimizable.
Find unnamed function in --trace-*
output (using --trace-opt
as example):
$ ./node --trace-opt --print-opt-code ./script > code.asm
The output will be a mix of the two flags. From --trace-opt
the output will usually look like so:
[marking 0x1531090902b9 myFunctionName <JS Function (SharedFunctionInfo 0x3f91d3335139)> ...
But in the case where the function has no name it will look like this:
[marking 0x1531090902b9 <JS Function (SharedFunctionInfo 0x3f91d3335139)> ...
So grab the SharedFunctionInfo
address and search for object: 0x3f91d3335139 <SharedFunctionInfo>
. This will drop you into the execution body of the calling function. Scroll up and you'll see a body of text that contains the unnamed function.
I suggest getting latest kernel and building perf (example from time of this writing):
$ wget https://www.kernel.org/pub/linux/kernel/v3.x/testing/linux-3.13-rc2.tar.xz
$ tar xvf linux-3.13-rc2.tar.xz
$ linux-3.13-rc2/tools/perf/
$ make -f Makefile.perf -j4
I installed it globally, so did the following:
$ sudo make -f Makefile.perf -j4 install prefix=/usr
Before getting perf
output we'll need to prevent kernel pointers from being hidden:
$ sudo sysctl kernel/kptr_restrict=0
Now, let's start with some basic output:
$ perf stat -- ./node script.js
And some basic recording:
$ perf record -e cycles:u ./node script.js
$ perf report
Let's get an inverted call graph:
$ perf record --call-graph dwarf -e cycles -- ./node script.js
$ perf report
Get a list of all system tracepoints:
$ sudo perf list
Print how many times a specific syscall(s) is performed during execution:
$ sudo perf stat -e syscalls:sys_*_mmap -- ./node script.js
Get the stack traces for specific syscalls:
$ sudo perf record -e syscalls:sys_*_mmap --call-graph dwarf -- ./node script.js
List possible probes on Node
$ perf probe --no-demangle -F -x /path/to/node/node
Add probe for specific call in Node:
$ sudo perf probe --no-demangle -x /path/to/node/node -a uv_fs_open
Quick script to add probes for all Node (i.e. libuv) file system operations:
for i in `perf probe --no-demangle -F -x /path/to/node/node | egrep "^uv_fs_"`
do
sudo perf probe --no-demangle -x /path/to/node/node -a $i
done
Or for everything:
for i in `perf probe --no-demangle -F -x /path/to/node/node | egrep "^uv_[^_]"`
do
sudo perf probe --no-demangle -x /path/to/node/node -a $i
done
See how many times each are called:
$ sudo perf stat -e probe_node:uv_fs_* -- ./node script.js
And don't forget to remove them when you're all done :)
$ sudo perf probe -d uv_fs_*
We have a new session of Performance Analysis Training scheduled for January. Join us!
Read the detailed syllabus: http://thenodefirm.com/training/performance-analysis
Grab a ticket: http://firm.io/perf-beta