benjamintanweihao/presentation.md

## presentation.md

      
    Raw
  

              presentation.md
            
          
    Erlang Tracing: more than you wanted to know

Rough Outline


What can be traced?
How can trace events be specified?
"match specifications": twisty passages, all alike
WTF, can I just use DTrace and drink my coffee/beer/whisky in peace?
Trace delivery mechanisms: pick one of two
Trace event receivers: what can they do?
Redbug vs. The World
Examples of event receivers

Reference docs


See docs: dbg:p/2 or erlang:trace/3

http://erlang.org/doc/man/dbg.html
http://erlang.org/doc/man/erlang.html#trace-3


They're somewhat interchangable.
Use whatever one you're comfortable with API-wise
They are twiddling the same knobs, inside the VM
I talk a lot about the RedBug app:

https://github.com/massemanet/eper


What can be traced?


Which processes?

A specific pid.
existing: All processes currently existing
new: All processes that will be created in the future
all: existing and new


What events?

function calls: call, return
sending of messages
receiving of messages
process-related events: spawn, exit, register, unregister, link, unlink, getting linked, getting unlinked
scheduling of processes: in, out
process exiting
garbage collection activities


Additional bells & whistles

microsecond timestamps on all messages
report arity # instead of arg list
set on spawn / first spawn / link / first link


"match specifications": twisty passages, all alike


See docs in ets:fun2ms/1 ms_transform:parse_transform/2

http://erlang.org/doc/man/ets.html
http://erlang.org/doc/man/ms_transform.html


Because regular expressions on Erlang data sucks.
Match specs attempt to be Erlang'ish

And try to be quick to execute
But are obtuse to the 7th degree....


Example match specs:

[{'_',[],[{message,{return_trace}}]}]
[{{'$1','$2'},[{'>','$2',{const,3}},{is_atom,'$1'}],['$1']}]

ets:fun2ms(fun(A, B) when is_atom(A), B > 3 -> A end).


Mostly useful for restricting tracing on function calls

After-the-fact tracing filters are good but...
... generating exactly the traces that you want is better.
E.g., match foo:bar/3 but only if 1st arg is hellooooooo_world.


Trace delivery mechanisms: pick one of two


Erlang process receiver

Send the trace events to any pid in the cluster
Easy to use


TCP receiver

Send binary trace events to a TCP port listener
Faster & lower overhead: less likely to kill VM in trace event tsunami


IMPORTANT: All tracing events must go to the same single receiver.

DTrace is far more flexible, alas.


WTF, can I just use DTrace and drink my coffee/beer/whisky in peace?


I sooooo wish.......
... unfortunate reality hits ....
DTrace isn't available everywhere that Erlang is
The VM's tracing infrastructure isn't a good match with DTrace, yet.

Pulls requests are welcome!


DANGER: Erlang tracing can crash your VM!

It is quite possible to create a tsunami of trace events.

You can easily perturb performance with trace event receiver @ 100% CPU
You can easily run your machine out of RAM
You can deadlock the tracer and/or other processes.
Some trace patterns can disable tail call recursion ... run your
VM out of memory even when the # of trace events is very small.

To be safe, do all of these things:

Make your trace specifications as specific as possible

Do not use "'_':'_'"
I.e.: Any module + any function name
If you must use "your_module_name:'_'", run for a very short
time, until you are confident that you will not create an event tsunami.


Simplify your trace receiver as much as possible
Limit the amount of time that tracing is enabled
Read the docs!
Redbug has all of these limits, that's one reason we use/recommend it.

Trace event receivers: what can they do?


Anything, really

The events are just data.  Fold/spindle/mutilate/....


In practice, simple is better


Most tracers just format the event, in the order received

Simple, stateless
For example, this is what Redbug does by default

But Redbug can use custom event handler, if you wish....


Redbug vs. The World


Redbug's default trace event receiver is stateless, prints to console


Redbug has option to write binary events to a file for offline processing


Redbug can use multiple match specs at "full strength & power"


So ... when should you not use Redbug?

When online & stateful event receiver is needed


Examples of event receivers

Redbug specification:
> redbug:start("erlang:now -> return").
ok
21:31:52 <timer_server> {erlang,now,[]}
21:31:52 <timer_server> {erlang,now,0} -> {1385,555512,322417}
21:31:53 <timer_server> {erlang,now,[]}
21:31:53 <timer_server> {erlang,now,0} -> {1385,555513,322384}
[...]
quitting: msg_count

To limit the formatting depth (see all of the "..." for omitted data)
(dev1@127.0.0.1)26> redbug:start("timer:'_' -> return", {print_depth,7}).
ok
21:33:41 <{erlang,apply,2}> {timer,tc,[riak_pipe_stat,produce_stats,[]]}
21:33:41 <<0.3234.1>> {timer,now_diff,[{1385,555621,...},{1385,...}]}
21:33:41 <<0.3234.1>> {timer,now_diff,2} -> 130
21:33:41 <<0.3234.1>> {timer,tc,3} -> {130,
                                       {riak_pipe,[{{...},...},{...}|...]}}

For Redbug help, run redbug:help().  For explanation of Redbug
print_depth, see the ~P formatting description in the io man page.
WARNING: Cut-and-paste code style


These are not great examples of Erlang code style
Close your eyes if you are sensitive to ugly code.  ^_^

fsm_init_tracer.erl

Source: Scott Fritchie: https://gist.github.com/slfritchie/007b399675fb76f2d4f0
Report every Interval seconds the count of init() function calls for the following Riak KV FSM modules: riak_kv_buckets_fsm, riak_kv_exchange_fsm, riak_kv_get_fsm, riak_kv_index_fsm, riak_kv_keys_fsm, riak_kv_put_fsm
timeit.erl

Source: Greg Burd: https://gist.github.com/2656289
%% @doc Dynamically add timing to MFA.  There are various types of
%% timing.
%%
%% {all, Max} - time latency of all calls to MFA
%%
%% {sample, N, Max} - sample every N calls and stop sampling after Max
%%
%% {threshold, Millis, Max} - count # of calls where latency is > Millis
%% and count # of calls total, thus percentage of calls over threshold

Sample output:
> timeit:timeit(timer, tc, 3, {sample, 10, 5}). 
{ok,[{matched,'dev1@127.0.0.1',1},{saved,1}]}
[sample 1/10] <0.10771.1>:timer:tc: 0.152 ms
[sample 2/20] <0.10796.1>:timer:tc: 0.484 ms
[sample 3/30] <0.10825.1>:timer:tc: 0.111 ms
[sample 4/40] <0.10849.1>:timer:tc: 0.101 ms
[sample 5/50] <0.10875.1>:timer:tc: 194.682 ms

fsm_latency_tracer.erl

Source: Scott Fritchie: https://gist.github.com/slfritchie/7a87269148280e4724ce
Report each riak_kv_put_fsm and riak_kv_get_fsm process that executes in more than LatencyMS milliseconds.  Also, for various strategic API functions in the file, bitcask, and eleveldb modules, report latencies above the same limit.
backend_latency_tracer.erl

Source: Scott Fritchie: https://gist.github.com/slfritchie/e7c3ad866f67d3fc8935
A more sophisticated version of fsm_latency_tracer.  Watches more
API functions in the bitcask, eleveldb, file, and
riak_kv_fs2_backend modules.  Uses the folsom library to generate
histograms of latencies of the file:pread() function to show
outliers and multi-modal patterns that simple averages & maximums can
show.
accumulating_time_tracer.erl

Source: Scott Fritchie: https://gist.github.com/slfritchie/c01690d9c30e5e7b0100
Demonstrates how much of every second that the file_server_2 process
is busy.  It shows pretty dramatically how Bitcask's highest latencies
are frequently caused by serialization by file_server_2.
trace_large4.erl

Source: Joe Blomstedt: https://gist.github.com/jtuple/244c578773962077e215

For calls to riak_kv_get_fsm:calculate_objsize/2, print the
Bucket, Key, and object size.
For calls to riak_kv_eleveldb_backend:put/5 and
riak_kv_bitcask_backend:put/5 and riak_kv_memory_backend:put/5,
report if the object size is larger than Size.
Stop after Count events have been recorded or Time milliseconds have
elapsed.

read_bin_trace_file.erl

Source: Scott Fritchie: https://gist.github.com/slfritchie/23121803af63d76eeca5
A small example of reading a binary-formatted Erlang trace file and doing something simple with it.
Note that Redbug can write its trace events into a binary file.  This technique can be used for off-line processing, e.g., to avoid high processing cost in a production environment.
func_args_tracer.erl and latency_histogram_tracerl.erl

Source: Scott Fritchie: https://gist.github.com/slfritchie/159a8ce1f49fc03c77c6
func_args_tracer.erl
%% For example: what ETS tables are being called the most by ets:lookup/2?
%% The 1st arg of ets:lookup/2 is the table name.
%% Watch for 10 seconds.
%%
%% > func_args_tracer:start(ets, lookup, 2, 10, fun(Args) -> hd(Args) end).
%% 
%% Tracer pid: <0.16102.15>, use func_args_tracer:stop() to stop
%% Otherwise, tracing stops in 10 seconds
%% Current date & time: {2013,9,19} {18,5,48}
%% {started,<0.16102.15>}
%% Total calls: 373476 
%% Call stats:         
%% [{folsom_histograms,114065},
%%  {ac_tab,69689},
%%  {ets_riak_core_ring_manager,67147},
%%  {folsom_spirals,57076},
%%  {riak_capability_ets,48862},
%%  {riak_core_node_watcher,8149},
%%  {riak_api_pb_registrations,8144},
%%  {folsom,243},
%%  {folsom_meters,43},
%%  {folsom_durations,20},
%%  {timer_tab,18},
%%  {folsom_gauges,8},
%%  {riak_core_stat_cache,5},
%%  {sys_dist,3},
%%  {inet_db,1},
%%  {21495958,1},
%%  {3145765,1},
%%  {3407910,1}]

latency_histogram_tracerl.erl
%% For example: create a histogram of call latencies for bitcask:put/3.
%% Watch for 10 seconds.
%%
%% > latency_histogram_tracer:start(bitcask, put, 3, 10).
%% 
%% Tracer pid: <0.2108.18>, use latency_histogram_tracer:stop() to stop
%% Otherwise, tracing stops in 10 seconds
%% Current date & time: {2013,9,19} {18,14,13}
%% {started,<0.2108.18>}
%% Histogram stats:     
%% [{min,0},
%%  {max,48},
%%  {arithmetic_mean,2.765411819271055},
%%  [...]
%%  {percentile,[{50,3},{75,4},{90,5},{95,6},{99,8},{999,14}]},
%%  {histogram,[{1,13436},
%%              {2,12304},
%%              {3,10789},
%%              {4,7397},
%%              {5,4191},
%%              {6,1929},
%%              [...]
%%              {30,0},
%%              {31,0},
%%              {40,2},
%%              {50,1}]},
%%  {n,51746}]

ktrace.erl and ktrace_fold.erl

Source: Joe Blomstedt: https://gist.github.com/jtuple/161ad7c8c278b45729df
Source: Jordan West: https://gist.github.com/jrwest/2999f8f217f1cbceca83

Attempts to capture enough data from Erlang tracing to construct a
flamegraph.

http://dtrace.org/blogs/brendan/2011/12/16/flame-graphs/


Aside/see also: https://github.com/proger/eflame
On VM #1: ktrace:c()
On VM #2, on same machine, running Riak: ktrace:s()
Then get Riak to do some get requests.
ktrace output should appear in VM #1's console.  Save it to a file.
Run ktrace_fold:go("/path/to/input-file", "/path/to/output-file").
Run flamegraph.pl /path/to/output-file > "/path/to/output.svg

For example: http://www.snookles.com/scotttmp/vnode-get.svg ... which
suggests that only about 15% of the wall-clock time for
riak_kv_vnode get processing is due to the backend (Bitcask in this
case).
Questions?