Skip to content

Instantly share code, notes, and snippets.

View brendangregg's full-sized avatar

Brendan Gregg brendangregg

View GitHub Profile
dtruss cat /etc/networks
##
# Networks Database
##
loopback 127 loopback-net
SYSCALL(args) = return
__sysctl(0x7FFF5FAB9074, 0x2, 0x7FFF5FAB9060) = 0 0
bsdthread_register(0x7FFF8F249174, 0x7FFF8F249164, 0x2000) = 0 0
thread_selfid(0x7FFF8F249174, 0x7FFF8F249164, 0x0) = 12380231 0
mmap(0x0, 0x2000, 0x3, 0x1002, 0x1000000, 0x0) = 0x14C000 0
@brendangregg
brendangregg / gist:5616407
Last active December 17, 2015 13:19
odsync binary
begin 755 odsync
M?T5,1@$! 08! ( P ! P L%"#0 <' #0 ( &
M "@ '0 < 8 T - %" # P 4 P /0
M #T 4( !$ 1 ! ! !0@ \0\
M /$/ % ! $ #T#P ] \&" #X 0 & ( < $
M @ $ 0 ! $ 8( #@! !P !0Y61D" $ @!!0@
M # P $ ! "]U<W(O;&EB+VQD+G-O+C$ 1L#.VP0
M 0 )0 ", ! @ ! 8 ( "0
M H + #@ 0
M$0 !0 6 %P 8 &0 !H =
@brendangregg
brendangregg / gist:6877796
Created October 8, 2013 01:03
mac os x instruments: thread states
#!/usr/sbin/dtrace -C -Z -s
/*
To run this script, please type the following command while in the same
directory (or specify the full pather rather than ./):
sudo ./threadstates -o dtrace_output.txt <pid>
After the output file has been generated (here, it's dtrace_output.txt but you
may call it whatever you like), call up the trace document where
this script was exported from and choose the "DTrace Data Import..." option
@brendangregg
brendangregg / gist:10691399
Created April 15, 2014 00:04
stap use-server no avahi
$ stap -v --use-server=10.140.145.195:7001 -ve 'global ops; probe syscall.* { ops[probefunc()] <<< 1; }'
Systemtap translator/driver (version 1.8/0.152 non-git sources)
Copyright (C) 2005-2012 Red Hat, Inc. and others
This is free software; see the source for copying conditions.
enabled features: AVAHI LIBRPM LIBSQLITE3 NSS BOOST_SHARED_PTR TR1_UNORDERED_MAP NLS
Created temporary directory "/tmp/stapuKSnAK"
Session arch: x86_64 release: 3.2.41-nflx
Using a compile server.
Running sh -c cd '/tmp/stapuKSnAK/client' && zip -qr '/tmp/stapuKSnAK/client.zip' *
Spawn waitpid result (0x0): 0
@brendangregg
brendangregg / rwtime.stp
Created June 27, 2014 23:28
I CAN HAZ SYSTEMTAP
#!/usr/bin/stap
/*
* rwtime.stp read/write syscalls by latency.
*
* USAGE: ./rwtime.stp [execname]
*
* An option argument of the program name, eg, "httpd", can be provided. Without
* this, all processes are traced.
*
* 24-Jun-2014 Brendan Gregg Created this.
@brendangregg
brendangregg / rwtime2.stp
Created June 28, 2014 02:02
I CAN HAZ ARGV
#!/usr/bin/stap
/*
* rwtime.stp read/write syscalls by latency.
*
* USAGE: ./rwtime.stp [execname]
*
* An option argument of the program name, eg, "httpd", can be provided. Without
* this, all processes are traced.
*
* 24-Jun-2014 Brendan Gregg Created this.
@brendangregg
brendangregg / gist:f8ed5345cfc903599a60
Created August 5, 2014 01:08
dynamic tracing of ZFS on Linux, on Linux

So I just found ZFS on my test Linux ubuntu system, and gave my perf-tools (https://github.com/brendangregg/perf-tools) a spin.

Per-second zfs* calls:

# ./funccount -Ti 1 'zfs*'
Tracing "zfs*"... Ctrl-C to end.

Tue Aug  5 00:51:41 UTC 2014
FUNC                              COUNT
@brendangregg
brendangregg / fsmicrobench.md
Last active February 16, 2022 08:25
some FS micro-benchmarks

F1. FS 128k streaming writes

Benchmark: fio write

Command: fio --name=seqwrite --rw=write --bs=128k --size=4g --end_fsync=1 --loops=4 # aggrb tput

Rationale: Measure the performance of a single threaded streaming write of a reasonably large file. The aim is to measure how well the file system and platform can sustain a write workload, which will depend on how well it can group and dispatch writes. It's difficult to benchmark buffered file system writes in both a short duration and in a repeatable way, as performance greatly depends on if and when the pagecache begins to flush dirty data. As a workaround, an fsync() at the end of the benchmark is called to ensure that flushing will always occur, and the benchmark also repeats four times. While this provides a much more reliable measurement, it is somewhat worst-case (applications don't always fsync), providing closer to a minimum rate – rather than a maximum rate – that you should expect.

F2. FS cached 4k random reads

@brendangregg
brendangregg / gist:d75f7f14da1126a7d31d
Created December 5, 2014 01:28
v8 perf-basic-prof weird symbol
13a80b1584f5 RegExp:\bwebOS(?:/[\d.]+|[ \w.]*) (/tmp/perf-7539.map)
@brendangregg
brendangregg / gist:899851d03ed4bf543303
Last active August 29, 2015 14:17
sample perf problem answers

G'Day, I didn't have an email addr so pasted into gist. Just wanted to explain things a bit better than could on twitter. Good luck!

1. CPU-bound

Depends what you mean by CPU bound: bound by availability or speed.

If the CPUs themselves are hot, then this is easy. "mpstat -P ALL 1" will show the hot CPUs. If single threads are causing it, then "pidstat -t 1" will identify them (although, it could be a large thread pool competing). This approach identifies if something is resource constrained by CPUs, but not bound by their speed.

Imagine a single CPU system running at 10% utilization, with an application processing 1 request per second, which is taking 100ms, all CPU time. The application's performance is CPU (speed) bounded, but the system looks mostly idle. This can be identified using walltime vs CPU time for the request. Most languages have a way to get the CPU time (something getrusage() related). Eg, if you were able to measure that the request took 100 ms, and, 100 ms of CPU time was consumed,