Skip to content

Instantly share code, notes, and snippets.

@EvanTheB
Created July 31, 2019 01:46
Show Gist options
  • Save EvanTheB/5b64eafb84eeaf51c289295ac06e1b0b to your computer and use it in GitHub Desktop.
Save EvanTheB/5b64eafb84eeaf51c289295ac06e1b0b to your computer and use it in GitHub Desktop.
Why does one awk go faster than the other?
File is ~100 million lines, with ~30 categories in the first column. So sort type 1 should execute ~30 subprocesses.
Takes ~ 180 seconds, so subprocess overhead should be very minimal.
Type 1, multiple sub-sorts:
$ time env time -v -o sort1.time awk -v cmd="sort -k2,2n" '$1 != prev {close(cmd); prev=$1} {print | cmd}' vcf.bed > vcf.sort1.bed
real 2m57.867s
user 0m46.381s
sys 4m14.309s
Type 2, decorate sort undecorate:
$ time env time -v -o sort2.time awk '{print ($1 in a? c+0: ++c)"\t" $0; a[$1]}' vcf.bed | time -v -o sort2.sort.time sort -k1,1n -k3,3n | time -v -o sort2.cut.time cut -f 2- > vcf.sort2.bed
real 2m31.199s
user 0m24.973s
sys 2m14.660s
Surprisingly, (to me) the type 1 takes more realtime (+20%), AND more cpu time (+100%).
More detailed /usr/bin/time -v reports below.
Things that stand out to me (type 1 vs 2):
LOTS of page faults - mmap-ed io? (1135087 vs 301240)
Less memory - (smaller sorts?) - 55052 vs 74472
LOTS of context switches - un or line buffered io? - 28717559 vs 392500
similar fs out calls , ~3 million
It would be nice to also get time on type 1 awk's child sort.
I could also ltrace awk to see how it does print | cmd...
Command being timed: "awk -v cmd=sort -k2,2n $1 != prev {close(cmd); prev=$1} {print | cmd} vcf.bed"
User time (seconds): 46.38
System time (seconds): 254.17
Percent of CPU this job got: 169%
Elapsed (wall clock) time (h:mm:ss or m:ss): 2:57.61
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 55052
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 1135087
Voluntary context switches: 28717559
Involuntary context switches: 559
Swaps: 0
File system inputs: 0
File system outputs: 3356744
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
Command being timed: "awk {print ($1 in a? c+0: ++c)"\t" $0; a[$1]} vcf.bed"
User time (seconds): 5.71
System time (seconds): 50.37
Percent of CPU this job got: 42%
Elapsed (wall clock) time (h:mm:ss or m:ss): 2:11.39
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 5060
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 7589
Voluntary context switches: 124
Involuntary context switches: 86
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
Command being timed: "sort -k1,1n -k3,3n"
User time (seconds): 18.32
System time (seconds): 76.23
Percent of CPU this job got: 62%
Elapsed (wall clock) time (h:mm:ss or m:ss): 2:31.18
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 74472
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 301240
Voluntary context switches: 392500
Involuntary context switches: 342
Swaps: 0
File system inputs: 0
File system outputs: 3845120
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
Command being timed: "cut -f 2-"
User time (seconds): 0.92
System time (seconds): 7.86
Percent of CPU this job got: 5%
Elapsed (wall clock) time (h:mm:ss or m:ss): 2:30.89
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 4664
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 4319
Voluntary context switches: 39822
Involuntary context switches: 58
Swaps: 0
File system inputs: 0
File system outputs: 1680728
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment