Created
July 31, 2019 01:46
-
-
Save EvanTheB/5b64eafb84eeaf51c289295ac06e1b0b to your computer and use it in GitHub Desktop.
Why does one awk go faster than the other?
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File is ~100 million lines, with ~30 categories in the first column. So sort type 1 should execute ~30 subprocesses. | |
Takes ~ 180 seconds, so subprocess overhead should be very minimal. | |
Type 1, multiple sub-sorts: | |
$ time env time -v -o sort1.time awk -v cmd="sort -k2,2n" '$1 != prev {close(cmd); prev=$1} {print | cmd}' vcf.bed > vcf.sort1.bed | |
real 2m57.867s | |
user 0m46.381s | |
sys 4m14.309s | |
Type 2, decorate sort undecorate: | |
$ time env time -v -o sort2.time awk '{print ($1 in a? c+0: ++c)"\t" $0; a[$1]}' vcf.bed | time -v -o sort2.sort.time sort -k1,1n -k3,3n | time -v -o sort2.cut.time cut -f 2- > vcf.sort2.bed | |
real 2m31.199s | |
user 0m24.973s | |
sys 2m14.660s | |
Surprisingly, (to me) the type 1 takes more realtime (+20%), AND more cpu time (+100%). | |
More detailed /usr/bin/time -v reports below. | |
Things that stand out to me (type 1 vs 2): | |
LOTS of page faults - mmap-ed io? (1135087 vs 301240) | |
Less memory - (smaller sorts?) - 55052 vs 74472 | |
LOTS of context switches - un or line buffered io? - 28717559 vs 392500 | |
similar fs out calls , ~3 million | |
It would be nice to also get time on type 1 awk's child sort. | |
I could also ltrace awk to see how it does print | cmd... | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Command being timed: "awk -v cmd=sort -k2,2n $1 != prev {close(cmd); prev=$1} {print | cmd} vcf.bed" | |
User time (seconds): 46.38 | |
System time (seconds): 254.17 | |
Percent of CPU this job got: 169% | |
Elapsed (wall clock) time (h:mm:ss or m:ss): 2:57.61 | |
Average shared text size (kbytes): 0 | |
Average unshared data size (kbytes): 0 | |
Average stack size (kbytes): 0 | |
Average total size (kbytes): 0 | |
Maximum resident set size (kbytes): 55052 | |
Average resident set size (kbytes): 0 | |
Major (requiring I/O) page faults: 0 | |
Minor (reclaiming a frame) page faults: 1135087 | |
Voluntary context switches: 28717559 | |
Involuntary context switches: 559 | |
Swaps: 0 | |
File system inputs: 0 | |
File system outputs: 3356744 | |
Socket messages sent: 0 | |
Socket messages received: 0 | |
Signals delivered: 0 | |
Page size (bytes): 4096 | |
Exit status: 0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Command being timed: "awk {print ($1 in a? c+0: ++c)"\t" $0; a[$1]} vcf.bed" | |
User time (seconds): 5.71 | |
System time (seconds): 50.37 | |
Percent of CPU this job got: 42% | |
Elapsed (wall clock) time (h:mm:ss or m:ss): 2:11.39 | |
Average shared text size (kbytes): 0 | |
Average unshared data size (kbytes): 0 | |
Average stack size (kbytes): 0 | |
Average total size (kbytes): 0 | |
Maximum resident set size (kbytes): 5060 | |
Average resident set size (kbytes): 0 | |
Major (requiring I/O) page faults: 0 | |
Minor (reclaiming a frame) page faults: 7589 | |
Voluntary context switches: 124 | |
Involuntary context switches: 86 | |
Swaps: 0 | |
File system inputs: 0 | |
File system outputs: 0 | |
Socket messages sent: 0 | |
Socket messages received: 0 | |
Signals delivered: 0 | |
Page size (bytes): 4096 | |
Exit status: 0 | |
Command being timed: "sort -k1,1n -k3,3n" | |
User time (seconds): 18.32 | |
System time (seconds): 76.23 | |
Percent of CPU this job got: 62% | |
Elapsed (wall clock) time (h:mm:ss or m:ss): 2:31.18 | |
Average shared text size (kbytes): 0 | |
Average unshared data size (kbytes): 0 | |
Average stack size (kbytes): 0 | |
Average total size (kbytes): 0 | |
Maximum resident set size (kbytes): 74472 | |
Average resident set size (kbytes): 0 | |
Major (requiring I/O) page faults: 0 | |
Minor (reclaiming a frame) page faults: 301240 | |
Voluntary context switches: 392500 | |
Involuntary context switches: 342 | |
Swaps: 0 | |
File system inputs: 0 | |
File system outputs: 3845120 | |
Socket messages sent: 0 | |
Socket messages received: 0 | |
Signals delivered: 0 | |
Page size (bytes): 4096 | |
Exit status: 0 | |
Command being timed: "cut -f 2-" | |
User time (seconds): 0.92 | |
System time (seconds): 7.86 | |
Percent of CPU this job got: 5% | |
Elapsed (wall clock) time (h:mm:ss or m:ss): 2:30.89 | |
Average shared text size (kbytes): 0 | |
Average unshared data size (kbytes): 0 | |
Average stack size (kbytes): 0 | |
Average total size (kbytes): 0 | |
Maximum resident set size (kbytes): 4664 | |
Average resident set size (kbytes): 0 | |
Major (requiring I/O) page faults: 0 | |
Minor (reclaiming a frame) page faults: 4319 | |
Voluntary context switches: 39822 | |
Involuntary context switches: 58 | |
Swaps: 0 | |
File system inputs: 0 | |
File system outputs: 1680728 | |
Socket messages sent: 0 | |
Socket messages received: 0 | |
Signals delivered: 0 | |
Page size (bytes): 4096 | |
Exit status: 0 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment