Skip to content

Instantly share code, notes, and snippets.

@sergeylukin
Last active December 17, 2015 06:19
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save sergeylukin/5564524 to your computer and use it in GitHub Desktop.
Save sergeylukin/5564524 to your computer and use it in GitHub Desktop.
Profiling AWK vs GREP in sorting file contents in custom order (AWK is 2 times faster)
#! /usr/bin/env sh
###
### awk_vs_grep.sh - Profiles 2 methods used to sort file contents using custom pattern
### make sure to have file "awk_vs_grep.txt" with some content
### for more details refer to this Question: http://unix.stackexchange.com/questions/75498/sort-using-custom-pattern/
###
### Usage: `./awk_vs_grep.sh` - will execute commands 10 times and only print profile results
### Usage: `./awk_vs_grep.sh 1000` - will execute commands 1000 times and only print profile results
### Usage: `./awk_vs_grep.sh 1000 /dev/tty` - will execute commands 1000 times and print commands output + results
# how many times to run profiled commands - default 10
TIMES=${1:-10}
# define OUT for profiled commands - default null device
OUT=${2:-/dev/null}
# Define Helpers
function currtime() {
echo $(date +%s%N)
}
function diff_in_seconds() {
start_time=$2
end_time=$1
echo "scale=3;($end_time - $start_time)/(1*10^09)" | bc
}
# Profile AWK method
AWK_START_TIME=$(currtime)
for ((n=0; n<$TIMES; n++)); do
sort awk_vs_grep.txt | awk '$0 ~ /^b/ || $0 ~ /^d/ {print} $0 !~ /^b/ && $0 !~ /^d/ { a[f++] = $0 } END { for (word = 0; word < f; word++) { print a[word] } }' > $OUT
done
AWK_TIME=$(diff_in_seconds $(currtime) $AWK_START_TIME)
# Profile GREP method
GREP_START_TIME=$(currtime)
for ((n=0; n<$TIMES; n++)); do
grep '^b' awk_vs_grep.txt > $OUT
grep '^d' awk_vs_grep.txt > $OUT
grep -v '^b' awk_vs_grep.txt | grep -v '^d' | sort > $OUT
done
GREP_TIME=$(diff_in_seconds $(currtime) $GREP_START_TIME)
# Print Results
echo "Executed commands $TIMES times"
echo "AWK total time: $AWK_TIME seconds"
echo "GREP total time: $GREP_TIME seconds"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment