Skip to content

Instantly share code, notes, and snippets.

@lengerfulluse
Last active July 18, 2018 05:58
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save lengerfulluse/a7eb46b1e4138bb7ba2176ded06b3869 to your computer and use it in GitHub Desktop.
Save lengerfulluse/a7eb46b1e4138bb7ba2176ded06b3869 to your computer and use it in GitHub Desktop.
common used awk command

1. count the occurence, just like uniq statment.

awk '{ tot[$0]++ } END { for (i in tot) printf("%s\t%s\n", i,tot[i]) }'

2. find multiple occurance of string in one lines

sed 's/\[http-/\n&/g; s/userid":/\n&/g;s/recmid":"/\n&/g;s/[^\n]*\nuserid":\([[:digit:]]*\)[^\n]*/\1 /g;s/.$//'

3. basic shell hashmap usage:

http://www.artificialworlds.net/blog/2012/10/17/bash-associative-array-examples/

4. insert a new line between a pattern

echo foo | perl -pe 's/(.*)/\n$1/'

5. hashmap one file and look up in another file:

awk '
BEGIN{
   FS=OFS=","
   while ( (getline line < "lookup_file.txt") > 0 ) {
      split(line,f)
      map[f[1]] = f[2]
   }
}
{ $3 = map[$3]; print }
' data.txt

# another simple example: 
# first loop over ledger-email-part1.done file, and store into hash named 'h', and 
# iterate over prod-customerID.part1 file, do some operations via 'h'.
awk 'NR==FNR {h[$1] = $2; next} {print h[$2]}' ledger-email-part1.done prod-customerID.part1

6. find files in a directorty and exec command

find . -exec cmd {} \;

7. Merge even line into odd line one by one via VIM

:global/^/join

8. Sed usage with regex match

cat ledger-gc-part1.done | sed -n -e 's/^.*customer:\(.*\) with amount:\(.*\) for gcId:\(.*\), via.*$/\1,\2,\3/p'

### Non greedy match with [^/]* instead of .*?, eg:
sed 's|\(http://[^/]*/\).*|\1|g'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment