Skip to content

Instantly share code, notes, and snippets.

@subhankar94
Created November 24, 2016 22:29
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save subhankar94/1fef732e0c6cbdd5404f3c5c25f6b6ae to your computer and use it in GitHub Desktop.
Save subhankar94/1fef732e0c6cbdd5404f3c5c25f6b6ae to your computer and use it in GitHub Desktop.
McIlroy's 6 line shell script to print k most frequently occurring words in a file
tr -cs A-Za-z '\n' | # transliterate non-alphabetic chars (-c) to newline ('\n'), squeeze identical adjacent to single instance (-s)
tr A-Z a-z | # transliterate upper case characters to lower case
sort | # sort alphabetically
uniq -c | # collapse identical adjacent lines and add occurence of each line (-c)
sort -rn | # sort in reverse (-r), based on numeric value (-n)
sed ${1}q # pass through stream editor, print first-arg-many words and then quit (q)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment