Skip to content

Instantly share code, notes, and snippets.

@joergeschmann
Last active December 25, 2015 02:08
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save joergeschmann/6899927 to your computer and use it in GitHub Desktop.
Save joergeschmann/6899927 to your computer and use it in GitHub Desktop.
Splitt a text into its words, count the occurrences, sort it by that and save it to a file.
Shell command:
cat text.txt | tr -sc 'A-Za-z' '\n' | sort | uniq -c | sort -nr >> counts.txt
The content of text.txt:
Lorem Ipsum Dolor Sit Amet. Lorem ipsum.
The result in counts.txt:
2 Lorem
1 Sit
1 Ipsum
1 ipsum
1 Dolor
1 Amet
Commands:
cat text.txt
=> displays the content of the text.txt file
tr -sc 'A-Za-z' '\n' < text.txt
-s squeezes repetitions
-c complements the set
'A-Za-z' the character set to find
'\n' the translation
=> translates every input character into a new line, that is not a character (complement of A-Za-z).
sort
=> sorts the lines alphabetically
uniq -c
-c count
=> counts the occurrences and put the number before each line
sort -n -r
-n numeric sort
-r in reverse order
=> sorts the lines by the counts in a decreasing order
>> counts.txt
=> saves the output into the file counts.txt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment