Skip to content

Instantly share code, notes, and snippets.

@josuecau
Last active August 17, 2020 09:29
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save josuecau/58310901b62b004985b6d4952646382c to your computer and use it in GitHub Desktop.
Save josuecau/58310901b62b004985b6d4952646382c to your computer and use it in GitHub Desktop.
#!/usr/bin/env bash
# List the most frequently used words in a text.
[ $# -ge 1 ] && [ -f "$1" ] && input="$1" || input="-"
# shellcheck disable=SC2002
cat "$input" |
tr -cs '[:alpha:]' '\n' | # Split words and drop non-alphabetic characters.
tr '[:upper:]' '[:lower:]' | # Put it all to lowercase.
sed '/../!d' | # Remove lines with less than two chars.
sort | uniq -c | sort -k1nr | # Deduplicate and sort by number of occurrences.
awk '$1 > 1' # Keep words with more than one occurrence.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment