Skip to content

Instantly share code, notes, and snippets.

@peko
Created December 21, 2018 11:40
Show Gist options
  • Save peko/b29d8afe35ffadf0ba7669b871110941 to your computer and use it in GitHub Desktop.
Save peko/b29d8afe35ffadf0ba7669b871110941 to your computer and use it in GitHub Desktop.
Word statistics for text file
#!/bin/bash
# 1. clear file from junk
# 2. split by wordd
# 3. lower case
# 4. sort for uniq
# 5. uniq withcount
# 6. sort by count
cat textfile.txt \
| sed -E 's#([^А-Яа-я ])|\s+# #g' \
| sed 's# #\n#g' \
| awk '{print tolower($0)}' \
| sort \
| uniq -c \
| sort -k1 -n -r
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment