Skip to content

Instantly share code, notes, and snippets.

@sepehr
Last active November 10, 2015 22:45
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save sepehr/8b952a49d8efc9dfecd7 to your computer and use it in GitHub Desktop.
Save sepehr/8b952a49d8efc9dfecd7 to your computer and use it in GitHub Desktop.
Shell: Analyze huge files for repeating text portions
# Sorts the file by duplicate line count
sort /path/to/filename | uniq -c | sort -nr > ./_aggregated.tmp
# Just read the head as it's probably a huge file
head -n 1000 ./_aggregated.tmp | less
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment