Skip to content

Instantly share code, notes, and snippets.

@akagr
Last active August 29, 2015 14:03
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save akagr/22436a7a5cb7ec31ea89 to your computer and use it in GitHub Desktop.
Save akagr/22436a7a5cb7ec31ea89 to your computer and use it in GitHub Desktop.
File commands
# Find all files less than 2kb
find /path/to/directory -type f -size -2k
# Find all text files less than 2kb
find /path/to/directory -type f -name '*.txt' -size -2k
# Find all files with given pattern and count them
find -type f -name '*.jpg' | wc -l
find -type f -name 'e_*' | wc -l
# Find all files which are less than 2kb and delete them
find /path/to/directory -type f -size -2k -exec rm -i {} \;
# Find all files with same name and list them
find /path/to/directory -type f -exec basename {} \; | sort | uniq -c | grep -v "^[ \t]*1"
# Find all files with same name and copy the repeated file in a new directory (Only copies one file per group of repeating files). Remove the last pipe to list repeating files.
find -type f -exec basename {} \; | sort | uniq -c | grep -v "^[ \t]*1" | cut -f8-100 -d ' ' | xargs find -type f -name | xargs cp -t /path/to/new/directory
# Find all files with same content
find -not -empty -type f -printf "%s\n" | sort -rn | uniq -d | xargs -I{} -n1 find -type f -size {}c -print0 | xargs -0 md5sum | sort | uniq -w32 --all-repeated=separate
# Find all files with same content and delete them
find -not -empty -type f -printf "%s\n" | sort -rn | uniq -d | xargs -I{} -n1 find -type f -size {}c -print0 | xargs -0 md5sum | sort | uniq -w32 --all-repeated=separate | cut -f3-100 -d ' ' | tr '\n.' '\t.' | sed 's/\t\t/\n/g' | cut -f2-100 | tr '\t' '\n' | xargs -pr rm -v
# Find all files which have the middle five numbers in form of xxxx_x repeated
find *_E_* | cut -d '_' -f2-3 | uniq --all-repeated=separate | uniq | xargs -I {} find -name *{}*
# Find all files which have the middle five numbers in form of xxxx_x repeated and output them in a file
find *_E_* | cut -d '_' -f2-3 | uniq --all-repeated=separate | uniq | xargs -I {} find -name *{}* > repeated_list.txt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment