Skip to content

Instantly share code, notes, and snippets.

@dmgig
Last active August 31, 2018 16:19
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dmgig/61927e83ede581887ebd85ceb653943a to your computer and use it in GitHub Desktop.
Save dmgig/61927e83ede581887ebd85ceb653943a to your computer and use it in GitHub Desktop.
Reduce duplicated images into md5 named image bank with symlinks from original location

Image Banker

Method to de-duplicate a large amount of images contained in a complicated folder structure which must be maintained.

Given a directory containing duplicate jpg images with any folder structure, img-banker.sh will:

  1. Recursively find all jpgs.
  2. Calculate the file's md5 hash.
  3. Copy the file to img-bank directory using the md5 hash as the new name (the "banked image").
  4. Replace the original image with a symlink to the banked image.
#!/bin/bash
# build html page for testing
rm img-bank-test.html
multiple_cmd() {
echo "<img src='$1' />" >> img-bank-test.html
};
export -f multiple_cmd;
find ./repo -name '*.jpg' -exec bash -c 'multiple_cmd "$0"' {} \;
#!/bin/bash
bank_images() {
IMGBANKDIR=~/IMAGE-BANK-TEST/img-bank/
ORIGNAME=$1
MD5NAME=$(md5 "$1" | awk '{ print $4 }').jpg
cp -v $1 $IMGBANKDIR$MD5NAME
if [ $? -eq 0 ]; then
rm $1
ln -s $IMGBANKDIR$MD5NAME $ORIGNAME
else
echo FAIL
fi
};
export -f bank_images;
find ~/IMAGE-BANK-TEST/repo -type f -name '*.jpg' -exec bash -c 'bank_images "$0"' {} \;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment