Skip to content

Instantly share code, notes, and snippets.

@nemobis
Created October 17, 2012 18:08
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save nemobis/3907106 to your computer and use it in GitHub Desktop.
Save nemobis/3907106 to your computer and use it in GitHub Desktop.
Silly script to find interlaced images on Commons (bug 17645)
#!/bin/bash
# commons-interlace-exiftool.sh: silly script to find interlaced images on Commons
cat jpgcommons.txt | # Take list of filenames, one per line
while read line # As long as there is another line to read ...
do
URL=$(curl "http://commons.wikimedia.org/w/api.php?action=query&prop=imageinfo&iiprop=url&titles=File:$line&format=xml" | grep -oE 'http://upload.wikimedia.org[^"]+');
echo "URL is $URL"
IDEN=$(curl $URL | exiftool -fast2 - | grep -i "Encoding Process")
# "Baseline DCT" only safe JPEG SOF tag, many less common ones are uncertain
# http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/JPEG.html#SOF
# http://u88.n24.queensu.ca/exiftool/forum/index.php/topic,3911.msg18139.html#msg18139
if grep -qi "Baseline DCT" <<< $IDEN; then
echo ">>>>>>> $line is not interlaced <<<<<<<" #Only if we found it and exit status is 0
else
echo ">>>>>>> $line is probably interlaced/progressive <<<<<<<"
echo "$line" >> interlaced-exiftool.txt
fi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment