Skip to content

Instantly share code, notes, and snippets.

@erincerys
Last active February 19, 2016 04:47
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save erincerys/7182e0fef7b5aa34f688 to your computer and use it in GitHub Desktop.
Save erincerys/7182e0fef7b5aa34f688 to your computer and use it in GitHub Desktop.
Simple bash script to get the summed run time of all titles on an imdb.com user list with help of the oMDB API and some open source software
#!/bin/bash
# Dependencies (you must have these already installed on your system)
# - jq, JSON processor written in C
# - bc, arbitrary precision numer processor
# - curl, for transferring data given an URL
if [ -z $1 ] ; then
echo 'Pass an argument of the imdb.com list ID i.e. ls123456789 found in the URL'
exit 1
fi
SourceOut=./imdb-source
IdsOut=./imdb-ids
ImdbListId=$1
StartOfList=1
Furl="http://www.imdb.com/list/${ImdbListId}/?start=${StartOfList}&view=compact&sort=listorian:asc&defaults=1"
if [ -e $IdsOut ] ; then
echo 'Capture file already exists'
exit 1
fi
echo 'Getting index to figure out number of pages on the IMDB list'
curl -L -s "$Furl" > $SourceOut
echo 'Collecting IDs of titles on IMDB from all list pages'
Pages=$(cat $SourceOut | grep -E 'Page 1 of [0-9]' | tail -n 1 | sed -r 's/Page 1 of ([0-9])/\1/')
CurPage=1
while [ $CurPage -le $Pages ] ; do
if [ $CurPage -eq 1 ] ; then
cat $SourceOut | grep -Eo 'href="\/title\/tt[0-9]{7}\/"' | sed -r 's/.*(tt[0-9]{7})\/"/\1/' | sort | uniq >> $IdsOut
else
StartOfList=$[$StartOfList + 250]
Furl="http://www.imdb.com/list/${ImdbListId}/?start=${StartOfList}&view=compact&sort=listorian:asc&defaults=1"
curl -L -s $Furl | grep -Eo 'href="\/title\/tt[0-9]{7}\/"' | sed -r 's/.*(tt[0-9]{7})\/"/\1/' | sort | uniq >> $IdsOut
fi
CurPage=$[$CurPage + 1]
done
echo 'Querying oMDB API for title metadata (this will take a long time for long lists)'
TotalRunTime=0
for ImdbId in `cat $IdsOut` ; do
JsonResponse=$(curl -s "http://www.omdbapi.com/?i=${ImdbId}&plot=short&r=json")
RunTime=$(echo $JsonResponse | jq -r '.Runtime' | cut -d" " -f1)
TotalRunTime=$[$TotalRunTime + $RunTime]
done
TotalDays=`echo "scale=2; ${TotalRunTime}/60/24" | bc`
echo -e "\nYou have spent $TotalDays days watching movies!"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment