Skip to content

Instantly share code, notes, and snippets.

@josephby
Last active October 20, 2016 16:39
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save josephby/864008bffeba1f2a039cb4518d7ef44a to your computer and use it in GitHub Desktop.
Save josephby/864008bffeba1f2a039cb4518d7ef44a to your computer and use it in GitHub Desktop.
bash one-liner to dump all URLs from an email newsletter
#/bin/bash
# urls-from-email
#
# bash script to dump all URLs from an email
#
# To use this, save the "Raw Message Source" of an email to a filename (e.g. file.txt) and then run
#
# ./urls-from-email.sh file.txt
#
# It will then output every link in the email, one per line, excluding emails from links to twitter.com, getrevue.co or facebook.com
#
# You can have Google Chrome open all of these links in one shot to do with as you will, e.g.
#
# ./urls-from-email.sh file.txt > links.txt
# cat links.txt | while IFS= read -r line; do { /usr/bin/open -a "/Applications/Google Chrome.app" $line } ; done
#
cat $1 | perl -MMIME::QuotedPrint -pe '$_=MIME::QuotedPrint::decode($_);' | grep -Eo 'href="[^\"]+"' | grep -Eo '(http|https)://[^"]+' | while IFS= read -r line ; do { curl -Ls -o /dev/null -w %{url_effective} $line; printf "\n" ;} ; done | sed -E '/twitter.com|getrevue.co|facebook.com|list-manage.com|list-manage2.com|campaign-archive(1|2).com/d'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment