Skip to content

Instantly share code, notes, and snippets.

@rjzak
Created March 6, 2017 20:59
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rjzak/b8f3f0dd325499a063b6b2552b42b868 to your computer and use it in GitHub Desktop.
Save rjzak/b8f3f0dd325499a063b6b2552b42b868 to your computer and use it in GitHub Desktop.
Pull and reformat lists for blocking domains associated with advertising (since ad networks don't police themselves and allow malvertising) and malware/scams.
#!/bin/bash
# Modified Pi-hole script to generate a dnsmasq file
# Intended for EdgeOS/EdgeMax from Ubuquiti Networks https://www.ubnt.com/
# original : https://github.com/jacobsalmela/pi-hole/blob/master/gravity-adv.sh
# original : https://gist.github.com/OnlyInAmerica/75e200886e02e7562fa1
# inspiration: https://help.ubnt.com/hc/en-us/articles/205223340-EdgeRouter-Ad-blocking-content-filtering-using-EdgeRouter
# Be sure to put this file in /config/user-data/
# Symlink for cron updates:
# ln -s /config/user-data/edgeos_adblock.sh /etc/cron.weekly/edgeos_adblock
# Re-run the symlink command after EdgeOS updates.
# The Pi-hole now blocks over 120,000 ad domains
# Address to send ads to (the RPi), can use as 0.0.0.0 (Pi-Hole not required, but is a great utility. https://pi-hole.net/)
piholeIP="0.0.0.0"
outlist='/etc/dnsmasq.d/final_blocklist.conf'
tempoutlist="$outlist.tmp"
echo "Getting yoyo ad list..." # Approximately 2452 domains at the time of writing
curl -s -d mimetype=plaintext -d hostformat=unixhosts http://pgl.yoyo.org/adservers/serverlist.php? | sort > $tempoutlist
echo "Getting winhelp2002 ad list..." # 12985 domains
curl -s http://winhelp2002.mvps.org/hosts.txt | grep -v "#" | grep -v "127.0.0.1" | sed '/^$/d' | sed 's/\ /\\ /g' | awk '{print $2}' | sort >> $tempoutlist
echo "Getting adaway ad list..." # 445 domains
curl -s https://adaway.org/hosts.txt | grep -v "#" | grep -v "::1" | sed '/^$/d' | sed 's/\ /\\ /g' | awk '{print $2}' | grep -v '^\\' | grep -v '\\$' | sort >> $tempoutlist
echo "Getting hosts-file ad list..." # 28050 domains
curl -s http://hosts-file.net/.%5Cad_servers.txt | grep -v "#" | grep -v "::1" | sed '/^$/d' | sed 's/\ /\\ /g' | awk '{print $2}' | grep -v '^\\' | grep -v '\\$' | sort >> $tempoutlist
echo "Getting malwaredomainlist ad list..." # 1352 domains
curl -s http://www.malwaredomainlist.com/hostslist/hosts.txt | grep -v "#" | sed '/^$/d' | sed 's/\ /\\ /g' | awk '{print $3}' | grep -v '^\\' | grep -v '\\$' | sort >> $tempoutlist
echo "Getting adblock.gjtech ad list..." # 696 domains
curl -s http://adblock.gjtech.net/?format=unix-hosts | grep -v "#" | sed '/^$/d' | sed 's/\ /\\ /g' | awk '{print $2}' | grep -v '^\\' | grep -v '\\$' | sort >> $tempoutlist
echo "Getting someone who cares ad list..." # 10600
curl -s http://someonewhocares.org/hosts/hosts | grep -v "#" | sed '/^$/d' | sed 's/\ /\\ /g' | grep -v '^\\' | grep -v '\\$' | awk '{print $2}' | grep -v '^\\' | grep -v '\\$' | sort >> $tempoutlist
echo "Getting Mother of All Ad Blocks list..." # 102168 domains!! Thanks Kacy
curl -A 'Mozilla/5.0 (X11; Linux x86_64; rv:30.0) Gecko/20100101 Firefox/30.0' -e http://forum.xda-developers.com/ http://adblock.mahakala.is/ | grep -v "#" | awk '{print $2}' | sort >> $tempoutlist
# Sort the aggregated results and remove any duplicates
# Remove entries from the whitelist file if it exists at the root of the current user's home folder
echo "Removing duplicates and formatting the list of domains..."
cat $tempoutlist | sed $'s/\r$//' | sort | uniq | sed '/^$/d' | awk -v "IP=$piholeIP" '{sub(/\r$/,""); print "address=/"$0"/"IP}' > $outlist
rm $tempoutlist
# Count how many domains/whitelists were added so it can be displayed to the user
numberOfAdsBlocked=$(cat $outlist | wc -l | sed 's/^[ \t]*//')
echo "$numberOfAdsBlocked ad domains blocked."
/etc/init.d/dnsmasq force-reload
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment