Last active

Embed URL

HTTPS clone URL

SSH clone URL

You can clone with HTTPS or SSH.

Download Gist

Make one large blocklist from the bluetack lists on iblocklist.com

View getBlockLists.sh
1 2 3 4 5 6 7 8
#!/usr/bin/env sh
 
# Download lists, unpack and filter, write to stdout
curl -s https://www.iblocklist.com/lists.php \
| sed -n "s/.*value='\(http:.*=bt_.*\)'.*/\1/p" \
| xargs wget -O - \
| gunzip \
| egrep -v '^#'

This can easily be added to cron also:
0 3 * * 0 curl -s http://www.iblocklist.com/lists.php | sed -n "s/.*value='\(http:.*=bt_.*\)'.*/\1/p" | xargs wget -O - | gunzip | egrep -v '^#' > ~/Library/Application\ Support/Transmission/blocklists/generated.txt

Hi! I'm getting these errors trying to execute from the Terminal:
"xargs: wget: No such file or directory
gzip: stdin: unexpected end of file"

Am I doing anything wrong? Thanks!

replace wget -O - by curl in the script.

Hi and thanks for script. I run from terminal (i have replace wget -0 - with curl) and all is:
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
And then error:
gzip: stdin: unexpected end of file
Thanks!

Replace wget -O - with curl -Ls

Hi @jamesstout,
Would you please post the entire terminal command with your correction?
When I run:

sudo 0 3 * * 0 curl -s http://www.iblocklist.com/lists.php | sed -n "s/.*value='\(http:.*=bt_.*\)'.*/\1/p" | xargs curl -Ls | gunzip | egrep -v    '^#' > ~/Library/Application\ Support/Transmission/blocklists/generated.txt

I get:

-bash: 0: command not found

The "0 3 * * 0 ..." is a crontab entry. Just type "curl -s ...."

@ArtemGordinsky you're getting -bash: 0: command not found because 0 isn't a command. It's a crontab entry, as @prehensilecode mentioned.

Updated for Mac:

curl -s https://www.iblocklist.com/lists.php | sed -n "s/.*value='\(http:.*=bt_.*\)'.*/\1/p" | sed "s/\&/\&/g" | sed "s/http/\"http/g" | sed "s/gz/gz\"/g" | xargs curl -L | gunzip | egrep -v '^#' > ~/Library/Application\ Support/Transmission/blocklists/generated.txt.bin

The blocklist URL has changed to HTTPS. The updated url on line 1 is https://www.iblocklist.com/lists.php (as above but it wasn't updated in the download). Thanks for this great script!

iblocklist.com is bigoted hate website.

Owner

@kilolima updated.

Hi folks. I've analysed the code for @fortran01's mac version (thanks v. much @fortran01)
and posted a plain english translation of what each bit does below.
Hope it's accurate but if I have made any errors please correct me! :-)
Hope this helps people tweak the code to do other similar stuff

For more clarity as to what's going on, I've put each piped segment of the command on seperate lines,
but remember it's all one big line.

 curl -s https://www.iblocklist.com/lists.php
 | sed -n "s/.*value='\(http:.*=bt_.*\)'.*/\1/p"
 | sed "s/\&/\&/g"
 | sed "s/http/\"http/g"
 | sed "s/gz/gz\"/g"
 | xargs curl -L
 | gunzip
 | egrep -v '^#' > ~/Library/Application\ Support/Transmission/blocklists/generated.txt.bin

Plain english Explanation of each bit of the command does:

grab the webpage "https://www.iblocklist.com/lists.php" in silent mode (no progress bar or error messages)

search each line of this webpage, looking for lines containing text of the form
(anything)value=http:(anything)=bt_(anything)
chop out and dump the first bit ( (anything)value= ) from each of the lines you find

in the resultant lines, change all occurrences of &amp to &

in the resultant lines change all occurrences of the string http to "http

in the resultant lines, change all occurrences of the string gz to gz"

feed the resultant lines one by one to the curl command (-L means curl will automatically redo the grab if server says any file resource has moved)

feed each file downloaded by curl to gunzip program (uncompress it)

write only the lines from each file that don't start with a # (i.e. that are not comments) into the file
"~/Library/Application Support/Transmission/blocklists/generated.txt.bin"

All this ultimately results in the following lines being fed one by one by xargs to the curl command:

"http://list.iblocklist.com/?list=bt_level1&fileformat=p2p&archiveformat=gz"
"http://list.iblocklist.com/?list=bt_level2&fileformat=p2p&archiveformat=gz"
"http://list.iblocklist.com/?list=bt_level3&fileformat=p2p&archiveformat=gz"
"http://list.iblocklist.com/?list=bt_edu&fileformat=p2p&archiveformat=gz"
"http://list.iblocklist.com/?list=bt_rangetest&fileformat=p2p&archiveformat=gz"
"http://list.iblocklist.com/?list=bt_bogon&fileformat=p2p&archiveformat=gz"
"http://list.iblocklist.com/?list=bt_ads&fileformat=p2p&archiveformat=gz"
"http://list.iblocklist.com/?list=bt_spyware&fileformat=p2p&archiveformat=gz"
"http://list.iblocklist.com/?list=bt_proxy&fileformat=p2p&archiveformat=gz"
"http://list.iblocklist.com/?list=bt_templist&fileformat=p2p&archiveformat=gz"
"http://list.iblocklist.com/?list=bt_microsoft&fileformat=p2p&archiveformat=gz"
"http://list.iblocklist.com/?list=bt_spider&fileformat=p2p&archiveformat=gz"
"http://list.iblocklist.com/?list=bt_hijacked&fileformat=p2p&archiveformat=gz"
"http://list.iblocklist.com/?list=bt_dshield&fileformat=p2p&archiveformat=gz"

This should help clarify exactly what pattern is being searched for by sed and what the sed filtering actually does
e.g.
"&amp" becomes "&"
inverted commas placed before each http and after each gz

You could of course feed these lines manually to curl if you just want to grab individual zipped blocklists etc.

If doing this, don't forget to filter out the comment lines with egrep and write to a .bin file for transmission :-)

aww.. they changed it again :/

I've just wrote a simple Haskell program to sort and merge IP ranges (overlapping and adjacent IP ranges are fused, empty or commented lines are removed) : https://gist.github.com/Piezoid/ee43be6e5eebd6aa9bac

Use stdin and stdout with gz format :
./fuseblkl < inList.p2p.gz > outList.p2p.gz

I wrote a script that downloads 50 different Blocklists and performs Duplication checks and other Repeated Offender Queries to compact the IP Reputation from a Threat Source of over 700,000 IPs to approx 2-300,000 IPs. Its primarily for pfSense Firewall Blocking but can be adapted for other needs. Any comments appreciated via email.

https://gist.github.com/BBcan17/67e8c456cb399fbe02ee

Here's another variant: it downloads only lists with a rating of >= 4. You could remove grep -A1 'star_[45]' to get all lists.

curl -sL 'https://www.iblocklist.com/lists.php' | egrep -A1 'star_[45]' | egrep -o '[a-z]{20}' | sort -u | while read -r blocklist; do curl -sL "http://list.iblocklist.com/?list=${blocklist}&fileformat=p2p&archiveformat=gz" | gunzip -q > ~/Library/Application\ Support/Transmission/blocklists/$blocklist; done

https://trac.transmissionbt.com/wiki/Blocklists:

When transmission starts, it scans this directory for files not ending in ".bin" and tries to parse them.

ip2k commented

[ Edited 08/17 to no longer use a temp file]

I re-wrote this to work with their current site and made the resulting daily-updated list available at http://ip2k.com/list.gz

(note that this requires hxwls which should be in the 'html-xml-utils' package or similar in your distro)


#!/bin/bash

for url in $(curl -s https://www.iblocklist.com/lists.php \
| hxwls \
| grep -v png \
| grep 'list=' \
| sed 's|/list.php?list=||g' \
| sed 's|^|http://list.iblocklist.com/?list=|g' \
| sed 's|$|\&fileformat=p2p\&archiveformat=gz|g'); do wget --no-verbose "${url}" -O - | gunzip |egrep -v '^#' >> list; done

gzip list
echo "DONE"
ls -lah list.gz

This one is for uTorrent on MacOSX, if anyone needs:

#!/bin/bash
DIR="~/Library/Application Support/uTorrent"
LIST_COUNT=$(wget -q -O - http://www.iblocklist.com/lists.php | sed -n "s/.*value='\(http:.*list=.*\)'.*/\1/p" | wc -l )
CURRENT_LIST_ID=0
mv ipfilter.dat ipfilter.`date +%Y%m%d_%s`.dat 2>/dev/null
>"${DIR}/ipfilter.dat"
STARS_PRINTED=0
echo -n -e "..."
wget -q -O - http://www.iblocklist.com/lists.php | sed -n "s/.*value='\(http:.*list=.*\)'.*/\1/p" | while read LIST_URL; do
    CURRENT_LIST_ID=$[${CURRENT_LIST_ID}+1]
    STARS=$[20*${CURRENT_LIST_ID}/${LIST_COUNT}]
    STARS_LEFT=$[${STARS}-${STARS_PRINTED}]
    STARS_PRINTED=$STARS
    for ((;STARS_LEFT>0; STARS_LEFT--)); do echo -n -e "*"; done
    URL=$LIST_URL sh -c 'wget -q -O - "$URL" | gunzip -c | egrep -v "^#" | cut -d":" -f2 | egrep -v "^$" ' >> "${DIR}/ipfilter.dat" 2>/dev/null
done
echo

@ip2k nice script, you just have one typo:

for url in $(cat bl); do

should be

for url in $(cat bls); do

ip2k commented

@enricobacis thanks for pointing that out :) I fixed the script to not use a temp file at all now, and a daily-updated list is at http://ip2k.com/list.gz

@ip2k Just found out about the this list. Once I add the list you generated to the Transmission, it will get automatically updated as frequently are yours. Meaning I will not have to do anything except add the link to Transmission. Is that right?

ip2k, is your list still maintained?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.