Skip to content

Instantly share code, notes, and snippets.

@refraction-ray
Last active January 12, 2018 15:40
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save refraction-ray/fb45867e8bfddd760100284a7920fb7c to your computer and use it in GitHub Desktop.
Save refraction-ray/fb45867e8bfddd760100284a7920fb7c to your computer and use it in GitHub Desktop.
CLI tool to confirm whether there are two black list or just one

Get Started

  • Put both script in the gist: same.sh and checkdomains.py in the same folder.

  • Also create a GFWlist.txt in this folder, and put domains you want to check on each line.

  • Or you could simply utilize the integrated GFWlist from Github, run command below

    $ curl -o encoded.txt https://raw.githubusercontent.com/gfwlist/gfwlist/master/gfwlist.txt && base64 -d encoded.txt>GFWlist.txt
  • Finally just python3 checkdomains.py and wait for the results. Although it may take close to an hour to finish if you use the default list from the above step.

  • If you are interested in certain domain, just ./same.sh [domain.name], and you can learn about its status in black list for this domain.

Results Analysis

$ ./same.sh youtube.com
in the rst list
in the poison list
results are consistent
$ ./same.sh baidu.com
not in the rst list
might not in the poison list
results are consistent
$ python3 checkdomains.py
# some false positive domains may show here
3250 domains have been checked and 679 have rst issue 661 have poison issue
# the difference between the two list comes from false positive domains
  • The aim of this work is to check whether there are differences between two black lists: one for RST of DNS over TCP, the other one for DNS posioning when query on oversea DNS servers.
  • The default output of the py script is domains in one list while not in another, namely the discrepancy of the two list. And finally a line of summary of domains it scanned.
  • All those domains in stdout are false positive due to the fluctuations of the Internet. You may check by same.sh on each domain or python3 checkdomains.py>GFWlis.txt and then make a second round of scan to make sure they are false positives.
  • These false positive may result from the critical query time I set for the DNS query. I use 10ms as the indicator of a wrong DNS answer. You may want to change it according to your environment of the Internet to have the best results (the less false positives).
from re import sub,search
from subprocess import check_output
with open("GFWlist.txt") as ls:
total = 0
rst = 0
poison = 0
for line in ls.readlines():
domain = sub(r".*--.*|^@@.*|.*##.*|^\[.*|!.*|http://|https://|[\s]*|^\^https.*","",line)
domain = sub(r"^\.(.*)$|^[|]{1,2}(.*)$","\\1",domain)
if bool(search(r"^[_a-zA-Z0-9-]+(\.[_a-zA-Z0-9-]+)*\.[a-z]+$",domain)):
stdout = check_output(["./same.sh",domain]).decode('utf-8').split("\n")
total = total+1
if stdout[0].startswith("in"):
rst = rst+1
if stdout[1].startswith("in"):
poison = poison+1
if stdout[2].startswith('warning'):
print(domain)
# if total%100 == 0:
# print('%s domains have been checked and %s have rst issue %s have poison issue'%(total,rst,poison))
print('%s domains have been checked and %s have rst issue %s have poison issue'%(total,rst,poison))
#! /bin/bash
a=`dig +tcp @8.8.8.8 $1|grep reset`
if [ -n "$a" ]
then
rst=1
echo "in the rst list"
else
rst=0
echo "not in the rst list"
fi
b=`dig @8.8.8.8 $1|grep "Query time"|sed -e 's/^.*:\s\([0-9]*\).*$/\1/'`
length=`echo $b|wc -c`
# echo $b
if [ $b -lt 3 ]
then
poison=1
echo "in the poison list"
else
poison=0
echo "might not in the poison list"
fi
if [ $rst = $poison ]
then
echo "results are consistent"
else
echo "warning: inconsitent results in this domain"
fi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment