-
-
Save masnun/3170870 to your computer and use it in GitHub Desktop.
#!/usr/bin/env python | |
import urllib, sys, bs4 | |
print bs4.BeautifulSoup(urllib.urlopen("http://data.alexa.com/data?cli=10&dat=s&url="+ sys.argv[1]).read(), "xml").find("REACH")['RANK'] |
Hi there.
First of all many thanks for your one liner.
And even when it will break the "oneliner" concept just in case you want a humble improvement, below a version that catches the exception caused, probably, by an attempt of getting alexa rank for a non enough relevant domain:
!/usr/bin/env python
import urllib, sys, bs4
try:
print bs4.BeautifulSoup(urllib.urlopen("http://data.alexa.com/data?cli=10&dat=s&url="+ sys.argv[1]).read(), "xml").find("REACH")['RANK']
except TypeError:
print 'Unable to get "%s" Alexa rank. Not enough relevance?' %(sys.argv[1])`
PD: I was unable to properly format the code, sorry.
Cheers.
Thanks for the code!
I wrote something that doesn't involve any additional packages, uses only regex:
#!/usr/bin/env python
import urllib, sys, re
xml = urllib.urlopen('http://data.alexa.com/data?cli=10&dat=s&url=%s'%sys.argv[1]).read()
try: rank = int(re.search(r'<POPULARITY[^>]*TEXT="(\d+)"', xml).groups()[0])
except: rank = -1
print 'Your rank for %s is %d!\n' % (sys.argv[1], rank)
error is popping that index is out of bound...help!!!
:) @risheek20
import urllib.request, sys, re
import xmltodict, json
xml = urllib.request.urlopen('http://data.alexa.com/data?cli=10&dat=s&url={}'.format("www.example.com")).read()
result= xmltodict.parse(xml)
data = json.dumps(result).replace("@","")
data_tojson = json.loads(data)
url = data_tojson["ALEXA"]["SD"][1]["POPULARITY"]["URL"]
rank= data_tojson["ALEXA"]["SD"][1]["POPULARITY"]["TEXT"]
print("site {site}, rank {rank}".format(site=url,rank=rank))
I made one here: alexa_check.py
Not a one liner but it gets all ranks available, global and national. It will also show other countries rank (sometimes only one country will be shown on the website-besides global)...
For instance facebook.com will show only Global and US rank on the Alexa page. However my script finds that they are number 1 in several countries
:) @risheek20
import urllib.request, sys, re import xmltodict, json xml = urllib.request.urlopen('http://data.alexa.com/data?cli=10&dat=s&url={}'.format("www.example.com")).read() result= xmltodict.parse(xml) data = json.dumps(result).replace("@","") data_tojson = json.loads(data) url = data_tojson["ALEXA"]["SD"][1]["POPULARITY"]["URL"] rank= data_tojson["ALEXA"]["SD"][1]["POPULARITY"]["TEXT"] print("site {site}, rank {rank}".format(site=url,rank=rank))
hey buddy, ther's problem with parser.Parse(xml_input, True) ..... can you please edited???
I would rate it very much
Thanks in advance!
Hi there.
First of all amazing and simple script, many thanks.
Just wondering if what we want here is .find("POPULARITY")['TEXT'] instead of find("REACH")['RANK'], since the figures are more consistent with information displayed in:
a) URLs like
http://www.alexa.com/siteinfo/example.com
b) PageRank plugin for Mozilla Firefox
Cheerss.