masnun/alexa.py

Created July 24, 2012 16:02

Star () You must be signed in to star a gist
Fork () You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/masnun/3170870.js"></script>
Save masnun/3170870 to your computer and use it in GitHub Desktop.

Download ZIP

Python One-liner to get your site's Alexa Rank

Raw

	#!/usr/bin/env python
	import urllib, sys, bs4
	print bs4.BeautifulSoup(urllib.urlopen("http://data.alexa.com/data?cli=10&dat=s&url="+ sys.argv[1]).read(), "xml").find("REACH")['RANK']

jlegido commented Apr 12, 2016

Hi there.

First of all amazing and simple script, many thanks.

Just wondering if what we want here is .find("POPULARITY")['TEXT'] instead of find("REACH")['RANK'], since the figures are more consistent with information displayed in:

a) URLs like
http://www.alexa.com/siteinfo/example.com

b) PageRank plugin for Mozilla Firefox

Cheerss.

jlegido commented Oct 10, 2016

Hi there.

First of all many thanks for your one liner.

And even when it will break the "oneliner" concept just in case you want a humble improvement, below a version that catches the exception caused, probably, by an attempt of getting alexa rank for a non enough relevant domain:

!/usr/bin/env python

import urllib, sys, bs4
try:
print bs4.BeautifulSoup(urllib.urlopen("http://data.alexa.com/data?cli=10&dat=s&url="+ sys.argv[1]).read(), "xml").find("REACH")['RANK']
except TypeError:
print 'Unable to get "%s" Alexa rank. Not enough relevance?' %(sys.argv[1])`

PD: I was unable to properly format the code, sorry.

Cheers.

elhardoum commented Aug 12, 2017 •

edited

Loading

Thanks for the code!

I wrote something that doesn't involve any additional packages, uses only regex:

#!/usr/bin/env python
import urllib, sys, re
xml = urllib.urlopen('http://data.alexa.com/data?cli=10&dat=s&url=%s'%sys.argv[1]).read()
try: rank = int(re.search(r'<POPULARITY[^>]*TEXT="(\d+)"', xml).groups()[0])
except: rank = -1
print 'Your rank for %s is %d!\n' % (sys.argv[1], rank)

risheek20 commented Jun 24, 2018

error is popping that index is out of bound...help!!!

isaachq commented Sep 25, 2018 •

edited

Loading

import urllib.request, sys, re
import xmltodict, json

xml = urllib.request.urlopen('http://data.alexa.com/data?cli=10&dat=s&url={}'.format("www.example.com")).read()
 
result= xmltodict.parse(xml)
 
data = json.dumps(result).replace("@","")
data_tojson = json.loads(data)
url = data_tojson["ALEXA"]["SD"][1]["POPULARITY"]["URL"]
rank= data_tojson["ALEXA"]["SD"][1]["POPULARITY"]["TEXT"]
 
print("site {site}, rank {rank}".format(site=url,rank=rank))

rootVIII commented Oct 14, 2018 •

edited

Loading

I made one here: alexa_check.py

Not a one liner but it gets all ranks available, global and national. It will also show other countries rank (sometimes only one country will be shown on the website-besides global)...

For instance facebook.com will show only Global and US rank on the Alexa page. However my script finds that they are number 1 in several countries

MR1387 commented Nov 11, 2019

import urllib.request, sys, re
import xmltodict, json

xml = urllib.request.urlopen('http://data.alexa.com/data?cli=10&dat=s&url={}'.format("www.example.com")).read()
 
result= xmltodict.parse(xml)
 
data = json.dumps(result).replace("@","")
data_tojson = json.loads(data)
url = data_tojson["ALEXA"]["SD"][1]["POPULARITY"]["URL"]
rank= data_tojson["ALEXA"]["SD"][1]["POPULARITY"]["TEXT"]
 
print("site {site}, rank {rank}".format(site=url,rank=rank))

hey buddy, ther's problem with parser.Parse(xml_input, True) ..... can you please edited???
I would rate it very much
Thanks in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment