Skip to content

Instantly share code, notes, and snippets.

@edsu
Last active December 19, 2015 01:19
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save edsu/5874751 to your computer and use it in GitHub Desktop.
Save edsu/5874751 to your computer and use it in GitHub Desktop.
This is an example script that uses the link data at linkypedia.info and stats data at stats.grok.se to generate list of Wikipedia articles that reference a particular website and the number of times it was accessed at Wikipedia in the current month.
#!/usr/bin/env python
import sys
import json
import time
import urllib
import datetime
t = datetime.date.today()
t = "%s%02i" % (t.year, t.month)
stats_url = "http://stats.grok.se/json/en/%s/%%s" % t
stats = {}
for line in urllib.urlopen("http://linkypedia.info/websites/16/data/"):
wikipedia_url, extlink_url = line.split("\t")
title = wikipedia_url.split("/")[-1]
# if we've checked the article stats already no need to do it again
if title in stats:
continue
article_stats = json.loads(urllib.urlopen(stats_url % title).read())
total_views = sum([v for v in article_stats['daily_views'].values()])
stats[title] = total_views
sys.stderr.write(".")
time.sleep(1)
articles = stats.keys()
articles.sort(lambda a, b: cmp(stats[a], stats[b]))
for article in articles:
print "http://en.wikipedia.org/wiki/%s" % article, "\t", stats[article]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment