Skip to content

Instantly share code, notes, and snippets.

@kachok
Created July 29, 2011 17:11
Show Gist options
  • Star 5 You must be signed in to star a gist
  • Fork 5 You must be signed in to fork a gist
  • Save kachok/1114234 to your computer and use it in GitHub Desktop.
Save kachok/1114234 to your computer and use it in GitHub Desktop.
Quick and dirty way to get 1.usa.gov click archive
import urllib
import re
import time
data = urllib.urlopen('http://bitly.measuredvoice.com/bitly_archive/?C=M;O=D').read()
#print data
#datafiles name pattern - usagov_bitly_data2011-07-29-1311919454
p = re.compile('usagov_bitly_data\d{4}-\d{2}-\d{2}-\d{10}')
#print p.findall('<tr><td valign="top"><img src="/icons/unknown.gif" alt="[ ]"></td><td><a href="usagov_bitly_data2011-07-29-1311919454">usagov_bitly_data2011-07-29-1311919454</a></td><td align="right">29-Jul-2011 07:04 </td><td')
m=p.findall(data)
#print m
for i in range(len(m)):
if (i%2==0):
print m[i]
#time.strftime("%a, %d %b %Y %H:%M:%S +0000", time.localtime(epoch))
print len(m)
for i in range(len(m)):
if (i%2==0):
print "downloading ", m[i]
clicks = urllib.urlopen('http://bitly.measuredvoice.com/bitly_archive/'+m[i]).read()
file = open(m[i], "w")
file.write(clicks)
file.close()
print "done"
@abourget
Copy link

This doesn't work anymore.. anywhere we can get a hold of that data ??

@anothernoise
Copy link

The archive's moved to own page. Just replace the first link into code to http://1usagov.measuredvoice.com/2013/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment