Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Archive Tweets
import requests, os, glob, json
you = 'tmcw'
data = 'tweets'
try: os.mkdir(data)
except Exception: pass
def run(max_id = False):
already = glob.glob("%s/*.json" % data)
start = 'http://api.twitter.com/1/statuses/user_timeline.json?screen_name=%s&include_rts=true&count=200' % you
if max_id:
start = '%s&max_id=%s' % (start, max_id)
r = requests.get(start)
has_new = False
for t in r.json:
if ("%s/%s.json" % (data, t['id'])) not in already:
json.dump(t, open('%s/%s.json' % (data, t['id']), 'w'))
has_new = True
if has_new:
last = r.json.pop()
run(last['id'])
print 'starting twitter archive of @%s' % you
run()
@christopherfrance

This comment has been minimized.

Copy link

@christopherfrance christopherfrance commented Sep 11, 2012

Thanks for referring me to this Tom.

A minor note, this will not save all of your data, eg: your favorites, users you are following, users who are following you, or avatars, bios, etc. Also (more interestingly) it won't do any spidering to save data eventually needed to meaningfully reconstruct conversations (others' tweets), or embedded media (twitpics in the discussion, or even just preserving links). Are you aware of any other scripts that go this more elaborate route? Any interest in extending this one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment