Skip to content

Instantly share code, notes, and snippets.

@beaumartinez
Created December 28, 2014 15:34
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save beaumartinez/4e4993728bfa6d5a4e2a to your computer and use it in GitHub Desktop.
Save beaumartinez/4e4993728bfa6d5a4e2a to your computer and use it in GitHub Desktop.
Print the tweets in a Twitter archive that are younger than a year, useful for heroku_ebooks
#!/usr/bin/env python
from cStringIO import StringIO
from csv import DictReader
from sys import argv
from zipfile import ZipFile
from arrow import utcnow
import dateutil.parser
def twitter2ebooks(path):
archive = ZipFile(path)
csv = archive.read('tweets.csv')
csv = StringIO(csv)
rows = DictReader(csv)
now = utcnow()
for row in rows:
created_at = row['timestamp']
created_at = dateutil.parser.parse(created_at)
if (created_at - now).days < 365:
print repr(row['text'])
if __name__ == '__main__':
twitter2ebooks(argv[1])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment