Skip to content

Instantly share code, notes, and snippets.

@fin
Created February 1, 2012 00:17
Show Gist options
  • Save fin/1714138 to your computer and use it in GitHub Desktop.
Save fin/1714138 to your computer and use it in GitHub Desktop.
lazy twitter backup script
grep -m 1 * | grep -v ']}'
require 'rubygems'
require 'twitter'
require 'json'
username = "fin"
last_id = nil
page = 1
numberoftweets = 0
savedtweets = 0
backup_dir = 'mytweets'
while 1 do
begin
Twitter.user_timeline(username, :page => page, :count => 200, :include_rts => true, :include_entities => true).each do |tweet|
t = tweet.attrs
savethis = t.to_json
text = t['text']
last_id = t['id_str']
numberoftweets+=1
filename = "#{backup_dir}/#{last_id}.json"
if not File.exists? filename
f = File.new filename, 'w'
f.puts text
f.puts savethis
f.close
savedtweets+=1
end
end
$stderr.puts "page #{page}, last id #{last_id}, #{numberoftweets} tweets, #{savedtweets} new"
page += 1
if page > 20
break
end
rescue => ex
$stderr.puts "error, retrying"
$stderr.puts ex.inspect
sleep 4
end
end
@fin
Copy link
Author

fin commented Feb 1, 2012

or if you need to search metadata, feed the output through

$ sed 's/.\text":"([^"]).*"/\1/g'

to only get the text

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment