Create a gist now

Instantly share code, notes, and snippets.

Archive a FriendFeed feed in MongoDB
#!/usr/bin/ruby
require "rubygems"
require "mongo"
require "json/pure"
require "open-uri"
# db config
db = Mongo::Connection.new.db('friendfeed')
col = db.collection('lifesci')
# fetch json
0.step(9900, 100) {|n|
f = open("http://friendfeed-api.com/v2/feed/the-life-scientists?start=#{n}&num=100").read
j = JSON.parse(f)
break if j['entries'].count == 0
j['entries'].each do |entry|
if col.find({:_id => entry['id']}).count == 0
entry[:_id] = entry['id']
entry.delete('id')
col.save(entry)
end
end
puts "Processed entries #{n} - #{n + 99}", "Database contains #{col.count} documents."
}
puts "No more entries to process. Database contains #{col.count} documents."
@neilfws
Owner
neilfws commented Aug 13, 2010
  1. Re-written as a rake task; save as "Rakefile" and run as "rake db:seed feed=FEED_ID".
  2. entry ID alone not sufficient as unique key (may appear in several feeds); so prepended feed sup_id.
@neilfws
Owner
neilfws commented Dec 19, 2010

Changed step back to 9900; don't think anything above this returns more results.

@neilfws
Owner
neilfws commented Feb 1, 2011

Added a sleep() to this version of code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment