public
Last active

Archive a FriendFeed feed in MongoDB

  • Download Gist
friendfeed2mongodb.rb
Ruby
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
#!/usr/bin/ruby
 
require "rubygems"
require "mongo"
require "json/pure"
require "open-uri"
 
# db config
db = Mongo::Connection.new.db('friendfeed')
col = db.collection('lifesci')
 
# fetch json
0.step(9900, 100) {|n|
f = open("http://friendfeed-api.com/v2/feed/the-life-scientists?start=#{n}&num=100").read
j = JSON.parse(f)
break if j['entries'].count == 0
j['entries'].each do |entry|
if col.find({:_id => entry['id']}).count == 0
entry[:_id] = entry['id']
entry.delete('id')
col.save(entry)
end
end
puts "Processed entries #{n} - #{n + 99}", "Database contains #{col.count} documents."
}
 
puts "No more entries to process. Database contains #{col.count} documents."
  1. Re-written as a rake task; save as "Rakefile" and run as "rake db:seed feed=FEED_ID".
  2. entry ID alone not sufficient as unique key (may appear in several feeds); so prepended feed sup_id.

Changed step back to 9900; don't think anything above this returns more results.

Added a sleep() to this version of code.

Please sign in to comment on this gist.

Something went wrong with that request. Please try again.