Created

Embed URL

HTTPS clone URL

SSH clone URL

You can clone with HTTPS or SSH.

Download Gist

Archive a FriendFeed feed in MongoDB

View friendfeed2mongodb.rb
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
#!/usr/bin/ruby
 
require "rubygems"
require "mongo"
require "json/pure"
require "open-uri"
 
# db config
db = Mongo::Connection.new.db('friendfeed')
col = db.collection('lifesci')
 
# fetch json
0.step(9900, 100) {|n|
f = open("http://friendfeed-api.com/v2/feed/the-life-scientists?start=#{n}&num=100").read
j = JSON.parse(f)
break if j['entries'].count == 0
j['entries'].each do |entry|
if col.find({:_id => entry['id']}).count == 0
entry[:_id] = entry['id']
entry.delete('id')
col.save(entry)
end
end
puts "Processed entries #{n} - #{n + 99}", "Database contains #{col.count} documents."
}
 
puts "No more entries to process. Database contains #{col.count} documents."
Owner
  1. Re-written as a rake task; save as "Rakefile" and run as "rake db:seed feed=FEED_ID".
  2. entry ID alone not sufficient as unique key (may appear in several feeds); so prepended feed sup_id.
Owner

Changed step back to 9900; don't think anything above this returns more results.

Owner

Added a sleep() to this version of code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.