Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
Archive a FriendFeed feed in MongoDB
require "rubygems"
require "mongo"
require "json/pure"
require "open-uri"
# db config
db ='friendfeed')
col = db.collection('lifesci')
# fetch json
0.step(9900, 100) {|n|
f = open("{n}&num=100").read
j = JSON.parse(f)
break if j['entries'].count == 0
j['entries'].each do |entry|
if col.find({:_id => entry['id']}).count == 0
entry[:_id] = entry['id']
puts "Processed entries #{n} - #{n + 99}", "Database contains #{col.count} documents."
puts "No more entries to process. Database contains #{col.count} documents."

neilfws commented Aug 13, 2010

  1. Re-written as a rake task; save as "Rakefile" and run as "rake db:seed feed=FEED_ID".
  2. entry ID alone not sufficient as unique key (may appear in several feeds); so prepended feed sup_id.

neilfws commented Dec 19, 2010

Changed step back to 9900; don't think anything above this returns more results.


neilfws commented Feb 1, 2011

Added a sleep() to this version of code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment