Skip to content

Instantly share code, notes, and snippets.

View massdonati's full-sized avatar

Massimo Donati massdonati

  • San Francisco, CA
View GitHub Profile
@massdonati
massdonati / web_crowler.rb
Created May 9, 2013 10:19
Small web crawler script using Anemone and MongoDB
require 'anemone'
require 'mongo'
# MongoDB setup
db = Mongo::Connection.new.db("demo")
urls_collection = db["page_urls"]
#New Anemone web crawler setup and main operation
Anemone.crawl("http://www.fondazionecollegiopiox.org") do |anemone|
anemone.storage = Anemone::Storage.MongoDB