Skip to content

Instantly share code, notes, and snippets.

View ntulip's full-sized avatar

Nick Tulip ntulip

View GitHub Profile
var Promise = require('bluebird');
var MongoDB = Promise.promisifyAll(require("mongodb"));
var MongoClient = Promise.promisifyAll(MongoDB.MongoClient);
var cheerio = require('cheerio');
var http = require('http');
var urls = [
'http://www.magentocommerce.com/certification/directory/index/?q=&country_id=AU&region_id=&region=vic&certificate_type=',
'http://www.magentocommerce.com/certification/directory/index/?q=&country_id=AU&region_id=&region=victoria&certificate_type='
];
@azizmb
azizmb / message_queue_pipeline.py
Created January 7, 2012 08:54
Scrapy pipeline to enque scraped items to message queue using carrot
from scrapy.xlib.pydispatch import dispatcher
from scrapy import signals
from scrapy.exceptions import DropItem
from scrapy.utils.serialize import ScrapyJSONEncoder
from carrot.connection import BrokerConnection
from carrot.messaging import Publisher
from twisted.internet.threads import deferToThread
@guenter
guenter / move_to_rds.rb
Created November 11, 2010 02:14
A quick and dirty script to move a database into Amazon RDS (or any other database). Can transfer part of the data beforehand.
require 'fileutils'
start_time = Time.now
SOURCE_DB = {
:name => 'db_name',
:user => 'db_user',
:password => 'db_pass',
:host => 'localhost'