Skip to content

Instantly share code, notes, and snippets.


Dan Chrostowski dchrostowski

View GitHub Profile
dchrostowski /
Last active Apr 23, 2019
Generic synchronous task queue with concurrency in Python3
from threading import Thread
import queue
import time
import random
import string
class Task(object):
def __init__(self,*args,**kwargs):
self.fn = kwargs.pop('fn')
self.args = args
dchrostowski /
Created Jun 13, 2017
socks proxy middleware example for scrapy with DeleGate

I thought I'd just share how I'm getting socks support with scrapy. Basically there are two pretty good options, DeleGate and Privoxy. I'm going to give an example of a middleware that I implemented using DeleGate which has worked great for me thus far.

DeleGate is amazingly simple and straightforward; it's basically serving as an http-to-socks bridge. In other words, you make a request to it with scrapy as if it were an http proxy and it will take care of bridging that over to the socks server. Privoxy can do this too, but it seems like DeleGate has much better documentation and possibly more functionality than Privoxy (maybe...) You can either build from source or download a pre-built binary (supports Windows, MacOS X, Linux, BSD, and Solaris). Set it up however you like so that it's on your PATH. In my Ubuntu setup I simply created a symbolic link to the binary in my /usr/bin directory. Copying it over there works too. So after it'

dchrostowski /
Created Mar 1, 2017
Morningstar transcript crawler component
#! /usr/bin/env perl
package Quan::Worker::ConferenceCallTranscripts::Morningstar;
use Moose;
extends 'Quan::Worker::ConferenceCallTranscripts';
use Method::Signatures;
use URI;
use Crawler;
use JSON qw(encode_json decode_json);
use Data::Dumper;
use FindBin qw($Bin);