Skip to content

Instantly share code, notes, and snippets.

View dchrostowski's full-sized avatar
🙃
bots.

Dan Chrostowski dchrostowski

🙃
bots.
View GitHub Profile
@dchrostowski
dchrostowski / Morningstar.pm
Created March 1, 2017 17:13
Morningstar transcript crawler component
#! /usr/bin/env perl
package Quan::Worker::ConferenceCallTranscripts::Morningstar;
use Moose;
extends 'Quan::Worker::ConferenceCallTranscripts';
use Method::Signatures;
use URI;
use Crawler;
use JSON qw(encode_json decode_json);
use Data::Dumper;
use FindBin qw($Bin);
@dchrostowski
dchrostowski / synchronous_task_queue.py
Last active April 23, 2019 02:48
Generic synchronous task queue with concurrency in Python3
from threading import Thread
import queue
import time
import random
import string
class Task(object):
def __init__(self,*args,**kwargs):
self.fn = kwargs.pop('fn')
self.args = args
@dchrostowski
dchrostowski / scrapy_socks.md
Created June 13, 2017 05:29
socks proxy middleware example for scrapy with DeleGate

I thought I'd just share how I'm getting socks support with scrapy. Basically there are two pretty good options, DeleGate and Privoxy. I'm going to give an example of a middleware that I implemented using DeleGate which has worked great for me thus far.

DeleGate is amazingly simple and straightforward; it's basically serving as an http-to-socks bridge. In other words, you make a request to it with scrapy as if it were an http proxy and it will take care of bridging that over to the socks server. Privoxy can do this too, but it seems like DeleGate has much better documentation and possibly more functionality than Privoxy (maybe...) You can either build from source or download a pre-built binary (supports Windows, MacOS X, Linux, BSD, and Solaris). Set it up however you like so that it's on your PATH. In my Ubuntu setup I simply created a symbolic link to the binary in my /usr/bin directory. Copying it over there works too. So after it'