Skip to content

Instantly share code, notes, and snippets.

View uolter's full-sized avatar

Walter Traspadini uolter

View GitHub Profile
@uolter
uolter / hide_code.py
Created September 4, 2016 16:27
Hide code cells in Jupyter notebook
from IPython.display import HTML
def hide_code():
return HTML('''<script>
code_show=true;
function code_toggle() {
if (code_show){
$("div.input").hide();
} else {
$("div.input").show();
@uolter
uolter / map_reduce.py
Created February 8, 2016 05:40
Map reduce engine emulator
__author__ = 'uolter'
"""
Defines a single function, map_reduce, which takes an input
dictionary i and applies the user-defined function mapper to each
(input_key,input_value) pair, producing a list of intermediate
keys and intermediate values. Repeated intermediate keys then
have their values grouped into a list, and the user-defined
function reducer is applied to the intermediate key and list of
@uolter
uolter / cluster_count.py
Created February 7, 2016 10:38
Map reduce example to classify cluster and values
__author__ = 'uolter'
import map_reduce
def mapper(input_key, input_value):
def cut_and_clean_value(cluster):
"""
:param cluster: string in the format <cluster>:<value>
:return: touple cluster and value. If value is NaN return 0
@uolter
uolter / webgo.go
Created March 13, 2015 13:00
Dump http post with json
package main
import (
"encoding/json"
"io/ioutil"
"log"
"net/http"
)
type test_struct struct {
@uolter
uolter / quicksort.py
Created December 29, 2014 10:27
Simple Quck Sort example with python
#!/usr/bin/env
# -*- coding: utf-8 -*-
import unittest
""" Quicksort implementation """
def quicksort(arr):
""" Quicksort a list
@uolter
uolter / tree.py
Created December 22, 2014 10:24
Tree Data Structure for Manager Employee Hierarchy
# -*- coding: utf-8 -*-
import unittest
index = {}
class tree(object):
@uolter
uolter / pip_update_all
Created December 2, 2014 09:42
pip update all packages
pip freeze --local | grep -v '^\-e' | cut -d = -f 1 | xargs pip install -U
@uolter
uolter / venv_setup.sh
Created September 22, 2014 13:37
venv_and_pip_centos.sh
curl https://raw.githubusercontent.com/pypa/pip/master/contrib/get-pip.py > get-pip.py;
python get-pip.py;
rm -f get-pip.py;
# change directory here. Go in your project home dir.
# cd /opt/uuid_resolver/;
pip install virtualenv;
virtualenv venv;
# activate the virtualenv
source venv/bin/activate
# change here your requirements.txt location

Simple Website Crawler

The following gist is an extract of the article Building a simple crawler. It allows crawling from a URL and for a given number of bounce.

Basic Usage

from crawler import Crawler
crawler = Crawler()
crawler.crawl('http://techcrunch.com/')

displays the urls

#!/bin/bash
# Elastic Serarch Start and Stop Script
ES_HOME="/opt/elsearch/elasticsearch"
ES_USER="esearch"
PID=$(ps ax | grep elasticsearch | grep $ES_HOME | grep -v grep | awk '{print $1}')
#echo $PID