Skip to content

Instantly share code, notes, and snippets.

View bbelgodere's full-sized avatar

Brian Belgodere bbelgodere

View GitHub Profile
@jonathana
jonathana / grouplens_evaluator.py
Created June 22, 2011 20:15
"Mahout in Action" Grouplens evaluator sample from section 2.5 ported to jython
import sys, os, glob
from datetime import datetime
sys.path.append(os.environ.get("MAHOUT_CORE"))
for jar in glob.glob(os.environ.get("MAHOUT_JAR_DIR") + "/*.jar"):
sys.path.append(jar)
from org.apache.mahout.common import RandomUtils
from org.apache.mahout.cf.taste.common import TasteException
from org.apache.mahout.cf.taste.eval import *
@jboner
jboner / latency.txt
Last active June 27, 2024 14:47
Latency Numbers Every Programmer Should Know
Latency Comparison Numbers (~2012)
----------------------------------
L1 cache reference 0.5 ns
Branch mispredict 5 ns
L2 cache reference 7 ns 14x L1 cache
Mutex lock/unlock 25 ns
Main memory reference 100 ns 20x L2 cache, 200x L1 cache
Compress 1K bytes with Zippy 3,000 ns 3 us
Send 1K bytes over 1 Gbps network 10,000 ns 10 us
Read 4K randomly from SSD* 150,000 ns 150 us ~1GB/sec SSD
# MAC manipulators
alias random_mac='sudo ifconfig en0 ether `openssl rand -hex 6 | sed "s/\(..\)/\1:/g; s/.$//"`'
alias restore_mac='sudo ifconfig en0 ether YOUR_ORIGINAL_MAC_ADDRESS_GOES_HERE'
@magsol
magsol / parse_hashmap.py
Created March 15, 2013 15:34
This script takes the output of an Apache Mahout job (in HashMap format) and converts it to a histogram.
import numpy as np
import sys
import matplotlib.pyplot as plot
import csv
# read the arguments - need two files
if len(sys.argv) < 3:
quit('python parse.py [raw data file] [mahout output]')
# read the files
@willurd
willurd / web-servers.md
Last active June 28, 2024 12:38
Big list of http static server one-liners

Each of these commands will run an ad hoc http static server in your current (or specified) directory, available at http://localhost:8000. Use this power wisely.

Discussion on reddit.

Python 2.x

$ python -m SimpleHTTPServer 8000
@debasishg
debasishg / gist:8172796
Last active May 10, 2024 13:37
A collection of links for streaming algorithms and data structures

General Background and Overview

  1. Probabilistic Data Structures for Web Analytics and Data Mining : A great overview of the space of probabilistic data structures and how they are used in approximation algorithm implementation.
  2. Models and Issues in Data Stream Systems
  3. Philippe Flajolet’s contribution to streaming algorithms : A presentation by Jérémie Lumbroso that visits some of the hostorical perspectives and how it all began with Flajolet
  4. Approximate Frequency Counts over Data Streams by Gurmeet Singh Manku & Rajeev Motwani : One of the early papers on the subject.
  5. [Methods for Finding Frequent Items in Data Streams](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.9800&amp;rep=rep1&amp;t
@ceteri
ceteri / 01.repl.txt
Last active April 17, 2022 18:46
Intro to Apache Spark: general code examples
$ ./bin/spark-shell
14/04/18 15:23:49 INFO spark.HttpServer: Starting HTTP Server
14/04/18 15:23:49 INFO server.Server: jetty-7.x.y-SNAPSHOT
14/04/18 15:23:49 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:49861
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 0.9.1
/_/
@tsiege
tsiege / The Technical Interview Cheat Sheet.md
Last active June 28, 2024 10:38
This is my technical interview cheat sheet. Feel free to fork it or do whatever you want with it. PLEASE let me know if there are any errors or if anything crucial is missing. I will add more links soon.

ANNOUNCEMENT

I have moved this over to the Tech Interview Cheat Sheet Repo and has been expanded and even has code challenges you can run and practice against!






\

@andrewxhill
andrewxhill / cartodb-utils.py
Last active June 11, 2021 15:27
command-line python interface for manipulating data on CartoDB
import os
import urllib
import urllib2
import base64
import json
import sys
import argparse
try:
import requests
except ImportError:
@kracekumar
kracekumar / Writing better python code.md
Last active February 19, 2024 03:06
Talk I gave at June bangpypers meetup.

Writing better python code


Swapping variables

Bad code