Skip to content

Instantly share code, notes, and snippets.

View byronyi's full-sized avatar
:octocat:
Just for fun

Bairen Yi byronyi

:octocat:
Just for fun
View GitHub Profile
@huyng
huyng / matplotlibrc
Created February 8, 2011 15:50
my default matplotlib settings
### MATPLOTLIBRC FORMAT
# This is a sample matplotlib configuration file - you can find a copy
# of it on your system in
# site-packages/matplotlib/mpl-data/matplotlibrc. If you edit it
# there, please note that it will be overridden in your next install.
# If you want to keep a permanent local copy that will not be
# over-written, place it in HOME/.matplotlib/matplotlibrc (unix/linux
# like systems) and C:\Documents and Settings\yourname\.matplotlib
# (win32 systems).
@andrix
andrix / iunzip.py
Created July 4, 2011 13:33
python iterable unzip
import itertools
from operator import itemgetter
def iunzip(iterable):
"""Iunzip is the same as zip(*iter) but returns iterators, instead of
expand the iterator. Mostly used for large sequence"""
_tmp, iterable = itertools.tee(iterable, 2)
iters = itertools.tee(iterable, len(_tmp.next()))
return (itertools.imap(itemgetter(i), it) for i, it in enumerate(iters))
@HarryR
HarryR / zmqstub.c
Created September 23, 2011 13:34
zmq & libevent stub
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <libgen.h>
#include <signal.h>
#include <err.h>
#include <assert.h>
#include <zmq.h>
@justinkamerman
justinkamerman / hadoop-setup.sh
Created June 18, 2012 15:39
Cloud-init scripts for configuring a Ubuntu image for Hadoop
WRITE-MIME_MULTIPART=./bin/write-mime-multipart
.PHONY: clean
cloud-config.txt: ubuntu-config.txt hadoop-setup.sh
$(WRITE-MIME_MULTIPART) --output=$@ $^
clean:
$(RM) cloud-config.txt
@pprett
pprett / boston.json
Created October 1, 2012 18:28
Decision Tree Viewer (D3 and Sklearn)
{"error": 42716.2954, "samples": 506, "value": [22.532806324110698], "label": "RM <= 6.94", "type": "split", "children": [{"error": 17317.3210, "samples": 430, "value": [19.93372093023257], "label": "LSTAT <= 14.40", "type": "split", "children": [{"error": 6632.2175, "samples": 255, "value": [23.349803921568636], "label": "DIS <= 1.38", "type": "split", "children": [{"error": 390.7280, "samples": 5, "value": [45.58], "label": "CRIM <= 10.59", "type": "split", "children": [{"error": 0.0000, "samples": 4, "value": [50.0], "label": "Leaf - 4", "type": "leaf"}, {"error": 0.0000, "samples": 1, "value": [27.9], "label": "Leaf - 5", "type": "leaf"}]}, {"error": 3721.1632, "samples": 250, "value": [22.90520000000001], "label": "RM <= 6.54", "type": "split", "children": [{"error": 1636.0675, "samples": 195, "value": [21.629743589743576], "label": "LSTAT <= 7.57", "type": "split", "children": [{"error": 129.6307, "samples": 43, "value": [23.969767441860473], "label": "TAX <= 222.50", "type": "split", "children": [{"err
@MLnick
MLnick / sklearn-lr-spark.py
Created February 4, 2013 14:29
SGD in Spark using Scikit-learn
import sys
from pyspark.context import SparkContext
from numpy import array, random as np_random
from sklearn import linear_model as lm
from sklearn.base import copy
N = 10000 # Number of data points
D = 10 # Numer of dimensions
ITERATIONS = 5
class Solution{
public static boolean isPermutation(String s1, String s2){
if(s1 === null || s2 == null ) return false;
if(s1.length() != s2.length()) return false;
if(s1.length() == 0) return true; // If s1.length() == s2.length() and s1 is a empty string, s2 is the permutation of s1.
char[] sc1 = s1.toCharArray();
char[] sc2 = s2.toCharArray();
HashMap<String, int> temp = new HashMap<String, boolean>();
# Author: Vlad Niculae <vlad@vene.ro>
# Licence: BSD
from __future__ import division, print_function
import numpy as np
from sklearn.utils import check_random_state
class SquaredLoss(object):
def loss(self, y, pred):
@vasanthk
vasanthk / System Design.md
Last active June 26, 2024 17:33
System Design Cheatsheet

System Design Cheatsheet

Picking the right architecture = Picking the right battles + Managing trade-offs

Basic Steps

  1. Clarify and agree on the scope of the system
  • User cases (description of sequences of events that, taken together, lead to a system doing something useful)
    • Who is going to use it?
    • How are they going to use it?
@yaroslavvb
yaroslavvb / local_distributed_benchmark.py
Last active September 16, 2021 10:26
Benchmark distributed tensorflow locally by adding vector of ones on worker2 to variable on worker1 as fast as possible
"""Benchmark tensorflow distributed by adding vector of ones on worker2
to variable on worker1 as fast as possible.
On 2014 macbook, TensorFlow 0.10 this shows
Local rate: 2175.28 MB per second
Distributed rate: 107.13 MB per second
"""