Skip to content

Instantly share code, notes, and snippets.

View ddofer's full-sized avatar

Dan Ofer ddofer

View GitHub Profile
@chitchcock
chitchcock / 20111011_SteveYeggeGooglePlatformRant.md
Created October 12, 2011 15:53
Stevey's Google Platforms Rant

Stevey's Google Platforms Rant

I was at Amazon for about six and a half years, and now I've been at Google for that long. One thing that struck me immediately about the two companies -- an impression that has been reinforced almost daily -- is that Amazon does everything wrong, and Google does everything right. Sure, it's a sweeping generalization, but a surprisingly accurate one. It's pretty crazy. There are probably a hundred or even two hundred different ways you can compare the two companies, and Google is superior in all but three of them, if I recall correctly. I actually did a spreadsheet at one point but Legal wouldn't let me show it to anyone, even though recruiting loved it.

I mean, just to give you a very brief taste: Amazon's recruiting process is fundamentally flawed by having teams hire for themselves, so their hiring bar is incredibly inconsistent across teams, despite various efforts they've made to level it out. And their operations are a mess; they don't real

#!/usr/local/bin/python
# Copyright (C) 2004 Rune Linding & Lars Juhl Jensen - EMBL
# The DisEMBL is licensed under the GPL license
# (http://www.opensource.org/licenses/gpl-license.php)
# DisEMBL pipeline
from string import *
from sys import argv
from Bio import SeqIO
import fpformat
@arq5x
arq5x / grantham-dict.py
Last active November 13, 2018 15:25
Convert Grantham Amino Acid matrix into Python dict.
#!/usr/bin/env python
import sys
import pprint
def make_grantham_dict(grantham_mat_file):
"""
Citation: http://www.ncbi.nlm.nih.gov/pubmed/4843792
Provenance: http://www.genome.jp/dbget-bin/www_bget?aaindex:GRAR740104
@larsmans
larsmans / kmtransformer.py
Created July 14, 2013 21:12
k-means feature mapper for scikit-learn
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.metrics.pairwise import rbf_kernel
class KMeansTransformer(BaseEstimator, TransformerMixin):
def __init__(self, centroids):
self.centroids = centroids
def fit(self, X, y=None):
return self
@kiyukuta
kiyukuta / autoencoder.py
Last active January 23, 2020 06:16
Minimum implementation of denoising autoencoder.Error function is cross-entropy of reconstruction.Optimizing by SGD with mini-batch.Dataset is available at http://deeplearning.net/data/mnist/mnist.pkl.gz
#coding: utf8
"""
1. Download this gist.
2. Get the MNIST data.
wget http://deeplearning.net/data/mnist/mnist.pkl.gz
3. Run this code.
python autoencoder.py 100 -e 1 -b 20 -v
"""
import numpy
import argparse
@sloria
sloria / bobp-python.md
Last active May 12, 2024 06:54
A "Best of the Best Practices" (BOBP) guide to developing in Python.

The Best of the Best Practices (BOBP) Guide for Python

A "Best of the Best Practices" (BOBP) guide to developing in Python.

In General

Values

  • "Build tools for others that you want to be built for you." - Kenneth Reitz
  • "Simplicity is alway better than functionality." - Pieter Hintjens
@avrilcoghlan
avrilcoghlan / find_lca_of_go_terms.py
Created January 31, 2014 17:03
Python script to find last common ancestors of GO terms
import sys
import os
from collections import defaultdict
import calc_dists_to_top_of_GO
import calc_dists_to_top_of_GO_using_bfs
class Error (Exception): pass
#====================================================================#
@ameyavilankar
ameyavilankar / preprocess.py
Last active January 25, 2023 10:19
Removing Punctuation and Stop Words nltk
import string
import nltk
from nltk.tokenize import RegexpTokenizer
from nltk.corpus import stopwords
import re
def preprocess(sentence):
sentence = sentence.lower()
tokenizer = RegexpTokenizer(r'\w+')
tokens = tokenizer.tokenize(sentence)
@vals
vals / Clusters from dendrograms.ipynb
Created June 28, 2014 23:37
IPython notebook illustrating how to extract cluster elements in Python Dendrograms
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@syhw
syhw / dnn.py
Created July 13, 2014 20:55
Deep learning in one file.
"""
A deep neural network with or w/o dropout in one file.
"""
import numpy
import theano
import sys
from theano import tensor as T
from theano import shared
from theano.tensor.shared_randomstreams import RandomStreams