Skip to content

Instantly share code, notes, and snippets.

@Kornel
Kornel / gist:11e69ef9fd2e9380a21991029fbecaf9
Created November 26, 2017 20:19
scikit master branch build
This file has been truncated, but you can view the full file.
rm -f tags
python setup.py clean
running clean
removing 'build/temp.macosx-10.7-x86_64-3.6' (and everything under it)
removing 'build'
Partial import of sklearn during the build process.
Will remove generated .c files
rm -rf dist
python setup.py build_ext -i
blas_opt_info:
@Kornel
Kornel / SparkEnumUDT.scala
Created November 22, 2017 07:34
enum spark UDT
package org.apache.spark.sql.types
import xxx.Condition
import xxx.Department
class ConditionUDT extends UserDefinedType[Condition.Value] {
override def sqlType: DataType = IntegerType
override def serialize(obj: Condition.Value): Int = obj.id
@Kornel
Kornel / rank_metrics.py
Created October 20, 2017 08:19 — forked from bwhite/rank_metrics.py
Ranking Metrics
"""Information Retrieval metrics
Useful Resources:
http://www.cs.utexas.edu/~mooney/ir-course/slides/Evaluation.ppt
http://www.nii.ac.jp/TechReports/05-014E.pdf
http://www.stanford.edu/class/cs276/handouts/EvaluationNew-handout-6-per.pdf
http://hal.archives-ouvertes.fr/docs/00/72/67/60/PDF/07-busa-fekete.pdf
Learning to Rank for Information Retrieval (Tie-Yan Liu)
"""
import numpy as np
@Kornel
Kornel / tsne_template.py
Created June 29, 2017 19:13
scikit TSNE template for multiple perplexities / iterations
fig, ax = plt.subplots(nrows=2, ncols=3)
perplexity = [[2, 10, 20],
[30, 50, 100]]
iterations = [[1000, 1000, 1000],
[1000, 1000, 1000]]
for ri, row in enumerate(ax):
for ci, col in enumerate(row):
@Kornel
Kornel / inline_bar.py
Created April 10, 2017 17:26
Inline barplot for jupyter
from tempfile import NamedTemporaryFile
import base64
def inline_hist(data, figsize=(4, 4), **kwags):
data = list(data)
fig, ax = plt.subplots(1, 1, figsize=figsize, **kwags)
for k,v in ax.spines.items():
v.set_visible(False)
@Kornel
Kornel / UlamSpiral.java
Last active December 3, 2017 09:21
Ulam's spiral (sort of, without primes yet)
import static java.lang.Math.sqrt;
public class S {
public static void main(String[] args) {
int[][] s = createSpiral(20);
print(s);
}
private static int[][] createSpiral(int n) {
@Kornel
Kornel / alice.py
Created October 12, 2016 12:23
Communication Complexity Median in O(log(n)*log(n))
import scipy.stats
import numpy as np
n = 200
s = scipy.stats.randint.rvs(0, n, size = 2)
X = scipy.stats.randint.rvs(0, n, size = s[0] * 2 + 1)
Y = scipy.stats.randint.rvs(0, n, size = s[1] * 2)
i = 1
@Kornel
Kornel / service-checklist.md
Created September 29, 2016 07:13 — forked from acolyer/service-checklist.md
Internet Scale Services Checklist

Internet Scale Services Checklist

A checklist for designing and developing internet scale services, inspired by James Hamilton's 2007 paper "On Desgining and Deploying Internet-Scale Services."

Basic tenets

  • Does the design expect failures to happen regularly and handle them gracefully?
  • Have we kept things as simple as possible?
@Kornel
Kornel / AB.ipynb
Created September 26, 2016 19:17
A B tests - MCMC vs proportion test, WIP
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@Kornel
Kornel / Normal approximation of the binomial distribution.ipynb
Created September 26, 2016 19:14
Normal approximation of the binomial distribution
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.