Skip to content

Instantly share code, notes, and snippets.

# Original code from tinrtgu on Kaggle under WTFPL license
# Relicensed to BSD 3-clause (it does say do what you want...)
# Authors: Kyle Kastner
# License: BSD 3-clause
# Reference links:
# Adaptive learning: http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/41159.pdf
# Criteo scalable response prediction: http://people.csail.mit.edu/romer/papers/TISTRespPredAds.pdf
# Vowpal Wabbit (hashing trick): https://github.com/JohnLangford/vowpal_wabbit/
# Hashing Trick: http://arxiv.org/pdf/0902.2206.pdf
@nkhuyu
nkhuyu / NormalizedGini.java
Last active August 29, 2015 14:26 — forked from amihalik/NormalizedGini.java
NormalizedGini
import java.util.Arrays;
import java.util.Comparator;
public class NormalizedGini {
private static double gini(double[] a, double[] p, double[] w) {
int len = a.length;
if (p.length != len || w.length != len) {
throw new IllegalArgumentException("array length not equal");
}
@nkhuyu
nkhuyu / Basic timeseries exploration.ipynb
Created November 18, 2015 04:59 — forked from cast42/Basic timeseries exploration.ipynb
Exploration of basic timeseries approach to forecast sales in the Rossman Kaggle competition
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@nkhuyu
nkhuyu / springer-free-maths-books.md
Created December 29, 2015 21:06 — forked from bishboria/springer-free-maths-books.md
Springer have made a bunch of books available for free, here are the direct links
@nkhuyu
nkhuyu / try_distance_corr.py
Created January 22, 2016 03:44 — forked from josef-pkt/try_distance_corr.py
distance covariance and correlation
# -*- coding: utf-8 -*-
"""
Created on Fri Jun 15 14:00:29 2012
Author: Josef Perktold
License: MIT, BSD-3 (for statsmodels)
http://en.wikipedia.org/wiki/Distance_correlation
Yaroslav and Satrajit on sklearn mailing list
Originally:
https://gist.github.com/7565976a89d5da1511ce
Hi Donald (and Martin),
Thanks for pinging me; it's nice to know Typesafe is keeping tabs on this, and I
appreciate the tone. This is a Yegge-long response, but given that you and
Martin are the two people best-situated to do anything about this, I'd rather
err on the side of giving you too much to think about. I realize I'm being very
critical of something in which you've invested a great deal (both financially
@nkhuyu
nkhuyu / pdio.py
Created May 2, 2016 22:45 — forked from luispedro/pdio.py
Save & load from a pandas DataFrame/Series
import numpy.lib
import numpy as np
import pandas as pd
import cPickle as pickle
def save_pandas(fname, data):
'''Save DataFrame or Series
Parameters
----------
@nkhuyu
nkhuyu / extract_emails_from_text.py
Created May 20, 2016 17:43 — forked from dideler/example.md
A python script for extracting email addresses from text files. You can pass it multiple files. It prints the email addresses to stdout, one address per line. For ease of use, remove the .py extension and place it in your $PATH (e.g. /usr/local/bin/) to run it like a built-in command.
#!/usr/bin/env python
#
# Extracts email addresses from one or more plain text files.
#
# Notes:
# - Does not save to file (pipe the output to a file if you want it saved).
# - Does not check for duplicates (which can easily be done in the terminal).
#
# (c) 2013 Dennis Ideler <ideler.dennis@gmail.com>
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@nkhuyu
nkhuyu / rank_metrics.py
Created December 12, 2016 21:44 — forked from bwhite/rank_metrics.py
Ranking Metrics
"""Information Retrieval metrics
Useful Resources:
http://www.cs.utexas.edu/~mooney/ir-course/slides/Evaluation.ppt
http://www.nii.ac.jp/TechReports/05-014E.pdf
http://www.stanford.edu/class/cs276/handouts/EvaluationNew-handout-6-per.pdf
http://hal.archives-ouvertes.fr/docs/00/72/67/60/PDF/07-busa-fekete.pdf
Learning to Rank for Information Retrieval (Tie-Yan Liu)
"""
import numpy as np