Skip to content

Instantly share code, notes, and snippets.

View ogrisel's full-sized avatar

Olivier Grisel ogrisel

View GitHub Profile
@ogrisel
ogrisel / .gitignore
Created March 11, 2010 18:23
Image fetching and clustering / semantic coding
*.swp
*.pyc
*.png
data/*
build
@ogrisel
ogrisel / incoming-links.txt
Created April 8, 2010 17:37
Counting incoming links in DBpedia with unix shell tools
@ogrisel
ogrisel / .gitignore
Created April 12, 2010 14:52
t-SNE wrapper to output SVG maps
*.pyc
mnist2500*
build/
pip-log.txt
text-documents/
@ogrisel
ogrisel / out.txt
Created June 27, 2010 12:13
Random security terms generator
asynchronous buffer forging
anonymous identity injection
asynchronous SQL skewing
synchronous buffer analysis
reverse jail fuzzing
tainted state inspection
multi-modal integrity recovery
deep state engineering
social state breaking
monotonic state forging
#!/bin/bash
sudo apt-get update
sudo apt-get install -y byobu couchdb python-pip python-lxml
sudo pip install -U tweepy couchdbkit restkit
@ogrisel
ogrisel / enet_whitening.py
Created December 13, 2010 01:15
ElasticNet and whitening
"""Evaluating the impact of PCA + whitening on low rank data"""
import numpy as np
from pprint import pprint
from scikits.learn.datasets.samples_generator import make_regression_dataset
from scikits.learn.pca import PCA
from scikits.learn.linear_model import ElasticNetCV
data_opts = {
'n_train_samples': 5000,
import numpy as np, scipy, scipy.sparse, numpy.linalg, scipy.optimize
from scipy import weave
def project_l1(lbda, sigma):
"Project positive vector lbda to have l1 norm sigma"
ll = -np.sort(-lbda)
cs = 0.
theta = 0
prevtheta = 0
import time
import sys
import numpy as np
from scipy import linalg
from scikits.learn.linear_model import Lasso, lars_path
from joblib import Parallel, delayed
################################################################################
# Utilities to spread load on CPUs
@ogrisel
ogrisel / .gitignore
Created April 17, 2011 11:33
Cython / Perftools bug report
build
*.so
*.prof
@ogrisel
ogrisel / README.md
Created September 7, 2011 22:36
V-Measure and adjustment for chance

This is an experiment to highlight the dependency of the V-Measure value on the number of clusters of 2 independent uniform labelings for a finite number of samples.

Intuitively it seems that for finite number of samples the V-Measure is victim of some kind of birthday paradox that naive users might not be aware of.

Even if the maximum number of clusters considered (e.g. 10) is small with respect to the number of samples (e.g. 5000), the V-Measure of