Skip to content

Instantly share code, notes, and snippets.

Avatar

Jake Vanderplas jakevdp

View GitHub Profile
@jakevdp
jakevdp / AR_crash.py
Created Jun 13, 2011
ARPACK memory error
View AR_crash.py
import numpy as np
from scipy.sparse.linalg import eigs
N = 6
k = 2
# with this random seed, I get a memory error on the third iteration below
np.random.seed(2301)
A = np.random.random((N,N))
@jakevdp
jakevdp / README
Created Sep 29, 2011
test code & dataset for scikit-learn issue #365
View README
code demonstrating the problem seen in issue #365
to run the example:
tar -zxvf data.tgz
python test.py
@jakevdp
jakevdp / banded_tools.py
Created Dec 23, 2011
Benchmarks for eigenvalue decomposition
View banded_tools.py
from time import time
import numpy as np
from scipy.sparse import spdiags, issparse, dia_matrix
from scipy.sparse.linalg import factorized
from scipy import linalg as splinalg
class BandedMatrix(object):
def __init__(self, data, lu=None):
if issparse(data):
if lu:
@jakevdp
jakevdp / README.rst
Created Dec 29, 2011
GMM BIC/AIC test
View README.rst

This includes a test of the new GMM routines in https://github.com/bthirion/scikit-learn/tree/gmm-fixes

By changing the line

GMM = mixture.GMM

at the top of the file, we can plot the BIC and AIC for each variant of GMM. Standard GMM works beautifully: it settles in on 3 components, which are a good description of the data. DPGMM and VBGMM produce some unexpected results.

@jakevdp
jakevdp / README.rst
Created Jan 5, 2012
General Distance Metrics for BallTree
View README.rst

This is the outline of a framework that will allow general distance metrics to be incorporated into scikit-learn BallTree. The idea is that we need a fast way to compute the distance between two points under a given metric. In the basic framework here, this involves creating an object which exposes C-pointers to a function and a parameter structure so that the distance function can be called from either python or directly from cython with no python overhead.

@jakevdp
jakevdp / Makefile
Created Jan 18, 2012
Example of sphinx image copy
View Makefile
SPHINXBUILD = sphinx-build
BUILDDIR = _build
SPHINXOPTS = -d $(BUILDDIR)/doctrees .
all: html
html:
$(SPHINXBUILD) -b html $(SPHINXOPTS) $(BUILDDIR)/html
@echo
@echo "Build finished. The HTML pages are in $(BUILDDIR)/html."
@jakevdp
jakevdp / kneighbors_test.py
Created Jan 23, 2012
Showing memory error in BallTree
View kneighbors_test.py
import warnings
from sklearn import datasets
from sklearn.neighbors import NearestNeighbors
import numpy as np
n_points = 1000
n_neighbors = 10
out_dim = 2
n_trials = 100
@jakevdp
jakevdp / sklearn_doc.py
Created Sep 30, 2012
Scikit-learn Documentation Template
View sklearn_doc.py
"""
This file has an example function, with a documentation string which should
serve as a template for scikit-learn docstrings.
"""
def sklearn_template(X, y, a=1, flag=True, f=None, **kwargs):
"""This is where a short one-line description goes
This is where a longer, multi-line description goes. It's not
required, but might be helpful if more information is needed.
@jakevdp
jakevdp / basic_animation.py
Created Oct 6, 2012
Demo for GIF animations
View basic_animation.py
import numpy as np
from matplotlib import pyplot as plt
from matplotlib import animation
# First set up the figure, the axis, and the plot element we want to animate
fig = plt.figure()
ax = fig.add_subplot(111, xlim=(0, 2), ylim=(-2, 2))
line, = ax.plot([], [], lw=2)
# initialization function: plot the background of each frame
@jakevdp
jakevdp / README.md
Last active Mar 15, 2021
Numba Ball Tree example
View README.md

Numba Ball Tree

This is a quick attempt at writing a ball tree for nearest neighbor searches using numba. I've included a pure python version, and a version with numba jit decorators. Because class support in numba is not yet complete, all the code is factored out to stand-alone functions in the numba version. The resulting code produced by numba is about ~10 times slower than the cython ball tree in scikit-learn. My guess is that part of this stems from lack of inlining in numba, while the rest is due to some sort of overhead