Skip to content

Instantly share code, notes, and snippets.

Andreas Mueller amueller

Block or report user

Report or block amueller

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
View t-shirt analyzer.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@amueller
amueller / shuffle_once.ipynb
Created Dec 11, 2014
Shuffle once benchmarks
View shuffle_once.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View scipy_interpolation_weirdness.ipnb
{
"metadata": {
"name": ""
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
View scipy_interpolation_weirdness.ipnb
{
"metadata": {
"name": ""
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
@amueller
amueller / sklearn_tutorial_draft.rst
Last active Aug 29, 2015
scipy scikit-learn tutorial draft
View sklearn_tutorial_draft.rst

Tutorial Topic

This tutorial aims to provide an introduction to machine learning and scikit-learn "from the ground up". We will start with basic concepts of machine learning and implementing these using scikit-learn. Going in detail through the characteristics of several methods, we will discuss how to pick an algorithm for your application, how to set its parameters, and how to evaluate performance.

Please provide a more detailed abstract of your tutorial (again, see last years tutorials).

Machine learning is the task of extracting knowledge from data, often with the goal to generalize to new, unseen data. Applications of machine learning now touch nearly every aspect of everyday life, from the face detection in our

@amueller
amueller / elkan_bench.py
Last active Aug 29, 2015
benching elkan k-means implementation
View elkan_bench.py
from sklearn.cluster import KMeans
from time import time
from sklearn.datasets import load_digits, fetch_mldata, load_iris, fetch_20newsgroups_vectorized
def bench_kmeans(data, n_clusters=5, init='random', n_init=1):
start = time()
km1 = KMeans(algorithm='lloyd', n_clusters=n_clusters, random_state=0, init=init, n_init=n_init).fit(X)
print("lloyd time: %f inertia: %f" % (time() - start, km1.inertia_))
start = time()
km2 = KMeans(algorithm='elkan', n_clusters=n_clusters, random_state=0, init=init, n_init=n_init).fit(X)
@amueller
amueller / magic_constructor_estimator.ipynb
Created Apr 14, 2015
No more double underscores in sklearn.
View magic_constructor_estimator.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@amueller
amueller / knn_imputation_speed.ipynb
Created Aug 25, 2015
np.multiply test for knn imputation
View knn_imputation_speed.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@amueller
amueller / kneighbors_weired.py
Created Jan 23, 2012
Weird kneibors behaviour
View kneighbors_weired.py
from sklearn import datasets, manifold
from sklearn.neighbors import NearestNeighbors
import numpy as np
n_points = 1000
n_neighbors = 10
out_dim = 2
n_trials = 100
@amueller
amueller / test_c.py
Created Apr 1, 2012
Testing influence of dataset size on C
View test_c.py
import numpy as np
from sklearn import datasets
from sklearn.cross_validation import ShuffleSplit
from sklearn.grid_search import GridSearchCV
from sklearn.svm import SVC
from sklearn.preprocessing import Scaler
#data = datasets.load_digits()
data = datasets.fetch_mldata("usps")
You can’t perform that action at this time.