Skip to content

Instantly share code, notes, and snippets.

@mrgloom
mrgloom / kmtransformer.py
Last active August 31, 2015 13:46 — forked from larsmans/kmtransformer.py
k-means feature mapper for scikit-learn
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.metrics.pairwise import rbf_kernel
class KMeansTransformer(BaseEstimator, TransformerMixin):
def __init__(self, centroids):
self.centroids = centroids
def fit(self, X, y=None):
return self

What

Roll your own iPython Notebook server with Amazon Web Services (EC2) using their Free Tier.

What are we using? What do you need?

  • An active AWS account. First time sign-ups are eligible for the free tier for a year
  • One Micro Tier EC2 Instance
  • With AWS we will use the stock Ubuntu Server AMI and customize it.
  • Anaconda for Python.
  • Coffee/Beer/Time

image

I've been interested in computer vision for a long time, but I haven't had any free time to make any progress until this holiday season. Over Christmas and the New Years I experimented with various methodologies in OpenCV to detect road signs and other objects of interest to OpenStreetMap. After some failed experiments with thresholding and feature detection, the excellent /r/computervision suggested using the dlib C++ module because it has more consistently-good documentation and the pre-built tools are faster.

After a day or two figuring out how to compile the examples, I finally made some progress:

Compiling dlib C++ on a Mac with Homebrew

  1. Clone dlib from Github to your local machine:
@mrgloom
mrgloom / CUR4FIC
Created July 24, 2014 13:30 — forked from goldingn/CUR4FIC
# clear the workspace
rm(list = ls())
# load the relevant libraries
# install.packages(rCUR)
library(rCUR) # for CUR decomposition
# install.packages(irlba)
library(irlba) # for fast svd
@mrgloom
mrgloom / svm.py
Created June 4, 2014 09:41 — forked from mblondel/svm.py
# Mathieu Blondel, September 2010
# License: BSD 3 clause
import numpy as np
from numpy import linalg
import cvxopt
import cvxopt.solvers
def linear_kernel(x1, x2):
return np.dot(x1, x2)
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import fetch_mldata
from sklearn.decomposition import FastICA, PCA
from sklearn.cluster import KMeans
# fetch natural image patches
image_patches = fetch_mldata("natural scenes data")
X = image_patches.data
#!/usr/bin/python
#
# K-means clustering using Lloyd's algorithm in pure Python.
# Written by Lars Buitinck. This code is in the public domain.
#
# The main program runs the clustering algorithm on a bunch of text documents
# specified as command-line arguments. These documents are first converted to
# sparse vectors, represented as lists of (index, value) pairs.
from collections import defaultdict