Created July 14, 2013 21:12
k-means feature mapper for scikit-learn
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.metrics.pairwise import rbf_kernel
class KMeansTransformer(BaseEstimator, TransformerMixin):
def __init__(self, centroids):
self.centroids = centroids
def fit(self, X, y=None):
return self
def transform(self, X, y=None):
return rbf_kernel(X, self.centroids)
I'd be nicer to learn the centroids in fit.

@mblondel I learn the centroids in a separate pass over a large unlabeled dataset using MiniBatchKMeans.

