Skip to content

Instantly share code, notes, and snippets.

@jnothman
Last active January 1, 2016 01:39
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save jnothman/8074321 to your computer and use it in GitHub Desktop.
Save jnothman/8074321 to your computer and use it in GitHub Desktop.
Stacking in scikit-learn, a quick attempt
from sklearn.base import BaseEstimator, TransformerMixin
class Transformer(BaseEstimator, TransformerMixin):
def __init__(self, fn):
self.fn = fn
def fit(self, X, y):
return self
def transform(self, X):
return self.fn(X)
if __name__ == '__main__':
from sklearn import datasets, svm, pipeline, cross_validation
iris = datasets.load_iris()
p = pipeline.Pipeline([
('t', Transformer(svm.LinearSVC().fit(iris.data, iris.target).decision_function)),
('c', svm.LinearSVC()),
])
print(cross_validation.cross_val_score(p, iris.data, iris.target))
@npow
Copy link

npow commented Nov 22, 2014

Is this going to introduce bias since the transformer was fitted on the entire dataset?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment