Skip to content

Instantly share code, notes, and snippets.

@zacstewart
Last active November 22, 2015 12:33
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save zacstewart/c2230357d336d20957b9 to your computer and use it in GitHub Desktop.
Save zacstewart/c2230357d336d20957b9 to your computer and use it in GitHub Desktop.
Example transformer for tapping into your Pipeline
def TapTransformer(TransformerMixin):
def __init__(self, fn):
self.fn = fn
def transform(self, X, **transform_params):
self.fn(x)
return X
def fit(self, X, y=none, **fit_params):
return self
features = []
def save_features(f)
features.append(f)
pipeline = Pipeline([
('extract_essays', EssayExractor()),
('features', FeatureUnion([
('ngram_tf_idf', Pipeline([
('counts', CountVectorizer()),
('tf_idf', TfidfTransformer())
])),
('essay_length', LengthTransformer()),
('misspellings', MispellingCountTransformer())
])),
('print_features', TapTransformer(lambda X: print(X))),
('save_features', TapTransformer(save_features))
('classifier', MultinomialNB())
])
@iamb4ne1
Copy link

Can you provide examples of your LengthTransformer and MispellingCountTransformer?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment