Skip to content

Instantly share code, notes, and snippets.

@joshreini1
Created August 30, 2022 19:42
Show Gist options
  • Save joshreini1/cfb15787967734d5ddd68ed9e890296f to your computer and use it in GitHub Desktop.
Save joshreini1/cfb15787967734d5ddd68ed9e890296f to your computer and use it in GitHub Desktop.
Set up airbnb pipelines
from sklearn.pipeline import Pipeline
#instantiate transformer classes
cat_onehot_transformer = cat_onehot_transformer_custom(cat_list)
url_onehot_transformer = url_onehot_transformer_custom(url_list)
convert_dates_transformer = convert_dates_transformer_custom(date_list)
to_float_transformer = to_float_transformer_custom(tofloat_list)
fillna_transformer = fillna_transformer_custom(fillna_list)
cap_reviews_per_month = cap_reviews_per_month_custom()
drop_features = drop_features_selector(drop_list)
#set up pipelines
cat_preprocessing_pipe = Pipeline(steps = [
('cat_onehot_transformer', cat_onehot_transformer),
('url_onehot_transformer', url_onehot_transformer)
])
num_preprocessing_pipe = Pipeline(steps = [
('convert_dates_transformer',convert_dates_transformer),
('to_float_transformer',to_float_transformer),
('fillna_transformer',fillna_transformer),
('cap_reviews_per_month', cap_reviews_per_month)
])
# combine both numerical and categorical preprocessing step
combined_preprocessing = Pipeline([
('drop_features', drop_features),
('numericals', num_preprocessing_pipe),
('categoricals', cat_preprocessing_pipe)
])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment