Skip to content

Instantly share code, notes, and snippets.

@aniruddha27
Last active September 29, 2020 10:59
Show Gist options
  • Save aniruddha27/965ff8b01e19de1cffdb5cbe703d5495 to your computer and use it in GitHub Desktop.
Save aniruddha27/965ff8b01e19de1cffdb5cbe703d5495 to your computer and use it in GitHub Desktop.
# data standardization with sklearn
from sklearn.preprocessing import StandardScaler
# copy of datasets
X_train_stand = X_train.copy()
X_test_stand = X_test.copy()
# numerical features
num_cols = ['Item_Weight','Item_Visibility','Item_MRP','Outlet_Establishment_Year']
# apply standardization on numerical features
for i in num_cols:
# fit on training data column
scale = StandardScaler().fit(X_train_stand[[i]])
# transform the training data column
X_train_stand[i] = scale.transform(X_train_stand[[i]])
# transform the testing data column
X_test_stand[i] = scale.transform(X_test_stand[[i]])
@aniruddha27
Copy link
Author

Hi, for L14, where did you define X_train_stand before?
I only have X_train and X_test after splitting my df X.

Hey. So I made a copy of X_train in X_train_stand before standardizing it for comparisons later. However, the line was missing from the gist so I have just added it in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment