Skip to content

Instantly share code, notes, and snippets.

@srishtis
Created October 9, 2018 14:12
Show Gist options
  • Save srishtis/6ded5eafeac5da3cd8db95b3b2c7a3b3 to your computer and use it in GitHub Desktop.
Save srishtis/6ded5eafeac5da3cd8db95b3b2c7a3b3 to your computer and use it in GitHub Desktop.
one hot encoding and labelencoding features in Kaggle Hpp
# create a list of ordinal variables
ordinal_variables=['HeatingQC','KitchenQual','FireplaceQu','GarageQual','PoolQC','ExterQual','BsmtQual','Fence','BsmtCond','GarageCond','ExterCond','GarageCond','OverallCond','OverallQual','TotalHomeQual']
# label encoder
le = preprocessing.LabelEncoder()
for c in ordinal_variables:
le.fit(total_df[c])
total_df[c] = le.transform(total_df[c])
# create a list of categorical columns for one hot encoding
cat_variables= ['MSSubClass','MSZoning','Street','Alley','LotShape','LotConfig','LandContour','BsmtExposure','BldgType','CentralAir','Condition1','Condition2','Electrical','Exterior1st','Exterior2nd','Foundation','Functional','GarageFinish','GarageType','Heating','HouseStyle','LandSlope','SaleCondition','Utilities','RoofStyle','HasBsmt','RoofMatl','MasVnrType','HasBeenRemodelled','DecadeBuilt','DecadeSold','MoSold','Neighborhood','PavedDrive','MiscFeature','GrLivArea_Band','TotalBsmtSF_Band','SaleType']
# One-Hot encoding to convert categorical columns to numeric
print('start one-hot encoding')
total_df = pd.get_dummies(total_df, prefix = cat_variables,
columns = cat_variables)
print('one-hot encoding done')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment