This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
print('Skewness = ',train['SalePrice'].skew()) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
target = np.log(train['SalePrice']) | |
print('Skewness = ',target.skew()) | |
sns.distplot(target); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
corr = train.corr() | |
corr['SalePrice'].sort_values(ascending=False).head(10) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
table = pd.pivot_table(train,index='OverallQual',values='SalePrice',aggfunc=np.mean) | |
table |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
plt.scatter(x=train['GrLivArea'], y=train['SalePrice']) | |
plt.ylabel('Sale Price') | |
plt.xlabel('GrLivArea') | |
plt.show(); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# dropping outlier values from the dataset | |
train = train[train['GrLivArea']<4500] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#train rows | |
ntrain = train.shape[0] | |
#save log transform of target feature | |
target = np.log(train['SalePrice']) | |
#drop Id and SalePrice from train dataframe | |
train.drop(['Id','SalePrice'],inplace=True,axis=1) | |
#store test Id |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#Null values | |
train.isna().sum().sort_values(ascending=False).head(20) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Ordinal features | |
#NA means no Pool | |
train['PoolQC'].replace(['Ex','Gd','TA','Fa',np.nan],[4,3,2,1,0],inplace=True) | |
# NA means no fence | |
train['Fence'].replace(['GdPrv','MnPrv','GdWo','MnWw',np.nan],[4,3,2,1,0],inplace=True) | |
# NA means no fireplace | |
train['FireplaceQu'].replace(['Ex','Gd','TA','Fa','Po',np.nan],[5,4,3,2,1,0],inplace=True) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Ordinal features | |
for i in ['GarageCond','GarageQual']: | |
train[i].replace(['Ex','Gd','TA','Fa','Po',np.nan],[5,4,3,2,1,0],inplace=True) | |
# Nominal features | |
for i in ['GarageFinish','GarageType']: | |
train[i].fillna('None',inplace=True) | |
# Numerical features | |
for i in ['GarageYrBlt','GarageCars','GarageArea']: |