Skip to content

Instantly share code, notes, and snippets.

@pr2tik1
Created May 22, 2020 17:31
Show Gist options
  • Save pr2tik1/e5aabc0c7c7e3af2959eb06071b245eb to your computer and use it in GitHub Desktop.
Save pr2tik1/e5aabc0c7c7e3af2959eb06071b245eb to your computer and use it in GitHub Desktop.
def fixing_skewness(self):
"""
Function takes in a dataframe and return fixed skewed dataframe
"""
## Getting all the data that are not of "object" type.
numeric = self.data.dtypes[self.data.dtypes != "object"].index
# Check the skew of all numerical features
skewed_feats = self.data[numeric].apply(lambda x: skew(x)).sort_values(ascending=False)
high_skew = skewed_feats[abs(skewed_feats) > 0.5]
skewed_features = high_skew.index
for feat in skewed_features:
self.data[feat] = boxcox1p(self.data[feat], boxcox_normmax(self.data[feat] + 1))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment