Skip to content

Instantly share code, notes, and snippets.

@sh16ma
Last active January 19, 2022 10:14
Show Gist options
  • Save sh16ma/efd598968a4ef1d6d5650ccefd5498d1 to your computer and use it in GitHub Desktop.
Save sh16ma/efd598968a4ef1d6d5650ccefd5498d1 to your computer and use it in GitHub Desktop.
#🐍 #Python #EDA #NaN #ランキング #断捨雒
# Looking at NaN % within the data
nan = pd.DataFrame(all_df.isna().sum(), columns=['NaN_sum'])
nan['feat'] = nan.index
nan['ratio(%)'] = (nan['NaN_sum']/all_df.shape[0])*100
nan = nan[nan['NaN_sum'] > 0]
nan = nan.sort_values(by = ['NaN_sum']) #option: ascending=False ζ˜‡ι †
nan['Usability'] = np.where(nan['ratio(%)'] > 20, 'Discard', 'Keep')
nan
@sh16ma
Copy link
Author

sh16ma commented Jan 26, 2021

nan

@sh16ma
Copy link
Author

sh16ma commented Jan 26, 2021

# Plotting Nan

plt.figure(figsize=(15, 6))
sns.barplot(x=nan["name"], y=nan["ratio"])
plt.xticks(rotation=45)
plt.title('Features containing Nan')
plt.xlabel('Features')
plt.ylabel('% of Missing Data')
plt.show()

barplot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment