Skip to content

Instantly share code, notes, and snippets.

@ryanbehdad
Last active August 11, 2020 09:40
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ryanbehdad/913afe20ab4459bc74d703654ec6aab1 to your computer and use it in GitHub Desktop.
Save ryanbehdad/913afe20ab4459bc74d703654ec6aab1 to your computer and use it in GitHub Desktop.
Print a summary of a pandas dataframe and its columns
# =======================================================================
# Print a summary of a pandas dataframe and its columns
# =======================================================================
def df_summary(df):
print(f'Dataframe has {df.shape[0]:,} rows and {df.shape[1]:,} columns')
if len(df) > 1:
summary = pd.DataFrame(df.dtypes, columns=['dtype']).reset_index()
summary.rename(columns={'index': 'feature'}, inplace=True)
summary['missing'] = df.isnull().sum().values
summary['uniques'] = df.nunique().values
summary['first_value'] = df.iloc[0].values
summary['second_value'] = df.iloc[1].values
summary['final_value'] = df.iloc[len(df)-1].values
else:
summary = "Not enough data to analyse"
return summary
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment