Skip to content

Instantly share code, notes, and snippets.

@gchavez2
Last active August 16, 2019 22:33
Show Gist options
  • Save gchavez2/1ed1be38a6dc766a13fce72be6aadeb8 to your computer and use it in GitHub Desktop.
Save gchavez2/1ed1be38a6dc766a13fce72be6aadeb8 to your computer and use it in GitHub Desktop.
Dataframe exploration
df = pd.read_csv(FILENAME, names=["id", "title", "text"],
escapechar='\\', encoding='utf-8', header=0)
print(df.describe()) # Summary statistics (count, unique, top, freq) for every column of the dataframe
print(df.info()) # Information about each column on the dataframe, and memory usage
print(df.head()) # Show first 5 rows of dataframe
print(df.head(3)) # Show first 3 rows of dataframe
# Head with only a few columns, with random sampling of 10 rows
df[['col1', 'col2']].sample(10)
print(df.columns) # Name of the columns of the dataframe
print(df.shape) # Size of the dataframe [nrows x ncols]
# Display all columns on row 2, by index location (iloc)
print(df.iloc[[2,:]])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment