Skip to content

Instantly share code, notes, and snippets.

@kevinschaul
Last active April 19, 2022 19:52
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kevinschaul/323d672a738b2d946629b4434d632a8c to your computer and use it in GitHub Desktop.
Save kevinschaul/323d672a738b2d946629b4434d632a8c to your computer and use it in GitHub Desktop.

Pandas cheat sheet

Because I always forget how to do the same things, over and over again

See also: R stats cheat sheet

Helpful links:

Split-Apply-Combine type stuff

Count instances of variables within groups

df.value_counts(subset=['county', 'category']).to_frame(name='n').reset_index()

Mutate a grouped data frame

df['pct'] = df['n'] / df.groupby('county')['n'].transform('sum')

Dataframe tips

Sort by columns

df.sort_values(['col1', 'col2'])
df.sort_values('col1', ascending=False)
df.sort_values(['col1', 'col2'], ascending=[True, False])

Jupyter tips

Show all rows/columns of a dataframe

with pd.option_context('display.max_rows', None, 'display.max_columns', None):
    display(df)

Convert a series to a dataframe for better printing

series.to_frame(name'col_name')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment