Skip to content

Instantly share code, notes, and snippets.

@AlexDemian
Last active July 9, 2018 12:29
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save AlexDemian/b0e18849bbcc6fcbd6ae935cb86a60c3 to your computer and use it in GitHub Desktop.
Save AlexDemian/b0e18849bbcc6fcbd6ae935cb86a60c3 to your computer and use it in GitHub Desktop.
Short pandas guide
#*** Create
pandas.DataFrame([], columns=['column1','column2'])
pandas.DataFrame(data={'column1':[1,2,3], 'column2': ['1','2','3']})
pandas.DataFrame([[1,1], [2,2]], columns=['column1','column2'])
#*** Get
df.values
df.columns | df.keys()
#*** Post editing
# Rename columns
df.columns = ['column1', 'column2']
# UPDATE values by condition
df.loc[df['condition_column'] == 'condition_value', 'column_to_update'] = 0
df.loc[a['condition_column'] == 'condition_value', ['column_to_update1', 'column_to_update2']] = new_value1, new_value2
df.loc[(df['condition_column'] > 'condition_value') & (df['condition_column2'] > 'condition_value2')]
#*** Filters
# Simple value filter
df = df[df['condition_column'].isin(['1000', '-1'])]
# Simple value filter with columns select
df = df[df['condition_column'] != '-1'][['Out Peer', 'opeerid']].values
#*** Concat
# Nan value if column in child dataframe not exist
all_dfs = [pandas.DataFrame(data={'column1':[1,2,3], 'column2': ['1','2','3']}), pandas.DataFrame([[1,1], [2,2]], columns=['column1','column2'])]
new_df = pandas.concat(all_dfs)
#*** Append
# Nan value if column in child dataframe not exist
# Supports multiple rows append. In this case append method works like concat, but reassignment is necessary
new_df = df1.append(df2)
#*** Sorting
df.sort_values(by=['column1', 'column2'])
# Columns regrouping and filtering. Reassignment is necessary
new_df = df.reindex_axis(['column3','column1'], axis=1)
#*** Replace
# Replacement of null(nan) values. Reassignment is necessary
new_df = df.fillna(value='-')
new_df[['column1', 'column2']] = routelist[['column1', 'column2'']].fillna(value='NA value')
#*** Convert
df.column = df.column.astype(int, errors='ignore')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment