Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Helper function to compare two DataFrames and find rows which are unique or shared.
def dataframe_difference(df1, df2, which=None):
"""Find rows which are different."""
comparison_df = df1.merge(df2,
indicator=True,
how='outer')
if which is None:
diff_df = comparison_df[comparison_df['_merge'] != 'both']
else:
diff_df = comparison_df[comparison_df['_merge'] == which]
diff_df.to_csv('data/diff.csv')
return diff_df
@99rig

This comment has been minimized.

Copy link

@99rig 99rig commented Mar 5, 2020

Thank you!
excellent work!

@Per48edjes

This comment has been minimized.

Copy link

@Per48edjes Per48edjes commented Apr 23, 2020

Super useful -- thank you!

@velascog

This comment has been minimized.

Copy link

@velascog velascog commented May 5, 2020

I am able to get a left or right but not both, what am I doing wrong?
(below is whole script)

import pandas as pd
def dataframe_difference(wk15, wk16, which=None):
"""Find rows which are different between two DataFrames."""
comparison_df = wk15.merge(wk16,indicator=True, how='outer')
if which is None:
diff_df = comparison_df[comparison_df['_merge'] != 'both']
else:
diff_df = comparison_df[comparison_df['_merge'] == which]
diff_df.to_csv('diff.csv')
return diff_df

wk15 = pd.read_excel('GRNI-wk15.xlsx', sheet_name='GRNI-wk15')
wk16 = pd.read_excel('GRNI-wk16.xlsx', sheet_name='GRNI-wk16')

dataframe_difference(wk15,wk16)

@randallscott25

This comment has been minimized.

Copy link

@randallscott25 randallscott25 commented Dec 23, 2020

Yes, I am receiving the same results as Velascog;
Can you please explain?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment