Skip to content

Instantly share code, notes, and snippets.

@webbedfeet
Created August 29, 2022 04:33
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save webbedfeet/8e03a4d43dcf6e2ae7ae45fb546db136 to your computer and use it in GitHub Desktop.
Save webbedfeet/8e03a4d43dcf6e2ae7ae45fb546db136 to your computer and use it in GitHub Desktop.
Set difference of rows of two data frames
def df_diff(d1, d2):
"""
df_diff Create a DataFrame containing rows of d1 not in d2
Arguments:
d1 -- A data frame
d2 -- Another DataFrame which is a subset of d1
Returns:
A pandas DataFrame containing rows of d1 that are not in d2
"""
df_all = d1.merge(d2.drop_duplicates(), how="left", indicator=True)
return df_all[df_all._merge == "left_only"].drop(columns="_merge")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment