Skip to content

Instantly share code, notes, and snippets.

@BoredHackerBlog
Created February 19, 2021 14:10
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save BoredHackerBlog/67cd243135da6a082dcb771a07c75c54 to your computer and use it in GitHub Desktop.
Save BoredHackerBlog/67cd243135da6a082dcb771a07c75c54 to your computer and use it in GitHub Desktop.
#source: https://kanoki.org/2019/07/04/pandas-difference-between-two-dataframes/
import pandas as pd
import sys
def compare(csv1, csv2):
#might need to modify this to drop certain columns or read csv a certain way
dfcsv1 = pd.read_csv(csv1)
dfcsv2 = pd.read_csv(csv2)
dfrem = dfcsv1.merge(dfcsv2, how = 'outer' ,indicator=True).loc[lambda x : x['_merge']=='left_only']
dfadd = dfcsv1.merge(dfcsv2, how = 'outer' ,indicator=True).loc[lambda x : x['_merge']=='right_only']
return dfadd, dfrem
dfadd, dfrem = compare(sys.argv[1], sys.argv[2])
#might need to modify this to drop certain columns or write csv a certain way
dfadd.to_csv("added.csv")
dfrem.to_csv("removed.csv")
print("ADDED")
print(dfadd.to_string())
print("REMOVED")
print(dfrem.to_string())
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment