Skip to content

Instantly share code, notes, and snippets.

@paulochf
Created April 26, 2017 20:21
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save paulochf/54f038c74620c04d06731c5a911b58f3 to your computer and use it in GitHub Desktop.
Save paulochf/54f038c74620c04d06731c5a911b58f3 to your computer and use it in GitHub Desktop.
Prints differences between two given DataFrame's
import numpy as np
import pandas as pd
def df_diff(df1, df2):
# Extracted from
# http://stackoverflow.com/a/17095620
diffs = df1 != df2
ne_stacked = diffs.stack()
changed = ne_stacked[ne_stacked]
changed.index.names = ['id', 'col']
difference_locations = np.where(diffs)
changed_from = df1.values[difference_locations]
changed_to = df2.values[difference_locations]
print(pd.DataFrame({'from': changed_from, 'to': changed_to}, index=changed.index))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment