Skip to content

Instantly share code, notes, and snippets.

@magnetikonline
Last active January 24, 2023 21:06
Show Gist options
  • Star 16 You must be signed in to star a gist
  • Fork 6 You must be signed in to fork a gist
  • Save magnetikonline/845400198a8e4e4648746a675e955af3 to your computer and use it in GitHub Desktop.
Save magnetikonline/845400198a8e4e4648746a675e955af3 to your computer and use it in GitHub Desktop.
Python - comparing JSON data structures.

Python - comparing JSON data structures

A function compare_json_data(source_data_a,source_data_b), accepting structures populated with data loaded from json.load() and comparing for equality.

Example

$ ./compare.py 
Compare JSON result is: True

JSON files a.json and b.json are loaded via load_json() function and structures passed into compare_json_data() for comparison.

{
"apple": "value",
"orange": 12,
"banana": [1,2,3,"One","Two","Three","Fourth"],
"grape": false,
"strawberry": null,
"carrot": [
{
"fourth": "value-two",
"first": {
"apple": "value1",
"orange": "value2"
},
"second": "value",
"third": "another"
},
{},
["One","Two","Three"]
],
"pear": {
"first": "value",
"second": "another",
"third": [1,2,3,"One","Two","Three"],
"fourth": [
{
"first": "value",
"second": "another"
},
{}
]
}
}
{
"apple": "value",
"orange": 12,
"banana": [1,2,3,"One","Two","Three","Fourth"],
"pear": {
"first": "value",
"second": "another",
"third": [1,2,3,"One","Two","Three"],
"fourth": [
{
"second": "another",
"first": "value"
},
{}
]
},
"grape": false,
"strawberry": null,
"carrot": [
{
"second": "value",
"third": "another",
"fourth": "value-two",
"first": {
"apple": "value1",
"orange": "value2"
}
},
{},
["One","Two","Three"]
]
}
#!/usr/bin/env python3
import json
def load_json(file_path):
# open JSON file and parse contents
fh = open(file_path, "r")
data = json.load(fh)
fh.close()
return data
def compare_json_data(source_data_a, source_data_b):
def compare(data_a, data_b):
# type: list
if type(data_a) is list:
# is [data_b] a list and of same length as [data_a]?
if (type(data_b) is not list) or (len(data_a) != len(data_b)):
return False
# iterate over list items
for list_index, list_item in enumerate(data_a):
# compare [data_a] list item against [data_b] at index
if not compare(list_item, data_b[list_index]):
return False
# list identical
return True
# type: dictionary
if type(data_a) is dict:
# is [data_b] a dictionary?
if type(data_b) is not dict:
return False
# iterate over dictionary keys
for dict_key, dict_value in data_a.items():
# key exists in [data_b] dictionary, and same value?
if (dict_key not in data_b) or (
not compare(dict_value, data_b[dict_key])
):
return False
# dictionary identical
return True
# simple value - compare both value and type for equality
return (data_a == data_b) and (type(data_a) is type(data_b))
# compare a to b, then b to a
return compare(source_data_a, source_data_b) and compare(
source_data_b, source_data_a
)
def main():
# import testing JSON files to Python structures
a_json = load_json("a.json")
b_json = load_json("b.json")
# compare first struct against second
print(f"Compare JSON result is: {compare_json_data(a_json, b_json)}")
if __name__ == "__main__":
main()
@tiagogba
Copy link

I'm sorry, but it's not same that if '==' betwen two json? I think that would be better return what key or value is different.

@avoidik
Copy link

avoidik commented Aug 24, 2018

you might want to sort the lists before comparing them, and length of dictionaries could be compared as well

@johncornelius091
Copy link

Yeah, Sort the 2 json objects by keys and then just apply ==
Wouldn't that be a simple solution ?

@magnetikonline
Copy link
Author

@avoidik if I sort the lists, they are no longer considered equal... e.g.

[1,2,3,"One","Two","Three"]
[1,2,3,"One","Three","Two"]

those are not equal - what I want... but if sorted first before index/value compare...

[1,2,3,"One","Two","Three"]
[1,2,3,"One","Two","Three"]

now they are equal... but not from the source JSON struct.

@ManikandanRajendran
Copy link

Actually my response will return with timestamp. In such cases, it is failing as the timestamp differes. could you please help me to handle those scenarios?

@magnetikonline
Copy link
Author

Sorry @ManikandanRajendran you'll have to explain this with a code sample.

@prixgoody
Copy link

If the list contains different no of elements, then the code is throwing error as :

IndexError: list index out of range.

@magnetikonline
Copy link
Author

Hey @prixgoody - that's weird - this check should mean that's not possible:

(len(data_a) != len(data_b))

do you have an example of data structures that throw this error?

@prixgoody
Copy link

Hey @prixgoody - that's weird - this check should mean that's not possible:

(len(data_a) != len(data_b))

do you have an example of data structures that throw this error?

Hi, I have used your code but instead of returning true false, i am fetching more details like for which value it is diff. Still in progress.
Sorry for the miscommunication caused.

@justinjobo
Copy link

Hey @prixgoody - that's weird - this check should mean that's not possible:

(len(data_a) != len(data_b))

do you have an example of data structures that throw this error?

Hi, I have used your code but instead of returning true false, i am fetching more details like for which value it is diff. Still in progress.
Sorry for the miscommunication caused.

Any luck fetching which value is diff? I'm working on it too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment