Skip to content

Instantly share code, notes, and snippets.

@biancadanforth
Last active November 6, 2019 02:43
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save biancadanforth/c4790230c5a2702c8a64f62cbf39dc6a to your computer and use it in GitHub Desktop.
Save biancadanforth/c4790230c5a2702c8a64f62cbf39dc6a to your computer and use it in GitHub Desktop.
Check if two JSON objects are the same by first ordering them
import json, os
# Put filenames here; this script assumes these files are in the same dir as the script
FILENAME_1 = "2.json"
FILENAME_2 = "3.json"
def ordered(obj):
if isinstance(obj, dict):
return sorted((k, ordered(v)) for k, v in obj.items())
if isinstance(obj, list):
return sorted(ordered(x) for x in obj)
else:
return obj
def main():
files = [FILENAME_1, FILENAME_2]
ordered_files = []
for filename in files:
path = os.path.join(os.path.dirname(__file__), filename)
with open(path) as f:
file_parsed = json.load(f)
file_ordered = ordered(file_parsed)
ordered_files.append(file_ordered)
new_path = os.path.join(os.path.dirname(__file__), f"{os.path.splitext(filename)[0]}_prettier.json")
with open(new_path, "w+") as new_file:
json.dump(file_ordered, new_file, indent=4, sort_keys=True)
print(ordered_files[0] == ordered_files[1])
if __name__ == '__main__':
main()
@biancadanforth
Copy link
Author

biancadanforth commented Nov 5, 2019

This is a helper script I made while reviewing @danielhertenstein's FathomFox PR to parallelize the Vectorizer. I wanted to know if the resulting vectors.json files in both the serialized Vectorizer and parallelized Vectorizer were identical for the same samples and same ruleset. Since the parallelized Vectorizer can finish pages in a different order, I needed to sort each JSON object first before making a comparison. Thankfully the two outputs were the same.

Edit: Credit for the ordered function is from this Stack Overflow post.

@biancadanforth
Copy link
Author

Also thanks to @mythmon for giving this a look over! My Python skills are quite basic. Latest revision (6) with his feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment