Last active
December 17, 2015 23:09
-
-
Save wimsy/5687611 to your computer and use it in GitHub Desktop.
This is a quick command line executable script to merge two JSON files. It takes the two files plus a name for the output file as arguments, loads the two files, combines their data, removes duplicates, then writes the output to the designated output file. I wrote this to help me manage my personal location data tracked at [OpenPaths](https://op…
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/local/bin/python | |
''' | |
This utility takes three arguments from the command line: file1, file2 | |
and outfile. Assuming these are JSON files, it parses them, merges | |
the two lists (removing duplicates), then writes the merged list to | |
the outfile. | |
''' | |
import json | |
import sys | |
fn1 = sys.argv[1] | |
fn2 = sys.argv[2] | |
fnout = sys.argv[3] | |
records_read = 0 | |
records_written = 0 | |
with open(fn1,'r') as infile: | |
data = json.load(infile) | |
with open(fn2,'r') as infile: | |
data.extend(json.load(infile)) | |
records_read += len(data) | |
print `records_read` + ' records read.' | |
merged_data = [dict(t) for t in set([tuple(sorted(d.items())) for d in data])] | |
with open(fnout,'w') as outfile: | |
json.dump(merged_data, outfile, indent=4) | |
records_written += len(merged_data) | |
print `records_written` + ' records written.' | |
infile.close() | |
outfile.close() |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment