Skip to content

Instantly share code, notes, and snippets.

@ipl31
Created February 5, 2021 06:49
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ipl31/e4280b49f0c7df675fc7719b46844ae1 to your computer and use it in GitHub Desktop.
Save ipl31/e4280b49f0c7df675fc7719b46844ae1 to your computer and use it in GitHub Desktop.
#!/bin/bash
INPUTFILE="./sample.json"
SORTED_OUTPUT="./sorted-sample.json"
MISSING_FILE="./missing-records.txt"
TEMP_FILE=$(mktemp /tmp/clean.XXXXXX)
# Prepend the value from "id" the beginning of the line so we can use
# standard unix cli tools like sort. Save it to a temp file.
cat sample.json | jq -cr '"\(.id)\t\(.)"' > $TEMP_FILE
# concatonate the file with IDs prepended, sort by the IDs. Pipe the sort output to "tee".
# One tee pipe has the prepended IDs and whitespace removed and written to SORTED_OUTPUT.
# The other is piped to AWK to find missing integers in the sequence of ids.
cat $TEMP_FILE | sort | tee >(cut -f2 >$SORTED_OUTPUT) | (awk '{for(i=p+1; i<$1; i++) print i} {p=$1}') > $MISSING_FILE
#clean up the temp file.
rm $TEMP_FILE
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment