Skip to content

Instantly share code, notes, and snippets.

@jenschurchill
Created January 16, 2024 11:25
Show Gist options
  • Save jenschurchill/bb8cdcb05d2c09e978f37acdae029eef to your computer and use it in GitHub Desktop.
Save jenschurchill/bb8cdcb05d2c09e978f37acdae029eef to your computer and use it in GitHub Desktop.
JSON to CSV
#!/usr/bin/env bash
# Bash script to convert JSON data using jq.
###############################################################################
# MIT License
#
# Copyright (c) 2024 Jens Melgard Churchill <jens@churchill.dk>
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
###############################################################################
# Ensure IBM i QOpensys open source packages are on PATH...
export PATH="/QOpensys/pkgs/bin:$PATH"
err () {
>&2 printf 'Error %4s: %s\n' "$SYNTAX_ERROR" "$*"
}
info () {
if [ $SILENT -eq 0 ]
then
>&2 printf '%s\n' "$*"
fi
}
OPTSTRING=":hi:o:rs"
HELP=
dsp_help () {
if [ -z "$HELP" ]
then
info ''
info 'Parse JSON to CSV, using jq'
info ''
info 'Use -h to display this help screen.'
info 'Use -i "filepath" to specify JSON input file, or direct data to std. in'
info 'Use -o "filepath" to specify CSV output file, or direct data to std. out'
info 'Use -r to replace output file, if it exists.'
info 'Use -s to silence informational output.'
info ''
info 'Script supports single object or array of objects'
info 'Script will produce a header from the first object, and traverse nested objects'
info ''
info 'Syntax example:'
info "$0 -i 'data.json' -o 'data.csv' -r"
info ''
fi
echo "1"
}
json_to_csv () {
# If input root is a array of objects, including handling nested objects.
JQ_STMT_1='(. | map(leaf_paths) | unique) as $cols | map (. as $row | ($cols | map(. as $col | $row | getpath($col)))) as $rows | ([($cols | map(. | map(tostring) | join(".")))] + $rows) | map(@csv) | .[]'
# If input root is a single object, wrap it as an array...
JQ_STMT_2="[.] | $JQ_STMT_1"
jq -r "if type==\"array\" then $JQ_STMT_1 else $JQ_STMT_2 end" <&0
}
INPUT_FILE=
OUTPUT_FILE=
REPLACE_OUTPUT_FILE=0
SILENT=0
SYNTAX_ERROR=0
if [ $# -eq 0 ]
then
SYNTAX_ERROR=-1
HELP=$(dsp_help)
fi
# Setup using input arguments...
while getopts ${OPTSTRING} OPT
do
case ${OPT} in
h)
SYNTAX_ERROR=-2
HELP=$(dsp_help)
;;
i)
INPUT_FILE="${OPTARG}"
;;
o)
OUTPUT_FILE="${OPTARG}"
;;
r)
REPLACE_OUTPUT_FILE=1
;;
s)
SILENT=1
;;
:)
SYNTAX_ERROR=255
HELP=$(dsp_help)
err "Option -${OPTARG} requires an argument"
;;
?)
SYNTAX_ERROR=1
HELP=$(dsp_help)
err "Invalid option: -${OPTARG}"
;;
esac
done
# Confirm jq is installed and on PATH...
if ! command jq --version &>/dev/null; then
SYNTAX_ERROR=10
err "Required open source command jq not found, please install using yum install jq"
fi
# If we have a regular std. in,
# attempt to use positional arguments for input-file, if unset,
# and validate afterwards...
if [ -t 0 ]
then
for ARG in "$@"
do
[ -n "$INPUT_FILE" ] && break
if [[ "$ARG" != -* ]] && [ -f "$ARG" ]
then
INPUT_FILE="$ARG"
fi
done
# Std. in is not being redirected,
# and input-file does not exist...
if ! [ -f "$INPUT_FILE" ]
then
if [ -n "$INPUT_FILE" ]
then
SYNTAX_ERROR=100
err "Input JSON file \"$INPUT_FILE\" not found, please review input argument -i"
else
SYNTAX_ERROR=110
err "Input JSON file not specified, please review input argument -i"
fi
fi
fi
# If we have a regular std. out, and input-file is set,
# attempt to use positional arguments for output-file, if unset,
# and validate afterwards...
if [ -t 1 ] && [ -f "$INPUT_FILE" ]
then
for ARG in "$@"
do
[ -n "$OUTPUT_FILE" ] && break
if [[ "$ARG" != -* ]] && [ "$ARG" != "$INPUT_FILE" ]
then
OUTPUT_FILE="$ARG"
fi
done
fi
# Std. out is not being redirected,
# use specified output-file and replace it if so instructed...
if [ -t 1 ]
then
if [ -f "$OUTPUT_FILE" ] && [ 0 -eq "$REPLACE_OUTPUT_FILE" ]
then
SYNTAX_ERROR=200
err "Output CSV file \"$OUTPUT_FILE\" exists, please review output argument -o or specify -r to replace"
elif [ -n "$OUTPUT_FILE" ]
then
exec >"$OUTPUT_FILE"
fi
# Fail with error that output-file is specified, but std. out is being redirected...
elif [ -n "$OUTPUT_FILE" ]
then
SYNTAX_ERROR=210
err "Output file declared, but std. out is being redirected."
fi
info "Command: $0 $*"
info "Input: ${INPUT_FILE:--stdin-}"
info "Output: ${OUTPUT_FILE:--stdout-}"
# Are there any detected errors?
# And should they trigger a none zero exit code?
# Below zero is warnings, above is errors...
if [ 0 -ne $SYNTAX_ERROR ]
then
info "Aborted."
if [ 0 -gt $SYNTAX_ERROR ]
then
exit 0
else
exit $SYNTAX_ERROR
fi
fi
info "Converting..."
# Run conversion, either from input-file or std. in...
if [ -t 0 ]
then
json_to_csv <"$INPUT_FILE"
else
json_to_csv <&0
fi
@jenschurchill
Copy link
Author

jenschurchill commented Jan 16, 2024

This script converts arbitrary JSON into CSV

  1. It does this, by using the jq command
  2. It can receive and output data both using stdin and stdout, or using input and output file arguments.
  3. It should be compatible with QShell on IBM itm

Type json2csv.bash -h to view help.

Parse JSON to CSV, using jq

Use -h to display this help screen.
Use -i "filepath" to specify JSON input file, or direct data to std. in
Use -o "filepath" to specify CSV output file, or direct data to std. out
Use -r to replace output file, if it exists.
Use -s to silence informational output.

Script supports single object or array of objects
Script will produce a header from the first object, and traverse nested objects

Syntax example:
./json2csv.bash -i 'data.json' -o 'data.csv' -r

Error  110: Input JSON file not specified, please review input argument -i
Command: ./json2csv.bash -h
Input:   -stdin-
Output:  -stdout-
Aborted.

Here are some examples, they all use the -s option, to suppress verbose output,
note that this verbose output is sent to stderr, so piping stdout should only ever produce result data if conversion succeeds...

Convert single object...

command:
./json2csv.bash -si data/object.json

data:

{
    "firstname": "Jens",
    "lastname": "Churchill"
}

output:

"firstname","lastname"
"Jens","Churchill"

Convert array of objects...

command:
./json2csv.bash -si data/array.json

data:

[
{
    "firstname": "Jens",
    "lastname": "Churchill"
},
{
    "firstname": "John",
    "lastname": "Doe"
}
]

output:

"firstname","lastname"
"Jens","Churchill"
"John","Doe"

Convert nested object...

command:
./json2csv.bash -si data/nested_object.json

data:

{
    "firstname": "Jens",
    "lastname": "Churchill",
    "info": {
        "title": "Author",
        "commits": 1
    }
}

output:

"firstname","info.commits","info.title","lastname"
"Jens",1,"Author","Churchill"

Convert array of nested objects...

command:
./json2csv.bash -si data/array_nested_objects.json

data:

[
{
    "firstname": "Jens",
    "lastname": "Churchill",
    "info": {
        "title": "Author",
        "commits": 1
    }
},
{
    "firstname": "John",
    "lastname": "Doe",
    "info": {
        "title": "Contributer",
        "commits": 0
    }
}
]

output:

"firstname","info.commits","info.title","lastname"
"Jens",1,"Author","Churchill"
"John",0,"Contributer","Doe"

Convert array of nested objects, with incomplete data...

command:
./json2csv.bash -si data/array_nested_objects_incomplete.json

data:

[
{
    "firstname": "Jens",
    "lastname": "Churchill",
    "info": {
        "title": "Author",
        "commits": 1
    }
},
{
    "firstname": "John",
    "lastname": "Doe"
}
]

output:

"firstname","info.commits","info.title","lastname"
"Jens",1,"Author","Churchill"
"John",,,"Doe"

IBM i is registered trademark of the IBM Corporation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment