Skip to content

Instantly share code, notes, and snippets.

@bzerangue
Last active January 31, 2024 20:57
Show Gist options
  • Star 53 You must be signed in to star a gist
  • Fork 14 You must be signed in to fork a gist
  • Save bzerangue/7bf6610079659e57b8d50ecb94928c31 to your computer and use it in GitHub Desktop.
Save bzerangue/7bf6610079659e57b8d50ecb94928c31 to your computer and use it in GitHub Desktop.
JSON to NDJSON

NDJSON is a convenient format for storing or streaming structured data that may be processed one record at a time.

  • Each line is a valid JSON value
  • Line separator is ‘\n’

1. Convert JSON to NDJSON?

cat test.json | jq -c '.[]' > testNDJSON.json

With this simple line of code, you can convert and save files in NDJSON format.

Note: jq is a lightweight and flexible command-line JSON processor.
https://stedolan.github.io/jq/

Source: https://medium.com/datadriveninvestor/json-parsing-error-how-to-load-json-into-bigquery-successfully-using-ndjson-2b7d94616bcb

@m9aertner
Copy link

Useful as a stepping stone for creating input data for Elasticsearch bulk API.
Concrete example:

$ jq -c '.a | .[]' <<END
{
    "a": [
        {
            "a1": 1
        },
        {
            "a2": 2
        }
    ]
}
END

Output:

{"a1":1}
{"a2":2}

@UweW
Copy link

UweW commented Nov 4, 2021

have even an issue with my json data for elastic.
My challenge is that I need something simular to @m9aertner example, but my nesting goes one level deeper.

{
   "x": {
        "a": [
            {
                "a1": 1
            },
            {
                "a2": 2
            }
        ]
    }
}

should result in:

{"x":{"a1":1}}
{"x":{"a2":2}}

@m9aertner
Copy link

@UweW try

jq -c 'to_entries[] | { (.key) : (.value | .[] | .[]) }' <<<'{ "x": { "a": [ { "a1": 1 }, { "a2": 2 } ] } }'
{"x":{"a1":1}}
{"x":{"a2":2}}

@draxil
Copy link

draxil commented Aug 9, 2022

jq can choke on very large files, and be slow. For these situations I made json2nd.

@bzerangue
Copy link
Author

jq can choke on very large files, and be slow. For these situations I made json2nd.

Thanks for sharing @draxil !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment