Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 5 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save sigma23/a808c57c30ad87bc0b705a61755bd91b to your computer and use it in GitHub Desktop.
Save sigma23/a808c57c30ad87bc0b705a61755bd91b to your computer and use it in GitHub Desktop.
Convert from json array to jsonlines using jq and python
# Run this from the bash command prompt. Make sure that jq is installed https://github.com/stedolan/jq/wiki/Installation
# json_temp.json has the file in the form [{...}, {...}, {...}] and coverts to {...}\n{...}\n
jq -c '.[]' json_temp.json > json_temp.jsonl
# From within python can do this:
pip install jsonlines
import json
import jsonlines
json_array = json.loads(my_data)
# Write to jsonlines file:
with open(file_name, 'wb') as f:
writer = jsonlines.Writer(f)
writer.write_all(json_array)
writer.close()
@peregilk
Copy link

Struggled with the same issue, and came across this post. This works perfectly. However, it also reads the entire json-object into memory before starting writing it to file. My objects were simply too large for that to be possible. This alternative command does the same without reading everything into memory:

jq -cn --stream “fromstream(1|truncate_stream(inputs))” json_temp.json > json_temp.jsonl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment