Skip to content

Instantly share code, notes, and snippets.

@michael-simons
Created March 23, 2022 11:37
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save michael-simons/d85be623638dcf211873731c618ae9b6 to your computer and use it in GitHub Desktop.
Save michael-simons/d85be623638dcf211873731c618ae9b6 to your computer and use it in GitHub Desktop.
Filtering contents of fileB.json with information from fileA.json

The goal was it to find the authors of a book in fileA.json and get their details in fileB.json

See:

https://twitter.com/JMHReif/status/1503851216740171784?s=20&t=4D2LUKLH9Kh5hCuTSxRlbw

https://twitter.com/JMHReif/status/1503851872096067585?s=20&t=4D2LUKLH9Kh5hCuTSxRlbw

Goal: Create an array of author ids from fileA. Depending on what is to be filtered from fileA.json, there are different approaches:

Selecting by a book id can be done like this:

more fileA.json | jq -c '[ .[] | select(.book_id == '789' or .book_id == '007' ) | .authors[].author_id ] | unique'

This statement means: "Create an array from iterating the array of books, selecting all the books with either of the given ids and mapping their authors and applying the unique filter to the array.

Result is:

["abc","cde","x"]

Filtering by author is easier:

more fileA.json | jq -c '[ .[].authors[] | select(.name =="A1") | .author_id ] | unique'

Create an array of author ids by iterating the books and selecting the array of authors and filtering that by the name, than getting the id. Unique to have only that output:

["abc"]

The compact output can be passed to another jq call. Take note that I use argjson so it interprets it as a JSON array:

more fileB.json | \
jq --argjson ids `more fileA.json | jq -c '[ .[] | select(.book_id == '789' or .book_id == '007' ) | .authors[].author_id ] | unique'` \
'.[] | select([.author_id] | inside($ids))' 

respectively

more fileB.json | \
jq --argjson ids `more fileA.json | jq -c '[ .[].authors[] | select(.name =="A1") | .author_id ] | unique'` \
'.[] | select([.author_id] | inside($ids))' 
[
{"book_id": 123, "authors": [{"author_id": "abc", "name": "A1"}]},
{"book_id": 456, "authors": [{"author_id": "cde", "name": "A2"}]},
{"book_id": 789, "authors": [{"author_id": "abc", "name": "A1"}, {"author_id": "x", "name": "A3"}]},
{"book_id": 007, "authors": [{"author_id": "cde", "name": "A2"}, {"author_id": "x", "name": "A3"}]}
]
[
{"author_id": "abc", "name": "A1"},
{"author_id": "x", "name": "A2"},
{"author_id": "c", "name": "A3"}
]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment