Skip to content

Instantly share code, notes, and snippets.

@llimllib
Last active May 4, 2021 18:53
Show Gist options
  • Save llimllib/84a4345e87c5bc0758597c2052cbc680 to your computer and use it in GitHub Desktop.
Save llimllib/84a4345e87c5bc0758597c2052cbc680 to your computer and use it in GitHub Desktop.

llimllib's five minute guide to jq

Let's say somebody just asked us to make a frontend for the CFPB's complaint search API, but we've never used it. The first thing we do is just call it, and see a giant blob:

$ curl "https://www.consumerfinance.gov/data-research/consumer-complaints/search/api/v1/geo/states?search_term=test"
{"_scroll_id":"cXVlcnlUaGVuRmV0Y2g7NTs3Mjg1ODU6ZzRIUGF4VEZScjY5Ym1fNG1HRWc3Zzs3Mjg1ODY6ZzRIUGF4VEZScjY5Ym1fNG1HRWc3Zzs2MDg4OTM6RU85cXdTM2tRa2FEbjhUVkdZamwwdzs2MDg4OTI6RU85cXdTM2tRa2FEbjhUVkdZamwwdzs2MDU2NTU6NlFSZXU3eTdTWkNhMkNsQm1mclNFUTswOw==","took":11,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1871,"max_score":0.0,"hits":[]},"aggregations":{"product":{"doc_count":1871,"product":{"doc_count_error_upper_bound":0,"sum_other_doc_count":614,"buckets":[{"key":"Credit reporting, credit repair services, or other personal consumer reports","doc_count":373,"product":{"doc_count_error_upper_bound":0,"sum_other_doc_count":0,"buckets":[{"key":"Credit reporting","doc_count":354},{"key":"Other personal consumer report","doc_count":19}]}},{"key":"Debt collection","doc_count":371,"product":{"doc_count_error_upper_bound":0,"sum_other_doc_count":10,"buckets":[{"key":"Medical debt","doc_count":183},{"key":"Medical","doc_count":63},{"key":"Other debt","doc_count":43},{"key":"Credit card debt","doc_count":25},{"key":"Auto debt","doc_count":11},{"key":"I do not know","doc_count":10},{"key":"Other (i.e. phone, health club, etc.)","doc_count":8},{"key":"Private student loan debt","doc_count":8},{"key":"Credit card","doc_count":5},{"key":"Mortgage debt","doc_count":5}]}},{"key":"Mortgage","doc_count":203,"product":{"doc_count_error_upper_bound":0,"sum_other_doc_count":0,"buckets":[{"key":"Conventional home mortgage","doc_count":67},{"key":"FHA mortgage","doc_count":37},{"key":"Conventional fixed mortgage","doc_count":32},{"key":"VA mortgage","doc_count":21},{"key":"Conventional adjustable mortgage (ARM)","doc_count":17},{"key":"Other mortgage","doc_count":12},{"key":"Other type of mortgage","doc_count":7},{"key":"Home equity loan or line of credit (HELOC)","doc_count":4},{"key":"Home equity loan or line of credit","doc_count":3},{"key":"Reverse mortgage","doc_count":3}]}},{"key":"Credit card or prepaid card","doc_count":155,"product":{"doc_count_error_upper_bound":0,"sum_other_doc_count":0,"buckets":[{"key":"General-purpose credit card or charge card","doc_count":122},{"key":"Store credit card","doc_count":16},{"key":"General-purpose prepaid card","doc_count":7},{"key":"Government benefit card","doc_count":7},{"key":"Payroll card","doc_count":2},{"key":"Gift card","doc_count":1}]}},{"key":"Vehicle loan or lease","doc_count":155,"product":{"doc_count_error_upper_bound":0,"sum_other_doc_count":0,"buckets":[{"key":"Loan","doc_count":118},{"key":"Lease","doc_count":37}]}}]}},"issue":{"doc_count":1871,"issue":{"doc_count_error_upper_bound":23,"sum_other_doc_count":1332,"buckets":[{"key":"Incorrect information on your report","doc_count":158,"issue":{"doc_count_error_upper_bound":0,"sum_other_doc_count":0,"buckets":[{"key":"Information belongs to someone else","doc_count":67},{"key":"Account status incorrect","doc_count":47},{"key":"Account information incorrect","doc_count":30},{"key":"Personal information incorrect","doc_count":5},{"key":"Public record information inaccurate","doc_count":4},{"key":"Information is incorrect","doc_count":3},{"key":"Information is missing that should be on the report","doc_count":1},{"key":"Old information reappears or never goes away","doc_count":1}]}},{"key":"Attempts to collect debt not owed","doc_count":119,"issue":{"doc_count_error_upper_bound":0,"sum_other_doc_count":0,"buckets":[{"key":"Debt is not yours","doc_count":64},{"key":"Debt was paid","doc_count":37},{"key":"Debt was result of identity theft","doc_count":14},{"key":"Debt was already discharged in bankruptcy and is no longer owed","doc_count":4}]}},{"key":"Problem with a credit reporting company's investigation into an existing problem","doc_count":92,"issue":{"doc_count_error_upper_bound":0,"sum_other_doc_count":0,"buckets":[{"key":"Their investigation did not fix an error on your report","doc_count":54},{"key":"Problem with personal statement of dispute","doc_count":13},{"key":"Investigation took more than 30 days","doc_count":10},{"key":"Difficulty submitting a dispute or getting information about a dispute over the phone","doc_count":8},{"key":"Was not notified of investigation status or results","doc_count":4}]}},{"key":"Improper use of your report","doc_count":85,"issue":{"doc_count_error_upper_bound":0,"sum_other_doc_count":0,"buckets":[{"key":"Credit inquiries on your report that you don't recognize","doc_count":50},{"key":"Reporting company used your report improperly","doc_count":29},{"key":"Received unsolicited financial product or insurance offers after opting out","doc_count":4},{"key":"Report provided to employer without your written authorization","doc_count":2}]}},{"key":"Problem with a purchase shown on your statement","doc_count":85,"issue":{"doc_count_error_upper_bound":0,"sum_other_doc_count":0,"buckets":[{"key":"Credit card company isn't resolving a dispute about a purchase on your statement","doc_count":65},{"key":"Card was charged for something you did not purchase with the card","doc_count":17},{"key":"Overcharged for something you did purchase with the card","doc_count":3}]}}]}},"state":{"doc_count":1871,"state":{"doc_count_error_upper_bound":0,"sum_other_doc_count":0,"buckets":[{"key":"CA","doc_count":279,"product":{"doc_count_error_upper_bound":7,"sum_other_doc_count":224,"buckets":[{"key":"Credit reporting, credit repair services, or other personal consumer reports","doc_count":55}]},"issue":{"doc_count_error_upper_bound":9,"sum_other_doc_count":256,"buckets":[{"key":"Problem with a credit reporting company's investigation into an existing problem","doc_count":23}]}},{"key":"FL","doc_count":198,"product":{"doc_count_error_upper_bound":5,"sum_other_doc_count":149,"buckets":[{"key":"Debt collection","doc_count":49}]},"issue":{"doc_count_error_upper_bound":6,"sum_other_doc_count":183,"buckets":[{"key":"Attempts to collect debt not owed","doc_count":15}]}},{"key":"TX","doc_count":170,"product":{"doc_count_error_upper_bound":5,"sum_other_doc_count":131,"buckets":[{"key":"Credit reporting, credit repair services, or other personal consumer reports","doc_count":39}]},"issue":{"doc_count_error_upper_bound":5,"sum_other_doc_count":147,"buckets":[{"key":"Incorrect information on your report","doc_count":23}]}},{"key":"GA","doc_count":114,"product":{"doc_count_error_upper_bound":3,"sum_other_doc_count":83,"buckets":[{"key":"Credit reporting, credit repair services, or other personal consumer reports","doc_count":31}]},"issue":{"doc_count_error_upper_bound":5,"sum_other_doc_count":89,"buckets":[{"key":"Incorrect information on your report","doc_count":25}]}},{"key":"NY","doc_count":111,"product":{"doc_count_error_upper_bound":1,"sum_other_doc_count":89,"buckets":[{"key":"Debt collection","doc_count":22}]},"issue":{"doc_count_error_upper_bound":5,"sum_other_doc_count":100,"buckets":[{"key":"Incorrect information on your report","doc_count":11}]}},{"key":"IL","doc_count":72,"product":{"doc_count_error_upper_bound":0,"sum_other_doc_count":54,"buckets":[{"key":"Credit reporting, credit repair services, or other personal consumer reports","doc_count":18}]},"issue":{"doc_count_error_upper_bound":4,"sum_other_doc_count":64,"buckets":[{"key":"Managing an account","doc_count":8}]}},{"key":"NJ","doc_count":60,"product":{"doc_count_error_upper_bound":1,"sum_other_doc_count":47,"buckets":[{"key":"Debt collection","doc_count":13}]},"issue":{"doc_count_error_upper_bound":2,"sum_other_doc_count":54,"buckets":[{"key":"Managing an account","doc_count":6}]}},{"key":"PA","doc_count":54,"product":{"doc_count_error_upper_bound":0,"sum_other_doc_count":41,"buckets":[{"key":"Debt collection","doc_count":13}]},"issue":{"doc_count_error_upper_bound":3,"sum_other_doc_count":50,"buckets":[{"key":"Getting a loan or lease","doc_count":4}]}},{"key":"NC","doc_count":53,"product":{"doc_count_error_upper_bound":0,"sum_other_doc_count":37,"buckets":[{"key":"Debt collection","doc_count":16}]},"issue":{"doc_count_error_upper_bound":2,"sum_other_doc_count":46,"buckets":[{"key":"Attempts to collect debt not owed","doc_count":7}]}},{"key":"MD","doc_count":52,"product":{"doc_count_error_upper_bound":0,"sum_other_doc_count":43,"buckets":[{"key":"Credit card or prepaid card","doc_count":9}]},"issue":{"doc_count_error_upper_bound":2,"sum_other_doc_count":46,"buckets":[{"key":"Incorrect information on credit report","doc_count":6}]}},{"key":"VA","doc_count":49,"product":{"doc_count_error_upper_bound":0,"sum_other_doc_count":36,"buckets":[{"key":"Mortgage","doc_count":13}]},"issue":{"doc_count_error_upper_bound":1,"sum_other_doc_count":44,"buckets":[{"key":"Improper use of your report","doc_count":5}]}},{"key":"OH","doc_count":44,"product":{"doc_count_error_upper_bound":0,"sum_other_doc_count":35,"buckets":[{"key":"Credit reporting, credit repair services, or other personal consumer reports","doc_count":9}]},"issue":{"doc_count_error_upper_bound":2,"sum_other_doc_count":38,"buckets":[{"key":"Incorrect information on credit report","doc_count":6}]}},{"key":"WA","doc_count":44,"product":{"doc_count_error_upper_bound":0,"sum_other_doc_count":35,"buckets":[{"key":"Debt collection","doc_count":9}]},"issue":{"doc_count_error_upper_bound":0,"sum_other_doc_count":40,"buckets":[{"key":"Attempts to collect debt not owed","doc_count":4}]}},{"key":"MA","doc_count":39,"product":{"doc_count_error_upper_bound":0,"sum_other_doc_count":28,"buckets":[{"key":"Credit reporting, credit repair services, or other personal consumer reports","doc_count":11}]},"issue":{"doc_count_error_upper_bound":0,"sum_other_doc_count":35,"buckets":[{"key":"Incorrect information on your report","doc_count":4}]}},{"key":"AZ","doc_count":38,"product":{"doc_count_error_upper_bound":1,"sum_other_doc_count":32,"buckets":[{"key":"Mortgage","doc_count":6}]},"issue":{"doc_count_error_upper_bound":2,"sum_other_doc_count":33,"buckets":[{"key":"Managing an account","doc_count":5}]}},{"key":"TN","doc_count":34,"product":{"doc_count_error_upper_bound":0,"sum_..........

That's a lot of data! What's in there? Let's use jq to help us figure it out!

Just piping to jq will pretty-print the object:

$ curl -s "https://www.consumerfinance.gov/data-research/consumer-complaints/search/api/v1/geo/states?search_term=test" \
  | jq | head
{
  "_scroll_id": "cXVlcnlUaGVuRmV0Y2g7NTs3Mjg1ODU6ZzRIUGF4VEZScjY5Ym1fNG1HRWc3Zzs3Mjg1ODY6ZzRIUGF4VEZScjY5Ym1fNG1HRWc3Zzs2MDg4OTM6RU85cXdTM2tRa2FEbjhUVkdZamwwdzs2MDg4OTI6RU85cXdTM2tRa2FEbjhUVkdZamwwdzs2MDU2NTU6NlFSZXU3eTdTWkNhMkNsQm1mclNFUTswOw==",
  "took": 11,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {

Now we can start to see what the object looks like, but we don't have the full picture. Let's ask what the keys are in the top-level object. To do so, we'll call the jq keys function:

$ curl -s "https://www.consumerfinance.gov/data-research/consumer-complaints/search/api/v1/geo/states?search_term=test" \
  | jq 'keys'
[
"_scroll_id",
"_shards",
"aggregations",
"hits",
"timed_out",
"took"
]

(The -s flag to curl just tells it not to print status information to the console)

Let's say that hits looks interesting to us, and we want to know what keys are in that object - let's use keys again:

$ curl -s "https://www.consumerfinance.gov/data-research/consumer-complaints/search/api/v1/geo/states?search_term=test" \
  | jq '.hits | keys'
[
  "hits",
  "max_score",
  "total"
]

Huh, I bet hits might represent the responses to our search query. Let's call .hits to print out the whole object.

(details: . represents the current node - since we haven't done anything it's the root, and hits is the key we want to view. If we knew that there were an object foo with a key bar in the hits object, we could refer to bar as .hits.foo.bar: root -> hits -> foo -> bar)

$ curl -s "https://www.consumerfinance.gov/data-research/consumer-complaints/search/api/v1/geo/states?search_term=test" \
  | jq '.hits'
{
"total": 1871,
"max_score": 0,
"hits": []
}

Dang, that wasn't what we were looking for. How about .aggregations?

$ curl -s "https://www.consumerfinance.gov/data-research/consumer-complaints/search/api/v1/geo/states?search_term=test" \
  | jq '.aggregations'
{
  "product": {
    "doc_count": 1871,
    "product": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 614,
      "buckets": [
        {
          "key": "Credit reporting, credit repair services, or other personal consumer reports",
          "doc_count": 373,
          "product": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
...and may more lines

Oh no! we printed way too many lines. This is something that will happen to you often when you use jq; my recommendation is to pipe the result to head or less.

Anyway, we can see that a few levels down, there is a buckets list, containing objects with a key. I wonder what keys there are in the buckets objects?

$ curl -s "https://www.consumerfinance.gov/data-research/consumer-complaints/search/api/v1/geo/states?search_term=test" \
  | jq '.aggregations.product.product.buckets | map(.key)'
[
  "Credit reporting, credit repair services, or other personal consumer reports",
  "Debt collection",
  "Mortgage",
  "Credit card or prepaid card",
  "Vehicle loan or lease"
]

This time we've introduced a jq pipe: we grabbed the node at the path .aggregations.product.product.buckets and piped it to the map function.

We can read .aggregations.product.product.buckets | map(.key) as "get the list located at .aggregations.product.product.buckets, pipe it to the map function, and for each object get the key value".

To see if there's interesting data in there, we might also ask ourselves what the first bucket looks like; we can use normal javascript list syntax to get [0] of the list:

$ curl -s "https://www.consumerfinance.gov/data-research/consumer-complaints/search/api/v1/geo/states?search_term=test" \
  | jq '.aggregations.product.product.buckets[0]'
{
  "key": "Credit reporting, credit repair services, or other personal consumer reports",
  "doc_count": 373,
  "product": {
    "doc_count_error_upper_bound": 0,
    "sum_other_doc_count": 0,
    "buckets": [
      {
        "key": "Credit reporting",
        "doc_count": 354
      },
      {
        "key": "Other personal consumer report",
        "doc_count": 19
      }
    ]
  }
}

OK, so the bucket contains another list of buckets and a count of documents within that bucket.

One thing we might want to build is a list of each bucket's sub-buckets. Within the map function, we could construct an object like so:

$ curl -s "https://www.consumerfinance.gov/data-research/consumer-complaints/search/api/v1/geo/states?search_term=test" \ 
  | jq '.aggregations.product.product.buckets | map({
       "key": .key,
       "sub_buckets": .product.buckets | map(.key)
    })'
[
  {
    "key": "Credit reporting, credit repair services, or other personal consumer reports",
    "sub_buckets": [
      "Credit reporting",
      "Other personal consumer report"
    ]
  },
  {
    "key": "Debt collection",
    "sub_buckets": [
      "Medical debt",
      "Medical",
      "Other debt",
      "Credit card debt",
      "Auto debt",
      "I do not know",
      "Other (i.e. phone, health club, etc.)",
      "Private student loan debt",
      "Credit card",
      "Mortgage debt"
    ]
  },
  {...more stuff here

This time we see that the . node works recursively: within the map function, . represents the current node being iterated on.

I'm going to leave it here, I don't think we figured out how the API works, but I hope I've introduced jq as a useful tool for breaking down a json response into something you can start to understand, and transform into useful output for some other tool.

This type of function can be very handy for transforming the output of one API into the input for another function: you use a path to find your way down the tree, then a map or reduce to create objects for some other function or API to consume.

Check out the manual, there's lots more functions you can use to transform the input into the form you'd like to see.

One more thing

One utility function that I often find helpful, and want to mention before I go, is --raw-output or -r flag.

Let's say we're writing a shell script and we want to use that _scroll_id identifier we saw earlier in the API output. If we call jq to get that value, it responds by printing it out as a json value, surrounded by quotes:

$ curl -s "https://www.consumerfinance.gov/data-research/consumer-complaints/search/api/v1/geo/states?search_term=test" \ 
  | jq '._scroll_id'
"cXVlcnlUaGVuRmV0Y2g7NTs2MDYzNzY6NlFSZXU3eTdTWkNhMkNsQm1mclNFUTs2MDYzNzc6NlFSZXU3eTdTWkNhMkNsQm1mclNFUTs0ODYxNzc6VVg2cUsxQmxRQ09rY3U4SWd4WWhjZzs0ODYxNzg6VVg2cUsxQmxRQ09rY3U4SWd4WWhjZzs3MjkzMDc6ZzRIUGF4VEZScjY5Ym1fNG1HRWc3ZzswOw=="

Oftentimes, that's a bit annoying in the script - you want to get the value without quotes. That's where --raw-output comes into play:

$ curl -s "https://www.consumerfinance.gov/data-research/consumer-complaints/search/api/v1/geo/states?search_term=test" \ 
  | jq -r '._scroll_id'
cXVlcnlUaGVuRmV0Y2g7NTs2MDYzNzY6NlFSZXU3eTdTWkNhMkNsQm1mclNFUTs2MDYzNzc6NlFSZXU3eTdTWkNhMkNsQm1mclNFUTs0ODYxNzc6VVg2cUsxQmxRQ09rY3U4SWd4WWhjZzs0ODYxNzg6VVg2cUsxQmxRQ09rY3U4SWd4WWhjZzs3MjkzMDc6ZzRIUGF4VEZScjY5Ym1fNG1HRWc3ZzswOw==

and we have a raw value, ready to be used as input to some other process.

That's all

The basic idea of jq is that you can transform JSON from one format to another in a reasonably concise way. You can get quite far just with the tools I've mentioned in this tutorial - . paths to navigate the object, the keys function, and the map function. If you need more, there are many more tools in the manual.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment