Skip to content

Instantly share code, notes, and snippets.

@quii
Created March 16, 2020 09:03
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save quii/496b1d65b0cf4827e9939be795899ef4 to your computer and use it in GitHub Desktop.
Save quii/496b1d65b0cf4827e9939be795899ef4 to your computer and use it in GitHub Desktop.
es

Aggregates

Let's say I have a funds index with 2 documents like this

{
  "fundId": 123,
  "fundManager": {
    "id": 123434,
    "name": "Chris Inc"
  },
  "sectors": ["a", "b"]
}

{
  "fundId": 456,
  "fundManager": {
    "id": 123434,
    "name": "Chris Inc"
  },
  "sectors": ["c"]
}

Notice they are two different "funds" with the same fund manager.

What we want to do is retrieve all fund managers with the sectors they manage (e.g bring all the sectors together)

So in this case I'd want for the lack of a better word a "GROUP BY" and get something returned like

{
  "fundManager": {
    "id": 123434,
    "name": "Chris Inc"
  },
  "sectors": ["a", "b", "c"]
}

Is this possible? How?

@Lexcenture
Copy link

Lexcenture commented Mar 16, 2020

Let's put these documents into Elasticsearch and allowing ES to infer the types i.e. not setting up our own mapping.

PUT deals/doc/1
{
  "fundId": 123,
  "fundManager": {
    "id": 123434,
    "name": "Chris Inc"
  },
  "sectors": ["a", "b"]
}
PUT deals/doc/1
{
  "fundId": 123,
  "fundManager": {
    "id": 123434,
    "name": "Chris Inc"
  },
  "sectors": ["a", "b"]
}

There are 2 ways you can get all the sectors for a give fund manger:

  1. The following will return all the documents (limit the fields to "sectors") and in your code you can marshal the data to your requirements.
GET deals/doc/_search
{
  "_source": ["sectors"], 
  "query": {
    "terms": {
      "fundManager.id": [123434 ]
    }
  }
}
  1. The following query will use aggregation to give you the unique sectors and count of documents for each sector, I personally prefer this method.
GET deals/doc/_search
{
  "size": 0, 
  "query": {
    "terms": {
      "fundManager.id": [123434 ]
    }
  },
  "aggs": {
    "sectors": {
      "terms": {
        "field": "sectors.keyword", //We can only aggregate on string keyword type, here default mapping creates a field called keyword automaticatlly
        "size": 1000
      }
    }
  }
}

@Lexcenture
Copy link

Lexcenture commented Mar 16, 2020

Here is the query to get all the fund managers and their related sectors:

GET deals/doc/_search
{
  "size": 0,
  "aggs": {
    "fund_managers": {
      "terms": {
        "field": "fundManager.name.keyword",
        "size": 10,
        "order": {
          "_key": "asc"
        }
      },
      "aggs": {
        "sectors": {
          "terms": {
            "field": "sectors.keyword",
            "size": 1000,
            "order": {
              "_count": "desc"
            }
          }
        }
      }
    }
  }
}

@Lexcenture
Copy link

Aggregation and returning some fieleds

GET deals/doc/_search
{
  "size": 0,
  "aggs": {
    "fund_managers": {
      "terms": {
        "field": "fundManager.name.keyword",
        "size": 100
      },
      "aggs": {
        "top_sectors": {
          "top_hits": {
            "_source": {
              "includes": [
                "fundId",
                "fundManager.id",
                "fundManager.name",
                "sectors"
              ]
            },
            "size": 10
          }
        }
      }
    }
  }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment