Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Elasticsearch scripted aggregation with joined fields

This script allows you to do SQL GROUPBY-like aggregations on multiple fields in an Elasticsearch index.

Performance will likely be poor on large data sets.

Saved Groovy script in <elasticsearch_dir>/config/scripts/join-param-list.groovy:

return fields.collect { doc[it].value }.join(delimiter);

A representative query that does a "GROUPBY" to see the number of identical first-name / last-name / employer pairs:

{
    "query": {
        "term":{"_type":"account"}
    },
    "size":1,
    "aggs": {
        "agg1": {
            "terms": {
                "script": {
                    "file": "join-param-list",
                    "lang": "groovy",
                    "params": {"fields":["firstname","lastname","employer"], "delimiter":"|" }
                }
            }
        }
    }
}

Sample agg output:

"aggregations": {
    "agg1": {
      "doc_count_error_upper_bound": 5,
      "sum_other_doc_count": 990,
      "buckets": [
        {
          "key": "abbott|smith|acme",
          "doc_count": 1
        },
etc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.