Skip to content

Instantly share code, notes, and snippets.

@FrankHassanabad
Last active May 28, 2021 16:02
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save FrankHassanabad/b51bf12a3c02aaeef234859b7a36b491 to your computer and use it in GitHub Desktop.
Save FrankHassanabad/b51bf12a3c02aaeef234859b7a36b491 to your computer and use it in GitHub Desktop.
Elastic Queries with aliases const keyword and non-const keyword
# Mapping with a "constant_keyword" for "data_stream.dataset"
# Mapping with an alias from "event.dataset" -> "data_stream.dataset"
DELETE const-keyword-frank-delme-1
PUT const-keyword-frank-delme-1
{
"mappings": {
"dynamic":"false",
"properties": {
"@timestamp": {
"type": "date"
},
"message": {
"type": "text"
},
"data_stream": {
"properties": {
"dataset": {
"type": "constant_keyword",
"value": "endpoint"
}
}
},
"event": {
"properties": {
"dataset": {
"type": "alias",
"path": "data_stream.dataset"
}
}
}
}
}
}
# Non data_stream mapping with "event.dataset" as type "keyword" and no data_stream
DELETE keyword-frank-delme-1
PUT keyword-frank-delme-1
{
"mappings": {
"dynamic":"false",
"properties": {
"@timestamp": {
"type": "date"
},
"message": {
"type": "text"
},
"event": {
"properties": {
"dataset": {
"type": "keyword",
"ignore_above" : 1024
}
}
}
}
}
}
# post example document to keyword
DELETE keyword-frank-delme-1/_doc/1
POST keyword-frank-delme-1/_doc/1
{
"@timestamp": "2021-05-28T15:33:39.333Z",
"event": {
"dataset": "nginx"
}
}
# post example document to const keyword, I intentionally leave it _off_ the source document to maximize corner cases
# Otherwise you would add it like so:
# "data_stream": {
# "dataset": "endpoint"
# }
DELETE const-keyword-frank-delme-1/_doc/1
POST const-keyword-frank-delme-1/_doc/1
{
"@timestamp": "2021-05-28T15:33:39.333Z"
}
# Using fields works with an alias correctly with mixed const and non const keyword
GET const-keyword-frank-delme-1,keyword-frank-delme-1/_search
{
"fields" : ["event.dataset"]
}
# Aggs returns both values regardless if you have added the const keyword to a _source or not
# This works with the alias above as well.
GET const-keyword-frank-delme-1,keyword-frank-delme-1/_search
{
"size": 0,
"aggs": {
"event_dataset": {
"terms": {
"field": "event.dataset"
}
}
}
}
# We can query with the alias and get both values back across alias
# and non-alias in combination with a const keyword and non const keyword
GET const-keyword-frank-delme-1,keyword-frank-delme-1/_search
{
"fields" : ["event.dataset"],
"query": {
"terms": {
"event.dataset": ["endpoint", "nginx"]
}
}
}
@FrankHassanabad
Copy link
Author

# Using fields works with an alias correctly with mixed const and non const keyword
GET const-keyword-frank-delme-1,keyword-frank-delme-1/_search
{
  "fields" : ["event.dataset"]
}

Returns:

{
  "took" : 6,
  "timed_out" : false,
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "const-keyword-frank-delme-1",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "@timestamp" : "2021-05-28T15:33:39.333Z"
        },
        "fields" : {
          "event.dataset" : [
            "endpoint"
          ]
        }
      },
      {
        "_index" : "keyword-frank-delme-1",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "@timestamp" : "2021-05-28T15:33:39.333Z",
          "event" : {
            "dataset" : "nginx"
          }
        },
        "fields" : {
          "event.dataset" : [
            "nginx"
          ]
        }
      }
    ]
  }
}

@FrankHassanabad
Copy link
Author

# Aggs returns both values regardless if you have added the const keyword to a _source or not
# This works with the alias above as well.
GET const-keyword-frank-delme-1,keyword-frank-delme-1/_search
{
  "size": 0,
  "aggs": {
    "event_dataset": {
      "terms": {
        "field": "event.dataset"
      }
    }
  }
}

Returns:

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "event_dataset" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "endpoint",
          "doc_count" : 1
        },
        {
          "key" : "nginx",
          "doc_count" : 1
        }
      ]
    }
  }
}

@FrankHassanabad
Copy link
Author

FrankHassanabad commented May 28, 2021

# We can query with the alias and get both values back across alias
# and non-alias in combination with a const keyword and non const keyword
GET const-keyword-frank-delme-1,keyword-frank-delme-1/_search
{
  "fields" : ["event.dataset"],
  "query": {
    "terms": {
      "event.dataset": ["endpoint", "nginx"]
    }
  }
}

Returns:

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "const-keyword-frank-delme-1",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "@timestamp" : "2021-05-28T15:33:39.333Z"
        },
        "fields" : {
          "event.dataset" : [
            "endpoint"
          ]
        }
      },
      {
        "_index" : "keyword-frank-delme-1",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "@timestamp" : "2021-05-28T15:33:39.333Z",
          "event" : {
            "dataset" : "nginx"
          }
        },
        "fields" : {
          "event.dataset" : [
            "nginx"
          ]
        }
      }
    ]
  }
}

Note that you don't have the constant in the _source field and need to ensure you're using fields. You actually can add it to your _source document during ingest, that is allowed, but if you don't you have to use fields in order to get the value back if you want it back.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment