Skip to content

Instantly share code, notes, and snippets.

@thecambian
Created February 21, 2018 14:48
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save thecambian/ffaab5dbc73f995d0b1136bf5733811f to your computer and use it in GitHub Desktop.
Save thecambian/ffaab5dbc73f995d0b1136bf5733811f to your computer and use it in GitHub Desktop.
Part 2: Index aliases

Part 2: Index aliases

Basics

Create indices

Create indices for 2017 and 2018:

curl -XPUT http://localhost:9200/visitor_logs_2017 -H "content-type: application/json" -d @- <<EOF
{
    "mappings" : {
        "_doc" : {
            "dynamic": "strict",
            "properties" : {
                "user-id":    { "type": "keyword" },
                "ip":         { "type": "text" },
                "session-id": { "type": "keyword" },
                "ts":         { "type": "date" },
                "url":        { "type": "text" },
                "method":     { "type": "keyword" }
            }
        }
    }
}
EOF

echo

curl -XPUT http://localhost:9200/visitor_logs_2018 -H "content-type: application/json" -d @- <<EOF
{
    "mappings" : {
        "_doc" : {
            "dynamic": "strict",
            "properties" : {
                "user-id":    { "type": "keyword" },
                "ip":         { "type": "text" },
                "session-id": { "type": "keyword" },
                "ts":         { "type": "date" },
                "url":        { "type": "text" },
                "method":     { "type": "keyword" }
            }
        }
    }
}
EOF
{"acknowledged":true,"shards_acknowledged":true,"index":"visitor_logs_2017"}
{"acknowledged":true,"shards_acknowledged":true,"index":"visitor_logs_2018"}

Add data

Add data to each index:

curl -XPOST http://localhost:9200/visitor_logs_2017/_doc -H "content-type: application/json" -d @- <<EOF
{
  "user-id": "30c1b62a",
  "ip": "10.76.54.93",
  "session-id": "08298f4a",
  "ts": "2017-12-31T08:52:19Z",
  "url": "https://www.example.com/api/reports/228422",
  "method": "PUT"
}
EOF

echo

curl -XPOST http://localhost:9200/visitor_logs_2018/_doc -H "content-type: application/json" -d @- <<EOF
{
  "user-id": "30c1b62a",
  "ip": "10.76.54.93",
  "session-id": "d8e81b56",
  "ts": "2018-01-01T13:55:01Z",
  "url": "https://www.example.com/api/reports/228422",
  "method": "GET"
}
EOF
{"_index":"visitor_logs_2017","_type":"_doc","_id":"JtLMuGEBbyPuiTfcxErP","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1}
{"_index":"visitor_logs_2018","_type":"_doc","_id":"J9LMuGEBbyPuiTfcxErt","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1}

Show the document counts in these indices:

curl http://localhost:9200/_cat/indices/visitor_logs_201*?v
health status index             uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   visitor_logs_2017 5C_-HfWlS8yZWoPbC6il4Q   5   1          1            0      5.7kb          5.7kb
yellow open   visitor_logs_2018 zqTLt2keRN2NCJef0Iq02Q   5   1          1            0      5.7kb          5.7kb

Query the log indices

Query both indices at the same time. Use JQ to just show the matching document data:

curl http://localhost:9200/visitor_logs_2017,visitor_logs_2018/_search?q=30c1b62a | jq .hits.hits
[
  {
    "_index": "visitor_logs_2017",
    "_type": "_doc",
    "_id": "JtLMuGEBbyPuiTfcxErP",
    "_score": 0.2876821,
    "_source": {
      "user-id": "30c1b62a",
      "ip": "10.76.54.93",
      "session-id": "08298f4a",
      "ts": "2017-12-31T08:52:19Z",
      "url": "https://www.example.com/api/reports/228422",
      "method": "PUT"
    }
  },
  {
    "_index": "visitor_logs_2018",
    "_type": "_doc",
    "_id": "J9LMuGEBbyPuiTfcxErt",
    "_score": 0.2876821,
    "_source": {
      "user-id": "30c1b62a",
      "ip": "10.76.54.93",
      "session-id": "d8e81b56",
      "ts": "2018-01-01T13:55:01Z",
      "url": "https://www.example.com/api/reports/228422",
      "method": "GET"
    }
  }
]

The equivalent query with a wildcard list of indices:

curl http://localhost:9200/visitor_logs_*/_search?q=30c1b62a | jq .hits.hits
[
  {
    "_index": "visitor_logs_2017",
    "_type": "_doc",
    "_id": "JtLMuGEBbyPuiTfcxErP",
    "_score": 0.2876821,
    "_source": {
      "user-id": "30c1b62a",
      "ip": "10.76.54.93",
      "session-id": "08298f4a",
      "ts": "2017-12-31T08:52:19Z",
      "url": "https://www.example.com/api/reports/228422",
      "method": "PUT"
    }
  },
  {
    "_index": "visitor_logs_2018",
    "_type": "_doc",
    "_id": "J9LMuGEBbyPuiTfcxErt",
    "_score": 0.2876821,
    "_source": {
      "user-id": "30c1b62a",
      "ip": "10.76.54.93",
      "session-id": "d8e81b56",
      "ts": "2018-01-01T13:55:01Z",
      "url": "https://www.example.com/api/reports/228422",
      "method": "GET"
    }
  }
]

Create and use an alias

Create an alias to cover those two indices:

curl -XPOST http://localhost:9200/_aliases?pretty -H "content-type: application/json" -d @- <<EOF
{
    "actions" : [
        { "add" : { "index" : "visitor_logs_2017", "alias" : "visitor_logs" } },
        { "add" : { "index" : "visitor_logs_2018", "alias" : "visitor_logs" } }
    ]
}
EOF
{
  "acknowledged" : true
}

Run the same query as above against the alias:

curl http://localhost:9200/visitor_logs/_search?q=30c1b62a | jq .hits.hits
[
  {
    "_index": "visitor_logs_2017",
    "_type": "_doc",
    "_id": "JtLMuGEBbyPuiTfcxErP",
    "_score": 0.2876821,
    "_source": {
      "user-id": "30c1b62a",
      "ip": "10.76.54.93",
      "session-id": "08298f4a",
      "ts": "2017-12-31T08:52:19Z",
      "url": "https://www.example.com/api/reports/228422",
      "method": "PUT"
    }
  },
  {
    "_index": "visitor_logs_2018",
    "_type": "_doc",
    "_id": "J9LMuGEBbyPuiTfcxErt",
    "_score": 0.2876821,
    "_source": {
      "user-id": "30c1b62a",
      "ip": "10.76.54.93",
      "session-id": "d8e81b56",
      "ts": "2018-01-01T13:55:01Z",
      "url": "https://www.example.com/api/reports/228422",
      "method": "GET"
    }
  }
]

Two results, one from each index, just as before.

Inspect the aliases

curl -XGET http://localhost:9200/_aliases?pretty
{
  "visitor_logs_2017" : {
    "aliases" : {
      "visitor_logs" : { }
    }
  },
  "visitor_logs_2018" : {
    "aliases" : {
      "visitor_logs" : { }
    }
  }
}

We can see that both indices are aliased by “visitor_logs”.

Use case: data volume management

Add more data

For clarity, we can add a bit more data to the 2017 and 2018 visitor logs indices:

for i in `seq 1 59`;
do
    curl -XPOST http://localhost:9200/visitor_logs_2017/_doc -H "content-type: application/json" -d @- > /dev/null <<EOF
    {
      "user-id": "30c1b62a",
      "ip": "10.76.54.93",
      "session-id": "08298f4a",
      "ts": "2017-12-31T12:$i:00Z",
      "url": "https://www.example.com/api/reports/228422",
      "method": "PUT"
    }
EOF
done

for i in `seq 1 22`;
do
    curl -XPOST http://localhost:9200/visitor_logs_2018/_doc -H "content-type: application/json" -d @- > /dev/null <<EOF
    {
      "user-id": "30c1b62a",
      "ip": "10.76.54.$i",
      "session-id": "08298f4a",
      "ts": "2018-01-01T09:$i:19Z",
      "url": "https://www.example.com/api/reports/228422",
      "method": "PUT"
    }
EOF
done

A quick query shows there are now many hits in the alias for user “30c1b62a”:

curl http://localhost:9200/visitor_logs/_search?q=30c1b62a | jq .hits.total
65

Inspect cluster size

We can use the _cat API to look at the total cluster size:

curl http://localhost:9200/_cat/indices/visitor_logs_201*?v
health status index             uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   visitor_logs_2017 5C_-HfWlS8yZWoPbC6il4Q   5   1         51            0     54.4kb         54.4kb
yellow open   visitor_logs_2018 zqTLt2keRN2NCJef0Iq02Q   5   1         14            0     29.4kb         29.4kb

Remove oldest index from alias

If we now want to reduce our cluster’s data size, we would want to remove the oldest visitor logs index. First, remove visitor_logs_2017 from the alias:

curl -XPOST http://localhost:9200/_aliases?pretty -H "content-type: application/json" -d @- <<EOF
{
    "actions" : [
        { "remove" : { "index" : "visitor_logs_2017", "alias" : "visitor_logs" } }
    ]
}
EOF
{
  "acknowledged" : true
}

Now, when we re-query the index alias, we get fewer results:

curl http://localhost:9200/visitor_logs/_search?q=30c1b62a | jq .hits.total
14

Delete index and verify freed space

Close and delete the old index to get some disk space back:

curl -XPOST http://localhost:9200/visitor_logs_2017/_close
echo
curl -XDELETE http://localhost:9200/visitor_logs_2017
{"acknowledged":true}
{"acknowledged":true}

And to be sure, look at the total cluster size after using the same _cat request as before:

curl http://localhost:9200/_cat/indices/visitor_logs_201*?v
health status index             uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   visitor_logs_2018 zqTLt2keRN2NCJef0Iq02Q   5   1         14            0     29.4kb         29.4kb

Success! The 2017 logs are gone.

Index aliases use case: maintenance

Create destination index

Create a new index with the correct mapping:

curl -XPUT http://localhost:9200/visitor_logs_2018_01 -H "content-type: application/json" -d @- <<EOF
{
    "mappings" : {
        "_doc" : {
            "dynamic": "strict",
            "properties" : {
                "user-id":    { "type": "keyword" },
                "ip":         { "type": "ip" },
                "session-id": { "type": "keyword" },
                "ts":         { "type": "date" },
                "url":        { "type": "text" },
                "method":     { "type": "keyword" }
            }
        }
    }
}
EOF
{"acknowledged":true,"shards_acknowledged":true,"index":"visitor_logs_2018_01"}

Reindex

Reindex the existing 2018 data into the new index:

curl -XPOST http://localhost:9200/_reindex?pretty -H "content-type: application/json" -d @- <<EOF
{
  "source": {
    "index": "visitor_logs_2018"
  },
  "dest": {
    "index": "visitor_logs_2018_01"
  }
}
{
  "took" : 60,
  "timed_out" : false,
  "total" : 14,
  "updated" : 0,
  "created" : 14,
  "deleted" : 0,
  "batches" : 1,
  "version_conflicts" : 0,
  "noops" : 0,
  "retries" : {
    "bulk" : 0,
    "search" : 0
  },
  "throttled_millis" : 0,
  "requests_per_second" : -1.0,
  "throttled_until_millis" : 0,
  "failures" : [ ]
}

Switch indexes in alias

Adjust the index alias to include only the new index:

curl -XPOST http://localhost:9200/_aliases?pretty -H "content-type: application/json" -d @- <<EOF
{
    "actions" : [
        { "remove" : { "index" : "visitor_logs_2018", "alias" : "visitor_logs" } },
        { "add" : { "index" : "visitor_logs_2018_01", "alias" : "visitor_logs" } }
    ]
}
EOF
{
  "acknowledged" : true
}

Clean up

Delete the old index:

curl -XDELETE http://localhost:9200/visitor_logs_2018
{"acknowledged":true}

Use _cat to show what remains:

curl http://localhost:9200/_cat/indices/visitor_logs_201*?v
health status index                uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   visitor_logs_2018_01 69top4NPSiGlWeco7VWMhw   5   1         14            0     25.7kb         25.7kb

Query with new datatype

We can use our new enhanced IP address data type for an IP network range query.

Which network addresses in the /28 are making API requests?

curl -XGET http://localhost:9200/visitor_logs/_search -H "content-type: application/json" -d @- <<EOF |
{
  "size": 0,
  "query": {
    "term": {
      "ip": "10.76.54.0/28"
    }
  },
  "aggs": {
    "ips_in_slash_28": {
      "terms": {
        "field": "ip"
      }
    }
  }
}
EOF
jq .aggregations.ips_in_slash_28.buckets
[
  {
    "key": "10.76.54.10",
    "doc_count": 1
  },
  {
    "key": "10.76.54.11",
    "doc_count": 1
  },
  {
    "key": "10.76.54.12",
    "doc_count": 1
  },
  {
    "key": "10.76.54.13",
    "doc_count": 1
  },
  {
    "key": "10.76.54.14",
    "doc_count": 1
  },
  {
    "key": "10.76.54.15",
    "doc_count": 1
  }
]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment