Create indices for 2017 and 2018:
curl -XPUT http://localhost:9200/visitor_logs_2017 -H "content-type: application/json" -d @- <<EOF
{
"mappings" : {
"_doc" : {
"dynamic": "strict",
"properties" : {
"user-id": { "type": "keyword" },
"ip": { "type": "text" },
"session-id": { "type": "keyword" },
"ts": { "type": "date" },
"url": { "type": "text" },
"method": { "type": "keyword" }
}
}
}
}
EOF
echo
curl -XPUT http://localhost:9200/visitor_logs_2018 -H "content-type: application/json" -d @- <<EOF
{
"mappings" : {
"_doc" : {
"dynamic": "strict",
"properties" : {
"user-id": { "type": "keyword" },
"ip": { "type": "text" },
"session-id": { "type": "keyword" },
"ts": { "type": "date" },
"url": { "type": "text" },
"method": { "type": "keyword" }
}
}
}
}
EOF
{"acknowledged":true,"shards_acknowledged":true,"index":"visitor_logs_2017"}
{"acknowledged":true,"shards_acknowledged":true,"index":"visitor_logs_2018"}
Add data to each index:
curl -XPOST http://localhost:9200/visitor_logs_2017/_doc -H "content-type: application/json" -d @- <<EOF
{
"user-id": "30c1b62a",
"ip": "10.76.54.93",
"session-id": "08298f4a",
"ts": "2017-12-31T08:52:19Z",
"url": "https://www.example.com/api/reports/228422",
"method": "PUT"
}
EOF
echo
curl -XPOST http://localhost:9200/visitor_logs_2018/_doc -H "content-type: application/json" -d @- <<EOF
{
"user-id": "30c1b62a",
"ip": "10.76.54.93",
"session-id": "d8e81b56",
"ts": "2018-01-01T13:55:01Z",
"url": "https://www.example.com/api/reports/228422",
"method": "GET"
}
EOF
{"_index":"visitor_logs_2017","_type":"_doc","_id":"JtLMuGEBbyPuiTfcxErP","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1}
{"_index":"visitor_logs_2018","_type":"_doc","_id":"J9LMuGEBbyPuiTfcxErt","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1}
Show the document counts in these indices:
curl http://localhost:9200/_cat/indices/visitor_logs_201*?v
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open visitor_logs_2017 5C_-HfWlS8yZWoPbC6il4Q 5 1 1 0 5.7kb 5.7kb
yellow open visitor_logs_2018 zqTLt2keRN2NCJef0Iq02Q 5 1 1 0 5.7kb 5.7kb
Query both indices at the same time. Use JQ to just show the matching document data:
curl http://localhost:9200/visitor_logs_2017,visitor_logs_2018/_search?q=30c1b62a | jq .hits.hits
[
{
"_index": "visitor_logs_2017",
"_type": "_doc",
"_id": "JtLMuGEBbyPuiTfcxErP",
"_score": 0.2876821,
"_source": {
"user-id": "30c1b62a",
"ip": "10.76.54.93",
"session-id": "08298f4a",
"ts": "2017-12-31T08:52:19Z",
"url": "https://www.example.com/api/reports/228422",
"method": "PUT"
}
},
{
"_index": "visitor_logs_2018",
"_type": "_doc",
"_id": "J9LMuGEBbyPuiTfcxErt",
"_score": 0.2876821,
"_source": {
"user-id": "30c1b62a",
"ip": "10.76.54.93",
"session-id": "d8e81b56",
"ts": "2018-01-01T13:55:01Z",
"url": "https://www.example.com/api/reports/228422",
"method": "GET"
}
}
]
The equivalent query with a wildcard list of indices:
curl http://localhost:9200/visitor_logs_*/_search?q=30c1b62a | jq .hits.hits
[
{
"_index": "visitor_logs_2017",
"_type": "_doc",
"_id": "JtLMuGEBbyPuiTfcxErP",
"_score": 0.2876821,
"_source": {
"user-id": "30c1b62a",
"ip": "10.76.54.93",
"session-id": "08298f4a",
"ts": "2017-12-31T08:52:19Z",
"url": "https://www.example.com/api/reports/228422",
"method": "PUT"
}
},
{
"_index": "visitor_logs_2018",
"_type": "_doc",
"_id": "J9LMuGEBbyPuiTfcxErt",
"_score": 0.2876821,
"_source": {
"user-id": "30c1b62a",
"ip": "10.76.54.93",
"session-id": "d8e81b56",
"ts": "2018-01-01T13:55:01Z",
"url": "https://www.example.com/api/reports/228422",
"method": "GET"
}
}
]
Create an alias to cover those two indices:
curl -XPOST http://localhost:9200/_aliases?pretty -H "content-type: application/json" -d @- <<EOF
{
"actions" : [
{ "add" : { "index" : "visitor_logs_2017", "alias" : "visitor_logs" } },
{ "add" : { "index" : "visitor_logs_2018", "alias" : "visitor_logs" } }
]
}
EOF
{
"acknowledged" : true
}
Run the same query as above against the alias:
curl http://localhost:9200/visitor_logs/_search?q=30c1b62a | jq .hits.hits
[
{
"_index": "visitor_logs_2017",
"_type": "_doc",
"_id": "JtLMuGEBbyPuiTfcxErP",
"_score": 0.2876821,
"_source": {
"user-id": "30c1b62a",
"ip": "10.76.54.93",
"session-id": "08298f4a",
"ts": "2017-12-31T08:52:19Z",
"url": "https://www.example.com/api/reports/228422",
"method": "PUT"
}
},
{
"_index": "visitor_logs_2018",
"_type": "_doc",
"_id": "J9LMuGEBbyPuiTfcxErt",
"_score": 0.2876821,
"_source": {
"user-id": "30c1b62a",
"ip": "10.76.54.93",
"session-id": "d8e81b56",
"ts": "2018-01-01T13:55:01Z",
"url": "https://www.example.com/api/reports/228422",
"method": "GET"
}
}
]
Two results, one from each index, just as before.
curl -XGET http://localhost:9200/_aliases?pretty
{
"visitor_logs_2017" : {
"aliases" : {
"visitor_logs" : { }
}
},
"visitor_logs_2018" : {
"aliases" : {
"visitor_logs" : { }
}
}
}
We can see that both indices are aliased by “visitor_logs”.
For clarity, we can add a bit more data to the 2017 and 2018 visitor logs indices:
for i in `seq 1 59`;
do
curl -XPOST http://localhost:9200/visitor_logs_2017/_doc -H "content-type: application/json" -d @- > /dev/null <<EOF
{
"user-id": "30c1b62a",
"ip": "10.76.54.93",
"session-id": "08298f4a",
"ts": "2017-12-31T12:$i:00Z",
"url": "https://www.example.com/api/reports/228422",
"method": "PUT"
}
EOF
done
for i in `seq 1 22`;
do
curl -XPOST http://localhost:9200/visitor_logs_2018/_doc -H "content-type: application/json" -d @- > /dev/null <<EOF
{
"user-id": "30c1b62a",
"ip": "10.76.54.$i",
"session-id": "08298f4a",
"ts": "2018-01-01T09:$i:19Z",
"url": "https://www.example.com/api/reports/228422",
"method": "PUT"
}
EOF
done
A quick query shows there are now many hits in the alias for user “30c1b62a”:
curl http://localhost:9200/visitor_logs/_search?q=30c1b62a | jq .hits.total
65
We can use the _cat
API to look at the total cluster size:
curl http://localhost:9200/_cat/indices/visitor_logs_201*?v
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open visitor_logs_2017 5C_-HfWlS8yZWoPbC6il4Q 5 1 51 0 54.4kb 54.4kb
yellow open visitor_logs_2018 zqTLt2keRN2NCJef0Iq02Q 5 1 14 0 29.4kb 29.4kb
If we now want to reduce our cluster’s data size, we would want to
remove the oldest visitor logs index. First, remove visitor_logs_2017
from the alias:
curl -XPOST http://localhost:9200/_aliases?pretty -H "content-type: application/json" -d @- <<EOF
{
"actions" : [
{ "remove" : { "index" : "visitor_logs_2017", "alias" : "visitor_logs" } }
]
}
EOF
{
"acknowledged" : true
}
Now, when we re-query the index alias, we get fewer results:
curl http://localhost:9200/visitor_logs/_search?q=30c1b62a | jq .hits.total
14
Close and delete the old index to get some disk space back:
curl -XPOST http://localhost:9200/visitor_logs_2017/_close
echo
curl -XDELETE http://localhost:9200/visitor_logs_2017
{"acknowledged":true}
{"acknowledged":true}
And to be sure, look at the total cluster size after using the same _cat
request as before:
curl http://localhost:9200/_cat/indices/visitor_logs_201*?v
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open visitor_logs_2018 zqTLt2keRN2NCJef0Iq02Q 5 1 14 0 29.4kb 29.4kb
Success! The 2017 logs are gone.
Create a new index with the correct mapping:
curl -XPUT http://localhost:9200/visitor_logs_2018_01 -H "content-type: application/json" -d @- <<EOF
{
"mappings" : {
"_doc" : {
"dynamic": "strict",
"properties" : {
"user-id": { "type": "keyword" },
"ip": { "type": "ip" },
"session-id": { "type": "keyword" },
"ts": { "type": "date" },
"url": { "type": "text" },
"method": { "type": "keyword" }
}
}
}
}
EOF
{"acknowledged":true,"shards_acknowledged":true,"index":"visitor_logs_2018_01"}
Reindex the existing 2018 data into the new index:
curl -XPOST http://localhost:9200/_reindex?pretty -H "content-type: application/json" -d @- <<EOF
{
"source": {
"index": "visitor_logs_2018"
},
"dest": {
"index": "visitor_logs_2018_01"
}
}
{
"took" : 60,
"timed_out" : false,
"total" : 14,
"updated" : 0,
"created" : 14,
"deleted" : 0,
"batches" : 1,
"version_conflicts" : 0,
"noops" : 0,
"retries" : {
"bulk" : 0,
"search" : 0
},
"throttled_millis" : 0,
"requests_per_second" : -1.0,
"throttled_until_millis" : 0,
"failures" : [ ]
}
Adjust the index alias to include only the new index:
curl -XPOST http://localhost:9200/_aliases?pretty -H "content-type: application/json" -d @- <<EOF
{
"actions" : [
{ "remove" : { "index" : "visitor_logs_2018", "alias" : "visitor_logs" } },
{ "add" : { "index" : "visitor_logs_2018_01", "alias" : "visitor_logs" } }
]
}
EOF
{
"acknowledged" : true
}
Delete the old index:
curl -XDELETE http://localhost:9200/visitor_logs_2018
{"acknowledged":true}
Use _cat
to show what remains:
curl http://localhost:9200/_cat/indices/visitor_logs_201*?v
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open visitor_logs_2018_01 69top4NPSiGlWeco7VWMhw 5 1 14 0 25.7kb 25.7kb
We can use our new enhanced IP address data type for an IP network range query.
Which network addresses in the /28
are making API requests?
curl -XGET http://localhost:9200/visitor_logs/_search -H "content-type: application/json" -d @- <<EOF |
{
"size": 0,
"query": {
"term": {
"ip": "10.76.54.0/28"
}
},
"aggs": {
"ips_in_slash_28": {
"terms": {
"field": "ip"
}
}
}
}
EOF
jq .aggregations.ips_in_slash_28.buckets
[
{
"key": "10.76.54.10",
"doc_count": 1
},
{
"key": "10.76.54.11",
"doc_count": 1
},
{
"key": "10.76.54.12",
"doc_count": 1
},
{
"key": "10.76.54.13",
"doc_count": 1
},
{
"key": "10.76.54.14",
"doc_count": 1
},
{
"key": "10.76.54.15",
"doc_count": 1
}
]