Some docs:
- https://www.elastic.co/blog/red-elasticsearch-cluster-panic-no-longer
- https://www.elastic.co/guide/en/elasticsearch/reference/current/restart-cluster.html
HEALTH DETAILS
curl -XGET http://$log:9200/_cluster/health?pretty
LIST INDEXES
k -n sre exec -it elasticsearch-0 -c elasticsearch -- curl -s http://localhost:9200/_cat/indices?v |grep kube
ROLLOVER
k -n sre exec -it elasticsearch-0 -c elasticsearch -- curl -X POST -H 'Content-Type: application/json' -d '{"conditions":{"max_docs":1}}' localhost:9200/fluentd.kube.falco/_rollover
k -n sre exec -it elasticsearch-0 -c elasticsearch -- curl -s http://localhost:9200/_cat/aliases?v | awk '{print $1}'| egrep -v ^.kibana\|^ilm\|^alias | xargs -n1 -I% curl -X POST -H 'Content-Type: application/json' -d '{"conditions":{"max_docs":1}}' http://localhost:9200/%/_rollover
NODE DETAIS
curl -s http://$log:9200/_cat/nodes?v
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
10.20.0.61 56 96 9 1.50 0.98 0.49 mdi * elasticsearch
10.20.0.62 61 93 10 0.40 0.87 0.63 mdi - elasticsearch
WHAT ARE THE CUSTOM SETTINGS?
curl -XGET http://$log:9200/_cluster/settings?pretty
WHAT ARE THE CUSTOM SETTINGS?
curl -XGET http://$log:9200/_cluster/settings?pretty
{
"persistent" : { },
"transient" : { }
}
SETUP WATERMARK DISK SPACE TO 90%
curl -XPUT http://$log:9200/_cluster/settings?pretty -H 'Content-Type: application/json' -d '{"transient": {"cluster.routing.allocation.disk.watermark.low": "90%"}}'
WHO IS MASTER
curl -X GET http://$log:9200/_cat/master
SETTINGS
curl -XGET http://$log:9200/_cluster/settings?pretty
INDEXES
curl -s http://$log:9200/_cat/indices?v | grep -v green
REALLOCATE/REROUTEE ALL SHARDS WHICH ARE UNASSIGNED
# retry rerouting unassigned shards
curl -XPOST -ss localhost:9200/_cluster/reroute?retry_failed=true
# realocate
curl -XPUT http://$log:9200/_cluster/settings?pretty -H 'Content-Type: application/json' -d'{"persistent": {"cluster.routing.allocation.enable": "all"}}'
If some shards won't reallocate, do following:
curl -XGET http://$log:9200/_cluster/nodes?pretty
curl -s http://localhost:9200/_nodes?pretty
curl -XPOST http://$log:9200/_cluster/reroute -d '{ "commands" : [ { "allocate" : { "index" : "log-2018.01.07", "shard" : 0, "node": "l3iTL89aRSONcMwqoZ38Zw", "allow_primary": "true" }}]}'
Get node unique name
curl -XGET http://$log:9200/_nodes?pretty | grep transport_address -B 3
Put some config
curl -X PUT http://$log:9200/test_idx/_settings -H 'Content-Type: application/json' -d '{ "index.routing.allocation.exclude._name": null }' |jq
Delete index manually
curl -XDELETE $log:9200/index/fluentd.kube.app.ver.vega-*
To print docs count by index (sum of fluentd daily indexes):
curl -sS localhost:9200/_cat/indices?v|awk '{print $3,$7}'|sed 's,^.*-\([0-9]*.[0-9]*.[0-9]*\),\1,g'|awk '{sum[$1]+=$2;}END{for (i in sum) print i,sum[i]}'|sort
To print index sizes in MB (sum of fluentd daily indexes):
curl -sS "localhost:9200/_cat/indices?v&bytes=b"|awk '{print $3,$9}'|sed 's,^.*-\([0-9]*.[0-9]*.[0-9]*\),\1,g'|awk '{sum[$1]+=$2;}END{for (i in sum) print i,sum[i]/1024/1024}'|sort
To view process of shards recovery:
curl -sS "localhost:9200/_cat/recovery?pretty&active_only=true"
You can also view allocation and recovery information for given index:
curl -sS localhost:9200/fluentd.svcfw.apiaccess-2020.04.05-000001/allocation/explain?pretty
curl -sS "localhost:9200/fluentd.svcfw.apiaccess-2020.04.05-000001/_recovery?pretty"
You can then list unassigned shards with:
curl -sS localhost:9200/_cat/shards | grep UNASSIGNED | awk '{print $1}'
To describe why shards cannot be allocated:
curl -sS localhost:9200/_cluster/allocation/explain?pretty
curl -sS http://localhost:9200/_cat/shards | grep UNASSIGNED | awk '{print $1}' | xargs -i curl -XDELETE "http://localhost:9200/{}"
You can check ILM for single index or iterate over all indexes:
curl -sS http://localhost:9200/_cat/indices?h=i | while read i; do
curl -sS http://localhost:9200/$i/_ilm/status?pretty
done
If you see error similar to
index.lifecycle.rollover_alias [fluentd.kube] does not point to index [fluentd.kube-2019.11.08-000001]
It means that alias does not point to index you are trying to roll over. This state won't be fixed by itself and you need to fix alias first and then retry ILM.
First you need to check that name of alias is not used as an index already. In that case, you need to delete such index otherwise you will not be able to create alias.
# Check existing indexes
curl -sS localhost:9200/_cat/indices | grep fluentd.kube
# Delete index if it exists
curl -X DELETE localhost:9200/fluentd.kube
# Point fluentd.kube alias to last index seen (fluentd.kube-2019.11.08-000001)
curl -sS -XPOST -H 'Content-Type: application/json' -d '{"actions": [{"add":{"index":"fluentd.kube-2019.11.08-000001","alias":"fluentd.kube"}}]}' localhost:9200/_aliases
# Retry ILM for given index
curl -sS -XPOST localhost:9200/fluentd.kube-2019.11.08-000001/_ilm/retry?pretty
Index can become read-only (with disallowed delete operation) in certain situations, eg. when elastic will go out of disk space, etc. Then it's necessary to fix index configuration and allow deletion again (read_only_allow_delete).
This behavior os further explained in Elasticsearch documentation:
cluster.routing.allocation.disk.watermark.flood_stage
: Controls the flood
stage watermark. It defaults to 95%, meaning that Elasticsearch enforces a
read-only index block (index.blocks.read_only_allow_delete) on every index
that has one or more shards allocated on the node that has at least one disk
exceeding the flood stage. This is a last resort to prevent nodes from running
out of disk space. The index block must be released manually once there is
enough disk space available to allow indexing operations to continue.
To fix it, check storage space and unset
index.blocks.read_only_allow_delete
. This is possible in Kibana or with
curl:
```sh
for i in in $(curl -sS -XGET http://localhost:9200/_cat/shards | grep UNASSIGNED | awk '{print $1}' | xargs -n1); do curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
"commands" : [ {
"allocate" : {
"index" : "$i",
"shard" : 0,
"node" : "elasticsearch-0",
"allow_primary" : true
}
}
]
}' -H 'Content-Type: application/json' ; sleep 5; done
```
set -x
for i in `curl -sS http://localhost:9200/_cat/shards | grep UNASSIGNED | awk '{print $1}' | xargs -n1`; do curl -XDELETE "http://localhost:9200/$i"& sleep 10; done