Skip to content

Instantly share code, notes, and snippets.

@dgiebert
Last active April 19, 2024 12:17
Show Gist options
  • Save dgiebert/0c357b0060d8593e7fb100721e748566 to your computer and use it in GitHub Desktop.
Save dgiebert/0c357b0060d8593e7fb100721e748566 to your computer and use it in GitHub Desktop.
Debug etcd performance

Debugging etcd performance issues

fio

curl -LO https://github.com/rancherlabs/support-tools/raw/master/instant-fio-master/instant-fio-master.sh
zypper in zlib-devel make git gcc
bash instant-fio-master.sh
mkdir test-data
fio --rw=write --ioengine=sync --fdatasync=1 --directory=test-data --size=100m --bs=2300 --name=mytest

Make sure test-data is on the same disk as etcd data and check fsync/fdatasync/sync_file_range the 99.00th should be below 10000!

benchmark

zypper in golang
git clone https://github.com/etcd-io/etcd.git
cd etcd
go install -v ./tools/benchmark
go run ./tools/benchmark --key /etc/kubernetes/ssl/kube-etcd-NODENAME-key.pem --cert /etc/kubernetes/ssl/kube-etcd-NODENAME.pem --cacert kube-ca.pem put

etcdctl

RKE2

etcdctl check perf

for etcdpod in $(kubectl -n kube-system get pod -l component=etcd --no-headers -o custom-columns=NAME:.metadata.name); do kubectl -n kube-system exec $etcdpod -- sh -c "ETCDCTL_ENDPOINTS='https://127.0.0.1:2379' ETCDCTL_CACERT='/var/lib/rancher/rke2/server/tls/etcd/server-ca.crt' ETCDCTL_CERT='/var/lib/rancher/rke2/server/tls/etcd/server-client.crt' ETCDCTL_KEY='/var/lib/rancher/rke2/server/tls/etcd/server-client.key' ETCDCTL_API=3 etcdctl check perf"; done

etcdctl endpoint status -w json

for etcdpod in $(kubectl -n kube-system get pod -l component=etcd --no-headers -o custom-columns=NAME:.metadata.name); do kubectl -n kube-system exec $etcdpod -- sh -c "ETCDCTL_ENDPOINTS='https://127.0.0.1:2379' ETCDCTL_CACERT='/var/lib/rancher/rke2/server/tls/etcd/server-ca.crt' ETCDCTL_CERT='/var/lib/rancher/rke2/server/tls/etcd/server-client.crt' ETCDCTL_KEY='/var/lib/rancher/rke2/server/tls/etcd/server-client.key' ETCDCTL_API=3 etcdctl endpoint status -w json"; done

RKE1 (On the etcd nodes)

  • docker exec -ti etcd etcdctl endpoint status -w json

    => Compare dbSize and dbSizeInUse on large difference try compact and defrag

  • docker exec -ti etcd etcdctl check perf

Sources

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment