Skip to content

Instantly share code, notes, and snippets.

@hucsmn
Last active September 26, 2022 06:43
Show Gist options
  • Save hucsmn/a591d3bbe5f6d6f779dd4d6a2566395c to your computer and use it in GitHub Desktop.
Save hucsmn/a591d3bbe5f6d6f779dd4d6a2566395c to your computer and use it in GitHub Desktop.
Kubernetes installation notes

Kubernetes installation notes

All nodes are running Debian 11 cloud image, architect x86_64, hypervisor kvm (libvirt), kernel 5.10.0-18, kubelet version 1.25.2.

Basic setup (on each node)

  1. Install packages

Run as root:

# switch to tuna mirror
apt update
apt install -y apt-transport-https ca-certificates curl
sed -i -e 's|http://deb.debian.org/debian|https://mirrors.tuna.tsinghua.edu.cn/debian|; s|http://security.debian.org/debian-security|https://mirrors.tuna.tsinghua.edu.cn/debian-security|' /etc/apt/sources.list

# install kubernetes using tuna mirror
curl -fsSLo /usr/share/keyrings/kubernetes-archive-keyring.gpg https://packages.cloud.google.com/apt/doc/apt-key.gpg
echo 'deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://mirrors.tuna.tsinghua.edu.cn/kubernetes/apt kubernetes-xenial main' | tee /etc/apt/sources.list.d/kubernetes.list > /dev/null
apt update
apt install -y kubelet kubeadm kubectl
apt-mark hold kubelet kubeadm kubectl

# install necessary dependencies
apt install -y containerd apparmor apparmor-utils ipvsadm

# install system administration tools
apt install -y openssh-server tmux sysstat lsof ncat netcat-openbsd bind9-dnsutils ipset htop ncdu jq ripgrep unzip

# reconfigure ssh
rm -f /etc/ssh/ssh_host_*
ssh-keygen -A
systemctl restart ssh.service
# fresh install:
#   systemctl enable --now ssh.service

# setup unprivileged user
useradd '<USER>'
passwd '<USER>'
gpasswd -a '<USER>' sudo
mkdir -p /home/'<USER>'/.ssh
chmod 0700 /home/'<USER>'/.ssh
nano /home/'<USER>'/.ssh/authorized_keys
chmod 0600 /home/'<USER>'/.ssh/authorized_keys
chown -R '<USER>':'<USER>' /home/'<USER>'
  1. Configure hostnames

Edit /etc/hostname (via hostnamectl set-hostname) and /etc/hosts on each node, and make sure that each node could reach another node by ip or hostname.

If system is deployed on libvirt or qemu, before starting vm, hostname could be edited using libguestfs-tools: virt-customize -a <disk-image> --hostname <hostname>.

  1. Configure static IP for NIC of specific mac

Edit /etc/network/cloud-ifupdown-helper:

--- a/cloud-ifupdown-helper        2022-09-26 11:12:01.472007473 +0800
+++ b/cloud-ifupdown-helper        2022-09-26 11:11:50.122138044 +0800
@@ -16,6 +16,7 @@
 # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
 
 template="/etc/network/cloud-interfaces-template"
+template_by_mac_prefix="/etc/network/cloud-interfaces-template-"
 cfgdir="/run/network/interfaces.d"
 mkdir -p "$cfgdir"
 
@@ -33,7 +34,13 @@
         return 0
     fi
 
-    sed "s/\\\$INTERFACE/$INTERFACE/g" "$template" > "$working"
+    local mac_address="$(cat /sys/class/net/$INTERFACE/address 2>&- | sed 's/://g' || true)"
+    local template_by_mac="$template_by_mac_prefix$mac_address"
+    if [[ -f "$template_by_mac" ]]; then
+        sed "s/\\\$INTERFACE/$INTERFACE/g" "$template_by_mac" > "$working"
+    else
+        sed "s/\\\$INTERFACE/$INTERFACE/g" "$template" > "$working"
+    fi
     mv "$working" "$final"
     log "Generated configuration for $INTERFACE"

Then create an ifup template for specific mac, for example /etc/network/cloud-interfaces-template-0102030a0b0c (file name format: cloud-interfaces-template-<lower case mac address without colons>):

auto $INTERFACE
allow-hotplug $INTERFACE
iface $INTERFACE inet static
    address <STATIC CIDR ADDRESS>
    gateway <GATEWAY IP>
    dns-nameservers <DNS SERVER 1> <DNS SERVER 2>

Reconfigure interface immediately:

INTERFACE='<INTERFACE>' sudo /etc/network/cloud-ifupdown-helper
sudo systemdctl restart ifup@'<INTERFACE>'.service
  1. Configure containerd as kubernetes backend

Configure crictl to use containerd backend, create /etc/crictl.yaml:

runtime-endpoint: unix:///var/run/containerd/containerd.sock

Edit /etc/containerd/config.yaml as below, to set cni binary directory /opt/cni/bin (from package kubernetes-cni) and change cgroups driver to systemd:

version = 2

[plugins]
  [plugins."io.containerd.grpc.v1.cri"]
    [plugins."io.containerd.grpc.v1.cri".cni]
      bin_dir = "/opt/cni/bin"
      conf_dir = "/etc/cni/net.d"
    [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
      runtime_type = "io.containerd.runc.v2"
      [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
        SystemdCgroup = true
  [plugins."io.containerd.internal.v1.opt"]
    path = "/var/lib/containerd/opt"

Override containerd.service environment variables via systemctl edit containerd.service:

[Service]
EnvironmentFile=-/etc/containerd/containerd.env

Create /etc/containerd/containerd.env, add http/https proxy here if needed:

# toggle proxy for image downloading
#HTTP_PROXY=<PROXY>
#HTTPS_PROXY=<PROXY>
#NO_PROXY=<PROXY BLACKLIST>

Restart containerd:

sudo systemctl restart containerd.service
# fresh install:
#   sudo systemctl enable --now containerd.service
  1. Configure kernel parameters

Create /etc/sysctl.d/10-vm.conf:

vm.overcommit_memory = 1
vm.panic_on_oom = 0
vm.swappiness = 0

vm.max_map_count = 262144

Create /etc/sysctl.d/20-fs.conf:

fs.nr_open = 52706963
fs.inotify.max_user_watches = 89100

Create /etc/sysctl.d/30-net.conf:

# ipv6
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

# ipv4
net.ipv4.ip_forward = 1
net.ipv4.neigh.default.gc_stale_time = 120
net.ipv4.tcp_synack_retries = 2

# arp
net.ipv4.conf.default.arp_announce = 2
net.ipv4.conf.lo.arp_announce = 2
net.ipv4.conf.all.arp_announce = 2

# ip_vs
net.ipv4.tcp_keepalive_time = 600
net.ipv4.tcp_keepalive_intvl = 30
net.ipv4.tcp_keepalive_probes = 10

# nf_conntrack
net.netfilter.nf_conntrack_max = 2310720

Essential kernel parameters are:

  • vm.swappiness = 0

  • net.ipv4.ip_forward = 1

Apply above kernel parameters immediately:

sudo sysctl --system
  1. Configure kernel modules

Create /etc/modules-load.d/kubernetes.conf, enable required kernel modules:

ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh

br_netfilter
nf_conntrack

Load above kernel modules immediately:

sudo cat /etc/modules-load.d/kubernetes.conf | while read mod; do sudo modprobe $mod; done

Cluster installation

Bootstrap a brand-new kubernetes cluster on the first node.

  1. Generate token

Generate a random secure token to stdout, intended for kubeadm init and kubeadm join:

head -c 11 /dev/random | hexdump -e '"%02x"' | sed -E 's/^(.{6})(.*)/\1.\2/' | sed -e '$a\'
# or use python:
#   python -c 'import secrets; print((lambda s: s[:6] + "." + s[6:])(secrets.token_hex(11)));'
  1. Create initialization configuration file

Create a configuration file kubeadm-init.yaml for kubeadm init.

Below is the sample configuration, kubernetes components version v1.25.2, backend container runtime is containerd, cgroups driver is systemd, swap is allowed, api server listens at 6443, pods subnet is 10.192.0.0/16, service subnet is 10.64.0.0/16 (cluster dns 10.64.0.10), token is 24h available for other node to join.

All placeholders should be replaced before actually running kubeadm init.

kind: InitConfiguration
apiVersion: kubeadm.k8s.io/v1beta3

bootstrapTokens:
- token: '<SECRET TOKEN>'
  ttl: 24h
  usages:
  - authentication
  - signing
  groups:
  - system:bootstrappers:kubeadm:default-node-token

localAPIEndpoint:
  advertiseAddress: '<HOST IP>'
  bindPort: 6443

nodeRegistration:
  name: '<HOST NAME>'
  criSocket: 'unix:///var/run/crio/crio.sock'

---

kind: ClusterConfiguration
apiVersion: kubeadm.k8s.io/v1beta3

kubernetesVersion: v1.25.2
clusterName: '<CLUSTER NAME>'

dns: {}

networking:
  dnsDomain: '<CLUSTER NAME>.local'
  serviceSubnet: 10.64.0.0/16
  podSubnet: 10.192.0.0/16

controlPlaneEndpoint: '<HOST IP>:6443'

apiServer:
  timeoutForControlPlane: 4m
  certSANs:
  - 10.64.0.1
  - 127.0.0.1
  - '<HOST NAME>'
  - '<OTHER NODE NAMES>'
  - '<EXTERNAL ADMIN MACHINE NAMES OR IPS>'
  - kubernetes
  - kubernetes.default
  - kubernetes.default.svc
  - kubernetes.default.svc.cluster.local
  extraVolumes:
  - name: timezone
    readOnly: true
    hostPath: /etc/localtime
    mountPath: /etc/localtime

controllerManager:
  extraVolumes:
  - name: timezone
    readOnly: true
    hostPath: /etc/localtime
    mountPath: /etc/localtime

scheduler:
  extraVolumes:
  - name: timezone
    readOnly: true
    hostPath: /etc/localtime
    mountPath: /etc/localtime

etcd:
  local:
    dataDir: /var/lib/etcd
    serverCertSANs:
    - '<HOST NAME>'
    - '<OTHER NODE NAMES>'
    peerCertSANs:
    - '<HOST NAME>'
    - '<OTHER NODE NAMES>'

---

kind: KubeProxyConfiguration
apiVersion: kubeproxy.config.k8s.io/v1alpha1

bindAddress: 0.0.0.0

mode: 'ipvs'
ipvs:
  strictARP: true

clientConnection:
  kubeconfig: /var/lib/kube-proxy/kubeconfig.conf

---

kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1

clusterDomain: '<CLUSTER NAME>.local'
clusterDNS:
- 10.64.0.10

staticPodPath: /etc/kubernetes/manifests

cgroupDriver: systemd

Optionally, validate the config file:

sudo kubeadm init --config kubeadm-init.yaml --dry-run
# save the log:
#   sudo kubeadm init --config kubeadm-init.yaml --dry-run 2>&1 | tee dry-run.log

Optionally, pull kubernetes container images ahead of initialization:

sudo kubeadm config images pull --config kubeadm-init.yaml
  1. Initialize kubernetes cluster

Configure kubelet as autostart on boot, kubeadm init would start it later:

sudo systemctl enable kubelet.service

Initialize kubernetes cluster:

sudo kubeadm init --config kubeadm-init.yaml --upload-certs
# save the log:
#   sudo kubeadm init --config kubeadm-init.yaml --upload-certs 2>&1 | tee kube-init.log

Setup kubernetes administrator for the current user (kubectl requires this step):

mkdir -p $HOME/.kube
sudo cat /etc/kubernetes/admin.conf | tee $HOME/.kube/config > /dev/null

Allow current node for scheduling:

kubectl taint nodes --all node-role.kubernetes.io/control-plane- node-role.kubernetes.io/master-
kubectl label nodes --all node-role.kubernetes.io/control-plane- node-role.kubernetes.io/master-

The command kubeadm init would print join instructions for other worker nodes and master node, join token is available in 24 hours.

Join other nodes

Configure kubelet as autostart on boot, kubeadm join would start it later:

sudo systemctl enable kubelet.service

Copy .kube/config from the master node:

scp '<USER>@<HOST IP>:.kube/config' .kube/config

To join a control plane, execute an instruction printed by kubeadm init before:

sudo kubeadm join '<HOST IP>:6443' --token '<SECRET TOKEN>' \
        --discovery-token-ca-cert-hash '<SHA256 HASH>' \
        --control-plane --certificate-key '<CERT KEY>'

To join a worker node, execute another instruction printed by kubeadm init before:

sudo kubeadm join '<HOST IP>:6443' --token '<SECRET TOKEN>' \
        --discovery-token-ca-cert-hash '<SHA256 HASH>'

Install networking addons

  1. Download flannel manifests v0.19.2, and change pod subnetwork address to 10.192.0.0/16:
curl https://raw.githubusercontent.com/flannel-io/flannel/v0.19.2/Documentation/kube-flannel.yml -O
sed -i -e 's|10.244.0.0/16|10.192.0.0/16|' kube-flannel.yml
  1. Install flannel:
kubectl apply -f kube-flannel.yml
  1. Wait for ready
watch -n 1 kubectl get pods -n kube-flannel

General examines

Show nodes information:

kubectl get nodes -o wide
# show node taints and labels, requires jq installed:
#   kubectl get nodes -o json | jq '.items[] | { name: .metadata.name, status: (if (.status.conditions[] | select(.type == "Ready") | .status) == "True" then "Ready" else "NotReady" end), ctime: .metadata.creationTimestamp, labels: .metadata.labels, taints: .spec.taints }'

Show system pods status:

kubectl get pods -n kube-system

Show all kube-proxy proxier type:

kubectl get pods -n kube-system | grep kube-proxy | cut -d' ' -f1 |
  while read pod; do
    echo "*** kube-proxy pod $pod logs, search 'Proxier' ***"
    kubectl logs "$pod" -n kube-system | grep -E 'Using \w+ Proxier'
    echo
  done

# show kube-proxy with coresponding node name, requires jq installed:
#   kubectl get pods -n kube-system | grep kube-proxy | cut -d' ' -f1 |
#     while read pod; do
#       node=$(kubectl get pod/$pod -n kube-system -o json | jq -r '.spec.nodeName')
#       echo "*** kube-proxy pod $pod (node: $node) logs, search 'Proxier' ***"
#       kubectl logs "$pod" -n kube-system | grep -E 'Using \w+ Proxier'
#       echo
#     done

Show default namesapce pods:

kubectl get pods
# show pods with ip, requires jq installed:
#   for pod in $(kubectl get pods | sed 1d | cut -d' ' -f1); do
#     kubectl get pod/$pod -o json | jq '{name: .metadata.name, phase: .status.phase, pod_ip: .status.podIP, node: .spec.nodeName, host_ip: .status.hostIP}';
#   done |
#     jq -s . |
#     jq -r '(.[0] | keys_unsorted | @tsv), (.[] | map(.) | @tsv)' |
#     column -t

Show default namesapce services:

kubectl get svc

Show ipvs status:

sudo ipvsadm -L -n

Show iptables status:

sudo iptables -S
# or:
#    sudo iptables -L -n

Show ip routes:

ip route

Uninstall

  1. Uninstall flannel:
kubectl delete -f kube-flannel.yml
  1. Uninstall kubernetes, execute on each node:
# uninstall kubernetes and disable kubelet service
sudo kubeadm reset --force
sudo systemctl disable kubelet.service

# clear iptables and ipvs
sudo iptables -F && sudo iptables -t nat -F && sudo iptables -t mangle -F && sudo iptables -X
sudo ipvsadm --clear

# clear flannel remainings
sudo rm /etc/cni/net.d/*flannel*

# for each user, remove orphaned kubectl admin config
rm -f $HOME/.kube/config
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment