Skip to content

Instantly share code, notes, and snippets.

@cantbewong
Last active April 29, 2018 08:03
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save cantbewong/476efcc1f0a8be9f7b83a4e043b9fd68 to your computer and use it in GitHub Desktop.
Save cantbewong/476efcc1f0a8be9f7b83a4e043b9fd68 to your computer and use it in GitHub Desktop.

How to install Kubernetes on bare metal

This was written for Kubernetes 1.6

Assume 4 physical nodes, or VMs, that will be used with ScaleIO storage

  • All nodes have 2 CPU cores, 2GB memory, 64GB of disk storage. Hardware or VM type needs to support CentOS 7.3.
    • this is a marginal memory allocation for the master, so go up to 4GB on the master if you can.
  • Assume a single NIC each, All on a common subnet - though other configurations may work
  • Each node must have a hostname in DNS, with forward and reverse lookup working, DHCP is OK.

This process is suitable for training and testing, but not suitable for heavy workloads or enterprise grade production deployments. This is specifically intended for on "on premises" deployment to "bare metal" or hypervisor. Easier deployment processes are available for running Kubernetes in many of the popular public clouds.

ScaleIO is a software defined storage solution that provides block based storage (what you want for high performance stateful containerized apps such as databases), from commidity x86 servers. It can be deployed with Kubernetes in a converged infrastructure, where ScaleIO is installed on the same nodes as the Kubernetes minions which run containers. However, in the process described below, a non-converged ScaleIO deployment is assumed to be already deployed. ScaleIO binaries are available for free download here. You will use only the client package (EMC-ScaleIO-sds-2.0-5014.0.el7.x86_64.rpm) in the process described here.

Install CentOS 7.3 on all nodes (Master, 2 Managed Minion Nodes, 1 UI Host Minion Node)

  1. use default centos disk format = xfs
  2. enable IPV4
  3. set timezone, with ntp (default)
  4. For the UI Host Minion Node only, use install a desktop version of Centos. The procedure here only enables access to a browser running directly on the UI node. Enabling user based access control would require a significantly more complex process than what is described here.

On Master - used to initiate kubeadm install bootstrap

Login as root

Generate ssh key for root

ssh-keygen -t rsa

Copy public to cluster nodes - substitute your actual ips

cat ~/.ssh/id_rsa.pub | ssh root@192.168.1.32 "mkdir ~/.ssh && cat >> ~/.ssh/authorized_keys"
cat ~/.ssh/id_rsa.pub | ssh root@192.168.1.33 "mkdir ~/.ssh && cat >> ~/.ssh/authorized_keys"
cat ~/.ssh/id_rsa.pub | ssh root@192.168.1.34 "mkdir ~/.ssh && cat >> ~/.ssh/authorized_keys"

On all nodes (BOOT, MASTER, 2 AGENTS, 1 PUBLIC AGENT)

As an option, a tool that supports multiple concurent console sessions such as tmux could be useful for efficiently performing these steps that are common to multiple nodes.

Login as root

visudo
  1. uncomment # %wheel ALL=(ALL) NOPASSWD: ALL
  2. comment out the other existing activated %wheel line

Add a non-root user

adduser centos
passwd centos
usermod -aG wheel centos
usermod -aG docker centos

Login as this centos user on Master in another session,

  1. Generate an ssh key set for convenience
  2. After other nodes reach the stage of centos user creation, copy the public key to targets

Substitute your actual ips in commands below:

`ssh-keygen -t rsa`
cat ~/.ssh/id_rsa.pub | ssh centos@192.168.1.32 "mkdir ~/.ssh && cat >> ~/.ssh/authorized_keys"
cat ~/.ssh/id_rsa.pub | ssh centos@192.168.1.33 "mkdir ~/.ssh && cat >> ~/.ssh/authorized_keys"
cat ~/.ssh/id_rsa.pub | ssh centos@192.168.1.34 "mkdir ~/.ssh && cat >> ~/.ssh/authorized_keys"

On all nodes continued..., as root

Address some Docker related items

vi /etc/default/grub

add ipv6.disable=1 in GRUB_CMDLINE_LINUX definition

Stop firewall

sudo systemctl stop firewalld && sudo systemctl disable firewalld

Enable OverlayFS
sudo tee /etc/modules-load.d/overlay.conf <<-'EOF'
overlay
EOF
Disable SELINUX
sudo sed -i s/SELINUX=enforcing/SELINUX=permissive/g /etc/selinux/config &&
  sudo groupadd nogroup
reload kernel modules

reboot

On all nodes continued..., as root

Create /etc/sysctl.d/k8s.conf with this content:

cat << EOF > /etc/sysctl.d/k8s.conf
sysctl net.ipv4.ip_forward=1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF

Install pre-req and useful packages

yum install -y nano ntp tar xz unzip curl ipset open-vm-tools nfs-utils yum-versionlock wget
chkconfig ntpd on
service ntpd restart
systemctl enable ntpd

Define Docker's repo

sudo tee /etc/yum.repos.d/docker.repo <<-'EOF'
[dockerrepo]
name=Docker Repository
baseurl=https://yum.dockerproject.org/repo/main/centos/$releasever/
enabled=1
gpgcheck=1
gpgkey=https://yum.dockerproject.org/gpg
EOF

Configure systemd to run the Docker Daemon with OverlayFS:

sudo mkdir -p /etc/systemd/system/docker.service.d && sudo tee /etc/systemd/system/docker.service.d/override.conf <<- EOF
[Service]ExecStart=ExecStart=/usr/bin/dockerd --storage-driver=overlay
EOF

Install Docker 1.12.6 from Docker's repo (latest version supported by Kubernetes 1.6)

sudo yum install -y docker-engine-1.12.6 docker-engine-selinux-1.12.6
yum versionlock docker-engine docker-engine-selinux
systemctl enable docker && systemctl start docker

Test docker

docker ps
docker run hello-world

Install Kubernetes binaries

Define the Kubernetes repository:

cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg
  https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF
WARNING this is based on a kubeadm workaround specific to version 1.6.0, some newer versions of 1.6 have known issues with this workaround, and no alternate workaround. Until this is resolved, you should use 1.6.0
yum install -y kubelet-1.6.0-0 kubeadm-1.6.0-0 kubectl-1.6.0-0 kubernetes-cni

Start kubelet service

systemctl enable kubelet && systemctl start kubelet

log is expected to show a startup error - it runs a recovery loop until connected to Master

journalctl -xn -u kubelet.service

INSTALL ScaleIO CLIENT binary

yum install -y numactl libaio
yum localinstall -y EMC-ScaleIO-sdc-2.0-5014.0.el7.x86_64.rpm
/opt/emc/scaleio/sdc/bin/drv_cfg --add_mdm --ip 192.168.1.11,192.168.1.12 --file /bin/emc/scaleio/drv_cfg.txt
In another console session, login to ScaleIO MDM as root

Substitute the ip of the node being added in the command below.

scli --add_sdc --sdc_ip 192.168.1.31

You can log off, or leave open for subsequent nodes. The command above is needed for each node - using the unique ip address for each.

Back on node being installed

Initiate a ScaleIo rescan:

/opt/emc/scaleio/sdc/bin/drv_cfg --rescan

On Master node only

Apply install workaround

Edit /etc/systemd/system/kubelet.service.d/10-kubeadm.conf:

Delete use of KUBELET_NETWORK_ARGS but leave definition in place. We will retore this later. This is a workaround for a kubeadm install issue in Kubernetes 1.6.0. See:

Leave this defined, but remove the use of it below the definition. (Note: You will restore the original in a later step):

Environment="KUBELET_NETWORK_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin"

Trigger reload of altered settings
systemctl daemon-reload

In a second login window to Master

Start kubeadm to trigger cluster initialization

Note the pod-network-cidr parameter is needed when using Flannel. It is not need if using Weave instead.

kubeadm init --pod-network-cidr 10.244.0.0/16

After a few minutes, you should get "Your Kubernetes master has initialized successfully!" You should see output like this. You will need to utilize the token on other nodes when they join cluster, so save this cluster token. (perhaps place it in a k8s-cluster-token.txt file in /root)

[apiclient] Waiting for at least one node to register and become ready
[apiclient] First node is ready after 3.001959 seconds
[apiclient] Test deployment succeeded
[token] Using token: 39fe22.de236ac4acd0114f
[apiconfig] Created RBAC rules
[addons] Created essential addon: kube-proxy
[addons] Created essential addon: kube-dns

Note: if an error occurs, you cannot run kubeadm init again without tearing down the cluster.

Refer to kubeadm documentation in the event of issues.

Continue install workaround

Edit /etc/systemd/system/kubelet.service.d/10-kubeadm.conf:

Revert change - use KUBELET_NETWORK_ARGS on ExecStart for kubelet, like this.

ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_NETWORK_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_DNS_ARGS $KUBELET_AUTHZ_ARGS $KUBELET_EXTRA_ARGS
Trigger reload of altered settings
systemctl daemon-reload

Check kubelet status (should be running):

systemctl status kubelet

Check log for issues:

journalctl -xn -u kubelet.service

Log on to master in a new console as user centos

Choose a CNI option of your choice - See for more details http://kubernetes.io/docs/admin/addons/

If you choose Weave

kubectl apply -f https://git.io/weave-kube-1.6

Else, if you choose Flannel

kubectl create -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel-rbac.yml
kubectl create -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

Verify deploy of a network CNI pod

kubectl get pods --all-namespaces

Restart the kubelet service

sudo systemctl restart kubelet.service

Examine log to verify kubelet restarted

At this point kubelet should be running normally on the Master node.

journalctl -xn -u kubelet.service

deploy a pod network to the cluster

On all other nodes, log on as user centos

Join the node to the cluster

You wil need the cluster token you saved from the cluster initialization step.

systemctl enable kubelet && systemctl start kubelet
kubeadm join --token 39fe22.de236ac4acd0114f 192.168.1.31:6443

check kubelet service startup success

journalctl -xn -u kubelet.service

On the ui node, log on as user centos

start the UI pod

export KUBECONFIG=$HOME/admin.conf
kubectl create -f https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/kubernetes-dashboard.yaml
kubectl proxy

This should output: Starting to serve on 127.0.0.1:8001^C

The Kubernetes Dashboard should be operational if you open http://localhost:8001/uia browser running on the UI node desktop (physical or hypervisor hosted console). This will not be accesible remotely.

Troubleshooting Kubernetes

The Kubernetes scheduler and kube-proxy run in a container. You can use docker ps and then docker logs <id> to show logs. If the UI is functional, you can use it to show these logs.

The kubelet and container runtime on each node, for example Docker, do not run in containers.

  1. You can use journalctl -xn -u kubelet.service to show the complete log for a service
  2. You can use journalctl --since "2017-05-07 08:00:00" to show the log since a specified time.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment