This guide has moved to a GitHub repository to enable collaboration and community input via pull-requests.
https://github.com/alexellis/k8s-on-raspbian
Alex
This guide has moved to a GitHub repository to enable collaboration and community input via pull-requests.
https://github.com/alexellis/k8s-on-raspbian
Alex
#!/bin/sh | |
# This installs the base instructions up to the point of joining / creating a cluster | |
curl -sSL get.docker.com | sh && \ | |
sudo usermod pi -aG docker | |
sudo dphys-swapfile swapoff && \ | |
sudo dphys-swapfile uninstall && \ | |
sudo update-rc.d dphys-swapfile remove | |
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add - && \ | |
echo "deb http://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list && \ | |
sudo apt-get update -q && \ | |
sudo apt-get install -qy kubeadm | |
echo Adding " cgroup_enable=cpuset cgroup_memory=1 cgroup_enable=memory" to /boot/cmdline.txt | |
sudo cp /boot/cmdline.txt /boot/cmdline_backup.txt | |
orig="$(head -n1 /boot/cmdline.txt) cgroup_enable=cpuset cgroup_memory=1 cgroup_enable=memory" | |
echo $orig | sudo tee /boot/cmdline.txt | |
echo Please reboot |
Thank you very much for the write-up, very straight-forward and working like a charm. Just make sure to update the Raspbian images to the latest kernel version and to follow the instructions. On stretch light you don't have to create the dhcpcd.conf file yourself for static IP's, just adjust the existing file. And also sometimes it helps to wait for a bit, getting all the kube-system containers in ready and running state can take a while..now getting ready for OpenFaaS and maybe running some in-house Ghost-blogs.
@deurk How could you get it work? I init k8s 1.11.2 but I get this error: "failed to pull image [k8s.gcr.io/kube-proxy-arm:v1.11.2]: exit status 1". Can you help me fix it?
I used pi 3 model B
Just wanted to share here as I had been watching this thread off and on for a while as @aaronkjones directed me here. The latest deployment using Ansible for all of this now works again. I am using Weave BTW.
Has anyone been able to get their Rpi3 k8s cluster integrated with Gitlab-CE? I'm trying to integrate my cluster right now and it's failing to install Helm-Tiller through Gitlab CE - just failing to connect in general really.
figured i'd contribute some deviations from the instructions that helped me, i managed to get this running (28/08/2018) by specifying version 1.8.3 on kubelet, kubectl, kubeadm install, and using the flannel network.
Thank you for your example install commands for v1.9.7 @chito4! I was that tired sould you referenced that needed that info to get this working.
Got it running on Raspberry Pi 3B+, 16GB Micro SD card, Raspbian Stretch Lite OS, Kubernetes v1.9.7 (kubeadm, kubectl), Weave CNI, Docker 18.06.1-ce
I haven't attempted any nodes except the master node so far, but will update if I run into problems.
I was able to get a 7-node Raspberry Pi cluster running using:
Here is my exact system state:
$ cat /proc/device-tree/model
Raspberry Pi 3 Model B Plus Rev 1.3
$ uname -a
Linux pi-master 4.14.62-v7+ #1134 SMP Tue Aug 14 17:10:10 BST 2018 armv7l GNU/Linux
$ cat /boot/cmdline.txt
dwc_otg.lpm_enable=0 console=serial0,115200 console=tty1 root=PARTUUID=830d7945-02 rootfstype=ext4 elevator=deadline fsck.repair=yes rootwait cgroup_enable=cpuset cgroup_enable=memory
Note: Here's a really easy way to append the cgroup
stuff to the /boot/cmdline.txt
file:
sudo sed -i 's/ rootwait$/ rootwait cgroup_enable=cpuset cgroup_enable=memory/g' /boot/cmdline.txt
When I initialized the cluster, I used the following command:
$ sudo kubeadm init --token-ttl=0 --kubernetes-version v1.11.2 --apiserver-advertise-address=192.168.1.48
$ docker version
Client:
Version: 18.06.1-ce
API version: 1.38
Go version: go1.10.3
Git commit: e68fc7a
Built: Tue Aug 21 17:30:52 2018
OS/Arch: linux/arm
Experimental: false
Server:
Engine:
Version: 18.06.1-ce
API version: 1.38 (minimum version 1.12)
Go version: go1.10.3
Git commit: e68fc7a
Built: Tue Aug 21 17:26:37 2018
OS/Arch: linux/arm
Experimental: false
$ kubeadm version # (formatted for readability)
kubeadm version: &version.Info{
Major:"1",
Minor:"11",
GitVersion:"v1.11.2",
GitCommit:"bb9ffb1654d4a729bb4cec18ff088eacc153c239",
GitTreeState:"clean",
BuildDate:"2018-08-07T23:14:39Z",
GoVersion:"go1.10.3",
Compiler:"gc",
Platform:"linux/arm",
}
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
pi-master Ready master 13m v1.11.2
pi-node-01 Ready <none> 5m v1.11.2
pi-node-02 Ready <none> 5m v1.11.2
pi-node-03 Ready <none> 5m v1.11.2
pi-node-04 Ready <none> 5m v1.11.2
pi-node-05 Ready <none> 5m v1.11.2
pi-node-06 Ready <none> 5m v1.11.2
$ kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
coredns-78fcdf6894-zl96s 1/1 Running 0 12m
coredns-78fcdf6894-zxc95 1/1 Running 0 12m
etcd-pi-master 1/1 Running 0 11m
kube-apiserver-pi-master 1/1 Running 1 11m
kube-controller-manager-pi-master 1/1 Running 0 11m
kube-proxy-2zblw 1/1 Running 0 6m
kube-proxy-5wb5l 1/1 Running 0 6m
kube-proxy-6cngc 1/1 Running 0 6m
kube-proxy-6sk8t 1/1 Running 0 6m
kube-proxy-dczbt 1/1 Running 0 6m
kube-proxy-rtvtm 1/1 Running 0 12m
kube-proxy-zvwph 1/1 Running 0 6m
kube-scheduler-pi-master 1/1 Running 0 11m
weave-net-49hfp 2/2 Running 1 6m
weave-net-cz7pm 2/2 Running 0 6m
weave-net-j75wt 2/2 Running 0 6m
weave-net-jb7vp 2/2 Running 0 6m
weave-net-kmzd8 2/2 Running 0 6m
weave-net-kzspl 2/2 Running 0 10m
weave-net-wr692 2/2 Running 1 6m
$ dpkg -l | egrep "kube|docker"
ii docker-ce 18.06.1~ce~3-0~raspbian armhf Docker: the open-source application container engine
ii kubeadm 1.11.2-00 armhf Kubernetes Cluster Bootstrapping Tool
ii kubectl 1.11.2-00 armhf Kubernetes Command Line Tool
ii kubelet 1.11.2-00 armhf Kubernetes Node Agent
ii kubernetes-cni 0.6.0-00 armhf Kubernetes CNI
Still not joy using the scripts.
cat /proc/device-tree/model
Raspberry Pi 2 Model B Rev 1.1
uname -a
Linux node-1 4.14.50-v7+ #1122 SMP Tue Jun 19 12:26:26 BST 2018 armv7l GNU/Linux
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add - && \
> echo "deb http://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list && \
> sudo apt-get update -q && \
> sudo apt-get install -qy kubeadm
OK
deb http://apt.kubernetes.io/ kubernetes-xenial main
Hit:1 http://archive.raspberrypi.org/debian stretch InRelease
Get:2 http://raspbian.raspberrypi.org/raspbian stretch InRelease [15.0 kB]
Hit:4 https://download.docker.com/linux/raspbian stretch InRelease
Get:3 https://packages.cloud.google.com/apt kubernetes-xenial InRelease [8,993 B]
Get:5 https://packages.cloud.google.com/apt kubernetes-xenial/main armhf Packages [18.3 kB]
Fetched 42.3 kB in 3s (13.3 kB/s)
Reading package lists...
Reading package lists...
Building dependency tree...
Reading state information...
kubeadm is already the newest version (1.12.0-rc.1-00).
sudo kubeadm init --token-ttl=0 --apiserver-advertise-address=192.168.1.100 --ignore-preflight-errors=ALL
[init] using Kubernetes version: v1.11.3
[preflight] running pre-flight checks
[WARNING KubeletVersion]: the kubelet version is higher than the control plane version. This is not a supported version skew and may lead to a malfunctional cluster. Kubelet version: "1.12.0-rc.1" Control plane version: "1.11.3"
[preflight/images] Pulling images required for setting up a Kubernetes cluster
[preflight/images] This might take a minute or two, depending on the speed of your internet connection
[preflight/images] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[preflight] Activating the kubelet service
[certificates] Generated front-proxy-ca certificate and key.
[certificates] Generated front-proxy-client certificate and key.
[certificates] Generated etcd/ca certificate and key.
[certificates] Generated etcd/server certificate and key.
[certificates] etcd/server serving cert is signed for DNS names [node-1 localhost] and IPs [127.0.0.1 ::1]
[certificates] Generated etcd/peer certificate and key.
[certificates] etcd/peer serving cert is signed for DNS names [node-1 localhost] and IPs [192.168.1.100 127.0.0.1 ::1]
[certificates] Generated etcd/healthcheck-client certificate and key.
[certificates] Generated apiserver-etcd-client certificate and key.
[certificates] Generated ca certificate and key.
[certificates] Generated apiserver certificate and key.
[certificates] apiserver serving cert is signed for DNS names [node-1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.1.100]
[certificates] Generated apiserver-kubelet-client certificate and key.
[certificates] valid certificates and keys now exist in "/etc/kubernetes/pki"
[certificates] Generated sa key and public key.
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"
[controlplane] wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[controlplane] wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[controlplane] wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml"
[init] waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests"
[init] this might take a minute or longer if the control plane images have to be pulled
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker.
Here is one example how you may list all Kubernetes containers running in docker:
- 'docker ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'docker logs CONTAINERID'
couldn't initialize a Kubernetes cluster
Hi @alexellis, thanks for your great work.
Just notice that the link inside quick.md hasn't been updated.
cgroup_enable=memory
. Thanks again.For those who want to save their SD cards running K8S like this (I know I spend a lot of them for fun and games) the quick and dirty fix and its really simple to add them to this howto :
I assume you have a raspbian / rsyslog / whatever NAS running somewhere on you're local network. DO NOT DO THIS OVER THE NET unless you have a hackwish ;)
On your NAS:
add before GLOBAL DIRECTIVES
sudo nano /etc/rsyslog.conf
# provides TCP syslog reception
module(load="imtcp")
input(type="imtcp" port="514")
$template DynaFile,"/<YOUR NAS PATH>/%HOSTNAME%/%syslogfacility-text%.log"
*.* -?DynaFile
$ sudo systemctl reload rsyslog
(make sure your NAS firewall accepts 514 incoming).
Then on all "masters & workers" comment everything out in /etc/rsyslog.conf after the "RULES" section and add a little something like this:
$ sudo nano /etc/rsyslog.conf
# To remote syslog server
*.* @@192.168.x:y:514
sudo systemctl reload rsyslog
I tried to keep it as simple as possible, again QUICK AND DIRTY adjust to your own situation. No need for complicated remote logging services and so on. And trust me your SD card will be a lot happier!!!
Thx again to that post. Unfortunately it does not work for me. So far the latest 1.9.x version is useable for me. More on that can be found here.
Running the get.docker.com script fails for me unless I disable swap first.
thx @denhamparry, worked for me without kubectl proxy
The issue with kubeadm init crashing on v1.12.x is due to the kube-apiserver container running out of memory (code 137). I've put up a bug report in kubernetes/kubeadm: kubernetes/kubeadm#1279
Hopefully, we can get a solution put together so we can run the latest kubernetes...
To get the current flannel manifest (https://raw.githubusercontent.com/coreos/flannel/c5d10c8/Documentation/kube-flannel.yml) to work on 1.12.2 I had to apply the patch suggested here:
kubectl patch daemonset kube-flannel-ds-arm \
--namespace=kube-system \
--patch='{"spec":{"template":{"spec":{"tolerations":[{"key": "node-role.kubernetes.io/master", "operator": "Exists", "effect":
"NoSchedule"},{"effect":"NoSchedule","operator":"Exists"}]}}}}' ```
I just went through the entire process, and as of today (2018-12-03), it is very very unstable and fragile.
Recapping for anybody that is spending sleepless nights on it:
followed https://gist.github.com/alexellis/a7b6c8499d9e598a285669596e9cdfa2 - my nodes are called ocramius-k8s-pi-1
(192.168.1.110) and ocramius-k8s-pi-2
(192.168.1.111)
followed steps above until before kubeadm init
(note: as I'm writing, I have v1.12.3
installed)
had to downgrade docker-ce
to 18.06.0
on all hosts, due to kubernetes/minikube#3323. To do that, I followed:
curl -sSL get.docker.com | sh && \
sudo usermod pi -aG docker
newgrp docker
apt purge -y docker-ce && apt-autoremove -y
apt install docker-ce=18.06.0~ce~3-0~raspbian
Note that ignoring the preflight checks with --ignore-preflight-errors=SystemVerification
won't work, since something changed in how dockerd handles temporary files. Make sure that docker version
reports 18.06.0
:
pi@ocramius-k8s-pi-1:/home/pi# docker version
Client:
Version: 18.06.0-ce
API version: 1.38
Go version: go1.10.3
Git commit: 0ffa825
Built: Wed Jul 18 19:19:46 2018
OS/Arch: linux/arm
Experimental: false
Server:
Engine:
Version: 18.06.0-ce
API version: 1.38 (minimum version 1.12)
Go version: go1.10.3
Git commit: 0ffa825
Built: Wed Jul 18 19:15:34 2018
OS/Arch: linux/arm
Experimental: false
went with the flannel setup (couldn't get weavenet to work):
sudo kubeadm init --token-ttl=0 --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.1.110
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/c5d10c8/Documentation/kube-flannel.yml
sudo sysctl net.bridge.bridge-nf-call-iptables=1
kubectl get pods --namespace=kube-system
will report something like following:
pi@ocramius-k8s-pi-1:/home/pi# kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
coredns-576cbf47c7-9bp9s 0/1 ContainerCreating 0 2m29s
coredns-576cbf47c7-jmgf5 0/1 ContainerCreating 0 2m29s
etcd-ocramius-k8s-pi-1 1/1 Running 0 110s
kube-apiserver-ocramius-k8s-pi-1 1/1 Running 1 95s
kube-controller-manager-ocramius-k8s-pi-1 1/1 Running 0 106s
kube-proxy-t4qc7 1/1 Running 0 2m29s
kube-scheduler-ocramius-k8s-pi-1 1/1 Running 0 2m44s
Inspecting the pods that are in ContainerCreating
status, you will get something like:
kubectl describe pods coredns-576cbf47c7-9bp9s --namespace=kube-system
<snip>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m41s default-scheduler Successfully assigned kube-system/coredns-576cbf47c7-9bp9s to ocramius-k8s-pi-1
Warning NetworkNotReady 1s (x13 over 2m41s) kubelet, ocramius-k8s-pi-1 network is not ready: [runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized]
applied the patch suggested by @cglazner right above me:
kubectl patch daemonset kube-flannel-ds-arm \
--namespace=kube-system \
--patch='{"spec":{"template":{"spec":{"tolerations":[{"key": "node-role.kubernetes.io/master", "operator": "Exists", "effect":
"NoSchedule"},{"effect":"NoSchedule","operator":"Exists"}]}}}}'
system will recover:
pi@ocramius-k8s-pi-1:/home/pi# kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
coredns-576cbf47c7-9bp9s 1/1 Running 0 31m
coredns-576cbf47c7-jmgf5 1/1 Running 0 31m
etcd-ocramius-k8s-pi-1 1/1 Running 0 30m
kube-apiserver-ocramius-k8s-pi-1 1/1 Running 1 30m
kube-controller-manager-ocramius-k8s-pi-1 1/1 Running 0 30m
kube-flannel-ds-arm-pznfc 1/1 Running 0 19m
kube-proxy-t4qc7 1/1 Running 0 31m
kube-scheduler-ocramius-k8s-pi-1 1/1 Running 0 31m
pi@ocramius-k8s-pi-1:/home/pi# kubectl get nodes
NAME STATUS ROLES AGE VERSION
ocramius-k8s-pi-1 Ready master 32m v1.12.3
can now join other nodes (in my case ocramius-k8s-pi-2
):
sudo sysctl net.bridge.bridge-nf-call-iptables=1
kubeadm join 192.168.1.110:6443 --token <snip> --discovery-token-ca-cert-hash sha256:<snip>
verify status:
pi@ocramius-k8s-pi-1:/home/pi# kubectl get nodes
NAME STATUS ROLES AGE VERSION
ocramius-k8s-pi-1 Ready master 37m v1.12.3
ocramius-k8s-pi-2 Ready <none> 70s v1.12.3
Very nice!
Indeed the patch mentionned by cglazner did fix the flannel issue.
I'm on k8s 1.13.0 with flannel 0.10-arm:
#kubectl get no
NAME STATUS ROLES AGE VERSION
pi-master Ready,SchedulingDisabled master 53m v1.13.0
pi3-slave-01 Ready worker 46m v1.13.0
pi3-slave-02 Ready worker 46m v1.13.0
pi3-slave-03 Ready worker 46m v1.13.0
pi3-slave-04 Ready worker 46m v1.13.0
I use the below code to create the cluster:
--On Master node--
root:
kubeadm init --token-ttl=0 --pod-network-cidr=10.244.0.0/16
myuser :
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown
curl -sSL https://rawgit.com/coreos/flannel/v0.10.0/Documentation/kube-flannel.yml| sed "s/amd64/arm/g" | kubectl create -f -
kubectl -n kube-system patch daemonset kube-flannel-ds
--patch='{"spec":{"template":{"spec":{"tolerations":[{"key": "node-role.kubernetes.io/master", "operator": "Exists", "effect": "NoSchedule"},{"effect":"NoSchedule","operator":"Exists"}]}}}}'
sysctl net.bridge.bridge-nf-call-iptables=1
--On Slaves node--
kubeadm join 192.168.x.x:6443 --token --discovery-token-ca-cert-hash sha256:
--On Master node--
kubectl cordon pi-master
kubectl label node pi3-slave-01 node-role.kubernetes.io/worker=
kubectl label node pi3-slave-02 node-role.kubernetes.io/worker=
kubectl label node pi3-slave-03 node-role.kubernetes.io/worker=
kubectl label node pi3-slave-04 node-role.kubernetes.io/worker=
Have been playing around with this over the weekend, really enjoying the project!
I hit a block with Kubernetes Dashboard, and realised that I couldn't connect to it via proxy due to it being set as a ClusterIP rather than a NodeIP.
* Edit `kubernetes-dashboard` service.
$ kubectl -n kube-system edit service kubernetes-dashboard* You should the see yaml representation of the service. Change type: **ClusterIP** to type: **NodePort** and save file. * Check port on which Dashboard was exposed.
$ kubectl -n kube-system get service kubernetes-dashboard NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes-dashboard NodePort 10.108.252.18 <none> 80:30294/TCP 23m* Create a proxy to view within your browser
$ ssh -L 8001:127.0.0.1:31707 pi@k8s-master-1.local* Browse [localhost:8001](http://localhost:8001)
Thanks again Alex!
Many thanks Alex for a great post and thanks to Denham I incorporated your comments to my dashboard config with help from this post to get me up and running https://kubecloud.io/kubernetes-dashboard-on-arm-with-rbac-61309310a640 having experienced the same issues with RBAC permissions in the post. (from...
$ kubectl create serviceaccount dashboard....
I hope this helps anyone who is getting to grips with K8S RPI cluster!! ::D
New problem seems to have cropped up on flannel: flannel-io/flannel#1060
vxlan.go:120] VXLAN config: VNI=1 Port=0 GBP=false DirectRouting=false E1218 03:12:12.719715 1 main.go:280] Error registering network: failed to configure interface flannel.1: failed to ensure address of interface flannel.1: link has incompatible addresses. Remove additional addresses and try again. &netlink.Vxlan{LinkAttrs:netlink.LinkAttrs{Index:5, MTU:1450, TxQLen:0, Name:"flannel.1", HardwareAddr:net.HardwareAddr{0x2, 0x2a, 0x1c, 0x2f, 0x61, 0x25}, Flags:0x13, RawFlags:0x11043, ParentIndex:0, MasterIndex:0, Namespace:interface {}(nil), Alias:"", Statistics:(*netlink.LinkStatistics)(0x13a320e4), Promisc:0, Xdp:(*netlink.LinkXdp)(0x13812200), EncapType:"ether", Protinfo:(*netlink.Protinfo)(nil), OperState:0x0}, VxlanId:1, VtepDevIndex:2, SrcAddr:net.IP{0xc0, 0xa8, 0x3, 0x32}, Group:net.IP(nil), TTL:0, TOS:0, Learning:false, Proxy:false, RSC:false, L2miss:false, L3miss:false, UDPCSum:true, NoAge:false, GBP:false, Age:300, Limit:0, Port:8472, PortLow:0, PortHigh:0}
To allow the pod to start successfully, SSH onto the worker and run sudo ip link delete flannel.1
. Recreating the pod will then start successfully.
This comment combines the knowledge of this gist and the many comments above plus the workings of https://github.com/aaronkjones/rpi-k8s-node-prep
Download Raspbian Stretch Lite (2018-11-13 4.14 kernel), flash to sd cards for your cluster (in my case 5 cards).Once flashed and BEFORE you boot the pis, set up the networking by mounting the sd card (you can of course just boot the Pis and setup networking on each machine, my personal preference is to do this in advance):
Turn on ssh: sudo touch <boot partition mount point>/ssh
Enable C-Groups: sudo vi <boot partition mount point>/cmdline.txt
dwc_otg.lpm_enable=0 console=serial0,115200 console=tty1 root=PARTUUID=7ee80803-02 rootfstype=ext4 elevator=deadline fsck.repair=yes group_enable=cpuset cgroup_memory=1 cgroup_enable=memory rootwait quiet init=/usr/lib/raspi-config/init_resize.sh
I'm using wired networking on a 192.168.2.xx subnet so setup the host entries: sudo vi <rootfs partition mount point>/etc/hosts
...add to bottom of file...
192.168.2.31 node01
192.168.2.32 node02
192.168.2.33 node03
192.168.2.34 node04
192.168.2.35 node05
sudo vi <rootfs partition mount point>/etc/dhcpcd.conf
...add to bottom of file...
interface eth0
static ip_address=192.168.2.XX/24 <--- change XX = 31,32,33,34,35
static routers=192.168.2.1
static domain_name_servers=192.168.2.1
Unmount the sd card, then in turn mount the other 4 sd cards repeating the steps above changing the ip address. Once you have setup all 5 sdcards put them into the pis and power on the cluster. SSH to each in turn and complete the configuration and install the software with the below steps.
ssh pi@192.168.2.XX
sudo -i
hostnamectl set-hostname nodeXX <- change to node01/02/03/04/05 as appropriate
apt-get update
apt-get upgrade -y
curl -s https://download.docker.com/linux/raspbian/gpg | sudo apt-key add -
echo "deb [arch=armhf] https://download.docker.com/linux/raspbian stretch edge" | sudo tee /etc/apt/sources.list.d/socker.list
apt-get update -q
apt-get install -y docker-ce=18.06.0~ce~3-0~raspbian --allow-downgrades
echo "docker-ce hold" | sudo dpkg --set-selections
usermod pi -aG docker
dphys-swapfile swapoff
dphys-swapfile uninstall
update-rc.d dphys-swapfile remove
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
echo "deb http://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list
apt-get update -q
apt-get install -y kubeadm=1.13.1-00 kubectl=1.13.1-00 kubelet=1.13.1-00
reboot
ssh pi@192.168.2.31
sudo kubeadm init --token-ttl=0 --apiserver-advertise-address=192.168.2.31 --kubernetes-version v1.13.1
Save the join token and token hash, it will be needed in "Setup slave nodes02-05"
Make local config for pi user, so login as pi on node01
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Check it's working (except the dns pods wont be ready)
kubectl get pods --all-namespaces
Setup kubernetes ovverlay networking
kubectl apply -f https://git.io/weave-kube-1.6
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"
Join the cluster using the join token and token hash when you ran kubeadm on node01
sudo kubeadm join 192.168.2.31:6443 --token xxxxxxxxxxxxxxxxxxxxxxxx --discovery-token-ca-cert-hash sha256:yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
Back on the master node01 check the nodes have joined the cluster and that pods are running:
kubectl get nodes
kubectl get pods --all-namespaces
Deploy the tls disabled version of the dashboard
echo -n 'apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: kubernetes-dashboard
labels:
k8s-app: kubernetes-dashboard
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: kubernetes-dashboard
namespace: kube-system' | kubectl apply -f -
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v1.10.1/src/deploy/alternative/kubernetes-dashboard-arm.yaml
To access the dashboard start the proxy on node01:
kubectl proxy --address 0.0.0.0 --accept-hosts '.*'
Then from your pc point your browser at:
http://192.168.2.31:8001/api/v1/namespaces/kube-system/services/http:kubernetes-dashboard:/proxy/
@andyburgin I followed your instructions, but I can't get the master node running...
The kubeadm init [...]
did not finish:
[...]
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker.
Here is one example how you may list all Kubernetes containers running in docker:
- 'docker ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'docker logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
I waited for some time until all pod were "Running":
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system etcd-msh-master 1/1 Running 0 105s
kube-system kube-apiserver-msh-master 1/1 Running 5 107s
kube-system kube-controller-manager-msh-master 1/1 Running 0 115s
kube-system kube-scheduler-msh-master 1/1 Running 0 76s
(is it possible that kube-dns and kube-proxy are missing?)
Then I applied the two weave-net files you mentioned:
kubectl apply -f https://git.io/weave-kube-1.6
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"
But the weave-net pod will not become "Running"...
ERROR: logging before flag.Parse: E0120 16:25:32.259195 11085 reflector.go:205] github.com/weaveworks/weave/prog/weave-npc/main.go:319: Failed to list *v1.Pod: Get https://10.96.0.1:443/api/v1/pods?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
ERROR: logging before flag.Parse: E0120 16:25:32.267598 11085 reflector.go:205] github.com/weaveworks/weave/prog/weave-npc/main.go:320: Failed to list *v1.NetworkPolicy: Get https://10.96.0.1:443/apis/networking.k8s.io/v1/networkpolicies?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
ERROR: logging before flag.Parse: E0120 16:25:32.274948 11085 reflector.go:205] github.com/weaveworks/weave/prog/weave-npc/main.go:318: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
10.96.0.1 seems to be the kubernetes service IP:
$ kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 17m
Oooookayy... I finally managed to get it working \o/
I wrote a small bash script that checks for /etc/kubernetes/manifests/kube-apiserver.yaml
to update failureThreshold
(new value: 100
) and initialDelaySeconds
(new value: 1080
) as soon as the file exists. The new values are much bigger than they need to be, but they allowed my to get my master node up and running! Whenever I tried to change these values this by hand, the kubeadm init ...
command failed.
I just set up a working cluster but couldn't get the master running on an RPI 2. Moved SD card over to a RPI 3 and then kubeadm init
ran just fine. The worker node seem to run just fine on the RPI 2.
Wondering if anyone has gotten helm/tiller working in this configuration?
@rnbwkat I got it working but I had to specify a different tiller image, one which was compatible with ARM. The command I used was:
helm init --service-account tiller --tiler-image=jessestuart/tiller:v2.9.0
@janpieper. I’ve run into the “node not found”. looking through all the comments I was going to follow the save steps you did. I wonder what versions of k8s and docker you’ve installed
I tried to setup the cluster following the steps described but still didn't get a succesful kubeadm init. I tried different versions of k8s and docker. Is there somebody who has the steps to get 1.13-3 working with 18.09.0
@janpieper can you share the script?
@janpieper steps worked up until the point everyone mentioned, and rather than the script that polls and zaps the config, I found you can do the same (after the initial failure) by running these commands (lifted from this issue kubernetes/kubeadm#1380)
sudo kubeadm reset
sudo kubeadm init phase certs all
sudo kubeadm init phase kubeconfig all
sudo kubeadm init phase control-plane all --pod-network-cidr 10.244.0.0/16
sudo sed -i 's/initialDelaySeconds: [0-9][0-9]/initialDelaySeconds: 240/g' /etc/kubernetes/manifests/kube-apiserver.yaml
sudo sed -i 's/failureThreshold: [0-9]/failureThreshold: 18/g' /etc/kubernetes/manifests/kube-apiserver.yaml
sudo sed -i 's/timeoutSeconds: [0-9][0-9]/timeoutSeconds: 20/g' /etc/kubernetes/manifests/kube-apiserver.yaml
sudo kubeadm init --v=1 --skip-phases=certs,kubeconfig,control-plane --ignore-preflight-errors=all --pod-network-cidr 10.244.0.0/16
Then I installed flannel.
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.11.0/Documentation/kube-flannel.yml
Something that threw me off was the shell demo that Kubernetes provides works fine (kubectl apply -f https://k8s.io/examples/application/shell-demo.yaml) docs here:
https://kubernetes.io/docs/tasks/debug-application-cluster/get-shell-running-container/
But it fails when doing a deployment of nginx from their example here:
https://kubernetes.io/docs/tasks/run-application/run-stateless-application-deployment/
Turns out the nginx image isn't compatible with ARM, once I changed the image to a pi supported image (tobi312/rpi-nginx
) it worked fine! Thanks to everyone here, I finally got my pi cluster going.
Took me a bit longer than expected to get to test it but I managed to make it work 👍