Skip to content

Instantly share code, notes, and snippets.

@onelson
Last active January 8, 2018 08:38
Show Gist options
  • Save onelson/4c9f6102c342750ea23f3da2415125a9 to your computer and use it in GitHub Desktop.
Save onelson/4c9f6102c342750ea23f3da2415125a9 to your computer and use it in GitHub Desktop.
k8s on rpi3 notes

I've been reading Kubernetes Up and Running, and got excited when I saw that they include some instructions for configuring a cluster on Raspberry Pis, which I've been dying to play with for some time but never had a project to dig into.

I had been recording my progress in the form of a twitter thread, but I thought something easier to scan, and editable, might be a better way to record the outcomes and lessons learned.

As I started to dig in, I found the short guide to be different enough what I was seeing (in reality) that I decided to keep some notes about the tweaks I was required to make.

In this document will try to use block quotes (like this one) to indicate subjective observations. These blocks could be labeled as hunches, or even suspicions. In other cases, I'll use these blocks to indicate contradictions, or things that don't work as advertised.

Current State of the network

This is just a quick list of what is working and what isn't. These topics are expanded upon in the sections that follow.

  • Cluster leader has 2 network interfaces, one public (wlan0), one private (eth0).
  • Cluster leader is providing other nodes on the private subnet with addresses via dhcp.
  • All cluster nodes are able to reach each other over their private subnet and switch.
  • Cluster leader is able to reach the internet, but other nodes are not despite listing the leader as their gateway.

Quick overview of the network

The general layout of the cluster outlined in the book has the leader configured with 2 network interfaces:

  • wlan0 this is the public interface that you'd normally connect to when using kubectl, for example. This interface would be connected to your regular wifi access point, so it'll be on your main network (with internet access).
  • eth0 the wired interface would be connected to a switch and would offer dhcp for a private cluster subnet.

The cluster leader would then enable something called IP Forwarding which should allow traffic on eth0 to reach addresses outside of the private cluster subnet (ie, for public internet access).

As I understand it, the other nodes in the cluster would connect their wired interfaces to the switch, and skip wifi entirely. With the cluster leader listed as their gateway (in the network config fed to them via dhcp from the leader) they should be able to route from their eth0, through to the cluster leader's eth0, which should then forward through to the leader's wlan0.

Unfortunately this isn't quite working for me yet. The leader is able to reach public addresses, but the other nodes are not. Fixed by finding better iptables rules than were shown in the book.

From device-init to cloud-init

This was the first snag I ran into. In the book they describe configuring your wifi connection and hostname for the cluster leader by editing a file called /boot/device-init.yaml which does not exist in the current (as of writing) version of the OS image linked to by the book.

Looks like this change was introduced in the last two releases, and while I'm still entertaining the idea of downgrading to bring the book's instructions back into working order, these notes are assuming we are using v1.7.1 which is the latest as of writing this. If enough time has passed, the latest might be some other version. See the releases list to determine this.

The old way of configuring these seemed to involve plugging in some basic keys and values into a yaml file. The new way, built on a very specific version (v0.7.9) of cloud-init, is not nearly as simple.

The cloud-init way is to edit a file called /boot/user-data which is also a yaml file, but includes blocks which are essentially shell scripts, so you have to watch how you format them. I can appreciate that this is more extensible in that you can effectively do whatever you want, but it's a little ugly and clumsy to someone who doesn't know exactly what they need (like me).

A point of confusion that remains for me is around how cloud-init only seems to run parts of what is found in the /boot/user-data with each boot, and much of it is one-time on first boot execution. This means, if you image your Pi, and boot it up so you can edit the file (to configure your wifi, for example), you've already missed your opportinuty for cloud-init to create and wire up the conf files you'll need to edit - you'll have to do some of the automation by hand.

I'm not sure how to force cloud-init to re-run itself as if we're booting for the first time, which might smooth this over. Lacking the know-how on this means I'm left with 2 options, either make all my edits to /boot/user-data before first boot by mounting the sd card after imaging, or run the conf creation by hand by following the script blocks in the file.

Cluster Leader config

The leader effectively acts as a router in front of the rest of the cluster, so it has some extra configuration that is not required on the rest of the nodes.

Configuring WiFi

Here's an example of the (commented out by default) wifi config:

# # WiFi connect to HotSpot
# # - use `wpa_passphrase SSID PASSWORD` to encrypt the psk
write_files:
  - content: |
      allow-hotplug wlan0
      iface wlan0 inet dhcp
      wpa-conf /etc/wpa_supplicant/wpa_supplicant.conf
      iface default inet dhcp
      wireless-power off
    path: /etc/network/interfaces.d/wlan0
  - content: |
      ctrl_interface=DIR=/var/run/wpa_supplicant GROUP=netdev
      update_config=1
      network={
      ssid="YOUR_WIFI_SSID"
      psk="YOUR_WIFI_PASSPHRASE"
      }
    path: /etc/wpa_supplicant/wpa_supplicant.conf

# These commands will be ran once on first boot only
runcmd:
  # Pickup the hostname changes
  - 'systemctl restart avahi-daemon'

#  # Activate WiFi interface
  - 'ifup wlan0'

You'll notice the line just above the wifi config that reads:

use `wpa_passphrase SSID PASSWORD` to encrypt the psk

This didn't actually work for me, perhaps because of the way my home router is configured, and I had to use a plain text passphrase in this config instead of an encrypted one.

This advice to encrypt the passphrase was actually a little frustrating considering the fact you'd need to be on a system with this binary to be able to run it. If you were running linux on whatever computer you're doing this work on, you might have access to this, but from a mac or windows, you might need to ssh into your Pi to be able to run it, which would mean you'd miss your first boot then have to go edit the wlan0 conf after the fact to update it with the newly encrypted passphrase.

Much of what you'll see in this example config is what you'd find commented out by default, however I noticed that on the rpi3 I have, the wlan0 interface will turn itself off after some time. I added the line reading wireless-power off to prevent the power management system from shutting down the wlan0 interface, which is meant to act as the link to the external network for our cluster.

It's also important to note that the write_files section responsible for creating the wlan0 configs is where most of your customizations are made, but you'll also have to uncomment the very last line (under the runcmd section), which reads - 'ifup wlan0' to actually turn on the wifi.

Configuring eth0

The book offers the following for the content of /etc/network/interfaces.d/eth0

allow-hotplug eth0
iface eth0 inet static
    address 10.0.0.1
    netmask 255.255.255.0
    broadcast 10.0.0.255
    gateway 10.0.0.1

and here's what I currently have:

allow-hotplug eth0
iface eth0 inet static
    address 10.0.0.1
    netmask 255.255.255.0
    broadcast 10.0.0.255

The big deviation here is the removal of the gateway directive, which in my case prevented the IP forwarding from allowing the leader from contacting external IPs, and the addition of the pre-up line which reloads some saved iptables state (discussed later on).

The removal of the gateway directive is suspect to me. It fixes the cluster leader's internet access, but I wonder if the removal is what breaks internet access for the other nodes that route through the leader.

Configuring dhcp

The book suggests installing the dhcpd service with

$ sudo apt-get install isc-dhcp-server

The book mentions being able to restart dhcpd by running sudo systemctl dhcpd restart but in this case, the service is named isc-dhcp-server rather than dhcpd.

The book doesn't mention this, but I wanted to be sure that my Pi didn't try to offer leases on wlan0 which is managed by my home router. In order to restrict dhcpd to specific interfaces, edit /etc/default/isc-dhcp-server. Update the INTERFACES line to list only the private network interface like so:

INTERFACES="eth0"

My /etc/dhcp/dhcpd.conf is very similar to the book's suggestion, with a couple small tweaks.

option domain-name cluster;
option domain-name-servers 8.8.8.8, 8.8.4.4;

subnet 10.0.0.0 netmask 255.255.255.0 {
  range 10.0.0.100 10.0.0.254;
  option subnet-mask 255.255.255.0;
  option broadcast-address 10.0.0.255;
  option routers 10.0.0.1;
  
  # host node-02 {
  #   hardware ethernet XX:XX:XX:XX:XX:XX;
  #   fixed-address 10.0.0.2;
  # }

  # host node-03 {
  #   hardware ethernet XX:XX:XX:XX:XX:XX;
  #   fixed-address 10.0.0.3;
  # }

  # host node-04 {
  #   hardware ethernet XX:XX:XX:XX:XX:XX;
  #   fixed-address 10.0.0.4;
  # }

}

default-lease-time 600;
max-lease-time 7200;
authoritative;

My tweaks were mainly just that I wanted to set the range to not include 10.0.0.1 which is statically assigned to the cluster leader.

I set the dynamic range to be well above the node IPs so I'll have an address to ssh in to so I can learn the MAC address of the wired network interface. Once I learn the MAC address, I can uncomment and update the static lease for the node in question.

As an aside, the book's recommendation to use /etc/hosts to give names to the nodes seems undercut by the lack of advice in setting static leases. Without static leases, the nodes could change IPs after being powered down and back up.

The book also suggests option domain-name "cluster.home"; but I shortened this to simply option domain-name cluster; so that I might reach my nodes by node-3.cluster, though I was never actually able to do this. I could only reach the nodes by IP, and in fact, the book recommends adding names for the nodes in /etc/hosts so I'm not sure why it'd make sense to set this in the dhcp config at all.

While configuring the names in /etc/hosts is a little brute force, I've been finding it the most reliable. I don't plan on running Bind on the cluster leader, but until that time I'll be using static leases and /etc/hosts.

Setting up IP Forwarding

The first thing to do is to edit /etc/sysctl.conf and uncomment, or add the following line:

net.ipv4.ip_forward=1

You will need to either reboot, or run sudo sysctl -p to have this config change reflected by your network stack.

The book then shows some iptables commands, and suggests running them as part of /etc/rc.local.

Here's mine:

#!/bin/sh -e
#
# rc.local
#
# This script is executed at the end of each multiuser runlevel.
# Make sure that the script will "exit 0" on success or any other
# value on error.
#
# In order to enable or disable this script just change the execution
# bits.
#
# By default this script does nothing.


iptables -t nat -A POSTROUTING -o wlan0 -j MASQUERADE
iptables -A FORWARD -i wlan0 -o eth0 -m state --state RELATED,ESTABLISHED -j ACCEPT
iptables -A FORWARD -i eth0 -o wlan0 -j ACCEPT

exit 0

Running sudo iptables -L -n will show the rules have loaded after a reboot.

The iptables rules shown in the book allowed the cluster leader to reach external IPs, but the other nodes could not access the internet even while routing through the leader.

After much searching, I found this article on the Ubuntu Help site, and with some slight modifications to the interface names, here's a set of rules that work for me:

# The following rule needs to be set up on ALL NODES
iptables -P FORWARD ACCEPT

# The rest should only be configured on the cluster leader
iptables -A FORWARD -o wlan0 -i eth0 -s 10.0.0.0/24 -m conntrack --ctstate NEW -j ACCEPT
iptables -A FORWARD -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
iptables -t nat -F POSTROUTING
iptables -t nat -A POSTROUTING -o wlan0 -j MASQUERADE

I do not understand any of this iptables stuff. I don't know why the book's suggestion doesn't work, or why this set of rules from the ubuntu help site does. Your mileage may vary, apparently.

Common Config (all nodes)

iptables

Take sepecial note of the section immediately above this regarding iptables. The following rule should be added to the /etc/rc.local on all nodes in the cluster, including the leader.

iptables -P FORWARD ACCEPT

This is noted in the flannel troubleshooting guide, which links to https://docs.docker.com/engine/userguide/networking/default_network/container-communication/#container-communication-between-hosts

Without this rule, your pods will not be able to communicate with each other.

SSH

To start, I ran ssh-keygen on just the cluster leader (node-01 in my case), as the user that comes pre-configured with the hypriot image (pirate).

I then run the following so that a user with this public key can access this host without a password:

cat ~/.ssh/id_rsa.pub > ~/.ssh/authorized_keys

After this, I also create a file at ~/.ssh/config which contains:

Host node-*
    User pirate

Once you've got all this in place, you can copy the .ssh directory around to all your nodes using scp.

Unfortunately the nice short node-* names won't work just yet, so for this we need to use IPs instead.

$ scp -r ~/.ssh pirate@10.0.0.2:~/
$ scp -r ~/.ssh pirate@10.0.0.3:~/
$ scp -r ~/.ssh pirate@10.0.0.4:~/

You might want to make similar changes to ~/.ssh/config on your personal computer to avoid having to specify the pirate user. I do this, and also add my own ~/.ssh/id_rsa.pub to the authorized_keys file on the cluster leader to make accessing the cluster more convenient.

Configuring /etc/hosts

The hypriot image uses cloud-init to manage the hosts file on each node, by default.

Since this is the case, I ssh into each node by IP, and add the following to the bottom of /etc/cloud/templates/hosts.debian.tmpl:

10.0.0.1 node-01
10.0.0.2 node-02
10.0.0.3 node-03
10.0.0.4 node-04

If you haven't already, you may also want to enter the correct hostname for in /boot/user-data on each node as you make the rounds.

After rebooting each node, you should be able to see these entries now in /etc/hosts. Additionally, if you completed the ssh config steps mentioned above, you should now be able to ssh freely between nodes by name.

Installing Kubernetes

The book offers some commands which you may find convenient to wrap up in a small script (so it can be scp'd around all your nodes).

Here's my install-k8s.sh

#!/bin/sh -e

curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
echo "deb http://apt.kubernetes.io/ kubernetes-xenial main" \
  >> /etc/apt/sources.list.d/kubernetes.list
apt-get update
apt-get upgrade -y
apt-get install -y kubelet kubeadm kubectl kubernetes-cni

Copy this script to each of the nodes, and invoke with sudo.

Setting up the cluster

On the cluster leader, run the following (which differs slightly from the commands in the book, I guess some flags have changed).

sudo kubeadm init --pod-network-cidr 10.244.0.0/16 --apiserver-advertise-address 10.0.0.1

The end of the command output should share a command to run on each of the other nodes to join them to your cluster.

Finishing Touches

For downloading and modifying the flannel and the dashboard yamls, I wrote some scripts (just so I'd have a record of what I ran). Again, there are some slight tweaks to update the urls given the drift of time.

get-flannel.sh

#!/bin/sh -e

# ./get-flannel.sh > flannel.yaml

curl https://rawgit.com/coreos/flannel/master/Documentation/kube-flannel.yml \
  | sed "s/amd64/arm/g" | sed "s/vxlan/host-gw/g"

get-dashboard.sh

#!/bin/sh -e

# ./get-dashboard.sh > kubernetes-dashboard.sh

CONF_URL=https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml

curl -sSL \
  $CONF_URL \
  | sed "s/amd64/arm/g"

Random links that touched on topics I ran into soon after the initial setup

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment