Microk8s is a Canonical project to provide a kubernetes environment for local development, similar to minikube but without requiring a separate VM to manage. These instructions describe setting it up for common development use cases with Cilium and may be helpful in particular for testing BPF kernel extensions with Cilium.
Microk8s will run its own version of docker for the kubernetes runtime, so if you have an existing docker installation then this may be confusing, for instance when building images the image may be stored with one of these installations and not the other. This guide assumes you will run both docker daemon instances, and use your existing docker-ce for building Cilium while using the microk8s.docker daemon instance for the runtime of your kubernetes pods.
- Linux with kernel 4.9 or newer (Full Cilium Requirements)
- Snap (default installed in recent Ubuntu distros)
- docker-ce for new docker client binary, used in local image build.
In this howto setup was run on packet.net c1.small.x86 node with Ubuntu 17.10. When you use a remote machine from a cloud provider for testing / development, be aware of k8s default settings.
Quick howto:
# apt-get install snapd apt-transport-https ca-certificates curl software-properties-common build-essential flex bison clang llvm libelf-dev libssl-dev libcap-dev gcc-multilib libncurses5-dev pkg-config
# curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add -
# add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
# apt-get update
# apt-get install docker-ce
# wget https://dl.google.com/go/go1.11.2.linux-amd64.tar.gz
# tar xvf go1.11.2.linux-amd64.tar.gz -C /usr/local/
# mkdir -p ~/go/src/github.com/cilium/
And add to bashrc:
export GOPATH=/<home>/go/
export GOROOT=/usr/local/go/
export PATH=/snap/bin/:/usr/local/go/bin/:/root/go/bin/:$PATH
Then follow with building Cilium itself:
# cd ~/go/src/github.com/cilium/
# git clone https://github.com/cilium/cilium.git && cd cilium/
# go get -u github.com/gordonklaus/ineffassign
# go get -u github.com/jteeuwen/go-bindata/...
# SKIP_DOCS=true make
For stable version:
# snap install microk8s --classic --channel=1.12/stable
For latest unstable version (e.g. having microk8s.stop
/ microk8s.start
and other extensions):
# snap install microk8s --classic --edge
Edit configs as next step:
# vim /var/snap/microk8s/current/args/kube-apiserver
- Add
--allow-privileged
option (To support Cilium with root privileges)
# vim /var/snap/microk8s/current/args/kubelet
- Set
--network-plugin
option tocni
(To use Cilium for network plumbing) - Set
--cni-bin-dir
option to/opt/cni/bin
(To find the Cilium CNI binary)
# alias kubectl=microk8s.kubectl
... or do ...
# snap alias microk8s.kubectl kubectl
- There's probably a way to set up your local kubectl to point to the local microk8s instead, rather than using the microk8s version but this is less of a problem because microk8s tends to track latest kubernetes pretty closely. I found that aliasing kubectl to microk8s.kubectl worked well for me locally.
- Deploy etcd + Cilium via instructions here: http://docs.cilium.io/en/stable/gettingstarted/minikube/
# kubectl create -n kube-system -f https://raw.githubusercontent.com/cilium/cilium/1.3.0/examples/kubernetes/addons/etcd/standalone-etcd.yaml
# kubectl create -f https://raw.githubusercontent.com/cilium/cilium/1.3.0/examples/kubernetes/1.12/cilium.yaml
# kubectl -n kube-system edit ds cilium
If you have trouble with the above steps, check the Troubleshooting section.
- Set docker socket hostpath to point within snap path (So that Cilium can associate container labels with endpoints)
Example configuration:
volumes:
- hostPath:
path: /var/run/cilium
type: DirectoryOrCreate
name: cilium-run
- hostPath:
path: /sys/fs/bpf
type: DirectoryOrCreate
name: bpf-maps
- hostPath:
path: /var/snap/microk8s/current/docker.sock <----
type: Socket
name: docker-socket
# microk8s.enable dns registry
- The registry is available on localhost:32000 (via NodePort).
For these steps, you are recommended to have a recent version of docker-ce so that multi-stage builds work correctly. As of Nov 2018, the version of docker provided with microk8s is not new enough (17.03, Issue)
These steps work by:
- Using the docker-ce version of docker client to build and update the docker-ce daemon with the new Cilium image
- Using docker-ce client docker binary to push the image into the k8s-deployed docker registry
- Using the microk8s.docker client to pull the new image from the local registry into the docker daemon provided by microk8s
- Relying on the registry URI for the docker image for k8s to pull the image from the microk8s.docker daemon.
- Make your local changes to your Cilium repository.
# DOCKER_IMAGE_TAG="my-image" make docker-image
# docker tag cilium/cilium:my-image localhost:32000/cilium/cilium:my-image
# docker push localhost:32000/cilium/cilium:my-image
This uses your local docker-ce (and docker daemon hosted at /var/run/docker.sock
) to push into the registry that was configured above.
For pushing to a docker hub account:
# docker login
[...]
# DOCKER_IMAGE_TAG="my-image" make docker-image
# docker tag cilium/cilium:my-image user/cilium:my-image
# docker push user/cilium:my-image
If you have trouble with the above steps, check the Troubleshooting section.
The below instructions use the microk8s.docker via microk8s, which is hosted at /var/snap/microk8s/current/docker.sock
.
To roll out the new Cilium with the local registry reliably, I found that it was helpful to deploy this prepull YAML; otherwise the connection for fetching the image tends to get reset during container startup, which puts the node into a bad state:
apiVersion: apps/v1beta2
kind: DaemonSet
metadata:
name: prepull
namespace: container-registry
spec:
selector:
matchLabels:
name: prepull
template:
metadata:
labels:
name: prepull
spec:
initContainers:
- name: prepull
image: docker
command: ["docker", "pull", "localhost:32000/cilium/cilium:my-image"]
volumeMounts:
- name: docker
mountPath: /var/run
volumes:
- name: docker
hostPath:
path: /var/snap/microk8s/current/
containers:
- name: pause
image: gcr.io/google_containers/pause
# kubectl create -f prepull.yaml
When you want to re-pull the image again:
# kubectl delete po -n container-registry -l name=prepull
Then, edit your Cilium DS YAML to point to the new tag, replacing image: docker.io/cilium/cilium:v1.3.0
with image: localhost:32000/cilium/cilium:my-image
:
# kubectl -n kube-system edit ds cilium
Set image and imagePullPolicy:
image: localhost:32000/cilium/cilium:my-image
imagePullPolicy: Never
lifecycle:
postStart:
exec:
command:
- /cni-install.sh
preStop:
exec:
command:
- /cni-uninstall.sh
If all of local registry is not used, and images pulled from docker hub instead, set image and imagePullPolicy:
image: docker.io/user/cilium:my-image
imagePullPolicy: Always
lifecycle:
postStart:
exec:
command:
- /cni-install.sh
preStop:
exec:
command:
- /cni-uninstall.sh
And rollout:
# kubectl -n kube-system rollout status ds cilium
If the tag is already pointing to your custom image, you should just need to delete the pods:
# kubectl -n kube-system delete po -l k8s-app=cilium
# kubectl -n kube-system rollout status ds cilium
In order to enable and test the ipvlan datapath, command line arguments for daemon need to be:
[...]
spec:
containers:
- args:
- --debug=$(CILIUM_DEBUG)
- --kvstore=etcd
- --kvstore-opt=etcd.config=/var/lib/etcd-config/etcd.config
- --disable-ipv4=$(DISABLE_IPV4)
- --datapath-mode=ipvlan <--
- --ipvlan-master-device=bond0 <--
- --tunnel=disabled <--
- --install-iptables-rules=false <-- implicit L3 vs L3S depending on install switch
command:
- cilium-agent
[...]
If the rollout gets stuck it can be debugged through ...
# kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
[...]
kube-system cilium-hbcs8 0/1 ErrImagePull 0 75s 147.75.80.23 test <none>
[...]
# kubectl describe pod -n kube-system cilium-hbcs8
[...]
Warning Failed 30s kubelet, test Failed to pull image "localhost:32000/cilium/cilium:my-image": rpc error: code = Unknown desc = Error while pulling image: Get http://localhost:32000/v1/repositories/cilium/cilium/images: read tcp localhost:53302->127.0.0.1:32000: read: connection reset by peer
Normal BackOff 4s (x4 over 103s) kubelet, test Back-off pulling image "localhost:32000/cilium/cilium:my-image"
Warning Failed 4s (x4 over 103s) kubelet, test Error: ImagePullBackOff
[...]
... e.g. in this case the imagePullPolicy
was probably set to Always
.
The daemon set updates are undone via:
# kubectl -n kube-system rollout undo ds cilium
Check if Cilium is up and running:
# kubectl get pods --all-namespaces -o wide
# kubectl -n kube-system logs --timestamps cilium-1234
Deploying a sample application for testing Cilium w/o policy first:
# kubectl create -f https://raw.githubusercontent.com/cilium/cilium/HEAD/examples/minikube/http-sw-app.yaml
# kubectl exec -it -n kube-system cilium-1234 -- cilium endpoint list
# kubectl exec -it tiefighter -- netperf -t TCP_STREAM -H 10.23.177.124
[...]
Force endpoint regeneration:
# kubectl delete po tiefighter
Switch to pods netns:
# microk8s.docker inspect --format '{{ .State.Pid }}' `kubectl get pod tiefighter -o jsonpath='{.status.containerStatuses[0].containerID}' | cut -c 10-21`
8667
# nsenter -t 8667 -n bash
Dumping identity and ip cache:
# kubectl exec -it -n kube-system cilium-1234 -- cilium identity list
# kubectl exec -it -n kube-system cilium-1234 -- cilium map get cilium_ipcache
Deploy Cilium policy via k8s API:
# kubectl create -f https://raw.githubusercontent.com/cilium/cilium/1.3.0/examples/minikube/sw_l3_l4_policy.yaml
Check policy enforcement for endpoints and currently deployed rules:
# kubectl -n kube-system exec cilium-1234 -- cilium endpoint list
# kubectl -n kube-system exec cilium-l234 -- cilium policy get
Get currently installed policies:
# kubectl get ciliumnetworkpolicies.cilium.io
Delete a Cilium policy via k8s API (also clears it from active policy):
# kubectl delete ciliumnetworkpolicies.cilium.io rule1
Check L3 label-based policy verdicts:
# kubectl -n kube-system exec cilium-1234 -- cilium policy trace --src-k8s-pod default:tiefighter --dst-k8s-pod default:xwing
Minimal example:
# cat sw_l3.yaml
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
name: "l3-rule"
specs:
- endpointSelector:
matchLabels:
org: empire
ingress:
- fromRequires:
- matchLabels:
org: empire
# kubectl create -f ./sw_l3.yaml
# kubectl exec -ti xwing -- ping -c3 <IP tiefighter>
PING 10.9.245.239 (10.9.245.239): 56 data bytes
--- 10.9.245.239 ping statistics ---
3 packets transmitted, 0 packets received, 100% packet loss
command terminated with exit code 1
Cilium monitor output for observing drops by policy:
# kubectl exec -n=kube-system cilium-1234 -ti -- cilium monitor --related-to 49565 --related-to 16476
Listening for events on 8 CPUs with 64x4096 of shared memory
Press Ctrl-C to quit
<- endpoint 16476 flow 0xb5da957f identity 14294->0 state new ifindex 0: 10.9.241.61 -> 10.9.71.76 EchoRequest
xx drop (Policy denied (L3)) flow 0xb5da957f to endpoint 49565, identity 14294->53755: 10.9.241.61 -> 10.9.71.76 EchoRequest
<- endpoint 16476 flow 0xb5da957f identity 14294->0 state new ifindex 0: 10.9.241.61 -> 10.9.71.76 EchoRequest
xx drop (Policy denied (L3)) flow 0xb5da957f to endpoint 49565, identity 14294->53755: 10.9.241.61 -> 10.9.71.76 EchoRequest
<- endpoint 16476 flow 0xb5da957f identity 14294->0 state new ifindex 0: 10.9.241.61 -> 10.9.71.76 EchoRequest
xx drop (Policy denied (L3)) flow 0xb5da957f to endpoint 49565, identity 14294->53755: 10.9.241.61 -> 10.9.71.76 EchoRequest
<- endpoint 16476 flow 0x16f8a5c8 identity 14294->0 state new ifindex 0: ca:2a:e8:1c:47:1c -> 02:7f:f3:65:1c:20 ARP
<- endpoint 49565 flow 0x59d8660b identity 53755->0 state new ifindex 0: 10.9.71.76 -> 10.9.245.239 EchoRequest
-> overlay flow 0x59d8660b identity 53755->0 state new ifindex cilium_vxlan: 10.9.71.76 -> 10.9.245.239 EchoRequest
<- endpoint 49565 flow 0x59d8660b identity 53755->0 state new ifindex 0: 10.9.71.76 -> 10.9.245.239 EchoRequest
-> overlay flow 0x59d8660b identity 53755->0 state new ifindex cilium_vxlan: 10.9.71.76 -> 10.9.245.239 EchoRequest
<- endpoint 49565 flow 0x59d8660b identity 53755->0 state new ifindex 0: 10.9.71.76 -> 10.9.245.239 EchoRequest
-> overlay flow 0x59d8660b identity 53755->0 state new ifindex cilium_vxlan: 10.9.71.76 -> 10.9.245.239 EchoRequest
^C
Received an interrupt, disconnecting from monitor...
- https://cilium.readthedocs.io/en/stable/gettingstarted/minikube/
- https://cilium.readthedocs.io/en/stable/policy/language/#policy-examples
In some cases when switching from an older to newer kernel for testing, docker daemon may not start up due to Error starting daemon: error initializing graphdriver: driver not supported
which is visible from journalctl -fu snap.microk8s.daemon-docker.service
log..
# rm -rf /var/snap/microk8s/common/var/lib/docker/aufs/
# rm -rf /var/snap/microk8s/common/var/lib/docker/image/aufs/
For using printk(...)
debugging in the code, enable via:
# kubectl exec -ti -n kube-system cilium-1234 -- mount -t tracefs nodev /sys/kernel/tracing
# kubectl exec -ti -n kube-system cilium-1234 -- cilium config Debug=Enable
# kubectl exec -ti -n kube-system cilium-1234 -- tc exec bpf dbg
In case there is a change in host IP, you can restart kubernetes API server the following way in order to propagte the new IP to all kubernetes cluster members:
# microk8s.stop
# microk8s.start
For older microk8s versions:
# snap disable microk8s
# snap enable microk8s
The docker daemon is not listening on IPv4, so connecting to localhost:32000
for the docker registry may hit something like this depending on your local configuration:
# docker push localhost:32000/cilium/cilium:my-image
The push refers to repository [localhost:32000/cilium/cilium]
Get http://localhost:32000/v2/: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
To overcome this, use 127.0.0.1
instead of localhost
, or remove the IPv6 (::1
) host alias from /etc/hosts
.
When deploying cilium daemon set for the first time, the following error may occur. Make sure to have --allow-privileged
option set in kube-apiserver and apiserver reloaded:
# kubectl create -n kube-system -f https://raw.githubusercontent.com/cilium/cilium/1.3.0/examples/kubernetes/addons/etcd/standalone-etcd.yaml
[...]
The DaemonSet "cilium" is invalid:
* spec.template.spec.containers[0].securityContext.privileged: Forbidden: disallowed by cluster policy
* spec.template.spec.initContainers[0].securityContext.privileged: Forbidden: disallowed by cluster policy
See also https://microk8s.io/docs/
# journalctl -fu snap.microk8s.daemon-kubelet.service
# journalctl -fu snap.microk8s.daemon-apiserver
# journalctl -fu snap.microk8s.daemon-docker.service
For testing, we block all traffic other than ssh:
# iptables -A INPUT -i bond0 -m state --state ESTABLISHED,RELATED -j ACCEPT
# iptables -A INPUT -i bond0 -p tcp --dport 22 -j ACCEPT
# iptables -A INPUT -i bond0 -j DROP
This can be made persistent and stored in /etc/iptables/rules.v4
:
# apt install iptables-persistent
Some quick-links for general troubleshooting:
Also note that, from my understanding, Cilium requires taking over kube-proxy, and microk8s which uses kubelite and doesn't have an option to disable kube-proxy.