Skip to content

Instantly share code, notes, and snippets.

@jspaleta
Last active May 10, 2024 19:21
Show Gist options
  • Save jspaleta/a708a519bf5d2fed357d8a91e0351e98 to your computer and use it in GitHub Desktop.
Save jspaleta/a708a519bf5d2fed357d8a91e0351e98 to your computer and use it in GitHub Desktop.
K3s HomeLab Setup for CiliumEgresssGatewayPolicy testing

Problem Statement

Use dedicated node with WireGuard VPN connection for services that are only available via the WireGaurd connections

Home Lab Setup

Home network

192.168.1.0/24 network behind retail cable modem/router/dhcp server

Server: control-plane-node

CentOS Stream 9 wired connection to 192.169.1.0/24 network

k3s control-plane node

podCIDR 10.42.0.0/24

Server: worker-node

CentOS Stream 9 wired connection to 192.168.1.0/24 network k3s worker node

podCIDR 10.42.1.0/24

Workstation: workstation

Fedora 38 wired connection to 102.168.1.0/24 network

This workstation is not participating in k3s cluster

This workstation most not have docker service running. Docker service automatically enabled Linux kernel bridge iptables filtering on all bridge devices, which will interfere with local bridge device tests (until appropriate firewall rules are put in place)

configured to use kubectl access to k3s cluster via the 192.168.1.0/24 network address provided by k3s install.

Will add bridge device br0 connected to two network namespaces on private 10.100.2.0/24 network

Basic Testing Procedure

  1. create wireguard tunnel between worker-node and workstation
  2. create linux bridge device on workstation and attach network namespaces to the 10.100.2.0/24 private network
  3. test connectivity into and out of workstation network namespace via bridge device using diagnostic nc service.
  4. label k3s nodes
  5. deploy test pods, one to each node using node selector labels
  6. test to ensure pod on worker-node has access to diagnostic nc service but pod on control-plane-node does not have access to the diagnostic service
  7. install Cilium egress policy for diagnostic service address/port
  8. test to ensure pods on both worker-node and control-plane-node have acess to the diagnostic service
apiVersion: cilium.io/v2
kind: CiliumEgressGatewayPolicy
metadata:
name: egress-test
spec:
# Specify which pods should be subject to the current policy.
# Multiple pod selectors can be specified.
selectors:
- podSelector:
matchLabels:
env: test
# The following label selects default namespace
io.kubernetes.pod.namespace: default
# Specify which destination CIDR(s) this policy applies to.
# Multiple CIDRs can be specified.
destinationCIDRs:
- "10.100.2.0/24"
# Configure the gateway node.
egressGateway:
# Specify which node should act as gateway for this policy.
nodeSelector:
matchLabels:
#node.kubernetes.io/name: worker-node
egress: worker
# Specify the IP address used to SNAT traffic matched by the policy.
# It must exist as an IP associated with a network interface on the instance.
#egressIP: 10.10.9.1
# Alternatively it's possible to specify the interface to be used for egress traffic.
# In this case the first IPv4 assigned to that interface will be used as egress IP.
interface: wgB
## Smaller script to run on the workstation as root that will setup bridge device and do basic network connectivity testing.
## You'll need to modify IP addresses based on your home lab specifics.
## If Docker service was running in the past it would have enabled several sysctl bridge filtering features
## Disable them here if they were enabled. Docker service will automatically re-enable them when it is restarted
echo -e "Fixing sysctl bridge filtering settings"
sysctl net.bridge.bridge-nf-call-arptables=0 net.bridge.bridge-nf-call-ip6tables=0 net.bridge.bridge-nf-call-iptables=0
## Delete stale artifacts from previous run of this script, this helps clean up any routing change.
echo -e "Deleting artifacts from last run of this script"
ip netns delete private-net0
ip netns delete private-net1
ip link delete br0
sleep 3
# Create the private network namespaces that will be connected to the bridge device
echo -e "Creating namespaces private-net0 and private-net1"
# create namespaces
ip netns add private-net0
ip netns add private-net1
# create network devices
echo -e "creating bridge br0 and attaching to namespaces"
# create veth pairs and bridge
ip link add i0 type veth peer name i0-p
ip link add i1 type veth peer name i1-p
ip link add br0 type bridge
# add devices to network namespace
ip link set i0 netns private-net0
ip link set i1 netns private-net1
# add device peers into bridge in host network namespace
ip link set i0-p master br0
ip link set i1-p master br0
# add addresses to network devices
echo -e "Setting up addresses\n private-net0: 10.100.2.10\n private-net1: 10.100.2.20\n br0: 10.100.2.254"
ip addr add 10.100.2.254/24 dev br0
ip netns exec private-net0 ip a add 10.100.2.10/24 dev i0
ip netns exec private-net1 ip a add 10.100.2.20/24 dev i1
# bring up all devices
ip netns exec private-net0 ip link set i0 up
ip netns exec private-net1 ip link set i1 up
ip link set i0-p up
ip link set i1-p up
ip link set br0 up
# add def route for network namespaces
ip -n private-net0 route add default via 10.100.2.254
ip -n private-net1 route add default via 10.100.2.254
echo -e "Testing connectivity with ping"
sleep 3
echo "net0 route info"
ip netns exec private-net0 route -n
echo -e "\n"
echo "net1 route info"
ip netns exec private-net1 route -n
echo -e "\n"
echo "net0 to wgC"
ip netns exec private-net0 ping 10.10.9.0 -c 2 || exit 1
echo -e "\n"
echo "net0 to wgB"
ip netns exec private-net0 ping 10.10.9.1 -c 2 || exit 1
echo -e "\n"
echo "net1 to wgC"
ip netns exec private-net1 ping 10.10.9.0 -c 2 || exit 1
echo -e "\n"
echo "net1 to wgB"
ip netns exec private-net1 ping 10.10.9.1 -c 2 || exit 1
echo -e "\n"
echo "host to net0"
ping 10.100.2.10 -c 2 || exit 1
echo -e "\n"
echo "host to net1"
ping 10.100.2.20 -c 2 || exit 1
echo -e "\n"
echo "net0 to net1"
ip netns exec private-net0 ping 10.100.2.20 -c 2 || exit 1
echo -e "\n"
echo "net1 to net0"
ip netns exec private-net1 ping 10.100.2.10 -c 2 || exit 1
echo -e "\n"
# You'll need to edit the address for your worker-node
echo "net0 to worker-node cilium address"
ip netns exec private-net0 ping 10.42.X.Y -c 2 || exit 1
echo -e "\n"
# You'll need to edit the address your worker-node
echo "net1 to worker-node cilium address"
ip netns exec private-net1 ping 10.42.X.Y -c 2 || exit 1
echo -e "\n"
kind: Pod
metadata:
name: ubuntu-worker-pod
labels:
app.kubernetes.io/name: test-pods
env: test
spec:
containers:
- name: ubuntu
image: ubuntu
imagePullPolicy: IfNotPresent
command:
- "sleep"
- "604800"
restartPolicy: Always
nodeSelector:
egress: worker
---
apiVersion: v1
kind: Pod
metadata:
name: ubuntu-control-plane-pod
labels:
app.kubernetes.io/name: test-pods
env: test
spec:
containers:
- name: ubuntu
image: ubuntu
imagePullPolicy: IfNotPresent
command:
- "sleep"
- "604800"
restartPolicy: Always
nodeSelector:
egress: control-plane

Testing egress policy using dummy service

Assumes you have private network dummy service setup on workstation, and have confirmed that worker-node/workstation wg tunnel is working

Prepare k3s node with custom labels for egress gateway use

kubectl get nodes 
NAME                 STATUS   ROLES                  AGE    VERSION
control-plane-node   Ready    control-plane,master   2d5h   v1.27.4+k3s1
worker-node          Ready    <none>                 2d4h   v1.27.4+k3s1

add the egress label

kubectl label nodes worker-node egress=worker
kubectl label nodes control-plane-node egress=control-plane

add the test pods with node label selector

kubectl apply -f test-pods.yaml

install nc on both test pods

workstation$ kubectl exec ubuntu-control-plane-pod -- /bin/bash -c "apt update; apt install -y netcat"
workstation$ kubectl exec ubuntu-worker-pod -- /bin/bash -c "apt update; apt install -y netcat"

Confirm ubuntu-worker-pod has access to dummy service running in private-net1 namespace on workstation

workstation$ kubectl exec ubuntu-worker-pod -- /bin/bash -c "echo 'ubuntu-worker pod' | nc -w 10 -v -q 1 10.100.2.20 4444"
Connection to 10.100.2.20 4444 port [tcp/*] succeeded!

Confirm ubuntu-control-plane-pod has no access to dummy service running in private-net1 namespace on workstation

workstation$ kubectl exec ubuntu-control-plane-pod -- /bin/bash -c "echo 'ubuntu-control-plane pod' | nc -w 10 -v -q 1 10.100.2.20 4444"
nc: connect to 10.100.2.20 port 4444 (tcp) timed out: Operation now in progress
command terminated with exit code 1

Install egress policy

$ kubectl create -f egress-policy.yaml 

Re-test from pods

workstation$ kubectl exec ubuntu-control-plane-pod -- /bin/bash -c "echo 'ubuntu-control-plane pod with egress policy enabled' | nc -w 10 -v -q 1 10.100.2.20 4444"
Connection to 10.100.2.20 4444 port [tcp/*] succeeded!
workstation$ kubectl exec ubuntu-worker-pod -- /bin/bash -c "echo 'ubuntu-worker pod with egress policy enabled' | nc -w 10 -v -q 1 10.100.2.20 4444"
Connection to 10.100.2.20 4444 port [tcp/*] succeeded!

delete egress policy

kubectl delete -f egress-policy.yaml

retest to ensure ubuntu-control-plane pod lost access

workstation$ kubectl exec ubuntu-control-plane-pod -- /bin/bash -c "echo 'ubuntu-control-plane pod without egress policy' | nc -w 10 -v -q 1 10.100.2.20 4444"
nc: connect to 10.100.2.20 port 4444 (tcp) timed out: Operation now in progress
command terminated with exit code 1

Done

## Testing pod access from workstation
### Why is this test important?
If you are running into problems with egress connectivity, it might actually be a wireguard configuration problem.
If you've configured wireguard overbroadly with something like `0.0.0.0/0` your wireguard service might have added
additional routes that impact pod traffic.
### Prereqs
you'll want to have pods running on each node that you can connect to,
and wg tunnel between workstation and worker-node is active.
Egress policy doesn't impact this test.
### find the Pod IP for pod running on the worker-node host
```
kubectl describe pod ubuntu-worker-pod |grep IP
IP: 10.42.X.Y
```
### Setup dummy service on ubuntu-worker-pod
```
kubectl exec ubuntu-worker-pod -- nc -v -k -l 5555
Listening on 0.0.0.0 5555
```
### connect on workstation using ubuntu-worker-pod IP
```
workstation $ echo "workstation" | nc -v -q 1 10.42.X.Y 5555
Ncat: Version 7.93 ( https://nmap.org/ncat )
Ncat: Connected to 10.42.X.Y:5555.
Ncat: 12 bytes sent, 0 bytes received in 0.04 seconds.
```
### connect from private-net1 namespace on workstation using ubuntu-worker-pod IP
```
workstation $ echo "private-net1" | ip netns exec private-net1 nc -v 10.42.X.Y 5555
Ncat: Version 7.93 ( https://nmap.org/ncat )
Ncat: Connected to 10.42.X.Y:5555.
Ncat: 13 bytes sent, 0 bytes received in 0.04 seconds.
```
### disable the wg tunnel on worker-node and retest
```
worker-node$ sudo systemctl stop wg-quick@wgB
```
```
workstation root$ echo "workstation" | nc -v -w 10 --send-only 10.42.X.Y 5555
Ncat: Version 7.93 ( https://nmap.org/ncat )
Ncat: TIMEOUT.
```
```
workstation root$ echo "private-net1" | ip netns exec private-net1 nc -v 10.42.X.Y 5555
Ncat: Version 7.93 ( https://nmap.org/ncat )
Ncat: TIMEOUT.
```
### re-establish wg tunnel
```
worker-node$ sudo systemctl start wg-quick@wgB
```
```
workstation root$ echo "workstation" | nc -v -w 10 --send-only 10.42.X.Y 5555
Ncat: Version 7.93 ( https://nmap.org/ncat )
Ncat: Connected to 10.42.X.Y:5555.
Ncat: 12 bytes sent, 0 bytes received in 0.04 seconds.
```
```
workstation root$ echo "private-net1" | ip netns exec private-net1 nc -v 10.42.X.Y 5555
Ncat: Version 7.93 ( https://nmap.org/ncat )
Ncat: Connected to 10.42.X.Y:5555.
Ncat: 13 bytes sent, 0 bytes received in 0.04 seconds.
```

Home Lab WireGaurd Setup

This provides a tunnel between the workstation and the k3s worker-node host

wg tunnel uses the 10.10.9.0/31 network

workstation side interface: wgC:

workstation side of tunnel is configured to gain access to the k3s Pod and service CIDRS wg managed by systemd wg-quick service

wgC.confg (slightly redacted)

[Interface]
# PostUp processed by wg-quick
PostUp = wg set %i private-key /etc/wireguard/%i.key
Address = 10.10.9.0/31
ListenPort = 51000

[Peer]
# Public Key and Endpoint redacted
PublicKey = <public key for wgB on worker-node>
Endpoint = 192.168.1.B:51000
AllowedIPs = 10.10.9.0/31,10.42.0.0/16,10.43.0.0/16

worker-node side interface: wgB:

worker-node side of tunnel is configured to gain access to the private network on workstation running the dummy service wg managed by systemd wg-quick service

wgB.confg (slightly redacted)

[Interface]
# PostUp processed by wg-quick
PostUp = wg set %i private-key /etc/wireguard/%i.key
Address = 10.10.9.1/31
ListenPort = 51000

[Peer]
# Public Key and Endpoint redacted
PublicKey = <public key for wgC on workstation>
Endpoint = 192.168.1.A:51000
AllowedIPs = 10.100.2.0/24,10.10.9.0/31

workstation private network dummy service setup

To simulate a simple private network i've created a private 10.100.2.0/24 network on the workstation using a bridge attached to network namespaces see setup-bridge.sh for commands

workstation root$ ip netns exec private-net0 ifconfig
i0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.100.2.10  netmask 255.255.255.0  broadcast 0.0.0.0
        inet6 fe80::244c:7dff:fe1c:2237  prefixlen 64  scopeid 0x20<link>
        ether 26:4c:7d:1c:22:37  txqueuelen 1000  (Ethernet)
        RX packets 132  bytes 13682 (13.3 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 31  bytes 2490 (2.4 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

workstation root$ ip netns exec private-net0 route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.100.2.254    0.0.0.0         UG    0      0        0 i0
10.100.2.0      0.0.0.0         255.255.255.0   U     0      0        0 i0

workstation root$ ip netns exec private-net1 ifconfig
i1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.100.2.20  netmask 255.255.255.0  broadcast 0.0.0.0
        inet6 fe80::f0ff:76ff:fec7:4515  prefixlen 64  scopeid 0x20<link>
        ether f2:ff:76:c7:45:15  txqueuelen 1000  (Ethernet)
        RX packets 138  bytes 13992 (13.6 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 39  bytes 3018 (2.9 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

workstation root$ ip netns exec private-net1 route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.100.2.254    0.0.0.0         UG    0      0        0 i1
10.100.2.0      0.0.0.0         255.255.255.0   U     0      0        0 i1

The only way to access the network namespaces is locally from the workstation or via the WireGaurd tunnel.

Attach netcat listening service to a namespace as diagnostic to use for connection testing.

workstation root$ ip netns exec private-net1 nc -vlk 10.100.2.20 4444

Confirm its working by running on same workstation

workstation root $ echo "workstation": | nc -v 10.100.2.20 4444
Ncat: Version 7.93 ( https://nmap.org/ncat )
Ncat: Connected to 10.100.2.20:4444.
Ncat: 13 bytes sent, 0 bytes received in 0.05 seconds.

Confirm no access to dummy service from control-plane

control-plane-node $ echo "control-plane-node" | nc -v -w 10 10.100.2.20 4444
Ncat: Version 7.92 ( https://nmap.org/ncat )
Ncat: TIMEOUT.

Confirm WireGaurd access on worker-node

worker-node $ echo "worker-node" | nc -v -w 10 10.100.2.20 4444
Ncat: Version 7.92 ( https://nmap.org/ncat )
Ncat: Connected to 10.100.2.20:4444.
Ncat: 7 bytes sent, 0 bytes received in 0.06 seconds.


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment