Skip to content

Instantly share code, notes, and snippets.

@timothystewart6
Created April 9, 2024 01:03
Show Gist options
  • Save timothystewart6/2f5825cd7b8f1ec00aef8b7f6b04502b to your computer and use it in GitHub Desktop.
Save timothystewart6/2f5825cd7b8f1ec00aef8b7f6b04502b to your computer and use it in GitHub Desktop.
apiVersion: provisioning.cattle.io/v1
kind: Cluster
metadata:
annotations:
field.cattle.io/creatorId: user-mzmwp
creationTimestamp: '2024-04-08T19:48:03Z'
finalizers:
- wrangler.cattle.io/cloud-config-secret-remover
- wrangler.cattle.io/provisioning-cluster-remove
- wrangler.cattle.io/rke-cluster-remove
generation: 2
managedFields:
- apiVersion: provisioning.cattle.io/v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:finalizers:
v:"wrangler.cattle.io/provisioning-cluster-remove": {}
v:"wrangler.cattle.io/rke-cluster-remove": {}
f:spec:
.: {}
f:kubernetesVersion: {}
f:localClusterAuthEndpoint: {}
f:rkeConfig:
.: {}
f:chartValues:
.: {}
f:rke2-cilium: {}
f:rke2-multus: {}
f:etcd:
.: {}
f:snapshotRetention: {}
f:snapshotScheduleCron: {}
f:machineGlobalConfig:
.: {}
f:cni: {}
f:disable: {}
f:disable-kube-proxy: {}
f:etcd-expose-metrics: {}
f:machineSelectorConfig: {}
f:registries: {}
f:upgradeStrategy:
.: {}
f:controlPlaneConcurrency: {}
f:controlPlaneDrainOptions:
.: {}
f:deleteEmptyDirData: {}
f:disableEviction: {}
f:enabled: {}
f:force: {}
f:gracePeriod: {}
f:ignoreDaemonSets: {}
f:skipWaitForDeleteTimeoutSeconds: {}
f:timeout: {}
f:workerConcurrency: {}
f:workerDrainOptions:
.: {}
f:deleteEmptyDirData: {}
f:disableEviction: {}
f:enabled: {}
f:force: {}
f:gracePeriod: {}
f:ignoreDaemonSets: {}
f:skipWaitForDeleteTimeoutSeconds: {}
f:timeout: {}
manager: rancher
operation: Update
time: '2024-04-08T19:48:03Z'
- apiVersion: provisioning.cattle.io/v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:finalizers:
.: {}
v:"wrangler.cattle.io/cloud-config-secret-remover": {}
f:spec:
f:rkeConfig:
f:machinePoolDefaults: {}
f:upgradeStrategy:
f:controlPlaneDrainOptions:
f:ignoreErrors: {}
f:postDrainHooks: {}
f:preDrainHooks: {}
f:workerDrainOptions:
f:ignoreErrors: {}
f:postDrainHooks: {}
f:preDrainHooks: {}
manager: rancher-v2.8.2-secret-migrator
operation: Update
time: '2024-04-08T19:48:03Z'
- apiVersion: provisioning.cattle.io/v1
fieldsType: FieldsV1
fieldsV1:
f:status:
.: {}
f:agentDeployed: {}
f:clientSecretName: {}
f:clusterName: {}
f:conditions: {}
f:fleetWorkspaceName: {}
f:observedGeneration: {}
f:ready: {}
manager: rancher
operation: Update
subresource: status
time: '2024-04-08T20:24:56Z'
name: home-01
namespace: fleet-default
resourceVersion: '25305907'
uid: 8f6e1496-d56a-44eb-a53b-dbd27e2d4999
spec:
kubernetesVersion: v1.27.12+rke2r1
localClusterAuthEndpoint: {}
rkeConfig:
chartValues:
rke2-cilium: {}
rke2-multus: {}
etcd:
snapshotRetention: 5
snapshotScheduleCron: 0 */5 * * *
machineGlobalConfig:
cni: multus,cilium
disable:
- rke2-ingress-nginx
disable-kube-proxy: false
etcd-expose-metrics: false
machinePoolDefaults: {}
machineSelectorConfig:
- config:
protect-kernel-defaults: false
registries: {}
upgradeStrategy:
controlPlaneConcurrency: '1'
controlPlaneDrainOptions:
deleteEmptyDirData: true
disableEviction: false
enabled: false
force: false
gracePeriod: -1
ignoreDaemonSets: true
ignoreErrors: false
postDrainHooks: null
preDrainHooks: null
skipWaitForDeleteTimeoutSeconds: 0
timeout: 120
workerConcurrency: '1'
workerDrainOptions:
deleteEmptyDirData: true
disableEviction: false
enabled: false
force: false
gracePeriod: -1
ignoreDaemonSets: true
ignoreErrors: false
postDrainHooks: null
preDrainHooks: null
skipWaitForDeleteTimeoutSeconds: 0
timeout: 120
status:
agentDeployed: true
clientSecretName: home-01-kubeconfig
clusterName: c-m-vjhmgv77
conditions:
- lastUpdateTime: '2024-04-08T19:53:29Z'
status: 'False'
type: Reconciling
- lastUpdateTime: '2024-04-08T19:48:03Z'
status: 'False'
type: Stalled
- lastUpdateTime: '2024-04-08T19:54:43Z'
status: 'True'
type: Created
- lastUpdateTime: '2024-04-08T20:24:56Z'
status: 'True'
type: RKECluster
- lastUpdateTime: '2024-04-08T19:48:03Z'
status: 'True'
type: BackingNamespaceCreated
- lastUpdateTime: '2024-04-08T19:48:03Z'
status: 'True'
type: DefaultProjectCreated
- lastUpdateTime: '2024-04-08T19:48:03Z'
status: 'True'
type: SystemProjectCreated
- lastUpdateTime: '2024-04-08T19:48:03Z'
status: 'True'
type: InitialRolesPopulated
- lastUpdateTime: '2024-04-08T20:24:56Z'
status: 'True'
type: Updated
- lastUpdateTime: '2024-04-08T20:24:56Z'
status: 'True'
type: Provisioned
- lastUpdateTime: '2024-04-08T19:55:44Z'
status: 'True'
type: Ready
- lastUpdateTime: '2024-04-08T19:48:04Z'
status: 'True'
type: CreatorMadeOwner
- lastUpdateTime: '2024-04-08T19:48:04Z'
status: 'True'
type: NoDiskPressure
- lastUpdateTime: '2024-04-08T19:48:04Z'
status: 'True'
type: NoMemoryPressure
- lastUpdateTime: '2024-04-08T19:48:04Z'
status: 'True'
type: SecretsMigrated
- lastUpdateTime: '2024-04-08T19:48:04Z'
status: 'True'
type: ServiceAccountSecretsMigrated
- lastUpdateTime: '2024-04-08T19:48:04Z'
status: 'True'
type: RKESecretsMigrated
- lastUpdateTime: '2024-04-08T19:48:04Z'
status: 'True'
type: ACISecretsMigrated
- lastUpdateTime: '2024-04-08T19:54:43Z'
status: 'True'
type: Connected
- lastUpdateTime: '2024-04-08T19:53:14Z'
status: 'True'
type: GlobalAdminsSynced
- lastUpdateTime: '2024-04-08T19:53:16Z'
status: 'True'
type: SystemAccountCreated
- lastUpdateTime: '2024-04-08T19:53:18Z'
status: 'True'
type: AgentDeployed
- lastUpdateTime: '2024-04-08T19:53:29Z'
status: 'True'
type: Waiting
fleetWorkspaceName: fleet-default
observedGeneration: 2
ready: true
@clemenko
Copy link

good times. lol. From a brand new ubuntu 22.04 vm.

# update to the latest kernel

apt update && apt upgrade -y && reboot

# ssh back in

mkdir -p /etc/rancher/{rke2,k3s}/ /var/lib/rancher/rke2/server/manifests/

cat << EOF >> /etc/sysctl.conf
# SWAP settings
vm.swappiness=0
vm.panic_on_oom=0
vm.overcommit_memory=1
kernel.panic=10
kernel.panic_on_oops=1
vm.max_map_count = 262144

# Have a larger connection range available
net.ipv4.ip_local_port_range=1024 65000

# Increase max connection
net.core.somaxconn=10000

# Reuse closed sockets faster
net.ipv4.tcp_tw_reuse=1
net.ipv4.tcp_fin_timeout=15

# The maximum number of "backlogged sockets".  Default is 128.
net.core.somaxconn=4096
net.core.netdev_max_backlog=4096

# 16MB per socket - which sounds like a lot,
net.core.rmem_max=16777216
net.core.wmem_max=16777216

# Various network tunables
net.ipv4.tcp_max_syn_backlog=20480
net.ipv4.tcp_max_tw_buckets=400000
net.ipv4.tcp_no_metrics_save=1
net.ipv4.tcp_rmem=4096 87380 16777216
net.ipv4.tcp_syn_retries=2
net.ipv4.tcp_synack_retries=2
net.ipv4.tcp_wmem=4096 65536 16777216

# ARP cache settings for a highly loaded docker swarm
net.ipv4.neigh.default.gc_thresh1=8096
net.ipv4.neigh.default.gc_thresh2=12288
net.ipv4.neigh.default.gc_thresh3=16384

# ip_forward and tcp keepalive for iptables
net.ipv4.tcp_keepalive_time=600
net.ipv4.ip_forward=1

# monitor file system events
fs.inotify.max_user_instances=8192
fs.inotify.max_user_watches=1048576

# disable ipv6
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
EOF
sysctl -p

useradd -r -c "etcd user" -s /sbin/nologin -M etcd -U;  

echo -e "cni:\n- multus\n- cilium" > /etc/rancher/rke2/config.yaml

cat << EOF >> /var/lib/rancher/rke2/server/manifests/rke2-cilium-config.yaml
# /var/lib/rancher/rke2/server/manifests/rke2-cilium-config.yaml
---
apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: rke2-cilium
  namespace: kube-system
spec:
  valuesContent: |-
    cni:
      exclusive: false
EOF

curl -sfL https://get.rke2.io | INSTALL_RKE2_CHANNEL=v1.27 sh - ; systemctl enable --now rke2-server.service

echo "export KUBECONFIG=/etc/rancher/rke2/rke2.yaml CRI_CONFIG_FILE=/var/lib/rancher/rke2/agent/etc/crictl.yaml PATH=$PATH:/var/lib/rancher/rke2/bin" >> ~/.bashrc
ln -s /var/run/k3s/containerd/containerd.sock /var/run/containerd/containerd.sock
source ~/.bashrc

Then add the two kube objects.

cat <<EOF | kubectl create -f -
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: ipvlan-def
spec:
  config: '{
      "cniVersion": "0.3.1",
      "type": "ipvlan",
      "master": "enp1s0",
      "mode": "l2",
      "ipam": { "type": "static" }
    }'
EOF

cat <<EOF | kubectl create -f -
apiVersion: v1
kind: Pod
metadata:
  name: nginx
  annotations:
    k8s.v1.cni.cncf.io/networks: '[{ "name": "ipvlan-def", "ips": [ "192.168.1.204/24" ] }]'
spec:
  containers:
  - name: nginx
    image: nginx
EOF

I did notice that if the kernel was not updated it didn't play nice.
Maybe see if you can use the exact same steps?

@timothystewart6
Copy link
Author

Same thing. Also, mac didn't make a difference. My problem isn't that the pod isn't getting an IP or DNS, it's that it's not reachable from outside

inside

# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: net1@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default
    link/ether bc:24:11:5f:fe:c4 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 192.168.20.204/24 brd 192.168.20.255 scope global net1
       valid_lft forever preferred_lft forever
    inet6 fe80::bc24:1100:15f:fec4/64 scope link
       valid_lft forever preferred_lft forever
83: eth0@if84: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether de:35:04:4b:a3:8d brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.42.0.10/32 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::dc35:4ff:fe4b:a38d/64 scope link
       valid_lft forever preferred_lft forever
# ping 192.168.20.52
PING 192.168.20.52 (192.168.20.52) 56(84) bytes of data.
64 bytes from 192.168.20.52: icmp_seq=1 ttl=255 time=7.66 ms
64 bytes from 192.168.20.52: icmp_seq=2 ttl=255 time=21.9 ms
64 bytes from 192.168.20.52: icmp_seq=3 ttl=255 time=3.42 ms
^C
# ping google.com
PING google.com (142.250.190.46) 56(84) bytes of data.
64 bytes from ord37s33-in-f14.1e100.net (142.250.190.46): icmp_seq=1 ttl=54 time=8.36 ms
64 bytes from ord37s33-in-f14.1e100.net (142.250.190.46): icmp_seq=2 ttl=54 time=8.35 ms
^C

outside

# ping interface from outside
➜  dev ping 192.168.20.204
PING 192.168.20.204 (192.168.20.204): 56 data bytes
Request timeout for icmp_seq 0
Request timeout for icmp_seq 1
Request timeout for icmp_seq 2
Request timeout for icmp_seq 3


# ping 192.168.20.50, same IP we could ping inside
PING 192.168.20.52 (192.168.20.52) 56(84) bytes of data.
64 bytes from 192.168.20.52: icmp_seq=1 ttl=255 time=7.66 ms
64 bytes from 192.168.20.52: icmp_seq=2 ttl=255 time=21.9 ms
64 bytes from 192.168.20.52: icmp_seq=3 ttl=255 time=3.42 ms
^C
root@rke2-test:/home/serveradmin# kubectl describe pod nginx
Name:             nginx
Namespace:        default
Priority:         0
Service Account:  default
Node:             rke2-test/192.168.20.188
Start Time:       Thu, 11 Apr 2024 09:07:50 -0500
Labels:           <none>
Annotations:      k8s.v1.cni.cncf.io/network-status:
                    [{
                        "name": "portmap",
                        "interface": "eth0",
                        "ips": [
                            "10.42.0.244"
                        ],
                        "mac": "0e:fc:eb:07:be:45",
                        "default": true,
                        "dns": {},
                        "gateway": [
                            "10.42.0.161"
                        ]
                    },{
                        "name": "default/ipvlan-def",
                        "interface": "net1",
                        "ips": [
                            "192.168.20.204"
                        ],
                        "mac": "bc:24:11:5f:fe:c4",
                        "dns": {}
                    }]
                  k8s.v1.cni.cncf.io/networks: [{ "name": "ipvlan-def", "ips": [ "192.168.20.204/24" ] }]
Status:           Running
IP:               10.42.0.244
IPs:
  IP:  10.42.0.244
Containers:
  nginx:
    Container ID:   containerd://474f0acf634470a2bad7cb1b4c169caaa947511d9ada326c0803226374419132
    Image:          nginx
    Image ID:       docker.io/library/nginx@sha256:b72dad1d013c5e4c4fb817f884aa163287bf147482562f12c56368ca1c2a3705
    Port:           80/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Thu, 11 Apr 2024 09:07:51 -0500
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-7cq42 (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  kube-api-access-7cq42:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason          Age   From               Message
  ----    ------          ----  ----               -------
  Normal  Scheduled       29s   default-scheduler  Successfully assigned default/nginx to rke2-test
  Normal  AddedInterface  29s   multus             Add eth0 [10.42.0.244/32] from portmap
  Normal  AddedInterface  29s   multus             Add net1 [192.168.20.204/24] from default/ipvlan-def
  Normal  Pulling         28s   kubelet            Pulling image "nginx"
  Normal  Pulled          28s   kubelet            Successfully pulled image "nginx" in 557.84797ms (557.861487ms including waiting)
  Normal  Created         28s   kubelet            Created container nginx
  Normal  Started         28s   kubelet            Started container nginx
root@rke2-test:/home/serveradmin#

@clemenko
Copy link

and to confirm this is behind proxmox? Is your ufw on?

@timothystewart6
Copy link
Author

that's right. UFW is not on

@clemenko
Copy link

Are you letting rke2 install multus? DId you want to get a zoom call to troubleshoot?

@timothystewart6
Copy link
Author

Yeah I did. I copy and pasted line for line on a new machine. I think it's just how multus may work. I can't have cluster networking and local networking at the same time.

This gets me external DNS inside and pingable from the outside, but I do not get cluster networking inside. It seems this does not route anything on eth0 which is the cluster network. (Also I have tried 100's of variations of this)

apiVersion: v1
kind: Pod
metadata:
  name: sample-pod
  namespace: default
  annotations:
    k8s.v1.cni.cncf.io/networks: |
      [{
        "name": "multus-iot",
        "namespace": "default",
        "default-route": ["192.168.20.1"],
        "mac": "c6:5e:a4:8e:7a:59",
        "ips": ["192.168.20.202/24"]
      }]
spec:
  dnsPolicy: ClusterFirst
  dnsConfig:
    nameservers:
      - 192.168.60.10
      - 192.168.60.22
  containers:
  - name: sample-pod
    command: ["/bin/ash", "-c", "trap : TERM INT; sleep infinity & wait"]
    image: alpine
---
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: multus-iot
  namespace: default
spec:
  config: |-
    {
      "cniVersion": "0.3.1",
      "type": "ipvlan",
      "master": "eth1",
      "ipam": {
        "type": "static",
        "routes": [
          { "dst": "192.168.0.0/16", "gw": "192.168.20.1" }
        ],
      "gateway": "192.168.20.1"
      }
    }

ping from outside

64 bytes from 192.168.20.202: icmp_seq=3139 ttl=63 time=0.903 ms
64 bytes from 192.168.20.202: icmp_seq=3140 ttl=63 time=0.756 ms
64 bytes from 192.168.20.202: icmp_seq=3141 ttl=63 time=1.034 ms
64 bytes from 192.168.20.202: icmp_seq=3142 ttl=63 time=0.819 ms
64 bytes from 192.168.20.202: icmp_seq=3143 ttl=63 time=0.923 ms

inside

/ # nslookup google.com
Server:         192.168.60.10
Address:        192.168.60.10:53

Non-authoritative answer:
Name:   google.com
Address: 142.250.190.46

Non-authoritative answer:
Name:   google.com
Address: 2607:f8b0:4009:809::200e

/ # nslookup k8s-home-01.local.<redacted>.com
Server:         192.168.60.10
Address:        192.168.60.10:53

Name:   k8s-home-01.local.techtronic.us
Address: 192.168.60.50

Non-authoritative answer:

/ # nslookup homepage
Server:         192.168.60.10
Address:        192.168.60.10:53

** server can't find homepage.default.svc.cluster.local: NXDOMAIN

** server can't find homepage.svc.cluster.local: NXDOMAIN

** server can't find homepage.cluster.local: NXDOMAIN

** server can't find homepage.default.svc.cluster.local: NXDOMAIN

** server can't find homepage.svc.cluster.local: NXDOMAIN

** server can't find homepage.cluster.local: NXDOMAIN

/ # nslookup homepage.default.svc.cluster.local
Server:         192.168.60.10
Address:        192.168.60.10:53

** server can't find homepage.default.svc.cluster.local: NXDOMAIN

** server can't find homepage.default.svc.cluster.local: NXDOMAIN

/ # ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: net1@if3: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether bc:24:11:ed:46:d2 brd ff:ff:ff:ff:ff:ff
    inet 192.168.20.202/24 brd 192.168.20.255 scope global net1
       valid_lft forever preferred_lft forever
    inet6 fe80::bc24:1100:1ed:46d2/64 scope link 
       valid_lft forever preferred_lft forever
121: eth0@if122: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP qlen 1000
    link/ether ae:43:88:53:7d:90 brd ff:ff:ff:ff:ff:ff
    inet 10.42.5.102/32 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::ac43:88ff:fe53:7d90/64 scope link 
       valid_lft forever preferred_lft forever
/ # ip r
default via 192.168.20.1 dev net1 
10.42.5.55 dev eth0 scope link 
192.168.0.0/16 via 192.168.20.1 dev net1 
192.168.20.0/24 dev net1 scope link  src 192.168.20.202 
/ # 

@clemenko
Copy link

I left the default route on the internal interface, eth0. I also left the dns pointing to the internal as well. I was able to ping around.

Nginx described

root@cilium:~# kubectl describe pod nginx
Name:             nginx
Namespace:        default
Priority:         0
Service Account:  default
Node:             cilium/192.168.1.179
Start Time:       Thu, 11 Apr 2024 04:30:39 +0000
Labels:           <none>
Annotations:      k8s.v1.cni.cncf.io/network-status:
                    [{
                        "name": "portmap",
                        "interface": "eth0",
                        "ips": [
                            "10.42.0.91"
                        ],
                        "mac": "9a:40:15:00:68:8f",
                        "default": true,
                        "dns": {},
                        "gateway": [
                            "10.42.0.100"
                        ]
                    },{
                        "name": "default/ipvlan-def",
                        "interface": "net1",
                        "ips": [
                            "192.168.1.204"
                        ],
                        "mac": "fe:7c:81:99:65:34",
                        "dns": {}
                    }]
                  k8s.v1.cni.cncf.io/networks: [{ "name": "ipvlan-def", "ips": [ "192.168.1.204/24" ] }]

netdef

cat <<EOF | kubectl create -f -
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: ipvlan-def
spec:
  config: '{
      "cniVersion": "0.3.1",
      "type": "ipvlan",
      "master": "enp1s0",
      "mode": "l2",
      "ipam": { "type": "static" }
    }'
EOF

ping from my laptop

clembookair:clemenko Desktop $ ping 192.168.1.204
PING 192.168.1.204 (192.168.1.204): 56 data bytes
64 bytes from 192.168.1.204: icmp_seq=0 ttl=64 time=1.436 ms
64 bytes from 192.168.1.204: icmp_seq=1 ttl=64 time=1.528 mstdebian 

dig from inside.

root@cilium:~# kubectl exec -it nginx -- bash
root@nginx:/# dig google.com

; <<>> DiG 9.18.24-1-Debian <<>> google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 58682
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: 806ea2c59666c602 (echoed)
;; QUESTION SECTION:
;google.com.			IN	A

;; ANSWER SECTION:
google.com.		30	IN	A	142.251.40.142

;; Query time: 40 msec
;; SERVER: 10.43.0.10#53(10.43.0.10) (UDP)
;; WHEN: Thu Apr 11 18:16:10 UTC 2024
;; MSG SIZE  rcvd: 77

we need to look at how you want the pod to connect/route to what subnets internally(eth0) vs externally(net1).

root@nginx:/# route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
default         10.42.0.100     0.0.0.0         UG    0      0        0 eth0
10.42.0.100     0.0.0.0         255.255.255.255 UH    0      0        0 eth0
192.168.1.0     0.0.0.0         255.255.255.0   U     0      0        0 net1

Can it be assumed that 10...* is all interal to the cluster? Or are there vlans on your network that have that subnet. You might be able to say default is net1 and then make sure you have DNS and other routes just for the 10.x.x.x

@timothystewart6
Copy link
Author

timothystewart6 commented Apr 11, 2024

I think I really have it this time

---
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: multus-iot
  namespace: default
spec:
  config: |-
    {
      "cniVersion": "0.3.1",
      "type": "ipvlan",
      "master": "eth1",
      "ipam": {
        "type": "static",
        "routes": [
          { "dst": "192.168.0.0/16", "gw": "192.168.20.1" }
        ],
      "gateway": "192.168.20.1"
      }
    }
apiVersion: v1
kind: Pod
metadata:
  name: sample-pod
  namespace: default
  annotations:
    k8s.v1.cni.cncf.io/networks: |
      [{
        "name": "multus-iot",
        "namespace": "default",
        "mac": "c6:5e:a4:8e:7a:59",
        "ips": ["192.168.20.202/24"]
      }]
spec:
  containers:
  - name: sample-pod
    command: ["/bin/ash", "-c", "trap : TERM INT; sleep infinity & wait"]
    image: alpine

NIC on host

3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether bc:24:11:ed:46:d2 brd ff:ff:ff:ff:ff:ff
    altname enp0s19
    inet 192.168.20.71/24 metric 100 brd 192.168.20.255 scope global dynamic eth1
       valid_lft 76260sec preferred_lft 76260sec
    inet6 fe80::be24:11ff:feed:46d2/64 scope link
       valid_lft forever preferred_lft forever
➜  ~

ping from outside

➜  dev ping 192.168.20.201
PING 192.168.20.201 (192.168.20.201): 56 data bytes
64 bytes from 192.168.20.201: icmp_seq=0 ttl=63 time=0.690 ms
64 bytes from 192.168.20.201: icmp_seq=1 ttl=63 time=0.677 ms
^C
--- 192.168.20.201 ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.677/0.683/0.690/0.006 ms

exec into pod

/ # ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: net1@if3: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether bc:24:11:ed:46:d2 brd ff:ff:ff:ff:ff:ff
    inet 192.168.20.202/24 brd 192.168.20.255 scope global net1
       valid_lft forever preferred_lft forever
    inet6 fe80::bc24:1100:1ed:46d2/64 scope link 
       valid_lft forever preferred_lft forever
133: eth0@if134: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP qlen 1000
    link/ether c2:60:af:46:0b:d2 brd ff:ff:ff:ff:ff:ff
    inet 10.42.5.236/32 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::c060:afff:fe46:bd2/64 scope link 
       valid_lft forever preferred_lft forever
/ # nslookup google.com
Server:         10.43.0.10
Address:        10.43.0.10:53

Non-authoritative answer:
Name:   google.com
Address: 142.250.190.46

Non-authoritative answer:
Name:   google.com
Address: 2607:f8b0:4009:802::200e

/ # ns lookup k8s-01.local.<redacted>.com
/bin/sh: ns: not found
/ # nslookup k8s-01.local.<redacted>.com
Server:         10.43.0.10
Address:        10.43.0.10:53

/ # nslookup homepage # k8s service record
Server:         10.43.0.10
Address:        10.43.0.10:53

Name:   homepage.default.svc.cluster.local
Address: 10.43.143.7

/ # ping 192.168.20.52 #ip outside of cluster, same subnet
PING 192.168.20.52 (192.168.20.52): 56 data bytes
64 bytes from 192.168.20.52: seq=0 ttl=255 time=1071.444 ms
64 bytes from 192.168.20.52: seq=1 ttl=255 time=71.535 ms
^C
--- 192.168.20.52 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 71.535/571.489/1071.444 ms
/ # ping 192.168.60.10
PING 192.168.60.10 (192.168.60.10): 56 data bytes
^C

/ # ping 192.168.60.1 #ip outside of cluster, outside of subnet
PING 192.168.60.1 (192.168.60.1): 56 data bytes
64 bytes from 192.168.60.1: seq=0 ttl=64 time=0.243 ms
64 bytes from 192.168.60.1: seq=1 ttl=64 time=0.361 ms
64 bytes from 192.168.60.1: seq=2 ttl=64 time=0.383 ms
64 bytes from 192.168.60.1: seq=3 ttl=64 time=0.446 ms
^C
--- 192.168.60.1 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 0.243/0.358/0.446 ms
/ # curl google.com
/bin/sh: curl: not found
/ # 
```

`curl` endpoint on Pod IP exposed via `ipvlan`


```bash
  dev curl http://192.168.20.201:8123
<!DOCTYPE html><html><head><title>Home Assistant</title><meta charset="utf-8"><link rel="manifest" href="/manifest.json" crossorigin="use-credentials"><link rel="icon" href="/static/icons/favicon.ico"><link rel="modulepreload" href="/frontend_latest/core.3T6YYLPr4EA.js" crossorigin="use-credentials"><link rel="modulepreload" href="/frontend_latest/app.RvSvIUp05_E.js" crossorigin="use-credentials"><link rel="mask-icon" href="/static/icons/mask-icon.svg" color="#18bcf2"><link rel="apple-touch-icon" href="/static/icons/favicon-apple-180x180.png"><meta name="apple-itunes-app" content="app-id=1099568401"><meta name="apple-mobile-web-app-capable" content="yes"><meta name="apple-mobile-web-app-status-bar-style" content="default"><meta name="apple-mobile-web-app-title" content="Home Assistant"><meta name="msapplication-config" content="/static/icons/browserconfig.xml"><meta name="mobile-web-app-capable" content="yes"><meta name="application-name" content="Home Assistant"><meta name="referrer" content="same-origin"><meta name="theme-color" content="#03A9F4"><meta name="color-scheme" content="dark light"><meta name="viewport" content="width=device-width,user-scalable=no,viewport-fit=cover,initial-scale=1"><style>body{font-family:Roboto,Noto,Noto Sans,sans-serif;-moz-osx-font-smoothing:grayscale;-webkit-font-smoothing:antialiased;font-weight:400;margin:0;padding:0;height:100%}</style><style>html{background-color:var(--primary-background-color,#fafafa);color:var(--primary-text-color,#212121);height:100vh}@media (prefers-color-scheme:dark){html{background-color:var(--primary-background-color,#111);color:var(--primary-text-color,#e1e1e1)}}#ha-launch-screen{height:100%;display:flex;flex-direction:column;justify-content:center;align-items:center}#ha-launch-screen svg{width:112px;flex-shrink:0}#ha-launch-screen .ha-launch-screen-spacer{flex:1}</style></head><body><div id="ha-launch-screen"><div class="ha-launch-screen-spacer"></div><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 240 240"><path fill="#18BCF2" d="M240 224.762a15 15 0 0 1-15 15H15a15 15 0 0 1-15-15v-90c0-8.25 4.77-19.769 10.61-25.609l98.78-98.7805c5.83-5.83 15.38-5.83 21.21 0l98.79 98.7895c5.83 5.83 10.61 17.36 10.61 25.61v90-.01Z"/><path fill="#F2F4F9" d="m107.27 239.762-40.63-40.63c-2.09.72-4.32 1.13-6.64 1.13-11.3 0-20.5-9.2-20.5-20.5s9.2-20.5 20.5-20.5 20.5 9.2 20.5 20.5c0 2.33-.41 4.56-1.13 6.65l31.63 31.63v-115.88c-6.8-3.3395-11.5-10.3195-11.5-18.3895 0-11.3 9.2-20.5 20.5-20.5s20.5 9.2 20.5 20.5c0 8.07-4.7 15.05-11.5 18.3895v81.27l31.46-31.46c-.62-1.96-.96-4.04-.96-6.2 0-11.3 9.2-20.5 20.5-20.5s20.5 9.2 20.5 20.5-9.2 20.5-20.5 20.5c-2.5 0-4.88-.47-7.09-1.29L129 208.892v30.88z"/></svg><div id="ha-launch-screen-info-box" class="ha-launch-screen-spacer"></div></div><home-assistant></home-assistant><script>function _ls(e,n){var t=document.createElement("script");return n&&(t.crossOrigin="use-credentials"),t.src=e,document.head.appendChild(t)}window.polymerSkipLoadingFontRoboto=!0,"customElements"in window&&"content"in document.createElement("template")||_ls("/static/polyfills/webcomponents-bundle.js",!0);var isS11_12=/(?:.*(?:iPhone|iPad).*OS (?:11|12)_\d)|(?:.*Version\/(?:11|12)(?:\.\d+)*.*Safari\/)/.test(navigator.userAgent)</script><script>if(-1===navigator.userAgent.indexOf("Android")&&-1===navigator.userAgent.indexOf("CrOS")){function _pf(o,t){var n=document.createElement("link");n.rel="preload",n.as="font",n.type="font/woff2",n.href=o,n.crossOrigin="anonymous",document.head.appendChild(n)}_pf("/static/fonts/roboto/Roboto-Regular.woff2"),_pf("/static/fonts/roboto/Roboto-Medium.woff2")}</script><script crossorigin="use-credentials">isS11_12||(import("/frontend_latest/core.3T6YYLPr4EA.js"),import("/frontend_latest/app.RvSvIUp05_E.js"),window.customPanelJS="/frontend_latest/custom-panel.i8gdgENK1So.js",window.latestJS=!0)</script><script>import("/hacsfiles/iconset.js");</script><script>window.latestJS||(window.customPanelJS="/frontend_es5/custom-panel.z-AFomjCesQ.js",_ls("/frontend_es5/core.P4Zu7KndJf8.js",!0),_ls("/frontend_es5/app.XMyum4uAiWg.js",!0))</script><script>if (!window.latestJS) {}</script></body></html>%
➜  dev
```

@clemenko
Copy link

Cool. I am fairly certain you don't need

        "routes": [
          { "dst": "192.168.0.0/16", "gw": "192.168.20.1" }
        ],
      "gateway": "192.168.20.1"

in the NetworkAttachmentDefinition

apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: multus-iot
  namespace: default
spec:
  config: |-
    {
      "cniVersion": "0.3.1",
      "type": "ipvlan",
      "master": "eth1",
      "ipam": { "type": "static" }
    }

@clemenko
Copy link

oh and

apiVersion: v1
kind: Pod
metadata:
  name: sample-pod
  namespace: default
  annotations:
    k8s.v1.cni.cncf.io/networks: [{ "name": "multus-iot", "ips": ["192.168.20.202/24"]  }]
spec:
  containers:
  - name: sample-pod
    command: ["/bin/ash", "-c", "trap : TERM INT; sleep infinity & wait"]
    image: alpine

@timothystewart6
Copy link
Author

I will test without but I think I need both in the NAD. I've added mac because I don't want a random mac every time i started up, dhcp gets messy. Yeah, I could remove the namespace but I always declare it, even it is is default.

@clemenko
Copy link

that makes sense.

@timothystewart6
Copy link
Author

I do need routes otherwise I can't reach it from the outside. I can remove gateway.

    {
      "cniVersion": "0.3.1",
      "type": "ipvlan",
      "master": "eth1",
      "ipam": {
        "type": "static",
        "routes": [
          { "dst": "192.168.0.0/16", "gw": "192.168.20.1" }
        ]
      }
    }

@timothystewart6
Copy link
Author

i am betting this will also work with macvlan but I don't even want to breath on it 😅

@clemenko
Copy link

clemenko commented Apr 11, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment