Skip to content

Instantly share code, notes, and snippets.

@lukassup
Last active April 2, 2024 11:32
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save lukassup/82cccb83a18b3510b925b6c771065efa to your computer and use it in GitHub Desktop.
Save lukassup/82cccb83a18b3510b925b6c771065efa to your computer and use it in GitHub Desktop.

Kubernetes - setup IPVS

(OPTIONAL): This is already done in Ansible scripts if using https://github.com/lukassup/libvirt-istiolab-tf

NOTE: KubeProxy config can be passed to kubeadm init.

kubectl edit configmap -n kube-system kube-proxy
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: "ipvs"
ipvs:
  strictARP: true
kubectl delete pod -n kube-system -l k8s-app=kube-proxy

Kubernetes - kube-dns

(OPTIONAL): The .10 IP is the default and this step can be skipped if using https://github.com/lukassup/libvirt-istiolab-tf

Set predefined clusterIP for kube-dns

kubectl patch svc/kube-dns -n kube-system --patch '{"spec":{"clusterIP":"10.1.1.10"}}'

Calico - install

Calico - fix for incorrect interface autodetect

kubectl set env daemonset/calico-node -n kube-system IP_AUTODETECTION_METHOD=kubernetes-internal-ip

Calico - disable IPIP, VXLAN, NAT globally and for default pool

kubectl set env daemonset/calico-node -n kube-system CALICO_IPV4POOL_IPIP=Never
kubectl set env daemonset/calico-node -n kube-system CALICO_IPV4POOL_VXLAN=Never
kubectl set env daemonset/calico-node -n kube-system CALICO_IPV4POOL_NAT_OUTGOING=true

kubectl wait pods -n kube-system -l k8s-app=calico-node --for condition=Ready --timeout=60s

calicoctl patch pool default-ipv4-ippool --patch='{"spec":{"ipipMode":"Never"}}'
calicoctl patch pool default-ipv4-ippool --patch='{"spec":{"vxlanMode":"Never"}}'
calicoctl patch pool default-ipv4-ippool --patch='{"spec":{"natOutgoing":true}}'

Calico - configure BGP peers for ToR switch

NOTE: using Istio subzone label here which is typically a rack for on-prem deployment

calicoctl apply -f - <<EOF
---
apiVersion: projectcalico.org/v3
kind: BGPPeer
metadata:
  name: rack-10-0-1-tor
spec:
  asNumber: 64513
  nodeSelector: topology.istio.io/subzone == 'rack-10-0-1'
  peerIP: 10.0.1.254
---
apiVersion: projectcalico.org/v3
kind: BGPPeer
metadata:
  name: rack-10-0-2-tor
spec:
  asNumber: 64514
  nodeSelector: topology.istio.io/subzone == 'rack-10-0-2'
  peerIP: 10.0.2.254
---
apiVersion: projectcalico.org/v3
kind: BGPPeer
metadata:
  name: rack-10-0-3-tor
spec:
  asNumber: 64515
  nodeSelector: topology.istio.io/subzone == 'rack-10-0-3'
  peerIP: 10.0.3.254
EOF

Calico - add rack labels for nodes

kubectl label node kube-ctrl01 topology.istio.io/subzone=rack-10-0-1
kubectl label node kube-node01 topology.istio.io/subzone=rack-10-0-1

kubectl label node kube-ctrl02 topology.istio.io/subzone=rack-10-0-2
kubectl label node kube-node02 topology.istio.io/subzone=rack-10-0-2

kubectl label node kube-ctrl03 topology.istio.io/subzone=rack-10-0-3
kubectl label node kube-node03 topology.istio.io/subzone=rack-10-0-3

Calico - set node BGP ASN

calicoctl patch node kube-ctrl01 --patch='{"spec":{"bgp":{"asNumber":64513}}}'
calicoctl patch node kube-node01 --patch='{"spec":{"bgp":{"asNumber":64513}}}'

calicoctl patch node kube-ctrl02 --patch='{"spec":{"bgp":{"asNumber":64514}}}'
calicoctl patch node kube-node02 --patch='{"spec":{"bgp":{"asNumber":64514}}}'

calicoctl patch node kube-ctrl03 --patch='{"spec":{"bgp":{"asNumber":64515}}}'
calicoctl patch node kube-node03 --patch='{"spec":{"bgp":{"asNumber":64515}}}'

Calico - cluster-wide BGP configuration

calicoctl apply -f - <<EOF
---
apiVersion: projectcalico.org/v3
kind: BGPConfiguration
metadata:
  name: default
spec:
  nodeToNodeMeshEnabled: false
  logSeverityScreen: Info
  bindMode: NodeIP
  serviceClusterIPs:
    - cidr: 10.1.1.0/24
EOF

Calico - verify successful BGP peering with ToR switch

# calicoctl node status
Calico process is running.

IPv4 BGP status
+--------------+---------------+-------+----------+-------------+
| PEER ADDRESS |   PEER TYPE   | STATE |  SINCE   |    INFO     |
+--------------+---------------+-------+----------+-------------+
| 10.0.1.254   | node specific | up    | 09:20:33 | Established |
+--------------+---------------+-------+----------+-------------+

FRR - BGP sessions should now be established

leaf01# show bgp vrf vrf-main ipv4 unicast summary 
BGP router identifier 10.0.0.1, local AS number 64513 vrf-id 7
BGP table version 18
RIB entries 29, using 5568 bytes of memory
Peers 4, using 2896 KiB of memory
Peer groups 2, using 128 bytes of memory

Neighbor        V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd   PfxSnt Desc
*10.0.1.1       4      64513       211       218        0    0    0 00:10:22            2       12 N/A
*10.0.1.2       4      64513       209       216        0    0    0 00:10:17            2       12 N/A
spine01(swp1)   4      64512      1512      1513        0    0    0 01:14:51           10       15 N/A
spine02(swp2)   4      64512      1512      1514        0    0    0 01:14:51           10       15 N/A

Total number of neighbors 4
* - dynamic neighbor
2 dynamic neighbor(s), limit 100


leaf01# show bgp vrf vrf-main ipv4 unicast neighbors 10.0.1.1 routes      
BGP table version is 18, local router ID is 10.0.0.1, vrf id 7
Default local pref 100, local AS 64513
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

   Network          Next Hop            Metric LocPrf Weight Path
*>i10.1.1.0/24      10.0.1.1                      100      0 i
*>i10.2.107.192/26  10.0.1.1                      100      0 i

Displayed  2 routes and 26 total paths

leaf01# show bgp vrf vrf-main ipv4 unicast neighbors 10.0.1.2 routes 
BGP table version is 18, local router ID is 10.0.0.1, vrf id 7
Default local pref 100, local AS 64513
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

   Network          Next Hop            Metric LocPrf Weight Path
*=i10.1.1.0/24      10.0.1.2                      100      0 i
*>i10.2.0.128/26    10.0.1.2                      100      0 i

Displayed  2 routes and 26 total paths

Kubectl - test deployment and service

# kubectl create deployment httpbin --image=docker.io/kong/httpbin --replicas=3
# kubectl expose deploy/httpbin --port=80 --target-port=80

# kubectl wait pods -l app=httpbin --for condition=Ready --timeout=60s
# kubectl get pod -l=app=httpbin -o wide -w
NAME                      READY   STATUS    RESTARTS   AGE   IP           NODE          ...
httpbin-dd48785fc-5n8bj   1/1     Running   0          86s   10.2.161.4   kube-node03   ...
httpbin-dd48785fc-f25xd   1/1     Running   0          86s   10.2.238.1   kube-node02   ...
httpbin-dd48785fc-lvnsn   1/1     Running   0          86s   10.2.0.129   kube-node01   ...

# kubectl get svc -l=app=httpbin
NAME      TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
httpbin   ClusterIP   10.1.1.190   <none>        80/TCP    52s
## test pod icmp ping
# for ip in $(kubectl get pod -l=app=httpbin -o jsonpath='{..status.podIP}'); do ping -c1 $ip; done

## test pod tcp connectivity
# for ip in $(kubectl get pod -l=app=httpbin -o jsonpath='{..status.podIP}'); do nc -zv $ip 80; done

## test pod http
# POD_URLS=$(kubectl get pod -l=app=httpbin -o=jsonpath='{ range .items[*] }http://{ .status.podIP }:80{"\n"}{ end }')
# for url in ${POD_URLS[@]}; do curl -sSLD/dev/stdout -o/dev/null "$url"; done

## test svc icmp ping, this should fail with ICMP error: Destination Port Unreachable
# SVC_IP=$(kubectl get svc/httpbin -o=jsonpath='{.spec.clusterIP}')
# ping -c1 $SVC_IP

## test svc tcp connectivity
# SVC_PORT=$(kubectl get svc/httpbin -o=jsonpath='{.spec.ports[0].targetPort}')
# nc -zv $SVC_IP $SVC_PORT

## test svc http
# SVC_URL=http://$SVC_IP:$SVC_PORT
# curl -sSLD/dev/stdout -o/dev/null "$SVC_URL"

Istio - install

istioctl install

Istio - enable sidecar autoinjection for default namespace

kubectl label namespace default istio-injection=enabled

Kubernetes - recreate httpbin app pods to inject Istio sidecars

kubectl delete pod -l=app=httpbin
kubectl wait pods -l app=httpbin --for condition=Ready --timeout=60s

Verify connectivity again

## test pod icmp ping
# for ip in $(kubectl get pod -l=app=httpbin -o jsonpath='{..status.podIP}'); do ping -c1 $ip; done

## test pod tcp connectivity
# for ip in $(kubectl get pod -l=app=httpbin -o jsonpath='{..status.podIP}'); do nc -zv $ip 80; done

## test pod http
# POD_URLS=$(kubectl get pod -l=app=httpbin -o=jsonpath='{ range .items[*] }http://{ .status.podIP }:80{"\n"}{ end }')
# for url in ${POD_URLS[@]}; do curl -sSLD/dev/stdout -o/dev/null "$url"; done

## test svc tcp connectivity
# SVC_IP=$(kubectl get svc/httpbin -o=jsonpath='{.spec.clusterIP}')
# SVC_PORT=$(kubectl get svc/httpbin -o=jsonpath='{.spec.ports[0].targetPort}')
# nc -zv $SVC_IP $SVC_PORT

## test svc http
# SVC_URL=http://$SVC_IP:$SVC_PORT
# curl -sSLD/dev/stdout -o/dev/null "$SVC_URL"

Istio - setup Gateway & VirtualService

NOTE: you can also add HTTPS configuration using Istio example

kubectl apply -f - <<EOF
---
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  name: httpbin-gateway
spec:
  selector:
    istio: ingressgateway # use istio default controller
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    hosts:
    - httpbin.example.com
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: httpbin
spec:
  hosts:
  - httpbin.example.com
  gateways:
  - httpbin-gateway
  http:
  - route:
    - destination:
        host: httpbin
EOF

Istio - verify service connectivity via Ingress Gateway

INGRESS_IP=$(kubectl get -n istio-system svc/istio-ingressgateway -o jsonpath='{.spec.clusterIP}')
INGRESS_URL="http://${INGRESS_IP}:80"
curl -sSLD/dev/stdout -o/dev/null -H 'Host: httpbin.example.com' "$INGRESS_URL"

Istio - Ingress gateway on bare metal

Prerequisites

  • Ingress host should be able to resolve Kubernetes internal DNS entries
  • Ingress host should be able to reach Pod IPs (should work if followed Calico guide above)

This part is based on Istio Virtual Machine setup

  • On Kubernetes control plane node enable Istio Pilot Workload autoregistration and generate Istio workload entry config

    WORK_DIR=ingress
    VM_APP=istio-ingressgateway  # default install
    VM_NAMESPACE=istio-system # default install
    SERVICE_ACCOUNT=istio-ingressgateway-service-account  # default install
    CLUSTER_NETWORK=''
    VM_NETWORK=''
    CLUSTER=cluster1
    
    # NOTE: prereq for workload autoregistration
    cat <<EOF > ./vm-cluster.yaml
    apiVersion: install.istio.io/v1alpha1
    kind: IstioOperator
    metadata:
      name: istio
    spec:
      values:
        global:
          meshID: mesh1
          multiCluster:
            clusterName: "${CLUSTER}"
          network: "${CLUSTER_NETWORK}"
    EOF
    
    # NOTE: prereq for workload autoregistration
    istioctl install -f vm-cluster.yaml --set values.pilot.env.PILOT_ENABLE_WORKLOAD_ENTRY_AUTOREGISTRATION=true --set values.pilot.env.PILOT_ENABLE_WORKLOAD_ENTRY_HEALTHCHECKS=true
    
    mkdir -p $WORK_DIR
    
    cat << EOF > workloadgroup.yaml
    apiVersion: networking.istio.io/v1alpha3
    kind: WorkloadGroup
    metadata:
      name: "${VM_APP}"
      namespace: "${VM_NAMESPACE}"
    spec:
      metadata:
        labels:
          app: "${VM_APP}"
      template:
        serviceAccount: "${SERVICE_ACCOUNT}"
        network: "${VM_NETWORK}"
    EOF
    
    # prereq for workload autoregistration
    kubectl --namespace "${VM_NAMESPACE}" apply -f workloadgroup.yaml
    
    istioctl x workload entry configure -f workloadgroup.yaml -o "${WORK_DIR}" --clusterID "${CLUSTER}" --autoregister
  • Transfer /root/ingress dir contents to Ingress host, e.g.

    kube-crtl01 # tar -czvf ~debian/ingress.tar.gz ./ingress
    kube-ctrl01 $ scp ~debian/ingress.tar.gz debian@istio-ingress01:~
    kube-ctrl01 $ scp ~debian/ingress.tar.gz debian@istio-ingress02:~
    kube-ctrl01 $ scp ~debian/ingress.tar.gz debian@istio-ingress03:~
    istio-ingressN # tar -xzvf ~debian/ingress.tar.gz # extract to root homedir on each ingress host
  • On Ingress host

    # copy root cert
    mkdir -p /etc/certs
    cp ~/ingress/root-cert.pem /etc/certs/root-cert.pem
    
    # copy serviceaccount token
    mkdir -p /var/run/secrets/tokens
    cp ~/ingress/istio-token /var/run/secrets/tokens/istio-token
    
    # install package (rpm also available)
    curl -LO https://storage.googleapis.com/istio-release/releases/1.19.3/deb/istio-sidecar.deb
    dpkg -i istio-sidecar.deb
    
    cp ~/ingress/cluster.env /var/lib/istio/envoy/cluster.env
    cp ~/ingress/mesh.yaml /etc/istio/config/mesh
    # optional, add istiod svc ip to hostfile
    # echo "$ISTIO_ISTIOD_SVC_IP    istiod.istio-system.svc" >> /etc/hosts
    
    mkdir -p /etc/istio/proxy
    chown -R istio-proxy /var/lib/istio /etc/certs /etc/istio/proxy /etc/istio/config /var/run/secrets /etc/certs/root-cert.pem
  • On Ingress host make changes to configs

    # /etc/istio/config/mesh
    defaultConfig:
      discoveryAddress: istiod.istio-system.svc:15012
    # /etc/istio/envoy/cluster.env
    #<append>
    
    # router config
    EXEC_USER='root'
    ISTIO_AGENT_FLAGS='router'
    ISTIO_CUSTOM_IP_TABLES='true'
        ISTIO_METAJSON_LABELS='{"istio":"ingressgateway","istio.io/rev":"default","app":"'$CANONICAL_SERVICE'","service.istio.io/canonical-name":"'$CANONICAL_SERVICE'","service.istio.io/canonical-revision":"'$CANONICAL_REVISION'"}'
  • On Ingress host, start istio

    systemctl start istio
  • On Kubernetes control plane node, verify the new router proxy is visible in istiod

    istioctl proxy-status
    NAME                                                    CLUSTER      CDS        LDS        EDS        RDS          ECDS         ISTIOD                      VERSION
    <...>
    istio-ingress01.istio-system                            cluster1     SYNCED     SYNCED     SYNCED     SYNCED       NOT SENT     istiod-854777b68c-mtjqs     1.19.3
  • On Ingress host, verify Envoy config contains cluster_configs aka backends

    pilot-agent request GET /clusters
  • On Ingress host, verify Envoy config contains route_configs aka frontends (assuming some resources of type Gateway & VirtualService are configured)

    pilot-agent request GET /listeners
  • On Ingress host, test HTTP connectivity to Kubernetes service

    curl -sSLD/dev/stdout -o/dev/null -H 'Host: httpbin.example.com' http://localhost
  • On Ingress host, test HTTPS connectivity (if you configured HTTPS for your Gateway earlier)

    curl -ksSLD/dev/stdout -o/dev/null --resolve httpbin.example.com:443:127.0.0.1 https://httpbin.example.com 

Anycast on Istio proxy

Precreated anycast IP addresses on istio-ingress hosts are:

10.10.0.1/32
10.10.0.2/32
10.10.0.3/32
10.10.0.4/32
10.10.0.5/32
10.10.0.6/32
10.10.0.7/32
10.10.0.8/32
10.10.0.9/32
10.10.0.10/32

These IPs are advetised to leaf switches using FRR via BGP

Workaround bug(?) with default Istio Ingress Gateway

Seems like Istio ingress gateway Kubernetes Service for prevents external Istio proxy from listening on custom IPs, remove it:

root@kube-ctrl01:~# kubectl -n istio-system delete svc/istio-ingressgateway

Reconfigure Istio Gateway to listen on anycast IP

On Kubernetes control plane node:

kubectl apply -f - <<EOF
---
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  name: httpbin-gateway
spec:
  selector:
    istio: ingressgateway # use istio default controller
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    hosts:
    - httpbin.example.com
    bind: 10.10.0.1
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: httpbin
spec:
  hosts:
  - httpbin.example.com
  gateways:
  - httpbin-gateway
  http:
  - route:
    - destination:
        host: httpbin
EOF

Test connectivity:

root@kube-ctrl01:~# curl -sSLD/dev/stdout -o /dev/null  -H 'Host: httpbin.example.com' http://10.10.0.1/get
HTTP/1.1 200 OK
server: istio-envoy
date: Tue, 12 Dec 2023 20:31:20 GMT
content-type: application/json
content-length: 683
access-control-allow-origin: *
access-control-allow-credentials: true
x-envoy-upstream-service-time: 3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment