Skip to content

Instantly share code, notes, and snippets.

@maelvls
Last active November 16, 2021 13:50
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save maelvls/ee979a057c1ac8aa39e7e4df93830c8a to your computer and use it in GitHub Desktop.
Save maelvls/ee979a057c1ac8aa39e7e4df93830c8a to your computer and use it in GitHub Desktop.
Getting started using cert-manager with the sig-network Gateway API

Getting started using cert-manager with the sig-network Gateway API

This gist was published at https://www.jetstack.io/blog/cert-manager-gateway-api-traefik-guide/

Contents:

Getting started

Prerequisites

In order to get ACME certificates using one of the Implementations of Gateway API, you will need a cluster with:

  1. A Service type=LoadBalancer controller installed (if you are running on GKE, then you are good to go).
  2. A DNS zone in some DNS provider that ExternalDNS supports (e.g., CloudDNS or Scaleway). You can also use the IP addresses directly, but this guide will focus on host names instead.

In the reainmer of the guide, we will be using:

  • a pre-existing GKE cluster,
  • a pre-existing CloudDNS zone (domain mael-valais-gcp.jetstacker.net.).

Before installing the Gateway API implementation, we will need to install ExternalDNS.

Note that as of August 6, 2021, ExternalDNS does not support the Gateway API natively. Andy Bursavich is working on implementing it and you can follow the development on the issue external-dns#2045.

Let us set a couple of variables:

# The following project must contain the CloudDNS zone you will be using.
PROJECT=my-gcp-project

# This is the `DNS_NAME` that you can see when running
# `gcloud dns managed-zones list`:
DOMAIN=mael-valais-gcp.jetstacker.net

ExternalDNS needs a service account key to let ExternalDNS configure DNS records on the zone:

gcloud iam service-accounts create external-dns --display-name "For ExternalDNS" --project "$PROJECT"
gcloud projects add-iam-policy-binding "$PROJECT" --role=roles/dns.admin \
  --member="serviceAccount:external-dns@$PROJECT.iam.gserviceaccount.com"
kubectl -n kube-system apply -f- >/dev/null <<EOF
apiVersion: v1
kind: Namespace
metadata:
  name: external-dns
---
apiVersion: v1
kind: Secret
metadata:
  name: jsonkey
stringData:
  jsonkey: |
    $(gcloud iam service-accounts keys create /dev/stdout --iam-account "external-dns@$PROJECT.iam.gserviceaccount.com" --project "$PROJECT" | jq -c)
EOF

Finally, we can install ExternalDNS:

helm repo add bitnami https://charts.bitnami.com/bitnami >/dev/null
helm upgrade --install external-dns bitnami/external-dns --namespace external-dns --create-namespace \
    --set provider=google --set google.project="$PROJECT" --set google.serviceAccountSecret=jsonkey --set google.serviceAccountSecretKey=jsonkey \
    --set sources='{ingress,service}' >/dev/null

Getting started with Traefik

Before starting, make sure to follow the instructions in Prerequisites.

The support for the Gateway API was introduced with Traefik 2.4.8. Let us install Traefik:

helm repo add traefik --force-update https://helm.traefik.io/traefik
helm upgrade --install traefik traefik/traefik --namespace traefik --create-namespace \
    --set additionalArguments='{--providers.kubernetesingress,--providers.kubernetesingress.ingressendpoint.publishedservice=traefik/traefik,--experimental.kubernetesgateway=true,--providers.kubernetesgateway=true}' \
    --set ssl.enforced=true --set dashboard.ingressRoute=true

Let us configure the Service type=LoadBalancer created by Traefik with a DNS name:

kubectl annotate svc -n traefik traefik --overwrite "external-dns.alpha.kubernetes.io/hostname=traefik.$DOMAIN"

After some time, you should see:

# Check 1: the EXTERNAL-IP should have been set.
$ kubectl get svc -n traefik traefik
NAME      TYPE           CLUSTER-IP      EXTERNAL-IP    PORT(S)
traefik   LoadBalancer   10.43.192.130   34.78.153.43   80:30137/TCP,443:30804/TCP

# Check 2: after a few minutes, you should be able to query the domain.
$ nslookup traefik.$DOMAIN

As detailed on the page Traefik & Kubernetes, Traefik needs some extra RBAC rules:

kubectl apply -f- <<EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: gateway-role
rules:
  - apiGroups:
      - ""
    resources:
      - services
      - endpoints
      - secrets
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - networking.x-k8s.io
    resources:
      - gatewayclasses
      - gateways
      - httproutes
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - networking.x-k8s.io
    resources:
      - gatewayclasses/status
      - gateways/status
      - httproutes/status
    verbs:
      - update
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: gateway-controller
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: gateway-role
subjects:
  - kind: ServiceAccount
    name: traefik
    namespace: traefik
EOF

Let us install the Gateway API CRDs:

kubectl kustomize "github.com/kubernetes-sigs/gateway-api/config/crd?ref=v0.3.0" | kubectl apply -f -

The next step is to install cert-manager. The Gateway API is supported since 1.5.0-beta.0:

helm repo add jetstack https://charts.jetstack.io
helm upgrade --install cert-manager jetstack/cert-manager --namespace cert-manager --create-namespace \
  --set installCRDs=true --set "extraArgs={--controllers=*\,gateway-shim}" --version v1.5.0-beta.0

Now, we can create an ACME Issuer and two Gateways: one for solving HTTP-01 challenges and one for listening on 443 (cf. below)

kubectl apply -f- <<EOF
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
  name: letsencrypt
spec:
  acme:
    email: your-email@gmail.com
    server: https://acme-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      name: letsencrypt
    solvers:
      - http01:
          gatewayHTTPRoute:
            labels:
              gateway: http01-solver-traefik
---
apiVersion: networking.x-k8s.io/v1alpha1
kind: GatewayClass
metadata:
  name: traefik
spec:
  controller: traefik.io/gateway-controller
---
apiVersion: networking.x-k8s.io/v1alpha1
kind: Gateway
metadata:
  name: http01-solver
spec:
  gatewayClassName: traefik
  listeners:
  - protocol: HTTP
    port: 8000
    routes:
      kind: HTTPRoute
      selector:
        matchLabels:
          gateway: http01-solver-traefik
---
apiVersion: networking.x-k8s.io/v1alpha1
kind: Gateway
metadata:
  name: traefik
  annotations:
    cert-manager.io/issuer: letsencrypt
spec:
  gatewayClassName: traefik
  listeners:
  - hostname: traefik.mael-valais-gcp.jetstacker.net
    protocol: HTTPS
    port: 8443
    routes:
      kind: HTTPRoute
      selector:
        matchLabels:
          gateway: traefik
    tls:
      mode: Terminate
      certificateRef:
        name: traefik-tls
        kind: Secret
        group: core
EOF

And finally, let us create a Deployment to test that out:

kubectl create deployment echoserver --image k8s.gcr.io/echoserver:1.3 --dry-run=client -oyaml | kubectl apply -f-
kubectl expose deployment echoserver --port=8080 --dry-run=client -oyaml | kubectl apply -f-
kubectl apply -f- <<EOF
apiVersion: networking.x-k8s.io/v1alpha1
kind: HTTPRoute
metadata:
  labels:
    gateway: traefik
  name: echoserver
spec:
  hostnames:
  - traefik.$DOMAIN
  rules:
  - forwardTo:
    - serviceName: echoserver
      port: 8080
EOF

Getting started with Istio

Before starting, make sure to follow the instructions in Prerequisites.

For installing istio, we will be using istioctl. Install istioctl by following the Download Istio instructions.

You can then install Istio on your cluster. No extra flag is required to enable the Gateway API support:

istioctl install

Let us give a DNS name to the Service type=LoadBalancer created by Istio:

kubectl annotate svc -n istio-system istio-ingressgateway --overwrite "external-dns.alpha.kubernetes.io/hostname=istio.$DOMAIN"

After some time, you should see:

# Check 1: the EXTERNAL-IP should have been set.
$ kubectl get svc -n istio-system istio-ingressgateway
NAME                   TYPE           CLUSTER-IP      EXTERNAL-IP                   PORT(S)                                      AGE
istio-ingressgateway   LoadBalancer   10.43.128.122   35.242.128.33,35.242.128.33   15021:30907/TCP,80:31160/TCP,443:31361/TCP   16m

# Check 2: after a few minutes, you should be able to query the domain.
$ nslookup istio.$DOMAIN

Let us install the Gateway API CRDs:

kubectl kustomize "github.com/kubernetes-sigs/gateway-api/config/crd?ref=v0.3.0" | kubectl apply -f -

The next step is to install cert-manager. The Gateway API is supported since 1.5.0-beta.0:

helm repo add jetstack https://charts.jetstack.io
helm upgrade --install cert-manager jetstack/cert-manager --namespace cert-manager --create-namespace \
  --set installCRDs=true --set "extraArgs={--controllers=*\,gateway-shim}" --version v1.5.0-beta.0

We can create a single Gateway along with the Let's Encrypt Issuer:

kubectl apply -f- <<EOF
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
  name: letsencrypt-using-istio
spec:
  acme:
    email: your-email@gmail.com
    server: https://acme-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      name: letsencrypt
    solvers:
      - http01:
          gatewayHTTPRoute:
            labels:
              gateway: istio
---
apiVersion: networking.x-k8s.io/v1alpha1
kind: GatewayClass
metadata:
  name: istio
spec:
  controller: istio.io/gateway-controller
---
apiVersion: networking.x-k8s.io/v1alpha1
kind: Gateway
metadata:
  name: istio
  cert-manager.io/issuer: letsencrypt
spec:
  gatewayClassName: istio
  listeners:
  - protocol: HTTP
    port: 8000
    routes:
      kind: HTTPRoute
      selector:
        matchLabels:
          gateway: istio
  - hostname: istio.mael-valais-gcp.jetstacker.net
    protocol: HTTPS
    port: 8443
    routes:
      kind: HTTPRoute
      selector:
        matchLabels:
          gateway: istio
    tls:
      mode: Terminate
      certificateRef:
        name: istio-tls
        kind: Secret
        group: core
EOF

And finally, let us create a Deployment to test that out:

kubectl create deployment echoserver --image k8s.gcr.io/echoserver:1.3 --dry-run=client -oyaml | kubectl apply -f-
kubectl expose deployment echoserver --port=8080 --dry-run=client -oyaml | kubectl apply -f-
kubectl apply -f- <<EOF
apiVersion: networking.x-k8s.io/v1alpha1
kind: HTTPRoute
metadata:
  labels:
    gateway: traefik
  name: echoserver
spec:
  hostnames:
  - traefik.$DOMAIN
  rules:
  - forwardTo:
    - serviceName: echoserver
      port: 8080
EOF

Known bugs in implementations

[HAProxy Ingress] certificateRef.group cannot be set to core

As of haproxy-ingress v0.13.0-beta.2, haproxy-ingress expects certificateRef.group to be empty, but the Gateway API v1alpha1 CRD requires a non-empty group (issue haproxy-ingress#830).

This issue has been fixed for v1alpha2 in the PR 562, but will not be back-ported to v1alpha1.

Until the PR 833 is merged, the only workaround I know about is to manually disable the non-empty requirement from the CRD:

kubectl get crd gateways.networking.x-k8s.io -oyaml | grep -v -- '- group' | kubectl apply -f-

[Traefik] HTTPRoute and Gateway must be on the same namespace

As of Traefik 2.4.9, Traefik only watches for HTTPRoutes that are on the same namespace as the Gateway and does not honor from: All (issue traefik#8246).

For example, the following won't work:

apiVersion: networking.x-k8s.io/v1alpha1
kind: Gateway
metadata:
  name: traefik
  namespace: traefik # 🔥
  annotations:
    cert-manager.io/issuer: letsencrypt
spec:
  gatewayClassName: traefik
  listeners:
  - protocol: HTTP
    port: 8000
    routes:
      kind: HTTPRoute
      selector:
        matchLabels:
          gateway: http01-solver-traefik
      namespaces:
        from: All

At this point, you would expect to be able to create a Certificate in any namespace with an ACME Issuer. Let's imagine that you have an ACME Issuer and Certificate in the namespace default:

apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
  name: letsencrypt
  namespace: default
spec:
  acme:
    solvers:
      - http01:
          gatewayHTTPRoute:
            labels:
              gateway: http01-solver-traefik
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: example-tls
  namespace: default
spec:
  issuerRef:
    name: letsencrypt
  dnsNames:
  - example.com

cert-manager will create, as expected, an HTTPRoute in the namespace default:

kind: HTTPRoute
metadata:
  name: cm-acme-http-solver-gdhvg
  namespace: default
  labels:
    gateway: http01-solver-traefik
spec:
  gateways:
    allow: All
  hostnames:
  - example.com
  rules:
  - forwardTo:
    - port: 8089
      serviceName: cm-acme-http-solver-gdhvg
      weight: 1
    matches:
    - path:
        type: Exact
        value: /.well-known/acme-challenge/YadC4gaAzqEPU1Yea0D2MrzvNRWiBCtUizCtpiRQZqI

But Traefik won't do anything with it:

time="2021-08-05T16:28:32Z" level=error msg="an error occurred while creating gateway status: 1 error occurred:
Cannot fetch HTTPRoutes for namespace "traefik" and matchLabels map[gateway:http01-solver-traefik]
gateway=http01-solver-traefik namespace=traefik

[Traefik] One faulty listener breaks the entire Gateway

traefik requires all listeners to be valid before configuring itself, which means we can't use a single Gateway for both the HTTP-01 challenges on port 80 and 443 configured with a certificate that is meant to be created using the HTTP-01 challenge.

For example:

apiVersion: networking.x-k8s.io/v1alpha1
kind: Gateway
metadata:
  name: traefik
  annotations:
    cert-manager.io/issuer: letsencrypt
spec:
  gatewayClassName: traefik
  listeners:
  # ✅ This listener is valid as per Traefik.
  - protocol: HTTP
    port: 8000
    routes:
      kind: HTTPRoute
      selector:
        matchLabels:
          gateway: http01-solver-traefik
  # ❌ This listener is invalid as per Traefik since the Secret can't be found.
  - hostname: traefik.mael-valais-gcp.jetstacker.net
    protocol: HTTPS
    port: 8443
    routes:
      kind: HTTPRoute
      selector:
        matchLabels:
          gateway: traefik
    tls:
      mode: Terminate
      certificateRef:
        name: traefik-tls
        kind: Secret
        group: core

Even thought the first listener is valid, none of them will be configured in Traefik; Traefik shows the following error:

An error occurred while creating gateway status: 1 error occurred: Error while retrieving certificate:
secret default/traefik-tls does not exist

Imagining that the Secret traefik-tls existed, cert-manager would remove the temporary HTTPRoute that it created, which means the first listener (on port 80) would start erroring, preventing the second listener (on port 443) from being configured:

time="2021-08-05T16:28:32Z" level=error msg="an error occurred while creating gateway status: 1 error occurred:
Cannot fetch HTTPRoutes for namespace "default" and matchLabels map[gateway:http01-solver]

The only workaround is to create a separate Gateway to prevent one listener from crashing the other listeners:

apiVersion: networking.x-k8s.io/v1alpha1
kind: Gateway
metadata:
  name: http01-solver
  annotations:
    cert-manager.io/issuer: letsencrypt
spec:
  gatewayClassName: traefik
  listeners:
  - protocol: HTTP
    port: 8000
    routes:
      kind: HTTPRoute
      selector:
        matchLabels:
          gateway: http01-solver-traefik
---
apiVersion: networking.x-k8s.io/v1alpha1
kind: Gateway
metadata:
  name: traefik
  annotations:
    cert-manager.io/issuer: letsencrypt
spec:
  gatewayClassName: traefik
  listeners:
  - hostname: traefik.mael-valais-gcp.jetstacker.net
    protocol: HTTPS
    port: 8443
    routes:
      kind: HTTPRoute
      selector:
        matchLabels:
          gateway: traefik
    tls:
      mode: Terminate
      certificateRef:
        name: traefik-tls
        kind: Secret
        group: core
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment