Skip to content

Instantly share code, notes, and snippets.

@nathancoleman
Last active February 3, 2023 22:38
Show Gist options
  • Save nathancoleman/076343780c3e0b4c03fb91f9d4f84616 to your computer and use it in GitHub Desktop.
Save nathancoleman/076343780c3e0b4c03fb91f9d4f84616 to your computer and use it in GitHub Desktop.

Consul API Gateway + WAN Federation

Prerequisites

I have two Kubernetes clusters created, GKE clusters in the us-east1 region specifically.

Create the Consul datacenters

We want to be certain that we're setting up Consul with federation enabled. For this, I'm following the WAN Federation Between Multiple Kubernetes Clusters Through Mesh Gateways guide.

Notably, we want to install Consul using v0.48.0 in our primary datacenter and v0.49.0 in our secondary datacenter. For context, I'm using kubectx + kubens to manage my K8s contexts and namespaces.

Primary datacenter (dc1)

$ kctx dc1
Switched to context "dc1".

$ kubectl apply --kustomize "github.com/hashicorp/consul-api-gateway/config/crd?ref=v0.4.0"
customresourcedefinition.apiextensions.k8s.io/gatewayclassconfigs.api-gateway.consul.hashicorp.com created
customresourcedefinition.apiextensions.k8s.io/gatewayclasses.gateway.networking.k8s.io created
...

$ helm upgrade --install --values ./values-dc1.yaml consul hashicorp/consul --version="0.48.0" --create-namespace --namespace consul
Release "consul" does not exist. Installing it now.
W1013 15:39:51.233871   98549 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W1013 15:39:52.640457   98549 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
NAME: consul
LAST DEPLOYED: Thu Oct 13 15:39:30 2022
NAMESPACE: consul
STATUS: deployed
REVISION: 1
NOTES:
Thank you for installing HashiCorp Consul!

Your release is named consul.
...

$ kns consul
Context "dc1" modified.
Active namespace is "consul".

$ kubectl apply -f ./proxydefaults.yaml
proxydefaults.consul.hashicorp.com/global created

$ kubectl get secret consul-federation --namespace consul --output yaml > consul-federation-secret.yaml

Secondary datacenter (dc2)

First, get the API URL and paste it into values-dc2.yaml.

$ kctx dc2
Switched to context "dc2".

# Copy the API URL for use in values-dc2.yaml
$ export CLUSTER=$(kubectl config view -o jsonpath="{.contexts[?(@.name == \"$(kubectl config current-context)\")].context.cluster}")
$ kubectl config view -o jsonpath="{.clusters[?(@.name == \"$CLUSTER\")].cluster.server}" | pbcopy
# <PASTE API URL INTO values-dc2.yaml>

$ kubectl create namespace consul
namespace/consul created

$ kns consul
Context "dc2" modified.
Active namespace is "consul".

$ kubectl apply -f ./consul-federation-secret.yaml
secret/consul-federation created

$ kubectl apply --kustomize "github.com/hashicorp/consul-api-gateway/config/crd?ref=v0.4.0"
customresourcedefinition.apiextensions.k8s.io/gatewayclassconfigs.api-gateway.consul.hashicorp.com created
customresourcedefinition.apiextensions.k8s.io/gatewayclasses.gateway.networking.k8s.io created
...

$ helm upgrade --install --values ./values-dc2.yaml consul hashicorp/consul --version="0.49.0" --create-namespace --namespace consul
Release "consul" does not exist. Installing it now.
W1013 15:52:53.029229   99674 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W1013 15:52:54.302580   99674 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
NAME: consul
LAST DEPLOYED: Thu Oct 13 15:52:30 2022
NAMESPACE: consul
STATUS: deployed
REVISION: 1
NOTES:
Thank you for installing HashiCorp Consul!

Your release is named consul.
...

$ kubectl apply -f proxydefaults.yaml
proxydefaults.consul.hashicorp.com/global created

# Verify federation was successful
$ kubectl exec statefulset/consul-server --namespace consul -- consul members -wan

Create Service, Gateway, and Route in secondary datacenter (dc2)

$ kubectl apply -f gateway.yaml
gateway.gateway.networking.k8s.io/api-gateway created

$ kubectl apply -f service.yaml
servicedefaults.consul.hashicorp.com/echo-1 created
service/echo-1 created
serviceaccount/echo-1 created
deployment.apps/echo-1 created
referencegrant.gateway.networking.k8s.io/example-reference-grant created

$ kubectl apply -f route.yaml
httproute.gateway.networking.k8s.io/echo-1 created

Verify

$ export GATEWAY_IP=$(kubectl get gateway api-gateway -o json | jq -r '.status.addresses | first | .value')
$ curl --insecure https://$GATEWAY_IP:8443
{
 "path": "/",
 "host": "34.75.90.220:8443",
 "method": "GET",
 ...
 "namespace": "default",
 "ingress": "",
 "service": "echo-1",
 "pod": "echo-1-bcbf544f-6n724"
}

Kubernetes config dump

$ kubectl get deployment consul-api-gateway-controller -o yaml | yq '.spec.template.spec.initContainers[1]'
command:
  - /bin/sh
  - -ec
  - |
    consul-k8s-control-plane acl-init \
      -component-name=api-gateway-controller \
      -acl-auth-method=consul-k8s-component-auth-method-dc2 \
      -primary-datacenter=dc1 \
      -consul-api-timeout=5s \
      -log-level=info \
      -log-json=false
env:
  - name: HOST_IP
    valueFrom:
      fieldRef:
        apiVersion: v1
        fieldPath: status.hostIP
  - name: CONSUL_CACERT
    value: /consul/tls/ca/tls.crt
  - name: CONSUL_HTTP_ADDR
    value: https://$(HOST_IP):8501
image: hashicorp/consul-k8s-control-plane:0.49.0
imagePullPolicy: IfNotPresent
name: api-gateway-controller-acl-init
resources:
  limits:
    cpu: 50m
    memory: 25Mi
  requests:
    cpu: 50m
    memory: 25Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
  - mountPath: /consul/login
    name: consul-data
  - mountPath: /consul/tls/ca
    name: consul-ca-cert
    readOnly: true
    


$ kubectl get deployment consul-api-gateway-controller -o yaml | yq '.spec.template.spec.containers[0]'
command:
  - /bin/sh
  - -ec
  - |
    consul-api-gateway server \
      -sds-server-host consul-api-gateway-controller.consul.svc \
      -k8s-namespace consul \
      -primary-datacenter=dc1 \
      -log-level info \
      -log-json=false
env:
  - name: CONSUL_CACERT
    value: /consul/tls/ca/tls.crt
  - name: HOST_IP
    valueFrom:
      fieldRef:
        apiVersion: v1
        fieldPath: status.hostIP
  - name: CONSUL_HTTP_TOKEN_FILE
    value: /consul/login/acl-token
  - name: CONSUL_HTTP_ADDR
    value: https://$(HOST_IP):8501
image: hashicorppreview/consul-api-gateway:0.5-dev
imagePullPolicy: IfNotPresent
lifecycle:
  preStop:
    exec:
      command:
        - /bin/sh
        - -ec
        - /consul-bin/consul logout
name: api-gateway-controller
ports:
  - containerPort: 9090
    name: sds
    protocol: TCP
resources:
  limits:
    cpu: 100m
    memory: 100Mi
  requests:
    cpu: 100m
    memory: 100Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
  - mountPath: /consul-bin
    name: consul-bin
  - mountPath: /consul/tls/ca
    name: consul-ca-cert
    readOnly: true
  - mountPath: /consul/login
    name: consul-data
    readOnly: true

Consul config dump

Roles

image

image

Policies

image

image

# Sourced from https://github.com/hashicorp/learn-consul-kubernetes/blob/main/api-gateway/local/api-gw/consul-api-gateway.yaml
apiVersion: gateway.networking.k8s.io/v1beta1
kind: Gateway
metadata:
name: api-gateway
namespace: consul
spec:
gatewayClassName: consul-api-gateway
listeners:
- protocol: HTTPS
port: 8443
name: https
allowedRoutes:
namespaces:
from: Same
tls:
certificateRefs:
- name: consul-server-cert
#Sourced from https://developer.hashicorp.com/consul/docs/k8s/deployment-configurations/multi-cluster/kubernetes#proxydefaults
apiVersion: consul.hashicorp.com/v1alpha1
kind: ProxyDefaults
metadata:
name: global
spec:
meshGateway:
mode: 'local'
# Derived from https://github.com/hashicorp/learn-consul-kubernetes/blob/main/api-gateway/local/api-gw/routes.yaml
apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
name: echo-1
namespace: consul
spec:
parentRefs:
- name: api-gateway
rules:
- matches:
- path:
type: PathPrefix
value: /
backendRefs:
- kind: Service
name: echo-1
namespace: default
port: 8080
# Sourced from https://github.com/hashicorp/learn-consul-kubernetes/blob/main/api-gateway/local/two-services/echo-1.yaml
apiVersion: consul.hashicorp.com/v1alpha1
kind: ServiceDefaults
metadata:
name: echo-1
namespace: default
spec:
protocol: http
---
apiVersion: v1
kind: Service
metadata:
labels:
app: echo-1
name: echo-1
namespace: default
spec:
ports:
- port: 8080
name: high
protocol: TCP
targetPort: 3000
selector:
app: echo-1
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: echo-1
namespace: default
automountServiceAccountToken: true
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: echo-1
name: echo-1
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: echo-1
template:
metadata:
labels:
app: echo-1
annotations:
'consul.hashicorp.com/connect-inject': 'true'
spec:
serviceAccountName: echo-1
containers:
- image: k8s.gcr.io/ingressconformance/echoserver:v0.0.1
name: echo-1
env:
- name: SERVICE_NAME
value: echo-1
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
ports:
- containerPort: 3000
---
apiVersion: gateway.networking.k8s.io/v1alpha2
kind: ReferenceGrant
metadata:
name: example-reference-grant
namespace: default
spec:
from:
- group: gateway.networking.k8s.io
kind: HTTPRoute
namespace: consul # Must match the namespace that api-gw/routes.yaml is deployed into
to:
- group: ""
kind: Service
# Sourced from https://developer.hashicorp.com/consul/docs/k8s/deployment-configurations/multi-cluster/kubernetes#primary-datacenter
# Only modified to enable Consul API Gateway
global:
name: consul
datacenter: dc1
image: hashicorp/consul:1.13.2
# TLS configures whether Consul components use TLS.
tls:
# TLS must be enabled for federation in Kubernetes.
enabled: true
federation:
enabled: true
# This will cause a Kubernetes secret to be created that
# can be imported by secondary datacenters to configure them
# for federation.
createFederationSecret: true
acls:
manageSystemACLs: true
# If ACLs are enabled, we must create a token for secondary
# datacenters to replicate ACLs.
createReplicationToken: true
# Gossip encryption secures the protocol Consul uses to quickly
# discover new nodes and detect failure.
gossipEncryption:
autoGenerate: true
connectInject:
# Consul Connect service mesh must be enabled for federation.
enabled: true
controller:
enabled: true
meshGateway:
# Mesh gateways are gateways between datacenters. They must be enabled
# for federation in Kubernetes since the communication between datacenters
# goes through the mesh gateways.
enabled: true
# Enable Consul API Gateway
apiGateway:
enabled: true
image: hashicorp/consul-api-gateway:0.4.0
# Sourced from https://developer.hashicorp.com/consul/docs/k8s/deployment-configurations/multi-cluster/kubernetes#secondary-cluster-s
# Only modified to enable Consul API Gateway
global:
name: consul
datacenter: dc2
image: hashicorp/consul:1.13.2
tls:
enabled: true
# Here we're using the shared certificate authority from the primary
# datacenter that was exported via the federation secret.
caCert:
secretName: consul-federation
secretKey: caCert
caKey:
secretName: consul-federation
secretKey: caKey
acls:
manageSystemACLs: true
# Here we're importing the replication token that was
# exported from the primary via the federation secret.
replicationToken:
secretName: consul-federation
secretKey: replicationToken
federation:
enabled: true
k8sAuthMethodHost: <kubernetes-api-url-of-secondary>
primaryDatacenter: dc1
gossipEncryption:
secretName: consul-federation
secretKey: gossipEncryptionKey
connectInject:
enabled: true
controller:
enabled: true
meshGateway:
enabled: true
server:
# Here we're including the server config exported from the primary
# via the federation secret. This config includes the addresses of
# the primary datacenter's mesh gateways so Consul can begin federation.
extraVolumes:
- type: secret
name: consul-federation
items:
- key: serverConfigJSON
path: config.json
load: true
# Enable Consul API Gateway
apiGateway:
enabled: true
image: hashicorppreview/consul-api-gateway:0.5-dev
@manobi
Copy link

manobi commented Oct 14, 2022

I have though proxydefaults was a global resource, I only have it in primary datacenter.

@nathancoleman
Copy link
Author

It is global, you should be fine only creating it in your primary datacenter

@manobi
Copy link

manobi commented Oct 14, 2022

My api-gateway-controller have the exact same rules, I'm almost certain that the problem is caused by some race condition in bootstrapping due to the sizing of my deployment.

@codex70
Copy link

codex70 commented Oct 31, 2022

@nathancoleman , I've finally had time to test this, everything appears to be working correctly all status details appear fine, the api gateway works in the first data center, but when trying to curl to the api gateway in the second data center I get the following:

curl -k --header "Host: service.host" "https://${API}:8443/" -v
*   Trying X.X.X.X...
* TCP_NODELAY set
* Connected to X.X.X.X (X.X.X.X) port 8443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to X.X.X.X:8443
* Closing connection 0
curl: (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to X.X.X.X:8443

If I telnet on the first data center I can connect to the server, on the second data center:

telnet X.X.X.X 8443
Trying X.X.X.X...
Connected to X.X.X.X.
Escape character is '^]'.
Connection closed by foreign host.

The network configurations are the same for both data centers, so I'm really not sure where the problem is. There's nothing useful in the logs for the API gateway.

@nathancoleman
Copy link
Author

@codex70 do you have your values.yaml handy? Curious what kind of service you were using for the Gateway and whatnot.

@codex70
Copy link

codex70 commented Nov 2, 2022

@nathancoleman , I noticed that you had used the same [gateway.yaml] file, previously I had a different name for the api gateway in the second datacenter. When I followed your example above and used the same configuration in both data centers it worked as expected.

Once again, thank you for all your help on this, I have now closed the related issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment