Skip to content

Instantly share code, notes, and snippets.

@janeczku
Created June 10, 2020 14:10
Show Gist options
  • Star 58 You must be signed in to star a gist
  • Fork 10 You must be signed in to fork a gist
  • Save janeczku/b16154194f7f03f772645303af8e9f80 to your computer and use it in GitHub Desktop.
Save janeczku/b16154194f7f03f772645303af8e9f80 to your computer and use it in GitHub Desktop.
How to register Rancher managed Kubernetes clusters in Argo CD

How to register Rancher managed Kubernetes clusters in Argo CD

Registering Rancher managed clusters in Argo CD doesn't work out of the box unless the Authorized Cluster Endpoint is used. Many users will prefer an integration of Argo CD via the central Rancher authentication proxy (which shares the network endpoint of the Rancher API/GUI). So let's find out why registering clusters via Rancher auth proxy fails and how to make it work.

Hint: If you are just looking for the solution scroll to the bottom of this page.

Why do i get an error when running argocd cluster add?

Service Account tokens and the Rancher authentication proxy

Registering external clusters to an Argo CD instance is normally accomplished by invoking the command-line tool argocd like this:

$ argocd cluster add <context-name>

Here, context-name references a context in the command-line user's kubeconfig file (by default ~/.kube/config).
Running this command using a context that points to a Rancher authentication proxy endpoint (typically an URL in the form https://<rancher-server-endpoint>/k8s/clusters/<cluster-id>) will result in the following error:

FATA[0001] rpc error: code = Unknown desc = REST config invalid: the server has asked for the client to provide credentials 

Ref: GH ticket

Before getting to the solution let's understand why this happens.

By default, the kubeconfig files provided by Rancher specify the Rancher server network endpoint as the cluster API server endpoint. By doing so Rancher acts as an authentication proxy that validates the user identity and then proxies the request to the downstream cluster.

This generally has a number of advantages compared to having clients communicating directly with the downstream cluster API endpoint:One of the is that this provides a high available Kubernetes API endpoint for all clusters under Rancher's management, sparing the ops team from having to maintain a failover/load-balancing mechanism for each cluster's API servers.
An alternative authentication method avalaible in Rancher is the Authorized Cluster Endpoint which allows requests to be authenticated directly at the downstream cluster Kubernetes API server. See the documentation for details on these methods.

It is important to understand that only the Authorized Cluster Endpoint allows authentication based on K8s service account tokens. The authentication proxy endpoint requires usng a Rancher API token instead.

How can still use the central Rancher auth endpoint to integrate ArgoCD

To summarize: In order to integrate Argo CD via the Rancher server network endpoint, we will need to setup Argo CD with a Rancher API token in lieu of a Kubernetes Service Account token.

For now this can not be accomplished using the argocd command-line tool because it doesn't let the user specify a pre-existing API credential or custom kubeconfig.

Luckily, argocd is at the core just a Kubernetes client with syntactic sugar coating and therefore does most things by interacting with Kubernetes (CRD) resources under the hood. So whatever $ argocd cluster add does, we should be able to do this using kubectl and K8s manifests.

According to the documentation the "cluster add" command does the following:

The above command installs a ServiceAccount (argocd-manager), into the kube-system namespace of that kubectl context, and binds the service account to an admin-level ClusterRole. Argo CD uses this service account token to perform its management tasks (i.e. deploy/monitoring).

And in another place we find:

To manage external clusters, Argo CD stores the credentials of the external cluster as a Kubernetes Secret in the argocd namespace. This secret contains the K8s API bearer token associated with the argocd-manager ServiceAccount created during argocd cluster add, along with connection options to that API server

Futhermore, the format of that cluster secret is also described in detail on this page providing the following example:

apiVersion: v1
kind: Secret
metadata:
  name: mycluster-secret
  labels:
    argocd.argoproj.io/secret-type: cluster
type: Opaque
stringData:
  name: mycluster.com
  server: https://mycluster.com
  config: |
    {
      "bearerToken": "<authentication token>",
      "tlsClientConfig": {
        "insecure": false,
        "caData": "<base64 encoded certificate>"
      }
    }

Finally, the argocd CLI creates a RBAC role called argocd-manager-role which by default assigns clusteradmin privileges but can be narrowed down to implement a least-privilege concept.

tl;dr: Here are the steps to register Rancher managed clusters using the central Rancher API endpoint

So it appears that all we need to do is replace the argocd cluster add command with the following steps:

  1. Create a local Rancher user account (e.g. service-argo)
  2. Create a Rancher API token for that user account, either by logging in and using the GUI (API & Keys -> Add Key) or requesting the token via direct invocation of the /v3/tokens API resource.
  3. Authorize that user account in the cluster (GUI: Cluster -> Members -> Add) and assign the cluster-member role (role should be narrowed down for production usage later).
  4. Create a secret resource file (e.g. cluster-secret.yaml) based on the example above, providing a configuration reflecting the Rancher setup:
    • name: A named reference for the cluster, e.g. "prod".
    • server: The Rancher auth proxy endpoint for the cluster in the format: https://<rancher-server-endpoint>/k8s/clusters/<cluster-id>
    • config.bearerToken: The Rancher API token created above
    • config.tlsClientConfig.caData: PEM encoded CA certificate data for Rancher's SSL endpoint. Only needed if the server certificate is not signed by a public trusted CA.
  5. Then apply the secret to the Argo CD namespace in the cluster where Argo CD is installed (by default argocd): $ kubectl apply -n argocd -f cluster-secret.yaml

Finally check that the cluster has been successfully registered in Argo CD:

argocd cluster list
SERVER                                                     NAME     VERSION  STATUS      MESSAGE
https://xxx.rancher.xxx/k8s/clusters/c-br1xm               vsphere  1.17     Successful
https://kubernetes.default.svc                                               Successful
@sherkon18
Copy link

Would you have to create a secret for each cluster you're managing with Rancher or just add a service-argo user to each cluster?

@pwurbs
Copy link

pwurbs commented Jan 11, 2023

@janeczku Great work. Thx a lot for this guide.
For me it worked.
But I had to set the users role to Cluster Owner, Cluster Member was not sufficient (could not list some objects like cron, replicationController). Later on I check how to narrow down the permissions.

@leliyin
Copy link

leliyin commented Jan 11, 2023

We are automating managed cluster registration in ArgoCD with Rancher Proxy using the link: https://gist.github.com/janeczku/b16154194f7f03f772645303af8e9f80. We’re able to automate this programmatically by creating the k8s secret for cluster registration in ArgoCD. There was one incident during testing where I saw the cluster secret was created, but argocd CLI showed cluster registered, but with no ‘Successful’ status:
$ argocd cluster listWARN[0000] Failed to invoke grpc call. Use flag --grpc-web in grpc calls. To avoid this warning message, use flag --grpc-web.
SERVER NAME VERSION STATUS MESSAGE PROJECT
https://rancher.default.172.18.0.231.nip.io/k8s/clusters/c-qmfns demo
https://kubernetes.default.svc/ in-cluster in-cluster
When trying to deploy the app in UI, the cluster is populated which we could create the app, but syncing the app failed with error ‘https://rancher.default.172.18.0.231.nip.io/k8s/clusters/c-qmfns not configured’ error.
What resolved the issue is restarting all argocd pods. Has anyone run into this issue as well? I'm particularly concerned that the cluster secret was created and cluster was populated in UI, however the clue that the argocd CLI didn't show Successful statusseemed to indicate an error in cluster registration, and this could mislead the user to create the app in UI which failed as a result.

@leliyin
Copy link

leliyin commented Jan 24, 2023

@janeczku Great work. Thx a lot for this guide. For me it worked. But I had to set the users role to Cluster Owner, Cluster Member was not sufficient (could not list some objects like cron, replicationController). Later on I check how to narrow down the permissions.

+1. I also had to set the user role to cluster owner as cluster member role was insufficient that the rancher user needs to create deployment on the managed cluster.

@MaxAnderson95
Copy link

For those here in 2023 not knowing what the "cluster id" is: It's the Mgmt Cluster name which is more of a unique ID and less of a name. You can find this by going to Cluster Management > Choose your downstream cluster > Related Resources > Find the object under "Refers To" with the type of Mgmt Cluster. The name should be something like c-m-abcdefghi. Use this in the URL to reference the cluster:

https://<rancher-server-endpoint>/k8s/clusters/c-m-abcdefghi

@alvrebac
Copy link

alvrebac commented Aug 10, 2023

Hey,
am I missing something?
I did all of the steps to add a new cluster. But everytime I do a kubectl apply with the created secret it only creates the secret and does nothing else.

@ggogel
Copy link

ggogel commented Jan 6, 2024

This guide worked for me using Rancher 2.8.0. Thanks!

However, I had to give the user service-argo the cluster owner role. Giving the cluster member role wasn't sufficient in my case:

Failed to load live state: failed to get cluster info for "...": error synchronizing cache state : failed to sync cluster ... failed to load initial state of resource ServiceAccount: serviceaccounts is forbidden: User "u-h8njl" cannot list resource "serviceaccounts" in API group "" at the cluster scope

@virtualb0x
Copy link

virtualb0x commented Mar 18, 2024

Can plz anyone help me?

I've make everything instruction says:

  1. Created local user with cluster owner role on both clusters (where argocd is) and rancher-maanges cluster
  2. Created api token in rancher cluster for this user in UI and paste it to secret
  3. Generated a secret:
apiVersion: v1
kind: Secret
metadata:
  name: test-secret
  labels:
    argocd.argoproj.io/secret-type: cluster
type: Opaque
stringData:
  name: test
  server: https://kubernetes.tdp.corp/k8s/clusters/c-5tl4w
  config: |
    {
      "bearerToken": "token-vnkzp:glnw724zdg8s9r6h7mk6jtxpwbwq4cxkaefasefasefaeh54sb",
      "tlsClientConfig": {
        "insecure": false,
        "caData": "LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUM0VENDQWNtZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFTTVJBd0RnWURWUVFERXdkcmR<data_ommited>gQ0VSVElGSUNBVEUtLS0tLQo="
      }
    }

But I get an error:
Unable to create application: error while validating and normalizing app: error validating the repo: error getting k8s server version: Get "https://kubernetes.tdp.corp/k8s/clusters/c-5tl4w/version?timeout=32s": x509: certificate signed by unknown authority

Found out: argoproj/argo-cd#3945 this case, but I really do not understand. I just copy paste base encoded cert from ~/.kube/config which I loaded in cluster where rancher is for local user I created.

When I decode it I see:

-----BEGIN CERTIFICATE-----
MIIC4TCCAcmgAwIBAgIBADANBgkqhkiG9w0BAQsFADASMRAwDgYDVQQDEwdrdWJl
<....>
js5Q9L4Lato1WfwcovXbv7o0IuoKpZXQRDevfWY3dHiZzhc+KNMNzxLg39oZ/Kjh
+6T7C54MoGzjvgLsJug0gQEvcE/D
-----END CERTIFICATE-----

If I do Insecure - true and do not paste any caData everything is ok
What am i doing wrong?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment