mtougeron/CAPA.md

## CAPA.md

      
    Raw
  

              CAPA.md
            
          
    WARNING: This hasn't been tested extensively outside of my environment. Your mileage may vary.
Assumptions:

Any security group modifications or creation that CAPA does that's not specifically flagged below are acceptable for a brief disruption when modified
This is valid as of CAPA 2.0.2. This may not work with new versions (e.g., the steps were different pre-2.x and it was easier to import even the VPC itself pre-2.x)

Importing CAPA Cluster (using BYO VPC):

Make sure AWSManagedControlPlane.spec.eksClusterName matches the EKS cluster name
Optionally set AWSManagedControlPlane.spec.network.securityGroupOverrides.controlplane to match the security group you have on the EKS controlplane. If you have extra security groups I haven't been able to figure out how to import those into CAPA but they stay attached to the EKS cluster and are just ignored by CAPA
Set the VPC information according to the BYO VPC specs https://cluster-api-aws.sigs.k8s.io/topics/bring-your-own-aws-infrastructure.html#configuring-the-awscluster-specification
Determine if you need to set AWSManagedControlPlane.spec.vpcCni.disabled based on what you have installed on your cluster
AWS resources have the required tags according to https://cluster-api-aws.sigs.k8s.io/topics/bring-your-own-aws-infrastructure.html#tagging-aws-resources

Set tag kubernetes.io/cluster/<clusterName> = owned or shared is set appropriately on VPC, Subnets, & Route table resources
Set tag kubernetes.io/cluster/<clusterName> = owned is set on EKS cluster
Set tag kubernetes.io/role/internal-elb & kubernetes.io/role/elb are set on the appropriate Subnets
Set tag sigs.k8s.io/cluster-api-provider-aws/cluster/<clusterName> = owned on the EKS cluster
Set tag sigs.k8s.io/cluster-api-provider-aws/role = common on the EKS cluster


Make sure that the credentials/IAM Role that CAPA runs as will have access to the EKS cluster to manage things like CNI and/or iamAuthenticatorConfig (via the aws-auth ConfigMap)
If you have an OIDC provider attached you'll need to have it detached before applying the yaml manifest or set AWSManagedControlPlane.spec.associateOIDCProvider: false (haven't been able to figure out why it doesn't detect it's already attached)

Caution: If you are running kube-proxy via your legacy code/install, and set AWSManagedControlPlane.spec.kubeProxy.disabled to true, it will uninstall the kube-proxy DaemonSet
At this point you are running/managing the EKS cluster via CAPA but the compute nodes are still running/connected using the non-CAPA system.
Migrating the workloads to CAPA managed compute tiers:

Create new compute tiers using MachineDeployment or AWSManagedMachinePool and size them appropriately
Cordon the old compute tiers
If using AutoScalingGroups, add the tag k8s.io/cluster-autoscaler/node-template/taint/managed-by = legacy:NoSchedule to the ASGs (or whatever taint you want to use to tell the cluster-autoscaler that the old nodes will have a taint)
Taint the old compute tiers with the above taint (this will ensure the cluster-autoscaler knows that any nodes from these ASGs will have the taint when started so it won't try to scale them up)
Drain the old compute tier nodes
You may be able to rely on the cluster-autoscaler to automatically delete/remove the old nodes but if not, remove them and terminate the instances

Now all compute nodes are managed via CAPA

  
## CAPZ.md

      
    Raw
  

              CAPZ.md
            
          
    Importing CAPZ cluster (using BYO VNET):

Make sure AzureManagedControlPlane.metadata.name matches the AKS cluster name
Set the AzureManagedControlPlane.spec.virtualNetwork fields to match your existing VNET
Make sure the AzureManagedControlPlane.spec.sshPublicKey matches what was set on the AKS cluster. (including any potential newlines included in the base64 encoding; this was a big gotcha for me)

NOTE: This is a required field in CAPZ, if you don't know what public key was used, you can change or set it via the azure cli however before attempting to import the cluster.


Make sure the Cluster.spec.clusterNetwork settings match properly to what you are using in AKS
Make sure the AzureManagedControlPlane.spec.dnsServiceIP matches what is set in AKS
Set the tag sigs.k8s.io_cluster-api-provider-azure_cluster_<clusterName> = owned on the AKS cluster
Set the tag sigs.k8s.io_cluster-api-provider-azure_role = common on the AKS cluster

NOTE: Several fields, like networkPlugin, when not set on the AKS cluster at creation time, will mean that CAPZ will not be able to set it when doing a reconcile loop because AKS doesn't allow it to be changed if not set at creation. If it was set at creation time, CAPZ will be able to successfully change/manage the field
At this point you can apply your yaml manifest and the AKS cluster will be imported as a AzureManagedControlPlane. The managed machine pools are still partially managed by your old system and partially managed by the global AKS settings that are now managed by AKS/CAPZ. I highly recommend setting up new AzureManagedMachinePool(s) as soon as possible, taint & drain the old compute pools, and then remove them.