I am writing this article to document data related to sizing classification for RHACM 2.2 deployments.
We define three generic size classifications, based on the number of managed clusters under management.
Then, we deploy RHACM 2.2 into each of the size classifications, and measure the system based on the workload.
Today, we will keep the workload very generic.
The workload that we will use are:
- number of managed clusters being managed
- number of grc policies applied to the managed clusters
- number of applications defined and applied to the managed clusters
- Observability component is enabled on worker nodes
This deployment consists of the core RHACM platform with Observability enabled. The hub cluster is running on AWS via IPI deployment.
TSHIRT SIZE | cpu master | cpu worker | ram GB master | ram GB worker | pvc storage | # managed clusters | # grc policies | # app |
---|---|---|---|---|---|---|---|---|
POC/small | 12 core | 12 core | 48 GB | 48 GB | <= 10 | 100s | 100s | |
medium | 24 core | 72 core | 96 GB | 288 GB | 3212 Gi** | <= 50 | 100s | 100s |
large | 24 core | x | 96 core | x | <= 1000 | 100s | 100s |
Here, we show the results for testing on the small tshirt sizing. The usage data is across the RHACM namespaces or worker nodes.
label | peak cpu usage | mean cpu usage | peak mem usage | mean mem usage | pvc storage | # managed clusters | # grc policies | # applications |
---|---|---|---|---|---|---|---|---|
POC/small | 8.43 core | 3.81 core | 32.19 GB | 23.22 GB | 186.3 Gi | 10 | 500 | 500 |
medium*** | 26.40 core | 10.18 core | 283.34 GB | 154.54 GB | ** | 50 | 500 | 500 |
large | tbd |
- The Observability reciever pvc needed to be adjusted to handle data collection from 10 managed clusters and workload.
- 500 apps and policies were deployed. These are policies and apps that create configmap resources on the managed clusters, allowing us not to be bound by CPU on the target managed clusters. Even at this load, the CPU usage on the target managed clusters was between 50%-70%.
- The PVC storage default for Observability reciever is 10GB. With this workload, we needed to be increased to 100GB.
** The actual pvc storage data was not collected for this run, so the claim size is listed as a reference. The conclusion is RHACM with Observability enabled, you will able to run on a small tshirt size, with this kind of workload.
*** This test run was conducted on a cluster with 6 workers @ 8 CPU/64GB RAM for each worker node.
For now, I'll just include the screenshot.
I include here our data for Network
NOTE: The gap above is from updating the OBS component settings.