problem statement: scale a kubernetes delegate not based on cpu or memory, but on the number of tasks assigned to the delegates
what we need:
- a delegate deployed to a cluster
- a prometheus server running
- delegate metrics should be getting scraped and sent to promethes
- the prometheus adapter should be installed in the cluster
- this allows us to use prometheus metrics as scaling metrics
- validate that our prometheus metrics can be used by the cluster for scaling:
> kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/harness-delegate-ng/pods/*/io_harness_custom_metric_tasks_currently_executing" | jq .
{
"kind": "MetricValueList",
"apiVersion": "custom.metrics.k8s.io/v1beta1",
"metadata": {},
"items": [
{
"describedObject": {
"kind": "Pod",
"namespace": "harness-delegate-ng",
"name": "lab-5c9b7cf675-cvpw8",
"apiVersion": "/v1"
},
"metricName": "io_harness_custom_metric_tasks_currently_executing",
"timestamp": "2023-12-12T22:59:34Z",
"value": "0",
"selector": null
}
]
}
- limit your delegate to accepting a single task at a time
custom_envs:
- name: DELEGATE_TASK_CAPACITY
value: "1"
- create a
HorizontalPodAutoscaler
resouce to scale the pods of your delegate based on the tasks
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: lab-hpa
namespace: harness-delegate-ng
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: lab
minReplicas: 1
maxReplicas: 15
metrics:
- type: Pods
pods:
metric:
name: io_harness_custom_metric_tasks_currently_executing
target:
type: Value
value: 0.5
averageValue: 0.5
experiments:
x5 tasks of 240s each:
it is at this point as the 5/240 test is running that i realize when a task starts and it gets its list of delegates to poll to take the tasks, it never gets any new delegates added after the task was started. so by starting all five tasks at once, they all wait for the single delegate that was alive when the task starts.
now we need to adjust the tests so that there are a second batch of tasks that start only after the new delegate is online.
once i see a new delegate is connected, i trigger another batch of tests, and now i see new pods picking up tasks, the total number of concurrent running tasks increases, and we get some sort of "task based scaling"