Integrating Amazon Managed Service for Prometheus with Kubecost
Amazon Managed Service for Prometheus (AMP) is a Prometheus-compatible monitoring service that makes it easy to monitor containerized applications at scale. AMP is currently available in Public Preview mode.
Integrating AMP with Kubecost follows a workflow that is similar to integrating Kubecost with any Custom Prometheus.
1. Set up Amazon Managed Service for Prometheus (AMP) and Kubecost
You should first have successfully created an AMP workspace and ingesting Prometheus metrics to it and installed Kubecost.
2. Set up Kubecost to use your Amazon Managed Service for Prometheus (AMP)
First, download the values file and set prometheus.enabled
to false
, and prometheus.fqdn
to the URL of your Prometheus service address, starting with http://
. Note that this address is not your AMP remote write URL but your Prometheus cluster address, e.g. http://prometheus-server.prometheus.svc.cluster.local
.
Then, navigate to the directory of the file and apply the changes by running the following helm
command:
# replace the namespace and filename if needed
$ helm upgrade --install kubecost kubecost/cost-analyzer --namespace kubecost -f ./values.yaml
Verify that Kubecost is using the Prometheus server configured to remote write metrics to AMP rather than Kubecost's default Prometheus installation by forwarding Kubecost to a local port and querying the /api
endpoint:
# replace `kubecost` and `deployment/kubecost-cost-analyzer` if needed
$ kubectl port-forward --namespace kubecost deployment/kubecost-cost-analyzer 9090
# in another Terminal window
$ curl http://localhost:9090/api/
The first line returned by the curl
command should contain your fqdn
parameter URL, like so:
Using Prometheus at http://prometheus-server.prometheus.svc.cluster.local.
Seek help in our troubleshooting guide or reach out to us on Slack if you run into any issues.
3. Set up your Prometheus to scrape metrics from Kubecost
First, create a file called extra_scrape_configs.yaml
with the following contents, replacing <your_kubecost_namespace>
with your Kubecost namespace:
- job_name: kubecost
honor_labels: true
scrape_interval: 1m
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
dns_sd_configs:
- names:
- kubecost-cost-analyzer.<your_kubecost_namespace>
type: "A"
port: 9003
Then, add these recording rules as serverFiles
to the Prometheus override_values.yaml
override file. While they are optional, they offer improved performance.
serverFiles:
rules:
groups:
- name: CPU
rules:
- expr: sum(rate(container_cpu_usage_seconds_total{container_name!=""}[5m]))
record: cluster:cpu_usage:rate5m
- expr: rate(container_cpu_usage_seconds_total{container_name!=""}[5m])
record: cluster:cpu_usage_nosum:rate5m
- expr: avg(irate(container_cpu_usage_seconds_total{container_name!="POD", container_name!=""}[5m])) by (container_name,pod_name,namespace)
record: kubecost_container_cpu_usage_irate
- expr: sum(container_memory_working_set_bytes{container_name!="POD",container_name!=""}) by (container_name,pod_name,namespace)
record: kubecost_container_memory_working_set_bytes
- expr: sum(container_memory_working_set_bytes{container_name!="POD",container_name!=""})
record: kubecost_cluster_memory_working_set_bytes
- name: Savings
rules:
- expr: sum(avg(kube_pod_owner{owner_kind!="DaemonSet"}) by (pod) * sum(container_cpu_allocation) by (pod))
record: kubecost_savings_cpu_allocation
labels:
daemonset: "false"
- expr: sum(avg(kube_pod_owner{owner_kind="DaemonSet"}) by (pod) * sum(container_cpu_allocation) by (pod)) / sum(kube_node_info)
record: kubecost_savings_cpu_allocation
labels:
daemonset: "true"
- expr: sum(avg(kube_pod_owner{owner_kind!="DaemonSet"}) by (pod) * sum(container_memory_allocation_bytes) by (pod))
record: kubecost_savings_memory_allocation_bytes
labels:
daemonset: "false"
- expr: sum(avg(kube_pod_owner{owner_kind="DaemonSet"}) by (pod) * sum(container_memory_allocation_bytes) by (pod)) / sum(kube_node_info)
record: kubecost_savings_memory_allocation_bytes
labels:
daemonset: "true"
- expr: label_replace(sum(kube_pod_status_phase{phase="Running",namespace!="kube-system"} > 0) by (pod, namespace), "pod_name", "$1", "pod", "(.+)")
record: kubecost_savings_running_pods
- expr: sum(rate(container_cpu_usage_seconds_total{container_name!="",container_name!="POD",instance!=""}[5m])) by (namespace, pod_name, container_name, instance)
record: kubecost_savings_container_cpu_usage_seconds
- expr: sum(container_memory_working_set_bytes{container_name!="",container_name!="POD",instance!=""}) by (namespace, pod_name, container_name, instance)
record: kubecost_savings_container_memory_usage_bytes
- expr: avg(sum(kube_pod_container_resource_requests{resource="cpu", unit="core", namespace!="kube-system"}) by (pod, namespace, instance)) by (pod, namespace)
record: kubecost_savings_pod_requests_cpu_cores
- expr: avg(sum(kube_pod_container_resource_requests{resource="memory", unit="byte", namespace!="kube-system"}) by (pod, namespace, instance)) by (pod, namespace)
record: kubecost_savings_pod_requests_memory_bytes
Lastly, apply the changes:
$ helm upgrade --install prometheus-for-amp prometheus-community/prometheus -n prometheus -f ./override_values.yaml \
--set-file extraScrapeConfigs=extra_scrape_configs.yaml
To check that the rules were applied successfully, run Kubecost locally and check that Prometheus Status
on the Settings page indicates Kubecost recording rules available in Prometheus
and Kubecost cost-model metrics available in Prometheus
.
# replace `kubecost` and `deployment/kubecost-cost-analyzer` if needed
$ kubectl port-forward --namespace kubecost deployment/kubecost-cost-analyzer 9090
Et voilà!
To verify that the integration is set up, check that the Prometheus Status
section on Kubecost Settings page does not contain any errors.
Have a look at the Custom Prometheus integration troubleshooting guide if you run into any errors while setting up the integration. You're also welcome to reach out to us on Slack if you require further assistance.
The service is GA now. Can you change this line please? - "AMP is currently available in Public Preview mode."