Skip to content

Instantly share code, notes, and snippets.

@elenalape
Last active October 18, 2021 13:50
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save elenalape/3677431767def8626e66eba68be04f4f to your computer and use it in GitHub Desktop.
Save elenalape/3677431767def8626e66eba68be04f4f to your computer and use it in GitHub Desktop.

Integrating Amazon Managed Service for Prometheus with Kubecost

Amazon Managed Service for Prometheus (AMP) is a Prometheus-compatible monitoring service that makes it easy to monitor containerized applications at scale. AMP is currently available in Public Preview mode.

Integrating AMP with Kubecost follows a workflow that is similar to integrating Kubecost with any Custom Prometheus.

1. Set up Amazon Managed Service for Prometheus (AMP) and Kubecost

You should first have successfully created an AMP workspace and ingesting Prometheus metrics to it and installed Kubecost.

2. Set up Kubecost to use your Amazon Managed Service for Prometheus (AMP)

First, download the values file and set prometheus.enabled to false, and prometheus.fqdn to the URL of your Prometheus service address, starting with http://. Note that this address is not your AMP remote write URL but your Prometheus cluster address, e.g. http://prometheus-server.prometheus.svc.cluster.local.

Then, navigate to the directory of the file and apply the changes by running the following helm command:

# replace the namespace and filename if needed
$ helm upgrade --install kubecost kubecost/cost-analyzer --namespace kubecost -f ./values.yaml

Verify that Kubecost is using the Prometheus server configured to remote write metrics to AMP rather than Kubecost's default Prometheus installation by forwarding Kubecost to a local port and querying the /api endpoint:

# replace `kubecost` and `deployment/kubecost-cost-analyzer` if needed
$ kubectl port-forward --namespace kubecost deployment/kubecost-cost-analyzer 9090

# in another Terminal window
$ curl http://localhost:9090/api/

The first line returned by the curl command should contain your fqdn parameter URL, like so:

Using Prometheus at http://prometheus-server.prometheus.svc.cluster.local.

Seek help in our troubleshooting guide or reach out to us on Slack if you run into any issues.

3. Set up your Prometheus to scrape metrics from Kubecost

First, create a file called extra_scrape_configs.yaml with the following contents, replacing <your_kubecost_namespace> with your Kubecost namespace:

- job_name: kubecost
  honor_labels: true
  scrape_interval: 1m
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  dns_sd_configs:
    - names:
        - kubecost-cost-analyzer.<your_kubecost_namespace>
      type: "A"
      port: 9003

Then, add these recording rules as serverFiles to the Prometheus override_values.yaml override file. While they are optional, they offer improved performance.

serverFiles:
  rules:
    groups:
      - name: CPU
        rules:
          - expr: sum(rate(container_cpu_usage_seconds_total{container_name!=""}[5m]))
            record: cluster:cpu_usage:rate5m
          - expr: rate(container_cpu_usage_seconds_total{container_name!=""}[5m])
            record: cluster:cpu_usage_nosum:rate5m
          - expr: avg(irate(container_cpu_usage_seconds_total{container_name!="POD", container_name!=""}[5m])) by (container_name,pod_name,namespace)
            record: kubecost_container_cpu_usage_irate
          - expr: sum(container_memory_working_set_bytes{container_name!="POD",container_name!=""}) by (container_name,pod_name,namespace)
            record: kubecost_container_memory_working_set_bytes
          - expr: sum(container_memory_working_set_bytes{container_name!="POD",container_name!=""})
            record: kubecost_cluster_memory_working_set_bytes
      - name: Savings
        rules:
          - expr: sum(avg(kube_pod_owner{owner_kind!="DaemonSet"}) by (pod) * sum(container_cpu_allocation) by (pod))
            record: kubecost_savings_cpu_allocation
            labels:
              daemonset: "false"
          - expr: sum(avg(kube_pod_owner{owner_kind="DaemonSet"}) by (pod) * sum(container_cpu_allocation) by (pod)) / sum(kube_node_info)
            record: kubecost_savings_cpu_allocation
            labels:
              daemonset: "true"
          - expr: sum(avg(kube_pod_owner{owner_kind!="DaemonSet"}) by (pod) * sum(container_memory_allocation_bytes) by (pod))
            record: kubecost_savings_memory_allocation_bytes
            labels:
              daemonset: "false"
          - expr: sum(avg(kube_pod_owner{owner_kind="DaemonSet"}) by (pod) * sum(container_memory_allocation_bytes) by (pod)) / sum(kube_node_info)
            record: kubecost_savings_memory_allocation_bytes
            labels:
              daemonset: "true"
          - expr: label_replace(sum(kube_pod_status_phase{phase="Running",namespace!="kube-system"} > 0) by (pod, namespace), "pod_name", "$1", "pod", "(.+)")
            record: kubecost_savings_running_pods
          - expr: sum(rate(container_cpu_usage_seconds_total{container_name!="",container_name!="POD",instance!=""}[5m])) by (namespace, pod_name, container_name, instance)
            record: kubecost_savings_container_cpu_usage_seconds
          - expr: sum(container_memory_working_set_bytes{container_name!="",container_name!="POD",instance!=""}) by (namespace, pod_name, container_name, instance)
            record: kubecost_savings_container_memory_usage_bytes
          - expr: avg(sum(kube_pod_container_resource_requests{resource="cpu", unit="core", namespace!="kube-system"}) by (pod, namespace, instance)) by (pod, namespace)
            record: kubecost_savings_pod_requests_cpu_cores
          - expr: avg(sum(kube_pod_container_resource_requests{resource="memory", unit="byte", namespace!="kube-system"}) by (pod, namespace, instance)) by (pod, namespace)
            record: kubecost_savings_pod_requests_memory_bytes

Lastly, apply the changes:

$ helm upgrade --install prometheus-for-amp prometheus-community/prometheus -n prometheus -f ./override_values.yaml \
--set-file extraScrapeConfigs=extra_scrape_configs.yaml

To check that the rules were applied successfully, run Kubecost locally and check that Prometheus Status on the Settings page indicates Kubecost recording rules available in Prometheus and Kubecost cost-model metrics available in Prometheus.

# replace `kubecost` and `deployment/kubecost-cost-analyzer` if needed
$ kubectl port-forward --namespace kubecost deployment/kubecost-cost-analyzer 9090

Et voilà!


To verify that the integration is set up, check that the Prometheus Status section on Kubecost Settings page does not contain any errors.

Kubecost status no errors

image

Have a look at the Custom Prometheus integration troubleshooting guide if you run into any errors while setting up the integration. You're also welcome to reach out to us on Slack if you require further assistance.

@elenalape
Copy link
Author

Hi @awsimaya, good catch, thank you! This gist isn't meant to be public-facing, but I've submitted a PR to remove that sentence from our official docs: kubecost/docs#137

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment