Skip to content

Instantly share code, notes, and snippets.

@david-martin
Last active August 14, 2023 16:02
Show Gist options
  • Save david-martin/3a87d45ad8381850459d9051b88f1f91 to your computer and use it in GitHub Desktop.
Save david-martin/3a87d45ad8381850459d9051b88f1f91 to your computer and use it in GitHub Desktop.

Gateway API State Metrics

TLDR

Inspired by https://github.com/kubernetes/kube-state-metrics, this is a proposal for a prometheus metrics exporter. The exporter will export metrics solely for Gateway API resources. The exporter will watch the kubernetes API server for Gateway API resources and convert the contents of the resources into metrics with labels and values.

By providing a standard set of metrics around Gateway API resources it will allow further standardization and sharing of things like:

  • Alert queries and recording rules
  • A General Gateway API Dashboard

Additionally, the metrics available from this exporter could provide the glue for doing more complex and useful queries when combined with metrics from underlying gateway providers.

Goals

Non-Goals

  • Implement metrics for all Gateway API resources. Additional metrics can be added later
  • Any example dashboards or alerts that make use of the new metrics

Introduction

This proposal focuses on an initial set of metrics for the core Gateway API resources:

  • Gateway
  • GatewayClass
  • HTTPRoute

The reason for choosing these 3 resources is:

  • to limit the scope while a pattern is established for naming, labels and resource relationships without getting bogged down in content
  • because these resources are the main resources used in the Kuadrant project day to day, and value can be added there quickly

Metrics

Gateway metrics

Gateway core metrics

Information about a Gateway, Gauge

gatewayapi_gateway_info{namespace="<NAMESPACE>",gateway="<GATEWAY>",gatewayclass="<GATEWAYCLASS_NAME>"} 1

Unix creation timestamp in seconds, Gauge

gatewayapi_gateway_created{namespace="<NAMESPACE>",gateway="<GATEWAY>",gatewayclass="<GATEWAYCLASS_NAME>"} 1690879977

Unix deletion timestamp in seconds, Gauge

gatewayapi_gateway_deletion_timestamp{namespace="<NAMESPACE>",gateway="<GATEWAY>",gatewayclass="<GATEWAYCLASS_NAME>"} 1690879977

Per Listener information, Gauge

gatewayapi_gateway_listener{namespace="<NAMESPACE>",gateway="<GATEWAY>",gatewayclass="<GATEWAYCLASS_NAME>",listener="<LISTENER_NAME>",port="<PORT>",protocol="<PROTOCOL>"} 1

Potential additional labels for optional fields include hostname & tls-mode AllowedRoutes and CertificateRefs would need some thought if they are to be represented as metrics, as they are lists. Perhaps separate metrics e.g. gatewayapi_gateway_listener_allowed_route{} and gatewayapi_gateway_listener_certificate_ref{} While technically possible, there's potential for excessive numbers of metrics being generated.

Status Conditions of Gateway, Gauge, 1 or 0 (1 means this condition currently applies to this gateway)

gatewayapi_gateway_status_condition{namespace="<NAMESPACE>",gateway="<GATEWAY>",gatewayclass="<GATEWAYCLASS_NAME>",condition="<Accepted|Scheduled|Ready>",status="<true|false|unknown>"} 1

Number of attached routes for an individual listener, Gauge

gatewayapi_gateway_status_listener_attached_routes{namespace="<NAMESPACE>",gateway="<GATEWAY>",gatewayclass="<GATEWAYCLASS_NAME>",listener="<LISTENER_NAME>"} 5

Status conditions of individual listeners, Gauge 1 or 0 (1 means this condition currently applies to this listener)

gatewayapi_gateway_status_listener_condition{namespace="<NAMESPACE>",gateway="<GATEWAY>",gatewayclass="<GATEWAYCLASS_NAME>",listener="<LISTENER_NAME>",condition="<Accepted|Conflicted|Detached|Programmed|Ready|ResolvedRefs>",status="<true|false|unknown>"} 1

Address types and values, Gauge

gatewayapi_gateway_status_address{namespace="<NAMESPACE>",gateway="<GATEWAY>",gatewayclass="<GATEWAYCLASS_NAME>",type="<IPAddress|Hostname>",value="<ADDRESS>"} 1
gatewayapi_gateway_status_address{namespace="<NAMESPACE>",gateway="<GATEWAY>",gatewayclass="<GATEWAYCLASS_NAME>",type="<DOMAIN_PREFIXED_STRING_IDENTIFIER>",value="<ADDRESS>"} 1

Gateway optional metrics, disabled by default (potentially high cardinality metrics)

Kubernetes annotations converted to Prometheus labels, Gauge

gatewayapi_gateway_annotations{namespace="<NAMESPACE>",gateway="<GATEWAY>",gatewayclass="<GATEWAYCLASS_NAME>",annotation_GATEWAY_ANNOTATION="<GATEWAY_ANNOTATION>"} 1

Kubernetes labels converted to Prometheus labels, Gauge

gatewayapi_gateway_labels{namespace="<NAMESPACE>",gateway="<GATEWAY>",gatewayclass="<GATEWAYCLASS_NAME>",label_GATEWAY_LABEL="<GATEWAY_LABEL>"} 1

Gateway example queries

Find all Gateways not in a Ready state

gatewayapi_gateway_status_condition{condition="Ready",status!="true"} > 0

Count the number of listeners across all gateways

count(gatewayapi_gateway_listener)

Find any gateways with 0 attached routes

gatewayapi_gateway_status_listener_attached_routes == 0

Find any listeners not in a Ready state

gatewayapi_gateway_status_listener_condition{condition="Ready",status!="true"} > 0

GatewayClass metrics

GatewayClass core metrics

Information about a GatewayClass, Gauge

gatewayapi_gatewayclass_info{gatewayclass="<GATEWAYCLASS_NAME>",controller_name="<GATEWAYCLASS_CONTROLLER_NAME>"} 1

ParametersReference would need some thought if they are to be represented as metrics, as it is a list. Perhaps a separate metric e.g. gatewayapi_gatewayclass_parameter_ref{} The GatewayClass description is an optional field that has limited value in a metric. It doesn't make sense to include this field as a label.

Unix creation timestamp in seconds, Gauge

gatewayapi_gatewayclass_created{gatewayclass="<GATEWAYCLASS_NAME>",controller_name="<GATEWAYCLASS_CONTROLLER_NAME>"} 1690879977

Unix deletion timestamp in seconds, Gauge

gatewayapi_gatewayclass_deletion_timestamp{gatewayclass="<GATEWAYCLASS_NAME>",controller_name="<GATEWAYCLASS_CONTROLLER_NAME>"} 1690879977

Status Conditions of GatewayClass, Gauge, 1 or 0 (1 means this condition currently applies to this GatewayClass)

gatewayapi_gatewayclass_status_condition{gatewayclass="<GATEWAYCLASS_NAME>",condition="Accepted",status="<true|false|unknown>"} 1

GatewayClass optional metrics, disabled by default (potentially high cardinality metrics)

Kubernetes annotations converted to Prometheus labels, Gauge

gatewayapi_gatewayclass_annotations{gatewayclass="<GATEWAYCLASS_NAME>",controller_name="<GATEWAYCLASS_CONTROLLER_NAME>",annotation_GATEWAYCLASS_ANNOTATION="<GATEWAYCLASS_ANNOTATION>"} 1

Kubernetes labels converted to Prometheus labels, Gauge

gatewayapi_gatewayclass_labels{gatewayclass="<GATEWAYCLASS_NAME>",controller_name="<GATEWAYCLASS_CONTROLLER_NAME>",label_GATEWAYCLASS_LABEL="<GATEWAYCLASS_LABEL>"} 1

GatewayClass example queries

Find any GatewayClasses that are not in an accepted state

gatewayapi_gatewayclass_status_condition{condition="Accepted",status!="true"} > 0

Get the GatewayClass info (e.g. controller name) of any Gateways that are not in a Ready state

(gatewayapi_gateway_status_condition{condition="Ready",status!="true"} > 0)
* on(gatewayclass) group_right gatewayapi_gatewayclass_info

HTTPRoute metrics

HTTPRoute core metrics

Information about a HTTPRoute, Gauge

gatewayapi_httproute_info{namespace="<NAMESPACE>",httproute="<HTTPROUTE_NAME>"} 1

Unix creation timestamp in seconds, Gauge

gatewayapi_httproute_created{namespace="<NAMESPACE>",httproute="<HTTPROUTE_NAME>"} 1690879977

Unix deletion timestamp in seconds, Gauge

gatewayapi_httproute_deletion_timestamp{namespace="<NAMESPACE>",httproute="<HTTPROUTE_NAME>"} 1690879977

Parent References that the route wants to be attached to, Gauge

gatewayapi_httproute_parent_ref{namespace="<NAMESPACE>",httproute="<HTTPROUTE_NAME>",parent_ref_group="<GROUP>",parent_ref_kind="<KIND>",parent_ref_namespace="<PARENT_REF_NAMESPACE>",parent_ref_name="<PARENT_REF_NAME>",parent_ref_section_name="<PARENT_REF_SECTION_NAME>",parent_ref_port="<PARENT_REF_PORT>"} 1
  • To avoid confusion, the parent ref namespace will use the label parent_ref_namespace instead of namespace (which is the namespace of the HTTPRoute). Then, for the sake of keeping a pattern, all other parent ref fields will be prefixed with parent_ref_ e.g. parent_ref_group

Hostnames to match against the HTTP Host header, Gauge

gatewayapi_httproute_hostname{namespace="<NAMESPACE>",httproute="<HTTPROUTE_NAME>",hostname="<HOSTNAME>"} 1

Rules would need some thought if they are to be represented as metrics as there are nested lists for matches, filters and backendRefs. While technically possible, there's potential for excessive numbers of metrics being generated.

HTTPRouteStatus parent status conditions, Gauge

gatewayapi_httproute_status_parent_condition{namespace="<NAMESPACE>",httproute="<HTTPROUTE_NAME>",condition="<Accepted|ResolvedRefs>",status="<true|fasle|unknown>",parent_ref_group="<GROUP>",parent_ref_kind="<KIND>",parent_ref_namespace="<PARENT_REF_NAMESPACE>",parent_ref_name="<PARENT_REF_NAME>",parent_ref_section_name="<PARENT_REF_SECTION_NAME>",parent_ref_port="<PARENT_REF_PORT>",controller_name="<GATEWAYCLASS_CONTROLLER_NAME>"} 1

HTTPRoute optional metrics, disabled by default (potentially high cardinality metrics)

Kubernetes annotations converted to Prometheus labels, Gauge

gatewayapi_httproute_annotations{namespace="<NAMESPACE>",httproute="<HTTPROUTE_NAME>",annotation_HTTPROUTE_ANNOTATION="<HTTPROUTE_ANNOTATION>"} 1

Kubernetes labels converted to Prometheus labels, Gauge

gatewayapi_httproute_labels{namespace="<NAMESPACE>",httproute="<HTTPROUTE_NAME>",label_HTTPROUTE_LABEL="<HTTPROUTE_LABEL>"} 1

HTTPRoute example queries

Find any HTTPRoutes that haven't been Accepted by a parent.

gatewayapi_httproute_status_parent_condition{condition="Accepted",status!="true"} > 0

Get any non-true Gateway listener status conditions for HTTPRoutes in the default namespace that haven't been Accepted by a parent.

(gatewayapi_gateway_status_listener_condition{status!=true} > 0)
* on(gateway)
label_replace(
  gatewayapi_httproute_status_parent_condition{namespace="default",condition="Accepted",status!="true"},
  "gateway","$1","parent_ref_name", "(.+)"
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment