braydonk/otel_matching_spec.md

## otel_matching_spec.md

      
    Raw
  

              otel_matching_spec.md
            
          
    OpenTelemetry Wildcard Name Matching

Background

What and Why

The following is a specification for OpenTelemetry Wildcard Name Matching grammar that I propose to add to the generated metadata packages from mdatagen. This grammar will facilitate the use of wildcards in metric and resource attribute configurations to enabling/disabling metrics/resource attributes in a more concise way, allowing a configuration like:
metrics:
  "*":
    enabled: false
  system.*:
    enabled: true
Previously, this would require every metric be listed individually for enabled/disabled. With this feature, we can allow this to be specified in two simple entries.
Challenges

The biggest challenge with this feature is that we cannot rely on the order specified in the configuration. Since the underlying type is a Golang map, order is not deterministic. This the main reason that I propose coming up with a cust:w
om grammar rather than using an existing solution; this specification will include the prioritization system that determines the order of patterns being processed, regardless of the order received.
The grammar being custom means that we will need to manually implement a matching engine. I will include the specification for that as well.
Specification

Grammar

The grammar is an extension of the OTel name grammar. In plain english, the OTel name gramma is an OTel identifier (characters with underscores) optionally followed by a . and another OTel identifier. With the wildcard matching, the grammar is similar, except instead of a name you can optionally have one of two things:

A wildcard * character, which matches anything in a group. If it is the final group, it will match any number of subsequent groups.
A multimatch {x,y,z,...} which will match anything specified within it

The two of these will be referred to as "matchers" for the remainder of the document.
Formal grammar specification can be found at the bottom of this document.
Priority

The priority system is required due to the fact that the part of the collector that processes these patterns receives them in a non-deterministic order. It would be ideal if they could be applied in the order specified in the configuration file, but unfortunately that is not possible.
The priority system uses the following rules in this order:

Patterns are sorted in ascending order of number namespace groups (identifiers/wildcards separated by .)
Matchers in earlier groups are applied first
Matchers are applied before identifiers in the same group
Wildcard matchers are applied before multimatch matchers
If all other priority conditions don't trigger, they are applied in lexicographical order

When a priority rule is triggered, it takes precedence over any below it, i.e. if one pattern has less groups than another, it will be applied first even if the latter has an earlier matcher. This is to ensure that the order of patterns is always deterministic.
Given the following unsorted list of patterns:
process.cpu.{time,utilization}
process.cpu.*
processes.*
*
process.cpu.count

With priority applied, the patterns would be applied in the following order:
*
processes.*
process.cpu.*
process.cpu.{time,utilization}
process.cpu.count

Examples

The following examples demonstrate using the name matching patterns with the Semantic Conventions for HTTP Metrics (permalinked to revision as of writing of this document, metrics may have changed if reading in the future).

metrics:
  http.client.*:
    enabled: true
  "*":
    enabled: false
First, all metrics will be first be disabled since * is a higher priority pattern and will be applied first. Then, all http.client namespaced metrics will be enabled.

metrics:
  http.*.{request,response}.body.size:
    enabled: true
The following metrics will be enabled:
http.client.request.body.size
http.client.response.body.size
http.server.request.body.size
http.server.response.body.size

metrics:
  http.client.*:
    enabled: false
  http.*.active_requests:
    enabled: true
The first pattern applied would be http.*.active_requests, so first http.client.active_requests and http.server.active_requests would be enabled. Then, the http.client.* pattern would be applied, which means all http.client namespaced metrics would be disabled, including http.client.active_requests. Oops, that's not what we wanted! Unfortunately, to make this work we will need the following:
metrics:
  http.client.*:
    enabled: false
  http.client.active_requests:
    enabled: true
  http.server.active_requests:
    enabled: true
The wildcard pattern http.client.* will be applied first, then the active_requests enablers will run.
Grammar Spec

You can try the grammar out at this BNF Playground link.
EBNF

<pattern> ::= <group> ("." <group>)*
<group> ::= <otel_identifier> | <matcher>
<otel_identifier> ::= <otel_identifier_character> ("_"? <otel_identifier_character>)*
<otel_identifier_character> ::= <letter> | <digit>
<matcher> ::= "*" | <multimatch>
<multimatch> ::= "{" <otel_identifier> ("," " "? <otel_identifier>)* "}"

<letter> ::= [a-z]
<digit> ::= [0-9]

Nearley

Try it out by pasting it into the Nearley Playground.
pattern -> group ( "." group ):*
group -> otel_identifier | matcher
otel_identifier -> otel_identifier_character ( "_":? otel_identifier_character ):*
otel_identifier_character -> letter | digit
matcher -> "*" | multimatch
multimatch -> "{" otel_identifier ( "," " ":? otel_identifier ):* "}"

letter -> [a-z]
digit -> [0-9]