Skip to content

Instantly share code, notes, and snippets.

@noahfalk
Last active July 21, 2023 02:33
Show Gist options
  • Save noahfalk/32bb919907c972ac06ac128bc767d48d to your computer and use it in GitHub Desktop.
Save noahfalk/32bb919907c972ac06ac128bc767d48d to your computer and use it in GitHub Desktop.
Metrics configuration feature design

Metrics Configuration Feature Design

This is my strawman to get discussion and feedback started. I'm actively researching and iterating on it. My overall goals with the design:

  • App developers can easily configure a set of metrics they want to collect and forward that data to a sink. Sinks likely, but not necessarily, will transmit the metrics out-of-process to some remote storage.
  • App developers should find using the API similar to configuring ILogger
  • 3rd parties such as OpenTelemetry, Prometheus.NET, or AppMetrics can easily author a sink to capture the metrics data and transmit it.
  • Sink developers that understand how to use MeterListener should find this API similar and easy to adopt.
  • 3rd party authors of instrumentation that currently takes an explicit dependency on OpenTelemetry MeterProviderBuilder should be able to easily support this API in addition.

Usage examples

App dev sends the default metrics telemetry to an OLTP endpoint using OpenTelemetry

NOTE: This scenario may be sending metric data that doesn't conform to OpenTelemetry naming conventions
var builder = WebApplication.CreateBuilder(args);
builder.Metrics.AddOpenTelemetry(builder => builder.AddOtlpExporter());

The defaults are controlled from appsettings.json which would contain this in a new web app template:

{
  "Metrics": {
    "EnabledMetrics": {
      "System": true,
      "Microsoft.AspNetCore": true
    }
  }
}

The specific destination for the telemetry is picked up by OTel from the environment variable OTEL_EXPORTER_OTLP_ENDPOINT. Other variations allow the configuration to be passed as a parameter ot AddOltpExporter() or using the service container to configure the OtlpExporterOptions object.

The AddOpenTelemetry() call here is a hypothetical extension method implemented by the OTel.NET project similar to their current ILoggingBuilder.AddOpenTelemetry() API.

App dev wants to send default metrics via Prometheus.NET

var builder = WebApplication.CreateBuilder(args);
builder.Metrics.AddPrometheus();

[omitting standard webapp code here]

app.MapMetrics();

Prometheus requires Kestrel to host its scrape endpoint which is why it integrates again in app.UseEndpoints(). The same appsettings.json as above (not shown) is controlling the default metrics being sent to Prometheus.

App dev sends the default metrics telemetry, named with OTel conventions, to an OLTP endpoint using OpenTelemetry

Option #1: Use instrumentation libraries provided by OpenTelemetry

var builder = WebApplication.CreateBuilder(args);
builder.Metrics.AddRuntimeMetrics();
builder.Metrics.AddAspNetCoreMetrics();
builder.Metrics.AddOpenTelemetry(builder => builder.AddOtlpExporter());

The AddXXXMetrics() APIs are new hypothetical extension methods provided by OpenTelemetry instrumentation libraries. These calls produce metrics under a Meter name defined by the OTel instrumentation libraries such as "OpenTelemetry.Instrumentation.AspNetCore". If the metrics are left enabled in appsettings.json then both OpenTelmetry and built-in metrics would be transmitted. The app developer should exlucde built-in metrics from the list if they don't want both.

Option #2: Use APIs provided by runtime libraries (Optional future development, does not ship in .NET 8)

var builder = WebApplication.CreateBuilder(args);
builder.Metrics.AddRuntimeMetrics(RuntimeMetricNaming.OpenTelemetry1_0);
builder.Metrics.AddAspNetCoreMetrics(AspNetCoreMetricNaming.OpenTelemetry1_0);
builder.Metrics.AddOpenTelemetry(builder => builder.AddOtlpExporter());

Once OpenTelemetry has stable conventions it is possible for the libraries that produce metrics to also include extension methods that apply OpenTelemetry naming rules. The enumeration would likely start with two options, the default metric names shipped originally and OpenTelemetry's names. Over time more options might be added to the enumeration for updated versions of OpenTelemetry's conventions or other standards that have gained a broad interest. Unlike option #1, these metrics are produced with the default Meter name for the library such as "Microsoft.AspNetCore" and it replaces the default metric rather than adding new metrics in parallel.

App dev wants to enable metrics from certain Meters to get sent (config-based approach)

{
  "Metrics": {
    "EnabledMetrics": {
      "System": true,
      "Microsoft.AspNetCore": true
      "Microsoft.Extensions.Caching": true // hypothetical runtime provided Meter
      "StackExchange.Redis": true          // hypothetical Meter in a 3P library
    }
  }
}

App dev wants to enable metrics from certain Meters to get sent (programmatic approach)

var builder = WebApplication.CreateBuilder(args);
builder.Metrics
  .EnableMetrics("Microsoft.Extensions.Caching")
  .EnableMetrics("StackExchange.Redis")
  .AddOpenTelemetry(builder => builder.AddOtlpExporter());

App dev wants to enable metrics from certain Meters to get sent (custom-instrumentation approach)

Meters need to be created in order for turning them on to have any effect. In some cases it may be convenient to both create them and turn them on as a single step using a strongly typed extension method:

var builder = WebApplication.CreateBuilder(args);
builder.Metrics
  .AddSqlClientInstrumentation()
  .AddOpenTelemetry(builder => builder.AddOtlpExporter());

This AddSqlClientInstrumentation method is a hypothetical extension method that could be offered by an OpenTelemetry instrumentation package. Note that some instrumentation packages are likely to be duplicative with built-in metrics. If the app developer doesn't want both then they should ensure the built-in metrics aren't included in their configuration.

App dev wants to disable sending metrics for some instruments on a Meter (config-based approach)

{
  "Metrics": {
    "EnabledMetrics": {
      "System": true,
      "Microsoft.AspNet.Core": {
        "Default": true,               // enable default metrics for all the instruments
        "connection-duration": false   // but exclude the connection-duration metric
      }
    }
  }
}

App dev wants to only enable specific instruments on a Meter (config-based approach)

{
  "Metrics": {
    "EnabledMetrics": {
      "System": true,
      "Microsoft.AspNet.Core": {
        "connection-duration": true   // enable only the connection-duration metric
      }
    }
  }
}

App dev wants to only enable specific instruments on a Meter (API-based approach)

var builder = WebApplication.CreateBuilder(args);
builder.Metrics
  // enable only the duration-counter metric
  .EnableMetrics("Microsoft.AspNetCore", "duration-counter");
  .AddOpenTelemetry(builder => builder.AddOtlpExporter());

App dev wants to disable sending metrics for some instruments on a Meter (API-based approach)

var builder = WebApplication.CreateBuilder(args);
builder.Metrics
  // enable all default metrics for all instruments on this Meter except duration-counter
  .EnableMetrics("Microsoft.AspNetCore", instrument => instrument.Name != "duration-counter");
  .AddOpenTelemetry(builder => builder.AddOtlpExporter());

spec

Microsoft.Extensions.Diagnostics assembly

namespace Microsoft.Extensions.DependencyInjection;

public static class MetricsServiceCollectionExtensions
{
    public static IServiceCollection AddMetrics(this IServiceCollection);
    public static IServiceCollection AddMetrics(this IServiceCollection, Action<IMetricsBuilder> configure);
}

Microsoft.Extensions.Diagnostics.Abstractions assembly

namespace Microsoft.Extensions.Diagnostics.Metrics;

public interface IMetricsBuilder
{
    IServiceCollection Services { get; }
}

public static class MetricsBuilderExtensions
{
    public static IMetricsBuilder AddListener(this IMetricsBuilder builder, IMetricsListener listener);
    public static IMetricsBuilder ClearListeners(this IMetricsBuilder builder);
}

public interface IMetricsSource
{
    public void RecordObservableInstruments();
}

public interface IMetricsListener
{
    public void SetSource(IMetricsSource source);
    public object? InstrumentPublished(Instrument instrument);
    public void MeasurementsCompleted(Instrument instrument, object? userState);
    public MeasurementCallback<T> GetMeasurementHandler<T>();
}

public class MetricsEnableOptions
{
    public IList<InstrumentEnableRule> Rules { get; }
}

public class InstrumentEnableRule
{
    public InstrumentEnableRule(string? listenerName, string? meterName, MeterScope scopes, string? instrumentName, Action<string?, Instrument, bool> filter);
    public string? ListenerName { get; }
    public string? MeterName { get; }
    public MeterScope Scopes { get; }
    public string? InstrumentName { get; }
    public Func<string?, Instrument, bool>? Filter { get; }
}

public static MetricsBuilderEnableExtensions
{
    // common overloads
    public static IMetricsBuilder EnableMetrics(this IMetricsBuilder builder, string? meterName);
    public static IMetricsBuilder EnableMetrics<T>(this IMetricsBuilder builder, string? meterName) where T : IMetricsListener;
    
    // less common overloads
    public static IMetricsBuilder EnableMetrics(this IMetricsBuilder builder, string? meterName, string? instrumentName);
    public static IMetricsBuilder EnableMetrics<T>(this IMetricsBuilder builder, string? meterName, string? instrumentName) where T : IMetricsListener;
    public static IMetricsBuilder EnableMetrics(this IMetricsBuilder builder, Func<Instrument, bool> filter);
    public static IMetricsBuilder EnableMetrics<T>(this IMetricsBuilder builder, Func<Instrument, bool> filter) where T : IMetricsListener;
    public static IMetricsBuilder EnableMetrics(this IMetricsBuilder builder, string? meterName, MeterScope scopes);
    public static IMetricsBuilder EnableMetrics<T>(this IMetricsBuilder builder, string? meterName, MeterScope scopes) where T : IMetricsListener;
    public static IMetricsBuilder EnableMetrics(this IMetricsBuilder builder, string? meterName, Action<Instrument, bool> filter);
    public static IMetricsBuilder EnableMetrics<T>(this IMetricsBuilder builder, string? meterName, Action<Instrument, bool> filter) where T : IMetricsListener;
    public static IMetricsBuilder EnableMetrics(this IMetricsBuilder builder, string? meterName, MeterScope scopes, Action<Instrument, bool> filter);
    public static IMetricsBuilder EnableMetrics<T>(this IMetricsBuilder builder, string? meterName, MeterScope scopes, Action<Instrument, bool> filter) where T : IMetricsListener;
    
    // all the same extension methods on MetricsOptions
    public static MetricsEnableOptions EnableMetrics(this MetricsEnableOptions options, string? meterName);
    public static MetricsEnableOptions EnableMetrics<T>(this MetricsEnableOptions options, string? meterName) where T : IMetricsListener;
    public static MetricsEnableOptions EnableMetrics(this MetricsEnableOptions options, string? meterName, string? instrumentName) where T : IMetricsListener;
    public static MetricsEnableOptions EnableMetrics<T>(this MetricsEnableOptions options, string? meterName, string? instrumentName);
    public static MetricsEnableOptions EnableMetrics(this MetricsEnableOptions options, Func<Instrument, bool> filter);
    public static MetricsEnableOptions EnableMetrics<T>(this MetricsEnableOptions options, Func<Instrument, bool> filter) where T : IMetricsListener;
    public static MetricsEnableOptions EnableMetrics(this MetricsEnableOptions options, string? meterName, MeterScope scopes);
    public static MetricsEnableOptions EnableMetrics<T>(this MetricsEnableOptions options, string? meterName, MeterScope scopes) where T : IMetricsListener;
    public static MetricsEnableOptions EnableMetrics(this MetricsEnableOptions options, string? meterName, Action<Instrument, bool> filter);
    public static MetricsEnableOptions EnableMetrics<T>(this MetricsEnableOptions options, string? meterName, Action<Instrument, bool> filter) where T : IMetricsListener;
    public static MetricsEnableOptions EnableMetrics(this MetricsEnableOptions options, string? meterName, MeterScope scopes, Action<Instrument, bool> filter);
    public static MetricsEnableOptions EnableMetrics<T>(this MetricsEnableOptions options, string? meterName, MeterScope scopes, Action<Instrument, bool> filter) where T : IMetricsListener;
}

[Flags]
public enum MeterScope
{
    Global,
    Local
}

Integration with configuration

We want users to be able to configure which metrics to capture for different listeners using appsettings.json or other IConfiguration sources

Data format

Metrics is the top level section under which "EnabledMetrics" configures which metrics are enabled for all listeners and a section named with a listener name adds additional rules specific to that listener. The key under "EnabledMetrics" is a Meter name prefix and the value is true if the Meter should be enabled or false if disabled.

{
    "Metrics": {
        "EnabledMetrics": {
            "System.Runtime": true
        },
        "OpenTelemetry": {
            "EnabledMetrics": {
                "Microsoft.AspNetCore": true
            }
        }
    }
}

Similar to logging, "Default" acts as a zero-length meter prefix creating a default rule applying to all Meters if no longer prefix rule matches. Logically this allows creating opt-in or opt-out policies. If no rule matches a Meter it will not be enabled, so omitting "Default" is the same behavior as including Default=false.

{
    "Metrics": {
        "EnabledMetrics": {
            "Default": true
            "NoisyLibrary": false
        },
    }
}

To enable individual instruments within a Meter replace the boolean with object syntax where the keys are instrument names or prefixes and the value is true iff enabled. "Default" is also honored at this scope as the zero-length instrument name prefix. Not specifying a "Default" is the same as "Default": false.

{
    "Metrics": {
        "EnabledMetrics": {
            "System.Runtime": {
                "gc": true,
                "jit": true
            }
            "Microsoft.AspNet.Core": {
                "Default": true,
                "connection-duration": false
            }
        }
    }
}

To make rules apply only to Meters at a specific Global or Local scope instead of "EnabledMetrics" you can use "EnabledGlobalMetrics" or "EnabledLocalMetrics". Rules in a "EnabledMetrics" section apply to Meters in both scopes. The expectation is that using this local/global config is rare and maybe we should just remove it?

{
    "Metrics": {
        "EnabledGlobalMetrics": {
            "System.Runtime": true
        }
        "EnabledLocalMetrics": {
            "System.Net.Http": true
        }
        "EnabledMetrics": {
            "Meter1": true,
            "Meter2": true
        }
    }
}

API

Microsoft.Extensions.Diagnostics.Configuration.dll:

namespace Microsoft.Extensions.Diagnostics.Metrics
{
    public static class MetricsBuilderExtensions
    {
        public static IMetricsBuilder AddConfiguration(this IMetricsBuilder builder, IConfiguration configuration);
    }
}

namespace Microsoft.Extensions.Diagnostics.Metrics.Configuration
{
    public interface IMetricListenerConfigurationFactory
    {
        IConfiguration GetConfiguration(Type listenerType);
    }
    
    public interface IMetricListenerConfiguration<T>
    {
        IConfiguration Configuration { get; }
    }
}

Integration with Hosting APIs

We should include a Metrics property in parallel where existing hosting APIs expose Logging

public class HostApplicationBuilder //pre-existing
{
    public IMetricsBuilder Metrics { get; }
}

public class WebApplicationBuilder //pre-existing
{
    public IMetricsBuilder Metrics { get; }
}

public static class HostingHostBuilderExtensions //pre-existing
{
    public static IHostBuilder ConfigureMetrics(this IHostBuilder hostBuilder, Action<IMetricsBuilder> configureMetrics);
    public static IHostBuilder ConfigureMetrics(this IHostBuilder hostBuilder, Action<HostBuilderContext, IMetricsBuilder> configureMetrics);
}

Default sinks

We should include a simple console output of the metric measurements intended for debugging purposes. This doesn't have any of the customization that the ILogger console does because it isn't intended to be left enabled in production code. Recording raw measurements to text is very inefficient.

public static class ConsoleMetricsExtension
{
    public static IMetricsBuilder AddConsole(this IMetricsBuilder builder);
}
@noahfalk
Copy link
Author

Yeah, all of your guesses are correct 👍 , let me fix it inline.

@Tratcher
Copy link

Note HostApplicationBuilder now has a base interface IHostApplicationBuilder where this should be added too.

@Tratcher
Copy link

Any idea why System.Diagnostics.Metrics.Instrument isn't CLSCompliant? It's causing a viral spread of <CLSCompliant>false</CLSCompliant> up the stack.

@noahfalk
Copy link
Author

Offhand, no I didn't even realize Instrument was non-compliant.

@tarekgh - does this ring a bell for you? Separately does our API design care much about CLS-compliance these days? I'm not sure if we should be trying to do anything about this or just mark everything non-compliant and move on.

@tarekgh
Copy link

tarekgh commented Jul 19, 2023

The CLS compliance is turned off on the whole System.Diagnostics.DiagnosticSource library and not only the Instrument or metric classes. This is done long ago in the PR dotnet/corefx#33207 by Vance. I have no idea why he did that. @noahfalk you were one of the reviewers of this PR.

I tried to turn on the CLS compliance on the library then building and running the tests and I am not seeing any problem so far. I am not sure if there is any risk to the library consumers if we turn on the CLS compliance.
In general, in .NET Core there are many APIs exposed that are not CLS compliant, so I don't think anyone cares much about that anymore.

@Tratcher
Copy link

Tratcher commented Jul 19, 2023

I tried to turn on the CLS compliance on the library then building and running the tests and I am not seeing any problem so far. I am not sure if there is any risk to the library consumers if we turn on the CLS compliance.

Great, can we please turn that on so it doesn't go viral up the stack? Want me to submit a PR?

@tarekgh
Copy link

tarekgh commented Jul 19, 2023

I'll open the PR and look if CI gates will pass and wait for @noahfalk advise if he can see any risk.

@tarekgh
Copy link

tarekgh commented Jul 19, 2023

I have opened the PR dotnet/runtime#89214.

@Tratcher
Copy link

If the MeterScope is unspecified calling EnableMetrics("System.Net.Http") would bind to instruments on both of the above Meters if the app created them. Specifying MeterScope.Global would only bind to the global Meter, and specifying MeterScope.Local would only bind to the local Meter. If you created a 2nd DI container and called
AddMetrics(builder => builder.EnableMetrics("System.Net.Http")) in that one it would still have visibility on the
global Meter, but it would not have visibility on the local Meter created in DI container 1.

InstrumentEnableRule requires MeterScope to be specified as Global or Local. Should it be nullable?

@noahfalk
Copy link
Author

It is a flags enum so I was imagining setting it to Global | Local if you want it to apply to both. Does that work or you still forsee issues?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment