Skip to content

Instantly share code, notes, and snippets.

@noahfalk
Last active July 21, 2023 02:33
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save noahfalk/32bb919907c972ac06ac128bc767d48d to your computer and use it in GitHub Desktop.
Save noahfalk/32bb919907c972ac06ac128bc767d48d to your computer and use it in GitHub Desktop.
Metrics configuration feature design

Metrics Configuration Feature Design

This is my strawman to get discussion and feedback started. I'm actively researching and iterating on it. My overall goals with the design:

  • App developers can easily configure a set of metrics they want to collect and forward that data to a sink. Sinks likely, but not necessarily, will transmit the metrics out-of-process to some remote storage.
  • App developers should find using the API similar to configuring ILogger
  • 3rd parties such as OpenTelemetry, Prometheus.NET, or AppMetrics can easily author a sink to capture the metrics data and transmit it.
  • Sink developers that understand how to use MeterListener should find this API similar and easy to adopt.
  • 3rd party authors of instrumentation that currently takes an explicit dependency on OpenTelemetry MeterProviderBuilder should be able to easily support this API in addition.

Usage examples

App dev sends the default metrics telemetry to an OLTP endpoint using OpenTelemetry

NOTE: This scenario may be sending metric data that doesn't conform to OpenTelemetry naming conventions
var builder = WebApplication.CreateBuilder(args);
builder.Metrics.AddOpenTelemetry(builder => builder.AddOtlpExporter());

The defaults are controlled from appsettings.json which would contain this in a new web app template:

{
  "Metrics": {
    "EnabledMetrics": {
      "System": true,
      "Microsoft.AspNetCore": true
    }
  }
}

The specific destination for the telemetry is picked up by OTel from the environment variable OTEL_EXPORTER_OTLP_ENDPOINT. Other variations allow the configuration to be passed as a parameter ot AddOltpExporter() or using the service container to configure the OtlpExporterOptions object.

The AddOpenTelemetry() call here is a hypothetical extension method implemented by the OTel.NET project similar to their current ILoggingBuilder.AddOpenTelemetry() API.

App dev wants to send default metrics via Prometheus.NET

var builder = WebApplication.CreateBuilder(args);
builder.Metrics.AddPrometheus();

[omitting standard webapp code here]

app.MapMetrics();

Prometheus requires Kestrel to host its scrape endpoint which is why it integrates again in app.UseEndpoints(). The same appsettings.json as above (not shown) is controlling the default metrics being sent to Prometheus.

App dev sends the default metrics telemetry, named with OTel conventions, to an OLTP endpoint using OpenTelemetry

Option #1: Use instrumentation libraries provided by OpenTelemetry

var builder = WebApplication.CreateBuilder(args);
builder.Metrics.AddRuntimeMetrics();
builder.Metrics.AddAspNetCoreMetrics();
builder.Metrics.AddOpenTelemetry(builder => builder.AddOtlpExporter());

The AddXXXMetrics() APIs are new hypothetical extension methods provided by OpenTelemetry instrumentation libraries. These calls produce metrics under a Meter name defined by the OTel instrumentation libraries such as "OpenTelemetry.Instrumentation.AspNetCore". If the metrics are left enabled in appsettings.json then both OpenTelmetry and built-in metrics would be transmitted. The app developer should exlucde built-in metrics from the list if they don't want both.

Option #2: Use APIs provided by runtime libraries (Optional future development, does not ship in .NET 8)

var builder = WebApplication.CreateBuilder(args);
builder.Metrics.AddRuntimeMetrics(RuntimeMetricNaming.OpenTelemetry1_0);
builder.Metrics.AddAspNetCoreMetrics(AspNetCoreMetricNaming.OpenTelemetry1_0);
builder.Metrics.AddOpenTelemetry(builder => builder.AddOtlpExporter());

Once OpenTelemetry has stable conventions it is possible for the libraries that produce metrics to also include extension methods that apply OpenTelemetry naming rules. The enumeration would likely start with two options, the default metric names shipped originally and OpenTelemetry's names. Over time more options might be added to the enumeration for updated versions of OpenTelemetry's conventions or other standards that have gained a broad interest. Unlike option #1, these metrics are produced with the default Meter name for the library such as "Microsoft.AspNetCore" and it replaces the default metric rather than adding new metrics in parallel.

App dev wants to enable metrics from certain Meters to get sent (config-based approach)

{
  "Metrics": {
    "EnabledMetrics": {
      "System": true,
      "Microsoft.AspNetCore": true
      "Microsoft.Extensions.Caching": true // hypothetical runtime provided Meter
      "StackExchange.Redis": true          // hypothetical Meter in a 3P library
    }
  }
}

App dev wants to enable metrics from certain Meters to get sent (programmatic approach)

var builder = WebApplication.CreateBuilder(args);
builder.Metrics
  .EnableMetrics("Microsoft.Extensions.Caching")
  .EnableMetrics("StackExchange.Redis")
  .AddOpenTelemetry(builder => builder.AddOtlpExporter());

App dev wants to enable metrics from certain Meters to get sent (custom-instrumentation approach)

Meters need to be created in order for turning them on to have any effect. In some cases it may be convenient to both create them and turn them on as a single step using a strongly typed extension method:

var builder = WebApplication.CreateBuilder(args);
builder.Metrics
  .AddSqlClientInstrumentation()
  .AddOpenTelemetry(builder => builder.AddOtlpExporter());

This AddSqlClientInstrumentation method is a hypothetical extension method that could be offered by an OpenTelemetry instrumentation package. Note that some instrumentation packages are likely to be duplicative with built-in metrics. If the app developer doesn't want both then they should ensure the built-in metrics aren't included in their configuration.

App dev wants to disable sending metrics for some instruments on a Meter (config-based approach)

{
  "Metrics": {
    "EnabledMetrics": {
      "System": true,
      "Microsoft.AspNet.Core": {
        "Default": true,               // enable default metrics for all the instruments
        "connection-duration": false   // but exclude the connection-duration metric
      }
    }
  }
}

App dev wants to only enable specific instruments on a Meter (config-based approach)

{
  "Metrics": {
    "EnabledMetrics": {
      "System": true,
      "Microsoft.AspNet.Core": {
        "connection-duration": true   // enable only the connection-duration metric
      }
    }
  }
}

App dev wants to only enable specific instruments on a Meter (API-based approach)

var builder = WebApplication.CreateBuilder(args);
builder.Metrics
  // enable only the duration-counter metric
  .EnableMetrics("Microsoft.AspNetCore", "duration-counter");
  .AddOpenTelemetry(builder => builder.AddOtlpExporter());

App dev wants to disable sending metrics for some instruments on a Meter (API-based approach)

var builder = WebApplication.CreateBuilder(args);
builder.Metrics
  // enable all default metrics for all instruments on this Meter except duration-counter
  .EnableMetrics("Microsoft.AspNetCore", instrument => instrument.Name != "duration-counter");
  .AddOpenTelemetry(builder => builder.AddOtlpExporter());

spec

Microsoft.Extensions.Diagnostics assembly

namespace Microsoft.Extensions.DependencyInjection;

public static class MetricsServiceCollectionExtensions
{
    public static IServiceCollection AddMetrics(this IServiceCollection);
    public static IServiceCollection AddMetrics(this IServiceCollection, Action<IMetricsBuilder> configure);
}

Microsoft.Extensions.Diagnostics.Abstractions assembly

namespace Microsoft.Extensions.Diagnostics.Metrics;

public interface IMetricsBuilder
{
    IServiceCollection Services { get; }
}

public static class MetricsBuilderExtensions
{
    public static IMetricsBuilder AddListener(this IMetricsBuilder builder, IMetricsListener listener);
    public static IMetricsBuilder ClearListeners(this IMetricsBuilder builder);
}

public interface IMetricsSource
{
    public void RecordObservableInstruments();
}

public interface IMetricsListener
{
    public void SetSource(IMetricsSource source);
    public object? InstrumentPublished(Instrument instrument);
    public void MeasurementsCompleted(Instrument instrument, object? userState);
    public MeasurementCallback<T> GetMeasurementHandler<T>();
}

public class MetricsEnableOptions
{
    public IList<InstrumentEnableRule> Rules { get; }
}

public class InstrumentEnableRule
{
    public InstrumentEnableRule(string? listenerName, string? meterName, MeterScope scopes, string? instrumentName, Action<string?, Instrument, bool> filter);
    public string? ListenerName { get; }
    public string? MeterName { get; }
    public MeterScope Scopes { get; }
    public string? InstrumentName { get; }
    public Func<string?, Instrument, bool>? Filter { get; }
}

public static MetricsBuilderEnableExtensions
{
    // common overloads
    public static IMetricsBuilder EnableMetrics(this IMetricsBuilder builder, string? meterName);
    public static IMetricsBuilder EnableMetrics<T>(this IMetricsBuilder builder, string? meterName) where T : IMetricsListener;
    
    // less common overloads
    public static IMetricsBuilder EnableMetrics(this IMetricsBuilder builder, string? meterName, string? instrumentName);
    public static IMetricsBuilder EnableMetrics<T>(this IMetricsBuilder builder, string? meterName, string? instrumentName) where T : IMetricsListener;
    public static IMetricsBuilder EnableMetrics(this IMetricsBuilder builder, Func<Instrument, bool> filter);
    public static IMetricsBuilder EnableMetrics<T>(this IMetricsBuilder builder, Func<Instrument, bool> filter) where T : IMetricsListener;
    public static IMetricsBuilder EnableMetrics(this IMetricsBuilder builder, string? meterName, MeterScope scopes);
    public static IMetricsBuilder EnableMetrics<T>(this IMetricsBuilder builder, string? meterName, MeterScope scopes) where T : IMetricsListener;
    public static IMetricsBuilder EnableMetrics(this IMetricsBuilder builder, string? meterName, Action<Instrument, bool> filter);
    public static IMetricsBuilder EnableMetrics<T>(this IMetricsBuilder builder, string? meterName, Action<Instrument, bool> filter) where T : IMetricsListener;
    public static IMetricsBuilder EnableMetrics(this IMetricsBuilder builder, string? meterName, MeterScope scopes, Action<Instrument, bool> filter);
    public static IMetricsBuilder EnableMetrics<T>(this IMetricsBuilder builder, string? meterName, MeterScope scopes, Action<Instrument, bool> filter) where T : IMetricsListener;
    
    // all the same extension methods on MetricsOptions
    public static MetricsEnableOptions EnableMetrics(this MetricsEnableOptions options, string? meterName);
    public static MetricsEnableOptions EnableMetrics<T>(this MetricsEnableOptions options, string? meterName) where T : IMetricsListener;
    public static MetricsEnableOptions EnableMetrics(this MetricsEnableOptions options, string? meterName, string? instrumentName) where T : IMetricsListener;
    public static MetricsEnableOptions EnableMetrics<T>(this MetricsEnableOptions options, string? meterName, string? instrumentName);
    public static MetricsEnableOptions EnableMetrics(this MetricsEnableOptions options, Func<Instrument, bool> filter);
    public static MetricsEnableOptions EnableMetrics<T>(this MetricsEnableOptions options, Func<Instrument, bool> filter) where T : IMetricsListener;
    public static MetricsEnableOptions EnableMetrics(this MetricsEnableOptions options, string? meterName, MeterScope scopes);
    public static MetricsEnableOptions EnableMetrics<T>(this MetricsEnableOptions options, string? meterName, MeterScope scopes) where T : IMetricsListener;
    public static MetricsEnableOptions EnableMetrics(this MetricsEnableOptions options, string? meterName, Action<Instrument, bool> filter);
    public static MetricsEnableOptions EnableMetrics<T>(this MetricsEnableOptions options, string? meterName, Action<Instrument, bool> filter) where T : IMetricsListener;
    public static MetricsEnableOptions EnableMetrics(this MetricsEnableOptions options, string? meterName, MeterScope scopes, Action<Instrument, bool> filter);
    public static MetricsEnableOptions EnableMetrics<T>(this MetricsEnableOptions options, string? meterName, MeterScope scopes, Action<Instrument, bool> filter) where T : IMetricsListener;
}

[Flags]
public enum MeterScope
{
    Global,
    Local
}

Integration with configuration

We want users to be able to configure which metrics to capture for different listeners using appsettings.json or other IConfiguration sources

Data format

Metrics is the top level section under which "EnabledMetrics" configures which metrics are enabled for all listeners and a section named with a listener name adds additional rules specific to that listener. The key under "EnabledMetrics" is a Meter name prefix and the value is true if the Meter should be enabled or false if disabled.

{
    "Metrics": {
        "EnabledMetrics": {
            "System.Runtime": true
        },
        "OpenTelemetry": {
            "EnabledMetrics": {
                "Microsoft.AspNetCore": true
            }
        }
    }
}

Similar to logging, "Default" acts as a zero-length meter prefix creating a default rule applying to all Meters if no longer prefix rule matches. Logically this allows creating opt-in or opt-out policies. If no rule matches a Meter it will not be enabled, so omitting "Default" is the same behavior as including Default=false.

{
    "Metrics": {
        "EnabledMetrics": {
            "Default": true
            "NoisyLibrary": false
        },
    }
}

To enable individual instruments within a Meter replace the boolean with object syntax where the keys are instrument names or prefixes and the value is true iff enabled. "Default" is also honored at this scope as the zero-length instrument name prefix. Not specifying a "Default" is the same as "Default": false.

{
    "Metrics": {
        "EnabledMetrics": {
            "System.Runtime": {
                "gc": true,
                "jit": true
            }
            "Microsoft.AspNet.Core": {
                "Default": true,
                "connection-duration": false
            }
        }
    }
}

To make rules apply only to Meters at a specific Global or Local scope instead of "EnabledMetrics" you can use "EnabledGlobalMetrics" or "EnabledLocalMetrics". Rules in a "EnabledMetrics" section apply to Meters in both scopes. The expectation is that using this local/global config is rare and maybe we should just remove it?

{
    "Metrics": {
        "EnabledGlobalMetrics": {
            "System.Runtime": true
        }
        "EnabledLocalMetrics": {
            "System.Net.Http": true
        }
        "EnabledMetrics": {
            "Meter1": true,
            "Meter2": true
        }
    }
}

API

Microsoft.Extensions.Diagnostics.Configuration.dll:

namespace Microsoft.Extensions.Diagnostics.Metrics
{
    public static class MetricsBuilderExtensions
    {
        public static IMetricsBuilder AddConfiguration(this IMetricsBuilder builder, IConfiguration configuration);
    }
}

namespace Microsoft.Extensions.Diagnostics.Metrics.Configuration
{
    public interface IMetricListenerConfigurationFactory
    {
        IConfiguration GetConfiguration(Type listenerType);
    }
    
    public interface IMetricListenerConfiguration<T>
    {
        IConfiguration Configuration { get; }
    }
}

Integration with Hosting APIs

We should include a Metrics property in parallel where existing hosting APIs expose Logging

public class HostApplicationBuilder //pre-existing
{
    public IMetricsBuilder Metrics { get; }
}

public class WebApplicationBuilder //pre-existing
{
    public IMetricsBuilder Metrics { get; }
}

public static class HostingHostBuilderExtensions //pre-existing
{
    public static IHostBuilder ConfigureMetrics(this IHostBuilder hostBuilder, Action<IMetricsBuilder> configureMetrics);
    public static IHostBuilder ConfigureMetrics(this IHostBuilder hostBuilder, Action<HostBuilderContext, IMetricsBuilder> configureMetrics);
}

Default sinks

We should include a simple console output of the metric measurements intended for debugging purposes. This doesn't have any of the customization that the ILogger console does because it isn't intended to be left enabled in production code. Recording raw measurements to text is very inefficient.

public static class ConsoleMetricsExtension
{
    public static IMetricsBuilder AddConsole(this IMetricsBuilder builder);
}
@JamesNK
Copy link

JamesNK commented May 13, 2023

Something that R9 folks have mentioned several times is trimming unnecessary tags (from their perspective) from exported results. What about providing a way to customize tags while listening? Add, remove, rename, etc. Or do you feel this is something better left in the exporter layer for performance?

I think we should figure out between us and OTel how to handle this scenario.

@CodeBlanch
Copy link

@noahfalk In the spec section would it be possible to add the assembly destination for the new types? Safe to assume IMetricsBuilder, MetricsBuilderExtensions, IMetricSource, and IMetricListener will go into Microsoft.Extensions.Diagnostics.Abstractions? It will probably need a dependency on M.E.DI.Abstractions to make IServiceCollection available? Just trying to think through how this will ripple through OTel.NET / Auto-instrumentation.

PS: Seems like there is mixed usage of plural Metrics and singular Metric in the type names. Probably will sort out in review just thought I'd mention it.

@CodeBlanch
Copy link

Couple more questions after finishing a read through...

  • I don't totally follow the scope (MeterScope) concept. Would you mind flushing that out a bit?
  • What is MeterFactoryOptions for?

@samsp-msft
Copy link

samsp-msft commented Jun 27, 2023

We already have a mechanism for AddOpenTelemetry() that exposes metrics - why do we need an alternative from the other direction? I fear this is going to add confusion.

Could the mapping be part of a Listener, and have an API to remap as part of that listener - Are there scenarios other than OTel that really need the name mapping?

I would suggest that the main consumers are in the Instrumentation Providers for OTel - so AddOpenTelemetry().Metrics.AddAspNetInstrumentation() would include the mapping inside its implementation so that OTel sees names that follow their schema.

If I explicitly added the Meter for Microsoft.AspNet.Hosting (sp?) it would give the native names, and not remapped ones.

@noahfalk
Copy link
Author

@noahfalk In the spec section would it be possible to add the assembly destination for the new types?

Sure thing. You are correct that probably almost all of it goes in Abstractions and Abstractions gains a dependency on M.E.DI.Abstractions. I think the only part of the public API that doesn't go in Abstractions would be the AddMetrics() API, and then of course the implementation that is internal.

Seems like there is mixed usage of plural Metrics and singular Metric in the type names

Its possible I just messed some up :) Alternative naming suggestions are also welcome.

What is MeterFactoryOptions for?

I think that is a left-over from earlier versions of IMeterFactory that had tags. We got rid of those tags in the past but I forgot to remove it from the design. I'll confirm and then likely get rid of it.

I don't totally follow the scope (MeterScope) concept. Would you mind flushing that out a bit?

The feature that introduced IMeterFactory created a new scope parameter on Meters. Consider:

ServiceCollection collection1 = new ServiceCollection();
collection1.AddMetrics();
IServiceProvider provider = collection1.BuildServiceProvider();

// This Meter has scope=null which means it is global
Meter global = new Meter("System.Net.Http"); 
// This Meter has scope=meterFactory which means it is local to this DI container
Meter local = provider.GetRequiredService<IMeterFactory>().Create("System.Net.Http");

If the MeterScope is unspecified calling EnableMetrics("System.Net.Http") would bind to instruments on both of the above Meters if the app created them. Specifying MeterScope.Global would only bind to the global Meter, and specifying MeterScope.Local would only bind to the local Meter. If you created a 2nd DI container and called
AddMetrics(builder => builder.EnableMetrics("System.Net.Http")) in that one it would still have visibility on the
global Meter, but it would not have visibility on the local Meter created in DI container 1.

@noahfalk
Copy link
Author

Are there scenarios other than OTel that really need the name mapping?

Tools/projects/companies create their own naming schemes as well. For example inside Microsoft we have Microsoft Common Schema.

@noahfalk
Copy link
Author

Seems like there is mixed usage of plural Metrics and singular Metric in the type names

I made a few changes and attempted to standardize it, but there are different naming patterns that could advocate for both singular and plural forms in some places. I may still have missed some places and accidentally left it inconsistent. Let me know if there are places where the names differ from what felt natural to you.

@noahfalk
Copy link
Author

noahfalk commented Jul 6, 2023

Based on recent conversations I removed all the renaming APIs, leaving only ability to select which Meters/Instruments should be enabled and which listeners they should be sent to.

@Tratcher
Copy link

Tratcher commented Jul 17, 2023

MetricsBuilderEnableExtensions references a few non-existent types:

  • MetricsOptions -> MetricsEnableOptions?
  • IMeasurementListener -> IMetricsListener?

Others:

  • namespace Microsoft.Extensions.Metrics -> Microsoft.Extensions.Diagnostics.Metrics?

@noahfalk
Copy link
Author

Yeah, all of your guesses are correct 👍 , let me fix it inline.

@Tratcher
Copy link

Note HostApplicationBuilder now has a base interface IHostApplicationBuilder where this should be added too.

@Tratcher
Copy link

Any idea why System.Diagnostics.Metrics.Instrument isn't CLSCompliant? It's causing a viral spread of <CLSCompliant>false</CLSCompliant> up the stack.

@noahfalk
Copy link
Author

Offhand, no I didn't even realize Instrument was non-compliant.

@tarekgh - does this ring a bell for you? Separately does our API design care much about CLS-compliance these days? I'm not sure if we should be trying to do anything about this or just mark everything non-compliant and move on.

@tarekgh
Copy link

tarekgh commented Jul 19, 2023

The CLS compliance is turned off on the whole System.Diagnostics.DiagnosticSource library and not only the Instrument or metric classes. This is done long ago in the PR dotnet/corefx#33207 by Vance. I have no idea why he did that. @noahfalk you were one of the reviewers of this PR.

I tried to turn on the CLS compliance on the library then building and running the tests and I am not seeing any problem so far. I am not sure if there is any risk to the library consumers if we turn on the CLS compliance.
In general, in .NET Core there are many APIs exposed that are not CLS compliant, so I don't think anyone cares much about that anymore.

@Tratcher
Copy link

Tratcher commented Jul 19, 2023

I tried to turn on the CLS compliance on the library then building and running the tests and I am not seeing any problem so far. I am not sure if there is any risk to the library consumers if we turn on the CLS compliance.

Great, can we please turn that on so it doesn't go viral up the stack? Want me to submit a PR?

@tarekgh
Copy link

tarekgh commented Jul 19, 2023

I'll open the PR and look if CI gates will pass and wait for @noahfalk advise if he can see any risk.

@tarekgh
Copy link

tarekgh commented Jul 19, 2023

I have opened the PR dotnet/runtime#89214.

@Tratcher
Copy link

If the MeterScope is unspecified calling EnableMetrics("System.Net.Http") would bind to instruments on both of the above Meters if the app created them. Specifying MeterScope.Global would only bind to the global Meter, and specifying MeterScope.Local would only bind to the local Meter. If you created a 2nd DI container and called
AddMetrics(builder => builder.EnableMetrics("System.Net.Http")) in that one it would still have visibility on the
global Meter, but it would not have visibility on the local Meter created in DI container 1.

InstrumentEnableRule requires MeterScope to be specified as Global or Local. Should it be nullable?

@noahfalk
Copy link
Author

It is a flags enum so I was imagining setting it to Global | Local if you want it to apply to both. Does that work or you still forsee issues?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment