rynowak/on_linkability.md

## on_linkability.md

      
    Raw
  

              on_linkability.md
            
          
    On Linkability and Size

Optimizing a .NET application for size requires us to develop new muscles, new tools, new designs, and new criteria for decision making. We typically provide customer value though building rich and powerful APIs. Optimizing for size asks something new of us, wanting granular APIs, uncoupled features, and statically-analyzable patterns.
Of the top ten non-Microsoft .NET library packages on nuget.org - 6/10 of these libraries serve the primary purpose of doing dynamic code invocation (IoC).

Newtonsoft.Json (Serializer)
Castle.Core (Reflection/IL utilities)
Moq (Mocking/Proxies/IL Generation)
Automapper (Object-Mapping)
Nunit (Test Framework)
Xunit (Test Framework)

It's clear that whatever problems we face in delivering solutions that are linkable, our community and thus all end users will also face.
Knowing .NET customers, if we're going to be successful, we need to not only make the set of features we ship 'just work', we need to give library authors a chance to adopt the same solutions. In general it's not possible to assume that all libraries can just 'opt out' because IoC-type features are viral by nature.

To help contextualize the linker ... when we do care about size? I would say we care about size when our customers do - what are the use cases?

Console tools
Cloud infrastructure
Small server apps (we assume)
Blazor WASM

Of these, Blazor WASM is unique in that it doesn't use CoreCLR - a Blazor WASM app is already very small (<5mb uncompressed, and 2mb compressed). In a Blazor app, the total size of the managed code is bigger than the runtime.

There are two different definitions for "linker-friendly" - either too much is trimmed (false negative), or too little is trimmed (false positive). I'll give less coverage to false-positive cases because Jan's already written an excellent document on that.
Understanding Tree-Shaking

The term tree-shaking is used widely, and tree-shaking is understood to be brittle - but there are other kinds of dead-code-elimination that aren't quite so impactful. It's important to establish that tree-shaking specifically refers to dead-code elimination with a set of properties that make it especially aggressive:

Start from the application/module entry point(s)
Visit all possible types/fields/methods that are definitely reachable via some control-flow
Remove members and types that were not visited

The consequence of this is that any method that's not invoked will be removed.
We can derive from this a better understanding of our two sets of problems:

False Negative: Method is expected to be invoked, but without an analyzable call-site.
False Positive: Method is not-expected to be invoked, but appears in some control-flow path.

False Negatives (Too Much Trimming)

This is an analysis of false negative cases through the lens of ASP.NET Core. These kinds of cases could easily appear in any library, but ASP.NET Core has a variety of IoC-type features, so it's useful as a set of motivating examples.
Pessimism and Instantiability

The linker does not treat types as instantiable when it does not see a call to the constructor. Here's an example:
services.AddTransient<IWidgetFactory, DatabaseWidgetFactory>();

...

[ApiController]
public class WidgetController : ControllerBase
{
    [HttpGet("/widgets")]
    public async Task<ActionResult<Widget>> GetWidget([FromServices] IWidgetFactory factory)
    {
        ...
    }
}
This is idoimatic ASP.NET Core code to make the DatabaseWidgetFactory available to the app through Dependeny Injection. Users will choose this kind of pattern because it's friendly to unit-testing, and reflects maintainable practics like coding to an abstraction rather than implementation.
However, to the linker, the only use it sees of DatabaseWidgetFactory is in the generic signature of AddTransient<,> - there are no method calls, or accesses to any of it's fields, so it's fair game to remove. The linker has a different set of heuristics based on whether or not it seems the class as instantiated. An instantiable class will have all of it's interface implementations and at least one constructor preserved. A non-instantiable class won't have any constructors (it will even remove the default constructor).
Naturally without doing something to address this, the application will crash when attempting to instantiate DatabaseWidgetFactory.
ASP.NET Core features in this category:

Middleware
DI
Options
Activator.CreateInstance
ActivatorUtilities.CreateInstance

One of the interesting special cases here is that the linker can remove constructors which leads to violation of the new() constraint. This is a quirk of how some of our APIs are defined in 3.0 where an implementation calss has the new() constraint, but the interface doesn not.
Late-Binding Discovery

A secondary problem is when types light up some extensibility based on there mere existance. We can reuse an example:
[ApiController]
public class WidgetController : ControllerBase
{
    [HttpGet("/widgets")]
    public async Task<ActionResult<Widget>> GetWidget([FromServices] IWidgetFactory factory)
    {
        ...
    }
}
MVC has functionality that will go find types following a certain set of patterns, and wire them up. This pattern is broken totally by the linker because the linker sees this as dead code.
ASP.NET Core features in this category:

Controllers
Pages
View Components

Solution: Pattern Recognition

We've prototyped an approach to using a custom pasess in the linker to recognize patterns and feed additional data into the linker. The linker has support for injecting custom passes, but it's very rudimentary right now. To use this in anger we would have to improve some aspects of how the linker is packaged and integrated with the SDK.

In the case of controllers - this is actually a great solution. The source of truth for the runtime is types that match a certain pattern, and this is easy to capture in the linker. For something like MVC's controllers this is easy to get right and would be easy for us to maintain.

In the case of dependency injection - we were able to prototype an approach, but it's relatively involved, and it has flaws.
The primary problem is that the source of truth for DI is imperative, we can see both of these cases appear.
// Declarative - easy to analyze
services.AddSingleton<IWidgetFactory, DatabaseWidgetFactory>();

// Imperative - can't reliably analyze
services.AddSingleton(typeof(IWidgetFactory), Type.GetType(Configuration["FactoryType"]));
Our DI system supports calls that accept Type objects from arbitrary expressions. Moreover, even simple code could just get the analysis wrong. The linker doesn't have data-flow analysis built-in - the state of the art for linker heuristics is to do a backwards scan over the IL looking for a ldtoken.
// Trivally defeats analysis
Type x = typeof(DatabaseWidgetFactory); // ldtoken here
Console.WriteLine("My favorite type is " + typeof(ConditionalWeakTable<>)); // ldtoken everywhere
services.AddSingleton(typeof(IWidgetFactory), x);
The additional complication is that this problem is transitive. We have APIs like this in ASP.NET Core:
services.AddHostedService<BackgroundWidgetProcessor>();

...

public static class ServiceCollectionHostedServiceExtensions
{
    public static IServiceCollection AddHostedService<THostedService>(this IServiceCollection services)
        where THostedService : class, IHostedService
    {
        services.TryAddEnumerable(ServiceDescriptor.Singleton<IHostedService, THostedService>());
        return services;
    }
}
These viral cases are problematic - because they would require a change to the linker's algorithm to change. The linker is iterative, and does a breadth-first search - it does not analyze nodes repeatedly, each one is visited at most once.
When we analyze services.AddHostedService<BackgroundWidgetProcessor>(); in an application, we don't that AddHostedService results in a service registration. When we analyze AddHostedService, we don't know that it was called with BackgroundWidgetProcessor. AddHostedService is not unique. There are many cases where an API exists as a convenient or discoverable alternative to some other DI/activation primitive.

There are some suppliments we could add to transform pattern recognition into a usable feature set.
First we add a [DynamicallyConstructed] attribute. This doesn't have to be defined in the BCL, it could be matched based on a name. This is a marker that can be applied to a type, type parameter+, or parameter (of type Type) that instructs the linker about the intentions.
note: I know we can't apply an attribute to a type parameter in C#. We'd have to choose another encoding to mean this.
Second, we write a linker pass like the prototype, but based on this attribute rather than the hardcoded to a set of APIs. This pass will preserve all constructors of a type that's used in a [DynamicallyConstructed] site.
Next, we need to deal with the cases that escape this analysis. The best idea I have to write analyzer to handle two cases:

APIs that pass-though a type to a [DynamicallyConstructed] site
Usage of arbitrary expressions to pass a Type to a [DynamicallyConstructed] site

This ends up being a lot of work, but it would make the linker feature more usable, and solves these problems in a way that can scale. The reason why I arrived at [DynamicallyConstructable] as the solution is that for cases like DI, once the type is seen as instantiable by the linker, it will do correct analysis of everything else.
You could also imagine a [DynamicallyInvocable] attribute as a companion that says preserve all members - this would be suitable for solving problems like MVC's controllers where any public API might be called. However, this alone isn't enough to solve the problem - MVC's controllers aren't rooted by anything - they are discovered dynamically. We could change this, and only support manual registration of controllers for these scenarios - or we could support a mode to these APIs that also declare the type as a root.
Addendum: Is it wicked not to care?

All of this kit could probably be simplified if we want a low cost way of making ASP.NET Core survive the linker. One basic heuristic would solve all of these problems.
If you see visit a typeof(T) or <T> then preserve all of its constructors. Early versions of our prototype linker pass used this heuristic.
We haven't measured to see the difference in trimming between this heuristic and the more complex version of the logic that we ended up yet.
Solution: Static Code Generation

We've also considered in the past using static code generation to solve these problems. The idea is that all of the dynamism we need could be code-generated inside the linker or compiler, and then the linker would have something easy to analyze.
The potential of static code generation is very similar to pattern recognition. In fact, you can think of it as a clickstop past pattern recognition by the linker.

Patterns like MVC controllers (or JSON Serialization) are trivial to analyze, and would be easy to generate with static code.
Patterns like DI are lossy when analyzed, and so generated code would need to support fallback cases, or lower fidelty.

The success of an approach like this is contingent on kind of programming experience offered, and how analyzable it is.
Addendum: Adding Runtime Sources of Truth

Features like Controllers (or JSON Serialization) have an additional source of truth beyond what can be statically reasoned about. Users can write imperative code, or provide additional configuration that is applied on top of the data gleaned from reflection. This is valuable to users, it's also not something we need to sacrifice.
The approach that's used today for IoC-Type systems is usually:

Reflection -> Descriptor -> Runnable Code

Consider as a motivating example this pseudocode representing MVC's initialization:
var controllerDescriptors = new List<ControllerDescriptor>();
foreach (var (method, type) in GetControllerMethods())
{
    var controllerDescriptor = new ControllerDescriptor();
    controllerDescriptor.FactoryFunc = MakeFactory(type);
    controllerDescriptor.InvokeFunc = MakeInvokeFunc(method);
    
    controllerDescriptor.Parameters = new List<ParameterDescriptor>();
    foreach (var parameter in method.GetParameters())
    {
        var parameterDescriptor = new ParameterDescriptor();
        parameterMode.Binder = MakeBinder(parameter);
        controllerDescriptor.Parameters.Add(parameterDescriptor);
    }

    RunUserCallbacks(controllerDescriptor); // Let user-code modify any detail of this
    controllerDescriptors.Add(controllerDescriptor);
}

...

// DRASTICALLY simplified example of reflection done by the framework
private static Func<HttpContext, Task<object> MakeBinder(ParameterInfo parameter)
{
    if (parameter.GetCustomAttribute(typeof(FromBodyAttribute)))
    {
        return context => JsonSerializer.DeserializeAsync(context.Request.Stream, parameter.ParameterType);
    }
    else
    {
        // do something else
    }
}
And with static code generation:
var controllerDescriptors = new List<ControllerDescriptor>();

var controllerDescriptor = new ControllerDescriptor();
controllerDescriptor.FactoryFunc = () => new MyController();
controllerDescriptor.InvokeFunc = (context, instance) => instance.MyMethod(context);

controllerDescriptor.Parameters = new List<ParameterDescriptor>();

var parameterDescriptor = new ParameterDescriptor();
parameterMode.Binder = context => JsonSerializer.DeserializeAsync(context.Request.Stream, parameter.ParameterType);
controllerDescriptor.Parameters.Add(parameterDescriptor);

RunUserCallbacks(controllerDescriptor); // Let user-code modify any detail of this
controllerDescriptors.Add(controllerDescriptor);
If we retain the approach of Reflection -> Descriptor -> Runnable Code we don't have to lose any power, as long as we can generate the runnable code statically and store it on the descriptor.
Solution: Design Static APIs

The other solution to would be to build APIs that are statically analyzable as alternatives, and help users through the transition. For consideration, we could build APIs for dependency injection that are declarative only, or strongly steer users towards declarative solutions.
It's much maligned (by some people), but MEF somewhat solved this problem long ago. The way to declare a service (part in MEF terminology) was to decorate it with an attribute. Dynamism in MEF is represented with metadata and selectors.
However MEF's strategy isn't a complete fit for what we're trying to accomplish because it's not linkable. If you imagine that we had a model like MEF, we'd have to assume that you used every declared service in ASP.NET Core - that's clearly a regression from the current state.

We could make DI linkable by removing the ability to pass a Type into it. We'd end up being able to make DI linkable quite easily.
One idea that's appealing here is to have an analyzer that's activated by turning on the linker. We'd have to remove our usage of Type overloads and similar features in ASP.NET Core's code (or annotate them properly), and then users would be warned by the analyzer.
The static + analyzer approach could scale to libraries as well. A library author who cares about linkability doesn't need to remove un-analyzable constructs, they would just need to annotate them as such. This helps because the usuability of the library is not sacrificed for all users, but we gain diagnostics for linkability.

This doesn't address how to make DI totally static.
The major problem with this is our DI system has imperative semantics. The order of registering services is significant, and that's not something the linker is prepared to reason about. We also allow the lifetime of a service to be runtime-configured, which is not something we should allow if we want to be totally static - it affects the codegen.
Fundamentally the challenge here is that even if make each DI callsite analyzable, that doesn't make the entire service graph analyzable.
if (IsTuesday())
{
    services.AddSingleton<ILunchProvider, TacoTruck>();
}
else
{
    services.AddSingleton<ILunchProvider, Cafe25>();
}
I think the closest we could get to static DI base don our current system would be:

Remove (or ban) registration methods that accept Type
Perform analysis of each registration

Generate a 'factory' method for each tuple of (service, type, lifetime)
Generate code that 'pre-registers' each of these factory methods


Execute service registration code at startup
Choose from the set of registered factories based on the registrations

The flaw with this approach is that the factory method will have lower fidelity than what we can do today with IL emit. There are advantages to be realized from knowing the exact set of services that we can realize while doing codegen - and we've done this inside our DI system. We would lose these optimizations when moving to a static-codegen based system because of the need for more indirection.
Addendum: PGO for DI

We could augment all of this kit with PGO for DI. We could run the application after building it, and then exfiltrate somehow the exact set of service registrations.
This addresses the problems of doing static codegen for DI - but there are drawbacks to this:

Users may not be able to run the application in the environment they use to build
DI service registration might vary based on runtime characteristics

My concern is: that if these problems are high-profile enough to affect users, we won't turn this on by default. If we don't turn in on by default, why did we build it, etc.
False Positives (Understanding the Size of .NET Code)

Since Jan did a good writeup of these kinds of cases, I want to focus on how we can analyze and contextualize the size of features.
We're close to having a good story for computing the cost of calling a particular API. What exists so far:

The linker can output a DAG of everything it visits
My tool: https://github.com/rynowak/dotnet-il-beancounter that can analyze size of method bodies

So we have a graph, and we have a way to annotate that graph, but there's a few more things we'd need to do:

Walk the entire BCL/SF and produce a dag
Add the ability to analyze the cost of metadata as well as IL

Now we have all of the data, what we want to do it take all of the APIs used by the app and then sort them by cost of everything they dominate (descending). This gives us the ability to look at each API call in the app, and get the non-amortized cost of that individual API call. You could imagine this a flame graph or tree-view if it helps.
There are some problems with doing just this kind of analysis, but it seems like a good starting point.
We could remix this data in a few other ways:

Compute the non-amortized cost of a nuget package
Build IDE experiences that show you this data in codelens
Tag bubbles for expensive features like regex or xml