Skip to content

Instantly share code, notes, and snippets.

@steveharter
Last active January 11, 2022 17:59
Show Gist options
  • Save steveharter/6982f5c11dd7043215b53252ad0931fa to your computer and use it in GitHub Desktop.
Save steveharter/6982f5c11dd7043215b53252ad0931fa to your computer and use it in GitHub Desktop.
Reflection features for V6

March 25th, 2021

Steve Harter

Update (January 11th, 2022)

This document is a point-in-time status update for reflection work that was being prototyped in V6 but didn't make it in mainly due to lack of support for __makeref(ref struct) and the ability to have a collection of TypeReferences.

Please refer to the User Story for the current status of this feature.

Scenarios

Reflection improvements for V6:

  • Methods
    • Fast invoke for methods and constructors.
      • Standard reflection is ~20x slower than a direct Delegate call (for a simple 3-parameter method) plus allocates due to boxing\unboxing and usage of object[].
    • Support ref and out modifiers.
      • Alhough this is currently possible with standard reflection, it has usability issues since value types passed in the object[] must be post-processed by the user to obtain the new values.
  • Properties and Fields
    • Fast get\set with the same support as methods.
  • ref struct \ byref-like support
    • Support ref struct instances as method arguments and property\field values.
    • Support invoking members on byref-like instances.
      • On platforms that do not support ref emit including iOS this may not be entirely feasible in V6. It is more feasible if there is no support for passing other byref-like instances as method arguments.

Reflection consumers

Different types of reflection consumers and scenarios which should be supported:

  • May not know number of arguments at compile time and thus may create the argument objects in a loop (without separate variables for each argument).
    • Thus, we need a "collection" of values (e.g. a new Arguments type) and not just Invoke(arg1), Invoke(arg1, arg2), etc.
  • May not know type of arguments at compile time.
    • We should have non-generic alternative methods.
    • Boxing of value types and accepting System.Object should be supported (ref struct arguments won't work with this however).
  • May not know \ care whether argument is byref or byvalue at compile time.
    • Thus, we should support passing a byval to a method that expects byref (by passing it byref automatically).
  • Unwilling to take a dependency to the large System.Linq.Expressions assembly (e.g. Blazor "hello world").
    • Thus, no internal dependency on expressions.
  • Unable to take a dependency on Reflection.Emit due to platform limitation (e.g. iOS).
    • Thus, a "slow path" fallback is necessary, but that may be limited to common scenarios only.

To help reduce scope and prioritize, the JSON serializer is assumed to be the canonical consumer and litmus test, meaning if the JSON serializer needs the functionality then it is required for 6.0. At the end of 6.0, the JSON serializer should not be using any Types from System.Reflection.Emit.

non-scenarios for 6.0

The existing MethodInfo.Invoke() and similar methods on PropertyInfo and FieldInfo will not change. It is possible, however, to do this and make standard reflection much faster for platforms that support IL Emit.

To reduce scoping, there is no support for using method handles and pointers directly. Using MethodBase, PropertyInfo and FieldInfo is required.

Default values for method arguments are not supported in the prototype; could be but requires more overhead.

Status

Outstanding issues to address before completely functional:

  • No way to support non-POD ref structs such as Span<T>. without language support. POD ref structs (only contain blittable primitives) are supported in the current prototype.
  • No way to have a nice variable-length collection without language support
  • Need a fast way to deal with AOT for non-emit cases. Should also support ref struct.

Implementation \ design

To achieve highest performance possible, Reflection.Emit will likely to be needed. Having a runtime apporach without emit is also desired for those platforms that don't support emit.

High-level options for fast path:

  1. Use Emit to create a stub function to invoke members.
  2. Add runtime support (using TypeHandle and other internals)
  3. Discussion: what other options exist for calling managed methods? The new function pointer work doesn't seem feasible since it lacks ability to create a function pointer from a MethodBase (or PropertyInfo \ FieldInfo) and even if so, lacks the ability to call it in a general-purpose manner.

Fallback options for when Emit can't be used (iOS, AOT platforms?):

  1. Use Delegate.CreateDelegate and Type.MakeGenericType. The number of method parameters may be limited (e.g. 4) to limit the number of pre-defined open generic delegates that need to be defined ahead of time. This option will be 5-10x as fast as existing reflection.
  2. Fall back to standard reflection. Due to having to wrap the standard reflection calls, this will be slower than standard reflection.
  3. Use runtime\unmanaged fallbacks; requires deeper runtime integration. Need to research what is JITted today for the emit case (including calling runtime intrinsics) vs. what the runtime would do by itself without emit.

Type casting

TypedReference can not automatically be converted to a base or derived class.

For usability, a cast to the declared type will occur via castclass opcode. This allows the TypedReference variable and parameter signature to vary. However, boxing and unboxing are not supported. The TypedReference.FromObject() and ToObject() method must be used. This is primarily due to performance to avoid checking if the type is System.Object and a value type and thus needs to be boxed.

Performance goals

Optimize for throughput, not first call. Thus Emit and JIT can be used. A delegate created through a dynamic method will be cached on the respective reflection instance (MethodBase, PropertyInfo or FieldInfo).

The performance should be on par with using emit \ DynamicMethod.

No allocations or boxing. However, manual boxing should be supported:

  • The JSON serializer boxes a target value type once and then sets all fields\properties with that same boxed value. The serializer does this to reduce usage of Type.MakeGenericType() with minimal performance impact since value types with properties\fields typically aren't used and if they are used they (should) have a small number of properties\fields.
  • Value type-based arguments that need to be created dynamically will typically be boxed in the loosely-coupled reflection scenarios and created through Activator.CreateInstance(); this is fine as we can obtain the inner value before converting to a TypedReference.

In order to cache the delegate, the runtime classes for MethodInfo and FieldInfo will add 4 bytes. This is not quite pay-to-play since the 4 bytes are null and not used until the fast invoke methods are called. There are alternatives here (including user-held delegates and\or caching approaches), but currently the assumption is that the additional memory is acceptable.

Requirement discussions \ todos

  • CultureInfo? Is this parameter necessary? Used for Exception localization? It could be added later.
  • Optional arguments defaulted? Ideally, yes, including generics. Note that reflection currently has issues with defaulting generic parameters.

API Approaches

There are various API approaches:

  1. Low-level TypedReference (up to ~16 arguments). Fastest. Currently this is the option in the API proposal.
  2. Fixed-length collection supporting literals (up to 4-7 arguments). Slower, but safer and nicer.
  3. Unsafe variable-length collection.
  4. Strongly-typed variable-length collection. This is ideal but requires language features to enable a stack-based collection that supports ref struct.

Approach 1: Low-level TypedReference (up to ~16 arguments same as Action<> and Func<> delegates)

First, a TypedReference must be created:

// Literal
string s = "steve";
TypedReference tr = TypedReference.FromRef(ref s);
// or
TypedReference tr = __makeref(s);

// Value type
MyStruct myStruct = ...
TypedReference tr = TypedReference.FromRef(ref myStruct);
// or
TypedReference tr = __makeref(myStruct);

// Reference type
MyPoco myPoco = ...
TypedReference tr = TypedReference.FromRef(ref myPoco);
// or
TypedReference tr = __makeref(myPoco);

// ByRef-like POD (ref struct with no managed fields)
MyRefStruct refStruct = ...
IntPtr ptr = new IntPtr(&refStruct); // Unsafe hack that works but would require language\runtime feature to avoid IntPtr
TypedReference tr = TypedReference.FromIntPtr(ptr);

// ByRef-like non-POD such as Span<T>; this is a major issue blocking adoption of this feature
Span<int> span = ... // or Span<object> for reference type scenarios
// Getting a raw unsafe pointer doesn't work
IntPtr ptr = new IntPtr(&refStruct); // compile error CS0208
// Getting a TypedReference doesn't work
TypedReference tr = __makeref(span); // compile error CS1601
// Once __makeref(span) works, we'd also want to wrap __makeref(span) if possible:
TypedReference tr = TypedReference.FromRef(span); // doesn't compile since byref-like not allowed with generics
// If the language team adds support for a generic constraint with byref-like then we may need a separate helper:
TypedReference tr = TypedReference.FromByRefLike(span); 

// System.Object can also be used for boxed value types:
object i1 = 1;
TypedReference tr = TypedReference.FromObject(ref myInt);

then an Invoke() overload is called on the corresponding type (e.g. MethodInfo):

  var myref = new MyRefStruct(42); // POD type; no managed references
  string s = "steve";
  int i1 = 1;

  MethodInfo mi = ...

  mi.InvokeFuncDirect(
    obj: default(TypedReference),
    TypedReference.FromIntPtr(new IntPtr(&myref), typeof(MyRefStruct)),
    TypedReference.FromRef(ref s),
    TypedReference.FromRef(ref i1),
    returnValue: default(TypedReference));

With __makeref (not recommended):

  mi.InvokeFuncDirect(
    obj: default(TypedReference),
    TypedReference.FromIntPtr(new IntPtr(&myref), typeof(MyRefStruct)),
    __makeref(s),
    __makeref(i2),
    returnValue: default(TypedReference));

Using __makeref is not recommended mostly because it is somewhat non-dicoverable and C# specific plus it has the __ syntax which is not used elsewhere and implies a low-level call that should not be used directly. Also, there will be other helpers on TypedReference so using factory methods on TypedReference will be more intuitive. These other helpers include obtaining the current type and value ( the Type must be passed as a generic paramter to obtain the value); these wrap __reftype and __refvalue which have the same discoverability issues.

Approach 2: fixed-length collection supporting literals

An experimental collection was prototyped supporting ByReference<> and literals:

  InvokeParameters
    .AddIntPtr(new IntPtr(&myref), typeof(MyRefStruct))
    .Add("steve") // kept on stack as T
    .AddRef(ref i2) // kept on stack as ByReference<T>
    .InvokeActionDirect(mi);

Basically this builds a generic type up to a fixed number of parameters (say 4) and keeps value types, reference types and literals safe since references and value type copies are properly held without pointers. However, each .Add*() method copies the values from the prior generic type so it isn't scaleable to a large number of parameters.

When Invoke* is called, the various state (held by T, ref T and IntPtr) are converted to TypedReferences.

Approach 3: unsafe variable-length collection

TBD; needs additional prototyping. Possible in V6.

Approach 4: strongly-typed variable-length collection

This is a variable-length and scaleable version of Approach 2 that will likely need langage features to treat TypedReference as a normal by-ref type so it can be returned from methods and be a field on another byref-like type.

Unlike Approach 3, this prevents pointers from being used.

Essentially a collection or linked-list of TypedReferences need to be maintained (or created when Invoke() is called). Since it should cooperate with GC, strong references need to be maintained. It should also be stack allocated.

UPDATE: please see the User Story link at the top of this document for additional work here.

API

TypedReference additions

Additions to the existing TypedReference struct:

public ref struct TypedReference
{
    ...
    // Most common factory method:
    // Same as calling __makeref(value)
    public static TypedReference FromRef<T>(ref T value);

    // Used for late-bound scenarios.
    // For value types, gets the boxed value.
    // For reference types, same as FromRef<object>().
    // Complementary to the existing ToObject().
    public static TypedReference FromObject(ref object value);

    // Intended for "ref structs" but could support any pointer:
    public static unsafe TypedReference FromIntPtr<T>(IntPtr value);
    public static unsafe TypedReference FromIntPtr(T value, Type type);

    // Optional alternative to using "default(TypedReference)":
    // public static TypedReference GetNull();

    // Determine if value is "default(TypedReference)"
    public bool IsNull { get; }

    // Optional methods to obtain value (via __refvalue) not required for functionality:
    public T As<T>(); // Same as calling __refvalue(TypedReference, T)
    public ref T AsRef<T>(); // Same as calling "ref __refvalue(TypedReference, T)"
    // Due to compiler we may need to do this instead:
    // public void AsRef<T>(out T value);
    ...
}

MethodBase additions

The apprach here mirrors Action and Func. Both take up to 16 parameters (including "this" for instance methods) plus Func has a return value:

Func<T,TResult>
Func<T1,T2,TResult>
...
Func<T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,TResult>
public class MethodBase // Includes both MethodInfo and ConstructorInfo; ConstructorInfo must use InvokeFuncDirect* to return the new value.
{
    ...
    // Func
    public void InvokeFuncDirect(TypedReference result);
    public void InvokeFuncDirect(TypedReference arg, TypedReference result);
    public void InvokeFuncDirect(TypedReference arg1, TypedReference arg2, TypedReference result);
    ...
    public void InvokeFuncDirect(TypedReference arg1, TypedReference arg2, ... , TypedReference arg16, TypedReference result);

    // Action
    public void InvokeActionDirect();
    public void InvokeActionDirect(TypedReference arg);
    public void InvokeActionDirect(TypedReference arg1, TypedReference arg2);
    ...
    public void InvokeActionDirect(TypedReference arg1, TypedReference arg2, ... , TypedReference arg16);

    protected virtual void InvokeFuncDirect(TypedReference arg1, TypedReference arg2, ... , TypedReference arg16, TypedReference result);
    protected virtual void InvokeActionDirect(TypedReference arg1, TypedReference arg2, ... , TypedReference arg16, TypedReference result);

    // Possible collection support (TBD):
    // public InvokeArguments CreateArguments();
    // public virtual void Invoke(object obj, in InvokeArguments values);
    // public virtual void Invoke(TypedReference obj, in InvokeArguments values);

PropertyInfo and FieldInfo additions

Examples

PropertyInfo property = ...
object obj = ...
int i = 42;
property.SetValueDirect(
  obj: TypedReference.FromRef(ref obj),
  value: TypedReference.FromRef(ref i));

API

public abstract class PropertyInfo
{
  // Wraps GetMethod().InvokeFuncDirect(,)
  public void GetValueDirect(TypedReference obj, TypedReference result);

  // Wraps SetMethod().InvokeActionDirect(,)
  public void SetValueDirect(TypedReference obj, TypedReference value);

  // Optional for common scenarios:
  // public void SetValueDirect(object obj, object value); // or just replace existing SetValue
  // public void SetValueDirect<T>(object obj, in T value);
  // public void SetValueDirect<Type, TValue>(in Type obj, in TValue value);
  // public object GetValueDirect(object obj); // or just replace existing GetValue althought that would increase start-up time and private bytes due to IL Emit
  // public T GetValueDirect<T>(object obj);
  // public TValue GetValueDirect<Type>(in Type obj);
}

Field support is consistent with properties.

public abstract class FieldInfo
{
  public void GetValueDirect(TypedReference obj, TypedReference result);
  public void SetValueDirect(TypedReference obj, TypedReference value);

  // Also see PropertyInfo optional overloads for common scenarios that also apply to fields.
}

Comparison to standard reflection

Argument validation and special treatment

Layer Standard Reflection Fast Invoke
Parameter Validation Upfront try\catch on top of Emit with post analysis**
Copying of values upfront for substitution Yes in RuntimeType.CheckValue() No
Reference type polymorphism Supported automatically Supported automatically
Special type: Enums Yes Yes
Special type: System.ReflectionPointer Yes tbd
Other special types tbd tbd
Copying of values upfront for substitution Yes No
Boxing Supported automatically Supported manually via TypedReference.ToObject()
Unboxing Supported automatically Supported manually via TypedReference.FromObject()

** fast invocation attempts to throw the same exceptions as normal reflection for parameter validation. However, instead of validating all arguments ahead-of-time (which is expensive):

  • A try\catch block will catch InvalidCastException thrown by fast invoke only (will let others go through)
  • Attempt to find the invalid parameter by inspecting each.
  • Throw ArgumentException with text such as: Object of type 'System.String' cannot be converted to type 'System.Int32'.

Other feature comparison

Other features Standard Reflection Fast Invoke
Ref and out Yes. Need to manually unbox values Yes
Ref returns Yes. Need to manually unbox value Yes
Can invoke "ref structs" No Yes via pointer or IntPtr***
Can pass "ref structs" No Yes via pointer or IntPtr***

*** ref struct support may not work if IL Emit is not possible on the current platform.

External dependencies

C# TypedReference as normal ref struct

The TypedReference is now a ref struct but Roslyn still have additional constraints including returning from a method and be a field of another ref struct. These additional constraints should now be unnecessary and should be removed. See https://docs.microsoft.com/en-us/dotnet/csharp/misc/cs1601.

This would increase the usability by allowing the TypedReferences to be flowed in more cases, and perhaps used by future "parameter" class to enable an arbitrary number of parameters.

For recent work to make TypedReference a ref struct see dotnet/runtime#2216 and comments on this issue: dotnet/runtime#26186 (comment).

For background on TypedReference see https://github.com/dotnet/roslyn/blob/afd10305a37c0ffb2cfb2c2d8446154c68cfa87a/docs/compilers/CSharp/System.TypedReference.md.

C# generic constraint: where T : ref struct

This would allow "ref structs" to be used without IntPtr. dotnet/csharplang#1148

This would allow, for example:

Span<byte> span = ...
mi.InvokeActionDirect(TypedReference.FromByRefLike(ref span));

instead of

Span<byte> span = ...
unsafe
{
    mi.InvokeActionDirect(TypedReference.FromIntPtr(new IntPtr(&span)));
}

Misc design notes

Run-time delegate creation

We need to be able to create a delegate class or function pointer at run-time. This must also support where some parameters may be a ref struct.

When using Emit, a DynamicMethod will be created like

    private delegate void Stub(TypedReference returnValue, TypedReference obj, TypedReference arg1, TypedReference arg2, ...);

which will support a fixed number of parameters (say 7 parameters max).

Supporting a variable (no max) number of parameters is possible through unsafe code and pointers; additional prototyping is necessary for this.

Performance numbers

Note the numbers may be a bit slower when the try\catch pattern is added (to throw the proper exception) and a castclass opcode is added (for covariance\contravariance support).

Methods

Passing two Int32s and one string:

Using TypedReference.FromRef: 1881

// Existing MethodInfo.Invoke
MethodInfo.Invoke with cached object[]: 22369
MethodInfo.Invoke creating object[] each time: 23750

~11x faster (and no allocs).

Fields

SetField on Class: 1106
GetField on Class: 1095

Properties

These just wrap the method invoke above by obtaining the property's get and set MethodInfo.

SetProperty on Class: 1170
SetProperty on Struct: 1101

// Existing PropertyInfo.Set\GetValue:
SetProperty on Class using reflection 11265
SetProperty on Struct using reflection 11848

~9x-10x faster (and no allocs).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment