Skip to content

Instantly share code, notes, and snippets.

@atruskie
Last active May 11, 2023 03:07
Show Gist options
  • Star 5 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save atruskie/bfb7e9ee3df954a29cbc17bdf12405f9 to your computer and use it in GitHub Desktop.
Save atruskie/bfb7e9ee3df954a29cbc17bdf12405f9 to your computer and use it in GitHub Desktop.
Inferring abstract/interface types for YamlDotNet Deserialization

DEPRECATED

This feature is now part of YAML.NET.

See aaubry/YamlDotNet#774 and https://github.com/aaubry/YamlDotNet/wiki/Deserialization---Type-Discriminators.

Addresses aaubry/YamlDotNet#343:

Determining the type from a field

This collection of classes allows one to deserialize abstract classes or interfaces by inspecting the parser's parsing events to choose an appropriate type for the YamlDotNet derserializer to use.

  • AbstractNodeNodeTypeResolver does must of the heavy work by buffering parse events when it encounters an abstract type or interface that we've registered a type resolver for. It then checks if any registered type discriminators can work with the sub-graph of parsing events.
  • IParserExtensions.cs adds a handy extension to IParser that allows for the easy inspection of keys and values in only the current mapping, ignoring any nested mappings.
  • ITypeDiscriminator is a provider interface. For each abstract/interface type you should create a type discrimnator that inspects parsing events and returns an appropriate type when its conditions are met.
  • ParsingEventBuffer is a utility used by AbstractNodeResolver to buffer the streaming parsing events from YamlDotNet's parser. The buffer allows for the events to be replayed which:
    • allows for multiple ITypeDicriminators to be used - each recieving a replay of the events they can inspect
    • and once a type has been selected, replays the parsing events to the standard object node serializer - as if nothing ever happened
  • Finally two example type discriminators (specific to my project) are included:
    • ExpectationTypeResolver looks for the presence of mapping keys that are unique to the descendent/concrete type. IF a key is found the appropriate Type is returned
    • AggregateExpectationTypeResolver looks for a key that is common that has the desired type encoded in its value. This is similar to a kind key on a TypeScript discriminated union.

This solution does slow down the serializer (technically) and add memory pressure (technically). For small file sizes the difference should be trival (I haven't noticed). For large file sizes, try to only resolve types on leaf nodes (the most nested mappings). This will kepp the number of events stored in the buffer small and allow the parser to continue to operate in a mostly streaming fashion.

Usage example:

NamingConvention = UnderscoredNamingConvention.Instance;
// these resolvers allow us to deserialize to an abstract class or interface
var aggregateExpectationResolver = new AggregateExpectationTypeResolver(NamingConvention);
var expectationResolver = new ExpectationTypeResolver(NamingConvention);

YamlDeserializer = new DeserializerBuilder()
    .WithNamingConvention(NamingConvention)
    .WithNodeDeserializer(
        inner => new AbstractNodeNodeTypeResolver(inner, aggregateExpectationResolver, expectationResolver),
         s => s.InsteadOf<ObjectNodeDeserializer>())
    .Build();

// my domain models

public class Model {
  IExpectation[] Expectation {get; set;}
}

public abstract class Expectation : IExpectation { /** blah blah blah **/ }
public class BoundsExpectation : Expectation { /** blah blah blah **/ }
public class CentroidExpectation : Expectation { /** blah blah blah **/ }
public class TimeExpectation : Expectation { /** blah blah blah **/ }

public abstract class AggregateExpectation : IExpectation {
  /// <summary>
  /// Essentially a `kind` property - determines which child type to instantiate.
  /// </summary>
  public string SegmentWith { get; init; }
}
public class EventCount : AggregateExpectation { /** blah blah blah **/ }
public class NoEvents : AggregateExpectation { /** blah blah blah **/ }

And then this yaml can be deserialzied:

- segment_with: event_count   # deserializes to EventCount
- segment_with: no_events     # deserializes to NoEvents
- bounds: abc                 # deserializes to BoundsExpectation
- centroid: null              # deserializes to CentroidExpectation
- time:                       # deserializes to TimeExpectation

LICENSE: MIT

using System;
using System.Collections.Generic;
using System.Linq;
using YamlDotNet.Core;
using YamlDotNet.Core.Events;
using YamlDotNet.Serialization;
using YamlDotNet.Serialization.NodeDeserializers;
namespace Egret.Cli.Models
{
public class AbstractNodeNodeTypeResolver : INodeDeserializer
{
private readonly INodeDeserializer original;
private readonly ITypeDiscriminator[] typeDiscriminators;
public AbstractNodeNodeTypeResolver(INodeDeserializer original, params ITypeDiscriminator[] discriminators)
{
if (original is not ObjectNodeDeserializer)
{
throw new ArgumentException($"{nameof(AbstractNodeNodeTypeResolver)} requires the original resolver to be a {nameof(ObjectNodeDeserializer)}");
}
this.original = original;
typeDiscriminators = discriminators;
}
public bool Deserialize(IParser reader, Type expectedType, Func<IParser, Type, object> nestedObjectDeserializer, out object value)
{
// we're essentially "in front of" the normal ObjectNodeDeserializer.
// We could let it check if the current event is a mapping, but we also need to know.
if (!reader.Accept<MappingStart>(out var mapping))
{
value = null;
return false;
}
// can any of the registered discrimaintors deal with the abstract type?
var supportedTypes = typeDiscriminators.Where(t => t.BaseType == expectedType);
if (!supportedTypes.Any())
{
// no? then not a node/type we want to deal with
return original.Deserialize(reader, expectedType, nestedObjectDeserializer, out value);
}
// now buffer all the nodes in this mapping.
// it'd be better if we did not have to do this, but YamlDotNet does not support non-streaming access.
// See: https://github.com/aaubry/YamlDotNet/issues/343
// WARNING: This has the potential to be quite slow and add a lot of memory usage, especially for large documents.
// It's better, if you use this at all, to use it on leaf mappings
var start = reader.Current.Start;
Type actualType;
ParsingEventBuffer buffer;
try
{
buffer = new ParsingEventBuffer(ReadNestedMapping(reader));
// use the discriminators to tell us what type it is really expecting by letting it inspect the parsing events
actualType = CheckWithDiscriminators(expectedType, supportedTypes, buffer);
}
catch (Exception exception)
{
throw new YamlException(start, reader.Current.End, "Failed when resolving abstract type", exception);
}
// now continue by re-emitting parsing events
buffer.Reset();
return original.Deserialize(buffer, actualType, nestedObjectDeserializer, out value);
}
private static Type CheckWithDiscriminators(Type expectedType, IEnumerable<ITypeDiscriminator> supportedTypes, ParsingEventBuffer buffer)
{
foreach (var discriminator in supportedTypes)
{
buffer.Reset();
if (discriminator.TryResolve(buffer, out var actualType))
{
CheckReturnedType(discriminator.BaseType, actualType);
return actualType;
}
}
throw new Exception($"None of the registered type discriminators could supply a child class for {expectedType}");
}
private static LinkedList<ParsingEvent> ReadNestedMapping(IParser reader)
{
var result = new LinkedList<ParsingEvent>();
result.AddLast(reader.Consume<MappingStart>());
var depth = 0;
do
{
var next = reader.Consume<ParsingEvent>();
depth += next.NestingIncrease;
result.AddLast(next);
} while (depth >= 0);
return result;
}
private static void CheckReturnedType(Type baseType, Type candidateType)
{
if (candidateType is null)
{
throw new NullReferenceException($"The type resolver for AbstractNodeNodeTypeResolver returned null. It must return a valid sub-type of {baseType}.");
}
else if (candidateType.GetType() == baseType)
{
throw new InvalidOperationException($"The type resolver for AbstractNodeNodeTypeResolver returned the abstract type. It must return a valid sub-type of {baseType}.");
}
else if (!baseType.IsAssignableFrom(candidateType))
{
throw new InvalidOperationException($"The type resolver for AbstractNodeNodeTypeResolver returned a type ({candidateType}) that is not a valid sub type of {baseType}");
}
}
}
}
using Egret.Cli.Models;
using System;
using System.Collections.Generic;
using YamlDotNet.Core;
using YamlDotNet.Core.Events;
using YamlDotNet.Serialization;
namespace Egret.Cli.Serialization
{
public class AggregateExpectationTypeResolver : ITypeDiscriminator
{
public const string TargetKey = nameof(AggregateExpectation.SegmentWith);
private readonly string targetKey;
private readonly Dictionary<string, Type> typeLookup;
public AggregateExpectationTypeResolver(INamingConvention namingConvention)
{
targetKey = namingConvention.Apply(TargetKey);
typeLookup = new Dictionary<string, Type>() {
{ namingConvention.Apply(nameof(NoEvents)), typeof(NoEvents) },
{ namingConvention.Apply(nameof(EventCount)), typeof(EventCount) },
};
}
public Type BaseType => typeof(IExpectationTest);
public bool TryResolve(ParsingEventBuffer buffer, out Type suggestedType)
{
if (buffer.TryFindMappingEntry(
scalar => targetKey == scalar.Value,
out Scalar key,
out ParsingEvent value))
{
// read the value of the kind key
if (value is Scalar valueScalar)
{
suggestedType = CheckName(valueScalar.Value);
return true;
}
else
{
FailEmpty();
}
}
// we could not find our key, thus we could not determine correct child type
suggestedType = null;
return false;
}
private void FailEmpty()
{
throw new Exception($"Could not determin expectation type, {targetKey} has an empty value");
}
private Type CheckName(string value)
{
if (typeLookup.TryGetValue(value, out var childType))
{
return childType;
}
var known = (typeLookup.Keys).JoinWithComma();
throw new Exception($"Could not match `{targetKey}: {value} to a known expectation. Expecting one of: {known}");
}
}
}
using Egret.Cli.Models;
using System;
using System.Collections.Generic;
using YamlDotNet.Core;
using YamlDotNet.Core.Events;
using YamlDotNet.Serialization;
namespace Egret.Cli.Serialization
{
public class ExpectationTypeResolver : ITypeDiscriminator
{
private readonly Dictionary<string, Type> typeLookup;
public ExpectationTypeResolver(INamingConvention namingConvention)
{
typeLookup = new Dictionary<string, Type>() {
{ namingConvention.Apply(nameof(BoundedExpectation.Bounds)), typeof(BoundedExpectation) },
{ namingConvention.Apply(nameof(CentroidExpectation.Centroid)), typeof(CentroidExpectation) },
{ namingConvention.Apply(nameof(TimeExpectation.Time)), typeof(TimeExpectation) },
};
}
public Type BaseType => typeof(IExpectationTest);
public bool TryResolve(ParsingEventBuffer buffer, out Type suggestedType)
{
if (buffer.TryFindMappingEntry(
scalar => typeLookup.ContainsKey(scalar.Value),
out Scalar key,
out ParsingEvent _))
{
suggestedType = typeLookup[key.Value];
return true;
}
suggestedType = null;
return false;
}
}
}
using System;
using YamlDotNet.Core;
using YamlDotNet.Core.Events;
namespace Egret.Cli.Models
{
public static class IParserExtensions
{
public static bool TryFindMappingEntry(this ParsingEventBuffer parser, Func<Scalar, bool> selector, out Scalar key, out ParsingEvent value)
{
parser.Consume<MappingStart>();
do
{
// so we only want to check keys in this mapping, don't descend
switch (parser.Current)
{
case Scalar scalar:
// we've found a scalar, check if it's value matches one
// of our predicate
var keyMatched = selector(scalar);
// move head so we can read or skip value
parser.MoveNext();
// read the value of the mapping key
if (keyMatched)
{
// success
value = parser.Current;
key = scalar;
return true;
}
// skip the value
parser.SkipThisAndNestedEvents();
break;
case MappingStart or SequenceStart:
parser.SkipThisAndNestedEvents();
break;
default:
// do nothing, skip to next node
parser.MoveNext();
break;
}
} while (parser.Current is not null);
key = null;
value = null;
return false;
}
}
}
using System;
namespace Egret.Cli.Models
{
public interface ITypeDiscriminator
{
Type BaseType { get; }
bool TryResolve(ParsingEventBuffer buffer, out Type suggestedType);
}
}
using System.Collections.Generic;
using YamlDotNet.Core;
using YamlDotNet.Core.Events;
namespace Egret.Cli.Models
{
public class ParsingEventBuffer : IParser
{
private readonly LinkedList<ParsingEvent> buffer;
private LinkedListNode<ParsingEvent> current;
public ParsingEventBuffer(LinkedList<ParsingEvent> events)
{
buffer = events;
current = events.First;
}
public ParsingEvent Current => current?.Value;
public bool MoveNext()
{
current = current.Next;
return current is not null;
}
public void Reset()
{
current = buffer.First;
}
}
}
@nitz
Copy link

nitz commented Apr 15, 2021

I know I did just say it, but really thanks again -- your feedback, info, and involvement has been wonderful.

Based on your example in the other thread, I'm definitely on the right track, but wasn't using the right direction! I reworked the ExpectationTypeResolver and AggregateExpectation to have a more generic interface so I could pass in the type lookups I'm generating from loading some assemblies at runtime.

I think where I was going wrong was trying to use the AggregateExpectation because I wanted to discriminate by the name of one of the mappings in the node, which is why I was thinking to use it with an empty string. But your example clarified something I had been missing to this point: I'd been thinking of the "document" as a level higher than a node, rather than a node itself!

@pmikstacki
Copy link

Maybe someone will find this helpfull.
if you want to automatically fill up type lookup in AggregateExpectationTypeResolver:

public class AggregateExpectationTypeResolver<T> : ITypeDiscriminator where T : class

and in the constructor:

foreach (var type in ReflectiveEnumerator.GetEnumerableOfType<T>())
            {
                typeLookup.Add(nameof(type), type);
            }

You can get the reflective enumerator class from here

@BuzzyLizzy
Copy link

BuzzyLizzy commented Sep 14, 2021

This is extensive work, well done, learned me a lot. I seem to know to little still.... I could not get it to work. I was notsure how to kick off the Deserialization, so I did this:

var expectationResolver = new ExpectationTypeResolver(CamelCaseNamingConvention.Instance);
var aggregateExpectationResolver = new AggregateExpectationTypeResolver(UnderscoredNamingConvention.Instance);

var deserializer = new DeserializerBuilder()
                    .WithNodeDeserializer(
                                        inner => new AbstractNodeNodeTypeResolver(inner, aggregateExpectationResolver, expectationResolver),
                                        s => s.InsteadOf<ObjectNodeDeserializer>())
                    .WithNamingConvention(CamelCaseNamingConvention.Instance)
                    .Build();
var model = deserializer.Deserialize<Model>(theString);

where "Model" is the class as you supplied above.

When I step through the code the issue I found was that in the class "AbstractNodeNodeTypeResolver" method "Deserialize" at these lines:

            // can any of the registered discriminators deal with the abstract type?
            var supportedTypes = typeDiscriminators.Where(t => t.BaseType == expectedType);
            if (!supportedTypes.Any())
            {
                // no? then not a node/type we want to deal with
                return original.Deserialize(reader, expectedType, nestedObjectDeserializer, out value);
            }

The Deserialization "expectedType" is the "Model", therefore it is not one of the "supportedTypes" and goes into calling original.Deserialize and then attempts to Deserialze the example as you gave above and starts with and exception that proeprty "segment_with" is not part of "Model". I am probably doing something wrng, but I cannot see a way around this, unless I add another discriminator for the root "Model" object. I will appreciate you expert advise on this, and thank you for all your work, I can see a lot of work in the above code.

I have the Yaml I supply as the following, since this relates to the problem I need to solve:

segment_with: event_count
segment_with: no_events  
bounds: abc              
centroid: null           
time:

My Model looks like this:

    public interface IExpectation {

    }
    
    public class TheModel {
        IExpectation[] Expectations { get; set; }
    }

    public abstract class Expectation : IExpectation {

    }

    public class BoundedExpectation : Expectation {
        public string Bounds => "bounds";
    }

    public class CentroidExpectation : Expectation {
        public string Centroid => "centroid";
    }

    public class TimeExpectation : Expectation {
        public string Time => "time";
    }

    public abstract class AggregateExpectation : IExpectation {
        /// <summary>
        /// Essentially a `kind` property - determines which child type to instantiate.
        /// </summary>
        public string SegmentWith { get; init; }
    }

    public class EventCount : AggregateExpectation {

    }

    public class NoEvents : AggregateExpectation {

    }

@atruskie
Copy link
Author

@BuzzLizzy, I would not expect my specific type resolvers to be useful in your project. That is AggregateExpectationTypeResolver and ExpectationTypeResolver should not appear in your project.

I'd encourage you to think about the method of detection for resolving a type:

  • is there a common key with a value that tells us the type? (E.g. like the AggregateExpectationTypeResolver or a discriminated union)?
  • is the shape of the type (which keys are present) the determiner (like ExpectationTypeResolver)?

From what I can piece together of the error you mentioned, it seems like you're using the discriminator for a discriminated union on a model without the required key/value kind property (in this case segment_with).

@BuzzyLizzy
Copy link

Thank you very much for your reply. As first steps to understand the TypeResolver and how to use it I attempted to replicate your example exactly, then after I understood it I thought I will take the next step and try to apply it on my own project where I will then addapt, as you rightfully mention, to my own model, but this is a complicated topic so I wanted to start with just this example.

@victorromeo
Copy link

@atruskie I'm currently using your technique to deserialize Azure DevOps Pipeline yaml files. It is working well in most scenarios and wanted to thank you for your efforts. https://github.com/victorromeo/LocalAzureAgent

I note, that a very similar technique is actually used in the official Azure DevOps agent code base.

Currently I'm trying to extend your example to support scalar types, as they're used in the Azure DevOps Pipeline types to deserialize Variables and Parameters

In this case, deserializing the script as a Step List works out of the box, but the deserializing the string doesn't. Its a primitive type, which cannot be registered as an IExpectation. I was wondering if you may be able to comment on how you would extend you code to support primitive types?

@atruskie
Copy link
Author

I'd suggest trying to implement your own ITypeDiscriminator - I can't see any reason that wouldn't work?

@JJ11teen
Copy link

Hey @atruskie, would you be open to me opening a PR for YamlDotNet based on this?

@atruskie
Copy link
Author

@JJ11teen Sure - last I checked there wasn't any interest.

Tag me on the PR too please. There was some confusion in this gist about which types were user implemented and which were patches.

@JJ11teen
Copy link

JJ11teen commented Feb 2, 2023

Thanks @atruskie, I have opened this YamlDotNet PR :)

@Zinvoke
Copy link

Zinvoke commented Feb 13, 2023

Id love for this to be impolemented

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment