Skip to content

Instantly share code, notes, and snippets.

@atruskie
Last active May 11, 2023 03:07
Show Gist options
  • Star 5 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save atruskie/bfb7e9ee3df954a29cbc17bdf12405f9 to your computer and use it in GitHub Desktop.
Save atruskie/bfb7e9ee3df954a29cbc17bdf12405f9 to your computer and use it in GitHub Desktop.
Inferring abstract/interface types for YamlDotNet Deserialization

DEPRECATED

This feature is now part of YAML.NET.

See aaubry/YamlDotNet#774 and https://github.com/aaubry/YamlDotNet/wiki/Deserialization---Type-Discriminators.

Addresses aaubry/YamlDotNet#343:

Determining the type from a field

This collection of classes allows one to deserialize abstract classes or interfaces by inspecting the parser's parsing events to choose an appropriate type for the YamlDotNet derserializer to use.

  • AbstractNodeNodeTypeResolver does must of the heavy work by buffering parse events when it encounters an abstract type or interface that we've registered a type resolver for. It then checks if any registered type discriminators can work with the sub-graph of parsing events.
  • IParserExtensions.cs adds a handy extension to IParser that allows for the easy inspection of keys and values in only the current mapping, ignoring any nested mappings.
  • ITypeDiscriminator is a provider interface. For each abstract/interface type you should create a type discrimnator that inspects parsing events and returns an appropriate type when its conditions are met.
  • ParsingEventBuffer is a utility used by AbstractNodeResolver to buffer the streaming parsing events from YamlDotNet's parser. The buffer allows for the events to be replayed which:
    • allows for multiple ITypeDicriminators to be used - each recieving a replay of the events they can inspect
    • and once a type has been selected, replays the parsing events to the standard object node serializer - as if nothing ever happened
  • Finally two example type discriminators (specific to my project) are included:
    • ExpectationTypeResolver looks for the presence of mapping keys that are unique to the descendent/concrete type. IF a key is found the appropriate Type is returned
    • AggregateExpectationTypeResolver looks for a key that is common that has the desired type encoded in its value. This is similar to a kind key on a TypeScript discriminated union.

This solution does slow down the serializer (technically) and add memory pressure (technically). For small file sizes the difference should be trival (I haven't noticed). For large file sizes, try to only resolve types on leaf nodes (the most nested mappings). This will kepp the number of events stored in the buffer small and allow the parser to continue to operate in a mostly streaming fashion.

Usage example:

NamingConvention = UnderscoredNamingConvention.Instance;
// these resolvers allow us to deserialize to an abstract class or interface
var aggregateExpectationResolver = new AggregateExpectationTypeResolver(NamingConvention);
var expectationResolver = new ExpectationTypeResolver(NamingConvention);

YamlDeserializer = new DeserializerBuilder()
    .WithNamingConvention(NamingConvention)
    .WithNodeDeserializer(
        inner => new AbstractNodeNodeTypeResolver(inner, aggregateExpectationResolver, expectationResolver),
         s => s.InsteadOf<ObjectNodeDeserializer>())
    .Build();

// my domain models

public class Model {
  IExpectation[] Expectation {get; set;}
}

public abstract class Expectation : IExpectation { /** blah blah blah **/ }
public class BoundsExpectation : Expectation { /** blah blah blah **/ }
public class CentroidExpectation : Expectation { /** blah blah blah **/ }
public class TimeExpectation : Expectation { /** blah blah blah **/ }

public abstract class AggregateExpectation : IExpectation {
  /// <summary>
  /// Essentially a `kind` property - determines which child type to instantiate.
  /// </summary>
  public string SegmentWith { get; init; }
}
public class EventCount : AggregateExpectation { /** blah blah blah **/ }
public class NoEvents : AggregateExpectation { /** blah blah blah **/ }

And then this yaml can be deserialzied:

- segment_with: event_count   # deserializes to EventCount
- segment_with: no_events     # deserializes to NoEvents
- bounds: abc                 # deserializes to BoundsExpectation
- centroid: null              # deserializes to CentroidExpectation
- time:                       # deserializes to TimeExpectation

LICENSE: MIT

using System;
using System.Collections.Generic;
using System.Linq;
using YamlDotNet.Core;
using YamlDotNet.Core.Events;
using YamlDotNet.Serialization;
using YamlDotNet.Serialization.NodeDeserializers;
namespace Egret.Cli.Models
{
public class AbstractNodeNodeTypeResolver : INodeDeserializer
{
private readonly INodeDeserializer original;
private readonly ITypeDiscriminator[] typeDiscriminators;
public AbstractNodeNodeTypeResolver(INodeDeserializer original, params ITypeDiscriminator[] discriminators)
{
if (original is not ObjectNodeDeserializer)
{
throw new ArgumentException($"{nameof(AbstractNodeNodeTypeResolver)} requires the original resolver to be a {nameof(ObjectNodeDeserializer)}");
}
this.original = original;
typeDiscriminators = discriminators;
}
public bool Deserialize(IParser reader, Type expectedType, Func<IParser, Type, object> nestedObjectDeserializer, out object value)
{
// we're essentially "in front of" the normal ObjectNodeDeserializer.
// We could let it check if the current event is a mapping, but we also need to know.
if (!reader.Accept<MappingStart>(out var mapping))
{
value = null;
return false;
}
// can any of the registered discrimaintors deal with the abstract type?
var supportedTypes = typeDiscriminators.Where(t => t.BaseType == expectedType);
if (!supportedTypes.Any())
{
// no? then not a node/type we want to deal with
return original.Deserialize(reader, expectedType, nestedObjectDeserializer, out value);
}
// now buffer all the nodes in this mapping.
// it'd be better if we did not have to do this, but YamlDotNet does not support non-streaming access.
// See: https://github.com/aaubry/YamlDotNet/issues/343
// WARNING: This has the potential to be quite slow and add a lot of memory usage, especially for large documents.
// It's better, if you use this at all, to use it on leaf mappings
var start = reader.Current.Start;
Type actualType;
ParsingEventBuffer buffer;
try
{
buffer = new ParsingEventBuffer(ReadNestedMapping(reader));
// use the discriminators to tell us what type it is really expecting by letting it inspect the parsing events
actualType = CheckWithDiscriminators(expectedType, supportedTypes, buffer);
}
catch (Exception exception)
{
throw new YamlException(start, reader.Current.End, "Failed when resolving abstract type", exception);
}
// now continue by re-emitting parsing events
buffer.Reset();
return original.Deserialize(buffer, actualType, nestedObjectDeserializer, out value);
}
private static Type CheckWithDiscriminators(Type expectedType, IEnumerable<ITypeDiscriminator> supportedTypes, ParsingEventBuffer buffer)
{
foreach (var discriminator in supportedTypes)
{
buffer.Reset();
if (discriminator.TryResolve(buffer, out var actualType))
{
CheckReturnedType(discriminator.BaseType, actualType);
return actualType;
}
}
throw new Exception($"None of the registered type discriminators could supply a child class for {expectedType}");
}
private static LinkedList<ParsingEvent> ReadNestedMapping(IParser reader)
{
var result = new LinkedList<ParsingEvent>();
result.AddLast(reader.Consume<MappingStart>());
var depth = 0;
do
{
var next = reader.Consume<ParsingEvent>();
depth += next.NestingIncrease;
result.AddLast(next);
} while (depth >= 0);
return result;
}
private static void CheckReturnedType(Type baseType, Type candidateType)
{
if (candidateType is null)
{
throw new NullReferenceException($"The type resolver for AbstractNodeNodeTypeResolver returned null. It must return a valid sub-type of {baseType}.");
}
else if (candidateType.GetType() == baseType)
{
throw new InvalidOperationException($"The type resolver for AbstractNodeNodeTypeResolver returned the abstract type. It must return a valid sub-type of {baseType}.");
}
else if (!baseType.IsAssignableFrom(candidateType))
{
throw new InvalidOperationException($"The type resolver for AbstractNodeNodeTypeResolver returned a type ({candidateType}) that is not a valid sub type of {baseType}");
}
}
}
}
using Egret.Cli.Models;
using System;
using System.Collections.Generic;
using YamlDotNet.Core;
using YamlDotNet.Core.Events;
using YamlDotNet.Serialization;
namespace Egret.Cli.Serialization
{
public class AggregateExpectationTypeResolver : ITypeDiscriminator
{
public const string TargetKey = nameof(AggregateExpectation.SegmentWith);
private readonly string targetKey;
private readonly Dictionary<string, Type> typeLookup;
public AggregateExpectationTypeResolver(INamingConvention namingConvention)
{
targetKey = namingConvention.Apply(TargetKey);
typeLookup = new Dictionary<string, Type>() {
{ namingConvention.Apply(nameof(NoEvents)), typeof(NoEvents) },
{ namingConvention.Apply(nameof(EventCount)), typeof(EventCount) },
};
}
public Type BaseType => typeof(IExpectationTest);
public bool TryResolve(ParsingEventBuffer buffer, out Type suggestedType)
{
if (buffer.TryFindMappingEntry(
scalar => targetKey == scalar.Value,
out Scalar key,
out ParsingEvent value))
{
// read the value of the kind key
if (value is Scalar valueScalar)
{
suggestedType = CheckName(valueScalar.Value);
return true;
}
else
{
FailEmpty();
}
}
// we could not find our key, thus we could not determine correct child type
suggestedType = null;
return false;
}
private void FailEmpty()
{
throw new Exception($"Could not determin expectation type, {targetKey} has an empty value");
}
private Type CheckName(string value)
{
if (typeLookup.TryGetValue(value, out var childType))
{
return childType;
}
var known = (typeLookup.Keys).JoinWithComma();
throw new Exception($"Could not match `{targetKey}: {value} to a known expectation. Expecting one of: {known}");
}
}
}
using Egret.Cli.Models;
using System;
using System.Collections.Generic;
using YamlDotNet.Core;
using YamlDotNet.Core.Events;
using YamlDotNet.Serialization;
namespace Egret.Cli.Serialization
{
public class ExpectationTypeResolver : ITypeDiscriminator
{
private readonly Dictionary<string, Type> typeLookup;
public ExpectationTypeResolver(INamingConvention namingConvention)
{
typeLookup = new Dictionary<string, Type>() {
{ namingConvention.Apply(nameof(BoundedExpectation.Bounds)), typeof(BoundedExpectation) },
{ namingConvention.Apply(nameof(CentroidExpectation.Centroid)), typeof(CentroidExpectation) },
{ namingConvention.Apply(nameof(TimeExpectation.Time)), typeof(TimeExpectation) },
};
}
public Type BaseType => typeof(IExpectationTest);
public bool TryResolve(ParsingEventBuffer buffer, out Type suggestedType)
{
if (buffer.TryFindMappingEntry(
scalar => typeLookup.ContainsKey(scalar.Value),
out Scalar key,
out ParsingEvent _))
{
suggestedType = typeLookup[key.Value];
return true;
}
suggestedType = null;
return false;
}
}
}
using System;
using YamlDotNet.Core;
using YamlDotNet.Core.Events;
namespace Egret.Cli.Models
{
public static class IParserExtensions
{
public static bool TryFindMappingEntry(this ParsingEventBuffer parser, Func<Scalar, bool> selector, out Scalar key, out ParsingEvent value)
{
parser.Consume<MappingStart>();
do
{
// so we only want to check keys in this mapping, don't descend
switch (parser.Current)
{
case Scalar scalar:
// we've found a scalar, check if it's value matches one
// of our predicate
var keyMatched = selector(scalar);
// move head so we can read or skip value
parser.MoveNext();
// read the value of the mapping key
if (keyMatched)
{
// success
value = parser.Current;
key = scalar;
return true;
}
// skip the value
parser.SkipThisAndNestedEvents();
break;
case MappingStart or SequenceStart:
parser.SkipThisAndNestedEvents();
break;
default:
// do nothing, skip to next node
parser.MoveNext();
break;
}
} while (parser.Current is not null);
key = null;
value = null;
return false;
}
}
}
using System;
namespace Egret.Cli.Models
{
public interface ITypeDiscriminator
{
Type BaseType { get; }
bool TryResolve(ParsingEventBuffer buffer, out Type suggestedType);
}
}
using System.Collections.Generic;
using YamlDotNet.Core;
using YamlDotNet.Core.Events;
namespace Egret.Cli.Models
{
public class ParsingEventBuffer : IParser
{
private readonly LinkedList<ParsingEvent> buffer;
private LinkedListNode<ParsingEvent> current;
public ParsingEventBuffer(LinkedList<ParsingEvent> events)
{
buffer = events;
current = events.First;
}
public ParsingEvent Current => current?.Value;
public bool MoveNext()
{
current = current.Next;
return current is not null;
}
public void Reset()
{
current = buffer.First;
}
}
}
@atruskie
Copy link
Author

@JJ11teen Sure - last I checked there wasn't any interest.

Tag me on the PR too please. There was some confusion in this gist about which types were user implemented and which were patches.

@JJ11teen
Copy link

JJ11teen commented Feb 2, 2023

Thanks @atruskie, I have opened this YamlDotNet PR :)

@Zinvoke
Copy link

Zinvoke commented Feb 13, 2023

Id love for this to be impolemented

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment