Skip to content

Instantly share code, notes, and snippets.

@steveharter
Last active July 10, 2020 21:24
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save steveharter/d71cdfc25df53a8f60f1a3563d13cf0f to your computer and use it in GitHub Desktop.
Save steveharter/d71cdfc25df53a8f60f1a3563d13cf0f to your computer and use it in GitHub Desktop.
Programming model options for code-gen

Overview

This covers basic questions and proposes for the high-level programming model for code-gen that end-users will need to use after code-gen has executed.

The actual code that is generated for each POCO will likely be the same no matter what the high-level programming model ends up being with the exception of any added "serialize" and "deserialize" methods that are just thin wrappers on top of JsonSerializerOptions.Serialize() and JsonSerializeOptions.Deserialize().

The actual per-POCO generated code that remains the same will be covered elsewhere, but essentially is a wrapper over a modified JsonClassInfo class to expose metadata and callbacks.

Assumptions:

  • Having a single JsonSerializeOptions` instance should work for all types in a project. If this is not sufficient, we should look at adding features to the serializer such as additional custom attributes to control behavior for certain Types.

Notes:

  • We currently do not have a way to specify the default JsonSerializerOptions (todo:find existing issue) but that would be desirable in most programming model examples here.

Registering Types

  • Automatic registration: this would imply that the options class looks up metadata of some form or is intialized automatically:
    • An attribute on a type such as [JsonSerializeable] that specifies the JsonClassInfo type or a class that can return it.
      • This will not be performant on first-time startup since the serializer will need to call Type.GetAttribute().
    • An assembly-level attribute such as [JsonSerializableTypes] that contains all types that need to be registered and their corresponding JsonClassInfo.
      • This would be acceptable for first-time startup (a single attribute for all Types).
    • A module or assembly-level constructor. Currently in C# 9.0 thinking, although not sure if it will be in or not.
    • A static constructor on a well-known type.
  • Registration though generated code: this would imply that the options class is modified by calling generated code:
    • Having (de)serialization API entry points for a particular POCO Type (or "context" class) that auto-register. This is the most efficient but does require calls to new serialization APIs.

If initialization isn't performed, the current design will still allow the existing JsonSerializer methods to function properly, but they will be slower.

Lazily intialization

A given project may have 100's of code-generated POCOs but for a given scenrio (or a benchmark) only a handful may be necessary. To mitigate performance concerns there are two strategies:

  1. Zero overhead until used. This means that there is no overhead until the Type is used.
  2. Lazy initialization. This means there is minimial overhead until the Type is used. This means the actual metadata won't be created for the Type until the first (de)serialization.

Avoiding Dictionary lookup on public entry point API

Every top-level call to serialization or deserialize performs an internal lookup on a Dictionary<Type, JsonClassInfo> to obtain the root-level metadata for the Type. For small POCOs (or even value types like an Int32) this can be significant - about 10% of the total cost.

All of the "automatic registration" examples would require a dictionary lookup.

The "registration though generated code" examples would not require a dictionary lookup. This is achieved by passing the JsonClassInfo metadata class into the entry points. By having the metadata class also contain the reference to a JsonSerializerOptions instance, just a single "context" parameter is necessary when calling the public entry points.

By avoiding the dictionary lookup and using "registration though generated code" we also get lazy initialization.

Do we generate facades per POCO type?

The methods can either be in each POCO (if owned):

// Existing POCO class elsewhere with the actual properties etc.
// This is just the code-gen public portion.
public partial class MyPoco
{
    public static JsonClassInfo JsonClassInfo {get;} 
    public static MyPoco Deserialize(string json);
    public string Serialize();
}

or a facade class if not owned, or for all cases if consistency across owned + unowned is desired:

public class MyPocoSerializer
{
    // Not a partial class since it creates a new Type with "Serialize" appended.
    public JsonClassInfo JsonClassInfo {get;} 
    public MyPoco Deserialize(string json);
    public string Serialize(MyPoco obj);
}

The implementation will likely call a new public overload on JsonSerializer that take the metadata class JsonClassInfo. These new overloads can also be called directly:

MyPoco obj = ...
JsonSerializer.Serialize(obj, MyPocoSerializer.JsonClassInfo);

Note that many overloads may be desired (only string shown above), causing some code bloat:

  • string
  • byte[]\Span
  • Stream
  • Utf8JsonReader\Utf8JsonWriter

Options for adding a per-project wrapper

This helps obtain either the POCO facade easier (if we generate POCO facades) or the metadata class for each POCO (if no POCO facades are generated).

An issue with this pattern is that each project will have a different JsonSerializerOptions class which may or may not be desirable. If Types are used across two projects, for example, having two instances of the options class is not desirable. Also see the next section about using a "context" class which handles this case.

Option 1: generate derived JsonSerializerOptions

Currently the JsonSerializerOptions class is sealed, but could be unsealed and derived from.

Supporting POCO facades:

public sealed class MyJsonSerializerOptions : JsonSerializerOptions
{
    public Types Types {get;}

    public class Types
    {
        public MyPocoSerializer MyPoco {get;}
        public MyPoco2Serializer MyPoco2 {get;}
        public MyPoco3Serializer MyPoco3 {get;}
        ...
    }
}

Used like:

var options = new MyJsonSerializerOptions();
MyPoco obj = options.Types.MyPoco.Deserialize(json);

Without POCO facades:

public class MyJsonSerializerOptions : JsonSerializerOptions
{
    public Types Types {get;}

    public class Types
    {
        public JsonClassInfo MyPoco {get;}
        public JsonClassInfo MyPoco2 {get;}
        public JsonClassInfo MyPoco3 {get;}
        ...
    }

    // This is called if someone forgets to pass in the JsonClassInfo and instead passed in the options instance.
    protected override GetJsonClassInfo(Type type) {}
}

Used like:

var options = new MyJsonSerializerOptions();
MyPoco obj = JsonSerializer.Deserialize<MyPoco>(json, options.Types.MyPoco);

This option doesn't require per-Type code gen. This nice thing here is the "known types" are baked into the derived options class, so we can have delayed metadata creation until first usage:

var options = new MyJsonSerializerOptions();
// Oops, forgot to pass in options.Types.MyPoco but the options class will
// figure that out for me and auto-register (although requires dictionary lookup)
MyPoco obj = JsonSerializer.Deserialize<MyPoco>(json, options);

Option 2: add a new JsonSerializerContext class

As an alternative to unsealing JsonSerializerOptions, create a new type. This allows each project to have their own context class but with a shared options instance.

The context would support IDisposable to allow removal of cached metadata from JsonSerializerOptions for types in the project. Being able to unload metadata was asked for for by the community (todo:find issue) to support assembly unloading. Note this could also be done without adding JsonSerializerContext by adding new members to the existing options class such as RemoveMetadataFor(Type) and\or RemoveMetadataForAssembly(Assembly)

public class JsonSerializerContext : IDisposable
{
    public JsonSerializerContext();
    public JsonSerializerContext(JsonSerializerOptions options);
    public JsonSerializerOptions JsonSerializerOptions {get;}
}

Supporting POCO facades:

Now no nested Types property\class is needed:

public class MyJsonSerializerContext : JsonSerializerContext
{
    // no nested "Types" class is necessary

    public MyPocoSerializer MyPoco {get;}
    public MyPoco2Serializer MyPoco2 {get;}
    public MyPoco3Serializer MyPoco3 {get;}
}

Used like:

var context = new MyJsonSerializerContext(options);
MyPoco obj = context.MyPoco.Deserialize(json);

or with a using statement for IDisposable:

using (var context = new MyJsonSerializerContext(options))
{
    MyPoco obj = context.MyPoco.Deserialize(json);
}
// The MyPoco metadata is removed from `JsonSerializerOptions`

Without POCO facades:

public class MyJsonSerializerContext : JsonSerializerContext
{
    // no nested "Types" class is necessary
    public JsonClassInfo MyPoco {get;}
    public JsonClassInfo MyPoco2 {get;}
    public JsonClassInfo MyPoco3 {get;}
}

Used like:

var context = new MyJsonSerializerContext(options);
MyPoco obj = JsonSerializer.Deserialize<MyPoco>(context.Types.MyPoco);

This option doesn't require per-Type code gen.

Without POCO facades + add generics for JsonClassInfo:

This avoids the explicit required like today during deserialize:

// Today you have to close the generic:
MyPoco obj = JsonSerializer<MyPoco>(json);
public class MyJsonSerializerContext : JsonSerializerContext
{
    // no nested "Types" class is necessary
    public JsonClassInfo<MyPoco> MyPoco {get;}
    public JsonClassInfo<MyPoco2> MyPoco2 {get;}
    public JsonClassInfo<MyPoco3> MyPoco3 {get;}
}

Used like:

var context = new MyJsonSerializerContext(options);
// Due to generic inference, we don't have to close the generic anymore.
MyPoco obj = JsonSerializer.Deserialize(context.MyPoco);

It appears this option is best suited to handle all scenarios.

Note we will likely need to rename JsonClassInfo<T> to JsonTypeInfo<T> for consistency with other similar classes like System.Reflection.TypeInfo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment