ewoutkramer/TypeSystemRedesign.md

## TypeSystemRedesign.md

      
    Raw
  

              TypeSystemRedesign.md
            
          
    FHIR type information in .NET API 2.0

One of the major areas of work for the 2.0 version of the .NET API was better for working with multiple versions of FHIR within a single application. Since 1.6, we have been splitting up the API in parts that are specific for a given FHIR version, and parts that can be reused across FHIR versions. Additionally, we have seen use of the FHIR type infrastructure to define non-FHIR models like CDA. For this to work smoothly and consistently, we had to refactor the API into parts that are reusable for non-FHIR models, parts that are specific to FHIR and parts that are specific to a certain version of FHIR.
Unfortunately, some of this redesign required a breaking change in the API surface, even for parts that are commonly used. This document details those changes.
The source of truth: ModelInfo

In the 1.x version of the API, ModelInfo exposes a wide variety of utility methods:

Mapping between .NET types and Fhir types
Meta-data oriented queries like "What are all conformance resources".
FHIR inheritance hierarchy
Version information

In 2.x, we are introducing UniModelInfo, a ModelInfo stripped down to contain information about the model itself, as well as listing the available datatypes. Instead of a single static instance, there will now (usually) be one instance for each model the application needs to deal with. For example, in a mapping application, this might be FHIR R3, FHIR R4 and CDA.
public abstract class UniModelInfo : IAnnotated
{
    /// <summary>
    /// An name given to this model, e.g. "FHIR". Is also used as a namespace
    /// prefix to make type names unique, e.g. E.g. "CDA" in the fully
    /// qualified type name "CDA.ST".
    /// </summary>
    public string Name { get; }

    /// <summary>
    /// The version of the model this class represents, e.g. "3.0.1"
    /// </summary>
    public string Version { get; }

    /// <summary>
    /// All the types defined in this model
    /// </summary>
    public abstract IReadOnlyCollection<IStructureDefinitionSummary> Types { get; }

    /// <summary>
    /// Source for model specific information.
    /// </summary>
    /// <param name="type"></param>
    /// <returns></returns>
    public abstract IEnumerable<object> Annotations(Type type);
}
Information about a model may be retrieved from several sources. In the 1.x version, ModelInfo (and much of the metadata) was fed by the information read from .NET Attributes on the POCO model. However, model information could also be retrieved from a set of FHIR StructureDefinitions, a FHIR core package or a CQL ELM file.
Concrete implementations for this abstract class currently are PocoModelInfo, which retrieves information from the POCOs (much like the ModelInfo in version 1.x of the library), StructureDefinitionModelInfo, which can read model information from a FHIR Bundle (like profiles-types.xml), a set of StructureDefinitions or a FHIR Package.

Implementer Note: There is also a SystemModelInfo class that contains the type information for the primitive "system" types on which all models are built, any concrete implementation of UniModelInfo is supposed to expose these types as well.

To guarantee backwards compatibility, the 2.0 .NET API will still include a ModelInfo in the assembly containing the generated POCOs for each FHIR version.

Design note: this ModelInfo class is actually a subclass of PocoModelInfo and pre-loads the information from POCO's in the same assembly. A set of extension methods provide the same methods as the current 1.x ModelInfo.

Retrieving model-specific information

The UniModelInfo captures only information that is common and required for all types of information models (e.g. FHIR, CDA, ELM), however it will frequently be necessary to get information and characteristics that are specific for a model. Examples are the list of "retrievable classes" for CQL, or the list of "conformance classes" for FHIR. Concrete subclasses will implement support for these specific properties, but there are a few cross-cutting aspects that we have formalized, e.g. details about XML serialization are common to both FHIR, CDA and ELM:
interface IXmlSerializationModelInfo
{
    XNamespace DefaultNamespace { get; }
    Uri SchemaLocation { get; }
}

Design note: other cross-cutting aspects may be implemented using extension methods, e.g. there could be a set of non-POCO but FHIR specific extension methods to get to information currently available on UniModelInfo:
public static bool IsReference(this UniInfoModel model, TypeInfo type)
{
    ...
}


Getting to a ModelInfo

Originally, ModelInfo was a static class, so each application had exactly one model instance. To be more precise: each assembly compiled for a specific version of FHIR contained its own ModelInfo. To support having multiple models, one can now load and use multiple instances using one of the concrete implementations as discussed above.
Since most applications will still use a single model, the user can set this single application-wide model using the static UniModelInfo.Default property. Note that this property is null by default, so if you choose to not use the "1.x"-style ModelInfo (for example to be able to use non-FHIR models), but still want the convenience of a single model, you have to initialize and set this global property at the start up of your application.
Better, however, is to explicitly instantiate and pass UniModelInfo instances to those components of the .NET API that need it. In 2.0, these components (like the parsers, serializers and validator) stick to the following design:

They implement IModelInfoAware (which is an interface with a single read-only property Model of type UniModelInfo).
They have an additional constructor with a parameter of type UniModelInfo (or specific subclass).
They may provide methods taking a UniModelInfo parameter.

These components will determine the current UniModelInfo to use by first looking at the instance passed in the constructor. If no such instance was passed, they will default to look at the UniModelInfo.Default global default instance as described above.
In fact, there is a third option, which is a hybrid between setting a global model (which is not recommended) and passing dependencies (possibly deeply into nested parts of your application): the ModelInfoContext. Using this context sets a "temporary" default model to use by IModelInfoAware components:
    var sourceData = FhirXmlParser.Parse(...);
    var fhirR3ModelInfo = StructureDefinitionModelInfo.FromPackage(....);
    var fhirR4ModelInfo = StructureDefinitionModelInfo.FromPackage(....);
    var myPocoModelInfo = PocoModelInfo.LoadFromAssembly(...);

    // We can first set the global default (if necessary).
    UniModelInfo.Default = myPocoModelInfo;

    // we can override this by creating a disposable `ModelInfoContext`
    using(new ModelInfoContext(fhirR3ModelInfo))
    {
        // here, the R3 model info applies, not the one loaded from the POCOs
        var x = sourceData.ToTypedElement();

        // of course, you can now override this by passing in an explicit
        // model info
        var y = sourceData.ToTypedElement(fhirR4ModelInfo);
    }

    // here, the default applies again.
Type information

One of the major functions of UniModelInfo is supplying information about the types that form the content of the model, using UniModelInfo.Types. This is a collection of TypeInfo:
public abstract class TypeInfo
{
    TypeInfo Base { get; }
    UniModelInfo DeclaringModel { get; }

    bool IsAbstract { get; }
    bool IsOrdered { get; }
    bool IsBindable { get; }
    bool IsPrimitive { get; }

    /// <summary>
    /// The unique name for the type (within this model).
    /// </summary>
    string Name { get; }

    /// <summary>
    /// A globally unique identifier, in FHIR this would be the canonical.
    /// </summary>
    string Identifier { get; }

    IReadOnlyCollection<TypeElementInfo> Elements { get; }
}

public interface IFhirTypeInfo
{
    bool IsResource { get; }         
}

public inteface ICDATypeInfo
{
    // This is an extension in StructureDefinition
    string Namespace { get; }
}

public class PocoTypeInfo : TypeInfo, IFhirTypeInfo, IStructureDefinitionSummary
{
}

public class StructureDefinitionTypeInfo : TypeInfo, IFhirTypeInfo, ICDATypeInfo, IStructureDefinitionSummary
{
}
Note that, just like with the UniModelInfo, TypeInfo is an abstract type for which different sources for the models will create concrete subclasses, potentially implementing interfaces containing information only applicable to that model (like IsResource here, for FHIR).
Each TypeInfo refers to the UniModelInfo that defines the type, since each type (and type name) is only unique within the context of that model, especially since the model contains a version label.
For performance reasons, applications should make sure they maintain a single instance of a model per application, so there will also be a single TypeInfo per type. This makes it easy (and more performant) to quickly reference type information from instance data encoded in ITypedElements and compare those types against the known types in the model

Design note: we should implement Equals() in these classes to do a ReferenceEquals first, then -failing that- compare them by content.

Each TypeInfo lists its properties, here called TypeElementInfo to stick to the familiar FHIR terminology of 'element' for a property:
public abstract class TypeElementInfo  // ElementDefinition
{
    string Name { get; }
    bool IsCollection { get; }

    TypeInfo[] Type { get; }
}

interface IFhirTypeElementInfo
{
    bool InSummary { get; }  // only MaskingNode uses this
    bool IsChoiceElement { get; }
    bool IsRequired { get; }     // only MaskingNode uses this
}

interface ICDATypeElementInfo
{
    string DefaultTypeName { get; }
}

interface IXmlSerializationTypeElementInfo
{
    int Order { get; }
    string NonDefaultNamespace { get; }
    XmlRepresentation Representation { get; }
}

public class PocoTypeElementInfo : IElementDefinitionSummary, IFhirTypeElementInfo, ICDATypeElementInfo, IXmlSerializationTypeElementInfo
{

}
The biggest difference in structural design from the comparable IElementDefinitionSummary is that TypeElementInfo does not directly support the nested/backbone types common in FHIR anymore. Each backbone element is now referred to by its actual name and type, and an element cannot define unnamed nested structures. To enable this, the UniModelInfo will contain types for each of the nested structures defined in FHIR. This design aligns much better with the type systems used in common languages like C++, C# or Java. As well, this ensures that the type structure is no longer (possibly infinitely deeply) nested, which makes it easier to implement type information providers.

Design note: FHIR already has an http://hl7.org/fhir/StructureDefinition/structuredefinition-explicit-type-name extension that names these "anonymous" substructures. The full name for the type will be of the form "parent type" + # + "explicit type name", e.g. Patient#Contact.

Less visible, but equally important, is the fact that the Elements property of TypeInfo will also include the primitive value element in FHIR primitive types. This aligns better with how other primitive elements (like Element.id or XHtml.div) are exposed. Since the introduction of the FhirPath System types, we have a clear way of identifying the types of such primitive elements in a StructureDefinition, and there is no longer a need to treat the value element differently. This will also make it possible for non-FHIR models (like CDA) to introduce primitive elements other than value.
Use of type information in instances

In 1.x, we already supported working with data across versions and models using the ITypedElement interface. In 2.0, this interface will be changed for it to be able to use the new TypeInfo classes:
public interface ITypedElement
{
    IEnumerable<ITypedElement> Children(string name=null);

    string Name { get; }

    TypeInfo InstanceType { get; }

    object Value { get; }

    string Location { get; }        
}
There are two important changes from the 1.x version of this interface:

InstanceType now refers directly to the corresponding type from the UniModelInfo. This makes it very straightforward to get to the corresponding type information, without the need for lookups via external providers (as is the case for the IStructureDefinitionSummaryProviders now). Since each type refers to its model, it is also now much easier to get a reference to the UniModelInfo and get metadata information about all the other types. Additionally, in the 1.x version, the named type in InstanceType was usually meant to be a FHIR type - in 2.0 this reference unambiguously refers to a specific type from a specific (possibly non-FHIR) model.
Since type information is easier to get to, we no longer need an explicit ElementDefinition property, this can be recovered by getting the member information for the declaring type for that property. This is a different access path than in the 1.x version, where the property itself owned its metadata, but is more aligned with existing reflection systems in Java and .NET and also makes implementing ITypedElement much easier.


We can now also redesign ElementNode to be much more performant, and not require direct access to an IResourceResolver anymore. The user could (if necessary) supply instance data to an ElementNode, and only in the case of choice types would this be really necessary. In which case a single reference to a type in the current UniModelInfo would suffice.

Discussion:


(UniModelInfo + ModelInfo) or (ModelInfo + ModelInfoR3/R4/R5)?
Have a single concrete ModelInfo and multiple ModelInfoProviders (which put model-specific stuff in annotations), or an abstract ModelInfo with multiple implementations (as documented here).
Same is true for TypeInfo. We could have the specific ModelInfo providers/implementations (see previous bullet) create either a specific subclass or create a single TypeInfo class. Again, model-specific stuff on Annotations.
Use Annotations (like the FhirParsers) for specific information, or just allow type sniffing (if(myModel is IPocoMappingInfo) or even if(myModel is PocoModelInfo))?
Must these concrete implementations be public at all?
Shall we rebrand IStructureDefinitionSummary as ITypeInfo? How much will that break? We can make the actual implementer of ITypeInfo also implement IStructureDefinitionSummary for backwards compat I guess.  Or a TypeInfo base class that implements IStructureDefinitionSummary?
CQL defines specific subclasses of TypeInfo for classes, lists, intervals, tuples and simple types (primitives). In particular, should we have the separation between PrimitiveTypeInfo and ClassTypeInfo? For now, I have decided to stay close to the .NET (and Java) reflection setup (no subclasses of System.Type). We could have a IsPrimitive boolean if needed.
Should we -for backw compat reasons- still include the old InstanceType property and add a new one to represent the TypeInfo?