While I was working on the new code generator for the .NET API 2.0, I had to review the class hierarchy of FHIR and how we represent this in the StructureDefinition
s that are part of the specification. Certainly, if you are working with multiple versions of FHIR and do any kind of metadata work, you will have found yourself trying to remember the answer to questions like: "Is DataRequirements.codeFilter
based on Element
or BackboneElement
?", "Was SimpleQuantity
a datatype or a profile on Quantity
in R3?" "How did we specify the datatype of Narrative.text
in R4? Did that change across FHIR versions?"
Yet again, I found myself digging through tons of StructureDefinitions to find out the details I needed to get the code generation done. I told myself that this time around, I would actually document it, so you (and a future me) would have just a single page to go to.
Let's first take a look at the Resources. The Resource inheritance structure is pretty simple:
Even so, there are a few interesting things to notice:
Resource
andDomainResource
are abstract classes, you will never find an instance of these in your data.DomainResource
(from which most other resourced derive) introduces thetext
,contained
,extension
andmodifierExtension
properties - this means that the other three(!) remaining resources (Binary
,Bundle
andParameter
) cannot be extended, nor can they contain a human-readable summary. They also cannot contain contained resources!
If you have been looking at the R5-preview, you might have encountered two abstract DomainResource
subclasses: CanonicalResource
and its subclass MetadataResource
. Togerhter, they specify a set of elements shared between the conformance resources (like StructureDefinition
, ValueSet
) and other "metadata" resources that have authoring information associated with them (e.g. PlanDefinition
, TestScript
). These resources are really more like interfaces, and are not really part of the inheritance-hierarchy of the resources. Currently, if you look in the StructureDefintion
for, say, PlanDefinition
, you'll find:
"type" : "PlanDefinition",
"baseDefinition" :
"http://hl7.org/fhir/StructureDefinition/MetadataResource",
"derivation" : "specialization",
Which would tell you that MetadataResource
is the superclass for PlanDefinition
. Which it really is not. By the time we publish R5, we will have to find a way to express this resource is really a specialization of DomainResource
, but implements MetadataResource
.
This will remain an R5 feature, so in R3 and R4, all the resources that moved under this new CanonicalResource
are, in fact, subclassed directly from DomainResource
.
So, if there is an inheritance hierarchy, does that mean you can actually use a concrete type where there is an element that allows one of the supertypes? The answer is yes, in at least three places (there might be more):
-
In
StructureDefinition
, to define the type for thecontained
element inDomainResource
:"path" : "DomainResource.contained", "short" : "Contained, inline Resources", "min" : 0, "max" : "*", "type" : [{ "code" : "Resource" }],
This principle is also used in the
Parameters.parameter.resource
and theBundle.entry.resource
element. I am not aware of any others. -
To specify that a resource reference can point to "Any" other resource. This happens often, and appears in the
targetProfile
element of a reference element inStructureDefinition
, e.g.:"path" : "MessageHeader.focus", "short" : "The actual content of the message", "type" : [{ "code" : "Reference", "targetProfile" : ["http://hl7.org/fhir/StructureDefinition/Resource"] }],
-
To define the
SearchParameter
common to all resources, as specified in thebase
element:"name" : "_id", "description" : "Logical id of this artifact", "code" : "_id", "base" : ["Resource"], "expression" : "Resource.id"
In case you were wondering, there is a
SearchParameter
onDomainResource
(and thus, any subclass) as well:"name" : "_text", "description" : "Search on the narrative of the resource", "code" : "_text", "base" : ["DomainResource"]
The id
element has always been a source of confusion, maybe because we have two similar, but completely different, id
elements: one on Resource
and one on Element
(and thus any datatype). We'll talk about that second one later, but let me first stress that Resource.id
is a normal element, it behaves like Patient.active
or any other innocent property. Its datatype is id
, a normal, complex FHIR datatype, which means you may even extend it.
We have managed to make this worse by getting the datatype wrong in the StructureDefinition
for Resource
in R4:
<path value="Resource.id"/>
<type>
<extension url="http://hl7.org/fhir/StructureDefinition/structuredefinition-fhir-type">
<valueUrl value="string"/>
</extension>
<code value="http://hl7.org/fhirpath/System.String"/>
</type>
The extension is saying that this property is actually not a complex FHIR type, but a primitive string like we find in the value
attribute in the XML representation for a code
or string
. That is wrong. It is currently not much better in R5:
<path value="Resource.id"/>
<type>
<extension url="http://hl7.org/fhir/StructureDefinition/structuredefinition-fhir-type">
<valueUrl value="id"/>
</extension>
<code value="http://hl7.org/fhirpath/System.String"/>
</type>
STU3 is (for now) the last version to get this right:
<path value="Resource.id"/>
<type>
<code value="id"/>
</type>
The situation for the datatypes is more complex, and has seen more changes from FHIR version to version. In my mental picture of the datatypes (and elements using the datatypes) there are three broad categories, each with their own set of quirks:
- The complex datatypes (
Identifier
,HumanName
, etc.) - The primitive datatypes (
string
,code
, etc.) - Backbones
All of these derive (indirectly or directly) from Element
. The first two can be distinguished (in R3 and R4) only by looking at StructureDefinition.kind
, which is complex-type
or primitive-type
respectively. In R5, complex datatypes and primitives do not directly derive from Element
anymore, instead R5 introduces two new abstract classes, DataType
and PrimitiveType
. As you can guess, all primitives now derive from the latter, whereas complex types are children of DataType
.
The Backbones are the most idiosyncratic of the types, so let's start with those.
Backbones are the "anonymous" types defined in-place in the resource or datatype, and as such are not defined as independent, identifiable datatypes. Examples are Patient.contact
(where contact is a set of elements that repeat, defined in place in Patient) and DataRequirement.codeFilter
(idem, but then inside a datatype). Let us take a look:
{
"extension" : [{
"url" : "http://hl7.org/fhir/StructureDefinition/structuredefinition-explicit-type-name",
"valueString" : "Contact"
}],
"path" : "Patient.contact",
"type" : [{ "code" : "BackboneElement" }],
}
{
"path" : "Patient.contact.relationship",
}
As you can see, this backbone (here Patient.contact
) is declared to be of type BackboneElement
. Subsequent children elements (I've just shown Patient.contact.relationship
here) define the child members of the element (and thus the type). Note, again, that this is done inside the Patient
resource, has no canonical url of its own and thus cannot be re-used by other resources. Also, since R3, there is an extension to specify a name for the backbone type. This name is not unique, and is mostly used for rendering purposes (e.g. in UML diagrams, this backbone would still be represented as a class with a name) and code generation (in most programming languages this nested class would need to be represented as a first-class named class type). Unfortunately, you cannot assume all backbones have this extension specified, and you need to have a fallback scenario to derive your own (e.g. using the last part of the path) if necessary.
The type BackboneElement
itself is a direct subclass of Element
, and only adds the modifierExtension
element. It is important to remember that all backbones in resources are using BackboneElement
. Backbones also exist in datatypes, but they are relatively rare (examples are Dosage.doseAndRate
, Timing.repeat
and ElementDefinition.slicing
). Backbones in datatypes are of type Element
, rather than BackboneElement
, to enforce the general FHIR rule that datatypes (including their elements) cannot have a modifier extension.
By the way, there is subtle room for error in your programming logic here. We noted before that the use of the abstract type Resource
in DomainResource.contained
meant you could substitute any resource at that point. The indistinguishable use of the abstract BackboneElement
as a type of a backbone element does evidently not imply the same kind of polymorphism.
Other elements (and currently, other elements appearing lateR in the StructureDefinition) may re-use this backbone element to specify their type, by using a contentReference
element in the ElementDefinition
(This is from Questionnaire.item.item
):
<contentReference value="#Questionnaire.item"/>
Note the '#' here. It means "inside the base definition for this StructureDefinition". This means that a contentReference
always refers to the element in the base definition, not to the element in the StructureDefinition you are currently processing!
The difference between complex and primitive datatypes (and even resources) is pretty slim in FHIR, mostly because FHIR's primitive are not really primitive: they still can be extended and can be identified for reference. What does set them apart is the presence of a value
element (represented as an attribute in XML) that is typed as a primitive value (e.g. a simple string or integer). This is confusing and amongst my colleagues we have introduced terminology like "real primitive" and "FHIR primitive" to distinguish the two.
That being said, it also means that you can treat the StructureDefinitions for both almost identically, and you just have to be aware of the value
attribute (more about that later).
There are a few things pointing out. One is the use of extensions: as discussed before, FHIR allows extensions anywhere, but modifier extensions only on Resources, not on datatypes. By adding the modifierExtension
to DomainResource
and BackboneElement
, but not to Element
, this was structurally enforced in R3. However, in the R3 timeframe (and before), we had promoted some commonly used backbones to datatypes, among them ElementDefinition
and Dosage
. As part of resources, these structures had allowed modifier extensions, however, once derived from Element
(because of their "promotion" to a datatype), this was no longer allowed. In R4, we corrected this mistake by formally identifying a subset of datatypes that could have modifier extensions on them. The way we did that was by making these types derive from BackboneElement
. This seemed to make sense: as part of resources they used to be backbone elements, now they were simply stand-alone datatypes, still deriving from BackboneElement
. It did confound the notion of a "true" backbone element and a datatype, however, and in R5, the datatype hierarchy got enriched by a true abstract BackboneType
class, which is the parent class of all datatypes that allow modifier extensions. The full hierarchy (with its history) is summarized below:
The set of datatypes allowing modifiers (under BackboneElement in R4, and BackboneType in R5) is limited. Since R4 we have Dosage
, ElementDefinition
and Timing
. R5 will (probably) add
MarketingStatus
, OrderedDistribution
, Population
, ProdCharacteristic
, ProductShelfLife
and Statistic
.
Previously, I told you that the only difference in treating StructureDefinition for complex types and primitive types was the value
element. Well, that was mostly true. There are three other elements, part of complex datatypes, that behave just like the value
attribute. These are:
Extension.url
- a simple string that contains an uriNarrative.div
- a simple string that contains xhtmlElement.id
- a simple string containing an id of the element
Now, I purposely use the word "simple string" here, since these three elements are not of a FHIR type, they cannot be extended, they are simple values. Unfortunately, we made quite a mess of how we specified this in the FHIR core StructureDefinitions. Before STU3, we actually typed these elements as FHIR types (there wasn't anything else we could do). The ElementDefinition.type.code
where the type for an element lives is a required binding, so we just pretended our world was turtles all the way down. Primtive's value
attribute thus was declared to be of another FHIR type. The serialization for XML and Json would not allow you to do non-primitive things with them anyway. This heritage is still visible, if you look at the definition of an Extension
on the HL7 website, you will see this:
Note how the type for url
is still indicated as uri
, it will even forward you to the FHIR "primitives" page. But in fact, it is a simple string.
By the time we were working on R3, we had been convinced this was the wrong approach. Unfortunately, we chose the wrong solution for R3: we chose to not supply a type, and used extensions to tell you so the kind of "real primitive" value you were dealing with:
<!-- R3 situation -->
<element id="boolean.value">
<path value="boolean.value"/>
<representation value="xmlAttr"/>
<!-- Note: primitive values do not have an assigned type. e.g. this is compiler magic. XML, JSON and RDF types provided by extension -->
<type>
<code>
<extension url="http://hl7.org/fhir/StructureDefinition/structuredefinition-json-type">
<valueString value="boolean"/>
</extension>
<extension url="http://hl7.org/fhir/StructureDefinition/structuredefinition-xml-type">
<valueString value="xsd:boolean"/>
</extension>
<extension url="http://hl7.org/fhir/StructureDefinition/structuredefinition-rdf-type">
<valueString value="xsd:boolean"/>
</extension>
</code>
</type>
</element>
We even generated a comment into the StructureDefinition: compiler magic here! As you can see, the value of a FHIR primitive (boolean
in this case) does not have a type anymore, the code
element is empty, except for a few extensions. These give the on-the-wire format of the primitive value for each of the known serializations.
We then failed to apply this solution consistently for value
's friends Extension.url
, Element.id
and Narrative.div
however:
<!-- R3 situation -->
<element id="Extension.url">
<path value="Extension.url"/>
<representation value="xmlAttr"/>
<type>
<code value="uri"/>
</type>
</element>
and
<!-- R3 situation -->
<element id="Element.id">
<path value="Element.id"/>
<representation value="xmlAttr"/>
<type>
<code value="string"/>
</type>
</element>
and
<!-- R3 situation -->
<element id="Narrative.div">
<path value="Narrative.div"/>
<type>
<code value="xhtml"/>
</type>
</element>
remained as they were in DSTU2. Note how representation
is set to xmlAttr
for all of these, except for Narrative.div
. This makes sense, since we are dealing with XHtml for the latter, for which we have the xhtml
representation. This is however, not applied here at the element level, but instead at the value
attribute for the xhtml
type.
Then, for R4, the shared maintenance of the FhirPath standard with the CQL people (like Bryn and Chris) forced us to unify our type system with that of CQL. It did have the benefit of providing us, finally, with the correct solution. We loosened the binding strength for ElementDefinition.type.code
to extensible
so we could use non-FHIR codes for the types, and hence introduce a (url based) scheme to name the "real primitives", which we by then called "system primitives":
<!-- R4/R5 situation -->
<element id="boolean.value">
<path value="boolean.value"/>
<representation value="xmlAttr"/>
<type>
<extension url="http://hl7.org/fhir/StructureDefinition/structuredefinition-fhir-type">
<valueUrl value="boolean"/>
</extension>
<code value="http://hl7.org/fhirpath/System.Boolean"/>
</type>
</element>
As you can see, the code
element now has a value again, and actually specified the "external" type for this primitive value element. We refer here to the set of types that had been defined and used already by CQL. You can take a look at Appendix B of the CQL specification to find their exact definition. These types are now used as the basis for FHIR, CQL, FhirPath and the FHIR Mapping Language.
When using R3, you might have the need to map the structure-definition-xml-type
extension to these system types as follows:
R3 xml type | R4+ system type |
---|---|
xsd:boolean | System.Boolean |
xsd:int | System.Integer |
xsd:string | System.String |
xsd:decimal | System.Decimal |
xsd:anyURI | System.String |
xsd:base64Binary | System.String |
xsd:dateTime | System.DateTime |
xsd:gYear OR xsd:gYearMonth OR xsd:date | System.DateTime |
xsd:gYear OR xsd: gYearMonth OR xsd: date OR xsd: dateTime | System.DateTime |
xsd:time | System.Time |
xsd:token | System.String |
xsd:nonNegativeInteger | System.Integer |
xsd:positiveInteger | System.Integer |
xhtml:div | System.String |
The same approach is used for Element.id
and Extension.url
. Note though that Narrative.div
remains defined in terms of primitive FHIR type xhtml
, which in its turn does use this approach to define xhtml.value
:
<element id="xhtml.value">
<path value="xhtml.value"/>
<representation value="xhtml"/>
<type>
<extension url="http://hl7.org/fhir/StructureDefinition/structuredefinition-fhir-type">
<valueUrl value="string"/>
</extension>
<code value="http://hl7.org/fhirpath/System.String"/>
</type>
</element>
This still feels inconsistent to me (is Narrative.text
really a complex FHIR value of type xhtml
or a true primitive, just like Extension.url
?), and I personally think we should "pull up" the specification of xhtml.value
into Narrative.div
and get rid of xhtml
. A consequence of treating xhtml
as a datatype, is that is inherits from Element
, and thus has an extension
element. In xhtml
the max occurrence for this field has been set to 0 in both the snapshot
and differential
. For those generating classes in OO languages from this spec this obviously presents problems, since there's no way to "remove" this extension
element from the class hierarchy.
As stated before, primitives (in R5) are now all children of PrimitiveType
. The hierarchy is a bit deeper though, as shown below (taken from the current spec):
This hierarchy has not changed much (yet) between R4 and R5, the major addition being the integer64
type (which is in the process of being renamed to long
).
This picture clearly shows that not all primitives are direct children of PrimitiveType
(in R5) or Element
(in R3 and R4): there is a set of "stringy" types derived from string
(i.e. markdown
), a few constrained uri's deriving from uri
(like uuid
), and additional kinds of integer
called positiveInt
and unsignedInt
. Contrary to class inheritance in common programming languages, these subclasses are actually more specialized versions of their superclasses and introduce no additional members of functionality. Instead they constrain the set of values that the type can represent.
Which leads me to a very common error in dealing with the datamodel: these specialized subclasses cannot be substituted for their (abstract) supertypes! For example, if an element is of type string
, you cannot use values of type markdown
for that element. Not that this is easy to do so: in most places there is no type information in the XML/Json serialization for FHIR. This may, however, happen with choice properties. For example, Observation.value[x]
specifies that it may contain a string
- but that does not mean you can supply a code
or markdown
here: valueMarkdown
would be incorrect.
This is in contrast to what we saw in the last section, where the inheritance hierarchy for Resources does permit substitutability. Even so, derived resources and specialized datatypes both have a StructureDefinition.derivation
of kind specialization
- you will just have to hard-code this knowledge into your software.
Note that the current R5 publication shows xhtml
as a direct child of DataType
. This is a mistake: xhtml
is a primitive (its StructureDefinition.kind
is primitive-type
) and its base is in fact PrimitiveType
.
That is not all there is to say about the hierarchy. The "subclasses" under Quantity
merit some attention too. Some of these are subclasses (like Money
and Distance
), others are actually constraints (e.g. MoneyQuantity
). This difference between the two is quite substantial: the "constraint" types under Quantity
are not types at all. They cannot appear as names in e.g. Observation.valueMoneyQuantity
, and when they are referenced, they are referenced like a profile (which they really are). Take a look at how SimpleQuantity
(a constraint in R4) is referenced by CarePlan
in its StructureDefinition:
{
"path" : "CarePlan.activity.detail.dailyAmount",
"type" : [{
"code" : "Quantity",
"profile" : ["http://hl7.org/fhir/StructureDefinition/SimpleQuantity"]
}],
}
This clearly shows that the type of this element is Quantity
, but it is further constrained to the core profile SimpleQuantity
.
Since R3, SimpleQuantity
is a constraint, in R4 (and R5) MoneyQuantity
was added as a constraint. To be more precise, the existing Money
type (subclass of Quantity
) was turned into a constraint (MoneyQuantity
) and a new Money
type was introduced. The latter represents a currency, and has a currency
element, whereas MoneyQuantity
profile is based on Quantity
and just restricts the Quantity.code
to be a currency code.
For years, the resources and datatypes lived in separate inheritance hierarchies. Some of the reference implementations already fixed that gap (.NET has a class called Base
and Java IBase
), and R5 will formally introduce the abstract resource/datatype Base
. It has no elements, and two subclasses, Resource
and Element
. This means Resource
points to it as its baseDefinition
:
"kind" : "resource",
"abstract" : true,
"type" : "Resource",
"baseDefinition" :
"http://hl7.org/fhir/StructureDefinition/Base",
"derivation" : "specialization",
and so does Element
:
"kind" : "complex-type",
"abstract" : true,
"type" : "Element",
"baseDefinition" :
"http://hl7.org/fhir/StructureDefinition/Base",
"derivation" : "specialization",
Before, these two types had no base
nor a derivation
.
Which makes you wonder, what kind
is Base
? In its StructureDefinition, we find:
"kind" : "complex-type",
"abstract" : true,
"type" : "Base",
So, formally, Base
is a complex type, and Resource
s are special kinds of complex types. Notice that there is no derivation
, and no base
.
Phew. As you have seen, we have tried our best to keep you busy when you are working with StructureDefinition
in all its glory. And we have introduced enough changes over the past years to make it even more enjoyable when working with multiple versions. I've tried to be complete, but it's unlikely I succeeded. So, if you find any omissions, let me know!