Introduction
In real-world healthcare IT ecosystem HL7 v2 is a most widely used standard for data interoperability. New standards like HL7 v3 or FHIR are rather young to completely replace old and time-proven HL7 v2.
Blah-blah, we need to map from HL7 to FHIR.
So how can we describe mapping from HL7 v2 to FHIR? There are two options: imperative and declarative. Well-known products like MIRTH or Iguana pick up imperative approach with Lua or JavaScript scripting engines. Such mappings are easy to write and blah-blah-blah.
But in this article we want to present language for describing mappings in declarative way. Why do we think declarative approach is better? Mainly because structure of declarative mapping will represent structure of resulting data. FHIR resources are deeply nested hierarchical data structures, and if mappings follows structure of resulting resource, it's much easier to read and write them.
To demonstrate that it's true we will use analogy with HTML templates. Actually, mapping from HL7 to any hierarchical data structure (like FHIR resource) has a lot in common with HTML templating. You have some incoming data (HL7 message in our case or View Model in HTML world) and you want to get some complex structured output (FHIR resource or HTML DOM) based on this incoming data.
Consider following HTML template written in popular HAML language:
%section.container
%h1= post.title
%h2= post.subtitle
%div
= post.content
And now compare how same piece of HTML is constructed with JavaScript in imperative manner:
function MyTemplate(post) {
var section = document.createElement("section");
var h1 = document.createElement("h1");
var h2 = document.createElement("h2");
var content = document.createElement("div");
h1.innerHTML = post.title;
h2.innerHTML = post.subtitle;
content.innerHTML = post.content;
section.appendChild(h1);
section.appendChild(h2);
section.appendChild(content);
return section;
}
The main difference here (besides overall code length) is that HAML template is structured and imperative code is linear. When you read HAML template, you clearly see that in result you'll get root
It's a piece of mapping for Patient resource written in language called JUTE. This is a small language we developed in our team to perform HL7 to FHIR mapping.
resourceType: Patient
multipleBirthInteger: $ PID.24
deceasedBoolean: $ PID.30
birthDate: $ PID.7 | dateTime
gender: $ PID.8 | translateCode("gender")
name:
$foreach: PID.5 as name
$value:
period:
start: $ name.12 | dateTime
end: $ name.13 | dateTime
given: $ name.2 | capitalize
middle: $ name.3 | capitalize
family: $ name.1 | capitalize
suffix: $ name.4 | capitalize
prefix: $ name.5 | capitalize
text: '{{name.5}} {{name.2}} {{name.3}} {{name.1}} {{name.4}} {{name.6}}'
In next chapters we will describe JUTE in more detail.
Few words about YAML
JUTE templates are JSON documents. But JSON itself is not well-designed to be written by humans (it's more a data serialization format), so in this article we will use language called YAML to represent data structures. YAML is easy to read and write, thanks to it's clean syntax and indentation-based nesting. Don't forget that technically there is no big difference between JSON and YAML, they both are able to represent same data structures.
Structure of parsed HL7 message
Parsing of HL7 messages goes out of scope of this article, but at least we need to describe data structure of parsed message because we're going to use it as an incoming data for JUTE template.
On top level parsed message is an object (or map) where keys are segment names and values are segments. If there is more than one segment of of specific kind, associated value will be array of segments.
Segment itself is an object with numeric keys. Each segment field is either a simple type (string/number) or an array (complex type like XPN or CX). If field is repeatable, it will be presented as array of simple types or array of arrays.
PID:
'0': PID
'1': 1
'2': I
'3':
- MRN12345
-
-
-
-
- Good Health Hospital
<...>
'13':
- - (555)555-5555
- PRN
- - (777)777-7777
- PRN
Segments can be nested into each other. For example, OUL messages can contain many OBR segments, and each OBR contains many NTE and OBX, and each OBX contains many NTE. That's why we use objects to represent segments, not arrays. Because among numerical keys for fields there will be keys for nested segments:
PID:
'0': PID
'1': 1
<...>
'PV1':
'0': PV1
OBR:
- '0': OBR
'1': 1
'2': '00045'
<...>
NTE:
<...>
OBX:
- '0': OBX
'1': 2
<...>
- '0': OBX
'1': 2
Now when we understand how structure of parsed HL7 message will look like, we can proceed describing JUTE templates.
Simple JUTE template
Simplest JUTE template can look like this:
"Hi from JUTE"
This template will be evaluated into itself, and in result we'll have same string:
"Hi from JUTE"
Same is true for any data type, including maps and arrays:
Key1: true
Key2: false
Key3: [1, 2.0, "hello, world"]
Such template also will be evaluated into iteself.
JUTE expressions
We "inject" data from HL7 message into resulting data structure with JUTE expressions. JUTE expressions are strings starting with '$' sign:
RegularString: just a regular string
PatientFirstName: $ PID.5.2 | capitalize
PatientLastName: $ PID.5.1 | capitalize
This template will be evaluated into (depending on data in HL7 message):
RegularString: just a regular string
PatientFirstName: John
PatientLastName: Smith
JUTE expression have a lot in common with JavaScript expressions or expressions in some other high-level dynamic language. Main difference here is that instead of variable names you're using paths, like "PID.5.2". Also there is special syntax for applying one or more filters on evaluation result (you can treat filters as exotic form of function calls). Consider several examples of JUTE expressions:
-
$ PID.5 - evaluates into array containing all subfields of patient's name)
-
$ PID.5 | compact | join("|") - evaluates into string containing all non-empty subfields of patient's name separated by pipe character
-
$ PID.13 || PID.14 | formatPhone - get patient's phone from either PID.13 or PID.14 and formats it with formatPhone filter
-
$ PID.5.1 + " " + PID.5.2 - evaluates into string containing patient's last and first names separated by space
-
$ PID.7 | dateTime - evaluates into string containing patient's birth date formatted according to ISO 8601
String Interpolations
There is special syntax for string interpolations:
Notification: Patient {{ PID.5.2 | capitailze }} {{ PID.5.1 | capitailze }}
just arrived to {{ PID.PV1.3 | formatLocation }}.
This template evaluates to (depends on data in HL7 message):
Notification: Patient John Smith
just arrived to ROOM 231, FLOOR 2, BED 4.
Directives
To perform branching, iterations, function calls and other dynamic evaluations there is special nodes in template called directives. Directives are maps where one or more keys are starting with '$' sign.
$if
Performs conditional evaluation:
PatientName:
$if: PID.5.1 && PID.5.2
$then:
FirstName: $ PID.5.2
LastName: $ PID.5.1
$else:
FirstName: Unnamed
LastName: Patient
If condition is true, node is evaluated into value of $then attribute, $else otherwise. If condition is false and $else attribute is ommited, node evaluated into null.
NB there is shorter form for $if directive:
PatientName:
$if: PID.5.1 && PID.5.2
FirstName: $ PID.5.2
LastName: $ PID.5.1
In shorten form node evaluated into itself (without '$if' attribute) when condition is true, null otherwise.
$foreach
$foreach node is used to iterate over one or more arrays. If value to iterate over is not an array, exactly one iteration is performed. Evaluates into array containing results of each iteration.
PatientIdentifiers:
$foreach: PID.2, PID.3, PID.4, PID.18 as id
$body:
$if: id.1
Value: $ id.1
There is shorten form of $foreach directive:
$foreach: DG1 as diag
CodeSystem: $ diag.2
Code: $ diag.3
Display: $ diag.4
$filter
Applies one or more filtering functions on it's '$body':
Patient:
Location:
$filter:
- compact
- join(".")
$body:
- $ PV1.3.7 | trim
- $ PV1.3.8 | trim
- $ PV1.3.2 | trim
- $ PV1.3.3 | trim
$let
Declares local variables and make them available in it's $body. Evaluates into $body.
$let:
ID: PID.3 | md5
NAME: PID.5 | formatName
LOCATION:
$filter:
- compact
- join(".")
$body:
- $ PV1.3.7 | trim
- $ PV1.3.8 | trim
- $ PV1.3.2 | trim
- $ PV1.3.3 | trim
$body:
Notification:
type: patient_arrived
text: Patient {{ NAME }} has arrived to {{ LOCATION }}.
payload:
pid: $ ID
name: $ NAME
location: $ LOCATION
Evaluating JUTE Templates
Evaluating JUTE templates is rather simple task. In general, we need to perform breadth-first traversal of template and call evaluation function on each node.
Evaluation function performs dispatching based on type of current node. If current node is a string starting with '$', we need to perform expression evaluation. If current node is a map (object) containing keys starting with '$' sign, we need to determine which directive is it and to call corresponding directive's evaluation function. Otherwise, node is evaluated into itself.
Other Benefits of Declarativeness
Worth mentioning that because JUTE templates are data structures, they are language-independent, so it's possible to make JUTE interpriter for any programming language. Also it's possible to implement code generator which translates JUTE template into imperative code in some language, which will allow to achieve better performance.
Future Development
Initially JUTE was developed as a declarative DSL for mapping HL7 to FHIR resources, but now it's obvious that JUTE's field of application is much wider. Generally, JUTE can transform any data structure to any data structure, and can be applied, for example, to translate FHIR to HL7 messages. Heath Samurai team will announce new versions and new features of JUTE language, as well as new versions of reference implementation.
Conclusion
In this article we described a tiny language for describing transformation from HL7 to FHIR and showed that declarative data-based mappings are easier to read and write rather than mappings written in imperative languages.