mlapshin/jute-article.md

## jute-article.md

      
    Raw
  

              jute-article.md
            
          
    Introduction

In real-world healthcare IT ecosystem HL7 v2 is a most widely used
standard for data interoperability. New standards like HL7 v3 or FHIR
are rather young to completely replace old and time-proven HL7 v2.
Blah-blah, we need to map from HL7 to FHIR.
So how can we describe mapping from HL7 v2 to FHIR? There are two
options: imperative and declarative. Well-known products like MIRTH or
Iguana pick up imperative approach with Lua or JavaScript scripting
engines. Such mappings are easy to write and blah-blah-blah.
But in this article we want to present language for describing
mappings in declarative way. Why do we think declarative approach is
better? Mainly because structure of declarative mapping will represent
structure of resulting data. FHIR resources are deeply nested
hierarchical data structures, and if mappings follows structure of
resulting resource, it's much easier to read and write them.
To demonstrate that it's true we will use analogy with HTML
templates. Actually, mapping from HL7 to any hierarchical data
structure (like FHIR resource) has a lot in common with HTML
templating. You have some incoming data (HL7 message in our case or
View Model in HTML world) and you want to get some complex structured
output (FHIR resource or HTML DOM) based on this incoming data.
Consider following HTML template written in popular HAML language:
%section.container
  %h1= post.title
  %h2= post.subtitle
  %div
    = post.content

And now compare how same piece of HTML is constructed with JavaScript
in imperative manner:
function MyTemplate(post) {
    var section = document.createElement("section");
    var h1 = document.createElement("h1");
    var h2 = document.createElement("h2");
    var content = document.createElement("div");

    h1.innerHTML = post.title;
    h2.innerHTML = post.subtitle;
    content.innerHTML = post.content;

    section.appendChild(h1);
    section.appendChild(h2);
    section.appendChild(content);

    return section;
}

The main difference here (besides overall code length) is that HAML
template is structured and imperative code is linear. When you read
HAML template, you clearly see that in result you'll get root
 element with three child elements. It's much harder to
achieve same understanding while reading imperative code. That's why
there are a lot of people who prefer to write HTML templates in HAML,
Slim, JADE and nobody will construct HTML DOM in imperative manner. So
why can't we adopt same approach for mapping HL7 to FHIR?
It's a piece of mapping for Patient resource written in language
called JUTE. This is a small language we developed in our team to
perform HL7 to FHIR mapping.
resourceType: Patient
multipleBirthInteger: $ PID.24
deceasedBoolean: $ PID.30
birthDate: $ PID.7 | dateTime
gender: $ PID.8 | translateCode("gender")

name:
  $foreach: PID.5 as name
  $value:
    period:
      start: $ name.12 | dateTime
      end: $ name.13 | dateTime

    given: $ name.2 | capitalize
    middle: $ name.3 | capitalize
    family: $ name.1 | capitalize
    suffix: $ name.4 | capitalize
    prefix: $ name.5 | capitalize

    text: '{{name.5}} {{name.2}} {{name.3}} {{name.1}} {{name.4}} {{name.6}}'

In next chapters we will describe JUTE in more detail.
Few words about YAML

JUTE templates are JSON documents. But JSON itself is not
well-designed to be written by humans (it's more a data serialization
format), so in this article we will use language called YAML to
represent data structures. YAML is easy to read and write, thanks to
it's clean syntax and indentation-based nesting. Don't forget that
technically there is no big difference between JSON and YAML, they
both are able to represent same data structures.
Structure of parsed HL7 message

Parsing of HL7 messages goes out of scope of this article, but at
least we need to describe data structure of parsed message because
we're going to use it as an incoming data for JUTE template.
On top level parsed message is an object (or map) where keys are
segment names and values are segments. If there is more than one
segment of of specific kind, associated value will be array of
segments.
Segment itself is an object with numeric keys. Each segment field is
either a simple type (string/number) or an array (complex type like
XPN or CX). If field is repeatable, it will be presented as array of
simple types or array of arrays.
PID:
  '0': PID
  '1': 1
  '2': I
  '3':
    - MRN12345
    -
    -
    -
    -
    - Good Health Hospital
  <...>
  '13':
    - - (555)555-5555
      - PRN

    - - (777)777-7777
      - PRN

Segments can be nested into each other. For example, OUL messages can
contain many OBR segments, and each OBR contains many NTE and OBX, and
each OBX contains many NTE. That's why we use objects to represent
segments, not arrays. Because among numerical keys for fields there
will be keys for nested segments:
PID:
  '0': PID
  '1': 1
  <...>
  'PV1':
    '0': PV1

OBR:
  - '0': OBR
    '1': 1
    '2': '00045'
    <...>
    NTE:
      <...>
    OBX:
      - '0': OBX
        '1': 2
        <...>
      - '0': OBX
        '1': 2

Now when we understand how structure of parsed HL7 message will look
like, we can proceed describing JUTE templates.
Simple JUTE template

Simplest JUTE template can look like this:
"Hi from JUTE"

This template will be evaluated into itself, and in result we'll have
same string:
"Hi from JUTE"

Same is true for any data type, including maps and arrays:
Key1: true
Key2: false
Key3: [1, 2.0, "hello, world"]

Such template also will be evaluated into iteself.
JUTE expressions

We "inject" data from HL7 message into resulting data structure with
JUTE expressions. JUTE expressions are strings starting with '$' sign:
RegularString: just a regular string
PatientFirstName: $ PID.5.2 | capitalize
PatientLastName: $ PID.5.1 | capitalize

This template will be evaluated into (depending on data in HL7
message):
RegularString: just a regular string
PatientFirstName: John
PatientLastName: Smith

JUTE expression have a lot in common with JavaScript expressions or
expressions in some other high-level dynamic language. Main difference
here is that instead of variable names you're using paths, like
"PID.5.2". Also there is special syntax for applying one or more
filters on evaluation result (you can treat filters as exotic form of
function calls). Consider several examples of JUTE expressions:


$ PID.5 - evaluates into array containing all subfields of
patient's name)


$ PID.5 | compact | join("|") - evaluates into string containing
all non-empty subfields of patient's name separated by pipe
character


$ PID.13 || PID.14 | formatPhone - get patient's phone from either
PID.13 or PID.14 and formats it with formatPhone filter


$ PID.5.1 + " " + PID.5.2 - evaluates into string containing
patient's last and first names separated by space


$ PID.7 | dateTime - evaluates into string containing patient's
birth date formatted according to ISO 8601


String Interpolations

There is special syntax for string interpolations:
Notification: Patient {{ PID.5.2 | capitailze }} {{ PID.5.1 | capitailze }}
  just arrived to {{ PID.PV1.3 | formatLocation }}.

This template evaluates to (depends on data in HL7 message):
Notification: Patient John Smith
  just arrived to ROOM 231, FLOOR 2, BED 4.

Directives

To perform branching, iterations, function calls and other dynamic
evaluations there is special nodes in template called
directives. Directives are maps where one or more keys are starting
with '$' sign.
$if

Performs conditional evaluation:
PatientName:
  $if: PID.5.1 && PID.5.2
  $then:
    FirstName: $ PID.5.2
    LastName: $ PID.5.1
  $else:
    FirstName: Unnamed
    LastName: Patient

If condition is true, node is evaluated into value of $then attribute,
$else otherwise. If condition is false and $else attribute is ommited,
node evaluated into null.
NB there is shorter form for $if directive:
PatientName:
  $if: PID.5.1 && PID.5.2
  FirstName: $ PID.5.2
  LastName: $ PID.5.1

In shorten form node evaluated into itself (without '$if' attribute)
when condition is true, null otherwise.
$foreach

$foreach node is used to iterate over one or more arrays. If value to
iterate over is not an array, exactly one iteration is
performed. Evaluates into array containing results of each iteration.
PatientIdentifiers:
  $foreach: PID.2, PID.3, PID.4, PID.18 as id
  $body:
    $if: id.1
    Value: $ id.1

There is shorten form of $foreach directive:
$foreach: DG1 as diag
CodeSystem: $ diag.2
Code: $ diag.3
Display: $ diag.4

$filter

Applies one or more filtering functions on it's '$body':
Patient:
  Location:
    $filter:
      - compact
      - join(".")

    $body:
      - $ PV1.3.7 | trim
      - $ PV1.3.8 | trim
      - $ PV1.3.2 | trim
      - $ PV1.3.3 | trim

$let

Declares local variables and make them available in it's
$body. Evaluates into $body.
$let:
  ID: PID.3 | md5
  NAME: PID.5 | formatName
  LOCATION:
    $filter:
      - compact
      - join(".")

    $body:
      - $ PV1.3.7 | trim
      - $ PV1.3.8 | trim
      - $ PV1.3.2 | trim
      - $ PV1.3.3 | trim

$body:
  Notification:
    type: patient_arrived
    text: Patient {{ NAME }} has arrived to {{ LOCATION }}.
    payload:
      pid: $ ID
      name: $ NAME
      location: $ LOCATION

Evaluating JUTE Templates

Evaluating JUTE templates is rather simple task. In general, we need
to perform breadth-first traversal of template and call evaluation
function on each node.
Evaluation function performs dispatching based on type of current
node. If current node is a string starting with '$', we need to
perform expression evaluation. If current node is a map (object)
containing keys starting with '$' sign, we need to determine which
directive is it and to call corresponding directive's evaluation
function. Otherwise, node is evaluated into itself.
Other Benefits of Declarativeness

Worth mentioning that because JUTE templates are data structures, they
are language-independent, so it's possible to make JUTE interpriter
for any programming language. Also it's possible to implement code
generator which translates JUTE template into imperative code in some
language, which will allow to achieve better performance.
Future Development

Initially JUTE was developed as a declarative DSL for mapping HL7 to
FHIR resources, but now it's obvious that JUTE's field of application
is much wider. Generally, JUTE can transform any data structure to any
data structure, and can be applied, for example, to translate FHIR to
HL7 messages. Heath Samurai team will announce new versions and new
features of JUTE language, as well as new versions of reference
implementation.
Conclusion

In this article we described a tiny language for describing
transformation from HL7 to FHIR and showed that declarative data-based
mappings are easier to read and write rather than mappings written in
imperative languages.