Skip to content

Instantly share code, notes, and snippets.

@niquola
Last active August 30, 2022 12:48
Show Gist options
  • Save niquola/977741142ec39eb40ff358edb2c7e6dc to your computer and use it in GitHub Desktop.
Save niquola/977741142ec39eb40ff358edb2c7e6dc to your computer and use it in GitHub Desktop.

FHIR Terminology Repository (FTR)

Motivation

FHIR Apps (servers) require terminologies for validation, lookups and other scenarios. There is a list of challenges with terminologies:

  • Keep “standard” terminologies like ICD10, RxNorm, LOINC in sync
  • Publish and distribute valuesets and terminology modules across the organization, projects and jurisdictions.
  • There may be multiple clients to terminologies like FHIR profiling, FHIR servers, FHIR terminology servers, ETL pipelines etc

One solution is to have centralized FHIR terminology server. We propose another “low-level” soulition - FHIR Terminology Repository (FTR).

FTR makes eacy to publish and sync terminologies in a distributed way (aka git) between all services.

Summary

FTR (FHIR Terminology Repository) is a spec for repository layout, formats and algorithms to distribute FHIR terminologies as ValueSets.

Basic unit of distribution is a “expanded” ValueSet. Design of ValueSets is out of the scope of this specification. Spec does not use CodeSystem, it is represented as a ValueSet with all system concepts. FTR works only with “expanded” ValueSets (VS). “Expanded” VS is a collection of Concept resources. Concept must have at lest two elements - system and code, but may contain arbitrary additional properties.

FTR defines:

  • VS File format - to store content of specific version of valueset
  • VS Patch format - changes between two versions
  • VS Tag format - named reference to specific version and chain of patches to current version
  • Tag Index format - index of all valuesets with specific tag.
  • VS Publish algorithm - how VS should be published
  • VS Update algorithm - how to update VS by VS Pointer
  • Tag Sync how effectively sync many VSs

Repository layout

FTR is a base directory containing module directories (fhir.r4 or loinc).. Each module directory contains ‘vs/’ directory with valuesets and ‘tags/’ directory with tag indexes

fhir/ # module
  - tags/ # tags
    - r5.ndjson.gz # tag indexes
    - r4.ndjson.gz
  - vs/ 
    - patient.gender/ # valueset
      - tag.r4.ndjson.gz # vs tag
      - tag.r5.ndjson.gz
      - vs.bc7623b7a94ed3d8feaffaf7580df3eca4f5f5ca.ndjson.gz # vs file
      - vs.e3b0c44298fc1c149afbf4c8996fb92427ae41ca.ndjson.gz
      - patch.bc7623b7a.7ae41ca.ndjson.gz # vs patch
loinc
  - tags/
    - main.ndjson.gz
  - vs/ 
    - loinc/
      - tag.main.ndjson.gz
      - vs.bc7623b7a94ed3d8feaffaf7580df3eca4f5f5ca.ndjson.gz

ValueSet File (VSF)

ValueSet File (VSF) is gzipped ndjson file. First line of this file is VS resource and rest is Concept resources. Concept lines must be sorted lexicographically by “{{system}}-{{code}}”. JSON must be “Canonical JSON”, i.e. sorted keys and no-white-spaces to guarantee reproducible SHA1 hash of file with same content. VSF name must be “vs.{hash}.ndjson.gz”, where hash is a SHA1 hash of gzipped file.

Example:

{"name":"fhir/patient-gender","url":"..."}
{"code":"male","display":"Male"}
{"code":"male","display":"Female"}

ValueSet Patch File and algorithm (VSD)

If you have two versions of VSF files you can generate patch file (VSP) in one run with following algorithm:

reserve_first_line
loop( c1 = next_concept(vs1), c2 = rend_next(vs2))
 case
  when c1 is null and c2 is null
    if changes > 0
      update_first_line(header(vs2))
    else
      empty_patch()
  when c1.code&system = c2.code&system
    when c1 not equal c2
      update_concept(c2)
    recur(next_concept(vs1), next_concept(vs2))
  when c1.code&system < c2.code&system
    remove_concept(c1)
    recur(next_concept(vs1), c2)
  when c1.code&system > c2.code&system
    new_concept(c2)
    recur(next_concept(vs1), c2)

Patch file will look like this

{"name":"myvs"}
{"op": "add",    "code":"c1" ,"display":".."}
{"op": "remove", "code":"c2" ,"display":".."}
{"op": "update", "code":"c3" ,"display":".."}

ValueSet Tag File (VST)

ValueSet Tag File is gziped ndjson file. Where first line is a hash of latest version of valueset and the rest is pointers to patch files.

{"tag":"fhir.v4", "hash":"{hash-3}"}
{"from": "{hash-1}", "to":"{hash-2}"}
{"from": "{hash-2}", "to":"{hash-3}"}

Tag Index File (TI)

Tag index file is a table of all valueset with specific tag and their current hashes sorted by valueset name. There is tag.hash file, which is hash of TI

{"name": "module.myvs-1", "hash":"...."}
{"name": "module.myvs-2", "hash":"...."}

It intended to be used to check quickly - does anything changed in tag and calculate bulk patch plan.

Client can save last tag hash and hashes all valuesets. When client want to check “that something changed” it can compare saved hash with current hash from tags/[tag].hash file. When hashes are not equal, client can load index file and using previous VS hashes discover which VS should be updated. To update VS client load VST and search for patches to be applied to client version of VS or client may choose just load full version referenced in VST.

Publish VS

To publish VS under specific tag.

  • generate VSF
  • check it does not present in current repo - upload it
  • check tag file of this VS pointing to this version
  • when not
    • generate patch file from latest version
    • update VST with new hash and migration
  • update TI file and it’s hash file
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment