Student : Aditya Kurkure
Project : HAPI Struct <> Proto Conversion Tool
Organization : Google FHIR SDK
Mentors : Richard Kareko
NOTE : We are currently in the process of migrating the project here.
The purpose of this project is to implement the HAPI <> Proto converter, a library that converts between Hapi structs and Fhir protos. The Structures defined in the FHIR specification can be broadly categorized into two parts Resources and Base Types. The resources represent the medical data that is to be stored/shared. For example, a patient resource contains data related to a particular patient like their name, address, contact info, etc. Base types define how this data is stored, for example, a string for name and date for DOB. The base types can be further classified into Data types and Primitive types. Primitive Types Cannot be broken down into simpler types. Therefore each Data type and Resource are ultimately a collection of primitive types. The basic idea therefore would be to break down these Resources and Data types until we reach primitive types, convert these primitive types to HAPI / FHIR form. Reconstruct the types.
Issues :Add functionality to convert between composite types , Feat : Add functionality to convert Hapi resources to Fhir protos (JSON), Feat : Add functionality to convert between primitive datatypes
Pull requests : Convert composite Types, Primitive converter poet, adds functionality to converter between resources using Json representation, feat: initialize hapi<>proto converter module, add functionality to convert between primitive types (reflection), feat: convert between enums
Feature Branch: Hapi Proto Converter
- Create library that provides functionlity to convert between hapi structs and fhir protos using JSON representation
- Create library that provides functionality to convert between hapi structs and fhir protos using Code generation
- Comparing both the approaches
- Additional work on the Android FHIR SDK
- Proper performance testing
- DomainResource.Contained element -> Handled differently in Fhir protos and Hapi structs
- XHTML type elements -> Handled differently in Fhir protos and Hapi structs
- Extensions on Primitive types (Done manually currently) -> Will be fixed with gradle task automation
- Resource type elements -> Handled differently in Fhir protos and Hapi structs
1. Serializing- deserializing JSON
Hapi Structures as well as FHIR protos can be converted to JSON representation. The general approach would be to serialize the HAPI struct to JSON representation followed by parsing the JSON into a proto object ( and vice versa).
Pros | Cons |
---|---|
Natively supported by both libraries | Serializing- deserializing will require the intermediate JSON representation |
Estimated effort will be the least. | Can only convert a complete resource and not primitive / composite types. |
Implementation will be fairly simple. |
2. Java reflection API
Using Java reflection, we can get Type information on the Hapi struct at runtime. Using this information, we can build the corresponding fhir proto-object. The Process is similar to that of Code generation, the only difference is this will take place at runtime and in code generation it would take place at compile time.
Pros | Cons |
---|---|
Size of the final library would be less than that of Code generation. (but still larger than JSON converter). | Slower than code generation. (Moshi as an example) |
Would not involve converting to an intermediate state. | Possibility of runtime errors (as opposed to compile time errors in code generation). |
Can convert a resource as well as base types |
3. Code Generation
Using code generation, we can create converters that would map each element in the HAPI struct to the corresponding element in the Fhir proto. This can be done using the structure definition file defined in the fhir spec, which would give us the type information at compile time.
Pros | Cons |
---|---|
The errors generated due to incorrect mapping would be at compile time rather runtime. | Size of the final library would be the largest between the 3 approaches. |
Would not involve converting to an intermediate state. | Would have to infer all properties from the structure definition file. |
Can convert a resource as well as base types | Harder to maintain, as converter can break if there are changes in the Proto or HAPI library. |
The elements were broading categorized into 2 parts :-
- Primitive types
- Composite types
- Backbone elements types
- Code types (Enums)
- Elements with a content reference rather than a type
- Choice type
- Other types
The types and whether or not it is repeated can be easily infered from the structure definition file. The challenging part was getting the naming convention correct. However since both the Hapi struct codegen and Proto codegen are open sourced makes this somewhat easier.
For testing the relative performance of the approaches, I converted a 1000 patient resources from hapi to proto and vice versa.
All Values are in milliseconds.
JSON | Codegen | |||
---|---|---|---|---|
Hapi to Proto | Proto to Hapi | Hapi to Proto | Proto to Hapi | |
First conversion | 391 | 1134 | 705 | 281 |
100 conversions | 887 | 1584 | 823 | 328 |
1000 conversions | 3285 | 3231 | 1205 | 500 |
Excluding first conversions | 2894 | 2097 | 500 | 219 |
No of files | Hapi JSON | Proto Bytes | Proto String |
---|---|---|---|
1 | 2.41 kb | 761 b | 2.90 kb |
100 | 243 kb | 75 kb | 293 kb |
1000 | 2.35 mb | 743 kb | 2.83 mb |
1,00,000 | 235 mb | 72.5 mb | 591 mb |
inline fun <reified T : Resource> convert(
resource: GeneratedMessageV3,
hapiParser: IParser,
protoPrinter: JsonFormat.Printer
): T
inline fun <reified T : Resource> convert(
resource: GeneratedMessageV3,
hapiParser: IParser,
protoPrinter: JsonFormat.Printer
): T
protoPatient.toHapi()
hapiPatient.toProto()
- Protobufs have a relatively smaller size and hence are suitable for network and database operations.
- Hapi structures have a wide variety of use cases, for example the FHIR path engine and the CQL engine.
- The current HAPI fhir server could be modified to accept and store data in the FHIR proto format.
- The Cloud Health Care API (GCP) can introduce new functionality such as evaluation of CQL. (however this project isn't open source)
- Why the Kotlin/ Kotlin poet library?
The alternatives to this are Byte Buddy , Java poet. Byte buddy generates the classes at runtime and thus would not fit our use case. I chose kotlin poet over java poet because the android fhir library uses kotlin. Post GSoC I do plan on generating code in java.
- Why Object notation over Top level functions?
While the recommended way is to use Top level functions the library currently uses object notation. There is a heap space error while trying to compile the library while using top level functions. I had attempted a few rudimentary fixes however they don't fix the problem.