Skip to content

Instantly share code, notes, and snippets.

@clarkevans
Created February 28, 2018 21:45
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save clarkevans/0c6330b400543c2fea0cbf16236d62a2 to your computer and use it in GitHub Desktop.
Save clarkevans/0c6330b400543c2fea0cbf16236d62a2 to your computer and use it in GitHub Desktop.
Exploring FHIR Synthetic Data /w IHM Model
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Exploring FHIR Synthetic Data /w IHM Model"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This notebook explores queries on synthetic medical records for 12 sample patients obtained from the Cypress project. The synthetic data is provided as an XML formatted FHIR encoded file, one file per patient. This data is loaded into memory in a manner inspired by the Integrated Health Model (\"IHM\"). This notebook can be used to learn about the synthetic data so that we can be sure it's translation is successful."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The Julia language Jupyter notebook instance should be started in the project folder. There are several code modules we'll use in the source directory."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"cd(\"src\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Example Data & Combinator Introduction"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To explore this data we use an early prototype of QueryCombinators, a very flexible high-level query language detailed at:\n",
"https://querycombinators.org implemented in the Julia language for scientific computing. The warning below reflects that we've overridden a function in the query combinator library to work-around a bug we encountered."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING: Method definition cse(RBT.Query) in module RBT at /home/cce/.julia/v0.5/RBT/src/cse.jl:3 overwritten in module Main at /home/cce/ihm-fhir-ri/src/querycombinators.jl:84.\n"
]
},
{
"data": {
"text/plain": [
"\"QueryCombinators are Loaded!\""
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"include(\"querycombinators.jl\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The first stage of processing is a relatively straight-forward conversion of these XML formatted FHIR encoded synthetic records into a in-memory representation. The ``load-fhir.jl`` program provides a function, ``query_ihm`` that can be then used to query this load all synthetic patients."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"query_ihm (generic function with 2 methods)"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"include(\"load-fhir.jl\") "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__How many patients exist in this synthetic data set?__\n",
"\n",
"To retrieve the answer, we pass to ``query_ihm()`` a query combinator ``Count(Patient)``. "
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"12"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" Count(Patient) \n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can apply Count by appending it to an already working query.\n",
"\n",
"``Count(Patient)`` is equivalent to ``Patient >> ThenCount``\n",
"\n",
"In this combinator query language, the composition ``>>`` operator indicates a pipeline. This notation permits the incremental construction of queries, so that you could simply append operations on an already working query. By convention, operations starting with *Then* indicate a pipeline version of a combinator; ``ThenCount`` is the pipeline equivalent of ``Count``. \n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"12"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" Patient \n",
" >> ThenCount\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__How many encounters, across all patients, are in this data set?__\n",
"\n",
"We create a 3-stage pipeline: (a) choose all ``Patient`` records, (b) for each patient, navigate to the corresponding ``Encounter`` records, (c) and ``ThenCount`` the size of resulting list. In this query language, nested lists are automatically flattened into a single stream. Hence, ``Patient >> Encounter`` construct lists encounters across all patients. "
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"28"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" Patient\n",
" >> Encounter\n",
" >> ThenCount\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__How many observations, actions and requests does each patient have?__\n",
"\n",
"The query below produces, for each patient, the synthetic patient's handle and the count of corresponding medical record entries. Sometimes it's not helpful to flatten the output list; or it may be helpful to select more than one output field. In this case, the combinator ``ThenSelect`` builds a record in the output structure."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"12-element composite vector of {String, Int64, Int64, Int64}:\n",
" (\"2_N_GP_Adult\",10,3,1) \n",
" (\"A_Heart_Adult\",14,1,0) \n",
" (\"B_Heart_Adult\",11,3,2) \n",
" (\"C_N_GP_Adult\",10,0,2) \n",
" (\"Z14_N_GP_Adult\",6,0,0) \n",
" (\"Z1_Pregnancy_Adult\",10,1,0)\n",
" (\"Z2_N_Heart_Adult\",13,0,0) \n",
" (\"Z4_Heart_Adult\",12,3,2) \n",
" (\"Z5_Heart_Adult\",6,1,0) \n",
" (\"Z6_N_Heart_Adult\",8,1,1) \n",
" (\"Z7_Heart_Adult\",8,1,1) \n",
" (\"Z9_GP_Geriatric\",5,0,8) "
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" Patient\n",
" >> ThenSelect(\n",
" Handle,\n",
" Count(Observation),\n",
" Count(Action),\n",
" Count(Request))\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The current selection here might be handy to use again. Let's give it a name."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"RBT.Combinator(RBT.#265)"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ThenSummarizePatient = (\n",
" ThenSelect(\n",
" Handle,\n",
" Count(Observation),\n",
" Count(Action),\n",
" Count(Request)))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__Which patients have Requests?__"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"7-element composite vector of {String, Int64, Int64, Int64}:\n",
" (\"2_N_GP_Adult\",10,3,1) \n",
" (\"B_Heart_Adult\",11,3,2) \n",
" (\"C_N_GP_Adult\",10,0,2) \n",
" (\"Z4_Heart_Adult\",12,3,2) \n",
" (\"Z6_N_Heart_Adult\",8,1,1)\n",
" (\"Z7_Heart_Adult\",8,1,1) \n",
" (\"Z9_GP_Geriatric\",5,0,8) "
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" Patient\n",
" >> ThenFilter(Exists(Request))\n",
" >> ThenSummarizePatient\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__Which patients have Requests and more than 10 Observations?__"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"4-element composite vector of {String, Int64, Int64, Int64}:\n",
" (\"2_N_GP_Adult\",10,3,1) \n",
" (\"B_Heart_Adult\",11,3,2) \n",
" (\"C_N_GP_Adult\",10,0,2) \n",
" (\"Z4_Heart_Adult\",12,3,2)"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" Patient\n",
" >> ThenFilter(Exists(Request))\n",
" >> ThenFilter(Count(Observation) .> 8) \n",
" >> ThenSummarizePatient\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__What is the average number of observations for each patient?__\n",
"\n",
"Let's solve this in two stages. Let's first compute, for each patient, the count of its observations. This returns an array of counts. Second, let's take the mean of those counts."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"12-element Array{Int64,1}:\n",
" 10\n",
" 14\n",
" 11\n",
" 10\n",
" 6\n",
" 10\n",
" 13\n",
" 12\n",
" 6\n",
" 8\n",
" 8\n",
" 5"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" Patient\n",
" >> Count(Observation)\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"9.416666666666666"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" Patient\n",
" >> Count(Observation)\n",
" >> ThenMean\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__How are the patient's encounters coded?__\n",
"\n",
"So that we could get a nice summary, let's start with every encounter across all patients, ``ThenGroup`` by the encounter's type, ``ThenSort`` in decending order (``Desc``) by the number of encounters in each group, and then finally, ``ThenSelect`` the type and the number of corresponding encounters. We use ``Category`` to mean the \"type\" of Encounter, using the word \"type\" in a programming environment is unwise."
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"5-element composite vector of {Concept, Int64}:\n",
" (Concept(390906007),19)\n",
" (Concept(185347001),5) \n",
" (Concept(32485007),2) \n",
" (Concept(108219001),1) \n",
" (Concept(182964004),1) "
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" Patient\n",
" >> Encounter\n",
" >> ThenGroup(Category)\n",
" >> ThenSort(Count(Encounter) >> Desc)\n",
" >> ThenSelect(\n",
" Category,\n",
" Count(Encounter))\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__When were these synthetic patients born, from oldest to youngest?__\n",
"\n",
"The IHM model doesn't have Patient in its model which might have a convenient ``BirthDate`` attribute. Instead, it is represented as a ``184099003``|``Date of birth`` coded ``Observation``. We need to convert this convention into something more friendly. A computation ``Frame`` is necessary since we need this combinator to work for each patient individually, and not across a collection of patients."
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"RBT.Combinator(RBT.#185)"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"BirthDateConcept = ToConcept(\"SNOMED-CT\",\"184099003\")\n",
"BirthDate = (\n",
" Frame(\n",
" Observation\n",
" >> ThenFilter(AnyOf(Observable .== BirthDateConcept))\n",
" >> Initiation\n",
" >> ThenUnique\n",
" ) >> ThenExpectOne)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"While the definition of ``BirthDate`` maybe involved, it can be easily used. This mechanism permits the dynamic construction of query languages customized for very narrow domains."
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"12-element composite vector of {Date, String}:\n",
" (1941-01-02,\"C_N_GP_Adult\") \n",
" (1941-02-07,\"Z9_GP_Geriatric\") \n",
" (1946-02-01,\"A_Heart_Adult\") \n",
" (1946-02-01,\"Z2_N_Heart_Adult\") \n",
" (1946-02-01,\"B_Heart_Adult\") \n",
" (1946-02-01,\"Z4_Heart_Adult\") \n",
" (1959-01-26,\"Z7_Heart_Adult\") \n",
" (1967-01-24,\"Z6_N_Heart_Adult\") \n",
" (1968-02-14,\"Z14_N_GP_Adult\") \n",
" (1969-08-26,\"Z5_Heart_Adult\") \n",
" (1981-05-29,\"2_N_GP_Adult\") \n",
" (1992-10-01,\"Z1_Pregnancy_Adult\")"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" Patient\n",
" >> ThenSort(BirthDate)\n",
" >> ThenSelect(\n",
" BirthDate >> AsDate,\n",
" Handle)\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__Can we focus our attention on the youngest patient?__\n",
"\n",
"This dynamic customization ability permits interesting queries to be locally defined. To find the youngest patient, we need to sort in descending order, and then take the 1st one from the list."
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"RBT.Combinator(RBT.#185)"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"FocusPatient = (\n",
" Patient\n",
" >> ThenSort(BirthDate >> Desc)\n",
" >> ThenTake(1) \n",
" >> ThenExpectOne\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Once defined, queries on this focus patient are simple."
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(\"Z1_Pregnancy_Adult\",10,1,0)"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" FocusPatient \n",
" >> ThenSummarizePatient\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__What observations exist for this patient?__"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"10-element composite vector of {Date, Concept, Measurement?}:\n",
" (1992-10-01,Concept(184099003),#NULL) \n",
" (1992-10-01,Concept(248152002),#NULL) \n",
" (1992-10-01,Concept(413581001),#NULL) \n",
" (2014-03-01,Concept(84114007),#NULL) \n",
" (2014-03-01,Concept(981000124106),#NULL) \n",
" (2015-04-03,Concept(1201005),#NULL) \n",
" (2015-04-03,Concept(66071002),#NULL) \n",
" (2015-05-23,Concept(47200007),#NULL) \n",
" (2016-02-01,Concept(289259007),#NULL) \n",
" (2016-02-11,Concept(271649006),Measurement(Concept(259018001),144))"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" FocusPatient\n",
" >> Observation\n",
" >> ThenSelect(\n",
" Initiation >> AsDate, \n",
" Observable,\n",
" Value)\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Using a SNOMED terminology service."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So that we can deepen our analysis, let's cross-reference these codings with a SNOMED terminology service. The following file defines a ``SuperType`` combinator that takes any Concept and returns the list of its parents. More generally it also defines ``ConceptDescription`` which will return a description for a concept, and ``ConceptRelationship`` which will build a combinator specific to a particular kind of SNOMED relationship."
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"ConceptRelationship (generic function with 2 methods)"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"include(\"load-snomed.jl\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's name a few concepts for our analysis. "
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"RBT.Combinator(RBT.#201)"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ClinicalFinding = ToConcept(\"SNOMED-CT\", \"404684003\")\n",
"HeartFailure = ToConcept(\"SNOMED-CT\", \"84114007\")\n",
"EssentialHypertension = ToConcept(\"SNOMED-CT\", \"59621000\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__What are the descriptions for our patient's entries?__\n",
"\n",
"These can be retrieved using the ``CategoryDescription`` combinator which takes a particular category and returns the corresponding description via terminology lookup."
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"10-element composite vector of {Date, String}:\n",
" (1992-10-01,\"Date of birth\") \n",
" (1992-10-01,\"Female\") \n",
" (1992-10-01,\"Asian or Pacific islander\") \n",
" (2014-03-01,\"Heart failure\") \n",
" (2014-03-01,\"Moderate left ventricular systolic dysfunction\")\n",
" (2015-04-03,\"Benign essential hypertension\") \n",
" (2015-04-03,\"Type B viral hepatitis\") \n",
" (2015-05-23,\"High risk pregnancy\") \n",
" (2016-02-01,\"Vaginal delivery\") \n",
" (2016-02-11,\"Systolic blood pressure\") "
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" FocusPatient\n",
" >> Observation\n",
" >> ThenSelect(\n",
" Initiation >> AsDate, \n",
" Observable >> ConceptDescription)\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__What are ancestors of ``289259007``|``Vaginal delivery``?__\n",
"\n",
"A terminology server permits us make inquiries about super-types for the patient's codings. The ``SuperType`` combinator that returns the direct super-types for any given concept. The built-in ``Connect`` combinator performs a transitive closure. Hence, below we count the number of super-type ancestors for each of the patient's coded values."
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"7-element composite vector of {String, String}:\n",
" (\"289258004\",\"Finding of pattern of delivery\") \n",
" (\"118215003\",\"Delivery finding\") \n",
" (\"118185001\",\"Finding related to pregnancy\") \n",
" (\"248982007\",\"Pregnancy, childbirth and puerperium finding\")\n",
" (\"250171008\",\"Clinical history and observation findings\") \n",
" (\"404684003\",\"Clinical finding\") \n",
" (\"138875005\",\"SNOMED-CT(138875005)\") "
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"VaginalDeliveryConcept = ToConcept(\"SNOMED-CT\", \"289259007\")\n",
"\n",
"query_ihm(\n",
" VaginalDeliveryConcept\n",
" >> Connect(SuperType)\n",
" >> ThenSelect(\n",
" CodeValue,\n",
" ConceptDescription)\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__Can we test diagnosis coding transitively?__\n",
"\n",
"We could define another combinator ``IsA`` to test if a particular concept has another for an ancestor. If you notice from the result above, the ``Connect`` list doesn't include the concept itself. One would expect a concept to match itself, so we need ``IsA`` to ``Merge`` the concept in with its list of ancestors before testing set membership with ``AnyOf``. "
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"IsA (generic function with 1 method)"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"IsA(parent) = AnyOf(Merge(It, It >> Connect(SuperType)) .== parent)"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(true,true,false)"
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" Select(\n",
" HeartFailure >> IsA( HeartFailure), \n",
" HeartFailure >> IsA(ClinicalFinding),\n",
" ClinicalFinding >> IsA(HeartFailure))\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__Which of the patient's diagnosis are a form of ``59621000``|``EssentialHypertension``?__"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1-element composite vector of {Concept, String}:\n",
" (Concept(1201005),\"Benign essential hypertension\")"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" FocusPatient\n",
" >> Observation\n",
" >> ThenFilter(Observable >> IsA(EssentialHypertension))\n",
" >> ThenSelect(\n",
" Observable,\n",
" Observable >> ConceptDescription)\n",
") "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__Across all patients, which have had ``EssentialHypertension`` finding, what kind was it?__\n",
"\n",
"To make this a bit more readable, let's make another combinator, ``Having`` that tests if the patient has any entries that are categorized as a descendent of the concept provided. While each one of these combinators may be individually easy to debug and use, their combined operation can be quite sophisticated."
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Having (generic function with 1 method)"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Having(aConcept) = AnyOf(Observation >> Observable >> IsA(aConcept))"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"6-element composite vector of {String, String*}:\n",
" (\"Z14_N_GP_Adult\",String[\"Essential hypertension\"]) \n",
" (\"Z1_Pregnancy_Adult\",String[\"Benign essential hypertension\"])\n",
" (\"Z5_Heart_Adult\",String[\"Malignant essential hypertension\"]) \n",
" (\"Z6_N_Heart_Adult\",String[\"Essential hypertension\"]) \n",
" (\"Z7_Heart_Adult\",String[\"Essential hypertension\"]) \n",
" (\"Z9_GP_Geriatric\",String[\"Essential hypertension\"]) "
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" Patient\n",
" >> ThenFilter(Having(EssentialHypertension))\n",
" >> ThenSelect(\n",
" Handle,\n",
" Observation\n",
" >> ThenFilter(Observable >> IsA(EssentialHypertension))\n",
" >> Observable\n",
" >> ConceptDescription)\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As common analysis patters are discovered, they could be documented and provided combinators that: (a) simplify the task at hand, (b) can be understood individually, and (c) don't limit other avenues of analysis. For example, if returning the patient handle and concepts matching a code are useful, it could be codified."
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"FindingOf (generic function with 1 method)"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ThenHaving(aConcept) = \n",
" ThenFilter(Having(aConcept))\n",
"\n",
"FindingOf(aConcept) = (\n",
" Observable\n",
" >> ThenFilter(IsA(aConcept))\n",
" >> ConceptDescription)\n",
" "
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"8-element composite vector of {String, String*}:\n",
" (\"A_Heart_Adult\",String[\"Acute left-sided heart failure\"]) \n",
" (\"B_Heart_Adult\",String[\"Acute left-sided heart failure\"]) \n",
" (\"C_N_GP_Adult\",String[\"Chronic congestive heart failure\"]) \n",
" (\"Z1_Pregnancy_Adult\",String[\"Heart failure\"]) \n",
" (\"Z2_N_Heart_Adult\",String[\"Acute left-sided heart failure\"])\n",
" (\"Z4_Heart_Adult\",String[\"Acute left-sided heart failure\"]) \n",
" (\"Z6_N_Heart_Adult\",String[\"Congestive heart failure\"]) \n",
" (\"Z7_Heart_Adult\",String[\"Congestive heart failure\"]) "
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" Patient\n",
" >> ThenHaving(HeartFailure)\n",
" >> ThenSelect(\n",
" Handle,\n",
" Observation >> FindingOf(HeartFailure))\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this way, it becomes sort of questionable where the data model ends and the query model begins. The boundary is dynamic as new queries can be defined that create derivative data models more suitable towards use in a specific field of interest."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__What kinds of cardiovascular observations are found in our sample data?__\n",
"\n",
"Since our data is coded in SNOMED-CT, this query can be answered in two parts. First, we need to import the conceptual relationship, ``363698007``|``FINDING SITE``, then, name a concept in the structural hierarchy we are interested in. "
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"RBT.Combinator(RBT.#201)"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"FindingSite = ConceptRelationship(363698007)\n",
"CardiovascularStructure = ToConcept(\"SNOMED-CT\", 113257007)"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"7-element composite vector of {Concept, String}:\n",
" (Concept(51840005),\"Systemic circulatory system\") \n",
" (Concept(87878005),\"Left ventricle\") \n",
" (Concept(80891009),\"Heart\") \n",
" (Concept(49848007),\"Myocardium of left ventricle\")\n",
" (Concept(41801008),\"Coronary artery\") \n",
" (Concept(21814001),\"Ventricle\") \n",
" (Concept(6975006),\"Anterior myocardium\") "
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" Patient\n",
" >> Observation\n",
" >> Observable \n",
" >> FindingSite\n",
" >> ThenUnique \n",
" >> ThenFilter(IsA(CardiovascularStructure))\n",
" >> ThenSelect(It, It >> ConceptDescription)\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__What cardiovascular problems does our focus patient have?__\n",
"\n",
"So we can test it independently and re-use it, let's define a ``CardioProblem`` as observations with a Cardiovascular finding site. Then, the query is straight-forward."
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"3-element composite vector of {Date, String}:\n",
" (2014-03-01,\"Heart failure\") \n",
" (2014-03-01,\"Moderate left ventricular systolic dysfunction\")\n",
" (2015-04-03,\"Benign essential hypertension\") "
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"CardioProblem = (\n",
" Observation\n",
" >> ThenFilter(\n",
" AnyOf(Observable \n",
" >> FindingSite \n",
" >> IsA(CardiovascularStructure))))\n",
"\n",
"query_ihm(\n",
" FocusPatient\n",
" >> CardioProblem\n",
" >> ThenSelect(\n",
" Initiation >> AsDate,\n",
" Observable >> ConceptDescription)\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this query language, the boolean expression ``AnyOf`` takes a plural set of boolean values and returns ``True`` if any of them are true. It is defined in terms of existence, in particular, \n",
"``AnyOf(X) == Exists(X >> ThenFilter(It .== True))``"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Accessing Patient data from an Encounter"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"*We outline a work-around for the existing combinator implementation.*\n",
"\n",
"One of the things that makes this combinator language easy to work with is that it tracks the user's navigational context. In some cases we wish to do a cross product between one or more tables. The current combinator language doesn't have a generic way to do to this, but we have several place holders till it's turned into generic functionality."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__What encounters do we have for the focus patient?__"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"2-element composite vector of {String, Date, Date}:\n",
" (\"Z1_Pregnancy_Adult\",1992-10-01,2016-02-11)\n",
" (\"Z1_Pregnancy_Adult\",1992-10-01,2016-03-08)"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" FocusPatient\n",
" >> ThenMixEncounter\n",
" >> ThenSelect(\n",
" Patient >> Handle,\n",
" Patient >> BirthDate >> AsDate,\n",
" Encounter >> Initiation >> AsDate)\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Similarly, we can list the Observations for our Focus Patient."
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"10-element composite vector of {String, Date, Date}:\n",
" (\"Z1_Pregnancy_Adult\",1992-10-01,1992-10-01)\n",
" (\"Z1_Pregnancy_Adult\",1992-10-01,1992-10-01)\n",
" (\"Z1_Pregnancy_Adult\",1992-10-01,1992-10-01)\n",
" (\"Z1_Pregnancy_Adult\",1992-10-01,2014-03-01)\n",
" (\"Z1_Pregnancy_Adult\",1992-10-01,2014-03-01)\n",
" (\"Z1_Pregnancy_Adult\",1992-10-01,2015-04-03)\n",
" (\"Z1_Pregnancy_Adult\",1992-10-01,2015-04-03)\n",
" (\"Z1_Pregnancy_Adult\",1992-10-01,2015-05-23)\n",
" (\"Z1_Pregnancy_Adult\",1992-10-01,2016-02-01)\n",
" (\"Z1_Pregnancy_Adult\",1992-10-01,2016-02-11)"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" FocusPatient\n",
" >> ThenMixObservation\n",
" >> ThenSelect(\n",
" Patient >> Handle,\n",
" Patient >> BirthDate >> AsDate,\n",
" Observation >> Initiation >> AsDate)\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the following section, we'll want a 3-way cross product of the current ``Patient`` with all ``Encounter`` and ``Observation`` for that patient. The ``ThenMixObservationByEncounter`` combinator will do the trick in this case. The results below show all combinations of encounter and observation for our focus patient. "
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"20-element composite vector of {String, Date, Date}:\n",
" (\"Z1_Pregnancy_Adult\",2016-02-11,1992-10-01)\n",
" (\"Z1_Pregnancy_Adult\",2016-02-11,1992-10-01)\n",
" (\"Z1_Pregnancy_Adult\",2016-02-11,1992-10-01)\n",
" (\"Z1_Pregnancy_Adult\",2016-02-11,2014-03-01)\n",
" (\"Z1_Pregnancy_Adult\",2016-02-11,2014-03-01)\n",
" (\"Z1_Pregnancy_Adult\",2016-02-11,2015-04-03)\n",
" (\"Z1_Pregnancy_Adult\",2016-02-11,2015-04-03)\n",
" (\"Z1_Pregnancy_Adult\",2016-02-11,2015-05-23)\n",
" (\"Z1_Pregnancy_Adult\",2016-02-11,2016-02-01)\n",
" (\"Z1_Pregnancy_Adult\",2016-02-11,2016-02-11)\n",
" (\"Z1_Pregnancy_Adult\",2016-03-08,1992-10-01)\n",
" (\"Z1_Pregnancy_Adult\",2016-03-08,1992-10-01)\n",
" (\"Z1_Pregnancy_Adult\",2016-03-08,1992-10-01)\n",
" (\"Z1_Pregnancy_Adult\",2016-03-08,2014-03-01)\n",
" (\"Z1_Pregnancy_Adult\",2016-03-08,2014-03-01)\n",
" (\"Z1_Pregnancy_Adult\",2016-03-08,2015-04-03)\n",
" (\"Z1_Pregnancy_Adult\",2016-03-08,2015-04-03)\n",
" (\"Z1_Pregnancy_Adult\",2016-03-08,2015-05-23)\n",
" (\"Z1_Pregnancy_Adult\",2016-03-08,2016-02-01)\n",
" (\"Z1_Pregnancy_Adult\",2016-03-08,2016-02-11)"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" FocusPatient\n",
" >> ThenMixObservationByEncounter\n",
" >> ThenSelect(\n",
" Patient >> Handle,\n",
" Encounter >> PeriodStart >> AsDate,\n",
" Observation >> Initiation >> AsDate)\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Temporal Relationships between Medical Record Entries"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Each of the IHM entities has a ``timing`` interval which marks an ``initiation`` and ``expiration`` time; if the ``expiration`` time is missing, it indicates an open interval. Medical record entries can be correlated using these and other intervals based upon a few operators:\n",
"\n",
"* ``A >> During(B)`` means that ``A`` occurred fully within the time period ``B``\n",
"* ``A >> Overlaps(B)`` means some time point within ``A`` occurs within ``B``.\n",
"* ``A >> StartsBeforeEndOf(B)`` means ``A`` was possibly known during ``B``.\n",
"* ``A >> EndsBeforeEndOf(B)`` means ``A`` expires before ``B``.\n",
"\n",
"It is possible to add many more of these operators, inspired by the Quality Data Model (\"QDM\")."
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\"Interval operations are loaded!\""
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"include(\"interval.jl\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__Which patients were seen in the 4th quarter of 2016?__"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"4-element Array{String,1}:\n",
" \"A_Heart_Adult\" \n",
" \"B_Heart_Adult\" \n",
" \"C_N_GP_Adult\" \n",
" \"Z4_Heart_Adult\""
]
},
"execution_count": 37,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Q4_2016 = ToInterval(\"2016-09-01\", \"2016-12-31T23:59\")\n",
"\n",
"query_ihm(\n",
" Patient\n",
" >> ThenFilter(\n",
" AnyOf(\n",
" Encounter\n",
" >> Timing\n",
" >> During(Q4_2016)))\n",
" >> Handle)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this query we use ``AnyOf`` since the tested condition is plural, and we only wish the test to pass if at least one of the subordinate expressions is true. If this is omitted, you'll get an error. This explicit step is important since there is no clear treatment of negations in a plural context."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__Which patients were seen in Q4 of 2016, when were they seen?__"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"4-element composite vector of {String, String, Date}:\n",
" (\"A_Heart_Adult\",\"Follow-up encounter\",2016-12-20) \n",
" (\"B_Heart_Adult\",\"Follow-up encounter\",2016-12-20) \n",
" (\"C_N_GP_Adult\",\"Follow-up encounter\",2016-10-01) \n",
" (\"Z4_Heart_Adult\",\"Follow-up encounter\",2016-12-20)"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" Patient \n",
" >> ThenMixEncounter\n",
" >> ThenFilter(\n",
" Encounter >> During(Q4_2016))\n",
" >> ThenSelect(\n",
" Patient >> Handle,\n",
" Encounter >> Category >> ConceptDescription,\n",
" Encounter >> Initiation >> AsDate)\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__What happened during our focus patient's encounters?__\n",
"\n",
"This is a non-trivial question. There are two things that the system may know about: (a) which entries started, and were perhaps made during this encounter, and (b) which previous entries were in the system, relevant, and perhaps even a topic of discussion between patient and physician. Hence, the primary relationship between an ``Encounter`` and another entry, such as a ``Observation`` must be temporally analyzed. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__When were are our focus patient's encounters, again?__"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"2-element composite vector of {Interval}:\n",
" (Interval(2016-02-11T15:00:00,2016-02-11T16:00:00),)\n",
" (Interval(2016-03-08T15:00:00,2016-03-08T16:00:00),)"
]
},
"execution_count": 39,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" FocusPatient\n",
" >> ThenMixEncounter\n",
" >> ThenSelect(Encounter >> Period)\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__When were our focus patient's observations?__"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"10-element composite vector of {Interval}:\n",
" (Interval(1992-10-01T15:00:00,-),) \n",
" (Interval(1992-10-01T15:00:00,-),) \n",
" (Interval(1992-10-01T15:00:00,-),) \n",
" (Interval(2014-03-01T08:00:00,-),) \n",
" (Interval(2014-03-01T08:00:00,-),) \n",
" (Interval(2015-04-03T08:00:00,-),) \n",
" (Interval(2015-04-03T08:00:00,2015-07-01T08:15:00),)\n",
" (Interval(2015-05-23T08:00:00,2016-03-27T08:15:00),)\n",
" (Interval(2016-02-01T16:00:00,2016-02-01T16:00:00),)\n",
" (Interval(2016-02-11T15:00:00,2016-02-11T15:00:00),)"
]
},
"execution_count": 40,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" FocusPatient\n",
" >> ThenMixObservation\n",
" >> ThenSelect(Observation >> Period)\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__What observations were made *During* her encounters?__"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1-element composite vector of {Interval, Interval}:\n",
" (Interval(2016-02-11T15:00:00,2016-02-11T16:00:00),Interval(2016-02-11T15:00:00,2016-02-11T15:00:00))"
]
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" FocusPatient\n",
" >> ThenMixObservationByEncounter\n",
" >> ThenFilter(During(Encounter, Observation))\n",
" >> ThenSelect(\n",
" Encounter >> Period,\n",
" Observation >> Timing)\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__What observations are During the patient encounter on 11th of Feb?__\n",
"\n",
"So that we can compare/contrast temporal comparision, let's make a combinator, ``FocusObservation`` that takes a temporal comparison operator and an encounter date and returns relevant observations."
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1-element composite vector of {Date, String}:\n",
" (2016-02-11,\"Systolic blood pressure\")"
]
},
"execution_count": 42,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ThenObservation(TemporalOp, SomeDate) = (\n",
" ThenMixObservationByEncounter\n",
" >> ThenFilter(TemporalOp(Encounter, Observation))\n",
" >> ThenFilter(\n",
" (Encounter >> PeriodStart)\n",
" >> AsDate .== ToDate(SomeDate))\n",
" >> Observation)\n",
"\n",
"query_ihm(\n",
" FocusPatient \n",
" >> ThenObservation(During, \"2016-02-11\")\n",
" >> ThenSelect(\n",
" Initiation >> AsDate,\n",
" Observable >> ConceptDescription)\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__What entries *Overlap* the focus encounter?__\n",
"\n",
"If ``During`` is strictly limited, ``Overlaps`` is more open. It includes those entries made in the medical record which lack a ``Stop`` time or where the ``Stop`` time occurs after the ``Start`` of the given encounter."
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"8-element composite vector of {Date, String}:\n",
" (1992-10-01,\"Date of birth\") \n",
" (1992-10-01,\"Female\") \n",
" (1992-10-01,\"Asian or Pacific islander\") \n",
" (2014-03-01,\"Heart failure\") \n",
" (2014-03-01,\"Moderate left ventricular systolic dysfunction\")\n",
" (2015-04-03,\"Benign essential hypertension\") \n",
" (2015-05-23,\"High risk pregnancy\") \n",
" (2016-02-11,\"Systolic blood pressure\") "
]
},
"execution_count": 43,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" FocusPatient \n",
" >> ThenObservation(Overlaps, \"2016-02-11\")\n",
" >> ThenSelect(\n",
" Initiation >> AsDate,\n",
" Observable >> ConceptDescription)\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__What entries *Start before end of* our focus encounter?__\n",
"\n",
"This is a common temporal relationship. It perhaps includes everything that was known by the physician at the time of the encounter, including past diagnosis that were resolved."
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"10-element composite vector of {Date, String}:\n",
" (1992-10-01,\"Date of birth\") \n",
" (1992-10-01,\"Female\") \n",
" (1992-10-01,\"Asian or Pacific islander\") \n",
" (2014-03-01,\"Heart failure\") \n",
" (2014-03-01,\"Moderate left ventricular systolic dysfunction\")\n",
" (2015-04-03,\"Benign essential hypertension\") \n",
" (2015-04-03,\"Type B viral hepatitis\") \n",
" (2015-05-23,\"High risk pregnancy\") \n",
" (2016-02-01,\"Vaginal delivery\") \n",
" (2016-02-11,\"Systolic blood pressure\") "
]
},
"execution_count": 44,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" FocusPatient \n",
" >> ThenObservation(StartsBeforeEndOf, \"2016-02-11\")\n",
" >> ThenSelect(\n",
" Initiation >> AsDate,\n",
" Observable >> ConceptDescription)\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__Which patient encounters were relevant to hypertension?__\n",
"\n",
"Let's put the previous two sections together. We're looking for patient encounters that occurred while there was an active diagnosis of Hypertension. Rather than working this out as a single query, let's encapsulate the idea of returning enounters correlated temporally with other entries that descend from the named concept.\n"
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"MixHaving (generic function with 1 method)"
]
},
"execution_count": 45,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"MixHaving(TemporalOp, aConcept) = (\n",
" ThenMixObservationByEncounter\n",
" >> ThenFilter(TemporalOp(Encounter, Observation))\n",
" >> ThenFilter(\n",
" Exists(Observation >> FindingOf(aConcept)))\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__Which visits occurred in 4th Quarter with patients having essential hypertension diagnosis?__"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0-element composite vector of {String, Date}"
]
},
"execution_count": 46,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" Patient\n",
" >> MixHaving(StartsBeforeEndOf, EssentialHypertension)\n",
" >> ThenFilter(During(Q4_2016, Encounter))\n",
" >> ThenSelect(\n",
" Patient >> Handle,\n",
" Encounter >> PeriodStart >> AsDate)\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__When did we have heart-patient visits in Q4 of 2016?__"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"4-element composite vector of {String, Date}:\n",
" (\"A_Heart_Adult\",2016-12-20) \n",
" (\"B_Heart_Adult\",2016-12-20) \n",
" (\"C_N_GP_Adult\",2016-10-01) \n",
" (\"Z4_Heart_Adult\",2016-12-20)"
]
},
"execution_count": 47,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" Patient\n",
" >> ThenHaving(HeartFailure)\n",
" >> MixHaving(Overlaps, HeartFailure)\n",
" >> ThenFilter(During(Q4_2016, Encounter))\n",
" >> ThenSelect(\n",
" Patient >> Handle,\n",
" Encounter >> PeriodStart >> AsDate)\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### ValueSets"
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"ValueSet (generic function with 1 method)"
]
},
"execution_count": 48,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"include(\"load-valueset.jl\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**For well-known hypertensive NLM value sets, what concepts are included?**"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(Concept[Concept(CPT:99201),Concept(CPT:99202),Concept(CPT:99203),Concept(CPT:99204),Concept(CPT:99205),Concept(CPT:99212),Concept(CPT:99213),Concept(CPT:99214),Concept(CPT:99215),Concept(185347001),Concept(108219001),Concept(390906007),Concept(390906007)],Concept[Concept(10725009),Concept(1201005),Concept(276789009),Concept(371125006),Concept(ICD-9-CM:401.0),Concept(ICD-9-CM:401.1),Concept(ICD-9-CM:401.9),Concept(429457004),Concept(46481004),Concept(48146000),Concept(56218007),Concept(59621000),Concept(59720008),Concept(65518004),Concept(78975002),Concept(ICD-10-CM:I10)],Concept[Concept(LOINC:8480-6),Concept(271649006)],Concept[Concept(LOINC:8462-4),Concept(271650006)])"
]
},
"execution_count": 49,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"SystolicBloodPressure = \n",
" ValueSet(\"2.16.840.1.113883.3.526.3.1032\")\n",
"DiastolicBloodPressure = \n",
" ValueSet(\"2.16.840.1.113883.3.526.3.1033\")\n",
"EssentialHypertension = \n",
" ValueSet(\"2.16.840.1.113883.3.464.1003.104.12.1011\")\n",
"OfficeVisit = \n",
" ValueSet(\"2.16.840.1.113883.3.464.1003.101.12.1001\")\n",
"query_ihm(\n",
" ThenSelect(\n",
" OfficeVisit,\n",
" EssentialHypertension,\n",
" SystolicBloodPressure,\n",
" DiastolicBloodPressure\n",
" )\n",
")\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**What is the maximum blood pressure for all patients?**"
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"12-element composite vector of {String, Int64?, Int64?}:\n",
" (\"2_N_GP_Adult\",#NULL,150) \n",
" (\"A_Heart_Adult\",90,150) \n",
" (\"B_Heart_Adult\",60,150) \n",
" (\"C_N_GP_Adult\",#NULL,155) \n",
" (\"Z14_N_GP_Adult\",102,194) \n",
" (\"Z1_Pregnancy_Adult\",#NULL,144)\n",
" (\"Z2_N_Heart_Adult\",90,#NULL) \n",
" (\"Z4_Heart_Adult\",60,150) \n",
" (\"Z5_Heart_Adult\",#NULL,146) \n",
" (\"Z6_N_Heart_Adult\",#NULL,158) \n",
" (\"Z7_Heart_Adult\",#NULL,150) \n",
" (\"Z9_GP_Geriatric\",#NULL,#NULL) "
]
},
"execution_count": 50,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"DiastolicPressure = (\n",
" Observation \n",
" >> ThenFilter(\n",
" AnyOf(Category .== DiastolicBloodPressure))\n",
" >> Value >> Scalar \n",
")\n",
"SystolicPressure = (\n",
" Observation \n",
" >> ThenFilter(\n",
" AnyOf(Category .== SystolicBloodPressure))\n",
" >> Value >> Scalar \n",
")\n",
"\n",
"query_ihm(\n",
" Patient\n",
" >> ThenSelect(\n",
" Handle,\n",
" DiastolicPressure >> ThenMax,\n",
" SystolicPressure >> ThenMax \n",
" )\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__Which patients with essential hypertension diagnosis had a blood pressure reading under 140 during an office visit in the 2nd quarter of 2016?__"
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"2-element Array{String,1}:\n",
" \"B_Heart_Adult\" \n",
" \"Z4_Heart_Adult\""
]
},
"execution_count": 51,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" Patient\n",
" >> ThenFilter(\n",
" AnyOf(Observation >> Category \n",
" .== EssentialHypertension))\n",
" >> ThenMixObservationByEncounter\n",
" >> ThenFilter(\n",
" During(Q4_2016, Encounter) &\n",
" During(Encounter, Observation) &\n",
" AnyOf(Encounter >> Category \n",
" .== OfficeVisit) &\n",
" AnyOf(SystolicPressure .< 140))\n",
" >> Patient >>Handle >> ThenUnique\n",
")\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## IHM Event Hierarchies and Data Set Statistics"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's do some analysis of the various concepts used in this dataset. One curiocity is to know what hierarchies are used by the various kinds of assertions. The first thing to define is a combinator that, for any concept, returns the hierarchy. Then, we create another combinator, ``Hierarchies`` that takes an event (Encounter, Observable, Action, Request) and returns the distinct hierarchies for the codes they use. This query is made possible since each of these events has a primary concept categorization and we've modeled this with a base class having a category attribute."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__How many entries are there for each QDM DataType?__\n",
" \n",
"This query involves constructing an arbitrary grouping. ``ThenGroup(DataType)`` dynamically produces a new entity that associates each distinct ``DataType`` with its corresponding ``Entry`` records. These entry records could then be counted in this new context."
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Hierarchies (generic function with 1 method)"
]
},
"execution_count": 52,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"RootConcept = ToConcept(\"SNOMED-CT\",138875005)\n",
"Hierarchy = (\n",
" Connect(SuperType)\n",
" >> ThenFilter(AnyOf(SuperType .== RootConcept))\n",
")\n",
"Hierarchies(EventKind) = (\n",
" Patient\n",
" >> EventKind\n",
" >> Category\n",
" >> Hierarchy\n",
" >> ThenUnique\n",
" >> ConceptDescription\n",
")\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__Which hierarchies are used by observations?__"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"4-element Array{String,1}:\n",
" \"Observable entity\" \n",
" \"Clinical finding\" \n",
" \"Social context\" \n",
" \"Pharmaceutical / biologic product\""
]
},
"execution_count": 53,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" Hierarchies(Observation)\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__Which hierarchies are used by actions?__"
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"2-element Array{String,1}:\n",
" \"Pharmaceutical / biologic product\"\n",
" \"Procedure\" "
]
},
"execution_count": 54,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" Hierarchies(Action)\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__Which hierarchies are used by requests?__"
]
},
{
"cell_type": "code",
"execution_count": 55,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"2-element Array{String,1}:\n",
" \"Procedure\" \n",
" \"Pharmaceutical / biologic product\""
]
},
"execution_count": 55,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" Hierarchies(Request)\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__Which hierarchies are used by encounters?__"
]
},
{
"cell_type": "code",
"execution_count": 56,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"2-element Array{String,1}:\n",
" \"Pharmaceutical / biologic product\"\n",
" \"Procedure\" "
]
},
"execution_count": 56,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_ihm(\n",
" Hierarchies(Action)\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**What is the ancestory for a particular vaccine?**"
]
},
{
"cell_type": "code",
"execution_count": 57,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"6-element composite vector of {String, String}:\n",
" (\"46233009\",\"Influenza virus vaccine\") \n",
" (\"71181003\",\"Vaccine\") \n",
" (\"350326008\",\"Vaccine, immunoglobulins and antisera\")\n",
" (\"69509008\",\"Biological agent\") \n",
" (\"373873005\",\"Pharmaceutical / biologic product\") \n",
" (\"138875005\",\"SNOMED-CT(138875005)\") "
]
},
"execution_count": 57,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"InfluenzaSplitVirionVaccine = ToConcept(\"SNOMED-CT\", 346524008)\n",
"\n",
"query_ihm(\n",
" InfluenzaSplitVirionVaccine\n",
" >> Connect(SuperType)\n",
" >> ThenSelect(\n",
" CodeValue,\n",
" ConceptDescription)\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Combinators and Data Conversion"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"How does ``Count(Patient)`` work? The *Query Combinator* query\n",
"language sees computation as an algebra of queries; that is, \n",
"elements in the algebra are query functions and combinators \n",
"take one or more query functions to produce a new query \n",
"function. \n",
"\n",
"* The ``Patient`` combinator takes the top level database\n",
" instance and returns a list of patient records. \n",
"\n",
"* The ``Count`` combinator is once indirect; from its \n",
" argument, it builds a query that does the counting. \n",
" In particular, the ``Count: (X->Y*) => (X->Int)`` \n",
" combinator takes any query ``(X->Y*)`` (from ``X`` \n",
" to a list of ``Y``) and constructs a query ``(X->Int)`` \n",
" that maps each ``X`` in its input to the *count* of \n",
" the corresponding ``Y`` entities. \n",
"\n",
"This Query Combinator computation model encapsulates complex \n",
"context management so that a domain expert, such as a medical \n",
"researcher or data analyst, could focus on their problem rather \n",
"than being caught up in logistics. The semantics of Query \n",
"Combinators is described in a paper (https://arxiv.org/abs/1702.08409)."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Julia 0.5.2",
"language": "julia",
"name": "julia-0.5"
},
"language_info": {
"file_extension": ".jl",
"mimetype": "application/julia",
"name": "julia",
"version": "0.5.2"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment