Skip to content

Instantly share code, notes, and snippets.

@grahamegrieve
Last active September 22, 2017 04:44
Show Gist options
  • Save grahamegrieve/66a55c9a70c8b106d0bdea000c536f47 to your computer and use it in GitHub Desktop.
Save grahamegrieve/66a55c9a70c8b106d0bdea000c536f47 to your computer and use it in GitHub Desktop.
Use cases
This document describes a way of granting an application access to data on a set of patients.
The application can request a copy of all pertinent (clinical) access to the patients in a
single download. Note: We expect that this data will be pretty large.
Authorizing Access
Access to the data is granted by using the SMART backend services spec
(url: http://docs.smarthealthit.org/authorization/backend-services/).
We didn’t see a need for Group/* or Launch/* kind of scopes - System/*.read will do
fine. (or User/*.*, for interactice processes, though interactive processes are out of
scope for this work)
Accessing Data
The application can do either of the following queries:
GET [base]/Patient/$everything?start=[date-time]&_type=[type,type]
GET [base]/Group/[id]/$everything?start=[date-time]&_type=[type,type]
Notes:
* The first query returns all data on all patients that the user account has access to, since the starting date time provided.
* The second query provides access to all data on all patients in the nominated group. How the Group resource is
created/identified/defined/managed is out of scope for now
(question of whether we need to do sort this out has been referred to ONC).
* the start date/time means only records since the nominated time. In the absence of the parameter, it means all data ever
* The _type parameter is used to specify which resource types are part of the focal query (no impact on which related
resources are included). In the absence of this parameter, all types are included. This includes at least the CCDS
* The FHIR specification will be modified to allow Patient/$everything to cross patients, and to add $everything to Group
* Group will be added as a compartment type in the base Specification
Generally, this is expected to result in quite a lot of data. The client is expected to request this asynchronously, per rfc 7240.
To do this, the client uses the Prefer header:
Prefer: respond-async
When the server sees this return header, instead of generating the response, and then returning it, the server returns a
202 Accepted header, and a Content-Location at which the client can use to access the response.
The client then queries this content location using GET content-location (no prefixing). The response can be one of 3 outcomes:
* a 202 Accepted that indicates that processing is still happening. This response has no body.
It may also have an X-Progress header that provides some indication of progress to the user
* a 5xx Error that indicates that preparing the response has failed. The body is an OperationOutcome describing the error
* a 200 OK with the response for the original request. This response can carry a X-Available-Until header to indicate when
the response will no longer be available, and one ore more Link: headers that list the files that are available for download
after preparation is complete
Notes:
* This asynchronous protocol will be added as a general feature to the FHIR spec for all calls. Server discretion when to support it.
* Client can cancel a task or advise the server it's ok to delete the outcome using DELETE content-location.
* Other than the 5xx response, these responses have no body, except when the accept content type is 'text/html', in
which case the responses have an HTML representation of the content in the header (e.g. a redirect, an error, or
a list of files to download) (server discretion whether to support text/html)
* Link Headers can have one or more links in them, per rfc 5988
* todo: decide whether to add 'supports asynchronous' flag to the CapabilityStatement resource
Format of returned data
If the client uses the Accept type if application/fhir+json or application/fhir+xml, the response will be a bundle in the
specified format. Alternatively, the client can use the type application/fhir+ndjson. In this case the response is a
set of files in ndjson format (see http://ndjson.org/). Each file contains only resources of a single type.
There can be more than one file for each resource type. Bundles are broken up at Bundle.entry.resource - e.g. bundle entries
have a full URL, and the reosuce for the entry will be found in relevant download. (todo: how does that work for history?)
Notes:
* the response - whether a Bundle or the ? manifest will include a server time that can be used as the start time on a following query.
* clients should be prepared to receive resources that change on the boundary more than once (still todo)
* application/fhir+ndjson will be documented in the base spec
* may need to do some registration work for +ndjson
* May need to describe further formats (avro/parquet etc) later - consultation to follow
Subscriptions
Subscriptions are not supported at this time - applications can perform this query as needed
@chrisgrenz
Copy link

chrisgrenz commented Sep 14, 2017

The "task" resource (no body, not FHIR) isn't the response resource. GET on the task shouldn't return the response; it should point to the response via Location or a Link (of type "enclosure" maybe?). Using Link would allow for multiple response resources and would avoid DocumentManifest (if desired).

To address sequencing of requests, I would suggest that Date on the "task" resource could suffice (although not adequate for a min/max range) in normal sequenced requests.

  • a 202 Accepted that indicates that processing is still happening. This response has no body. Should include Date with the initial request datetime and Last-Modified with the status update time. May include X-Progress to indicated progress (format?) and/or Retry-After with an estimate of completion.
    It may also have an X-Progress header that provides some indication of progress to the user

@chrisgrenz
Copy link

  • a 200 OK with the response for the original request. This response can carry a X-Available-Until header to indicate when
    the response will no longer be available.
    This response may carry an OperationOutcome to describe the results of the task. SHALL contain Link header with enclosure type links to resulting resources. Expires header indicates when the task and associated resources may no longer be available. Links may be pre-signed and therefore should be treated as secret.

DELETE on a task link will, at the server's discretion:

  • Terminate a running task
  • Expire the resulting resources and free up storage for these resources.

@grahamegrieve
Copy link
Author

Link Header (rfc 5988) - we could use that (I like not using Document Manifest). I'm not sure I understand the point about sequencing of requests - it didn't come up when we discussed it. What is needed to be addressed?

Expires header is about the freshness of the response, not the underlying resource on the server. I couldn't think that's an appropriate use (or I would have used it). The Delete on the task wording is better

@brianpos
Copy link

Does the proposed ndjson format serialize the entries (containing the resources), or the resources themselves?
(i prefer the e tries, as that supports deletes too)

@chrisgrenz
Copy link

Sequencing as in using the date of the last request as the "since" of the next request. Although reading the Date rfc again makes me think that might be a stretch of a use as well (my use of Expires is specifically forbidden!).

Hadn't considered Brian's idea of entry serialization before. MOST uses (I think?) would find this to be an expensive change in terms of complexity to put all the resource elements an additional level below the root. Many deserializers optimize for the first level and would be forced to parse the whole FHIR resource as a single struct. I'll contact the MITRE guys and see if we can do some tests.

@grahamegrieve
Copy link
Author

looks to me like you'd want to serialize ndjson at the entry level for history, and search, and at the resource level for $everything. I've done async support in my server now. Though technically I can do async for any operation, I'm only going to allow it for search, history, and operations. I'll rewrite the notes above now that I've implemented shortly.

@grahamegrieve
Copy link
Author

Updated the write up for the async pattern and nd-json after implementing both. Request pattern/since is still a todo

@jamesagnew
Copy link

What does The FHIR specification will be modified to allow Patient/$everything to cross patients mean? Is the idea that at the type level, $everything just applies to all patients? Or all patients that the user has access to?

@jamesagnew
Copy link

Also, instead of start, would there be value in just calling the parameter lastUpdated or possibly _lastUpdated and giving it the same semantics as that search parameter (i.e. the date applies to the last updated date of the resource as I assume start also does, and also allows/requires a modifier if you want gt semantics but also allows you to specify a range)

@jamesagnew
Copy link

Also if this gets into spec, imo we should add a note recommending (or even just suggesting as a MAY) that the server support the HTTP Range header: https://developer.mozilla.org/en-US/docs/Web/HTTP/Range_requests

@grahamegrieve
Copy link
Author

James:

  1. yes: all patients that the user has access to
  2. _lastUpdated - yes, something like that. Still to be clarified
  3. range: sure, we can suggest that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment