LongboatAline/datamodel.md

## datamodel.md

      
    Raw
  

              datamodel.md
            
          
    (input for Adrian/Tebbe)
Bolus


 bolus type: prebolus, correction, meal bolus, basal - not really convinced, often several causes for a dose combined in one bolus. Count UAM dosed as SMB as correction, meal bolus, or basal? There's some of each included...
 Insulin with bolus: everything relevant already taken care of. I've just needed some help with reading

 Allow the presence of different types of insulin administered externally. As Adrian pointed out: dia and peak are logged already, so this is already fixed, and doesn't even require a lookup table to decode brand names to generic parameters. Great!
 Store insulin used in pump at each reservoir change. can be done with comment, not needed since dia/peak already taken into account
 keep track of each bolus for its duration according to its profile. Done. how great is that?
 later? allow for prolonged insulin action with large Bolus. (That's not DB, and possible with the information availabe. so scratch it)


 cache future IOB (IobCobAS), display IOB for timespan of predictions (see tisf branch)
 save acting insulin (IA) to db - While recoverable from dosing information, it takes time to calculate.
 nice to have allow toggle between IOB and IA

TDD - glad to see it incorporated, we're good there.


 Keep track of TDD, up/down regulation  (SW, not DB - hence ignore)
 Keep track of some TDD-related safety parameters, even if a user decides to override them manually (minIOB, maxIOB defined in relation to TDD)

Carbs

some metadata I'd like to see with carbs

 fast/medium/slow acting (or by GI) (so the notation of fast acting rescue carbs won't mess the calculation for hours on end. Since these may even be entered quite accurately, they could be used for automagically re-determining carb factors)
 estimated, or precise
 qualifier for "replacement" carbs entered to mimic a less direct BG impact (FPE, cortisol treatment, ...)

Metadata to support deduplication and correct time-sequencing of device data


 Timestamp

UTC (continuous time, not subject to timezone or DST changes)
device time of the device read out (if available)
device time and TZ of the device doing the readout (may differ significantly)
for data bridged in or backfilled, timestamp of processing and uploader.(not to be confused up with data timestamp)
if the device chain needed for the readout is longer, keep track of the timestamp on all devices involved - any of them may be the source of later error, and data may be propagated on and reach the looping algorithm via multiple paths. We advise against it, want to avoid it, but bad things happen in real life with real users.


 each timestamp tuple should qualify if it's syncronized, or subject to drift. (Is it sufficient to not set UTC if there's not time-normal? could use GMT+0 to mark the difference, that would also apply for back-filled data)
 Add a qualifier, if cgm data is raw, pre-processed/filtered (manufacturer/smoothing/bias correction/...) (ability to avoid double application of filtering), or bridged
 In order to be able to revert to fallback strategies in case of faulty sensor data, I would suggest to label sensor data with a qualifier for deductible errors, such as
signal missing , stuck signal, signal bias, long-time error (hard failure) (PDF) Hybrid Online Multi-Sensor Error Detection and Functional Redundancy for Artificial Pancreas Control Systems
 idea: If data doesn't come with an unique ID, add hash with sequence of previous data from same source to protect against repeated readouts of same data clogging the db.

InterfaceIDs

see above - sometimes data gets bridged in, or backfilled after a readout by a different source. I'd want to mark this, to differenciate analytical data from data that was accessible for the looping algorithm, and to de-duplicate and sanitize data (use case: backfilling/bridging from manufacturer sw readouts, backfilling with data that got falsified by time adjustments (battery fault...) on one of the devices involved in the uploader chain, etc) Example: Months of erroneous Tidepool data, when after a failure of the pump battery, the suspend got recorded with a timestamp a the beginning of time. After adjusting time, Tidepool was certain, that the pump was delivering basal until start/stop got toggled again.

 backfilled at
 backfilled by

TherapyEvent

yes please :)

 add rise as a variant/addition of waking up. The difference gets a meaning during sick or lazy days...
 medication in general (item, dose, possibly duration)
 cortison (dose, duration)
 period might merit a qualifier
 Therapy failures (kinked set, accidental disconnect, "damaged" insulin) with duration

state

From considering a different concept (state-aware loop rather than the stateless variant) where Profile switches are the closest thing to state transitions, I tend to sort some events with duration into state rather than events:

 toggle awake/asleep
 activity (aerob, anaerob, post-activity)
 cortisol (dose, duration)
 period (stage)
 pregnancy (stage)
 hypo (low, hypo, phIR)
 stages of the food chain (ES, pp)
 post-alcohol


notes on time

Something I had written on Gitter about a year ago without much feedback regarding time shifts and drift -
I'd rather keep a reliable time reference with every record, since not all changes of time-base are necessarily logged (in pumps at least the intentional ones usually are, hard resets by death of battery are not), and there's a whole zoo of devices involved. I did some work on a data-model regarding that in 2005 (Old blurb, pre-cgm), and arrived at three different time references I wanted to keep with every record:

UTC Timestamp - to have a linear time reference. Anything that cannot be traced back to UTC is potentially worthless or worse, harmful.
device-local timestamp (whatever time the device thinks it's at). Could be anything from just after the epochto "wildly in the future", especially if it's the odd random meter used only occasionally. (I was moddeling diabetic data logging in general, and didn't dare to dream that data could be used to control our pumps).
device-local also translates to user-local when it comes to aligning notes, carbs, and other manual input with the sequential time. The latter might need some heuristic magic to determine the matching UTC timestamp if data is entered  retrospectively.
(taking a different approach with this one nowadays, see circadian state loop) circadian timestamp reference to circadian rythm, basically a counter how many hours:minutes have passed since the last known bedtime. (needed to align IS and BR to the current day, a reference I have been sorely missing from most data models - there are two I know of that are taking shifts in the circadian rythm into account: Lehmann/Deutsch with their AIDA simulator, and the  doctor who did my pump ed).
In the documentation to the tidepool data model, they'e describing the timestamp conundrum perfectly >)