Skip to content

Instantly share code, notes, and snippets.

@Savelenko
Last active April 4, 2021 15:46
Show Gist options
  • Save Savelenko/d13659018b820d51569a450258936433 to your computer and use it in GitHub Desktop.
Save Savelenko/d13659018b820d51569a450258936433 to your computer and use it in GitHub Desktop.
F# GADT Sample model
module Samples
open TypeEquality
(* Generic data needed for all samples. *)
type Identifier = Identifier of int64
type Site = Site of string
// Together
type GenericSampleData = {
SampleId : Identifier
Site : Site
// Add more data pieces here: type, dates, etc.
}
(* There are two categories of samples: normal and tumor. We can go two ways from here: (A) define "marker" types with
no data and put category-specific data in the actual sample DU below or (B) define types for distinguishing categories
and store specific data in same types. For now let's do (A). *)
// A trick to make this type usable only as a type. We don't need any values, but F# does not allow empty types
// natively.
type Tumor = Tumor of Tumor
type Normal = Normal of Normal
(* Let's say that tumor samples have a specific piece of data: tumor percentage. (My understanding of the original
example.) *)
type TumorPercentage = TumorPercentage of uint
(* Here comes a single type for a sample with either category. Important point is that each case additionally
contains a "note" on what the type variable of the type becomes for that case. *)
type Sample<'category> =
| NormalSample of GenericSampleData * TypeEquality<'category, Normal>
| TumorSample of GenericSampleData * TumorPercentage * TypeEquality<'category, Tumor>
(* Note the (inferred!) category type of the example values below. This is due to 'refl'. *)
// Sample<Normal>
let normalSample = NormalSample ({ SampleId = Identifier 1L; Site = Site "Lab 1"}, refl)
// Sample<Tumor>
let tumorSample = TumorSample ({ SampleId = Identifier 2L; Site = Site "Lab 2"}, TumorPercentage 1u, refl)
(* The identifier of any sample, regardless of category *)
let sampleId (sample : Sample<'a>) : Identifier =
let (NormalSample (s, _) | TumorSample (s, _, _)) = sample in s.SampleId
(* Type-safe, no-Option function which works only on tumor-category samples: *)
let tumorPercentage (sample : Sample<Tumor>) : TumorPercentage =
match sample with
| TumorSample (_, tp, _) -> tp
// Now comes the interesting part: pattern matching would be incomplete (in F#), but we know that another case
// should not be possible given the type of 'sample'. Using 'Refute' we can, well, refute an impossible sample value
// and convince the type checker that this match case is also "handled". In reality, this case will never be reached
// because it is impossible to construct 'isNormal' with type 'TypeEquality<Tumor,Normal>'.
| NormalSample (_, isNormal) -> TypeEquality.Refute<_> isNormal
(* Bonus. What if we want to determine tumor percentage of any sample? The price for that is Option and the function
could look like
tumorPercentageInAnySample : Sample<'a> -> Option<TumorPercentage>
Implementing this function is no problem, but we already have our 'tumorPercentage' function above and would like to
reuse it. So instead of a separate 'tumorPercentageInAnySample' we extend the model with another handy building block:
determine whether a sample is a tumor-category sample. Note how the "Option-penalty" moves to this function instead. *)
let asTumorSample (sample : Sample<'a>) : Option<Sample<Tumor>> =
match sample with
// Don't @ me about variable names here. We just want to copy data regardless of its meaning, this is technical code!
| TumorSample (a, b, _) -> Some (TumorSample (a, b, refl))
| _ -> None
// And now combining the building blocks:
let tumorPercentageInAnySample (sample : Sample<'a>) : Option<TumorPercentage> =
sample |> asTumorSample |> Option.map tumorPercentage
(* Runnable examples *)
let id1 = sampleId normalSample
let id2 = sampleId tumorSample
//let p1 = tumorPercentage normalSample // Does not compile, good
let p1 = tumorPercentage tumorSample
let p2 = tumorPercentageInAnySample tumorSample
let p3 = tumorPercentageInAnySample normalSample
(* This module defines a way of "remembering" that two types are the same. Don't focus on this technicality initially and
see how it is used in the model first. An important point though is that this module is reusable as a library and does
not have to be redefined for each model again. *)
module TypeEquality
/// A context which cannot be inspected or constructed by another module.
type F<'a> = private | Dummy
/// A value of type 'TypeEquality<a,b>' is a proof (witness) that types 'a' and 'b' are the same type.
[<NoEquality; NoComparison>]
type TypeEquality<'a,'b> = TypesEqual of (F<'a> -> F<'b>) with
[<CompilerMessage("Probably intended to refute equality of unequals, but the types are same.", 7001, IsError = true)>]
static member Refute(_ : TypeEquality<'a,'a>) =
failwith "This will never happen due to explicit compiler error above."
/// Use this in situations when you end up with 'TypeEquality<a,b>', with 'a' and 'b' _concrete_ distinct types (not
/// variables). This actually signals an impossible case and is equivalent to deriving logical False. And if we can
/// derive False we can derive anything. Curry-Howard isomorphism tells us that this is the same as being able to
/// compute any value. Anyways, 'Refute' impossible cases to get a polymorphic value fitting any expression you have
/// to make complete; the exception below will never actually be raised.
static member Refute(_ : TypeEquality<'a,'b>) =
failwith "Attempted to Refute polymorphic type equality. Leibniz is not amused."
/// Any type 'a' is the same as itself (reflexivity). Use this in a model to "force" type inference on type variables.
let refl = TypesEqual id
/// If type 'a' is the same as type 'b', then 'b' is the same as 'a' (symmetry).
let symm (_ : TypeEquality<'a,'b>) : TypeEquality<'b,'a> = TypesEqual unbox
/// Given that types 'a' and 'b' are actually same type and a value of type 'a', we also have a value of type 'b',
// namely the same value.
let coerce (_ : TypeEquality<'a,'b>) (a: 'a) : 'b = unbox a
@kurt-mueller-osumc
Copy link

Hey, thank you for taking the time to type this this up - this is great. I read through it all (twice!) and I'm still soaking it all in. I have some further reading to do about generalized algebraic data types, just who the heck Leibniz is, and about Equality in general.

You've also given me some code to play around with and grok. I really appreciate that!

@Savelenko
Copy link
Author

Savelenko commented Apr 3, 2021

You are welcome. GADTs are natively supported in Haskell where the compiler understands and supports everything which is done here manually through module TypeEquality and more. I learned about this technique of approximating/simulating GADTs from the PureScript leibniz library. It is more powerful than my tiny TypeEquality module here because PureScript has a more capable type system than F#. Nevertheless, as you can see even in F# this approach is practical and useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment