Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save typesanitizer/ac0f09efd862d7b113585eb7e5f1fd2e to your computer and use it in GitHub Desktop.
Save typesanitizer/ac0f09efd862d7b113585eb7e5f1fd2e to your computer and use it in GitHub Desktop.
What I want from a Type System
(Note to the reader: Sections marked with < VG: ... > are notes inserted by me
(Varun), not from the original author. I'm just taking notes here to understand
better what exactly the person wants, and trying to analyse how the different
"requirements" are at odds with each other.)
The question was asked why I (as a programmer who prefers dynamic languages) don't consider static types "worth it". Here
is a short list of what I would need from a type system for it to be truely useful to me:
< VG: This is dangerous to some extent (if you use it everywhere). History
has shown that while people (e.g. ML/Haskell users) were initially interested
in having full type inference and did not want to write signatures anywhere,
that can easily lead to hidden API changes, and more generally volatility
due to non-local effects of inference. Certainly, there are ways to do
inference in a more local manner apart from ordinary unification, but I'm not
sure how effective that is if you do not even have top-level signatures.
>
1) Full type inference. I would really prefer to be able to write:
(defn concat-names [person]
(assoc person :full-name (str (:first-name person)
(:second-name person))))
And have the compiler know that whatever type required and produced from this function was acceptible as long as the
input type had :first-name and :second-name and the output had :full-name.
< VG: This _cannot_ interact in a sane way with _full_ type inference.
If you write down just [1030203, "Chocolate"], the type system has no
way of understanding that you meant [ProductID, ProductName]
instead of [Int32, String], unless it receives the information from
_somewhere_. Some annotations are necessary to "pin" down things ... the
type-checker can insert coercions in other places.
>
2) Value types in structures would need to be nominal, not structural. I really don't care if I get a tuple of
[Int32, String], I very rarely care what the machine level types are, what I really want is [ProductID, ProductName],
what the underlying types are doesn't really matter to me.
< VG: This can be accomodated by using either width subtyping or row
polymorphism.
This also seems to imply that types are not allowed to be made abstract,
because that would allow you to hide the fields and you'd have to return
a type error for an `is-a` check.
>
3) Collections of named value types (call them structs, or classes, or whatever) would need to be inclusively typed,
not exclusive. That is to say I sould be able to say:
(defstruct Person [first-name last-name])
The "is-a" check for this class/struct would then need to work on any type that included the first-name and last-name
values. Including any classes/structs that included other parameters. In the `concat-name` function above, the input
and the output of that function would both return true for (is-a? Person x), since both the inputs and the outputs
include the correct values.
Perhaps then we should say that key/value structures are structurally typed (but allow for extra items), while other
types are nominally typed.
< VG: The transformation of Person.:first-name -> :person.name/first seems odd,
and I can't quite figure out what general rule the author is getting at here...
>
4) This could get messy very quickly, so it would be important to have all these structural members be namespaced as
well. :first-name isn't good enough, we would need to be able to specify the name as :person.name/first and
:person.name/last. Thus I could later add a :product/name and it would not conflict with any other definition of :name.
< VG: Basic set logic is a _very_ slippery term. Are set complements/universes
included? Do we "only" have unions and intersections? Presumably, when the author says
"these types" they are talking about nominal and structural types, not including
function types (which are not at all mentioned here).
I don't think unions can interact well with inference. If (x : Employee) type-checks,
then so does (x : Employee | τ) for every other τ. If you have (x : Person), and you
also have Teacher in scope, then (x : Employee), (x : Person), (x : Teacher), all
type-check.
If one limits inference to only named unions, that would probably lead to type errors
for intermediate values in case one is working with large unions.
>
5) Now I would still need basic set logic for these types so I could say (deftype Teacher (union Person Employee))
< VG: As an aside, the paragraph below sounds very much like what I read about
protocol buffers (or was it FlatBuffers?), where fields are always marked
optional, essentially meaning that anything can be nil. Also, the point of
needless conversions is handled by a proper record system.
>
So, why do I need all this? Well a lot of work I do has to do with very messy business logic. Situations were a
company wants to convert their receipt system, trading engine, or something of that nature into computer code. Often
this data is very complex and hard to define. Situations where a importer may bring in 100 fields but we only care
about 10. Thus it's often important to require that certain data match a model, but allow for extra data to flow
through a system without much effort. In addition, the problem with nominal k/v types is that I often find myself
converting between two types simply because function A requires a DBPerson, and function B requires a UIPerson. If
the keys and values are the same, they should flow through, while still keeping me from saying PersonName = ProductID.
All this would also need to require a minimal amount of typing from me, I can't spend time writing converters from
DBPerson to UIPerson, and the conversion of one of these types to the other shouldn't take a lot of compute power since
I may be performing millions of these conversions every second.
In the end, no static type system I've seen provides all this for me. Some come close, but none are perfect. So I
stick with dynamic languages (Clojure if you haven't guessed), and leverage Spec (http://clojure.org/about/spec) which
not only provides validation for me where I want it, but also offers QuickCheck like generative testing, and destructuring.
Perhaps someday someone will provide all that for me, with a REPL, since I really need that, but until then I'll stick
with data-driven dynamic Clojure.
Thanks for taking the time to read this, hopefully it was helpful.
Timothy Baldridge
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment