Skip to content

Instantly share code, notes, and snippets.

@hgiasac

hgiasac/blog.md Secret

Created January 4, 2019 17:55
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save hgiasac/5647baaccd1304068ce90dfeb9278fda to your computer and use it in GitHub Desktop.
Save hgiasac/5647baaccd1304068ce90dfeb9278fda to your computer and use it in GitHub Desktop.

Typeable - A long journey to type-safe dynamic type representation

Haskell has grown and changed amazingly in recent years, with great contribution from community and industrial adoption. There are many new ideas and changes in new GHC versions, as well as breaking changes. The most influent library is base package, which contains the Standard Haskell Prelude, and its support libraries, and a large collection of useful libraries ranging from data structures to parsing combinators and debugging utilities. (hackage)

Typeable is a module in base package. Maybe you already know, the Typeable class is used to create runtime type information for arbitrary types. Explicitly, Typeable is a unique Fingerprint (or hash) of type coupled with a coercion. This module has big influence in Data.Dynamic, Data.Data module as well as generic programming libraries. This post gives you a short story about the innovation of Typeable along with GHC extensions.

A long time ago...

Typeable was originally a part of Dynamic, which is the solution for Dynamic typing in Statically typed language. In real world, there are programming situations we need to deal with data whose type cannot be determined at compile time, for example, fetch data from API, parse JSON string, query from database...

Since mid-1960s, early programming languages (Algol-68, Pascal, Simula-67) used similar technique of Dynamic. Until 1989, Dynamic and typecase term was introduced in "Dynamic Typing in a Statically Typed Language" paper, with lambda calculus notation.

In Haskell, from what I saw in git.haskell.org, the first commit was in June 28, 2001 by Simon Marlow. However, I'm not sure this proof is correct. Maybe the module was implemented before that. Unfortunately I can't find another reliable source. In this version, Typeable and Dynamic were in same module. In July 24 2003, Typeable was moved to Data.Typeable package.

In the early stage of Haskell, TypeRep was constructed of TyCon and child TypeReps. TyCon hold Key as an unique HashTable, and a String to determine type name at user level.

https://gist.github.com/f65d643a9b32c6e524754019cb52a208

FFI C macros can help generating instances, with some exceptions, such as Tuple. undefined was used as an alternative for Proxy. unsafeCoerce was used anywhere for coercing types.

https://gist.github.com/c7f5093c3773c252083d5d78a570a0eb

From Key to Fingerprint

TyCon was stored in unsafe IORef cache, which internally kept a HashTable mapping Strings to Ints. So that each TyCon could be given a unique Int for fast comparison, the String has to be unique across all types in the program. However, derived instances of typeable used the qualified original name (e.g. GHC.Types.Int) which is not necessarily unique, is non-portable, and exposes implementation details.

https://gist.github.com/c55f0f2c96c9cbfcb00bccba2a93ac7b

In July 2011, Simon Marlow replaced Key with Fingerprint - , and moved TyCon as well as TypeRep internally. The fields of TyCon are not exposed via the public API. Together the three fields tyConPackage, tyConModule and tyConName uniquely identify a TyCon, and the Fingerprint is a hash of the concatenation of these three Strings (so no more internal cache to map strings to unique IDs). This implementation is also easier for GHC to generate derived instances

https://gist.github.com/a9443cade4a95b33a6a985b45c7fd13e

Merge all the things

With the rise of Type-level programming, GHC compiler is more and more better. Many boilerplate codes were refactored and optimized in safe and well-performed. Typeable also took advantage of new GHC features:

PolyKinds

Before Poly-kinds, there are many duplicated code to support variant for N-ary type constructors. To represent type constructors with kind * -> *, such as Maybe or [], we could create a separate type class, called Typeable1. However, this approach is ugly and inflexible. What about tuples? Do we need a Typeable2, Typeable3... for them?

https://gist.github.com/3748bd858271c41079f9a57e4e986a6f

With kind polymorphism we can write:

https://gist.github.com/27c9a324b7743caa69057c7901311ff7

We have generalized in two ways here.

  • First, Typeable gets a polymorphic kind: forall a. a -> Constraint, so that it can be used for types of any kind.
  • Second, we need some way to generalize the argument of typeRep , which we have done via a poly-kinded data type Proxy:

https://gist.github.com/13a972b31e78228254174c21375f1f4b

The idea is that, say typeRep (Proxy :: Proxy Int) will return the type representation for Int, while typeRep (Proxy :: Proxy Maybe) will do the same for Maybe. The proxy argument carries no information—the type has only one, nullary constructor and is only present so that the programmer can express the type at which to invoke typeRep. Because there are no constraints on the kind of a, it is safe to assign Proxy the polymorphic kind forall a. a -> *.

TypeLevelReasoning

Richard A. Eisenberg also proposed and implemented TypeLevelReasoning. New module Data.Type.Equality was born. The idea is defines the type of equality witnesses for two types of any kind k:

https://gist.github.com/169d3b5c9800754a1801daf514fd0173

Pattern matching on this generalized algebraic datatype (GADT) allows GHC to discover the equality between two types. If a :=: b is inhabited by some terminating value, then the type a is the same as the type b. To use this equality in practice, pattern-match on the a :=: b to get out the Refl constructor:

https://gist.github.com/8690284c7a63d3b76b8d9d865de4d821

We pattern-match on Refl. This exposes the fact that a and b must be the same. Then, GHC happily uses x of type a in a context expecting something of type b.

With this, we can remove unsafe hack when convert Proxy:

https://gist.github.com/508654591ad35378ec986af2d948ffb2

And more...

This release also added is the AutoDeriveTypeable language extension, which will automatically derive Typeable for all types and classes declared in that module.

The update could caused breaking changes, although backwards compatibility interfaces were kept. José Pedro Magalhães also backed up old codes into Data.OldTypeable module, which was removed later in GHC 7.10.1 - base-4.8.0.0

  • Since GHC 8.2, GHC has supported type-indexed type representations. Data.Typeable provides type representations which are qualified over this index. To keep Typeable backwards compatibility, the interface of this module is similar to old releases. Type.Reflection module is available for the type-indexed interface.

Safer and more expressive type representations

With the motivation of distribution programming, mainly focusing on Cloud Haskell, new challenges occurred: Typeable wasn't safe enough.

Open world's challenge

Cloud Haskell is an Erlang-style library for programming a distributed system in Haskell. It is based on the message-passing model of Erlang, but with additional advantages that stem from Haskell’s purity, types, and monads. If you are from Scala land, Cloud Haskell has similar features with Akka, which use Actor model for communicating across processes. However, Cloud Haskell also supports thread in same machine. You can conveniently do things the old way.

Because the information that is transmitted between machines by network must be in bit, the data must be Serializable. This ensures two properties: The data can be encoded and decoded to binary fore and back, and can produce a TypeRep that captures the item’s type.

https://gist.github.com/f268426ae00c1c24e28824dcc001b35b

To guarantee that the code is then applied to appropriately typed values, the receiving node must perform a dynamic type test. That way, even if the code pointer was corrupted in transit, by accident or malice, the receiving node will be type-sound. You can think of it like this: a decoder is simply a parser for the bits in the ByteString, so a decoder for Int can fail to parse a full Int (returning Nothing), but it can't return a non-Int. A simple way to do this is to serialize a code pointer as a key into a static pointer table containing Dynamic values (RemoteTable). When receiving code pointer with key k, the recipient can lookup up entry k in the table, find a Dynamic, and check that it has the expected type.

To do that, Typeable needed to evolve further.

TypeRep was not indexed, so there was no connection between the TypeRep stored in a Dynamic and the corresponding value. Indeed, accessing the typeRep required a proxy argument to specify the type that should be represented.

https://gist.github.com/63e732cb466bae77e86d372545548c28

Because there is no connection between types and their representations, this implementation of Dynamic requires unsafeCoerce. For example, here is the old fromDynamic:

https://gist.github.com/4ccc1515665ab455267461dc8410ae82

Client code is un-trusted. The current design (GHC 7.8) couldn't hide all such uses of unsafeCoerce from the user. If they could write a Typeable instance, they must use unsafeCoerce, and defeat type safety. So only GHC is allowed write Typeable instances.

Kind-indexed GADTs

The key to our approach is our type-indexed type representation TypeRep. But what is a type-indexed type representation? That is, the index in a type-indexed type representation is itself the represented type. For example:

  • The representation of Int is the value of type TypeRep Int. – The representation of Bool is the value of type TypeRep Bool.
  • And so on.

The idea of kind-indexed is originally from GADTs. For example, we consider designing a GADT for closed type representations:

https://gist.github.com/d3ec2ff3eea182b6def9de6d2b03b6ae

GADTs differ from ordinary algebraic data types in that they allow each data constructor to constrain the type parameters to the datatype. For example, the TyInt constructor requires that the single parameter to TyRep be Int.

We can use type representations for type-indexed programming a simple example liked computing a default element for each type.

https://gist.github.com/c516f051d487f1c8dcea975cd375a843

This code pattern matches the type representation to determine what value to return. Because of the nonuniform type index, pattern matching recovers the identity of the type variable a.

  • In the first case, because the data constructor is TyInt, this parameter must be Int, so 0 can be returned.
  • In the second case, the parameter a must be equal to Bool, so returning False is well-typed.

However, the GADT above can only be used to represent types of kind *. To represent type constructors with kind * -> *, such as Maybe or [], we could create a separate datatype, perhaps called TyRep1, TyRep2,... Kind polymorphism which allows data types to be parameterized by kind variables as well as type variables, could be the solution. However, it is not enough to unify the representations for TyRep. the type representation (shown below) should constrain its kind parameter.

https://gist.github.com/73ccc3747cd1fe12df64de19bdf80083

This TyRep type takes two parameters, a kind k and a type of that kind (not named in the kind annotation). The data constructors constrain k to a concrete kind. For the example to be well-formed, TyInt must constrain the kind parameter to *. Similarly, TyMaybe requires the kind parameter to be * -> *. We call this example a kind-indexed GADT because the datatype is indexed by both kind and type information.

Pattern matching with this datatype refines kinds as well as types—determining whether a type is of the form TyApp makes new kind and type equalities available. For example, consider the zero function extended with a default value of the Maybe type.

https://gist.github.com/c5e10c30eb699962c70f80be10fdbae6

In the last case, the TyApp pattern introduces the kind variable k, the type variables b :: k -> * and c :: k, and the type equality a ∼ b c. The TyMaybe pattern adds the kind equality k ~ * and type equality b ∼ Maybe. Combining the equality, we can show that Maybe c, the type of Nothing, is well-kinded and equal to a.

With this design, we also enable type decomposition feature. Finally, new TypeRep will be:

https://gist.github.com/c1c505865f9304bbf777dfc504e438d0

The current implementation of TypeRep allows constant-time comparison based on fingerprints. To support this in the new scheme we would want to add a fingerprint to every TypeRep node. But we would not want clients to see those fingerprints.

The TyCon type, which is a runtime representation of the “identity” of a type constructor, is now silently generates a binding for a suitable instance for every datatype declaration by GHC. For example, for Maybe GHC will generate:

https://gist.github.com/8b6b0a83949e7a9983cca327469811fa

The name $tcMaybe is not directly available to the programmer. Instead (this is the second piece of built-in support), GHC’s type-constraint solver has special behavior for Typeable constraints, as follows.

To solve Typeable(t1 t2), GHC simply solves Typeable t1 and Typeable t2, and combines the results with TrApp. To solve Typeable T where T is a type constructor, the solver uses TrTyCon. The first argument of TrTyCon is straight-forward: it is the (runtime representation of the) type constructor itself, e.g $tcMaybe.

But TrTyCon also stores the representation of the kind of this very constructor, of type TypeRep k. Recording the kind representations is important, otherwise we would not be able to distinguish, say, Proxy :: * -> * from Proxy :: (* -> *) -> *, where Proxy has a polymorphic kind (Proxy :: forall k. k -> *). We do not support direct representations of kind-polymorphic constructors like Proxy, rather TrTyCon encodes the instantiation of a kind-polymorphic constructor (such as Proxy).

Notice that TrTyCon is fundamentally insecure: you could use it to build a TypeRep t for any t whatsoever. That is why we do not expose the representation of TypeRep to the programmer. Instead the part of GHC’s Typeable solver that builds TrTyCon applications is part of GHC’s trusted code base.

TypeRep is abstract, and thus we don't use proxy to determine the TypeRep value anymore:

https://gist.github.com/ba20c5e8a2b2c342c1a396c244a01357

Kind equalities

We also need to recompute representation equality function eqT. It is easy, just need to compare equality between 2 TypeRep fingerprint:

https://gist.github.com/818ce07e96b24ae07e3d5a6daeefae8b

It is critical that this function returns (:~~:), not (:~:). GHC can't compile. This is because TyCons exist at many different kinds. For example, Int is at kind *, and Maybe is at kind * -> *. Thus, when comparing two TyCon representations for equality, we want to learn whether the types and the kinds are equal. If we used type equalities (:~:) here, the eqTypeRep function could be used only when we know, from some other source, that the kinds are equal.

Richard A. Eisenberg proposed and implemented kind heterogeneous equalities (2013 - 2015). It enable new, useful features such as kind-indexed GADTs, promoted GADTs and kind families. This extension was experiment in GHC 8.0.1, then was provided in Data.Type.Equality module.

The restriction above exists because GHC reasons about only type equality, never kind equality. The solution to all of these problems is simple to state: merge the concepts of type and kind. If types and kinds are the same, then we surely have kind equalities. In order to overcome those challenges, it has been necessary to augment GHC’s internal language, System FC. This is beyond the scope of this post. If you need to dig into detail, let's read this paper.

fromDynamic turns out like this:

https://gist.github.com/e77186ae9e0011a650dbaedd8819ca1b

We use eqT to compare the two TypeReps, and pattern-match on HRefl, so that in the second case alternative we know that a and d are equal, so we can return Just x where a value of type Maybe d is needed. More generally, eqT allows to implement type-safe cast, a useful operation in its own right.

Decomposing polykinds representations

So far, we have discussed type representations for only types of kind *. The only operation we have provided over TypeRep is eqT, which compares two type representations for equality. Does (,) which has kind (* -> *) too have a TypeRep? For example, how can we decompose the type representation, to check that it indeed represents a pair, and extract its first component?

https://gist.github.com/ce8cf417563c2b0273610ed8b5494cf6

Of course it must. Since types in Haskell are built via a sequence of type applications (much like how an expression applying a function to multiple arguments is built with several nested term applications), the natural dual is to provide a way to decompose type applications. Let's take a look at TrApp definition:

https://gist.github.com/95426bd7e750cf7ee6b7229ec2e19860

TrApp allows us to observe the structure of types and expose the type equalities it has discovered to the type checker. Now we can implement dynFst:

https://gist.github.com/9ccd6a6e481938a25ad32a74922460f2

We check that the TypeRep of x is of form (,) a b by decomposing it twice. Then we must check that rp, the TypeRep of the function part of this application, is indeed the pair type constructor (,); we can do that using eqT. These three GADT pattern matches combine to tell the type checker that the type of x, which began life in the (Dyn rpab x) pattern match as an existentially-quantified type variable, is indeed a pair type (a, b). So we can safely apply fst to x, to get a result whose type representation ra we have in hand.

The code is simple enough, but the type checker has to work remarkably hard behind the scenes to prove that it is sound. Let us take a closer look with kind signatures added:

https://gist.github.com/fe563b6965ba8a1e02c971927a0787f4

Note that k1, the kind of b is existentially bound in this data structure, meaning that it does not appear in the kind of the result type (a b). We know the result kind of the type application but there is no way to know the kinds of the sub-components.

With kind polymorphism in mind, let’s add some type annotations to see what existential variables are introduced in dynFst:

https://gist.github.com/04d042773d726353b0258c1a4a38bcc7

Focus on the arguments to the call to eqT in the third line. We know that:

  • rp :: TypeRep p and p :: k1 -> k2 -> *
  • typeRep :: TypeRep (,) and (,) :: * -> * -> *

So eqT must compare the TypeReps for two types of different kinds; if the runtime test succeeds, we know not only that p ~ (,), but also that (k1 ~ *) and (k2 ~ *). That is, the pattern match on Refl GADT constructor brings local kind equalities into scope, as well as type equalities.

Better Pattern matching with PatternSynonyms

The code above is ugly. Moreover, we don't want to expose TypeRep constructor to users. Earliest solution is returning another GADT:

https://gist.github.com/e891bb8eeb86c021a9958c9d865338fd

However, we can make it better with PatternSynonyms extension which was provided since GHC 7.8.

Pattern synonyms enable giving names to parametrized pattern schemes. They can also be thought of as abstract constructors that don’t have a bearing on data representation. GHC User's Guide

https://gist.github.com/d2dc0d7e926a5fd7c561b2fe1c171441

With this extension, you can not only hide the representation of the datatype, but also use it in pattern matching. Our code is much cleaner.

Decompose function type, TypeInType and Dependent Types

We also need to implement dynApply, which applies a function Dynamic to an argument Dynamic. It is necessary in real-world application, e.g, send an object with a function type, say Bool -> Int, over the network.

https://gist.github.com/39b738d31afec776ad85bbfe2c081928

In theory, We can use TrApp and App pattern to construct and decompose function type (->) too. The definition of this function is fairly straightforward:

https://gist.github.com/89e165a50bbbdb7f35421bcdbe0c127d

However, because functions are quite ubiquitous, we should define another constructor for the sake of efficiency:

https://gist.github.com/9230a589f69785ecc2cbe2709be32152

This definition wasn't compiled before GHC 8.0. First, TrApp and TrFun is ambiguous that can be solved with explicit quantification. Second, GHC can't know that kind TypeRep :: k -> * is identical with * -> *.

Kind equality extends the idea of kind polymorphism by declaring that types and kinds are indeed one and the same. Nothing within GHC distinguishes between types and kinds. Another way of thinking about this is that the type Bool and the “promoted kind” 'Bool are actually identical.

One simplification allowed by combining types and kinds is that the type of Type is just Type. It is true that the Type :: Type axiom can lead to non-termination, but this is not a problem in GHC, as we already have other means of non-terminating programs in both types and expressions. This decision (among many, many others) does mean that despite the expressiveness of GHC’s type system, a “proof” you write in Haskell is not an irrefutable mathematical proof. GHC promises only partial correctness, that if your programs compile and run to completion, their results indeed have the types assigned. It makes no claim about programs that do not finish in a finite amount of time. GHC User's Guide

To enable (* ::*) axiom, you have to enable TypeInType extension, which is a deprecated alias of PolyKinds, DataKinds and KindSignatures. Its functionality has been integrated into these other extensions. With this extension, GHC can know that k ~ *, and the code can compile.

https://gist.github.com/2e241b46913cd70f332726d4abe48d29

Levity Polymorphism

You may notice, there are RuntimeRep and TYPE kind signatures in TrFun constructor. They relates to Levity Polymorphism, which is implemented in GHC version 8.0.1, released early 2016.

In brief, most of types we use everyday (Int, Bool, AST, ...) are boxed value, which is represented by a pointer into the heap. The main advantage of boxed types are supporting polymorphism. The disadvantage is slow performance. Most polymorphic languages also support some form of unboxed primitive values that are represented not by a pointer but by the value itself. In Haskell, unboxed types are denoted with MagicHash suffix.

https://gist.github.com/5f5679ee680bb52dd1f9d7501333fc0c

Haskell also categorizes types into lifted, and unlifted. Lifted types is one that is lazy. It is considered lifted because it has one extra element beyond the usual ones, representing a non-terminating computation. For example, Haskell’s Bool type is lifted, meaning that three Bools are possible: True, False ,and . An unlifted type, on the other hand, is strict. The element does not exist in an unlifted type.

Because Haskell represents lazy computation as thunks at runtime, all lifted types must also be boxed. However, it is possible to have boxed, unlifted types.

  • Lifted - Boxed: Int, Bool,...
  • Unlifted - Boxed: ByteArray#
  • Unlifted - Unboxed: Int#, Bool#

Given these unboxed values, the boxed versions can be defined in Haskell itself; GHC does not treat them specially. For example:

https://gist.github.com/897d72d8ca6bb49b1329cfa06d0ffa77

Here Int is an ordinary algebraic data type, with one data constructor I#, that has one field of type Int#. The function plusInt simply pattern matches on its arguments, fetches their contents (i1 and i2, both of type Int#), adds them using (+#), and boxes the result with I#.

The issue is, like many other compilers for a polymorphic language, GHC assumes that a value of polymorphic type, such as x :: a s represented uniformly by a heap pointer, or lifted type. The compiler adopts The Instantiation Principle:

You cannot instantiate a polymorphic type variable with an unlifted type.

That is tiresome for programmers, but in return they get solid performance guarantees.

How can the compiler implement the instantiation principle? For example, how does it even know if a type is unlifted? By kinds, much the same way that terms are classified by types. Type, or * kind which we use every day is lifted. In contrast, # is a new kind that classifies unlifted types, e.g: Int#, Bool#.

In default, polymorphism functions assume kind of parameters is Type. If we attempt to instantiate it at type Float# :: #, we will get a kind error because Type and # are different kinds. The function arrow type (->) is the same. It is just a binary type constructor with kind:

https://gist.github.com/980e3ec4349e5236514e89912f12e861

But now we have a serious problem: a function over unlifted types, such as sumTo# :: Int# -> Int# -> Int#, becomes ill-kinded!.

For many years its “solution” was to support a sub-kinding relation. That is, GHC had a kind OpenKind, a super-kind of both Type and #. We could then say that:

https://gist.github.com/bcb470fe5a1c68b953b72389178bc5f7

However, there are drawbacks. The combination of type inference, polymorphism, and sub-typing, is problematic. And indeed GHC’s implementation of type inference was riddled with awkward and unprincipled special cases caused by sub-kinding. Moreover, The kind OpenKind would embarrassingly appear in error messages.

Levity polymorphism bring new idea: replace sub-kinding with kind polymorphism. New primitive type-level constant, TYPE :: RuntimeRep -> Type is introduced with the following supporting definitions:

https://gist.github.com/602a6fc5c2c7f3eecdd8810499336d47

RuntimeRep is a type that describes the runtime representation of values of a type. Type, the kind that classifies the types of values, was previously treated as primitive, but now becomes a synonym for TYPE Lifted, where Lifted :: RuntimeRep. It is easiest to see how these definitions work using examples:

https://gist.github.com/6bd2a86bd4f07b51ca325a10592ea2ee

Any type that classifies values, whether boxed or unboxed, lifted or unlifted, has kind TYPE r for some r :: RuntimeRep. The type RuntimeRep tells us the runtime representation of values of that type. This datatype encodes the choice of runtime value.

We can now give proper types to (->), same as TrFun. This enables polymorphism for both lifted and unlifted types.

https://gist.github.com/f4cc4f90a104c3dcc8f362b65dfd7148

Note: Unboxed and Levity Polymorphism types are imported in GHC.Exts module.

If you are more curious about Levity polymorphism, let's take a look at original paper.

Compare TypeReps

It is sometimes necessary to use type representations in the key of a map. For example, Shake uses a map keyed on type representations to look up class instances (dictionaries) at runtime; these instances define class operations for the types of data stored in a collection of Dynamics. Storing the class operations once per type, instead of with each Dynamic package, is much more efficient.

More specifically, we would like to implement the following interface:

https://gist.github.com/7acbddf36d784252b89957f41a66629d

But how should we implement these type-indexed maps? One option is to use HashMap. We can define the typed-map as a map between the type representation and a dynamic value.

https://gist.github.com/50b647c3d691aa472652e10abb485872

Notice that we must wrap the TypeRep key in an existential SomeTypeRep, otherwise all the keys would be for the same type, which would rather defeat the purpose! The insert and lookup functions can then use toDynamic and fromDynamic to ensure that the right type of value is stored with each key.

https://gist.github.com/aee1ed25cc8b16d7dc2f00e5b17614d3

Because Map family uses balanced binary trees to achieve efficient lookup, SomeTypeRep must be an instance of Ord. This is straightforward, since TypeRep use fingerprint for unique hash, it can be compared too.

https://gist.github.com/3ccd5b8e9d1267e75b1ae9d1f7f55a21

Notice that we cannot make an instance for Ord (TypeRep a): if we compare two values both of type TypeRep t, following the signature of compare, they should always be equal!

Other Changes

  • Type-indexed type representation interface is in Type.Reflection module. Data.Typeable provides type representations which are qualified over this index, providing an interface very similar to the Typeable notion seen in previous releases for backward compatibility.

  • Data.Typeable.TypeRep and Type.Reflection.TypeRep are different. Data.Typeable.TypeRep is alias of SomeTypeRep

  • TypeRep definition is replaced with record, to easier to extend parameters https://gist.github.com/a38efab80c66a1d6a83807ade3d17bf2

  • The kind of the TypeRep in each TrTyCon and TrApp constructor is cached. This is necessary to ensure that typeRepKind (which is used, at least, in deserialization and dynApply) is cheap, because calculating the kind of type constructor and nested type applications is pricy,

  • We need to be able to represent TypeRep Type. This is a bit tricky because typeRepKind (typeRep @Type) = typeRep @Type, so if we actually cache the TypeRep of the kind of Type, we will have a loop. One simple way to do this is to make the cached kind fields lazy and allow TypeRep Type to be cyclical.

Limitation and unexplored future

Our interface does not support representations of polymorphic types, such as TypeRep (∀ a. a -> a). Although plausible, supporting those in our setting brings in a whole new range of design decisions that are as of yet unexplored (e.g. higher-order representations vs. de-Bruijn?). Furthermore, it requires the language to support impredicative polymorphism (the ability to instantiate quantified variables with polymorphic types, for instance the a variable in TypeRep a or Typeable a), which GHC currently does not. Finally, representations of polymorphic types have implications on semantics and possibly parametricity.

Similarly, constructors with polymorphic kinds would require impredicative kind polymorphism. A representation of type TypeRep (Proxy :: ∀ kp. kp -> *) would require the kind parameter k of TypeRep (a :: k) to be instantiated to the polymorphic kind ∀ kp. kp -> *. Type inference for impredicative kind polymorphism is no easier than for impredicative type polymorphism and we have thus excluded this possibility.

Summary

Typeable take a long journey for definition, design, refactor and redesign, from runtime term level to type-safe. However, the journey doesn't stop here. GHC is still growing, better and safer, with innovation and contribution of brilliant researchers. It motivates us to study and do great software and bring toward the industry.

Acknowledgment

Thanks to Richard A. Eisenberg and his team for great research and contribution. This post take many reference in their papers.

Appendix

  • The complete code in "A reflection on types" paper: link
  • Homogeneous equality ~ Type equality: (a :~: b)
  • Heterogeneous equality ~ Kind equality: (a :~~: b)
  • Universal quantification: forall a. a -> a
  • Existentially quantified type: forall a. Show a => a -> String

References

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment