Skip to content

Instantly share code, notes, and snippets.

@andrewthad
Created June 14, 2019 14:47
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save andrewthad/90a4b23051870d14afb4ba064699bc5b to your computer and use it in GitHub Desktop.
Save andrewthad/90a4b23051870d14afb4ba064699bc5b to your computer and use it in GitHub Desktop.
Strictness in Kinds

Strictness in Kinds

Let's tweak GHC's RuntimeRep stuff to try adding information about strictness. This information could be helpful for avoiding evaluating thunks that are guaranteed to already be evaluated. Presently, RuntimeRep is defined as:

data RuntimeRep
  = LiftedRep
  | UnliftedRep
  | IntRep
  | ...

Let's adjust this:

data Strictness = Strict | Lazy
data RuntimeRep
  = LiftedRep Strictness
  | UnliftedRep
  | IntRep
  | ...
type Type = TYPE ('LiftedRep 'Lazy)

In Levity Polymorphism, Eisenberg and Peyton Jones argue that "kinds, not types, are calling conventions". So what's the calling convention for these two new kinds? We want LiftedRep Lazy to behave just like the old LiftedRep did, and we want LiftedRep Strict to behave like UnliftedRep. So why bother introducing something new rather than reusing UnliftedRep? The answer is that we want some cool magic primitives, forget and seqTyped. These cannot be used with things like Array#. We'll get to them later. Let's write a spine-strict element-lazy list data type as:

data List :: Type -> forall (s :: Strictness) -> TYPE ('LiftedRep s)
  Cons :: a -> List a 'Strict -> List a s
  Nil :: List a s

The strictness type variable needs to show up last, and it must have role phantom. That is, the data constructor cannot perform GADT-like analysis on it, and they cannot give it to their fields. With this in mind, we can throw this in the mix:

forget :: forall (a :: forall (s :: Strictness) -> TYPE ('LiftedRep s)). a 'Strict -> a 'Lazy

And forget just compiles to a no-op. Any evaluated closure can be treated as an unevaluated closure, a fortunate consequence of how GHC's runtime was designed. (Notice that users cannot call this on something like Array#). Next, let us consider a better seq:

seqTyped :: forall (a :: forall (s :: Strictness) -> TYPE ('LiftedRep s)). a 'Lazy -> a 'Strict

This function scrutinizes argument. This is not a no-op. It turns into case in GHC Core. Speaking of which, that case would work differently. In GHC Core, we would want case to provide make the new identifier have a strict type:

case (ds1 :: List Int Lazy) of (ds2 :: List Int Strict) {
  __DEFAULT -> ...
 }

You could even have wrappers to support standard non-strict data types. If a user wrote:

data Foo = Foo Bool Bool Bool

The compiler could generate a wrapper:

data FooInternal :: forall (s :: Strictness) -> TYPE ('LiftedRep s) where
  FooInternal :: Bool -> Bool -> Bool -> FooInternal s
newtype Foo = Foo (FooInternal 'Lazy)

And the user is none the wiser. Every data type would need to be internally rewritten in this way. Is all of this stuff actually sound? Who knows? It looks sound to me. But you really need someone who can write cool proofs like this:

        τ type
---------------------
something → goes here
@Ericson2314
Copy link

"The strictness type variable needs to show up last, and it must have role phantom."

I'm concerned about the precedent for putting erasable stuff in the runtime rep. Per https://gitlab.haskell.org/ghc/ghc/issues/14917 it would be nice to have a monomorphizing quantifier. But with your plan we are left with a requirement that the runtime rep need only be shallowly monomorphized: Strict | Lazy is behind the pointer and thus not relevent. This means the "const eval" system that would go with the memorizing compiler would have to track not just all (deep) or nothing constantness, but also shallow (head) constantness.

condensed into one example, I'm worried about how to allow:

id :: foreach (constShallow r :: RuntimeRep). forall (a  :: TYPE r). a -> a
id a = a

foo :: forall s (a  :: TYPE (Lifted s)). a -> a
foo = id

@andrewthad
Copy link
Author

andrewthad commented Jun 15, 2019 via email

@Ericson2314
Copy link

@andrewthad first of all I think you might want to check your markdown :).

If you are fine with:

id :: foreach (r :: RuntimeRep). forall (a :: TYPE r). a -> a
id a = a

foo :: foreach (s :: Strictness). forall a :: TYPE ('Lifted s). a -> a

foo = id @('Lifted s)

then there is no problem! Note that this means that foo (and forget and seqTyped like it) must be specialized to their strictness.


The strictness parameter could be irrelevant in principle, by which I mean not to TYPE or Lifted but the body of the \@(s : Strictness) -> body associated with a forall (s : Strictness). quantifier. (We know this is true because this is the case today, the ABI is the same!) I perhaps further discussion of this is useful to https://gitlab.haskell.org/ghc/ghc/issues/14917, but not to this issue.


I was just reminded of https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0019-constraint-vs-type.rst#old-proposal's third alternative. The goal there was (t :: Type) |> (Type ~R Constraint) :: Constraint. It sounds like this proposal would benefit from, if not need outright, (t :: TYPE (Lifted Strict)) |> (TYPE (Lifted Strict) ~R TYPE (Lifted Lazy)) :: TYPE (Lifted Lazy).

@simonpj
Copy link

simonpj commented Jun 18, 2019

@AndreasPK
Copy link

AndreasPK commented Jun 22, 2019

If the goal of this is optimizing the resulting code then the backend really wants three properties not two.

  • Known tagged (WHNF)
  • Unknown (Current LiftedRep behaviour)
  • Known untagged (Thunk)

The known untagged isn't as important as you don't gain as much codegen wise, but you can still remove the check for the tag.

That is not to say we can't limit Core to just two and infer the rest as optimization on a best-effort basis.

@sgraf812
Copy link

sgraf812 commented Nov 12, 2019

What's the purpose of UnliftedRep here? Have you looked at the wiki page on unlifted data types? It has a similar approach to levity polymorphism with (non-zero-cost) coercions. In particular, I'd propose

data Levity = Lifted | Unlifted
data PointeeKind = Managed Levity | Unmanaged
data RuntimeRep
  = PtrRep PointeeKind
  | IntRep
  | ...
pattern AddrRep = PtrRep Unmanaged
pattern LiftedRep = PtrRep (Managed Lifted)
pattern UnliftedRep = PtrRep (Managed Unlifted)

And then introduce polymorphism over Levity.

@andrewthad
Copy link
Author

In Proposal #112, the levity type data Levity = Lifted | Unlifted was accepted will be implemented. I've just never gotten around to doing it. So, a future GHC will certainly have:

data Levity = Lifted | Unlifted
data RuntimeRep
  = PtrRep Levity
  | IntRep
  | ...

The structure you suggest is isomorphic to what I have in mind but with entirely different semantics communicated by the names:

data Strictness = Strict | Lazy
data Levity = Lifted Strictness | Unlifted
data RuntimeRep
  = PtrRep Levity
  | IntRep
  | ...
data ByteArray# :: TYPE ('PtrRep 'Unlifted)
data Addr# :: TYPE 'AddrRep
data UserDefined :: forall (s :: Strictness). TYPE ('PtrRep ('Lifted s))

In your example, you've consolidated unmanaged pointers and managed pointers. That could be done with data Levity = Lifted Strictness | Unlifted | Unmanaged although I think it's orthogonal to the question at hand. We need three kinds, rather than two, for boxed types because of forget:

forget :: forall (a :: forall (s :: Strictness) -> TYPE ('PtrRep ('Lifted s))). a 'Strict -> a 'Lazy

Consider the above data types. We need forget to type check when called on a strict UserDefined, but it should never be able to accept ByteArray# as an argument. That is, there is no lazy variant of ByteArray#. By cutting it down to two kinds instead of three, you make it impossible to provide a meaningful type for forget.

Perhaps someone could make ByteArray# allowed to be lazy or strict with something like:

data CoreByteArray :: ... -- strictness polymorphic
newtype ByteArray# = ByteArray# (CoreByteArray# 'Strict) -- an unlifted newtype

Then you really could collapse unlifted types and strict lifted types into a single kind as you suggest. But this is more work, and there may be reasons not to do that to ByteArray#. I haven't thought about doing that much.

@sgraf812
Copy link

Ah, that makes sense. And indeed you say so right at the beginning, but I couldn't make much sense of it without reading the proposal first.

But then there's a cringeworthy separation between Lifted Strict and Unlifted. Apart from Lifted Strict not actually making much sense (but let's not engage in a bike-shedding discussion here), I can imagine that we want to write polymorphic functions that work on both. It sounds to me like Lifted Strict is really a sub-rep of Unlifted, namely those data types which also have a lifted variant.

But why not solve this with a type class? I called this LevityPolymorphic/LevityCoercible (seems I wasn't very consistent) on the wiki page. ByteArray# won't be an instance of this type-class, whereas user-defined data types might be. We could either have a semi-automatic mechanism via deriving or have it fully automated like Coercible. Then there's no need for a separate Unlifted anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment