Skip to content

Instantly share code, notes, and snippets.

@sstucki
Last active May 26, 2020 11:34
Show Gist options
  • Save sstucki/3fa46d2c4ce6f54dc61c3d33fc898098 to your computer and use it in GitHub Desktop.
Save sstucki/3fa46d2c4ce6f54dc61c3d33fc898098 to your computer and use it in GitHub Desktop.
Strawman of a DOT calculus with higher-kinded types

DOT with higher-kinded types -- A sketch

Definitions

Syntax

Var         x, y, z
Term           s, t ::= x                 variable
                        λx:T.t            term abstraction
                        s t               term application
                        new(x: T)d        record creation
                        t.l               term member selection
Definition        d ::= { a = t }         term member
                        { A = T }         type member
Value*         v, w ::= λx:T.t
                        new(x: T)d
Path*          p, q ::= x
                        v
Type        S, T, U ::= ⊤                 top
                        ⊥                 bottom
                        S ∧ T             intersection
                        (x:S) → T         dependent function type
                        { a : T }         term member
                        { A :: K }        type member
                        μ(x: T)           recursive record type
                        p.A               type member selection
                        λx:S.T            type abstraction
                        S t               type application

Kind              K ::= (x:S) → K         dependent function kind
                        S..T              type interval kind

Context           Γ ::= ∅                 empty
                        Γ, x: T           extension

* = predicate on terms defining a subset of the terms

As is customary, we write S → T as a shorthand for non-dependent function types (and kinds), i.e. for (x: S) → T where x does not appear freely in T. We write * for the interval containing the entire type lattice, i.e. for ⊥..⊤.

Substitutions [x := p] of a variable x for a path p and their applications t[x := p] to terms and T[x := p] to types are defined in the usual way.

We make use of let-binders, which are desugared in the usual way,

⟦ let x: T = t₁ in t₂ ⟧ = (λx: T. ⟦ t₂ ⟧) ⟦ t₁ ⟧

omitting the explicit type annotation when it can be readily inferred.

As in DOT, parametric polymorphism is achieved through "boxing" and (path-)dependent types, e.g. universals are encoded as

∀X.T   =   (x: { A :: * }) → T[X := x.A]
ΛX.t   =   λx: { A :: * }. t[X := x.A]

Type operators are encoded similarly as

K₁ → K₂    =   { A :: K₁ } → K₂
λX: K. T   =   λx: { A :: K }. T[X := x.A]

Note that we can use dependent kinds to give very precise kinds to type operators:

λX: K. T  ::  (X: K) → T..T

As we'll see below, this allows us to admit β-reductions in types through a combination of kinding and sub-kinding.

Reduction rules

Contractions:

(AppAbs)        (λx:T.t) v —→ t[x := v]

                d = … ∧ { a = t } ∧ …
(RecSel)        ------------------------------------
                (new(x: T)d).a —→ t[x := new(x: T)d]

Congruence:

                   t₁ —→ t₂'
(App₁)          ---------------
                t₁ t₂ —→ t₁' t₂

                    t —→ t'
(App₂)            -----------
                  v t —→ v t'

                    t —→ t'
(Sel)             -----------
                  t.a —→ t'.a

Typing

Term typing

x : T ∈ Γ
--------- (T-Var)
Γ ⊢ x : T

Γ ⊢ S :: *    Γ, x : S ⊢ t : T
------------------------------ (T-→-I)
Γ ⊢ λx: S.t : (x: S) → T

Γ ⊢ t : (x: S) → T   Γ ⊢ p : S
------------------------------ (T-→-E₁)
Γ ⊢ t p : T[x := p]

Γ ⊢ t₁ : S → T   Γ ⊢ t₂ : S
--------------------------- (T-→-E₂)
Γ ⊢ t₁ t₂ : T

Γ, x: T ⊢ d : T
------------------------ (T-{}-I)
Γ ⊢ new(x: T)d : μ(x: T)

Γ ⊢ t : { a : T }
----------------- (T-{}-E)
Γ ⊢ t.a : T

Γ ⊢ p : T[x := p]
--------------------- (T-Fold)
Γ ⊢ p : μ(x: T)

Γ ⊢ p : μ(x: T)
--------------------- (T-Unfold)
Γ ⊢ p : T[x := p]

Γ ⊢ t: T    Γ ⊢ t: U
--------------------
Γ ⊢ t: T ∧ U

Γ ⊢ t : S   Γ ⊢ S <: T
---------------------- (T-Sub)
Γ ⊢ t : T

Value/path typing is defined as the obvious restrictions of the above relation to values/paths.

The following rules for let-binders are admissible.

Γ ⊢ p : S   Γ, x : S ⊢ t : T
--------------------------------- (T-Let₁)
Γ ⊢ let x: S = p in t : T[x := p]

Γ ⊢ t₁ : S   Γ ⊢ t₂ : T
--------------------------- (T-Let₂)
Γ ⊢ let x: S = t₁ in t₂ : T

Definition typing

Γ ⊢ t : T
------------------------- (D-{:})
Γ ⊢ { a = t } : { a : T }

Γ ⊢ T :: K
-------------------------- (D-{::})
Γ ⊢ { A = T } : { A :: K }

Γ ⊢ d₁ : T₁     Γ ⊢ d₂ : T₂
--------------------------- (D-∧)
Γ ⊢ d₁ ∧ d₂ : T₁ ∧ T₂

Subtyping

---------- (T-Refl)
Γ ⊢ T <: T

Γ ⊢ S <: T    Γ ⊢ T <: U
------------------------ (T-Trans)
Γ ⊢ S <: U

Γ ⊢ T :: *
---------- (T-<:-⊤)
Γ ⊢ T <: ⊤

Γ ⊢ T :: *
---------- (T-⊥-<:)
Γ ⊢ ⊥ <: T

-------------- (T-∧-<:₁)
Γ ⊢ T ∧ U <: T

-------------- (T-∧-<:₂)
Γ ⊢ T ∧ U <: U

Γ ⊢ S <: T    Γ ⊢ S <: U
------------------------ (T-<:-∧)
Γ ⊢ S <: T ∧ U

Γ ⊢ S₂ <: S₁    Γ, x: S₂ ⊢ T₁ <: T₂
----------------------------------- (T-→-<:-→)
Γ ⊢ (x: S₁) → T₁ <: (x: S₂) → T₂

Γ ⊢ T₁ <: T₂
---------------------------- (T-{:}-<:-{:})
Γ ⊢ { a : T₁ } <: { a : T₂ }

Γ ⊢ K₁ <:: K₂
------------------------------ (T-{::}-<:-{::})
Γ ⊢ { A :: K₁ } <: { A :: K₂ }

Γ ⊢ S :: T₁..T₂
----------------- (T-..-<:)
Γ ⊢ T₁ <: S

Γ ⊢ S :: T₁..T₂
------------------ (T-<:-..)
Γ ⊢ S <: T₂

Kinding

---------- (K-⊤-F)
Γ ⊢ ⊤ :: *

---------- (K-⊥-F)
Γ ⊢ ⊥ :: *

Γ ⊢ T, U :: *
-------------- (K-∧-F)
Γ ⊢ T ∧ U :: *

Γ ⊢ S :: *    Γ, x: S ⊢ T :: *
------------------------------ (K-→-F)
Γ ⊢ (x: S) → T :: *

Γ ⊢ T :: *
------------------ (K-{:}-F)
Γ ⊢ { a : T } :: *

------------------- (K-{::}-F)
Γ ⊢ { A :: K } :: *

Γ, x: T ⊢ T :: *
---------------- (K-μ-F)
Γ ⊢ μ(x: T) :: *

Γ ⊢ T :: *
------------- (K-..-I)
Γ ⊢ T :: T..T

Γ ⊢ S :: *    Γ, x: S ⊢ T :: K
------------------------------ (K-→-I)
Γ ⊢ λx: S.T :: (x: S) → K

Γ ⊢ S :: (x: T) → K   Γ ⊢ p : T
------------------------------- (K-→-E₁)
Γ ⊢ S p :: K[x := p]

Γ ⊢ S :: T → K   Γ ⊢ t : T
-------------------------- (K-→-E₂)
Γ ⊢ S t :: K

Γ ⊢ p : { A :: K }
------------------ (K-Sel)
Γ ⊢ p.A :: K

Γ ⊢ T :: K₁    Γ ⊢ K₁ <:: K₂
---------------------------- (K-Sub)
Γ ⊢ T :: K₂

Sub-kinding

Γ ⊢ S₂ <: S₁    Γ ⊢ T₁ <: T₂
---------------------------- (K-..-<::-..)
Γ ⊢ S₁..T₁ <:: S₂..T₂

Γ ⊢ T₂ <: T₁    Γ, x: T₂ ⊢ K₁ <:: K₂
------------------------------------ (K-→-<::-→)
Γ ⊢ (x: T₁) → K₁ <:: (x: T₂) → K₂

A few points worth noting:

  • The selection subtyping rules from DOT are now admissible, e.g. we have

     Γ ⊢ p : { A :: S..T }
     --------------------- (K-Sel)
     Γ ⊢ p.A :: S..T
     --------------- (T-..-<:)
     Γ ⊢ S <: p.A
    
  • β-reduction in types is also admissible (as a subtyping rule), e.g. we have

     Γ ⊢ S :: *    Γ, x: S ⊢ T :: T..T
     --------------------------------- (K-→-I)
     Γ ⊢ λx: S. T :: (x: S) → T..T               Γ ⊢ p : S
     ----------------------------------------------------- (K-→-E₁)
     Γ ⊢ (λx: S. T) p :: (T..T)[x := p]
     ---------------------------------- (T-<:-..)
     Γ ⊢ (λx: S. T) p <: T[x := p]
    

    Note that this is not a new idea, Aspinall already noted in 1994 that β-reduction is admissible in systems with singleton types1. Type interval kinds can be understood to be a generalization of singleton kinds2: clearly, every singleton {T}_* can be encoded as a singleton interval T..T. Most of the literature on the properties of systems with singleton types and kinds seems to use PER-semantics1, 2, 3, interpreting singleton types/kinds {T}_K as an equivalence class of types/kinds up to β(η)-equivalence.

  • Types in this system do not have normal forms in general. Consider e.g. the encoding of

     { type F[X] = F[F[X]] }
    

    which is expressible as

     new(z: { F :: (x: {A::*}) → z.F (z.F x) .. z.F (z.F x) })
         { F = λx:{A::*}. z.F (z.F x) }
    

    This is in contrast to the calculi with singleton kinds proposed by Stone and Harper2, where type equivalence is show to be decidable via comparison of WHNFs.

  • The introduction rule for interval kinds (over proper types)

     Γ ⊢ T :: *
     ------------- (K-..-I)
     Γ ⊢ T :: T..T
    

    corresponds roughly to Aspinall's {}-I for singleton types,1

     Γ ⊢ M : A
     ------------- ({}-I)
     Γ ⊢ M : {M}_A
    

    and Stone and Harper's introduction rule for singleton kinds.2

     Γ ⊢ A :: *
     -------------
     Γ ⊢ A :: S(A)
    
  • One might wonder why we don't give the "most precise" kinds to proper types in the first place, e.g. the kinding rule for the top type could be

     ------------- (K-⊤-F′)
     Γ ⊢ ⊤ :: ⊤..⊤
    

    On a first glance, it seems like the original (K-⊤-F) rule could then be recovered via subsumption, since ⊤..⊤ <:: *. However, a derivation of this relationship already requires one for ⊤ :: * or ⊥ :: *, and so we end up with a circular dependency:

        ???
     ----------
     Γ ⊢ ⊤ :: *
     ---------- (T-⊥-<:)   ---------- (T-Refl)
     Γ ⊢ ⊥ <: ⊤            Γ ⊢ ⊤ <: ⊤
     -------------------------------- (K-<:-..)
     Γ ⊢ ⊤..⊤ <:: ⊥..⊤
    
  • It's questionable whether identifying * with ⊥..⊤ is a good idea. Alternatively, one could treat them separately and add sub-kinding rules to convert between them. At the very least one should have that ⊥..⊤ <:: *.

  • One could take the analogy to singleton types one step further and allow intervals A .._K B over kinds K other than *. E.g.

     (λx:A₁.B₁) .._(* → *) (λx:A₂.B₂)
    

    would be an interval over unary type operators. Note the kind annotation to make sure the bounds of the interval are of the same kind. This is closer in spirit to the way abstract, higher-kinded type members are handled in dotty, where one can write e.g.

     { type F <: List }
    

    (which the compiler η-expands to)

     { type F <: λX.List[X] }
    

    While it is unclear whether this would simplify or complicate the meta theory, it would likely make the existing theory from systems with singleton types more easily applicable in the DOT setting.

    Note that the more liberal, kind-annotated variant of intervals can be encoded in terms of the *-only variant. Again, this has already been noticed, for the case of singleton kinds, by Stone and Harper:2

     S .._*        T   =   S..T
     F .._(x: S)→K G   =   (x: S) → (F x) .._K (G x)
    
  • This proposal originally contained the following two extra subtyping rules:

     Γ, x: S ⊢ T₁ <: T₂
     -------------------------- (T-λ-<:-λ)
     Γ ⊢ λx: S. T₁ <: λx: S. T₂
    
     Γ ⊢ S₁ <: S₂
     ---------------- (T-App-<:-App)
     Γ ⊢ S₁ t <: S₂ t
    

    However, as Martin pointed out in a comment to his gist (https://gist.github.com/odersky/36aee4b7fe6716d1016ed37051caae95), these rules might not add any expressivity in practice. Consider the following example taken from his comment:

     type C(x) <: Seq(x)
    

    is encoded as

     { C :: (x: {A::*}) → ⊥..(Seq x) }
    

    Then

     p.C y :: ⊥..(Seq y)
    

    Hence

     p.C y <: Seq y
    

    Although this idea does not seem to extend to the case of partial applications, e.g.

     type Map[K, V] <: GenMap[K, V]
     Map Int  <:  GenMap Int   ?
    

    such partial applications can only occur nested inside other subtyping derivations but never directly in (T-Sub), (T-<:-..) or (T-..-<:) since partially applied types necessarily have arrow-kinds.

Unsoundness

The following example illustrates that type safety does not hold for the proposed calculus (neither for the above version, nor for Martin's original proposal).

let x: { A :: { B :: x.A → ⊤..⊥ }..{ B :: x.A → ⊤..⊥ } } =
    new(x: { A :: { B :: x.A → ⊤..⊥ }..{ B :: x.A → ⊤..⊥ } })
           { A =  { B :: x.A → ⊤..⊥ } } in
let y: { B :: x.A → ⊤..⊥ } =
    new(y: { B :: (z: x.A) → (z.B z)..(z.B z) })
           { B =  λz: x.A. z.B z } in
let z: y.B y = () in
(z: ⊥)

To see that this example typechecks, let's first verify that the declared types of x, y and z match the types of their definitions.

  • For x, we have that x.A → ⊤..⊥ is a kind (syntactically), so we immediately get { B :: x.A → ⊤..⊥ } :: * by (K-{::}-F). From there, we get the declared type by applying the usual sequence of (K-..-I), (D-{::}), (T-{}-I) and (T-Unfold).

  • For y, we need to show that (z: x.A) → (z.B z)..(z.B z) <:: (z: x.A) → ⊤..⊥ (note the absurd bounds on the codomain). Let Γ = x: { A :: { B :: x.A → ⊤..⊥ }..{ B :: x.A → ⊤..⊥ } } and Γ′ = Γ, z: x.A, then by (T-Var) and (K-Sel)

     Γ′ ⊢ x.A :: { B :: x.A → ⊤..⊥ }..{ B :: x.A → ⊤..⊥ }
    

    by (T-Var), (T-<:-..) and (T-Sub)

     Γ′ ⊢ z : { B :: x.A → ⊤..⊥ }
    

    by (K-Sel)

     Γ′ ⊢ z.B :: x.A → ⊤..⊥
    

    by (T-Var) and (K-→-E)

     Γ′ ⊢ z.B z :: ⊤..⊥
    

    by (T-..-<:) and (T-<:-..)

     Γ′ ⊢ ⊤ <: z.B z <: ⊥
    

    by (K-..-<::-..) and (K-→-<::-→)

     Γ ⊢ (z: x.A) → (z.B z)..(z.B z) <:: (z: x.A) → ⊤..⊥
    
  • For z, it's sufficient to show that ⊤ <: y.B y. Let Γ′′ = Γ, y: { B :: x.A → ⊤..⊥ }, then by (T-Var) and (K-Sel)

     Γ′′ ⊢ y.B :: x.A → ⊤..⊥
    

    by (T-Var), (K-Sel) and (T-..-<:)

     Γ′′ ⊢ { B :: x.A → ⊤..⊥ } <: x.A
    

    by (T-Var) and (T-Sub)

     Γ′′ ⊢ y : x.A
    

    by (T-Var) and (K-→-E)

     Γ′′ ⊢ y.B y :: ⊤..⊥
    

    by (T-..-<:) and (T-<:-..)

     Γ′′ ⊢ ⊤ <: y.B y <: ⊥
    

It follows that y.B y has absurd bounds in a context that corresponds to a store. Equivalently, we can substitute x and y by their definitions to obtain a closed type with absurd bounds. In either case, the type lattice becomes trivial.

The example is a variant of the well-know technique for typing arbitrary lambda terms in a system with (iso-)recursive types (see e.g. TAPL, ch. 20, pp. 273). Here, we lift this technique to the kind level to allow kinding arbitrary lambda expressions on the type level. Then we construct a divergent type expression and kind it with absurd bounds.

This unsoundness example makes crucial use of

  1. recursion in types (or rather in bounds of type members), and
  2. path-dependent type families.

The former is also present in Scala/dotty where as the latter is not. The example does therefore not constitute a soundness issue in those languages. Since this soundness issue is specific to HK DOT, it's worth elaborating a bit on this point. The different sorts of arrows in HK DOT (and Scala/dotty) can be categorized according to the sort of their domain and codomain:

  1. (possibly path-dependent) function type (domain: type, codomain: type), inhabited by term-level lambdas/functions/methods,

    example in DOT

    f : (x: S) → T         where S ≠ { A :: K }
    f = λx: S. t
    

    examples in Scala/dotty

    def f(x: S): T = t
    val g: (x: S) => T = (x: S) => t
    
  2. universal quantifier/polymorphic type (domain: kind, codomain type), inhabited by polymorphic terms/functions

    examples in DOT

    f : (x: { A :: K }) → T
    f = λx: { A :: K }. t
    

    examples in Scala/dotty

    def f[X]: T = t
    def f[X[_]](y: ...): T = t
    
  3. kind of path-indexed type families (domain: type, codomain: type), inhabited by type families or path-indexed singleton kinds

    examples in HK DOT

    F :: (x: S) → K         where S ≠ { A :: K′ }
    F =  λx: S. T
    

    alternatively, using Martin's variant

    F <: all(x: S)K         where S ≠ { A : K′ }
    F =  all(x: S)J               J singleton, J <: K
    

    there is no counterpart of this in Scala/dotty

  4. kind of type operators, i.e. higher kind (domain: kind, codomain: kind), inhabited by type operators/functions/lambdas

    examples in HK DOT

    F :: (x: { A :: K′ }) → K
    F =  λx: { A :: K′ }. T
    

    in Martin's variant

    F <: all(x: { A : K′ })K
    F =  all(x: { A : K′ })J      J singleton, J <: K
    

    example in Scala/dotty

    F[X]       >: S <: T
    F[X, Y, Z] >: S <: T
    F[X[_]]    >: S <: T
    F[X]       = T
    

In other words, the kind language of both variants of HK DOT are strictly more expressive than that of Scala/dotty, which suggests that the soundness issue could be avoided by restricting kinds as follows:

K ::= (x: { A :: K }) → K′         type operator kind
      S..T                         type interval kind

This restriction is purely syntactic and remains "compatible" with Scala/dotty in that any (higher-kinded) abstract type definition expressible in Scala can still be encoded in this subset.

References

  1. D. Aspinall, Subtyping with singleton types, CSL'94, LNCS 933, 2005. http://dx.doi.org/10.1007/BFb0022243
  2. C. A. Stone, R. Harper, Deciding Type Equivalence in a Language with Singleton Kinds, POPL '00. http://doi.acm.org/10.1145/325694.325724
  3. C. A. Stone, R. Harper, Extensional Equivalence and Singleton Types, ACM TCL 7(4), 2006. http://doi.acm.org/10.1145/1183278.1183281
  4. N. Amin, S. Grütter, M. Odersky, T. Rompf, S. Stucki, The Essence of Dependent Object Types, WadlerFest, LNCS 9600, 2016. http://dx.doi.org/10.1007/978-3-319-30936-1_14
  5. N. Amin et al., The DOT Calculus. Additional material including a detailed type safety proof with an accompanying Coq formalization. http://wadlerfest.namin.net/
@Blaisorblade
Copy link

Blaisorblade commented Dec 4, 2016

I'm not sure I can ask many good questions. But I think the lack of ANF broke (RecSel), even though this is a side point.

While this system is not in ANF, when (RecSel) expands (new(x: T)d).a, it duplicates the newly created object. That should break if you add singleton types, but already here if T abstracts some type in d. It should be enough to take d = {A = Int; a = lambda y: x.A. (lambda z: y.A. z) y}, and T = {A : ⊥..⊤} to show this; after substituting the new object for x, the two occurrences of x.A should be unrelated, so we break preservation (though not soundness).

(new(x: T)d).a = (new(x: ({A : ⊥..⊤} ∧ {a : x.A → x.A})){A = Int; a = lambda y: x.A. (lambda z: y.A. z) y}).a —→ 
lambda y: ((new(x: T)d).a).A . (lambda z: ((new(x: T)d).a).A . z) y

I don't think there's a way out of ANF, even just for types. Take your example type F[X] = List[List[X]], that needs let-bindings to be encoded. If we replace List by a type constructor member that is abstract, like A above, we get the same problem, so we can conclude that even type-level let bindings can't be inlined.
In particular, let's replace List by Set: define trait SetFunctor { type Set[X] }, and have a factory method set(x: {X :: *}): SetFunctor = new (x: SetFunctor) {...}: set represents a generative ML functor. And you get let bindings that can't be reduced because their identity matters: type F[X] = let Y = Set[X] in Set[Y] becomes type F = lambda x: {A::*}. let y = set(x) in set(y).

BTW, typing such an F is not trivial, since y is not in scope in the conclusion. To emphasize the trouble, encode type F[X]=List[Set[X]]: assuming List is not abstract, I expect to end up with something like

type F[X] = {
  type SetX :: *
  type Z = List[SetX]
}

(I expect, in fact, that List is a nominal type encoded through abstraction, but that it still has smaller bounds than Set).

Below's a few more questions, though they might be less useful.

Types in this system do not have normal forms in general. [...]

  1. What happens if you normalize types using everything but expansion of recursion, and then comparing the normal forms? That's the basic trick behind Amadio-Cardelli and that we scaled to F_{\Omega}^{\Mu*} (in the linked http://ps.informatik.uni-tuebingen.de/research/functors/equirecursion-fomega-popl16.pdf). That trick is rather subtle though: you lose as soon as you need to compare, in the "normal forms" you obtain, a fixpoint with a type variable.
  2. I'm probably missing something, but how do you actually expand this type multiple times to trigger this loop? It's not obvious I can via either subtyping or typing. Without subtyping for recursive types, it seems your example can't be expanded via subtyping, unlike for beta-reduction. We can use T-Unfold, but as usual it only works once.
    It seems form p.F (where p is your term, p = new(z: { F :: (x: {A::*}) → * }) { F = λx:{A::*}. z.F (z.F x) }), search for its kind (and use a bunch of rules to obtain a singleton kind using the definition) and then use subkinding.

Apart from normal forms:

  1. More boringly, I'm confused that (K-{::}-F) doesn't verify that K is well-scoped. Subtyping for (T-∧-<:₁) and (T-∧-<:₂) does that too, which surprises me, though that's as in the Wadlerfest paper.

  2. Thanks for finding better citations regarding singleton types/kinds and beta-reduction (I only knew this from Pure Subtype Systems).

Memo: Correct URLs for citation 1 and 2 are http://dx.doi.org/10.1007/BFb0022243 (or https://pdfs.semanticscholar.org/e4c2/0a4cd47fa27fff7bafb9f933c2f9a01126bb.pdf) and http://doi.acm.org/10.1145/325694.325724.

@Blaisorblade
Copy link

Now let's get to the more fundamental problems.

  1. If you don't like models, the first edition of PFPL, the discussion of singleton kinds points instead to hereditary substitution, which is purely syntactic (Crary, LFMTP 2009): http://www.cs.cmu.edu/afs/cs/user/crary/www/papers/2009/synsing.pdf

  2. TL;DR. What you pointed out, in fact, is just that lemmas on Γ ⊢# S :: ∀(y: T₂)U₁..U₂ should not try to run open terms. However, you should never need to — lemmas for such a judgement that need a store should also receive a "spine" of future arguments for S. When you have a store for Γ ⊢# S x :: K has U₁..U₂, you can take s(x) and use it to build the spine.

More in detail (sorry for any repetition):

[...]
even with an additional "is-arrow" judgment, we still eventually need to unzip the subkinding relationship at the top of the above derivation, where, crucially, the environment doesn't correspond to a store any longer (due to the additional y: T₂).

But the starting point was to have lemmas for Γ ⊢# S x :: K has U₁..U₂. If S was kinded as Γ ⊢# S :: ∀(y: T₂)U₁..U₂, then x :: T₂. Could we then substitute y by x, and take the value that x had in the store? That would complicate the lemmas on an is-arrow judgement, but maybe not that much: they'd have to quantify over pairs of stores and spines, where the spine must supply all arguments that will be needed by S. That might even work if the "arity" of S is not constant but depends on the arguments, though I hope that's impossible.

You should ask why I'm relating x and y. Motivation below.

This situation is reminiscent of the trouble caused by the presence of a subtyping rule for recursive types in DOT. However, unlike that rule, the (∀-<::-∀) subkinding rule is crucial in this calculus.

Both types seem to have a coinductive/step-indexed flavor in some sense. Sticking to S x: the typechecker can't run open terms, so it should not try to run their bodies "too much", just like a productive definition can't force too much of the input. Function values can be regarded as codata and can be observed only by supplying arguments. Since we use singleton types, (type-level) function types (and associated lemmas) can describe the entire behavior of the type-level function, so they also behave in a somewhat codata-ish way. That might be one reason why even Harper uses a model to describe singleton types, rather than pure syntax: they also need to have arguments at runtime. Crary (2009) uses hereditary substitution, which might have similar properties.

Equirecursive types are also codata, but it's not fully clear to me that interpretation would solve DOT's problems with equirecursion, so let's leave that aside for now. I do suspect coinduction might clarify what happens with equirecursive types.

@sstucki
Copy link
Author

sstucki commented Dec 5, 2016

While this system is not in ANF, when (RecSel) expands (new(x: T)d).a, it duplicates the newly created object. That should break if you add singleton types, but already here if T abstracts some type in d.

I don't see how this would break things. Two types p.A and q.A are equal when p ≡ q syntactically (as witnessed by (Refl)). So in particular, for any value v and type x.A, since x.A <: x.A you also have v.A <: v.A after substituting occurrences of x by v in any (sub-)typing derivation.

Actually, there is an issue in your example because the body of a is ill-typed. We have

x: {A: ⊥..⊤}, y: x.A ⊢ λz: (y.A). z : (z: y.A) → y.A
x: {A: ⊥..⊤}, y: x.A ⊢ y            : x.A

but x.A ≠ y.A and the declared upper bound of x.A is so we can't up-cast x.A to y.A. In fact, y.A seems ill-formed, so there's probably something I'm misunderstanding about this example.

I don't think there's a way out of ANF, even just for types. (...)

Again, I don't understand the problem with your example. Maybe it will become more clear once we resolved the confusion about the first issue?

Types in this system do not have normal forms in general. [...]

  1. What happens if you normalize types using everything but expansion of recursion, and then comparing the normal forms? That's the basic trick behind Amadio-Cardelli and that we scaled to F_{\Omega}^{\Mu*} (in the linked http://ps.informatik.uni-tuebingen.de/research/functors/equirecursion-fomega-popl16.pdf).

I haven't read your paper on F_{\Omega}^{\Mu*} yet but I assume that this requires reducing under fixpoint-operators? In DOT, we cannot just reduce under μs because they might make absurd assumptions (remember that preservation does not hold for open terms and types in DOT).

Also, note that

  1. types in DOT are iso-recursive not equi-recursive: although the fold/unfold operations don't appear in the syntax, they definitely appear in typing derivations;

  2. this problem is already present in Wadlerfest DOT where we don't have higher kinds, and it doesn't cause trouble, consider

    new(z: { A :: z.A .. z.A }) { A = z.A }
    
  3. it may not be necessary to normalize types to show soundness -- in fact, I very much hope it isn't.

  1. I'm probably missing something, but how do you actually expand this type multiple times to trigger this loop? (...)

If I understand your question correctly, you're asking whether there are infinite sequences of subtyping steps for such types (corresponding to an infinite sequence of reductions)? Let p be the path from my example, i.e.

p = new(z: { F :: (x: {A::*}) → z.F (z.F x) .. z.F (z.F x) })
    { F = λx:{A::*}. z.F (z.F x) }

Note the very precise kind: my original example was actually ill-kinded! Now, let int = new(_:_){ A = Int } and consider p.F int (i.e. the encoding of F[Int]), then we have

⊢ p.F int :: p.F (p.F int) .. p.F (p.F int)

and so by (T-<:-..)

⊢ p.F int <: p.F (p.F int)

Similarly,

⊢ p.F (p.F int) :: p.F (p.F (p.F int)) .. p.F (p.F (p.F int))
-------------------------------------------------------------
⊢ p.F (p.F int) <: p.F (p.F (p.F int))

and so on.

I'm a bit worried that my observation about types not having normal forms is sending us on a wild-goose chase. I merely wanted to make the point that proof techniques for preservation that rely on normalizing types will probably not work in this case. Examples of such techniques seem common for Fω-like systems,4, 5, 6 so one might be tempted to apply them here.

More boringly, I'm confused that (K-{::}-F) doesn't verify that K is well-scoped. Subtyping for (T-∧-<:₁) and (T-∧-<:₂) does that too, which surprises me, though that's as in the Wadlerfest paper.

Yes, this is part design, part notational laziness. We usually do assume that terms, types and kinds in judgments Γ ⊢ J are all well-scoped in the sense that all variables present in J are also given a type in Γ. But this assumption is left implicit. On the other hand, we do not assume that types and kinds are well-formed, e.g. judgments like Γ ⊢ λx:Int. λy:(x.A). y : (x: Int) → x.A → x.A are derivable.

Memo: Correct URLs for citation 1 and 2 are (...)

Whooops, thanks for the corrections! :-)

If you don't like models, the first edition of PFPL, the discussion of singleton kinds points instead to hereditary substitution, which is purely syntactic (...)

Yes, I found that paper as well. I'm quite happy that you mention hereditary substitution, actually, because I think it will be key to solving the issues with the meta theory that I described, though not in the obvious way. I'll come back to that below.

TL;DR. What you pointed out, in fact, is just that lemmas on Γ ⊢# S :: ∀(y: T₂)U₁..U₂ should not try to run open terms. However, you should never need to — lemmas for such a judgement that need a store should also receive a "spine" of future arguments for S. When you have a store for Γ ⊢# S x :: K has U₁..U₂, you can take s(x) and use it to build the spine.

That's not quite true. It is true that associating the arguments in the spine to the corresponding variables in open terms is part of the problem (or rather part of the solution) but that is only half the story. The crucial point is that the typings p₁ : T₁, p₂ : T₂, ..., pₙ : Tₙ of the arguments in the spine, need not be "strict", i.e. they might make arbitrary use of subsumption. Consequently, the domain T of the kind (x: T) → U₁..U₂ of S need not have tight bounds. This is incompatible with tight subtyping Γ ⊢‌# S <: T because the latter only allows type member selection on (strictly-typed) variables with tight bounds:

Γ ⊢! x : {A: T..T}
------------------ (K!-Sel)
Γ ⊢! x.A :: T..T
---------------- (T#-<:-..)
Γ ⊢# x.A <: T

consequently, we can not use tight subkinding in the second premise of the tight version of (K-→-<::-→):

Γ ⊢# T₂ <: T₁    Γ, x: T₂ ⊢ K₁ <:: K₂
------------------------------------- (K#-→-<::-→)
Γ ⊢# (x: T₁) → K₁ <:: (x: T₂) → K₂

because there is no guarantee that T₂ has tight bounds (it most likely doesn't). As a consequence, we need to show that the following substitution lemma (or a generalization thereof) is admissible:

Γ, x: T₂ ⊢ K₁ <:: K₂    Γ ⊢# p : T₂    s ~ Γ
--------------------------------------------
Γ ⊢# K₁[x := p] <:: K₂[x := p]

Note that the conclusion uses tight sub-kinding, where as the first premise does not. So in order to prove this substitution lemma, we need to first show that the non-tight variants of (T-<:-..), (T-..-<:) are admissible in in tight subkinding derivations, which is exactly what we set out to prove in the first place.

[...]
even with an additional "is-arrow" judgment, we still eventually need to unzip the subkinding relationship at the top of the above derivation, where, crucially, the environment doesn't correspond to a store any longer (due to the additional y: T₂).

But the starting point was to have lemmas for Γ ⊢# S x :: K has U₁..U₂. If S was kinded as Γ ⊢# S :: ∀(y: T₂)U₁..U₂, then x :: T₂. Could we then substitute y by x, and take the value that x had in the store?

Again, the problem is not so much the value of x in the store as it's typing: a derivation of Γ ⊢# x : T₂ may make use of subsumption, which means Γ, y: T₂ does not correspond to a store s[y := x]. The definition of the "corresponds to a store" relation s ~ Γ requires all store locations to be typed strictly.

(...) Crary (2009) uses hereditary substitution, which might have similar properties.

I think what we need to solve this problem is something along the lines of hereditary substitution. Though, maybe surprisingly, not to normalize type expressions but to "normalize" subtyping derivations. Conceptually, hereditary substitution is re-normalizing terms after substitutions in order to eliminate any new redexes or "cuts" that might have been created. But this principle can be applied more generally in situations where one wants to maintain "cut-free" representations of derivations in the face of substitution. An example is algorithmic (sub-)typing for full F-<: as described in TAPL (ch. 28, p. 424): in this setting, uses of subsumption and transitivity corresponds to cuts/redexes and are "normalized" or "evaluated" away by the proof of their admissibility in algorithmic subtyping derivations. To do so one needs to simultaneously show that both transitivity and context narrowing are admissible in order to contract uses of the (SA-All) rule that relates ∀-types. The proof of context narrowing essentially performs a substitution of a subtyping derivation for a a bounded variable, followed by re-normalization of the result via transitivity. This pair of mutually recursive proofs thus has a structure reminiscent of that of hereditary substitution.

I believe that the same principle applies in this context although with more complicated "cuts" and rather non-standard "reductions". Essentially, the idea is to push uses of structural subtyping/-kinding rules as far down the derivation tree as possible (i.e. closer to the conclusion). But importantly, I'm not proposing that we try to normalize types themselves or eliminate all uses of transitivity. And even more importantly, we should normalize (sub-)typing derivations under non-store contexts.

For example, the following typing derivation

                         Γ ⊢ T₁ <: S₁    Γ ⊢ S₂ <: T₂
---------------------    ----------------------------------
Γ ⊢ x : { A: S₁..S₂ }    Γ ⊢ { A: S₁..S₂ } <: { A: T₁..T₂ }
-----------------------------------------------------------
Γ ⊢ x.A :: T₁..T₂
-----------------
Γ ⊢ x.A <: T₂

where Γ(x) = { A: S₁..S₂ } with S₁ ≠ S₂, should be normalized as

Γ ⊢ x : { A: S₁..S₂ }
---------------------
Γ ⊢ x.A :: S₁..S₂
-----------------
Γ ⊢ x.A <: S₂        Γ ⊢ S₂ <: T₂
---------------------------------
Γ ⊢ x.A <: T₂

Note that a) the context Γ does not correspond to a store, and b) that the "normalized" derivation contains a use of transitivity that cannot be eliminated.

The goal is to find a "normal form" for subtyping derivations that eliminates all uses of transitivity and non-structural subtyping rules in the empty context resp. in contexts that correspond to stores (i.e. for subtyping closed terms), so that preservation and canonical forms becomes essentially a matter of inverting structural (sub-)typing in such contexts.

Of course, all of this hinges on there actually being a suitable termination measure for this edifice of mutually recursive normalization/admissibility proofs, but that's where I believe thinking in terms of hereditary substitution may help.

I admit there are a lot of "if"s and "but"s and it's all still rather vague and abstract at this point. But at least it's a starting point...

Equirecursive types are also codata, but it's not fully clear to me that interpretation would solve DOT's problems with equirecursion, so let's leave that aside for now.

As pointed out above, DOT doesn't use equirecursion: folding and unfolding is achieved through typing rules, not through subtyping rules.

References

  1. B. C. Pierce, M. Steffen, Higher-order subtyping, TCS 176(1-2), http://dx.doi.org/10.1016/S0304-3975(96)00096-5
  2. M. Steffen, Polarized Higher-Order Subtyping, PhD thesis, Erlangen, 1998
  3. A. Abel, Polarized Subtyping for Sized Types, MSCS 18(Special Issue 05), 2008, http://dx.doi.org/10.1017/S0960129508006853

@Blaisorblade
Copy link

Blaisorblade commented Dec 5, 2016

Let me answer for now just on (RecSel). I have some other comments, but that's where I understand things best.

I don't see how this would break things. Two types p.A and q.A are equal when p ≡ q syntactically (as witnessed by (Refl)).

Ah, very interesting—(Refl) itself is "problematic" then. Do you model the fact that type selection, in Scala and other calculi, depends on object identity not object equality? It seems not. Object identity can be modeled in small-step semantics without ANF, so why does this calculus not do that? That might be an acceptable simplifying assumption, maybe even a good one, but at least ML module people care a lot about the difference (and dealing with it properly tends to be pretty subtle), and I suspect Martin would care as well. In ML, this shows up in the semantics of functors (generative vs applicative). And at least, you'd have to be pretty explicit on this change.

Just to be clear: In Scala, copying an expression that creates an object will create two distinct objects, whose abstract type members are treated as distinct. Otherwise you don't have path-dependent types but go into proper dependent types.

That's the essence of the question I was asking, though apparently this affects even the reflexivity rule. You're right about the example I used, I should have used x.A instead of y.A. However, as long as you have (Refl), that example will not break preservation.


Concretely, in Scala, and the νObj, Wadlerfest and OOPSLA'16 calculi, v1.T = v2.T is only true (if T is abstract) if v1 and v2 are not just equal objects, but the same object. That's enforced either through ANF or (in OOPSLA'16) by having rule (SelX) rather than reflexivity. (SelX) restricts reflexivity to x.T <: x.T, not v.T.

So in Scala something like this wouldn't typecheck:

(x: (new AnyRef { type T }).T) => (x: (new AnyRef { type T }).T)

In fact, that doesn't parse in Scala, but I have a semi-convincing encoding of something similar below. Alternatively, the broken example exhibits the same property more properly, since there x.T is defined to be Int but then the definition is hidden.

Alternatively, object creation is not treated as a pure expression but as an expression with side effects (where the side effect is "creating a new object"), so that syntactically equal expressions don't produce equal values. In fact, to model properly the fact that new(x: T)d has side effects, and to keep allowing unrestricted inlining of values, most would not treat new(x: T)d as a value (the generated memory addresses, on the other hand, are fine as values).

Scala encoding of example above:

scala> val v = new AnyRef { type T }
v: AnyRef{type T} = $anon$1@379619aa

scala> (x: v.T) => (x: v.T)
res0: v.T => v.T = <function1>

scala> val v2 = v
v2: AnyRef{type T} = $anon$1@379619aa

scala> (x: v.T) => (x: v2.T)
<console>:14: error: type mismatch;
found   : x.type (with underlying type v.T)
required: v2.T
       (x: v.T) => (x: v2.T)
                    ^

@Blaisorblade
Copy link

Blaisorblade commented Dec 5, 2016

Regarding redundant rules, type C(x) <: Seq(x) and type Map[K, V] <: GenMap[K, V]; Map Int <: GenMap Int ?, could one extend (T-..-<:) to higher kinds? That's strictly a question about expressiveness, and it somehow fits with using singleton kinds to make beta-reduction admissible.

A potential concern, both there and in beta-reduction (or maybe only here), is that since arguments are not types but full objects, in general asking questions on partially applied types might be similar to reducing under binders. Beta-reduction is safer since it has an argument, but would it be sound to have (T-λ-<:-λ) or to make it admissible? Imagine S has bad bounds; that rule asks whether Γ, x: S ⊢ T₁ <: T₂, so T₁ <: T₂ might well be an absurd conclusion. Maybe that's sound because to exploit this absurdity we'd have to apply λx: S. T₁ to a non-existent argument for x? Would that break if we extend (T-..-<:) to higher kinds?

In fact, I've just seen, in Harper's PFPL, that usual systems can encode singletons for functions just by having singletons for kind *, which is pretty cool (though the relevant proofs are subtle). It's probably the same as what you do here too (and what Aspinall pointed out), but Harper's presentation is insightful and clear. But that presentation does have a generalized version of (T-..-<:) and (T-<:-..) as an admissible rule — I mean the theorem at the end of Harper's chapter on singleton kinds (chapter 24 or 43, depending on the edition).

Edit: the theorem is
Theorem 24.1. If ∆⊢c::κ, then ∆⊢S(c::κ):<:κ and ∆⊢c::S(c::κ).

where S(c::κ) is a function that encodes higher singletons in terms of singletons at kind * (by "eta-expansion" so to say).

@sstucki
Copy link
Author

sstucki commented Dec 6, 2016

Ah, very interesting—(Refl) itself is "problematic" then. Do you model the fact that type selection, in Scala and other calculi, depends on object identity not object equality? It seems not.

No, this calculus does not distinguish between object identity and object equality. BUT, the answer to your question really depends on what you mean by "object equality" and "object identity". In particular, types in this calculus are still path-dependent (as opposed to fully dependent) and the distinction between applicative and generative functors does manifest itself. I will go into some detail below as to why I think this is the case. But first, I'd like to clarify the essential difference between this calculus (or rather the first-order fragment) and that described in the Wadlerfest paper (WF-DOT).

Unlike WF-DOT, this calculus follows the tradition of typed lambda calculi (and various extensions thereof) in that it is referentially transparent in the sense that any definition can be replaced by its value (via substitution) without changing the overall result of the program. In STLC, for example, the two programs (λx:T. f x x) λz:U.z and f (λz:U.z) (λz:U.z) yield the same result, and nobody would argue that x is a store location initialized by λz:U.z or that, in copying the expression λz:U.z twice, we have somehow created distinct values. Expressions (and values in particular) are considered identical if they are syntactically equal. Depending on the calculus in question, this might be further relaxed to identify β/η-equivalent expressions. Whether or not this is an appropriate "design decision" for a calculus that deals with "objects" is a different question, which I'm not going to address here (though I will argue that it has some practical advantages below). Be that as it may, a direct consequence of this "design decision" is that the path language (represented by the syntactic category p) has to be extended to also contain values (denoted by v). This is because paths can appear in types in the form of type member selections p.A: if x.A is a valid type expression, and we are serious about referential transparency, then x.A[x := v] = v.A must be a valid type expression as well.

In other words, the problem (if any) is not with the (Refl) rule but with the choice to include values in the path language.

Note that despite the added flexibility, important restrictions remain.

  1. Paths are still a subset of terms, in particular, (u v).A is not a path, nor is v.a.A. As a consequence, types remain path-dependent (albeit for a more relaxed notion of "paths").

  2. A pair of type selections v.A and u.A are comparable only if the values v and u are syntactically equal, no β/η-reductions are allowed, ({a = λx.(λy.y)x} ∧ {A = ⊤}).A and ({a = λx.x} ∧ {A = ⊤}).A are different.

Object identity can be modeled in small-step semantics without ANF, so why does this calculus not do that? That might be an acceptable simplifying assumption, maybe even a good one, but at least ML module people care a lot about the difference (and dealing with it properly tends to be pretty subtle), and I suspect Martin would care as well.

Indeed, he does. I have had many discussion with Martin about whether this is the right approach to formalizing Scala, and at present, we mostly agree that it essentially depends on how the calculus is to be used.

If you're interested in modeling Scala closely -- what happens in the compiler or at runtime -- then a WF-DOT, ANF-based, small-step formalization or an environment-based, OOPSLA'16-style, big-step formalization is probably what you want. In these calculi, variable names correspond to store/environment locations reflecting the familiar distinction between object identity and equality.

If you're interested in the more logical/type-theoretic aspects of the calculus, and about relating it (and its meta theory) to other typed lambda calculi, then a small-step formalization permitting substitutions in terms as well as in types is probably better suited. For example, developing a proof that an ANF-based WF-DOT simulates a substitution-based F-<: is a pain (in fact we don't have any such proof), whereas proving the simulation for a substitution-based variant of DOT is almost trivial (Nada mechanized it in Coq a while back).

In the end, I don't think either of the two approaches yields fundamental insights that cannot be gained from the other. On the contrary, I believe that switching between the two perspectives is more likely to reveal which features of the type system (and its meta theory) are accidental and which reflect deeper principles.

In ML, this shows up in the semantics of functors (generative vs applicative). And at least, you'd have to be pretty explicit on this change.

Functors remain generative in DOT, irrespective of the formalization. Remember that types are still path-dependent, so expressions of the form (f x).A are not allowed: f x needs to be let-bound, at which point different instances can no longer be identified. Consider

let f: μ(z: {A :: ⊥..⊤; a: x.A}) → μ(z: {B :: ⊥..⊤; b: z.B}) =
    λx: μ(z: {A :: ⊥..⊤; a: x.A}). new(z:_){B = x.A; b = x.a} in
let m: μ(z: {A :: ⊥..⊤; x.A}) = new(z:_){A = Int; a = 1} in
let x = f m in
let y = f m in
let z: x.B = x.b in   // Works.
let w: x.B = y.b      // Doesn't typecheck: x.B ≠ y.B

In fact, this simpler example already fails in both calculi:

let m: μ(z: {A :: ⊥..⊤; x.A}) = new(z:_){A = Int; a = 1} in
let x = m
let y: x.A = x.a in   // Works.
let z: x.A = m.a      // Doesn't typecheck: x.A ≠ m.A

In fact, even if we changed the last line to

let y: x.A = (new(z:_){A = Int; a = 1}).a  // Doesn't typecheck either.

it would still not type check. Only if we consistently replace x by it's value do we get a typable program:

let y: (new(z:_){A = Int; a = 1}).A = (new(z:_){A = Int; a = 1}).a

I don't want to belabor this point unnecessarily, so I'm going to skip the remainder of your comments about object identity in Scala and other DOT versions vs. this one. But please let me know if any of the above remains unclear or dubious to you.

Regarding redundant rules, type C(x) <: Seq(x) and type Map[K, V] <: GenMap[K, V]; Map Int <: GenMap Int ?, could one extend (T-..-<:) to higher kinds? That's strictly a question about expressiveness, and it somehow fits with using singleton kinds to make beta-reduction admissible.

The extension of (T-..-<:) and (T-<:-..) to arrow-kinds is possible in principle but it requires the use of intervals over kinds other than * -- though these are readily encodable (as explained in the gist).

A more interesting question, IMO, is how you would ever use the fact that Map Int <: GenMap Int, other than to show that Map Int T <: GenMap Int T, which is already covered by the current (T-..-<:) and (T-<:-..) rules.

A potential concern, both there and in beta-reduction (or maybe only here), is that since arguments are not types but full objects, in general asking questions on partially applied types might be similar to reducing under binders. Beta-reduction is safer since it has an argument, but would it be sound to have (T-λ-<:-λ) or to make it admissible? Imagine S has bad bounds; that rule asks whether Γ, x: S ⊢ T₁ <: T₂, so T₁ <: T₂ might well be an absurd conclusion. Maybe that's sound because to exploit this absurdity we'd have to apply λx: S. T₁ to a non-existent argument for x? Would that break if we extend (T-..-<:) to higher kinds?

Yes, indeed, the same holds for the (K-→-<:-→) and (T-→-<:-→) rules, which, in combination with singleton types, already allows "reductions" under a binder. You are right that the resulting subtyping/-kinding relations can be absurd, but they do not constitute a soundness issue since they cannot escape the binder.

In fact, I've just seen, in Harper's PFPL, that usual systems can encode singletons for functions just by having singletons for kind *, which is pretty cool (though the relevant proofs are subtle). It's probably the same as what you do here too (and what Aspinall pointed out), but Harper's presentation is insightful and clear.

Yes, as mentioned in the gist, Harper and Stone already describe this in their papers on singleton types and kinds,2, 3 though I have not read the relevant chapter in PFPL. I should probably have a look at that.

@Blaisorblade
Copy link

No, this calculus does not distinguish between object identity and object equality.

Agreed then. I assumed you'd know this but was surprised you didn't mention it—but Martin knew what you meant by "not using ANF".

In other words, the problem (if any) is not with the (Refl) rule but with the choice to include values in the path language.

Oh. I thought OOPSLA had values in the path language without full (Refl), but I was confused.

Let me try to focus my speculations on real problems then: Map Int <: GenMap Int and hereditary substitution.

A more interesting question, IMO, is how you would ever use the fact that Map Int <: GenMap Int, other than to show that Map Int T <: GenMap Int T, which is already covered by the current (T-..-<:) and (T-<:-..) rules.

Well, what if I try to encode def f[K <: GenMap Int] = ...; f[Map Int]? In fact, it seems I currently won't need Map Int <: GenMap Int, but Map Int <: λ x :: *. GenMap Int x, and inspecting your subtyping rules suggests that's unprovable. No rule is applicable except transitivity (which defers the problem) or (T-<:-..) and (T-..-<:) (which don't apply at kind * -> *).
The reason is that encoding K <: GenMap Int requires an interval of non-* kind, which must be currently encoded using eta-expansion. However, type definitions (in f's argument) need not be eta-expanded. Maybe (D-{::}) should be modified and force eta-expanded definitions, like (apparently) Dotty seems to do internally? Or maybe the rules should just be extended to handle this case somehow—right now, it seems I can't even prove F <: λ x :: *. F x or viceversa.

BTW, an algorithmic for Map Int <: λ x. GenMap Int x might need kind-directed eta-expansion, which is sort-of obvious but I've never heard. I have seen eta-expansion for equivalence checking (ATTAPL Ch. 6), but not for subtyping (though it can arise the same way). Of course google disagrees and points me to https://www.researchgate.net/publication/221557946_Syntactic_Metatheory_of_Higher-Order_Subtyping (which also uses HSubst, though on types, and mentions the barrier below).

==

Though, maybe surprisingly, not to normalize type expressions but to "normalize" subtyping derivations. [...]

Of course, all of this hinges on there actually being a suitable termination measure for this edifice of mutually recursive normalization/admissibility proofs, but that's where I believe thinking in terms of hereditary substitution may help.

I admit there are a lot of "if"s and "but"s and it's all still rather vague and abstract at this point. But at least it's a starting point...

Great, you have a research problem and an attack!

I wondered about step-indexing here: it would allow the mutual recursion to proceed, but it wouldn't ensure the whole process terminates, so it's clearly not enough by itself...

Hereditary substitution (HSubst) might be a cool inspiration. This suggests maybe the best idea of this comment: have you tried adding proof terms to the subtyping rules, and phrasing reduction rules for those? That's maybe just notation, but it might help all those that get beta-reduction better than cut-elimination. (In fact, we even get to look informally at the "untyped" reduction rules if we want, though we probably shouldn't formalize those). The result might be of interest beyond Scala!

To this end, though, you might need to internalize dependency into assumptions using a term-level lambda, so that you can exhibit (λ x: S. < proof term > : T <: U): (∀ x: S. T <: U). But that would probably complicate things...

Then, can you explain the "leftover" transitivity/subsumption in a way similar to variables (and spines) in neutral terms?

Back to hereditary substitution: It does prove (weak? strong?) normalization for STLC ("Hereditary Substitutions for Simple Types, Formalized" details how) without logical relations.

But HSubst has limited power, though it might be enough: since (IIUC) hereditary substitution can be formalized in Peano Arithmetic, and since proves strong normalization (hence, with a few provisos consistency) of the object language, hereditary substitution alone can't work for type systems that extend Peano Arithmetic, such as System T, or System F (which proves second-order Peano Arithmetic consistent). I have no clue where the system you have sits, but maybe it's too powerful? But subtyping proofs don't have lambdas and subtyping lacks internalized implication, and maybe it can't be encoded suitably and this is all irrelevant.

There you need reducibility candidates or logical relations, which go beyond Peano Arithmetic. This point is made in "Proofs and Types", in the intros of Chapter 6 and 15, though I don't fully get all those intros. I've just seen it again mentioned in http://www.twelf.org/slr/.
Proofs and Types also doesn't actually mention hereditary substitution (it hints at "proof theoretic techniques" and I guess that's what they mean) or logical relations (but it mentions Tate '67).

If the barrier applies, it would suggest considering logical relations for transitivity elimination, but I've never heard of those. Parroting termination of STLC/System F, they'd be like "this subtyping proof is itself normalizing, and even the result of applying it to an assumption normalizes". What's confusing is that the property, it seems, would not be defined by induction on some "proof type", since you don't have internal abstraction on assumptions (unless you add lambdas, as sketched above), but by induction on the context. That's not novel though: logical relations are defined on closed terms usually, but they can be extended to open ones, and PFPL's chapter on "Equational Reasoning for T" (either edition) does exactly that, by defining open logical equivalence.

@sstucki
Copy link
Author

sstucki commented Jan 24, 2017

TLDR:

  1. HK DOT is unsafe (though there might be a fix),
  2. even with the fix, HSubst is (probably) too weak to prove safety.

A slightly more comprehensive summary with links to the relevant sections:

  1. Subtyping partial type applications.
  2. Proof terms for subtyping derivations, why we should expect them to be normalizing, and how this all relates to inversion, canonical forms and type safety.
  3. The soundness/safety counter example revisited: an illustration of how non-terminating coercion terms break safety.
  4. Why impredicativity of HK DOT spells trouble for a HSubst-based proof.
  5. Some thoughts on a proof via logical relations.
  6. An outlook on future work.

But let's start from the beginning...

No, this calculus does not distinguish between object identity and object equality.

Agreed then. I assumed you'd know this but was surprised you didn't mention it—but Martin knew what you meant by "not using ANF".

Ah yes, that was a rather imprecise distinction indeed, good point. I'll make a mental note to be more clear about this point in future write-ups.

BTW: I recently re-read parts of the TAPL chapter on Featherweight Java (ch. 19, pp. 247), while preparing for TA-ing an exercise on FJ, and realized that it uses the same type of substitution-based semantics as I do above. In particular, there is also no distinction between object identity and object equality: the reduction rules for field selection and method invocation operate directly on objects.

1. Subtyping partial type applications

Let me try to focus my speculations on real problems then: Map Int <: GenMap Int and hereditary substitution.

A more interesting question, IMO, is how you would ever use the fact that Map Int <: GenMap Int, other than to show that Map Int T <: GenMap Int T, which is already covered by the current (T-..-<:) and (T-<:-..) rules.

Well, what if I try to encode def f[K <: GenMap Int] = ...; f[Map Int]?

Yes, initially I was worried as well that partially applied types might appear in bounds of polymorphic functions, and thus, via transitivity, in anywhere in subtyping derivations. But once you de-sugar the above signature into DOT, the partial application disappears. In DOT, the signature of f would be

(k: {A :: (x: *) → ⊥..(p.GenMap int x)}) → T

where A is a dummy label, p is the module containing the abstract type GenMap, and int is a "boxed" variant of Int. Note the "η-expansion" of GenMap! This is necessary because

  1. type parameters have to be "boxed" and their upper/lower bounds have to be translated into upper/lower bounds of the corresponding abstract type (here A),
  2. the calculus only supports type intervals over proper types, so type intervals over type operators have to be de-sugared as described above (note that the definition of S .._K T performs an η-expansion so higher-order type bounds are effectively "η-long").

As a consequence, every occurrence of a type operator in a bound is necessarily fully applied in the kind corresponding to the "type box".

The reason is that encoding K <: GenMap Int requires an interval of non-* kind, which must be currently encoded using eta-expansion.

Exactly.

However, type definitions (in f's argument) need not be eta-expanded. Maybe (D-{::}) should be modified and force eta-expanded definitions, like (apparently) Dotty seems to do internally?

I think your on to something there. We want all (type-)definitions to have singleton kinds, so the typing rules above should be adjusted (Martin's original proposal actually enforced this restriction). There are good meta-theoretic reasons for requiring definitions to have singleton types, and incidentally, this would force definitions to be η-expanded. Consider the following definition

App[F[_]] = F

which we can either interpret as the identity operation on unary type operators or as a "curried" variant of type application, i.e. App[F][X] = F[X]. This type definition is actually not valid in Scala (I don't know whether it is in dotty). Instead, one has to η-expand and un-curry manually, like so:

App[F[_], X] = F[X]

Now in HK DOT (as outlined above) one may encode the former as

App :: {F :: {A :: *} → *} → {A :: *} → *
App =  λx: {F :: {A :: *} → *}. x.F

Note that the kind of App is certainly not a singleton, so according to Martin's original proposal, this wouldn't be valid. On the other hand, the η-expanded version can be given a singleton kind:

App :: (x: {F :: {A :: *} → *}) → (y: {A :: *}) → (x.F y)..(x.F y)
App =  λx: {F :: {A :: *} → *}. λy: {A :: *}. x.F y

I believe this generalizes to arbitrary type operators.

To summarize, enforcing singleton types for all type definitions ensures all such definitions are η-long, which is also in accordance with Scala.

Or maybe the rules should just be extended to handle this case somehow—right now, it seems I can't even prove F <: λ x :: *. F x or viceversa.

No, η-conversion is not admissible. But η-conversion only makes sense for type operators, and following the same reasoning as above, you should never need to compare type operators that are not fully applied. Note that fully applied variants of your example operators are comparable.

Γ ⊢ λ x :: *. F x :: (x :: *) → (F x)..(F x)    ...
--------------------------------------------------- (K-→-E)
Γ ⊢ (λ x :: *. F x) p :: (F p)..(F p)
------------------------------------- (T-..-<:)
Γ ⊢ F p <: (λ x :: *. F x) p

BTW, an algorithmic for Map Int <: λ x. GenMap Int x might need kind-directed eta-expansion, which is sort-of obvious but I've never heard. I have seen eta-expansion for equivalence checking (ATTAPL Ch. 6), but not for subtyping (though it can arise the same way). Of course google disagrees and points me to https://www.researchgate.net/publication/221557946_Syntactic_Metatheory_of_Higher-Order_Subtyping (which also uses HSubst, though on types, and mentions the barrier below).

Indeed. That's a nice paper BTW (I guess most of Abel's papers are). His paper on polarized subtyping, which I mentioned earlier,6 also makes essential use of hereditary substitution (though he doesn't mention it explicitly, IIRC).

2. Proof terms for subtyping derivations and normalization

Hereditary substitution (HSubst) might be a cool inspiration. This suggests maybe the best idea of this comment: have you tried adding proof terms to the subtyping rules, and phrasing reduction rules for those?

I certainly have!

That's maybe just notation, but it might help all those that get beta-reduction better than cut-elimination. (In fact, we even get to look informally at the "untyped" reduction rules if we want, though we probably shouldn't formalize those). The result might be of interest beyond Scala!

I agree that this is of more general interest, and I think that we should formalize the untyped reduction rules! Firstly, they provide a sort of formal specification for the normalization proof we're looking for, and secondly, we can prove a variant of the type safety theorems for the new calculus. Namely

  1. that reductions of terms (and the coercions they contain) preserve types, and
  2. that closed, well-typed (coercion) terms can either be reduced or are values.

Intuitively, a coercion value is a coercion that is closed and normal/canonical, and that therefore corresponds to a subtyping derivation that can be inverted. Indeed this latter point can be made formal by a handful of corresponding canonical forms lemmas (for coercions). I have sketched such a type safety proof for a variant of the system above (without recursion or records) and there don't seem to be any major difficulties. I believe that extending the proof to the full system would increasing the technical complexity but wouldn't introduce any new conceptual difficulties.

These type-safety theorems do of course not guarantee that type safety holds for the original calculus because coercion terms may be non-normalizing, in which case either a) the calculus is unsafe (see example below) or b) our reduction strategy is flawed. In either case, the formalization allows us to make the inherent "dynamics" of the normalization proof explicit, and — as you already mentioned — should give some insights about the proof-theoretic strength of the logic corresponding to the subtyping fragment.

Finally, it allows us to interpret the final type safety proof as a proof by (weak) simulation: the extended calculus (with explicit coercions) weakly simulates the original calculus by interleaving finite sequences of "silent" coercion reductions with the reductions of the "bare" terms in the original calculus. The simulation relation is simply the function that maps terms in the extended calculus to their bare counterparts by erasing all coercions.

To this end, though, you might need to internalize dependency into assumptions using a term-level lambda, so that you can exhibit (λ x: S. < proof term > : T <: U): (∀ x: S. T <: U). But that would probably complicate things...

Then, can you explain the "leftover" transitivity/subsumption in a way similar to variables (and spines) in neutral terms?

I don't understand your example. Could you elaborate a bit? How would you use a term with this particular signature?

It is definitely true though that the coercion language does need to internalize dependencies. E.g. the language I introduce below does have binders and reduction rules for eliminating them.

There is also plenty of literature where subtyping rules in some language are be elaborated into explicit coercion terms in a subtyping-free fragment of the same language. But this isn't what we want here, because it would blur the clear separation between the coercion language, which we need to be normalizing, and the original term language, which might not be normalizing (it certainly isn't here). Again, I'm going to say a bit more about this below.

Back to hereditary substitution: It does prove (weak? strong?) normalization for STLC ("Hereditary Substitutions for Simple Types, Formalized" details how) without logical relations.

Another very nice paper (and a source of inspiration for my attempt to use HSubst here). BTW, are you aware of a similar paper that handles STLC with products?

But HSubst has limited power, (...), hereditary substitution alone can't work for type systems that extend Peano Arithmetic, such as System T, or System F (...). I have no clue where the system you have sits, but maybe it's too powerful?

Indeed, the expressiveness/proof-theoretic power of the HK DOT system above (and in particular the (sub-)typing fragment) is too powerful for HSubst, as exemplified by the above safety counter example — in fact, this is how I found the counter example in the first place. But there are even simpler examples of why HSubst doesn't work "out of the book" here, and these apply even to the "fixed" version of HK DOT where the safety counter example doesn't apply. I'll get back to this later — first, let's have a look at the counter example to get a feeling for what a non-terminating subtyping coercion looks like!

3. Non-termination of the soundness/safety counter example

Since the presence of path-indexed type families (plus recursion) are powerful enough to type arbitrary (untyped) lambda calculus expressions on the type level, types are clearly not normalizing. However, we're not trying to normalize types, but rather (sub-)typing derivations, and as you further note,

subtyping proofs don't have lambdas and subtyping and subtyping lacks internalized implication, and maybe it can't be encoded suitably and this is all irrelevant.

Unfortunately, they can be encoded suitably.

In what follows, I'll use a new syntax for paths and types extended with explicit subtyping and subkinding coercions:

   p, q ::= ...
         |  unfold(p)       (T-Unfold)
         |  p ↑ α           (T-Sub)

S, T, U ::= ...
         |  s(T)            (K-..)
         |  T ⇑ Φ           (K-Sub)

α, β, γ ::= id_T            (T-Refl)
         |  α ; β           (T-Trans)
         |  {A::Φ}          (T-{::}-<:-{::})
         |  ⟨T|             (T-..-<:)
         |  |T⟩             (T-<:-..)
         |  ...

   Φ, Ψ ::= α..β            (K-..-<::-..)
         |  (x:T.Φ)^α       (K-→-<::-→)

where I have indicated the corresponding (sub-)typing and kinding rules. For lack of a better name, let's call the original calculus the bare language (BL) and the extended calculus with explicit coercions the extended language (EL). Suggestions for better terminology are welcome...

When presenting (sub-)typing and kinding rules/derivations, I will use the EL for "proof terms", but the BL for the associated types and kinds. There is an obvious erasure function ⌞-⌟ that, given a path p, type T or kind K in the EL, yields the underlying bare path ⌞p⌟, type ⌞T⌟ or kind ⌞K⌟, respectively, in the BL. E.g. the updated subtyping rule for (T-..-<:) is

Γ ⊢ S :: T₁..T₂
------------------- (T-..-<:)
Γ ⊢ ⟨S| : T₁ <: ⌞S⌟

Note that the conclusion of this rule is still a subtyping judgment, albeit one with an explicit proof term ⟨S|. I.e. subtyping is now a four-place relation Γ ⊢ α : S <: T, relating a coercion α to it's (bare) upper and lower bounds S and T in a typing context Γ. Importantly, <: remains part of the relation, i.e. it is not a type operator here. This is a design choice to simplify erasure of coercions from subtyping derivations (it suffices to just drop the proof term) and contrasts with systems that elaborate (or translate) subtyping judgments into a term-language (e.g. through a relation Γ ⊢ S <: T ⇝ α) where coercions have proper types (e.g. Γ ⊢ α : S → T). Still, in designing the reduction rules, it is of course useful to think of coercions as maps α : S → T from terms of their lower bound (i.e. the domain) S to terms of their upper bound (i.e. the codomain) T.

Semantically, we treat elements of the BL as uninterpreted constants and reason only about the dynamics of the new extensions in the EL: coercions and the associated instances of subsumption. In this context, the sub-kinding rule (K-→-<::-→) takes the role of lambda abstraction,

Γ ⊢ α : T₂ <: T₁        Γ, x: T₂ ⊢ Φ :: K₁ <:: K₂
------------------------------------------------- (K-→-<::-→)
Γ ⊢ (x: T₁. Φ)^α :: (x: T₁) → K₁ <:: (x: T₂) → K₂

internalizing implication in its arrow kind bounds (x: T₁) → K₁ <:: (x: T₂) → K₂. The kinding rule (K-→-E₁) along with subsumption (K-Sub) on the first premise act as "application"/implication elimination, i.e.

Γ ⊢ Ψ : (x: T₁) → K₁ <:: (x: T₂) → K₂
Γ ⊢ S :: (x: T₁) → K₁
-------------------------- (K-Sub)
Γ ⊢ S ⇑ Ψ :: (x: T₂) → K₂                Γ ⊢ p : T₂
--------------------------------------------------- (K-→-E₁)
Γ ⊢ (S ⇑ Ψ) p :: K₂[x := ⌞p⌟]

We can "contract" this (sub-)kinding derivation by pushing the subsumption down the derivation tree through the "application". The corresponding reduction rule is

(T ⇑ (x: S. Φ)^α) p   —→   (T (p ↑ α)) ⇑ Φ[x := p]

Note the similarity to β-reduction! It's easy to verify that this reduction preserves both

  1. the erased type expression (which remains T p), and
  2. its kind (which remains K₂[x := ⌞p⌟]),

assuming the following substitution lemma holds:

Γ ⊢ p : T             Γ, x: T ⊢ Φ :: K₁ <:: K₂
----------------------------------------------
Γ ⊢ Φ[x := p] :: K₁[x := ⌞p⌟] <:: K₂[x := ⌞p⌟]

Returning to the above safety counter example, if we try and reduce the (sub-)typing derivation for z : ⊥, we encounter exactly this situation. As before, let

Γ = x: { A :: { B :: x.A → ⊤..⊥ }..{ B :: x.A → ⊤..⊥ } }
Γ′ = Γ, z: x.A
Γ′′ = Γ, y: { B :: x.A → ⊤..⊥ }

then

Γ′ ⊢ z ↑ |x.A⟩ : { B :: x.A → ⊤..⊥ }
Γ′ ⊢ (z ↑ |x.A⟩).B z :: ⊤..⊥

and letting p = (z ↑ |x.A⟩).B z

Γ′ ⊢ ⟨p| : ⊤ <: z.B z
Γ′ ⊢ ⟨p|..|p⟩ :: (z.B z)..(z.B z) <:: ⊤..⊥
Γ ⊢ (z: x.A. ⟨p|..|p⟩)^id :: (z: x.A) → (z.B z)..(z.B z) <:: (z: x.A) → ⊤..⊥

Now let

Φ = (z: x.A. ⟨p|..|p⟩)^id
v = unfold(new(x: { A :: { B :: x.A → ⊤..⊥ }..{ B :: x.A → ⊤..⊥ } })
                  { A =  { B :: x.A → ⊤..⊥ } })
w = unfold(new(y: { B :: (z: x.A) → (z.B z)..(z.B z) })
                  { B =  λz: x.A. z.B z })
q = w ↑ { B :: Φ }

i.e. v and q are exactly the "definitions" of x and y, respectively, and ⌞q⌟ = w. Furthermore, we have

Γ′′ ⊢ y.B :: x.A → ⊤..⊥
Γ′′ ⊢ ⟨x.A| : { B :: x.A → ⊤..⊥ } <: x.A
Γ′′ ⊢ y ↑ ⟨x.A| : x.A
Γ′′ ⊢ y.B (y ↑ ⟨x.A|) :: ⊤..⊥
Γ′′ ⊢ ⟨y.B (y ↑ ⟨x.A|)|;|y.B (y ↑ ⟨x.A|)⟩ : ⊤ <: ⊥

This last coercion is the one we'd like to normalize. To do so, we start by substituting the definitions v and q for x and y (otherwise the coercion is already in normal form). We get

Γ ⊢ ⟨q.B (q ↑ ⟨v.A|)|;|q.B (q ↑ ⟨v.A|)⟩ : ⊤ <: ⊥                   (1)

Expanding a bit and focusing on the left instance of (T-..-<:), we have

Γ ⊢ ⟨(w ↑ { B :: Φ }).B (q ↑ ⟨v.A|)| : ⊤ <: w.B w

and we can apply the following reduction rule

(p ↑ { A :: Φ }).A   —→   p.A ⇑ Φ

to push subsumption through type member selection.

Γ ⊢ ⟨(w.B ⇑ Φ) (q ↑ ⟨v.A|)| : ⊤ <: w.B w

Note that the underlying bare type is again preserved by the reduction rule. Now we are ready to apply the β-like reduction rule discussed earlier.

⟨(w.B ⇑ Φ) (q ↑ ⟨v.A|)|
    =   ⟨(w.B ⇑ (z: v.A. ⟨p|..|p⟩)^id) (q ↑ ⟨v.A|)|
    —→  ⟨(w.B ((q ↑ ⟨v.A|) ↑ id) ⇑ ⟨p|..|p⟩[z := q ↑ ⟨v.A|]|
    =   ⟨(w.B ((q ↑ ⟨v.A|) ↑ id) ⇑ ⟨p′|..|p′⟩|

where p′ = p[z := q ↑ ⟨v.A|]. Expanding p′, we get

p′  =   p[z := q ↑ ⟨v.A|]
    =   ((z ↑ |v.A⟩).B z)[z := q ↑ ⟨v.A|]
    =   ((q ↑ ⟨v.A|) ↑ |v.A⟩).B (q ↑ ⟨v.A|)

which looks suspiciously similar to the left half of (1) above. Indeed, if we reduce p′ further we get

p′  =   ((q ↑ ⟨v.A|) ↑ |v.A⟩).B (q ↑ ⟨v.A|)
    —→  (q ↑ ⟨v.A|;|v.A⟩).B (q ↑ ⟨v.A|)
    —→  (q ↑ id).B (q ↑ ⟨v.A|)
    —→  q.B (q ↑ ⟨v.A|)

which is exactly the subexpression of (1) we started from. Its easy to convince oneself that these last three steps correspond to sensible reductions too (there are some subtleties involved in formulating the rule for the second step in general, but the instance used in this example is straight-forward).

In summary, we are left with a coercion expression that is strictly bigger than the one we started from and which contains the core of the original coercion as a sub-expression.

⟨q.B (q ↑ ⟨v.A|)|
    —→*  ⟨(w.B (q ↑ ⟨v.A|)) ⇑ ⟨q.B (q ↑ ⟨v.A|)|..|q.B (q ↑ ⟨v.A|)⟩|
    —→*  ...

So we are forced to conclude that this sequence of reductions diverges, and hence no (total) HSubst function can exist for this coercion calculus in general.

But, as stated above, I believe this situation can be remedied by restricting the kind language to forbid path-indexed families of types.

4. Impredicativity troubles for HSubst

So let's turn to the question of whether HSubst is powerful enough proof normalization of the "fixed" coercion calculus. In order to use HSubst, we first need to define a language of "canonical" (or normal) path and type expressions. Such a definition is implicit in the reduction rules for coercions, but we need an direct, syntactic definition, possibly typedA5 or untyped.A1,A3,A4 This is cumbersome, but there are no real conceptual difficulties.

Next, one has to define HSubst on canonical paths and types as well as on canonical path and type coercions, and then show that the type of the substitute path in any "nested" use of HSubst decreases, for some well-founded order on types. Since kinds can appear inside types and vice versa, we actually need a combined order that covers both types and kinds. The essential difficulty here is "impredicativity". Just as it is difficult to define a useful well-founded order on types in an impredicative calculus such as System F, it is difficult to define one on types and kinds in HK DOT. In particular, for HSubst to work, we need the kind of a type application to be smaller than the kind of the corresponding type operator, i.e. given F :: (x: S) → K and p : S, we would like

K[x := p]   ≺   (x: S) → K

to hold, where _≺_ is our candidate order. If variables were not allowed in kind expressions (as e.g. in Fω-<:A1), we could just take _≺_ to be the sub-expression order:

K[x := p]   =   K   ≺   (x: S) → K

but obviously this does not work for HK DOT. Since paths, types and kinds may all appear circularly inside each other, we can pick e.g. K = ⊥..x! and p = { { ⊥..(x: S) → K } }, and get

K[x := p]   =   ⊥..{ { ⊥..((x: S) → K) } }!   ≻   (x: S) → K

which is the opposite of what we want. Note, firstly, that the above example still applies to the "fixed" kind language, where only arrow kinds of the form (x: { A :: K₁ } → K₂ are allowed and, secondly, that the inclusion of kinds in the path language is crucial here. The mere presence of variables in kinds (or types) is not necessarily a problem. For example, Harper and Licata's "Canonical LF"A4 has dependent types, but since the system is stratified, i.e. kinds and types can never appear in terms, any terms appearing in types (or kinds) can simply be ignored by _≺_ (concretely, Harper and Licata first define a forgetful map from types to "simple types", forgetting any term occurrences, then define a well-founded order on simple types instead).

Let's work through an example to see what exactly goes wrong when we try to extend this approach to HK DOT. Consider the following definition of the identity function on types:

tId : { I :: (x: {A::*}) → x.A..x.A }   <:  { I :: {A::*} → * }
tId = { I =  λx: {A::*}. x.A }

Now, assume we are given a variable z of type { I :: (x: {A::*}) → x.A..x.A }, i.e. the precise type of the identity function. We can apply the operator z.I to a pair of nested boxes containing some kind K, i.e. z.I { A = { B :: K } }, which is OK because { B :: K } :: * and hence { A = { B :: K } } : {A::*} for any K. Let C = { B :: K }, then we have

z.I { A = C }  ::  { A = C }.A..{ A = C }.A  <::  C..C

and hence C <: z.I { A = C } <: C, i.e. the identity applied to some type C is equivalent to C, as one would expect. For example, if we let K = (x: {A::*}) → x.A..x.A and B = I, we have

z : C  <:  z.I { A = C }

i.e. we can apply the identity to it's own type. Let's make all the coercions in this statement explicit, starting with C:

C            :: *
s(C)         :: C..C
{ A = s(C) } :: { A :: C..C }

Let k = { A = s(C) }. Noting that ⌞k⌟ = { A = C }, we have

z.I                      :: (x: {A::*}) → x.A..x.A
z.I k                    :: { A = C }.A..{ A = C }.A
k.A                      :: C..C
⟨k.A|                    :: C <:: { A = C }.A
⟨k.A|..|k.A⟩             :: { A = C }.A..{ A = C }.A <:: C..C
(z.I k) ⇑ (⟨k.A|..|k.A⟩) :: C..C

Let E = (z.I k) ⇑ (⟨k.A|..|k.A⟩). Noting that ⌞E⌟ = z.I ⌞k⌟ we have

z       : C
⟨E|     : C <: z.I { A = C }
z ↑ ⟨E| : z.I { A = C }

Note that the variable z, which is a path, appears inside the type z.I, which, in turn, appears as the head of a type application and eventually in the path coercion ⟨E|. Hence, interesting things should happen if we substitute some path p ↑ α for z and normalize the result. We expect the path coercion to be pushed, successively, through the selection of I, then the application to k and finally through the left projection ⟨-|. On the way, it should turn from a path coercion into a type coercion and back, and it's type (and/or kind) should change accordingly.

To keep things simple, we will work with (an η-expanded version of) the identity coercion on the type of z:

Φ :: (x: {A::*}) → x.A..x.A <:: (x: {A::*}) → x.A..x.A
Φ =  (x: {A::*}. id_x.A .. id_x.A)^id_{A::*}

α : { I :: (x: {A::*}) → x.A..x.A } <: { I :: (x: {A::*}) → x.A..x.A }
α = { I :: Φ }

I'll leave it to the reader to verify that these definition are well-typed.

In what follows, I'll annotate paths/types and coercions with both their type/kind as well as simplified versions thereof. In particular, I'll write statements of the form p : T / |T| / ⌊T⌋′ where |T| and ⌊T⌋ are, respectively, the types obtained by erasing all variables and all paths from T. E.g. we have

z ↑ ⟨E| : z.I { A = C } / ?.I { A = |C| } / ?.I ?

with |C| = ⌊C⌋ = { B :: |K| } and |K| = ⌊K⌋ = {A::*} → ?.A..?.A.

Now let's compute the normal form of (z ↑ ⟨E|)[z := p ↑ α]. We have

  (z ↑ ⟨E|)[z := p ↑ α] :  (z.I { A = C })[z := ⌞p ↑ α⌟]  /
                          |(z.I { A = C })[z := ⌞p ↑ α⌟]| /
                          ⌊(z.I { A = C })[z := ⌞p ↑ α⌟]⌋
= (p ↑ α) ↑ ⟨E′|        : ⌞p⌟.I { A = C } / ⌞p⌟.I { A = C } / ?.I ?

with

E′ = E[z := p ↑ α]                   :: C..C / |C|..|C| / ⌊C⌋..⌊C⌋
   = ((p ↑ α).I k) ⇑ (⟨k.A|..|k.A⟩)  :: C..C / |C|..|C| / ⌊C⌋..⌊C⌋

We can already see that the erasure function |-| does not commute with substitution, i.e. |T| ≠ |T[x := ⌞p⌟]|, as exemplified by the type of z ↑ ⟨E| before and after the substitution of ⌞p ↑ α⌟ for z. This is a problem because it means that the simple type |T[x := ⌞p⌟]| of an application f p may be bigger than the codomain |T| of the corresponding simple arrow type |S| → |T| of the function f. The same is true for the simple kinds of type applications. Concretely, in the above example, we have

z.I   :: (x: {A::*}) → x.A..x.A / {A::*} → ?.A..?.A    / {A::*} → ?.A..?.A
z.I k :: (⌞k⌟.A)..(⌞k⌟.A)       / (|⌞k⌟|.A)..(|⌞k⌟|.A) / ?.A..?.A

where (|⌞k⌟|.A)..(|⌞k⌟|.A) is clearly not a subexpression of {A::*} → ?.A..?.A. Unfortunately, HSubst relies crucially on the fact that the type of a an application (or any redex in general) is smaller than the type of the corresponding function (or, in general, the term being eliminated).

Conversely, the erasure function ⌊-⌋ does commute with substitution, i.e. we have ⌊T⌋ = ⌊T[x := ⌞p⌟]⌋. E.g. in the above example, ?.A..?.A is a subexpression of {A::*} → ?.A..?.A.

Continuing the above example, the first redex we encounter is (p ↑ α).I, deeply nested inside E′. The associated reduction is

(p ⇑ { A :: Φ }).I  —→  p.I ⇑ Φ

The kind of the redex and the type of the associated eliminatee α are

(p ⇑ α).I :: K / |K| / ⌊K⌋
α         :  C <: C / |C| <: |C| / ⌊C⌋ <: ⌊C⌋

Expanding C = { I :: K } and |C| = ⌊C⌋ = { I :: |K| }, and noting that |K| = ⌊K⌋, we see that K, |K| and ⌊K⌋ are subexpressions, respectively, of C, |C| and ⌊C⌋ — so far, all is well. The next redex we encounter is (p.I ⇑ Φ) k, with reduction

(p.I ⇑ Φ) k  —→  (p.I (k ↑ id_{A::*}) ⇑ (id_x.A .. id_x.A)[x:=⌞p⌟]

which is an instance of the β-like reduction rule we have seen earlier. The types of the redex and Φ are

(p.I ⇑ Φ) k :: (⌞k⌟.A)..(⌞k⌟.A) / (|⌞k⌟|.A)..(|⌞k⌟|.A) / ?.A..?.A
Φ           :: K <: K / |K| <: |K| / ⌊K⌋ <: ⌊K⌋

Expanding the definitions of K, |K| and ⌊K⌋ we discover that (⌞k⌟.A)..(⌞k⌟.A) is not a subexpression of K, nor is (|⌞k⌟|.A)..(|⌞k⌟|.A) a subexpression of |K|. However, ?.A..?.A is a subexpression of ⌊K⌋, as expected since ⌊-⌋ commutes with substitution. We can therefore dismiss the subexpression order on types as well as subexpressions modulo |-| as candidates for our well-founded induction order for HSubst and focus instead on subexpressions modulo ⌊-⌋.

To see why this will not work either, we first need to perform a few boring reductions on E′ (mostly pushing identity coercions around and eliminating them):

⟨E′|  —→*  ⟨(p.I k) ⇑ ⟨k.A|..|k.A⟩|

The resulting coercion is a redex for the reduction

⟨(p.I k) ⇑ ⟨k.A|..|k.A⟩|  —→  ⟨k.A|;⟨(p.I k)|

which is an instance of the contraction rule

⟨T ⇑ α..β|  —→  α;⟨T|

The associated type/kind are

⟨(p.I k) ⇑ (⟨k.A|..|k.A⟩)|  :   C <: p.I { A = C } / ⌊C⌋ <: ?.I ?
⟨k.A|..|k.A⟩                ::  { A = C }.A..{ A = C }.A <:: C..C /
                                ?.A..?.A <:: ⌊C⌋..⌊C⌋

Focusing on the lower bounds, we note that C is a subexpression of { A = C }.A..{ A = C }.A but, alas, ⌊C⌋ = { B :: |K| } is not a subexpression of ?.A..?.A. The erasure function ⌊-⌋ forgot too much.

In summary: just as paths, types and kinds can circularly contain each other, so can coercions on paths and kinds (i.e. subtyping and subkinding derivations). As a consequence, (hereditary) substitutions in types may trigger (hereditary) substitutions in paths, and vice-versa. Whatever well-founded order/induction measure we use, it has to be defined both on (well-typed) paths and on (well-kinded) types. Ignoring all paths in types and kinds results in a well-founded order that is too coarse to make deeply nested calls to HSubst, while only ignoring variables leads to an order that is not stable under substitution, i.e. it cannot relate the kind K[x := p] of an application F p to the domain K of the arrow kind of the domain (x: S) → K of the corresponding type operator F.

All hope is not lost though: there could still be a well-founded order on types that goes beyond simple subexpression modulo erasure. That is, provided that the type language of HK DOT is weaker than System F. So is it? One thing we can say is that System F can at least not be directly encoded in the type language of HK DOT, simply because that would require us to encode kind-level variables through boxing in path-level variables, similar to the way we encode type variables in DOT. Recall that in DOT we have an injection { A = _ } from types to paths ("boxing") with inverse _.A ("unboxing") which allows us to treat path variables x of type x : { A :: * } as type variables X = x.A and encode universal quantifiers (over types) like ∀X.T as ⟦ ∀X.T ⟧ = (x.A) → ⟦ T[X := x.A] ⟧. To encode universals ∀k.K over kinds, we first need an encoding for kind variables k, i.e. a similar invertible injection from kinds to paths. A candidate is { A = { B :: _ } }, but this map has no syntactic form as its inverse.

5. Some thoughts on a proof via logical relations

Proofs and Types also doesn't actually mention hereditary substitution (it hints at "proof theoretic techniques" and I guess that's what they mean) or logical relations (but it mentions Tate '67).

I think the term "hereditary substitution" had not been invented when Proofs and Types was written. It seems to have originatedA3,A4,A5 in a paper by Watkins et al. that was published in 2004.A6 However, the method used to prove the weak normalization theorem in chapter 4 of P&T also makes crucial use of the fact that, in STLC, the type of a redex is smaller than the type of the corresponding eliminatee, e.g. for applications f t: T, we have f: S → T, and for projections πᵢ t: Tᵢ, we have t: T₁ × T₂. Both techniques leverage this fact by using the types of eliminatees in redexes as part of their induction/termination measure.

The origins of the term "logical relations" seem to go back much further, according to a thread on the TYPES mailing list at least as far as 1974, so I'm not sure why it is not mentioned in P&T.

If the barrier applies, it would suggest considering logical relations for transitivity elimination, but I've never heard of those.

Neither have I.

Parroting termination of STLC/System F, they'd be like "this subtyping proof is itself normalizing, and even the result of applying it to an assumption normalizes". What's confusing is that the property, it seems, would not be defined by induction on some "proof type", since you don't have internal abstraction on assumptions (unless you add lambdas, as sketched above), but by induction on the context. That's not novel though: logical relations are defined on closed terms usually, but they can be extended to open ones, and PFPL's chapter on "Equational Reasoning for T" (either edition) does exactly that, by defining open logical equivalence.

I don't understand this comment. What exactly do you mean by "applying it to an assumption"? And why would the relation not be defined by induction on types/kinds? The way I see it, there would be four mutually inductively defined relations/predicates (one each for terms, types and the corresponding coercions). Since coercions are typed/kinded by two types/kinds, the corresponding relations would likely be defined by induction on e.g. the upper bound and case-analysis on the lower, or vice-versa. E.g. definitions might look something like this:

  • p ∈ R⟦{ K₁ }⟧ if p! ∈ R⟦K₂⟧.
  • α ∈ R⟦T <: { K }⟧ if, for all p, p ∈ R⟦T⟧ implies (p ↑ α)! ∈ R⟦K₂⟧.

Needless to say, this seems rather non-standard. For the arrow cases, predicativity/dependency becomes an issue, e.g. the following definition is not inductive in the upper bound:

  • Φ ∈ R⟦(x: S₁) → K₁ <: (x: S₂) → K₂⟧ if, for all T and p, T ∈ R⟦(x: S₁) → K₁⟧ and p ∈ R⟦S₂⟧ implies (T ⇑ Φ)! p ∈ R⟦K₂[x:=⌞p⌟]⟧.

and we need to instead parametrize the (reducibility) relation on K₂ over relations (i.e. reducibility candidates) on S₂. Normalization proofs for e.g. the calculus of constructionsA2 have to deal with similar dependency-related issues and might serve as a starting point here.

For now, I have no idea how to deal with the order-related operations (reflexivity, transitivity) in this setting, nor how to represent recursive types/paths.

6. Conclusions and future work

I have more to say about

  1. pros and cons of encoding coercions as lambda terms,
  2. predicative variants of HK DOT,
  3. a simplified calculus where types and terms are stratified,

but this comment is already way too long, so that's for another time.

Additional references

  1. A. Abel and D. Rodriguez, Syntactic Metatheory of Higher-Order Subtyping, CSL '08. http://dx.doi.org/10.1007/978-3-540-87531-4_32
  2. T. Coquand, J. H. Gallier, A Proof of Strong Normalization for the Theory of Constructions Using a Kripke-Like Interpretation, CIS Tech. Report (MS-CIS-90-44), University of Pennsylvania, 1990. http://repository.upenn.edu/cis_reports/568/
  3. K. Crary, A Syntactic Account of Singleton Types via Hereditary Substitution, LFMTP '09. http://www.cs.cmu.edu/afs/cs/user/crary/www/papers/2009/synsing.pdf
  4. R. Harper, D. R. Licata, Mechanizing Metatheory in a Logical Framework, in JFP 17(4-5), 2007. http://dx.doi.org/10.1017/S0956796807006430
  5. C. Keller, T. Altenkirch, Hereditary Substitutions for Simple Types, MSFP '10. http://doi.acm.org/10.1145/1863597.1863601
  6. K. Watkins, I. Cervesato, F. Pfenning, D. Walker, A Concurrent Logical Framework: The Propositional Fragment, TYPES 2003. http://dx.doi.org/10.1007/978-3-540-24849-1_23

@Blaisorblade
Copy link

Blaisorblade commented Nov 7, 2018

What I wrote doesn’t work, but was the first step in understanding what is happening. And I finally took the time to understand this remark:

The example is a variant of the well-know technique for typing arbitrary lambda terms in a system with (iso-)recursive types (see e.g. TAPL, ch. 20, pp. 273). Here, we lift this technique to the kind level to allow kinding arbitrary lambda expressions on the type level. Then we construct a divergent type expression and kind it with absurd bounds.

In both cases, non-termination is caused by the negative recursion, which needs to be controlled, and the restriction that Sandro proposes also happens to forbid such negative recursion, or at least direct occurrences of it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment