Skip to content

Instantly share code, notes, and snippets.

@sstucki
Last active May 26, 2020 11:34
Show Gist options
  • Star 5 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save sstucki/3fa46d2c4ce6f54dc61c3d33fc898098 to your computer and use it in GitHub Desktop.
Save sstucki/3fa46d2c4ce6f54dc61c3d33fc898098 to your computer and use it in GitHub Desktop.
Strawman of a DOT calculus with higher-kinded types

DOT with higher-kinded types -- A sketch

Definitions

Syntax

Var         x, y, z
Term           s, t ::= x                 variable
                        λx:T.t            term abstraction
                        s t               term application
                        new(x: T)d        record creation
                        t.l               term member selection
Definition        d ::= { a = t }         term member
                        { A = T }         type member
Value*         v, w ::= λx:T.t
                        new(x: T)d
Path*          p, q ::= x
                        v
Type        S, T, U ::= ⊤                 top
                        ⊥                 bottom
                        S ∧ T             intersection
                        (x:S) → T         dependent function type
                        { a : T }         term member
                        { A :: K }        type member
                        μ(x: T)           recursive record type
                        p.A               type member selection
                        λx:S.T            type abstraction
                        S t               type application

Kind              K ::= (x:S) → K         dependent function kind
                        S..T              type interval kind

Context           Γ ::= ∅                 empty
                        Γ, x: T           extension

* = predicate on terms defining a subset of the terms

As is customary, we write S → T as a shorthand for non-dependent function types (and kinds), i.e. for (x: S) → T where x does not appear freely in T. We write * for the interval containing the entire type lattice, i.e. for ⊥..⊤.

Substitutions [x := p] of a variable x for a path p and their applications t[x := p] to terms and T[x := p] to types are defined in the usual way.

We make use of let-binders, which are desugared in the usual way,

⟦ let x: T = t₁ in t₂ ⟧ = (λx: T. ⟦ t₂ ⟧) ⟦ t₁ ⟧

omitting the explicit type annotation when it can be readily inferred.

As in DOT, parametric polymorphism is achieved through "boxing" and (path-)dependent types, e.g. universals are encoded as

∀X.T   =   (x: { A :: * }) → T[X := x.A]
ΛX.t   =   λx: { A :: * }. t[X := x.A]

Type operators are encoded similarly as

K₁ → K₂    =   { A :: K₁ } → K₂
λX: K. T   =   λx: { A :: K }. T[X := x.A]

Note that we can use dependent kinds to give very precise kinds to type operators:

λX: K. T  ::  (X: K) → T..T

As we'll see below, this allows us to admit β-reductions in types through a combination of kinding and sub-kinding.

Reduction rules

Contractions:

(AppAbs)        (λx:T.t) v —→ t[x := v]

                d = … ∧ { a = t } ∧ …
(RecSel)        ------------------------------------
                (new(x: T)d).a —→ t[x := new(x: T)d]

Congruence:

                   t₁ —→ t₂'
(App₁)          ---------------
                t₁ t₂ —→ t₁' t₂

                    t —→ t'
(App₂)            -----------
                  v t —→ v t'

                    t —→ t'
(Sel)             -----------
                  t.a —→ t'.a

Typing

Term typing

x : T ∈ Γ
--------- (T-Var)
Γ ⊢ x : T

Γ ⊢ S :: *    Γ, x : S ⊢ t : T
------------------------------ (T-→-I)
Γ ⊢ λx: S.t : (x: S) → T

Γ ⊢ t : (x: S) → T   Γ ⊢ p : S
------------------------------ (T-→-E₁)
Γ ⊢ t p : T[x := p]

Γ ⊢ t₁ : S → T   Γ ⊢ t₂ : S
--------------------------- (T-→-E₂)
Γ ⊢ t₁ t₂ : T

Γ, x: T ⊢ d : T
------------------------ (T-{}-I)
Γ ⊢ new(x: T)d : μ(x: T)

Γ ⊢ t : { a : T }
----------------- (T-{}-E)
Γ ⊢ t.a : T

Γ ⊢ p : T[x := p]
--------------------- (T-Fold)
Γ ⊢ p : μ(x: T)

Γ ⊢ p : μ(x: T)
--------------------- (T-Unfold)
Γ ⊢ p : T[x := p]

Γ ⊢ t: T    Γ ⊢ t: U
--------------------
Γ ⊢ t: T ∧ U

Γ ⊢ t : S   Γ ⊢ S <: T
---------------------- (T-Sub)
Γ ⊢ t : T

Value/path typing is defined as the obvious restrictions of the above relation to values/paths.

The following rules for let-binders are admissible.

Γ ⊢ p : S   Γ, x : S ⊢ t : T
--------------------------------- (T-Let₁)
Γ ⊢ let x: S = p in t : T[x := p]

Γ ⊢ t₁ : S   Γ ⊢ t₂ : T
--------------------------- (T-Let₂)
Γ ⊢ let x: S = t₁ in t₂ : T

Definition typing

Γ ⊢ t : T
------------------------- (D-{:})
Γ ⊢ { a = t } : { a : T }

Γ ⊢ T :: K
-------------------------- (D-{::})
Γ ⊢ { A = T } : { A :: K }

Γ ⊢ d₁ : T₁     Γ ⊢ d₂ : T₂
--------------------------- (D-∧)
Γ ⊢ d₁ ∧ d₂ : T₁ ∧ T₂

Subtyping

---------- (T-Refl)
Γ ⊢ T <: T

Γ ⊢ S <: T    Γ ⊢ T <: U
------------------------ (T-Trans)
Γ ⊢ S <: U

Γ ⊢ T :: *
---------- (T-<:-⊤)
Γ ⊢ T <: ⊤

Γ ⊢ T :: *
---------- (T-⊥-<:)
Γ ⊢ ⊥ <: T

-------------- (T-∧-<:₁)
Γ ⊢ T ∧ U <: T

-------------- (T-∧-<:₂)
Γ ⊢ T ∧ U <: U

Γ ⊢ S <: T    Γ ⊢ S <: U
------------------------ (T-<:-∧)
Γ ⊢ S <: T ∧ U

Γ ⊢ S₂ <: S₁    Γ, x: S₂ ⊢ T₁ <: T₂
----------------------------------- (T-→-<:-→)
Γ ⊢ (x: S₁) → T₁ <: (x: S₂) → T₂

Γ ⊢ T₁ <: T₂
---------------------------- (T-{:}-<:-{:})
Γ ⊢ { a : T₁ } <: { a : T₂ }

Γ ⊢ K₁ <:: K₂
------------------------------ (T-{::}-<:-{::})
Γ ⊢ { A :: K₁ } <: { A :: K₂ }

Γ ⊢ S :: T₁..T₂
----------------- (T-..-<:)
Γ ⊢ T₁ <: S

Γ ⊢ S :: T₁..T₂
------------------ (T-<:-..)
Γ ⊢ S <: T₂

Kinding

---------- (K-⊤-F)
Γ ⊢ ⊤ :: *

---------- (K-⊥-F)
Γ ⊢ ⊥ :: *

Γ ⊢ T, U :: *
-------------- (K-∧-F)
Γ ⊢ T ∧ U :: *

Γ ⊢ S :: *    Γ, x: S ⊢ T :: *
------------------------------ (K-→-F)
Γ ⊢ (x: S) → T :: *

Γ ⊢ T :: *
------------------ (K-{:}-F)
Γ ⊢ { a : T } :: *

------------------- (K-{::}-F)
Γ ⊢ { A :: K } :: *

Γ, x: T ⊢ T :: *
---------------- (K-μ-F)
Γ ⊢ μ(x: T) :: *

Γ ⊢ T :: *
------------- (K-..-I)
Γ ⊢ T :: T..T

Γ ⊢ S :: *    Γ, x: S ⊢ T :: K
------------------------------ (K-→-I)
Γ ⊢ λx: S.T :: (x: S) → K

Γ ⊢ S :: (x: T) → K   Γ ⊢ p : T
------------------------------- (K-→-E₁)
Γ ⊢ S p :: K[x := p]

Γ ⊢ S :: T → K   Γ ⊢ t : T
-------------------------- (K-→-E₂)
Γ ⊢ S t :: K

Γ ⊢ p : { A :: K }
------------------ (K-Sel)
Γ ⊢ p.A :: K

Γ ⊢ T :: K₁    Γ ⊢ K₁ <:: K₂
---------------------------- (K-Sub)
Γ ⊢ T :: K₂

Sub-kinding

Γ ⊢ S₂ <: S₁    Γ ⊢ T₁ <: T₂
---------------------------- (K-..-<::-..)
Γ ⊢ S₁..T₁ <:: S₂..T₂

Γ ⊢ T₂ <: T₁    Γ, x: T₂ ⊢ K₁ <:: K₂
------------------------------------ (K-→-<::-→)
Γ ⊢ (x: T₁) → K₁ <:: (x: T₂) → K₂

A few points worth noting:

  • The selection subtyping rules from DOT are now admissible, e.g. we have

     Γ ⊢ p : { A :: S..T }
     --------------------- (K-Sel)
     Γ ⊢ p.A :: S..T
     --------------- (T-..-<:)
     Γ ⊢ S <: p.A
    
  • β-reduction in types is also admissible (as a subtyping rule), e.g. we have

     Γ ⊢ S :: *    Γ, x: S ⊢ T :: T..T
     --------------------------------- (K-→-I)
     Γ ⊢ λx: S. T :: (x: S) → T..T               Γ ⊢ p : S
     ----------------------------------------------------- (K-→-E₁)
     Γ ⊢ (λx: S. T) p :: (T..T)[x := p]
     ---------------------------------- (T-<:-..)
     Γ ⊢ (λx: S. T) p <: T[x := p]
    

    Note that this is not a new idea, Aspinall already noted in 1994 that β-reduction is admissible in systems with singleton types1. Type interval kinds can be understood to be a generalization of singleton kinds2: clearly, every singleton {T}_* can be encoded as a singleton interval T..T. Most of the literature on the properties of systems with singleton types and kinds seems to use PER-semantics1, 2, 3, interpreting singleton types/kinds {T}_K as an equivalence class of types/kinds up to β(η)-equivalence.

  • Types in this system do not have normal forms in general. Consider e.g. the encoding of

     { type F[X] = F[F[X]] }
    

    which is expressible as

     new(z: { F :: (x: {A::*}) → z.F (z.F x) .. z.F (z.F x) })
         { F = λx:{A::*}. z.F (z.F x) }
    

    This is in contrast to the calculi with singleton kinds proposed by Stone and Harper2, where type equivalence is show to be decidable via comparison of WHNFs.

  • The introduction rule for interval kinds (over proper types)

     Γ ⊢ T :: *
     ------------- (K-..-I)
     Γ ⊢ T :: T..T
    

    corresponds roughly to Aspinall's {}-I for singleton types,1

     Γ ⊢ M : A
     ------------- ({}-I)
     Γ ⊢ M : {M}_A
    

    and Stone and Harper's introduction rule for singleton kinds.2

     Γ ⊢ A :: *
     -------------
     Γ ⊢ A :: S(A)
    
  • One might wonder why we don't give the "most precise" kinds to proper types in the first place, e.g. the kinding rule for the top type could be

     ------------- (K-⊤-F′)
     Γ ⊢ ⊤ :: ⊤..⊤
    

    On a first glance, it seems like the original (K-⊤-F) rule could then be recovered via subsumption, since ⊤..⊤ <:: *. However, a derivation of this relationship already requires one for ⊤ :: * or ⊥ :: *, and so we end up with a circular dependency:

        ???
     ----------
     Γ ⊢ ⊤ :: *
     ---------- (T-⊥-<:)   ---------- (T-Refl)
     Γ ⊢ ⊥ <: ⊤            Γ ⊢ ⊤ <: ⊤
     -------------------------------- (K-<:-..)
     Γ ⊢ ⊤..⊤ <:: ⊥..⊤
    
  • It's questionable whether identifying * with ⊥..⊤ is a good idea. Alternatively, one could treat them separately and add sub-kinding rules to convert between them. At the very least one should have that ⊥..⊤ <:: *.

  • One could take the analogy to singleton types one step further and allow intervals A .._K B over kinds K other than *. E.g.

     (λx:A₁.B₁) .._(* → *) (λx:A₂.B₂)
    

    would be an interval over unary type operators. Note the kind annotation to make sure the bounds of the interval are of the same kind. This is closer in spirit to the way abstract, higher-kinded type members are handled in dotty, where one can write e.g.

     { type F <: List }
    

    (which the compiler η-expands to)

     { type F <: λX.List[X] }
    

    While it is unclear whether this would simplify or complicate the meta theory, it would likely make the existing theory from systems with singleton types more easily applicable in the DOT setting.

    Note that the more liberal, kind-annotated variant of intervals can be encoded in terms of the *-only variant. Again, this has already been noticed, for the case of singleton kinds, by Stone and Harper:2

     S .._*        T   =   S..T
     F .._(x: S)→K G   =   (x: S) → (F x) .._K (G x)
    
  • This proposal originally contained the following two extra subtyping rules:

     Γ, x: S ⊢ T₁ <: T₂
     -------------------------- (T-λ-<:-λ)
     Γ ⊢ λx: S. T₁ <: λx: S. T₂
    
     Γ ⊢ S₁ <: S₂
     ---------------- (T-App-<:-App)
     Γ ⊢ S₁ t <: S₂ t
    

    However, as Martin pointed out in a comment to his gist (https://gist.github.com/odersky/36aee4b7fe6716d1016ed37051caae95), these rules might not add any expressivity in practice. Consider the following example taken from his comment:

     type C(x) <: Seq(x)
    

    is encoded as

     { C :: (x: {A::*}) → ⊥..(Seq x) }
    

    Then

     p.C y :: ⊥..(Seq y)
    

    Hence

     p.C y <: Seq y
    

    Although this idea does not seem to extend to the case of partial applications, e.g.

     type Map[K, V] <: GenMap[K, V]
     Map Int  <:  GenMap Int   ?
    

    such partial applications can only occur nested inside other subtyping derivations but never directly in (T-Sub), (T-<:-..) or (T-..-<:) since partially applied types necessarily have arrow-kinds.

Unsoundness

The following example illustrates that type safety does not hold for the proposed calculus (neither for the above version, nor for Martin's original proposal).

let x: { A :: { B :: x.A → ⊤..⊥ }..{ B :: x.A → ⊤..⊥ } } =
    new(x: { A :: { B :: x.A → ⊤..⊥ }..{ B :: x.A → ⊤..⊥ } })
           { A =  { B :: x.A → ⊤..⊥ } } in
let y: { B :: x.A → ⊤..⊥ } =
    new(y: { B :: (z: x.A) → (z.B z)..(z.B z) })
           { B =  λz: x.A. z.B z } in
let z: y.B y = () in
(z: ⊥)

To see that this example typechecks, let's first verify that the declared types of x, y and z match the types of their definitions.

  • For x, we have that x.A → ⊤..⊥ is a kind (syntactically), so we immediately get { B :: x.A → ⊤..⊥ } :: * by (K-{::}-F). From there, we get the declared type by applying the usual sequence of (K-..-I), (D-{::}), (T-{}-I) and (T-Unfold).

  • For y, we need to show that (z: x.A) → (z.B z)..(z.B z) <:: (z: x.A) → ⊤..⊥ (note the absurd bounds on the codomain). Let Γ = x: { A :: { B :: x.A → ⊤..⊥ }..{ B :: x.A → ⊤..⊥ } } and Γ′ = Γ, z: x.A, then by (T-Var) and (K-Sel)

     Γ′ ⊢ x.A :: { B :: x.A → ⊤..⊥ }..{ B :: x.A → ⊤..⊥ }
    

    by (T-Var), (T-<:-..) and (T-Sub)

     Γ′ ⊢ z : { B :: x.A → ⊤..⊥ }
    

    by (K-Sel)

     Γ′ ⊢ z.B :: x.A → ⊤..⊥
    

    by (T-Var) and (K-→-E)

     Γ′ ⊢ z.B z :: ⊤..⊥
    

    by (T-..-<:) and (T-<:-..)

     Γ′ ⊢ ⊤ <: z.B z <: ⊥
    

    by (K-..-<::-..) and (K-→-<::-→)

     Γ ⊢ (z: x.A) → (z.B z)..(z.B z) <:: (z: x.A) → ⊤..⊥
    
  • For z, it's sufficient to show that ⊤ <: y.B y. Let Γ′′ = Γ, y: { B :: x.A → ⊤..⊥ }, then by (T-Var) and (K-Sel)

     Γ′′ ⊢ y.B :: x.A → ⊤..⊥
    

    by (T-Var), (K-Sel) and (T-..-<:)

     Γ′′ ⊢ { B :: x.A → ⊤..⊥ } <: x.A
    

    by (T-Var) and (T-Sub)

     Γ′′ ⊢ y : x.A
    

    by (T-Var) and (K-→-E)

     Γ′′ ⊢ y.B y :: ⊤..⊥
    

    by (T-..-<:) and (T-<:-..)

     Γ′′ ⊢ ⊤ <: y.B y <: ⊥
    

It follows that y.B y has absurd bounds in a context that corresponds to a store. Equivalently, we can substitute x and y by their definitions to obtain a closed type with absurd bounds. In either case, the type lattice becomes trivial.

The example is a variant of the well-know technique for typing arbitrary lambda terms in a system with (iso-)recursive types (see e.g. TAPL, ch. 20, pp. 273). Here, we lift this technique to the kind level to allow kinding arbitrary lambda expressions on the type level. Then we construct a divergent type expression and kind it with absurd bounds.

This unsoundness example makes crucial use of

  1. recursion in types (or rather in bounds of type members), and
  2. path-dependent type families.

The former is also present in Scala/dotty where as the latter is not. The example does therefore not constitute a soundness issue in those languages. Since this soundness issue is specific to HK DOT, it's worth elaborating a bit on this point. The different sorts of arrows in HK DOT (and Scala/dotty) can be categorized according to the sort of their domain and codomain:

  1. (possibly path-dependent) function type (domain: type, codomain: type), inhabited by term-level lambdas/functions/methods,

    example in DOT

    f : (x: S) → T         where S ≠ { A :: K }
    f = λx: S. t
    

    examples in Scala/dotty

    def f(x: S): T = t
    val g: (x: S) => T = (x: S) => t
    
  2. universal quantifier/polymorphic type (domain: kind, codomain type), inhabited by polymorphic terms/functions

    examples in DOT

    f : (x: { A :: K }) → T
    f = λx: { A :: K }. t
    

    examples in Scala/dotty

    def f[X]: T = t
    def f[X[_]](y: ...): T = t
    
  3. kind of path-indexed type families (domain: type, codomain: type), inhabited by type families or path-indexed singleton kinds

    examples in HK DOT

    F :: (x: S) → K         where S ≠ { A :: K′ }
    F =  λx: S. T
    

    alternatively, using Martin's variant

    F <: all(x: S)K         where S ≠ { A : K′ }
    F =  all(x: S)J               J singleton, J <: K
    

    there is no counterpart of this in Scala/dotty

  4. kind of type operators, i.e. higher kind (domain: kind, codomain: kind), inhabited by type operators/functions/lambdas

    examples in HK DOT

    F :: (x: { A :: K′ }) → K
    F =  λx: { A :: K′ }. T
    

    in Martin's variant

    F <: all(x: { A : K′ })K
    F =  all(x: { A : K′ })J      J singleton, J <: K
    

    example in Scala/dotty

    F[X]       >: S <: T
    F[X, Y, Z] >: S <: T
    F[X[_]]    >: S <: T
    F[X]       = T
    

In other words, the kind language of both variants of HK DOT are strictly more expressive than that of Scala/dotty, which suggests that the soundness issue could be avoided by restricting kinds as follows:

K ::= (x: { A :: K }) → K′         type operator kind
      S..T                         type interval kind

This restriction is purely syntactic and remains "compatible" with Scala/dotty in that any (higher-kinded) abstract type definition expressible in Scala can still be encoded in this subset.

References

  1. D. Aspinall, Subtyping with singleton types, CSL'94, LNCS 933, 2005. http://dx.doi.org/10.1007/BFb0022243
  2. C. A. Stone, R. Harper, Deciding Type Equivalence in a Language with Singleton Kinds, POPL '00. http://doi.acm.org/10.1145/325694.325724
  3. C. A. Stone, R. Harper, Extensional Equivalence and Singleton Types, ACM TCL 7(4), 2006. http://doi.acm.org/10.1145/1183278.1183281
  4. N. Amin, S. Grütter, M. Odersky, T. Rompf, S. Stucki, The Essence of Dependent Object Types, WadlerFest, LNCS 9600, 2016. http://dx.doi.org/10.1007/978-3-319-30936-1_14
  5. N. Amin et al., The DOT Calculus. Additional material including a detailed type safety proof with an accompanying Coq formalization. http://wadlerfest.namin.net/
@sstucki
Copy link
Author

sstucki commented Jan 24, 2017

TLDR:

  1. HK DOT is unsafe (though there might be a fix),
  2. even with the fix, HSubst is (probably) too weak to prove safety.

A slightly more comprehensive summary with links to the relevant sections:

  1. Subtyping partial type applications.
  2. Proof terms for subtyping derivations, why we should expect them to be normalizing, and how this all relates to inversion, canonical forms and type safety.
  3. The soundness/safety counter example revisited: an illustration of how non-terminating coercion terms break safety.
  4. Why impredicativity of HK DOT spells trouble for a HSubst-based proof.
  5. Some thoughts on a proof via logical relations.
  6. An outlook on future work.

But let's start from the beginning...

No, this calculus does not distinguish between object identity and object equality.

Agreed then. I assumed you'd know this but was surprised you didn't mention it—but Martin knew what you meant by "not using ANF".

Ah yes, that was a rather imprecise distinction indeed, good point. I'll make a mental note to be more clear about this point in future write-ups.

BTW: I recently re-read parts of the TAPL chapter on Featherweight Java (ch. 19, pp. 247), while preparing for TA-ing an exercise on FJ, and realized that it uses the same type of substitution-based semantics as I do above. In particular, there is also no distinction between object identity and object equality: the reduction rules for field selection and method invocation operate directly on objects.

1. Subtyping partial type applications

Let me try to focus my speculations on real problems then: Map Int <: GenMap Int and hereditary substitution.

A more interesting question, IMO, is how you would ever use the fact that Map Int <: GenMap Int, other than to show that Map Int T <: GenMap Int T, which is already covered by the current (T-..-<:) and (T-<:-..) rules.

Well, what if I try to encode def f[K <: GenMap Int] = ...; f[Map Int]?

Yes, initially I was worried as well that partially applied types might appear in bounds of polymorphic functions, and thus, via transitivity, in anywhere in subtyping derivations. But once you de-sugar the above signature into DOT, the partial application disappears. In DOT, the signature of f would be

(k: {A :: (x: *) → ⊥..(p.GenMap int x)}) → T

where A is a dummy label, p is the module containing the abstract type GenMap, and int is a "boxed" variant of Int. Note the "η-expansion" of GenMap! This is necessary because

  1. type parameters have to be "boxed" and their upper/lower bounds have to be translated into upper/lower bounds of the corresponding abstract type (here A),
  2. the calculus only supports type intervals over proper types, so type intervals over type operators have to be de-sugared as described above (note that the definition of S .._K T performs an η-expansion so higher-order type bounds are effectively "η-long").

As a consequence, every occurrence of a type operator in a bound is necessarily fully applied in the kind corresponding to the "type box".

The reason is that encoding K <: GenMap Int requires an interval of non-* kind, which must be currently encoded using eta-expansion.

Exactly.

However, type definitions (in f's argument) need not be eta-expanded. Maybe (D-{::}) should be modified and force eta-expanded definitions, like (apparently) Dotty seems to do internally?

I think your on to something there. We want all (type-)definitions to have singleton kinds, so the typing rules above should be adjusted (Martin's original proposal actually enforced this restriction). There are good meta-theoretic reasons for requiring definitions to have singleton types, and incidentally, this would force definitions to be η-expanded. Consider the following definition

App[F[_]] = F

which we can either interpret as the identity operation on unary type operators or as a "curried" variant of type application, i.e. App[F][X] = F[X]. This type definition is actually not valid in Scala (I don't know whether it is in dotty). Instead, one has to η-expand and un-curry manually, like so:

App[F[_], X] = F[X]

Now in HK DOT (as outlined above) one may encode the former as

App :: {F :: {A :: *} → *} → {A :: *} → *
App =  λx: {F :: {A :: *} → *}. x.F

Note that the kind of App is certainly not a singleton, so according to Martin's original proposal, this wouldn't be valid. On the other hand, the η-expanded version can be given a singleton kind:

App :: (x: {F :: {A :: *} → *}) → (y: {A :: *}) → (x.F y)..(x.F y)
App =  λx: {F :: {A :: *} → *}. λy: {A :: *}. x.F y

I believe this generalizes to arbitrary type operators.

To summarize, enforcing singleton types for all type definitions ensures all such definitions are η-long, which is also in accordance with Scala.

Or maybe the rules should just be extended to handle this case somehow—right now, it seems I can't even prove F <: λ x :: *. F x or viceversa.

No, η-conversion is not admissible. But η-conversion only makes sense for type operators, and following the same reasoning as above, you should never need to compare type operators that are not fully applied. Note that fully applied variants of your example operators are comparable.

Γ ⊢ λ x :: *. F x :: (x :: *) → (F x)..(F x)    ...
--------------------------------------------------- (K-→-E)
Γ ⊢ (λ x :: *. F x) p :: (F p)..(F p)
------------------------------------- (T-..-<:)
Γ ⊢ F p <: (λ x :: *. F x) p

BTW, an algorithmic for Map Int <: λ x. GenMap Int x might need kind-directed eta-expansion, which is sort-of obvious but I've never heard. I have seen eta-expansion for equivalence checking (ATTAPL Ch. 6), but not for subtyping (though it can arise the same way). Of course google disagrees and points me to https://www.researchgate.net/publication/221557946_Syntactic_Metatheory_of_Higher-Order_Subtyping (which also uses HSubst, though on types, and mentions the barrier below).

Indeed. That's a nice paper BTW (I guess most of Abel's papers are). His paper on polarized subtyping, which I mentioned earlier,6 also makes essential use of hereditary substitution (though he doesn't mention it explicitly, IIRC).

2. Proof terms for subtyping derivations and normalization

Hereditary substitution (HSubst) might be a cool inspiration. This suggests maybe the best idea of this comment: have you tried adding proof terms to the subtyping rules, and phrasing reduction rules for those?

I certainly have!

That's maybe just notation, but it might help all those that get beta-reduction better than cut-elimination. (In fact, we even get to look informally at the "untyped" reduction rules if we want, though we probably shouldn't formalize those). The result might be of interest beyond Scala!

I agree that this is of more general interest, and I think that we should formalize the untyped reduction rules! Firstly, they provide a sort of formal specification for the normalization proof we're looking for, and secondly, we can prove a variant of the type safety theorems for the new calculus. Namely

  1. that reductions of terms (and the coercions they contain) preserve types, and
  2. that closed, well-typed (coercion) terms can either be reduced or are values.

Intuitively, a coercion value is a coercion that is closed and normal/canonical, and that therefore corresponds to a subtyping derivation that can be inverted. Indeed this latter point can be made formal by a handful of corresponding canonical forms lemmas (for coercions). I have sketched such a type safety proof for a variant of the system above (without recursion or records) and there don't seem to be any major difficulties. I believe that extending the proof to the full system would increasing the technical complexity but wouldn't introduce any new conceptual difficulties.

These type-safety theorems do of course not guarantee that type safety holds for the original calculus because coercion terms may be non-normalizing, in which case either a) the calculus is unsafe (see example below) or b) our reduction strategy is flawed. In either case, the formalization allows us to make the inherent "dynamics" of the normalization proof explicit, and — as you already mentioned — should give some insights about the proof-theoretic strength of the logic corresponding to the subtyping fragment.

Finally, it allows us to interpret the final type safety proof as a proof by (weak) simulation: the extended calculus (with explicit coercions) weakly simulates the original calculus by interleaving finite sequences of "silent" coercion reductions with the reductions of the "bare" terms in the original calculus. The simulation relation is simply the function that maps terms in the extended calculus to their bare counterparts by erasing all coercions.

To this end, though, you might need to internalize dependency into assumptions using a term-level lambda, so that you can exhibit (λ x: S. < proof term > : T <: U): (∀ x: S. T <: U). But that would probably complicate things...

Then, can you explain the "leftover" transitivity/subsumption in a way similar to variables (and spines) in neutral terms?

I don't understand your example. Could you elaborate a bit? How would you use a term with this particular signature?

It is definitely true though that the coercion language does need to internalize dependencies. E.g. the language I introduce below does have binders and reduction rules for eliminating them.

There is also plenty of literature where subtyping rules in some language are be elaborated into explicit coercion terms in a subtyping-free fragment of the same language. But this isn't what we want here, because it would blur the clear separation between the coercion language, which we need to be normalizing, and the original term language, which might not be normalizing (it certainly isn't here). Again, I'm going to say a bit more about this below.

Back to hereditary substitution: It does prove (weak? strong?) normalization for STLC ("Hereditary Substitutions for Simple Types, Formalized" details how) without logical relations.

Another very nice paper (and a source of inspiration for my attempt to use HSubst here). BTW, are you aware of a similar paper that handles STLC with products?

But HSubst has limited power, (...), hereditary substitution alone can't work for type systems that extend Peano Arithmetic, such as System T, or System F (...). I have no clue where the system you have sits, but maybe it's too powerful?

Indeed, the expressiveness/proof-theoretic power of the HK DOT system above (and in particular the (sub-)typing fragment) is too powerful for HSubst, as exemplified by the above safety counter example — in fact, this is how I found the counter example in the first place. But there are even simpler examples of why HSubst doesn't work "out of the book" here, and these apply even to the "fixed" version of HK DOT where the safety counter example doesn't apply. I'll get back to this later — first, let's have a look at the counter example to get a feeling for what a non-terminating subtyping coercion looks like!

3. Non-termination of the soundness/safety counter example

Since the presence of path-indexed type families (plus recursion) are powerful enough to type arbitrary (untyped) lambda calculus expressions on the type level, types are clearly not normalizing. However, we're not trying to normalize types, but rather (sub-)typing derivations, and as you further note,

subtyping proofs don't have lambdas and subtyping and subtyping lacks internalized implication, and maybe it can't be encoded suitably and this is all irrelevant.

Unfortunately, they can be encoded suitably.

In what follows, I'll use a new syntax for paths and types extended with explicit subtyping and subkinding coercions:

   p, q ::= ...
         |  unfold(p)       (T-Unfold)
         |  p ↑ α           (T-Sub)

S, T, U ::= ...
         |  s(T)            (K-..)
         |  T ⇑ Φ           (K-Sub)

α, β, γ ::= id_T            (T-Refl)
         |  α ; β           (T-Trans)
         |  {A::Φ}          (T-{::}-<:-{::})
         |  ⟨T|             (T-..-<:)
         |  |T⟩             (T-<:-..)
         |  ...

   Φ, Ψ ::= α..β            (K-..-<::-..)
         |  (x:T.Φ)^α       (K-→-<::-→)

where I have indicated the corresponding (sub-)typing and kinding rules. For lack of a better name, let's call the original calculus the bare language (BL) and the extended calculus with explicit coercions the extended language (EL). Suggestions for better terminology are welcome...

When presenting (sub-)typing and kinding rules/derivations, I will use the EL for "proof terms", but the BL for the associated types and kinds. There is an obvious erasure function ⌞-⌟ that, given a path p, type T or kind K in the EL, yields the underlying bare path ⌞p⌟, type ⌞T⌟ or kind ⌞K⌟, respectively, in the BL. E.g. the updated subtyping rule for (T-..-<:) is

Γ ⊢ S :: T₁..T₂
------------------- (T-..-<:)
Γ ⊢ ⟨S| : T₁ <: ⌞S⌟

Note that the conclusion of this rule is still a subtyping judgment, albeit one with an explicit proof term ⟨S|. I.e. subtyping is now a four-place relation Γ ⊢ α : S <: T, relating a coercion α to it's (bare) upper and lower bounds S and T in a typing context Γ. Importantly, <: remains part of the relation, i.e. it is not a type operator here. This is a design choice to simplify erasure of coercions from subtyping derivations (it suffices to just drop the proof term) and contrasts with systems that elaborate (or translate) subtyping judgments into a term-language (e.g. through a relation Γ ⊢ S <: T ⇝ α) where coercions have proper types (e.g. Γ ⊢ α : S → T). Still, in designing the reduction rules, it is of course useful to think of coercions as maps α : S → T from terms of their lower bound (i.e. the domain) S to terms of their upper bound (i.e. the codomain) T.

Semantically, we treat elements of the BL as uninterpreted constants and reason only about the dynamics of the new extensions in the EL: coercions and the associated instances of subsumption. In this context, the sub-kinding rule (K-→-<::-→) takes the role of lambda abstraction,

Γ ⊢ α : T₂ <: T₁        Γ, x: T₂ ⊢ Φ :: K₁ <:: K₂
------------------------------------------------- (K-→-<::-→)
Γ ⊢ (x: T₁. Φ)^α :: (x: T₁) → K₁ <:: (x: T₂) → K₂

internalizing implication in its arrow kind bounds (x: T₁) → K₁ <:: (x: T₂) → K₂. The kinding rule (K-→-E₁) along with subsumption (K-Sub) on the first premise act as "application"/implication elimination, i.e.

Γ ⊢ Ψ : (x: T₁) → K₁ <:: (x: T₂) → K₂
Γ ⊢ S :: (x: T₁) → K₁
-------------------------- (K-Sub)
Γ ⊢ S ⇑ Ψ :: (x: T₂) → K₂                Γ ⊢ p : T₂
--------------------------------------------------- (K-→-E₁)
Γ ⊢ (S ⇑ Ψ) p :: K₂[x := ⌞p⌟]

We can "contract" this (sub-)kinding derivation by pushing the subsumption down the derivation tree through the "application". The corresponding reduction rule is

(T ⇑ (x: S. Φ)^α) p   —→   (T (p ↑ α)) ⇑ Φ[x := p]

Note the similarity to β-reduction! It's easy to verify that this reduction preserves both

  1. the erased type expression (which remains T p), and
  2. its kind (which remains K₂[x := ⌞p⌟]),

assuming the following substitution lemma holds:

Γ ⊢ p : T             Γ, x: T ⊢ Φ :: K₁ <:: K₂
----------------------------------------------
Γ ⊢ Φ[x := p] :: K₁[x := ⌞p⌟] <:: K₂[x := ⌞p⌟]

Returning to the above safety counter example, if we try and reduce the (sub-)typing derivation for z : ⊥, we encounter exactly this situation. As before, let

Γ = x: { A :: { B :: x.A → ⊤..⊥ }..{ B :: x.A → ⊤..⊥ } }
Γ′ = Γ, z: x.A
Γ′′ = Γ, y: { B :: x.A → ⊤..⊥ }

then

Γ′ ⊢ z ↑ |x.A⟩ : { B :: x.A → ⊤..⊥ }
Γ′ ⊢ (z ↑ |x.A⟩).B z :: ⊤..⊥

and letting p = (z ↑ |x.A⟩).B z

Γ′ ⊢ ⟨p| : ⊤ <: z.B z
Γ′ ⊢ ⟨p|..|p⟩ :: (z.B z)..(z.B z) <:: ⊤..⊥
Γ ⊢ (z: x.A. ⟨p|..|p⟩)^id :: (z: x.A) → (z.B z)..(z.B z) <:: (z: x.A) → ⊤..⊥

Now let

Φ = (z: x.A. ⟨p|..|p⟩)^id
v = unfold(new(x: { A :: { B :: x.A → ⊤..⊥ }..{ B :: x.A → ⊤..⊥ } })
                  { A =  { B :: x.A → ⊤..⊥ } })
w = unfold(new(y: { B :: (z: x.A) → (z.B z)..(z.B z) })
                  { B =  λz: x.A. z.B z })
q = w ↑ { B :: Φ }

i.e. v and q are exactly the "definitions" of x and y, respectively, and ⌞q⌟ = w. Furthermore, we have

Γ′′ ⊢ y.B :: x.A → ⊤..⊥
Γ′′ ⊢ ⟨x.A| : { B :: x.A → ⊤..⊥ } <: x.A
Γ′′ ⊢ y ↑ ⟨x.A| : x.A
Γ′′ ⊢ y.B (y ↑ ⟨x.A|) :: ⊤..⊥
Γ′′ ⊢ ⟨y.B (y ↑ ⟨x.A|)|;|y.B (y ↑ ⟨x.A|)⟩ : ⊤ <: ⊥

This last coercion is the one we'd like to normalize. To do so, we start by substituting the definitions v and q for x and y (otherwise the coercion is already in normal form). We get

Γ ⊢ ⟨q.B (q ↑ ⟨v.A|)|;|q.B (q ↑ ⟨v.A|)⟩ : ⊤ <: ⊥                   (1)

Expanding a bit and focusing on the left instance of (T-..-<:), we have

Γ ⊢ ⟨(w ↑ { B :: Φ }).B (q ↑ ⟨v.A|)| : ⊤ <: w.B w

and we can apply the following reduction rule

(p ↑ { A :: Φ }).A   —→   p.A ⇑ Φ

to push subsumption through type member selection.

Γ ⊢ ⟨(w.B ⇑ Φ) (q ↑ ⟨v.A|)| : ⊤ <: w.B w

Note that the underlying bare type is again preserved by the reduction rule. Now we are ready to apply the β-like reduction rule discussed earlier.

⟨(w.B ⇑ Φ) (q ↑ ⟨v.A|)|
    =   ⟨(w.B ⇑ (z: v.A. ⟨p|..|p⟩)^id) (q ↑ ⟨v.A|)|
    —→  ⟨(w.B ((q ↑ ⟨v.A|) ↑ id) ⇑ ⟨p|..|p⟩[z := q ↑ ⟨v.A|]|
    =   ⟨(w.B ((q ↑ ⟨v.A|) ↑ id) ⇑ ⟨p′|..|p′⟩|

where p′ = p[z := q ↑ ⟨v.A|]. Expanding p′, we get

p′  =   p[z := q ↑ ⟨v.A|]
    =   ((z ↑ |v.A⟩).B z)[z := q ↑ ⟨v.A|]
    =   ((q ↑ ⟨v.A|) ↑ |v.A⟩).B (q ↑ ⟨v.A|)

which looks suspiciously similar to the left half of (1) above. Indeed, if we reduce p′ further we get

p′  =   ((q ↑ ⟨v.A|) ↑ |v.A⟩).B (q ↑ ⟨v.A|)
    —→  (q ↑ ⟨v.A|;|v.A⟩).B (q ↑ ⟨v.A|)
    —→  (q ↑ id).B (q ↑ ⟨v.A|)
    —→  q.B (q ↑ ⟨v.A|)

which is exactly the subexpression of (1) we started from. Its easy to convince oneself that these last three steps correspond to sensible reductions too (there are some subtleties involved in formulating the rule for the second step in general, but the instance used in this example is straight-forward).

In summary, we are left with a coercion expression that is strictly bigger than the one we started from and which contains the core of the original coercion as a sub-expression.

⟨q.B (q ↑ ⟨v.A|)|
    —→*  ⟨(w.B (q ↑ ⟨v.A|)) ⇑ ⟨q.B (q ↑ ⟨v.A|)|..|q.B (q ↑ ⟨v.A|)⟩|
    —→*  ...

So we are forced to conclude that this sequence of reductions diverges, and hence no (total) HSubst function can exist for this coercion calculus in general.

But, as stated above, I believe this situation can be remedied by restricting the kind language to forbid path-indexed families of types.

4. Impredicativity troubles for HSubst

So let's turn to the question of whether HSubst is powerful enough proof normalization of the "fixed" coercion calculus. In order to use HSubst, we first need to define a language of "canonical" (or normal) path and type expressions. Such a definition is implicit in the reduction rules for coercions, but we need an direct, syntactic definition, possibly typedA5 or untyped.A1,A3,A4 This is cumbersome, but there are no real conceptual difficulties.

Next, one has to define HSubst on canonical paths and types as well as on canonical path and type coercions, and then show that the type of the substitute path in any "nested" use of HSubst decreases, for some well-founded order on types. Since kinds can appear inside types and vice versa, we actually need a combined order that covers both types and kinds. The essential difficulty here is "impredicativity". Just as it is difficult to define a useful well-founded order on types in an impredicative calculus such as System F, it is difficult to define one on types and kinds in HK DOT. In particular, for HSubst to work, we need the kind of a type application to be smaller than the kind of the corresponding type operator, i.e. given F :: (x: S) → K and p : S, we would like

K[x := p]   ≺   (x: S) → K

to hold, where _≺_ is our candidate order. If variables were not allowed in kind expressions (as e.g. in Fω-<:A1), we could just take _≺_ to be the sub-expression order:

K[x := p]   =   K   ≺   (x: S) → K

but obviously this does not work for HK DOT. Since paths, types and kinds may all appear circularly inside each other, we can pick e.g. K = ⊥..x! and p = { { ⊥..(x: S) → K } }, and get

K[x := p]   =   ⊥..{ { ⊥..((x: S) → K) } }!   ≻   (x: S) → K

which is the opposite of what we want. Note, firstly, that the above example still applies to the "fixed" kind language, where only arrow kinds of the form (x: { A :: K₁ } → K₂ are allowed and, secondly, that the inclusion of kinds in the path language is crucial here. The mere presence of variables in kinds (or types) is not necessarily a problem. For example, Harper and Licata's "Canonical LF"A4 has dependent types, but since the system is stratified, i.e. kinds and types can never appear in terms, any terms appearing in types (or kinds) can simply be ignored by _≺_ (concretely, Harper and Licata first define a forgetful map from types to "simple types", forgetting any term occurrences, then define a well-founded order on simple types instead).

Let's work through an example to see what exactly goes wrong when we try to extend this approach to HK DOT. Consider the following definition of the identity function on types:

tId : { I :: (x: {A::*}) → x.A..x.A }   <:  { I :: {A::*} → * }
tId = { I =  λx: {A::*}. x.A }

Now, assume we are given a variable z of type { I :: (x: {A::*}) → x.A..x.A }, i.e. the precise type of the identity function. We can apply the operator z.I to a pair of nested boxes containing some kind K, i.e. z.I { A = { B :: K } }, which is OK because { B :: K } :: * and hence { A = { B :: K } } : {A::*} for any K. Let C = { B :: K }, then we have

z.I { A = C }  ::  { A = C }.A..{ A = C }.A  <::  C..C

and hence C <: z.I { A = C } <: C, i.e. the identity applied to some type C is equivalent to C, as one would expect. For example, if we let K = (x: {A::*}) → x.A..x.A and B = I, we have

z : C  <:  z.I { A = C }

i.e. we can apply the identity to it's own type. Let's make all the coercions in this statement explicit, starting with C:

C            :: *
s(C)         :: C..C
{ A = s(C) } :: { A :: C..C }

Let k = { A = s(C) }. Noting that ⌞k⌟ = { A = C }, we have

z.I                      :: (x: {A::*}) → x.A..x.A
z.I k                    :: { A = C }.A..{ A = C }.A
k.A                      :: C..C
⟨k.A|                    :: C <:: { A = C }.A
⟨k.A|..|k.A⟩             :: { A = C }.A..{ A = C }.A <:: C..C
(z.I k) ⇑ (⟨k.A|..|k.A⟩) :: C..C

Let E = (z.I k) ⇑ (⟨k.A|..|k.A⟩). Noting that ⌞E⌟ = z.I ⌞k⌟ we have

z       : C
⟨E|     : C <: z.I { A = C }
z ↑ ⟨E| : z.I { A = C }

Note that the variable z, which is a path, appears inside the type z.I, which, in turn, appears as the head of a type application and eventually in the path coercion ⟨E|. Hence, interesting things should happen if we substitute some path p ↑ α for z and normalize the result. We expect the path coercion to be pushed, successively, through the selection of I, then the application to k and finally through the left projection ⟨-|. On the way, it should turn from a path coercion into a type coercion and back, and it's type (and/or kind) should change accordingly.

To keep things simple, we will work with (an η-expanded version of) the identity coercion on the type of z:

Φ :: (x: {A::*}) → x.A..x.A <:: (x: {A::*}) → x.A..x.A
Φ =  (x: {A::*}. id_x.A .. id_x.A)^id_{A::*}

α : { I :: (x: {A::*}) → x.A..x.A } <: { I :: (x: {A::*}) → x.A..x.A }
α = { I :: Φ }

I'll leave it to the reader to verify that these definition are well-typed.

In what follows, I'll annotate paths/types and coercions with both their type/kind as well as simplified versions thereof. In particular, I'll write statements of the form p : T / |T| / ⌊T⌋′ where |T| and ⌊T⌋ are, respectively, the types obtained by erasing all variables and all paths from T. E.g. we have

z ↑ ⟨E| : z.I { A = C } / ?.I { A = |C| } / ?.I ?

with |C| = ⌊C⌋ = { B :: |K| } and |K| = ⌊K⌋ = {A::*} → ?.A..?.A.

Now let's compute the normal form of (z ↑ ⟨E|)[z := p ↑ α]. We have

  (z ↑ ⟨E|)[z := p ↑ α] :  (z.I { A = C })[z := ⌞p ↑ α⌟]  /
                          |(z.I { A = C })[z := ⌞p ↑ α⌟]| /
                          ⌊(z.I { A = C })[z := ⌞p ↑ α⌟]⌋
= (p ↑ α) ↑ ⟨E′|        : ⌞p⌟.I { A = C } / ⌞p⌟.I { A = C } / ?.I ?

with

E′ = E[z := p ↑ α]                   :: C..C / |C|..|C| / ⌊C⌋..⌊C⌋
   = ((p ↑ α).I k) ⇑ (⟨k.A|..|k.A⟩)  :: C..C / |C|..|C| / ⌊C⌋..⌊C⌋

We can already see that the erasure function |-| does not commute with substitution, i.e. |T| ≠ |T[x := ⌞p⌟]|, as exemplified by the type of z ↑ ⟨E| before and after the substitution of ⌞p ↑ α⌟ for z. This is a problem because it means that the simple type |T[x := ⌞p⌟]| of an application f p may be bigger than the codomain |T| of the corresponding simple arrow type |S| → |T| of the function f. The same is true for the simple kinds of type applications. Concretely, in the above example, we have

z.I   :: (x: {A::*}) → x.A..x.A / {A::*} → ?.A..?.A    / {A::*} → ?.A..?.A
z.I k :: (⌞k⌟.A)..(⌞k⌟.A)       / (|⌞k⌟|.A)..(|⌞k⌟|.A) / ?.A..?.A

where (|⌞k⌟|.A)..(|⌞k⌟|.A) is clearly not a subexpression of {A::*} → ?.A..?.A. Unfortunately, HSubst relies crucially on the fact that the type of a an application (or any redex in general) is smaller than the type of the corresponding function (or, in general, the term being eliminated).

Conversely, the erasure function ⌊-⌋ does commute with substitution, i.e. we have ⌊T⌋ = ⌊T[x := ⌞p⌟]⌋. E.g. in the above example, ?.A..?.A is a subexpression of {A::*} → ?.A..?.A.

Continuing the above example, the first redex we encounter is (p ↑ α).I, deeply nested inside E′. The associated reduction is

(p ⇑ { A :: Φ }).I  —→  p.I ⇑ Φ

The kind of the redex and the type of the associated eliminatee α are

(p ⇑ α).I :: K / |K| / ⌊K⌋
α         :  C <: C / |C| <: |C| / ⌊C⌋ <: ⌊C⌋

Expanding C = { I :: K } and |C| = ⌊C⌋ = { I :: |K| }, and noting that |K| = ⌊K⌋, we see that K, |K| and ⌊K⌋ are subexpressions, respectively, of C, |C| and ⌊C⌋ — so far, all is well. The next redex we encounter is (p.I ⇑ Φ) k, with reduction

(p.I ⇑ Φ) k  —→  (p.I (k ↑ id_{A::*}) ⇑ (id_x.A .. id_x.A)[x:=⌞p⌟]

which is an instance of the β-like reduction rule we have seen earlier. The types of the redex and Φ are

(p.I ⇑ Φ) k :: (⌞k⌟.A)..(⌞k⌟.A) / (|⌞k⌟|.A)..(|⌞k⌟|.A) / ?.A..?.A
Φ           :: K <: K / |K| <: |K| / ⌊K⌋ <: ⌊K⌋

Expanding the definitions of K, |K| and ⌊K⌋ we discover that (⌞k⌟.A)..(⌞k⌟.A) is not a subexpression of K, nor is (|⌞k⌟|.A)..(|⌞k⌟|.A) a subexpression of |K|. However, ?.A..?.A is a subexpression of ⌊K⌋, as expected since ⌊-⌋ commutes with substitution. We can therefore dismiss the subexpression order on types as well as subexpressions modulo |-| as candidates for our well-founded induction order for HSubst and focus instead on subexpressions modulo ⌊-⌋.

To see why this will not work either, we first need to perform a few boring reductions on E′ (mostly pushing identity coercions around and eliminating them):

⟨E′|  —→*  ⟨(p.I k) ⇑ ⟨k.A|..|k.A⟩|

The resulting coercion is a redex for the reduction

⟨(p.I k) ⇑ ⟨k.A|..|k.A⟩|  —→  ⟨k.A|;⟨(p.I k)|

which is an instance of the contraction rule

⟨T ⇑ α..β|  —→  α;⟨T|

The associated type/kind are

⟨(p.I k) ⇑ (⟨k.A|..|k.A⟩)|  :   C <: p.I { A = C } / ⌊C⌋ <: ?.I ?
⟨k.A|..|k.A⟩                ::  { A = C }.A..{ A = C }.A <:: C..C /
                                ?.A..?.A <:: ⌊C⌋..⌊C⌋

Focusing on the lower bounds, we note that C is a subexpression of { A = C }.A..{ A = C }.A but, alas, ⌊C⌋ = { B :: |K| } is not a subexpression of ?.A..?.A. The erasure function ⌊-⌋ forgot too much.

In summary: just as paths, types and kinds can circularly contain each other, so can coercions on paths and kinds (i.e. subtyping and subkinding derivations). As a consequence, (hereditary) substitutions in types may trigger (hereditary) substitutions in paths, and vice-versa. Whatever well-founded order/induction measure we use, it has to be defined both on (well-typed) paths and on (well-kinded) types. Ignoring all paths in types and kinds results in a well-founded order that is too coarse to make deeply nested calls to HSubst, while only ignoring variables leads to an order that is not stable under substitution, i.e. it cannot relate the kind K[x := p] of an application F p to the domain K of the arrow kind of the domain (x: S) → K of the corresponding type operator F.

All hope is not lost though: there could still be a well-founded order on types that goes beyond simple subexpression modulo erasure. That is, provided that the type language of HK DOT is weaker than System F. So is it? One thing we can say is that System F can at least not be directly encoded in the type language of HK DOT, simply because that would require us to encode kind-level variables through boxing in path-level variables, similar to the way we encode type variables in DOT. Recall that in DOT we have an injection { A = _ } from types to paths ("boxing") with inverse _.A ("unboxing") which allows us to treat path variables x of type x : { A :: * } as type variables X = x.A and encode universal quantifiers (over types) like ∀X.T as ⟦ ∀X.T ⟧ = (x.A) → ⟦ T[X := x.A] ⟧. To encode universals ∀k.K over kinds, we first need an encoding for kind variables k, i.e. a similar invertible injection from kinds to paths. A candidate is { A = { B :: _ } }, but this map has no syntactic form as its inverse.

5. Some thoughts on a proof via logical relations

Proofs and Types also doesn't actually mention hereditary substitution (it hints at "proof theoretic techniques" and I guess that's what they mean) or logical relations (but it mentions Tate '67).

I think the term "hereditary substitution" had not been invented when Proofs and Types was written. It seems to have originatedA3,A4,A5 in a paper by Watkins et al. that was published in 2004.A6 However, the method used to prove the weak normalization theorem in chapter 4 of P&T also makes crucial use of the fact that, in STLC, the type of a redex is smaller than the type of the corresponding eliminatee, e.g. for applications f t: T, we have f: S → T, and for projections πᵢ t: Tᵢ, we have t: T₁ × T₂. Both techniques leverage this fact by using the types of eliminatees in redexes as part of their induction/termination measure.

The origins of the term "logical relations" seem to go back much further, according to a thread on the TYPES mailing list at least as far as 1974, so I'm not sure why it is not mentioned in P&T.

If the barrier applies, it would suggest considering logical relations for transitivity elimination, but I've never heard of those.

Neither have I.

Parroting termination of STLC/System F, they'd be like "this subtyping proof is itself normalizing, and even the result of applying it to an assumption normalizes". What's confusing is that the property, it seems, would not be defined by induction on some "proof type", since you don't have internal abstraction on assumptions (unless you add lambdas, as sketched above), but by induction on the context. That's not novel though: logical relations are defined on closed terms usually, but they can be extended to open ones, and PFPL's chapter on "Equational Reasoning for T" (either edition) does exactly that, by defining open logical equivalence.

I don't understand this comment. What exactly do you mean by "applying it to an assumption"? And why would the relation not be defined by induction on types/kinds? The way I see it, there would be four mutually inductively defined relations/predicates (one each for terms, types and the corresponding coercions). Since coercions are typed/kinded by two types/kinds, the corresponding relations would likely be defined by induction on e.g. the upper bound and case-analysis on the lower, or vice-versa. E.g. definitions might look something like this:

  • p ∈ R⟦{ K₁ }⟧ if p! ∈ R⟦K₂⟧.
  • α ∈ R⟦T <: { K }⟧ if, for all p, p ∈ R⟦T⟧ implies (p ↑ α)! ∈ R⟦K₂⟧.

Needless to say, this seems rather non-standard. For the arrow cases, predicativity/dependency becomes an issue, e.g. the following definition is not inductive in the upper bound:

  • Φ ∈ R⟦(x: S₁) → K₁ <: (x: S₂) → K₂⟧ if, for all T and p, T ∈ R⟦(x: S₁) → K₁⟧ and p ∈ R⟦S₂⟧ implies (T ⇑ Φ)! p ∈ R⟦K₂[x:=⌞p⌟]⟧.

and we need to instead parametrize the (reducibility) relation on K₂ over relations (i.e. reducibility candidates) on S₂. Normalization proofs for e.g. the calculus of constructionsA2 have to deal with similar dependency-related issues and might serve as a starting point here.

For now, I have no idea how to deal with the order-related operations (reflexivity, transitivity) in this setting, nor how to represent recursive types/paths.

6. Conclusions and future work

I have more to say about

  1. pros and cons of encoding coercions as lambda terms,
  2. predicative variants of HK DOT,
  3. a simplified calculus where types and terms are stratified,

but this comment is already way too long, so that's for another time.

Additional references

  1. A. Abel and D. Rodriguez, Syntactic Metatheory of Higher-Order Subtyping, CSL '08. http://dx.doi.org/10.1007/978-3-540-87531-4_32
  2. T. Coquand, J. H. Gallier, A Proof of Strong Normalization for the Theory of Constructions Using a Kripke-Like Interpretation, CIS Tech. Report (MS-CIS-90-44), University of Pennsylvania, 1990. http://repository.upenn.edu/cis_reports/568/
  3. K. Crary, A Syntactic Account of Singleton Types via Hereditary Substitution, LFMTP '09. http://www.cs.cmu.edu/afs/cs/user/crary/www/papers/2009/synsing.pdf
  4. R. Harper, D. R. Licata, Mechanizing Metatheory in a Logical Framework, in JFP 17(4-5), 2007. http://dx.doi.org/10.1017/S0956796807006430
  5. C. Keller, T. Altenkirch, Hereditary Substitutions for Simple Types, MSFP '10. http://doi.acm.org/10.1145/1863597.1863601
  6. K. Watkins, I. Cervesato, F. Pfenning, D. Walker, A Concurrent Logical Framework: The Propositional Fragment, TYPES 2003. http://dx.doi.org/10.1007/978-3-540-24849-1_23

@Blaisorblade
Copy link

Blaisorblade commented Nov 7, 2018

What I wrote doesn’t work, but was the first step in understanding what is happening. And I finally took the time to understand this remark:

The example is a variant of the well-know technique for typing arbitrary lambda terms in a system with (iso-)recursive types (see e.g. TAPL, ch. 20, pp. 273). Here, we lift this technique to the kind level to allow kinding arbitrary lambda expressions on the type level. Then we construct a divergent type expression and kind it with absurd bounds.

In both cases, non-termination is caused by the negative recursion, which needs to be controlled, and the restriction that Sandro proposes also happens to forbid such negative recursion, or at least direct occurrences of it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment