Var x, y, z
Term s, t ::= x variable
λx:T.t term abstraction
s t term application
new(x: T)d record creation
t.l term member selection
Definition d ::= { a = t } term member
{ A = T } type member
Value* v, w ::= λx:T.t
new(x: T)d
Path* p, q ::= x
v
Type S, T, U ::= ⊤ top
⊥ bottom
S ∧ T intersection
(x:S) → T dependent function type
{ a : T } term member
{ A :: K } type member
μ(x: T) recursive record type
p.A type member selection
λx:S.T type abstraction
S t type application
Kind K ::= (x:S) → K dependent function kind
S..T type interval kind
Context Γ ::= ∅ empty
Γ, x: T extension
* = predicate on terms defining a subset of the terms
As is customary, we write S → T
as a shorthand for non-dependent
function types (and kinds), i.e. for (x: S) → T
where x
does not
appear freely in T
. We write *
for the interval containing the
entire type lattice, i.e. for ⊥..⊤
.
Substitutions [x := p]
of a variable x
for a path p
and their
applications t[x := p]
to terms and T[x := p]
to types are defined
in the usual way.
We make use of let
-binders, which are desugared in the usual way,
⟦ let x: T = t₁ in t₂ ⟧ = (λx: T. ⟦ t₂ ⟧) ⟦ t₁ ⟧
omitting the explicit type annotation when it can be readily inferred.
As in DOT, parametric polymorphism is achieved through "boxing" and (path-)dependent types, e.g. universals are encoded as
∀X.T = (x: { A :: * }) → T[X := x.A]
ΛX.t = λx: { A :: * }. t[X := x.A]
Type operators are encoded similarly as
K₁ → K₂ = { A :: K₁ } → K₂
λX: K. T = λx: { A :: K }. T[X := x.A]
Note that we can use dependent kinds to give very precise kinds to type operators:
λX: K. T :: (X: K) → T..T
As we'll see below, this allows us to admit β-reductions in types through a combination of kinding and sub-kinding.
Contractions:
(AppAbs) (λx:T.t) v —→ t[x := v]
d = … ∧ { a = t } ∧ …
(RecSel) ------------------------------------
(new(x: T)d).a —→ t[x := new(x: T)d]
Congruence:
t₁ —→ t₂'
(App₁) ---------------
t₁ t₂ —→ t₁' t₂
t —→ t'
(App₂) -----------
v t —→ v t'
t —→ t'
(Sel) -----------
t.a —→ t'.a
x : T ∈ Γ
--------- (T-Var)
Γ ⊢ x : T
Γ ⊢ S :: * Γ, x : S ⊢ t : T
------------------------------ (T-→-I)
Γ ⊢ λx: S.t : (x: S) → T
Γ ⊢ t : (x: S) → T Γ ⊢ p : S
------------------------------ (T-→-E₁)
Γ ⊢ t p : T[x := p]
Γ ⊢ t₁ : S → T Γ ⊢ t₂ : S
--------------------------- (T-→-E₂)
Γ ⊢ t₁ t₂ : T
Γ, x: T ⊢ d : T
------------------------ (T-{}-I)
Γ ⊢ new(x: T)d : μ(x: T)
Γ ⊢ t : { a : T }
----------------- (T-{}-E)
Γ ⊢ t.a : T
Γ ⊢ p : T[x := p]
--------------------- (T-Fold)
Γ ⊢ p : μ(x: T)
Γ ⊢ p : μ(x: T)
--------------------- (T-Unfold)
Γ ⊢ p : T[x := p]
Γ ⊢ t: T Γ ⊢ t: U
--------------------
Γ ⊢ t: T ∧ U
Γ ⊢ t : S Γ ⊢ S <: T
---------------------- (T-Sub)
Γ ⊢ t : T
Value/path typing is defined as the obvious restrictions of the above relation to values/paths.
The following rules for let
-binders are admissible.
Γ ⊢ p : S Γ, x : S ⊢ t : T
--------------------------------- (T-Let₁)
Γ ⊢ let x: S = p in t : T[x := p]
Γ ⊢ t₁ : S Γ ⊢ t₂ : T
--------------------------- (T-Let₂)
Γ ⊢ let x: S = t₁ in t₂ : T
Γ ⊢ t : T
------------------------- (D-{:})
Γ ⊢ { a = t } : { a : T }
Γ ⊢ T :: K
-------------------------- (D-{::})
Γ ⊢ { A = T } : { A :: K }
Γ ⊢ d₁ : T₁ Γ ⊢ d₂ : T₂
--------------------------- (D-∧)
Γ ⊢ d₁ ∧ d₂ : T₁ ∧ T₂
---------- (T-Refl)
Γ ⊢ T <: T
Γ ⊢ S <: T Γ ⊢ T <: U
------------------------ (T-Trans)
Γ ⊢ S <: U
Γ ⊢ T :: *
---------- (T-<:-⊤)
Γ ⊢ T <: ⊤
Γ ⊢ T :: *
---------- (T-⊥-<:)
Γ ⊢ ⊥ <: T
-------------- (T-∧-<:₁)
Γ ⊢ T ∧ U <: T
-------------- (T-∧-<:₂)
Γ ⊢ T ∧ U <: U
Γ ⊢ S <: T Γ ⊢ S <: U
------------------------ (T-<:-∧)
Γ ⊢ S <: T ∧ U
Γ ⊢ S₂ <: S₁ Γ, x: S₂ ⊢ T₁ <: T₂
----------------------------------- (T-→-<:-→)
Γ ⊢ (x: S₁) → T₁ <: (x: S₂) → T₂
Γ ⊢ T₁ <: T₂
---------------------------- (T-{:}-<:-{:})
Γ ⊢ { a : T₁ } <: { a : T₂ }
Γ ⊢ K₁ <:: K₂
------------------------------ (T-{::}-<:-{::})
Γ ⊢ { A :: K₁ } <: { A :: K₂ }
Γ ⊢ S :: T₁..T₂
----------------- (T-..-<:)
Γ ⊢ T₁ <: S
Γ ⊢ S :: T₁..T₂
------------------ (T-<:-..)
Γ ⊢ S <: T₂
---------- (K-⊤-F)
Γ ⊢ ⊤ :: *
---------- (K-⊥-F)
Γ ⊢ ⊥ :: *
Γ ⊢ T, U :: *
-------------- (K-∧-F)
Γ ⊢ T ∧ U :: *
Γ ⊢ S :: * Γ, x: S ⊢ T :: *
------------------------------ (K-→-F)
Γ ⊢ (x: S) → T :: *
Γ ⊢ T :: *
------------------ (K-{:}-F)
Γ ⊢ { a : T } :: *
------------------- (K-{::}-F)
Γ ⊢ { A :: K } :: *
Γ, x: T ⊢ T :: *
---------------- (K-μ-F)
Γ ⊢ μ(x: T) :: *
Γ ⊢ T :: *
------------- (K-..-I)
Γ ⊢ T :: T..T
Γ ⊢ S :: * Γ, x: S ⊢ T :: K
------------------------------ (K-→-I)
Γ ⊢ λx: S.T :: (x: S) → K
Γ ⊢ S :: (x: T) → K Γ ⊢ p : T
------------------------------- (K-→-E₁)
Γ ⊢ S p :: K[x := p]
Γ ⊢ S :: T → K Γ ⊢ t : T
-------------------------- (K-→-E₂)
Γ ⊢ S t :: K
Γ ⊢ p : { A :: K }
------------------ (K-Sel)
Γ ⊢ p.A :: K
Γ ⊢ T :: K₁ Γ ⊢ K₁ <:: K₂
---------------------------- (K-Sub)
Γ ⊢ T :: K₂
Γ ⊢ S₂ <: S₁ Γ ⊢ T₁ <: T₂
---------------------------- (K-..-<::-..)
Γ ⊢ S₁..T₁ <:: S₂..T₂
Γ ⊢ T₂ <: T₁ Γ, x: T₂ ⊢ K₁ <:: K₂
------------------------------------ (K-→-<::-→)
Γ ⊢ (x: T₁) → K₁ <:: (x: T₂) → K₂
A few points worth noting:
-
The selection subtyping rules from DOT are now admissible, e.g. we have
Γ ⊢ p : { A :: S..T } --------------------- (K-Sel) Γ ⊢ p.A :: S..T --------------- (T-..-<:) Γ ⊢ S <: p.A
-
β-reduction in types is also admissible (as a subtyping rule), e.g. we have
Γ ⊢ S :: * Γ, x: S ⊢ T :: T..T --------------------------------- (K-→-I) Γ ⊢ λx: S. T :: (x: S) → T..T Γ ⊢ p : S ----------------------------------------------------- (K-→-E₁) Γ ⊢ (λx: S. T) p :: (T..T)[x := p] ---------------------------------- (T-<:-..) Γ ⊢ (λx: S. T) p <: T[x := p]
Note that this is not a new idea, Aspinall already noted in 1994 that β-reduction is admissible in systems with singleton types1. Type interval kinds can be understood to be a generalization of singleton kinds2: clearly, every singleton
{T}_*
can be encoded as a singleton intervalT..T
. Most of the literature on the properties of systems with singleton types and kinds seems to use PER-semantics1, 2, 3, interpreting singleton types/kinds{T}_K
as an equivalence class of types/kinds up to β(η)-equivalence. -
Types in this system do not have normal forms in general. Consider e.g. the encoding of
{ type F[X] = F[F[X]] }
which is expressible as
new(z: { F :: (x: {A::*}) → z.F (z.F x) .. z.F (z.F x) }) { F = λx:{A::*}. z.F (z.F x) }
This is in contrast to the calculi with singleton kinds proposed by Stone and Harper2, where type equivalence is show to be decidable via comparison of WHNFs.
-
The introduction rule for interval kinds (over proper types)
Γ ⊢ T :: * ------------- (K-..-I) Γ ⊢ T :: T..T
corresponds roughly to Aspinall's
{}-I
for singleton types,1Γ ⊢ M : A ------------- ({}-I) Γ ⊢ M : {M}_A
and Stone and Harper's introduction rule for singleton kinds.2
Γ ⊢ A :: * ------------- Γ ⊢ A :: S(A)
-
One might wonder why we don't give the "most precise" kinds to proper types in the first place, e.g. the kinding rule for the top type could be
------------- (K-⊤-F′) Γ ⊢ ⊤ :: ⊤..⊤
On a first glance, it seems like the original (K-⊤-F) rule could then be recovered via subsumption, since
⊤..⊤ <:: *
. However, a derivation of this relationship already requires one for⊤ :: *
or⊥ :: *
, and so we end up with a circular dependency:??? ---------- Γ ⊢ ⊤ :: * ---------- (T-⊥-<:) ---------- (T-Refl) Γ ⊢ ⊥ <: ⊤ Γ ⊢ ⊤ <: ⊤ -------------------------------- (K-<:-..) Γ ⊢ ⊤..⊤ <:: ⊥..⊤
-
It's questionable whether identifying
*
with⊥..⊤
is a good idea. Alternatively, one could treat them separately and add sub-kinding rules to convert between them. At the very least one should have that⊥..⊤ <:: *
. -
One could take the analogy to singleton types one step further and allow intervals
A .._K B
over kindsK
other than*
. E.g.(λx:A₁.B₁) .._(* → *) (λx:A₂.B₂)
would be an interval over unary type operators. Note the kind annotation to make sure the bounds of the interval are of the same kind. This is closer in spirit to the way abstract, higher-kinded type members are handled in dotty, where one can write e.g.
{ type F <: List }
(which the compiler η-expands to)
{ type F <: λX.List[X] }
While it is unclear whether this would simplify or complicate the meta theory, it would likely make the existing theory from systems with singleton types more easily applicable in the DOT setting.
Note that the more liberal, kind-annotated variant of intervals can be encoded in terms of the
*
-only variant. Again, this has already been noticed, for the case of singleton kinds, by Stone and Harper:2S .._* T = S..T F .._(x: S)→K G = (x: S) → (F x) .._K (G x)
-
This proposal originally contained the following two extra subtyping rules:
Γ, x: S ⊢ T₁ <: T₂ -------------------------- (T-λ-<:-λ) Γ ⊢ λx: S. T₁ <: λx: S. T₂ Γ ⊢ S₁ <: S₂ ---------------- (T-App-<:-App) Γ ⊢ S₁ t <: S₂ t
However, as Martin pointed out in a comment to his gist (https://gist.github.com/odersky/36aee4b7fe6716d1016ed37051caae95), these rules might not add any expressivity in practice. Consider the following example taken from his comment:
type C(x) <: Seq(x)
is encoded as
{ C :: (x: {A::*}) → ⊥..(Seq x) }
Then
p.C y :: ⊥..(Seq y)
Hence
p.C y <: Seq y
Although this idea does not seem to extend to the case of partial applications, e.g.
type Map[K, V] <: GenMap[K, V] Map Int <: GenMap Int ?
such partial applications can only occur nested inside other subtyping derivations but never directly in (T-Sub), (T-<:-..) or (T-..-<:) since partially applied types necessarily have arrow-kinds.
The following example illustrates that type safety does not hold for the proposed calculus (neither for the above version, nor for Martin's original proposal).
let x: { A :: { B :: x.A → ⊤..⊥ }..{ B :: x.A → ⊤..⊥ } } =
new(x: { A :: { B :: x.A → ⊤..⊥ }..{ B :: x.A → ⊤..⊥ } })
{ A = { B :: x.A → ⊤..⊥ } } in
let y: { B :: x.A → ⊤..⊥ } =
new(y: { B :: (z: x.A) → (z.B z)..(z.B z) })
{ B = λz: x.A. z.B z } in
let z: y.B y = () in
(z: ⊥)
To see that this example typechecks, let's first verify that the declared types of x
, y
and z
match the types of their definitions.
-
For
x
, we have thatx.A → ⊤..⊥
is a kind (syntactically), so we immediately get{ B :: x.A → ⊤..⊥ } :: *
by (K-{::}-F). From there, we get the declared type by applying the usual sequence of (K-..-I), (D-{::}), (T-{}-I) and (T-Unfold). -
For
y
, we need to show that(z: x.A) → (z.B z)..(z.B z) <:: (z: x.A) → ⊤..⊥
(note the absurd bounds on the codomain). LetΓ = x: { A :: { B :: x.A → ⊤..⊥ }..{ B :: x.A → ⊤..⊥ } }
andΓ′ = Γ, z: x.A
, then by (T-Var) and (K-Sel)Γ′ ⊢ x.A :: { B :: x.A → ⊤..⊥ }..{ B :: x.A → ⊤..⊥ }
by (T-Var), (T-<:-..) and (T-Sub)
Γ′ ⊢ z : { B :: x.A → ⊤..⊥ }
by (K-Sel)
Γ′ ⊢ z.B :: x.A → ⊤..⊥
by (T-Var) and (K-→-E)
Γ′ ⊢ z.B z :: ⊤..⊥
by (T-..-<:) and (T-<:-..)
Γ′ ⊢ ⊤ <: z.B z <: ⊥
by (K-..-<::-..) and (K-→-<::-→)
Γ ⊢ (z: x.A) → (z.B z)..(z.B z) <:: (z: x.A) → ⊤..⊥
-
For
z
, it's sufficient to show that⊤ <: y.B y
. LetΓ′′ = Γ, y: { B :: x.A → ⊤..⊥ }
, then by (T-Var) and (K-Sel)Γ′′ ⊢ y.B :: x.A → ⊤..⊥
by (T-Var), (K-Sel) and (T-..-<:)
Γ′′ ⊢ { B :: x.A → ⊤..⊥ } <: x.A
by (T-Var) and (T-Sub)
Γ′′ ⊢ y : x.A
by (T-Var) and (K-→-E)
Γ′′ ⊢ y.B y :: ⊤..⊥
by (T-..-<:) and (T-<:-..)
Γ′′ ⊢ ⊤ <: y.B y <: ⊥
It follows that y.B y
has absurd bounds in a context that corresponds to a store. Equivalently, we can substitute x
and y
by their definitions to obtain a closed type with absurd bounds. In either case, the type lattice becomes trivial.
The example is a variant of the well-know technique for typing arbitrary lambda terms in a system with (iso-)recursive types (see e.g. TAPL, ch. 20, pp. 273). Here, we lift this technique to the kind level to allow kinding arbitrary lambda expressions on the type level. Then we construct a divergent type expression and kind it with absurd bounds.
This unsoundness example makes crucial use of
- recursion in types (or rather in bounds of type members), and
- path-dependent type families.
The former is also present in Scala/dotty where as the latter is not. The example does therefore not constitute a soundness issue in those languages. Since this soundness issue is specific to HK DOT, it's worth elaborating a bit on this point. The different sorts of arrows in HK DOT (and Scala/dotty) can be categorized according to the sort of their domain and codomain:
-
(possibly path-dependent) function type (domain: type, codomain: type), inhabited by term-level lambdas/functions/methods,
example in DOT
f : (x: S) → T where S ≠ { A :: K } f = λx: S. t
examples in Scala/dotty
def f(x: S): T = t val g: (x: S) => T = (x: S) => t
-
universal quantifier/polymorphic type (domain: kind, codomain type), inhabited by polymorphic terms/functions
examples in DOT
f : (x: { A :: K }) → T f = λx: { A :: K }. t
examples in Scala/dotty
def f[X]: T = t def f[X[_]](y: ...): T = t
-
kind of path-indexed type families (domain: type, codomain: type), inhabited by type families or path-indexed singleton kinds
examples in HK DOT
F :: (x: S) → K where S ≠ { A :: K′ } F = λx: S. T
alternatively, using Martin's variant
F <: all(x: S)K where S ≠ { A : K′ } F = all(x: S)J J singleton, J <: K
there is no counterpart of this in Scala/dotty
-
kind of type operators, i.e. higher kind (domain: kind, codomain: kind), inhabited by type operators/functions/lambdas
examples in HK DOT
F :: (x: { A :: K′ }) → K F = λx: { A :: K′ }. T
in Martin's variant
F <: all(x: { A : K′ })K F = all(x: { A : K′ })J J singleton, J <: K
example in Scala/dotty
F[X] >: S <: T F[X, Y, Z] >: S <: T F[X[_]] >: S <: T F[X] = T
In other words, the kind language of both variants of HK DOT are strictly more expressive than that of Scala/dotty, which suggests that the soundness issue could be avoided by restricting kinds as follows:
K ::= (x: { A :: K }) → K′ type operator kind
S..T type interval kind
This restriction is purely syntactic and remains "compatible" with Scala/dotty in that any (higher-kinded) abstract type definition expressible in Scala can still be encoded in this subset.
- D. Aspinall, Subtyping with singleton types, CSL'94, LNCS 933, 2005. http://dx.doi.org/10.1007/BFb0022243
- C. A. Stone, R. Harper, Deciding Type Equivalence in a Language with Singleton Kinds, POPL '00. http://doi.acm.org/10.1145/325694.325724
- C. A. Stone, R. Harper, Extensional Equivalence and Singleton Types, ACM TCL 7(4), 2006. http://doi.acm.org/10.1145/1183278.1183281
- N. Amin, S. Grütter, M. Odersky, T. Rompf, S. Stucki, The Essence of Dependent Object Types, WadlerFest, LNCS 9600, 2016. http://dx.doi.org/10.1007/978-3-319-30936-1_14
- N. Amin et al., The DOT Calculus. Additional material including a detailed type safety proof with an accompanying Coq formalization. http://wadlerfest.namin.net/
TLDR:
A slightly more comprehensive summary with links to the relevant sections:
But let's start from the beginning...
Ah yes, that was a rather imprecise distinction indeed, good point. I'll make a mental note to be more clear about this point in future write-ups.
BTW: I recently re-read parts of the TAPL chapter on Featherweight Java (ch. 19, pp. 247), while preparing for TA-ing an exercise on FJ, and realized that it uses the same type of substitution-based semantics as I do above. In particular, there is also no distinction between object identity and object equality: the reduction rules for field selection and method invocation operate directly on objects.
1. Subtyping partial type applications
Yes, initially I was worried as well that partially applied types might appear in bounds of polymorphic functions, and thus, via transitivity, in anywhere in subtyping derivations. But once you de-sugar the above signature into DOT, the partial application disappears. In DOT, the signature of
f
would bewhere
A
is a dummy label,p
is the module containing the abstract typeGenMap
, andint
is a "boxed" variant ofInt
. Note the "η-expansion" ofGenMap
! This is necessary becauseA
),S .._K T
performs an η-expansion so higher-order type bounds are effectively "η-long").As a consequence, every occurrence of a type operator in a bound is necessarily fully applied in the kind corresponding to the "type box".
Exactly.
I think your on to something there. We want all (type-)definitions to have singleton kinds, so the typing rules above should be adjusted (Martin's original proposal actually enforced this restriction). There are good meta-theoretic reasons for requiring definitions to have singleton types, and incidentally, this would force definitions to be η-expanded. Consider the following definition
which we can either interpret as the identity operation on unary type operators or as a "curried" variant of type application, i.e.
App[F][X] = F[X]
. This type definition is actually not valid in Scala (I don't know whether it is in dotty). Instead, one has to η-expand and un-curry manually, like so:Now in HK DOT (as outlined above) one may encode the former as
Note that the kind of
App
is certainly not a singleton, so according to Martin's original proposal, this wouldn't be valid. On the other hand, the η-expanded version can be given a singleton kind:I believe this generalizes to arbitrary type operators.
To summarize, enforcing singleton types for all type definitions ensures all such definitions are η-long, which is also in accordance with Scala.
No, η-conversion is not admissible. But η-conversion only makes sense for type operators, and following the same reasoning as above, you should never need to compare type operators that are not fully applied. Note that fully applied variants of your example operators are comparable.
Indeed. That's a nice paper BTW (I guess most of Abel's papers are). His paper on polarized subtyping, which I mentioned earlier,6 also makes essential use of hereditary substitution (though he doesn't mention it explicitly, IIRC).
2. Proof terms for subtyping derivations and normalization
I certainly have!
I agree that this is of more general interest, and I think that we should formalize the untyped reduction rules! Firstly, they provide a sort of formal specification for the normalization proof we're looking for, and secondly, we can prove a variant of the type safety theorems for the new calculus. Namely
Intuitively, a coercion value is a coercion that is closed and normal/canonical, and that therefore corresponds to a subtyping derivation that can be inverted. Indeed this latter point can be made formal by a handful of corresponding canonical forms lemmas (for coercions). I have sketched such a type safety proof for a variant of the system above (without recursion or records) and there don't seem to be any major difficulties. I believe that extending the proof to the full system would increasing the technical complexity but wouldn't introduce any new conceptual difficulties.
These type-safety theorems do of course not guarantee that type safety holds for the original calculus because coercion terms may be non-normalizing, in which case either a) the calculus is unsafe (see example below) or b) our reduction strategy is flawed. In either case, the formalization allows us to make the inherent "dynamics" of the normalization proof explicit, and — as you already mentioned — should give some insights about the proof-theoretic strength of the logic corresponding to the subtyping fragment.
Finally, it allows us to interpret the final type safety proof as a proof by (weak) simulation: the extended calculus (with explicit coercions) weakly simulates the original calculus by interleaving finite sequences of "silent" coercion reductions with the reductions of the "bare" terms in the original calculus. The simulation relation is simply the function that maps terms in the extended calculus to their bare counterparts by erasing all coercions.
I don't understand your example. Could you elaborate a bit? How would you use a term with this particular signature?
It is definitely true though that the coercion language does need to internalize dependencies. E.g. the language I introduce below does have binders and reduction rules for eliminating them.
There is also plenty of literature where subtyping rules in some language are be elaborated into explicit coercion terms in a subtyping-free fragment of the same language. But this isn't what we want here, because it would blur the clear separation between the coercion language, which we need to be normalizing, and the original term language, which might not be normalizing (it certainly isn't here). Again, I'm going to say a bit more about this below.
Another very nice paper (and a source of inspiration for my attempt to use HSubst here). BTW, are you aware of a similar paper that handles STLC with products?
Indeed, the expressiveness/proof-theoretic power of the HK DOT system above (and in particular the (sub-)typing fragment) is too powerful for HSubst, as exemplified by the above safety counter example — in fact, this is how I found the counter example in the first place. But there are even simpler examples of why HSubst doesn't work "out of the book" here, and these apply even to the "fixed" version of HK DOT where the safety counter example doesn't apply. I'll get back to this later — first, let's have a look at the counter example to get a feeling for what a non-terminating subtyping coercion looks like!
3. Non-termination of the soundness/safety counter example
Since the presence of path-indexed type families (plus recursion) are powerful enough to type arbitrary (untyped) lambda calculus expressions on the type level, types are clearly not normalizing. However, we're not trying to normalize types, but rather (sub-)typing derivations, and as you further note,
Unfortunately, they can be encoded suitably.
In what follows, I'll use a new syntax for paths and types extended with explicit subtyping and subkinding coercions:
where I have indicated the corresponding (sub-)typing and kinding rules. For lack of a better name, let's call the original calculus the bare language (BL) and the extended calculus with explicit coercions the extended language (EL). Suggestions for better terminology are welcome...
When presenting (sub-)typing and kinding rules/derivations, I will use the EL for "proof terms", but the BL for the associated types and kinds. There is an obvious erasure function
⌞-⌟
that, given a pathp
, typeT
or kindK
in the EL, yields the underlying bare path⌞p⌟
, type⌞T⌟
or kind⌞K⌟
, respectively, in the BL. E.g. the updated subtyping rule for (T-..-<:) isNote that the conclusion of this rule is still a subtyping judgment, albeit one with an explicit proof term
⟨S|
. I.e. subtyping is now a four-place relationΓ ⊢ α : S <: T
, relating a coercionα
to it's (bare) upper and lower boundsS
andT
in a typing contextΓ
. Importantly,<:
remains part of the relation, i.e. it is not a type operator here. This is a design choice to simplify erasure of coercions from subtyping derivations (it suffices to just drop the proof term) and contrasts with systems that elaborate (or translate) subtyping judgments into a term-language (e.g. through a relationΓ ⊢ S <: T ⇝ α
) where coercions have proper types (e.g.Γ ⊢ α : S → T
). Still, in designing the reduction rules, it is of course useful to think of coercions as mapsα : S → T
from terms of their lower bound (i.e. the domain)S
to terms of their upper bound (i.e. the codomain)T
.Semantically, we treat elements of the BL as uninterpreted constants and reason only about the dynamics of the new extensions in the EL: coercions and the associated instances of subsumption. In this context, the sub-kinding rule (K-→-<::-→) takes the role of lambda abstraction,
internalizing implication in its arrow kind bounds
(x: T₁) → K₁ <:: (x: T₂) → K₂
. The kinding rule (K-→-E₁) along with subsumption (K-Sub) on the first premise act as "application"/implication elimination, i.e.We can "contract" this (sub-)kinding derivation by pushing the subsumption down the derivation tree through the "application". The corresponding reduction rule is
Note the similarity to β-reduction! It's easy to verify that this reduction preserves both
T p
), andK₂[x := ⌞p⌟]
),assuming the following substitution lemma holds:
Returning to the above safety counter example, if we try and reduce the (sub-)typing derivation for
z : ⊥
, we encounter exactly this situation. As before, letthen
and letting
p = (z ↑ |x.A⟩).B z
Now let
i.e.
v
andq
are exactly the "definitions" ofx
andy
, respectively, and⌞q⌟ = w
. Furthermore, we haveThis last coercion is the one we'd like to normalize. To do so, we start by substituting the definitions
v
andq
forx
andy
(otherwise the coercion is already in normal form). We getExpanding a bit and focusing on the left instance of (T-..-<:), we have
and we can apply the following reduction rule
to push subsumption through type member selection.
Note that the underlying bare type is again preserved by the reduction rule. Now we are ready to apply the β-like reduction rule discussed earlier.
where
p′ = p[z := q ↑ ⟨v.A|]
. Expandingp′
, we getwhich looks suspiciously similar to the left half of (1) above. Indeed, if we reduce
p′
further we getwhich is exactly the subexpression of (1) we started from. Its easy to convince oneself that these last three steps correspond to sensible reductions too (there are some subtleties involved in formulating the rule for the second step in general, but the instance used in this example is straight-forward).
In summary, we are left with a coercion expression that is strictly bigger than the one we started from and which contains the core of the original coercion as a sub-expression.
So we are forced to conclude that this sequence of reductions diverges, and hence no (total) HSubst function can exist for this coercion calculus in general.
But, as stated above, I believe this situation can be remedied by restricting the kind language to forbid path-indexed families of types.
4. Impredicativity troubles for HSubst
So let's turn to the question of whether HSubst is powerful enough proof normalization of the "fixed" coercion calculus. In order to use HSubst, we first need to define a language of "canonical" (or normal) path and type expressions. Such a definition is implicit in the reduction rules for coercions, but we need an direct, syntactic definition, possibly typedA5 or untyped.A1,A3,A4 This is cumbersome, but there are no real conceptual difficulties.
Next, one has to define HSubst on canonical paths and types as well as on canonical path and type coercions, and then show that the type of the substitute path in any "nested" use of HSubst decreases, for some well-founded order on types. Since kinds can appear inside types and vice versa, we actually need a combined order that covers both types and kinds. The essential difficulty here is "impredicativity". Just as it is difficult to define a useful well-founded order on types in an impredicative calculus such as System F, it is difficult to define one on types and kinds in HK DOT. In particular, for HSubst to work, we need the kind of a type application to be smaller than the kind of the corresponding type operator, i.e. given
F :: (x: S) → K
andp : S
, we would liketo hold, where
_≺_
is our candidate order. If variables were not allowed in kind expressions (as e.g. in Fω-<:A1), we could just take_≺_
to be the sub-expression order:but obviously this does not work for HK DOT. Since paths, types and kinds may all appear circularly inside each other, we can pick e.g.
K = ⊥..x!
andp = { { ⊥..(x: S) → K } }
, and getwhich is the opposite of what we want. Note, firstly, that the above example still applies to the "fixed" kind language, where only arrow kinds of the form
(x: { A :: K₁ } → K₂
are allowed and, secondly, that the inclusion of kinds in the path language is crucial here. The mere presence of variables in kinds (or types) is not necessarily a problem. For example, Harper and Licata's "Canonical LF"A4 has dependent types, but since the system is stratified, i.e. kinds and types can never appear in terms, any terms appearing in types (or kinds) can simply be ignored by_≺_
(concretely, Harper and Licata first define a forgetful map from types to "simple types", forgetting any term occurrences, then define a well-founded order on simple types instead).Let's work through an example to see what exactly goes wrong when we try to extend this approach to HK DOT. Consider the following definition of the identity function on types:
Now, assume we are given a variable
z
of type{ I :: (x: {A::*}) → x.A..x.A }
, i.e. the precise type of the identity function. We can apply the operatorz.I
to a pair of nested boxes containing some kindK
, i.e.z.I { A = { B :: K } }
, which is OK because{ B :: K } :: *
and hence{ A = { B :: K } } : {A::*}
for anyK
. LetC = { B :: K }
, then we haveand hence
C <: z.I { A = C } <: C
, i.e. the identity applied to some typeC
is equivalent toC
, as one would expect. For example, if we letK = (x: {A::*}) → x.A..x.A
andB = I
, we havei.e. we can apply the identity to it's own type. Let's make all the coercions in this statement explicit, starting with
C
:Let
k = { A = s(C) }
. Noting that⌞k⌟ = { A = C }
, we haveLet
E = (z.I k) ⇑ (⟨k.A|..|k.A⟩)
. Noting that⌞E⌟ = z.I ⌞k⌟
we haveNote that the variable
z
, which is a path, appears inside the typez.I
, which, in turn, appears as the head of a type application and eventually in the path coercion⟨E|
. Hence, interesting things should happen if we substitute some pathp ↑ α
forz
and normalize the result. We expect the path coercion to be pushed, successively, through the selection ofI
, then the application tok
and finally through the left projection⟨-|
. On the way, it should turn from a path coercion into a type coercion and back, and it's type (and/or kind) should change accordingly.To keep things simple, we will work with (an η-expanded version of) the identity coercion on the type of
z
:I'll leave it to the reader to verify that these definition are well-typed.
In what follows, I'll annotate paths/types and coercions with both their type/kind as well as simplified versions thereof. In particular, I'll write statements of the form
p : T / |T| / ⌊T⌋′
where|T|
and⌊T⌋
are, respectively, the types obtained by erasing all variables and all paths fromT
. E.g. we havewith
|C| = ⌊C⌋ = { B :: |K| }
and|K| = ⌊K⌋ = {A::*} → ?.A..?.A
.Now let's compute the normal form of
(z ↑ ⟨E|)[z := p ↑ α]
. We havewith
We can already see that the erasure function
|-|
does not commute with substitution, i.e.|T| ≠ |T[x := ⌞p⌟]|
, as exemplified by the type ofz ↑ ⟨E|
before and after the substitution of⌞p ↑ α⌟
forz
. This is a problem because it means that the simple type|T[x := ⌞p⌟]|
of an applicationf p
may be bigger than the codomain|T|
of the corresponding simple arrow type|S| → |T|
of the functionf
. The same is true for the simple kinds of type applications. Concretely, in the above example, we havewhere
(|⌞k⌟|.A)..(|⌞k⌟|.A)
is clearly not a subexpression of{A::*} → ?.A..?.A
. Unfortunately, HSubst relies crucially on the fact that the type of a an application (or any redex in general) is smaller than the type of the corresponding function (or, in general, the term being eliminated).Conversely, the erasure function
⌊-⌋
does commute with substitution, i.e. we have⌊T⌋ = ⌊T[x := ⌞p⌟]⌋
. E.g. in the above example,?.A..?.A
is a subexpression of{A::*} → ?.A..?.A
.Continuing the above example, the first redex we encounter is
(p ↑ α).I
, deeply nested insideE′
. The associated reduction isThe kind of the redex and the type of the associated eliminatee
α
areExpanding
C = { I :: K }
and|C| = ⌊C⌋ = { I :: |K| }
, and noting that|K| = ⌊K⌋
, we see thatK
,|K|
and⌊K⌋
are subexpressions, respectively, ofC
,|C|
and⌊C⌋
— so far, all is well. The next redex we encounter is(p.I ⇑ Φ) k
, with reductionwhich is an instance of the β-like reduction rule we have seen earlier. The types of the redex and
Φ
areExpanding the definitions of
K
,|K|
and⌊K⌋
we discover that(⌞k⌟.A)..(⌞k⌟.A)
is not a subexpression ofK
, nor is(|⌞k⌟|.A)..(|⌞k⌟|.A)
a subexpression of|K|
. However,?.A..?.A
is a subexpression of⌊K⌋
, as expected since⌊-⌋
commutes with substitution. We can therefore dismiss the subexpression order on types as well as subexpressions modulo|-|
as candidates for our well-founded induction order for HSubst and focus instead on subexpressions modulo⌊-⌋
.To see why this will not work either, we first need to perform a few boring reductions on
E′
(mostly pushing identity coercions around and eliminating them):The resulting coercion is a redex for the reduction
which is an instance of the contraction rule
The associated type/kind are
Focusing on the lower bounds, we note that
C
is a subexpression of{ A = C }.A..{ A = C }.A
but, alas,⌊C⌋ = { B :: |K| }
is not a subexpression of?.A..?.A
. The erasure function⌊-⌋
forgot too much.In summary: just as paths, types and kinds can circularly contain each other, so can coercions on paths and kinds (i.e. subtyping and subkinding derivations). As a consequence, (hereditary) substitutions in types may trigger (hereditary) substitutions in paths, and vice-versa. Whatever well-founded order/induction measure we use, it has to be defined both on (well-typed) paths and on (well-kinded) types. Ignoring all paths in types and kinds results in a well-founded order that is too coarse to make deeply nested calls to HSubst, while only ignoring variables leads to an order that is not stable under substitution, i.e. it cannot relate the kind
K[x := p]
of an applicationF p
to the domainK
of the arrow kind of the domain(x: S) → K
of the corresponding type operatorF
.All hope is not lost though: there could still be a well-founded order on types that goes beyond simple subexpression modulo erasure. That is, provided that the type language of HK DOT is weaker than System F. So is it? One thing we can say is that System F can at least not be directly encoded in the type language of HK DOT, simply because that would require us to encode kind-level variables through boxing in path-level variables, similar to the way we encode type variables in DOT. Recall that in DOT we have an injection
{ A = _ }
from types to paths ("boxing") with inverse_.A
("unboxing") which allows us to treat path variablesx
of typex : { A :: * }
as type variablesX = x.A
and encode universal quantifiers (over types) like∀X.T
as⟦ ∀X.T ⟧ = (x.A) → ⟦ T[X := x.A] ⟧
. To encode universals∀k.K
over kinds, we first need an encoding for kind variablesk
, i.e. a similar invertible injection from kinds to paths. A candidate is{ A = { B :: _ } }
, but this map has no syntactic form as its inverse.5. Some thoughts on a proof via logical relations
I think the term "hereditary substitution" had not been invented when Proofs and Types was written. It seems to have originatedA3,A4,A5 in a paper by Watkins et al. that was published in 2004.A6 However, the method used to prove the weak normalization theorem in chapter 4 of P&T also makes crucial use of the fact that, in STLC, the type of a redex is smaller than the type of the corresponding eliminatee, e.g. for applications
f t: T
, we havef: S → T
, and for projectionsπᵢ t: Tᵢ
, we havet: T₁ × T₂
. Both techniques leverage this fact by using the types of eliminatees in redexes as part of their induction/termination measure.The origins of the term "logical relations" seem to go back much further, according to a thread on the TYPES mailing list at least as far as 1974, so I'm not sure why it is not mentioned in P&T.
Neither have I.
I don't understand this comment. What exactly do you mean by "applying it to an assumption"? And why would the relation not be defined by induction on types/kinds? The way I see it, there would be four mutually inductively defined relations/predicates (one each for terms, types and the corresponding coercions). Since coercions are typed/kinded by two types/kinds, the corresponding relations would likely be defined by induction on e.g. the upper bound and case-analysis on the lower, or vice-versa. E.g. definitions might look something like this:
p ∈ R⟦{ K₁ }⟧
ifp! ∈ R⟦K₂⟧
.α ∈ R⟦T <: { K }⟧
if, for allp
,p ∈ R⟦T⟧
implies(p ↑ α)! ∈ R⟦K₂⟧
.Needless to say, this seems rather non-standard. For the arrow cases, predicativity/dependency becomes an issue, e.g. the following definition is not inductive in the upper bound:
Φ ∈ R⟦(x: S₁) → K₁ <: (x: S₂) → K₂⟧
if, for allT
andp
,T ∈ R⟦(x: S₁) → K₁⟧
andp ∈ R⟦S₂⟧
implies(T ⇑ Φ)! p ∈ R⟦K₂[x:=⌞p⌟]⟧
.and we need to instead parametrize the (reducibility) relation on
K₂
over relations (i.e. reducibility candidates) onS₂
. Normalization proofs for e.g. the calculus of constructionsA2 have to deal with similar dependency-related issues and might serve as a starting point here.For now, I have no idea how to deal with the order-related operations (reflexivity, transitivity) in this setting, nor how to represent recursive types/paths.
6. Conclusions and future work
I have more to say about
but this comment is already way too long, so that's for another time.
Additional references