To have an ergonomic, flexible library that can abstractly represent
Rust types. The library should be expose a "minimal" set of types that
is nonetheless able to capture the full range of Rust types. "Minimal"
here means that some of the surface differences in Rust types -- e.g.,
the distinction between built-in types like u32
and user-defined
types like a struct -- ought to be minimized, so that code that works
with these types (e.g., trait solving) can focus on the most important
differences.
At various points in this document you see Question notes, which are places which note some specific uncertainties.
Generic over a "type family". Virtually all types in the library
are made generic over a trait, TypeFamily
. This allows embedders of
the library to customize how the types themselves are represented, and
to map the types.
Most everything in the IR is parameterized by the TypeFamily
trait:
trait TypeFamily: Copy + Clone + Debug + Eq + Ord {
..
}
We'll go over the details later, but for now it suffices to say that
the type family is defined by the embedded and can be used to control
(to a certain extent) the actual representation of types, goals, and
other things in memory. For example, the TypeFamily
trait could be
used to intern all the types, as rustc does, or it could be used to
Box
them instead, as the chalk testing harness currently does.
Question: What to name the TypeFamily
? Since instances of it
must be passed around, I've been considering TypeContext
, as well,
and to call those instances tcx: TypeContext
, like in the
compiler...?
Types are represented by a Ty<TF>
type and the TyData<TF>
enum.
There is no direct connection between them. The link is rather made
by the TypeFamily
trait, via the InternedTy
associated type:
struct Ty<TF: TypeFamily>(TF::InternedTy);
enum TyData<TF: TypeFamily> { .. }
The way this works is that the InternedTy
trait has an associated
type and two related methods:
trait InternedTy {
type InternedTy;
fn intern_ty(&self, data: &TyData<Self>) -> Self::InternedTy;
fn lookup_ty(data: &Self::InternedTy) -> &TyData<Self>;
}
However, as a user you are not meant to use these directly. Rather,
they are encapsulated in methods on the Ty
and TyData
trait:
impl<TF: TypeFamily> Ty<TF> {
fn data(&self) -> &TyData<TF> {
TF::lookup_ty(self)
}
}
and
impl<TF: TypeFamily> TyData<TF> {
fn intern(&self, tf: &TF) -> Ty<TF> {
Ty(tf.intern_ty(self))
}
}
Note that there is an assumption here that TF::lookup_ty
needs no
context. This effectively constrains the InternedTy
representation
to be a Box
or &
type. To be more general, at the cost of some
convenience, we could make that a method as well, so that one would
invoke ty.data(tf)
instead of just ty.data()
. This would permit us
to use (for example) integers to represent interned types, which might
be nice (e.g., to permit using generational indices).
Question: Should we make lookup_ty
a proper method?
This section covers the variants we use to categorize types. We have endeavored to create a breakdown that simplifies the Rust "surface syntax" of types to their "essence". In particular, the goal is to group together types that are largely treated identically by the system and to separate types when there are important semantic differences in how they are handled.
Chalk variant | Example Rust types |
---|---|
Apply |
Vec<u32> , f32 |
Placeholder |
how we represent T when type checking fn foo<T>() { .. } |
Dyn |
dyn Trait |
Fn |
fn(&u8) |
Alias |
<T as Iterator>::Item , or the Foo in type Foo = impl Trait and type Foo = u32 |
BoundVariable |
an uninstantiated generic parameter like the T in struct Foo<T> |
Each variant of TyData
generally wraps a single struct, which
represents a type known to be of that particular variant. This section
goes through the variants in a bit more detail, and in particular
describes why each variant exists.
Apply
-- anApplicationTy<TF>
is kind of a "normal Rust type", likeVec<u32>
orf32
. These consist of a "type name" (in our examples,Vec
andf32
respecively) and zero or more generic arguments (respectively,[u32]
and[]
).- They are equal to other types (modulo aliases, see below) iff they have the same "type name" and the generic arguments are recursively equal
Placeholder
-- aPlaceholderIndex
type represents a generic type that is being treated abstractly or -- more generally -- the result of a "type function" that cannot be evaluated. For example, when typing the body of a generic function likefn foo<T: Iterator>
, the typeT
would be represented with a placeholder. Similarly, in that same function, the associated typeT::Item
might be represented with a placeholder.- Like application types, placeholder types are only known to be equal
- When proving negative goals, e.g.,
not { Implemented(T: Trait) }
, placeholders are replaced with inference variables -- the idea is that this goal is only true if there is no typeT
that implementsTrait
. Therefore, if we can find no answeres forexists<T> { Implemented(T: Trait) }
, then we know that the negation is true.- Note that this means that e.g.
forall<X> { X = i32 }
is false but so isforall<X> { not { X = i32 } }
.
- Note that this means that e.g.
InferenceVar
-- anInferenceVar
type represents a type whose value is being inferred. Inference variables may be "bound" or "not bound", but that state is stored externally, in the inference context (see the section on inference below).- When equating, inference variables are treated specially in that they become bound (or, if they have already been bound, they are replaced with their value).
- Inference variables are also integral to canonicalization and other types.
Dyn
-- aDynTy<TF>
represents adyn Trait
type. In chalk, these are represented as an existential type where we store the predicates that are known to be true. So a type likedyn Write
would be represented as, effectively, anexists<T> { T: Write }
type.- Two
dyn P
anddyn Q
types are equal ifP => Q
andQ => P
, in short. - There are "automatic" rules for proving that
dyn P: P
and so forth.- (We have to reconcile this with some of the soundness issues associated with dyn trait)
- Question: in rustc,
dyn Foo
has a lifetime bound, I guess we need that in chalk?
- Two
Fn
-- aFnTy<TF>
represents afn()
type. These are almost just application types, except that they can also contain aforall
quantifier over lifetimes (e.g.,forall<'a> fn(&'a u32)
).- Two
Fn
typesA, B
are equalA = B
ifA <: B
andB <: A
- Two
Fn
typesA, B
are subtypesA <: B
if- After instantiating the lifetime parameters on
B
universally...- You can instantiate the lifetime parameters on
A
existentially...- And then you find that
P_B <: P_A
for every parameter typeP
onA
andB
andR_A <: R_B
for the return typeR
ofA
andB
.
- And then you find that
- You can instantiate the lifetime parameters on
- After instantiating the lifetime parameters on
- We currently handle type inference with a bit of a hack (same as
rustc); when relating a
Fn
typeF
to an unbounded type variableV
, we instantiateV
withF
. But in practice because of the above subtyping rules there are actually a range of values thatV
could have and still be equal withF
. This may or may not be someting to consider revisiting.
- Two
GeneratorWitness
-- aGeneratorWitness<TF>
represents the types that may be part of a generator state. Unlike other types, these types can include bound, existential lifetimes, which refer to lifetimes within the suspended stack frame. You can think of it as a type likeexists<'a> { (T...) }
.- This is very similar to an
Apply
type, but it has a binder for the erased lifetime(s), which must be handled specifically in equating and so forth. - It is also similar to a
Dyn
type, in that it represents an existential type, but in contrast toDyn
, what we know here is not a predicate but rather some upper bound on the set of types contained within.
- This is very similar to an
Alias
-- anAliasTy<TF>
represents some form of type alias. These correspond to Rust types like<T as Iterator>::Item
but alsoimpl Trait
types and type aliases likeFoo<X>
wheretype Foo<X> = Vec<X>
. Each alias has an alias id as well as parameters. Aliases effectively represent a type function.- In general, an alias type
A
can also be equal to any other typeT
if evaluating the aliasA
yieldsT
(this is currently handled in Chalk via aProjectionEq
goal, but it would be renamed toAliasEq
under this proposal).- However, some alias types can also be instantiated as "alias
placeholders". This occurs when the precise type of the alias is
not known, but we know that there is some type that it
evaluates to (for example,
<T as Iterator>::Item
might be treated opaquely asT::Item
; similarlyimpl Trait
types are treated opaquely until the latter phases of the compiler).
- However, some alias types can also be instantiated as "alias
placeholders". This occurs when the precise type of the alias is
not known, but we know that there is some type that it
evaluates to (for example,
- In general, an alias type
BoundVariable
-- a bound variable represents some variable that is bound in an outer term. For example, given a term likeforall<X> { Implemented(X: Trait) }
, theX
is bound. Bound variables in chalk (like rustc) use de bruijin indices (See below).- Bound variables are never directly equated, as any bound variables would have been instantiated with either inference variables or placeholders.
- They do appear in canonical forms and other terms that contain binders.
Error
-- theError
type represents a type that resulted from some erroneous expression. Error types generally propagate eagerly in an attempt to suppress nonsense errors that are derived by interactions with buggy code.Error
should be its own variant because most bits of code will want to handle it somewhat specially -- e.g., maybe it can "unify" with any other type without any effect, and so forth.
The rustc TyKind
enum has a lot more variants than chalk. This
section describes how the rustc types can be mapped to chalk
types. The intention is that, at least when transitioning, rustc would
implement the TypeFamily
trait and would map from the TyKind
enum to chalk's TyData
on the fly, when data()
is invoked.
This section describes how a "representative sample" of rustc types can be mapped to Chalk variants.
rustc type | chalk variant (and some notes) |
---|---|
Bool |
Apply |
Char |
Apply |
Int(_) |
Apply |
Uint(_) |
Apply |
Float(_) |
Apply |
Adt(_, _) |
Apply |
Foreign(_) |
Apply |
Str |
Apply |
Array(_, _) |
Apply |
Slice(_) |
Apply |
RawPtr(_) |
Apply |
Ref(_, _, _) |
Apply |
FnDef(_, _) |
Apply |
FnPtr(_, _) |
Fn |
Dynamic(_, _) |
Dyn |
Closure(_, _) |
Apply |
Generator(_, _) |
Apply |
GeneratorWitness(_) |
GeneratorWitness |
Never |
Apply |
Tuple(_) |
Apply |
Projection(_) |
Alias |
UnnormalizedProjection(_) |
(see below) |
Opaque(_, _) |
Alias |
Param(_) |
XXX Placeholder? |
Bound(_, _) |
BoundVariable |
Placeholder(_) |
Placeholder |
Infer(_) |
InferenceVar |
Error |
Error |
An ApplicationTy<TF>
is kind of a "normal Rust type", like
Vec<u32>
or f32
. Such types are only "equal" to themselves (modulo
aliases, see below), and they may take type arguments. Note that we
group together both user-defined structs/enums/unions (like Vec
)
as well as built-in types like f32
, which effectively behave the
same.
An ApplicationTy<TF>
contains two fields:
- a "type name" (of type
TypeName<TF>
); and, - a list of generic arguments (of type
GenericArguments<TF>
).
The TypeName<TF>
itself is largely opaque to chalk. We discuss it in
more detail elsewhere. The point is that it represents, semantically,
either the name of some user-defined type (like Vec
) or builin-types
like i32
. It may also represent types like "tuple of arity 2" ((_, _)
) or "fixed-length array" [_; _]
. Note that the precise set of
these built-in types is defined by the TypeFamily
and is unknown to
chalk-ir.
This section is tracking things that have to be changed in chalk to match what is described in this document.
Parameter<TF>
needs to be renamed toGenericArgument
Vec<Parameter<TF>>
needs to be replaced withGenericArguments
- Extract
TypeName
into something opaque to chalk-ir. - Dyn type equality should probably be driven by entailment.
- Projections need to be renamed to aliases.
- The variant we use for impl traits should be removed and folded into type aliases.
- Remove placeholders and projection placeholders from apply and create placeholder types.
- Move
Error
from aTypeName
to its own variant. - Introduce
GeneratorWitness
into chalk - Complete transition from
ForAll
toFn
in chalk