Swift allows you to initialize any type from an integer literal in source code by conforming it to the ExpressibleByIntegerLiteral
protocol. Currently, Int8
, Int16
, Int32
, Int64
, Int
, and their unsigned counterparts are supported as bootstrap types when the compiler emits code passing source code literals to an ExpressibleByIntegerLiteral.init(integerLiteral:)
implementation. This effectively caps the maximum supported integer literal width to 64 bits.
This proposal aims to unlock arbitrary-precision integer literals in a way that:
- will decouple the compile-time overflow checking from the actual passed integer value storage,
- will eliminate the need for magic underscored protocols,
- does not resort to inlining or
constexpr
-like ideas, - will obsolete components of
ExpressibleByFloatLiteral
that overlap withExpressibleByIntegerLiteral
, - will make it easier to reason about the behavior of textual literal protocols such as
ExpressibleByExtendedGraphemeClusterLiteral
,
as well as laying foundations for features that we may want Swift to have in the future, including but not limited to:
- lossless decimal types,
- arbitrary-precision float types,
- significant figures, and
- hex color literals.
-
literal A pre-parsed representation of a source code string (e.g.
(base: .hex, words: [0xabcd], places: 4)
for"0xabcd"
). -
bootstrap type A value supplied by the compiler for the purposes of initializing a literal-expressible type. Currently, it is the argument passed to an
ExpressibleByIntegerLiteral.init(integerLiteral:)
implementation. For integer literals right now (but not literals in general), they are always available to the caller at compile time. -
expressed type A type conforming to
ExpressibleByIntegerLiteral
. It is initialized from a bootstrap value.
Right now, ExpressibleByIntegerLiteral
generally works like this:
// module-A.swift
struct Foo:ExpressibleByIntegerLiteral
{
typealias IntegerLiteralType = UInt8
init(integerLiteral:UInt8)
...
}
// module-B.swift
import struct A.Foo
func foo() -> Foo
{
255 as Foo
}
Key things to note:
Module B
does not know anything about A.Foo
’s ExpressibleByIntegerLiteral
conformance, other than that it has one.
That is, the body of B.foo
is equivalent to:
Foo.init(integerLiteral: 255 as UInt8) // as Foo
When you get an overflow error like:
error: integer literal '256' overflows when stored into 'Foo'
256 as Foo
^
the compiler did not actually know how to check if 256
was a valid Foo
, it only knew how to decompose 256 as Foo
into Foo.init(integerLiteral: 256 as UInt8)
, and pattern-match the 256 as UInt8
bootstrap expression. The compiler only knows how to pattern-match a fixed set of builtin bootstrap types conforming to _ExpressibleByBuiltinIntegerLiteral
, which is why you cannot use ExpressibleByIntegerLiteral
to implement compile-time overflow checking for, say, 24-bit values.
The ExpressibleByIntegerLiteral
conformance for A.Foo
in module A
does not know anything about the literal written in the B.foo
method, other than its bootstrap value.
For example, B.foo
could have written 255
, 0255
, 0xff
, or even 0b1111_1111
.
This information could often be valuable to us. For example, an RGB hex color type might want to only accept hexadecimal literals exactly 24 digits long. The numeric values 0xff11ff
(hot pink) and 0x00ff11ff
are equivalent, but the second form is ambiguous, because it’s not clear if the final ff
is the alpha component or the blue component. This forces things like CSS builders to resort to awkward APIs like (r: 0xff, g: 0x11, b: 0xff)
instead of the more natural 0xff11ff
.
Moreover, in many decimal-related applications, the number of leading zeroes is significant. Having a way to preserve digit count information will allow us to reuse parts of our integer literal system when designing lossless decimal types in the future.
We should scrap the magic _ExpressibleByBuiltinIntegerLiteral
protocol, and replace it with an officially-supported IntegerLiteral
protocol. (Only standard library types can conform to _ExpressibleByBuiltinIntegerLiteral
, so the ABI impact should be limited.)
protocol IntegerLiteral
{
associatedtype Base where Base:IntegerLiteralBase
init(_:Base, words:[UInt], places:Int)
}
The precise definition of words
and places
is not important to this proposal, and can be bikeshed at a later date. More important is the Base
type and the IntegerLiteralBase
protocol.
If all we cared about was preserving the numeric base information, we could simply have the standard library vend a concrete SwiftBase
enumeration like:
enum SwiftBase
{
case binary, octal, decimal, hexadecimal
}
Then we could omit the associatedtype
Base
. We don’t, because having Base
as an associatedtype
requirement has a lot of advantages:
- Associated types are always known at compile time. This means the compiler can use
Base
to implement compile-time overflow checking. This also means the compiler could statically forbid expressed types from being written in certain bases in accordance to theBase
type. Base
can conform to protocols (besidesIntegerLiteralBase
), and these protocol conformances can be orthogonal to each other. The compiler can vend the set of static checks it knows how to do as marker protocols, and use these conformances to perform compile-time validation. Types conforming toExpressibleByIntegerLiteral
can then opt-in to different kinds of compile-time validation through theirBase
type.Base
effectively decouples the compile-time validation from the physical bootstrap type used to transfer data from source code to anExpressibleByIntegerLiteral
initializer. This will enable us to, for example, implement 24-bit overflow checking without having to add anInt24
integer type to the standard library.
We need IntegerLiteralBase
because we want to have some way of passing along the original literal’s base to the ExpressibleByIntegerLiteral
implementation. Users cannot use the Base
associated type to enable any numeric base, because that would change the syntax of the language. So we need to have some way of translating the four known Swift bases (binary
, octal
, decimal
, hexadecimal
) into an instance of an ExpressibleByIntegerLiteral
type’s Base
.
protocol IntegerLiteralBase
{
static
var binary:Self
{
get
}
static
var octal:Self
{
get
}
static
var decimal:Self
{
get
}
static
var hexadecimal:Self
{
get
}
}
Base
types that don’t support certain numeric bases can implement those requirements as Never
.
The data flow proposed here has three steps:
- Compiler lexes a
(Builtin.Base, words:[UInt], places:Int)
tuple, from source code. All fields are known, including theBuiltin.Base
value and theSelf.IntegerLiteral.Base
type. (But not theSelf.IntegerLiteral.Base
value.) - At run time, instantiate a
Self.IntegerLiteral.Base
value from theBuiltin.Base
value, and then instantiate theSelf.IntegerLiteral
bootstrap value. - At run time, call
Self.init(integerLiteral:)
with the newly-constructedSelf.IntegerLiteral
bootstrap value.
The intermediate step is useful because many conforming types do not actually need an arbitrary-precision words
vector, and may not care about the places
count. The standard library can provide IntegerLiteral
conformances for all the concrete standard library integer types (much like it currently does for _ExpressibleByBuiltinIntegerLiteral
, but with less compiler magic), and downstream users can simply piggyback off of those.
More importantly, it preserves source-compatibility with existing ExpressibleByIntegerLiteral
implementations.
The intermediate step can also improve performance when crossing module boundaries. A third-party library might only need an IntegerLiteral
type of Int
, and it is much more efficient for the compiler to pass the library an Int
value generated from a standard library Int:IntegerLiteral
implementation that it knows how to constant-fold. This also has the upside of making third-party libraries less dependent on inlining.
This proposal is designed to be backwards-compatible with all existing ExpressibleByIntegerLiteral
implementations.
Replacing the _ExpressibleByBuiltinIntegerLiteral
associated type constraint with IntegerLiteral
will break ABI. However, only standard library types can conform to _ExpressibleByBuiltinIntegerLiteral
, so the ABI impact should be limited.
This proposal is unlikely to harm or improve standard library binary resilience. The IntegerLiteral
abstraction layer minimizes overhead from bigint traffic, which will make third-party libraries less dependent on inlining. This will improve the binary resilience of the Swift ecosystem in the long run.
The proposed changes to ExpressibleByIntegerLiteral
can be used to implement lossless decimal literals. The ExpressibleByFloatLiteral.FloatLiteralType
associated type can then be replaced with conformances to DecimalLiteral
(analogous to IntegerLiteral
) on the concrete types Float
, Float80
, and Double
, which models their relationship to FloatLiteralType
much better than the _ExpressibleByBuiltinFloatLiteral
protocol.
Decoupling literals from bootstrap values will make it straightforward for us to enable implementing ExpressibleByStringLiteral
and ExpressibleByExtendedGraphemeClusterLiteral
initializers that operate on raw UTF-8 data. This will increase the amount of constant-folding the compiler is able to perform, since the raw UTF-8 data is known to the caller at compile time.
This proposal is an alternative to StaticBigInt
, pitched here.