nrc/ints.md

## ints.md

      
    Raw
  

              ints.md
            
          
    Integer types

This is meant to be a summary of the current status and the questions around
integer fallback and type names. It's meant to be an objective overview so we
can discuss at the work week and have the same starting point. (Obviously I have
an opinion here, so it might not be totally objective but I'll try).
Current status

We have two 'current' places - the current implementation and the current 'spec'

i.e., current implementation + accepted RFC.

The current implementation is that we have fixed size (u8, u16, u32,
u64, i8, i16, i32, i64) and pointer sized integer types (int,
uint). Integer literals must have an inferrable type or be annotated with a
suffix (u, u8, etc.). If a type cannot be inferred there is a type error.
We have previously accepted RFC 212.
The gist of that RFC is that if the compiler cannot infer a precise type for an
integer, it will assume int. We had previously had similar behaviour, it was
removed because we thought it was not very important in real world code and
because it introduces some scope for unexpected overflow errors. However, not
having the fallback is extremely irritating in toy examples and tutorials. (The
fallback in RFC 212 is not exactly the same as the old fallback - we will allow
it to interact with inference, previously it did not).
We have not implemented RFC 212 because it is backwards compatible and not
super-high priority for 1.0.
Proposals/open questions

There are three interelated questions:
1 Should we fallback to i32 instead of int?
2 Should we rename int and uint?
3 Should we use i32 rather than int in our examples and where it doesn't
matter what type we use (because there will never be overflow)?
We could consider these seperately, but I think they are interelated because it
would be strange (though not impossible) for our 'hard' default (i.e., the
compiler's fallback) and our 'soft' default (what we use in examples, etc.) to
be different. If we do decide to switch our defaults to i32, then it become
odd for a pointer-sized int to remain called int (likewise, uint) because it
is the most intuitive name for an integer, but we would otherwise discourage its
use, and because it does not indicate the size of the integer like the other
type names do.
1 is proposed in (RFC 452)[rust-lang/rfcs#452], 2 is
proposed in (RFC 464)[rust-lang/rfcs#464]. There seems
to be broad (but not unanimous) support from the community for both RFCs
(although there are worries that this is an issue with a 'silent majority' who
support the status quo or don't care). The consensus in the discussion for RFC
464 is for iptr and uptr, though there are several alternatives (imem,
isize, index, offset/addr, and so forth).
An open question is: if we do rename int, should we then keep int as an
alias for i32 to make toy examples more friendly to newcomers?
Motivation

There are a few reasons to prefer i32 as a default:

overflow safety - its easier to reason about overflow when the width of the
integer is known. Using a pointer sized integer is only appropriate (from the
point of view of overflow) when you are dealing with pointers or array indices
or similar situations (or if you know the integer will never hold a value
larger than the smallest possible size, but then using a fixed size integer of
the smallest size is more efficient).
for 16bit platforms, a pointer-sized integer is too small for many cases
i32 is generally faster than i64 (particularly an issue for benchmarks)

Does it matter?

I think the key question is not in fact what is right, but if it matters at all
(our conclusion from the last work week was pretty much 'no'). Since in real
software, use of the fallback is rare, the effect of changing the fallback is
very small. I see two counter arguments: pedagogy and benchmarks. We should
encourage new programmers to do the 'right thing' around integers and overflow,
and having a fixed size default integer encourages this (note, that I am not
making the direct argument that changing the integer fallback will lead to safer
Rust programs, only that it will, to some minor extent, encourage safer
behaviour). The second worry is that we could lose benchmark points if someone
naively implements a benchmark in Rust without using type annotations and runs
on a 64 bit platform, then they will get the slower 64 bit integer instead of
32. Given the difficulty in writing even a small benchmark program with no
integer type annotations, I'm not sure how realistic this worry is.
Data

This is the kind of question some data could really help with. The best idea I
can come up with is to implement the fallback to both int and i32 and
instrument the compiler to count the number of annotations we could elide either
way. I'm not sure its easy to do the instrumentation, however. And the
implementation work itself is non-trivial. And this is only really useful data
if we assume that we are using the 'correct' annotations, which, arguably, we
are not.
Conclusion and recommendation

I believe that the right thing to do is to fallback to i32 (mainly for
overflow safety). I had previously supported this change because it seemed like
a very minor change to an accepted but un-implemented RFC. However, if we make
this change, it seems that we must also change the naming of integer types and
our 'soft' default, and then it becomes a lot of work. Given that the practical
effects of the change are small, that the work required to make the changes is
backwards incompatible (i.e., must happen before 1.0) and significant, and that
collecting really informative data will be difficult, I think the best thing to
do is to stick with the status quo (which make me a little sad, but such is
life).
If we agree, then we should write this up properly and explain it well to the
community, in particular on the RFCs.