antifuchs/nonzero_uint_literals.md

## nonzero_uint_literals.md

      
    Raw
  

              nonzero_uint_literals.md
            
          
Feature Name: nonzero_uint_literals
Start Date: TODO
RFC PR:
Rust Issue:

Summary

Add an extension to the INTEGER_LITERAL syntax that allows users to
specify literals as non-zero unsigned integers. We introduce a new
INTEGER_SUFFIX that starts with n to indicate non-zero
literals. These literals get checked at compile-time to ensure they
are not zero.
Motivation

Currently, safe usage of APIs that specify non-zero parameters have to
convert a given number to its non-zero equivalent (e.g. u8 to
NonZeroU8) by using the ::new() method, which returns an
Option. This is useful when using user-provided numbers, but can be
quite verbose and unwieldy when using an API with literals written in
program text, like so:
divide_by(dividend, NonZeroU32::new(20).expect("can't happen"))
Or, since the literal is certainly non-zero, a programmer might use
the following unsafe block, which would raise alarm bells with people
auditing, and would constrain anyone making changes to the code block
in the future (and is only slightly less verbose):
divide_by(dividend, unsafe { NonZeroU32::new_unchecked(20) })
These options push a programer safety check that should happen at
compile time (i.e., that the programmer didn't accidentally provide a
literal that is zero) to run-time instead: The former by inducing a
panic, the latter by unsafely proceeding with a zero value.
More than that, these options treat non-zero unsigned integer literals
as something "other" when their usage in programs could be quite
straightforward instead.
Guide-level explanation

When using APIs that specify non-zero unsigned integer types, code
passing integers literals (like 20) to these APIs needs to assert
that those literals are not zero.
This can be achieved in safe code by converting a u32 integer to a
NonZeroU32 integer. Given the definition for a division function for
unsigned integers that can never divide by zero:
fn divide_by(dividend: u32, divisor: NonZeroU32) -> u32 {
    dividend / divisor.get()
}
We'd call the function like so:
divide_by(100, NonZeroU32::new(20).expect("inconveivable!"))

However, this long-form conversion performs the check that 20 is not
zero at run time, so if somebody should accidentally delete the 2
while editing, the program will still compile and panic at run time.
The easier (and shorter) way to make this assertion is to use the
n32 suffix on the integer literal:
divide_by(100, 20n32)
It is a compile-time error to use the literal zero with an n suffix,
so no edits or slips of the finger will result in an accidentally
compiling program that errors at run time.
Reference-level explanation

To implement this RFC, there shall be following additions:

Amend the definition of INTEGER_SUFFIX to allow the following 6 suffixes: n8, n16, n32, n64, n128, nsize. These correspond to core::num::NonZeroU8 (and up) literals.
Make it a parse error to use 0 with any n suffix listed above.
Adjust the documentation and examples of NonZeroU* types to
indicate the short form as the preferred way of specifying them in
program text.

No change is made to the handling of type inference (see open questions), so non-suffixed integer literals are handled in the same way as before.
Drawbacks

This adds more syntax to a language that is already quite rich in
syntax, for a relatively niche feature (few, if any other languages
seem to have non-zero types built in).
Rationale and alternatives

Rationale

The current implementation of NonZeroU* types focuses mostly on the
machine optimization aspect, but doesn't help programmers write more
correct programs as much as it hopes: Unless they deal with
user-provided numbers, they might be led to expect that Rust's nonzero
types protect them from typos at compile time, but they don't.
Making the form that is the easiest to type be checked at compile
time, and referring to it in examples throughout should lead users to
the "golden path" where their programs can use non-zero integers
fearlessly.
Another benefit is that it allows users of APIs with non-zero types
think about those arguments in a more natural way, e.g. "the number 4
(which happens to be represented as an u32 that disallows zero)",
rather than "the non-zero u32 4, which will error at run time if it
should be zero (??!)"
Pros:

Checked at compile time.
nice and short

Alternatives

Several alternative ideas and paths exist. I don't know if I'm
enumerating all of them yet:
Only do the compile-time check for ::new and ::new_unchecked

The most annoying problem that arises from having these literals be
checked at run time is that by altering code in a bad way
(e.g. deleting a 2 from the expression 20), a programmer can make
safe rust code panic, or make unsafe code go and behave in an
unspecified way. Having a compile-time warning (or clippy lint) to
ensure that literals passed to ::new_unchecked and ::new never are
zero would make this problem very visible when the programmer is
around to fix it.
Pros:

checked at compile time.

Cons:

still very verbose

Do nothing, delegate to macros as in the nonzero_ext crate

The
nonzero_ext
crate provides a macro which checks that literals are not zero, and
converts them into the corresponding non-zero type from the
(explicitly specified) input type:
divide_by(dividend, nonzero!(20u32))
Pros:

nothing to do - it's user code.
can check against zero at compile/macro-expansion time.
can safely use ::new_unchecked - no runtime check, even in debug
builds.

Cons:

error messages for compile-time assertions in macros are really ugly
they're macros, so modify the language in a way not familiar to all
readers, and with unclearly-tested/validated means.
still a bit verbose, and forces the reader to reason through what
input type will yield what output type.

Do nothing at all

Pros:

guaranteed to contain no unfamiliar code for readers who only know the stdlib/language spec

Cons:

super verbose
no compile-time checks.

Prior art

I am not aware of much prior art in the space of non-zero positive
integers around languages; so far, Rust seems to be the place where
they are best supported, which raises the risk that making them
un-ergonomic to use relegates this type to a niche existence.
Unresolved questions

Better to split into two RFCs?

This RFC co-mingles two things: The compile-time safety of non-zero
literals, and the ergonomics aspect. Rust should have a lint or
warning for passing literal zero to the nonzero type constructors, so
maybe this should be a separate RFC, maybe a precursor to this one?
Should this RFC also talk about type inference?

Since integer literals without a type
suffix
(search down for unsuffixed integer literal) are specified to have
the type determined by type inference, it's thinkable that the
compiler should also perform this type inference for non-zero types:
e.g., passing the literal 4 to a function that takes
core::num::NonZeroUsize should have inference determine that the
literal value fulfills the requirements of the type and is of that
type.
Cons:

seems more complex to implement, though I don't know how much work would be required
elevates the core::num types to a level they didn't have before - this might be unwanted?

Future possibilities

My hope is that this proposal will make it more attractive to use
nonzero types in more places that don't yet use them, making it easier
to correctly and safely use interfaces with less boilerplate.
Some future steps:

Type inference that understands non-zero uint types could allow
users to write even more compact, provably correct code.
Allowing rust-internal APIs to use NonZeroU* types as arguments
where they make sense, with no loss of ergonomics.