Skip to content

Instantly share code, notes, and snippets.

@antifuchs
Created December 7, 2018 16:50
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save antifuchs/7530075de2c2e894b97300dfb3ac9920 to your computer and use it in GitHub Desktop.
Save antifuchs/7530075de2c2e894b97300dfb3ac9920 to your computer and use it in GitHub Desktop.
  • Feature Name: nonzero_uint_literals
  • Start Date: TODO
  • RFC PR:
  • Rust Issue:

Summary

Add an extension to the INTEGER_LITERAL syntax that allows users to specify literals as non-zero unsigned integers. We introduce a new INTEGER_SUFFIX that starts with n to indicate non-zero literals. These literals get checked at compile-time to ensure they are not zero.

Motivation

Currently, safe usage of APIs that specify non-zero parameters have to convert a given number to its non-zero equivalent (e.g. u8 to NonZeroU8) by using the ::new() method, which returns an Option. This is useful when using user-provided numbers, but can be quite verbose and unwieldy when using an API with literals written in program text, like so:

divide_by(dividend, NonZeroU32::new(20).expect("can't happen"))

Or, since the literal is certainly non-zero, a programmer might use the following unsafe block, which would raise alarm bells with people auditing, and would constrain anyone making changes to the code block in the future (and is only slightly less verbose):

divide_by(dividend, unsafe { NonZeroU32::new_unchecked(20) })

These options push a programer safety check that should happen at compile time (i.e., that the programmer didn't accidentally provide a literal that is zero) to run-time instead: The former by inducing a panic, the latter by unsafely proceeding with a zero value.

More than that, these options treat non-zero unsigned integer literals as something "other" when their usage in programs could be quite straightforward instead.

Guide-level explanation

When using APIs that specify non-zero unsigned integer types, code passing integers literals (like 20) to these APIs needs to assert that those literals are not zero.

This can be achieved in safe code by converting a u32 integer to a NonZeroU32 integer. Given the definition for a division function for unsigned integers that can never divide by zero:

fn divide_by(dividend: u32, divisor: NonZeroU32) -> u32 {
    dividend / divisor.get()
}

We'd call the function like so:

divide_by(100, NonZeroU32::new(20).expect("inconveivable!"))

However, this long-form conversion performs the check that 20 is not zero at run time, so if somebody should accidentally delete the 2 while editing, the program will still compile and panic at run time.

The easier (and shorter) way to make this assertion is to use the n32 suffix on the integer literal:

divide_by(100, 20n32)

It is a compile-time error to use the literal zero with an n suffix, so no edits or slips of the finger will result in an accidentally compiling program that errors at run time.

Reference-level explanation

To implement this RFC, there shall be following additions:

  • Amend the definition of INTEGER_SUFFIX to allow the following 6 suffixes: n8, n16, n32, n64, n128, nsize. These correspond to core::num::NonZeroU8 (and up) literals.
  • Make it a parse error to use 0 with any n suffix listed above.
  • Adjust the documentation and examples of NonZeroU* types to indicate the short form as the preferred way of specifying them in program text.

No change is made to the handling of type inference (see open questions), so non-suffixed integer literals are handled in the same way as before.

Drawbacks

This adds more syntax to a language that is already quite rich in syntax, for a relatively niche feature (few, if any other languages seem to have non-zero types built in).

Rationale and alternatives

Rationale

The current implementation of NonZeroU* types focuses mostly on the machine optimization aspect, but doesn't help programmers write more correct programs as much as it hopes: Unless they deal with user-provided numbers, they might be led to expect that Rust's nonzero types protect them from typos at compile time, but they don't.

Making the form that is the easiest to type be checked at compile time, and referring to it in examples throughout should lead users to the "golden path" where their programs can use non-zero integers fearlessly.

Another benefit is that it allows users of APIs with non-zero types think about those arguments in a more natural way, e.g. "the number 4 (which happens to be represented as an u32 that disallows zero)", rather than "the non-zero u32 4, which will error at run time if it should be zero (??!)"

Pros:

  • Checked at compile time.
  • nice and short

Alternatives

Several alternative ideas and paths exist. I don't know if I'm enumerating all of them yet:

Only do the compile-time check for ::new and ::new_unchecked

The most annoying problem that arises from having these literals be checked at run time is that by altering code in a bad way (e.g. deleting a 2 from the expression 20), a programmer can make safe rust code panic, or make unsafe code go and behave in an unspecified way. Having a compile-time warning (or clippy lint) to ensure that literals passed to ::new_unchecked and ::new never are zero would make this problem very visible when the programmer is around to fix it.

Pros:

  • checked at compile time.

Cons:

  • still very verbose

Do nothing, delegate to macros as in the nonzero_ext crate

The nonzero_ext crate provides a macro which checks that literals are not zero, and converts them into the corresponding non-zero type from the (explicitly specified) input type:

divide_by(dividend, nonzero!(20u32))

Pros:

  • nothing to do - it's user code.
  • can check against zero at compile/macro-expansion time.
  • can safely use ::new_unchecked - no runtime check, even in debug builds.

Cons:

  • error messages for compile-time assertions in macros are really ugly
  • they're macros, so modify the language in a way not familiar to all readers, and with unclearly-tested/validated means.
  • still a bit verbose, and forces the reader to reason through what input type will yield what output type.

Do nothing at all

Pros:

  • guaranteed to contain no unfamiliar code for readers who only know the stdlib/language spec

Cons:

  • super verbose
  • no compile-time checks.

Prior art

I am not aware of much prior art in the space of non-zero positive integers around languages; so far, Rust seems to be the place where they are best supported, which raises the risk that making them un-ergonomic to use relegates this type to a niche existence.

Unresolved questions

Better to split into two RFCs?

This RFC co-mingles two things: The compile-time safety of non-zero literals, and the ergonomics aspect. Rust should have a lint or warning for passing literal zero to the nonzero type constructors, so maybe this should be a separate RFC, maybe a precursor to this one?

Should this RFC also talk about type inference?

Since integer literals without a type suffix (search down for unsuffixed integer literal) are specified to have the type determined by type inference, it's thinkable that the compiler should also perform this type inference for non-zero types: e.g., passing the literal 4 to a function that takes core::num::NonZeroUsize should have inference determine that the literal value fulfills the requirements of the type and is of that type.

Cons:

  • seems more complex to implement, though I don't know how much work would be required
  • elevates the core::num types to a level they didn't have before - this might be unwanted?

Future possibilities

My hope is that this proposal will make it more attractive to use nonzero types in more places that don't yet use them, making it easier to correctly and safely use interfaces with less boilerplate.

Some future steps:

  • Type inference that understands non-zero uint types could allow users to write even more compact, provably correct code.
  • Allowing rust-internal APIs to use NonZeroU* types as arguments where they make sense, with no loss of ergonomics.
@iFreilicht
Copy link

I like the basic goal of making this kind of thing more ergonomic, but I'd like to propose a very different solution:

Allow implementing types to be initialized by literals in general. The compiler can already do bounds-checking. For example, if you try to create an integer that is too big:

error: literal out of range for `u8`
  --> src/lib.rs:10:18
   |
1  | const asdf: u8 = 29999;
   |                  ^^^^^
   |
   = note: `#[deny(overflowing_literals)]` on by default
   = note: the literal `29999` does not fit into the type `u8` whose range is `0..=255`

The compiler knows what the range for u8 and all other integral types is. If we have some trait that allows to specify this range for any type (similarly to num_traits::Bounded), then the compiler could simply check whether the literal is within the range of the type, and allow the assignment if so.

The resulting syntax for your exemplary function call would be:

divide_by(100, 20)

This vastly improves ergonomics not only for constructors for types with NonZero fields and NonZero constants, but also other crates for specific numeric values like fixed.

I just realized this isn't even a submitted RFC yet, but I put too much effort into this comment to delete it now 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment