seanmiddleditch/cplusplus-newtype.md

## cplusplus-newtype.md

      
    Raw
  

              cplusplus-newtype.md
            
          
    Meta-Proposal for C++ new-types ("opaque typedefs")

Motivation

Use cases for "opaque typedefs" or new-types come up frequently in the C++ community, such as:

Numeric types with units
Flags or unique IDs that eschew arithmetic operations
Type-system-enforced wrappers of types (e.g. to differentiate user-input strings vs santitized strings)
More

Status Quo

Wrapper boilerplate

Today, writing a wrapper around a type typically requires lots of boilerplate. Consider a wrapper around an integer which removes bitwise operations but still enables arithmetic operations:
class wrapper {
public:
  constexpr explicit wrapper(int val) noexcept : value(val) {}
  
  constexpr wrapper operator+(wrapper rhs) const noexcept { return {value + rhs.value}; }
  constexpr wrapper operator-(wrapper rhs) const noexcept { return {value - rhs.value}; }
  constexpr wrapper operator*(wrapper rhs) const noexcept { return {value * rhs.value}; }
  
  constexpr std::strong_ordering operator<=>(wrapper) const noexcept = default;
  
  constexpr operator int() const noexcept { return value; }
  
  // ... many more operators, friend functions for ostreams/hash, etc.

private:
  int value;
};
Opaque IDs

Sometimes enum classes are used for new-types of integers that shouldn't support any arithematic operations, such as for unique ids or indices:
enum class my_id : unsigned long long { };
Note that this is often used in applications that care about minimizing code for build-times, as the enum class will result in decent codegen on all ABIs and build modes (compared to wrapper structs and their necessity for user-defined constructors and similar operations).
Flags

And when enum class is desired for a flags type, all the bitwise operations must be re-introduced, but they must be done so in the enclosing scope (no hidden friends or member functions allowed, so diagnostics and build times are impacted):
enum class flags : unsigned char {
  none = 0,
  destroyed = 1,
  delayed = 2,
  locked = 4,
  active = 8
};

constexpr flags operator|(flags l, flags r) noexcept { return flags{to_underlying(l) | to_underlying(r)}; }
constexpr flags operator&(flags l, flags r) noexcept { return flags{to_underlying(l) & to_underlying(r)}; }
constexpr flags& operator|=(flags& l, flags r) noexcept { return l = flags{to_underlying(l) | to_underlying(r}); }
constexpr flags& operator&=(flags& l, flags r) noexcept { return l = flags{to_underlying(l) & to_underlying(r)}; }
/* ... many more */
New-Types

A new-type solves the issues of wrappers by allowing a new independent type to be declared which is a transparent wrapper around another value (either built-in or user-provided!) with explicit control over which of the wrapped's types are made available on the wrapper.
Proposal

Add language support for wrapping a type in a new-type while providing explicit control over the available operations with minimal boilerplate.
Syntax Bikeshed

To declare a new-type, simple declare a new class-type that inherits from the wrapped type. This is a language change only in that it would now be legal to "inherit" from built-in types.
New-types are any class-type that (a) has only a single base-type and (b) has no non-static member fields of its own.
class wrapper : int {};
Operations and functions can be exposed/forwarded from a base type are = defaulted, with the wrapper type used in place of the base-type:
class wrapper : int {
  friend auto operator+(wrapper, wrapper) = default;
  friend auto operator-(wrapper, wrapper) = default;
  friend auto operator*(wrapper, wrapper) = default;
};
The above is semantically equivalent to defining the operators as:
  constexpr auto operator+(wrapper l, wrapper r) noexcept { return wrapper{int{l}, int{r}}; }
Implementations would be strongly encouraged to avoid emitting or calling actual function definitions for built-in operators exposed this way and to instead emit equivalent machine-code/IR as if the built-in base-type had been used directly.
In other words, the difference in behavior should be mostly semantic and in the front-end, not the back-end, to the extent reasonable for a quality implementation.
Semantics

A class-type is considered a new-type if:

It has a single base-type
Its definition contains no non-static data members
It has no non-override virtual member functions

A new-type MAY include:

Static data-members
Member functions, both static and non-static, with defaulted or non-defaulted implementations
Virtual member functions that override one provided by its base-type
Nested type declrations

Only new-types may:

Use an integral, floating-point, boolean, pointer, or enumeration type as a base-type
Default operator member functions or friend functions that are provided by its base-type

Considerations

Type Traits

Consider is_integral type trait, and a new-type like struct mytype : int {};.
Should is_integral<mytype> be true or false?
As specified today, is_integral looks for a specific closed set of implementation-provided integral types. The trait specification would need to be extended to include new-types of integral types, if that behavior were desired.
However, I think we do not want that behavior. is_integral implies that the target type meets the full interface of integral types. However, arbitrary new-types of integral types will not meet that interface, as not all operators are guaranteed to be exposed, or various overloads on core integral types would not include the new-type.
A better option in my opinion is to offer only a is_newtype trait and an newtype_base_t trait which could be combined with is_integral in cases where a new-type of an integral type is the desired behavior.
Accessing the Base Type

For non-defaulted functions, some extra support would be desired for accessing the base type for any new-type. Consider:
struct wrapper : int {
  wrapper abs() const noexcept {
    int val = *(int const*)this;
    return wrapper{std::abs(val)};
  }
};
In this particular case, the awkward pointer casts are required to get to the base int representation of the new-type. The cases work here, but would be problematic once Deducing This is available:
struct wrapper : int {
  wrapper abs(wrapper this self) const noexcept {
     int val = *(int const*)std::addressof(self);
     return wrapper{std::abs(val)};
  }
};
Note that the pointer tricks are required to avoid getting bitten by conversion operators, e.g. a wrapper::operator int(). This is very similar to how we have addressof to avoid operator& shenanigans.
We already have a similar problem for scoped enumerations, which we solved via underlying_type and the new to_underlying in particular.
A similar approach would be very useful for easily and correctly writing new-type functions. It may be possible to update the underlying machinery to support both enums and new-types, though a better option is likely to introduce some new traits and functions: newtype_base_t and to_newtype_base, with the latter supposed (possibly qualified) references to new-types and returning a (similarly-qualified) reference to the base type representation.