Skip to content

Instantly share code, notes, and snippets.

@PoignardAzur
Last active November 27, 2023 16:23
Show Gist options
  • Star 38 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save PoignardAzur/aea33f28e2c58ffe1a93b8f8d3c58667 to your computer and use it in GitHub Desktop.
Save PoignardAzur/aea33f28e2c58ffe1a93b8f8d3c58667 to your computer and use it in GitHub Desktop.

About variadics in Rust

This is an analysis of how variadic generics could be added to Rust. It's not a proposal so much as a summary of existing work, and a toolbox for creating an eventual proposal.

Introduction

Variadic generics (aka variadic templates, or variadic tuples), are an often-requested feature that would enable traits, functions and data structures to be generic over a variable number of types.

To give a quick example, a Rust function with variadic generics might look like this:

fn make_tuple_sing<...T: Sing>(t: (...T)) {
    for member in ...t {
        member.sing();
    }
}

let kpop_band = (KPopStar::new(), KPopStar::new());
let rock_band = (RockStar::new(), RockStar::new(), RockStar::new(), RockStar::new());
let mixed_band = (KPopStar::new(), RockStar::new(), KPopStar::new());

make_tuple_sing(kpop_band);
make_tuple_sing(rock_band);
make_tuple_sing(mixed_band);

Note that variadic generics are a broader feature than variadic functions. There are many languages implementing a feature that lets users call a function with an arbitrary number of parameters; this feature is usually called a variadic function. The extra parameters are dynamically typed (C, JS, Python) or a shorthand for passing a slice (ex: Java, Go, C#).

As far as I'm aware, there are only two widespread languages implementing variadic generics (C++ and D), which is what this document is about. (Zig has similar features, but it doesn't really have "templates" the way C++ or Rust understand it)

About post-monomorphization errors

This document sometimes refers to "post-monomorphization errors". If you're not familiar with Rust generics, this term might confuse you.

Post-monomorphization errors refer to any compiler error in the code of a generic function, that isn't triggered by compiling the generic function, but is triggered by instantiating it.

A major strength of Rust compared to C++ or D is that post-monomorphization errors are virtually non-existent. If your generic function compiles when you write it, it will compile when someone else uses it.

Prior work

RFCs

Other languages

Motivations

Implementing traits for tuples

This has been the focus of much attention, as the use case is a natural extension of the one that motivated const-generics.

Currently, implementing a trait for a tuple is usually done by writing a macro implementing the trait over a given number of fields, then calling the macro multiple times.

This has the same problems that array implementations of traits without const generics had:

  • It requires lots of awkward boilerplate.
  • It's unpleasant to code and maintain; compiler errors are a lot less readable than with regular generics.
  • It only implements the trait up to a fixed number of fields, usually 12. That means that a user with a 13-uple is out of luck.

For instance, right now the implementation of Hash for tuples looks like:

    macro_rules! impl_hash_tuple {
        // ...

        ( $($name:ident)+ ) => (
            #[stable(feature = "rust1", since = "1.0.0")]
            impl<$($name: Hash),+> Hash for ($($name,)+) where last_type!($($name,)+): ?Sized {
                #[allow(non_snake_case)]
                fn hash<S: Hasher>(&self, state: &mut S) {
                    let ($(ref $name,)+) = *self;
                    $($name.hash(state);)+
                }
            }
        );
    }

    // ...

    impl_hash_tuple! { A }
    impl_hash_tuple! { A B }
    impl_hash_tuple! { A B C }
    impl_hash_tuple! { A B C D }
    impl_hash_tuple! { A B C D E }
    impl_hash_tuple! { A B C D E F }
    impl_hash_tuple! { A B C D E F G }
    impl_hash_tuple! { A B C D E F G H }
    impl_hash_tuple! { A B C D E F G H I }
    impl_hash_tuple! { A B C D E F G H I J }
    impl_hash_tuple! { A B C D E F G H I J K }
    impl_hash_tuple! { A B C D E F G H I J K L }

Variadic generics would provide an idiomatic way to implement a trait for arbitrary tuples.

More powerful helper functions

The ecosystem has a few utility methods that operate on pairs of objects, such as Iterator::zip or async-std's Future::join.

There are often use cases where one might want to call these methods to combine more than two objects; currently, the default way to do so is to call the method multiple times and chain the results, eg a.join(b).join(c) which essentially returns Joined(Joined(a, b), c) (kind of like "cons lists" in Lisp).

A common workaround is instead implement the utility method as a macro, which can take an arbitrary number of parameters, but this isn't always convenient.

Easier #[derive] macros

I have never seen this use case suggested before, but it seems like an obvious feature to me; it's also the main use case for variadics in D.

Rust has a lot of crates centered around providing #[attribute] and #[derive] macros to enhance your types. These macros often follow a pattern of "list your type's fields, and then for each of the fields, do something similar". For instance:

#[derive(serde::Serialize, serde::Deserialize, Debug)]
struct Point {
    x: i32,
    y: i32,
}

The Serialize, Deserialize and Debug macros all follow the same principle of "do something with x, then do something with y", where the "something" in question can be easily defined with traits. For Debug, this is built-in. For Serialize and Deserialize, this is done by generating a string of tokens that compiles to a Serialize/Deserialize implementation for Point.

By contrast, a serialization function in D will look like

void serialize_struct(T)(Writer writer, string name, const T value)
{
    writer.startObject(name);

    static foreach (memberName; __traits(allMembers, T))
    {{
        auto memberValue = __traits(getMember, value, member);
        alias PlainMemberT = typeof(cast() memberValue);

        static if (isStruct!PlainMemberT)
        {
            serialize_struct(writer, memberName, memberValue);
        }
        else
        {
            // Serialize leaf types (eg integers, strings, etc)
            // ...
        }
    }}
}

There are a lot of subtle differences between D's semantics and Rust's, which means some concepts can't be trivially ported. I personally think these differences are under-studied, like much of D's generics, but the specific details are outside the scope of this document.

The gist of it is that D's generics are closer to Rust macros than Rust generics. They feel like a scripting language, where entire chunks of code can be disabled or reinterpreted based on template parameters; so naturally post-monomorphization errors are much more frequent.

That said, the enthusiastic adoption of static foreach in D shows that there is a strong demand for an easy-to-use, idiomatic feature to write code that applies to each field of a data structure.

Other use cases

Some variadic generics proposals mention other possible use cases. I think they are less compelling, so I'll cover them very briefly:

  • Variadic functions: Right now this use case is covered by macros. Eg where C has printf, and D has printfln, Rust has the println! macro. With variadic generics we could write a println function, which would be better-integrated with the language (better error messages, maybe faster compile times, etc); though it probably wouldn't replace the println! macro (for instance, it wouldn't be able to check the format string at compile-time).

  • Fn traits: Fn traits currently work with compiler magic, so that they be called with arbitrary numbers and types of arguments. Proposals mention that implementing variadic tuples would offer more flexibility when working with higher-order functions, though I'm not sure that's still the case.

  • Tuple manipulation: Some proposals mention that tuples could be enhanced to be flattened, concatenated, etc. This would allow more flexible code in some situations, eg:

    let (package_name, ...fields) = get_some_config();
    use_config(package_name, (fields), ...get_other_config());

Design elements

This section is not so much a proposal for variadic templates, so much as a list of questions that need to be answered to implement the feature.

It's not as structured as I'd like. I'll start with what the syntax might look like, what operations we'll want to allow, and branch out from there.

Basic syntax

First and foremost, we want:

  • To express "this template function/type/trait takes a parameter that can represent an arbitrary number of types".
  • To require that each of those types implements a given trait.
  • To declare tuples of variadic types, and pass them around (eg let my_tuple : /* VARIADIC_TYPES */; foobar(my_tuple);)
  • In some case, to "flatten" tuples, and interpret them as a comma-separated list of values, like the spread operator in JavaScript.

Common suggestions to represent variadic types include: Ts..., ...Ts, Ts.., ..Ts, Ts @ ... Using ... is closer to existing C++/D syntax, but .. is closer to existing Rust syntax for "and then a bunch of things". The Ts @ .. syntax in particular mimics existing subslice patterns.

Finally, we want to concisely express "Execute this code for every member of this variadic tuple". There are two common approaches for this:

  • Split the tuple into a "head" and a "tail" binding (eg let (head, ...tail) = my_tuple), process the head, and recursively call the function on the tail binding.
  • A special for loop which iterates over tuples. Syntax could be for member in ...tuple or for member ..in tuple or something similar.

This document will analyze both approaches later.

Note: I don't really care what the exact syntax is, for any of these features.The examples in this document just use an arbitrary syntax, but any other could work. I ask that commenters focus on the features in this document, and avoid bikeshedding at first. Thank you!

Type inference

Let's consider the future::join use-case again:

let a = some_future(1u8);
let b = some_future("hello");
let c = some_future(42.0);
assert_eq!(future::join(a, b, c).await, (1u8, "hello", 42.0));

Thejoin function takes (SomeFuture<u8>, SomeFuture<&str>, SomeFuture<f32>) and returns JoinedFuture<(u8, str, f32)>. To enable this, we need a way for a function declaration to apply a mapping to the types of a variadic tuple. The declaration of join might look like:

fn join<...Fs: Future>(futures: ...Fs) -> JoinedFuture<...Fs::Output>;

(Here we're adding the ... token before a type-expression that includes a variadic argument, and instantiating the entire type-expression for each member of the variadic; this is similar to how macro_rules! repetitions work)

Conceptually, we are mapping our variadic parameter to a type constructor that looks like <type T where T: Future> => <T as Future>::Output (pseudocode).

The mapping should carry over all the usual rules of type inference. For instance:

fn unwrap_all<...Ts>(options: ...Option<Ts>) -> (...Ts);

let (a, b, c) = unwrap_all(Some(1), Some("hello"), Some(false));

In the above, type inference should understand that unwrap_all expects Options, and match the types of Ts to the option payloads (that is, Ts should be i32, &str, bool).

Multiple mappings

The above examples are fairly simple, with a single mapping. Some use-cases are likely to be more complex. For instance, let's imagine a function that splits a tuple of Eithers into tuples of Lefts and Rights.

fn split<...Lefts, ...Rights>(eithers: ...Either<Lefts, Rights>)
    -> (...Option<Lefts>, ...Option<Rights>);

The syntax used in this example is simplified, and makes some implicit assumptions: that Lefts and Rights variadics have the same tuple size, and that our ... syntax iterates through both of them "simultaneously" (like Iterator::zip).

We can imagine use-cases where these assumptions don't hold; for instance, functions that take multiple independent variadic parameter but don't zip them, or a syntax where ...(As, Bs) with As = i8, i16, Bs = str, bool evaluates to (i8, str), (i8, bool), (i16, str), (i16, bool). It's unclear how often there would be a real-life need for those use-cases, though.

Either way, the language spec should either document these assumptions, or provide a syntax to include them in a declaration, that both the developer and the compiler can reason about, eg:

fn split<...Lefts, ...Rights>(eithers: ...Either<Lefts, Rights>)
  -> (...Option<Lefts>, ...Option<Rights>)
  where Lefts: SAME_TUPLE_SIZE_AS<Rights>;

(though I'm personally not super hot on the above syntax; when working with arrays, we usually don't write fn foobar<A: Array<T>, B: Array<T2>>(...) where A: SameSizeAs<B>;, we use const-generics)

Concatenating variadics

Some users might want to use variadics to concatenate tuples, eg:

fn concat<...As, ...Bs>(a: (...As), b: (...Bs)) -> (...As, ...Bs);

Much of the previous section applies here. It's easy to come up with a syntax that leaves a lot of edge-cases unspecified.

In particular, if users are allowed to use multiple variadics in a single type list, this may make patterns harder to reason about, unless sensible limitations are specified:

fn cut_in_two<...As, ...Bs>(ab: (...As, ...Bs)) -> (...As), (...Bs);

cut_in_two((0, 1, 2, 3, 4)); // Where does As stop and Bs start?

Other transformations

In theory, variadics can enable a statically-typed language to treat types almost like first-class values. Beyond map, a language could port most of the classic list transformations (filter, fold, any, all, sort, dedup, etc) to sequences of types (and this is indeed what D does).

It's unclear how much real-world use these transformations would have. One could, for instance, write a type constructor IntegerTypes<Ts...> that would return a subset of the input tuple with only integer types; but it's not obvious what practical applications this constructor would have that cannot be achieved now.

Rust generally frowns upon adding complex type operations for the sake of making them possible, the way some functional programming languages do (no Higher-Kinded Types for you). Anybody wanting to push towards D-style non-linear variadics (that is, type constructors with variadics that aren't N-to-N), will have an uphill climb ahead of them. Among other things, they'll be expected to research how these additions would impact type inference, undecidability issues, and post-monomorphization errors.

Implementing variadic functions

Declaring variadic types in only one-half of the feature. The other is how to use them.

Tuple for

A for loop is the most idiomatic way to express "I want to do the same thing, over and over again, with each item of this collection".

Adapting it for variadics is straightforward:

fn hash_tuple<...Ts: Hash, S: Hasher>(tuple: (...Ts), state: &mut S) {
  for member ...in tuple {
    
    member.hash(state);
  };
}

We also need tuple-for to return the data at the end of its block, for its respective members:

fn unwrap_all<...Ts>(options: ...Option<Ts>) -> (...Ts) {
  for option ...in options {
    option.unwrap()
  }
}

(That last feature would probably be incompatible with break and continue statements; I won't go into details, but there are several ways to adress that)

And in some cases, we need to iterate over types as well as values:

fn to_vecs<...Ts>(options: (...Option<Ts>)) -> (...Vec<Ts>) {
  // Again, this is just one possible syntax
  for option, type T ...in options {
    if let Some(value) = option {
      vec![value]
    }
    else {
      Vec::new::<T>()
    }
  }
}

(Though I don't expect these cases to be common; Rust type inference is strong, and the example above is kind of dumb)

Iterating over references

Given the following code:

let array = [1, 2, 3];
    
for x in &array {
    do_stuff(x);
}

we expect x to be of type &i32, not i32.

Similarly, with variadics:

// tuple == (1, 2, 3);
    
for x ...in &tuple {
    do_stuff(x);
}

x should also be of type &i32.

Since tuple-for probably won't be built on a trait the way existing for-loops are, the exact rules will probably be a little more magical than with iterators.

Iterating over multiple tuples

Some use-cases might require the implementation to iterate over multiple tuples simultaneously. For instance:

fn zip<...As, ...Bs>(tuple_a: (...As), tuple_b: (...Bs)) -> (...(As, Bs)) {
  for a, b ...in tuple_a, tuple_b {
    (a, b)
  }
}

Another possibility is to have the zip function above be purely hardcoded, and have other use-cases zip their tuples before using them.

Fold expressions

C++ provides a convenient way to fold variadics over binary operators:

template<typename ...Ts> auto sum_of(Ts&& ...ts) {
    return (0 + ... + ts);
}

Rust could implement something similar.

Spread operator

Users may want to be able to create new tuples by "flattening" existing tuples, and treating them as if they were a comma-separated list of their members. In JavaScript, this is known as the spread operator, though Rust has a similar concept known as the struct update syntax.

This would make it possible to concatenate tuples or pass them inside arguments list, eg:

let my_tuple = (1, 2, 3, ...previous_tuple);
my_variadic_function(arg1, ...my_tuple, arg2, ...my_other_tuple, more_args);

This is a powerful feature, and as such, it might be hard to reason about. In particular, the type inference system would need to account for cases where the number of arguments a tuple "spreads" into is dependent on template parameters:

fn foo<T1, T2, T3>(t1: T1, t2: T2, t3: T3);

fn bar<...Ts>(tuple: (...Ts) {
  // Should probably be an error, there's no guarantee that Ts is 3 members long
  foo(...tuple);
}

Destructuring patterns

Destructuring patterns are the opposite of the spread operator:

// prev_tuple == (1, 2, 3, 4, 5, 6);
let (a, ...my_tuple, b, c) = prev_tuple;
// a, b, c == 1, 5, 6
// my_tuple == 2, 3, 4

(Also, some destructuring syntax already exists in the language)

This is often suggested as a means to enable C++11-style recursive variadics, eg:

fn print_all<...Ts>(args: ...Ts) {
  match args {
    (arg0, ...others_args) => {
      print(arg0);
      print_all(others_args);
    }
    () => ()
  }
}

But, generally speaking, it's a powerful feature with a lot of potential uses (and thus, much like concatenating and spreading tuples, it makes type inference harder).

Note that tuples have an implementation-defined internal layout. This means that a sequence of fields in a tuple aren't guaranteed to be a subslice of that tuple, which is why binding to destructuring patterns isn't currently allowed in Rust.

In practice, that means let (head, ...tail) = args might work, but let (ref head, ref ...tail) = &args would not.

Also, interactions with macros are non-obvious. For instance, it's not immediately clear whether println!("somestufff {}", ...my_tuple) would compile.

Derives

I think this is the most under-analyzed potential benefit of variadic generics.

There is a large consensus that slow compile times are one of the major pain points of Rust (though, interestingly, there doesn't seem to be a recent analysis on the subject). The lifetime system can be tamed after a learning period, but the compiler stays slow, incremental improvements notwithstanding.

I'd wager that for the majority of projects, having to compile syn and proc-macros takes a big chunk of compile times. There are a lot of Rust projects that use extremely elaborate proc macros to generate trait implementations that boil down to "for every member of your struct, do X".

This is a really inefficient way to produce generic code. We're parsing a token-stream, performing expensive operations on the resulting AST, then producing another token tree that needs to be parsed, type-checked and borrow-checked all over again. The process is fiddly and library maintainers can easily introduce compile errors that don't show up in their tests because they can only be triggered by certain inputs (though solutions are being explored to adress that).

In fact, I'd be interested to see a survey of the most often used proc macros in published crates, because I suspect the vast majority of use-cases could be easily implemented with variadic generics.

For instance, assume that we had variadic generics, as well as a GET_FIELDS builtin that transforms a struct into a tuple of its fields. The Debug derive could be implemented as:

use syn::*;

#[proc_macro_derive(Debug)]
pub fn derive_debug(input: TokenStream) -> TokenStream {
    let mut derive_input = parse_macro_input!(input as DeriveInput);
    
    let struct_name = derive_input.ident.to_string();
    let struct_data;
    if let Struct(data) = derive_input.data {
      struct_data = data;
    }
    else {
      unimplemented!();
    }
    let field_names : VecDequeue<String> = struct_data.fields.map(|field| 
      field.ident.unwrap().to_string()
    ).collect();

    let generics = add_trait_bounds(derive_input.generics);
    let (impl_generics, ty_generics, where_clause) = generics.split_for_impl();

    let expanded = quote! {

        impl #impl_generics fmt::Debug for #struct_name #ty_generics #where_clause {
          fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
            debug_struct(#struct_name, GET_FIELDS!(self), #field_names, f);
          }
        }

    };
    proc_macro::TokenStream::from(expanded)
}

fn debug_struct_fields<...Ts: Debug>(
  struct_name: String,
  fields: &(...Ts),
  field_names: VecDequeue<String>, 
  f: &mut fmt::Formatter<'_>
)
  -> Result<(), fmt::Error>
{
  f.debug_struct(struct_name);
  for field ...in fields {
    f.field(field_names.pop_front(), field);
  }
  f.finish()
}
  • Note: This is an example of a simplified derive macro. Actualy macros usually have more complicated rules regarding attributes; for instance, Derivative(Debug) will have rules like "use the debug implementation of all these fields, but skip this one". I believe this could be adressed with the rules already described, though I won't go into details here.

The example still uses syn, because derives can only be written with proc_macros and the entire system is a tower of duct tape. With further language improvements, syn could be removed entirely (with derives based on macros-by-example, or a slightly more ergonomic TokenStream, or a #[derive_with_variadic_generics] builtin).

Regardless, the example is still a lot leaner than the traditional derive macro. The macro only generates 5 lines of tokens, and the heavy lifting is done by debug_struct_fields, which only needs to be parsed, type-checked and borrow-checked once. Any errors in the implementation will show up at compile time, not when the macro is being instantiated, with easier-to-read error messages.

How hard would the GET_FIELDS builtin be to implement? The main problem is that a struct layout is implementation-defined, and there is no guarantee that struct Point{ x: i32, y: i32 }; has the same layout as (i32, i32). In practice, we might want a builtin trait like HasVariadicFields<T...>; so that the prototype of debug_struct_fields would actually be:

fn debug_struct_fields<...Ts: Debug, Struct: HasVariadicFields<...Ts>>(
  struct_name: String,
  fields: &Struct,
  field_names: VecDequeue<String>, 
  f: &mut fmt::Formatter<'_>
)

First-class types and reflection

While they're not strictly speaking variadics, first-class types are an idea that comes up often enough.

The general idea would be to treat types as values (no different than integers or string) that can be passed to and returned from const functions, eg:

const fn do_something_with(arg: type) -> type;

type my_type = i32;
type my_other_type = do_something_with(my_type);

This is extremely powerful beyond the scope of variadics; first-class types make it possible to create languages inside the language, and to generate entirely new type semantics inside const functions, while using the familiar syntax of Rust code.

In principle, we can even use them to create variadic tuples and variadic generics without any dedicated syntax, extending existing features:

// Hardcoded by the compiler
impl const Iterator<Type> for Tuple {
    const LEN: usize;
    // ...
}

let my_tuple = ...;
for member in my_tuple.iter() {
    member.do_stuff();
}

struct MyTypes<const Types: Vec<Type>> {
    things: (Types),
    optional_things: (Types.map(|Type| Option<Type>)),
}

In general, these proposals are suggested as a replacement for variadic generics. The idea being that we don't need variadics if we can use existing syntax instead, applied to types instead of values.

How to proceed

Everything I wrote so far has been a summary of different proposals and design ideas. I've tried to make a balanced account, without actually recommending anything.

This section is where I actually explain how I think the Rust language should implement variadics.

Minimum viable product

The rust team recently decided to ship a minimum version of const generics, which is currently available on stable. This version only kept the features that were unanimously agreed upon, while postponing the features that required non-trivial design decisions for later stabilization.

I think something similar should be done with variadic generics. A first implementation should be released, focusing on the use case of implementing traits for tuples. This means:

  • No zipping, flattening, concatenating, or applying type constructors to tuples. (...Ts) is allowed, (...As, ...Bs), ...Option<As>, ...(As, Bs), etc, aren't.
  • No variadic functions. Eg fn foobar<...Ts>(tuple: (...Ts)) is allowed, fn foobar<...Ts>(args: ...Ts) isn't.
  • Implement tuple for.
  • No spread operator or destructuring patterns (eg let (a, ...my_tuple, b, c) = prev_tuple; isn't allowed).
  • No GET_FIELDS or HasVariadicFields builtin for transforming a struct into a tuple.

This is still more than enough to implement eg the Hash trait for any arbitrary tuple.

Implement tuple for

Even before an MVP is stabilized, the language should add a tuple-for syntax for any tuple, including non-variadic tuples.

It's a micro-feature that could pull its weight even without variadics. For instance, proc-macros could generate code with tuple-for, instead of copy-pasting the same bit multiple times. The macros would still have to be called for every tuple size, but the code actually being generated would be simpler.

Restrict tuple arithmetic

While I've tried to be balanced, I think it's clear by now that I don't think non-linear tuple arithmetic is worth implementing, even past the MVP stage.

By "non-linear", I mean any operation that takes a N-tuple and doesn't return a N-tuple: the spread operator, destructuring pattern, tuple flattening/concatenation, tuple indexing, etc.

These features are required to implement C++11-style recursive variadics, but C++11-style variadics are absolutely awful. They only work in a language resigned to horrible long post-monomorphization errors and officially-sanctionned hacks to work around the language's own limitations.

There's an interesting conversation about what we want Rust generics to be like; some of that conversation has already started, with the question of whether to allow maybe-panicking code in const generics. Whether to bring tuples closer to reified types, with flattening and indexing and so on, should be part of that conversation, which is broader than this document.

Focus on the compile error story

This wouldn't be part of any formal proposal, but it's something to keep in mind.

This document has referred to post-monomorphization errors a lot. A major preprequisite of any variadics proposal should be that it doesn't add any.

More generally, variadics should be designed with Rust's compile-error story in mind. A major feature of Rust is that the compiler helps you locate where an error comes from quickly. A feature that works when you use it right isn't good enough; it must also be easily corrected when you use it wrong.

This should be the case on the declaration side ("your for-loop is invalid because the member type X doesn't match the member type Y of variadic types ...Xs and ...Ys") and on the use side ("you are using unwrap_all wrong because you're giving it Option<X>, Option<Y> but you're expecting X, (), the second parameter should be Y).

In practice, keeping that focus requires keeping the semantics simple; it's also one of the reasons I think non-linear tuple arithmetic should be avoided.

Avoid first-class types

First-class types are sometimes brought up as a counter-argument against implementing variadics. The reasoning goes that we can implement all of the desired use-cases without adding any new syntax; if we do add a special ...Ts syntax, that syntax will become a special-case, technical debt that will need to be maintained in the compiler and taught to newcomers, without pulling its weight.

Using first-class types, we can be much more flexible about how we handle our types. We can have generics that use imperative rust code to define the types they handle, which means they can be defined on a much broader set of types without additional syntax.

But, for the reasons I listed in previous sections, I'm extremely skeptical of any such proposal.

I think the above reasoning is flawed. While first-class types may appear simpler at a glance, they come with a slew of corner cases, and subtle semantic changes, that would be much more difficult to implement (and maintain in the compiler, and teach to newcomers) than variadics alone. And, not to belabor a point, but post-monomorphization errors are bad and first-class types would bring millions of post-monomorphization errors.

Bluntly speaking, I don't think they're on the table.

(Speaking generally, we shouldn't add reflection the way Zig or D do it, by saying "How can we expose as much information from the type system as we can?". Instead we should ask "What are the common reflection use-cases, and how can we provide tools to implement them, such that the path-of-least-resistance for the tools is one that degrades gracefully?")

Improve the derive use-case

Sections of this document have laid out the base of how derive macros could be improved incrementally, using variadic generics.

But in my dream future, the more generics improve, the less macros will be needed. For instance, after enough progress, we may replace the #[proc_macro_derive] builtin entirely with a #[variadic_derive] builtin:

// See https://gist.github.com/PoignardAzur/4795888034e8b40b6b312b1d9da2cf3c
// for a more complete example

#[variadic_derive(struct)]
impl<
    const Name: String,
    const ...FIELDS_INFO: FieldData,
    ...Ts: Debug,
    Struct: HasVariadicFields<...Ts, ...FieldsInfo>,
> Debug for Struct {
    fn fmt(&self, f: &mut Formatter<'_>) -> Result {
      let mut f = f.debug_struct(Name);
      for field, field_info ...in self.fields, FIELDS_INFO {
        let (field_name, _field_attributes) = field_info;
        f.field(field_name, field);
      }
      f.finish()
    }
}

It will take a lot of intermediary features for this to be possible, beyond those described in this document. Among others:

  • Advanced variadic generics.
  • Advanced const generics.
  • The ultimate fusion, variadic const generics.
  • Some form of compile-time reflection.
  • A more thought-out post-monomorphization error story, so that the derive can raise custom errors when a field's attribute doesn't match its type, in a human-readable way.

This will take a lot of time to implement, but once it is implemented, the vast majority of crates will finally repent.

(as in, remove their dependency on syn)

And that is something we can all aspire to.


Discussion on reddit

@jplatte
Copy link

jplatte commented Jan 21, 2021

The thing I'm personally not too enthusiastic about is derives as a big motivation for this. If you wanted to able to inspect all parts of the type the derive was used on like proc-macro derives can, you would basically end up with const Input: syn::DeriveInput instead of const Name: String and const FieldNames: [String]. The only improvement over proc-macros then would be the ability to declare some assumptions the derive macro has of the input (types of a struct's fields implement a certain trait, maybe some other things), and I would argue that most macros have more complex requirements of their input than that. I think without post-morphization errors, only <5% of what people want derives to do could be done with something like #[variadic_derive], which stands in stark contrast to the effort that would be required to enable it.

The issue of proc-macro compile times is also being tackled from other sides (see watt and cargo-watt), which I find a lot more compelling.

@pthariensflame
Copy link

There’s also this issue of mine, which seems to have been overlooked here.

@Skepfyr
Copy link

Skepfyr commented Jan 21, 2021

This is great and covers most of what I've seen on the topic, some thoughts in a random order:

  • More detail on HList style variadics i.e. (Head, ..Tail) would be good. These don't work (well) in rust because you can't do much with Tail as it isn't guaranteed to be a subset of the type. As rust can move the fields about in the tuple layout as it sees fit the head of the tuple isn't necessarily at the front in memory.
  • println would give worse error messages as a function as the macro parses the format string which would be hard to do in a const way at the moment.
  • Iterating over types could be incredibly useful if you leverage associated types/consts.
  • This can't be done at the moment with some clever macros and traits, every so often I go "wait can't you just use trait objects and iterators" before realising (for the third time) that not all traits can be turned into trait objects (Hash is a good example).

@PoignardAzur
Copy link
Author

@pthariensflame

There’s also this issue of mine, which seems to have been overlooked here.

You're not going to like this.

Overall I don't think your proposal is the direction I'd want variadics to go into. It goes straight into what I was describing as "non-linear type arithmetic". I think Rust's type system shouldn't be as powerful as an actual programming language (the way Zig and D's type systems are). I think type transformations should be restricted to a few reversible operations, that are easy to reason about for the compiler, so that if the user uses a type wrong, the compiler can easily tells them where the error is and how to fix it.

I'll add a link to your issue in the "existing RFCs" part.

@PoignardAzur
Copy link
Author

@jplatte

I think the description of what variadic generics are should be generalized to all generics / generic parameter kinds 🙂

Yeah, so, I'm gonna channel my inner Arya Stark and tell you:

Not today.

(But yeah, that's a good point; also you can have variadics that are all the same type and there are definitely use cases where that would be useful, but, honestly, this document is already pretty long as it is)

We couldn't, actually. With a bunch of const eval improvements, we could turn println!("{} {}", a, b); into println::<"{} {}">(a, b); but that's rather ugly. println! parses its first argument at compile time, that will likely never be possible with a regular function.

Ugh, way too many people are reacting to that one throwaway line. I think I'm going to replace it with "vec" instead or remove it.

If you wanted to able to inspect all parts of the type the derive was used on like proc-macro derives can, you would basically end up with const Input: syn::DeriveInput instead of const Name: String and const FieldNames: [String].

I oversimplified the example. In practice, you would end up with const Name: String const FieldNames: [String], const FieldAttributes: (...AtributeTypes) (also, you would need different macros for deriving structs, tuples and enums). The trickiest part is that, like you point out, you need to require different traits for each field depending on the field's attributes, which, yes, essentially requires post-monomorphization errors (and the lang team hasn't really decided if we want to let a library writer trigger those). That said, they would be deliberate post-monorphization errors, ideally with some user-friendly context to tell the user how to fix it. Accidental post-monorphization errors, with their stack traces of template function calls, would be much more rare.

@PoignardAzur
Copy link
Author

@Skepfyr

More detail on HList style variadics i.e. (Head, ..Tail) would be good. These don't work (well) in rust because you can't do much with Tail as it isn't guaranteed to be a subset of the type. As rust can move the fields about in the tuple layout as it sees fit the head of the tuple isn't necessarily at the front in memory.

I mean... I did say that?

  • Tuples have an implementation-defined internal layout. This means that a sequence of fields in a tuple aren't guaranteed to be a subslice of that tuple, which is why binding to destructuring patterns isn't currently allowed in Rust.

Not sure if you think I should write something else?

This can't be done at the moment with some clever macros and traits,

What do you mean?

@Skepfyr
Copy link

Skepfyr commented Jan 21, 2021

Ahh yeah, I didn't see that point, it didn't come where I expected it. (I was expecting somewhere between where you introduced the two approaches and "Implementing variadic functions" where you dive into tuple for)

This can't be done at the moment with some clever macros and traits,

What do you mean?

This is diving deep into a random thought I had that doesn''t work but I previouly thought that something simple like https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=530c89119c15a42c92bc08f2688a0a61 would be a good stepping stone, however as that example shows, it isn't. I don't think it's particularly interesting.

@chriskrycho
Copy link

Other “prior art” worth including in analysis: TypeScript’s support (added in 4.0 about six months ago) for variadic tuples—a subset of the functionality proposed here and with substantially different semantics.

@jplatte
Copy link

jplatte commented Jan 25, 2021

If you wanted to able to inspect all parts of the type the derive was used on like proc-macro derives can, you would basically end up with const Input: syn::DeriveInput instead of const Name: String and const FieldNames: [String].

I oversimplified the example. In practice, you would end up with const Name: String const FieldNames: [String], const FieldAttributes: (...AtributeTypes) (also, you would need different macros for deriving structs, tuples and enums). The trickiest part is that, like you point out, you need to require different traits for each field depending on the field's attributes, which, yes, essentially requires post-monomorphization errors (and the lang team hasn't really decided if we want to let a library writer trigger those). That said, they would be deliberate post-monorphization errors, ideally with some user-friendly context to tell the user how to fix it. Accidental post-monorphization errors, with their stack traces of template function calls, would be much more rare.

You make it sound like it's just "one more small addition and it's gonna work". I disagree. Attributes would basically require a significant part of syn once again so you can parse them conveniently (they have been allowed to contain arbitrary token for a while) and you've also skipped over types completely. A decent number of macros have special behaviour for fields of a specific type (example: thiserror::Error handles fields with a type named Backtrace specially).

@PoignardAzur
Copy link
Author

PoignardAzur commented Jan 25, 2021

You make it sound like it's just "one more small addition and it's gonna work".

I feel like you're strongly misinterpreting what I said. The passage we're discussing (including edits I made after your first post) is:

But in my dream future, the more generics improve, the less macros will be needed. For instance, after enough progress, we may replace the #[proc_macro_derive] builtin entirely with a #[variadic_derive] builtin [...]

It will take a lot of intermediary features for this to be possible, beyond those described in this document. [...] This will take a lot of time to implement, but once it is implemented, the vast majority of crates will [be able to not depend on syn].

I understand that, generally speaking, it's very reasonable to be skeptical of proposals that demand a lot of work and semantic changes for a few niche use cases, but in this case it seems a little unfair: this document already spends most of its wordcount describing common use-cases that could be implemented incrementally.

The last section is the only one that describes a feature that would actually require a significant rework, as a "once we've done all the intermediary things, this is what we could shoot for" thing.

Attributes would basically require a significant part of syn once again so you can parse them conveniently (they have been allowed to contain arbitrary token for a while)

In the general case, yes, but in practice I think most of the stuff mainstream crates do now could be done with tyexpr and constexpr expressions and a few other hardcoded formats parsed by the compiler before it gives them to the template function.

To be clear, I'm not saying the attribute syntax mainstream crates use now could be parsed that way. I'm saying the lang team could release #[derive_from_variadic_function_or_whatever], then tell crate maintainers "here is the syntax you need to switch to if you want to use the new derive", and crates could do the switch without losing features.

@PragmaTwice
Copy link

PragmaTwice commented Jan 26, 2021

Great job!
A little typo about the c++ code in the section "fold expressions": the required bracket is missing

template<typename ...Ts> int sum_of(Ts... ts) { // consider `decltype(auto)` for the return type, and universal forwarding `Ts&& ...` for parameter types
    return (0 + ... + ts);
}

@jplatte
Copy link

jplatte commented Jan 26, 2021

Thinking about this more, I think I might have to split my argument in two to become clearer:

  1. I just don't think varadics for derives is a considerable improvement. I think if there was a significant need for making simple derive macros easier to write, it could just as well be done by adding more helper crates for proc-macros (there are some actually, for example darling for attribute parsing). Varadics would probably make proc-macro code harder to read on average, because very few people would read much variadic-generic code.

  2. #[variadic_derive] is just never going to pull its weight. If it can only replace some proc-macro derives, you now have to learn multiple ways of writing derives when wanting to contribute to various projects. Even if it can fully replace proc-macro derives, old proc-macro derives will still exist for a long time and proc-macros in general will also keep existing. I would much prefer if you only had to learn variadic generics for (mostly) much simpler applications, where it would also be obvious why they're using variadic generics.

    Following my argument from 1., I can only see two reasons for #[variadic_derive]:

    • Lower compile times. As I mentioned above, proc-macro compile times can also being tackled in different ways, so I don't think this is that compelling.
    • Integration with the type system: If I haven't missed anything, this wasn't even mentioned above, and I think I know why: proc-macro derives not integrating with the type system is not actually a problem in practice. I do like the idea a lot in theory, but I think #[variadic_derive]s implementation would have to be unreasonably complex, partly because of this.

@PoignardAzur
Copy link
Author

If you think the proc-macro ecosystem is fine as it is (or almost-fine, with stuff like pre-compiled wasm proc-macros filling in the gaps), then yeah, I guess I'm not going to convince you.

I think a lot of people are frustrated with proc-macros. I mean, if you look at the reflect crate, it's essentially an extremely complex, compiler-within-a-compiler library aimed only at making derive macros simple / more reliable. The fact that this crate exists (and was originally written by the guy responsible for, like, half the proc macro ecosystem) proves there is a need. At the same time, this crate is kind of an ugly hack.

And fundamentally speaking, I really don't think the standard way to write code generic on complex types should be to have an in-language language parser, plus an in-language symbol resolver, which we use to parse the tokens of the type's declaration, then generate an implementation for that type as a string of tokens which has to be parsed all over again. This is the wrong level of abstraction, we're just using it a lot because it's a lot easier to manipulate.

Again, if you think macros as they exist are fine, I don't think anything can say can convince you. I could come up with any number of examples of things that are inconvenient or inefficient and you could say "It's not a problem in practice" or "That's the way the ecosystem is and it's not worth changing it" or "There's a crate that kind of solves it" for each of them.

But pragmatically speaking, I do think the feature would be worth the cost.

@jplatte
Copy link

jplatte commented Jan 26, 2021

Okay, I see where you are coming from. I agree it would be really nice to have derives not be purely syntactical, but I disagree that variadic generics would be able to provide good abstractions for deriving traits. In your current example, some of the generic parameters are part of the generated impl, some are not. Similarily, some of the code is really compile-time code, some is part of the generated impl. Currently, it's only the for ...in parts that work at compile time, but any non-trivial derive has more compile-time logic.

Without having put too much thought into it, the way I would expect typesystem-integrated derives to work is basically some sort of compiler plugin, so you'd provide a function fn derive_debug(input: DeriveInput) -> ImplItem.

@comex
Copy link

comex commented Jan 27, 2021

And fundamentally speaking, I really don't think the standard way to write code generic on complex types should be to have an in-language language parser, plus an in-language symbol resolver, which we use to parse the tokens of the type's declaration, then generate an implementation for that type as a string of tokens which has to be parsed all over again. This is the wrong level of abstraction, we're just using it a lot because it's a lot easier to manipulate.

Personally, I agree with this.

But I'm confused. How do you reconcile

In practice, you would end up with const Name: String const FieldNames: [String], const FieldAttributes: (...AtributeTypes) (also, you would need different macros for deriving structs, tuples and enums). The trickiest part is that, like you point out, you need to require different traits for each field depending on the field's attributes […]

with

I think Rust's type system shouldn't be as powerful as an actual programming language (the way Zig and D's type systems are). I think type transformations should be restricted to a few reversible operations, that are easy to reason about for the compiler

?

If you're going to do something like "require different traits for each field depending on the field's attributes" – or more generally, reach anywhere near the same ballpark of flexibility as custom derives – you pretty much have to be able to perform arbitrary computation, just as custom derives can. You can't be limited to "a few reversible operations".

It makes sense to avoid the Turing tarpit of C++ variadics, where you can express any computation you want, but only by using what amounts to a separate programming language embedded in C++'s type system, one that's 10x as verbose as normal code and largely unreadable.

It makes sense to me to say that Rust could add a limited form of variadics which, in order to work within pre-monomorphization-checked type system, as well as to generate nice errors, would only support a few operations.

But that design decision inevitably means that variadics cannot be the feature that substitutes for custom derives. They would be useful for a lot of things, but derives wouldn't be one of them, at least outside of the very simplest cases. I think custom derives ought to be replaced, by something that's less hacky and can interact with the type system, but that would have to be a completely separate feature, something that would inevitably end up resembling D or Zig's compile-time reflection.

In short, when you say variadics would be limited but also powerful, I think you're trying to have your cake and eat it too. How do you respond?

@PoignardAzur
Copy link
Author

PoignardAzur commented Jan 27, 2021

If you're going to do something like "require different traits for each field depending on the field's attributes" – or more generally, reach anywhere near the same ballpark of flexibility as custom derives – you pretty much have to be able to perform arbitrary computation, just as custom derives can. You can't be limited to "a few reversible operations".

[...] It makes sense to avoid the Turing tarpit of C++ variadics, where you can express any computation you want, but only by using what amounts to a separate programming language embedded in C++'s type system, one that's 10x as verbose as normal code and largely unreadable.

[...] In short, when you say variadics would be limited but also powerful, I think you're trying to have your cake and eat it too. How do you respond?

That's a fair criticism and a real contradiction.

It's a contradiction we're hitting with const functions, where we want to use them everywhere a const is expected, but we don't want to have to deal with post-monomorphization panics and semver issues that come with const panics, but we also don't want to reduce const functions to an ultra-restricted subset of the language that cannot possibly panic (though there's a lot of proposals for a nopanic attribute that would probably involve a SMT solver or dependent types or something).

That being said, I think you are underestimating how powerful the Rust type system is right now.

(you and everyone who says that only the very simplest derives could be implemented with generics)

To make that point, I wrote another example of variadic syntax, this time focused on emulating Derivative's Debug (but keep this up and I can probably be nerd-sniped into doing serde next ^^).

In that example, the line that does the heavy lifting is where ...Ts: ...DebugStructField<Attributes>; which means, "for every field of type T and its attribute of type A passed with it, T must be bound to DebugStructField<A>", which is reasonably non-Turing-tarpit-ish (though error messages would still be suboptimal).

You'll note that Derivative implements some non-trivial type semantics, field by field:

  • Some fields require Debug,
  • Some fields don't require anything,
  • Some fields require a custom format function.

Now, I don't know if you consider this to be "outside of the very simplest cases". I don't want to argue in bad faith or put words in your mouth. But I'm willing to willing to wager that, when you said that variadics wouldn't cover enough of what derives did, you either didn't have specific use cases in mind, or the use cases you did have in mind were similar to Derivative's.

(correct me if I'm wrong though)


Just to make this discussion less abstract, here is a list of crates with derive macros I've found on crates.io, roughly sorted by popularity:

Of these, I think only pest and maybe darling are absolutely impossible to port to variadics (pest because it uses an external format file, darling because it does some weird things).

Derivative, zeroize, strum, derive_more and serde look easy to convert. There may be some syntax changes, and I haven't looked at every single implementation to check what it does, but from the documentation it looks like they would be similar to the MyDebug example I linked.

StructOpt and thiserror are mostly straightforward, with one exception each:

  • structopt includes doc comments in the derivation; this could be done with variadics (you'd have to add one more generic argument to the impl, like const ...FIELD_DOCS: String and stuff).
  • thiserror has format strings that are presumably checked at compile time. This can't be done with variadics without "first class" post-monomorphization errors.

Note that for all of the above I'm mostly commenting about the "deriving structs" part. Most of these crates can also derive enums, which wouldn't be possible with the proposal as written.

There's a few ways to address that (say that enums have to be passed to proc-macro derives, or have a #[variadic_derive_enum] builtin that treats enums as structs of Option and does the conversion back-and-forth, or implement variadics for enums, etc). But I'm not going into them right now because this is already long enough.

EDIT: Also, I didn't go into custom bounds. Those are a lot trickier; I haven't thought about them in detail, but my gut says they can still be handled without adding complicated semantics.

(As a final note, I hope I'm not sounding too curt here; I'm a little irritated by some of the feedback I get, which focuses on the parts that I feel are the least central to the analysis, but that's the game when you publish a proposal for informal review; I do think your and @jplatte's feedback has been relevant and useful, even if I fundamentally disagree with most of it)

Honestly, this is making me want to implement some of this as a proc-macro, to see how far I could get with it. I think a lot of these discussions would be easier to have with actual, non-pseudocode examples.

@jplatte
Copy link

jplatte commented Jan 27, 2021

I don't have the time to respond to your new example now, but there's two small things I wanted to note:

In that example, the line that does the heavy lifting is where ...Ts: ...DebugStructField<FIELD_TYPES>;

Not sure if an old version or what, but it seems this should be ...Ts: ...DebugStructField<Attributes>

  • darling - though I'm not sure I understood what this one did.

This is a helper crate similar to the reflect crate. It simplifies attribute parsing for proc-macros.

@PoignardAzur
Copy link
Author

PoignardAzur commented Jan 27, 2021

Not sure if an old version or what, but it seems this should be ...Ts: ...DebugStructField

Thanks, corrected.

This is a helper crate similar to the reflect crate. It simplifies attribute parsing for proc-macros.

Yeah, but I meant, I don't really understand what the derive attributes do. But yeah, I get the impression that it's "serde except for parsing token streams".

@comex
Copy link

comex commented Jan 27, 2021

To make that point, I wrote another example of variadic syntax, this time focused on emulating Derivative's Debug (but keep this up and I can probably be nerd-sniped into doing serde next ^^).

There's already a fair bit of magic embedded into there. I know you're using example syntax, but the question is what it implies about semantics.

First, this line:

  Struct: HasVariadicFields<...Ts, ...FIELD_NAMES, ...FIELD_ATTRIBUTES>,

To start with, arguments need to be associated types/consts, not generic parameters, because otherwise the impl wouldn't be allowed. (i.e. you can't have impl<A, B: Trait<A>> Foo for Bar because there could be multiple possible values of A that satisfy the requirements.) So maybe something like

  Struct: HasVariadicFields<FieldTypes=...Ts, FIELD_NAMES=...FIELD_NAMES, FIELD_ATTRIBUTES=...FIELD_ATTRIBUTES>,

But then what is FieldTypes? I suppose it could be a tuple type, and a bound like FieldTypes=…Ts could be treated as a requirement that FieldTypes is a tuple type; similarly with tuple values for FIELD_NAMES and FIELD_ATTRIBUTES.

But does the compiler know that Ts, FIELD_NAMES, and FIELD_ATTRIBUTES have the same length? And this 'flattened' encoding where types, names, and attributes are separate lists is somewhat awkward. Perhaps a different encoding with an intermediate type would avoid those problems.

But then here:

    for field_name, field_data ...in fields, FIELD_NAMES, FIELD_ATTRIBUTES {

fields is not actually defined anywhere but I guess we are using some mechanism to turn the struct into a tuple. But again, how does the compiler know the fields tuple is the same length as the names/attributes tuples?

Perhaps the trickiest part is actually the where clause:

  where
    ...Ts: ...DebugStructField<Attributes>

If we happen to be deriving traits for a generic struct, we don't want the derived impl to have any fancy DebugStructField bounds on its generic parameters, but it might need some bounds. For example, if the struct is something like

struct Foo<T> { a: i32, t: T }

then the derived impl should be something like

impl<T: Debug> Debug for Foo<T>

but not something like

impl<T> Debug for Foo<T> where (i32: T): DebugStructFields<StandardDebug, StandardDebug>

because in the latter case, (a) the DebugStructFields bound would have to be duplicated in any generic item that wants to use the impl, and (b) the complexity would make it hard to tell why the bound wasn't holding in any given case.

So the implementation of variadic_derive would have to prove that the bound ...Ts: ...DebugStructField<Attributes> holds for all possible values of the generic parameters, and then remove it from the impl. That's doable, but it's a fair bit of magic. And how do you distinguish that bound, which should be removed, from the T: Debug bound, which shouldn't? Especially since T: Debug isn't directly expressed anywhere in the variadic impl itself, but is just a requirement of one of the DebugStructField impls.

@PoignardAzur
Copy link
Author

But then what is FieldTypes? I suppose it could be a tuple type, and a bound like FieldTypes=…Ts could be treated as a requirement that FieldTypes is a tuple type; similarly with tuple values for FIELD_NAMES and FIELD_ATTRIBUTES.

Yes, it would have to be a tuple. The idea is that HasVariadicFields is kind of a magic type that transforms a struct into a tuple type of its fields.

You're right that the types would have to be associated types (and you can probably bundle the field types and the field names/attributes together).

But does the compiler know that Ts, FIELD_NAMES, and FIELD_ATTRIBUTES have the same length?

That's an open design question.

For the sake of this example, we could assume that only one "tuple arity" is only allowed per generic, so the compiler can always assume that two variadics parameters have the same length.

fields is not actually defined anywhere

Oh, crap, that's my bad. Gonna fix this.

  where
   ...Ts: ...DebugStructField<Attributes>

If we happen to be deriving traits for a generic struct, we don't want the derived impl to have any fancy DebugStructField bounds on its generic parameters, but it might need some bounds. For example, if the struct is something like

I'm not sure if there's a misunderstanding, but in ...Ts: ...DebugStructField<Attributes>, Ts represents the field types, not the implementation's bounds.

Keep in mind that #[derive(MyDebug)] wouldn't "generate" bounds in the sense of generating a new syntactic construct; it's more that it takes a blanket impl, and it selectively binds it to the types it's applied to (and emits an error if it can't bind it).

I haven't actually figured out how bounds would fit into this. Like, if we take your struct:

#[derive(MyDebug)]
struct Foo<T> { a: i32, t: T }

You're right that we'd want to end up with something equivalent to impl<T: Debug> Debug for Foo<T>, whereas with the magic syntax in my example the code above would just get a type error.

impl<T> Debug for Foo<T> where (i32: T): DebugStructFields<StandardDebug, StandardDebug>

Nitpick, it would be more like

impl<T> Debug for Foo<T> 
  where 
    i32: DebugStructFields<StandardDebug>,
    T: DebugStructFields<StandardDebug>,

The difference being that the compiler can do diagnostics on each field independently.

(b) the complexity would make it hard to tell why the bound wasn't holding in any given case.

Actually, not that hard. I've compiled a non-variadic version and the error message isn't too bad.

(also, the compiler could add more specialized error messages DebugStructFields because it knows that it's the type passed to variadic_derive, which means it has some semantic info to work with)

Anyway, my main takeaway is that this syntax would need some way to address generic bounds on the derived trait to be usable.

@PoignardAzur
Copy link
Author

@jplatte

Any additional thoughts about the derive example?

@jplatte
Copy link

jplatte commented Jan 30, 2021

Thanks for the ping. I don't have anything to say specifically about that example. I can only reiterate

I disagree that variadic generics would be able to provide good abstractions for deriving traits

That is, I think this is just the wrong abstractions. I agree that procedural macros are not the best abstraction, but I think they get many things right. IMHO any alternative to procedural macros / procedural macro derives should also be able to benefit from existing Rust code in the form of build dependencies and should have a clear distinction between compile-time code and generated code.

Variadic derives would be kind of like having to learn an entirely new language that Rust code is embedded in. They could potentially result in shorter code that's more reliable in edge cases that proc macros currently can't handle well or at all, but at the expense of being another huge feature to learn when creating Rust libraries, and likely being much harder to read even for experienced developers (I've done a lot of template metaprogramming in C++ before and never really got good at reading that code; getting familiar with a proc-macro crate is very easy for me these days).

@PoignardAzur
Copy link
Author

I've done a lot of template metaprogramming in C++ before and never really got good at reading that code

Is that a fair comparison? C++ templates are absolutely awful.

On the other hand, I've always had an easier time reading D variadic templates. I'm trying to write a custom proc macro right now and boy do I wish I could use static foreach instead.

@jplatte
Copy link

jplatte commented Jan 30, 2021

Is that a fair comparison? C++ templates are absolutely awful.

Well yeah, there is some awkwardness to C++ template metaprogramming that could be avoided with varadic metaprogramming in Rust, but think fundamentally both would be the same thing: A purely functional DSL intermixed with its "host" language. And I think that just makes for hard to read code. Functional programming is super useful, but there are just things that are easier to express imperatively, and being stripped of the option to write code imperatively is super annoying IME (I've also used Haskell for a few years before Rust, and I think being able to choose the right paradigm when it feels right in Rust is a major reason why I've completely abandoned Haskell since).

On the other hand, I've always had an easier time reading D variadic templates. I'm trying to write a custom proc macro right now and boy do I wish I could use static foreach instead.

I have no experience with D so can't comment on this.

@Kimundi
Copy link

Kimundi commented Feb 26, 2021

Good overview or the problem domain and history! I also like the idea of providing a tuple for loop as a first stepping stone that would be usable on its own. One thought about that, though it might get into syntax bike shedding, but:

Normal for loops iterate over a dynamic sequence of homogenous values. A tuple for loop would iterate over a static sequences of hereogenous values.

So maybe it would make sense to not frame the feature as a loop, but rather as variadic expanded let statments (or something similar).

Eg something like (syntax of course just an example):

...let x: T in tuple {
    // ...
}

I don't mean this just as keyword bike shedding, but as a way to think about the feature and how it would be taught. After all, its semantic would be closer to a macro that just expands to N blocks of independend code than to a loop. This might also give us more syntactic and semantic wiggle room for the features spoken of in the document.


Another point I've thought about is the discussion about how to control the size of two variadic type packs - eg in the examples where they need to have the same size to zip them for example. There is also the interaction with const generics and the ability to combine both to consider. Eg in C++ you can write an API like get<2>() to get the thrid variadic tuple element.

So what if, either optional or mandatory, the number of types in a variadic is part of the type signature as a const generic? Eg:

fn foo<const N: usize, ...<N>Ts>(arg: ...Ts) { ... }

That way you can an actual size to work with and to hang type bound off. Though I guess this could also just be provided with a TupleSize<...Ts>::LEN helper.

@B-Reif
Copy link

B-Reif commented Apr 16, 2023

@PoignardAzur Have you heard any developments or interest in tuple-for? I'm curious about exploring this micro-feature further.

@PoignardAzur
Copy link
Author

No developments.

I think the semi-official stance from the lang team is "we'll look into variadics when [long least of currently in-progress features] is done".

@Massou31
Copy link

Massou31 commented May 5, 2023

@PoignardAzur I've posted https://internals.rust-lang.org/t/thoughts-on-tuples/18778 before finding this gist.

I have no insight if now is the right time to revive a proposal on variadic generics and tuples but I was thinking on a some steps towards a full implementation.

You seem to have put a lot of thoughts and efforts on this I'm genuinely curious on your opinion on the following:

C++ uses variadic generics to implement tuples (https://en.cppreference.com/w/cpp/utility/tuple).
I've explored the idea of using tuples for variadic generics since they already exist in the language.

step 1:

If at most one variable pattern is allowed in a single tuple and the possibility to bind this pattern exists then it's possible to have some form of variadic generics (the idea is to just have the same pattern matching already available for arrays).

A variable pattern could match 0 or more tuple items where 0 items would bind to unit ()

In let bindings

let (a, b @ .., c) = (1, 2);
assert_eq!(a, 1);
assert_eq!(b, ());
assert_eq!(c, 2);

let (a, b @ .., c) = (1, 2, 3);
assert_eq!(b, (2,));

// nested elements
let (a, (b, c @ ..), d, e @ ..) = (1, (2, 3, 4), 5, 6, 7);
assert_eq!(a, 1);
assert_eq!(b, 2);
assert_eq!(c, (3, 4));
assert_eq!(d, 5);
assert_eq!(e, (6, 7));

In function parameters:

// simple binding
fn foo(a: (u8, ..));

// with type or trait restrictions:
fn foo(a: (.. : u8));
fn foo(a: (.. : impl Trait));
fn foo(a: (.. : dyn Trait));

// binding with destructuration
fn foo((a, b, c): (u8, .., u32));

// with nested elements
fn foo((a, b, (c, d)): (u8, .., (.., u32)));

step 2:

tuple erazure and integrations with generics (with things like: https://doc.rust-lang.org/std/marker/trait.Tuple.html)

// simple restriction
fn foo(a: (..));

fn bar<T: Tuple>(a: T) {
   foo(a);
}

// auto traits for tuples if all of the members implement it
fn foo<T: TupleOf<Display + Debug>>(tupl: T);

// same thing but more verbose:
fn foo<T>(tupl: T)
    where
        T: TupleOf<Display> + TupleOf<Debug>;

// alternative: have auto associated items like _0, _1, ...
fn foo<T>(tupl: T)
    where
        T: Tuple,
        T::_0: TraitA,
        T::_1: TraitB

// with structs
struct MyStruct<T, U: Tuple, V> {
    t: T.
    u: U,
    v: V,
}

impl<T, U: Tuple, V> MyStruct {
    pub fn new((t, u, v): (T, ..: U, V)) -> Self {
        Self {t, u, v}
    }
}

MyStruct::new((10, 12)); // U will be ()
MyStruct::new((10, 11, 12)); // U will be (11,)
MyStruct::new((10, 11, 12, 13)); // U will be (11, 12)

step 3:

Integrate with consts and have constructs like:

// const for loops
const for x in (1, 2, 3) {}

// const while loop
let tupl = (1, 2, 3);
const while let Some(x, tail) = tupl.pop_front() {
    if x == () {
        break;
    } else {
        tupl = tail;
    }
}

A const LEN member can also exist and some restrictions on tuple lenght as well

// with exact size
fn foo<T: Tuple<LEN == 12>>(tupl: T);

// at least 4 elements
fn foo<T: Tuple<LEN > 3>>(tupl: T);
// 3 or less elements
fn foo<T: Tuple<LEN <= 3>>(tupl: T);

EDIT: added struct examples

@PoignardAzur
Copy link
Author

Your syntax doesn't really work with existing constructs; specifically, in step 2, this:

where
        T: TupleOf<Display> + TupleOf<Debug>;

doesn't work because traits can't be passed as template parameters, and this is extremely unlikely to change.

Now, we might want to have TupleOf be an exception where it's the only trait that is generic over other traits; but at that point, we essentially end up with a slightly different syntax for this:

where
        ...T: Display + Debug;

(or whatever variadic syntax you want to use)

@Massou31
Copy link

Massou31 commented May 5, 2023

Your syntax doesn't really work with existing constructs; specifically, in step 2, this:

where
        T: TupleOf<Display> + TupleOf<Debug>;

doesn't work because traits can't be passed as template parameters, and this is extremely unlikely to change.

Now, we might want to have TupleOf be an exception where it's the only trait that is generic over other traits; but at that point, we essentially end up with a slightly different syntax for this:

where
        ...T: Display + Debug;

(or whatever variadic syntax you want to use)

Do you think that a special binding syntax could also be introduced in the case of parameters (like in your suggestion) ?

Regardless of the syntax, after more thinking, variadic generics don't work well with associated types.

I suppose that at that point unpacking the arguments might lead to inconsistencies or generate a lot of different code if there are multiple different associated types.

Example:

trait MyTrait {
    type Assoc;
}

fn foo<T>(tupl: (.. : T)) -> (.. : T::Assoc)
where
    ..T: MyTrait;
}

Basically this is a limitation on variadic generics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment