Skip to content

Instantly share code, notes, and snippets.

@est31
Last active September 10, 2022 04:49
Show Gist options
  • Star 12 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save est31/8d0465997ea920c5ba917cbbf80a822d to your computer and use it in GitHub Desktop.
Save est31/8d0465997ea920c5ba917cbbf80a822d to your computer and use it in GitHub Desktop.

Postfix macros in Rust

The problem

Rust has many postfix combinators, for example the .unwrap_or(x) and .unwrap_or_else(|| x) functions. They are useful if you want to extract some value from an optionally present value, or if not, provide an alternative value. It's really nice and tidy to read:

let var = Some(0);
let var = var.unwrap_or(42);

But what if instead of providing a default value, you want alternative control flow?

let var = if let Some(var) = var {
    var
} else {
    continue;
};
let var = var.something();
let var = if let Some(var) = var {
    var
} else {
    continue;
};
// Do foo

This is quite ugly compared to the prior code! You can't use .unwrap_or(x) because x is always evaluated, nor can you use .unwrap_or_else(|| x) because x is inside a closure, and closures can't affect outside control flow through break, continue, or return.

A slightly nicer way would involve a match:

let var = match var {
    Some(var) => var,
    None => continue,
};
let var = var.something();
let var = match var {
    Some(var) => var,
    None => continue,
};
// Do foo

It's one line shorter each time, but that's still not as short as .unwrap_or(x) would have been.

Another alternative would involve nested if let:

if let Some(var) = var {
    let var = var.something();
    if let Some(var) = var {
        // Do foo
    }
}

But then you need to needlessly add indentation to the entire code that comes after the unwrapping. The beauty of a continue/break/return pattern is that you don't have this rightward drift. You can limit the drift to one by using ? inside a closure:

let var = (|| {
    var?.something()
})();
if let Some(var) = var {
    // Do foo
}

Then you'll only need a single of these if let's. But you still need one. And the code that calls something is separated from the code that does foo. It's still not as nice as .unwrap_or would have been.

The solution

Postfix combinators like .unwrap_or are functions. In Rust, functions can only take values. .unwrap_or takes a value unconditionally, for .unwrap_or_else the value is a closure, and its body is lazily evaluated. But functions can't take anything beyond that. Functions can't take anything that influences control flow inside the calling function. This doesn't compile:

fn foo() {
    continue; //~ ERROR cannot `continue` outside of a loop
}
fn bar() {
    for _ in 0..3 {
        foo();
    }
}

There is a language construct in Rust which does support influencing the outside context: macros. This compiles:

macro_rules! foo {
    () => {
        continue
    };
}
fn bar() {
    for _ in 0..3 {
        foo!();
    }
}

They also support taking arguments that influence control flow. Another example:

macro_rules! foo {
    ($v:expr) => {
        $v
    };
}
fn bar() {
    for _ in 0..3 {
        foo!(continue);
    }
}

Thus we can greatly improve the code from above using macros:

use unwrap_or_definition::unwrap_or;
let var = unwrap_or!(var, continue);
let var = var.something();
let var = unwrap_or!(var, continue);
// Do foo

But it's still not as nice as the .unwrap_or() or .unwrap_or_else() functions. It would be nice if one could call the unwrap_or macro in a postfix fashion:

use unwrap_or_definition::unwrap_or;
let var = var.unwrap_or!(continue)
    .something()
    .unwrap_or!(continue);
// Do foo

Look how nice it is. Beautiful!

But sadly this isn't supported by Rust :/. If we try to compile it, we get an error, because currently postfix macros aren't part of the language.

There is an open RFC to add postfix macro support to the language. But sadly, despite being years old, the RFC has not been merged yet.

RFC 2442

RFC 2442 gives macro authors the option to allow their macros to be used in postfix fashion. So one can do:

macro_rules! foo {
    ($self:self, $p:expr) => {
        // Stuff
    };
}
bar.foo!();

However, as said above, there are concerns. The main concerns voiced about the RFC regard resolution.

If you call a function bar, on a type Foo, it's resolved based on that type. You can have a totally different implementation of bar for a type Bar.

struct Foo;
struct Bar;
impl Foo {
    fn bar(&self) {
        println!("hi");
    }
}
impl Bar {
    fn bar(&self) {
        println!("hello");
    }
}
fn main() {
    Foo.bar(); // prints hi
    Bar.bar(); // prints hello
}

The RFC proposes that postfix macros shall follow the same resolution behaviour that all macros follow. You import them based on their name only, and they work on all types.

The critics of the RFC say that this is a massive flaw, and this has held back the RFC from being merged.

Back when the RFC was proposed, I'd been on the fence on the question. Now I'm leaning more towards normal macro based resolution.

The big issue with type based resolution is that it creates a dependence on the type system. Currently, the Rust compiler first expands all macros in a piece of code before it runs type checking on them. With type based resolution, a part of expansion has to run during type checking.

One big argument the proponents of type based resolution brought forward was consistency. Yes, type based postfix macro resolution would be consistent with postfix function resolution, but it would be inconsistent with basically anything macro related. Macros have currently zero features relying on the type. They only work on the syntactic level. So some kind of consistency has to be broken here, and personally I think that breaking the consistency with functions has lower cost, as macro expansion and type checking can still be in totally different phases of compilation.

I think the best argument that speaks in favour of type based resolution is expressive power: it's more powerful to have multiple macros with the same name in scope, that each do something different based on the type. Is this enough to pay the cost of mixing up the compiler's phases? I'm not sure.

And there is a way you can interpret macro based resolution from a type based point of view: postfix macros are just like extension traits implemented on all types, with name linked to the macro they contain. In other words, they are very similar to this construct from the function world:

struct Foo;
trait foo {
    fn foo(&self) {}
}
impl<T: ?Sized> foo for T {}
mod baz {
    use super::{foo, Foo};
    fn bar() {
        Foo.foo();
    }
}

The only difference in resolution behaviour is that renaming the "trait" also renames the macro.

There was one very interesting comment on the RFC, which suggested just adopting universal function call syntax, or in other words, allowing people to just postfix call any macro, without requiring the macro to opt in. Personally I think this is a great idea, as it allows all bang macros to be called in a postfix manner, and nothing has to be fitted for their support.

To summarize, there are still many open questions about the design of postfix macros, and it's likely that the RFC won't get merged any time soon, even though postfix macros would be a beautiful addition to the language.

Postfix macros on stable Rust

Thankfully, not every feature has to reside in the compiler for users to enjoy it, thanks to proc macros. A good example of this is the fehler crate, which allows crates to use exception syntax known from languages like Java or C++ in Rust by lowering it to a syntax the compiler can understand. Regardless of your opinion on this syntax, it's a good example of how proc macros can be used to experiment with language sugars.

This brings us to the big announcement of this post: Over the past few weeks, I've worked on on a proc macro that takes in Rust code with postfix macro syntax, and then lowers it to traditional prefix macros.

It's called postfix-macros and is published on crates.io under the same name. It also provides a bunch of postfix macro analogs to well known Rust constructs.

Example usage:

use postfix_macros::{postfix_macros, unwrap_or};

fn main() {
    postfix_macros! {
        let urls = ["https://rust-lang.org", "http://github.com"];
        for url in urls.iter() {
            let mut url_splitter = url.splitn(2, ':');
            let scheme = url_splitter.next().unwrap();
            let _remainder = url_splitter.next().unwrap_or! {
                println!("Ignoring URL: No scheme found");
                continue;
            };
            println!("scheme is {}", scheme);
        }
    }
}

Sadly it's currently not possible to provide an attribute macro that takes the same syntax, as currently the syntax is rejected during parsing and anything passed to an attr macro also needs to parse. I've made a PR to solve this for the future, so hopefully a subsequent version of the crate can offer an attr macro.

Parsing Rust in inverse order

I'll now give a summary on the design choices that went into the postfix-macros crate.

First was the choice of parsing library. Back when the first proc macro support launched in Rust 1.15.0, when one could only define custom derives, one could only access the code passed to the macro through a string based API. Conversion from and to strings were literally the only two functionalities available on the TokenStream type.

Building on this API, parsing libraries like syn were made, which relieved proc macro authors from the task of parsing the code themselves. However, there is one big issue with syn: its massive compile time. Often compilation of the syn crate is one of the longest tasks in a compilation graph.

Thankfully nowadays the proc_macro crate's APIs are way richer: they allow access to the actual tokens. This hasn't removed the entire task of parsing Rust, but it made the experience way easier.

In postfix-macros, I've done the experiment of not using syn, in fact not using any dependency at all, to only use proc_macro crate. So far it has turned out well. Yes, there are some bugs that wouldn't exist had syn been used, but compile time is wonderfully short. For me personally, this is more important than being fully bug-free. But the final word hasn't been spoken yet, maybe in the future I'll add an optional syn cargo feature, to allow people to use it instead.

Thanks to rolling my own parser, I was able to cut some corners. The macro doesn't touch most of the content passed to it. It only turns on if a specific pattern is recognized, .<ident>!(<something>) (also allowing [] and {}). Once it finds that pattern, it searches backwards for the start of the expression, as the macro may be chained. Last, it removes those items and puts them into the macro's arguments, lowering <expr>.<ident>!(<something>) to something like <ident>!(<expr>, <something>). There is special logic to recognize if something is empty or if the expression is more complicated and needs to be put inside {}.

The hardest part is the expression parsing code, as it walks backwards and has to take precedence into account. For example &().dbg!() should pass &() to the dbg macro instead of just (), but 0&0.dbg!() should pass just 0. For this, it performs a speculative prefix operator search. Another 1 A speculative search is performed to recognize match constructs, and if/else if chains.

There are some known bugs and probably some unknown ones as well, but I've gotten confident that the crate can be useful to people. Generally, Rust's type system should catch any issue arising from a precedence bug in my parsing code.

1: Update 2020-11-13: It has been pointed out to me that prefix operators like & or ! have weak precedence. E.g. &"hello".to_string() evals to &("hello".to_string()). I adjusted the code to mirror that behaviour for postfix macros as well, and removed the prefix operator search.

TLDR

I built a proc-macro library to have postfix macros on Rust. Check it out on github.

@piegamesde
Copy link

You didn't mention using and_then for solving the problem. I think it does so quite nicely:

let var = var.and_then(|var| var.something()); // Or: var.and_then(MyStruct::something);
let var = match var {
    Some(var) => var,
    None => continue,
};
// Do foo
// Or (if applicable): var.and_then(|var| { /* do foo */ })

You can do this as many times as you want, and only need one match block at the end. Sometimes the entire code can be expressed like this, thus eliminating the need for matching completely.

@est31
Copy link
Author

est31 commented Nov 16, 2020

@piegamesde indeed there are more ways. I mentioned though that this method still requires a match at the end, so it's only beneficial to avoid repetitions. So postfix macros are generally better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment