est31/postfix-macros.md

## postfix-macros.md

      
    Raw
  

              postfix-macros.md
            
          
    Postfix macros in Rust

The problem

Rust has many postfix combinators, for example the
.unwrap_or(x) and .unwrap_or_else(|| x) functions.
They are useful if you want to extract some value from
an optionally present value, or if not, provide an
alternative value. It's really nice and tidy to read:
let var = Some(0);
let var = var.unwrap_or(42);
But what if instead of providing a default value,
you want alternative control flow?
let var = if let Some(var) = var {
    var
} else {
    continue;
};
let var = var.something();
let var = if let Some(var) = var {
    var
} else {
    continue;
};
// Do foo
This is quite ugly compared to the prior code!
You can't use .unwrap_or(x) because x is always evaluated,
nor can you use .unwrap_or_else(|| x) because x is inside
a closure, and closures can't affect outside control flow through
break, continue, or return.
A slightly nicer way would involve a match:
let var = match var {
    Some(var) => var,
    None => continue,
};
let var = var.something();
let var = match var {
    Some(var) => var,
    None => continue,
};
// Do foo
It's one line shorter each time, but that's still not
as short as .unwrap_or(x) would have been.
Another alternative would involve nested if let:
if let Some(var) = var {
    let var = var.something();
    if let Some(var) = var {
        // Do foo
    }
}
But then you need to needlessly add indentation to the entire code
that comes after the unwrapping. The beauty of a continue/break/return
pattern is that you don't have this rightward drift.
You can limit the drift to one by using ? inside a closure:
let var = (|| {
    var?.something()
})();
if let Some(var) = var {
    // Do foo
}
Then you'll only need a single of these if let's. But you still
need one. And the code that calls something is separated from
the code that does foo. It's still not as nice as .unwrap_or
would have been.
The solution

Postfix combinators like .unwrap_or are functions. In Rust,
functions can only take values. .unwrap_or takes a value
unconditionally, for .unwrap_or_else the value is a closure,
and its body is lazily evaluated. But functions can't
take anything beyond that. Functions can't take anything
that influences control flow inside the calling function.
This doesn't compile:
fn foo() {
    continue; //~ ERROR cannot `continue` outside of a loop
}
fn bar() {
    for _ in 0..3 {
        foo();
    }
}
There is a language construct in Rust which does support
influencing the outside context: macros. This compiles:
macro_rules! foo {
    () => {
        continue
    };
}
fn bar() {
    for _ in 0..3 {
        foo!();
    }
}
They also support taking arguments that influence control flow.
Another example:
macro_rules! foo {
    ($v:expr) => {
        $v
    };
}
fn bar() {
    for _ in 0..3 {
        foo!(continue);
    }
}
Thus we can greatly improve the code from above using macros:
use unwrap_or_definition::unwrap_or;
let var = unwrap_or!(var, continue);
let var = var.something();
let var = unwrap_or!(var, continue);
// Do foo
But it's still not as nice as the .unwrap_or() or
.unwrap_or_else() functions. It would be nice if one
could call the unwrap_or macro in a postfix fashion:
use unwrap_or_definition::unwrap_or;
let var = var.unwrap_or!(continue)
    .something()
    .unwrap_or!(continue);
// Do foo
Look how nice it is. Beautiful!
But sadly this isn't supported by Rust :/.
If we try to compile it, we get an error,
because currently postfix macros aren't part
of the language.
There is an open RFC
to add postfix macro support to the language.
But sadly, despite being years old, the RFC has not
been merged yet.
RFC 2442

RFC 2442 gives macro authors the option to allow
their macros to be used in postfix fashion.
So one can do:
macro_rules! foo {
    ($self:self, $p:expr) => {
        // Stuff
    };
}
bar.foo!();
However, as said above, there are concerns.
The main concerns voiced about the RFC regard resolution.
If you call a function bar, on a type Foo, it's
resolved based on that type. You can have a totally
different implementation of bar for a type Bar.
struct Foo;
struct Bar;
impl Foo {
    fn bar(&self) {
        println!("hi");
    }
}
impl Bar {
    fn bar(&self) {
        println!("hello");
    }
}
fn main() {
    Foo.bar(); // prints hi
    Bar.bar(); // prints hello
}
The RFC proposes that postfix macros shall follow the same
resolution behaviour that all macros follow. You import
them based on their name only, and they work on all types.
The critics of the RFC say that this is a massive flaw,
and this has held back the RFC from being merged.
Back when the RFC was proposed, I'd been on the fence
on the question. Now I'm leaning more towards normal
macro based resolution.
The big issue with type based resolution is that it
creates a dependence on the type system. Currently,
the Rust compiler first expands all macros in a piece of
code before it runs type checking on them. With type
based resolution, a part of expansion has to run during
type checking.
One big argument the proponents of type based resolution
brought forward was consistency. Yes, type based postfix
macro resolution would be consistent with postfix function
resolution, but it would be inconsistent with basically
anything macro related. Macros have currently zero features
relying on the type. They only work on the syntactic level.
So some kind of consistency has to be broken here,
and personally I think that breaking the consistency with
functions has lower cost, as macro expansion and type checking
can still be in totally different phases of compilation.
I think the best argument that speaks in favour of type based
resolution is expressive power: it's more powerful to have
multiple macros with the same name in scope, that each
do something different based on the type. Is this enough to
pay the cost of mixing up the compiler's phases? I'm not sure.
And there is a way you can interpret macro based resolution
from a type based point of view: postfix macros are just like
extension traits implemented on all types,
with name linked to the macro they contain. In other
words, they are very similar to this construct from
the function world:
struct Foo;
trait foo {
    fn foo(&self) {}
}
impl<T: ?Sized> foo for T {}
mod baz {
    use super::{foo, Foo};
    fn bar() {
        Foo.foo();
    }
}
The only difference in resolution behaviour is
that renaming the "trait" also renames
the macro.
There was one very interesting comment
on the RFC, which suggested just adopting
universal function call syntax,
or in other words, allowing people to just
postfix call any macro, without requiring the macro
to opt in. Personally I think this is a great idea,
as it allows all bang macros to be called in a postfix
manner, and nothing has to be fitted for their
support.
To summarize, there are still many open questions
about the design of postfix macros, and it's likely
that the RFC won't get merged any time soon, even
though postfix macros would be a beautiful addition
to the language.
Postfix macros on stable Rust

Thankfully, not every feature has to reside in the compiler
for users to enjoy it, thanks to proc macros. A good example
of this is the fehler
crate, which allows crates to use exception syntax known
from languages like Java or C++ in Rust by lowering it
to a syntax the compiler can understand. Regardless of
your opinion on this syntax, it's a good example of how
proc macros can be used to experiment with language sugars.
This brings us to the big announcement of this post:
Over the past few weeks, I've worked on on a proc macro
that takes in Rust code with postfix macro syntax,
and then lowers it to traditional prefix macros.
It's called postfix-macros
and is published on crates.io under the same name.
It also provides a bunch of postfix macro analogs
to well known Rust constructs.
Example usage:
use postfix_macros::{postfix_macros, unwrap_or};

fn main() {
    postfix_macros! {
        let urls = ["https://rust-lang.org", "http://github.com"];
        for url in urls.iter() {
            let mut url_splitter = url.splitn(2, ':');
            let scheme = url_splitter.next().unwrap();
            let _remainder = url_splitter.next().unwrap_or! {
                println!("Ignoring URL: No scheme found");
                continue;
            };
            println!("scheme is {}", scheme);
        }
    }
}
Sadly it's currently not possible to provide an
attribute macro that takes the same syntax,
as currently the syntax is rejected during parsing
and anything passed to an attr macro also needs
to parse. I've made a PR
to solve this for the future, so hopefully a subsequent
version of the crate can offer an attr macro.
Parsing Rust in inverse order

I'll now give a summary on the design choices that went
into the postfix-macros crate.
First was the choice of parsing library. Back when the
first proc macro support launched in Rust 1.15.0,
when one could only define custom derives, one could
only access the code passed to the macro through a
string based API. Conversion from and to strings
were literally the only two functionalities available
on the TokenStream type.
Building on this API, parsing libraries like syn
were made, which relieved proc macro authors from the
task of parsing the code themselves. However,
there is one big issue with syn: its massive compile
time. Often compilation of the syn crate is one of
the longest tasks in a compilation graph.
Thankfully nowadays the proc_macro crate's APIs
are way richer:
they allow access to the actual tokens.
This hasn't removed the entire task of parsing Rust,
but it made the experience way easier.
In postfix-macros, I've done the experiment of not
using syn, in fact not using any dependency at all,
to only use proc_macro crate. So far it has turned
out well. Yes, there are some bugs that wouldn't exist
had syn been used, but compile time is wonderfully short.
For me personally, this is more important than being
fully bug-free. But the final word hasn't been spoken yet,
maybe in the future I'll add an optional syn cargo
feature, to allow people to use it instead.
Thanks to rolling my own parser, I was able to cut some
corners. The macro doesn't touch most of the content passed to it.
It only turns on if a specific pattern is recognized,
.<ident>!(<something>) (also allowing [] and {}).
Once it finds that pattern, it searches backwards
for the start of the expression, as the macro may be chained.
Last, it removes those items and puts them into the macro's
arguments, lowering <expr>.<ident>!(<something>) to
something like <ident>!(<expr>, <something>).
There is special logic to recognize if something is empty
or if the expression is more complicated and needs to be
put inside {}.
The hardest part is the expression parsing code, as it
walks backwards and has to take precedence into account.
For example &().dbg!() should pass &() to the dbg
macro instead of just (), but 0&0.dbg!() should
pass just 0. For this, it performs a speculative
prefix operator search. Another ¹ A
speculative search is performed to recognize match
constructs, and if/else if chains.
There are some known bugs and probably some unknown
ones as well, but I've gotten confident that the crate
can be useful to people. Generally, Rust's type system
should catch any issue arising from a precedence bug
in my parsing code.
¹: Update 2020-11-13:
It has been pointed out
to me that prefix operators like & or ! have weak
precedence. E.g. &"hello".to_string() evals to
&("hello".to_string()). I adjusted the code to
mirror that behaviour for postfix macros as well, and
removed the prefix operator search.
TLDR

I built a proc-macro library to have postfix macros on
Rust. Check it out on github.