masak/three-types-of-macros.md Secret

## three-types-of-macros.md

      
    Raw
  

              three-types-of-macros.md
            
          
    Routine macros

The normal type. They look like subs. Just as with subs, some special names
declare ops, in which case you can invoke them by using them as ops.
However, two things distinguish these a lot from subs. First, they accept and
return Qtrees. Second, they are called at parse time; it is the parser that
(by lookup) realizes that what it just parsed was a macro, passes it the
arguments as Qtrees, and then expects back a Qtree that it can replace the
call with.
sub compute() { say "COMPUTED!"; return 42 }

sub ex1($value) {
    say "before";
    say $value;
    say "after";
}

ex1 compute();  # COMPUTED!\nbefore\n42\nafter\n
                # compute() runs; Str result is passed to ex1()

macro ex2($qtree) {
    return quasi {
        say "before";
        say {{{ $qtree }}};     # X
        say "after";
    };
}

ex2 compute();  # before\nCOMPUTED!\n42\nafter\n
                # (arg is code, passed as Qtree)
                # compute() isn't actually run until line X runs

Besides this "pure" role of transforming subcall Qtrees (or their
operator brethren) into Qtrees, macros can also carry out side effects
(prompt/say etc, or changing globals), or carry state from one macro
call to another. In general, a macro has two regions:


Directly inside the macro. Runs at macro invocation time, which is
a part of parsing the program. From here, we can directly influence
the compilation process itself.


Inside one or more quasi blocks in the macro. Runs as part of the
mainline code, since the quasi block gets physically spliced into the
normal code.


Activation

When is a macro called? A bit like BEGIN time, it is called as soon as
everything is in place. The macro symbol itself needs to be parsed, and also
all its sub-Qtrees that act as arguments or operands. This means different
things for listops, infixes, prefixes, and postfixes.
mac $x, $y, $z ;
              ^--- here is where we can call the macro

$r ¤ $l ;
       ^

¤$x ;
   ^

$x¤ ;
   ^

my-if $COND, { ... } ;
                    ^

is parsed

But a macro may also declare itself with the is parsed trait, and effectively
gains control the parse process earlier than it otherwise would. Instead of
"as soon all the sub-Qtrees are in place", the macro now gets called as soon as
the macro symbol has been parsed.
mac $x, $y, $z ;
   ^--- with `is parsed`, this is where the macro gets invoked

$l ¤ $r ;
    ^

¤ $x;
 ^

$x¤ ;
   ^ 

my-if $COND, { ... } ;
     ^

In the case of is parsed, the macro is assuming the responsibility to parse
the rest of its arguments/operands, and hand control back to the Perl 6 parser
in a way that's consistent with the macro's grammatical category. It also gets
the responsibility to generate Qtrees for things that would otherwise have come
in as macro parameters.
The is parsed trait expects a regex supplying a parsing strategy. Inside of
the regex, we have full access to things defined in Perl6::Grammar. As an
example, the my-if above almost looks like a normal built-in control
structure, save for that annoying comma. With is parsed we can tell the
parser that we don't care for the comma.
class Q::MyIf {
    has Q::Expr $.expr;
    has Q::Block $.block;
}

macro my-if(Q::Expr, $expr, Q::Block $block)
        is parsed(/<EXPR> <.ws> <pblock>/) {

    return Q::MyIf.new(:$expr, :$block);
}

It's not clear that we can make the above work, or if it ends up being robust
and usable. But it seems to be a possibility. So let's hereby bring is parsed
back from the land of the deprecated into the shadow realm of the conjectural.
is reparsed

We're not bringing back is reparsed, though. Besides breaking the rule of
one-pass parsing, it's not clear that this form has a good use case driving it.
Unhygiene

These macros can exhibit unhygiene — the absence of hygiene — and
declare or modify symbols into the mainline program. Such a macro will screw up
simplistic tools such as syntax checkers, highlighters, linters, refactoring
tools, IDEs, and Java programmers. Take that, simplistic tools!
Unhygiene may also void the warranty of your computer, your pets, your family,
your friends, and nearby celestial bodies. Nevertheless, it can be very helpful
sometimes, and I expect people (in proud Perl tradition) to manage to use the
extra rope-lengths for shooting themselves in the appropriate amount of feet.
(Friendly reminder: the term "unhygienic macros" still refers to AST-based
macros and does not mean "textual macros". As an analogy, whereas unhygiene
is like a very lethal, armed nuclear warhead that probably oughtn't be in your
possession in the first place, textual macros are like two subcritical piles of
hot plutonium that when brought together will definitely go boom, and why are
you standing so close to the hot piles anyway and sorry, you're dead now.)
Multi macros

It's possible to define multi macros. The biggest difference to multi subs is
that the compiler will always pass Qtree arguments to the macro. Binding
happens on compile-time Qtrees, not on runtime values.
Absent an is parsed, a multi macro will parse according to the normal Perl 6
grammar, and then match the resulting Qtrees against the macro signature. If an
is parsed is present, a failed parse is tantamount to backtracking out of an
alternation, and a successful parse means that the multi macro is considered a
candidate. If all multi candidates fail then that's a dispatch error as usual.
If two or more tie, then normal tiebreaking rules apply. If at all possible, we
should let multi is parsed macros participate in LTM.
Conceptually, an is parsed on the multi macro counts as having an additional
constraint placed on the signature. If two multis have the same signature but
one of them has an is parsed trait, then the trait-adorned one will count as
narrower.
We may end up allowing is parsed on ordinary subroutines, too. The is parsed axis is orthogonal to the sub/macro axis, so it's a possibility. But
maybe it's less confusing not to allow that.
Syntax macros

Syntax macros are defined at statement level and essentially introduce a new
type of statement. Let's say we wanted a new pretending keyword, a block form
of temp:
class Q::Statement::Pretending is Q {
    has Q::Expr $.expr;
    has Q::Block $.block;
}

macro statement_control:<pretending>(Q::Expr $expr, Q::Block $block)
        is parsed(rule { <sym> <EXPR> <pblock> }) {
    # ...code to check that $expr is of the form `{{{$var}}} = {{{$value}}}` elided...
    return quasi {
        temp {{{$var}}} = {{{$value}}};
        {{{$block}}};   # handling of >0 params elided
    }
}

Notice how we're using the same mechanism as with op macros, only within the
statement_control grammatical category. The desire for syntax macros comes from
Scheme and Racket's define-syntax and
syntax-rules
facilities. I set out to add support for those things in our macro system, and
happily found that something like the above seems to be enough. We need more
concrete examples to verify this, though.
Especially considering the discussion above with multi macros, which would
allow us to dispatch the same keyword to various different syntax variants
— and also would nicely support third-party syntax extension in the same
way ordinary multis do — this seems to be a winner. We may still market
them as "syntax macros" if we want people to pay special interest to them.
It is an open question exactly which grammatical categories we will be able to
hook into like this. But statement_control seems like a straightforward one.
Analysis and traversal

The elided parts of the syntax macro above interact in various ways with the
incoming $expr and $block Qtrees. We should anticipate the needs of macro
writers, and provide an API that makes simple things easy and mind-bendingly
weird things achievable.
IntelliJ IDEA has various visitors to do this. They hook behaviors onto PSI
node type matching, enabling you to say things like "do this for all method
calls" or "do this for all field declarations". The idea is sound, even though
it'll likely come out looking a bit different in Perl 6 and with Qtrees. But at
the very least, we should have various default traversals that cover many
common use cases. This is something we've yet to investigate.
One thing which makes this less straightforward in Perl 6 is that the language
is more "freely nesting" than Java.
say "{ .foo given class :: { method foo { "OH" } } } $(constant $ = "HAI")"

class C { say "OH { method hai { say "o.O" }.name.uc }" }; C.new.hai

(Java doesn't allow class or constant declarations inside of string literals.)
Again, this means that we need to be more guided by use cases. Which methods do
we expect to find when we traverse a class for methods? Probably only the ones
registered on the class itself. Which excludes methods in nested classes but
includes methods nested in methods, or inside string literals, or other
expressions. We need good defaults here, informed by actual use and
expectations.
Also, the Qtrees themselves are supposed to be helpful in much the same way.
Traversal aside, often when you're sitting there with a reference to a variable
or a class, you want to ask "where is the declaration for this?". Such
questions are likely to form the basis of interesting Qtree analysis and
transformations.
Another thing that we will often want to do is evaluate a Qtree representing an
expression. The process is similar to EVAL, but starting from an already
parsed/contexted Qtree instead of a program string.
Going in the opposite direction, we sometimes want to construct Qtree literals
or identifiers from various run-phase inputs. Sometimes we're less interested
in the name we give an identifier, and more interested that it doesn't clash
with anything else in the lexical environment (à la gensyms).
Visitor macros

It's possible we shouldn't call these "macros" at all. But I don't have a
better name for them yet, so "visitor macros" it is, for now.
By way of example, let's say you want to write a macro that makes code such as
the following illegal:
if $some-expr == True {  # macro stops with "useless use of `== True`"
    ...
}

In this case, our macro is not so much a particular sub, op, or keyword, but a
constellation of Qtree nodes.
The visitor macro might look something like this:
MATCH (Q::If (
           Q::Infix::NumEq :$expr (
               Q::Enum :$rhs where *.value eq "Bool::True"))) {
    die "useless use of `== True`";
}

Every single bit of the above is conjectural syntax.
Conceptually, each visitor macro would traverse the Qtree nodes as they are
emitted by the parser. A visitor macro has a matcher part, basically a
signature, and a callback part, basically a routine body.
Unlike routine macros, a visitor macro is not expected to return a Qtree object
in the end.  (And if it does, then it will be discarded.) Instead, they are
expected to further typecheck or analyze the matched Qtree nodes, and then
maybe take some action. The action may be to modify the Qtree nodes somehow, or
to parsefail, or update some global state.
Qtree structures can be matched with subsignatures digging into the child
nodes of the Qtree root. In the cases where this is too constraining, where
blocks can be put to use: in a where block we can match against
non-descendant nodes in the program.
One big difference to ordinary routines is that there are no dispatch failures.
Normally a visitor simply doesn't trigger, and that's that.
Another difference is that since visitor macros are not called, care has to
be taken to not trigger them prematurely:


Firstly, it is probably prudent not to have visitor macros trigger inside of
their own macro body. We can have them register on the final }, which shouldn't
be an issue.


Secondly, let's say someone defines a visitor macro like the example above, and
wants to export it. Fine, we put an is export on the macro, and we might even
go with giving visitor macros an identifier so that people can have a say in whether
to import them. (And generally a way to refer to them.) In this case, it's easy to
avoid putting == True in if statements, but in more delicate cases it might not
be so easy to avoid triggering the visitor macro in the exporting module. Maybe an
is export-only trait might be useful here?


Alternatively, maybe we should think about separating the declaration and the
activation of a visitor macro? The above form which does both might still be a
convenient default, but thinking of export-only makes it seem like we sometimes
have control over the exact parser/compunit we activate the visitor macro in.
Also, maybe it might be useful sometimes to be able to programmatically
de-activate a visitor macro. The mechanism could be similar to wrap handles à la
S06, which all have a .restore method. Maybe visitor activations similarly returns
a handle with a .disable method.