Skip to content

Instantly share code, notes, and snippets.

@tabatkins
Last active March 16, 2021 00:08
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tabatkins/2fb7278d9605daf3c53f425a36f3de4e to your computer and use it in GitHub Desktop.
Save tabatkins/2fb7278d9605daf3c53f425a36f3de4e to your computer and use it in GitHub Desktop.
Comparing Python's match statement with mine

(Note that PEP 642 was rejected, but its points are intriguing; I discuss it at the end of this document.)

Basic patterns are identical to my proposal, modulo obvious differences in Python spelling vs JS spelling:

  • Identical "leaf" matchers (ident-matcher, nil-matcher, literal-matchers)
  • identical array-matcher, including doing a length check when it doesn't end in a ...foo
  • identical object-matcher, modulo the fact that Python doesn't have concise object notation (no {foo}). It does have the ability to capture unmatched keys, exactly as in my current proposal.
  • identical if-guards, including same rules on belonging to the overall when, not a matcher, so it has access to all the same bindings as the RHS
  • patterns can be arbitrarily nested
  • all nestable patterns can be followed by an as clause, identical to my proposal
  • nearly identical or-pattern; they require each branch to bind the same set of names, while my proposal just unions the names, and any names not used by the successful matcher are bound to undefined.

Python has built-in class/structure matching:

match event.get():
    case Click(position=(x, y)):
        handle_click_at(x, y)
    case KeyPress(key_name="Q") | Quit():
        game.quit()
    case KeyPress(key_name="up arrow"):
        game.go_north()
    ...
    case KeyPress():
        pass # Ignore other keystrokes
    case other_event:
        raise ValueError(f"Unrecognized event: {other_event}")

JS doesn't in current proposal, but it's trivial to install one on your class, and then it's just as easy to use as Python:

MyEvent[Symbol.caseMatch] = (matchVal) -> matchVal instanceof Event ? matchVal : null;
class Click extends MyEvent {...};
...

case(event.get()) {
    when ^Click with {position:[x,y]} -> ...;
    when ^KeyPress with {key_name:"Q"} -> ...;
    when ^KeyPress with {key_name:"up arrow"} -> ...;
    when ^KeyPress -> ...;
    when otherEvent -> throw new Exception(`Unrecognized event: ${otherEvent});
}

Python interprets all bare idents as ident-matchers, which match all values and bind them to that name. It interprets all dotted idents as variables, evaluating the reference and treating its value as a primitive matcher.

So in python, case foo: auto-succeeds and binds the value to the name "foo" for the RHS, while case foo.bar: ... evaluates foo.bar from the surrounding scope and matches the result against the match value, establishing no bindings.

I'm pretty sure this is a no-go in general for JS, and we need to solve the "ident matcher vs variable" problem a different way. In my current proposal, you can do variable matching with a computed-matcher, like when ^foo -> ...;; it'll evaluate the expression after the ^, and do either a primitive-match or invoke the matcher protocol, depending on whether the result is a primitive or an object.


Summary

Python's proposal is virtually identical to mine, modulo spelling. Differences are:

  • No arbitrary matcher protocal. Instead, built-in instanceof matching when the match pattern looks like foo(), with a slightly custom syntax to run an object-matcher on the match value as well (or, since Python has sequence-y classes, an array-matcher).

    (Technically you can metaprogram your way into an arbitrary-matcher protocol here, by making a class with a tricksy __instancecheck__() method.)

  • Handles the "ident-matcher" vs "variable-value matcher" by making all idents the first, and all dotted-idents the second. (Mine does variable-value matching by way of computed-matchers.)

  • Their or-patterns impose slightly more restrictions that I currently do, but I'm not opposed to going their way too if necessary.

  • They don't have an and-pattern.

PEP 642

PEP 642 is based on the objection that 634 has confusion between "ident matchers" (always match, introduce the named binding) and "variable-value matchers" (match conditionally against the value of the expression, introduce no binding). 634 avoids the confusion by separating them syntactically - bare idents are ident matchers unless they're one of the predefined primitive idents, dotted-idents are variable-value matchers. My proposal avoids the confusion by making idents ident-matchers unless they're predefined primitive idents, computed-value matchers handle variable values.

Both of these have the potential issue that it prevents us from adding new predefined idents to the language without potentially breaking patterns using those idents. (Normally we can introduce new ones if we want, by making them assignable, like undefined.)

642's solution is instead to make all idents variable-value matchers, and instead doing ident-matcher purely via as ident clauses, with them allowed to be used on their own. If I adopted this into my proposal, it would simplify some aspects; the default would be computed-value matching, with primitives simply resolving to their value.

Here are some examples of applying this:

when [head, ...tail] -> ...; (tests that matchValue.length is >=1, binds [0] to "head" and rest to "tail")
~becomes~
when [as head, ...as tail] -> ...;
when {foo} -> ...; (tests that matchValue.foo exists, binds it to "foo")
~unchanged~
(I think it's fine to still accept this as working)
when {foo:bar} -> ...; (tests that matchValue.foo exists, binds it to "bar")
~becomes~
when {foo:as bar} -> ...;
when {foo:(bar)} -> ...; (tests that matchValue.foo == bar)
~becomes~
when {foo:bar} -> ...;
when or((LF) | (CR)) -> ...; (tests that matchValue equals LF or CR, binds nothing)
~becomes~
when LF | CR -> ...;
~or~
when (LF | CR) as ch -> ...; (if you want to bind)

Downside is that this makes general expression syntax mixed more with pattern syntaxes. We explicitly define that array-literals, object-literals, and | and & are all patterns, not expressions, but that means we probably can't extend the syntax any further in the future. (Versus my existing proposal, which allows for extension via syntax like foo(stuff), since only naked parens allow arbitrary exprs inside; functiony-parens instead are always a pattern.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment