Skip to content

Instantly share code, notes, and snippets.

@tabatkins
Last active August 22, 2023 13:31
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tabatkins/ee6dfe274e81d76b6069c5ed37d3dcbf to your computer and use it in GitHub Desktop.
Save tabatkins/ee6dfe274e81d76b6069c5ed37d3dcbf to your computer and use it in GitHub Desktop.
Pattern-matching across constructs

"Matchers Everywhere" Proposal

tldr:

  • Matchers are mostly identical to what's in the repo today, except:

    • I added "predicate matchers" (if(<boolean-expr>)) and removed the if(...) part from the match() syntax.
    • Added ident when <matcher> binding-matcher form, to match its usage in destructuring.
    • Made regex matchers use /foo/ when <matcher> for consistency (and removed the named bindings for simplicity).
    • Added a simpler custom-matcher syntax (foo(<matchers>))
    • Divided simple "compare the value" and "invoke custom matcher" uses of ${} syntactically - ${...} just compares the value, while ${...}() invokes custom matcher with same meaning as the simpler syntax.
  • Slightly simplified match() - it still uses a when prefix on each arm, but it's just a keyword rather than a wrapper, to be consistent with other places matchers can be used.

     let x = match(val) {
     	when <matcher>: <return-val>;
     	when <matcher>: <return-val>;
     	default: <return-val>;
     };
  • Added <val> is <matcher> binary operator. Evaluates to true/false based on value matching the matcher.

  • Extended destructuring syntax to allow matchers (which gives us matchers in let/var/const` statements, function args, and for loops)

  • Defined a special syntax for matchers in if() and while() heads.

Intro

This proposal has several parts:

  1. A new syntax construct, the "matcher pattern", which is an elaboration on (similar to but distinct from) destructuring patterns.

    Matcher patterns allow testing the structure of an object in various ways, and recursing those tests into parts of the structure to an unlimited depth (similar to destructuring).

    Matcher syntax intentionally resembles destructuring syntax but goes well beyond the abilities and intention of simple destructuring.

  2. A new binary boolean operator, is, which lets you test values against matchers.

  3. A new syntax construct, the match() expression, which lets you test a value against multiple patterns and resolve to a value based on which one passed.

  4. Extensions to several existing syntax constructs that let you test and/or create bindings (var/let/const, if(), while(), for(), function args) to accept matchers as well.

Matcher Patterns

Destructuring matchers:

  • array matchers:

    • [<matcher>, <matcher>] exactly two items, matching the patterns
    • [<matcher>, <matcher>, ...] two items matching the patterns, more allowed
    • [<matcher>, <matcher>, ...<ident>] two items matching the patterns, with remainder collected into a list bound to <ident>
  • object matchers:

    • {<ident>, <ident>} has the ident keys (in its proto chain, not just own keys), and binds the value to that ident. Can have other keys. (aka {a} is identical to {a:a})
    • {<ident>: <matcher>, <ident>: <matcher>} has the ident keys, with values matching the patterns. Can have other keys.
    • {<ident>: <matcher>, ...<ident2>} has the ident key, with values matching the pattern. Remaining own keys collected into an object bound to <ident2>.
  • binding matchers:

    • <ident> Binds the matchable to the ident. (That is, [a, b] doesn't test the items in the array, just exposes them as a and b bindings.)
    • <ident> when <matcher>: Binds the matchable to the ident, and tests it against the matcher.

Value-testing matchers:

  • literal matchers:

    • 1,
    • "foo",
    • etc. All the primitives, plus (untagged only?) template literals.
    • also unary plus/minus
    • -0 and +0 test for the properly-signed zero, 0 just uses === equality.
    • NaN tests for NaN properly.
  • variable/expression matchers

    • ${<expression>} evaluates the expression, and then matches if the matchable equals the result. (Uses === semantics, except that NaN is matched properly.)

      For example, ${LF} will test against a LF variable from outside; {a, b: ${"foo-" + a}} will test .b's value against a dynamic string constructed from .a's value, etc.

  • regex matchers:

    • /foo/ matches if the regex matches
    • /foo/ when <pattern> matches the regex, and then matches the match result against the pattern (so you can extract groups, etc)
  • predicate matchers:

    • if(<expression>) Evaluates the expression, and matches if the expression is truthy. Doesn't use the matchable, doesn't produce any bindings.

Custom matchers:

  • function matchers:
    • foo(<matchers>) Invokes foo[Symbol.matcher] on the matchable.

      If the "arglist" is empty, just matches based on whether the custom matcher succeeds or fails.

      If the "arglist" is non-empty, additionally matches the provided matchers against the custom matcher's result, as if they were in an array matcher.

      (Iow, foo(<matcher1>, <matcher2>) is a more compact way to write ${foo} with [<matcher1>, <matcher2>])

  • expression custom matchers:
    • ${...}(<matchers>). Identical to function matchers, but evaluates the expression in the braces and then grabs the custom-matcher from the result, rather than requiring an existing variable name like function matchers.

Boolean matcher logic:

  • <matcher> and <matcher>: Tests the matchable against both matchers (in order), succeeds only if both succeed. Accumulates bindings from both. If first fails, short-circuits.
  • <matcher> or <matcher>: Tests the matchable against both matchers (in order), succeeds if either succeeds. Accumulates bindings from both, but values only from the first successful matcher (other bindings become undefined). If first succeeds, short-circuits.
  • not <matcher>: Tests the matchable against the matcher, succeeds only if the matcher fails. No bindings.
  • Matchers can be parenthesized, and must be if you're using multiple keywords; there is no precedence relationship between the keywords, so it's a syntax error to mix them at the same level.

Using Matchers

  • New match(){} expression:

     match(<val-expr>) { 
     	when <matcher>: <result-expr>; 
     	default: <result-expr>;
     }

    Find the first "arm" whose matcher passes, given the val. Evaluates to the corresponding result for that arm. The matcher can produce bindings that are visible within the matcher and within the result. default arm always matches. If no arm matches, throws.

  • New is operator

     <val-expr> is <matcher>

    Evaluates to true/false if val passes the matcher or not.

    Doing it manually with match() would be:

     let passes = match(<val-expr>) {
     	when <matcher>: true;
     	default: false;
     }
  • New bindings pattern, usable wherever you establish bindings:

    Anywhere you have a binding (top-level, or within a destructuring pattern), you can use one of these forms:

     // Starting with...
     let x = obj;
    
     // can replace with:
     let when <matcher> = obj
     let x when <matcher> = obj
    
     // If the context allows a default value, like in...
     let {foo: bar = 5} = obj;
    
     // a third form is allowed:
     let {foo: when <matcher> } = obj;
     let {foo: bar when <matcher> } = obj;
     let {foo: bar when <matcher> = default } = obj;
    • The first form, when <matcher>, requires the matcher to establish bindings; if it doesn't, it's an early SyntaxError.

      At runtime, it tests the appropriate value against the matcher. If it succeeds, it exposes the bindings the matcher defines, to whatever context would normally see bindings here. If it fails, exactly what happens is context-specific; often it'll throw a TypeError, but some places fail more gracefully.

    • The second form, x when <matcher>, doesn't require the matcher to establish bindings.

      It's identical to the first form, except the value being matched is also bound to the x name. (If the matcher also has an x binding, which one wins? I suspect the matcher's binding, just because it comes second in source order.)

    • The third form, x when <matcher> = default, doesn't require the matcher to match.

      If it does, it's identical to the second form, but if it doesn't, instead the default value is bound to the x name, and any bindings the matcher introduces are undefined.

    "Anywhere you have a binding" means, in particular:

    • var/let/const statements. A match failure throws a TypeError.
    • for() heads (for(when <matcher> of <iterable>) {...}). A match failure just skips the iteration (continueing).
    • Function args. A match failure throws a TypeError.

    Bindings use the specified binding semantics (var/let/const); if not explicitly unspecified, they default to let semantics. (For example, for(when [a, b] of vals) will use let semantics rather than var.)

  • New if/while forms:

    Bindings in if()-heads and while()-heads were discussed in the past (proposal), (notes), but never brought to an adequate conclusion; destructuring was avoided due to confusion about what was checked for truthiness, and there were questions about what the bindings' scope should be.

    This narrowly avoids stomping the proposed syntax from the earlier proposal, and should be compatible with it. If we'd like to revive the previous discussion instead, we can defer this part and match whatever the results are (but see notes, below, about compat with that proposal).

     if(when <matcher> = <val-expr>) {
     	// Evaluates only if the matcher matches.
     	// <matcher> isn't required to establish bindings,
     	// but if it does, they're made avaialble to the
     	// if()'s body.
     	// Note that the value doesn't have to be truthy;
     	// it's the pattern matching that matters, instead.
     } else if(when <matcher2> = <val-expr2>) {
     	// this one only sees the bindings from matcher2
     }
    
     while(when <matcher> = <val-expr>) {
     	// Runs val-expr on each iteration,
     	// breaking when it fails to pass the matcher.
     	// Exposes the matcher's bindings to the body.
     	// Like, `if()`, matcher isn't required to establish bindings.
     }
Compat with if-bindings proposal

If we do if-bindings, then I propose the following interaction with matchers:

  • The above-defined behavior is the behavior of the plain `when ` form.
  • Truthy test is only performed when using the `x when ` form, against the `x` value. This is *in addition to* requiring the matcher to succeed.
  • Bindings from the matcher are only visible to the consequent (by definition, they only exist if the matcher succeeded). The `x` binding in `x when ` can be visible to the antecedent, if that's what the committee decides is the correct general behavior for bindings.

If the if-bindings syntax ends up requiring a `let`/`const`/`var` to kick you into that behavior, we should waive it for the above-defined case, so you don't have to write `if(let when ... = x)`. The `when` on its own still suffices to distinguish it from the general expression syntax.

  • New catch form:

    This supercedes the previous "conditional catches" proposal. Same as the general "matchers anywhere you have a binding", except you can chain multiple catches. Match failure causes it to go to the next catch.

     try {...}
     catch(when <matcher>) {
     	// executes this block if the exception matches <matcher>
     	// passes to next catch otherwise
     }
     catch(when <matcher>) {
     	// if the last catch doesn't match,
     	// rethrows the error
     }

    Doing it manually with match() would be:

     try {...}
     catch(e) {
     	match(e) {
     		when <matcher>: do {...}
     		when <matcher>: do {...}
     		default: do {throw e};
     	}
     }
@theScottyJam
Copy link

I really like this overall view. And, I love the idea of having pattern-matching happen on function arguments like that.

One thing that could be added to the list of different places we'd like to have pattern-matching, is with a try-catch.

try {
  ...
} catch match(when(...)) {
  ...
}

As for the syntax, not that it matters too much, but the "match" word seems redundant in many of these examples. Couldn't the "when" word, by itself, be used to figure out that we're dealing with pattern matching?

if (when({ x }) = obj) { ... }

for (when({ x }) of array) { ... }

while (when({ x }) = obj) { ... }

const when({ x }) = obj;

const passes = obj is when({ x });

function fn(when({ x })) { ... }

try { ... } catch (when({ x })) { ...}

@tabatkins
Copy link
Author

Oh right, I forgot about catch. I'll drop that in too.

As for the syntax, not that it matters too much, but the "match" word seems redundant in many of these examples. Couldn't the "when" word, by itself, be used to figure out that we're dealing with pattern matching?

Yeah I was being conservative, but this seemed to provoke an allergic reaction in everyone. Updated the proposal to simplify this a lot. (Matches what you proposed, but slightly simpler even - when is just a keyword prefixing the matcher, rather than a wrapper.)

@littledan
Copy link

I definitely like this version's inclusion of the when extractor(pattern) form, as well as the usage in various LHS contexts. I might prefer subsetting this proposal somewhat, but that's a longer and more complicated discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment