bterlson/patternmatching.md

## patternmatching.md

      
    Raw
  

              patternmatching.md
            
          
    Pattern Matching

This is a strawman proposal for adding pattern matching to ECMAScript. Pattern matching is useful for matching a value to some structure in a similar way to destructuring. The primary difference between destructuring and pattern matching are the use cases involved - destructuring is useful for binding pieces out of larger structures whereas pattern matching is useful for mapping a value's structure to data or a set of behaviors. In practice this means that destructuring tends to allow many shapes of data and will do its best to bind something out of it, whereas pattern matching will tend to be more conservative.
Additionally, the power of pattern matching is increased substantially when values are allowed to participate in the pattern matching semantics as a matcher as well as a matchee. This proposal includes the notion of a pattern matching protocol - a symbol method that can be implemented by objects that enables developers to use those values in pattern matching. A common scenario where this is important is matching a string against a regexp pattern.
At this point, any syntax in this proposal should be considered highly provisional - there is plenty of room for bikeshedding here.
Pattern matching can be useful in a number of situations including catch guards and case statements. This proposal includes affordances for both, in addition to an operator form that can be broadly useful.
Basic Syntax & Semantics

A MatchClause is composed of three parts - a MatchExpression (which can be any expression), a MatchAssignmentPattern (which is some sigil followed by a AssignmentPattern, and a MatchIfClause which is an if token followed by any expression. Either a MatchExpression or a MatchAssignmentPattern must be provided. The following are examples of valid MatchClauses:

somePerson -> { id, props: { name } } if name === ""
-> [ first, second ]
x if x > 1

Note the usage of the token -> to separate the MatchExpression from the MatchAssignmentPattern. You could imagine a number of alternative syntaxes here including wraping the MatchExpression in parens. Feel free to comment with your ideas.
The high level semantics of the MatchClause as applied to some test value are as follows:

If a MatchExpression is provided, it is evaluated. The Symbol.matches method is invoked on the result passing the test value as a parameter. The resulting value is passed on to the rest of the MatchClause. If the MatchExpression is omitted, it is as if some value were provided whose Symbol.matches method simply returns its first argument unchanged (i.e. the identify matcher function).
If the MatchAssignmentPattern is provided, it is used to both assert a certain structure about the test value and to bind pieces out of it. For example, a MatchAssignmentPattern like [x, y] will ensure the test value (or the value returned by the MatchExpression if provided) is an array of length 2 and bind x and y out of it.
The MatchIfClause is executed in the context of any bindings created by the MatchAssignmentPattern and can be any expression that must evaluate to true for the pattern to match.

Switch statements may now contain either CaseClauses or SwitchMatchClauses exclusively (no mixing of the two). The SwitchMatchClause begins with match (acceptable as the region immediately inside the switch block is syntactically free) and is followed by a MatchClause. The other key difference from a CaseClause is that the SwitchMatchClause is braced (as it introduces lexical (let) bindings in a similar fashion to function parameters) and does not fall through by default (although you may use continue to do so).
The following are some motivating examples of the SwitchMatchClause in action:
switch (value) {
  match -> { x, y } {
    // matches objects with only x and y properties. x and y are bound in this block.
  }
  
  match -> [x, y] {
    // matches arrays or array-like objects with a .length property === 2.
  }
  
  match -> { x, y, ... } {
    // matches objects with at least an x and y enumerable property
  }
  
  match -> [ x, y, ... rest ] {
    // matches objects wtih a .length >= 2.
  }
  
  match SomeClass {
    // SomeClass[Symbol.matches](value) is invoked, and if truthy, this block will be executed.
    // It may be desirable to install Function[Symbol.matches](value) that returns `value instanceof this;`
    // in order to make this pattern available across all classes.
  }
    
  match /(\d+)-(\w+)/ -> [, id, name] if name === "Brian" {
    // matches if /abc/[Symbol.matches](value) returns an array that matches the pattern and the name
    // part of the regexp is equal to "Brian". id and name are bound in this block.
  }
  
  match /(\d+)-(\w+)/ -> [, id, name] {
    // matches if /abc/[Symbol.matches](value) returns an array that matches the pattern. id and name are bound
    // in this block.
  }
  
  match /(\d+)-(\d+)/ {
    // matches if /abc/[Symbol.matches](value) is truthy (which will do an obvious thing of returning the RegExp exec matcher).
  }
}

Additionally, the MatchClause can be used as a catch guard with nearly identical semantics:
try {
  ...
} catch(e match if isInteresting(e)) {
  ...
} catch(e match SyntaxError) {
  ...
}

Additionally, an operator would be very useful and would enable usage of these pattern matching semantics in a wide array of use-cases. Unfortunately, the ideal operator =~ is syntactically ambiguous due to the ~ operator alrady being a thing, and anyway the current syntax is somewhat unpalatable when used with a MatchAssignmentPattern, so suggestions are welcome. In any case, proceeding with the strawman =~ for illustrative purposes:
if (str =~ /foo/) { // /foo/[Symbol.matches(str); }

if (instance =~ SomeClass) { SomeClass[Symbol.matches](instance) }

if (obj =~ -> { x, y } ) { pointCache[x][y] = obj }

The =~ operator is also useful to enable pattern matching on deeply nested structures. For example:
switch (node) {
    match Leaf -> { parent } if parent =~ Container {
      ...
    }
} 

Other Considerations

instanceof could be defined in terms of the new matching protocol. Specifically, Function[Symbol.matches] could implement the current instanceof semantics and instanceof could simply delegate to this protocol. This would provide an easy way of overridding instanceof which has been discussed in the past.