Skip to content

Instantly share code, notes, and snippets.

@rbuckton
Created September 7, 2022 23:23
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rbuckton/fca8b4ecc4eb16422b01f2557203082b to your computer and use it in GitHub Desktop.
Save rbuckton/fca8b4ecc4eb16422b01f2557203082b to your computer and use it in GitHub Desktop.
Pattern Matching Tenets

Pattern Matching Tenets

  1. Patterns should be concise
  2. Patterns should be expressive
  3. Patterns should be explicit
  4. Patterns should be extensible
  5. Pattern matching should be exhaustive

Tenet 1 — Patterns should be concise

The primary goal of pattern matching is to reduce the overhead incurred by non-pattern based control flow. This means simplifying multi-statement or multi-expression comparisons to improve readability, as well as to cut down on the need to introduce temporary variables used only to avoid recomputing potentially side-effectful or expensive code.

Consider the following ECMAScript code sample:

if (obj.dimensions.height === 10 && obj.dimensions.width === 10) {  }

Example 1 — A multi-step conditional. (ECMAScript)

In this example, the dimensions property could potentially be an accessor that is side-effectful, or may perform an expensive computation. As such, we may want to avoid re-evaluating obj.dimensions and might instead use a variable to store the partial computation:

const dimensions = obj.dimensions;
if (dimensions.height === 10 && dimensions.width === 10) {  }

Example 2 — A multi-step conditional with a temporary variable. (ECMAScript)

However, this further increases the complexity of the condition by adding an additional statement, and requires the naming of an otherwise unused variable.

Now consider the same condition using the pattern matching syntax from C#:

if (obj is { dimensions: { height: 10, width: 10 } }) {}

Example 3 — Nested pattern matching. (C#)

The pattern matching example avoids re-execution of the property access from example 1, as well as the need to introduce a temporary variable like in example 2.

Patterns should be concise. However, for a pattern to be adequately concise, the pattern syntax must also be expressive.

Tenet 2 — Patterns should be expressive

In order for a pattern to be concise, the pattern syntax must be expressive enough to represent the most common cases of multi-step conditionals. There are a number of ways patterns can support such expressivity:

  • Literal constant patterns that express === or SameValue/SameValueZero equality.
  • Nested property patterns that can reduce the need for temporary variables and repeated use of &&.
  • List patterns that can reduce the need for complex iteration and caching logic, as well as avoid opaque index accesses like x[0][1].
  • Type-family patterns that test whether a value is an instance of a specific type.
  • Relational patterns that can express other relational operators (i.e., >, >=, <, <=), or even ranges.
  • Logical patterns that allow for conjection, disjunction, negation, and grouping.

Most languages with pattern matching support many of the above capabilities:

input switch {
    0 =>,                                 // literal constant
    { x: 0, y: 0 } =>,                    // nested properties
    (1, 2, 3) =>,                         // tuples
    [1, 2, 3, ..] =>,                     // lists
    Rect =>,                              // type family
    >1 =>,                                // relational
    not null =>,                          // logical negation
    1 or 2 or 3 =>,                       // logical disjunction
    >0 and <3 =>,                         // logical conjunction
    (>0 or >0F) and (<100 or <100F) =>// grouping
}

Example 4 — Expressivity in C# switch expressions. (C#)

match input with
| 0 ->// literal constant
| { x = 0; y = 0; } ->// nested properties
| (1, 2, 3) ->// tuples
| [1; 2; 3] ->// lists
| [|1; 2; 3 |] ->// arrays
| p : Point ->// type family
| :? Point ->// type family
| 1 | 2 | 3 ->// logical disjunction
| (a, b) & (_, "test") ->// logical conjunction
| (1 | 2) & (2 | 3) ->// grouping
|

Example 5 — Expressivity in F# match expressions. (F#)

match input {
    0 => …,                     // literal constant
    Point { x: 0, y: 0 } => …,  // nested properties
    (1, 2, 3) => …,             // lists/tuples
    Point { .. } => …,          // type family
    0..=5 => …,                 // range
    1 | 2 | 3 => …,             // logical disjunction}

Example 6 — Expressivity in Rust match expressions. (Rust)

input match
  case 0 =>// literal constant
  case Some(value) =>// nested properties (case classes and extractors)
  case (1, 2, 3) =>// tuple pattern
  case 1::2::3 =>// list pattern (infix operator and extractors)
  case p: Point =>// type family
  case 1 | 2 | 3 =>// logical disjunction

Example 7 — Expressivity in Scala match expressions. (Scala)

Patterns should be expressive. However, there are often either complex or niche cases where patterns cannot be sufficiently expressive. In such cases, patterns should also be extensible. And while patterns should be sufficiently expressive to handle the majority cases, their syntax should also be unambiguous. For this reason, patterns must also be explicit.

Tenet 3 — Patterns should be explicit

Because patterns should be concise and expressive, we must take care that we do not overload the meaning of various syntactic constructs. This would introduce confusion to readers and result in potentially unexpected behavior. To avoid this, patterns should be explicit — there should be no ambiguity in the meaning of a particular pattern.

For example, consider a list pattern like [A, B] (or (A, B), depending on the language). Should this pattern introduce new variables A and B corresponding to the first and second elements of the input? Or should it instead resolve the values stored in the existing A and B variables and match them against the corresponding elements of the input?

In C#, the pattern (A, B) is unambiguous and can only indicate references to existing types. The syntax related to introducing new variables requires a type prefix or var keyword. (A, B) matches a tuple whose elements are the types A and B, respectively. (A a, B b) matches a tuple with elements of type A and B and stores those values into the variables a and b, respectively. (var a, var b) matches a tuple regardless of the types of the two elements, and introduces the variables a and b.

Depending on the underlying syntax, such collisions can be avoided. For ECMAScript, there are a number of options to explore:

  • Leveraging the existing let, const, and var keywords (and potentially the proposed using keyword), similar to C#'s var and type binding patterns.
  • Introducing novel syntax for explicitly named bindings, such as Rust's @ binding operator.
  • Dislocating the bindings introduced from the pattern itself.

We must take care that whatever syntax we choose aligns as closely as possible with our preceding core tenets, namely:

  • Being explicit should not unduly prevent us from remaining concise.
  • Being explicit should not unduly prevent us from remaining expressive.

Tenet 4 — Patterns should be extensible

While patterns should ideally be sufficiently expressive, there are often more complex or niche cases that cannot be readily handled by pattern syntax. To address this, patterns should be extensible — providing escape hatches to address these cases.

There are several mechanisms for providing extensibility:

  • Pattern guards, which introduce additional conditions to evaluate to match a specific case.
  • Custom matchers, which augment pattern matching behavior either through custom operators or a dynamic API.
  • Interpolation, which escapes from the pattern syntax to access runtime values or custom matchers.

Pattern Guards

In cases where the pattern syntax is insufficiently expressive, many languages with pattern matching also introduce pattern guards which provide additional expressivity at the cost of dislocation of the guard from the pattern itself:

input switch {
    (var a, var b) when a > 0 && b > 0 =>,    // pattern guard using 'when'}

Example 8 — when pattern guard in C# switch expressions. (C#)

match input with
| (a, b) when a > 0 && b > 0 ->// pattern guard using 'when'
|

Example 9 — when pattern guard in F# match expressions. (F#)

input match {
    (a, b) if a > 0 && b > 0 => …               // pattern guard using 'if'
}

Example 10 — if pattern guard in Rust match expressions. (Rust)

input match
  case (a, b) if a > 0 && b > 0 =>// pattern guard using 'if'

Example 11 — if pattern guard in Scala match expressions. (Scala)

While pattern guards are useful for handling niche cases, a truly expressive pattern syntax primarily favors a more comprehensive set of patterns that promote locality of conditions to improve readability.

Custom Matchers

In addition to pattern guards, pattern matching languages often introduce extensibility mechanisms through a dispatch mechanism. In C#, this is handled via the Deconstruct method, which allows for the evaulation of user-defined behavior during destructuring and matching. In F#, this is addressed by Active Patterns, which allow you to reference a user-defined function to customize matching behavior. In Scala, this is supported via Extractor Objects and infix operator patterns.

class Person {
    public string GivenName;
    public string FamilyName;

    public Person(string givenName, string familyName) {
        GiveName = givenName;
        FamilyName = familyName;
    }

    public void Deconstruct(out string givenName, out string familyName) {
        givenName = GivenName;
        familyName = familyName;
    }
}

input match {
    Person(var givenName, var familyName) =>// checks type and calls Deconstruct
}

Example 12 — Pattern extensibility via Deconstruct. (C#)

let (|RGB|) (col : System.Drawing.Color) = ( col.R, col.G, col.B )

match col with
| RGB(r, g, b) ->// invokes RGB with input and destructures resulting tuple
|

Example 13 — Pattern extensibility via Active Patterns. (F#)

object CustomerID {
    def unapply(customerID: String): Option[String] = {
        val stringArray: Array[String] = customerID.split("--")
        if (stringArray.tail.nonEmpty) Some(stringArray.head) else None
    }
}

input match
  case CustomerID(name) =>// Invokes CustomerID.unapply

Example 14 — Pattern extensibility via Extractor Objects. (Scala)

// NOTE: exists in standard library
object :: {
    def unapply[A](ls: List[A]): Option[(A, A)] = {
        if (ls.empty) None
        else Some((ls.head, ls.tail))
    }
}

input match
  case h::t =>// Looks up `::` operator, invokes `unapply` and destructures result into `h` and `t`

Example 14 — Pattern extensibility via Extractor Objects and infix operator patterns. (Scala)

Interpolation

One last extensibility mechanism to consider is interpoloation. This is an escape hatch from the pattern syntax that allows you to substitute the result of evaluating an expression into a pattern. Such a mechanism is not widely employed in any language, however.

Tenet 5 — Pattern matching should be exhaustive

Statement-based control flow in ECMAScript often allows you to elide branches. You can have an if without an else and a switch without a default and this is acceptable within the language itself because these statements can simply produce an empty completion record for unhandled cases. However, unhandled branches are frequently a source of bugs in software and many static analysis tools often implement control flow analysis to find and surface such issues.

Expression-based control flow in ECMAScript differs in that it does not allow you to elide branches. A conditional expression cannot elide the "falsy" branch, and operators like &&, ||, and ?? do not allow you to elide the right-hand of the expression. Even operators like ?. are essentially exhaustive as they are often just a shorthand for a conditional operation whose "falsy" branch is just undefined.

Exhaustiveness is achieved in pattern matching through the use of refutable patterns and irrefutable patterns. A refutable pattern is one that conditonally matches a value. If such a pattern does not match, the next alternative or branch should be attempted. An irrefutable pattern is one that will always match. For a pattern matching construct to be considered exhaustive it must either cover the input domain with sufficient refutable patterns, or it must also have an irrefutable pattern.

For example, in Rust every variable declaration is essentially a pattern. The x in the statement let x = 5 is an irrefutable pattern, as there is no input value that would not match. However, the Some(x) in the statement let Some(x) = opt is a refutable pattern: if opt is not a Some, the pattern will not match and an error will be reported.

This is not unlike destructuring in ECMAScript today. The statement let { x } = input is an example of a refutable pattern: if input is not an object, the statement will throw. In the same vein, let [x] = input is also a refutable pattern: if input does not have a [Symbol.iterator] method, the statement will throw.

One way we could implement exhaustiveness in ECMAScript pattern matching would be to also throw if some branch of the pattern fails to match. For example, if we were to replace the simple conditional a ? b : c with a switch expression in C#, it might look something like this:

a switch {
    true => b,
    false => c
}

Example 15 — Exhaustiveness in a switch expression. (C#)

If we were to elide either branch, C#'s control flow analysis would fail, not unlike if we'd written a ? b : or a ? : c.

Event without static typing we can still support exhaustiveness in ECMAScript at runtime similar to how we support it for destructuring, by throwing a TypeError. In addition, third-party static analysis tools such as TypeScript and Flow would be able to recognize non-exhaustive pattern matching expressions and provide early warnings of potential errors.

The one instance where exhaustiveness should not be required is when the intent is to test whether the pattern matched rather than what was matched. In C#, the is expression can be used to conditionally test a pattern and returns either true if the pattern was exhaustive, or false if it was not:

if (obj is int) { ... }

Example 16 — Non-exhaustive pattern matching via the is expression. (C#)

Similarly, in Rust you can use if let and while let to conditionally match a non-exhaustive pattern:

if let Some(x) = opt { ... }

Example 17 — Non-exhaustive pattern matching via the if let statement. (Rust)

Pattern matching should be exhaustive, as exhaustiveness helps to avoid defects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment