Skip to content

Instantly share code, notes, and snippets.

@acutmore
Last active September 23, 2022 22:31
Show Gist options
  • Save acutmore/97e0b80850c04595b283d8faa052be9c to your computer and use it in GitHub Desktop.
Save acutmore/97e0b80850c04595b283d8faa052be9c to your computer and use it in GitHub Desktop.
When considering the design of the Record & Tuple proposal a core driver in much of the design stems primarily from equality rather than immutability.

Immutability is Equality's support act [hackmd]

When considering the design of the Record & Tuple proposal a core driver in much of the design stems primarily from equality rather than immutability.

Categorizing equality

There are many ways to think about equality but one way to divide up the design space is two dimensions.

pure impure
consistent A C
changeable B D

A) pure and consistent

=== is an example of a pure and consistent equality operator. It is pure because no other operation will be triggered by this operation; there can be no observable change in state. It is consistent because for any two values the result of comparing them will never change.

const a = getValue1();
const b = getValue2();
const result1 = (a === b);     // `===` will not trigger any other code to run. (pure)
doMoreWork();                  // We can now allow other code to run before
const result2 = (a === b);     // comparing the same two values again and
assertEqual(result1, result2); // the comparison result can not have changed. (consistent)

This category of equality operation is particularly useful for code that wants to provide strong guarantees about how code behaves and what properties can be inferred by the comparison.

B) pure and changeable

It is possible to write a pure equality operation that is not consistent - given the same two values the result can change over time; but being pure - it won't trigger any getters or proxy hooks or invoke symbol protocols. For example:

setup:
// Capture intrinsics ahead of time before they can be modified - this ensures the operation remains 'pure'
const _Uint8Array = Uint8Array;
const _byteLength = Object.getOwnPropertyDescriptor(ArrayBuffer.prototype, 'byteLength').get;
const _apply = Reflect.apply;

function equalArrayBuffers(a, b) {
    let l1;
    let l2;
    try {
        l1 = _apply(_byteLength, a, []);
        l2 = _apply(_byteLength, b, []);
    } catch { return false }; // something is not an ArrayBuffer
    if (l1 !== l2) return false;
    const vA = new _Uint8Array(a);
    const vB = new _Uint8Array(b);
    for (let i = 0; i < l1; i++) {
        if (vA[i] !== vB[i]) return false;
    }
    return true;
}
const a = new Uint8Array(1);
const b = new Uint8Array(1);
const bufferA = a.buffer;
const bufferB = b.buffer;
equalArrayBuffers(bufferA, bufferB); // true (pure: no user hookable code run)
b[0] = 1; // mutate b
equalArrayBuffers(bufferA, bufferB); // false (changeable)

This operation says that at the time when the operation is run the values are equal, but not that they were always equal in the past or will remain equal in the future. Asserting the current state of values in unit tests is one example of when this category of equality operation can be really useful without requiring consistency. So an API like Object.deepEquals might fall into this category.

C) impure and consistent

An impure but consistent equality operation sounds almost impossible; if the operation has points that can be hooked by userland code wouldn't that mean it can't guarantee consistency? The way to make this operation consistent is to keep a cache of previous results. As an input value is inspected, what is observed can be cached. Then, if the value is compared again the cached observations can be used instead of re-consulting potentially inconsistent operations. This does mean that while the equality may be consistent, the contents of the values may not always corelate with this.

setup:
const cache = new WeakMap();
cache.get = cache.get.bind(cache);
cache.set = cache.set.bind(cache);

function memoizedEqual(a, b) {
    const tA = typeof a;
    const tB = typeof b;
    if (tA !== tB) return false;
    if (tA !== "object" || a === null || b === null) {
     return a === b;
    }
    const sA = cache.get(a) ?? JSON.stringify(a);
    cache.set(a, sA);
    const sB = cache.get(b) ?? JSON.stringify(b);
    cache.set(b, sB);
    return sA === sB;
}
let counter = 0;
const a = {
    x: 1,
    get p() { return counter; } 
};
const b = {
    x: 1,
    get p() { return counter++; } 
};
memoizedEqual(a, b); // true (counter mutated -> impure)
b.x++;
memoizedEqual(a, b); // true (consistent result)
a.x === b.x;         // false (consistent with itself, but not with the values)

A variation on this is for the caching to instead enforce consistency by throwing an exception if the result changes. Technically the operation would not be consistent if it returned true on the first call and then threw an exception on the second call. But normal completions would be guaranteed consistent.

D) impure and changeable

== is an example of an impure and changeable equality operation in JavaScript.

let counter = 0;
const o = {[Symbol.toPrimitive]: () => counter++ };
const v = 1;
o == v; // false (and counter mutated (impure))
o == v; // true (changeable)
o == v; // false

While this may seem undesirable, it can give code authors the most amount of flexibility as equality can be completely user defined.

// Example in Kotlin:
// The '==' operation on objects delegates to the overridable 'equals' method.
// The Kotlin docs say `equals` is expected to be reflexive, symmetric, transitive and consistent but this is not enforced by the language.

fun main() {
    var result = true;
    class Value {
      override fun equals(other: Any?): Boolean {
          val r = result;
          result = !result;
          return r;
      }
    }

    val a = Value();
    val b = Value();

    println(a == b); // true
    println(a == b); // false
    println(a == b); // true
}

https://kotlinlang.org/docs/equality.html

Existing equality in JavaScript

JavaScript values can be divided into primitives ๐Ÿงฑ and objects ๐Ÿ›. For all four current built-in equality operations (StrictlyEqual, LooselyEqual, SameValueZero, and SameValue) primitives are compared by value and objects are compared by reference. Only LooselyEqual is not pure and consistent because when comparing objects to primitives the object is first dynamically coerced into a primitive. For all other operations a type mismatch is never considered equal.

Operations โžก
Types โฌ‡
StrictlyEqual SameValueZero SameValue
๐Ÿงฑ boolean 1 1 1
๐Ÿงฑ number 2 3 1
๐Ÿงฑ string 1 1 1
๐Ÿ› plain objects 1 1 1
๐Ÿ› Date 1 1 1
๐Ÿ› Array 1 1 1
...

Types like boolean and string are compared using the same approach in all four operations. All object types are always compared by reference. There are two additional approaches to comparing number.

  1. SameValue: Object.is - the two values are indistinguishable (ignoring NaN canonicalisation)
  2. IEEE 754: ===, [].indexOf
  3. SameValueZero: Map, Set, [].includes

There are other forms of equality on the horizon. In the Temporal proposal most of the types have an .equals method and due to temporal values being internally immutable the operation is consistent. Though due to supporting implicit conversion it is not symmetric; the type that the method is looked up on sets the domain of comparison.

const pd = Temporal.PlainDate.from('2006-08-24');
const pdt = Temporal.PlainDateTime.from('2006-08-24T03:24:30');
pd.equals(pdt); // true - compare by date only
pdt.equals(pd); // false - compared by date + time

To get symmetric comparison static methods are available so the domain can be selected explicitly:

Temporal.PlainDate.compare(pdt, pd); // 0
Temporal.PlainDate.compare(pd, pdt); // 0

Temporal.PlainDateTime.compare(pdt, pd); // +1
Temporal.PlainDateTime.compare(pd, pdt); // -1

Crucially these functions are specific to Temporal. For a generic equality operation to use these, it would first need to detect that it is working with Temporal types before it could utilize these.

New equality operations for JavaScript

Beyond referential equality there is no generic built-in equality operation for structured data. Structured data is when values further contain more values within them. For example an Array is an object and could contain numbers and strings and even more objects. Here 'generic' means that it is not tied to a domain, Temporal types can be composed together so are somewhat structured but not generic because they are tied to the domain of dates and times.

A core challenge in comparing generic structured data is knowing how to explore the structure, as there is more than one way for an object to contain another value. A simple walk of the structure based on the standard meta-object-protocol (ownKeys, get, etc) can return false positives as it misses private data.

function equals(a, b) {
  if (typeof a !== "object" || typeof b !== "object") {
    return Object.is(a, b);
  }
  if (a === b) return true;
  if (a === null || b === null) return false;
  const keysA = Reflect.ownKeys(a);
  const keysB = Reflect.ownKeys(b);
  if (keysA.length !== keysB.length) return false;
  for (const k of keysA) {
     if (!keysB.includes(k)) return false;
     if (!equals(a[k], b[k])) return false; // Skipping logic for handling cycles for brevity
  }
  return true;
}

equals({ a: { b: 1 } }, { a: { b: 1 } }); // true

class V { 
  #v;
  constructor(v) { this.#v = v; }
  getV() {
    return this.#v;
  } 
}

equals(new V(0), new V(1)); // true

Stepping back a little, to extend our table of equality there are two ways this can be achieved. We can either add new rows, add new columns, or both. Related reading: the Expression Problem

Operations โžก
Types โฌ‡
StrictlyEqual SameValueZero SameValue NEW OPERATION ?
๐Ÿงฑ boolean 1 1 1 ?
๐Ÿงฑ number 2 3 1 ?
๐Ÿงฑ string 1 1 1 ?
๐Ÿ› plain objects 1 1 1 ?
๐Ÿ› Date 1 1 1 ?
๐Ÿ› Array 1 1 1 ?
NEW TYPE ? ? ? ? ?

A new operation?

Perhaps a new column can be added. This could be in the form of a new API Object.deepEquals, or even a new operator ~=. Adding a new column means that new behaviors can be added to existing types. For example it could support comparing two Dates and also two Arrays of Dates. We can also further break this down into two sub-approaches. How does this operation know how to handle each type? Either knowledge about the types can be built directly into the operation or there can be a protocol the operation dictates that types can then fullfil.

// operation knows about types:
function equals1(a, b) {
    switch (type(a)) {
        case 'Date':
            ....;
        case 'Array':
            ...;
    }
}

// the types know about the operation:
Map.prototype[Symbol.equals] = ...;

function equals2(a, b) {
    return a[Symbol.equals](b);
}

// mixture:
function equals3(a, b) {
    return Symbol.equals in a ? equals2(a, b) : equals1(a, b); 
}

If the operation knows about types, this limits its use for user-defined types. If the operation has a symbol protocol this means it can't be pure.

There is also the question of if this operation should be consistent like the majority of the existing operations. If an API like Object.deepEquals was consistent then it would only be able to offer referential equality for mutable objects. So it seems preferential for an API like this to not provide consistency.

A new type?

It is also possible to add new types to the language, which can extend the behaviors of existing operations. For example a new Pair type could be added that holds two generic values new Pair(a, b), and SameValue returns a referential comparison of two Pairs, and LooselyEqual returns true for two Pairs if the two values are LooselyEqual to their corresponding values in the other Pair.

Both?

These are not mutually exclusive, a new operation could be added along with new types designed to be compatible with the new operation.

Records and Tuples

Maps and Sets

A key problem space that R&T is interested in is richer keys. Being able to enable code that could look a little like the following:

const connections = new Set();
connections.add(connection({from: 'a', to: 'b'}));
connections.has(connection({from: 'a', to: 'b'})); // true
for (const {from, to} of connections) {
    // ...
}

Right now R&T solves this by added new types to the language that work with the existing Set equality operation so we get this:

const connections = new Set();
connections.add(#{from: 'a', to: 'b'});
connections.has(#{from: 'a', to: 'b'}); // true
for (const {from, to} of connections) {
    // ...
}

Records are directly usable as rich/structural Set values due to their pure and consistent structural equality. They are like the inverse of the CompositeKeys approach. Once a CompositeKey is created its contents is opaque. To get the values back out of the key a look up table is required. Records on the other hand are transparent about their content and a look up table can be used to make their content opaque.

State change detection

In addition to usage in Maps, another problem-space that R&T tries to help in is modeling state and knowing if the state has changed. Change-detection in UI frameworks&libraries like React, Preact-signals and Svelte tend to rely on the built-in pure and consistent === or Object.is operations.

R&T being comparable with these existing operations means they work out of the box with these types of libraries and frameworks:

import {signal} from "@preact/signals";

const items = signal(Tuple()); // start with an empty list
observe(() => {
  console.log(items.value); // log whenever items changes
});

function reverseItems() {
   items.value = items.value.toReversed(); // because Tuples compare by content this only triggers a change if the order of the values has changed
}

To work out-of-the-box like this R&T need to be comparable with the existing approaches these libraries use: === and Object.is.

Anecdotally when the R&T champions have presented the R&T proposal at conferences the ability to use === to compare these structures is commonly the top feature that developers are excited about.

Deep equality

With core use-cases of R&T being usable for equality comparisons with other R&T, the R&T champions prefer they are deeply compared by value so users can build rich nested structures so they can keep the data organized, while remaining confident that they are also retaining structural-equality and not accidentally introducing referential equality. Exiting the guarantee of structural-equality must be done explicitly by only storing a reference-marker (string | number | symbol) to the data that lives outside of the R&T. The R&T champions believe this helps avoid subtle foot-guns that would be difficult to catch otherwise.

R&T being deeply inert means that walking down into their nested structure is pure and consistent. R&T are also immutable from the moment they are created, rather than starting mutable and being locked later. This means that cycles are an impossibility, a R&T can only directly reference a value that existed before it did. While a deep-equality operation can handle cycles, knowing when cycles can't exist cuts down on the book keeping, simplifying the operation for these types.

In this way R&T go beyond frozen objects. Frozen objects do not guarantee that accessing them is pure, or that getters are consistent, or that they do not contain cycles.

The advantage of new types

Adding new types that work with the existing operations means that they work out-of-the-box with existing data-structures and 3rd party code. Users can continue to work with what they are already familiar with and start to switch to using R&T where it helps improve their code.

The consequence

If records can be compared using existing operations this suggests two things:

  • Comparison of R&T should be pure and consistent
    • === and Map internals expect pure and consistent operations
  • R&T do not create their own unique identity
    • If two values are Object.is equal and typeof "object" then they share an identity

If R&T were object types but without identity then it does not seem safe to support them as WeakMap keys as they can be re-forged out of their contents. This would mean they are object types that are not usable in places where other objects are, which could be confusing.

In JavaScript there is already a category of types that are compared by value in a pure and consistent way and not observably garbage-collectable. They are the primitive non-object types. This is something many JS developers will already learn on their journey.

If R&T are non-object types with their own typeof this means their semantics align with the category of types they reside in.

Alternative new types:

Let's imagine a different approach where R&T do not work out-of-the-box with the existing data structures and 3rd party libraries/frameworks.

const R = Record;
const T = Tuple;
    
const r1 = R({p: T(1)});
const r2 = R({p: T(1)});
assert(typeof r1 === "object");
assert(Object.isFrozen(r1));
assert(r1 !== r2);
assert(R.isRecord(r1));
assert(T.isTuple(r1.p));
assert(R.equals(r1, r2));
assert(!R.equals(r1, {p: T(1)}));

Maps and Sets - alternative design

const connections = new Set([], { newMode: true });
connections.add(R({from: "A", to: "B"}));

connections.has(R({from: "A", to: "B"})); // true
connections.has({from: "A", to: "B"}); // false

A new equality operation (a new column) has been added here, just one that is embedded within the use of this new form of Map and Set. The operation could effectively be a copy of the SameValueZero column, except for a different approach to the R&T rows. One reason to only move away from referential equality for the R&T rows is because Map/Set also requires support for a hashing operation to operate in sub-linear time, and needs to trust the equality and hashing of the types to remain consistent and avoid reentrancy bugs.

State change detection - alternative design

If R&T do not work out-of-the-box with the existing equality operations then to work directly with 3rd party frameworks&libraries users will need to wait for these packages to decide to adopt this and then be updated.

Alternative new types review

This design also adds new types, but does not use this as an opportunity to change the semantics of the existing equality operations which instead continue to use the existing referential equality for structured data. A downside is that there are now two different types of Map and Set that users have to choose between, and in general 1st and 3rd party code has to explicitly add support for comparing R&T, https://immutable-js.com for example would need to be updated to get the benefit of storing R&T within these structures.

Additionally if this new mode for Map and Set only has new special support for R&T, and no further compatible types are added in the future. Then it may have been overkill to add this new mode, rather than have R&T work directly with the existing Map and Set operations. This would be similar to if a new operator Object.deepEquals was added, that initially only has new semantics for R&T and no new types followed. A new equality column was added without much benefit. This suggests that if a new operation is added it should be motivated by differing from an existing operation for more than 2 types.

The R&T proposal is already a relatively large proposal, growing the proposal to also define new equality operations for even more types could make the proposal too large to progress. There is the additional point that an Object.deepEquals API would more ideally not offer consistency, whereas R&T focus on consistency. So while the R&T champions are excited by the potential of an Object.deepEquals API in the future, the focus for R&T is to provide part of the pure-and-consistent deep-equality design space as this has specific use-cases not solved by an impure and changeable deep-equality operation.

Conclusion

Having a proposal that adds new types that work with the existing operations. Means that the equality table adds only rows and not columns. This makes R&T compatible with existing approaches to equality. It also keeps, for now, the existing separation of immutable-content-based-values being primitives, and dynamic-reference-based-values being objects. It adds expressivity to the primitives. R&T being deeply-immutable works well with these equality goals, while also meaning that R&T are useful to people who are only interested in using them for their deep immutability and not intending to make use of the equality operations.

The R&T champions are also confident this does not prevent the potential for exploring new columns that provides deep structural equality for objects, and instead leaves that design area completely open - rather than forcing its hand prematurely. We see leaving the door open, while providing immediate value, as a strength of the current R&T proposal, as JavaScript is a language with strong backwards compatibility guarantees.

Operations โžก
Types โฌ‡
StrictlyEqual SameValueZero SameValue Future ๐Ÿ”ฎ
๐Ÿงฑ boolean 1 1 1 ?
๐Ÿงฑ number 2 3 1 ?
๐Ÿงฑ string 1 1 1 ?
๐Ÿ› plain objects 1 1 1 ?
๐Ÿ› Date 1 1 1 ?
๐Ÿ› Array 1 1 1 ?
R&T 3 3 1 ?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment