Skip to content

Instantly share code, notes, and snippets.

@countvajhula
Last active March 10, 2022 17:20
Show Gist options
  • Save countvajhula/bf4041e4ae5e2feb7ad4b9631e2cf734 to your computer and use it in GitHub Desktop.
Save countvajhula/bf4041e4ae5e2feb7ad4b9631e2cf734 to your computer and use it in GitHub Desktop.
RRFI: Equality and Order Relations Interface

NOTE: THIS DOCUMENT HAS BEEN SUPERSEDED BY RRFI [Draft]: Two-level Universal Scheme for Equality

Proposal

A. Equality

Summary

Introduce an equality relation = that leverages a core notion of equality (egal? / equal-always?) and additionally supports a #:key optional argument.

Substance
New interfaces
  • =
  • hash-code
  • secondary-hash-code

= is egal? + an optional #:key argument. The key argument could be any unary, single-valued function. hash-code and secondary-hash-code are analogous to the similarly-named functions for the existing equality relations (e.g. equal-hash-code).

Extending Equality to Custom Types

Retain gen:equal+hash, possibly with modifications (see below). Even though, strictly speaking, only a unary key argument is needed in the defining procedure here, this proposal recommends retaining the existing binary comparison predicate since it could be more intuitive while defining equality for custom types, and especially for consistency with gen:orderable (see below).

Review the consideration mentioned in the docs re: "data cycles," and depending on that, possibly eliminate the need to provide a custom binary predicate to equal-proc here. Likewise, possibly eliminate the need for a custom hash procedure argument in the hash-proc method.

B. Order

Summary
  1. Make the order relations <, >, <=, and >= generic, so that they work on any datatype.
  2. Support a #:key optional argument, for symmetry with =.
Substance
New Interfaces
<
>
<=
>=
max
min

Generic versions of the usual numeric relations. These all support an optional #:key.

Extending Order to Custom Types

Add a new generic interface, gen:orderable with the method:

less-than?

This is a binary comparison predicate that returns true if a is strictly less than b, where a and b are the arguments and are expected to be of the same type. Both < and > can be inferred from the specification of this method. The other two relations <= and >= would additionally rely on the specification of gen:equal+hash or fall back to the core definition of equality (e.g. egal?).

C. API Design

Order and equality-related APIs should always accept a #:key optional argument. They should not accept a binary comparison predicate.

For order relations this is insufficient, as Sorawee pointed out, but we can always implement gen:orderable here if a simple key is inadequate to express the ordering. That way, APIs -- and the API convention -- can be kept lean, while still offering the same level of functionality and even encouraging good data discipline, i.e. writing a custom type and specifying a definition for the order relation for this new type vs providing a complicated comparison relation inline.

Rationale

Simplicity

This yields the simplest possible equality interface that a language could provide, making it easy for users to use. By enabling a uniform interface that is available everywhere, in equality comparisons as well as in equality-related APIs, it also reduces the size of the language used in practice allowing it to "fit in one's head," and minimizes the need to lookup documentation that the presence of ad hoc patterns encourages.

Genericity

The equality relation in Racket allows us to compare values of any type, but the order relations only allow numeric comparisons. Order relations are also widely used on other data types, and at the moment this results in widespread type-specific duplication of such interfaces, e.g. string<?. By making the standard order relations generic, and additionally supporting the #:key argument, this greatly improves their utility and convenience, and also once again results in reduction in the size of the language and the presence of ad hoc patterns.

Maintainability

The proposals reduce the size of the language, eliminating the need for many ad hoc interfaces and data structures in favor of a common and universal pattern. The presence of a standard pattern should mean a smaller surface area to optimize and maintain, and could mean that a wider swath of the language benefits from improvements and optimizations made in this broadly shared component.

Affected Standard Library APIs

This lists standard APIs that would be either added, removed (that is, deprecated rather than actually removed!), or modified.

Equality

Remove
eq?
eqv?
equal?
equal?/recur
equal-hash-code
equal-secondary-hash-code
eq-hash-code
eqv-hash-code
Keep/Modify
gen:equal+hash

Strings, Bytes, Characters, Symbols

Remove
string=?
string<?
string<=?
string>?
string>=?
string-ci=?
string-ci<?
string-ci<=?
string-ci>?
string-ci>=?
bytes=?
bytes<?
bytes>?
char=?
char<?
char<=?
char>?
char>=?
char-ci=?
char-ci<?
char-ci<=?
char-ci>?
char-ci>=?
symbol<?
keyword<?
Likely Remove
string-locale=?
string-locale<?
string-locale>?
string-locale-ci=?
string-locale-ci<?
string-locale-ci>?

Pairs and Lists

Add
find
Remove
remq
remv
remq*
remv*
memv
memq
memf
findf
assv
assq
assf
argmin
argmax
Keep / Modify
remove [b]
remove* [b]
member [b]
assoc [b]
sort [kb]
index-of [b]
indexes-of [b]
list-prefix? [b]
take-common-prefix [b]
drop-common-prefix [b]
split-common-prefix [b]
check-duplicates [kb]
remove-duplicates [kb]
group-by [kb]

Legend: [b] means the interface currently accepts a binary comparison predicate. [k] means it accepts a key argument, and [kb] means it accepts both, and [-] means it accepts neither. In all cases, the proposal is to accept only the key argument via #:key.

Note that other interfaces like index-where are outside scope here and unaffected by this proposal, so they will not be in either the "keep" or "remove" sections here.

Vectors

Remove
vector-argmin
vector-argmax
vector-memv
vector-memq
Keep/Modify
vector-member [-] [cf. generic collections]
vector-sort [kb] [cf. generic collections]
vector-sort! [kb] [cf. generic collections]

Hash Tables

Remove
hash-equal? [why does this exist vs just equal?]
hash-eqv?
hash-eq?
hasheq
hasheqv
make-hasheqv
make-hasheq
make-weak-hasheqv
make-weak-hasheq
make-ephemeron-hasheqv
make-ephemeron-hasheq
make-immutable-hasheqv
make-immutable-hasheq
Likely Remove
hash-ref-key [what is this?]
Keep / Modify
make-hash [-]
make-weak-hash [-]
make-ephemeron-hash [-]
make-immutable-hash [-]
hash-union [TBD]
hash-union! [TBD]
hash-intersect [TBD]

Currently, hash-union can union across different equality relations e.g. hash and hasheq. For comparison, set-union does not allow this.

Proposed handling, either:

  1. Union could be defined as "equal under any key."
  2. Intersection could be defined as "equal under all keys."
  3. Or we don't allow it.

Dictionaries

Keep / Modify
dict-remove [cf. generic collections]
dict-remove! [cf. generic collections]

Sets

Remove
set-equal?
set-eqv?
set-eq?
seteqv
seteq
mutable-seteqv
mutable-seteq
weak-seteqv
weak-seteq
set-mutable?
set-weak?
list->seteqv
list->seteq
list->mutable-seteqv
list->mutable-seteq
list->weak-seteqv
list->weak-seteq
for/seteq
for/seteqv
for*/seteq
for*/seteqv
for/mutable-seteq
for/mutable-seteqv
for*/mutable-seteq
for*/mutable-seteqv
for/weak-seteq
for/weak-seteqv
for*/weak-seteq
for*/weak-seteqv
set=? [why does this exist vs just equal?]
Likely Remove
define-custom-set-types
make-custom-set-types

It seems that these are mainly about a custom equality predicate, which #:key provides already.

Keep / Modify
set
mutable-set
weak-set
set?
list->set
list->mutable-set
list->weak-set
for/set
for*/set
for/mutable-set
for*/mutable-set
for/weak-set
for*/weak-set
set-member? [cf. generic collections]
set-union [TBD]
set-union! [TBD]
set-intersect [TBD]
set-intersect! [TBD]
set-subtract [TBD]
set-subtract! [TBD]
set-symmetric-difference [TBD]
set-symmetric-difference! [TBD]
subset? [TBD]
proper-subset? [TBD]

Union and intersection and other set operations in the presence of different definitions of equality remains to be defined. (see Hash Tables above)

Note: Interfaces like hash-remove and set-remove don't need to accept #:key since this is already baked into the data structure.

Procedures

Remove
arity=?

Structures

See Extending Equality to Custom Types.

Classes and Objects

The considerations for equal<%> are similar to gen:equal+hash, see Extending Equality to Custom Types. But also note that objects are mutable, and egal? may advocate an eq?-like comparison here.

Contracts

Keep / Modify
=/c
</c
>/c
<=/c
>=/c

Make these generic, and accept the #:key argument.

Pattern Matching

Keep / Modify
==

Support the #:key argument.

Macros

TBD
bound-identifier=?
free-identifier=?
free-transformer-identifier=?
free-template-identifier=?
free-label-identifier=?
check-duplicate-identifier

Operating System

Remove
path<?

The generic < may be used here. TBD: where should the type-specific behavior be documented for generic order relations? E.g. path<? does ->bytes followed by bytes<?.

Sample Implementation

The Racket collections relation/equivalence and relation/order exhibit some of the behavior in this proposal, but also differ in other respects (the present proposal takes precedence where they differ).

Acknowledgements

Many people were involved at various stages of the discussions on this topic, including:

  • Jack Firth, who brought the topic up for discussion.
  • Sorawee Porncharoenwase, who pointed out via a counterexample that the #:key argument on its own is insufficient for order relations.
  • Alex Knauth, who championed and wrote an implementation for egal? / equal-always?
  • Matthew Flatt, who brought up implementation considerations.
  • (among others)

See the discussions at racket/rhombus-prototype#16 and racket/rhombus-prototype#180 for more context.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment