allevato/NNNN-value-equatability-and-hashability.md Secret

## NNNN-value-equatability-and-hashability.md

      
    Raw
  

              NNNN-value-equatability-and-hashability.md
            
          
    Deriving Equatable and Hashable conformance


Proposal: SE-0000
Author(s): Tony Allevato
Status: Awaiting review
Review manager: TBD

Introduction

Developers have to write large amounts of boilerplate code to support
equatability and hashability of complex types. This proposal offers a way for the
compiler to automatically derive conformance to Equatable and Hashable to
reduce this boilerplate, in a subset of scenarios where generating the correct
implementation is known to be possible.
Swift-evolution thread: Universal Equatability, Hashability, and Comparability

Motivation

Building robust types in Swift can involve writing significant boilerplate
code to support hashability and equatability. By eliminating the complexity for
the users, we make Equatable/Hashable types must more appealing to users and
allow them to use their own types in contexts that require equatability and
hashability with no added effort on their part (beyond declaring the
conformance).
Equality is pervasive across many types, and for each one users must implement
the == operator such that it performs a fairly rote memberwise equality test.
As an example, an equality test for a basic struct is fairly uninteresting:
struct Foo: Equatable {
  static func == (lhs: Foo, rhs: Foo) -> Bool {
    return lhs.property1 == rhs.property1 &&
           lhs.property2 == rhs.property2 &&
           lhs.property3 == rhs.property3 &&
           ...
  }
}
What's worse is that this operator must be updated if any properties are added,
removed, or changed, and since it must be manually written, it's possible to get
it wrong, either by omission or typographical error.
Likewise, hashability is necessary when one wishes to store a type in a
Set or use one as a multi-valued Dictionary key. Writing high-quality,
well-distributed hash functions is not trivial so developers may not put a great
deal of thought into them – especially as the number of properties
increases – not realizing that their performance could potentially suffer
as a result. And as with equality, writing it manually means there is the
potential for it to not only be inefficient, but incorrect as well.
In particular, the code that must be written to implement equality for
enums is quite verbose:
enum Token: Equatable {
  case string(String)
  case number(Int)
  case lparen
  case rparen
  
  static func == (lhs: Token, rhs: Token) -> Bool {
    switch (lhs, rhs) {
    case (.string(let lhsString), .string(let rhsString)):
      return lhsString == rhsString
    case (.number(let lhsNumber), .number(let lhsNumber)):
      return lhsNumber == rhsNumber
    case (.lparen, .lparen), (.rparen, .rparen):
      return true
    default:
      return false
    }
  }
}
Crafting a high-quality hash function for this enum would be similarly
inconvenient to write.
Swift already derives Equatable and Hashable conformance for a small subset
of enums: those for which the cases have no associated values (which includes
enums with raw types). Two instances of such an enum are equal if they are the
same case, and an instance's hash value is its ordinal:
enum Foo  {
  case zero, one, two
}

let x = (Foo.one == Foo.two)  // evaluates to false
let y = Foo.one.hashValue     // evaluates to 1
Likewise, conformance to RawRepresentable is automatically derived for enums
with a raw type, and the recently approved Encodable/Decodable protocols
also support synthesis of their operations when possible. Since there is
precedent for derived conformances in Swift, we propose extending it to these
fundamental protocols.
Proposed solution

In general, we propose that a type derive conformance to Equatable/Hashable
if all of its members are Equatable/Hashable. We describe the specific
conditions under which these conformances are derived below, followed by the details
of how the conformance requirements are implemented.
Requesting derivation is opt-in

Users must opt-in to automatic derivation by declaring their type as
Equatable or Hashable without implementing any of their requirements.
Any type that declares such conformance and satisfies the conditions below
will cause the compiler to synthesize an implementation of ==/hashValue
for that type.
Making the derivation opt-in provides a number of benefits vs. making it
opt-out:


The syntax for opting in is natural; there is no clear analogue in Swift
today for having a type opt out of a feature.


It requires users to make conscious decisions about the public API surfaced
by their types. Types cannot accidentally "fall into" conformances that the
user does not wish them to; a type that does not initially support Equatable
can be made to at a later date, but the reverse is a breaking change.


The conformances supported by a type can be clearly seen by examining
its source code; nothing is hidden from the user.


We reduce the work done by the compiler and the amount of code generated
by not synthesizing conformances that are not desired and not used.


As will be discussed later, explicit conformance significantly simplifies
the implementation for recursive types.


There is one exception to this rule: enum types with cases that have no
associated values (including those with raw values) will continue to be
Equatable/Hashable without the user explicitly declaring those
conformances. While this does add some inconsistency to enums under this
proposal, changing this existing behavior would be source-breaking. The
question of whether such enums should be required to opt-in as well can
be revisited at a later date if so desired.
Overriding derived conformances

Any user-provided implementations of == or hashValue will override the
default implementations that would be provided by the compiler.
Protocol derivability conditions

For brevity, let P represent either the protocol Equatable or Hashable in
the descriptions below.
Derived conformances for enums

For an enum, derivability of P is based on the conformances of its
cases' associated values. Computed properties are not considered.
The following rules determine whether conformance to P can be derived for
an enum:


An enum with no cases does not derive conformance to P, since it is not
possible to create instances of such types.


An enum with one or more cases derives conformance to P if all of the
associated values of all of its cases conform to P.


Derived conformances for structs and classes

For a struct or a class, derivability of P is based on the conformances
of its stored instance properties only. Neither static properties nor computed
instance properties (those with custom getters) are considered.
The following rules determine whether conformance to P can be derived for a
struct or class:


A struct or class with no stored properties does not derive conformance
to P. (Even though it is vacuously true that all instances of a type with no
stored properties could be considered equal and hash to the same value, the
reality is that such types are more often used for grouping/nesting of
other entities and not for their singular value, and we don't consider it
worthwhile to generate extra code in this case.)


A struct or a class with one or more stored properties derives conformance to
P if all if the types of all of its stored properties conform to P.


A class only derives P if it has no superclass, or if its superclass also
conforms to P. (Subclasses do not need to redeclare conformances since they
are inherited, so this point refers to the ability to synthesize the
implementation.)


Considerations for recursive types

By making the derived conformance opt-in, recursive types (such as indirect
enums, classes that directly or indirectly reference instances of
themselves, and structs involved in such cycles) have their synthesis fall
into place with no extra effort.
In any cycle belonging to a recursive type, every type in that cycle must
declare its conformance explicitly. If a type does so but cannot have its
conformance derived because it does not satisfy the conditions above, then
it is simply an error for that type and not something that must be detected
by the compiler in order to reason about all the other types involved.
ycle.**
Other considerations

Conditional conformances
will allow generic types to conditionally derive Equatable and Hashable
for type argument substitutions where the rules above are satisfied.
For example, the standard library would be able to write the following:
extension Optional: Hashable where Wrapped: Hashable {}
Since Optional is an enum that satisfies the above requirements when
Wrapped is Hashable, the implementations of == and hashValue can
be derived in that extension by the compiler.
Conditional conformances will also significantly improve derivability
coverage over other payload/member types. For example, consider a struct
containing a stored property that is an array of Equatable types:
struct Foo: Equatable {
  var values: [String]
}
Today, Array<String> does offer a global == but does not conform to
Equatable, so its presence would prohibit Foo from deriving Equatable.
However, with conditional conformances in place, Foo would automatically
derive Equatable as well. This makes derived conformances significantly
more powerful.
Implementation details

An enum T that derives Equatable will receive a compiler-generated
implementation of static func == (lhs: T, rhs: T) -> Bool that returns
true if and only if lhs and rhs are the same case and have payloads
that are memberwise-equal.
An enum T that derives Hashable will receive a compiler-generated
implementation of var hashValue: Int { get } that uses an unspecified hash
function^† to compute the hash value by incorporating the case's
ordinal (i.e., definition order) followed by the hash values of its associated
values as its terms, also in definition order.
A struct T that derives Equatable will receive a compiler-generated
implementation of static func == (lhs: T, rhs: T) -> Bool that returns
true if and only if lhs.x == rhs.x for all stored properties in T.
A struct T that derives Hashable will receive a compiler-generated
implementation of var hashValue: Int { get } that uses an unspecified hash
function^† to compute the hash value by incorporating the hash values
of the fields as its terms, in definition order.
A class T that derives Equatable will receive a compiler-generated
implementation of static func == (lhs: T, rhs: T) -> Bool that returns
true if and only if lhs.x == rhs.x for all stored properties in T,
and if T has a superclass S, then lhs as S == rhs as S as well.
A class T that derives Hashable will receive a compiler-generated
implementation of var hashValue: Int { get } that uses an unspecified hash
function^† to compute the hash value by incorporating the hash value
of the superclass followed by the hash values of its fields as its terms, in
definition order.
^† We intentionally leave the exact definition of the hash function
unspecified here. A multiplicative hash function with good distribution is the
likely candidate, but we do not rule out other possibilities. Users should not
depend on the nature of the generated implementation or rely on particular
outputs; we reserve the right to change it in the future.
Impact on existing code

By making the conformance opt-in, this is a purely additive change that does
not affect existing code. We also avoid source-breaking changes by not changing
the behavior for enums with no associated values, which will continue to
derive Equatable and Hashable even without explicitly declaring the
conformance.
Alternatives considered

Omitting fields from synthesized conformances

Some commenters have expressed a desire to tag certain properties of a struct
from being included in automatically generated equality tests or hash value
computations. This could be valuable, for example, if a property is merely used
as an internal cache and does not actually contribute to the "value" of the
instance. Under the rules above, if this cached value was equatable, a user would
have to override == and hashValue and provide their own implementations to
ignore it.
Such a feature, which could be implemented with an attribute such as @transient,
would likely also play a role in other protocols like Encodable/Decodable.
This could be done as a purely additive change on top of this proposal, so we
propose not doing this at this time.
Explicit or implicit derivation

An earlier draft of this proposal made derived conformances implicit (without
declaring Equatable/Hashable explicitly). This has been changed because—in
addition to the reasons mentioned earlier in the proposal—Encodable/Decodable
provide a precedent for having the conformance be explicit. More importantly,
however, determining derivability for recursive types is significantly more
difficult if conformance is implicit, because it requires examining the entire
dependency graph for a particular type and to properly handle cycles in order to
decide if the conditions are satisfied.
Support for Comparable

The original discussion thread also included Comparable as a candidate for
automatic generation. Unlike equatability and hashability, however,
comparability requires an ordering among the members being compared.
Automatically using the definition order here might be too surprising for users,
but worse, it also means that reordering properties in the source code changes
the code's behavior at runtime. (This is true for hashability as well if a
multiplicative hash function is used, but hash values are not intended to be
persistent and reordering the terms does not produce a significant behavioral
change.)
Acknowledgments

Thanks to Joe Groff for spinning off the original discussion thread, Jose Cheyo
Jimenez for providing great real-world examples of boilerplate needed to support
equatability for some value types, and to Mark Sands for necromancing the
swift-evolution thread that convinced me to write this up.

Rationale

On [Date], the core team decided to (TBD) this proposal.
When the core team makes a decision regarding this proposal,
their rationale for the decision will be written here.