{{ message }}

Instantly share code, notes, and snippets.

# stephentyrone/minmax.md

Last active Sep 15, 2017
Notes on how min/max might be defined.

# Defining `min` and `max`

Most definitions of floating-point `min` and `max` are badly flawed; this note explains what axioms these functions should have, illustrates the problems with existing definitions, and explores the options for fixing the situation.

Note: `min` and `max` should be symmetric, so it suffices to restrict our attention to one of them. For most of the rest of this document, I'll explain everything using `min`. The appropriate modifications to get `max` should generally be pretty clear.

## Desired axioms

The first two axioms capture the meaning of minimum; they will likely seem completely obvious (possible the point of absurdity).

1. `min(a, b)` is either `a` or `b`.

2. If `x = min(a, b)`, then `x !> a` and `x !> b` (read `!>` as "not greater than").0

An immediate consequence of these two axioms is that if `a < b`, then `min(a, b) = a` and vice-versa. The behavior is not yet fully-specified, however; we have not defined the result if `a == b`1 or `a` and `b` are incomparable2.

With only these two axioms, a skeleton pseudo-code implementation of `min(a,b)` looks like the following:

``````T min(T a, T b) {
if a < b {
return a
}
else if b < a {
return b
}
else {
// behavior is undefined
}
}
``````

The next two axioms we would like to have make it easier to write bug-free programs and efficient reductions, and allow compilers to make useful simplifications on `min` and `max`:

1. `min(a, b) = min(b, a)` (`min` is commutative)
2. `min(a, min(b, c)) = min(min(a, b), c)` (`min` is associative)

For the mathematically-inclined, these two properties make the set of floating-point numbers together with `min` into a "commutative semigroup". For the not-mathematically-inclined, why do we care? Because we eventually want to be able to take the `min` of not just two numbers, but of entire sets or arrays of numbers.

These two properties let us build up the minimum over a set out of pairwise operations; no matter what order the elements appear in, or how they are bracketed, we will get the same result, which allows us to define the operation unambiguously, and allows a compiler to re-arrange the operations for maximum efficiency.

We noted above that axioms 1 and 2 fully specified the function if `a < b` or `a > b`. What do axioms 3 and 4 tell us about the remaining cases? Axiom 3 tells us that we can't define `min` to do something like return `a` when `a` and `b` are incomparable. Instead, we need to define the result without appealing to the operand order.

How do we pick a result, then? One natural choice is what IEEE 754 does in `minNum(a, b)`; if one of `a` or `b` is NaN and the other is a number, the number is returned (754 messed this up by having different behavior for sNaN, see below for details). Another natural choice is to have `min(a, b)` return NaN if either `a` or `b` is NaN. Still a third option would be to make `min` trap on NaN inputs, or statically require that programs prove both inputs are non-NaN before calling `min`.

## Notes

0: Why don't I just say "less than or equal to" and use `<=`? Because trichotomy doesn't hold for floating-point numbers (see 2).

1: In IEEE 754 binary floating-point, there are two ways for `a` and `b` to be equal but not actually be "the same": first, +0 and -0 have different representations but compare equal. Second, some formats admit non-canonical encodings--multiple bit representations of a single number. The only widely used case where this happens are x87 pseudo denormals. In IEEE 754 decimal floating-point, we will also need to deal with cases like `a = 1.0e1`, `b = 10.0e0`.

2: If `a` is NaN, then `a == b`, `a < b`, and `a > b` are all false for any `b`, even if `b` is the same as `a`.