Skip to content

Instantly share code, notes, and snippets.

@leto
Created August 28, 2009 10:25
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save leto/176888 to your computer and use it in GitHub Desktop.
Save leto/176888 to your computer and use it in GitHub Desktop.
NaN spec in ieee754-2008
9.2 Invalid operation 9.2.0
The invalid operation exception is signaled if and only if there is no usefully definable result. In these cases
the operands are invalid for the operation to be performed.
For operations producing results in floating-point format, the default result of an invalid exception operation
shall be a quiet NaN that should provide some diagnostic information (see 8.2). Such invalid exception
operations in this standard are:
a) any general-computational or signaling-computational operation on a signaling NaN (see 8.2);
b) multiplication: multiplication(0, ) or multiplication( , 0);
∞ ∞
c) fusedMultiplyAdd: fusedMultiplyAdd(0, ∞, c) or fusedMultiplyAdd(∞, 0, c) unless c is a quiet NaN;
if c is a quiet NaN then it is implementation defined whether the invalid operation exception is
signaled;
d) addition or subtraction or fusedMultiplyAdd: magnitude subtraction of infinities, such as: addition
(+ , – );
∞ ∞
e) division: division(0, 0) or division( , );
∞∞
f) remainder: remainder(x,y), where y is zero or x is infinite and neither is NaN;
g) squareRoot if the operand is less than zero;
h) quantize when the result does not fit in the destination format or when one operand is finite and the
other is infinite.
For operations producing no result in floating-point format, the invalid exception operations are:
i) conversion of an internal floating-point number to an integer format, when the source is NaN,
infinity, or a value which would convert to an integer outside the range of the result format under the
prevailing rounding mode.
j) comparison by way of unordered-signaling predicates listed in Table 9, when the operands are
unordered;
k) logB(NaN), logB(∞), and logB(0) when logBFormat is an integer format (see 7.3.3) .
8.3 The sign bit 8.3.0
When either an input or result is NaN, this standard does not interpret the sign of a NaN. Note however that
operations on bitstrings – copy, negate, abs, copySign – specify the sign bit of a NaN result, sometimes based
upon the sign bit of a NaN operand. The logical predicate totalOrder is also affected by the sign bit of a NaN
operand. For all other operations, this standard does not specify the sign bit of a NaN result, even when there
is only one input NaN, or when the NaN is produced from an invalid operation.
When neither the inputs nor result are NaN, the sign of a product or quotient is the exclusive OR of the
operands' signs; the sign of a sum, or of a difference x–y regarded as a sum x+(–y), differs from at most one of
the addends' signs; and the sign of the result of the roundToIntegral operations and roundToIntegralExact (see
7.3.1) is the sign of the operand. These rules shall apply even when operands or results are zero or infinite.
When the sum of two operands with opposite signs (or the difference of two operands with like signs) is
exactly zero, the sign of that sum (or difference) shall be +0 in all rounding direction modes except
roundTowardNegative; in that mode, the sign of an exact zero sum (or difference) shall be –0. However, x+x
= x–(–x) retains the same sign as x even when x is zero.
When (a×b)+c is exactly zero, the sign of fusedMultiplyAdd(a, b, c) shall be determined by the rules above
for a sum of operands. When the exact result of (a×b)+c is nonzero yet the result of fusedMultiplyAdd is
zero because of rounding, the zero result takes the sign of the exact result.
Except that squareRoot(–0) shall be –0, every valid squareRoot shall have a positive sign. !!!
The representation r of the floating-point datum, and value v of the floating-point datum represented, are
inferred from the constituent fields, thus:
a) If G0 through G4 are 11111, then v is NaN regardless of S. Furthermore, if G5 is 1, then r is sNaN;
otherwise r is qNaN. The remaining bits of G are ignored, and T constitutes the NaN's payload,
which can be used to distinguish various NaNs.
The NaN payload is encoded similarly to finite numbers described below, with G treated as though
all bits were zero. The payload corresponds to the significand of finite numbers, interpreted as an
integer with a maximum value of 10^(3×J) − 1, and the exponent field is ignored (it is treated as if it
were zero). A NaN is in its preferred (canonical) representation if the bits G6 through G_(w+4) are zero
and the encoding of the payload is canonical.
class(x) tells which of the following ten classes x falls into:
signalingNaN
quietNaN
...
The bit pattern in a NaN significand can affect how the
NaN is propagated
When a NaN operand cannot be represented in the destination format and this cannot otherwise be indicated,
the invalid exception shall be signaled. When a numeric operand would convert to an integer outside the
range of the destination format, the invalid exception shall be signaled if this situation cannot otherwise be
indicated.
7.10 Details of totalOrder predicate
7.10.0
For each supported non-storage floating-point format, an implementation shall provide certain predicates that
define orderings among all operands in a particular format.
totalOrder(x,y) imposes a total ordering on canonical members of the format of x and y;
a) if x < y, totalOrder(x, y) is true
b) if x > y, totalOrder(x, y) is false
c) if x = y:
1) totalOrder(−0, +0) is true
2) totalOrder(+0, −0) is false
3) if x and y represent the same floating-point datum:
i) if x and y have negative sign,
totalOrder(x, y) is true if and only if the exponent of x ≥ the exponent of y
ii) otherwise
totalOrder(x, y) is true if and only if the exponent of x ≤ the exponent of y
Note that totalOrder does not impose a total ordering on all encodings in a format. In particular
it does not distinguish among different encodings of the same floating-point representation, as
when one or both encodings are non-canonical.
d) if x and y are unordered numerically because x or y is NaN:
1) totalOrder(−NaN, y) is true where −NaN represents a NaN with negative sign bit and y is a
floating-point number.
2) totalOrder(x, +NaN) is true where +NaN represents a NaN with positive sign bit and x is a
floating-point number.
3) if x and y are both NaNs, then totalOrder reflects a total ordering based on
i) negative sign bit < positive sign bit
ii) signaling < quiet for +NaN, reverse for −NaN
iii) lesser payload < greater payload for +NaN, reverse for −NaN
Neither signaling nor quiet NaNs signal an exception.
For canonical x and y, totalOrder(x,y) and totalOrder(y,x) are both true only if x and y are bitwise identical.
7.11 Details of comparison predicates
7.11.0
For every supported non-storage floating-point format, it shall be possible to compare one floating-point
datum to another in that format. Additionally, floating-point data represented in different formats shall be
comparable as long as the operands' formats have the same radix.
Comparisons are exact and never overflow or underflow. Four mutually exclusive relations are possible: less
than, equal, greater than, and unordered. The last case arises when at least one operand is NaN. Every NaN
shall compare unordered with everything, including itself. Comparisons shall ignore the sign of zero
(so +0 = −0). Infinite operands of the same sign shall compare equal.
Languages define how the result of a comparison shall be delivered, in one of two ways: either as a condition
7.12.1 External character sequences representing zeros, infinities, and NaNs 7.12.1.0
The conversions (described in 7.12) from internal formats to external character sequences and back that
recover the original floating-point representation, recover zeros, infinities, and quiet NaNs, as well as nonzero
finite numbers. In particular, signs of zeros and infinities are preserved.
Conversion of an infinity in internal format to an external character sequence shall produce a language-
defined one of “inf” or “infinity” or a sequence that is equivalent except for case (e.g., “Infinity” or “INF”),
with a preceding minus sign if the input is negative. Whether the conversion produces a preceding plus sign if
the input is positive is language defined.
Conversion of external character sequences “inf” and “infinity”, regardless of case, with an optional
preceding sign, to an internal floating-point format shall produce an infinity (with the same sign as the input).
Conversion of a quiet NaN in internal format to an external character sequence shall produce a language-
defined one of “nan” or a sequence that is equivalent except for case (e.g., “NaN”), with an optional
preceding sign.
Conversion of a signaling NaN in internal format to an external character sequence should produce a
language-defined one of "snan" or "nan" or a sequence that is equivalent except for case, with an optional
preceding sign. If the conversion of a signaling NaN produces "nan" or a sequence that is equivalent except
for case, with an optional preceding sign, then the invalid exception should be signaled.
Conversion of external character sequences “nan”, regardless of case, with an optional preceding sign, to an
internal floating-point format shall produce a quiet NaN.
Conversion of an external character sequence "snan", regardless of case, with an optional preceding sign, to
an internal format should either produce a signaling NaN or else produce a quiet NaN and signal the invalid
exception.
Languages should provide an optional conversion of NaNs in internal format to external character sequences
that appends to the basic NaN character sequences a suffix that can represent the NaN payload (see 8.2). The
form and interpretation of the payload suffix is language defined. The language should require that any such
optional output sequences be recognized as input in conversion of external character sequences to internal
formats.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment