Skip to content

Instantly share code, notes, and snippets.

@GabrielMajeri
Last active October 2, 2022 18:49
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save GabrielMajeri/545042ee4f956d5b2141105eb6a505a9 to your computer and use it in GitHub Desktop.
Save GabrielMajeri/545042ee4f956d5b2141105eb6a505a9 to your computer and use it in GitHub Desktop.
Enable flushing denormals to zero in Rust

Flushing denormals to zero

Floating-point operations can sometimes result in denormalized numbers or arithmetic underflow. Hardware conforming to IEEE 754 has to support these conditions (albeit with lower performance) to maintain some numerical accuracy.

In certain high-performance applications, it might be beneficial to ignore these events and simply round the result to 0.

Note: the performance different in well-written programs should be minimal. If you notice major performance improvements, it means your code was already generating a large number of denormals / underflows, meaning you should double-check your code and determine why that is the case.

While CPUs can handle denormals with lower performance, these numbers still have lower precision and are usually indicative of badly-written math code.

x86_64 CPUs with SSE/AVX math

Requires Rust 1.27 or newer.

// Potentially improves the performance of SIMD floating-point math
// by flushing denormals/underflow to zero.
unsafe {
    use std::arch::x86_64::*;

    let mut mxcsr = _mm_getcsr();

    // Denormals & underflows are flushed to zero
    mxcsr |= (1 << 15) | (1 << 6);

    // All exceptions are masked
    mxcsr |= ((1 << 6) - 1) << 7;

    _mm_setcsr(mxcsr);
}

ARM / AArch64 CPUs with NEON

With 32-bit ARM, you can use the floating point control status register to enable flush-to-zero mode. Note that for 32-bit Advanced SIMD, flush-to-zero is always enabled.

With 64-bit ARM, you can use the equivalent floating point control register to enable flush-to-zero mode. SIMD in 64-bit mode supports switching between normal and flush-to-zero mode.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment