Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Issue while implementing wrapping_div_unchecked in packed_simd

I have one example from packed_simd, where I want to implement vertical (lane-wise) division for packed SIMD vectors (repr(simd)). The Intel SVML (Short-Vector Math Library) exposes these with the _mm_div_ep{i,u}{lane_bitwidth} APIs.

I've implemented two functions for integers:

impl Simd<[T; N]> where T: Signed | Unsigned {
    pub fn wrapping_div(self, other: Self) -> Self {
        assert!(!other.eq(Self::splat(0)).any(),
                     "division by zero: `{:?} / {:?}`", self, other);
        unsafe { intrinsic::wrapping_div(self, other) }
    }
}

This first function, wrapping_div, panics if any of the lanes of the denominator is zero, and is therefore safe. For those users that know that none of the lanes of the denominator is zero, I also implemented:

impl Simd<[T; N]> where T: Signed | Unsigned {
    /// ... Safety: if any of the lanes of other is zero, the behavior is undefined.
    pub unsafe fn wrapping_div_unchecked(self, other: Self) -> Self {
        // debug_assert!(!other.eq(Self::splat(0)).any(),
        //             "division by zero: `{:?} / {:?}`", self, other);
        unsafe { intrinsic::wrapping_div(self, other) }
    }
}

When operating on SIMD vectors it is, however, very common to only care about some lanes of the result:

let a: Simd<[i32; 4]>;
let c = unsafe { a.wrapping_div_unchecked(b) };  
let d = shuffle!(c, [0, 1]); // extract lanes 0 and 1
// drop(c); // the result of lanes 2 and 3 is never used

So it would be nice if instead of saying that calling wrapping_div_unchecked with a denominator containing a zero lane is undefined behavior, I could say something else to support this idiom.

Would supporting uninitialized bits in integers help here? That depends on whether dividing by zero returns uninitialized or something else (see below for a hint on what this should return).

A different way of improving this API would be to change the signature of the method to:

pub fn wrapping_div_unchecked(a: Simd<[T; N]>, b: Simd<[T; N]>)
     -> Simd<[MaybeUninit<T>; N]>;

The user could then write:

let a: Simd<[i32; 4]>;
let c = a.wrapping_div_unchecked(b);  
let d: Simd<[i32; 2]> = unsafe { shuffle!(c, [0, 1]).into_inner() }; // extract lanes 0 and 1
let e: MaybeUninit<i32> = c.extract(2);
let f: i32 = unsafe { e.into_inner() }; // UB?

As shown in steps e and f, the answer to what should dividing by zero returns has to be uninitialized, because otherwise step f wouldn't be UB.

With my packed_simd hat on, adding support for packed SIMD vectors of MaybeUninit<T> is going to be a true PITA to implement. Yet the API using MaybeUninit looks way clearer to me because it puts the unsafe block in the operation that is actually undefined behavior (reading a value that results from a division by zero) instead of in the actual division.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.