Skip to content

Instantly share code, notes, and snippets.

@deplinenoise
Last active August 29, 2015 14:05
Show Gist options
  • Save deplinenoise/0e5b1a2ac7dd28f51eeb to your computer and use it in GitHub Desktop.
Save deplinenoise/0e5b1a2ac7dd28f51eeb to your computer and use it in GitHub Desktop.
__m128 Uint32_Float_Round(__m128i in)
{
// Generate mask when input elements are > 2^31 - 1
__m128i mask = _mm_srai_epi32(in, 31);
// Version of the input shifted down one bit logically
__m128i t = _mm_srli_epi32(in, 1);
// Add lowest bit of each word to round.
__m128i in1 = _mm_or_epi32(t, _mm_and_si128(in, _mm_set1_epi32(1)));
// Select either shifted or unshifted version based on mask
__m128 a = _mm_or_si128(_mm_and_si128(mask, in1),
_mm_andnot_si128(mask, in));
// Convert to float
__m128 b = _mm_cvtepi32_ps(a);
// Correct magnitude of large elements by adding in the right elements again.
__m128 c = _mm_and_ps(mask, b);
__m128 result = _mm_add_ps(c, b);
return result;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment