Skip to content

Instantly share code, notes, and snippets.

Martin Källman martin-kallman

  • London
Block or report user

Report or block martin-kallman

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
@martin-kallman
martin-kallman / float16.c
Last active Nov 6, 2018
Fast half-precision to single-precision floating point conversion
View float16.c
// float32
// Martin Kallman
//
// Fast half-precision to single-precision floating point conversion
// - Supports signed zero and denormals-as-zero (DAZ)
// - Does not support infinities or NaN
// - Few, partially pipelinable, non-branching instructions,
// - Core opreations ~6 clock cycles on modern x86-64
void float32(float* __restrict out, const uint16_t in) {
uint32_t t1;
@martin-kallman
martin-kallman / inthash32.c
Created May 31, 2012
32-bit Integer Hash
View inthash32.c
uint32_t inthash32( uint32_t k ) {
k *= 1193897147;
k ^= k >> 16;
k ^= k >> 14;
k += 1193897147;
return k;
}
@martin-kallman
martin-kallman / gist:2790080
Created May 25, 2012
SSE4 Deduplication through Max() tournament
View gist:2790080
A = RIN //{3, 9, 2, 9}
For i = 0 .. 3:
B = Rotate(A, 1) //{9, 2, 9, 3}
C = Rotate(A, 2) //{2, 9, 3, 9}
D = Rotate(A, 3) //{9, 3, 9, 2}
RMAX = Max(A,B) //{9, 9, 9, 9}
RMAX = Max(RMAX, C) //{9, 9, 9, 9}
You can’t perform that action at this time.