Skip to content

Instantly share code, notes, and snippets.

@dhermes
Created September 30, 2015 03:05
Show Gist options
  • Save dhermes/f17fc85999f79ae2f304 to your computer and use it in GitHub Desktop.
Save dhermes/f17fc85999f79ae2f304 to your computer and use it in GitHub Desktop.
http://stackoverflow.com/questions/427477/fastest-way-to-clamp-a-real-fixed-floating-point-value
https://devtalk.nvidia.com/default/topic/514408/min-max-and-sign-functions-in-cuda-do-they-exist-if-so-where-/
https://en.wikipedia.org/wiki/Algorithm_%28C%2B%2B%29
https://en.wikipedia.org/wiki/C_mathematical_functions#Overview_of_functions
http://en.cppreference.com/w/c/numeric/math/fmax
$ find /usr/ | grep 'algorithm\.h$'
/usr/include/CGAL/algorithm.h
http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#mathematical-functions-appendix
Amazing! using fmin and fmax cut my computation time from 4.1-4.2 ms
to 3.1-3.2, and their use isn't the major part of the computations!
http://stackoverflow.com/questions/16584558/the-difference-between-max-and-fmax-cross-platform-compiling
The actual difference is, that fmin and fmax are mathematical functions
working on floating point numbers and originating from C99 (and might be
implemented intrisically by actual specialized CPU instructions where possible),
while min and max are general algorithms usable on any type
supporting < (and are probably just a simple (b<a) ? b : a instead of a
floating point instruction, though an implementation could even do that
with a specialization of min and max, but I doubt this).
http://gpuray.blogspot.com/2009/07/cuda-warps-and-branching.html
http://www.informit.com/articles/article.aspx?p=2103809&seqNum=4
Some conditional operations are so common that they are supported natively
by the hardware. Minimum and maximum operations are supported for both
integer and floating-point operands and are translated to a single
instruction. Additionally, floating-point instructions include modifiers
that can negate or take the absolute value of a source operand.
The compiler does a good job of detecting when min/max operations
are being expressed, but if you want to take no chances, call the
min()/max() intrinsics for integers or fmin()/fmax()
for floating-point values.
======================================================
https://devtalk.nvidia.com/default/topic/496548/are-max-a-b-and-min-a-b-divergent-/
The standard CPU implementation seems to be:
(b<a) ? a : b;
which is clearly divergent, but I'd like to know if CUDA does anything
clever to get around it.
======================================================
http://stackoverflow.com/a/16659263/1068170
maxsd %xmm0, %xmm1 # d, min
movapd %xmm2, %xmm0 # max, max
minsd %xmm1, %xmm0 # min, max
ret
maxsd %xmm0, %xmm1
minsd %xmm1, %xmm2
movaps %xmm2, %xmm0
ret
GENERATED ASSEMBLY (sm_1x, sm_2x)
======================================================
https://gist.github.com/dhermes/c79846c6074b938b2e10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment