Skip to content

Instantly share code, notes, and snippets.

@jj1bdx
Last active December 17, 2023 04:40
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jj1bdx/62e27aac4b54a29dfafe210c73c49b0e to your computer and use it in GitHub Desktop.
Save jj1bdx/62e27aac4b54a29dfafe210c73c49b0e to your computer and use it in GitHub Desktop.
VOLK v3.1.0 volk_32fc_s32f_atan2_32f() anomaly on input element 0+0j

VOLK v3.1.0 volk_32fc_s32f_atan2_32f() anomaly on input element 0+0j

VOLK v3.1.0 volk_32fc_s32f_atan2_32f() returns -NaN when input has a 0+0j element for the avx2 and avx2_fma kernels.

Note: j denotes an imaginary number, where j^2 = -1.

Summary

VOLK v3.1.0 volk_32fc_s32f_atan2_32f() returns -nan for the following kernels:

  • avx2 (a_avx2, u_avx2)
  • avx2_fma (a_avx2_fma, u_avx2_fma)

VOLK v3.1.0 volk_32fc_s32f_atan2_32f() returns zero for the following kernels:

  • generic
  • polynomial

Testing environment

Ubuntu 22.04.3 x86_64

Excerpt from dmesg output

DMI: Intel(R) Client Systems NUC10i7FNH/NUC10i7FNB, BIOS FNCML357.004
6.2020.0928.1457 09/28/2020
smpboot: CPU0: Intel(R) Core(TM) i7-10710U CPU @ 1.10GHz (family: 0x6
, model: 0xa6, stepping: 0x0)
AVX2 version of gcm_enc/dec engaged.

[End of report]

// Compile as
// cc -O -o atan2test atan2test.c \
// -I/usr/local/include -L/usr/local/lib -lm -lvolk
#include <math.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <volk/volk.h>
int main(int argc, char *argv[]) {
int n = 9;
unsigned int alignment = volk_get_alignment();
lv_32fc_t *in = (lv_32fc_t *)volk_malloc(sizeof(lv_32fc_t) * n, alignment);
float *out = (float *)volk_malloc(sizeof(float) * n, alignment);
float *out2 = (float *)volk_malloc(sizeof(float) * n, alignment);
float scale = 1.0f;
in[0] = lv_cmake(1.0f, 0.0f);
in[1] = lv_cmake(-1.0f, 0.0f);
// This 0+0j fails on volk_32fc_s32f_atan2_32f input
// for the non-generic kernels
in[2] = lv_cmake(0.0f, 0.0f);
in[3] = lv_cmake(0.0f, 1.0f);
in[4] = lv_cmake(0.0f, -1.0f);
in[5] = lv_cmake(1.0f, 1.0f);
in[6] = lv_cmake(1.0f, -1.0f);
in[7] = lv_cmake(-1.0f, 1.0f);
in[8] = lv_cmake(-1.0f, -1.0f);
for (unsigned int i = 0; i < n; i++) {
out[i] = atan2f(lv_cimag(in[i]), lv_creal(in[i]));
}
volk_32fc_s32f_atan2_32f(out2, in, scale, n);
for (unsigned int i = 0; i < n; i++) {
printf("in[%d]=%+3.1f+j%+3.1f, atan2f(in[%d])=%+10.8f, "
"volk_atan2([%d])=%+10.8f\n",
i, lv_creal(in[i]), lv_cimag(in[i]), i, out[i], i, out2[i]);
}
exit(0);
}
For the following line in ~/.volk/volk_config:
volk_32fc_s32f_atan2_32f a_avx2 u_avx2
in[0]=+1.0+j+0.0, atan2f(in[0])=+0.00000000, volk_atan2([0])=+0.00000000
in[1]=-1.0+j+0.0, atan2f(in[1])=+3.14159274, volk_atan2([1])=+3.14159274
in[2]=+0.0+j+0.0, atan2f(in[2])=+0.00000000, volk_atan2([2])= -nan
in[3]=+0.0+j+1.0, atan2f(in[3])=+1.57079637, volk_atan2([3])=+1.57079637
in[4]=+0.0+j-1.0, atan2f(in[4])=-1.57079637, volk_atan2([4])=-1.57079637
in[5]=+1.0+j+1.0, atan2f(in[5])=+0.78539819, volk_atan2([5])=+0.78539866
in[6]=+1.0+j-1.0, atan2f(in[6])=-0.78539819, volk_atan2([6])=-0.78539866
in[7]=-1.0+j+1.0, atan2f(in[7])=+2.35619450, volk_atan2([7])=+2.35619402
in[8]=-1.0+j-1.0, atan2f(in[8])=-2.35619450, volk_atan2([8])=-2.35619402
For the following line in ~/.volk/volk_config:
volk_32fc_s32f_atan2_32f a_avx2_fma u_avx2_fma
in[0]=+1.0+j+0.0, atan2f(in[0])=+0.00000000, volk_atan2([0])=+0.00000000
in[1]=-1.0+j+0.0, atan2f(in[1])=+3.14159274, volk_atan2([1])=+3.14159274
in[2]=+0.0+j+0.0, atan2f(in[2])=+0.00000000, volk_atan2([2])= -nan
in[3]=+0.0+j+1.0, atan2f(in[3])=+1.57079637, volk_atan2([3])=+1.57079637
in[4]=+0.0+j-1.0, atan2f(in[4])=-1.57079637, volk_atan2([4])=-1.57079637
in[5]=+1.0+j+1.0, atan2f(in[5])=+0.78539819, volk_atan2([5])=+0.78539866
in[6]=+1.0+j-1.0, atan2f(in[6])=-0.78539819, volk_atan2([6])=-0.78539866
in[7]=-1.0+j+1.0, atan2f(in[7])=+2.35619450, volk_atan2([7])=+2.35619402
in[8]=-1.0+j-1.0, atan2f(in[8])=-2.35619450, volk_atan2([8])=-2.35619402
For the following line in ~/.volk/volk_config:
volk_32fc_s32f_atan2_32f generic generic
in[0]=+1.0+j+0.0, atan2f(in[0])=+0.00000000, volk_atan2([0])=+0.00000000
in[1]=-1.0+j+0.0, atan2f(in[1])=+3.14159274, volk_atan2([1])=+3.14159274
in[2]=+0.0+j+0.0, atan2f(in[2])=+0.00000000, volk_atan2([2])=+0.00000000
in[3]=+0.0+j+1.0, atan2f(in[3])=+1.57079637, volk_atan2([3])=+1.57079637
in[4]=+0.0+j-1.0, atan2f(in[4])=-1.57079637, volk_atan2([4])=-1.57079637
in[5]=+1.0+j+1.0, atan2f(in[5])=+0.78539819, volk_atan2([5])=+0.78539819
in[6]=+1.0+j-1.0, atan2f(in[6])=-0.78539819, volk_atan2([6])=-0.78539819
in[7]=-1.0+j+1.0, atan2f(in[7])=+2.35619450, volk_atan2([7])=+2.35619450
in[8]=-1.0+j-1.0, atan2f(in[8])=-2.35619450, volk_atan2([8])=-2.35619450
For the following line in ~/.volk/volk_config:
volk_32fc_s32f_atan2_32f polynomial polynomial
in[0]=+1.0+j+0.0, atan2f(in[0])=+0.00000000, volk_atan2([0])=+0.00000000
in[1]=-1.0+j+0.0, atan2f(in[1])=+3.14159274, volk_atan2([1])=+3.14159274
in[2]=+0.0+j+0.0, atan2f(in[2])=+0.00000000, volk_atan2([2])=+0.00000000
in[3]=+0.0+j+1.0, atan2f(in[3])=+1.57079637, volk_atan2([3])=+1.57079637
in[4]=+0.0+j-1.0, atan2f(in[4])=-1.57079637, volk_atan2([4])=-1.57079637
in[5]=+1.0+j+1.0, atan2f(in[5])=+0.78539819, volk_atan2([5])=+0.78539866
in[6]=+1.0+j-1.0, atan2f(in[6])=-0.78539819, volk_atan2([6])=-0.78539866
in[7]=-1.0+j+1.0, atan2f(in[7])=+2.35619450, volk_atan2([7])=+2.35619402
in[8]=-1.0+j-1.0, atan2f(in[8])=-2.35619450, volk_atan2([8])=-2.35619402
@jj1bdx
Copy link
Author

jj1bdx commented Dec 17, 2023

This issue has been reported as: gnuradio/volk#730

@jj1bdx
Copy link
Author

jj1bdx commented Dec 17, 2023

A root cause fix proposal by adding the NaN check in volk_32fc_s32f_atan2_32f: gnuradio/volk#731

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment