Skip to content

Instantly share code, notes, and snippets.

@FrankNiemeyer
Created August 28, 2015 17:52
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save FrankNiemeyer/dae98dad57199c1abfdb to your computer and use it in GitHub Desktop.
Save FrankNiemeyer/dae98dad57199c1abfdb to your computer and use it in GitHub Desktop.
void dot3_soa_vectorized(const vector<float>& xs, const vector<float>& ys, const vector<float>& zs, vector<float>& dp) {
for (auto j = 0; j < reps; ++j) {
const auto px = (__m256*)xs.data();
const auto py = (__m256*)ys.data();
const auto pz = (__m256*)zs.data();
auto pd = (__m256*)dp.data();
auto i = vector_len / lane_width;
while (i--) {
pd[i] = _mm256_add_ps(_mm256_add_ps(_mm256_mul_ps(px[i], px[i]), _mm256_mul_ps(py[i], py[i])), _mm256_mul_ps(pz[i], pz[i]));
}
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment