We were unable to load Disqus. If you are a moderator please see our troubleshooting guide.

CagouilleRapide • 2 years ago

Hello, I'm late to the game, just wanted to say thank you for this article, learnt a lot !
I have one question though : I did participate in "coders strike back" as well and also thought about vectorizing the computation, but one thing always stopped me : collision handling.
How did you solve it ? Did you just do the computation for all 8 parallel evaluation as soon as one collision was detected or did you revert to scalar code ? Thanks in adance.

sprkrd • 2 years ago

Hello, great tutorial :)

I have late suggestion for a tutorial that is 3+ years old: I'd suggest spelling out the acronyms at the very beginning. There's a point in the life of every programmer when there are just way too many acronyms, and you start loosing sight of what is what. AVX = Advanced Vector eXtensions, SSE = Streaming SIMD Extensions, and SIMD = Single Instruction, Multiple Data.

Arkan • 2 years ago

Could not agree more, I was just wondering that about xmm and ymm registers. maybe eXtended memory and wi(Y)de memory? ...

Muhammad Zeeshan • 4 years ago

Scanning dependencies of target sqrt
[100%] Building CXX object CMakeFiles/sqrt.dir/sqrt.cpp.o
/project/target/sqrt/sqrt.cpp: In function 'void avx_sqrt()':
/project/target/sqrt/sqrt.cpp:28:9: error: '__mm256' was not declared in this scope
__mm256 v = _mm256_load_si256((__m256i *)&vectorized[0][0]);
^~~~~~~
/project/target/sqrt/sqrt.cpp:30:24: error: expected primary-expression before 'const'
_mm256_sqrt_ps(const __m256& v)
^~~~~
CMakeFiles/sqrt.dir/build.make:54: recipe for target 'CMakeFiles/sqrt.dir/sqrt.cpp.o' failed
make[2]: *** [CMakeFiles/sqrt.dir/sqrt.cpp.o] Error 1
CMakeFiles/Makefile2:60: recipe for target 'CMakeFiles/sqrt.dir/all' failed
make[1]: *** [CMakeFiles/sqrt.dir/all] Error 2
Makefile:76: recipe for target 'all' failed
make: *** [all] Error 2
i am getting these errors. i am trying to load vectorized array (first 8 floats) into an avx register and pass it to _mm256_sqrt_ps() function and getting this error. Any suggestions to remove these errors?