Vectorization improvements
After we cleaned up vectorization support as part of our ARM Neon experiments a few weeks ago (!188 (merged), !220 (merged), !222 (merged)), I did the same thing with AltiVec/VSX intrinsics for POWER processors. Adding a new SIMD instruction set to pystencils really is just a matter of some quick find-and-replace now. I had test access to a POWER8 machine today, ran in both little-endian and big-endian mode, and all tests passed. So pystencils now actually supports all SIMD instruction sets out there (ignoring MIPS and SPARC processors, which are essentially dead).
This pull request also contains some minor unrelated changes:
- switches the AES RNG to aligned stores
- adds a missing
pytest.importorskip
- fixes the
vec_any
/vec_all
operations (which used to only work on 256 bit doubles) - removes the
q_registers
argument fromget_vector_instruction_set
because there is no point in using half-width vectors - fix the AES-NI RNG on Ice Lake/Tiger Lake processors
Merge request reports
Activity
mentioned in merge request !225 (closed)
mentioned in merge request walberla/walberla!432 (merged)
added 1 commit
- 7276e31c - Make vec_any/vec_all vectorization actually work
added 1 commit
- 29a1dc54 - Make vec_any/vec_all vectorization actually work
added 1 commit
- 105fd0d6 - Make vec_any/vec_all vectorization actually work
added 1 commit
- 9d7acb38 - Make vec_any/vec_all vectorization actually work
added 1 commit
- 9f9d301c - Make vec_any/vec_all vectorization actually work
added 1 commit
- 2dd3cb2a - include pveclib header if needed and use LRU store for stream
added 1 commit
- 0b99c231 - include pveclib header if needed and use LRU store for stream
- Resolved by Michael Kuron
- Resolved by Michael Kuron
- Resolved by Michael Kuron
mentioned in commit b075b723