Field alignment on ARM
- conditionally remove
-march=native
, which Apple's clang doesn't support or need (see !415 (merged)) - don't use
__BIGGEST_ALIGNMENT__
on ARM because the architecture does not require alignment, yet has a performance penalty for unaligned access - check alignment in generated code, should be aligned to vector width or, in the nontemporal case, to the cacheline size (see pycodegen/pystencils!230 (merged))
- SVE vectorization currently requires specifying the width (see pycodegen/pystencils!232 (merged)), so autodetect it. Even with the width-agnostic version (pycodegen/pystencils!234 (merged)), explicitly specifying a width can be considered a
WALBERLA_OPTIMIZE_FOR_LOCALHOST
-style optimization