Skip to content

Properly detect and enable vectorization on ARM

Michael Kuron requested to merge arm64 into master

!320 (merged) has the side effect of breaking detection of SVE vectorization support and enablement of SVE in the compiler. My patch should properly fix the underlying problem.

py-cpuinfo is supported on ARM64 and can be used to detect Neon and SVE. However, there was indeed a bug here -- Neon is identified as asimd in /proc/cpuinfo, so we should check for asimd instead of neon.

While -march=native was not supported by Clang before 15, -mcpu=native is supported by GCC 6+ and Clang 7+. Let's use that instead of not adding a flag at all -- otherwise SVE support is not enabled in the compiler even if the hardware supports it.

Merge request reports