Properly detect and enable vectorization on ARM
!320 (merged) has the side effect of breaking detection of SVE vectorization support and enablement of SVE in the compiler. My patch should properly fix the underlying problem.
py-cpuinfo is supported on ARM64 and can be used to detect Neon and SVE. However, there was indeed a bug here -- Neon is identified as asimd
in /proc/cpuinfo, so we should check for asimd
instead of neon
.
While -march=native
was not supported by Clang before 15, -mcpu=native
is supported by GCC 6+ and Clang 7+. Let's use that instead of not adding a flag at all -- otherwise SVE support is not enabled in the compiler even if the hardware supports it.
Merge request reports
Activity
requested review from @schottenhamml
assigned to @schottenhamml
@schottenhamml can you please take a quick look at these changes
- Resolved by Helen Schottenhamml
@schottenhamml, have you had a chance to try out whether this fixes the problem you originally tried to solve with !320 (merged)?
added 12 commits
-
d01fc61c...e356d3d7 - 10 commits from branch
master
- 70afe477 - Merge remote-tracking branch 'origin/master' into arm64
- 8d99d156 - Remove cpuinfo dependency for SIMD detection on non-x86
-
d01fc61c...e356d3d7 - 10 commits from branch
added 1 commit
- 6cb620bb - Remove cpuinfo dependency for SIMD detection on non-x86
- Resolved by Helen Schottenhamml
@schottenhamml, as per our discussion, I removed the py-cpuinfo dependency on non-x86 architecture. Detection now takes place natively via ELF_HWCAP on Linux on ppc64, ppc64le, armv8, armv9. The same mechanism was already used on riscv. This codepath is now actually tested in CI, except on riscv, where QEMU does not support it yet.
Edited by Michael Kuron
mentioned in merge request !328 (merged)
added 1 commit
- f0e9cd00 - Remove cpuinfo dependency for SIMD detection on non-x86
added 4 commits
-
f0e9cd00...fd1c1259 - 3 commits from branch
master
- cede566a - Merge remote-tracking branch 'origin/master' into arm64
-
f0e9cd00...fd1c1259 - 3 commits from branch
added 4 commits
-
cede566a...30b55d00 - 3 commits from branch
master
- 267ce6a4 - Merge remote-tracking branch 'origin/master' into arm64
-
cede566a...30b55d00 - 3 commits from branch
requested review from @holzer and removed review request for @schottenhamml
requested review from @schottenhamml and removed review request for @holzer
mentioned in commit 178b4df7