x86 half precision (float16) support
This issue describes the process of adding experimental support for half precision (float16) on x86.
A nice description of what can be expected: https://clang.llvm.org/docs/LanguageExtensions.html#half-precision-floating-point
Some compilers provide support for IEEE 754-2008 half precision (binary16) floating point formats on x86. Most current systems do not implement the necessary arithmetic instructions to work directly on float16, but instead promote to float32 (single precision) before computations and quantize to float16 afterwards. However, storage and interchange is performed in the 16 bit format, thus memory-bound code could benefit, if half precision is sufficient numerically.
Commit 238a069c adds experimental support for clang >= version 15.
New type aliases half
and float16
are defined in DataTypes.h.
Some caveats / remaining issues:
-
float16
variables have to be cast tofloat
ordouble
before formatting (i.e. when writing tostdout
) - almost no
walberla
feature has been tested withfloat16
yet
Other compilers might also support such features, but have not been evaluated yet.
A small app that checks float16
support has been implemented with CheckFP16.cpp.
Also, a new CMake option WALBERLA_BUILD_WITH_HALF_PRECISION_SUPPORT
has been added. This does not set real_t
to float16
but simply enables float16
, and the corresponding type aliases. Note that usually some intrinsics have to be enabled on x86, so WALBERLA_OPTIMIZE_FOR_LOCALHOST
generally has to be enabled. Otherwise you will likely encounter linker errors.
First benchmarks with LIKWID show that autovectorization using single precision AVX intrinsics seems to work nicely. Run likwid-perfctr
on CheckFP16.cpp to see if that also works on your machine.