HIP Target and Platform
Introduce HIP as a dedicated code generation platform.
pystencils.codegen
- Add
Target.HIP - Change
Target.GPUto aliasTarget.CurrentGPU - Auto-detect
Target.CUDAorTarget.HIPon GPU systems when usingTarget.CurrentGPU
pystencils.backend
- Move all common functionality of CUDA and HIP to
GenericGpuplatform class - Introduce
HipPlatformto inherit fromGenericGpu
pystencils.jit
- Adapt the
CupyJitto only compile CUDA kernels on NVidia platforms, and only HIP kernels on ROCm platforms- While HIP can technically act as a shallow wrapper around CUDA on NVidia systems, this does not make sense here,
since a) we're gaining portability through code generation anyway, b)
hipccwill just callnvccon Nvidia systems anyway, and c) cupy needs to be built against the entire ROCm software stack to use HIP.
- While HIP can technically act as a shallow wrapper around CUDA on NVidia systems, this does not make sense here,
since a) we're gaining portability through code generation anyway, b)
Code Adaptations
- Use
Target.CurrentGPUthroughout the test suite where it is required - Use
Target.is_gpu()to detect GPU targets in all places where previously targets were checked only againstTarget.CUDA
Documentation
- Adapt installation guide and GPU codegen guide to reflect the new target
- Adapt backend GPU codegen docs
- Extend contrib guide with info on GPU development
On CI Testing
Since cupy only supports the combinations NVidia+CUDA and ROCm+HIP, we cannot currently test HIP code generation in the CI since we don't have GitLab runners with AMD GPUs.
Rationale
Separate modelling of CUDA and HIP in the code generator will be necessary to capture architectural differences in !438 (merged). Also, it turns out to be very important for the new waLBerla code generator (see https://i10git.cs.fau.de/da15siwa/sfg-walberla); using the target is the easiest way to distinguish between CUDA and HIP for GPU codegen.
Edited by Frederik Hennig