Iteration Axes for GPU

Follow-up to !503 (merged) to extend the iteration axes system to GPU kernels.

This MR moves the materialization of GPU iteration spaces away from the GPU platforms and into the codegen module, using the new iteration axes system. We introduce new axis types for GPU indexing, as well as IR functions for GPU block/thread indexing intrinsics. This allows the code generator and potentially users to describe GPU iteration strategies independently from the platform (CUDA, HIP, SYCL).

We update the GenericGpu platforms to use the new system. The SYCL platform shall be updated in the future (#126).

  • Introduce axis types and expansion strategies for GPU block and thread indexing
  • Update MaterializeAxes to turn GPU axes to indexing logic and kernel guards
  • Move GPU thread mapping logic from GenericGpu platform to codegen.gpu_indexing
  • Introduce GPU block/thread indexing intrinsics as IR functions, must be lowered during SelectFunctions by the GPU platforms
  • Purge old materialize_iteration_space code from CPU and GPU platforms
  • Update documentation and document about GPU codegen in the backend guide
Edited by Frederik Hennig

Merge request reports

Loading