Fundamental GPU Support
This MR introduces the fundamentals of GPU support to the new backend
General
- Introduce
GenericGpu
platform and threads range export: GPU platforms communicate the kernel's required thread grid size to the outside via aGpuThreadsRange
object separate from the AST - Add configuration options relating to GPUs
CUDA Platform
- Introduce CUDA platform
- Add materialization + guards for full and sparse iteration spaces
- Add materialization of math functions
SYCL Platform
- Introduce SYCL platform
- Add materialization + guards for full and sparse iteration spaces
- Add materialization of math functions
CUDA Just-In-Time Compiler
- Migrate implementation of
cupy
-based JIT to new backend as an object-oriented structure
Deviations and Missing Features
In the new implementation, block size selection is entirely up to the JIT / the runtime system and no longer affects the backend. Adaptive block sizes, register restrictions, etc. are not yet implemented by this MR.
Edited by Frederik Hennig