Optimization for GPU block size determination
This MR optimizes GPU block sizes such that these are always multiples of the hardware's warp (CUDA) or wavefront (HIP) size.
Summarized, this MR
- removes BasicOption
GpuOptions.omit_range_check - removes BasicOption
GpuOptions.block_size - introduces BasicOption
GpuOptions.warp_sizeand implements function for determining default values - introduces BasicOption
assume_warp_aligned_block_size, ensuring the compiler that block sizes match with warp size - adds new GpuOptions to the data flow of GpuIndexing
- adds algorithm for fitting block size according to iteration space and warp size
- adds
fit_block_sizeandtrim_block_sizemember functions toDynamicBlockSizeLaunchConfigurationfor computing block sizes based on a user-defined initial block size and the iteration space - for assumed alignment: rounds to multiples of warp size when iteration space is unknown to generation time
Edited by Richard Angersbach