Iteration Slices: Extended GPU support + bugfixes (!429) · Merge requests · pycodegen / pystencils

An error occurred while fetching reviewers.

Merged Iteration Slices: Extended GPU support + bugfixes

fhennig/gpu-iteration-spaces into v2.0-dev

Merged Frederik Hennig requested to merge fhennig/gpu-iteration-spaces into v2.0-dev 5 months ago

This MR slightly extends the support for more general iteration slices on the CUDA platform, greatly extends the test suite for iteration slices on all targets, and fixes some bugs found along the way.

Code Generator Configuration

Add special value AUTO to pystencils.config to model automatic behavior; no longer use None to mean "automatic" in specifying ghost layers
Add configuration option manual_launch_grid to disable automatic inference of the GPU launch grid size

Iteration Slices on GPU

Have the Cuda platform raise a warning if it can't figure out a launch grid because of dependencies between dimensions
Extend the JIT-compiled kernel object to allow manual specification of the launch grid, and enforce this if no grid size was inferred from the kernel

This now enables the iteration limits of faster coordinates to depend on the current counter value of slower coordinates; e.g. triangular iteration patterns, red-black checkerboard iteration, ... (see test cases)

Documentation for these features will be added in a follow-up MR.

Bugfixes

Fix parsing of iteration slices that are negative integers
Fix a bug in the loop vectorizer where the trailing loop was only executed if the SIMD-loop had run for at least one iteration

Test Suite

Add pytest fixtures for available codegen targets, to simplify writing tests that should succeed on all hardware
Add extensive tests for common and more uncommon iteration slices on all targets

Activity

Please register or sign in to reply

Admin message

Iteration Slices: Extended GPU support + bugfixes

Code Generator Configuration

Iteration Slices on GPU

Bugfixes

Test Suite

Merge request reports

Activity