Failed to fetch fork details. Try again later.
-
Martin Bauer authored
Features: - uses generated pack infos for packing & unpacking directly on GPU - can directly send GPU buffers if cuda-enabled MPI is available, otherwise the packed buffers are transfered to CPU first - communication hiding with cuda streams: communication can be run asynchronously - especially useful when compute kernel is also split up into inner and outer part - added RAII classes for CUDA streams and events - equivalence test that checks if generated CPU and GPU (overlapped) versions are computing same result as normal waLBerla LBM kernel
319909f0
Forked from
waLBerla / waLBerla
Source project has a limited visibility.