- Jan 22, 2019
-
-
Martin Bauer authored
-
Martin Bauer authored
Features: - uses generated pack infos for packing & unpacking directly on GPU - can directly send GPU buffers if cuda-enabled MPI is available, otherwise the packed buffers are transfered to CPU first - communication hiding with cuda streams: communication can be run asynchronously - especially useful when compute kernel is also split up into inner and outer part - added RAII classes for CUDA streams and events - equivalence test that checks if generated CPU and GPU (overlapped) versions are computing same result as normal waLBerla LBM kernel
-
- Sep 26, 2017
-
-
Martin Bauer authored
-
- Aug 02, 2017
-
-
Martin Bauer authored
-
-
Martin Bauer authored
-