Commits · 020743a1a0387db068fb87b2660129133f4d960b · itischler / waLBerla

Jan 22, 2019

Precompute x and f allocation size of GPUField · 020743a1
Martin Bauer authored 6 years ago

020743a1

New GPU communication scheme with GPU kernels for packing · 319909f0

Martin Bauer authored 6 years ago

Features:
   - uses generated pack infos for packing & unpacking directly on GPU
   - can directly send GPU buffers if cuda-enabled MPI is available,
     otherwise the packed buffers are transfered to CPU first
   - communication hiding with cuda streams: communication can be run
     asynchronously - especially useful when compute kernel is also
     split up into inner and outer part

- added RAII classes for CUDA streams and events
- equivalence test that checks if generated CPU and GPU (overlapped)
  versions are computing same result as normal waLBerla LBM kernel

319909f0

May 30, 2018
- Replace boost::bind with std::bind · 26ac8bb5
  Christian Godenschwager authored 6 years ago
```
Part of issue #48
```
  26ac8bb5
Jan 27, 2018
- Remove boost regex, chrono, thread · fb43f673
  Michael Kuron authored 7 years ago
  
  fb43f673
- Remove boost::function · eaf49972
  Michael Kuron authored 7 years ago
  
  eaf49972
Jan 05, 2018
- Exported all integer types of GPUFields to Python · 9095d9ef
  Martin Bauer authored 7 years ago
  
  9095d9ef
Dec 11, 2017
- Bugfix in cuda python export · 719f209b
  Martin Bauer authored 7 years ago
  
  719f209b
Dec 08, 2017

GPUPackInfo: add asynchronous (un)packing capabilities · 6bfe8c59

João Victor Tozatti Risso authored 7 years ago


Changes introduced in this commit are the following:

- CUDA streams: Add support for asynchronous (un)packing operations using CUDA
  streams in cuda::communication::GPUPackInfo. Through asynchronous operations
  it is possible to overlap GPU computation and MPI communication in simulations
  (e.g. LBM simulations). Asynchronous copies in CUDA require pinned memory on
  the host, and for that purpose a staging buffer is introduced (i.e.
  cuda::communication::PinnedMemoryBuffer) in the cuda module, which is used to
  stage data between the GPU and the MPI buffers.

- zyxf layout: Add zyxf field layout support in GPUPackInfo through extensions
  of the functions in cuda::GPUCopy.

- Extended GPUPackInfo test: Add stream and zyxf layout tests to the
  GPUPackInfoTest to test the proposed implementation.

- Extended Kernel: add CUDA stream and shared memory configuration support in
  cuda::Kernel class.

Signed-off-by: João Victor Tozatti Risso <joaovictortr@protonmail.com>

6bfe8c59

Nov 17, 2017
- CUDA: exported field copy functions to python · 2e30cf27
  Martin Bauer authored 7 years ago
  
  2e30cf27
Sep 26, 2017
- Field data getters used by code generation · 733e2ad1
  Martin Bauer authored 7 years ago
  
  733e2ad1
Aug 02, 2017
- Fixes in CUDA module · a5f840b0
  Martin Bauer authored 7 years ago
  
  a5f840b0
- Added test for usage of GPU comm in Python · d2f851dd
  Martin Bauer authored 7 years ago
  
  d2f851dd
- Python export for GPUFields and interface to pycuda · ba5733cc
  Martin Bauer authored 7 years ago
  
  ba5733cc
- CUDA communication that does not rely on cuda aware MPI · dd28a536
  Paulo Carvalho authored 7 years ago and Martin Bauer committed 7 years ago
  
  dd28a536
- CUDA support · 6fc7b559
  Martin Bauer authored 7 years ago
  
  6fc7b559

Admin message