- 18 May, 2020 1 commit
-
-
Dominik Thoennes authored
This new function does the actual generation using pystencils and creates a target library that can be used as a dependency as usual. One should not list the python script itself anymore in add_executable. Convention: Use the AppName as a prefix for the generated lib! Example Usage: waLBerla_generate_target_from_python( NAME AppNameGeneratedLib FILE GenerateKernel.py OUT_FILES Kernel.h Kernel.cpp) waLBerla_add_executable ( NAME AppNameGeneratedLibTest DEPENDS GeneratedLib)
-
- 26 Jul, 2019 1 commit
-
-
Martin Bauer authored
-
- 24 Jan, 2019 1 commit
-
-
Martin Bauer authored
- information from CMake to codegen: double/float, OpenMP, ...
-
- 22 Jan, 2019 2 commits
-
-
Martin Bauer authored
previously the ghost layer was aligned
-
Martin Bauer authored
Features: - uses generated pack infos for packing & unpacking directly on GPU - can directly send GPU buffers if cuda-enabled MPI is available, otherwise the packed buffers are transfered to CPU first - communication hiding with cuda streams: communication can be run asynchronously - especially useful when compute kernel is also split up into inner and outer part - added RAII classes for CUDA streams and events - equivalence test that checks if generated CPU and GPU (overlapped) versions are computing same result as normal waLBerla LBM kernel
-
- 11 Dec, 2017 1 commit
-
-
Martin Bauer authored
- one python file can now generate multiple source files
-
- 08 Dec, 2017 1 commit
-
-
João Victor Tozatti Risso authored
Changes introduced in this commit are the following: - CUDA streams: Add support for asynchronous (un)packing operations using CUDA streams in cuda::communication::GPUPackInfo. Through asynchronous operations it is possible to overlap GPU computation and MPI communication in simulations (e.g. LBM simulations). Asynchronous copies in CUDA require pinned memory on the host, and for that purpose a staging buffer is introduced (i.e. cuda::communication::PinnedMemoryBuffer) in the cuda module, which is used to stage data between the GPU and the MPI buffers. - zyxf layout: Add zyxf field layout support in GPUPackInfo through extensions of the functions in cuda::GPUCopy. - Extended GPUPackInfo test: Add stream and zyxf layout tests to the GPUPackInfoTest to test the proposed implementation. - Extended Kernel: add CUDA stream and shared memory configuration support in cuda::Kernel class. Signed-off-by:
João Victor Tozatti Risso <joaovictortr@protonmail.com>
-
- 20 Sep, 2017 1 commit
-
-
Martin Bauer authored
-
- 02 Aug, 2017 3 commits
-
-
Martin Bauer authored
-
-
Martin Bauer authored
-