Skip to content
Snippets Groups Projects
  • Martin Bauer's avatar
    Improvements for GPU code generation · 0cdd23d8
    Martin Bauer authored
    - turned on restrict keyword by default (makes large difference on GPUs)
    - smarter block indexing: changing block size depending on domain size
      Example: previously there where (1,1,1) blocks when requested
      block size was (64, 1, 1) and domain size (1, 512, 512), now the
      block size is changed automatically to (1, 64, 1) in this case
    - added __lauch_bounds__ to kernels to allow better optimizations from
      the CUDA compiler
    0cdd23d8
Forked from pycodegen / pystencils_walberla
Source project has a limited visibility.