Skip to content
Snippets Groups Projects
  1. Jan 03, 2020
  2. Oct 23, 2019
  3. Oct 21, 2019
  4. Oct 20, 2019
  5. Sep 10, 2019
  6. Jun 18, 2019
    • Martin Bauer's avatar
      CUDA indexing: clip to maximum cuda block size · 1754ef27
      Martin Bauer authored
      - previous method did not work with kernels generated for walberla where
        block size changes are made at runtime
      - device query does not always work, since the compile system may have
        no GPU or not the same GPU
      -> max block size is passed as parameter and only optionally determined
         by a device query
      release/0.2.3
      1754ef27
  7. Jun 14, 2019
  8. Jun 12, 2019
  9. May 29, 2019
  10. May 06, 2019
  11. May 05, 2019
  12. May 03, 2019
  13. Apr 29, 2019
  14. Apr 28, 2019
  15. Apr 26, 2019
  16. Apr 24, 2019
    • Martin Bauer's avatar
      Improvements for GPU code generation · f504b40f
      Martin Bauer authored
      - turned on restrict keyword by default (makes large difference on GPUs)
      - smarter block indexing: changing block size depending on domain size
        Example: previously there where (1,1,1) blocks when requested
        block size was (64, 1, 1) and domain size (1, 512, 512), now the
        block size is changed automatically to (1, 64, 1) in this case
      - added __lauch_bounds__ to kernels to allow better optimizations from
        the CUDA compiler
      f504b40f
  17. Apr 16, 2019
  18. Apr 15, 2019
    • Martin Bauer's avatar
      Bugfix: For certain MRT methods the CSE failed · 27a131fb
      Martin Bauer authored
      - replace_density_and_velocity simplification produced terms like
        0 * omega, because sympy's auto-eval is turned off
      - sympys CSE routine can apparently only handle evaluated terms
      - solution: evaluate multiplications with zero (i.e. replace them by 0)
      27a131fb
  19. Apr 14, 2019
    • Martin Bauer's avatar
      Fixes · 9bfd862f
      Martin Bauer authored
      - style changes marked by flake
      - using newest kerncraft version
      9bfd862f