1. 11 Jul, 2019 1 commit
  2. 10 Jul, 2019 7 commits
  3. 08 Jul, 2019 1 commit
    • Stephan Seitz's avatar
      Add global_declarations to cbackend · 3463ff54
      Stephan Seitz authored
      This enables astnodes.Nodes to have a member required_global_declarations
      by which they can specify a global declaration required for their usage.
      3463ff54
  4. 03 Jul, 2019 1 commit
    • Stephan Seitz's avatar
      Make subexpressions optional for constructing an AssignmentCollection · 05119269
      Stephan Seitz authored
      When introducing new people to pystencils it's often simpler not to
      differentiate between `main_assignments` and `subexpressions` in the
      beginning.
      Also for simple kernels subexpressions are often not needed, since
      intermediate symbols can also be set in main_assignments.
      
      Subexpression should be kept for expert users.
      05119269
  5. 27 Jun, 2019 2 commits
  6. 18 Jun, 2019 1 commit
    • Martin Bauer's avatar
      CUDA indexing: clip to maximum cuda block size · 1754ef27
      Martin Bauer authored
      - previous method did not work with kernels generated for walberla where
        block size changes are made at runtime
      - device query does not always work, since the compile system may have
        no GPU or not the same GPU
      -> max block size is passed as parameter and only optionally determined
         by a device query
      1754ef27
  7. 14 Jun, 2019 2 commits
  8. 12 Jun, 2019 1 commit
  9. 07 Jun, 2019 1 commit
  10. 29 May, 2019 1 commit
  11. 06 May, 2019 2 commits
  12. 05 May, 2019 1 commit
  13. 03 May, 2019 3 commits
  14. 29 Apr, 2019 3 commits
  15. 28 Apr, 2019 5 commits
  16. 26 Apr, 2019 6 commits
  17. 24 Apr, 2019 1 commit
    • Martin Bauer's avatar
      Improvements for GPU code generation · f504b40f
      Martin Bauer authored
      - turned on restrict keyword by default (makes large difference on GPUs)
      - smarter block indexing: changing block size depending on domain size
        Example: previously there where (1,1,1) blocks when requested
        block size was (64, 1, 1) and domain size (1, 512, 512), now the
        block size is changed automatically to (1, 64, 1) in this case
      - added __lauch_bounds__ to kernels to allow better optimizations from
        the CUDA compiler
      f504b40f
  18. 16 Apr, 2019 1 commit