1. 18 Jul, 2019 1 commit
  2. 15 Jul, 2019 1 commit
  3. 12 Jul, 2019 1 commit
  4. 11 Jul, 2019 1 commit
  5. 10 Jul, 2019 2 commits
    • Stephan Seitz's avatar
      2313eda2
    • Stephan Seitz's avatar
      Add DestructuringBindingsForFieldClass to use pystencils kernels in a more C++-ish way · 8e63c9ff
      Stephan Seitz authored
      DestructuringBindingsForFieldClass defines all field-related variables
      in its subordinated block.
      However, it leaves a TypedSymbol of type 'Field' for each field
      undefined.
      By that trick we can generate kernels that accept structs as
      kernelparameters.
      Either to include a pystencils specific Field struct of the following
      definition:
      
      ```cpp
      template<DTYPE_T, DIMENSION>
      struct Field
      {
          DTYPE_T* data;
          std::array<DTYPE_T, DIMENSION> shape;
          std::array<DTYPE_T, DIMENSION> stride;
      }
      
      or to be able to destructure user defined types like `pybind11::array`,
      `at::Tensor`, `tensorflow::Tensor`
      
      ```
      8e63c9ff
  6. 08 Jul, 2019 1 commit
    • Stephan Seitz's avatar
      Add global_declarations to cbackend · 3463ff54
      Stephan Seitz authored
      This enables astnodes.Nodes to have a member required_global_declarations
      by which they can specify a global declaration required for their usage.
      3463ff54
  7. 27 Jun, 2019 1 commit
  8. 26 Apr, 2019 2 commits
  9. 24 Apr, 2019 1 commit
    • Martin Bauer's avatar
      Improvements for GPU code generation · f504b40f
      Martin Bauer authored
      - turned on restrict keyword by default (makes large difference on GPUs)
      - smarter block indexing: changing block size depending on domain size
        Example: previously there where (1,1,1) blocks when requested
        block size was (64, 1, 1) and domain size (1, 512, 512), now the
        block size is changed automatically to (1, 64, 1) in this case
      - added __lauch_bounds__ to kernels to allow better optimizations from
        the CUDA compiler
      f504b40f
  10. 14 Apr, 2019 1 commit
    • Martin Bauer's avatar
      Fixes · 9bfd862f
      Martin Bauer authored
      - style changes marked by flake
      - using newest kerncraft version
      9bfd862f
  11. 05 Apr, 2019 1 commit
  12. 03 Apr, 2019 1 commit
  13. 01 Apr, 2019 2 commits
  14. 21 Mar, 2019 1 commit
    • Martin Bauer's avatar
      Separated modules into subfolders with own setup.py · 1e02cdc7
      Martin Bauer authored
      This restructuring allows for easier separation of modules into
      separate repositories later. Also, now pip install with repo url can be
      used.
      
      The setup.py files have also been updated to correctly reference each
      other. Module versions are not extracted from git state
      1e02cdc7
  15. 18 Mar, 2019 1 commit
  16. 15 Mar, 2019 2 commits
  17. 07 Mar, 2019 2 commits
  18. 26 Feb, 2019 1 commit
    • Martin Bauer's avatar
      Random number generation support for pystencils · 6a01f3e2
      Martin Bauer authored
      - counter-based philox RNG: counter/key is filled with cell coordinate
        and optional external parameters like block position and time step
      - works on CPU and GPU - on CPU only for non-vectorized versions
      
      - introduced more flexible "CustomCodeNode" that can inject
        backend-specific hand-written code
      6a01f3e2
  19. 18 Feb, 2019 1 commit
  20. 16 Nov, 2018 1 commit
  21. 14 Nov, 2018 1 commit
    • Martin Bauer's avatar
      Pass field information (shape,stride) as single elements instead of arr · 7a94740d
      Martin Bauer authored
      - small (length < 5) arrays with shape and stride information had to be
        memcpy'd to the GPU before every kernel call
      - instead of passing the information as arrays, the single elements are
        passed
      - leads to more function arguments, but simplifies GPU kernel calls
      
      -> changes in all backends required
      7a94740d
  22. 13 Nov, 2018 1 commit
  23. 26 Oct, 2018 1 commit
  24. 25 Oct, 2018 1 commit
  25. 19 Oct, 2018 1 commit
  26. 05 Sep, 2018 2 commits
  27. 06 Jul, 2018 1 commit
  28. 25 Jun, 2018 1 commit
  29. 18 May, 2018 1 commit
  30. 14 May, 2018 3 commits
  31. 13 May, 2018 1 commit
    • Martin Bauer's avatar
      Improved Vectorization · 501b2d7e
      Martin Bauer authored
      - support aligned load/stores
      - nontemporal stores
      - aligned memory allocation for arrays and temporary buffers
      501b2d7e
  32. 11 May, 2018 1 commit
    • Martin Bauer's avatar
      Generalized vectorization · 57a3c27e
      Martin Bauer authored
      - vectorization for loops with ranges that are not a multiple of vector width
      - vectorization for variable sized loops if special transformation
        replace_inner_stride_with_one is run
      57a3c27e