Skip to content
Snippets Groups Projects
  1. Nov 16, 2018
  2. Nov 14, 2018
    • Martin Bauer's avatar
      Pass field information (shape,stride) as single elements instead of arr · 7a94740d
      Martin Bauer authored
      - small (length < 5) arrays with shape and stride information had to be
        memcpy'd to the GPU before every kernel call
      - instead of passing the information as arrays, the single elements are
        passed
      - leads to more function arguments, but simplifies GPU kernel calls
      
      -> changes in all backends required
      7a94740d
  3. Nov 13, 2018
  4. Oct 26, 2018
  5. Oct 25, 2018
  6. Oct 19, 2018
  7. Sep 05, 2018
  8. Jul 06, 2018
  9. Jun 25, 2018
  10. May 18, 2018
  11. May 14, 2018
  12. May 13, 2018
    • Martin Bauer's avatar
      Improved Vectorization · 501b2d7e
      Martin Bauer authored
      - support aligned load/stores
      - nontemporal stores
      - aligned memory allocation for arrays and temporary buffers
      501b2d7e
  13. May 11, 2018
    • Martin Bauer's avatar
      Generalized vectorization · 57a3c27e
      Martin Bauer authored
      - vectorization for loops with ranges that are not a multiple of vector width
      - vectorization for variable sized loops if special transformation
        replace_inner_stride_with_one is run
      57a3c27e
  14. Apr 27, 2018
  15. Apr 13, 2018
  16. Apr 10, 2018
  17. Mar 05, 2018
    • Martin Bauer's avatar
      Boundary conditions · fd68e34d
      Martin Bauer authored
      - in-kernel Neumann boundaries
      - flag-interface for boundary handling makes one flag field multiple
        boundary handlings possible
      - generator: support for bitwise logical operators
      fd68e34d
  18. Feb 08, 2018
    • Martin Bauer's avatar
      lbmpy phasefield · 9cf1ac28
      Martin Bauer authored
      - step class for LB phasefield generic enough to work with 3-phase and
        N-phase models
      - cahn hilliard can either be solved by LBM or by finite differences
      - 3 phase model can be solved with rho phase or without
      9cf1ac28
  19. Dec 03, 2017
  20. Oct 17, 2017
  21. Oct 10, 2017
  22. Oct 09, 2017
    • Martin Bauer's avatar
      Vectorization & Type system overhaul · ea847bc5
      Martin Bauer authored
      - first vectorization tests are running
      - type system: use memoized getTypeOfExpression
      - casts are done using sp.Function('cast')
      - C backend adapted for vectorization support
      - AST nodes can required optional headers
      ea847bc5
  23. Sep 20, 2017
  24. Jul 26, 2017
  25. Jul 21, 2017
  26. Jul 07, 2017
  27. Mar 24, 2017
    • Martin Bauer's avatar
      Conditional AST Node & advanced CUDA indexing · ff641ec9
      Martin Bauer authored
      - abstraction layer for selecting CUDA block and grid sizes
        - line based (was implemented before)
        - block based (new, more flexible)
      -  new conditional (if/else) ast node, which is necessary for indexing schemes (guarding if)
      ff641ec9
  28. Mar 16, 2017
  29. Mar 14, 2017
    • Martin Bauer's avatar
      pystencils: fields can now contain structs · ec3faf51
      Martin Bauer authored
      - this extension is necessary for more generic boundary treatment
      - cells can now be structs, i.e. contain different data types
      - instead of having numeric index dimensions, one can use the index per cell to adress struct elements
      ec3faf51
  30. Mar 13, 2017
    • Martin Bauer's avatar
      pystencils: Cleaned up type system · c8b455fe
      Martin Bauer authored
      - use data type class consistently instead of strings (in TypedSymbol, Field and jit module)
      - new datatype class is based on numpy types with additional specifier information (const and restrict)
      - translation between data type class and other modules (numpy, ctypes)
      c8b455fe
  31. Mar 01, 2017
    • Martin Bauer's avatar
      pystencils: cpujit · dd17cd30
      Martin Bauer authored
      - windows support
      - automatic caching and creation of shared library with all generated kernels
      - restrict keyword and function prefixes are preprocessor macros now -> easier to generate one code for linux, cuda, windows
      dd17cd30