1. 18 Jun, 2019 1 commit
    • Martin Bauer's avatar
      CUDA indexing: clip to maximum cuda block size · 1754ef27
      Martin Bauer authored
      - previous method did not work with kernels generated for walberla where
        block size changes are made at runtime
      - device query does not always work, since the compile system may have
        no GPU or not the same GPU
      -> max block size is passed as parameter and only optionally determined
         by a device query
      1754ef27
  2. 14 Jun, 2019 1 commit
  3. 24 Apr, 2019 1 commit
    • Martin Bauer's avatar
      Improvements for GPU code generation · f504b40f
      Martin Bauer authored
      - turned on restrict keyword by default (makes large difference on GPUs)
      - smarter block indexing: changing block size depending on domain size
        Example: previously there where (1,1,1) blocks when requested
        block size was (64, 1, 1) and domain size (1, 512, 512), now the
        block size is changed automatically to (1, 64, 1) in this case
      - added __lauch_bounds__ to kernels to allow better optimizations from
        the CUDA compiler
      f504b40f
  4. 21 Mar, 2019 1 commit
    • Martin Bauer's avatar
      Separated modules into subfolders with own setup.py · 1e02cdc7
      Martin Bauer authored
      This restructuring allows for easier separation of modules into
      separate repositories later. Also, now pip install with repo url can be
      used.
      
      The setup.py files have also been updated to correctly reference each
      other. Module versions are not extracted from git state
      1e02cdc7
  5. 24 Jan, 2019 1 commit
  6. 16 Nov, 2018 1 commit
  7. 25 Oct, 2018 1 commit
  8. 19 Oct, 2018 1 commit
  9. 30 Apr, 2018 1 commit
  10. 28 Apr, 2018 1 commit
  11. 27 Apr, 2018 1 commit
  12. 13 Apr, 2018 2 commits
  13. 10 Apr, 2018 3 commits
  14. 19 Jan, 2018 1 commit
    • João Victor Tozatti Risso's avatar
      Code generation for field serialization into buffers · 979ee93b
      João Victor Tozatti Risso authored and Martin Bauer's avatar Martin Bauer committed
      Concept: Generate code involving the (un)packing of fields (from)to linear
      (1D) arrays, i.e. (de)serialization of the field values for buffered
      communication.
      
      A linear index is generated for the buffer, by inferring the strides and
      variables of the loops over fields in the AST. In the CPU, this information is
      obtained through the makeLoopOverDomain function, in
      pystencils/transformations/transformations.py. On CUDA, the strides of
      the fields (excluding buffers) are combined with the indexing variables to infer
      the indexing of the buffer.
      
      What is supported:
          - code generation for both CPU and GPU
          - (un)packing of fields with all the memory layouts supported by
          pystencils
          - (un)packing slices of fields (from)into the buffer
          - (un)packing subsets of cell values from the fields (from)into the buffer
      
      Limitations:
      
      - assumes that only one buffer and one field are being operated within
      each kernel, however multiple equations involving the buffer and the
      field are supported.
      
      - (un)packing multiple cell values (from)into the buffer is supported,
      however it is limited to the fields with indexDimensions=1. The same
      applies to (un)packing subset of cell values of each cell.
      
      Changes in this commit:
      
      - add the FieldType enumeration to pystencils/field.py, to mark fields
      of various types. This is replaces and is a generalization of the
      isIndexedField boolean flag of the Field class. For now, the types
      supported are: generic, indexed and buffer fields.
      
      - add the fieldType property to the Field class, which indicates the
      type of the field. Modifications were also performed to the member
      functions of the Field class to add this property.
      
      - add resolveBufferAccesses function, which replaces the fields marked
      as buffers with the actual field access in the AST traversal.
      
      Miscelaneous changes:
      
      - add blockDim and gridDim variables as CUDA indexing variables.
      979ee93b
  15. 11 Jan, 2018 1 commit
    • Martin Bauer's avatar
      pystencils cleanup · c598dc78
      Martin Bauer authored
      - single function to create kernel for specified target
      - data type creation from string - reuse numpy functionality
      - bugfixes in dot display
      c598dc78
  16. 11 Dec, 2017 1 commit
  17. 10 Oct, 2017 1 commit
  18. 09 Oct, 2017 1 commit
    • Martin Bauer's avatar
      Vectorization & Type system overhaul · ea847bc5
      Martin Bauer authored
      - first vectorization tests are running
      - type system: use memoized getTypeOfExpression
      - casts are done using sp.Function('cast')
      - C backend adapted for vectorization support
      - AST nodes can required optional headers
      ea847bc5
  19. 21 Jul, 2017 1 commit
  20. 11 Apr, 2017 1 commit
  21. 30 Mar, 2017 1 commit
  22. 29 Mar, 2017 1 commit
  23. 24 Mar, 2017 3 commits