- 23 Jan, 2020 3 commits
-
-
Stephan Seitz authored
-
Stephan Seitz authored
-
Stephan Seitz authored
-
- 26 Sep, 2019 1 commit
-
-
Stephan Seitz authored
-
- 24 Sep, 2019 1 commit
-
-
Stephan Seitz authored
-
- 15 Jul, 2019 1 commit
-
-
Stephan Seitz authored
-
- 11 Jul, 2019 1 commit
-
-
Martin Bauer authored
-
- 24 Apr, 2019 1 commit
-
-
Martin Bauer authored
- turned on restrict keyword by default (makes large difference on GPUs) - smarter block indexing: changing block size depending on domain size Example: previously there where (1,1,1) blocks when requested block size was (64, 1, 1) and domain size (1, 512, 512), now the block size is changed automatically to (1, 64, 1) in this case - added __lauch_bounds__ to kernels to allow better optimizations from the CUDA compiler
-
- 21 Mar, 2019 1 commit
-
-
Martin Bauer authored
This restructuring allows for easier separation of modules into separate repositories later. Also, now pip install with repo url can be used. The setup.py files have also been updated to correctly reference each other. Module versions are not extracted from git state
-
- 15 Mar, 2019 1 commit
-
-
Martin Bauer authored
-
- 12 Mar, 2019 1 commit
-
-
Martin Bauer authored
-
- 26 Feb, 2019 1 commit
-
-
Martin Bauer authored
- counter-based philox RNG: counter/key is filled with cell coordinate and optional external parameters like block position and time step - works on CPU and GPU - on CPU only for non-vectorized versions - introduced more flexible "CustomCodeNode" that can inject backend-specific hand-written code
-
- 14 Nov, 2018 3 commits
-
-
Martin Bauer authored
- was not used consistently before - symbol names are expected to be valid C identifiers - for complicated field names, the latex_name of field should be used
-
Martin Bauer authored
-
Martin Bauer authored
- small (length < 5) arrays with shape and stride information had to be memcpy'd to the GPU before every kernel call - instead of passing the information as arrays, the single elements are passed - leads to more function arguments, but simplifies GPU kernel calls -> changes in all backends required
-
- 07 Jun, 2018 1 commit
-
-
Martin Bauer authored
- better latex display for indirect accesses - new field type 'custom': only custom fields can be accessed indirectly no static bounds check possible for custom fields
-
- 30 Apr, 2018 1 commit
-
-
Martin Bauer authored
-
- 13 Apr, 2018 2 commits
-
-
Martin Bauer authored
- removed warnings - added flake8 as CI target
-
Martin Bauer authored
-
- 10 Apr, 2018 3 commits
-
-
Martin Bauer authored
-
Martin Bauer authored
- test run again - notebooks not yet
-
Martin Bauer authored
-
- 06 Feb, 2018 2 commits
-
-
Martin Bauer authored
- previously all objects where cached by id() - for waLBerla simulations in each time step a new np.array view is created from the waLBerla field. Each of these views has a different id -> caching did not work for waLBerla setups - changed hash for numpy arrays: instead of id, a tuple of (dataPtr, strides, shapes) is used as hash input
-
Martin Bauer authored
- scaling interface width eta instead of surface tensions tau to correct interface profile & surface tensions
-
- 31 Jan, 2018 1 commit
-
-
Martin Bauer authored
-
- 19 Jan, 2018 1 commit
-
-
Concept: Generate code involving the (un)packing of fields (from)to linear (1D) arrays, i.e. (de)serialization of the field values for buffered communication. A linear index is generated for the buffer, by inferring the strides and variables of the loops over fields in the AST. In the CPU, this information is obtained through the makeLoopOverDomain function, in pystencils/transformations/transformations.py. On CUDA, the strides of the fields (excluding buffers) are combined with the indexing variables to infer the indexing of the buffer. What is supported: - code generation for both CPU and GPU - (un)packing of fields with all the memory layouts supported by pystencils - (un)packing slices of fields (from)into the buffer - (un)packing subsets of cell values from the fields (from)into the buffer Limitations: - assumes that only one buffer and one field are being operated within each kernel, however multiple equations involving the buffer and the field are supported. - (un)packing multiple cell values (from)into the buffer is supported, however it is limited to the fields with indexDimensions=1. The same applies to (un)packing subset of cell values of each cell. Changes in this commit: - add the FieldType enumeration to pystencils/field.py, to mark fields of various types. This is replaces and is a generalization of the isIndexedField boolean flag of the Field class. For now, the types supported are: generic, indexed and buffer fields. - add the fieldType property to the Field class, which indicates the type of the field. Modifications were also performed to the member functions of the Field class to add this property. - add resolveBufferAccesses function, which replaces the fields marked as buffers with the actual field access in the AST traversal. Miscelaneous changes: - add blockDim and gridDim variables as CUDA indexing variables.
-
- 11 Dec, 2017 1 commit
-
-
Martin Bauer authored
-
- 02 Dec, 2017 1 commit
-
-
Martin Bauer authored
-
- 10 Oct, 2017 1 commit
-
-
Martin Bauer authored
- renaming because of clashes with types.py from other packages
-
- 21 Jul, 2017 1 commit
-
-
Martin Bauer authored
-
- 11 Apr, 2017 2 commits
-
-
Martin Bauer authored
- cache relied on uniqueness of python id() - id may be reused if object is freed -> object must be held alive -> kernel keeps all it arguments it was ever called with, alive (problematic in terms of memory consumption)
-
Martin Bauer authored
-> smaller block
-
- 30 Mar, 2017 1 commit
-
-
Martin Bauer authored
-
- 24 Mar, 2017 4 commits
-
-
Martin Bauer authored
-
Martin Bauer authored
- abstraction layer for selecting CUDA block and grid sizes - line based (was implemented before) - block based (new, more flexible) - new conditional (if/else) ast node, which is necessary for indexing schemes (guarding if)
-
Martin Bauer authored
- bugfix for CUDA kernels with variable field sizes - extended tests for pystencils gpu kernels
-
Martin Bauer authored
-
- 01 Mar, 2017 1 commit
-
-
Martin Bauer authored
- windows support - automatic caching and creation of shared library with all generated kernels - restrict keyword and function prefixes are preprocessor macros now -> easier to generate one code for linux, cuda, windows
-
- 21 Feb, 2017 1 commit
-
-
Martin Bauer authored
-
- 08 Dec, 2016 1 commit
-
-
Martin Bauer authored
-