- 30 Sep, 2019 1 commit
-
-
Stephan Seitz authored
-
- 23 Sep, 2019 1 commit
-
-
Stephan Seitz authored
-
- 16 Aug, 2019 1 commit
-
-
Martin Bauer authored
- numpy constants get directly their numpy type - integer functions check for integer types at construction
-
- 11 Jul, 2019 1 commit
-
-
Martin Bauer authored
-
- 26 Apr, 2019 1 commit
-
-
- 24 Apr, 2019 1 commit
-
-
Martin Bauer authored
- turned on restrict keyword by default (makes large difference on GPUs) - smarter block indexing: changing block size depending on domain size Example: previously there where (1,1,1) blocks when requested block size was (64, 1, 1) and domain size (1, 512, 512), now the block size is changed automatically to (1, 64, 1) in this case - added __lauch_bounds__ to kernels to allow better optimizations from the CUDA compiler
-
- 21 Mar, 2019 1 commit
-
-
Martin Bauer authored
This restructuring allows for easier separation of modules into separate repositories later. Also, now pip install with repo url can be used. The setup.py files have also been updated to correctly reference each other. Module versions are not extracted from git state
-
- 19 Oct, 2018 1 commit
-
-
Martin Bauer authored
-
- 07 Jun, 2018 1 commit
-
-
Martin Bauer authored
-
- 05 Jun, 2018 1 commit
-
-
Martin Bauer authored
- option to allocate more memory at end of line and do not generate a tail loop, if loop range is not divisible by SIMD width
-
- 13 May, 2018 1 commit
-
-
Martin Bauer authored
- support aligned load/stores - nontemporal stores - aligned memory allocation for arrays and temporary buffers
-
- 11 May, 2018 1 commit
-
-
Martin Bauer authored
- vectorization for loops with ranges that are not a multiple of vector width - vectorization for variable sized loops if special transformation replace_inner_stride_with_one is run
-