Commits · 5600b6b6d67795916af2b30e8703d9e58f05568f · pycodegen / pystencils

Jul 05, 2024
- [BUGFIX] GPU slicing · 54b01e22
  Markus Holzer authored 8 months ago and Frederik Hennig committed 8 months ago
  
  54b01e22
Jan 16, 2024
- Refactor source tree layout · bcc8d818
  Frederik Hennig authored 1 year ago and Markus Holzer committed 1 year ago
  
  bcc8d818
Aug 28, 2023
- Refactor gpu indexing · 33dc8ae9
  Markus Holzer authored 1 year ago and Michael Kuron committed 1 year ago
  
  33dc8ae9
Jul 13, 2023
- Remove pystencils.GPU_DEVICE · 376ee8d3
  Michael Kuron authored 1 year ago and Markus Holzer committed 1 year ago
  
  376ee8d3
Jul 05, 2023
- Fix indexing for AMD GPUs · 145c5264
  Markus Holzer authored 1 year ago and Michael Kuron committed 1 year ago
  
  145c5264
Jun 22, 2023
- Replace PyCuda with CuPy · 32926be2
  Markus Holzer authored 1 year ago and Michael Kuron committed 1 year ago
  
  32926be2
Feb 10, 2022
- Fix CUDA support · ffb9b53c
  Markus Holzer authored 3 years ago and Markus Holzer committed 3 years ago
  
  ffb9b53c
Jan 11, 2021
- Extended Test suit · 1c2653a3
  Markus Holzer authored 4 years ago and Michael Kuron committed 4 years ago
  
  1c2653a3
Jan 10, 2020
- Add pycuda.autoinit to GPU test · 62803d18
  Stephan Seitz authored 5 years ago
  
  62803d18
Jul 11, 2019
- Import sorting using isort · 6373c03a
  Martin Bauer authored 5 years ago
  
  6373c03a
Jun 18, 2019

CUDA indexing: clip to maximum cuda block size · 1754ef27

Martin Bauer authored 5 years ago

- previous method did not work with kernels generated for walberla where
  block size changes are made at runtime
- device query does not always work, since the compile system may have
  no GPU or not the same GPU
-> max block size is passed as parameter and only optionally determined
   by a device query

release/0.2.3

1754ef27

Apr 24, 2019

Improvements for GPU code generation · f504b40f

Martin Bauer authored 5 years ago

- turned on restrict keyword by default (makes large difference on GPUs)
- smarter block indexing: changing block size depending on domain size
  Example: previously there where (1,1,1) blocks when requested
  block size was (64, 1, 1) and domain size (1, 512, 512), now the
  block size is changed automatically to (1, 64, 1) in this case
- added __lauch_bounds__ to kernels to allow better optimizations from
  the CUDA compiler

f504b40f

Mar 22, 2019
- Additional tests for packinfo generation & fast approximation for div and sqrt · 0df63c2d
  Martin Bauer authored 6 years ago
  
  0df63c2d
Mar 21, 2019

Separated modules into subfolders with own setup.py · 1e02cdc7

Martin Bauer authored 6 years ago

This restructuring allows for easier separation of modules into
separate repositories later. Also, now pip install with repo url can be
used.

The setup.py files have also been updated to correctly reference each
other. Module versions are not extracted from git state

1e02cdc7

Admin message