requested to merge seitz/pystencils:astnodes-for-interpolation-rebased into master
This PR needs maybe still needs some clean-up. However, it would be good to recieve already some feed-back.
- Using CUDA textures
- Using HW accelerated interpolation for float32 textures
- Implement linear interpolations either via software (CPU, GPU), texture accesses without HW-interpolation but HW boundary handling
- Adding transformed coordinate systems to fields
What does not work:
- HW boundary handling for CUDA textures for the boundary handling modes
wrap(apparently they have been removed from CUDA's API but are still present in pycuda. Now there's only
cudaBoundaryModeZero = 0
Zero boundary mode
cudaBoundaryModeClamp = 1
Clamp boundary mode
cudaBoundaryModeTrap = 2
Trap boundary mode
Wtf is trap boundary mode? Nothing is documented so we can only experiment.
What kind of works:
- B-Spline interpolation on GPU using this repo as a submodule (http://www.dannyruijters.nl/cubicinterpolation/), to lazy for tests. Don't know how to prove correctness
- Textures for dtypes with itemsize > 4. PyCUDA has helper header (https://github.com/inducer/pycuda/blob/master/pycuda/cuda/pycuda-helpers.hpp) that loads doubles by two int fetches. However, this hack seems to be only working if we add a 0.5 offset and make all functions in this header accept float.