pystencils merge requestshttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests2019-10-21T14:06:35+02:00https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/77Run opencl without pycuda2019-10-21T14:06:35+02:00Stephan SeitzRun opencl without pycudaFix #15
This includes !76.
If anyone wants to use textures on OpenCL, we need to decouple `TextureInterpolatedField` from CUDA.Fix #15
This includes !76.
If anyone wants to use textures on OpenCL, we need to decouple `TextureInterpolatedField` from CUDA.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/78correctly print RNG nodes2019-10-21T14:05:18+02:00Michael Kuronmkuron@icp.uni-stuttgart.decorrectly print RNG nodesRegular assignments also use `\\leftarrow`. `<-` looks odd in Jupyter because it renders like `< -PhiloxRNG`, where it looks like less than minus.Regular assignments also use `\\leftarrow`. `<-` looks odd in Jupyter because it renders like `< -PhiloxRNG`, where it looks like less than minus.Martin BauerMartin Bauerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/47WIP: Complex number support2019-10-11T17:57:49+02:00Stephan SeitzWIP: Complex number supportDepends on !43
Pystencils should eventually support complex numbers. Even if complex fields can be considered harmful for CPU vectorization. The concept is nice since SymPy and Python support complex numbers and there should be no pe...Depends on !43
Pystencils should eventually support complex numbers. Even if complex fields can be considered harmful for CPU vectorization. The concept is nice since SymPy and Python support complex numbers and there should be no performance disadvantage for normal CPU and GPU code. Many applications in physics and signal processing rely on complex numbers.
Complex output fields can be passed directly to libraries like `cufft`.
Problem: In C++, one cannot mix calculation with `std::complex<float>` with `std::complex<double>`. So user has to specify `data_type='float32'` when single precision complex floats are desired.
TODO:
* GPU support with the header pycuda provides
* only use `complex_helper.h` when needed
* remove commits from !34 (probably the code will be changed)
* rebase -ihttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/71Fix printing of sp.Infinity/sp.NegativeInfinity2019-10-10T21:36:11+02:00Stephan SeitzFix printing of sp.Infinity/sp.NegativeInfinityFor sympy, oo s a number. So pystencils prints a double
INFINITY as INFINITY.0For sympy, oo s a number. So pystencils prints a double
INFINITY as INFINITY.0https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/70Fix error in README.md: pystencils[pyopencl] -> pystencils[opencl]2019-10-09T11:52:53+02:00Stephan SeitzFix error in README.md: pystencils[pyopencl] -> pystencils[opencl]Align `README.md` with `setup.py`Align `README.md` with `setup.py`https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/67Add ConditionalFieldAccess (Field.Access after out-of-bounds check)2019-10-01T15:36:36+02:00Stephan SeitzAdd ConditionalFieldAccess (Field.Access after out-of-bounds check)Adds a wrapper around a `Field.Access` that the access is only performed if a certain condition is met.
If I use this, I can safely perform calculations and adjoint calculations with `ghost_layers=0` and obtain the correct gradients w...Adds a wrapper around a `Field.Access` that the access is only performed if a certain condition is met.
If I use this, I can safely perform calculations and adjoint calculations with `ghost_layers=0` and obtain the correct gradients without separate boundary handling.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/69Small fixes2019-10-01T15:12:52+02:00Stephan SeitzSmall fixeshttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/68Declare LoopCounterSymbols nonnegative2019-10-01T15:12:29+02:00Stephan SeitzDeclare LoopCounterSymbols nonnegativeThis removed some checks like `ctr1 <= 0` from my kernelsThis removed some checks like `ctr1 <= 0` from my kernelshttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/66Set assumptions for TypedSymbol/cast_func/IntegerFunctionTwoArgsMixIn the Sym...2019-09-30T14:11:05+02:00Stephan SeitzSet assumptions for TypedSymbol/cast_func/IntegerFunctionTwoArgsMixIn the SymPy wayAfter having a nearly week long discussion on assumptions in my SymPy PR, I got some idea of how the assumptions in SymPy are working.
It's interesting that you can use `Function.__new__(cls, integer=True)` for `UndefinedFunction`s li...After having a nearly week long discussion on assumptions in my SymPy PR, I got some idea of how the assumptions in SymPy are working.
It's interesting that you can use `Function.__new__(cls, integer=True)` for `UndefinedFunction`s like `Function('f', interger=True)` but not for subclassese of `Function`.
Now things like `(2*f.shape[0]).is_integer` are working.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/63Bugfix: this bracket should not be here (collate_types returns single type)2019-09-30T14:10:58+02:00Stephan SeitzBugfix: this bracket should not be here (collate_types returns single type)https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/65Bugfix: Align calculation of number of ghost layers on GPU with CPU version2019-09-30T14:10:47+02:00Stephan SeitzBugfix: Align calculation of number of ghost layers on GPU with CPU versionFor the calculation of the number of ghostlayers only relative accesses
should be considered like on the CPU versionFor the calculation of the number of ghostlayers only relative accesses
should be considered like on the CPU versionhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/62Bugfix fields accessed for interpolator access2019-09-30T14:10:31+02:00Stephan SeitzBugfix fields accessed for interpolator accesshttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/64Bugfix avoid east and west const2019-09-30T14:10:05+02:00Stephan SeitzBugfix avoid east and west constHere's the printing logic for SympyAsssignment:
```python
if node.is_declaration:
if node.is_const # <<< and 'const' not in self._print(node.lhs.dtype):
prefix = 'const '
else:
...Here's the printing logic for SympyAsssignment:
```python
if node.is_declaration:
if node.is_const # <<< and 'const' not in self._print(node.lhs.dtype):
prefix = 'const '
else:
prefix = ''
data_type = prefix + self._print(node.lhs.dtype) + " "
return "%s%s = %s;" % (data_type, self.sympy_printer.doprint(node.lhs),
self.sympy_printer.doprint(node.rhs))
else:
lhs_type = get_type_of_expression(node.lhs)
if type(lhs_type) is VectorType and isinstance(node.lhs, cast_func):
```
It will always prefix const on a declaration. This will not work if dtype is also const since:
```python
def __str__(self):
result = BasicType.numpy_name_to_c(str(self._dtype))
if self.const:
result += " const"
return result
```
So we get something like `const int64_t const`.
I deleted the postfix const to have everything nicely aligned.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/61Kernel wrapper2019-09-26T17:14:29+02:00Stephan SeitzKernel wrapper`KernelWrapper` is cool. Let's also use it for the `gpucuda` backend.
Also:
- make `show_code(kernel_wrapper)` possible
- fix `DeprecationWarning` for import of `Hashable``KernelWrapper` is cool. Let's also use it for the `gpucuda` backend.
Also:
- make `show_code(kernel_wrapper)` possible
- fix `DeprecationWarning` for import of `Hashable`https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/60Eliminate usages of old name 'equation collection' for `AssignmentCollection`2019-09-26T17:12:44+02:00Stephan SeitzEliminate usages of old name 'equation collection' for `AssignmentCollection`We should avoid the old name equation collection.We should avoid the old name equation collection.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/59Document backends.json2019-09-26T12:49:19+02:00Stephan SeitzDocument backends.jsonhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/57Add AssignmentCollection.{free_fields,bound_fields}2019-09-25T15:41:44+02:00Stephan SeitzAdd AssignmentCollection.{free_fields,bound_fields}Wasn't this merged already?Wasn't this merged already?https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/56Interpolation 24.0.92019-09-25T15:41:24+02:00Stephan SeitzInterpolation 24.0.9This is another rebased PR for integrating interpolated accesses.
Iterpolation accesses work like `absolute_access` except they can be savely applied on all fields (i.e. with boundary checks).
More info here: !20
This PR contains som...This is another rebased PR for integrating interpolated accesses.
Iterpolation accesses work like `absolute_access` except they can be savely applied on all fields (i.e. with boundary checks).
More info here: !20
This PR contains some dead code that uses https://github.com/theHamsta/CubicInterpolationCUDA . I have not included it as a submodule in pystencils in this PR.
This PR break the hash of those two test:
```
[gw11] [ 14%] FAILED lbmpy_tests/test_code_hashequivalence.py::test_hash_equivalence_llvm
lbmpy_tests/test_conserved_quantity_relaxation_invariance.py::test_srt
[gw8] [ 15%] FAILED lbmpy_tests/test_code_hashequivalence.py::test_hash_equivalence
```https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/58Extra asserts sympy issue2019-09-25T15:38:17+02:00Stephan SeitzExtra asserts sympy issueAdd extra assertions to be super sure.Add extra assertions to be super sure.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/20WIP: Astnodes for interpolation2019-09-24T18:59:50+02:00Stephan SeitzWIP: Astnodes for interpolationThis PR needs maybe still needs some clean-up.
However, it would be good to recieve already some feed-back.
What works:
- Using CUDA textures
- Using HW accelerated interpolation for float32 textures
- Implement linear interpol...This PR needs maybe still needs some clean-up.
However, it would be good to recieve already some feed-back.
What works:
- Using CUDA textures
- Using HW accelerated interpolation for float32 textures
- Implement linear interpolations either via software (CPU, GPU), texture accesses without HW-interpolation but HW boundary handling
- Adding transformed coordinate systems to fields
What does not work:
- HW boundary handling for CUDA textures for the boundary handling modes `mirror` and `wrap` (apparently they have been removed from CUDA's API but are still present in pycuda. Now there's only
```
cudaBoundaryModeZero = 0
Zero boundary mode
cudaBoundaryModeClamp = 1
Clamp boundary mode
cudaBoundaryModeTrap = 2
Trap boundary mode
```
Wtf is trap boundary mode? Nothing is documented so we can only experiment.
What kind of works:
- B-Spline interpolation on GPU using this repo as a submodule (http://www.dannyruijters.nl/cubicinterpolation/), to lazy for tests. Don't know how to prove correctness
- Textures for dtypes with itemsize > 4. PyCUDA has helper header (https://github.com/inducer/pycuda/blob/master/pycuda/cuda/pycuda-helpers.hpp) that loads doubles by two int fetches. However, this hack seems to be only working if we add a 0.5 offset and make all functions in this header accept float.