pystencils merge requestshttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests2019-07-10T17:04:10+02:00https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/6Fix deprecation warning: `collections.abc` instead of `abc`2019-07-10T17:04:10+02:00Stephan SeitzFix deprecation warning: `collections.abc` instead of `abc`DeprecationWarning: Using or importing the ABCs from 'collections'
instead of from '' is deprecated, and in 3.8 it will stop workingDeprecationWarning: Using or importing the ABCs from 'collections'
instead of from '' is deprecated, and in 3.8 it will stop workinghttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/2Add autodiff2019-07-11T12:12:00+02:00Stephan SeitzAdd autodiffDraft for minimal integration of automatic differentiation. Tensorflow and Torch backend were removed (apart from AutoDiffOp.create_tensorlfow())
Only tests without LBM or tf/torch dependencies have been added. This implies that also nu...Draft for minimal integration of automatic differentiation. Tensorflow and Torch backend were removed (apart from AutoDiffOp.create_tensorlfow())
Only tests without LBM or tf/torch dependencies have been added. This implies that also numeric gradient checking is missing (depends on either tf or torch). Should we really move the backends into separate modules?
Apart from adding auto-differentiation functionality I added two more changes. Would be happy to AutoDiffOp.create_tensorflow() after feedback.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/11Fixup for DestructuringBindingsForFieldClass2019-07-18T10:28:09+02:00Stephan SeitzFixup for DestructuringBindingsForFieldClass- rename header Field.h is not a unique name in waLBerla context
- add PyStencilsField.h
- bindings were lacking data type- rename header Field.h is not a unique name in waLBerla context
- add PyStencilsField.h
- bindings were lacking data typehttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/8Auto-format pystencils/rng.py (trailing whitespace)2019-07-18T10:28:27+02:00Stephan SeitzAuto-format pystencils/rng.py (trailing whitespace)My editor feels better if that whitespace is not there.My editor feels better if that whitespace is not there.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/7Add `pystencils.make_python_function` used for KernelFunction.compile2019-07-18T11:56:29+02:00Stephan SeitzAdd `pystencils.make_python_function` used for KernelFunction.compile`KernelFunction.compile = None` is currently set by the
`create_kernel` function of each respective backend as partial function
of `<backend>.make_python_function`.
The code would be clearer with a unified `make_python_function`.
`Kerne...`KernelFunction.compile = None` is currently set by the
`create_kernel` function of each respective backend as partial function
of `<backend>.make_python_function`.
The code would be clearer with a unified `make_python_function`.
`KernelFunction.compile` can then be implemented as a call to this
function with the respective backend.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/13implemented derivation of gradient weights via rotation2019-08-03T14:01:23+02:00Markus Holzerimplemented derivation of gradient weights via rotationderive gradient weights of other direction with
already calculated weights of one direction
via rotation and apply them to a field.derive gradient weights of other direction with
already calculated weights of one direction
via rotation and apply them to a field.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/10Add CustomSympyPrinter._print_Sum2019-08-05T16:44:54+02:00Stephan SeitzAdd CustomSympyPrinter._print_SumThis makes sympy.Sum printable as instantaniously invoked lambda (Attention: C++-only, works in CUDA)This makes sympy.Sum printable as instantaniously invoked lambda (Attention: C++-only, works in CUDA)https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/19Run flynt (https://pypi.org/project/flynt/) on pystencils2019-08-06T08:03:25+02:00Stephan SeitzRun flynt (https://pypi.org/project/flynt/) on pystencilsThis replaces usages of "" % s by Python's f-strings.
Is this a good thing? I don't know. :shrug:This replaces usages of "" % s by Python's f-strings.
Is this a good thing? I don't know. :shrug:https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/17Fix #10: Add jinja2 to pystencils's dependencies2019-08-06T08:05:38+02:00Stephan SeitzFix #10: Add jinja2 to pystencils's dependenciesAlternative would be remove jinja2 (see other PR).
However, I think dependency on jinja2 is not to heavy.
This could make some implementations more elegant.Alternative would be remove jinja2 (see other PR).
However, I think dependency on jinja2 is not to heavy.
This could make some implementations more elegant.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/22Add AssignmentCollection.has_exclusive_writes2019-08-07T11:43:33+02:00Stephan SeitzAdd AssignmentCollection.has_exclusive_writesAn assumption of pystencils is that output stencil writes never overlap.
This allows massive parallelization without race conditions or atomics.
When I use my autodiff transformations I use this condition to check
whether the assumption...An assumption of pystencils is that output stencil writes never overlap.
This allows massive parallelization without race conditions or atomics.
When I use my autodiff transformations I use this condition to check
whether the assumption still hold for the backward assignments.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/26Add pystencils-autodiff2019-08-09T08:54:59+02:00Stephan SeitzAdd pystencils-autodiffThis adds pystencils_autodiff (https://pypi.org/project/pystencils-autodiff/0.1.3/) to pystencils.
After installing the extension, you can access all its classes in the submodule `pystenicls.autodiff`.
If it's not installed but you t...This adds pystencils_autodiff (https://pypi.org/project/pystencils-autodiff/0.1.3/) to pystencils.
After installing the extension, you can access all its classes in the submodule `pystenicls.autodiff`.
If it's not installed but you try to import it you get an error with installation instructions.
The internal code of pystencils_autodiff is still very ugly.
I hope I can clean it up in the next days/weeks.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/23Seeding of RNG2019-08-09T08:55:31+02:00Michael Kuronmkuron@icp.uni-stuttgart.deSeeding of RNGFor https://i10git.cs.fau.de/pycodegen/lbmpy/merge_requests/2For https://i10git.cs.fau.de/pycodegen/lbmpy/merge_requests/2Martin BauerMartin Bauerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/42WIP: make kerncraft/matplotlib tests pass with new image2019-09-02T08:28:51+02:00Stephan SeitzWIP: make kerncraft/matplotlib tests pass with new image--https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/54Always use codegen.rewriting.optimize2019-09-23T13:38:39+02:00Stephan SeitzAlways use codegen.rewriting.optimizePretty much !34 but with the changes to `create_kernel`. Can be closed if not wanted. Leaving it here for archiving purposes.
!34 has the workflow:
```python
assignments = optimize(assignments, optimizations)
ast = create_kernel(...Pretty much !34 but with the changes to `create_kernel`. Can be closed if not wanted. Leaving it here for archiving purposes.
!34 has the workflow:
```python
assignments = optimize(assignments, optimizations)
ast = create_kernel(assignments)
```https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/20WIP: Astnodes for interpolation2019-09-24T18:59:50+02:00Stephan SeitzWIP: Astnodes for interpolationThis PR needs maybe still needs some clean-up.
However, it would be good to recieve already some feed-back.
What works:
- Using CUDA textures
- Using HW accelerated interpolation for float32 textures
- Implement linear interpol...This PR needs maybe still needs some clean-up.
However, it would be good to recieve already some feed-back.
What works:
- Using CUDA textures
- Using HW accelerated interpolation for float32 textures
- Implement linear interpolations either via software (CPU, GPU), texture accesses without HW-interpolation but HW boundary handling
- Adding transformed coordinate systems to fields
What does not work:
- HW boundary handling for CUDA textures for the boundary handling modes `mirror` and `wrap` (apparently they have been removed from CUDA's API but are still present in pycuda. Now there's only
```
cudaBoundaryModeZero = 0
Zero boundary mode
cudaBoundaryModeClamp = 1
Clamp boundary mode
cudaBoundaryModeTrap = 2
Trap boundary mode
```
Wtf is trap boundary mode? Nothing is documented so we can only experiment.
What kind of works:
- B-Spline interpolation on GPU using this repo as a submodule (http://www.dannyruijters.nl/cubicinterpolation/), to lazy for tests. Don't know how to prove correctness
- Textures for dtypes with itemsize > 4. PyCUDA has helper header (https://github.com/inducer/pycuda/blob/master/pycuda/cuda/pycuda-helpers.hpp) that loads doubles by two int fetches. However, this hack seems to be only working if we add a 0.5 offset and make all functions in this header accept float.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/67Add ConditionalFieldAccess (Field.Access after out-of-bounds check)2019-10-01T15:36:36+02:00Stephan SeitzAdd ConditionalFieldAccess (Field.Access after out-of-bounds check)Adds a wrapper around a `Field.Access` that the access is only performed if a certain condition is met.
If I use this, I can safely perform calculations and adjoint calculations with `ghost_layers=0` and obtain the correct gradients w...Adds a wrapper around a `Field.Access` that the access is only performed if a certain condition is met.
If I use this, I can safely perform calculations and adjoint calculations with `ghost_layers=0` and obtain the correct gradients without separate boundary handling.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/47WIP: Complex number support2019-10-11T17:57:49+02:00Stephan SeitzWIP: Complex number supportDepends on !43
Pystencils should eventually support complex numbers. Even if complex fields can be considered harmful for CPU vectorization. The concept is nice since SymPy and Python support complex numbers and there should be no pe...Depends on !43
Pystencils should eventually support complex numbers. Even if complex fields can be considered harmful for CPU vectorization. The concept is nice since SymPy and Python support complex numbers and there should be no performance disadvantage for normal CPU and GPU code. Many applications in physics and signal processing rely on complex numbers.
Complex output fields can be passed directly to libraries like `cufft`.
Problem: In C++, one cannot mix calculation with `std::complex<float>` with `std::complex<double>`. So user has to specify `data_type='float32'` when single precision complex floats are desired.
TODO:
* GPU support with the header pycuda provides
* only use `complex_helper.h` when needed
* remove commits from !34 (probably the code will be changed)
* rebase -ihttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/100Fix Opencl and LLVM GPU tests2019-12-05T11:02:49+01:00Stephan SeitzFix Opencl and LLVM GPU testsFix tests for LLVM GPU and OpenCL
- !96 made it impossible to print functions without names (only important for LLVM GPU test)
- !87 made it impossible to run OpenCL kernels on CUDA OpenCL `int(...)`. is not a valid cast for it
- Sy...Fix tests for LLVM GPU and OpenCL
- !96 made it impossible to print functions without names (only important for LLVM GPU test)
- !87 made it impossible to run OpenCL kernels on CUDA OpenCL `int(...)`. is not a valid cast for it
- SymPy moved `sympy.boolalg` to `sympy.logic.boolalg`https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/85Opencl datahandling2019-12-05T12:14:33+01:00Stephan SeitzOpencl datahandlingCloses #15
OpenCL kernels are now integrated in the normal `create_kernel` workflow. Also there exists a created a `opencljit.init_globally` function that just creates some CL queue/contex if you do not want to give it as a parameter...Closes #15
OpenCL kernels are now integrated in the normal `create_kernel` workflow. Also there exists a created a `opencljit.init_globally` function that just creates some CL queue/contex if you do not want to give it as a parameter to every kernel.
SerialDatahandling is extended to work with alternative GPU array libraries to PyCuda.
There is now some overlapping code with the `_custom_transfer_functions` but I suppose they are for certain quantities that have a separate transfer function as oppose to using a whole different backend.
@kuron can you have a look on it? I think the solution is not as elegant as I thought it would be.
pycuda.gpuarray.GPUArrays are not wrapped. So if you use `dh.gpuarrays['foo']` you get either a pycuda array or a opencl array. I thought this step would be to drastic for one PR. Using OpenCL should still be a lot easier now.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/112Add CI minimal CI test for old sympy2019-12-17T18:51:26+01:00Stephan SeitzAdd CI minimal CI test for old sympyThe minimal test cannot catch everything but its something.The minimal test cannot catch everything but its something.