pystencils merge requestshttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests2019-08-06T22:04:11+02:00https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/21Add RELEASE-VERSION to .gitignore2019-08-06T22:04:11+02:00Stephan SeitzAdd RELEASE-VERSION to .gitignorehttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/22Add AssignmentCollection.has_exclusive_writes2019-08-07T11:43:33+02:00Stephan SeitzAdd AssignmentCollection.has_exclusive_writesAn assumption of pystencils is that output stencil writes never overlap.
This allows massive parallelization without race conditions or atomics.
When I use my autodiff transformations I use this condition to check
whether the assumption...An assumption of pystencils is that output stencil writes never overlap.
This allows massive parallelization without race conditions or atomics.
When I use my autodiff transformations I use this condition to check
whether the assumption still hold for the backward assignments.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/23Seeding of RNG2019-08-09T08:55:31+02:00Michael Kuronmkuron@icp.uni-stuttgart.deSeeding of RNGFor https://i10git.cs.fau.de/pycodegen/lbmpy/merge_requests/2For https://i10git.cs.fau.de/pycodegen/lbmpy/merge_requests/2Martin BauerMartin Bauerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/24Remove deprecation warning ('cachedir' parameter has been deprecated)2019-08-08T08:57:45+02:00Stephan SeitzRemove deprecation warning ('cachedir' parameter has been deprecated)Warning was:
```
/localhome/seitz_local/projects/pystencils/pystencils/cache.py:15: DeprecationWarning: The 'cachedir' parameter has been deprecated in
version 0.12 and will be removed in version 0.14.
You provided "cachedir='/local...Warning was:
```
/localhome/seitz_local/projects/pystencils/pystencils/cache.py:15: DeprecationWarning: The 'cachedir' parameter has been deprecated in
version 0.12 and will be removed in version 0.14.
You provided "cachedir='/localhome/seitz_local/.cache/pystencils'", use "location='/localhome/seitz_local/.cache/pystencils'" instead.
disk_cache = Memory(cachedir=cache_dir, verbose=False).cache
```https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/25Make generate_c also work if astnode does not have member `instruction_set`2019-08-06T22:07:34+02:00Stephan SeitzMake generate_c also work if astnode does not have member `instruction_set`generate_c currently only works for KernelFunctions, since member `instruction_set` is required.
generate_c can generate code for any astnode if this requirement is dropped.generate_c currently only works for KernelFunctions, since member `instruction_set` is required.
generate_c can generate code for any astnode if this requirement is dropped.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/26Add pystencils-autodiff2019-08-09T08:54:59+02:00Stephan SeitzAdd pystencils-autodiffThis adds pystencils_autodiff (https://pypi.org/project/pystencils-autodiff/0.1.3/) to pystencils.
After installing the extension, you can access all its classes in the submodule `pystenicls.autodiff`.
If it's not installed but you t...This adds pystencils_autodiff (https://pypi.org/project/pystencils-autodiff/0.1.3/) to pystencils.
After installing the extension, you can access all its classes in the submodule `pystenicls.autodiff`.
If it's not installed but you try to import it you get an error with installation instructions.
The internal code of pystencils_autodiff is still very ugly.
I hope I can clean it up in the next days/weeks.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/27Fix error message of CBackend for unsupported nodes2019-08-15T09:15:02+02:00Stephan SeitzFix error message of CBackend for unsupported nodesConcatenating `__class__` and `str` is not supported. Should be `str(type(self))` (full type path) or `self.__class__.__name__` (just class name)Concatenating `__class__` and `str` is not supported. Should be `str(type(self))` (full type path) or `self.__class__.__name__` (just class name)https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/28Philox tests and clean up2019-08-13T14:17:37+02:00Michael Kuronmkuron@icp.uni-stuttgart.dePhilox tests and clean upTest the Philox against reference data and clean up duplicated code in the code generation. The latter will make it easier to later add a vectorized Philox.Test the Philox against reference data and clean up duplicated code in the code generation. The latter will make it easier to later add a vectorized Philox.Martin BauerMartin Bauerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/29Basic support for OpenCL (experimental)2019-08-22T08:37:37+02:00Stephan SeitzBasic support for OpenCL (experimental)Basic support for OpenCL
Problem: OpenCL cannot import `stdint.h`. Temporary fix: define custom `opencl_stdint.h` (~~defines currently only `int64_t`~~ `)
TODO:
- ~~implement `opencl_stdint.h`~~
- implement shard_mem, textures,...Basic support for OpenCL
Problem: OpenCL cannot import `stdint.h`. Temporary fix: define custom `opencl_stdint.h` (~~defines currently only `int64_t`~~ `)
TODO:
- ~~implement `opencl_stdint.h`~~
- implement shard_mem, textures, built-in functions
- ~~avoid CUDA intrinsics (`fast_div`)~~https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/30AES-NI Random Number Generator2019-09-02T10:21:21+02:00Michael Kuronmkuron@icp.uni-stuttgart.deAES-NI Random Number GeneratorI was looking at how to vectorize the Philox RNG yesterday. Before I knew it, I had implemented a working RNG using AES-NI instructions :nerd: ... Not entirely what I had intended to do, but it might still be useful to someone and should...I was looking at how to vectorize the Philox RNG yesterday. Before I knew it, I had implemented a working RNG using AES-NI instructions :nerd: ... Not entirely what I had intended to do, but it might still be useful to someone and should be similarly fast as a vectorized Philox.
There is one place that could be optimized because I fall back to scalar instructions: I failed to reimplement `_mm_cvtepu64_pd` (the solution from https://stackoverflow.com/a/41148578 produces incorrect results in the least-significant half of the mantissa). Perhaps someone else can try to fix that.
I did not integrate this with the `vector_instruction_set` parameter of the code generation. Perhaps you can do that, @bauer. It needs support for SSE2 and AES instructions (which look like SSE2 instructions, but their availability is determined by a separate CPUID flag). It will also make use of `_mm_cvtepu32_ps` and `_mm_cvtepu64_pd` from AVX512 if available (these are 128-bit instructions that actually look like SSE2 instructions).Martin BauerMartin Bauerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/31Bugfix: TypedSymbol.is_negative should not be implemented in terms of super()...2019-08-14T17:03:02+02:00Stephan SeitzBugfix: TypedSymbol.is_negative should not be implemented in terms of super().is_positiveThis can lead to surprising simplificationsThis can lead to surprising simplificationshttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/32Bugfix: Readd __launch_bounds__ for dialect 'cuda'2019-08-15T09:14:26+02:00Stephan SeitzBugfix: Readd __launch_bounds__ for dialect 'cuda'__launch_bounds__ was deactivated when introducing `CudaBackend`__launch_bounds__ was deactivated when introducing `CudaBackend`https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/33Add KernelFunction.fields_written2019-08-16T08:59:16+02:00Stephan SeitzAdd KernelFunction.fields_writtenI found myself needing this convenience wrapper in various places.I found myself needing this convenience wrapper in various places.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/34Address #13: Use sympy.codegen.rewriting.optimize2019-09-23T10:55:13+02:00Stephan SeitzAddress #13: Use sympy.codegen.rewriting.optimizeIt's really comfortable to write optimizations in terms of `sympy.codegen.rewrite.RewriteOptim`:
```python
# Evaluates all constant terms
evaluate_constant_terms = ReplaceOptim(
lambda e: hasattr(e, 'is_constant') a...It's really comfortable to write optimizations in terms of `sympy.codegen.rewrite.RewriteOptim`:
```python
# Evaluates all constant terms
evaluate_constant_terms = ReplaceOptim(
lambda e: hasattr(e, 'is_constant') and e.is_constant,
lambda p: p.evalf()
)
```
This PR adds a parameter `sympy_optimizations` to the `create_*_kernel` functions that applies the list of optimizations to the assignments before creating the AST.
`sympy.codegen.rewrite` already has some optimizations. Some similar to the optimizations of pystencils.
For example `create_expand_pow_optimization(limit)` is really similar to the logic in `CustomSympyPrinter._print_Pow`.
See #13
Problem: old versions of sympy (e.g. from ubuntu CI) don't have `sympy.codegen.rewrite`. The optimizations are skipped in that case. `test_and_coverage` applies all optimizations.
We could also try to implement a fma-optimization (fused-multipy add) with that and `sympy.Wild`.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/35Fix get_type_of_expression for constants like sympy.pi2019-08-22T08:31:17+02:00Stephan SeitzFix get_type_of_expression for constants like sympy.piProblem: some constant expressions are neither Float,Integer,Rational and
don't have arguments.
```python
>>> from sympy import *
>>> isinstance(pi, Integer)
False
>>> isinstance(pi, Float)
False
>>> isinstance(pi, Rational)
F...Problem: some constant expressions are neither Float,Integer,Rational and
don't have arguments.
```python
>>> from sympy import *
>>> isinstance(pi, Integer)
False
>>> isinstance(pi, Float)
False
>>> isinstance(pi, Rational)
False
>>> pi.args
()
```https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/36Pre-push hook2019-08-20T16:28:40+02:00Stephan SeitzPre-push hookThis prevents me from pushing stuff that either fails in quicktest or flake8.
Has to be copied manually to `.git/hooks` and `python3` has to be adapted to your Python executable.
~~Is there an update in flake8 that `.flake8` is not...This prevents me from pushing stuff that either fails in quicktest or flake8.
Has to be copied manually to `.git/hooks` and `python3` has to be adapted to your Python executable.
~~Is there an update in flake8 that `.flake8` is not recognized automatically anymore and that we need to append C901?~~
Probably, I installed just different linter on my PC at home. flake8 can use different linters.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/37Remove main methods from tests (sorry for adding them)2019-08-20T16:27:21+02:00Stephan SeitzRemove main methods from tests (sorry for adding them)... or code will be executed when pytest is collecting the tests.
I found out that I can use "-s" to convince vim-test to show me test
output.... or code will be executed when pytest is collecting the tests.
I found out that I can use "-s" to convince vim-test to show me test
output.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/38Implement sp.Sum, sp.Product2019-08-21T18:45:35+02:00Stephan SeitzImplement sp.Sum, sp.ProductSum and Product have a indexing variable which is a Atom but not a free
symbol. So logic, that defines the undefined symbols in a `SympyAssignment` should not be
`atoms(sp.Symbol)` but `free_symbols`. `sp.Indexed` from the `ResolvedFie...Sum and Product have a indexing variable which is a Atom but not a free
symbol. So logic, that defines the undefined symbols in a `SympyAssignment` should not be
`atoms(sp.Symbol)` but `free_symbols`. `sp.Indexed` from the `ResolvedFieldAcess`es forms an edge case.
So we could also use `atoms(sp.Symbol).intersection(...free_symbols)`.
I hope I extracted from my fork all the necessary code to implement this feature.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/39Conftest: waLBerla, kerncraft2019-08-20T16:26:56+02:00Stephan SeitzConftest: waLBerla, kerncraft- Add `waLBerla` to conftest
- Add missing file to conftest for `kerncraft`- Add `waLBerla` to conftest
- Add missing file to conftest for `kerncraft`https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/40Contest: ignore two more files if waLBerla is not available2019-08-21T18:43:43+02:00Stephan SeitzContest: ignore two more files if waLBerla is not available- Contest: ignore two more files if waLBerla is not available (need when executing
- Skip collection of `pystencils.autodiff` always (not only if `'CI' in `os.environ`)- Contest: ignore two more files if waLBerla is not available (need when executing
- Skip collection of `pystencils.autodiff` always (not only if `'CI' in `os.environ`)