pystencils merge requestshttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests2021-05-03T14:19:21+02:00https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/243atomically write cache status file2021-05-03T14:19:21+02:00Michael Kuronmkuron@icp.uni-stuttgart.deatomically write cache status fileTry to fix https://i10git.cs.fau.de/pycodegen/pycodegen/-/jobs/568263, introduced in !240Try to fix https://i10git.cs.fau.de/pycodegen/pycodegen/-/jobs/568263, introduced in !240Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/108AssignmentCollection.__bool__2019-12-17T13:30:15+01:00Stephan SeitzAssignmentCollection.__bool__`AssignmentCollection` can be used in many cases where you could also use
a `List[Assignment]`. With `AssignmentCollection.__bool__`, an empty
AssignmentCollection is falsy and a non-empty one truthy.
So you can `assert assignments,...`AssignmentCollection` can be used in many cases where you could also use
a `List[Assignment]`. With `AssignmentCollection.__bool__`, an empty
AssignmentCollection is falsy and a non-empty one truthy.
So you can `assert assignments, 'must not be emtpy'`https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/222ARM Neon: Fix makeVec and add Philox2021-03-16T20:29:52+01:00Michael Kuronmkuron@icp.uni-stuttgart.deARM Neon: Fix makeVec and add PhiloxI did a quick find-and-replace translation of the Philox from SSE to Neon and noticed that `makeVec` was broken. Tests pass now.I did a quick find-and-replace translation of the Philox from SSE to Neon and noticed that `makeVec` was broken. Tests pass now.Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/320ARM for linux2023-03-27T20:07:49+02:00Helen SchottenhammlARM for linuxUntil now, ARM architectures are only allowed for Darwin systems. This MR extends their usage to Linux systems.Until now, ARM architectures are only allowed for Darwin systems. This MR extends their usage to Linux systems.Helen SchottenhammlHelen Schottenhammlhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/140Append assignments to KernelFunction (for later analysis etc.)2020-02-22T11:23:51+01:00Stephan SeitzAppend assignments to KernelFunction (for later analysis etc.)I wonder if it would be a good idea to apend the assignments from which a `KernelFunction` was created to `KernelFunction`.I wonder if it would be a good idea to apend the assignments from which a `KernelFunction` was created to `KernelFunction`.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/54Always use codegen.rewriting.optimize2019-09-23T13:38:39+02:00Stephan SeitzAlways use codegen.rewriting.optimizePretty much !34 but with the changes to `create_kernel`. Can be closed if not wanted. Leaving it here for archiving purposes.
!34 has the workflow:
```python
assignments = optimize(assignments, optimizations)
ast = create_kernel(...Pretty much !34 but with the changes to `create_kernel`. Can be closed if not wanted. Leaving it here for archiving purposes.
!34 has the workflow:
```python
assignments = optimize(assignments, optimizations)
ast = create_kernel(assignments)
```https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/133Allow vector assignments2020-01-21T22:04:08+01:00Stephan SeitzAllow vector assignments```python
>>>import pystencils as ps
>>>import sympy as sp
>>>a, b, c = sp.symbols("a b c")
>>>ps.Assignment(sp.Matrix([a,b,c]), sp.Matrix([1,2,3]))
(Assignment(a, 1), Assignment(b, 2), Assignment(c, 3))
```
Fixes #17```python
>>>import pystencils as ps
>>>import sympy as sp
>>>a, b, c = sp.symbols("a b c")
>>>ps.Assignment(sp.Matrix([a,b,c]), sp.Matrix([1,2,3]))
(Assignment(a, 1), Assignment(b, 2), Assignment(c, 3))
```
Fixes #17https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/118Allow functions for Field.coordinate_transform2020-01-09T20:20:23+01:00Stephan SeitzAllow functions for Field.coordinate_transformThis lets you quickly switch to polar coordinate systems:
```python
field.coordinate_transform = lambda x: sympy.Matrix((x.norm(), sympy.atan2(*x) / (2 * sympy.pi) * field.shape[1]))
```This lets you quickly switch to polar coordinate systems:
```python
field.coordinate_transform = lambda x: sympy.Matrix((x.norm(), sympy.atan2(*x) / (2 * sympy.pi) * field.shape[1]))
```https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/195Allow failure2020-12-08T13:15:47+01:00Markus HolzerAllow failureThe integration pipeline should not be necessary for an MR.
The integration will be checked consequently with the pycodegen repo before releasing. For every MR it should not be necessary due to waLBerlas codegen depending on pystencils ...The integration pipeline should not be necessary for an MR.
The integration will be checked consequently with the pycodegen repo before releasing. For every MR it should not be necessary due to waLBerlas codegen depending on pystencils and not pystencils depending on waLBerla.Stephan SeitzStephan Seitzhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/107Allow `CUSTOM_FIELD`s to have a different size in transformations.py2019-12-17T13:16:06+01:00Stephan SeitzAllow `CUSTOM_FIELD`s to have a different size in transformations.pyhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/181Allow **kernel_creation_args in create_boundary_kernel2020-10-30T10:47:29+01:00Stephan SeitzAllow **kernel_creation_args in create_boundary_kernelhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/46AES-NI vectorization improvements2019-09-17T09:08:05+02:00Michael Kuronmkuron@icp.uni-stuttgart.deAES-NI vectorization improvements!30 didn't implement an SSE-vectorized `_mm_cvtepu64_pd` equivalent because the [stackoverflow](https://stackoverflow.com/a/41148578) solution didn't work. That turned out to be due to a bad optimization in GCC 5+ in fast-math mode. None...!30 didn't implement an SSE-vectorized `_mm_cvtepu64_pd` equivalent because the [stackoverflow](https://stackoverflow.com/a/41148578) solution didn't work. That turned out to be due to a bad optimization in GCC 5+ in fast-math mode. None of the other compilers (Clang, Intel, MSVC) have that issue, so we just disable fast-math for that function.
Also, we now use fused multiply-add if available.Martin BauerMartin Bauerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/30AES-NI Random Number Generator2019-09-02T10:21:21+02:00Michael Kuronmkuron@icp.uni-stuttgart.deAES-NI Random Number GeneratorI was looking at how to vectorize the Philox RNG yesterday. Before I knew it, I had implemented a working RNG using AES-NI instructions :nerd: ... Not entirely what I had intended to do, but it might still be useful to someone and should...I was looking at how to vectorize the Philox RNG yesterday. Before I knew it, I had implemented a working RNG using AES-NI instructions :nerd: ... Not entirely what I had intended to do, but it might still be useful to someone and should be similarly fast as a vectorized Philox.
There is one place that could be optimized because I fall back to scalar instructions: I failed to reimplement `_mm_cvtepu64_pd` (the solution from https://stackoverflow.com/a/41148578 produces incorrect results in the least-significant half of the mantissa). Perhaps someone else can try to fix that.
I did not integrate this with the `vector_instruction_set` parameter of the code generation. Perhaps you can do that, @bauer. It needs support for SSE2 and AES instructions (which look like SSE2 instructions, but their availability is determined by a separate CPUID flag). It will also make use of `_mm_cvtepu32_ps` and `_mm_cvtepu64_pd` from AVX512 if available (these are 128-bit instructions that actually look like SSE2 instructions).Martin BauerMartin Bauerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/258Advanced Subexpression Insertion2021-07-28T22:10:01+02:00Frederik HennigAdvanced Subexpression InsertionMoved a few methods for elimination of selected subexpressions from lbmpy to pystencils. Helpful to control the granularity of common subexpression elimination, expression tree cleanup, and potentially to simplify equations by substituti...Moved a few methods for elimination of selected subexpressions from lbmpy to pystencils. Helpful to control the granularity of common subexpression elimination, expression tree cleanup, and potentially to simplify equations by substituting constant or zero subexpressions.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/1Address of SymPy-Function `address_of`2019-07-10T16:14:26+02:00Stephan SeitzAddress of SymPy-Function `address_of`Some CUDA functions (like `atomic_add`) require pointers to data. This PR adds a SymPy function representing the C address-of operator (`&`).
I tried to trigger cse to show a problem related to this function (dummy variables were not ...Some CUDA functions (like `atomic_add`) require pointers to data. This PR adds a SymPy function representing the C address-of operator (`&`).
I tried to trigger cse to show a problem related to this function (dummy variables were not typed correctly as pointer). I'll include the fix in a follow-up PR.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/34Address #13: Use sympy.codegen.rewriting.optimize2019-09-23T10:55:13+02:00Stephan SeitzAddress #13: Use sympy.codegen.rewriting.optimizeIt's really comfortable to write optimizations in terms of `sympy.codegen.rewrite.RewriteOptim`:
```python
# Evaluates all constant terms
evaluate_constant_terms = ReplaceOptim(
lambda e: hasattr(e, 'is_constant') a...It's really comfortable to write optimizations in terms of `sympy.codegen.rewrite.RewriteOptim`:
```python
# Evaluates all constant terms
evaluate_constant_terms = ReplaceOptim(
lambda e: hasattr(e, 'is_constant') and e.is_constant,
lambda p: p.evalf()
)
```
This PR adds a parameter `sympy_optimizations` to the `create_*_kernel` functions that applies the list of optimizations to the assignments before creating the AST.
`sympy.codegen.rewrite` already has some optimizations. Some similar to the optimizations of pystencils.
For example `create_expand_pow_optimization(limit)` is really similar to the logic in `CustomSympyPrinter._print_Pow`.
See #13
Problem: old versions of sympy (e.g. from ubuntu CI) don't have `sympy.codegen.rewrite`. The optimizations are skipped in that case. `test_and_coverage` applies all optimizations.
We could also try to implement a fma-optimization (fused-multipy add) with that and `sympy.Wild`.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/197Added version number to pystencils2021-01-26T08:44:01+01:00Markus HolzerAdded version number to pystencilsPystencils should have a `__version__` attributePystencils should have a `__version__` attributeMichael Kuronmkuron@icp.uni-stuttgart.deMichael Kuronmkuron@icp.uni-stuttgart.dehttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/286Added simplify_by_equality2022-03-16T10:20:12+01:00Frederik HennigAdded simplify_by_equalityAdded a simplification function to simplify expressions using a given equality $`a = b + c`$, by replacing occurences of e.g. $`b + c`$ by $`a`$, or $`a - b`$ by $`c`$.Added a simplification function to simplify expressions using a given equality $`a = b + c`$, by replacing occurences of e.g. $`b + c`$ by $`a`$, or $`a - b`$ by $`c`$.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/147Added guard around import to avoid failing when walberla is there but no pyth...2020-02-21T15:15:39+01:00Christoph RettingerAdded guard around import to avoid failing when walberla is there but no python module is builtStephan SeitzStephan Seitzhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/156add_types: only re-enable double-write check if it was previously enabled2020-06-15T20:57:20+02:00Michael Kuronmkuron@icp.uni-stuttgart.deadd_types: only re-enable double-write check if it was previously enabledOtherwise you can't generate certain boundary kernels that contain conditionals.Otherwise you can't generate certain boundary kernels that contain conditionals.Markus HolzerMarkus Holzer