pystencils issueshttps://i10git.cs.fau.de/pycodegen/pystencils/-/issues2021-04-12T10:40:22+02:00https://i10git.cs.fau.de/pycodegen/pystencils/-/issues/25Non-temporal stores do not use fences2021-04-12T10:40:22+02:00Michael Kuronmkuron@icp.uni-stuttgart.deNon-temporal stores do not use fencesWhen vectorization is enabled, instructions like `_mm(|256|512)_stream_p[sd]` are generated. However, the corresponding fence `_mm_mfence` is never generated. This is not a problem in practice as enough time will have passed by the time ...When vectorization is enabled, instructions like `_mm(|256|512)_stream_p[sd]` are generated. However, the corresponding fence `_mm_mfence` is never generated. This is not a problem in practice as enough time will have passed by the time the data is next read. However, an explicit fence should be added to guarantee safety.Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/issues/24CBackend uses aligned_alloc, which requires C++172021-08-19T11:13:32+02:00Michael Kuronmkuron@icp.uni-stuttgart.deCBackend uses aligned_alloc, which requires C++17backends/cbackend.py generates code that contains `aligned_alloc`. This is incompatible with our default compiler flags, which include `-std=c++11`. It is also incompatible with Walberla, which defaults to C++14. I guess GCC doesn't care...backends/cbackend.py generates code that contains `aligned_alloc`. This is incompatible with our default compiler flags, which include `-std=c++11`. It is also incompatible with Walberla, which defaults to C++14. I guess GCC doesn't care, but I've seen the issue come up with the latest Apple Clang, which interprets the standard more strictly than it used to.
We need to fall back to `posix_memalign` on POSIX and `_aligned_malloc` on Windows.Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/issues/23Cannot simplify piecewise function with field access in condition2021-01-05T17:44:27+01:00Michael Kuronmkuron@icp.uni-stuttgart.deCannot simplify piecewise function with field access in conditionThis fails:
```python
from pystencils.session import *
dh = ps.create_data_handling((20,20))
ρ = dh.add_array('rho')
pw = sp.Piecewise((0, 1 < sp.Max(-0.5, ρ.center+0.5)), (1, True))
sp.simplify(pw)
```
with the following error:
```
./p...This fails:
```python
from pystencils.session import *
dh = ps.create_data_handling((20,20))
ρ = dh.add_array('rho')
pw = sp.Piecewise((0, 1 < sp.Max(-0.5, ρ.center+0.5)), (1, True))
sp.simplify(pw)
```
with the following error:
```
./pystencils/pystencils/field.py in __iter__(self)
760 """This is necessary to work with parts of sympy that test if an object is iterable (e.g. simplify).
761 The __getitem__ would make it iterable"""
--> 762 raise TypeError("Field access is not iterable")
763
764 @property
TypeError: Field access is not iterable
```
Here are two similar examples that do not produce such an error:
```python
s = sp.Symbol("s")
pw = sp.Piecewise((0, 1 < sp.Max(-0.5, s+0.5)), (1, True))
sp.simplify(pw)
pw = sp.Piecewise((0, 1 < ρ.center+0.5), (1, True))
sp.simplify(pw)
```Stephan SeitzStephan Seitzhttps://i10git.cs.fau.de/pycodegen/pystencils/-/issues/222nd order finite volume discretizer2021-04-18T00:05:56+02:00Michael Kuronmkuron@icp.uni-stuttgart.de2nd order finite volume discretizerWe currently have a 1st order FVM and 1st and 2nd order FDM discretizer. 2nd order FVM would be nice to have too. Some problems decompose into two subgrids with 1st order, causing stability problems unless extreme resolutions are used.We currently have a 1st order FVM and 1st and 2nd order FDM discretizer. 2nd order FVM would be nice to have too. Some problems decompose into two subgrids with 1st order, causing stability problems unless extreme resolutions are used.https://i10git.cs.fau.de/pycodegen/pystencils/-/issues/21Increase Python minimum version to 3.82021-09-10T11:50:36+02:00Michael Kuronmkuron@icp.uni-stuttgart.deIncrease Python minimum version to 3.8Once Ubuntu 20.04 has been out for a year and Anaconda supports it, let's update to Python 3.8.Once Ubuntu 20.04 has been out for a year and Anaconda supports it, let's update to Python 3.8.Jan HönigJan Hönighttps://i10git.cs.fau.de/pycodegen/pystencils/-/issues/20Revamp the type system2022-05-11T14:42:49+02:00Michael Kuronmkuron@icp.uni-stuttgart.deRevamp the type system@bauer had planned to do that before he left, but ran out of time. He has some pretty clear ideas of how the type system should be. It should probably be done relatively quickly as it would make quite a few things a lot easier, e.g. the ...@bauer had planned to do that before he left, but ran out of time. He has some pretty clear ideas of how the type system should be. It should probably be done relatively quickly as it would make quite a few things a lot easier, e.g. the SIMD stuff. It is also necessary before things like #12 can be done properly.
The following test (for test_vectorization.py) does not work currently:
```python
def test_vectorized_loop_counter():
arr = np.zeros((4, 4))
@ps.kernel
def kernel_equal(s):
f = ps.fields(f=arr)
f[0, 0] @= ps.astnodes.LoopOverCoordinate.get_loop_counter_symbol(1)
ast = ps.create_kernel(kernel_equal).compile()(f=arr)
arr_ref = arr.copy()
arr = np.zeros((4, 4))
vectorize(ast, instruction_set=instruction_set)
ps.show_code(ast)
ast.compile()(f=arr)
assert np.allclose(arr_ref, arr)
```
Another problem can be found in the [attached jupyter notebook](/uploads/04f89bc361e00c144f14e5d657d78c4e/Untitled1.ipynb), where in the absence of `type_all_numbers` floating-point comparisons are used for integers in a conditional.
Finally, automatic conversion of arguments to the type expected by the signature of the SIMD intrinsics is not supported, which made https://i10git.cs.fau.de/walberla/walberla/-/merge_requests/414/diffs necessary.Release 1.0Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/issues/19Need test job against older SymPy version2020-01-24T12:15:59+01:00Michael Kuronmkuron@icp.uni-stuttgart.deNeed test job against older SymPy versionMy desktop computer runs Ubuntu 18.04, which includes SymPy 1.1.1. pystencils generally works fine on that version, but it is sufficiently different that we have occasionally broken support of it in the past, e.g. https://i10git.cs.fau.d...My desktop computer runs Ubuntu 18.04, which includes SymPy 1.1.1. pystencils generally works fine on that version, but it is sufficiently different that we have occasionally broken support of it in the past, e.g. https://i10git.cs.fau.de/pycodegen/pystencils/merge_requests/105. Furthermore, a slightly different simplification engine in newer Sympy versions has previously masked actual bugs (https://i10git.cs.fau.de/pycodegen/lbmpy/merge_requests/14 / https://i10git.cs.fau.de/pycodegen/pystencils/commit/721fdf454c024bc1ed8db65b31a91cf802b2dae7). To solve this properly, we should define a minimum SymPy version and perform CI tests using that specific version. Currently we only test the latest release and the latest master.Michael Kuronmkuron@icp.uni-stuttgart.deMichael Kuronmkuron@icp.uni-stuttgart.dehttps://i10git.cs.fau.de/pycodegen/pystencils/-/issues/18Checking for double/int instead of np.floating np.integer in LLVM printer2021-11-19T15:43:26+01:00Stephan SeitzChecking for double/int instead of np.floating np.integer in LLVM printerThose checks could get problematic when compling a kernel with float32/int32 instead of the default double/int types.
https://i10git.cs.fau.de/seitz/pystencils/blob/79a6e728e789a890ca36a7231baf4274487ddfdb/pystencils/llvm/llvm.py#L130Those checks could get problematic when compling a kernel with float32/int32 instead of the default double/int types.
https://i10git.cs.fau.de/seitz/pystencils/blob/79a6e728e789a890ca36a7231baf4274487ddfdb/pystencils/llvm/llvm.py#L130Jan HönigJan Hönighttps://i10git.cs.fau.de/pycodegen/pystencils/-/issues/16Staggered grids: allow more than just face neighbors2019-12-01T09:44:31+01:00Michael Kuronmkuron@icp.uni-stuttgart.deStaggered grids: allow more than just face neighborsWhen defining a kernel on a staggered grid, one can currently only store values on the faces/edges (3D/2D) of a cell. This is sufficient for common finite volume schemes. However, when one has a specific discretization in mind (in my cas...When defining a kernel on a staggered grid, one can currently only store values on the faces/edges (3D/2D) of a cell. This is sufficient for common finite volume schemes. However, when one has a specific discretization in mind (in my case, Capuani's electrokinetics solver), one may also need to calculate fluxes for the edges/corners (3D/2D). For a volume-of-fluid-like scheme, one also needs the corners in 3D.
So currently we only support D3Q6/D2Q4 staggered grids and I need D3Q26/D2Q8. This was already discussed with @bauer and is probably not a whole lot of work to implement.Michael Kuronmkuron@icp.uni-stuttgart.deMichael Kuronmkuron@icp.uni-stuttgart.dehttps://i10git.cs.fau.de/pycodegen/pystencils/-/issues/15OpenCL backend still requires pycuda to be installed2020-01-18T15:51:37+01:00Michael Kuronmkuron@icp.uni-stuttgart.deOpenCL backend still requires pycuda to be installedI wanted to try out the OpenCL backend on a machine that does not have the CUDA SDK or pycuda installed. Unfortunately, pystencils imports stuff from pycuda in multiple places throughout the code, so I cannot use the OpenCL backend on th...I wanted to try out the OpenCL backend on a machine that does not have the CUDA SDK or pycuda installed. Unfortunately, pystencils imports stuff from pycuda in multiple places throughout the code, so I cannot use the OpenCL backend on this machine.Stephan SeitzStephan Seitzhttps://i10git.cs.fau.de/pycodegen/pystencils/-/issues/14Minor installation issue if only cloning last commit2021-03-15T10:42:15+01:00Nils KohlMinor installation issue if only cloning last commitIf I clone pystencils with `--depth 1` the function `version_number_from_git()` in `pystencils/doc/`
will crash during `python setup.py` since `git tag` will return nothing:
```
$ git clone --depth 1 --branch hyteg git@i10git.cs.fau.de...If I clone pystencils with `--depth 1` the function `version_number_from_git()` in `pystencils/doc/`
will crash during `python setup.py` since `git tag` will return nothing:
```
$ git clone --depth 1 --branch hyteg git@i10git.cs.fau.de:pycodegen/pystencils.git
$ git tag
$ python3 setup.py develop
Traceback (most recent call last):
File "setup.py", line 52, in <module>
version=version_number_from_git(),
File "/builds/terraneo/pystencils/doc/version_from_git.py", line 18, in version_number_from_git
latest_release = get_released_versions()[-1]
IndexError: list index out of range
```https://i10git.cs.fau.de/pycodegen/pystencils/-/issues/13show how to create own opts with sympy.codegen.rewriting.optimize in doc2021-12-10T12:19:59+01:00Stephan Seitzshow how to create own opts with sympy.codegen.rewriting.optimize in docIn `sympy.codegen.rewriting` there is a function `optimize`:
```python
def optimize(expr, optimizations):
""" Apply optimizations to an expression.
Parameters
==========
expr : expression
optimizations : iterable o...In `sympy.codegen.rewriting` there is a function `optimize`:
```python
def optimize(expr, optimizations):
""" Apply optimizations to an expression.
Parameters
==========
expr : expression
optimizations : iterable of ``Optimization`` instances
The optimizations will be sorted with respect to ``priority`` (highest first).
Examples
========
>>> from sympy import log, Symbol
>>> from sympy.codegen.rewriting import optims_c99, optimize
>>> x = Symbol('x')
>>> optimize(log(x+3)/log(2) + log(x**2 + 1), optims_c99)
log1p(x**2) + log2(x + 3)
"""
for optim in sorted(optimizations, key=lambda opt: opt.priority, reverse=True):
new_expr = optim(expr)
if optim.cost_function is None:
expr = new_expr
else:
before, after = map(lambda x: optim.cost_function(x), (expr, new_expr))
if before > after:
expr = new_expr
return expr
```
We should use it in `create_kernel` to profit from Sympy's optimizations (currently very few, but some are duplicates from optimizations in pystencils) and to make it easy for users to incorporate their own optimizations into the expressions. `create_kernel` could accept an `Iterable` of `Optimization`s with a default collection of optimizations that pystencils normally uses.
As you can see, it's really easy to implement own optimizations:
```python
def create_expand_pow_optimization(limit):
""" Creates an instance of :class:`ReplaceOptim` for expanding ``Pow``.
The requirements for expansions are that the base needs to be a symbol
and the exponent needs to be an Integer (and be less than or equal to
``limit``).
Parameters
==========
limit : int
The highest power which is expanded into multiplication.
Examples
========
>>> from sympy import Symbol, sin
>>> from sympy.codegen.rewriting import create_expand_pow_optimization
>>> x = Symbol('x')
>>> expand_opt = create_expand_pow_optimization(3)
>>> expand_opt(x**5 + x**3)
x**5 + x*x*x
>>> expand_opt(x**5 + x**3 + sin(x)**3)
x**5 + sin(x)**3 + x*x*x
"""
return ReplaceOptim(
lambda e: e.is_Pow and e.base.is_symbol and e.exp.is_Integer and abs(e.exp) <= limit,
lambda p: (
UnevaluatedExpr(Mul(*([p.base]*+p.exp), evaluate=False)) if p.exp > 0 else
1/UnevaluatedExpr(Mul(*([p.base]*-p.exp), evaluate=False))
))
```Release 1.0Stephan SeitzStephan Seitzhttps://i10git.cs.fau.de/pycodegen/pystencils/-/issues/12Vectorized Philox RNG2021-02-11T14:49:03+01:00Michael Kuronmkuron@icp.uni-stuttgart.deVectorized Philox RNGMartin BauerMartin Bauerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/issues/11FieldPointerSymbol is rendered as a sympy.Symbol instead of a TypedSymbol2020-01-20T11:58:48+01:00Stephan SeitzFieldPointerSymbol is rendered as a sympy.Symbol instead of a TypedSymbol`python3 setup.py quicktest` fails on current sympy master.
I don't know whether this is a temporary issue or a change of SymPy's behavior.`python3 setup.py quicktest` fails on current sympy master.
I don't know whether this is a temporary issue or a change of SymPy's behavior.https://i10git.cs.fau.de/pycodegen/pystencils/-/issues/10replace jinja2 with default python features2019-08-08T12:21:54+02:00Dominik Thoennesdominik.thoennes@fau.dereplace jinja2 with default python featuresjinja2 should not be used in base pystencils since it introduces an additional dependency.
it is e.g. used in:
- https://i10git.cs.fau.de/pycodegen/pystencils/blob/master/pystencils/astnodes.py#L676jinja2 should not be used in base pystencils since it introduces an additional dependency.
it is e.g. used in:
- https://i10git.cs.fau.de/pycodegen/pystencils/blob/master/pystencils/astnodes.py#L676https://i10git.cs.fau.de/pycodegen/pystencils/-/issues/9Explore usage of `fma` for optimization on CUDA2019-08-21T18:39:09+02:00Stephan SeitzExplore usage of `fma` for optimization on CUDAI don't know whether nvcc automatically uses `fma` instructions (fused-multiply-add) when compiling with `-fast-math` flag.
If not, it could be easy to use `fma` whenever possible to accelerate compute-bound kernels.
https://devblogs....I don't know whether nvcc automatically uses `fma` instructions (fused-multiply-add) when compiling with `-fast-math` flag.
If not, it could be easy to use `fma` whenever possible to accelerate compute-bound kernels.
https://devblogs.nvidia.com/lerp-faster-cuda/https://i10git.cs.fau.de/pycodegen/pystencils/-/issues/8Simplification of derivation of gradient2020-11-25T13:23:51+01:00Markus HolzerSimplification of derivation of gradientWhen the weights of one direction are computed it would be possible to get the weights of the other directions via rotating the previous calculated. This functionality could be inserted in: [derivation.py](https://i10git.cs.fau.de/pycode...When the weights of one direction are computed it would be possible to get the weights of the other directions via rotating the previous calculated. This functionality could be inserted in: [derivation.py](https://i10git.cs.fau.de/pycodegen/pystencils/blob/master/pystencils/fd/derivation.py)https://i10git.cs.fau.de/pycodegen/pystencils/-/issues/7In compile_and_load C-code is generated twice2021-03-03T16:41:23+01:00Stephan SeitzIn compile_and_load C-code is generated twiceWhen I was reviewing my old code I saw that I placed a TODO here.
`generate_c` is just called to generate the hash and then again for the real code (if not loaded from shared object on disk).
Maybe the ast can be hashed directly instead ...When I was reviewing my old code I saw that I placed a TODO here.
`generate_c` is just called to generate the hash and then again for the real code (if not loaded from shared object on disk).
Maybe the ast can be hashed directly instead (danger point: generate_c may have changed since last generation).
I remember that I directly hashed the assignments for generation of torch/tensorflow code.
This had the disadvantage that I had to deactivate caching when developing/changing code affecting `generate_c`.
```python
def compile_and_load(ast):
cache_config = get_cache_config()
# TODO: inefficient to generate_c just for hash? could reuse it
code_hash_str = "mod_" + hashlib.sha256(generate_c(ast).encode()).hexdigest()
code = ExtensionModuleCode(module_name=code_hash_str)
code.add_function(ast, ast.function_name)
if cache_config['object_cache'] is False:
with TemporaryDirectory() as base_dir:
lib_file = compile_module(code, code_hash_str, base_dir)
result = load_kernel_from_file(code_hash_str, ast.function_name, lib_file)
else:
lib_file = compile_module(code, code_hash_str, base_dir=cache_config['object_cache'])
result = load_kernel_from_file(code_hash_str, ast.function_name, lib_file)
rtn = KernelWrapper(result, ast.get_parameters(), ast)
rtn.code = code.code
return rtn
```https://i10git.cs.fau.de/pycodegen/pystencils/-/issues/6show_code does not show code2020-01-23T15:22:33+01:00Stephan Seitzshow_code does not show codeshow_code does not show code but returns an object. Maybe rename it? `get_code`
`show_code` would still be useful to implement `print(show_code(ast))`
When I first saw the function I dropped the return value.show_code does not show code but returns an object. Maybe rename it? `get_code`
`show_code` would still be useful to implement `print(show_code(ast))`
When I first saw the function I dropped the return value.https://i10git.cs.fau.de/pycodegen/pystencils/-/issues/5Compare pystencils and loopy2020-06-08T13:30:29+02:00Jan HönigCompare pystencils and loopyExcluisve Features loopy
- more general indexing
- loop transformations
- unrolling
- tiling
- blocking
- OpenCL
Exclusive Features pystencils
- automatic compiling
- struct support??
- LLVM
- CUDA mapping strategiesExcluisve Features loopy
- more general indexing
- loop transformations
- unrolling
- tiling
- blocking
- OpenCL
Exclusive Features pystencils
- automatic compiling
- struct support??
- LLVM
- CUDA mapping strategiesJan HönigJan Hönig