pystencils issueshttps://i10git.cs.fau.de/pycodegen/pystencils/-/issues2024-03-27T17:25:53+01:00https://i10git.cs.fau.de/pycodegen/pystencils/-/issues/88Clarify semantics of fancy integer division functions2024-03-27T17:25:53+01:00Frederik HennigClarify semantics of fancy integer division functionsThe functions `modulo_floor`, `modulo_ceil`, `div_floor` and `div_ceil` of the `integer_functions` exhibit unclear rounding behaviour. Their names and docstrings indicate mathematical rounding behaviour ("down" is negative infinity), whi...The functions `modulo_floor`, `modulo_ceil`, `div_floor` and `div_ceil` of the `integer_functions` exhibit unclear rounding behaviour. Their names and docstrings indicate mathematical rounding behaviour ("down" is negative infinity), while their implementation performs zero-oriented rounding ("down" is toward zero), as this is the default behaviour of C `/` and `%`.
We should clarify the semantics of these functions and adapt docstring, implementation, or both.
See also the discussion here: pycodegen/pystencils!368Release 2.0Daniel BauerDaniel Bauerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/issues/74Constness in generated code2024-03-27T18:23:27+01:00Frederik HennigConstness in generated codeThe way pystencils determines constness and placement of `const` keywords is very intransparent and at times misleading.
Things that should be const are often printed non-const. While we may expect compilers to detect constness by themse...The way pystencils determines constness and placement of `const` keywords is very intransparent and at times misleading.
Things that should be const are often printed non-const. While we may expect compilers to detect constness by themselves, this
makes integration into other code frameworks more tedious, and still may impact performance.
Luckily, so far, I have not observed the opposite (non-const values printed const) which would just not compile.
## Constness
Constness is introduced in the following places (maybe I've missed something):
- In the type system:
- `BasicType` and `StructType`: An object type may be `const`-qualified. That is how it should be in a C code generator.
- `PointerType`: A pointer may be `const`-qualified, and its pointed-to type may also be; so theoretically, all four possible combinations of constness on pointers can be realized. This does not hold for the recently introduced double pointers (pycodegen/pystencils!356), though; see below.
- In the AST: `SympyAssignment` has a member `is_const` for constant declarations.
While it is perfectly sensible to const-qualify types, annotating assignments with context has the potential for vast confusions; what if the type of the LHS is non-const? How to print it if it is const?
## Problems
### Field Access Resolution
In `transformations.py` ([here](https://i10git.cs.fau.de/pycodegen/pystencils/-/blob/master/pystencils/transformations.py#L557)), the data pointer for a field access is created with `const=True` if the field is read-only; however this creates a const pointer, not a pointer to const. The pointed-to data type is still non-const.
### Code Printer
The code printer handles const `SympyAssignment`s ([here](https://i10git.cs.fau.de/pycodegen/pystencils/-/blob/master/pystencils/backends/cbackend.py#L273)) in a fashion that is clearly problematic: The LHS type is printed, and then all occurences of `const` are removed from it by string operations. Then a leading `const` is added.
This breaks down in the presence of pointers (especially now, with the new double pointers): `const` may occur more than once in a pointer type. Also, if the declaration LHS is of pointer type, the `const` keyword must not be added to the left, but to the right of the type string.
## Solutions
Constness should exclusively be a part of the type, and the type system should respect it.
I've tried several avenues of fixing this, without success; even though constness only occurs at very few places in the code, the way they are spread across the code makes changing them extremely challenging.
The constness issue should enter consideration of the design of the new backend (pycodegen/pystencils#73), where it can then be solved as a 'by-the-way', simply by clean integration of constness in the type system.Release 2.0Frederik HennigFrederik Hennighttps://i10git.cs.fau.de/pycodegen/pystencils/-/issues/73pystencils Backend Rework2024-03-20T10:53:56+01:00Frederik Hennigpystencils Backend ReworkThe code generation backend of pystencils is due for a major overhaul. This issue is meant to track the efforts in this direction.
Development branch: `pystencils/backend-rework`
The various sub-tasks of this project are documented usi...The code generation backend of pystencils is due for a major overhaul. This issue is meant to track the efforts in this direction.
Development branch: `pystencils/backend-rework`
The various sub-tasks of this project are documented using the Tasks below.
Documentation for this branch is currently served here: https://da15siwa.pages.i10git.cs.fau.de/dev-docs/pystencils-nbackend/Release 2.0Frederik HennigFrederik Hennighttps://i10git.cs.fau.de/pycodegen/pystencils/-/issues/55Support Reductions2024-03-15T14:29:57+01:00Markus HolzerSupport Reductionspystencils supports sympy Sum like this:
`sum = sp.Sum(sp.abc.k, (sp.abc.k, 1, 100))` will be printed as:
```c++
{
for (int64_t ctr_0 = 0; ctr_0 < _size_x_0; ctr_0 += 1)
{
_data_x[_stride_x_0*ctr_0] = [&]() {
doubl...pystencils supports sympy Sum like this:
`sum = sp.Sum(sp.abc.k, (sp.abc.k, 1, 100))` will be printed as:
```c++
{
for (int64_t ctr_0 = 0; ctr_0 < _size_x_0; ctr_0 += 1)
{
_data_x[_stride_x_0*ctr_0] = [&]() {
double sum = (double) 0;
for ( int k = 1.0; k <= 100.0; k += 1 ) {
sum += k;
}
return sum;
}();
}
}
```
This is C++ code and additionally, at the moment, it bypasses the type system a bit. Thus this should be reimplemented.Release 1.1https://i10git.cs.fau.de/pycodegen/pystencils/-/issues/54Type defaults2022-05-11T14:38:17+02:00Markus HolzerType defaultsWith !292 two new type specifications will be introduced. These concern the typing of numbers. For users, it might get complicated how to set these. Thus, good defaults should be deployed in order to make users set the type as rare as p...With !292 two new type specifications will be introduced. These concern the typing of numbers. For users, it might get complicated how to set these. Thus, good defaults should be deployed in order to make users set the type as rare as possible.Release 1.1Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/issues/53Show Assembly Instructions2022-06-29T09:47:44+02:00Markus HolzerShow Assembly InstructionsLike `ps.show_code` but to get the assembly instructions. Like #28 but without KernCraft.Like `ps.show_code` but to get the assembly instructions. Like #28 but without KernCraft.Release 1.1Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/issues/52Add possiblity to create ".pvd"-files in vtk-writer2022-06-29T09:47:56+02:00Christoph SchwarzmeierAdd possiblity to create ".pvd"-files in vtk-writerCurrently, the files that are written in pystencil's vtk-writer are stored as _vtkImageData_ (`.vti`) only. It would be great if the vtk-writer was able to create a file in _ParaView Data format_ (`.pvd`) which contains the path to each ...Currently, the files that are written in pystencil's vtk-writer are stored as _vtkImageData_ (`.vti`) only. It would be great if the vtk-writer was able to create a file in _ParaView Data format_ (`.pvd`) which contains the path to each `.vti`-file and the (LBM) time step at which the file was written.
In ParaView, "time" would then actually represent the (LBM) time step instead of being equal to the index, i.e., number of the loaded ".vti"-file.Release 1.1Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/issues/47Clean extended SymPy functions2024-03-27T18:21:34+01:00Markus HolzerClean extended SymPy functionsThere are some functions which extend a SymPy function. These should all be implemented in a single module/fileThere are some functions which extend a SymPy function. These should all be implemented in a single module/fileRelease 2.0Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/issues/46Vectorization revamp2024-03-27T18:21:42+01:00Jan HönigVectorization revampRelease 2.0https://i10git.cs.fau.de/pycodegen/pystencils/-/issues/44Kerncraft2021-11-25T12:34:57+01:00Markus HolzerKerncraftCheck the status of kerncraft and check if we should still use it in pystencilsCheck the status of kerncraft and check if we should still use it in pystencilsRelease 1.0Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/issues/43Remove LLVM and OpenCL2021-11-22T21:09:32+01:00Markus HolzerRemove LLVM and OpenCLOpenCL and LLVM are not used by anyone in pystencils. We should deprecate it and tag the last opencl or llvm version pystencils.OpenCL and LLVM are not used by anyone in pystencils. We should deprecate it and tag the last opencl or llvm version pystencils.Release 1.0Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/issues/20Revamp the type system2022-05-11T14:42:49+02:00Michael Kuronmkuron@icp.uni-stuttgart.deRevamp the type system@bauer had planned to do that before he left, but ran out of time. He has some pretty clear ideas of how the type system should be. It should probably be done relatively quickly as it would make quite a few things a lot easier, e.g. the ...@bauer had planned to do that before he left, but ran out of time. He has some pretty clear ideas of how the type system should be. It should probably be done relatively quickly as it would make quite a few things a lot easier, e.g. the SIMD stuff. It is also necessary before things like #12 can be done properly.
The following test (for test_vectorization.py) does not work currently:
```python
def test_vectorized_loop_counter():
arr = np.zeros((4, 4))
@ps.kernel
def kernel_equal(s):
f = ps.fields(f=arr)
f[0, 0] @= ps.astnodes.LoopOverCoordinate.get_loop_counter_symbol(1)
ast = ps.create_kernel(kernel_equal).compile()(f=arr)
arr_ref = arr.copy()
arr = np.zeros((4, 4))
vectorize(ast, instruction_set=instruction_set)
ps.show_code(ast)
ast.compile()(f=arr)
assert np.allclose(arr_ref, arr)
```
Another problem can be found in the [attached jupyter notebook](/uploads/04f89bc361e00c144f14e5d657d78c4e/Untitled1.ipynb), where in the absence of `type_all_numbers` floating-point comparisons are used for integers in a conditional.
Finally, automatic conversion of arguments to the type expected by the signature of the SIMD intrinsics is not supported, which made https://i10git.cs.fau.de/walberla/walberla/-/merge_requests/414/diffs necessary.Release 1.0Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/issues/13show how to create own opts with sympy.codegen.rewriting.optimize in doc2021-12-10T12:19:59+01:00Stephan Seitzshow how to create own opts with sympy.codegen.rewriting.optimize in docIn `sympy.codegen.rewriting` there is a function `optimize`:
```python
def optimize(expr, optimizations):
""" Apply optimizations to an expression.
Parameters
==========
expr : expression
optimizations : iterable o...In `sympy.codegen.rewriting` there is a function `optimize`:
```python
def optimize(expr, optimizations):
""" Apply optimizations to an expression.
Parameters
==========
expr : expression
optimizations : iterable of ``Optimization`` instances
The optimizations will be sorted with respect to ``priority`` (highest first).
Examples
========
>>> from sympy import log, Symbol
>>> from sympy.codegen.rewriting import optims_c99, optimize
>>> x = Symbol('x')
>>> optimize(log(x+3)/log(2) + log(x**2 + 1), optims_c99)
log1p(x**2) + log2(x + 3)
"""
for optim in sorted(optimizations, key=lambda opt: opt.priority, reverse=True):
new_expr = optim(expr)
if optim.cost_function is None:
expr = new_expr
else:
before, after = map(lambda x: optim.cost_function(x), (expr, new_expr))
if before > after:
expr = new_expr
return expr
```
We should use it in `create_kernel` to profit from Sympy's optimizations (currently very few, but some are duplicates from optimizations in pystencils) and to make it easy for users to incorporate their own optimizations into the expressions. `create_kernel` could accept an `Iterable` of `Optimization`s with a default collection of optimizations that pystencils normally uses.
As you can see, it's really easy to implement own optimizations:
```python
def create_expand_pow_optimization(limit):
""" Creates an instance of :class:`ReplaceOptim` for expanding ``Pow``.
The requirements for expansions are that the base needs to be a symbol
and the exponent needs to be an Integer (and be less than or equal to
``limit``).
Parameters
==========
limit : int
The highest power which is expanded into multiplication.
Examples
========
>>> from sympy import Symbol, sin
>>> from sympy.codegen.rewriting import create_expand_pow_optimization
>>> x = Symbol('x')
>>> expand_opt = create_expand_pow_optimization(3)
>>> expand_opt(x**5 + x**3)
x**5 + x*x*x
>>> expand_opt(x**5 + x**3 + sin(x)**3)
x**5 + sin(x)**3 + x*x*x
"""
return ReplaceOptim(
lambda e: e.is_Pow and e.base.is_symbol and e.exp.is_Integer and abs(e.exp) <= limit,
lambda p: (
UnevaluatedExpr(Mul(*([p.base]*+p.exp), evaluate=False)) if p.exp > 0 else
1/UnevaluatedExpr(Mul(*([p.base]*-p.exp), evaluate=False))
))
```Release 1.0Stephan SeitzStephan Seitzhttps://i10git.cs.fau.de/pycodegen/pystencils/-/issues/87Symbolic Language2024-03-27T17:25:53+01:00Frederik HennigSymbolic LanguageThe new backend must support the full symbolic language that pystencils offered previously, plus some sensible additions. For now, however, the focus should be on implementing code generation support for all previously supported SymPy fe...The new backend must support the full symbolic language that pystencils offered previously, plus some sensible additions. For now, however, the focus should be on implementing code generation support for all previously supported SymPy features, as well as everything in `pystencils.sympyextensions` and a few other submodules that add features to kernels.
## Symbolic Language Consolidation
All symbolic language features should be collected into the `pystencils.sympyextensions` module (as also discussed in pycodegen/pystencils#47).
Furthermore, the documentation should include a section that fully documents all SymPy features and our extensions which the code generator supports. This section should also point out any restrictions and caveats.
- [ ] Set up symbolic language documentation
## FreezeExpressions
The language of pystencils will basically encompass everything that `pystencils.backend.FreezeExpressions` can translate.
Various pystencils features are still unimplemented:
- [ ] Augmented Assignments
- [ ] AddressOf
- [ ] Relations (sp.Relational)
- [ ] pystencils.sympyextensions.integer_functions
- [x] Bitwise ops
- [x] Integer Division
- [ ] Fancy division operations (see pycodegen/pystencils#88)
- [ ] pystencils.sympyextensions.bit_masks
- [ ] GPU fast approximations (pystencils.fast_approximation)
- [ ] ConditionalFieldAccess
- [ ] sp.Piecewise
- [ ] sp.floor, sp.ceiling
- [ ] sp.log, sp.atan2, sp.sinh, sp.cosh. sp.atan
- [ ] sp.Min, sp.Max: multi-argument versions
- [ ] Modulus (sp.Mod)
- [ ] Folding functions (`sp.Sum`, `sp.Product`, see pycodegen/pystencils#55)
## Random Number Generation
The random number generators currently implemented in `pystencils.rng` must be integrated with the new backend, in particular with `FreezeExpressions`.
- [ ] Integrate RNGs
## Control Flow Structures
With the removal of the old `astnodes` module, there is currently no way to express control flow in the front-end / the symbolic language.
We should discuss whether, and how, we should introduce loops and conditionals into that language again.https://i10git.cs.fau.de/pycodegen/pystencils/-/issues/86Migrate test cases & notebooks2024-02-28T11:48:46+01:00Frederik HennigMigrate test cases & notebookshttps://i10git.cs.fau.de/pycodegen/pystencils/-/issues/85Staggered Kernels2024-03-15T09:55:35+01:00Frederik HennigStaggered KernelsThe code generator for staggered kernels must be re-implemented using the new backend.The code generator for staggered kernels must be re-implemented using the new backend.https://i10git.cs.fau.de/pycodegen/pystencils/-/issues/84CPU Platforms and Transformations2024-03-15T10:39:04+01:00Frederik HennigCPU Platforms and Transformations# CPU Recognition and JIT
- [ ] Implement CPU auto-recognition for the `CurrentCPU` target
- [ ] Re-introduce field size equality checks to the CPU JIT compiler
- [ ] Replace the legacy CPU just-in-time compiler with an object-orient...# CPU Recognition and JIT
- [ ] Implement CPU auto-recognition for the `CurrentCPU` target
- [ ] Re-introduce field size equality checks to the CPU JIT compiler
- [ ] Replace the legacy CPU just-in-time compiler with an object-oriented, modular structure
# CPU Optimizations
The sequence of optimizations applied to a CPU code shall be controlled and carried out by the CPU optimization driver `pystencils.backend.kernelcreation.cpu_optimization.optimize_cpu`. Each individual optimization pass shall be implemented in a dedicated class within the `pystencils.backend.kernelcreation.transformations` module.
## Loop Optimizations
- [ ] OpenMP
- [ ] Loop-Invariant Code Motion: Fixpoint analysis and transformer to extract loop-invariant code
- [ ] Loop Cutting and condition elimination
- [ ] Loop tiling / blocking
## Vectorization
Vectorization shall be implemented as a two-step procedure.
### Generic Vectorization
In the first phase, the generic vectorizer shall analyze the target loop's body. If it is vectorizable, it shall be vectorized by adapting all data types to their respective vector types; adapting the loop range and, if necessary, create a remainder loop; and replace array accesses with vectorized array accesses.
- [ ] Implement vectorization legality analysis
- [ ] Implement vector transformation
### Intrinsics Selection
The second phase is intrinsics selection. The vectorizer calls upon the platform, which must be a subclass of `GenericVectorCpu`, to map all vectorized operations onto intrinsic functions. The intrinsic selection pass shall be implemented in `pystencils.backend.transformations.select_intrinsics.MaterializeVectorIntrinsics`. This class interacts with the protocols defined in `pystencils.backend.platforms.GenericVectorCpu` to retrieve the actual intrinsics depending on the vector architecture.
- [x] Implement protocols to select vector type, constant, load, store, and arithmetic intrinsics
- [ ] Implement protocols to select variable broadcast intrinsics
- [ ] Implement protocols to select mathematical function intrinsics (e.g. trigonometry, transcendentals, ...)
### X86
Intrinsic selection for X86 is implemented in `pystencils.backend.platforms.x86`:
- [x] vector types
- [x] constants
- [x] arithmetic intrinsics
- [x] aligned and unaligned load/store
- [ ] gather/scatter
- [ ] variable broadcast
### ARM
Intrinsic selection for ARM shall be implemented in `pystencils.backend.platforms.arm`:
- [ ] vector types
- [ ] constants
- [ ] arithmetic intrinsics
- [ ] aligned and unaligned load/store
- [ ] gather/scatter
- [ ] variable broadcasthttps://i10git.cs.fau.de/pycodegen/pystencils/-/issues/83GPU Platform2024-03-15T09:54:26+01:00Frederik HennigGPU Platform## Kernel creation for GPUs
- [ ] Implement CUDA GPU platform into `pystencils.backend.platforms`
- [ ] Implement GPU indexing and iteration space materialization (might want to re-use the existing `GpuIndexing` classes)
## Parameter...## Kernel creation for GPUs
- [ ] Implement CUDA GPU platform into `pystencils.backend.platforms`
- [ ] Implement GPU indexing and iteration space materialization (might want to re-use the existing `GpuIndexing` classes)
## Parameterization
- [ ] Implement automatic GPU detection for the `CurrentGPU` target
- [ ] Add GPU options to `CreateKernelOptions`
## JIT
- [ ] Implement CUDA just-in-time compilation into `pystencils.backend.jit`https://i10git.cs.fau.de/pycodegen/pystencils/-/issues/82C++ Code Printer2024-03-12T11:14:20+01:00Frederik HennigC++ Code Printerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/issues/81Improved Expression Manipulation
2024-03-10T17:59:43+01:00Frederik HennigImproved Expression Manipulation
## Freeze
`FreezeExpressions` is currently still quite naive; it should be more intelligent and take SymPy's peculiarities into account:
- Expand `Pow` to multiplications where sensible
- Detect subtractions (e.g. a + (-1 * b)) and m...## Freeze
`FreezeExpressions` is currently still quite naive; it should be more intelligent and take SymPy's peculiarities into account:
- Expand `Pow` to multiplications where sensible
- Detect subtractions (e.g. a + (-1 * b)) and map them to `PsSub`
- [x] Extend `FreezeExpressions`
## Constant Folding
The new AST is completely static, but constant folding will be necessary as a general optimization and during many transformations. Integers can always be folded; we should be more conservative about floating-point constants.
- [x] Implement a constant folding pass