pystencils merge requestshttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests2023-09-07T11:10:33+02:00https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/349[BugFix] Fix indexing with ghostlayers2023-09-07T11:10:33+02:00Markus Holzer[BugFix] Fix indexing with ghostlayersThe Block indexing has bug when created with an iteration slice and ghost layers. With !341 The Block indexing supports slices more naturally by limiting the iteration space to the sliced size. Thus the counter index is multiplied by the...The Block indexing has bug when created with an iteration slice and ghost layers. With !341 The Block indexing supports slices more naturally by limiting the iteration space to the sliced size. Thus the counter index is multiplied by the step size. This was done also for the offset of the ghostlayers which is wrong.
This MR fixes the problemMarkus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/327[Fix] Update for Docker Images2023-06-04T16:14:23+02:00Markus Holzer[Fix] Update for Docker ImagesDue to an update of the docker images minor changes are required for the CIDue to an update of the docker images minor changes are required for the CIMarkus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/7Add `pystencils.make_python_function` used for KernelFunction.compile2019-07-18T11:56:29+02:00Stephan SeitzAdd `pystencils.make_python_function` used for KernelFunction.compile`KernelFunction.compile = None` is currently set by the
`create_kernel` function of each respective backend as partial function
of `<backend>.make_python_function`.
The code would be clearer with a unified `make_python_function`.
`Kerne...`KernelFunction.compile = None` is currently set by the
`create_kernel` function of each respective backend as partial function
of `<backend>.make_python_function`.
The code would be clearer with a unified `make_python_function`.
`KernelFunction.compile` can then be implemented as a call to this
function with the respective backend.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/22Add AssignmentCollection.has_exclusive_writes2019-08-07T11:43:33+02:00Stephan SeitzAdd AssignmentCollection.has_exclusive_writesAn assumption of pystencils is that output stencil writes never overlap.
This allows massive parallelization without race conditions or atomics.
When I use my autodiff transformations I use this condition to check
whether the assumption...An assumption of pystencils is that output stencil writes never overlap.
This allows massive parallelization without race conditions or atomics.
When I use my autodiff transformations I use this condition to check
whether the assumption still hold for the backward assignments.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/2Add autodiff2019-07-11T12:12:00+02:00Stephan SeitzAdd autodiffDraft for minimal integration of automatic differentiation. Tensorflow and Torch backend were removed (apart from AutoDiffOp.create_tensorlfow())
Only tests without LBM or tf/torch dependencies have been added. This implies that also nu...Draft for minimal integration of automatic differentiation. Tensorflow and Torch backend were removed (apart from AutoDiffOp.create_tensorlfow())
Only tests without LBM or tf/torch dependencies have been added. This implies that also numeric gradient checking is missing (depends on either tf or torch). Should we really move the backends into separate modules?
Apart from adding auto-differentiation functionality I added two more changes. Would be happy to AutoDiffOp.create_tensorflow() after feedback.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/112Add CI minimal CI test for old sympy2019-12-17T18:51:26+01:00Stephan SeitzAdd CI minimal CI test for old sympyThe minimal test cannot catch everything but its something.The minimal test cannot catch everything but its something.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/67Add ConditionalFieldAccess (Field.Access after out-of-bounds check)2019-10-01T15:36:36+02:00Stephan SeitzAdd ConditionalFieldAccess (Field.Access after out-of-bounds check)Adds a wrapper around a `Field.Access` that the access is only performed if a certain condition is met.
If I use this, I can safely perform calculations and adjoint calculations with `ghost_layers=0` and obtain the correct gradients w...Adds a wrapper around a `Field.Access` that the access is only performed if a certain condition is met.
If I use this, I can safely perform calculations and adjoint calculations with `ghost_layers=0` and obtain the correct gradients without separate boundary handling.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/10Add CustomSympyPrinter._print_Sum2019-08-05T16:44:54+02:00Stephan SeitzAdd CustomSympyPrinter._print_SumThis makes sympy.Sum printable as instantaniously invoked lambda (Attention: C++-only, works in CUDA)This makes sympy.Sum printable as instantaniously invoked lambda (Attention: C++-only, works in CUDA)https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/26Add pystencils-autodiff2019-08-09T08:54:59+02:00Stephan SeitzAdd pystencils-autodiffThis adds pystencils_autodiff (https://pypi.org/project/pystencils-autodiff/0.1.3/) to pystencils.
After installing the extension, you can access all its classes in the submodule `pystenicls.autodiff`.
If it's not installed but you t...This adds pystencils_autodiff (https://pypi.org/project/pystencils-autodiff/0.1.3/) to pystencils.
After installing the extension, you can access all its classes in the submodule `pystenicls.autodiff`.
If it's not installed but you try to import it you get an error with installation instructions.
The internal code of pystencils_autodiff is still very ugly.
I hope I can clean it up in the next days/weeks.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/229Add type conversion for SP types2021-04-03T06:01:47+02:00Markus HolzerAdd type conversion for SP typesIf Assignments are already typed for double-precision but the kernel is created for single-precision the assignments should be adapted.If Assignments are already typed for double-precision but the kernel is created for single-precision the assignments should be adapted.Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/144Add TypedMatrixSymbol (for usage of `MatrixSymbol` in kernels)2020-02-21T15:16:29+01:00Stephan SeitzAdd TypedMatrixSymbol (for usage of `MatrixSymbol` in kernels)I don't know whether this is a good idea but SymPy supports assigning MatrixSymbols. Like
```python
>>>A = MatrixSymbols('A', 3, 3)
>>>B = MatrixSymbols('B', 3, 3)
In [12]: pystencils.Assignment(A, B) ...I don't know whether this is a good idea but SymPy supports assigning MatrixSymbols. Like
```python
>>>A = MatrixSymbols('A', 3, 3)
>>>B = MatrixSymbols('B', 3, 3)
In [12]: pystencils.Assignment(A, B)
Out[12]: A := B
```
With this hack I can generate code like this:
```cpp
#define FUNC_PREFIX static
2 FUNC_PREFIX void kernel(float * RESTRICT _data_y, int64_t const _size_y_0, int64_t const _size_y_1, int64_t const _size_y_2, int64_t
const _stride_y_0, int64_t const _stride_y_1, int64_t const _stride_y_2, std::function< Vector3 < double >(int, int, int) > my_fun)
3 {
4 for (int ctr_0 = 0; ctr_0 < _size_y_0; ctr_0 += 1)
5 {
6 float * RESTRICT _data_y_00 = _data_y + _stride_y_0*ctr_0;
7 for (int ctr_1 = 0; ctr_1 < _size_y_1; ctr_1 += 1)
8 {
9 float * RESTRICT _data_y_00_10 = _stride_y_1*ctr_1 + _data_y_00;
10 for (int ctr_2 = 0; ctr_2 < _size_y_2; ctr_2 += 1)
11 {
12 const Vector3<double> A = my_fun(ctr_0, ctr_1, ctr_2);
13 _data_y_00_10[_stride_y_2*ctr_2] = A[0] + A[1] + A[2];
14 }
15 }
16 }
17 }
1 #define FUNC_PREFIX static
2 template <class Functor_T>
3 FUNC_PREFIX void kernel(float * RESTRICT _data_y, int64_t const _size_y_0, int64_t const _size_y_1, int64_t const _size_y_2, int64_t
const _stride_y_0, int64_t const _stride_y_1, int64_t const _stride_y_2, Functor_T my_fun)
4 {
5 for (int ctr_0 = 0; ctr_0 < _size_y_0; ctr_0 += 1)
6 {
7 float * RESTRICT _data_y_00 = _data_y + _stride_y_0*ctr_0;
8 for (int ctr_1 = 0; ctr_1 < _size_y_1; ctr_1 += 1)
9 {
10 float * RESTRICT _data_y_00_10 = _stride_y_1*ctr_1 + _data_y_00;
11 for (int ctr_2 = 0; ctr_2 < _size_y_2; ctr_2 += 1)
12 {
13 const Vector3<double> A = my_fun(ctr_0, ctr_1, ctr_2);
14 _data_y_00_10[_stride_y_2*ctr_2] = A[0] + A[1] + A[2];
15 }
16 }
17 }
18 }
```
from
```python
x, y = pystencils.fields('x, y: float32[3d]')
from pystencils.data_types import TypedMatrixSymbol
A = TypedMatrixSymbol('A', 3, 1, create_type('double'), 'Vector3<double>')
my_fun_call = DynamicFunction(TypedSymbol('my_fun',
'std::function< Vector3 < double >(int, int, int) >'),
A.dtype,
*pystencils.x_vector(3))
assignments = pystencils.AssignmentCollection({
A: my_fun_call,
y.center: A[0] + A[1] + A[2]
})
ast = pystencils.create_kernel(assignments)
pystencils.show_code(ast, custom_backend=FrameworkIntegrationPrinter())
```https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/54Always use codegen.rewriting.optimize2019-09-23T13:38:39+02:00Stephan SeitzAlways use codegen.rewriting.optimizePretty much !34 but with the changes to `create_kernel`. Can be closed if not wanted. Leaving it here for archiving purposes.
!34 has the workflow:
```python
assignments = optimize(assignments, optimizations)
ast = create_kernel(...Pretty much !34 but with the changes to `create_kernel`. Can be closed if not wanted. Leaving it here for archiving purposes.
!34 has the workflow:
```python
assignments = optimize(assignments, optimizations)
ast = create_kernel(assignments)
```https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/8Auto-format pystencils/rng.py (trailing whitespace)2019-07-18T10:28:27+02:00Stephan SeitzAuto-format pystencils/rng.py (trailing whitespace)My editor feels better if that whitespace is not there.My editor feels better if that whitespace is not there.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/358Draft: [FIX] Index fields exclusively containing coordinates are dropped by c...2024-01-31T11:48:23+01:00Frederik HennigDraft: [FIX] Index fields exclusively containing coordinates are dropped by code generatorIndex fields that exclusively contain coordinate data (members `x`, `y` and `z`) and that are not explicitly accessed in the kernel assignments are dropped by `pystencils.cpu.create_indexed_kernel` in `cpu/kernelcreation.py`, prev. line ...Index fields that exclusively contain coordinate data (members `x`, `y` and `z`) and that are not explicitly accessed in the kernel assignments are dropped by `pystencils.cpu.create_indexed_kernel` in `cpu/kernelcreation.py`, prev. line 119.
Then in line 128 the list of index fields is empty, and the code generator finds no field containing the coordinate information.
Code generation then aborts.
Is there a reason why index fields are first filtered this way?Frederik HennigFrederik Hennighttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/224Draft: Develop2023-09-14T11:03:48+02:00Markus HolzerDraft: DevelopThis MR adds two features to pystencils. First, the base pointer specification is revealed to the user which allows producing kernels with less register usage. Second, the summands insider the summation printer are printer recursively no...This MR adds two features to pystencils. First, the base pointer specification is revealed to the user which allows producing kernels with less register usage. Second, the summands insider the summation printer are printer recursively now which allows for more parallelism inside a single core.Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/342Draft: Do not reorder accesses in `move_constants_before_loop`2023-08-18T10:39:05+02:00Daniel BauerDraft: Do not reorder accesses in `move_constants_before_loop`Prior to this MR, `move_constants_before_loop` tries to move constants as far to the top as possible.
This might reorder read/write accesses to fields.
For example:
```python
import pystencils as ps
from pystencils import CreateKernelCo...Prior to this MR, `move_constants_before_loop` tries to move constants as far to the top as possible.
This might reorder read/write accesses to fields.
For example:
```python
import pystencils as ps
from pystencils import CreateKernelConfig
from pystencils.astnodes import Block, KernelFunction, LoopOverCoordinate, SympyAssignment
from pystencils.field import Field, FieldType
from sympy.abc import x, y
field = Field.create_generic("field", 1, field_type=FieldType.CUSTOM)
counter = LoopOverCoordinate.get_loop_counter_symbol(0)
load = SympyAssignment(x, field.absolute_access((counter,), (0,)))
store = SympyAssignment(field.absolute_access((counter+1,), (0,)), 2*x)
body = ps.typing.transformations.add_types(Block([load, store]), CreateKernelConfig())
loop = LoopOverCoordinate(body, 0, 0, 42)
block = Block([loop])
ps.transformations.resolve_field_accesses(block)
new_loops = ps.transformations.cut_loop(loop, [41])
ps.transformations.move_constants_before_loop(new_loops.args[1])
kernel = KernelFunction(
block,
ps.Target.CPU,
ps.Backend.C,
ps.cpu.cpujit.make_python_function,
None,
)
code = ps.get_code_str(kernel)
print(code)
```
prints
```c
FUNC_PREFIX void kernel(double * RESTRICT _data_field, int64_t const _stride_field_0)
{
const double x = _data_field[41*_stride_field_0];
_data_field[42*_stride_field_0] = x*2.0;
{
for (int64_t ctr_0 = 0; ctr_0 < 41; ctr_0 += 1)
{
const double x = _data_field[_stride_field_0*ctr_0];
_data_field[_stride_field_0*(ctr_0 + 1)] = x*2.0;
}
{
}
}
}
```
Note that the last (cut) loop iteration is moved before the primary loop, leading to a wrong load from index 41.
This MR changes `move_constants_before_loop` such that assignments can not be moved before their last modification.
Essentially, it replaces `symbols_defined` by `symbols_modified` [here](https://i10git.cs.fau.de/terraneo/pystencils/-/commit/be78ab165339d593869b5c77ef00a590a63ba130#99785d4b53b75ce54c83c3e499248de2a07fb2cd_598_597).
This new property is implemented for all AST nodes.
Note the implementation of `CustomCCodeNode`. I did not want to introduce breaking changes to the API.
Additionally, declarations are now inserted where the caller requests, instead of pushing them all the way to the top (https://i10git.cs.fau.de/terraneo/pystencils/-/commit/5c65d06216d050c22e28ba0b9487544342fc0926).
Lastly, a test for the new behavior is included.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/316Draft: feat: implement `__cuda_array_interface__`2023-09-14T10:43:31+02:00Stephan SeitzDraft: feat: implement `__cuda_array_interface__`https://numba.readthedocs.io/en/stable/cuda/cuda_array_interface.html
This is supported by:
- pycuda
- numba
- cupy
- torch
- nvcv https://github.com/CvCuda/CV-CUDA
- maybe by tensorflow in future: https://github.com/tensorflow/tensorfl...https://numba.readthedocs.io/en/stable/cuda/cuda_array_interface.html
This is supported by:
- pycuda
- numba
- cupy
- torch
- nvcv https://github.com/CvCuda/CV-CUDA
- maybe by tensorflow in future: https://github.com/tensorflow/tensorflow/issues/29039
Also allow to execute with cupy (https://docs.cupy.dev/en/stable/index.html)
instead of pycuda
TODO:
- [ ] check that pointers in correct CUDA context and if not import into
current
- [x] make execution with pycuda aware of `__cuda_array_interface__`
- [ ] what/how to testhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/353Draft: Generalise usage of Structs for nested array access2023-09-28T09:47:11+02:00Markus HolzerDraft: Generalise usage of Structs for nested array accessIn this, MR Structs are introduced in a more general form than they are used in the index kernel. The structs here can hold data and pointers to fields. This makes it possible to iterate over a struct and extract field pointers in each l...In this, MR Structs are introduced in a more general form than they are used in the index kernel. The structs here can hold data and pointers to fields. This makes it possible to iterate over a struct and extract field pointers in each loop iteration. The extracted fields are then updated in the normal loop nest.
The idea can be illustrated in a small example:
```python
import numpy as np
import pystencils as ps
from pystencils.typing import BasicType, FieldPointerSymbol, PointerType
from pystencils.struct import Struct
dtype = BasicType(np.float64)
f = ps.fields(f'f(1): double[3d]')
g = ps.fields(f'g(1): double[3d]')
struct_src = Struct("src")
struct_src.add_member(PointerType(dtype, const=False, restrict=False, double_pointer=True))
struct_dst = Struct("dst")
struct_dst.add_member(PointerType(dtype, const=False, restrict=False, double_pointer=True))
update_rule = [ps.Assignment(FieldPointerSymbol("f", dtype, const=True), struct_src[0]),
ps.Assignment(FieldPointerSymbol("g", dtype, const=False), struct_dst[0]),
ps.Assignment(g.center, f.center)]
ast = ps.create_kernel(update_rule)
```
This produces the following C-Code:
```c++
FUNC_PREFIX void kernel(double ** _data_dst, double ** _data_src, int64_t const _size_dst, int64_t const _size_f_0, int64_t const _size_f_1, int64_t const _size_f_2, int64_t const _stride_f_0, int64_t const _stride_f_1, int64_t const _stride_f_2, int64_t const _stride_g_0, int64_t const _stride_g_1, int64_t const _stride_g_2)
{
for (int64_t ctr_0 = 0; ctr_0 < _size_dst; ctr_0 += 1)
{
double * RESTRICT _data_f = _data_src[ctr_0];
double * RESTRICT _data_g = _data_dst[ctr_0];
for (int64_t ctr_1 = 0; ctr_1 < _size_f_0; ctr_1 += 1)
{
for (int64_t ctr_2 = 0; ctr_2 < _size_f_1; ctr_2 += 1)
{
for (int64_t ctr_3 = 0; ctr_3 < _size_f_2; ctr_3 += 1)
{
_data_g[_stride_g_0*ctr_1 + _stride_g_1*ctr_2 + _stride_g_2*ctr_3] = _data_f[_stride_f_0*ctr_1 + _stride_f_1*ctr_2 + _stride_f_2*ctr_3];
}
}
}
}
}
```
Thus the struct is used as a container for an arbitrary number of subarrays that are all updated at once. Since the struct only holds a single pointer per Element in the above example we can represent it as a double pointer **Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/283Draft: Remove too many zeros2023-03-27T10:40:59+02:00Markus HolzerDraft: Remove too many zerosRemove unnecessary from numbers: 1.80000000 --> 1.8Remove unnecessary from numbers: 1.80000000 --> 1.8Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/17Fix #10: Add jinja2 to pystencils's dependencies2019-08-06T08:05:38+02:00Stephan SeitzFix #10: Add jinja2 to pystencils's dependenciesAlternative would be remove jinja2 (see other PR).
However, I think dependency on jinja2 is not to heavy.
This could make some implementations more elegant.Alternative would be remove jinja2 (see other PR).
However, I think dependency on jinja2 is not to heavy.
This could make some implementations more elegant.