pystencils merge requestshttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests2020-02-11T19:50:44+01:00https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/145Switch back to intersphinx sympy.org/latest ('coz it works)2020-02-11T19:50:44+01:00Stephan SeitzSwitch back to intersphinx sympy.org/latest ('coz it works)https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/239Sympy 1.9 support2021-04-26T18:24:04+02:00Michael Kuronmkuron@icp.uni-stuttgart.deSympy 1.9 support- deepcopy support was broken due to https://github.com/sympy/sympy/pull/21260
- clean up some constructors
- fix detection of sympy development versions
fixes #35, fixes !237- deepcopy support was broken due to https://github.com/sympy/sympy/pull/21260
- clean up some constructors
- fix detection of sympy development versions
fixes #35, fixes !237Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/289SymPy1.102022-03-21T19:42:55+01:00Markus HolzerSymPy1.10With Sympy 1.10 two small problems have entered pystencils. This MR fixes the problems.
Fixes #59With Sympy 1.10 two small problems have entered pystencils. This MR fixes the problems.
Fixes #59Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/272Testing2021-11-11T08:09:08+01:00Markus HolzerTestingFixes race condition in the cpu jit. Clean parallel datahandling test cases and tutorial notebook.Fixes race condition in the cpu jit. Clean parallel datahandling test cases and tutorial notebook.Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/164Throw an error when performing GPU operations with SerialDataHandling when py...2020-07-09T15:41:24+02:00Stephan SeitzThrow an error when performing GPU operations with SerialDataHandling when pycuda is not availableMarkus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/249Undo some changes from !248 that are no longer needed2021-05-27T19:41:41+02:00Michael Kuronmkuron@icp.uni-stuttgart.deUndo some changes from !248 that are no longer neededIt turns out these were only needed before I moved the vectorization of the `RNGBase` objects to the right place. The vectorized C printer does actually print scalar code when it is passed scalar variables and field accesses.It turns out these were only needed before I moved the vectorization of the `RNGBase` objects to the right place. The vectorized C printer does actually print scalar code when it is passed scalar variables and field accesses.Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/110Update AssignmentCollection.__repr__2019-12-18T21:14:57+01:00Stephan SeitzUpdate AssignmentCollection.__repr__`Assignment Collection for y[0,0,0]` is usually not very helpful.
New representation:
```
In [6]: forward_assignments = pystencils.AssignmentCollection({
...: z[0, 0]: x[0, 0] * sympy.log(x[0, 0] * y[0, 0]),
...: ...`Assignment Collection for y[0,0,0]` is usually not very helpful.
New representation:
```
In [6]: forward_assignments = pystencils.AssignmentCollection({
...: z[0, 0]: x[0, 0] * sympy.log(x[0, 0] * y[0, 0]),
...: y[0,0] : x[1,1] +1
...: })
In [7]: forward_assignments
Out[7]: AssignmentCollection: z_C, y_C <- f(x_C, x_NE)
```https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/167Update conftest and readme2020-08-10T11:21:50+02:00Markus HolzerUpdate conftest and readmeDue to the update of the python environment, the conftest is updated to the new pytest version. Further, pages are hosted now on a new URL which is adapted in the readme.Due to the update of the python environment, the conftest is updated to the new pytest version. Further, pages are hosted now on a new URL which is adapted in the readme.Jan HönigJan Hönighttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/199Update parallel datahandling2020-12-21T11:37:06+01:00Markus HolzerUpdate parallel datahandlingThis MR should provide some very minor changes when the new Python Coupling implementation is merged in waLBerla.This MR should provide some very minor changes when the new Python Coupling implementation is merged in waLBerla.Michael Kuronmkuron@icp.uni-stuttgart.deMichael Kuronmkuron@icp.uni-stuttgart.dehttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/173Update pystencils integration pipeline2020-10-05T13:29:08+02:00Markus HolzerUpdate pystencils integration pipelineSome of waLBerlas test cases have changed. This MR adaptes the changes.Some of waLBerlas test cases have changed. This MR adaptes the changes.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/218Update setup.py2021-02-20T12:44:25+01:00Markus HolzerUpdate setup.pyhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/186updated kc coupling to support layercondition analysis2020-11-13T09:07:40+01:00Julian Hammerupdated kc coupling to support layercondition analysishttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/304Upgrade maximum supported SymPy version to 1.11.12022-10-10T22:32:05+02:00Markus HolzerUpgrade maximum supported SymPy version to 1.11.1Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/198Usage of custom boundary functor if given2020-12-20T16:11:13+01:00Sebastian Bindgen Usage of custom boundary functor if givenThis is needed to implement the Lees Edwards boundary conditions submitted in https://i10git.cs.fau.de/pycodegen/lbmpy/-/merge_requests/49.
Custom boundary functors can now be created by users.This is needed to implement the Lees Edwards boundary conditions submitted in https://i10git.cs.fau.de/pycodegen/lbmpy/-/merge_requests/49.
Custom boundary functors can now be created by users.Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/139Use `rich` for syntax highlighting of `show_code` also in terminal2020-01-30T10:18:28+01:00Stephan SeitzUse `rich` for syntax highlighting of `show_code` also in terminalStill works as usual in Jupyter notebooks and in terminal if `rich` is not installed.![Screenshot_20200130_100924](/uploads/5876a0aab841c4a2d736777cac4535f6/Screenshot_20200130_100924.png)Still works as usual in Jupyter notebooks and in terminal if `rich` is not installed.![Screenshot_20200130_100924](/uploads/5876a0aab841c4a2d736777cac4535f6/Screenshot_20200130_100924.png)https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/176Use C11CodePrinter for sympy 1.72020-10-07T16:42:49+02:00Stephan SeitzUse C11CodePrinter for sympy 1.7C++ may cause problems for CUDA/OpenCL (e.g. print `std::log`)C++ may cause problems for CUDA/OpenCL (e.g. print `std::log`)Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/253Use closest normal for boundary index list with single_link2021-06-18T12:07:29+02:00Markus HolzerUse closest normal for boundary index list with single_linkFor creating the index list just the first stencil entry was taken which is a neighbour of the investigated cell if `single_link=True`. With this MR the discrete normal is calculated and the neighbouring cell in the normal direction is t...For creating the index list just the first stencil entry was taken which is a neighbour of the investigated cell if `single_link=True`. With this MR the discrete normal is calculated and the neighbouring cell in the normal direction is taken to build up the index array.
Furthermore, the computational cost of the python versions for `create_boundary_index_list` is reduced drastically because the iteration space is now restricted to the boundary cells and not the entire domain anymore.Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/312Use common shape to resolve buffer access2022-12-22T09:41:41+01:00Markus HolzerUse common shape to resolve buffer accessPystencils assume that all fields have the same spatial shape. Thus the field access should also be resolved by one common field shape. This was violated in the GPU kernel creation and should be fixed with this MRPystencils assume that all fields have the same spatial shape. Thus the field access should also be resolved by one common field shape. This was violated in the GPU kernel creation and should be fixed with this MRMarkus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/43Use get_type_of_expression in typing_form_sympy_inspection to infer types2019-09-23T16:16:50+02:00Stephan SeitzUse get_type_of_expression in typing_form_sympy_inspection to infer typesDANGER ZONE: this changes something in the core behavior of pystencils. Be careful before merging!
In summary, when `typing_form_sympy_inspection` reaches the point where it would just use `default_type`, we try to use `get_type_of_ex...DANGER ZONE: this changes something in the core behavior of pystencils. Be careful before merging!
In summary, when `typing_form_sympy_inspection` reaches the point where it would just use `default_type`, we try to use `get_type_of_expression` to infer the actual type.
We use information of previously defined variables in current scope.
Another approach would be to just type all the intermediate variable with `auto`.
```python
x = pystencils.fields('x: float32[3d]')
assignments = pystencils.AssignmentCollection({
a: cast_func(10, create_type('float64')),
b: cast_func(10, create_type('uint16')),
e: 11,
c: b,
f: c + b,
d: c + b + x.center + e,
x.center: c + b + x.center
})
```
Before:
```cpp
FUNC_PREFIX void kernel(float * RESTRICT _data_x, int64_t const _size_x_0, int64_t const _size_x_1,
int64_t const _size_x_2, int64_t const _stride_x_0, int64_t const _stride_x_1, int64_t const _stri
de_x_2)
{
const double a = 10.0;
const double b = 10;
const double e = 11.0;
const double c = b;
const double f = b + c;
for (int ctr_0 = 0; ctr_0 < _size_x_0; ctr_0 += 1)
{
float * RESTRICT _data_x_00 = _data_x + _stride_x_0*ctr_0;
for (int ctr_1 = 0; ctr_1 < _size_x_1; ctr_1 += 1)
{
float * RESTRICT _data_x_00_10 = _stride_x_1*ctr_1 + _data_x_00;
for (int ctr_2 = 0; ctr_2 < _size_x_2; ctr_2 += 1)
{
const double d = b + c + e + _data_x_00_10[_stride_x_2*ctr_2];
_data_x_00_10[_stride_x_2*ctr_2] = b + c + _data_x_00_10[_stride_x_2*ctr_2];
}
}
}
}
```
After:
```cpp
FUNC_PREFIX void kernel(float * RESTRICT _data_x, int64_t const _size_x_0, int64_t const _size_x_1,
int64_t const _size_x_2, int64_t const _stride_x_0, int64_t const _stride_x_1, int64_t const _stri
de_x_2)
{
const double a = 10.0;
const uint16_t b = 10;
const int64_t e = 11.0;
const uint16_t c = b;
const uint16_t f = b + c;
for (int ctr_0 = 0; ctr_0 < _size_x_0; ctr_0 += 1)
{
float * RESTRICT _data_x_00 = _data_x + _stride_x_0*ctr_0;
for (int ctr_1 = 0; ctr_1 < _size_x_1; ctr_1 += 1)
{
float * RESTRICT _data_x_00_10 = _stride_x_1*ctr_1 + _data_x_00;
for (int ctr_2 = 0; ctr_2 < _size_x_2; ctr_2 += 1)
{
const float d = b + c + e + _data_x_00_10[_stride_x_2*ctr_2];
_data_x_00_10[_stride_x_2*ctr_2] = b + c + _data_x_00_10[_stride_x_2*ctr_2];
}
}
}
}
```https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/251Use int64 for indexing2021-06-08T08:33:34+02:00Markus HolzerUse int64 for indexingFor indexed kernels, int32 is too small for large domain sizes. Thus the coordinates are cast to int64 in this MR to allow huge domain sizes.
As an example of the adaption the generated code for a Neumann boundary is shown. Before:
```...For indexed kernels, int32 is too small for large domain sizes. Thus the coordinates are cast to int64 in this MR to allow huge domain sizes.
As an example of the adaption the generated code for a Neumann boundary is shown. Before:
```cpp
FUNC_PREFIX void kernel(double * RESTRICT _data_C, uint8_t * RESTRICT const _data_indexField, int64_t const _size_indexField_0, int64_t const _stride_indexField_0)
{
#pragma omp parallel
{
#pragma omp for schedule(static)
for (int64_t ctr_0 = 0; ctr_0 < _size_indexField_0; ctr_0 += 1)
{
const int32_t x = *((int32_t *)(& _data_indexField[12*_stride_indexField_0*ctr_0]));
const int32_t y = *((int32_t *)(& _data_indexField[12*_stride_indexField_0*ctr_0 + 4]));
const int64_t cx [] = { 0, 0, 0, -1, 1, -1, 1, -1, 1 };
const int64_t cy [] = { 0, 1, -1, 0, 0, 1, 1, -1, -1 };
const int invdir [] = { 0, 2, 1, 4, 3, 8, 7, 6, 5 };
const int32_t dir = *((int32_t *)(& _data_indexField[12*_stride_indexField_0*ctr_0 + 8]));
_data_C[x + 11*y] = _data_C[x + 11*y + cx[dir] + 11*cy[dir]];
}
}
}
```
After:
```cpp
FUNC_PREFIX void kernel(double * RESTRICT _data_C, uint8_t * RESTRICT const _data_indexField, int64_t const _size_indexField_0, int64_t const _stride_indexField_0)
{
#pragma omp parallel
{
#pragma omp for schedule(static)
for (int64_t ctr_0 = 0; ctr_0 < _size_indexField_0; ctr_0 += 1)
{
const int64_t x = *((int32_t *)(& _data_indexField[12*_stride_indexField_0*ctr_0]));
const int64_t y = *((int32_t *)(& _data_indexField[12*_stride_indexField_0*ctr_0 + 4]));
const int64_t cx [] = { 0, 0, 0, -1, 1, -1, 1, -1, 1 };
const int64_t cy [] = { 0, 1, -1, 0, 0, 1, 1, -1, -1 };
const int64_t invdir [] = { 0, 2, 1, 4, 3, 8, 7, 6, 5 };
const int64_t dir = *((int32_t *)(& _data_indexField[12*_stride_indexField_0*ctr_0 + 8]));
_data_C[x + 11*y] = _data_C[x + 11*y + cx[dir] + 11*cy[dir]];
}
}
}
```Markus HolzerMarkus Holzer