pystencils merge requestshttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests2020-11-06T15:45:24+01:00https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/183Updated Kerncraft Coupling2020-11-06T15:45:24+01:00Julian HammerUpdated Kerncraft Couplinghttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/175WIP: Opencl to SPIR-V ahead-of-time compilation2023-03-16T12:42:10+01:00Stephan SeitzWIP: Opencl to SPIR-V ahead-of-time compilationThis does not yet use the pystencils' cache folder or disk caching of the compilation.
This can be used to embed compiled bytecode into waLBerla executables as I do with my Vulkan wrapper. Not sure if this is a good way to go but at lea...This does not yet use the pystencils' cache folder or disk caching of the compilation.
This can be used to embed compiled bytecode into waLBerla executables as I do with my Vulkan wrapper. Not sure if this is a good way to go but at least we can experiment with it.
A good way to proceed with this MR is also a comparison between hip/sicl/ocl/vulkan in order to identify a suitable backend for pystencils.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/151Use dark mode for code preview if user prefers `prefers-color-scheme: dark`2020-04-23T07:59:41+02:00Stephan SeitzUse dark mode for code preview if user prefers `prefers-color-scheme: dark`pystencils currently does not look good in dark mode :/pystencils currently does not look good in dark mode :/https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/150Fix import: sympy.numbers -> sympy.core.numbers2020-03-24T00:57:30+01:00Stephan SeitzFix import: sympy.numbers -> sympy.core.numbersApparently `sympy` no longer exports `sympy.numbers` directly.Apparently `sympy` no longer exports `sympy.numbers` directly.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/144Add TypedMatrixSymbol (for usage of `MatrixSymbol` in kernels)2020-02-21T15:16:29+01:00Stephan SeitzAdd TypedMatrixSymbol (for usage of `MatrixSymbol` in kernels)I don't know whether this is a good idea but SymPy supports assigning MatrixSymbols. Like
```python
>>>A = MatrixSymbols('A', 3, 3)
>>>B = MatrixSymbols('B', 3, 3)
In [12]: pystencils.Assignment(A, B) ...I don't know whether this is a good idea but SymPy supports assigning MatrixSymbols. Like
```python
>>>A = MatrixSymbols('A', 3, 3)
>>>B = MatrixSymbols('B', 3, 3)
In [12]: pystencils.Assignment(A, B)
Out[12]: A := B
```
With this hack I can generate code like this:
```cpp
#define FUNC_PREFIX static
2 FUNC_PREFIX void kernel(float * RESTRICT _data_y, int64_t const _size_y_0, int64_t const _size_y_1, int64_t const _size_y_2, int64_t
const _stride_y_0, int64_t const _stride_y_1, int64_t const _stride_y_2, std::function< Vector3 < double >(int, int, int) > my_fun)
3 {
4 for (int ctr_0 = 0; ctr_0 < _size_y_0; ctr_0 += 1)
5 {
6 float * RESTRICT _data_y_00 = _data_y + _stride_y_0*ctr_0;
7 for (int ctr_1 = 0; ctr_1 < _size_y_1; ctr_1 += 1)
8 {
9 float * RESTRICT _data_y_00_10 = _stride_y_1*ctr_1 + _data_y_00;
10 for (int ctr_2 = 0; ctr_2 < _size_y_2; ctr_2 += 1)
11 {
12 const Vector3<double> A = my_fun(ctr_0, ctr_1, ctr_2);
13 _data_y_00_10[_stride_y_2*ctr_2] = A[0] + A[1] + A[2];
14 }
15 }
16 }
17 }
1 #define FUNC_PREFIX static
2 template <class Functor_T>
3 FUNC_PREFIX void kernel(float * RESTRICT _data_y, int64_t const _size_y_0, int64_t const _size_y_1, int64_t const _size_y_2, int64_t
const _stride_y_0, int64_t const _stride_y_1, int64_t const _stride_y_2, Functor_T my_fun)
4 {
5 for (int ctr_0 = 0; ctr_0 < _size_y_0; ctr_0 += 1)
6 {
7 float * RESTRICT _data_y_00 = _data_y + _stride_y_0*ctr_0;
8 for (int ctr_1 = 0; ctr_1 < _size_y_1; ctr_1 += 1)
9 {
10 float * RESTRICT _data_y_00_10 = _stride_y_1*ctr_1 + _data_y_00;
11 for (int ctr_2 = 0; ctr_2 < _size_y_2; ctr_2 += 1)
12 {
13 const Vector3<double> A = my_fun(ctr_0, ctr_1, ctr_2);
14 _data_y_00_10[_stride_y_2*ctr_2] = A[0] + A[1] + A[2];
15 }
16 }
17 }
18 }
```
from
```python
x, y = pystencils.fields('x, y: float32[3d]')
from pystencils.data_types import TypedMatrixSymbol
A = TypedMatrixSymbol('A', 3, 1, create_type('double'), 'Vector3<double>')
my_fun_call = DynamicFunction(TypedSymbol('my_fun',
'std::function< Vector3 < double >(int, int, int) >'),
A.dtype,
*pystencils.x_vector(3))
assignments = pystencils.AssignmentCollection({
A: my_fun_call,
y.center: A[0] + A[1] + A[2]
})
ast = pystencils.create_kernel(assignments)
pystencils.show_code(ast, custom_backend=FrameworkIntegrationPrinter())
```https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/117WIP: Add InterpolatorAccess.__getnewargs__2020-01-28T14:23:15+01:00Stephan SeitzWIP: Add InterpolatorAccess.__getnewargs__it was missing and instead TypedSymbol.__getnewargs__ was usedit was missing and instead TypedSymbol.__getnewargs__ was usedhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/116Throw error when trying to sympify `pystencils.Field` (e.g. using it in an...2020-01-03T13:24:14+01:00Stephan SeitzThrow error when trying to sympify `pystencils.Field` (e.g. using it in an...Throw error when trying to sympify `pystencils.Field` (e.g. using it in an Assignment without indexing)
This is a typical error when using pystencils: you forget the index and use a field directly in an Assignment.
Edit: apparently...Throw error when trying to sympify `pystencils.Field` (e.g. using it in an Assignment without indexing)
This is a typical error when using pystencils: you forget the index and use a field directly in an Assignment.
Edit: apparently, this error is only triggered on recent versions of Sympy that can sympify using `__sympy__` (not on CI).https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/113Test pystencils_autodiff in integration test2020-01-08T13:49:44+01:00Stephan SeitzTest pystencils_autodiff in integration testhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/112Add CI minimal CI test for old sympy2019-12-17T18:51:26+01:00Stephan SeitzAdd CI minimal CI test for old sympyThe minimal test cannot catch everything but its something.The minimal test cannot catch everything but its something.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/111Test pystencils_autodiff in integration test2019-12-17T18:56:19+01:00Stephan SeitzTest pystencils_autodiff in integration testhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/106WIP: Cuda autotune2020-10-07T13:04:35+02:00Stephan SeitzWIP: Cuda autotuneThis PR introduces ~~two~~ one change~~s~~:
- ~~rotate (32,1,1) depending on field strides to fastest dimension. So (1,1,32) for c-layout and (32,1,1) for fortran layout. So pystencils will be fast also for c-layout (this will always be...This PR introduces ~~two~~ one change~~s~~:
- ~~rotate (32,1,1) depending on field strides to fastest dimension. So (1,1,32) for c-layout and (32,1,1) for fortran layout. So pystencils will be fast also for c-layout (this will always be performed)~~
- auto-tune the block dimensions to whatevers is fastest for a specific kernel on localhost. On first kernel call different layouts are tried and the kernel will be called henceforth with the fastest configuration (disk_cached). This could be intersting for OpenCL where we don't know which launch config is the fastest (on OpenCL the runtime can alternatively give a hint on that).
One drawback: the test calls are only correct if input and output fields do not overlap (so no in-place kernels).https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/101WIP: Add csqrt, cpow to cuda_complex.hpp2021-11-22T15:41:05+01:00Stephan SeitzWIP: Add csqrt, cpow to cuda_complex.hppApparently, I'm using here a feature of a more recent C++ verion.
Specializing `cpow(T)` to `cpow(complex<T>)`Apparently, I'm using here a feature of a more recent C++ verion.
Specializing `cpow(T)` to `cpow(complex<T>)`https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/100Fix Opencl and LLVM GPU tests2019-12-05T11:02:49+01:00Stephan SeitzFix Opencl and LLVM GPU testsFix tests for LLVM GPU and OpenCL
- !96 made it impossible to print functions without names (only important for LLVM GPU test)
- !87 made it impossible to run OpenCL kernels on CUDA OpenCL `int(...)`. is not a valid cast for it
- Sy...Fix tests for LLVM GPU and OpenCL
- !96 made it impossible to print functions without names (only important for LLVM GPU test)
- !87 made it impossible to run OpenCL kernels on CUDA OpenCL `int(...)`. is not a valid cast for it
- SymPy moved `sympy.boolalg` to `sympy.logic.boolalg`https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/98WIP: Graph datahandling2020-01-28T14:24:04+01:00Stephan SeitzWIP: Graph datahandlingThis is the draft for a data handling that (optionally) forwards all calls to SerialDatahandling.
All calls and data transfers get recorded for the creation of an execution graph.
Needs to be changed after the breaking changes in dat...This is the draft for a data handling that (optionally) forwards all calls to SerialDatahandling.
All calls and data transfers get recorded for the creation of an execution graph.
Needs to be changed after the breaking changes in datahandling.
Needs a tiny change in lbmpy:
Instead of using `TimeLoop(...)` for time loop creation a custom function is used.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/97WIP: Add assumptions based on cast_func.args[0]2021-11-22T15:32:23+01:00Stephan SeitzWIP: Add assumptions based on cast_func.args[0]This enables cast_func(1.f, create_type('double')).positive == TrueThis enables cast_func(1.f, create_type('double')).positive == Truehttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/85Opencl datahandling2019-12-05T12:14:33+01:00Stephan SeitzOpencl datahandlingCloses #15
OpenCL kernels are now integrated in the normal `create_kernel` workflow. Also there exists a created a `opencljit.init_globally` function that just creates some CL queue/contex if you do not want to give it as a parameter...Closes #15
OpenCL kernels are now integrated in the normal `create_kernel` workflow. Also there exists a created a `opencljit.init_globally` function that just creates some CL queue/contex if you do not want to give it as a parameter to every kernel.
SerialDatahandling is extended to work with alternative GPU array libraries to PyCuda.
There is now some overlapping code with the `_custom_transfer_functions` but I suppose they are for certain quantities that have a separate transfer function as oppose to using a whole different backend.
@kuron can you have a look on it? I think the solution is not as elegant as I thought it would be.
pycuda.gpuarray.GPUArrays are not wrapped. So if you use `dh.gpuarrays['foo']` you get either a pycuda array or a opencl array. I thought this step would be to drastic for one PR. Using OpenCL should still be a lot easier now.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/67Add ConditionalFieldAccess (Field.Access after out-of-bounds check)2019-10-01T15:36:36+02:00Stephan SeitzAdd ConditionalFieldAccess (Field.Access after out-of-bounds check)Adds a wrapper around a `Field.Access` that the access is only performed if a certain condition is met.
If I use this, I can safely perform calculations and adjoint calculations with `ghost_layers=0` and obtain the correct gradients w...Adds a wrapper around a `Field.Access` that the access is only performed if a certain condition is met.
If I use this, I can safely perform calculations and adjoint calculations with `ghost_layers=0` and obtain the correct gradients without separate boundary handling.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/54Always use codegen.rewriting.optimize2019-09-23T13:38:39+02:00Stephan SeitzAlways use codegen.rewriting.optimizePretty much !34 but with the changes to `create_kernel`. Can be closed if not wanted. Leaving it here for archiving purposes.
!34 has the workflow:
```python
assignments = optimize(assignments, optimizations)
ast = create_kernel(...Pretty much !34 but with the changes to `create_kernel`. Can be closed if not wanted. Leaving it here for archiving purposes.
!34 has the workflow:
```python
assignments = optimize(assignments, optimizations)
ast = create_kernel(assignments)
```https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/47WIP: Complex number support2019-10-11T17:57:49+02:00Stephan SeitzWIP: Complex number supportDepends on !43
Pystencils should eventually support complex numbers. Even if complex fields can be considered harmful for CPU vectorization. The concept is nice since SymPy and Python support complex numbers and there should be no pe...Depends on !43
Pystencils should eventually support complex numbers. Even if complex fields can be considered harmful for CPU vectorization. The concept is nice since SymPy and Python support complex numbers and there should be no performance disadvantage for normal CPU and GPU code. Many applications in physics and signal processing rely on complex numbers.
Complex output fields can be passed directly to libraries like `cufft`.
Problem: In C++, one cannot mix calculation with `std::complex<float>` with `std::complex<double>`. So user has to specify `data_type='float32'` when single precision complex floats are desired.
TODO:
* GPU support with the header pycuda provides
* only use `complex_helper.h` when needed
* remove commits from !34 (probably the code will be changed)
* rebase -ihttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/42WIP: make kerncraft/matplotlib tests pass with new image2019-09-02T08:28:51+02:00Stephan SeitzWIP: make kerncraft/matplotlib tests pass with new image--