pystencils merge requestshttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests2020-11-25T13:23:50+01:00https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/15implemented derivation of gradient weights via rotation2020-11-25T13:23:50+01:00Markus Holzerimplemented derivation of gradient weights via rotationderive gradient weights of other direction with
already calculated weights of one direction
via rotation and apply them to a field.derive gradient weights of other direction with
already calculated weights of one direction
via rotation and apply them to a field.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/38Implement sp.Sum, sp.Product2019-08-21T18:45:35+02:00Stephan SeitzImplement sp.Sum, sp.ProductSum and Product have a indexing variable which is a Atom but not a free
symbol. So logic, that defines the undefined symbols in a `SympyAssignment` should not be
`atoms(sp.Symbol)` but `free_symbols`. `sp.Indexed` from the `ResolvedFie...Sum and Product have a indexing variable which is a Atom but not a free
symbol. So logic, that defines the undefined symbols in a `SympyAssignment` should not be
`atoms(sp.Symbol)` but `free_symbols`. `sp.Indexed` from the `ResolvedFieldAcess`es forms an edge case.
So we could also use `atoms(sp.Symbol).intersection(...free_symbols)`.
I hope I extracted from my fork all the necessary code to implement this feature.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/331Implement Pinned GPU memory2023-06-24T08:23:36+02:00Markus HolzerImplement Pinned GPU memoryCPU arrys with an equivalent GPU array should be pinned. Further, this MR fixes non-aligned strides between CPU and GPU arrays.CPU arrys with an equivalent GPU array should be pinned. Further, this MR fixes non-aligned strides between CPU and GPU arrays.Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/388Implement loop peeling from back2024-05-28T10:04:12+02:00Daniel BauerImplement loop peeling from backAs discussed today IRL.
A copy of `peel_loop_front` mutatis mutandis.As discussed today IRL.
A copy of `peel_loop_front` mutatis mutandis.Daniel BauerDaniel Bauerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/142Implement __hash__ for SympyAssignment2020-02-22T11:22:51+01:00Stephan SeitzImplement __hash__ for SympyAssignmenthttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/314Gpu bufferfield fix2023-03-11T19:06:31+01:00Philipp SuffaGpu bufferfield fixSome small changes in the calculation of the field sizes to allow only buffered fields as well as only absolute access fields.
This is needed to allow AA-pattern and communication hiding for sparse kernels (ListLBM)Some small changes in the calculation of the field sizes to allow only buffered fields as well as only absolute access fields.
This is needed to allow AA-pattern and communication hiding for sparse kernels (ListLBM)Philipp SuffaPhilipp Suffahttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/296Gpu block size2022-07-01T12:12:41+02:00Markus HolzerGpu block sizeIf the Assignments act on 2D fields but 3 GPU indexing parameters are provided, an error occurs. With this MR the third parameter is fixed to 1If the Assignments act on 2D fields but 3 GPU indexing parameters are provided, an error occurs. With this MR the third parameter is fixed to 1Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/254FVM: Choose better stencil for derivative in flux for D3Q272021-06-15T07:26:29+02:00Michael Kuronmkuron@icp.uni-stuttgart.deFVM: Choose better stencil for derivative in flux for D3Q27As reported by @Tischler, the FVM discretization does not use the correct stencils for fluxes with derivatives in D3Q27. The result is not wrong, but uses more neighbors than necessary.
This merge request adds a test case and improves t...As reported by @Tischler, the FVM discretization does not use the correct stencils for fluxes with derivatives in D3Q27. The result is not wrong, but uses more neighbors than necessary.
This merge request adds a test case and improves the stencil-choosing heuristic. It now produces the same two-point finite differences that a human would choose by optimizing the free weights later in the process.Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/284Fvm testcase with fluctuations and reactions2022-01-10T15:10:04+01:00itischlerFvm testcase with fluctuations and reactionsAdded a fluctuation testcase and a reaction testcase to the FVM.
Fixes #36Added a fluctuation testcase and a reaction testcase to the FVM.
Fixes #36Michael Kuronmkuron@icp.uni-stuttgart.deMichael Kuronmkuron@icp.uni-stuttgart.dehttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/146FVM derivation: use a smaller stencil before trying brute-force to find the sparsest stencil2020-02-12T13:24:22+01:00Michael Kuronmkuron@icp.uni-stuttgart.deFVM derivation: use a smaller stencil before trying brute-force to find the sparsest stencilD3Q7/D2Q5 should suffice for first derivatives, so try that stencil first before using brute force to find the sparsest D3Q27/D2Q9 stencil. In 2D, it does not really matter because the brute-force search is so fast, but in 3D it can take...D3Q7/D2Q5 should suffice for first derivatives, so try that stencil first before using brute force to find the sparsest D3Q27/D2Q9 stencil. In 2D, it does not really matter because the brute-force search is so fast, but in 3D it can take days to complete (due to that `itertools.product`). This pull request does not change the resulting stencil weights, it only massively speeds up the process of determining them.Martin BauerMartin Bauerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/384Fundamental GPU Support2024-07-15T09:00:38+02:00Frederik HennigFundamental GPU SupportThis MR introduces the fundamentals of GPU support to the new backend
## General
- Introduce `GenericGpu` platform and threads range export: GPU platforms communicate the kernel's required thread grid size to the outside via a `GpuThr...This MR introduces the fundamentals of GPU support to the new backend
## General
- Introduce `GenericGpu` platform and threads range export: GPU platforms communicate the kernel's required thread grid size to the outside via a `GpuThreadsRange` object separate from the AST
- Add configuration options relating to GPUs
## CUDA Platform
- Introduce CUDA platform
- Add materialization + guards for full and sparse iteration spaces
- Add materialization of math functions
## SYCL Platform
- Introduce SYCL platform
- Add materialization + guards for full and sparse iteration spaces
- Add materialization of math functions
## CUDA Just-In-Time Compiler
- Migrate implementation of `cupy`-based JIT to new backend as an object-oriented structure
## Deviations and Missing Features
In the new implementation, block size selection is entirely up to the JIT / the runtime system and no longer affects the backend.
Adaptive block sizes, register restrictions, etc. are not yet implemented by this MR.Frederik HennigFrederik Hennighttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/411Freeze casts of bare constants to typed PsConstantExprs2024-07-27T12:57:29+02:00Daniel BauerFreeze casts of bare constants to typed PsConstantExprsThe type of the argument to a cast is hard to infer because there is no context from the enclosing scope.
Unless a typed symbol appears in the argument, the typifier has no way to determine the type of the argument.
For more discussion s...The type of the argument to a cast is hard to infer because there is no context from the enclosing scope.
Unless a typed symbol appears in the argument, the typifier has no way to determine the type of the argument.
For more discussion see #97.
This MR fixes the issue for casts of bare constants, e.g. `CastFunc(0, float64)` by freezing them to constants of the target type directly.
Additionally, the typifier now raises an error if the type of an argument to a cast can not be determined.Release 2.0Daniel BauerDaniel Bauerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/200Fixes for Vector Testcase to Work2020-12-19T09:01:17+01:00Julian HammerFixes for Vector Testcase to Work* passing on non-default Kerncraft parameters
* gracefully failing on VectorType usage in AST* passing on non-default Kerncraft parameters
* gracefully failing on VectorType usage in ASThttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/252Fixes for buffers in loops with step size > 12021-06-08T08:54:17+02:00Frederik HennigFixes for buffers in loops with step size > 1This MR introduces some additions and fixes for generating CPU loops with step sizes > 1:
- The CPU `create_kernel` function now exposes a flag to disable the double field write check
- Rewrote `get_base_buffer_index` to use pure integ...This MR introduces some additions and fixes for generating CPU loops with step sizes > 1:
- The CPU `create_kernel` function now exposes a flag to disable the double field write check
- Rewrote `get_base_buffer_index` to use pure integer arithmetic, and corrected the computation of the buffer base index
to correctly incorporate loop step sizes. Added test case to check correctness.
- Added rudimentary `evalf` functionality to integer division sympy function `int_div` (its absence lead to an infinite recursion during code generation).
- Added correct printing of integer-typed expressions in `CustomSympyPrinter._typed_number`.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/263Fixed wrong type hints. Updated setup.py2021-09-16T22:47:01+02:00Jan HönigFixed wrong type hints. Updated setup.pyAdded authors and changed the package's email to a more general solution.Added authors and changed the package's email to a more general solution.Jan HönigJan Hönighttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/141Fixed Volume of Fluid discretization and added Advection-Diffusion testcase2020-02-03T18:51:16+01:00Alexander ReinauerFixed Volume of Fluid discretization and added Advection-Diffusion testcaseFixed VoF discretization and added advection-diffusion testcase for finitevolumes and VoF discretization on behalf of @kuronFixed VoF discretization and added advection-diffusion testcase for finitevolumes and VoF discretization on behalf of @kuronMartin BauerMartin Bauerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/280Fixed test for sliced iteration with buffer to use dynamic field sizes2021-11-28T16:19:50+01:00Frederik HennigFixed test for sliced iteration with buffer to use dynamic field sizesOtherwise the test did not fail as it should when float-division is used in indexing.Otherwise the test did not fail as it should when float-division is used in indexing.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/274Fixed integer square root2021-11-17T15:50:41+01:00Markus HolzerFixed integer square rootInteger square roots must be detected as float32 if the simulation is setup for SP.Integer square roots must be detected as float32 if the simulation is setup for SP.Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/193Fixed duplicated kwargs in boundaryhandling2020-12-07T16:30:17+01:00Markus HolzerFixed duplicated kwargs in boundaryhandlingMichael Kuronmkuron@icp.uni-stuttgart.deMichael Kuronmkuron@icp.uni-stuttgart.dehttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/262fixed create_kernel parameter data_type="float" to procucde single precision2021-09-14T18:35:29+02:00Christoph Altfixed create_kernel parameter data_type="float" to procucde single precisionCurrently if create_kernel(assignments, data_type="float") is used then the untyped symbols are typed with float64, since the np.dtype("float") creates this during the construction of a new TypedSymbol.
Since data_type or as it is calle...Currently if create_kernel(assignments, data_type="float") is used then the untyped symbols are typed with float64, since the np.dtype("float") creates this during the construction of a new TypedSymbol.
Since data_type or as it is called in cpu.create_kernel type_info can be an string of an C type, At least following the [documentation of cpu.create_kernel](https://i10git.cs.fau.de/pycodegen/pystencils/-/blob/master/pystencils/cpu/kernelcreation.py#L31) this behavior is a bit confusing, since typical the C type specifier "float" is meant to be single precision.
So I added a small function that just replaces "float" with "single" in the symbol_to_type dict, so the untyped symbols get the single precision type.Christoph AltChristoph Alt