pystencils merge requestshttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests2020-11-19T06:18:38+01:00https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/189Automatically align to what is required for vectorization2020-11-19T06:18:38+01:00Michael Kuronmkuron@icp.uni-stuttgart.deAutomatically align to what is required for vectorizationIf this cannot be detected because cpuinfo is missing, use 512 bitIf this cannot be detected because cpuinfo is missing, use 512 bitMarkus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/188Neon intrinsics2021-03-16T20:29:52+01:00Markus HolzerNeon intrinsicsThis MR implements neon intrinsics to enable vectorization for the ARM architecture.
This may also become useful once ARM HPC clusters actually get deployed, though these might end up using SVE instead of NEON. For that case, additional...This MR implements neon intrinsics to enable vectorization for the ARM architecture.
This may also become useful once ARM HPC clusters actually get deployed, though these might end up using SVE instead of NEON. For that case, additional work is needed because SVE's vector width is determined at runtime.Michael Kuronmkuron@icp.uni-stuttgart.deMichael Kuronmkuron@icp.uni-stuttgart.dehttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/187WIP: ARM NEON vectorization2020-11-18T14:59:55+01:00Michael Kuronmkuron@icp.uni-stuttgart.deWIP: ARM NEON vectorizationWith Apple's new laptops having ARM processors, I thought it might be time to add ARM NEON vectorization to pystencils. I don't currently have hardware to test on, but a bunch of test cases from both pystencils and lbmpy at least compile...With Apple's new laptops having ARM processors, I thought it might be time to add ARM NEON vectorization to pystencils. I don't currently have hardware to test on, but a bunch of test cases from both pystencils and lbmpy at least compile successfully. A Raspberry Pi 4 might actually be a useful and cheap device to add to CI for this purpose.
This may also become useful once ARM HPC clusters actually get deployed, though these might end up using SVE instead of NEON -- while I have added a few `if`s for that case, additional work is needed because SVE's vector width is determined at runtime.Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/186updated kc coupling to support layercondition analysis2020-11-13T09:07:40+01:00Julian Hammerupdated kc coupling to support layercondition analysishttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/185Blocking for partial directions2020-11-18T09:57:53+01:00Markus HolzerBlocking for partial directionsIn the current implementation, it was only possible to block for all coordinates. However, for some problems it might make sense to only block one specific direction. This can now be achieved by setting unwanted coordinates to zero.In the current implementation, it was only possible to block for all coordinates. However, for some problems it might make sense to only block one specific direction. This can now be achieved by setting unwanted coordinates to zero.Jan HönigJan Hönighttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/184improved kc coupling2020-11-11T14:32:42+01:00Julian Hammerimproved kc couplinghttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/183Updated Kerncraft Coupling2020-11-06T15:45:24+01:00Julian HammerUpdated Kerncraft Couplinghttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/182Delete generated file createindexlistcython.c2020-11-10T10:56:44+01:00Stephan SeitzDelete generated file createindexlistcython.chttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/181Allow **kernel_creation_args in create_boundary_kernel2020-10-30T10:47:29+01:00Stephan SeitzAllow **kernel_creation_args in create_boundary_kernelhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/180Fix Dirichlet boundary condition for scalar case2020-10-29T17:49:55+01:00Stephan SeitzFix Dirichlet boundary condition for scalar caseMarkus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/179make integration job required for MR2020-10-30T10:48:23+01:00Dominik Thoennesdominik.thoennes@fau.demake integration job required for MRhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/177Fix: shift_slice to accept single slices as argument, and return tuples2020-10-07T11:30:33+02:00Frederik HennigFix: shift_slice to accept single slices as argument, and return tuplesTwo fixes to `pystencils.slicing.shift_slice`:
- Previously, `shift_slice` assumed its argument `slices` to be iterable. Thus, it did not accept single slices as arguments. There are use cases, though, where it is necessary to shift a p...Two fixes to `pystencils.slicing.shift_slice`:
- Previously, `shift_slice` assumed its argument `slices` to be iterable. Thus, it did not accept single slices as arguments. There are use cases, though, where it is necessary to shift a plain `slice` object, or even `int` or `float` objects which can also be seen as slices. An additional `isinstance` check takes care of this.
- Previously, `shift_slice` returned `list`s of slices. By default, Python wraps multidimensional slices as `tuple`s. Code for manipulating multidimensional slices thus expects them to be given as tuples. Also, although it is currently possible to access numpy arrays with lists of slices instead of tuples, this action produces a deprecation warning. Thus, `shift_slice` is changed to return tuples.
An additional test case evaluating array accesses with shifted slices is also added.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/176Use C11CodePrinter for sympy 1.72020-10-07T16:42:49+02:00Stephan SeitzUse C11CodePrinter for sympy 1.7C++ may cause problems for CUDA/OpenCL (e.g. print `std::log`)C++ may cause problems for CUDA/OpenCL (e.g. print `std::log`)Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/175WIP: Opencl to SPIR-V ahead-of-time compilation2023-03-16T12:42:10+01:00Stephan SeitzWIP: Opencl to SPIR-V ahead-of-time compilationThis does not yet use the pystencils' cache folder or disk caching of the compilation.
This can be used to embed compiled bytecode into waLBerla executables as I do with my Vulkan wrapper. Not sure if this is a good way to go but at lea...This does not yet use the pystencils' cache folder or disk caching of the compilation.
This can be used to embed compiled bytecode into waLBerla executables as I do with my Vulkan wrapper. Not sure if this is a good way to go but at least we can experiment with it.
A good way to proceed with this MR is also a comparison between hip/sicl/ocl/vulkan in order to identify a suitable backend for pystencils.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/174Extend testsuit2020-10-13T13:16:38+02:00Markus HolzerExtend testsuitThis MR extends the test cases of pystencils.
Other changes made in this MR:
1. Usage of correct backends for the codegen instead of C Backend for all
2. Deletion of unusable function.
3. Correction of CUDA and OpenCL Array handlerThis MR extends the test cases of pystencils.
Other changes made in this MR:
1. Usage of correct backends for the codegen instead of C Backend for all
2. Deletion of unusable function.
3. Correction of CUDA and OpenCL Array handlerStephan SeitzStephan Seitzhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/173Update pystencils integration pipeline2020-10-05T13:29:08+02:00Markus HolzerUpdate pystencils integration pipelineSome of waLBerlas test cases have changed. This MR adaptes the changes.Some of waLBerlas test cases have changed. This MR adaptes the changes.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/172Warning fixes in setup.py2020-10-03T16:26:28+02:00Markus HolzerWarning fixes in setup.pyThis MR fixes some small warnings in setup.py. Additionally, the accuracy for the timeloop test case is lowered due to its often failureThis MR fixes some small warnings in setup.py. Additionally, the accuracy for the timeloop test case is lowered due to its often failurehttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/171count_operations: fix to not count integer expressions for addresses/constant...2020-10-05T09:52:05+02:00Dominik Ernstcount_operations: fix to not count integer expressions for addresses/constants as real operationsTries to fix the number of counter operations by count_operations. Previously, counter computations with integer constants like +1 would be counted as real, because a evalf would make it a +1.0.
Additionally address computations with po...Tries to fix the number of counter operations by count_operations. Previously, counter computations with integer constants like +1 would be counted as real, because a evalf would make it a +1.0.
Additionally address computations with pointers of real* type would be counted as real too.
Also, x^-1/2 is counted as both a square root and a division.https://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/170Make work on SymPy 1.7: sympy.printing.ccode -> sympy.printing.cxx2020-09-29T12:03:30+02:00Stephan SeitzMake work on SymPy 1.7: sympy.printing.ccode -> sympy.printing.cxxThere's also sympy.printing.c but we are always compiling as C++.
sympy-master failed before this fix.There's also sympy.printing.c but we are always compiling as C++.
sympy-master failed before this fix.Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/pycodegen/pystencils/-/merge_requests/169Fix: Replaced accidental `continue` by `break` in boundaries/createindexlist.py2020-10-07T10:54:06+02:00Frederik HennigFix: Replaced accidental `continue` by `break` in boundaries/createindexlist.pyThere was a `continue` instead of a `break` statement in the python code for index list creation, causing the `single_link` flag to be ignored. The test cases for this are updated in pycodegen/lbmpy!41.There was a `continue` instead of a `break` statement in the python code for index list creation, causing the `single_link` flag to be ignored. The test cases for this are updated in pycodegen/lbmpy!41.