waLBerla issueshttps://i10git.cs.fau.de/walberla/walberla/-/issues2024-02-12T15:19:48+01:00https://i10git.cs.fau.de/walberla/walberla/-/issues/238import waLBerla hangs after installation2024-02-12T15:19:48+01:00Pedro Santos Nevesimport waLBerla hangs after installationHi waLBerla developers and contributors!
With other colleagues in the EESSI and MultiXscale projects, we are trying to build and deploy optmized waLBerla v6.1 installations and ran into an issue when building it with two specific toolch...Hi waLBerla developers and contributors!
With other colleagues in the EESSI and MultiXscale projects, we are trying to build and deploy optmized waLBerla v6.1 installations and ran into an issue when building it with two specific toolchains that we'd like to report and hopefully get your input on.
A summary of the issue:
We are building waLBerla through EasyBuild using the [`foss2022b`](https://github.com/easybuilders/easybuild-easyconfigs/pull/19324) and [`foss2023a`](https://github.com/easybuilders/easybuild-easyconfigs/pull/19252) toolchains with two identical easyconfig files. With either toolchain the installation proceeds until the sanity check step which simply runs `python -c import waLBerla`, upon which the system hangs. We see this happen [on the EasyBuild test clusters](https://github.com/easybuilders/easybuild-easyconfigs/pull/19252#issuecomment-1820653972) but not on our personal laptops or in the HPC at the University of Groningen.
We tried to change the sanity check to `mpirun -np 1 python -c "import waLBerla"` in the chance that the issue was with the test cluster's environment, but the same hang occurs.
One successful workaround is to set `UCX_LOG_LEVEL=info` in the sanity check so that it reads `UCX_LOG_LEVEL=info python -c "import waLBerla"`. We don't know why changing the log level of `UCX` resolves this problem, and my colleague who discovered this has also opened a ticket about it in the `UCX` repo [here](https://github.com/openucx/ucx/issues/9532).
Another workaround seems to be importing `mpi4py` before waLBerla. This is surprising, because `mpi4py` is not a dependency of waLBerla. We would rather not add `mpi4py` as a dependency for this issue, especially without knowing the consequences of this.
Given that we were only seeing this problem in the EasyBuild test clusters and not in other systems, and also the fact that the `UCX` workaround seems to work for the [EESSI test clusters](https://github.com/EESSI/software-layer/pull/421), we assumed `import waLBerla` was likely hanging due to some quirk of the EasyBuild test clusters. However, we received a [report ](https://github.com/easybuilders/easybuild-easyconfigs/pull/19324/#issuecomment-1857832565) from another EasyBuild maintainer with a notice of this problem in another system. Because of this, we are now not convinced that whatever is causing this has to do with the EasyBuild clusters and their environment.
We have a [summary](https://gitlab.com/eessi/support/-/issues/20) of our attempts in our support portal, where you can find more details.
Would you have any idea of what could be causing this, or have you perhaps encountered something similar in the past? We'd love your input as we're quite confused about this problem. Thanks in advance!https://i10git.cs.fau.de/walberla/walberla/-/issues/239Dynamic load balancing: Refresh function seems to not communicate the flag fi...2024-01-29T12:17:02+01:00Philipp SuffaDynamic load balancing: Refresh function seems to not communicate the flag field correctlyWhen using the blockforest refresh function for dynamic load balancing, it seems not to communicate the "uidToFlag" map of the flag Field. So the flags in the flag field are still set correctly but the connection to the FlagUIDs seems to...When using the blockforest refresh function for dynamic load balancing, it seems not to communicate the "uidToFlag" map of the flag Field. So the flags in the flag field are still set correctly but the connection to the FlagUIDs seems to be lost.Philipp SuffaPhilipp Suffahttps://i10git.cs.fau.de/walberla/walberla/-/issues/237clang-tidy used to ignore .h files2023-11-09T14:59:33+01:00Dominik Thoennesdominik.thoennes@fau.declang-tidy used to ignore .h filesI think that the clang-tidy script ignored `.h` files in the past.
This means that these files were not checked and there are lots of warnings after the update.
I disabled the job in the pipeline for now.
https://i10git.cs.fau.de/walberl...I think that the clang-tidy script ignored `.h` files in the past.
This means that these files were not checked and there are lots of warnings after the update.
I disabled the job in the pipeline for now.
https://i10git.cs.fau.de/walberla/walberla/-/jobs/1135823https://i10git.cs.fau.de/walberla/walberla/-/issues/204Is there a LBM grid refinement example on the GPU?2023-09-27T21:54:03+02:00ahmedIs there a LBM grid refinement example on the GPU?Thanks a lot for making this great library open-source with such high-quality code!
I was wondering if there's a 3D LBM grid refinement refinement that runs on the GPU. I saw a a LBM grid refinement example under `apps\benchmarks\Adapti...Thanks a lot for making this great library open-source with such high-quality code!
I was wondering if there's a 3D LBM grid refinement refinement that runs on the GPU. I saw a a LBM grid refinement example under `apps\benchmarks\AdaptiveMeshRefinementFluidParticleCoupling` but I think it only runs in parallel on the CPU (correct me if I'm wrong). What I'm trying to find is running LBM on a non-uniform static mesh that doesn't change over time i.e., not AMR.https://i10git.cs.fau.de/walberla/walberla/-/issues/212Define FlagUIDs in the BoundaryCollection for reuse in App2023-06-21T11:38:45+02:00Philipp SuffaDefine FlagUIDs in the BoundaryCollection for reuse in AppIt could be useful to define the FlagUIDs, which are first set in the generation file, in the BoundaryCollection, so that they can be further used in the application file.
So if one defined a UBB, NoSlip and FixedDensity boundary in the...It could be useful to define the FlagUIDs, which are first set in the generation file, in the BoundaryCollection, so that they can be further used in the application file.
So if one defined a UBB, NoSlip and FixedDensity boundary in the generation file, the BoundaryCollection could look like:
```
namespace walberla{
namespace lbm {
const FlagUID noSlipFlagUID("NoSlip");
const FlagUID UBBFlagUID("UBB");
const FlagUID FixedDensityFlagUID("FixedDensity");
class PSMBoundaryCollection
{
....
```Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/walberla/walberla/-/issues/192Communication fails when using multiple blocks per process (GPU)2023-05-22T15:23:13+02:00Samuel KemmlerCommunication fails when using multiple blocks per process (GPU)The communication fails (i.e., communicates to the wrong position) when using multiple blocks per process (GPU). This only occurs between the process-local block 0 and process-local block 1.The communication fails (i.e., communicates to the wrong position) when using multiple blocks per process (GPU). This only occurs between the process-local block 0 and process-local block 1.https://i10git.cs.fau.de/walberla/walberla/-/issues/130Improve pe Union2023-05-02T12:18:33+02:00Michael Kuronmkuron@icp.uni-stuttgart.deImprove pe UnionCurrent restrictions that I noticed:
- [ ] A union of two overlapping bodies has an incorrect volume, mass and inertia tensor. Overlap should either be checked for and warned about, or the overlap volume should be estimated numerically ...Current restrictions that I noticed:
- [ ] A union of two overlapping bodies has an incorrect volume, mass and inertia tensor. Overlap should either be checked for and warned about, or the overlap volume should be estimated numerically and mass and inertia be corrected accordingly.
- [ ] Cryptic errors happen when a union is created from bodies spread across a block boundary.
Further restrictions that @eibl mentioned to me:
- [ ] Dynamic creation and splitting of unions in parallel is problematic.
- [ ] Unclear collision handling because concave objects can collide in multiple points simultaneously.
Regarding the last point: I guess you can even have simultaneous collisions without unions. E.g. cube-cube and cube-plane (face to face), cube-cylinder, plane-cylinder and cyclinder-cylinder (face to side or face to face), or sphere-torus (moving the sphere through the center of the torus).https://i10git.cs.fau.de/walberla/walberla/-/issues/209Code Quality Days 2.5. + 3.5.2023-05-02T09:36:09+02:00Dominik Thoennesdominik.thoennes@fau.deCode Quality Days 2.5. + 3.5.Hi,
this issue is intended to provide an overview of the open issues we could work on during the code quality days.
The current plan is to have this for two days.
Please feel free to add more information.
Poll for the date:
https://ter...Hi,
this issue is intended to provide an overview of the open issues we could work on during the code quality days.
The current plan is to have this for two days.
Please feel free to add more information.
Poll for the date:
https://terminplaner6.dfn.de/en/p/6683ec971fb22b9928e1ff6d3ae6b412-196072
| topic | comment/explanation | related issues |
| --- | --- | --- |
|Cleanup | Check very old issues (> 2 years) to see if these are still relevant | |
| Remove Boost | | #190 |
| fix metis/parmetis integration | | #195 |
| improve logging | | #178 |
| Unify Communication | CPU and GPU communication schemes are vastly similar | #196 |
| Better GPU integration | GPU usage should be integrated like MPI for example | !565 |
| Use SoA by default | Although SoA is mostly better it is not the default in waLBerla | #182 |
| Boundaries | Different topics on boundaries | #203 #173 #170 #3 |
https://docs.google.com/spreadsheets/d/1chiE5PCNcuokjp7Q3MyClaKQlDO2GqmaljPfJlIhH20/edit#gid=0https://i10git.cs.fau.de/walberla/walberla/-/issues/173Poor user experience with generated DynamicUBB (UBB + additional_data_handler)2023-05-02T09:32:33+02:00Nigel OvermarsPoor user experience with generated DynamicUBB (UBB + additional_data_handler)For my purposes, I need to be able to set a certain inflow profile, for example a variant of a Poiseuille flow where the velocity changes at every time step.
Currently, this is possible when using generated sweeps, but only with a few n...For my purposes, I need to be able to set a certain inflow profile, for example a variant of a Poiseuille flow where the velocity changes at every time step.
Currently, this is possible when using generated sweeps, but only with a few nasty hacks, for both making it temporally varying and spatially varying. To be able to use a DynamicUBB, which was generated via
```
ubb_dynamic = UBB(lambda *args: None, dim=stencil.D)
ubb_data_handler = UBBAdditionalDataHandler(stencil, ubb_dynamic)
# UBB with user-defined velocity profile
generate_boundary(ctx, "DynamicUBB", ubb_dynamic, lbm_method,
additional_data_handler=ubb_data_handler, target=target)
```
one needs to pass a `std::function< Vector3< real_t >(const Cell&, const shared_ptr< StructuredBlockForest >&, IBlock&)` to the constructor of the DynamicUBB type. The easiest way to make one is via a functor, for which a template looks like this:
```
class InflowProfile
{
public:
Vector3< real_t > operator()( const Cell& pos, const shared_ptr< StructuredBlockForest >& SbF, IBlock& block ) {
// return velocity vector depending on the cell location in the SbF
return getVelocityVector(pos, SbF, block);
}
};
```
**Temporally varying inflow profile**
The problem is that, in the current version of waLBerla/lbmpy, the additional data handler is only being called once before the running of the simulation when the generated member function
```
template<typename FlagField_T>
void DynamicUBB::fillFromFlagField( const shared_ptr<StructuredBlockForest> & blocks, ConstBlockDataID flagFieldID,
FlagUID boundaryFlagUID, FlagUID domainFlagUID)
```
is being called. After these values have been set, there is currently no easy way to update them at every time step. As a workaround, one can add the aforementioned member function as a `addFuncAfterTimeStep` to the SweepTimeLoop and add a `TimeTracker` object to the functor, which allows to update the values depending on the current time step. This approach does appear to incur a performance penalty, as the member function `DynamicUBB::fillFromFlagField` appears to do some work which should be unnecessary after an initial run of it. I think, but haven't tested it, that this also makes running on the GPU not possible or significantly slowed down.
**Spatially varying inflow profile**
Currently, the cells being passed to the functors `operator()` are different from the ones I expected to get. The problem is in the following generated code:
```
// for every cell in the block
if ( isFlagSet( it.neighbor(1, 0, 0 , 0 ), boundaryFlag ) )
{
auto element = IndexInfo(it.x(), it.y(), it.z(), 0 );
Vector3<real_t> InitialisatonAdditionalData = elementInitaliser(Cell(it.x(), it.y(), it.z()), blocks, *block);
element.vel_0 = InitialisatonAdditionalData[0];
element.vel_1 = InitialisatonAdditionalData[1];
element.vel_2 = InitialisatonAdditionalData[2];
// snip
}
// Similar code for all directions, e.g. 27 in total for a D3Q27 stencil
```
Where the function `elementInitialiser` is the our functors `operator()`.
In essence, what it being checked if the current cell has a neighbor, in the case of a inflow boundary, for which the inflowFlag is set. If that is the case, apply the functor to this cell, meaning the cell of which a neighboring cell has the inflowFlag set, not the cell with the inflowFlag self, and store the results (i.e. the computed velocity vector) in the appropriate location.
Currently, I am working around this problem by manually adding the direction to the cell on which the functor is applied. Unfortunately, due to the generated nature of the code, this is not a real solution...
In my use case, certain noise values are generated for every cell for every time step in the inflow boundary via a Python program which are then being read from a .json file. These values are then put into a `std::unordered_map<GlobalCellCoordinatesTuple, Vector3>` and then read during the boundary treatment. Without the aforementioned fix there is a miss match between the cell coordinates I put in the `std::unordered_map<..,..>` (the ones for which the inflowFlag is being set) and the cell coordinates that get passed to the functor, resulting in the wrong behavior.
**Suggestions**
- Allow the user to specify if they want the boundary values to be updated at every time step. Then the user would just have to add a `TimeTracker` to the functor and then can update the values in an easy manner.
- Allow the user to select if they want the cells passed to the functors `operator()` to be 1) the neighboring cells as is done right now, or 2) they are the actual cells for which the flag which is checked is set.Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/walberla/walberla/-/issues/3BoundaryHandling Documentation2023-05-02T09:28:20+02:00Martin BauerBoundaryHandling DocumentationDraft:
Section 1: Boundary Handling
- Basic Idea
- "copydoc" from FlagField: why FlagUID?
- FlagField, simple case: flag corresponds to boundary / domain
- generalize: boundary/domain can be mask...Draft:
Section 1: Boundary Handling
- Basic Idea
- "copydoc" from FlagField: why FlagUID?
- FlagField, simple case: flag corresponds to boundary / domain
- generalize: boundary/domain can be masks
-> explain need for BoundaryUID vs. FlagUID
- sidenote: Optimization with near boundary flag
or ListBased Approach ( description howto switch )
- what is a consistent state?
- boundary concept: how to implement a own boundary ( copydoc / nice format )
- Practical Usage:
- why templates?
- setup using tuples
- set/force/remove
- boundary handling factories
- explain difference between Flag/Domain/Boundary: difference is only 'optimization level'
Section 2: Boundary Handling Collection
- reason "multiphysics simulations"
- explain using example: picture ( fluid, temperature )
- explain why setBoundary(FlagUID) does not exist
- do some crazy scenario setups....
Section 3: Initializer i.e. what they can/cannot do.
- @bauer7.1Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/walberla/walberla/-/issues/205Juwels-Booster: selectDeviceBasedOnMpiRank() presumably assigns the wrong hos...2023-04-05T13:49:07+02:00Philipp SuffaJuwels-Booster: selectDeviceBasedOnMpiRank() presumably assigns the wrong host memory to the deviceThe function selectDeviceBasedOnMpiRank() seems to assign all devices (GPUs) on a node to the same host memory.
This could cause performance issues for CPU to GPU communication (cudaMemcopy), because GPUs are not communicating to their c...The function selectDeviceBasedOnMpiRank() seems to assign all devices (GPUs) on a node to the same host memory.
This could cause performance issues for CPU to GPU communication (cudaMemcopy), because GPUs are not communicating to their closest host memory.
So if you allocate 4 GPUs on a node and call the program with 4 MPI processes, all 4 GPUs are assigned to the same MPI process memory (of process 1) by cudaSetDevice().
This is the case, because the function gpuGetDeviceCount() returns 1 instead of 4 devices (GPUs).
This behavior is only tested for juwels-booster so far, further investigation is needed...https://i10git.cs.fau.de/walberla/walberla/-/issues/170Usability CodeGen boundaries2023-04-05T13:47:53+02:00Markus HolzerUsability CodeGen boundariesEspecially when a boundary with additional data is generated it is rather hard to understand how this should be done in the python script.
Something like a wrapper around the boundaries would maybe improve the situation. For example, th...Especially when a boundary with additional data is generated it is rather hard to understand how this should be done in the python script.
Something like a wrapper around the boundaries would maybe improve the situation. For example, the UBB can be generated in this way:
```python
ubb_dynamic = UBB(lambda *args: None, dim=stencil.D)
ubb_data_handler = UBBAdditionalDataHandler(stencil, ubb_dynamic)
```
However, the empty callback with the lambda and the `UBBAdditionalDataHandler` are cryptic and there is absolutely no other choice than shown above.
Thus a thin wrapper like `DynamicUBB(stencil=stencil)` could improve this situation quite a lot.
The problem why this has to be written in such a cryptic way in the first place is that lbmpy has to function as a standalone framework besides waLBerla. Thus certain design decisions are motivated out of a python world and contradict with the codegen for waLBerla.Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/walberla/walberla/-/issues/196Unification GPU/Device Usage2023-04-04T09:44:48+02:00Markus HolzerUnification GPU/Device UsageThe GPU/CUDA backend of waLBerla is very thin at the moment. In order to integrate the device using the following list of suggestions/observations should function as a guideline.
1. Like `MPI` device functions should get a wrapper aroun...The GPU/CUDA backend of waLBerla is very thin at the moment. In order to integrate the device using the following list of suggestions/observations should function as a guideline.
1. Like `MPI` device functions should get a wrapper around with an "environment" as done with `WALBERLA_MPI_SECTION`. Doing so users would no longer need to work with `if defined` as for example done here: https://i10git.cs.fau.de/walberla/walberla/-/blob/master/apps/benchmarks/FlowAroundSphereCodeGen/FlowAroundSphereCodeGen.cpp
2. The GPUField: https://i10git.cs.fau.de/walberla/walberla/-/blob/master/src/cuda/GPUField.h works differently in the sense that the interface is different (f-Size not a template parameter) but also it is very cumbersome to have explicit GPU data with its explicit `addToStorage` functions. It would be simpler if the `Field` itself could synchronise its data and give back a device of host pointer depending on the situations needed. This goes allong #109 which suggests using midspan for the Field data structure. The `mdspan` has the functionality right away.
3. Some device-specific implementations do not need to be specific. A good example is the communication scheme: https://i10git.cs.fau.de/walberla/walberla/-/blob/master/src/cuda/communication/UniformGPUScheme.h In the end the same `MPI` function is called just with a device pointer and not a host pointer (or even also with a device pointer if GPU direct is not available). Thus there is no need for this parallel world to exist here.
Another good example is the https://i10git.cs.fau.de/walberla/walberla/-/blob/master/src/cuda/communication/CustomMemoryBuffer.h which works basically completely similar to the normal device buffers with a slightly different API.https://i10git.cs.fau.de/walberla/walberla/-/issues/10Multigrid PDE solver2023-03-29T20:05:37+02:00Michael Kuronmkuron@icp.uni-stuttgart.deMultigrid PDE solverWalberla should eventually have a multigrid PDE solver. This should also enable solving PDEs on refined grids. As requested in software/walberla#15, it should optionally be possible to specify the full sparse matrix for the left-hand sid...Walberla should eventually have a multigrid PDE solver. This should also enable solving PDEs on refined grids. As requested in software/walberla#15, it should optionally be possible to specify the full sparse matrix for the left-hand side instead of just giving stencil weights.
Strategy:
1. [x] Basic multigrid with direct coarsening and no boundary conditions
* [x] Add stencil field support to existing PDE solvers (software/walberla#15), as this is required for the following steps.
* [x] Implement Galerkin Coarsening
* [x] Locally modify the stencil to take into account boundary conditions
* [ ] Grid refinement
* [ ] Make sure RBGS work on refined grids. We can assume that all blocks have the same coarsest resolution.
* [ ] Adapt V-cycle so that for its finest levels, it can use the block's resolution if no grid with the needed resolution is available.
* [ ] For dynamic refinement, need to recreate the coarser levels of the V-cycle (solution, rhs, residual, stencil field) whenever the refinement changes.Richard AngersbachRichard Angersbachhttps://i10git.cs.fau.de/walberla/walberla/-/issues/194Sum of cell local overlap fractions can exceed 1 (which it should not) in the...2023-03-23T11:19:54+01:00Samuel KemmlerSum of cell local overlap fractions can exceed 1 (which it should not) in the pe couplingSince pe is deprecated this issue is more for documentation and can be removed as soon as there is a documentation page for known bugs.Since pe is deprecated this issue is more for documentation and can be removed as soon as there is a documentation page for known bugs.https://i10git.cs.fau.de/walberla/walberla/-/issues/176Check and validate implementation of Guo force model with TRT/MRT2023-01-27T07:59:16+01:00Christoph SchwarzmeierCheck and validate implementation of Guo force model with TRT/MRTI tested waLBerla's TRT collision model with the compressible D3Q19 velocity set in waLBerla's free surface LBM module (not yet open source), .
When using the `GuoConstant` force model, I noticed two things that are not present when usi...I tested waLBerla's TRT collision model with the compressible D3Q19 velocity set in waLBerla's free surface LBM module (not yet open source), .
When using the `GuoConstant` force model, I noticed two things that are not present when using the `SimpleConstant` force model:
1. the physical results significantly differ from SRT (same even order relaxation rate, arbitrary "Magic" parameter)
2. the physical results change significantly with small changes in TRT's (even order) relaxation rate
These observations should not be related to the free surface LBM implementation and we should therefore
- [x] create a simple test case to validate the `GuoConstant` force model when using the TRT/MRT collision operators (without free surface LBM)
- [x] try to reproduce this issue (without free surface LBM)
- [ ] check if the `GuoConstant` and `GuoField` force models are implemented correctly for TRT/MRT collision operatorsJonas PlewinskiJonas Plewinskihttps://i10git.cs.fau.de/walberla/walberla/-/issues/200GCC12: EquationSystem warning when using `DebugOptimized` and `OPTIMIZIE_FOR_...2022-12-24T14:00:29+01:00Dominik Thoennesdominik.thoennes@fau.deGCC12: EquationSystem warning when using `DebugOptimized` and `OPTIMIZIE_FOR_LOCALHOST`The EquationSystem.cpp emmits a `maybe-uninitialized` warning when build with `DebugOptimized` and `WALBERLA_OPTIMIZIE_FOR_LOCALHOST` on gcc12
```
Building CXX object src/core/CMakeFiles/core.dir/math/equation_system/EquationSystem.cpp....The EquationSystem.cpp emmits a `maybe-uninitialized` warning when build with `DebugOptimized` and `WALBERLA_OPTIMIZIE_FOR_LOCALHOST` on gcc12
```
Building CXX object src/core/CMakeFiles/core.dir/math/equation_system/EquationSystem.cpp.o
cd /build/src/core && /usr/local/bin/ccache g++ -DBOOST_ALL_NO_LIB -I/build/src -I/walberla/src -isystem /opt/boost/include -isystem /usr/lib/x86_64-linux-gnu/openmpi/include/openmpi -isystem /usr/lib/x86_64-linux-gnu/openmpi/include -isystem /opt/openmesh/include -Wall -Wconversion -Wshadow -march=native -Wfloat-equal -Wextra -pedantic -D_GLIBCXX_USE_CXX11_ABI=1 -pthread -g -O3 -std=c++17 -o CMakeFiles/core.dir/math/equation_system/EquationSystem.cpp.o -c /walberla/src/core/math/equation_system/EquationSystem.cpp
In file included from /opt/boost/include/boost/graph/depth_first_search.hpp:21,
from /opt/boost/include/boost/graph/max_cardinality_matching.hpp:21,
from /walberla/src/core/math/equation_system/EquationSystem.cpp:39:
In constructor 'boost::bgl_named_params<T, Tag, Base>::bgl_named_params(T, const Base&) [with T = boost::vec_adj_list_vertex_id_map<boost::no_property, long unsigned int>; Tag = boost::vertex_index_t; Base = boost::bgl_named_params<boost::detail::odd_components_counter<long unsigned int>, boost::graph_visitor_t, boost::no_property>]',
inlined from 'boost::bgl_named_params<PType, boost::vertex_index_t, boost::bgl_named_params<T, Tag, Base> > boost::bgl_named_params<T, Tag, Base>::vertex_index_map(const PType&) const [with PType = boost::vec_adj_list_vertex_id_map<boost::no_property, long unsigned int>; T = boost::detail::odd_components_counter<long unsigned int>; Tag = boost::graph_visitor_t; Base = boost::no_property]' at /opt/boost/include/boost/graph/named_function_params.hpp:217:5,
inlined from 'static bool boost::maximum_cardinality_matching_verifier<Graph, MateMap, VertexIndexMap>::verify_matching(const Graph&, MateMap, VertexIndexMap) [with Graph = boost::adjacency_list<boost::vecS, boost::vecS, boost::undirectedS>; MateMap = long unsigned int*; VertexIndexMap = boost::vec_adj_list_vertex_id_map<boost::no_property, long unsigned int>]' at /opt/boost/include/boost/graph/max_cardinality_matching.hpp:779:61,
inlined from 'bool boost::matching(const Graph&, MateMap, VertexIndexMap) [with Graph = adjacency_list<vecS, vecS, undirectedS>; MateMap = long unsigned int*; VertexIndexMap = vec_adj_list_vertex_id_map<no_property, long unsigned int>; AugmentingPathFinder = edmonds_augmenting_path_finder; InitialMatchingFinder = extra_greedy_matching; MatchingVerifier = maximum_cardinality_matching_verifier]' at /opt/boost/include/boost/graph/max_cardinality_matching.hpp:807:79,
inlined from 'bool boost::checked_edmonds_maximum_cardinality_matching(const Graph&, MateMap, VertexIndexMap) [with Graph = adjacency_list<vecS, vecS, undirectedS>; MateMap = long unsigned int*; VertexIndexMap = vec_adj_list_vertex_id_map<no_property, long unsigned int>]' at /opt/boost/include/boost/graph/max_cardinality_matching.hpp:817:48,
inlined from 'bool boost::checked_edmonds_maximum_cardinality_matching(const Graph&, MateMap) [with Graph = adjacency_list<vecS, vecS, undirectedS>; MateMap = long unsigned int*]' at /opt/boost/include/boost/graph/max_cardinality_matching.hpp:824:56,
inlined from 'void walberla::math::EquationSystem::match()' at /walberla/src/core/math/equation_system/EquationSystem.cpp:97:4:
/opt/boost/include/boost/graph/named_function_params.hpp:192:56: warning: '*(unsigned char*)((char*)&occ + offsetof(boost::detail::odd_components_counter<long unsigned int>,boost::detail::odd_components_counter<long unsigned int>::m_parity))' may be used uninitialized [-Wmaybe-uninitialized]
192 | bgl_named_params(T v, const Base& b) : m_value(v), m_base(b) {}
| ^~~~~~~~~
/opt/boost/include/boost/graph/max_cardinality_matching.hpp: In member function 'void walberla::math::EquationSystem::match()':
/opt/boost/include/boost/graph/max_cardinality_matching.hpp:778:52: note: '*(unsigned char*)((char*)&occ + offsetof(boost::detail::odd_components_counter<long unsigned int>,boost::detail::odd_components_counter<long unsigned int>::m_parity))' was declared here
778 | detail::odd_components_counter< v_size_t > occ(num_odd_components);
|
```https://i10git.cs.fau.de/walberla/walberla/-/issues/199Make CMake and CodeGen simpler2022-11-29T13:03:44+01:00Markus HolzerMake CMake and CodeGen simplerAt the moment all generated files, that come out of the CodeGen script need to be manually stated under `OUT_FILES` in the `waLBerla_generate_target_from_python`. Example:
https://i10git.cs.fau.de/walberla/walberla/-/blob/master/apps/ben...At the moment all generated files, that come out of the CodeGen script need to be manually stated under `OUT_FILES` in the `waLBerla_generate_target_from_python`. Example:
https://i10git.cs.fau.de/walberla/walberla/-/blob/master/apps/benchmarks/FlowAroundSphereCodeGen/CMakeLists.txt
This has some disadvantages:
1. It is always a bit clunky because as a user you first provide the names of the `OUT_FILES` as strings to the generation function and then one needs to copy all these names in the CMake file again.
2. It is error-prone in a few fashions. The biggest issue is the correct file ending for GPU support. This was partially solved in !518, however still one needs to think about whether a file is only generated for CPU or can vary depending on the CMake configuration. Second, some generation functions like `generate_info_header` only produce a header file. This user must know this to write a correct CMake file first try ...
3. A third big problem is, that every generation function can only produce a single file. Otherwise, it would be impossible for a user to know which files will come out in the back. Thus this system lacks flexibility as wellMarkus HolzerMarkus Holzerhttps://i10git.cs.fau.de/walberla/walberla/-/issues/94Zero-centering of stored PDF values is not well documented and unclear2022-11-17T15:09:55+01:00Christoph RettingerZero-centering of stored PDF values is not well documented and unclearIn waLBerla, the LBM PDF values are either stored regularly, i.e. text-book like, or "centered around 0", i.e. just storing the deviation from the corresponding lattice weight.
The decision is done based on the lattice model with the com...In waLBerla, the LBM PDF values are either stored regularly, i.e. text-book like, or "centered around 0", i.e. just storing the deviation from the corresponding lattice weight.
The decision is done based on the lattice model with the compressible flag: true leads to regular PDF values, whereas false (=incompresisble) leads to centered PDF values.
The seemingly only place where this is noted is in a comment in src/lbm/field/PdfField.h but it is a very important aspect to realize when implementing own algorithms etc.
This should be documented better in some central place or a tutorial (and this issue is supposed to be a temporary documentation of this behavior).
Furthermore, it should be evaluated whether or not this centering might also make sense for the compressible case, as the only reason for centering seems to be floating point accuracy. Or is there another reason not to do it?
For a clearer self-documentation, one should add a flag to the lattice model like 'zero-centered' and this flag should be used instead of the compressible flag to check for the centering and its implications.7.1Christoph RettingerChristoph Rettingerhttps://i10git.cs.fau.de/walberla/walberla/-/issues/109Use mdspan instead of Field for portability of codegen beyond Walberla2022-09-14T12:12:39+02:00Michael Kuronmkuron@icp.uni-stuttgart.deUse mdspan instead of Field for portability of codegen beyond Walberla@bauer pointed out that C++23 wil have `std::mdspan`, a non-owning multidimensional array view. A reference implementation that works with C++11 and higher is already available: https://github.com/kokkos/mdspan.
This could be used at th...@bauer pointed out that C++23 wil have `std::mdspan`, a non-owning multidimensional array view. A reference implementation that works with C++11 and higher is already available: https://github.com/kokkos/mdspan.
This could be used at the interface between Walberla and pystencils‘ generated code and would allow pystencils_walberla to be used with other (non-Walberla) software. Right now, that interface uses `walberla::field::Field`.