waLBerla issueshttps://i10git.cs.fau.de/walberla/walberla/-/issues2023-05-02T09:28:20+02:00https://i10git.cs.fau.de/walberla/walberla/-/issues/3BoundaryHandling Documentation2023-05-02T09:28:20+02:00Martin BauerBoundaryHandling DocumentationDraft:
Section 1: Boundary Handling
- Basic Idea
- "copydoc" from FlagField: why FlagUID?
- FlagField, simple case: flag corresponds to boundary / domain
- generalize: boundary/domain can be mask...Draft:
Section 1: Boundary Handling
- Basic Idea
- "copydoc" from FlagField: why FlagUID?
- FlagField, simple case: flag corresponds to boundary / domain
- generalize: boundary/domain can be masks
-> explain need for BoundaryUID vs. FlagUID
- sidenote: Optimization with near boundary flag
or ListBased Approach ( description howto switch )
- what is a consistent state?
- boundary concept: how to implement a own boundary ( copydoc / nice format )
- Practical Usage:
- why templates?
- setup using tuples
- set/force/remove
- boundary handling factories
- explain difference between Flag/Domain/Boundary: difference is only 'optimization level'
Section 2: Boundary Handling Collection
- reason "multiphysics simulations"
- explain using example: picture ( fluid, temperature )
- explain why setBoundary(FlagUID) does not exist
- do some crazy scenario setups....
Section 3: Initializer i.e. what they can/cannot do.
- @bauer7.1Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/walberla/walberla/-/issues/4MPIManager createCartesianComm / useWorldComm / resetMPI2022-07-25T08:23:50+02:00Martin BauerMPIManager createCartesianComm / useWorldComm / resetMPI@schornbaum @godenschwager
Right now in MPIManager one cannot call first useWorldComm then createCartesianComm without calling resetMPI
in between.
Is there any reason why one could not remove this restriction and do an implicit resetM...@schornbaum @godenschwager
Right now in MPIManager one cannot call first useWorldComm then createCartesianComm without calling resetMPI
in between.
Is there any reason why one could not remove this restriction and do an implicit resetMPI in this case?
Then one could reconfigure the communicator more easily and could also initialize it f.e. with useWorldComm by default.7.1Nils KohlNils Kohlhttps://i10git.cs.fau.de/walberla/walberla/-/issues/8Problem with PythonCallbacks2022-07-25T08:23:48+02:00Martin BauerProblem with PythonCallbacks- callbacks are set as attributes in walberla_cpp.callbacks module
- if now two different modules with callbacks are imported using
PythonCallback cb1( "file1.py", "function1"); and
PythonCallback cb2( "fil...- callbacks are set as attributes in walberla_cpp.callbacks module
- if now two different modules with callbacks are imported using
PythonCallback cb1( "file1.py", "function1"); and
PythonCallback cb2( "file2.py", "function1");
cb1(); runs actually the wrong cb2 (since it overrides the walberla_cpp.callbacks.function1 entry7.1Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/walberla/walberla/-/issues/10Multigrid PDE solver2023-03-29T20:05:37+02:00Michael Kuronmkuron@icp.uni-stuttgart.deMultigrid PDE solverWalberla should eventually have a multigrid PDE solver. This should also enable solving PDEs on refined grids. As requested in software/walberla#15, it should optionally be possible to specify the full sparse matrix for the left-hand sid...Walberla should eventually have a multigrid PDE solver. This should also enable solving PDEs on refined grids. As requested in software/walberla#15, it should optionally be possible to specify the full sparse matrix for the left-hand side instead of just giving stencil weights.
Strategy:
1. [x] Basic multigrid with direct coarsening and no boundary conditions
* [x] Add stencil field support to existing PDE solvers (software/walberla#15), as this is required for the following steps.
* [x] Implement Galerkin Coarsening
* [x] Locally modify the stencil to take into account boundary conditions
* [ ] Grid refinement
* [ ] Make sure RBGS work on refined grids. We can assume that all blocks have the same coarsest resolution.
* [ ] Adapt V-cycle so that for its finest levels, it can use the block's resolution if no grid with the needed resolution is available.
* [ ] For dynamic refinement, need to recreate the coarser levels of the V-cycle (solution, rhs, residual, stencil field) whenever the refinement changes.Richard AngersbachRichard Angersbachhttps://i10git.cs.fau.de/walberla/walberla/-/issues/12Unify refinement selection for static and dynamic grid refinement2019-11-07T14:22:59+01:00Michael Kuronmkuron@icp.uni-stuttgart.deUnify refinement selection for static and dynamic grid refinementCurrently, the static grid refinement's `SetupBlockForest` takes a *refinement selection function*, where markers are set to specify which parts of the domain need further refinement. The dynamic grid refinement's `BlockForest`, on the ...Currently, the static grid refinement's `SetupBlockForest` takes a *refinement selection function*, where markers are set to specify which parts of the domain need further refinement. The dynamic grid refinement's `BlockForest`, on the other hand, takes a *minimum target level determination function*, which directly specifies the desired level of refinement.
This means that switching from static to dynamic refinement requires implementing another function. It also means that simulations with dynamic refinement need to either implement both or only set up the root blocks as part of the static refinement and leave the actual work to the dynamic refinement. Static refinement should however not be dropped from Walberla as it may be sufficient for many simulations and has the advantage that the block forest setup can be performed independently of the simulation and saved to a file.
I therefore propose that an adapter be created that takes a *minimum target level determination function* and sets markers appropriately so it can be used with static refinement. Alternatively, `SetupBlockForest` could also be modified to alternatively accept a *minimum target level determination function* directly instead of requiring a *refinement selection function*.https://i10git.cs.fau.de/walberla/walberla/-/issues/13Make the timeloop more powerful. lbm::refinement::TimeStep duplicates much of...2019-11-07T14:23:05+01:00Michael Kuronmkuron@icp.uni-stuttgart.deMake the timeloop more powerful. lbm::refinement::TimeStep duplicates much of the functionality of timeloop::SweepTimeloopFor refined simulations, one needs to use `lbm::refinement::TimeStep`, which duplicates much of the functionality of the regular `timeloop::SweepTimeloop`. The `lbm::refinement::TimeStep` is then attached to a `timeloop::SweepTimeloop`. ...For refined simulations, one needs to use `lbm::refinement::TimeStep`, which duplicates much of the functionality of the regular `timeloop::SweepTimeloop`. The `lbm::refinement::TimeStep` is then attached to a `timeloop::SweepTimeloop`. This is not very elegant and the two should be merged so that one does not end up with multiple `timing::TimingPool` objects, gets the same amount of debug logging, etc. Further problems include that `lbm::refinement::TimeStep` does not provide callbacks in all desirable places (I would for example need one each after `(startCommunication|wait)(CoarseToFine|FineToCoarse|EqualLevel)` ).
Perhaps `lbm::refinement::TimeStep` can be replaced with a set of helper functions that add the appropriate functors to an (enhanced) `timeloop::SweepTimeloop`.
Additionally, `timeloop::SweepTimeloop` should itself conform to the concept of `timeloop::Timeloop::SelectableFunc` so that timeloops can be nested. This would be useful for things that have internal iterations, like the PDE solvers.Christoph RettingerChristoph Rettingerhttps://i10git.cs.fau.de/walberla/walberla/-/issues/14field::refinement::PackInfo should permit specifying number of ghost layers t...2019-11-07T14:23:10+01:00Michael Kuronmkuron@icp.uni-stuttgart.defield::refinement::PackInfo should permit specifying number of ghost layers to communicateFor uniform grids, `field::communication::PackInfo`'s constructor has an optional argument `numberOfGhostLayers` that specifies how many ghost layers should be exchanged. If it is not given, then all ghost layers are exchanged.
For re...For uniform grids, `field::communication::PackInfo`'s constructor has an optional argument `numberOfGhostLayers` that specifies how many ghost layers should be exchanged. If it is not given, then all ghost layers are exchanged.
For refined grids, `field::refinement::PackInfo`, having otherwise identical functionality to its non-refined counterpart, does not permit specifying the number of ghost layers to exchange. In fact, it does not even exchange all ghost layers, but only communicates one layer of the coarse field to two layers of the fine field. Functionality to specify the number of ghost layers should be added. For consistency with non-refined simulations, the number of ghost layers should be specified from the perspective of the coarse field, so if `2` is given, 2 ghost layers of the coarse grid should exchange data with 4 ghost layers of the fine grid.https://i10git.cs.fau.de/walberla/walberla/-/issues/25walberla::geometry::getClosestLineBoxPoints produces wrong results2021-03-29T18:42:28+02:00Tobias Leemannwalberla::geometry::getClosestLineBoxPoints produces wrong resultsFor some combinations of Capsules and Boxes the analytical collision detection of PE seems to yield wrong results.
This is probably caused by a bug in the function `walberla::geometry::getClosestLineBoxPoints`.
Code Example to reprodu...For some combinations of Capsules and Boxes the analytical collision detection of PE seems to yield wrong results.
This is probably caused by a bug in the function `walberla::geometry::getClosestLineBoxPoints`.
Code Example to reproduce:
The Capsule has radius 3, length 6 and is not rotated and translated, so its center line runs from (-3,0,0) to (3,0,0).
The Box has size (3, 1.5, 1.5) and is somehow rotated and translated.
```
Vec3 box_pos(real_t(-2.46404), real_t(2.90053), real_t(-2.177));
Quat box_rot(real_t(0.712691),real_t(-0.509519),real_t(0.0466145), real_t(0.479884));
//Corresponding Bodies
Box box(0, 0, box_pos, Vec3(0,0,0), box_rot, Vec3(3, real_t(1.5), real_t(1.5)), iron, false, true, false);
Capsule cap(1, 1, Vec3(0,0,0), Vec3(0,0,0), Quat(), real_t(3.0), real_t(6.0), iron, false, true, false);
//This is how getClosestLineBoxPoints is called in the collide-function:
Vec3 line_point, box_point;
walberla::geometry::getClosestLineBoxPoints( Vec3(3,0,0), Vec3(-3,0,0), box_pos, box_rot.toRotationMatrix(),
Vec3(real_t(3), real_t(1.5), real_t(1.5)), line_point, box_point);
std::cerr << "Result: Line-Point " << line_point ", Box-Point " << box_point << std::endl;
```
This results in the following output:
`Result: Line-Point <3,0,0>, Box-Point <-1.66899,2.22324,-1.94998>`
This result is wrong, because if you rotate and translate the Corner (-1.5,-0.75,-0.75)(in box local coordinates) e.g. by
`std::cerr << "Corner Position of Box: " << box_pos+box_rot.rotate(Vec3(real_t(-1.5), real_t(-0.75), real_t(-0.75))) << std::endl;`
you obtain the point (-2.40108,1.35235,-1.18999) which is the nearest point of the box to the line. The nearest point on the line would then be (-2.40108, 0, 0).
The return of the incorrect points in this case results in a failure to detect the collision between the Box and the Capsule (which has a large penetration depth of about 1.1986).https://i10git.cs.fau.de/walberla/walberla/-/issues/37Change return type of FieldIterator conversion (iterator to const_iterator)2022-07-25T08:23:46+02:00Tobias Schruffschruff@iww.rwth-aachen.deChange return type of FieldIterator conversion (iterator to const_iterator)In my code, I could not do the following
```c
template< typename Field_T >
void foo( Field_T & field )
{
auto * buffer = field.clone();
for( typename Field_T::const_base_iterator it = field.begin(); ... )
{
buffer->( it.c...In my code, I could not do the following
```c
template< typename Field_T >
void foo( Field_T & field )
{
auto * buffer = field.clone();
for( typename Field_T::const_base_iterator it = field.begin(); ... )
{
buffer->( it.cell() ) = ... calculate something based on *it (plus some neighbors) ...
}
field.swapDataPointers( buffer );
}
```
because the compiler would not create a const_iterator (or const_base_iterator) from a non-const field. I'm not sure if this is a bug ( or a feature ;), but it would be handy to create a const_iterator from a non-const field in some situations, like the one above. Or can `foo` be implemented in a better/more efficient way?
**Proposal**
If the custom FieldIterator conversion is changed from
```c
operator const FieldIterator<const T, fieldFSize> & () const {
const FieldIterator<const T, fieldFSize> * ptr;
ptr = reinterpret_cast< const FieldIterator<const T, fieldFSize>* > ( this );
return *ptr;
}
```
to
```c
operator FieldIterator<const T, fieldFSize> () {
FieldIterator<const T, fieldFSize> * ptr;
ptr = reinterpret_cast< FieldIterator<const T, fieldFSize>* > ( this );
return *ptr;
}
```
the problem disappears (at the cost of creating a copy of the iterator).7.1Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/walberla/walberla/-/issues/40Write a test to check bounding boxes.2022-07-25T08:23:43+02:00Sebastian EiblWrite a test to check bounding boxes.7.1https://i10git.cs.fau.de/walberla/walberla/-/issues/52Rework config file and command line argument substitution2022-07-25T08:23:40+02:00Christoph RettingerRework config file and command line argument substitutionIt is possible to specify a config file and further command line arguments that replace variables set in the config file, allowing for parameter studies with the same config file. See core/Environment.h and core/config/Create.h.
However,...It is possible to specify a config file and further command line arguments that replace variables set in the config file, allowing for parameter studies with the same config file. See core/Environment.h and core/config/Create.h.
However, the documentation is misleading (which order should there be: "configFile var1 var2" or "var1 var 2 configFile").
Also, there seem to be two replacement mechanisms:
Directly in the create function (Create.cpp) via
`config->addValueReplacement( &argv[i][1], argv[i+1] )`
and then in the later called function
`substituteCommandLineArgs( *config, argc, argv )`
Both seem to require differently formatted command line arguments.
Also, it seems ambiguous to have two mechanisms achieving the same thing.7.1Christoph RettingerChristoph Rettingerhttps://i10git.cs.fau.de/walberla/walberla/-/issues/77Replace iterator macros with lambdas2022-07-25T08:23:37+02:00Michael Kuronmkuron@icp.uni-stuttgart.deReplace iterator macros with lambdasOriginal inspiration: OpenMP pragmas inside WALBERLA_FOR_ALL_CELLS_ macros are broken (#5). Also using commas outside of parentheses (e.g. when separating multiple template parameters) in the body of a WALBERLA_FOR_ALL_CELLS_ macro leads...Original inspiration: OpenMP pragmas inside WALBERLA_FOR_ALL_CELLS_ macros are broken (#5). Also using commas outside of parentheses (e.g. when separating multiple template parameters) in the body of a WALBERLA_FOR_ALL_CELLS_ macro leads to confusing error messages.
Macros are not very C++-like. Since C++11 brought structures like `std::for_each` and lambdas, there is a good language-level alternative.
There might be performance implications when a compiler doesn‘t properly inline the lambda. Recent compilers should be okay because a lot of the STL algorithms depend on it too, but it needs to be checked that performance doesn‘t degrade.7.1https://i10git.cs.fau.de/walberla/walberla/-/issues/78CMake cleanup/rewrite2022-07-25T08:23:35+02:00Michael Kuronmkuron@icp.uni-stuttgart.deCMake cleanup/rewrite- Decide on a minimum CMake version (e.g. 3.10, as included by Ubuntu 18.04) and then make use of all the new features available to simplify our build system.
- Make it easier to use of Walberla as a library in external projects.
- Rewri...- Decide on a minimum CMake version (e.g. 3.10, as included by Ubuntu 18.04) and then make use of all the new features available to simplify our build system.
- Make it easier to use of Walberla as a library in external projects.
- Rewrite the whole build system according to current CMake best practices.
- Eliminate all the hacks for library detection on old systems.
- Make each submodule either header-only or completely compiled into a library (e.g. using explicit template instantiation if the set of possible types is small). The current state where most submodules are like 10% compiled just combines the downsides of both (long compile times, requirement of linking).7.1https://i10git.cs.fau.de/walberla/walberla/-/issues/81pe_coupling tutorial2022-07-25T08:23:31+02:00Michael Kuronmkuron@icp.uni-stuttgart.depe_coupling tutorialThere should be a pe_coupling tutorial that shows how to set up a simulation with the momentum exchange method. Right now there are separate tutorials for pe and lbm, but coupling the two is highly nontrivial for a novice user.There should be a pe_coupling tutorial that shows how to set up a simulation with the momentum exchange method. Right now there are separate tutorials for pe and lbm, but coupling the two is highly nontrivial for a novice user.7.1Christoph RettingerChristoph Rettingerhttps://i10git.cs.fau.de/walberla/walberla/-/issues/89make UniqueID stateless2022-08-11T10:09:52+02:00Sebastian Eiblmake UniqueID statelessCurrently UniqueID has a state which is not stored in checkpoint&restore scenarios. This may cause id collisions in a restarted simulation.
I think this is only relevant for the pe when you create new particles after a restart.Currently UniqueID has a state which is not stored in checkpoint&restore scenarios. This may cause id collisions in a restarted simulation.
I think this is only relevant for the pe when you create new particles after a restart.7.1https://i10git.cs.fau.de/walberla/walberla/-/issues/94Zero-centering of stored PDF values is not well documented and unclear2022-11-17T15:09:55+01:00Christoph RettingerZero-centering of stored PDF values is not well documented and unclearIn waLBerla, the LBM PDF values are either stored regularly, i.e. text-book like, or "centered around 0", i.e. just storing the deviation from the corresponding lattice weight.
The decision is done based on the lattice model with the com...In waLBerla, the LBM PDF values are either stored regularly, i.e. text-book like, or "centered around 0", i.e. just storing the deviation from the corresponding lattice weight.
The decision is done based on the lattice model with the compressible flag: true leads to regular PDF values, whereas false (=incompresisble) leads to centered PDF values.
The seemingly only place where this is noted is in a comment in src/lbm/field/PdfField.h but it is a very important aspect to realize when implementing own algorithms etc.
This should be documented better in some central place or a tutorial (and this issue is supposed to be a temporary documentation of this behavior).
Furthermore, it should be evaluated whether or not this centering might also make sense for the compressible case, as the only reason for centering seems to be floating point accuracy. Or is there another reason not to do it?
For a clearer self-documentation, one should add a flag to the lattice model like 'zero-centered' and this flag should be used instead of the compressible flag to check for the centering and its implications.7.1Christoph RettingerChristoph Rettingerhttps://i10git.cs.fau.de/walberla/walberla/-/issues/108Make Codegen Safer2020-03-18T08:53:11+01:00Christoph RettingerMake Codegen SaferBy introducing more and more (lbmpy/pystencils but also mesa_pd) codegen stuff in waLBerla, the problem arises that changes made in lbmy or pystencils might ultimately also affect the outcome of the codegen procedure, i.e. the generated ...By introducing more and more (lbmpy/pystencils but also mesa_pd) codegen stuff in waLBerla, the problem arises that changes made in lbmy or pystencils might ultimately also affect the outcome of the codegen procedure, i.e. the generated kernel/lattice model.
The problem is not only that these changes are made in other repositories, but also that changes in some files there can not easily be linked to the functionality that is used in the codegen scripts, i.e. it is not intuitively clear by looking at commit changes that this might affect my generated kernel.
One solution could be, that if I want to make sure that a certain kernel does exactly what I have in mind, I have to define a certain input (e.g. 27 PDFs in a single cell) and compute the output with my kernel, for which I know it is correct.
This pair of input and output is then put inside the codegen script file and every time I generate a kernel, it is checked that the output of the newly generated kernel matches the formerly given one.
In some cases, this could also be done symbolically, but the issue arises that different optimizations might lead to a different symbolic representation, even though the outcome of the actual computation sis till the same.
So it would be great to introduce a checkKernel(kernel, input, output) function, to e.g. lbmpy_walberla. Then, initial (and correct) input-output-pairs for the currently existing codegen things in waLBerla have to be defined. And then the check function is used in all codegen scripts to assert that the behavior is still the same. The input data set should be complex enough to cover most cases (e.g. not all zeros), maybe even random.https://i10git.cs.fau.de/walberla/walberla/-/issues/109Use mdspan instead of Field for portability of codegen beyond Walberla2022-09-14T12:12:39+02:00Michael Kuronmkuron@icp.uni-stuttgart.deUse mdspan instead of Field for portability of codegen beyond Walberla@bauer pointed out that C++23 wil have `std::mdspan`, a non-owning multidimensional array view. A reference implementation that works with C++11 and higher is already available: https://github.com/kokkos/mdspan.
This could be used at th...@bauer pointed out that C++23 wil have `std::mdspan`, a non-owning multidimensional array view. A reference implementation that works with C++11 and higher is already available: https://github.com/kokkos/mdspan.
This could be used at the interface between Walberla and pystencils‘ generated code and would allow pystencils_walberla to be used with other (non-Walberla) software. Right now, that interface uses `walberla::field::Field`.https://i10git.cs.fau.de/walberla/walberla/-/issues/113Provide same behavior of generated lattice model to built-in ones2022-07-25T08:23:21+02:00Christoph RettingerProvide same behavior of generated lattice model to built-in onesWhen generating a lattice model with lbmpy_walberla, the behavior is in some cases not the same as for the walberla-built-in lattice models.
This is a (possibly incomplete) list of differences:
* E.g. for checkpointing, the PDF field is ...When generating a lattice model with lbmpy_walberla, the behavior is in some cases not the same as for the walberla-built-in lattice models.
This is a (possibly incomplete) list of differences:
* E.g. for checkpointing, the PDF field is save to a file, and at the same time the lattice model is also packed. In waLBerla, the lattice model has a valid `pack` method (by calling the `pack` functions of the collision and the force model ) that really packs all infos (relaxation rates, forces,..) of the lattice model into the buffer. When reading the checkpoint file, these parameters will be unpacked and initialized correctly. This is NOT the case for the generated one since no parameters are packed/unpacked meaning that the user has to take care himself to properly initialize the lattice model's parameters.
* The same problem is probably also relevant for adaptive mesh refinement when PDF field block data is sent from one block to another, together with the lattice model.
As these things can easily happen totally unnoticed, this is considered highly dangerous and should be changed.7.1Christoph RettingerChristoph Rettingerhttps://i10git.cs.fau.de/walberla/walberla/-/issues/126Investige usage of CUDA graphs to reduce kernel call overhead2020-07-26T21:38:21+02:00Stephan SeitzInvestige usage of CUDA graphs to reduce kernel call overheadMarco had the idea to use this in the CUDA backend for Petalisp. But maybe it's also useful in waLBerla timeloops.
https://developer.nvidia.com/blog/cuda-graphs/
It effectively tries to reduce call overhead of kernels when you repeated...Marco had the idea to use this in the CUDA backend for Petalisp. But maybe it's also useful in waLBerla timeloops.
https://developer.nvidia.com/blog/cuda-graphs/
It effectively tries to reduce call overhead of kernels when you repeatedly call the same group of kernels. Maybe not as relevant, since bottleneck should be communication not managing the CUDA kernels and our kernels are not exactly "short-running"-https://i10git.cs.fau.de/walberla/walberla/-/issues/130Improve pe Union2023-05-02T12:18:33+02:00Michael Kuronmkuron@icp.uni-stuttgart.deImprove pe UnionCurrent restrictions that I noticed:
- [ ] A union of two overlapping bodies has an incorrect volume, mass and inertia tensor. Overlap should either be checked for and warned about, or the overlap volume should be estimated numerically ...Current restrictions that I noticed:
- [ ] A union of two overlapping bodies has an incorrect volume, mass and inertia tensor. Overlap should either be checked for and warned about, or the overlap volume should be estimated numerically and mass and inertia be corrected accordingly.
- [ ] Cryptic errors happen when a union is created from bodies spread across a block boundary.
Further restrictions that @eibl mentioned to me:
- [ ] Dynamic creation and splitting of unions in parallel is problematic.
- [ ] Unclear collision handling because concave objects can collide in multiple points simultaneously.
Regarding the last point: I guess you can even have simultaneous collisions without unions. E.g. cube-cube and cube-plane (face to face), cube-cylinder, plane-cylinder and cyclinder-cylinder (face to side or face to face), or sphere-torus (moving the sphere through the center of the torus).https://i10git.cs.fau.de/walberla/walberla/-/issues/139Enable Python and codegen by default2022-07-25T08:23:16+02:00Michael Kuronmkuron@icp.uni-stuttgart.deEnable Python and codegen by defaultEnable `WALBERLA_BUILD_WITH_PYTHON`/`WALBERLA_BUILD_WITH_CODEGEN` by default if Python/lbmpy are found, respectively
Follow-up to !352Enable `WALBERLA_BUILD_WITH_PYTHON`/`WALBERLA_BUILD_WITH_CODEGEN` by default if Python/lbmpy are found, respectively
Follow-up to !3527.1Dominik Thoennesdominik.thoennes@fau.deDominik Thoennesdominik.thoennes@fau.dehttps://i10git.cs.fau.de/walberla/walberla/-/issues/145reduceOverParticles2021-03-30T10:54:04+02:00Sebastian EiblreduceOverParticles```
auto kinEnergy = ps.reduceOverParticles(.., SUM, [&](auto p_idx)
{return 0.5_r * ac.getMass(p_idx) * ac.getLinearVelocity(p_idx) * ac.getLinearVelocity(p_idx);});
``````
auto kinEnergy = ps.reduceOverParticles(.., SUM, [&](auto p_idx)
{return 0.5_r * ac.getMass(p_idx) * ac.getLinearVelocity(p_idx) * ac.getLinearVelocity(p_idx);});
```https://i10git.cs.fau.de/walberla/walberla/-/issues/147Don't call resetForceAndTorque inside DEM2021-05-20T15:09:56+02:00Michael Kuronmkuron@icp.uni-stuttgart.deDon't call resetForceAndTorque inside DEMThe DEM solver contains
```
// Resetting the acting forces
bodyIt->resetForceAndTorque();
[...]
// Resetting the acting forces
bodyIt->resetForceAndTorque();
```
which I don't think should be there, and it definitely shouldn't...The DEM solver contains
```
// Resetting the acting forces
bodyIt->resetForceAndTorque();
[...]
// Resetting the acting forces
bodyIt->resetForceAndTorque();
```
which I don't think should be there, and it definitely shouldn't reset the force twice. HCSITS does not reset the force itself, that's what ForceTorqueOnBodiesResetter is for.Christoph RettingerChristoph Rettingerhttps://i10git.cs.fau.de/walberla/walberla/-/issues/161Documentation of Python setup workflow2021-09-30T10:44:25+02:00Helen SchottenhammlDocumentation of Python setup workflowTo help the standard user to set up their Python environment properly for the usage with pystencils/ lbmpy, a documentation in the README would be helpful.To help the standard user to set up their Python environment properly for the usage with pystencils/ lbmpy, a documentation in the README would be helpful.https://i10git.cs.fau.de/walberla/walberla/-/issues/163Usage of NCCL2021-10-08T09:35:18+02:00Markus HolzerUsage of NCCLNVIDIA NCCL provides a Collective Communication Library. This could give a performance boost for multi GPU computations and be better than Cuda-Aware MPI.
https://developer.nvidia.com/nccl
https://docs.nvidia.com/deeplearning/nccl/insta...NVIDIA NCCL provides a Collective Communication Library. This could give a performance boost for multi GPU computations and be better than Cuda-Aware MPI.
https://developer.nvidia.com/nccl
https://docs.nvidia.com/deeplearning/nccl/install-guide/index.htmlMarkus HolzerMarkus Holzerhttps://i10git.cs.fau.de/walberla/walberla/-/issues/170Usability CodeGen boundaries2023-04-05T13:47:53+02:00Markus HolzerUsability CodeGen boundariesEspecially when a boundary with additional data is generated it is rather hard to understand how this should be done in the python script.
Something like a wrapper around the boundaries would maybe improve the situation. For example, th...Especially when a boundary with additional data is generated it is rather hard to understand how this should be done in the python script.
Something like a wrapper around the boundaries would maybe improve the situation. For example, the UBB can be generated in this way:
```python
ubb_dynamic = UBB(lambda *args: None, dim=stencil.D)
ubb_data_handler = UBBAdditionalDataHandler(stencil, ubb_dynamic)
```
However, the empty callback with the lambda and the `UBBAdditionalDataHandler` are cryptic and there is absolutely no other choice than shown above.
Thus a thin wrapper like `DynamicUBB(stencil=stencil)` could improve this situation quite a lot.
The problem why this has to be written in such a cryptic way in the first place is that lbmpy has to function as a standalone framework besides waLBerla. Thus certain design decisions are motivated out of a python world and contradict with the codegen for waLBerla.Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/walberla/walberla/-/issues/173Poor user experience with generated DynamicUBB (UBB + additional_data_handler)2023-05-02T09:32:33+02:00Nigel OvermarsPoor user experience with generated DynamicUBB (UBB + additional_data_handler)For my purposes, I need to be able to set a certain inflow profile, for example a variant of a Poiseuille flow where the velocity changes at every time step.
Currently, this is possible when using generated sweeps, but only with a few n...For my purposes, I need to be able to set a certain inflow profile, for example a variant of a Poiseuille flow where the velocity changes at every time step.
Currently, this is possible when using generated sweeps, but only with a few nasty hacks, for both making it temporally varying and spatially varying. To be able to use a DynamicUBB, which was generated via
```
ubb_dynamic = UBB(lambda *args: None, dim=stencil.D)
ubb_data_handler = UBBAdditionalDataHandler(stencil, ubb_dynamic)
# UBB with user-defined velocity profile
generate_boundary(ctx, "DynamicUBB", ubb_dynamic, lbm_method,
additional_data_handler=ubb_data_handler, target=target)
```
one needs to pass a `std::function< Vector3< real_t >(const Cell&, const shared_ptr< StructuredBlockForest >&, IBlock&)` to the constructor of the DynamicUBB type. The easiest way to make one is via a functor, for which a template looks like this:
```
class InflowProfile
{
public:
Vector3< real_t > operator()( const Cell& pos, const shared_ptr< StructuredBlockForest >& SbF, IBlock& block ) {
// return velocity vector depending on the cell location in the SbF
return getVelocityVector(pos, SbF, block);
}
};
```
**Temporally varying inflow profile**
The problem is that, in the current version of waLBerla/lbmpy, the additional data handler is only being called once before the running of the simulation when the generated member function
```
template<typename FlagField_T>
void DynamicUBB::fillFromFlagField( const shared_ptr<StructuredBlockForest> & blocks, ConstBlockDataID flagFieldID,
FlagUID boundaryFlagUID, FlagUID domainFlagUID)
```
is being called. After these values have been set, there is currently no easy way to update them at every time step. As a workaround, one can add the aforementioned member function as a `addFuncAfterTimeStep` to the SweepTimeLoop and add a `TimeTracker` object to the functor, which allows to update the values depending on the current time step. This approach does appear to incur a performance penalty, as the member function `DynamicUBB::fillFromFlagField` appears to do some work which should be unnecessary after an initial run of it. I think, but haven't tested it, that this also makes running on the GPU not possible or significantly slowed down.
**Spatially varying inflow profile**
Currently, the cells being passed to the functors `operator()` are different from the ones I expected to get. The problem is in the following generated code:
```
// for every cell in the block
if ( isFlagSet( it.neighbor(1, 0, 0 , 0 ), boundaryFlag ) )
{
auto element = IndexInfo(it.x(), it.y(), it.z(), 0 );
Vector3<real_t> InitialisatonAdditionalData = elementInitaliser(Cell(it.x(), it.y(), it.z()), blocks, *block);
element.vel_0 = InitialisatonAdditionalData[0];
element.vel_1 = InitialisatonAdditionalData[1];
element.vel_2 = InitialisatonAdditionalData[2];
// snip
}
// Similar code for all directions, e.g. 27 in total for a D3Q27 stencil
```
Where the function `elementInitialiser` is the our functors `operator()`.
In essence, what it being checked if the current cell has a neighbor, in the case of a inflow boundary, for which the inflowFlag is set. If that is the case, apply the functor to this cell, meaning the cell of which a neighboring cell has the inflowFlag set, not the cell with the inflowFlag self, and store the results (i.e. the computed velocity vector) in the appropriate location.
Currently, I am working around this problem by manually adding the direction to the cell on which the functor is applied. Unfortunately, due to the generated nature of the code, this is not a real solution...
In my use case, certain noise values are generated for every cell for every time step in the inflow boundary via a Python program which are then being read from a .json file. These values are then put into a `std::unordered_map<GlobalCellCoordinatesTuple, Vector3>` and then read during the boundary treatment. Without the aforementioned fix there is a miss match between the cell coordinates I put in the `std::unordered_map<..,..>` (the ones for which the inflowFlag is being set) and the cell coordinates that get passed to the functor, resulting in the wrong behavior.
**Suggestions**
- Allow the user to specify if they want the boundary values to be updated at every time step. Then the user would just have to add a `TimeTracker` to the functor and then can update the values in an easy manner.
- Allow the user to select if they want the cells passed to the functors `operator()` to be 1) the neighboring cells as is done right now, or 2) they are the actual cells for which the flag which is checked is set.Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/walberla/walberla/-/issues/176Check and validate implementation of Guo force model with TRT/MRT2023-01-27T07:59:16+01:00Christoph SchwarzmeierCheck and validate implementation of Guo force model with TRT/MRTI tested waLBerla's TRT collision model with the compressible D3Q19 velocity set in waLBerla's free surface LBM module (not yet open source), .
When using the `GuoConstant` force model, I noticed two things that are not present when usi...I tested waLBerla's TRT collision model with the compressible D3Q19 velocity set in waLBerla's free surface LBM module (not yet open source), .
When using the `GuoConstant` force model, I noticed two things that are not present when using the `SimpleConstant` force model:
1. the physical results significantly differ from SRT (same even order relaxation rate, arbitrary "Magic" parameter)
2. the physical results change significantly with small changes in TRT's (even order) relaxation rate
These observations should not be related to the free surface LBM implementation and we should therefore
- [x] create a simple test case to validate the `GuoConstant` force model when using the TRT/MRT collision operators (without free surface LBM)
- [x] try to reproduce this issue (without free surface LBM)
- [ ] check if the `GuoConstant` and `GuoField` force models are implemented correctly for TRT/MRT collision operatorsJonas PlewinskiJonas Plewinskihttps://i10git.cs.fau.de/walberla/walberla/-/issues/183MPI_ERR_TRUNCATE in WcTimingPool2022-03-24T14:06:22+01:00Daniel BauerMPI_ERR_TRUNCATE in WcTimingPoolI use mesh refinement with a refinement time step.
Additionally, I create two [WcTimingPools](https://i10git.cs.fau.de/walberla/walberla/-/blob/master/src/core/timing/TimingPool.h) and pass them to the refinement time step.
After running...I use mesh refinement with a refinement time step.
Additionally, I create two [WcTimingPools](https://i10git.cs.fau.de/walberla/walberla/-/blob/master/src/core/timing/TimingPool.h) and pass them to the refinement time step.
After running the simulation, I use [`logResultOnRoot()`](https://i10git.cs.fau.de/walberla/walberla/-/blob/master/src/core/timing/TimingPool.cpp#L411) to print the time measurements.
```
auto timingRef = std::make_shared<WcTimingPool>();
auto timingRefLvl = std::make_shared<WcTimingPool>();
⋮
// setup timeloop
refinementTimeStep->enableTiming(timingRef, timingRefLvl);
timeloop->addFuncBeforeTimeStep(makeSharedFunctor(refinementTimeStep), str::refinementTimeStep);
⋮
// run timeloop
auto timing = std::make_shared<WcTimingPool>();
timeloop->run(*timing);
// print timings
timing ->logResultOnRoot();
timingRef ->logResultOnRoot();
timingRefLvl->logResultOnRoot();
```
I run my simulation with 288 processes on 8 nodes and get the following behavior:
The first two timings (`timing` and `timingRef`) print the results as expected.
The level wise timing fails to print with the error:
```
[node098:207482] * An error occurred in MPI_Reduce
[node098:207482] * reported by process [3786932225,192]
[node098:207482] * on communicator MPI_COMM_WORLD
[node098:207482] * MPI_ERR_TRUNCATE: message truncated
[node098:207482] * MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[node098:207482] * and potentially your MPI job)
[node089.cluster:17591] PMIX ERROR: UNREACHABLE in file server/pmix_server.c at line 1741
[node089.cluster:17591] PMIX ERROR: UNREACHABLE in file server/pmix_server.c at line 1741
[node089.cluster:17591] 12 more processes have sent help message help-mpi-errors.txt / mpi_errors_are_fatal
[node089.cluster:17591] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
```
The error must come from [TimingPool.cpp:172-185](https://i10git.cs.fau.de/walberla/walberla/-/blob/master/src/core/timing/TimingPool.cpp#L172-L185).
I use gcc 8.3.0 and openmpi 3.1.5.https://i10git.cs.fau.de/walberla/walberla/-/issues/192Communication fails when using multiple blocks per process (GPU)2023-05-22T15:23:13+02:00Samuel KemmlerCommunication fails when using multiple blocks per process (GPU)The communication fails (i.e., communicates to the wrong position) when using multiple blocks per process (GPU). This only occurs between the process-local block 0 and process-local block 1.The communication fails (i.e., communicates to the wrong position) when using multiple blocks per process (GPU). This only occurs between the process-local block 0 and process-local block 1.https://i10git.cs.fau.de/walberla/walberla/-/issues/194Sum of cell local overlap fractions can exceed 1 (which it should not) in the...2023-03-23T11:19:54+01:00Samuel KemmlerSum of cell local overlap fractions can exceed 1 (which it should not) in the pe couplingSince pe is deprecated this issue is more for documentation and can be removed as soon as there is a documentation page for known bugs.Since pe is deprecated this issue is more for documentation and can be removed as soon as there is a documentation page for known bugs.https://i10git.cs.fau.de/walberla/walberla/-/issues/196Unification GPU/Device Usage2023-04-04T09:44:48+02:00Markus HolzerUnification GPU/Device UsageThe GPU/CUDA backend of waLBerla is very thin at the moment. In order to integrate the device using the following list of suggestions/observations should function as a guideline.
1. Like `MPI` device functions should get a wrapper aroun...The GPU/CUDA backend of waLBerla is very thin at the moment. In order to integrate the device using the following list of suggestions/observations should function as a guideline.
1. Like `MPI` device functions should get a wrapper around with an "environment" as done with `WALBERLA_MPI_SECTION`. Doing so users would no longer need to work with `if defined` as for example done here: https://i10git.cs.fau.de/walberla/walberla/-/blob/master/apps/benchmarks/FlowAroundSphereCodeGen/FlowAroundSphereCodeGen.cpp
2. The GPUField: https://i10git.cs.fau.de/walberla/walberla/-/blob/master/src/cuda/GPUField.h works differently in the sense that the interface is different (f-Size not a template parameter) but also it is very cumbersome to have explicit GPU data with its explicit `addToStorage` functions. It would be simpler if the `Field` itself could synchronise its data and give back a device of host pointer depending on the situations needed. This goes allong #109 which suggests using midspan for the Field data structure. The `mdspan` has the functionality right away.
3. Some device-specific implementations do not need to be specific. A good example is the communication scheme: https://i10git.cs.fau.de/walberla/walberla/-/blob/master/src/cuda/communication/UniformGPUScheme.h In the end the same `MPI` function is called just with a device pointer and not a host pointer (or even also with a device pointer if GPU direct is not available). Thus there is no need for this parallel world to exist here.
Another good example is the https://i10git.cs.fau.de/walberla/walberla/-/blob/master/src/cuda/communication/CustomMemoryBuffer.h which works basically completely similar to the normal device buffers with a slightly different API.https://i10git.cs.fau.de/walberla/walberla/-/issues/199Make CMake and CodeGen simpler2022-11-29T13:03:44+01:00Markus HolzerMake CMake and CodeGen simplerAt the moment all generated files, that come out of the CodeGen script need to be manually stated under `OUT_FILES` in the `waLBerla_generate_target_from_python`. Example:
https://i10git.cs.fau.de/walberla/walberla/-/blob/master/apps/ben...At the moment all generated files, that come out of the CodeGen script need to be manually stated under `OUT_FILES` in the `waLBerla_generate_target_from_python`. Example:
https://i10git.cs.fau.de/walberla/walberla/-/blob/master/apps/benchmarks/FlowAroundSphereCodeGen/CMakeLists.txt
This has some disadvantages:
1. It is always a bit clunky because as a user you first provide the names of the `OUT_FILES` as strings to the generation function and then one needs to copy all these names in the CMake file again.
2. It is error-prone in a few fashions. The biggest issue is the correct file ending for GPU support. This was partially solved in !518, however still one needs to think about whether a file is only generated for CPU or can vary depending on the CMake configuration. Second, some generation functions like `generate_info_header` only produce a header file. This user must know this to write a correct CMake file first try ...
3. A third big problem is, that every generation function can only produce a single file. Otherwise, it would be impossible for a user to know which files will come out in the back. Thus this system lacks flexibility as wellMarkus HolzerMarkus Holzerhttps://i10git.cs.fau.de/walberla/walberla/-/issues/200GCC12: EquationSystem warning when using `DebugOptimized` and `OPTIMIZIE_FOR_...2022-12-24T14:00:29+01:00Dominik Thoennesdominik.thoennes@fau.deGCC12: EquationSystem warning when using `DebugOptimized` and `OPTIMIZIE_FOR_LOCALHOST`The EquationSystem.cpp emmits a `maybe-uninitialized` warning when build with `DebugOptimized` and `WALBERLA_OPTIMIZIE_FOR_LOCALHOST` on gcc12
```
Building CXX object src/core/CMakeFiles/core.dir/math/equation_system/EquationSystem.cpp....The EquationSystem.cpp emmits a `maybe-uninitialized` warning when build with `DebugOptimized` and `WALBERLA_OPTIMIZIE_FOR_LOCALHOST` on gcc12
```
Building CXX object src/core/CMakeFiles/core.dir/math/equation_system/EquationSystem.cpp.o
cd /build/src/core && /usr/local/bin/ccache g++ -DBOOST_ALL_NO_LIB -I/build/src -I/walberla/src -isystem /opt/boost/include -isystem /usr/lib/x86_64-linux-gnu/openmpi/include/openmpi -isystem /usr/lib/x86_64-linux-gnu/openmpi/include -isystem /opt/openmesh/include -Wall -Wconversion -Wshadow -march=native -Wfloat-equal -Wextra -pedantic -D_GLIBCXX_USE_CXX11_ABI=1 -pthread -g -O3 -std=c++17 -o CMakeFiles/core.dir/math/equation_system/EquationSystem.cpp.o -c /walberla/src/core/math/equation_system/EquationSystem.cpp
In file included from /opt/boost/include/boost/graph/depth_first_search.hpp:21,
from /opt/boost/include/boost/graph/max_cardinality_matching.hpp:21,
from /walberla/src/core/math/equation_system/EquationSystem.cpp:39:
In constructor 'boost::bgl_named_params<T, Tag, Base>::bgl_named_params(T, const Base&) [with T = boost::vec_adj_list_vertex_id_map<boost::no_property, long unsigned int>; Tag = boost::vertex_index_t; Base = boost::bgl_named_params<boost::detail::odd_components_counter<long unsigned int>, boost::graph_visitor_t, boost::no_property>]',
inlined from 'boost::bgl_named_params<PType, boost::vertex_index_t, boost::bgl_named_params<T, Tag, Base> > boost::bgl_named_params<T, Tag, Base>::vertex_index_map(const PType&) const [with PType = boost::vec_adj_list_vertex_id_map<boost::no_property, long unsigned int>; T = boost::detail::odd_components_counter<long unsigned int>; Tag = boost::graph_visitor_t; Base = boost::no_property]' at /opt/boost/include/boost/graph/named_function_params.hpp:217:5,
inlined from 'static bool boost::maximum_cardinality_matching_verifier<Graph, MateMap, VertexIndexMap>::verify_matching(const Graph&, MateMap, VertexIndexMap) [with Graph = boost::adjacency_list<boost::vecS, boost::vecS, boost::undirectedS>; MateMap = long unsigned int*; VertexIndexMap = boost::vec_adj_list_vertex_id_map<boost::no_property, long unsigned int>]' at /opt/boost/include/boost/graph/max_cardinality_matching.hpp:779:61,
inlined from 'bool boost::matching(const Graph&, MateMap, VertexIndexMap) [with Graph = adjacency_list<vecS, vecS, undirectedS>; MateMap = long unsigned int*; VertexIndexMap = vec_adj_list_vertex_id_map<no_property, long unsigned int>; AugmentingPathFinder = edmonds_augmenting_path_finder; InitialMatchingFinder = extra_greedy_matching; MatchingVerifier = maximum_cardinality_matching_verifier]' at /opt/boost/include/boost/graph/max_cardinality_matching.hpp:807:79,
inlined from 'bool boost::checked_edmonds_maximum_cardinality_matching(const Graph&, MateMap, VertexIndexMap) [with Graph = adjacency_list<vecS, vecS, undirectedS>; MateMap = long unsigned int*; VertexIndexMap = vec_adj_list_vertex_id_map<no_property, long unsigned int>]' at /opt/boost/include/boost/graph/max_cardinality_matching.hpp:817:48,
inlined from 'bool boost::checked_edmonds_maximum_cardinality_matching(const Graph&, MateMap) [with Graph = adjacency_list<vecS, vecS, undirectedS>; MateMap = long unsigned int*]' at /opt/boost/include/boost/graph/max_cardinality_matching.hpp:824:56,
inlined from 'void walberla::math::EquationSystem::match()' at /walberla/src/core/math/equation_system/EquationSystem.cpp:97:4:
/opt/boost/include/boost/graph/named_function_params.hpp:192:56: warning: '*(unsigned char*)((char*)&occ + offsetof(boost::detail::odd_components_counter<long unsigned int>,boost::detail::odd_components_counter<long unsigned int>::m_parity))' may be used uninitialized [-Wmaybe-uninitialized]
192 | bgl_named_params(T v, const Base& b) : m_value(v), m_base(b) {}
| ^~~~~~~~~
/opt/boost/include/boost/graph/max_cardinality_matching.hpp: In member function 'void walberla::math::EquationSystem::match()':
/opt/boost/include/boost/graph/max_cardinality_matching.hpp:778:52: note: '*(unsigned char*)((char*)&occ + offsetof(boost::detail::odd_components_counter<long unsigned int>,boost::detail::odd_components_counter<long unsigned int>::m_parity))' was declared here
778 | detail::odd_components_counter< v_size_t > occ(num_odd_components);
|
```https://i10git.cs.fau.de/walberla/walberla/-/issues/204Is there a LBM grid refinement example on the GPU?2023-09-27T21:54:03+02:00ahmedIs there a LBM grid refinement example on the GPU?Thanks a lot for making this great library open-source with such high-quality code!
I was wondering if there's a 3D LBM grid refinement refinement that runs on the GPU. I saw a a LBM grid refinement example under `apps\benchmarks\Adapti...Thanks a lot for making this great library open-source with such high-quality code!
I was wondering if there's a 3D LBM grid refinement refinement that runs on the GPU. I saw a a LBM grid refinement example under `apps\benchmarks\AdaptiveMeshRefinementFluidParticleCoupling` but I think it only runs in parallel on the CPU (correct me if I'm wrong). What I'm trying to find is running LBM on a non-uniform static mesh that doesn't change over time i.e., not AMR.https://i10git.cs.fau.de/walberla/walberla/-/issues/205Juwels-Booster: selectDeviceBasedOnMpiRank() presumably assigns the wrong hos...2023-04-05T13:49:07+02:00Philipp SuffaJuwels-Booster: selectDeviceBasedOnMpiRank() presumably assigns the wrong host memory to the deviceThe function selectDeviceBasedOnMpiRank() seems to assign all devices (GPUs) on a node to the same host memory.
This could cause performance issues for CPU to GPU communication (cudaMemcopy), because GPUs are not communicating to their c...The function selectDeviceBasedOnMpiRank() seems to assign all devices (GPUs) on a node to the same host memory.
This could cause performance issues for CPU to GPU communication (cudaMemcopy), because GPUs are not communicating to their closest host memory.
So if you allocate 4 GPUs on a node and call the program with 4 MPI processes, all 4 GPUs are assigned to the same MPI process memory (of process 1) by cudaSetDevice().
This is the case, because the function gpuGetDeviceCount() returns 1 instead of 4 devices (GPUs).
This behavior is only tested for juwels-booster so far, further investigation is needed...https://i10git.cs.fau.de/walberla/walberla/-/issues/209Code Quality Days 2.5. + 3.5.2023-05-02T09:36:09+02:00Dominik Thoennesdominik.thoennes@fau.deCode Quality Days 2.5. + 3.5.Hi,
this issue is intended to provide an overview of the open issues we could work on during the code quality days.
The current plan is to have this for two days.
Please feel free to add more information.
Poll for the date:
https://ter...Hi,
this issue is intended to provide an overview of the open issues we could work on during the code quality days.
The current plan is to have this for two days.
Please feel free to add more information.
Poll for the date:
https://terminplaner6.dfn.de/en/p/6683ec971fb22b9928e1ff6d3ae6b412-196072
| topic | comment/explanation | related issues |
| --- | --- | --- |
|Cleanup | Check very old issues (> 2 years) to see if these are still relevant | |
| Remove Boost | | #190 |
| fix metis/parmetis integration | | #195 |
| improve logging | | #178 |
| Unify Communication | CPU and GPU communication schemes are vastly similar | #196 |
| Better GPU integration | GPU usage should be integrated like MPI for example | !565 |
| Use SoA by default | Although SoA is mostly better it is not the default in waLBerla | #182 |
| Boundaries | Different topics on boundaries | #203 #173 #170 #3 |
https://docs.google.com/spreadsheets/d/1chiE5PCNcuokjp7Q3MyClaKQlDO2GqmaljPfJlIhH20/edit#gid=0https://i10git.cs.fau.de/walberla/walberla/-/issues/212Define FlagUIDs in the BoundaryCollection for reuse in App2023-06-21T11:38:45+02:00Philipp SuffaDefine FlagUIDs in the BoundaryCollection for reuse in AppIt could be useful to define the FlagUIDs, which are first set in the generation file, in the BoundaryCollection, so that they can be further used in the application file.
So if one defined a UBB, NoSlip and FixedDensity boundary in the...It could be useful to define the FlagUIDs, which are first set in the generation file, in the BoundaryCollection, so that they can be further used in the application file.
So if one defined a UBB, NoSlip and FixedDensity boundary in the generation file, the BoundaryCollection could look like:
```
namespace walberla{
namespace lbm {
const FlagUID noSlipFlagUID("NoSlip");
const FlagUID UBBFlagUID("UBB");
const FlagUID FixedDensityFlagUID("FixedDensity");
class PSMBoundaryCollection
{
....
```Markus HolzerMarkus Holzerhttps://i10git.cs.fau.de/walberla/walberla/-/issues/237clang-tidy used to ignore .h files2023-11-09T14:59:33+01:00Dominik Thoennesdominik.thoennes@fau.declang-tidy used to ignore .h filesI think that the clang-tidy script ignored `.h` files in the past.
This means that these files were not checked and there are lots of warnings after the update.
I disabled the job in the pipeline for now.
https://i10git.cs.fau.de/walberl...I think that the clang-tidy script ignored `.h` files in the past.
This means that these files were not checked and there are lots of warnings after the update.
I disabled the job in the pipeline for now.
https://i10git.cs.fau.de/walberla/walberla/-/jobs/1135823https://i10git.cs.fau.de/walberla/walberla/-/issues/238import waLBerla hangs after installation2024-02-12T15:19:48+01:00Pedro Santos Nevesimport waLBerla hangs after installationHi waLBerla developers and contributors!
With other colleagues in the EESSI and MultiXscale projects, we are trying to build and deploy optmized waLBerla v6.1 installations and ran into an issue when building it with two specific toolch...Hi waLBerla developers and contributors!
With other colleagues in the EESSI and MultiXscale projects, we are trying to build and deploy optmized waLBerla v6.1 installations and ran into an issue when building it with two specific toolchains that we'd like to report and hopefully get your input on.
A summary of the issue:
We are building waLBerla through EasyBuild using the [`foss2022b`](https://github.com/easybuilders/easybuild-easyconfigs/pull/19324) and [`foss2023a`](https://github.com/easybuilders/easybuild-easyconfigs/pull/19252) toolchains with two identical easyconfig files. With either toolchain the installation proceeds until the sanity check step which simply runs `python -c import waLBerla`, upon which the system hangs. We see this happen [on the EasyBuild test clusters](https://github.com/easybuilders/easybuild-easyconfigs/pull/19252#issuecomment-1820653972) but not on our personal laptops or in the HPC at the University of Groningen.
We tried to change the sanity check to `mpirun -np 1 python -c "import waLBerla"` in the chance that the issue was with the test cluster's environment, but the same hang occurs.
One successful workaround is to set `UCX_LOG_LEVEL=info` in the sanity check so that it reads `UCX_LOG_LEVEL=info python -c "import waLBerla"`. We don't know why changing the log level of `UCX` resolves this problem, and my colleague who discovered this has also opened a ticket about it in the `UCX` repo [here](https://github.com/openucx/ucx/issues/9532).
Another workaround seems to be importing `mpi4py` before waLBerla. This is surprising, because `mpi4py` is not a dependency of waLBerla. We would rather not add `mpi4py` as a dependency for this issue, especially without knowing the consequences of this.
Given that we were only seeing this problem in the EasyBuild test clusters and not in other systems, and also the fact that the `UCX` workaround seems to work for the [EESSI test clusters](https://github.com/EESSI/software-layer/pull/421), we assumed `import waLBerla` was likely hanging due to some quirk of the EasyBuild test clusters. However, we received a [report ](https://github.com/easybuilders/easybuild-easyconfigs/pull/19324/#issuecomment-1857832565) from another EasyBuild maintainer with a notice of this problem in another system. Because of this, we are now not convinced that whatever is causing this has to do with the EasyBuild clusters and their environment.
We have a [summary](https://gitlab.com/eessi/support/-/issues/20) of our attempts in our support portal, where you can find more details.
Would you have any idea of what could be causing this, or have you perhaps encountered something similar in the past? We'd love your input as we're quite confused about this problem. Thanks in advance!https://i10git.cs.fau.de/walberla/walberla/-/issues/239Dynamic load balancing: Refresh function seems to not communicate the flag fi...2024-01-29T12:17:02+01:00Philipp SuffaDynamic load balancing: Refresh function seems to not communicate the flag field correctlyWhen using the blockforest refresh function for dynamic load balancing, it seems not to communicate the "uidToFlag" map of the flag Field. So the flags in the flag field are still set correctly but the connection to the FlagUIDs seems to...When using the blockforest refresh function for dynamic load balancing, it seems not to communicate the "uidToFlag" map of the flag Field. So the flags in the flag field are still set correctly but the connection to the FlagUIDs seems to be lost.Philipp SuffaPhilipp Suffa