Welcome to the documentation of the *lbmpy* lattice Boltzmann module! In this document we'll give you an overview over the architecture of the package and will show you different ways to use it. You can find more

hands-on examples on how to set up specific simulations in the tutorial notebooks.

You probably already have noticed that this is not a normal website. You are looking at an interactive IPython/Jupyter notebook. That means that the following grey code cells contain valid Python code that can be run interactively. You can run the code cells by hitting Shift+Enter. Make sure to run all of them in order, since variables and state defined in an executed code cell is used by the next cells.

## Code generation and its advantages

Now, back to *lbmpy*: Whats special about it? *lbmpy* is not just an implementation of different LB schemes in Python. Instead *lbmpy* is a package to generate fast C implementations. This is done by writing down the LB equations in a generic, symbolic way in a computer algebra system. Then the equations can be adapted to the specific problem, rearranged, and simplified. Basically you can think of this step as rewriting the "paper representation" of a method, such that a C programmer with no LB experience can implement it.

At this step one can already insert scenario specific information into the equations, like fixing relaxation rates, stencil, force model or domain sizes, such that the equations can be written in a simpler form that has less floating point operations (FLOPs).

*lbmpy* was written after realizing that it is almost impossible to write a generic **and** fast LB code in a language like C++. In principle it is possible to realize abstractions with static polymorphism (in C++: with template meta programming) but in practice this leads to code that is hard to understand and maintain. And for LB schemes there are lots of model and parameters choices that one would like to write down in an abstract way. The following picture shows some of these options one faces when implementing a new LB code:

![](../img/feature_optimization_overview.png)

%% Cell type:markdown id: tags:

Lets illustrate this with a simple example: Lets say you want MRT and cumulant collision operators for a two dimensional (D2Q9) and two different 3D stencils (D3Q19 and D3Q27). It turns out that abstracting the stencil specific code away by introducting a stencil class or even stencil-template-constructs already lead to performance degradations. When the stencil is known at compile time, one can do stencil specific common subexpression elimination that drastically reduces the number of FLOPs. Similar effects are also encountered related to the other implementation choices. Thus in *lbmpy* one completely specifies the LB scheme one wants to have, then the LB equations are simplified and optimized and a shared memory parallel C implementation is generated. In that way all the information one has about the scenario can be used at 'compile time' and the best possible compute kernel is generated.

Now, enough of the theory, let's set up a simple simulation. The next cell simulates a lid driven cavity, all domain borders are wall, the upper wall is moving to the right. To run the cell click into and and hit Shift+Enter.

Executing this cell probably took quite a while to execute, because in the background the method was derived, simplified and compiled. Running a few more timesteps now is faster, since the compiled kernel is already available.

%% Cell type:code id: tags:

``` python

ldc.run(500)

```

%% Cell type:markdown id: tags:

Above simulation runs C code, that was created by Python, compiled to a shared library, then loaded again into Python, such that we can call it like a normal Python function. We can actually have a look a the intermediate C code. Remove the comment from the next cell and run it:

%% Cell type:code id: tags:

``` python

#show_code(ldc.ast)

```

%% Cell type:markdown id: tags:

Note that this code has the relaxation rate and array sizes inserted as numeric constants. This additional information helps the C compiler to generate faster code. Also, having the code in symbolic form makes it easy to generate code for different platforms as well: C(++) for CPUs, optionally with platform specific SIMD instrinsics or CUDA for Nvidia GPUs. To run the lid driven cavity on GPUs all it takes are the following changes:

Performance is reported in million lattice cells updates per second (MLUPS), i.e. the larger the faster.

%% Cell type:markdown id: tags:

In this section we hopefully have convinced you, that code generation is a powerful technique that can give you code flexibility without drawbacks in performance. We have already demonstrated the flexibility with respect to target architecture (CPU, GPU), in the next section we show the flexibility in terms of supported LB methods.

%% Cell type:markdown id: tags:

## LB schemes

In this section we demonstrate how to use advanced LB schemes like multi-relaxation-time, cumulant and entropic schemes.

### MRT methods and its special cases

For the lid driven cavity above we have used the simplest lattice Boltzmann scheme with a single relaxation time (SRT), sometimes also called BGK scheme. This method relaxes each moment of the particle distribution function with the same relaxation rate. This relaxation rate is used to control the viscosity of the fluid. SRT can be viewed as a special case of the more general multi relaxation time methods, where different moments can be relaxed to equilibrium with a different rate. We get back the SRT operator by choosing all MRT relaxation rates as the same constant. Actually *lbmpy* always uses the MRT representation, and by setting a single relaxation rate for all moments and after a lot of equation simplification steps, an efficient SRT kernel drops out.

In a C++ code one would have to implement the SRT special case separately, since the C++ compiler could not do all the simplifications of the MRT code, even if the relaxation rates would be template parameters. This means that one would have to implement a separate SRT, TRT, and MRT kernel.

This is not required here, because of the automatic equation simplification.

Let's look at the details of the methods we used above:

%% Cell type:code id: tags:

``` python

ldc.method

```

%% Output

<lbmpy.methods.momentbased.MomentBasedLbMethod at 0x7f2a17e5eba8>

%% Cell type:markdown id: tags:

All moments are relaxed to their equilibrium values with the same rate. We can easily built a more complex method were we choose different rates to separately control viscosity, bulk viscosity and relaxation of higher order moments:

<lbmpy.methods.momentbased.MomentBasedLbMethod at 0x7f2a169b64e0>

%% Cell type:markdown id: tags:

In principle we can freely choose all entries of above table (as long as the moments are linear independent) and automatically generate a custom MRT implementation.

%% Cell type:markdown id: tags:

### Modern methods: cumulant and entropic schemes

Recently more complex lattice Boltzmann methods have been published that improve the stability of standard methods. We demonstrate this by a setting up a shear flow scenario where the upper half is initialized with velocity to the right and the lower half with a velocity to the left. The y-velocity component is initialized with a very small random value.

We choose a relaxation rate near the upper limit of 2. Three scenarios are created:

- BGK/SRT collision operator

- cumulant collision operator

- entropic stabilization

For the latter two advanced methods the relaxation rates of the higher order moments are chosen as equilibrium (for cumulant) or subject to an entropy condition (for the entropic) scheme.

*lbmpy* is a module to do Lattice Boltzmann simulations in Python.

In this tutorial you will get a broad overview of *lbmpy*'s features. We will run some of the included scenarios that come with *lbmpy*, like a channel flow and a lid driven cavity. This tutorial uses the simple, high-level API of *lbmpy*, while the following tutorials go into the low-level details.

The only prerequisite for this tutorial is basic Python and [numpy](http://www.numpy.org/) knowledge.

> #### What's special about *lbmpy* ?

> The LBM kernels (i.e. the functions that do all the computations) are not written in Python. Instead *lbmpy* generates optimized C or CUDA code for these kernels and compiles it using the *pystencils* module. In that way we get very fast LBM kernels, a lot faster than pure Python implementations and probably also faster than handwritten C kernels. This sounds complicated, but we don't have to care about all this background work, since all compiled kernels are available as Python functions again. Thus *lbmpy* can be used just like any other Python package.

## Lid Driven Cavity

We start by simulating a fluid in a rectangular box, where one wall (the lid) is moving. This is called a 'lid driven cavity'. At the stationary walls *no-slip* boundary conditions are set, which enforce zero velocity at the wall. At the lid there is a *velocity bounce back (UBB)* boundary condition, which sets zero normal velocity and a prescribed tangential velocity.

We don't have to set up all these boundary conditions manually since there is a function ``create_lid_driven_cavity`` that does all the work for us. This function takes the tangential velocity of the lid, which drives the flow. It is given in lattice units and to get a stable simulation it should be smaller than 0.1. The `relaxation_rate` determines the viscosity of the fluid: Small relaxation rates correspond to high viscosity. The `relaxation_rate` has to be between 0 and 2.

<lbmpy.methods.momentbased.MomentBasedLbMethod at 0x7fde7f192390>

%% Cell type:markdown id: tags:

The *run* method of the scenario runs the specified amount of time steps. When you run the next cell, 2000 time steps are executed and the velocity field is plotted. You can run the cell multiple times to see a time evolution.