Skip to content
Snippets Groups Projects
Commit 725ce26d authored by Frederik Hennig's avatar Frederik Hennig
Browse files

backend doc refactoring

parent dc8b5aba
No related merge requests found
Pipeline #64129 failed with stages
in 2 minutes and 2 seconds
******
Arrays
******
.. automodule:: pystencils.backend.arrays
:members:
...@@ -2,6 +2,9 @@ ...@@ -2,6 +2,9 @@
Abstract Syntax Tree Abstract Syntax Tree
******************** ********************
.. automodule:: pystencils.backend.ast.astnode
:members:
.. automodule:: pystencils.backend.ast.structural .. automodule:: pystencils.backend.ast.structural
:members: :members:
......
...@@ -2,16 +2,13 @@ ...@@ -2,16 +2,13 @@
Developer's Reference: Code Generation Backend Developer's Reference: Code Generation Backend
############################################## ##############################################
These pages provide a detailed overview of the pystencils code generation backend
These pages provide a detailed overview of the next-gen code generation backend ``nbackend`` currently being as a reference for current and future developers of pystencils.
developed for *pystencils*. This new backend is intended to consolidate and finally replace
all code generation functionality currently implemented in *pystencils* version 1.x.
.. toctree:: .. toctree::
:maxdepth: 1 :maxdepth: 1
rationale symbols
arrays
ast ast
kernelcreation kernelcreation
jit jit
...@@ -3,4 +3,5 @@ Kernel Creation ...@@ -3,4 +3,5 @@ Kernel Creation
*************** ***************
.. automodule:: pystencils.kernelcreation .. automodule:: pystencils.kernelcreation
:members:
***********************
Rationale and Key Ideas
***********************
Expression Manipulation
^^^^^^^^^^^^^^^^^^^^^^^
The pystencils code generator was originally built based entirely on the computer algebra system SymPy.
SymPy itself is ideal for the front-end representation of kernels using its mathematical language.
In pystencils, however, SymPy was long used to model all mathematical expressions, from the continuous equations
down to the bare C assignments, loop counters, and even pointer arithmetic.
SymPy's unique properties, especially regarding automatic rewriting and simplification of expressions,
while perfect for doing symbolic mathematics, have proven to be very problematic when used as the basis of
an intermediate code representation.
The primary problems caused by using SymPy for expression manipulation are these:
- Assigning and checking types on SymPy expressions is not possible in a stable way. While a type checking
pass over the expression trees may validate types early in the code generation process, often SymPy's auto-
rewriting system will be triggered by changes to the AST at a later stage, silently invalidating type
information.
- SymPy will aggressively simplify constant expressions in a strictly mathematical way, which leads to
semantically invalid transformations in contexts with fixed types. This problem especially concerns
integer types, and division in integer contexts.
- SymPy aggressively flattens expressions according to associativity, and freely reorders operands in commutative
operations. While perfectly fine in symbolic mathematics, this behaviour makes it impossible to group
and parenthesize operations for numerical or performance benefits. Another often-observed effect is that
SymPy distributes constant factors across sums, strongly increasing the number of FLOPs.
To avoid these problems, ``nbackend`` no longer uses SymPy for expression manipulation, but contains a native
AST data structure for modelling expressions as in C code.
Structure and Architecture of the Code Generator
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The code generation flow of *pystencils* has grown significantly over the years, to accomodate various different
kinds of kernels, output dialects, and target platforms. Very often, extensions were retroactively integrated with
a system that was not originally designed to support them. As a result, the code generator is now
a very convoluted set of functions and modules, containing large volumes of hard-to-read code, much of it
duplicated for several platforms.
The design of the ``nbackend`` takes the benefit of hindsight to provide the same (and, in some cases, a broader) set of
functionality through a much better structured software system. While the old code generator was implemented in an almost
entirely imperative manner, the ``nbackend`` makes extensive use of object-oriented programming for knowledge representation,
construction and internal representation of code, as well as analysis, transformation, and code generation tasks.
As a result, the ``nbackend`` is much more modular, concise, easier to extend, and implemented in a much smaller volume of
code.
*****************************
Symbols, Constants and Arrays
*****************************
.. autoclass:: pystencils.backend.symbols.PsSymbol
:members:
.. autoclass:: pystencils.backend.constants.PsConstant
:members:
.. automodule:: pystencils.backend.arrays
:members:
"""
The pystencils backend models contiguous n-dimensional arrays using a number of classes.
Arrays themselves are represented through the `PsLinearizedArray` class.
An array has a fixed name, dimensionality, and element type, as well as a number of associated
variables.
The associated variables are the *shape* and *strides* of the array, modelled by the
`PsArrayShapeSymbol` and `PsArrayStrideSymbol` classes. They have integer type and are used to
reason about the array's memory layout.
Memory Layout Constraints
-------------------------
Initially, all memory layout information about an array is symbolic and unconstrained.
Several scenarios exist where memory layout must be constrained, e.g. certain pointers
need to be aligned, certain strides must be fixed or fulfill certain alignment properties,
or even the field shape must be fixed.
The code generation backend models such requirements and assumptions as *constraints*.
Constraints are external to the arrays themselves. They are created by the AST passes which
require them and exposed through the `KernelFunction` class to the compiler kernel's runtime
environment. It is the responsibility of the runtime environment to fulfill all constraints.
For example, if an array ``arr`` should have both a fixed shape and fixed strides,
an optimization pass will have to add equality constraints like the following before replacing
all occurences of the shape and stride variables with their constant value::
constraints = (
[PsKernelConstraint(s.eq(f)) for s, f in zip(arr.shape, fixed_size)]
+ [PsKernelConstraint(s.eq(f)) for s, f in zip(arr.strides, fixed_strides)]
)
kernel_function.add_constraints(*constraints)
"""
from __future__ import annotations from __future__ import annotations
from typing import Sequence from typing import Sequence
...@@ -57,8 +20,7 @@ from ..defaults import DEFAULTS ...@@ -57,8 +20,7 @@ from ..defaults import DEFAULTS
class PsLinearizedArray: class PsLinearizedArray:
"""Class to model N-dimensional contiguous arrays. """Class to model N-dimensional contiguous arrays.
Memory Layout, Shape and Strides **Memory Layout, Shape and Strides**
--------------------------------
The memory layout of an array is defined by its shape and strides. The memory layout of an array is defined by its shape and strides.
Both shape and stride entries may either be constants or special variables associated with Both shape and stride entries may either be constants or special variables associated with
...@@ -67,7 +29,7 @@ class PsLinearizedArray: ...@@ -67,7 +29,7 @@ class PsLinearizedArray:
Shape and strides may be specified at construction in the following way. Shape and strides may be specified at construction in the following way.
For constant entries, their value must be given as an integer. For constant entries, their value must be given as an integer.
For variable shape entries and strides, the Ellipsis `...` must be passed instead. For variable shape entries and strides, the Ellipsis `...` must be passed instead.
Internally, the passed ``index_dtype`` will be used to create typed constants (`PsTypedConstant`) Internally, the passed ``index_dtype`` will be used to create typed constants (`PsConstant`)
and variables (`PsArrayShapeSymbol` and `PsArrayStrideSymbol`) from the passed values. and variables (`PsArrayShapeSymbol` and `PsArrayStrideSymbol`) from the passed values.
""" """
...@@ -118,7 +80,7 @@ class PsLinearizedArray: ...@@ -118,7 +80,7 @@ class PsLinearizedArray:
@property @property
def shape(self) -> tuple[PsArrayShapeSymbol | PsConstant, ...]: def shape(self) -> tuple[PsArrayShapeSymbol | PsConstant, ...]:
"""The array's shape, expressed using `PsTypedConstant` and `PsArrayShapeSymbol`""" """The array's shape, expressed using `PsConstant` and `PsArrayShapeSymbol`"""
return self._shape return self._shape
@property @property
...@@ -130,7 +92,7 @@ class PsLinearizedArray: ...@@ -130,7 +92,7 @@ class PsLinearizedArray:
@property @property
def strides(self) -> tuple[PsArrayStrideSymbol | PsConstant, ...]: def strides(self) -> tuple[PsArrayStrideSymbol | PsConstant, ...]:
"""The array's strides, expressed using `PsTypedConstant` and `PsArrayStrideSymbol`""" """The array's strides, expressed using `PsConstant` and `PsArrayStrideSymbol`"""
return self._strides return self._strides
@property @property
......
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment