Skip to content
Snippets Groups Projects
Commit 24b61fcf authored by Frederik Hennig's avatar Frederik Hennig
Browse files

extended backend documentation

parent 725ce26d
Branches
Tags
No related merge requests found
Pipeline #64160 failed with stages
in 2 minutes and 4 seconds
......@@ -3,12 +3,110 @@ Developer's Reference: Code Generation Backend
##############################################
These pages provide a detailed overview of the pystencils code generation backend
as a reference for current and future developers of pystencils.
as a reference for current and future developers of pystencils, as well as users
who wish to customize or extend the behaviour of the code generator in their applications.
.. toctree::
:maxdepth: 1
symbols
ast
kernelcreation
iteration_space
translation
platforms
jit
Internal Representation
-----------------------
The code generator translates the kernel from the SymPy frontend's symbolic language to an internal
representation (IR), which is then emitted as code in the required dialect of C.
All names of classes associated with the internal kernel representation are prefixed `Ps...`
to distinguis them from identically named front-end and SymPy classes.
The IR comprises *symbols*, *constants*, *arrays*, the *iteration space* and the *abstract syntax tree*:
* `PsSymbol` represents a single symbol in the kernel, annotated with a type. Other than in the frontend,
uniqueness of symbols is enforced by the backend: of each symbol, at most one instance may exist.
* `PsConstant` provides a type-safe representation of constants.
* `PsLinearizedArray` is the backend counterpart to the ubiquitous `Field`, representing a contiguous
n-dimensional array.
These arrays do not occur directly in the IR, but are represented through their *associated symbols*,
which are base pointers, shapes, and strides.
* The iteration space (`IterationSpace`) represents the kernel's iteration domain.
Currently, dense iteration spaces (`FullIterationSpace`) and an index list-based
sparse iteration spaces (`SparseIterationSpace`) are available.
* The *Abstract Syntax Tree* (AST) is implemented in the `pystencils.backend.ast` module.
It represents a subset of standard C syntax, as required for pystencils kernels.
Kernel Creation
---------------
Translating a kernel's symbolic representation to compilable code takes various analysis, transformation, and
optimization passes. These are implemented modularily, each represented by its own class.
They are tied together in the kernel translation *driver* and communicate with each other through the
`KernelCreationContext`, which assembles all relevant information.
The primary translation driver implemented in pystencils is the ubiquitous `create_kernel`.
However, the backend is designed to make it easy for users and developers to implement custom translation
drivers if necessary.
The various functional components of the kernel translator are best explained in the order they are invoked
by `create_kernel`.
Analysis and Constraint Checks
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The `KernelAnalysis` pass parses the SymPy assignment list and checks it for the consistency requirements
of the code generator, including the absence of loop-carried dependencies and the static single-assignment form.
Furthermore, it populates the `KernelCreationContext` with information about all fields encountered in the kernel.
Creation of the Iteration Space
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Before the actual translation can begin, the kernel's iteration space must be defined.
The `pystencils.backend.kernelcreation.iteration_space` module provides various means of creating iteration spaces,
which are used by *create_kernel* according to its input configuration.
To communicate the presence of an iteration space to other components, it must be set in the context using
`KernelCreationContext.set_iteration_space`.
It will be used during the *freeze* pass, and later be materialized to a loop nest or GPU index translation.
Freeze and Typification
^^^^^^^^^^^^^^^^^^^^^^^
The transformation of the SymPy expressions to the backend's AST is handled by `FreezeExpressions`.
This class instantiates field accesses according to the iteration space, maps SymPy operators and functions to their
backend instances, and raises an exception if asked to translate something the backend can't handle.
Constants and untyped symbols in the frozen expressions now need to be assigned a data type, and expression types
need to be checked against the C typing rules. This is the task of the `Typifier`. It assigns a default type to
every untyped symbol, attempts to infer the type of constants from their context in the expression,
and checks expression types using a stricter subset of the C typing rules,
allowing for no implicit type casts even between closely related types.
After the typification pass, the code generator either has a fully and correctly typed kernel body in hand,
or it has raised an exception.
Platform Selection and Materialization
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The various hardware platforms supported by pystencils are implemented in the `pystencils.backend.platforms` module.
Each implements a target-specific materialization of generic backend components, including:
- The iteration space, which is materialized to a specific index source. This might be a loop nest for CPU kernels, or
a thread index translation for GPU kernels
- Mathematical functions, which might have to be mapped to concrete library functions
- Vector data types and operations, which are mapped to intrinsics on vector CPU architectures
Transformations
^^^^^^^^^^^^^^^
TODO
Target-Specific Optimization
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TODO
Finalization
^^^^^^^^^^^^
TODO
****************
Iteration Spaces
****************
.. automodule:: pystencils.backend.kernelcreation.iteration_space
:members:
\ No newline at end of file
***************
Kernel Creation
***************
.. automodule:: pystencils.kernelcreation
:members:
*********
Platforms
*********
.. automodule:: pystencils.backend.platforms
:members:
\ No newline at end of file
******************
Kernel Translation
******************
.. autoclass:: pystencils.backend.kernelcreation.KernelCreationContext
:members:
.. autoclass:: pystencils.backend.kernelcreation.KernelAnalysis
:members:
.. autoclass:: pystencils.backend.kernelcreation.FreezeExpressions
:members:
.. autoclass:: pystencils.backend.kernelcreation.Typifier
:members:
"""
The `kernelcreation` module contains the actual translation logic of the pystencils code generator.
It provides a number of classes and submodules providing the various parts and passes of the code
generation process:
- Parameterization of the translation process
- Knowledge collection and management
- Kernel analysis and constraints checks
- Expression parsing and AST construction
- Platform-specific code materialization
- General and platform-specific optimizations
These components are designed to be combined and composed in a variety of ways, depending
on the actual code generation flow required.
The ``nbackend`` currently provides one native code generation driver:
`create_kernel` takes an `AssignmentCollection` and translates it to a simple loop kernel.
The code generator's components are perhaps most easily introduced in the context of that particular driver.
Exemplary Code Generation Driver: `create_kernel`
-------------------------------------------------
Generator Arguments
^^^^^^^^^^^^^^^^^^^
The driver accepts two parameters: an `AssignmentCollection` whose assignments represent the code of a single
kernel iteration without recurrences or other loop-carried dependencies; and a `CreateKernelConfig` which configures
the translation process.
Context Creation
^^^^^^^^^^^^^^^^
The primary object that stores all information and knowledge required during the translation process is the
`KernelCreationContext`. It is created in the beginning from the configuration parameter.
It will be responsible for managing all fields and arrays encountered during translation,
the kernel's iteration space,
and any parameter constraints introduced by later transformation passes.
Analysis Passes
^^^^^^^^^^^^^^^
Before the actual translation of the SymPy-based assignment collection to the backend's AST begins,
the kernel's assignments are checked for consistency with the translator's prequesites.
In this case, the `KernelAnalysis` pass
checks the static single assignment-form (SSA) requirement and the absence of loop-carried dependencies.
At the same time, it collects the set of all fields used in the assignments.
Iteration Space Creation
^^^^^^^^^^^^^^^^^^^^^^^^
The kernel's `IterationSpace` is inferred from a combination of configuration parameters and the set of field accesses
encountered in the kernel. Two kinds of iteration spaces are available: A sparse iteration space
(`SparseIterationSpace`) encompasses singular points in the cartesian coordinate space, provided by an index list.
A full iteration space (`FullIterationSpace`), on the other hand, represents a full cuboid cartesian coordinate space,
which may optionally be sliced.
The iteration space is used during the following translation passes to translate field accesses with respect to
the current iteration. It will only be instantiated in the form of a loop nest or GPU index calculation much later.
Freeze and Typification
^^^^^^^^^^^^^^^^^^^^^^^
The transformation of the SymPy-expressions to the backend's expression trees is handled by `FreezeExpressions`.
This class instantiates field accesses according to the iteration space, maps SymPy operators and functions to their
backend instances if supported, and raises an exception if asked to translate something the backend can't handle.
Constants and untyped symbols in the frozen expressions now need to be assigned a data type, and expression types
need to be checked against the C typing rules. This is the task of the `Typifier`. It assigns a default type to
every untyped symbol, attempts to infer the type of constants from their context in the expression,
and checks expression types using a much stricter
subset of the C typing rules, allowing for no implicit type casts even between closely related types.
After the typification pass, the code generator either has a fully and correctly typed kernel body in hand,
or it has raised an exception.
Platform-Specific Iteration Space Materialization
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
At this point, most remaining transformations are specific to the target platform. Hardware platforms are modelled
using subclasses of the `Platform` class, which implement all platform-specific transformations.
The platform for the current code generation flow is instantiated from the target specification passed
by the user in `CreateKernelConfig`.
Then, the platform is asked to materialize the iteration space (e.g. by turning it into a loop nest
for CPU code) and to materialize any functions for which it provides specific implementations.
Platform-Specific Optimizations
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Technically, the kernel is now complete, but it may still be optimized.
This is also the task of the platform instance. Potential optimizations include the inclusion of OpenMP,
loop splitting, slicing and blocking on CPUs,
and vectorization on CPU platforms with vector capabilities.
Finalization
^^^^^^^^^^^^
At last, the kernel is packed up as a `KernelFunction`.
It is furthermore annotated with constraints collected during the translation, and returned to the user.
"""
from .context import KernelCreationContext
from .analysis import KernelAnalysis
from .freeze import FreezeExpressions
......
......@@ -22,21 +22,19 @@ class KernelAnalysis:
A `KernelAnalysis` object may be called at most once.
Consistency and Constraints
---------------------------
**Consistency and Constraints**
The following checks are performed:
- **SSA Form:** The given assignments must be in single-assignment form; each symbol must be written at most once.
- **Independence of Accesses:** To avoid loop-carried dependencies, each field may be written at most once at
each index, and if a field is written at some location with index `i`, it may only be read with index `i` in
the same location.
- **Independence of Writes:** A weaker requirement than access independence; each field may only be written once
at each index.
- **SSA Form:** The given assignments must be in single-assignment form; each symbol must be written at most once.
- **Independence of Accesses:** To avoid loop-carried dependencies, each field may be written at most once at
each index, and if a field is written at some location with index `i`, it may only be read with index `i` in
the same location.
- **Independence of Writes:** A weaker requirement than access independence; each field may only be written once
at each index.
- **Dimension of index fields:** Index fields occuring in the kernel must have exactly one spatial dimension.
Knowledge Collection
--------------------
**Knowledge Collection**
The following knowledge is collected into the context:
- The set of fields accessed in the kernel
......
......@@ -43,17 +43,17 @@ class KernelCreationContext:
"""Manages the translation process from the SymPy frontend to the backend AST, and collects
all necessary information for the translation:
- *Data Types*: The kernel creation context manages the default data types for loop limits
and counters, index calculations, and the typifier.
- *Symbols*: The context maintains a symbol table, keeping track of all symbols encountered
during kernel translation together with their types.
- *Fields and Arrays*: The context collects all fields encountered during code generation,
applies a few consistency checks to them, and manages their associated arrays.
- *Iteration Space*: The context manages the iteration space of the kernel currently being
translated.
- *Constraints*: The context collects all kernel parameter constraints introduced during the
translation process.
- *Required Headers*: The context collects all header files required for the kernel to run.
- *Data Types*: The kernel creation context manages the default data types for loop limits
and counters, index calculations, and the typifier.
- *Symbols*: The context maintains a symbol table, keeping track of all symbols encountered
during kernel translation together with their types.
- *Fields and Arrays*: The context collects all fields encountered during code generation,
applies a few consistency checks to them, and manages their associated arrays.
- *Iteration Space*: The context manages the iteration space of the kernel currently being
translated.
- *Constraints*: The context collects all kernel parameter constraints introduced during the
translation process.
- *Required Headers*: The context collects all header files required for the kernel to run.
"""
......
......@@ -233,6 +233,8 @@ class FullIterationSpace(IterationSpace):
class SparseIterationSpace(IterationSpace):
"""Represents a sparse iteration space defined by an index list."""
def __init__(
self,
spatial_indices: Sequence[PsSymbol],
......@@ -264,6 +266,12 @@ def get_archetype_field(
check_same_layouts: bool = True,
check_same_dimensions: bool = True,
):
"""Retrieve an archetype field from a collection of fields, which represents their common properties.
Raises:
KernelConstrainsError: If any two fields with conflicting properties are encountered.
"""
shapes = set(f.spatial_shape for f in fields)
fixed_shapes = set(f.spatial_shape for f in fields if f.has_fixed_shape)
layouts = set(f.layout for f in fields)
......
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment