Commit 46ff7269 authored by Jan Hönig's avatar Jan Hönig
Browse files

Merge branch 'TypeSystemRebase' into 'master'

Rebase of pystencils Type System

Closes #20

See merge request pycodegen/pystencils!292
parents 37518a47 65c7e576
Pipeline #40003 passed with stages
in 19 minutes and 23 seconds
......@@ -107,8 +107,9 @@ ubuntu:
- $ENABLE_NIGHTLY_BUILDS
image: i10git.cs.fau.de:5005/pycodegen/pycodegen/ubuntu
before_script:
# - apt-get -y remove python3-sympy
- apt-get -y remove python3-sympy
- ln -s /usr/include/locale.h /usr/include/xlocale.h
- pip3 install `grep -Eo 'sympy[>=]+[0-9\.]+' setup.py | sed 's/>/=/g'`
# - pip3 install `grep -Eo 'sympy[>=]+[0-9\.]+' setup.py | sed 's/>/=/g'`
script:
- export NUM_CORES=$(nproc --all)
......
......@@ -82,10 +82,6 @@ try:
except ImportError:
collect_ignore += [os.path.join(SCRIPT_FOLDER, "pystencils/datahandling/vtk.py")]
# TODO: Remove if Ubuntu 18.04 is no longer supported
if pytest_version < 50403:
collect_ignore += [os.path.join(SCRIPT_FOLDER, "pystencils_tests/test_jupyter_extensions.ipynb")]
collect_ignore += [os.path.join(SCRIPT_FOLDER, 'setup.py')]
for root, sub_dirs, files in os.walk('.'):
......
This source diff could not be displayed because it is too large. You can view the blob instead.
%% Cell type:code id: tags:
``` python
from pystencils.session import *
```
%% Cell type:markdown id: tags:
# Tutorial 02: Basic Kernel generation with *pystencils*
Now that you have an [overview of pystencils](01_tutorial_getting_started.ipynb),
this tutorial shows in more detail how to formulate, optimize and run stencil kernels.
## 1) Kernel Definition
### a) Defining kernels with assignment lists and the `kernel` decorator
*pystencils* gets a symbolic formulation of the kernel. This can be either an `Assignment` or a sequence of `Assignment`s that follow a set of restrictions.
Lets first create a kernel that consists of multiple assignments:
%% Cell type:code id: tags:
``` python
src_arr = np.zeros([20, 30])
dst_arr = np.zeros_like(src_arr)
dst, src = ps.fields(dst=dst_arr, src=src_arr)
```
%% Cell type:code id: tags:
``` python
grad_x, grad_y = sp.symbols("grad_x, grad_y")
symbolic_description = [
ps.Assignment(grad_x, (src[1, 0] - src[-1, 0]) / 2),
ps.Assignment(grad_y, (src[0, 1] - src[0, -1]) / 2),
ps.Assignment(dst[0, 0], grad_x + grad_y),
]
kernel = ps.create_kernel(symbolic_description)
symbolic_description
```
%%%% Output: execute_result
![]()
$\displaystyle \left[ grad_{x} \leftarrow \frac{{src}_{(1,0)}}{2} - \frac{{src}_{(-1,0)}}{2}, \ grad_{y} \leftarrow \frac{{src}_{(0,1)}}{2} - \frac{{src}_{(0,-1)}}{2}, \ {dst}_{(0,0)} \leftarrow grad_{x} + grad_{y}\right]$
⎡ src_E src_W src_N src_S ⎤
⎢gradₓ := ───── - ─────, grad_y := ───── - ─────, dst_C := gradₓ + grad_y⎥
⎣ 2 2 2 2 ⎦
%% Cell type:markdown id: tags:
We created subexpressions, using standard sympy symbols on the left hand side, to split the kernel into multiple assignments. Defining a kernel using a list of `Assignment`s is quite tedious and hard to read.
To simplify the formulation of a kernel, *pystencils* offers the `kernel` decorator, that transforms a normal Python function with `@=` assignments into an assignment list that can be passed to `create_kernel`.
%% Cell type:code id: tags:
``` python
@ps.kernel
def symbolic_description_using_function():
grad_x @= (src[1, 0] - src[-1, 0]) / 2
grad_y @= (src[0, 1] - src[0, -1]) / 2
dst[0, 0] @= grad_x + grad_y
symbolic_description_using_function
```
%%%% Output: execute_result
![]()
$\displaystyle \left[ grad_{x} \leftarrow \frac{{src}_{(1,0)}}{2} - \frac{{src}_{(-1,0)}}{2}, \ grad_{y} \leftarrow \frac{{src}_{(0,1)}}{2} - \frac{{src}_{(0,-1)}}{2}, \ {dst}_{(0,0)} \leftarrow grad_{x} + grad_{y}\right]$
⎡ src_E src_W src_N src_S ⎤
⎢gradₓ := ───── - ─────, grad_y := ───── - ─────, dst_C := gradₓ + grad_y⎥
⎣ 2 2 2 2 ⎦
%% Cell type:markdown id: tags:
The decorated function can contain any Python code, only the `@=` operator, and the ternary inline `if-else` operator have different meaning.
### b) Ternary 'if' with `Piecewise`
The ternary operator maps to `sympy.Piecewise` functions, that can be used to introduce branching into the kernel. Piecewise defined functions must give a value for every input, i.e. there must be a 'otherwise' clause in the end that is indicated by the condition `True`. Piecewise objects are standard sympy terms that can be integrated into bigger expressions:
%% Cell type:code id: tags:
``` python
sp.Piecewise((1.0, src[0,1] > 0), (0.0, True)) + src[1, 0]
```
%%%% Output: execute_result
![]()
$\displaystyle {src}_{(1,0)} + \begin{cases} 1.0 & \text{for}\: {src}_{(0,1)} > 0 \\0.0 & \text{otherwise} \end{cases}$
⎛⎧1.0 for src_N > 0⎞
src_E + ⎜⎨ ⎟
⎝⎩0.0 otherwise ⎠
%% Cell type:markdown id: tags:
Piecewise objects are created by the `kernel` decorator for ternary if-else statements.
%% Cell type:code id: tags:
``` python
@ps.kernel
def kernel_with_piecewise():
grad_x @= (src[1, 0] - src[-1, 0]) / 2 if src[-1, 0] > 0 else 0.0
kernel_with_piecewise
```
%%%% Output: execute_result
![]()
$\displaystyle \left[ grad_{x} \leftarrow \begin{cases} \frac{{src}_{(1,0)}}{2} - \frac{{src}_{(-1,0)}}{2} & \text{for}\: {src}_{(-1,0)} > 0 \\0.0 & \text{otherwise} \end{cases}\right]$
⎡ ⎧src_E src_W ⎤
⎢ ⎪───── - ───── for src_W > 0⎥
⎢gradₓ := ⎨ 2 2 ⎥
⎢ ⎪ ⎥
⎣ ⎩ 0.0 otherwise ⎦
%% Cell type:markdown id: tags:
### c) Assignment level optimizations using `AssignmentCollection`
When the kernels get larger and more complex, it is helpful to organize the list of assignment into a more structured way. The `AssignmentCollection` offers optimizating transformation on a list of assignments. It holds two assignment lists, one for subexpressions and one for the main assignments. Main assignments are typically those that write to an array.
%% Cell type:code id: tags:
``` python
@ps.kernel
def somewhat_longer_dummy_kernel(s):
s.a @= src[0, 1] + src[-1, 0]
s.b @= 2 * src[1, 0] + src[0, -1]
s.c @= src[0, 1] + 2 * src[1, 0] + src[-1, 0] + src[0, -1] - src[0,0]
dst[0, 0] @= s.a + s.b + s.c
ac = ps.AssignmentCollection(main_assignments=somewhat_longer_dummy_kernel[-1:],
subexpressions=somewhat_longer_dummy_kernel[:-1])
ac
```
%%%% Output: execute_result
AssignmentCollection: dst_C, <- f(src_W, src_S, src_N, src_C, src_E)
AssignmentCollection: dst_C, <- f(src_N, src_E, src_W, src_C, src_S)
%% Cell type:code id: tags:
``` python
ac.operation_count
```
%%%% Output: execute_result
{'adds': 8,
'muls': 2,
'divs': 0,
'sqrts': 0,
'fast_sqrts': 0,
'fast_inv_sqrts': 0,
'fast_div': 0}
%% Cell type:markdown id: tags:
The `pystencils.simp` submodule offers several functions to optimize a collection of assignments.
It also offers functionality to group optimization into strategies and evaluate them.
In this example we reduce the number of operations by reusing existing subexpressions to get rid of two unnecessary floating point additions. For more information about assignment collections and simplifications see the [demo notebook](demo_assignment_collection.ipynb).
%% Cell type:code id: tags:
``` python
opt_ac = ps.simp.subexpression_substitution_in_existing_subexpressions(ac)
opt_ac
```
%%%% Output: execute_result
AssignmentCollection: dst_C, <- f(src_W, src_S, src_N, src_C, src_E)
AssignmentCollection: dst_C, <- f(src_N, src_E, src_W, src_C, src_S)
%% Cell type:code id: tags:
``` python
opt_ac.operation_count
```
%%%% Output: execute_result
{'adds': 6,
'muls': 1,
'divs': 0,
'sqrts': 0,
'fast_sqrts': 0,
'fast_inv_sqrts': 0,
'fast_div': 0}
%% Cell type:markdown id: tags:
### d) Ghost layers and iteration region
When creating a kernel with neighbor accesses, *pystencils* automatically restricts the iteration region, such that all accesses are safe.
%% Cell type:code id: tags:
``` python
kernel = ps.create_kernel(ps.Assignment(dst[0,0], src[2, 0] + src[-1, 0]))
ps.show_code(kernel)
```
%%%% Output: display_data
%%%% Output: display_data
%% Cell type:markdown id: tags:
When no additional ghost layer information is given, *pystencils* looks at all neighboring field accesses and introduces the required number of ghost layers **for all directions**. In the example above the largest neighbor accesses was ``src[2, 0]``, so theoretically we would need 2 ghost layers only the the end of the x coordinate.
By default *pystencils* introduces 2 ghost layers at all borders of the domain. The next cell shows how to change this behavior. Be careful with manual ghost layer specification, wrong values may lead to SEGFAULTs.
%% Cell type:code id: tags:
``` python
gl_spec = [(0, 2), # 0 ghost layers at the left, 2 at the right border
(1, 0)] # 1 ghost layer at the lower y, one at the upper y coordinate
kernel = ps.create_kernel(ps.Assignment(dst[0,0], src[2, 0] + src[-1, 0]), ghost_layers=gl_spec)
ps.show_code(kernel)
```
%%%% Output: display_data
%%%% Output: display_data
%% Cell type:markdown id: tags:
## 2 ) Restrictions
### a) Independence Restriction
*pystencils* only works for kernels where each array element can be updated independently from all other elements. This restriction ensures that the kernels can be easily parallelized and also be run on the GPU. Trying to define kernels where the results depends on the iteration order, leads to a ValueError.
%% Cell type:code id: tags:
``` python
invalid_description = [
ps.Assignment(dst[1, 0], src[1, 0] + src[-1, 0]),
ps.Assignment(dst[0, 0], src[1, 0] - src[-1, 0]),
]
try:
invalid_kernel = ps.create_kernel(invalid_description)
assert False, "Should never be executed"
except ValueError as e:
print(e)
```
%%%% Output: stream
Field dst is written at two different locations
%% Cell type:markdown id: tags:
The independence restriction makes sure that the kernel can be safely parallelized by checking the following conditions: If a field is modified inside the kernel, it may only be modified at a single spatial position. In that case the field may also only be read at this position. Fields that are not modified may be read at multiple neighboring positions.
Specifically, this rule allows for in-place updates that don't access neighbors.
%% Cell type:code id: tags:
``` python
valid_kernel = ps.create_kernel(ps.Assignment(src[0,0], 2*src[0,0] + 42))
```
%% Cell type:markdown id: tags:
If a field stores multiple values per cell, as in the next example, this restriction only applies for accesses with the same index.
%% Cell type:code id: tags:
``` python
v = ps.fields("v(2): double[2D]")
valid_kernel = ps.create_kernel([ps.Assignment(v[0,0](1), 2*v[0,0](1) + 42),
ps.Assignment(v[0,1](0), 2*v[1,0](0) + 42)])
```
%% Cell type:markdown id: tags:
### b) Static Single Assignment Form
All assignments that don't write to a field must be in SSA form
1. Each sympy symbol may only occur once as a left-hand-side (fields can be written multiple times)
2. A symbol has to be defined before it is used. If it is never defined it is introduced as function parameter
The next cell demonstrates the first SSA restriction:
%% Cell type:code id: tags:
``` python
@ps.kernel
def not_allowed():
a, b = sp.symbols("a b")
a @= src[0, 0]
b @= a + 3
a @= src[-1, 0]
dst[0, 0] @= a + b
try:
ps.create_kernel(not_allowed)
assert False
except ValueError as e:
print(e)
```
%%%% Output: stream
Assignments not in SSA form, multiple assignments to a
%% Cell type:markdown id: tags:
However, for right hand sides that are Field.Accesses this is allowed:
Also it is not allowed to write a field at the same location
%% Cell type:code id: tags:
``` python
@ps.kernel
def allowed():
def not_allowed():
dst[0, 0] @= src[0, 1] + src[1, 0]
dst[0, 0] @= 2 * dst[0, 0]
ps.create_kernel(allowed)
try:
ps.create_kernel(not_allowed)
assert False
except ValueError as e:
print(e)
```
%%%% Output: execute_result
%%%% Output: stream
Field dst is written twice at the same location
%% Cell type:markdown id: tags:
This situation should be resolved by introducing temporary variables
%% Cell type:code id: tags:
``` python
tmp_var = sp.Symbol("a")
@ps.kernel
def allowed():
tmp_var @= src[0, 1] + src[1, 0]
dst[0, 0] @= 2 * tmp_var
ast = ps.create_kernel(allowed)
ps.show_code(ast)
```
%%%% Output: display_data
%%%% Output: display_data
KernelFunction kernel([_data_dst, _data_src])
......
This source diff could not be displayed because it is too large. You can view the blob instead.
......@@ -11,11 +11,11 @@ Creating kernels
.. autoclass:: pystencils.CreateKernelConfig
:members:
.. autofunction:: pystencils.create_domain_kernel
.. autofunction:: pystencils.kernelcreation.create_domain_kernel
.. autofunction:: pystencils.create_indexed_kernel
.. autofunction:: pystencils.kernelcreation.create_indexed_kernel
.. autofunction:: pystencils.create_staggered_kernel
.. autofunction:: pystencils.kernelcreation.create_staggered_kernel
Code printing
......
......@@ -3,13 +3,13 @@ from .enums import Backend, Target
from . import fd
from . import stencil as stencil
from .assignment import Assignment, assignment_from_stencil
from .data_types import TypedSymbol
from pystencils.typing.typed_sympy import TypedSymbol
from .datahandling import create_data_handling
from .display_utils import get_code_obj, get_code_str, show_code, to_dot
from .field import Field, FieldType, fields
from .config import CreateKernelConfig
from .kernel_decorator import kernel, kernel_config
from .kernelcreation import (
CreateKernelConfig, create_domain_kernel, create_indexed_kernel, create_kernel, create_staggered_kernel)
from .kernelcreation import create_kernel, create_staggered_kernel
from .simp import AssignmentCollection
from .slicing import make_slice
from .spatial_coordinates import x_, x_staggered, x_staggered_vector, x_vector, y_, y_staggered, z_, z_staggered
......@@ -18,8 +18,8 @@ from .sympyextensions import SymbolCreator
__all__ = ['Field', 'FieldType', 'fields',
'TypedSymbol',
'make_slice',
'create_kernel', 'create_domain_kernel', 'create_indexed_kernel', 'create_staggered_kernel',
'CreateKernelConfig',
'create_kernel', 'create_staggered_kernel',
'Target', 'Backend',
'show_code', 'to_dot', 'get_code_obj', 'get_code_str',
'AssignmentCollection',
......
import numpy as np
from pystencils.data_types import BasicType
from pystencils.typing import numpy_name_to_c
def aligned_empty(shape, byte_alignment=True, dtype=np.float64, byte_offset=0, order='C', align_inner_coordinate=True):
......@@ -21,7 +21,7 @@ def aligned_empty(shape, byte_alignment=True, dtype=np.float64, byte_offset=0, o
from pystencils.backends.simd_instruction_sets import (get_supported_instruction_sets, get_cacheline_size,
get_vector_instruction_set)
type_name = BasicType.numpy_name_to_c(np.dtype(dtype).name)
type_name = numpy_name_to_c(np.dtype(dtype).name)
instruction_sets = get_supported_instruction_sets()
if instruction_sets is None:
byte_alignment = 64
......
......@@ -10,16 +10,17 @@ def print_assignment_latex(printer, expr):
"""sympy cannot print Assignments as Latex. Thus, this function is added to the sympy Latex printer"""
printed_lhs = printer.doprint(expr.lhs)
printed_rhs = printer.doprint(expr.rhs)
return r"{printed_lhs} \leftarrow {printed_rhs}".format(printed_lhs=printed_lhs, printed_rhs=printed_rhs)
return fr"{printed_lhs} \leftarrow {printed_rhs}"
def assignment_str(assignment):
return r"{lhs} ← {rhs}".format(lhs=assignment.lhs, rhs=assignment.rhs)
return fr"{assignment.lhs}{assignment.rhs}"
_old_new = sp.codegen.ast.Assignment.__new__
# TODO Typing Part2 add default type, defult_float_type, default_int_type and use sane defaults
def _Assignment__new__(cls, lhs, rhs, *args, **kwargs):
if isinstance(lhs, (list, tuple, sp.Matrix)) and isinstance(rhs, (list, tuple, sp.Matrix)):
assert len(lhs) == len(rhs), f'{lhs} and {rhs} must have same length when performing vector assignment!'
......@@ -34,19 +35,6 @@ LatexPrinter._print_Assignment = print_assignment_latex
sp.MutableDenseMatrix.__hash__ = lambda self: hash(tuple(self))
# Apparently, in SymPy 1.4 Assignment.__hash__ is not implemented. This has been fixed in current master
try:
sympy_version = sp.__version__.split('.')
if int(sympy_version[0]) <= 1 and int(sympy_version[1]) <= 4:
def hash_fun(self):
return hash((self.lhs, self.rhs))
Assignment.__hash__ = hash_fun
except Exception:
pass
def assignment_from_stencil(stencil_array, input_field, output_field,
normalization_factor=None, order='visual') -> Assignment:
"""Creates an assignment
......
......@@ -6,10 +6,10 @@ from typing import Any, List, Optional, Sequence, Set, Union
import sympy as sp
import pystencils
from pystencils.data_types import TypedImaginaryUnit, TypedSymbol, cast_func, create_type
from pystencils.typing.utilities import create_type, get_next_parent_of_type
from pystencils.enums import Target, Backend
from pystencils.field import Field
from pystencils.kernelparameters import FieldPointerSymbol, FieldShapeSymbol, FieldStrideSymbol
from pystencils.typing.typed_sympy import FieldPointerSymbol, FieldShapeSymbol, FieldStrideSymbol, TypedSymbol
from pystencils.sympyextensions import fast_subs
NodeOrExpr = Union['Node', sp.Expr]
......@@ -294,6 +294,8 @@ class SkipIteration(Node):
class Block(Node):
def __init__(self, nodes: List[Node]):
super(Block, self).__init__()
if not isinstance(nodes, list):
nodes = [nodes]
self._nodes = nodes
self.parent = None
for n in self._nodes:
......@@ -542,7 +544,6 @@ class LoopOverCoordinate(Node):
@property
def is_outermost_loop(self):
from pystencils.transformations import get_next_parent_of_type
return get_next_parent_of_type(self, LoopOverCoordinate) is None
@property
......@@ -571,7 +572,8 @@ class SympyAssignment(Node):
self.use_auto = use_auto
def __is_declaration(self):
if isinstance(self._lhs_symbol, cast_func):
from pystencils.typing import CastFunc
if isinstance(self._lhs_symbol, CastFunc):
return False
if any(isinstance(self._lhs_symbol, c) for c in (Field.Access, sp.Indexed, TemporaryMemoryAllocation)):
return False
......@@ -616,7 +618,6 @@ class SympyAssignment(Node):
if isinstance(symbol, Field.Access):
for i in range(len(symbol.offsets)):
loop_counters.add(LoopOverCoordinate.get_loop_counter_symbol(i))
result = {r for r in result if not isinstance(r, TypedImaginaryUnit)}
result.update(loop_counters)
result.update(self._lhs_symbol.atoms(sp.Symbol))
return result
......
......@@ -8,14 +8,17 @@ import sympy as sp
from sympy.core import S
from sympy.core.cache import cacheit
from sympy.logic.boolalg import BooleanFalse, BooleanTrue
from sympy.functions.elementary.trigonometric import TrigonometricFunction, InverseTrigonometricFunction
from sympy.functions.elementary.hyperbolic import HyperbolicFunction
from pystencils.astnodes import KernelFunction, LoopOverCoordinate, Node
from pystencils.cpu.vectorization import vec_all, vec_any, CachelineSize
from pystencils.data_types import (
PointerType, VectorType, address_of, cast_func, create_type, get_type_of_expression,
reinterpret_cast_func, vector_memory_access, BasicType, TypedSymbol)
from pystencils.typing import (
PointerType, VectorType, CastFunc, create_type, get_type_of_expression,
ReinterpretCastFunc, VectorMemoryAccess, BasicType, TypedSymbol)
from pystencils.enums import Backend
from pystencils.fast_approximation import fast_division, fast_inv_sqrt, fast_sqrt
from pystencils.functions import DivFunc, AddressOf
from pystencils.integer_functions import (
bit_shift_left, bit_shift_right, bitwise_and, bitwise_or, bitwise_xor,
int_div, int_power_of_2, modulo_ceil)
......@@ -30,8 +33,6 @@ __all__ = ['generate_c', 'CustomCodeNode', 'PrintNode', 'get_headers', 'CustomSy
HEADER_REGEX = re.compile(r'^[<"].*[">]$')
KERNCRAFT_NO_TERNARY_MODE = False
def generate_c(ast_node: Node,
signature_only: bool = False,
......@@ -63,6 +64,7 @@ def generate_c(ast_node: Node,
printer = custom_backend
elif dialect == Backend.C:
try:
# TODO Vectorization Revamp: instruction_set should not be just slapped on ast
instruction_set = ast_node.instruction_set
except Exception:
instruction_set = None
......@@ -125,7 +127,7 @@ def get_headers(ast_node: Node) -> Set[str]:
# --------------------------------------- Backend Specific Nodes -------------------------------------------------------
# TODO future CustomCodeNode should not be backend specific move it elsewhere
class CustomCodeNode(Node):
def __init__(self, code, symbols_read, symbols_defined, parent=None):
super(CustomCodeNode, self).__init__(parent=parent)
......@@ -219,7 +221,7 @@ class CBackend:
return getattr(self, method_name)(node)