Pystencils fails to vectorize very simple kernels:
import numpy as np
import pystencils as ps
from pystencils.astnodes import SympyAssignment, TypedSymbol
f = ps.fields("f: [1D]")
x = TypedSymbol("x", np.float64)
kernel = ps.create_kernel(
[SympyAssignment(x, 2.0), SympyAssignment(f[0], x)],
cpu_vectorize_info={"assume_inner_stride_one": True},
)
ps.show_code(kernel)
This example throws an exception in show_code
, complaining that the printer can not vectorize type casts.
The problem is that x = 2.0
is moved out of the loop (since it is constant).
What remains in the loop is f[i] = x
.
While the lefthandside of this expression is vectorized, the righthandside is left scalar, leading to the exception.
The issue comes from the insert_vector_casts
function.
It traverses each expression from the leafs to the root, leaving scalars scalar ^{1} and collating mixed expressions to vectors.
However, it handles the rhs of assignments separate from the lhs, leading to above issue.
Moreover, expressions like a (vec) + (b (scalar) * c (scalar))
are converted to a (vec) + CastToVec(b (scalar) * c (scalar))
, which leads to the same exception.
The correct way is to directly cast b
and c
to vectors, not their product.
Therefore, insert_vector_casts
must know beforehand, whether an expression appears inside a vectorized expression.
This MR fixes that for SympyAssignments. To that end, it first checks whether either side contains a vectorized expression, and if so, casts all symbols to vectors.
Since I am not really sure how to handle the cases for VectorMemoryAccess
(line 370/386) and ast.Conditional
(line 374/390), I left those untouched.

The exception is that CastFunctions are always replaced by vector casts. I do not know whether this is intentional.
↩