Check for correct alignment offset in cpujit
The following segfaults without a reasonable error message due to incorrect alignment:
from pystencils.session import *
domain_size = (128, 128)
dh = ps.create_data_handling(domain_size, periodicity=(True, True), default_target='cpu')
src = dh.add_array("src", values_per_cell=1, dtype=np.float64, ghost_layers=1, alignment=32)
dh.fill(src.name, 1.0, ghost_layers=True)
dst = dh.add_array("dst", values_per_cell=1, dtype=np.float64, ghost_layers=1, alignment=32)
dh.fill(dst.name, 1.0, ghost_layers=True)
update_rule = ps.Assignment(dst[0, 0], src[0, 0])
opt = {'instruction_set': 'avx', 'assume_aligned': True, 'nontemporal': True, 'assume_inner_stride_one': True}
ast = ps.create_kernel(update_rule, target=dh.default_target, cpu_vectorize_info=opt)
kernel = ast.compile()
ps.show_code(ast)
dh.run_kernel(kernel)
This is because the kernel has zero ghost layers but the fields have one, so alignment is inconsistent. It would have been avoided by adding ghost_layers=1
to the ps.create_kernel
call, by removing it from the ps.create_data_handling
call, or by changing the update rule to an assignment that includes neighbors (e.g. ps.Assignment(dst[0, 0], src[0, 1])
.
We should show a good error message, probably by adding something like the following to cpujit.py:
if ast_node.instruction_set:
offset = (ast_node.instruction_set['width'] - ast_node.ghost_layers) * item_size
offset_cond = "((uintptr_t) buffer_{name}.buf) % buffer_{name}.strides[0] == {offset}".format(name=field.name, offset=str(offset))
pre_call_code += template_check_array.format(cond=offset_cond, what="offset", name=field.name, expected=str(offset))