Skip to content

Draft: Do not reorder accesses in `move_constants_before_loop`

Daniel Bauer requested to merge terraneo/pystencils:bauerd/move-constants into master

Prior to this MR, move_constants_before_loop tries to move constants as far to the top as possible. This might reorder read/write accesses to fields.

For example:

import pystencils as ps
from pystencils import CreateKernelConfig
from pystencils.astnodes import Block, KernelFunction, LoopOverCoordinate, SympyAssignment
from pystencils.field import Field, FieldType
from sympy.abc import x, y

field = Field.create_generic("field", 1, field_type=FieldType.CUSTOM)

counter = LoopOverCoordinate.get_loop_counter_symbol(0)
load = SympyAssignment(x, field.absolute_access((counter,), (0,)))
store = SympyAssignment(field.absolute_access((counter+1,), (0,)), 2*x)

body = ps.typing.transformations.add_types(Block([load, store]), CreateKernelConfig())
loop = LoopOverCoordinate(body, 0, 0, 42)
block = Block([loop])

ps.transformations.resolve_field_accesses(block)
new_loops = ps.transformations.cut_loop(loop, [41])
ps.transformations.move_constants_before_loop(new_loops.args[1])

kernel = KernelFunction(
  block,
  ps.Target.CPU,
  ps.Backend.C,
  ps.cpu.cpujit.make_python_function,
  None,
)
code = ps.get_code_str(kernel)
print(code)

prints

FUNC_PREFIX void kernel(double * RESTRICT  _data_field, int64_t const _stride_field_0)
{
   const double x = _data_field[41*_stride_field_0];
   _data_field[42*_stride_field_0] = x*2.0;
   {
      for (int64_t ctr_0 = 0; ctr_0 < 41; ctr_0 += 1)
      {
         const double x = _data_field[_stride_field_0*ctr_0];
         _data_field[_stride_field_0*(ctr_0 + 1)] = x*2.0;
      }
      {

      }
   }
}

Note that the last (cut) loop iteration is moved before the primary loop, leading to a wrong load from index 41.

This MR changes move_constants_before_loop such that assignments can not be moved before their last modification. Essentially, it replaces symbols_defined by symbols_modified here. This new property is implemented for all AST nodes. Note the implementation of CustomCCodeNode. I did not want to introduce breaking changes to the API.

Additionally, declarations are now inserted where the caller requests, instead of pushing them all the way to the top (https://i10git.cs.fau.de/terraneo/pystencils/-/commit/5c65d06216d050c22e28ba0b9487544342fc0926).

Lastly, a test for the new behavior is included.

Merge request reports