using memcpy instead of packing individual elements is faster See merge request walberla/walberla!208