Use streams for GPU communication
This MR introduces streams for the local communication in the Uniform GPU communication.
Furthermore, bugs in the NonUniform scheme are fixed
This MR introduces streams for the local communication in the Uniform GPU communication.
Furthermore, bugs in the NonUniform scheme are fixed