Optimize communication within the same rank
Created by: masterleinad
We typically have large message sizes for communication within the same rank. Hence, this also fixes problems with message sizes MPI
can't handler within one MPI communication call. Also, see #131.