Improve benchmark for distributed tree
Created by: masterleinad
Changes:
- Properly align output,
- Only print information on MPI rank 0,
- Print relevant run parameters.
- Replace
overlap
byshift
such that values different from one and zero have a more intuitive meaning.