[WIP] Run BVH constructor on the device
Created by: masterleinad
This is to separate the (fundamental) CUDA-aware MPI changes from the changes to the constructor (commits 10104ff and eb2fa93). These two last commits expose some strange behavior to look at later. We observe some issues with floatring point issues (partly addressed by 10104ff) and success of one of the tests (one_leaf_per_rank
) depends on not running hello_world
.