Store indices of children nodes rather than pointer in the bounding volume hierarchy
Created by: dalg24
Motivation: this will let us copy trees between host and device and enable checkpoint/restart if that's something we want to get into.
Note that indices are store as ptrdiff_t
that has the same size as a pointer so that we do not change the alignment.
Merge request reports
Activity
Created by: dalg24
Have you considered switching from from getBoundingVolume(Node*) to getBoundingVolume(index)
I have. You need to now about the root so it remains a member function of BVH. I see no advantage in changing at the moment.
from distance(Node*) to distance(index)?
We can explore later but same I see no immediate benefit and it is not necessary to achieve the goal of the PR.
In general, is there a reason to keep Node* usage in many places in the code?
We can revisit later, this calls for more careful thinking.
Created by: dalg24
@dalg24 The reason I'm asking is that I want to play around with separating the bounding boxes from tree again, and those things would help a lot.
That's orthogonal we do not access the bounding boxes directly through the node.
$ git grep getBoundingVolume src/details/ArborX_DetailsTreeTraversal.hpp src/details/ArborX_DetailsTreeTraversal.hpp: if (predicate(bvh.getBoundingVolume(bvh.getRoot()))) src/details/ArborX_DetailsTreeTraversal.hpp: if (predicate(bvh.getBoundingVolume(child))) src/details/ArborX_DetailsTreeTraversal.hpp: bvh.getBoundingVolume(node)); src/details/ArborX_DetailsTreeTraversal.hpp: return distance(geometry, bvh.getBoundingVolume(node));
285 285 std::cout << "ref=" << ref.str() << "\n"; 286 286 287 287 // hierarchy generation 288 using ArborX::Box; 289 using ArborX::Details::makeLeafNode; 288 290 using ArborX::Details::Node; 289 291 Kokkos::View<Node *, DeviceType> leaf_nodes("leaf_nodes", n); 290 292 Kokkos::View<Node *, DeviceType> internal_nodes("internal_nodes", n - 1); 285 285 std::cout << "ref=" << ref.str() << "\n"; 286 286 287 287 // hierarchy generation 288 using ArborX::Box; 289 using ArborX::Details::makeLeafNode; 288 290 using ArborX::Details::Node; 289 291 Kokkos::View<Node *, DeviceType> leaf_nodes("leaf_nodes", n); 290 292 Kokkos::View<Node *, DeviceType> internal_nodes("internal_nodes", n - 1); Created by: aprokop
Cuda (Summit)
No difference in timings except for a single case: radius search for filled box/filled ball, where the new version is slightly (~10%) faster (better for larger). NOTE: both labels in plot are CUDA.
Serial (Summit)
For serial, most things are pretty close, though radius search for both filled and hollow variants is consistently slower for all scales by about 3-5%.
OpenMP (Summit) 42/smt1
For the most part, little difference. The sticking out thing is radius search for filled box, where the new version is consistently 8% slower.