Performance regression
Created by: aprokop
We seem to have introduced significant performance regression in the execution space refactoring for Cuda (maybe others, have not checked). Comparing 0fbcb17b with 8e272698:
$ compare_bench.py benchmarks 202004141712_0fbcb17.json 202004141709_8e27269.json | grep median
BM_construction<ArborX::BVH<Cuda>>/10000/0/manual_time_median -0.0204 -0.0150 1608 1575 1597 1573
BM_construction<ArborX::BVH<Cuda>>/100000/0/manual_time_median -0.2722 -0.2394 3184 2317 3563 2710
BM_construction<ArborX::BVH<Cuda>>/1000000/0/manual_time_median +0.2564 +0.2191 7952 9991 8432 10279
BM_construction<ArborX::BVH<Cuda>>/10000/1/manual_time_median -0.0452 -0.0316 1662 1587 1652 1600
BM_construction<ArborX::BVH<Cuda>>/100000/1/manual_time_median -0.2217 -0.1972 3289 2560 3676 2951
BM_construction<ArborX::BVH<Cuda>>/1000000/1/manual_time_median +0.3687 +0.3340 8321 11390 8810 11753
BM_knn_search<ArborX::BVH<Cuda>>/10000/10000/10/1/0/2/manual_time_median -0.0006 -0.0006 1444 1444 1534 1533
BM_knn_search<ArborX::BVH<Cuda>>/100000/100000/10/1/0/2/manual_time_median -0.0005 +0.0039 10598 10593 11011 11054
BM_knn_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/0/2/manual_time_median +0.0004 +0.0002 96161 96202 97028 97043
BM_knn_search<ArborX::BVH<Cuda>>/10000/10000/10/1/1/3/manual_time_median -0.0015 -0.0015 1605 1603 1694 1692
BM_knn_search<ArborX::BVH<Cuda>>/100000/100000/10/1/1/3/manual_time_median -0.0003 +0.0001 13720 13716 14166 14168
BM_knn_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/1/3/manual_time_median -0.0001 -0.0002 197086 197066 197942 197894
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/1/0/0/2/manual_time_median -0.0019 -0.0020 1105 1103 1195 1192
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/1/0/0/2/manual_time_median +0.0299 +0.0259 7098 7310 7506 7700
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/0/0/2/manual_time_median +0.0152 +0.0144 60885 61812 61748 62636
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/1/0/1/3/manual_time_median -0.0009 -0.0008 1067 1066 1157 1156
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/1/0/1/3/manual_time_median +0.0067 -0.0007 3769 3794 4228 4225
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/0/1/3/manual_time_median +0.0034 +0.0030 10351 10386 11189 11223
It's all in construction. I observed even worse for HACC data, where it's of larger size (36M).
P.S. This also reminds me of a similar problem from #242.