Use unsigned int for permutation indices instead of size_t
Created by: aprokop
While looking at the causes of CUDA construction slowdown in #235 I noticed that we are using size_t
for permutations. That PR will change everything to int
anyways, so I was curious to see whether changing permutation indices to unsigned int
changed anything.
Apparently, it does. The results on Summit are below. A brief summary: makes no difference for CPU (on my workstation it sped up by 10%), but makes significant difference for GPU, especially for small size constructions (which is exactly the problem in #235).
$ compare_bench.py benchmarks master.json unsigned_permute.json | grep median
BM_construction<ArborX::BVH<Serial>>/10000/0/manual_time_median +0.0085 +0.0085 3082 3108 3083 3109
BM_construction<ArborX::BVH<Serial>>/100000/0/manual_time_median +0.0055 +0.0055 31403 31576 1405 31578
BM_construction<ArborX::BVH<Serial>>/1000000/0/manual_time_median +0.0243 +0.0243 387338 396745 37898 397311
BM_construction<ArborX::BVH<Serial>>/10000/1/manual_time_median +0.0065 +0.0065 3100 3120 3101 3121
BM_construction<ArborX::BVH<Serial>>/100000/1/manual_time_median +0.0040 +0.0040 32319 32450 2321 32451
BM_construction<ArborX::BVH<Serial>>/1000000/1/manual_time_median +0.0159 +0.0158 402069 408467 42660 409034
BM_knn_search<ArborX::BVH<Serial>>/10000/10000/10/0/2/manual_time_median +0.0042 +0.0042 45285 45476 5289 45479
BM_knn_search<ArborX::BVH<Serial>>/100000/100000/10/0/2/manual_time_median +0.0034 +0.0034 482152 483806 42173 483825
BM_knn_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/2/manual_time_median +0.0054 +0.0054 5154061 5182131 514427 5182490
BM_knn_search<ArborX::BVH<Serial>>/10000/10000/10/1/3/manual_time_median +0.0043 +0.0043 46610 46810 6614 46813
BM_knn_search<ArborX::BVH<Serial>>/100000/100000/10/1/3/manual_time_median +0.0053 +0.0053 656396 659846 66421 659876
BM_knn_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/3/manual_time_median +0.0088 +0.0088 9299051 9380604 929560 9381093
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/0/0/2/manual_time_median +0.0060 +0.0060 66538 66938 6542 66942
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/0/0/2/manual_time_median +0.0036 +0.0036 699675 702221 69706 702250
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/0/2/manual_time_median +0.0011 +0.0011 7330743 7339006 731382 7339579
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/0/1/3/manual_time_median +0.0042 +0.0042 15861 15927 5863 15929
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/0/1/3/manual_time_median -0.0022 -0.0022 110701 110456 10708 110462
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/1/3/manual_time_median -0.0082 -0.0082 988659 980509 988687 980542
BM_construction<ArborX::BVH<OpenMP>>/10000/0/manual_time_median -0.0049 -0.0048 3082 3067 3084 3069
BM_construction<ArborX::BVH<OpenMP>>/100000/0/manual_time_median +0.0101 +0.0101 31370 31687 31373 31691
BM_construction<ArborX::BVH<OpenMP>>/1000000/0/manual_time_median -0.0742 -0.0742 392968 363812 393596 364401
BM_construction<ArborX::BVH<OpenMP>>/10000/1/manual_time_median -0.0099 -0.0098 3137 3106 3139 3108
BM_construction<ArborX::BVH<OpenMP>>/100000/1/manual_time_median -0.0231 -0.0231 32689 31935 32692 31938
BM_construction<ArborX::BVH<OpenMP>>/1000000/1/manual_time_median -0.0394 -0.0395 413017 396749 413682 397323
BM_knn_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/2/manual_time_median -0.0015 -0.0015 45341 45274 45345 45277
BM_knn_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/2/manual_time_median -0.0032 -0.0032 483070 481517 483092 481539
BM_knn_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/2/manual_time_median -0.0043 -0.0044 5173222 5150740 5173652 5151132
BM_knn_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/3/manual_time_median -0.0024 -0.0024 46646 46536 46649 46539
BM_knn_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/3/manual_time_median -0.0033 -0.0033 656584 654420 656608 654443
BM_knn_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/3/manual_time_median -0.0060 -0.0060 9340168 9284277 9340744 9284806
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/0/2/manual_time_median +0.0040 +0.0040 65913 66177 65917 66181
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/0/2/manual_time_median +0.0037 +0.0037 691815 694380 691841 694411
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/0/2/manual_time_median +0.0021 +0.0021 7241328 7256509 7241826 7256968
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/1/3/manual_time_median -0.0001 -0.0001 15883 15882 15885 15884
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/1/3/manual_time_median -0.0009 -0.0011 110559 110459 110565 110448
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/1/3/manual_time_median -0.0015 -0.0015 985677 984187 985711 984209
BM_construction<ArborX::BVH<Cuda>>/10000/0/manual_time_median -0.1373 -0.1099 1138 982 1425 1268
BM_construction<ArborX::BVH<Cuda>>/100000/0/manual_time_median -0.0630 -0.0570 2857 2677 3159 2979
BM_construction<ArborX::BVH<Cuda>>/1000000/0/manual_time_median -0.0513 -0.0482 7785 7385 8285 7886
BM_construction<ArborX::BVH<Cuda>>/10000/1/manual_time_median -0.1257 -0.1001 1125 984 1411 1270
BM_construction<ArborX::BVH<Cuda>>/100000/1/manual_time_median -0.0612 -0.0553 2858 2683 3160 2986
BM_construction<ArborX::BVH<Cuda>>/1000000/1/manual_time_median -0.0496 -0.0466 7776 7390 8276 7890
BM_knn_search<ArborX::BVH<Cuda>>/10000/10000/10/0/2/manual_time_median -0.0724 -0.0692 2029 1882 2118 1972
BM_knn_search<ArborX::BVH<Cuda>>/100000/100000/10/0/2/manual_time_median -0.0180 -0.0173 12422 12198 12781 12559
BM_knn_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/2/manual_time_median -0.0016 -0.0016 113910 113729 114614 114429
BM_knn_search<ArborX::BVH<Cuda>>/10000/10000/10/1/3/manual_time_median -0.0686 -0.0662 2223 2071 2312 2159
BM_knn_search<ArborX::BVH<Cuda>>/100000/100000/10/1/3/manual_time_median -0.0063 -0.0063 17582 17470 17945 17831
BM_knn_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/3/manual_time_median +0.0012 +0.0011 234339 234611 235051 235321
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/0/0/2/manual_time_median -0.1255 -0.1164 1083 947 1171 1034
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/0/0/2/manual_time_median -0.1141 -0.1092 7676 6801 8047 7168
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/0/2/manual_time_median -0.0163 -0.0161 56190 55272 56899 55980
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/0/1/3/manual_time_median -0.1353 -0.1248 1001 866 1088 952
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/0/1/3/manual_time_median -0.0433 -0.0391 3398 3251 3761 3614
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/1/3/manual_time_median -0.0411 -0.0390 10616 10180 11266 10827