Skip to content

Use unsigned int for permutation indices instead of size_t

Created by: aprokop

While looking at the causes of CUDA construction slowdown in #235 I noticed that we are using size_t for permutations. That PR will change everything to int anyways, so I was curious to see whether changing permutation indices to unsigned int changed anything.

Apparently, it does. The results on Summit are below. A brief summary: makes no difference for CPU (on my workstation it sped up by 10%), but makes significant difference for GPU, especially for small size constructions (which is exactly the problem in #235).

$ compare_bench.py benchmarks master.json unsigned_permute.json | grep median                                                            
BM_construction<ArborX::BVH<Serial>>/10000/0/manual_time_median                                  +0.0085         +0.0085          3082          3108          3083          3109
BM_construction<ArborX::BVH<Serial>>/100000/0/manual_time_median                                 +0.0055         +0.0055         31403         31576         1405         31578
BM_construction<ArborX::BVH<Serial>>/1000000/0/manual_time_median                                +0.0243         +0.0243        387338        396745        37898        397311
BM_construction<ArborX::BVH<Serial>>/10000/1/manual_time_median                                  +0.0065         +0.0065          3100          3120          3101          3121
BM_construction<ArborX::BVH<Serial>>/100000/1/manual_time_median                                 +0.0040         +0.0040         32319         32450         2321         32451
BM_construction<ArborX::BVH<Serial>>/1000000/1/manual_time_median                                +0.0159         +0.0158        402069        408467        42660        409034
BM_knn_search<ArborX::BVH<Serial>>/10000/10000/10/0/2/manual_time_median                         +0.0042         +0.0042         45285         45476         5289         45479
BM_knn_search<ArborX::BVH<Serial>>/100000/100000/10/0/2/manual_time_median                       +0.0034         +0.0034        482152        483806        42173        483825
BM_knn_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/2/manual_time_median                     +0.0054         +0.0054       5154061       5182131       514427       5182490
BM_knn_search<ArborX::BVH<Serial>>/10000/10000/10/1/3/manual_time_median                         +0.0043         +0.0043         46610         46810         6614         46813
BM_knn_search<ArborX::BVH<Serial>>/100000/100000/10/1/3/manual_time_median                       +0.0053         +0.0053        656396        659846        66421        659876
BM_knn_search<ArborX::BVH<Serial>>/1000000/1000000/10/1/3/manual_time_median                     +0.0088         +0.0088       9299051       9380604       929560       9381093
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/0/0/2/manual_time_median                    +0.0060         +0.0060         66538         66938         6542         66942
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/0/0/2/manual_time_median                  +0.0036         +0.0036        699675        702221        69706        702250
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/0/2/manual_time_median                +0.0011         +0.0011       7330743       7339006       731382       7339579
BM_radius_search<ArborX::BVH<Serial>>/10000/10000/10/0/1/3/manual_time_median                    +0.0042         +0.0042         15861         15927         5863         15929
BM_radius_search<ArborX::BVH<Serial>>/100000/100000/10/0/1/3/manual_time_median                  -0.0022         -0.0022        110701        110456        10708        110462
BM_radius_search<ArborX::BVH<Serial>>/1000000/1000000/10/0/1/3/manual_time_median                -0.0082         -0.0082        988659        980509        988687        980542
BM_construction<ArborX::BVH<OpenMP>>/10000/0/manual_time_median                                  -0.0049         -0.0048          3082          3067 3084          3069
BM_construction<ArborX::BVH<OpenMP>>/100000/0/manual_time_median                                 +0.0101         +0.0101         31370         31687         31373         31691
BM_construction<ArborX::BVH<OpenMP>>/1000000/0/manual_time_median                                -0.0742         -0.0742        392968        363812        393596        364401
BM_construction<ArborX::BVH<OpenMP>>/10000/1/manual_time_median                                  -0.0099         -0.0098          3137          3106    3139          3108
BM_construction<ArborX::BVH<OpenMP>>/100000/1/manual_time_median                                 -0.0231         -0.0231         32689         31935         32692         31938
BM_construction<ArborX::BVH<OpenMP>>/1000000/1/manual_time_median                                -0.0394         -0.0395        413017        396749        413682        397323
BM_knn_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/2/manual_time_median                         -0.0015         -0.0015         45341         45274         45345         45277
BM_knn_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/2/manual_time_median                       -0.0032         -0.0032        483070        481517        483092        481539
BM_knn_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/2/manual_time_median                     -0.0043         -0.0044       5173222       5150740       5173652       5151132
BM_knn_search<ArborX::BVH<OpenMP>>/10000/10000/10/1/3/manual_time_median                         -0.0024         -0.0024         46646         46536         46649         46539
BM_knn_search<ArborX::BVH<OpenMP>>/100000/100000/10/1/3/manual_time_median                       -0.0033         -0.0033        656584        654420        656608        654443
BM_knn_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/1/3/manual_time_median                     -0.0060         -0.0060       9340168       9284277       9340744       9284806
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/0/2/manual_time_median                    +0.0040         +0.0040         65913         66177         65917         66181
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/0/2/manual_time_median                  +0.0037         +0.0037        691815        694380        691841        694411
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/0/2/manual_time_median                +0.0021         +0.0021       7241328       7256509       7241826       7256968
BM_radius_search<ArborX::BVH<OpenMP>>/10000/10000/10/0/1/3/manual_time_median                    -0.0001         -0.0001         15883         15882         15885         15884
BM_radius_search<ArborX::BVH<OpenMP>>/100000/100000/10/0/1/3/manual_time_median                  -0.0009         -0.0011        110559        110459        110565        110448
BM_radius_search<ArborX::BVH<OpenMP>>/1000000/1000000/10/0/1/3/manual_time_median                -0.0015         -0.0015        985677        984187        985711        984209
BM_construction<ArborX::BVH<Cuda>>/10000/0/manual_time_median                                    -0.1373         -0.1099          1138           982          1425          1268
BM_construction<ArborX::BVH<Cuda>>/100000/0/manual_time_median                                   -0.0630         -0.0570          2857          2677          3159          2979
BM_construction<ArborX::BVH<Cuda>>/1000000/0/manual_time_median                                  -0.0513         -0.0482          7785          7385          8285          7886
BM_construction<ArborX::BVH<Cuda>>/10000/1/manual_time_median                                    -0.1257         -0.1001          1125           984          1411          1270
BM_construction<ArborX::BVH<Cuda>>/100000/1/manual_time_median                                   -0.0612         -0.0553          2858          2683          3160          2986
BM_construction<ArborX::BVH<Cuda>>/1000000/1/manual_time_median                                  -0.0496         -0.0466          7776          7390          8276          7890
BM_knn_search<ArborX::BVH<Cuda>>/10000/10000/10/0/2/manual_time_median                           -0.0724         -0.0692          2029          1882          2118          1972
BM_knn_search<ArborX::BVH<Cuda>>/100000/100000/10/0/2/manual_time_median                         -0.0180         -0.0173         12422         12198         12781         12559
BM_knn_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/2/manual_time_median                       -0.0016         -0.0016        113910        113729        114614        114429
BM_knn_search<ArborX::BVH<Cuda>>/10000/10000/10/1/3/manual_time_median                           -0.0686         -0.0662          2223          2071          2312          2159
BM_knn_search<ArborX::BVH<Cuda>>/100000/100000/10/1/3/manual_time_median                         -0.0063         -0.0063         17582         17470         17945         17831
BM_knn_search<ArborX::BVH<Cuda>>/1000000/1000000/10/1/3/manual_time_median                       +0.0012         +0.0011        234339        234611        235051        235321
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/0/0/2/manual_time_median                      -0.1255         -0.1164          1083           947          1171          1034
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/0/0/2/manual_time_median                    -0.1141         -0.1092          7676          6801          8047          7168
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/0/2/manual_time_median                  -0.0163         -0.0161         56190         55272         56899         55980
BM_radius_search<ArborX::BVH<Cuda>>/10000/10000/10/0/1/3/manual_time_median                      -0.1353         -0.1248          1001           866          1088           952
BM_radius_search<ArborX::BVH<Cuda>>/100000/100000/10/0/1/3/manual_time_median                    -0.0433         -0.0391          3398          3251          3761          3614
BM_radius_search<ArborX::BVH<Cuda>>/1000000/1000000/10/0/1/3/manual_time_median                  -0.0411         -0.0390         10616         10180         11266         10827

Merge request reports