Simple code to test OpenMP Detach functionality using cudaMemCpy2DAsync for asynchronous copies and cudaStreamAddCallback to perform the callback which fulfills the OpenMP detached event.
setUpModules.sh : loads appropriate modules
build.sh : sources setUpModules and builds executable
batch_cudaStreamAddCallback.sh : submission script for IBM systems at OLCF