Loading
[MLIR][OpenMP][OMPIRBuilder] Improve shared memory checks (#161864)
This patch refines checks to decide whether to use device shared memory or regular stack allocations. In particular, it adds support for parallel regions residing on standalone target device functions. The changes are: - Shared memory is introduced for `omp.target` implicit allocations, such as those related to privatization and mapping, as long as they are shared across threads in a nested parallel region. - Standalone target device functions are interpreted as being part of a Generic kernel, since the fact that they are present in the module after filtering means they must be reachable from a target region. - Prevent allocations whose only shared uses inside of an `omp.parallel` region are as part of a `private` clause from being moved to device shared memory.