Commit c79c0c7b authored by Joel E. Denny's avatar Joel E. Denny
Browse files

[Clacc][OpenACC] num_workers -> thread_limit

That is, instead of translating `num_workers` on a `parallel`
construct to `num_threads` at every lexically enclosed worker loop,
translate it to `thread_limit` on the `target teams` construct to
which the `parallel` construct is translated.  This change offers
multiple improvements:

* `num_workers` now affects orphaned loops, as expected.  Thus, it
  addresses some fixmes in
  `clang/test/OpenACC/directives/Tests/loop-tile.c`.
* It simplifies the generated OpenMP source.  In particular, when the
  `num_workers` argument is a non-constant expression, a local
  variable no longer has to be inserted to capture its current value.
* It eliminates bugs from the old translation's implementation:
    * The aforementioned local variable was inserted unnecessarily
      when the only enclosed apparent worker parallelism was from a
      worker function call or from a loop's worker clause that was
      discarded in the translation due to a tile clause.
    * The aforementioned local variable was mistakenly not inserted if
      the only enclosed worker parallelism was from an implicit worker
      clause.

This patch adds `openmp/libacc2omp/test/directives/num-workers.c` to
test when `num_workers` actually produces the number of workers
expected.  As noted in a fixme comment there, there are some cases
where it does not if `-O0`.  Based on our experiments, the old
translation to `num_threads` was no better for any use case but, as
described above, was worse for some use cases.
parent 457ae889
Loading
Loading
Loading
Loading