Commit 119d8a51 authored by Hans Wennborg's avatar Hans Wennborg
Browse files

Merging r351580:

------------------------------------------------------------------------
r351580 | kli | 2019-01-18 20:57:37 +0100 (Fri, 18 Jan 2019) | 4 lines

[OPENMP][DOCS] Release notes/OpenMP support updates, NFC.

Differential Revision: https://reviews.llvm.org/D56733

------------------------------------------------------------------------

llvm-svn: 351839
parent e264daec
Loading
Loading
Loading
Loading
+33 −43
Original line number Diff line number Diff line
@@ -17,61 +17,51 @@
OpenMP Support
==================

Clang fully supports OpenMP 4.5. Clang supports offloading to X86_64, AArch64,
PPC64[LE] and has `basic support for Cuda devices`_.

Standalone directives
=====================

* #pragma omp [for] simd: :good:`Complete`.

* #pragma omp declare simd: :partial:`Partial`.  We support parsing/semantic
  analysis + generation of special attributes for X86 target, but still
  missing the LLVM pass for vectorization.

* #pragma omp taskloop [simd]: :good:`Complete`.

* #pragma omp target [enter|exit] data: :good:`Complete`.

* #pragma omp target update: :good:`Complete`.

* #pragma omp target: :good:`Complete`.
Clang supports the following OpenMP 5.0 features

* #pragma omp declare target: :good:`Complete`.
* The `reduction`-based clauses in the `task` and `target`-based directives.

* #pragma omp teams: :good:`Complete`.
* Support relational-op != (not-equal) as one of the canonical forms of random
  access iterator.

* #pragma omp distribute [simd]: :good:`Complete`.
* Support for mapping of the lambdas in target regions.

* #pragma omp distribute parallel for [simd]: :good:`Complete`.
* Parsing/sema analysis for the requires directive.

Combined directives
===================
* Nested declare target directives.

* #pragma omp parallel for simd: :good:`Complete`.
* Make the `this` pointer implicitly mapped as `map(this[:1])`.

* #pragma omp target parallel: :good:`Complete`.
* The `close` *map-type-modifier*.

* #pragma omp target parallel for [simd]: :good:`Complete`.

* #pragma omp target simd: :good:`Complete`.

* #pragma omp target teams: :good:`Complete`.

* #pragma omp teams distribute [simd]: :good:`Complete`.

* #pragma omp target teams distribute [simd]: :good:`Complete`.

* #pragma omp teams distribute parallel for [simd]: :good:`Complete`.

* #pragma omp target teams distribute parallel for [simd]: :good:`Complete`.
Clang fully supports OpenMP 4.5. Clang supports offloading to X86_64, AArch64,
PPC64[LE] and has `basic support for Cuda devices`_.

Clang does not support any constructs/updates from OpenMP 5.0 except
for `reduction`-based clauses in the `task` and `target`-based directives.
* #pragma omp declare simd: :partial:`Partial`.  We support parsing/semantic
  analysis + generation of special attributes for X86 target, but still
  missing the LLVM pass for vectorization.

In addition, the LLVM OpenMP runtime `libomp` supports the OpenMP Tools
Interface (OMPT) on x86, x86_64, AArch64, and PPC64 on Linux, Windows, and macOS.

General improvements
--------------------
- New collapse clause scheme to avoid expensive remainder operations.
  Compute loop index variables after collapsing a loop nest via the
  collapse clause by replacing the expensive remainder operation with
  multiplications and additions.

- The default schedules for the `distribute` and `for` constructs in a
  parallel region and in SPMD mode have changed to ensure coalesced
  accesses. For the `distribute` construct, a static schedule is used
  with a chunk size equal to the number of threads per team (default
  value of threads or as specified by the `thread_limit` clause if
  present). For the `for` construct, the schedule is static with chunk
  size of one.
  
- Simplified SPMD code generation for `distribute parallel for` when
  the new default schedules are applicable.

.. _basic support for Cuda devices:

Cuda devices support
+17 −5
Original line number Diff line number Diff line
@@ -233,12 +233,15 @@ ABI Changes in Clang
OpenMP Support in Clang
----------------------------------

- OpenMP 5.0 features

  - Support relational-op != (not-equal) as one of the canonical forms of random
    access iterator.

  - Added support for mapping of the lambdas in target regions.

- Added parsing/sema analysis for OpenMP 5.0 requires directive.
  - Added parsing/sema analysis for the requires directive.
  - Support nested declare target directives.
  - Make the `this` pointer implicitly mapped as `map(this[:1])`.
  - Added the `close` *map-type-modifier*.

- Various bugfixes and improvements.

@@ -250,6 +253,15 @@ New features supported for Cuda devices:

- Fixed support for lastprivate/reduction variables in SPMD constructs.

- New collapse clause scheme to avoid expensive remainder operations.

- New default schedule for distribute and parallel constructs.

- Simplified code generation for distribute and parallel in SPMD mode.

- Flag (``-fopenmp_optimistic_collapse``) for user to limit collapsed
  loop counter width when safe to do so.

- General performance improvement.

CUDA Support in Clang