Unverified Commit b7b5907b authored by Chuanqi Xu's avatar Chuanqi Xu Committed by GitHub
Browse files

[Coroutines] Introduce [[clang::coro_only_destroy_when_complete]] (#71014)

Close https://github.com/llvm/llvm-project/issues/56980.

This patch tries to introduce a light-weight optimization attribute for
coroutines which are guaranteed to only be destroyed after it reached
the final suspend.

The rationale behind the patch is simple. See the example:

```C++
A foo() {
  dtor d;
  co_await something();
  dtor d1;
  co_await something();
  dtor d2;
  co_return 43;
}
```

Generally the generated .destroy function may be:

```C++
void foo.destroy(foo.Frame *frame) {
  switch(frame->suspend_index()) {
    case 1:
      frame->d.~dtor();
      break;
    case 2:
      frame->d.~dtor();
      frame->d1.~dtor();
      break;
    case 3:
      frame->d.~dtor();
      frame->d1.~dtor();
      frame->d2.~dtor();
      break;
    default: // coroutine completed or haven't started
      break;
  }

  frame->promise.~promise_type();
  delete frame;
}
```

Since the compiler need to be ready for all the cases that the coroutine
may be destroyed in a valid state.

However, from the user's perspective, we can understand that certain
coroutine types may only be destroyed after it reached to the final
suspend point. And we need a method to teach the compiler about this.
Then this is the patch. After the compiler recognized that the
coroutines can only be destroyed after complete, it can optimize the
above example to:

```C++
void foo.destroy(foo.Frame *frame) {
  frame->promise.~promise_type();
  delete frame;
}
```

I spent a lot of time experimenting and experiencing this in the
downstream. The numbers are really good. In a real-world coroutine-heavy
workload, the size of the build dir (including .o files) reduces 14%.
And the size of final libraries (excluding the .o files) reduces 8% in
Debug mode and 1% in Release mode.
parent e3c120a5
Loading
Loading
Loading
Loading
+3 −0
Original line number Diff line number Diff line
@@ -296,6 +296,9 @@ Attribute Changes in Clang
  is ignored, changed from the former incorrect suggestion to move it past
  declaration specifiers. (`#58637 <https://github.com/llvm/llvm-project/issues/58637>`_)

- Clang now introduced ``[[clang::coro_only_destroy_when_complete]]`` attribute
  to reduce the size of the destroy functions for coroutines which are known to
  be destroyed after having reached the final suspend point.

Improvements to Clang's diagnostics
-----------------------------------
+12 −0
Original line number Diff line number Diff line
@@ -1082,6 +1082,18 @@ def CFConsumed : InheritableParamAttr {
  let Documentation = [RetainBehaviorDocs];
}


// coro_only_destroy_when_complete indicates the coroutines whose return type
// is marked by coro_only_destroy_when_complete can only be destroyed when the
// coroutine completes. Then the space for the destroy functions can be saved.
def CoroOnlyDestroyWhenComplete : InheritableAttr {
  let Spellings = [Clang<"coro_only_destroy_when_complete">];
  let Subjects = SubjectList<[CXXRecord]>;
  let LangOpts = [CPlusPlus];
  let Documentation = [CoroOnlyDestroyWhenCompleteDocs];
  let SimpleHandler = 1;
}

// OSObject-based attributes.
def OSConsumed : InheritableParamAttr {
  let Spellings = [Clang<"os_consumed">];
+66 −0
Original line number Diff line number Diff line
@@ -7416,3 +7416,69 @@ that ``p->array`` must have at least ``p->count`` number of elements available:

  }];
}

def CoroOnlyDestroyWhenCompleteDocs : Documentation {
  let Category = DocCatDecl;
  let Content = [{
The `coro_only_destroy_when_complete` attribute should be marked on a C++ class. The coroutines
whose return type is marked with the attribute are assumed to be destroyed only after the coroutine has
reached the final suspend point.

This is helpful for the optimizers to reduce the size of the destroy function for the coroutines.

For example,

.. code-block:: c++

  A foo() {
    dtor d;
    co_await something();
    dtor d1;
    co_await something();
    dtor d2;
    co_return 43;
  }

The compiler may generate the following pseudocode:

.. code-block:: c++

  void foo.destroy(foo.Frame *frame) {
    switch(frame->suspend_index()) {
      case 1:
        frame->d.~dtor();
        break;
      case 2:
        frame->d.~dtor();
        frame->d1.~dtor();
        break;
      case 3:
        frame->d.~dtor();
        frame->d1.~dtor();
        frame->d2.~dtor();
        break;
      default: // coroutine completed or haven't started
        break;
    }

    frame->promise.~promise_type();
    delete frame;
  }

The `foo.destroy()` function's purpose is to release all of the resources
initialized for the coroutine when it is destroyed in a suspended state.
However, if the coroutine is only ever destroyed at the final suspend state,
the rest of the conditions are superfluous.

The user can use the `coro_only_destroy_when_complete` attributo suppress
generation of the other destruction cases, optimizing the above `foo.destroy` to:

.. code-block:: c++

  void foo.destroy(foo.Frame *frame) {
    frame->promise.~promise_type();
    delete frame;
  }

  }];
}
+4 −0
Original line number Diff line number Diff line
@@ -777,6 +777,10 @@ void CodeGenFunction::EmitCoroutineBody(const CoroutineBodyStmt &S) {

  // LLVM require the frontend to mark the coroutine.
  CurFn->setPresplitCoroutine();

  if (CXXRecordDecl *RD = FnRetTy->getAsCXXRecordDecl();
      RD && RD->hasAttr<CoroOnlyDestroyWhenCompleteAttr>())
    CurFn->setCoroDestroyOnlyWhenComplete();
}

// Emit coroutine intrinsic and patch up arguments of the token type.
+59 −0
Original line number Diff line number Diff line
// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -std=c++20 \
// RUN:     -disable-llvm-passes -emit-llvm %s -o - | FileCheck %s

// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -std=c++20 \
// RUN:     -O3 -emit-llvm %s -o - | FileCheck %s --check-prefix=CHECK-O

#include "Inputs/coroutine.h"

using namespace std;

struct A;
struct A_promise_type {
  A get_return_object();
  suspend_always initial_suspend();
  suspend_always final_suspend() noexcept;
  void return_value(int);
  void unhandled_exception();

  std::coroutine_handle<> handle;
};

struct Awaitable{
  bool await_ready();
  int await_resume();
  template <typename F>
  void await_suspend(F);
};
Awaitable something();

struct dtor {
    dtor();
    ~dtor();
};

struct [[clang::coro_only_destroy_when_complete]] A {
  using promise_type = A_promise_type;
  A();
  A(std::coroutine_handle<>);
  ~A();

  std::coroutine_handle<promise_type> handle;
};

A foo() {
    dtor d;
    co_await something();
    dtor d1;
    co_await something();
    dtor d2;
    co_return 43;
}

// CHECK: define{{.*}}@_Z3foov({{.*}}) #[[ATTR_NUM:[0-9]+]]
// CHECK: attributes #[[ATTR_NUM]] = {{.*}}coro_only_destroy_when_complete

// CHECK-O: define{{.*}}@_Z3foov.destroy
// CHECK-O: {{^.*}}:
// CHECK-O-NOT: br
// CHECK-O: ret void
Loading