Commit 748bb5a0 authored by Teresa Johnson's avatar Teresa Johnson
Browse files

[WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP

Summary:
Currently type test assume sequences inserted for devirtualization are
removed during WPD. This patch delays their removal until later in the
optimization pipeline. This is an enabler for upcoming enhancements to
indirect call promotion, for example streamlined promotion guard
sequences that compare against vtable address instead of the target
function, when there are small number of possible vtables (either
determined via WPD or by in-progress type profiling). We need the type
tests to correlate the callsites with the address point offset needed in
the compare sequence, and optionally to associated type summary info
computed during WPD.

This depends on work in D71913 to enable invocation of LowerTypeTests to
drop type test assume sequences, which will now be invoked following ICP
in the ThinLTO post-LTO link pipelines, and also after the existing
export phase LowerTypeTests invocation in regular LTO (which is already
after ICP). We cannot simply move the existing import phase
LowerTypeTests pass later in the ThinLTO post link pipelines, as the
comment in PassBuilder.cpp notes (it must run early because when
performing CFI other passes may disturb the sequences it looks for).

This necessitated adding a new type test resolution "Unknown" that we
can use on the type test assume sequences previously removed by WPD,
that we now want LTT to ignore.

Depends on D71913.

Reviewers: pcc, evgeny777

Subscribers: mehdi_amini, Prazek, hiraditya, steven_wu, dexonsmith, arphaman, davidxl, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73242
parent b198f16e
Loading
Loading
Loading
Loading
+3 −2
Original line number Diff line number Diff line
@@ -833,7 +833,8 @@ struct TypeTestResolution {
    Single,    ///< Single element (last example in "Short Inline Bit Vectors")
    AllOnes,   ///< All-ones bit vector ("Eliminating Bit Vector Checks for
               ///  All-Ones Bit Vectors")
  } TheKind = Unsat;
    Unknown,   ///< Unknown (analysis not performed, don't lower)
  } TheKind = Unknown;

  /// Range of size-1 expressed as a bit width. For example, if the size is in
  /// range [1,256], this number will be 8. This helps generate the most compact
@@ -1026,7 +1027,7 @@ public:
  // in the way some record are interpreted, like flags for instance.
  // Note that incrementing this may require changes in both BitcodeReader.cpp
  // and BitcodeWriter.cpp.
  static constexpr uint64_t BitcodeSummaryVersion = 8;
  static constexpr uint64_t BitcodeSummaryVersion = 9;

  // Regular LTO module name for ASM writer
  static constexpr const char *getRegularLTOModuleName() {
+1 −0
Original line number Diff line number Diff line
@@ -17,6 +17,7 @@ namespace yaml {

template <> struct ScalarEnumerationTraits<TypeTestResolution::Kind> {
  static void enumeration(IO &io, TypeTestResolution::Kind &value) {
    io.enumCase(value, "Unknown", TypeTestResolution::Unknown);
    io.enumCase(value, "Unsat", TypeTestResolution::Unsat);
    io.enumCase(value, "ByteArray", TypeTestResolution::ByteArray);
    io.enumCase(value, "Inline", TypeTestResolution::Inline);
+3 −0
Original line number Diff line number Diff line
@@ -7660,6 +7660,9 @@ bool LLParser::ParseTypeTestResolution(TypeTestResolution &TTRes) {
    return true;

  switch (Lex.getKind()) {
  case lltok::kw_unknown:
    TTRes.TheKind = TypeTestResolution::Unknown;
    break;
  case lltok::kw_unsat:
    TTRes.TheKind = TypeTestResolution::Unsat;
    break;
+2 −0
Original line number Diff line number Diff line
@@ -2772,6 +2772,8 @@ static const char *getWholeProgDevirtResByArgKindName(

static const char *getTTResKindName(TypeTestResolution::Kind K) {
  switch (K) {
  case TypeTestResolution::Unknown:
    return "unknown";
  case TypeTestResolution::Unsat:
    return "unsat";
  case TypeTestResolution::ByteArray:
+16 −0
Original line number Diff line number Diff line
@@ -748,6 +748,12 @@ PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
                                           true /* SamplePGO */));
  }

  // Lower type metadata and the type.test intrinsic in the ThinLTO
  // post link pipeline after ICP. This is to enable usage of the type
  // tests in ICP sequences.
  if (Phase == ThinLTOPhase::PostLink)
    MPM.addPass(LowerTypeTestsPass(nullptr, nullptr, true));

  // Interprocedural constant propagation now that basic cleanup has occurred
  // and prior to optimizing globals.
  // FIXME: This position in the pipeline hasn't been carefully considered in
@@ -1170,6 +1176,9 @@ PassBuilder::buildLTODefaultPipeline(OptimizationLevel Level, bool DebugLogging,
    // metadata and intrinsics.
    MPM.addPass(WholeProgramDevirtPass(ExportSummary, nullptr));
    MPM.addPass(LowerTypeTestsPass(ExportSummary, nullptr));
    // Run a second time to clean up any type tests left behind by WPD for use
    // in ICP.
    MPM.addPass(LowerTypeTestsPass(nullptr, nullptr, true));
    return MPM;
  }

@@ -1236,6 +1245,10 @@ PassBuilder::buildLTODefaultPipeline(OptimizationLevel Level, bool DebugLogging,
    // The LowerTypeTestsPass needs to run to lower type metadata and the
    // type.test intrinsics. The pass does nothing if CFI is disabled.
    MPM.addPass(LowerTypeTestsPass(ExportSummary, nullptr));
    // Run a second time to clean up any type tests left behind by WPD for use
    // in ICP (which is performed earlier than this in the regular LTO
    // pipeline).
    MPM.addPass(LowerTypeTestsPass(nullptr, nullptr, true));
    return MPM;
  }

@@ -1363,6 +1376,9 @@ PassBuilder::buildLTODefaultPipeline(OptimizationLevel Level, bool DebugLogging,
  // to be run at link time if CFI is enabled. This pass does nothing if
  // CFI is disabled.
  MPM.addPass(LowerTypeTestsPass(ExportSummary, nullptr));
  // Run a second time to clean up any type tests left behind by WPD for use
  // in ICP (which is performed earlier than this in the regular LTO pipeline).
  MPM.addPass(LowerTypeTestsPass(nullptr, nullptr, true));

  // Enable splitting late in the FullLTO post-link pipeline. This is done in
  // the same stage in the old pass manager (\ref addLateLTOOptimizationPasses).
Loading