Unverified Commit 27ebc844 authored by Alexandre Ganea's avatar Alexandre Ganea Committed by GitHub
Browse files

[clang][CodeView] Prevent the input name from appearing in LF_BUILDINFO (#194140)

The implicit contract of an `LF_BUILDINFO` record (represented in LLVM
by
[`BuildInfoRecord`](https://github.com/llvm/llvm-project/blob/6f0b55ec55f3e5e1ccc0d6b0d04a307479218768/llvm/include/llvm/DebugInfo/CodeView/TypeRecord.h#L667))
is that its `CommandLine` field should not contain the input source file
— a separate `SourceFile` field is reserved for that.

When the command-line flattening was moved from `llvm/` to `clang/` in
#106369, the comparison value used to identify and strip the source
positional was switched from `MainSourceFile->getFilename()` (the full
input path resolved by clang) to `CodeGenOpts.MainFileName` (just the
basename, set via `-main-file-name`). As a result, when the driver is
invoked with an absolute source path the cc1 positional is that absolute
path and no longer matches `MainFileName`, so the source filename leaks
into `CommandLine` as a trailing positional cc1 argument.

This is a regression in Clang 20. It breaks downstream tooling such as
Live++, whose unity-splitting feature relies on the embedded command
line being the cc1 invocation minus the source. Reported in #193900.

This PR restores the previous behavior by passing the resolved frontend
input path(s) to `flattenClangCommandLine` and including them in the
equality check that strips the source positional. The basename match
against `MainFileName` is kept for the relative-input case. A regression
test (and a symmetric relative-path test) is added to
`clang/test/DebugInfo/Generic/codeview-buildinfo.c`.

Should fix #193900.
parent bd7cd403
Loading
Loading
Loading
Loading
+19 −3
Original line number Diff line number Diff line
@@ -323,7 +323,8 @@ static bool actionRequiresCodeGen(BackendAction Action) {
}

static std::string flattenClangCommandLine(ArrayRef<std::string> Args,
                                           StringRef MainFilename) {
                                           StringRef MainFilename,
                                           ArrayRef<StringRef> InputFiles) {
  if (Args.empty())
    return std::string{};

@@ -342,7 +343,15 @@ static std::string flattenClangCommandLine(ArrayRef<std::string> Args,
      i++; // Skip this argument and next one.
      continue;
    }
    if (Arg.starts_with("-object-file-name") || Arg == MainFilename)
    if (Arg.starts_with("-object-file-name"))
      continue;
    // Strip the source positional, matching either MainFilename (the
    // -main-file-name basename) or one of the resolved frontend input paths
    // (which is what the cc1 positional looks like for an absolute driver
    // input). Avoid a generic basename match: it would also strip values of
    // args like `-include <path>` whose trailing component happens to equal
    // the source basename.
    if (Arg == MainFilename || llvm::is_contained(InputFiles, Arg))
      continue;
    // Skip fmessage-length for reproducibility.
    if (Arg.starts_with("-fmessage-length"))
@@ -519,8 +528,15 @@ static bool initTargetOptions(const CompilerInstance &CI,
      Options.MCOptions.IASSearchPaths.push_back(
          Entry.IgnoreSysRoot ? Entry.Path : HSOpts.Sysroot + Entry.Path);
  Options.MCOptions.Argv0 = CodeGenOpts.Argv0 ? CodeGenOpts.Argv0 : "";
  // Pass the resolved frontend inputs so flattenClangCommandLine can strip
  // the cc1 source positional even when the driver received an absolute path
  // (which won't match CodeGenOpts.MainFileName, that's just the basename).
  SmallVector<StringRef, 1> InputFiles;
  for (const auto &Input : CI.getFrontendOpts().Inputs)
    if (Input.isFile())
      InputFiles.push_back(Input.getFile());
  Options.MCOptions.CommandlineArgs = flattenClangCommandLine(
      CodeGenOpts.CommandLineArgs, CodeGenOpts.MainFileName);
      CodeGenOpts.CommandLineArgs, CodeGenOpts.MainFileName, InputFiles);
  Options.MCOptions.AsSecureLogFile = CodeGenOpts.AsSecureLogFile;
  Options.MCOptions.PPCUseFullRegisterNames =
      CodeGenOpts.PPCUseFullRegisterNames;
+33 −0
Original line number Diff line number Diff line
@@ -12,6 +12,18 @@
// RUN: %clang_cl -gcodeview-command-line --target=i686-windows-msvc -Xclang -fmessage-length=100 /c /Z7 /Fo%t.obj -- %s
// RUN: llvm-pdbutil dump --types %t.obj | FileCheck %s --check-prefix MESSAGELEN

// The source filename must be stripped from the embedded cc1 command line
// whether it's passed to the driver as an absolute or a relative path.
// See https://github.com/llvm/llvm-project/issues/193900.

// RUN: %clang_cl --target=i686-windows-msvc /c /Z7 /Fo%t.obj -- %s
// RUN: llvm-pdbutil dump --types %t.obj | FileCheck %s --check-prefix ABSPATH

// RUN: rm -rf %t.relpath && mkdir %t.relpath
// RUN: cp %s %t.relpath/hello.cpp
// RUN: cd %t.relpath && %clang_cl --target=i686-windows-msvc /c /Z7 /Fo:hello.obj -- hello.cpp
// RUN: llvm-pdbutil dump --types %t.relpath/hello.obj | FileCheck %s --check-prefix RELPATH

int main(void) { return 42; }

// CHECK:                       Types (.debug$T)
@@ -45,3 +57,24 @@ int main(void) { return 42; }
// MESSAGELEN: ============================================================
// MESSAGELEN: 0x{{.+}} | LF_BUILDINFO [size = {{.+}}]
// MESSAGELEN-NOT: -fmessage-length

// The cmdline is the 5th argument of LF_BUILDINFO. The source filename must
// not appear inside its value (the SourceFile field, the 3rd argument, is
// reserved for that).
// ABSPATH:       0x{{.+}} | LF_BUILDINFO [size = {{.+}}]
// ABSPATH-NEXT:           0x{{.*}}: `{{.*}}`
// ABSPATH-NEXT:           0x{{.*}}: `{{.*}}`
// ABSPATH-NEXT:           0x{{.*}}: `{{.+[\\/]codeview-buildinfo\.c}}`
// ABSPATH-NEXT:           0x{{.*}}: ``
// ABSPATH-NEXT:           0x{{.*}}: `
// ABSPATH-NOT:   {{[^"]*[\\/]codeview-buildinfo\.c}}
// ABSPATH-SAME:  `

// RELPATH:       0x{{.+}} | LF_BUILDINFO [size = {{.+}}]
// RELPATH-NEXT:           0x{{.*}}: `{{.*}}`
// RELPATH-NEXT:           0x{{.*}}: `{{.*}}`
// RELPATH-NEXT:           0x{{.*}}: `{{.*hello\.cpp}}`
// RELPATH-NEXT:           0x{{.*}}: ``
// RELPATH-NEXT:           0x{{.*}}: `
// RELPATH-NOT:   {{hello\.cpp}}
// RELPATH-SAME:  `