Unverified Commit 6795bfce authored by Job Noorman's avatar Job Noorman Committed by GitHub
Browse files

[BOLT][RISCV] Handle CIE's produced by GNU as (#69578)

On RISC-V, GNU as produces the following initial instruction in CIE's:

```
DW_CFA_def_cfa_register: r2
```

While I believe it is technically illegal to use this instruction
without first using a `DW_CFA_def_cfa` (since the offset is undefined),
both `readelf` and `llvm-dwarfdump` accept this and implicitly set the
offset to 0.

In BOLT, however, this triggers an assert (in `CFISnapshot::advanceTo`)
as it (correctly) believes the offset is not set. This patch fixes this
by setting the offset to 0 whenever executing `DW_CFA_def_cfa_register`
while the offset is undefined.

Note that this is probably the simplest workaround but it has a
downside: while emitting CFI start, we check if the initial instructions
are contained within `MCAsmInfo::getInitialFrameState` and omit them if
they are. This will not be true for GNU CIE's (since they differ from
LLVM's) which causes an unnecessary `DW_CFA_def_cfa_register` to be
emitted.

While technically correct, it would probably be better to replace the
GNU CIE with the one used by LLVM to avoid this situation. This would
solve the problem this patch solves while also preventing unnecessary
CFI instructions. However, this is a bit trickier to implement correctly
so I propose to keep this for a later time.

Note on testing: the test creates a simple function with three basic
blocks and forces the CFI state of the last one to be different from the
others using an arbitrary CFI instruction. Then,
`--reorder-blocks=reverse` is used to force `CFISnapshot::advanceTo` to
be called. This causes an assert on the current main branch.
parent 3c07a216
Loading
Loading
Loading
Loading
+7 −0
Original line number Diff line number Diff line
@@ -2462,6 +2462,13 @@ private:
    case MCCFIInstruction::OpDefCfaRegister:
      CFAReg = Instr.getRegister();
      CFARule = UNKNOWN;

      // This shouldn't happen according to the spec but GNU binutils on RISC-V
      // emits a DW_CFA_def_cfa_register in CIE's which leaves the offset
      // unspecified. Both readelf and llvm-dwarfdump interpret the offset as 0
      // in this case so let's do the same.
      if (CFAOffset == UNKNOWN)
        CFAOffset = 0;
      break;
    case MCCFIInstruction::OpDefCfaOffset:
      CFAOffset = Instr.getOffset();
+109 −0
Original line number Diff line number Diff line
## Compiled and stripped-down version of:
## (riscv64-linux-gnu-gcc -nostdlib -static -Wl,-q cie-gnu.s)
#     .text
#     .globl _start
#     .type _start, @function
# _start:
#     .cfi_startproc
#     beq a0, a1, 1f
#     ret
# 1:
#     .cfi_undefined t0 # Arbitrary cfi command to force a new state
#     ret
#     .cfi_endproc
#     .size _start, .-_start

--- !ELF
FileHeader:
  Class:           ELFCLASS64
  Data:            ELFDATA2LSB
  Type:            ET_EXEC
  Machine:         EM_RISCV
  Flags:           [ EF_RISCV_RVC, EF_RISCV_FLOAT_ABI_DOUBLE ]
  Entry:           0x10144
ProgramHeaders:
  - Type:            PT_LOAD
    Flags:           [ PF_X, PF_R ]
    FirstSec:        .text
    LastSec:         .eh_frame
    VAddr:           0x10000
    Align:           0x1000
    Offset:          0x0
Sections:
  - Name:            .text
    Type:            SHT_PROGBITS
    Flags:           [ SHF_ALLOC, SHF_EXECINSTR ]
    Address:         0x10144
    AddressAlign:    0x2
    Offset:          0x144
    Content:         6303B50082808280
  - Name:            .eh_frame
    Type:            SHT_PROGBITS
    Flags:           [ SHF_ALLOC ]
    Address:         0x10150
    AddressAlign:    0x8
    Content:         1000000000000000037A5200017C01011B0D02001000000018000000D8FFFFFF0800000000460705
  - Name:            .rela.text
    Type:            SHT_RELA
    Flags:           [ SHF_INFO_LINK ]
    Link:            .symtab
    AddressAlign:    0x8
    Info:            .text
    Relocations:
      - Offset:          0x10144
        Symbol:          ".L1\x021"
        Type:            R_RISCV_BRANCH
  - Name:            .rela.eh_frame
    Type:            SHT_RELA
    Flags:           [ SHF_INFO_LINK ]
    Link:            .symtab
    AddressAlign:    0x8
    Info:            .eh_frame
    Relocations:
      - Offset:          0x1016C
        Symbol:          '.L0 '
        Type:            R_RISCV_32_PCREL
      - Offset:          0x10170
        Symbol:          '.L0  (1)'
        Type:            R_RISCV_ADD32
      - Offset:          0x10170
        Symbol:          '.L0 '
        Type:            R_RISCV_SUB32
      - Offset:          0x10175
        Symbol:          '.L0  (2)'
        Type:            R_RISCV_SET6
      - Offset:          0x10175
        Symbol:          '.L0 '
        Type:            R_RISCV_SUB6
  - Type:            SectionHeaderTable
    Sections:
      - Name:            .text
      - Name:            .rela.text
      - Name:            .eh_frame
      - Name:            .rela.eh_frame
      - Name:            .symtab
      - Name:            .strtab
      - Name:            .shstrtab
Symbols:
  - Name:            '$x'
    Section:         .text
    Value:           0x10144
  - Name:            ".L1\x021"
    Section:         .text
    Value:           0x1014A
  - Name:            '.L0 '
    Section:         .text
    Value:           0x10144
  - Name:            '.L0  (1)'
    Section:         .text
    Value:           0x1014C
  - Name:            '.L0  (2)'
    Section:         .text
    Value:           0x1014A
  - Name:            _start
    Type:            STT_FUNC
    Section:         .text
    Binding:         STB_GLOBAL
    Value:           0x10144
    Size:            0x8
...
+17 −0
Original line number Diff line number Diff line
# Test that BOLT can handle CIE's produced by GNU as. On RISC-V, GNU as produces
# the following initial instruction:
# DW_CFA_def_cfa_register: r2
# While I believe it is technically incorrect to use this instruction without
# first using a DW_CFA_def_cfa (since the offset is unspecified), both readelf
# and llvm-dwarfdump accept this and implicitly set the offset to 0.
# In BOLT, this used to trigger an assert, however, since it (correctly)
# believed the offset was not set. This test checks we can handle this
# situation.

RUN: yaml2obj -o %t %p/Inputs/cie-gnu.yaml
RUN: llvm-bolt -o %t.bolt %t --reorder-blocks=reverse
RUN: llvm-dwarfdump --debug-frame %t.bolt | FileCheck %s

CHECK: 0x400000: CFA=X2
CHECK: 0x400004: CFA=X2: X5=undefined
CHECK: 0x400006: CFA=X2