Commit eb2e0b41 authored by Hans Wennborg's avatar Hans Wennborg
Browse files

Merging r324002:

------------------------------------------------------------------------
r324002 | ctopper | 2018-02-01 21:48:50 +0100 (Thu, 01 Feb 2018) | 7 lines

[DAGCombiner] When folding (insert_subvector undef, (bitcast (extract_subvector N1, Idx)), Idx) -> (bitcast N1) make sure that N1 has the same total size as the original output

We were only checking the element count, but not the total width. This could cause illegal bitcasts to be created if for example the output was 512-bits, but N1 is 256 bits, and the extraction size was 128-bits.

Fixes PR36199

Differential Revision: https://reviews.llvm.org/D42809
------------------------------------------------------------------------

llvm-svn: 324216
parent 8be5478f
Loading
Loading
Loading
Loading
+3 −1
Original line number Diff line number Diff line
@@ -16409,7 +16409,9 @@ SDValue DAGCombiner::visitINSERT_SUBVECTOR(SDNode *N) {
      N1.getOperand(0).getOpcode() == ISD::EXTRACT_SUBVECTOR &&
      N1.getOperand(0).getOperand(1) == N2 &&
      N1.getOperand(0).getOperand(0).getValueType().getVectorNumElements() ==
          VT.getVectorNumElements()) {
          VT.getVectorNumElements() &&
      N1.getOperand(0).getOperand(0).getValueType().getSizeInBits() ==
          VT.getSizeInBits()) {
    return DAG.getBitcast(VT, N1.getOperand(0).getOperand(0));
  }
+22 −0
Original line number Diff line number Diff line
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mcpu=skylake-avx512 | FileCheck %s

define void @foo() unnamed_addr #0 {
; CHECK-LABEL: foo:
; CHECK:       # %bb.0:
; CHECK-NEXT:    vaddps %zmm0, %zmm0, %zmm0
; CHECK-NEXT:    vinsertf128 $1, %xmm0, %ymm0, %ymm0
; CHECK-NEXT:    vinsertf64x4 $1, %ymm0, %zmm0, %zmm0
; CHECK-NEXT:    vmovups %zmm0, (%rax)
; CHECK-NEXT:    vzeroupper
; CHECK-NEXT:    retq
  %1 = fadd <16 x float> undef, undef
  %bc256 = bitcast <16 x float> %1 to <4 x i128>
  %2 = extractelement <4 x i128> %bc256, i32 0
  %3 = bitcast i128 %2 to <4 x float>
  %4 = shufflevector <4 x float> %3, <4 x float> undef, <16 x i32> <i32 0, i32
1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3, i32 0,
i32 1, i32 2, i32 3>
  store <16 x float> %4, <16 x float>* undef, align 4
  ret void
}