This project is mirrored from Pull mirroring updated .
  1. 05 Jun, 2020 1 commit
    • Sam Parker's avatar
      [CostModel] Unify getMemoryOpCost · 9303546b
      Sam Parker authored
      Use getMemoryOpCost from the generic implementation of getUserCost
      and have getInstructionThroughput return the result of that for loads
      and stores.
      This also means that the X86 implementation of getUserCost can be
      removed with the functionality folded into its getMemoryOpCost.
      Differential Revision:
  2. 26 May, 2020 1 commit
    • Sam Parker's avatar
      [CostModel] Unify getCastInstrCost · 8aaabade
      Sam Parker authored
      Add the remaining cast instruction opcodes to the base implementation
      of getUserCost and directly return the result. This allows
      getInstructionThroughput to return getUserCost for the casts. This
      has required changes to PPC and SystemZ because they implement
      getUserCost and/or getCastInstrCost with adjustments for vector
      operations. Adjusts have also been made in the remaining backends
      that implement the method so that they still produce a cost of zero
      or one for cost kinds other than throughput.
      Differential Revision:
  3. 20 May, 2020 1 commit
    • Sam Parker's avatar
      [NFCI][CostModel] Refactor getIntrinsicInstrCost · 8cc911fa
      Sam Parker authored
      Combine the two API calls into one by introducing a structure to hold
      the relevant data. This has the added benefit of moving the boiler
      plate code for arguments and flags, into the constructors. This is
      intended to be a non-functional change, but the complicated web of
      logic involved here makes it very hard to guarantee.
      Differential Revision:
  4. 14 May, 2020 1 commit
  5. 05 May, 2020 2 commits
  6. 29 Apr, 2020 1 commit
    • Simon Pilgrim's avatar
      [TTI] Add DemandedElts to getScalarizationOverhead · 090cae84
      Simon Pilgrim authored
      The improvements to the x86 vector insert/extract element costs in D74976 resulted in the estimated costs for vector initialization and scalarization increasing higher than should be expected. This is particularly noticeable on pre-SSE4 targets where the available of legal INSERT_VECTOR_ELT ops is more limited.
      This patch does 2 things:
      1 - it implements X86TTIImpl::getScalarizationOverhead to more accurately represent the typical costs of a ISD::BUILD_VECTOR pattern.
      2 - it adds a DemandedElts mask to getScalarizationOverhead to permit the SLP's BoUpSLP::getGatherCost to be rewritten to use it directly instead of accumulating raw vector insertion costs.
      This fixes PR45418 where a v4i8 (zext'd to v4i32) was no longer vectorizing.
      A future patch should extend X86TTIImpl::getScalarizationOverhead to tweak the EXTRACT_VECTOR_ELT scalarization costs as well.
      Reviewed By: @craig.topper
      Differential Revision:
  7. 28 Apr, 2020 1 commit
    • Sam Parker's avatar
      [TTI] Add TargetCostKind argument to getUserCost · e9c9329a
      Sam Parker authored
      There are several different types of cost that TTI tries to provide
      explicit information for: throughput, latency, code size along with
      a vague 'intersection of code-size cost and execution cost'.
      The vectorizer is a keen user of RecipThroughput and there's at least
      'getInstructionThroughput' and 'getArithmeticInstrCost' designed to
      help with this cost. The latency cost has a single use and a single
      implementation. The intersection cost appears to cover most of the
      rest of the API.
      getUserCost is explicitly called from within TTI when the user has
      been explicit in wanting the code size (also only one use) as well
      as a few passes which are concerned with a mixture of size and/or
      a relative cost. In many cases these costs are closely related, such
      as when multiple instructions are required, but one evident diverging
      cost in this function is for div/rem.
      This patch adds an argument so that the cost required is explicit,
      so that we can make the important distinction when necessary.
      Differential Revision:
  8. 23 Apr, 2020 1 commit
  9. 14 Apr, 2020 1 commit
  10. 03 Apr, 2020 1 commit
  11. 11 Mar, 2020 1 commit
    • Anna Welker's avatar
      [TTI][ARM][MVE] Refine gather/scatter cost model · a6d3bec8
      Anna Welker authored
      Refines the gather/scatter cost model, but also changes the TTI
      function getIntrinsicInstrCost to accept an additional parameter
      which is needed for the gather/scatter cost evaluation.
      This did require trivial changes in some non-ARM backends to
      adopt the new parameter.
      Extending gathers and truncating scatters are now priced cheaper.
      Differential Revision:
  12. 24 Jan, 2020 1 commit
    • Guillaume Chatelet's avatar
      [Alignment][NFC] Deprecate Align::None() · 805c157e
      Guillaume Chatelet authored
      This is a follow up on
      There's a caveat here that `Align(1)` relies on the compiler understanding of `Log2_64` implementation to produce good code. One could use `Align()` as a replacement but I believe it is less clear that the alignment is one in that case.
      Reviewers: xbolva00, courbet, bollu
      Subscribers: arsenm, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, Jim, kerbowa, cfe-commits, llvm-commits
      Tags: #clang, #llvm
      Differential Revision:
  13. 09 Dec, 2019 1 commit
    • David Green's avatar
      [ARM] Teach the Arm cost model that a Shift can be folded into other instructions · be7a1070
      David Green authored
      This attempts to teach the cost model in Arm that code such as:
        %s = shl i32 %a, 3
        %a = and i32 %s, %b
      Can under Arm or Thumb2 become:
        and r0, r1, r2, lsl #3
      So the cost of the shift can essentially be free. To do this without
      trying to artificially adjust the cost of the "and" instruction, it
      needs to get the users of the shl and check if they are a type of
      instruction that the shift can be folded into. And so it needs to have
      access to the actual instruction in getArithmeticInstrCost, which if
      available is added as an extra parameter much like getCastInstrCost.
      We otherwise limit it to shifts with a single user, which should
      hopefully handle most of the cases. The list of instruction that the
      shift can be folded into include ADC, ADD, AND, BIC, CMP, EOR, MVN, ORR,
      ORN, RSB, SBC and SUB. This translates to Add, Sub, And, Or, Xor and
      Differential Revision:
  14. 25 Oct, 2019 1 commit
  15. 17 Sep, 2019 1 commit
    • Graham Hunter's avatar
      [SVE][MVT] Fixed-length vector MVT ranges · 1a9195d8
      Graham Hunter authored
        * Reordered MVT simple types to group scalable vector types
        * New range functions in MachineValueType.h to only iterate over
          the fixed-length int/fp vector types.
        * Stopped backends which don't support scalable vector types from
          iterating over scalable types.
      Reviewers: sdesmalen, greened
      Reviewed By: greened
      Differential Revision:
      llvm-svn: 372099
  16. 22 May, 2019 1 commit
  17. 19 Jan, 2019 1 commit
    • Chandler Carruth's avatar
      Update the file headers across all of the LLVM projects in the monorepo · 2946cd70
      Chandler Carruth authored
      to reflect the new license.
      We understand that people may be surprised that we're moving the header
      entirely to discuss the new license. We checked this carefully with the
      Foundation's lawyer and we believe this is the correct approach.
      Essentially, all code in the project is now made available by the LLVM
      project under our new license, so you will see that the license headers
      include that license only. Some of our contributors have contributed
      code under our old license, and accordingly, we have retained a copy of
      our old license notice in the top-level files in each project and
      llvm-svn: 351636
  18. 05 Nov, 2018 1 commit
  19. 31 Oct, 2018 1 commit
    • Dorit Nuzman's avatar
      [LV] Support vectorization of interleave-groups that require an epilog under · 34da6dd6
      Dorit Nuzman authored
      optsize using masked wide loads 
      Under Opt for Size, the vectorizer does not vectorize interleave-groups that
      have gaps at the end of the group (such as a loop that reads only the even
      elements: a[2*i]) because that implies that we'll require a scalar epilogue
      (which is not allowed under Opt for Size). This patch extends the support for
      masked-interleave-groups (introduced by D53011 for conditional accesses) to
      also cover the case of gaps in a group of loads; Targets that enable the
      masked-interleave-group feature don't have to invalidate interleave-groups of
      loads with gaps; they could now use masked wide-loads and shuffles (if that's
      what the cost model selects).
      Reviewers: Ayal, hsaito, dcaballe, fhahn
      Reviewed By: Ayal
      Differential Revision:
      llvm-svn: 345705
  20. 27 Oct, 2018 1 commit
  21. 24 Oct, 2018 1 commit
    • Krzysztof Parzyszek's avatar
      [Hexagon] Flip hexagon-autohvx to be true by default · 57b5ac14
      Krzysztof Parzyszek authored
      This will allow other generators of LLVM IR to use the auto-vectorizer
      without having to change that flag.
      Note: on its own, this patch will enable auto-vectorization on Hexagon
      in all cases, regardless of the -fvectorize flag. There is a companion
      clang patch that together with this one forms an NFC for clang users.
      llvm-svn: 345169
  22. 14 Oct, 2018 3 commits
    • Dorit Nuzman's avatar
      recommit 344472 after fixing build failure on ARM and PPC. · 38bbf81a
      Dorit Nuzman authored
      llvm-svn: 344475
    • Dorit Nuzman's avatar
      revert 344472 due to failures. · 5118c68c
      Dorit Nuzman authored
      llvm-svn: 344473
    • Dorit Nuzman's avatar
      [IAI,LV] Add support for vectorizing predicated strided accesses using masked · 81743689
      Dorit Nuzman authored
      The vectorizer currently does not attempt to create interleave-groups that
      contain predicated loads/stores; predicated strided accesses can currently be
      vectorized only using masked gather/scatter or scalarization. This patch makes
      predicated loads/stores candidates for forming interleave-groups during the
      Loop-Vectorizer's analysis, and adds the proper support for masked-interleave-
      groups to the Loop-Vectorizer's planning and transformation stages. The patch
      also extends the TTI API to allow querying the cost of masked interleave groups
      (which each target can control); Targets that support masked vector loads/
      stores may choose to enable this feature and allow vectorizing predicated
      strided loads/stores using masked wide loads/stores and shuffles.
      Reviewers: Ayal, hsaito, dcaballe, fhahn, javed.absar
      Reviewed By: Ayal
      Differential Revision:
      llvm-svn: 344472
  23. 22 Aug, 2018 1 commit
  24. 12 Jun, 2018 1 commit
  25. 20 Apr, 2018 2 commits
  26. 16 Apr, 2018 1 commit
  27. 13 Apr, 2018 2 commits
  28. 03 Apr, 2018 2 commits
  29. 30 Mar, 2018 2 commits
  30. 27 Mar, 2018 1 commit
  31. 26 Mar, 2018 1 commit
  32. 01 Aug, 2017 1 commit
  33. 30 Jun, 2017 1 commit