This project is mirrored from https://github.com/llvm-doe-org/llvm-project.git.
Pull mirroring updated .
- 17 Jun, 2020 1 commit
-
-
Sjoerd Meijer authored
Refactor TTI hook emitGetActiveLaneMask and remove the unused arguments as suggested in D79100.
-
- 12 Jun, 2020 1 commit
-
-
David Green authored
Similar to a recent change to the X86 backend, this changes things so that we always produce a reduction intrinsics for all reduction types, not just the legal ones. This gives a better chance in the backend to custom lower them to something more suitable for MVE. Especially for something like fadd the in-order reduction produced during DAG lowering is already better than the shuffles produced in the midend, and we can do even better with a bit of custom lowering. Differential Revision: https://reviews.llvm.org/D81398
-
- 10 Jun, 2020 1 commit
-
-
Sam Parker authored
Add the remaining arithmetic opcodes into the generic implementation of getUserCost and then call this from getInstructionThroughput. Most of the backends have been modified to return the base implementation for cost kinds other RecipThroughput. The outlier here is AMDGPU which already uses getArithmeticInstrCost for all the cost kinds. This change means that most of the opcodes can be removed from that backends implementation of getUserCost. Differential Revision: https://reviews.llvm.org/D80992
-
- 09 Jun, 2020 1 commit
-
-
Sam Parker authored
Add cases for icmp, fcmp and select into the switch statement of the generic getUserCost implementation with getInstructionThroughput then calling into it. The BasicTTI and backend implementations have be set to return a default value (1) when a cost other than throughput is being queried. Differential Revision: https://reviews.llvm.org/D80550
-
- 08 Jun, 2020 1 commit
-
-
Sam Parker authored
Adding type checks into the other backends that call getTypeLegalizationCost. Differential Revision: https://reviews.llvm.org/D80984
-
- 05 Jun, 2020 1 commit
-
-
Sam Parker authored
Use getMemoryOpCost from the generic implementation of getUserCost and have getInstructionThroughput return the result of that for loads and stores. This also means that the X86 implementation of getUserCost can be removed with the functionality folded into its getMemoryOpCost. Differential Revision: https://reviews.llvm.org/D80984
-
- 29 May, 2020 1 commit
-
-
Sjoerd Meijer authored
This is split off from D79100 and adds a new target hook emitGetActiveLaneMask that can be queried to check if the intrinsic @llvm.get.active.lane.mask() is supported by the backend and if it should be emitted for a given loop. See also commit rG7fb8a40e and its commit message for more details/context on this new intrinsic. Differential Revision: https://reviews.llvm.org/D80597
-
- 26 May, 2020 1 commit
-
-
Sam Parker authored
Add the remaining cast instruction opcodes to the base implementation of getUserCost and directly return the result. This allows getInstructionThroughput to return getUserCost for the casts. This has required changes to PPC and SystemZ because they implement getUserCost and/or getCastInstrCost with adjustments for vector operations. Adjusts have also been made in the remaining backends that implement the method so that they still produce a cost of zero or one for cost kinds other than throughput. Differential Revision: https://reviews.llvm.org/D79848
-
- 23 May, 2020 1 commit
-
-
Craig Topper authored
If the caller needs to reponsible for making sure the MaybeAlign has a value, then we should just make the caller convert it to an Align with operator*. I explicitly deleted the relational comparison operators that were being inherited from Optional. It's unclear what the meaning of two MaybeAligns were one is defined and the other isn't should be. So make the caller reponsible for defining the behavior. I left the ==/!= operators from Optional. But now that exposed a weird quirk that ==/!= between Align and MaybeAlign required the MaybeAlign to be defined. But now we use the operator== from Optional that takes an Optional and the Value. Differential Revision: https://reviews.llvm.org/D80455
-
- 15 May, 2020 1 commit
-
-
Christopher Tetreault authored
Reviewers: efriedma, fpetrogalli, kmclaughlin, grosbach, dmgreen Reviewed By: dmgreen Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, dmgreen, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79816
-
- 13 May, 2020 1 commit
-
-
Pierre-vh authored
This patch adds a new TTI hook to allow targets to tell LSR that a chain including some instruction is already profitable and should not be optimized. This patch also adds an implementation of this TTI hook for ARM so LSR doesn't optimize chains that include the VCTP intrinsic. Differential Revision: https://reviews.llvm.org/D79418
-
- 12 May, 2020 1 commit
-
-
Sam Parker authored
- Specifically check for sext/zext users which have 'long' form NEON instructions. - Add more entries to the table for sext/zexts so that we can report more accurately the number of vmovls required for NEON. - Pass the instruction to the pass implementation. Differential Revision: https://reviews.llvm.org/D79561
-
- 05 May, 2020 2 commits
-
-
Simon Pilgrim authored
getScalarizationOverhead is only ever called with vectors (and we already had a load of cast<VectorType> calls immediately inside the functions). Followup to D78357 Reviewed By: @samparker Differential Revision: https://reviews.llvm.org/D79341
-
Sam Parker authored
Make the kind of cost explicit throughout the cost model which, apart from making the cost clear, will allow the generic parts to calculate better costs. It will also allow some backends to approximate and correlate the different costs if they wish. Another benefit is that it will also help simplify the cost model around immediate and intrinsic costs, where we currently have multiple APIs. RFC thread: http://lists.llvm.org/pipermail/llvm-dev/2020-April/141263.html Differential Revision: https://reviews.llvm.org/D79002
-
- 04 May, 2020 1 commit
-
-
David Green authored
-
- 28 Apr, 2020 1 commit
-
-
Sam Parker authored
There are several different types of cost that TTI tries to provide explicit information for: throughput, latency, code size along with a vague 'intersection of code-size cost and execution cost'. The vectorizer is a keen user of RecipThroughput and there's at least 'getInstructionThroughput' and 'getArithmeticInstrCost' designed to help with this cost. The latency cost has a single use and a single implementation. The intersection cost appears to cover most of the rest of the API. getUserCost is explicitly called from within TTI when the user has been explicit in wanting the code size (also only one use) as well as a few passes which are concerned with a mixture of size and/or a relative cost. In many cases these costs are closely related, such as when multiple instructions are required, but one evident diverging cost in this function is for div/rem. This patch adds an argument so that the cost required is explicit, so that we can make the important distinction when necessary. Differential Revision: https://reviews.llvm.org/D78635
-
- 22 Apr, 2020 1 commit
-
-
Craig Topper authored
In some cases just delete an unneeded include.
-
- 20 Apr, 2020 1 commit
-
-
Sam Parker authored
The API for shuffles and reductions uses generic Type parameters, instead of VectorType, and so assertions and casts are used a lot. This patch makes those types explicit, which means that the clients can't be lazy, but results in less ambiguity, and that can only be a good thing. Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=45562 Differential Revision: https://reviews.llvm.org/D78357
-
- 09 Apr, 2020 1 commit
-
-
Christopher Tetreault authored
Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: grosbach, efriedma, sdesmalen Reviewed By: efriedma Subscribers: hiraditya, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77271
-
- 11 Mar, 2020 1 commit
-
-
Anna Welker authored
Refines the gather/scatter cost model, but also changes the TTI function getIntrinsicInstrCost to accept an additional parameter which is needed for the gather/scatter cost evaluation. This did require trivial changes in some non-ARM backends to adopt the new parameter. Extending gathers and truncating scatters are now priced cheaper. Differential Revision: https://reviews.llvm.org/D75525
-
- 03 Feb, 2020 1 commit
-
-
Guillaume Chatelet authored
Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73874
-
- 31 Jan, 2020 1 commit
-
-
Guillaume Chatelet authored
Summary: This is a first step before changing the types to llvm::Align and introduce functions to ease client code. Reviewers: courbet Subscribers: arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73785
-
- 24 Jan, 2020 1 commit
-
-
David Green authored
The codegen for splitting a llvm.vector.reduction intrinsic into parts will be better than the codegen for the generic reductions. This will only directly effect when vectorization factors are specified by the user. Also added tests to make sure the codegen for larger reductions is OK. Differential Revision: https://reviews.llvm.org/D72257
-
- 22 Jan, 2020 1 commit
-
-
David Green authored
This is a very basic MVE gather/scatter cost model, based roughly on the code that we will currently produce. It does not handle truncating scatters or extending gathers correctly yet, as it is difficult to tell that they are going to be correctly extended/truncated from the limited information in the cost function. This can be improved as we extend support for these in the future. Based on code originally written by David Sherwood. Differential Revision: https://reviews.llvm.org/D73021
-
- 20 Jan, 2020 1 commit
-
-
David Green authored
We were previously not necessarily favouring postinc for the MVE loads and stores, leading to extra code prior to the loop to set up the preinc. MVE in general can benefit from postinc (as we don't have unrolled loops), and certain instructions like the VLD2's only post-inc versions are available. Differential Revision: https://reviews.llvm.org/D70790
-
- 09 Jan, 2020 1 commit
-
-
Sam Parker authored
We don't unroll vector loops for MVE targets, but we miss the case when loops only contain intrinsic calls. So just move the logic a bit to catch this case. Differential Revision: https://reviews.llvm.org/D72440
-
- 08 Jan, 2020 1 commit
-
-
Anna Welker authored
Adds a pass to the ARM backend that takes a v4i32 gather and transforms it into a call to MVE's masked gather intrinsics. Differential Revision: https://reviews.llvm.org/D71743
-
- 12 Dec, 2019 1 commit
-
-
Reid Kleckner authored
Soon Intrinsic::ID will be a plain integer, so this overload will not be possible. Rename both overloads to ensure that downstream targets observe this as a build failure instead of a runtime failure. Split off from D71320 Reviewers: efriedma Differential Revision: https://reviews.llvm.org/D71381
-
- 09 Dec, 2019 3 commits
-
-
David Green authored
With the extra optimisations we have done, these should now be fine to enable by default. Which is what this patch does. Differential Revision: https://reviews.llvm.org/D70968
-
David Green authored
This attempts to teach the cost model in Arm that code such as: %s = shl i32 %a, 3 %a = and i32 %s, %b Can under Arm or Thumb2 become: and r0, r1, r2, lsl #3 So the cost of the shift can essentially be free. To do this without trying to artificially adjust the cost of the "and" instruction, it needs to get the users of the shl and check if they are a type of instruction that the shift can be folded into. And so it needs to have access to the actual instruction in getArithmeticInstrCost, which if available is added as an extra parameter much like getCastInstrCost. We otherwise limit it to shifts with a single user, which should hopefully handle most of the cases. The list of instruction that the shift can be folded into include ADC, ADD, AND, BIC, CMP, EOR, MVN, ORR, ORN, RSB, SBC and SUB. This translates to Add, Sub, And, Or, Xor and ICmp. Differential Revision: https://reviews.llvm.org/D70966
-
David Green authored
This adds some extra cost model tests for shifts, and does some minor adjustments to some Neon code to make it clear as to what it applies to. Both NFC.
-
- 19 Nov, 2019 1 commit
-
-
David Green authored
Now that we have the intrinsics, we can add VLD2/4 and VST2/4 lowering for MVE. This works the same way as Neon, recognising the load/shuffles combination and converting them into intrinsics in a pre-isel pass, which just calls getMaxSupportedInterleaveFactor, lowerInterleavedLoad and lowerInterleavedStore. The main difference to Neon is that we do not have a VLD3 instruction. Otherwise most of the code works very similarly, with just some minor differences in the form of the intrinsics to work around. VLD3 is disabled by making isLegalInterleavedAccessType return false for those cases. We may need some other future adjustments, such as VLD4 take up half the available registers so should maybe cost more. This patch should get the basics in though. Differential Revision: https://reviews.llvm.org/D69392
-
- 15 Nov, 2019 1 commit
-
-
Sjoerd Meijer authored
This is a follow up of d90804d2, to also flag fmcp instructions as instructions that we do not support in tail-predicated vector loops. Differential Revision: https://reviews.llvm.org/D70295
-
- 13 Nov, 2019 1 commit
-
-
Sjoerd Meijer authored
This implements TTI hook 'preferPredicateOverEpilogue' for MVE. This is a first version and it operates on single block loops only. With this change, the vectoriser will now determine if tail-folding scalar remainder loops is possible/desired, which is the first step to generate MVE tail-predicated vector loops. This is disabled by default for now. I.e,, this is depends on option -disable-mve-tail-predication, which is off by default. I will follow up on this soon with a patch for the vectoriser to respect loop hint 'vectorize.predicate.enable'. I.e., with this loop hint set to Disabled, we don't want to tail-fold and we shouldn't query this TTI hook, which is done in D70125. Differential Revision: https://reviews.llvm.org/D69845
-
- 06 Nov, 2019 1 commit
-
-
Sjoerd Meijer authored
We have two ways to steer creating a predicated vector body over creating a scalar epilogue. To force this, we have 1) a command line option and 2) a pragma available. This adds a third: a target hook to TargetTransformInfo that can be queried whether predication is preferred or not, which allows the vectoriser to make the decision without forcing it. While this change behaves as a non-functional change for now, it shows the required TTI plumbing, usage of this new hook in the vectoriser, and the beginning of an ARM MVE implementation. I will follow up on this with: - a complete MVE implementation, see D69845. - a patch to disable this, i.e. we should respect "vector_predicate(disable)" and its corresponding loophint. Differential Revision: https://reviews.llvm.org/D69040
-
- 25 Oct, 2019 1 commit
-
-
Guillaume Chatelet authored
Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69307
-
- 17 Oct, 2019 1 commit
-
-
Sam Parker authored
Add generic DAG combine for extending masked loads. Allow us to generate sext/zext masked loads which can access v4i8, v8i8 and v4i16 memory to produce v4i32, v8i16 and v4i32 respectively. Differential Revision: https://reviews.llvm.org/D68337 llvm-svn: 375085
-
- 14 Oct, 2019 1 commit
-
-
Sam Parker authored
Add an extra parameter so the backend can take the alignment into consideration. Differential Revision: https://reviews.llvm.org/D68400 llvm-svn: 374763
-
- 15 Sep, 2019 1 commit
-
-
David Green authored
Masked loads and store fit naturally with MVE, the instructions being easily predicated. This adds lowering for the simple cases of masked loads and stores. It does not yet deal with widening/narrowing or pre/post inc, and so is currently behind an option. The llvm masked load intrinsic will accept a "passthru" value, dictating the values used for the zero masked lanes. In MVE the instructions write 0 to the zero predicated lanes, so we need to match a passthru that isn't 0 (or undef) with a select instruction to pull in the correct data after the load. Differential Revision: https://reviews.llvm.org/D67186 llvm-svn: 371932
-
- 13 Sep, 2019 1 commit
-
-
Sam Tebbs authored
This patch adds vecreduce_smax, vecredude_umax, vecreduce_smin, vecreduce_umin and selection for vmaxv and minv. Differential Revision: https://reviews.llvm.org/D66413 llvm-svn: 371827
-