Sanjoy Das [Wed, 4 Nov 2015 20:33:45 +0000 (20:33 +0000)]
[IR] Add bounds checking to paramHasAttr
Summary:
This is intended to make a later change simpler.
Note: adding this bounds checking required fixing `X86FastISel`. As
far I can tell I've preserved original behavior but a careful review
will be appreciated.
Reviewers: reames
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14304
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252073
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Wed, 4 Nov 2015 19:43:24 +0000 (19:43 +0000)]
Orc: Streamline some lambda usage in a unit test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252070
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Wed, 4 Nov 2015 19:18:11 +0000 (19:18 +0000)]
Relax the check for ninja.
On fedora the ninja executable is called ninja-build :-(
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252062
91177308-0d34-0410-b5e6-
96231b3b80d8
Andrew Kaylor [Wed, 4 Nov 2015 18:10:41 +0000 (18:10 +0000)]
Created new X86 FMA3 opcodes (FMA*_Int) that are used now for lowering of scalar FMA intrinsics.
Patch by Slava Klochkov
The key difference between FMA* and FMA*_Int opcodes is that FMA*_Int opcodes are handled more conservatively. It is illegal to commute the 1st operand of FMA*_Int instructions as the upper bits of scalar FMA intrinsic result must be taken from the 1st operand, but such commute transformation would change those upper bits and invalidate the intrinsic's result.
Reviewers: Quentin Colombet, Elena Demikhovsky
Differential Revision: http://reviews.llvm.org/D13710
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252060
91177308-0d34-0410-b5e6-
96231b3b80d8
James Molloy [Wed, 4 Nov 2015 16:55:07 +0000 (16:55 +0000)]
[ARM] Combine CMOV into BFI where possible
If we have a CMOV, OR and AND combination such as:
if (x & CN)
y |= CM;
And:
* CN is a single bit;
* All bits covered by CM are known zero in y;
Then we can convert this to a sequence of BFI instructions. This will always be a win if CM is a single bit, will always be no worse than the TST & OR sequence if CM is two bits, and for thumb will be no worse if CM is three bits (due to the extra IT instruction).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252057
91177308-0d34-0410-b5e6-
96231b3b80d8
Teresa Johnson [Wed, 4 Nov 2015 16:01:16 +0000 (16:01 +0000)]
[ThinLTO] Always set linkage type to external when converting alias
When converting an alias to a non-alias when the aliasee is not
imported, ensure that the linkage type is set to external so that it is
a valid linkage type. Added a test case that exposed this issue.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252054
91177308-0d34-0410-b5e6-
96231b3b80d8
James Molloy [Wed, 4 Nov 2015 15:28:04 +0000 (15:28 +0000)]
[SimplifyCFG] Merge conditional stores
We can often end up with conditional stores that cannot be speculated. They can come from fairly simple, idiomatic code:
if (c & flag1)
*a = x;
if (c & flag2)
*a = y;
...
There is no dominating or post-dominating store to a, so it is not legal to move the store unconditionally to the end of the sequence and cache the intermediate result in a register, as we would like to.
It is, however, legal to merge the stores together and do the store once:
tmp = undef;
if (c & flag1)
tmp = x;
if (c & flag2)
tmp = y;
if (c & flag1 || c & flag2)
*a = tmp;
The real power in this optimization is that it allows arbitrary length ladders such as these to be completely and trivially if-converted. The typical code I'd expect this to trigger on often uses binary-AND with constants as the condition (as in the above example), which means the ending condition can simply be truncated into a single binary-AND too: 'if (c & (flag1|flag2))'. As in the general case there are bitwise operators here, the ladder can often be optimized further too.
This optimization involves potentially increasing register pressure. Even in the simplest case, the lifetime of the first predicate is extended. This can be elided in some cases such as using binary-AND on constants, but not in the general case. Threading 'tmp' through all branches can also increase register pressure.
The optimization as in this patch is enabled by default but kept in a very conservative mode. It will only optimize if it thinks the resultant code should be if-convertable, and additionally if it can thread 'tmp' through at least one existing PHI, so it will only ever in the worst case create one more PHI and extend the lifetime of a predicate.
This doesn't trigger much in LNT, unfortunately, but it does trigger in a big way in a third party test suite.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252051
91177308-0d34-0410-b5e6-
96231b3b80d8
Filipe Cabecinhas [Wed, 4 Nov 2015 14:53:36 +0000 (14:53 +0000)]
Error out when faced with value names containing '\0'
Bug found with afl-fuzz.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252048
91177308-0d34-0410-b5e6-
96231b3b80d8
Aaron Ballman [Wed, 4 Nov 2015 14:40:54 +0000 (14:40 +0000)]
Silence an extra semicolon warning; NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252046
91177308-0d34-0410-b5e6-
96231b3b80d8
Michael Kuperstein [Wed, 4 Nov 2015 11:21:50 +0000 (11:21 +0000)]
[ELF] elfiamcu triple should imply e_machine == EM_IAMCU
Differential Revision: http://reviews.llvm.org/D14109
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252043
91177308-0d34-0410-b5e6-
96231b3b80d8
Michael Kuperstein [Wed, 4 Nov 2015 11:17:53 +0000 (11:17 +0000)]
[X86] DAGCombine should not introduce FILD in soft-float mode
The x86 "sitofp i64 to double" dag combine, in 32-bit mode, lowers sitofp
directly to X86ISD::FILD (or FILD_FLAG). This should not be done in soft-float mode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252042
91177308-0d34-0410-b5e6-
96231b3b80d8
James Molloy [Wed, 4 Nov 2015 08:36:53 +0000 (08:36 +0000)]
Revert "[PatternMatch] Switch to use ValueTracking::matchSelectPattern"
This was breaking the modules build and is being reverted while we reach consensus on the right way to solve this layering problem. This reverts commit r251785.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252040
91177308-0d34-0410-b5e6-
96231b3b80d8
Pawel Bylica [Wed, 4 Nov 2015 08:25:20 +0000 (08:25 +0000)]
Fix unit tests on Windows: handle env vars with non-ASCII chars.
Summary: On Windows we have to take UTF16 encoded env vars and convert them to UTF8. This patch fixes CopyEnvironment helper function used by process unit tests.
Reviewers: yaron.keren
Subscribers: yaron.keren, llvm-commits
Differential Revision: http://reviews.llvm.org/D14278
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252039
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjoy Das [Wed, 4 Nov 2015 04:31:21 +0000 (04:31 +0000)]
[OperandBundles] Refactor; NFCI.
Extract out a helper function `operandBundleFromBundleOpInfo`.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252038
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjoy Das [Wed, 4 Nov 2015 04:31:06 +0000 (04:31 +0000)]
[OperandBundles] Refactor; NFCI
Intended to make later changes simpler. Exposes
`getBundleOperandsStartIndex` and `getBundleOperandsEndIndex`, and uses
them for the computation in `getNumTotalBundleOperands`.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252037
91177308-0d34-0410-b5e6-
96231b3b80d8
Philip Reames [Wed, 4 Nov 2015 01:47:04 +0000 (01:47 +0000)]
[LVI] Update a comment to clarify what's actually happening and why
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252033
91177308-0d34-0410-b5e6-
96231b3b80d8
Philip Reames [Wed, 4 Nov 2015 01:43:54 +0000 (01:43 +0000)]
[CVP] Fold return values if possible
In my previous change to CVP (251606), I made CVP much more aggressive about trying to constant fold comparisons. This patch is a reversal in direction. Rather than being agressive about every compare, we restore the non-block local restriction for most, and then try hard for compares feeding returns.
The motivation for this is two fold:
* The more I thought about it, the less comfortable I got with the possible compile time impact of the other approach. There have been no reported issues, but after talking to a couple of folks, I've come to the conclusion the time probably isn't justified.
* It turns out we need to know the context to leverage the full power of LVI. In particular, asking about something at the end of it's block (the use of a compare in a return) will frequently get more precise results than something in the middle of a block. This is an implementation detail, but it's also hard to get around since mid-block queries have to reason about possible throwing instructions and don't get to use most of LVI's block focused infrastructure. This will become particular important when combined with http://reviews.llvm.org/D14263.
Differential Revision: http://reviews.llvm.org/D14271
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252032
91177308-0d34-0410-b5e6-
96231b3b80d8
Igor Laevsky [Wed, 4 Nov 2015 01:16:10 +0000 (01:16 +0000)]
[StatepointLowering] Remove distinction between call and invoke safepoints
There is no point in having invoke safepoints handled differently than the
call safepoints. All relevant decisions could be made by looking at whether
or not gc.result and gc.relocate lay in a same basic block. This change will
allow to lower call safepoints with relocates and results in a different
basic blocks. See test case for example.
Differential Revision: http://reviews.llvm.org/D14158
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252028
91177308-0d34-0410-b5e6-
96231b3b80d8
Alexey Samsonov [Wed, 4 Nov 2015 01:09:37 +0000 (01:09 +0000)]
Fix the test case for Windows.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252027
91177308-0d34-0410-b5e6-
96231b3b80d8
Alexey Samsonov [Wed, 4 Nov 2015 00:30:26 +0000 (00:30 +0000)]
[LLVMSymbolize] Reduce indentation by using helper function. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252022
91177308-0d34-0410-b5e6-
96231b3b80d8
Alexey Samsonov [Wed, 4 Nov 2015 00:30:24 +0000 (00:30 +0000)]
[LLVMSymbolize] Properly propagate object parsing errors from the library.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252021
91177308-0d34-0410-b5e6-
96231b3b80d8
Alexey Samsonov [Wed, 4 Nov 2015 00:30:19 +0000 (00:30 +0000)]
[llvm-symbolizer] Improve the test for missing input file.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252020
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Wed, 4 Nov 2015 00:10:33 +0000 (00:10 +0000)]
Fix unused variable warning from r252017
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252019
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Tue, 3 Nov 2015 23:50:08 +0000 (23:50 +0000)]
LLE 6/6: Add LoopLoadElimination pass
Summary:
The goal of this pass is to perform store-to-load forwarding across the
backedge of a loop. E.g.:
for (i)
A[i + 1] = A[i] + B[i]
=>
T = A[0]
for (i)
T = T + B[i]
A[i + 1] = T
The pass relies on loop dependence analysis via LoopAccessAnalisys to
find opportunities of loop-carried dependences with a distance of one
between a store and a load. Since it's using LoopAccessAnalysis, it was
easy to also add support for versioning away may-aliasing intervening
stores that would otherwise prevent this transformation.
This optimization is also performed by Load-PRE in GVN without the
option of multi-versioning. As was discussed with Daniel Berlin in
http://reviews.llvm.org/D9548, this is inferior to a more loop-aware
solution applied here. Hopefully, we will be able to remove some
complexity from GVN/MemorySSA as a consequence.
In the long run, we may want to extend this pass (or create a new one if
there is little overlap) to also eliminate loop-indepedent redundant
loads and store that *require* versioning due to may-aliasing
intervening stores/loads. I have some motivating cases for store
elimination. My plan right now is to wait for MemorySSA to come online
first rather than using memdep for this.
The main motiviation for this pass is the 456.hmmer loop in SPECint2006
where after distributing the original loop and vectorizing the top part,
we are left with the critical path exposed in the bottom loop. Being
able to promote the memory dependence into a register depedence (even
though the HW does perform store-to-load fowarding as well) results in a
major gain (~20%). This gain also transfers over to x86: it's
around 8-10%.
Right now the pass is off by default and can be enabled
with -enable-loop-load-elim. On the LNT testsuite, there are two
performance changes (negative number -> improvement):
1. -28% in Polybench/linear-algebra/solvers/dynprog: the length of the
critical paths is reduced
2. +2% in Polybench/stencils/adi: Unfortunately, I couldn't reproduce this
outside of LNT
The pass is scheduled after the loop vectorizer (which is after loop
distribution). The rational is to try to reuse LAA state, rather than
recomputing it. The order between LV and LLE is not critical because
normally LV does not touch scalar st->ld forwarding cases where
vectorizing would inhibit the CPU's st->ld forwarding to kick in.
LoopLoadElimination requires LAA to provide the full set of dependences
(including forward dependences). LAA is known to omit loop-independent
dependences in certain situations. The big comment before
removeDependencesFromMultipleStores explains why this should not occur
for the cases that we're interested in.
Reviewers: dberlin, hfinkel
Subscribers: junbuml, dberlin, mssimpso, rengolin, sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D13259
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252017
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Tue, 3 Nov 2015 23:50:03 +0000 (23:50 +0000)]
[LAA] LLE 5/6: Add predicate functions Dependence::isForward/isBackward, NFC
Summary: Will be used by the LoopLoadElimination pass.
Reviewers: hfinkel
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13258
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252016
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Tue, 3 Nov 2015 23:49:58 +0000 (23:49 +0000)]
[LAA] LLE 4/6: APIs to access the dependent instructions for a dependence, NFC
Summary:
The functions use LAI and MemoryDepChecker classes so they need to be
defined after those definitions outside of the Dependence class.
Will be used by the LoopLoadElimination pass.
Reviewers: hfinkel
Subscribers: rengolin, llvm-commits
Differential Revision: http://reviews.llvm.org/D13257
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252015
91177308-0d34-0410-b5e6-
96231b3b80d8
Peter Collingbourne [Tue, 3 Nov 2015 23:40:03 +0000 (23:40 +0000)]
CodeGen, Target: Move Mach-O-specific symbol name logic to Mach-O lowering.
A profile of an LTO link of Chrome revealed that we were spending some
~30-50% of execution time in the function Constant::getRelocationInfo(),
which is called from TargetLoweringObjectFile::getKindForGlobal() and in turn
from TargetMachine::getNameWithPrefix().
It turns out that we only need the result of getKindForGlobal() when
targeting Mach-O, so this change moves the relevant part of the logic to
TargetLoweringObjectFileMachO.
NFCI.
Differential Revision: http://reviews.llvm.org/D14168
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252014
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Tue, 3 Nov 2015 22:50:34 +0000 (22:50 +0000)]
AMDGPU: Make flat_scratch name consistent
The printed name and the parsed assembler names weren't the same.
I'm not sure which name SC prints these as, but I think it's this one.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252010
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Tue, 3 Nov 2015 22:50:32 +0000 (22:50 +0000)]
AMDGPU: Fix asserts on invalid register ranges
If the requested SGPR was not actually aligned, it was
accepted and rounded down instead of rejected.
Also fix an assert if the range is an invalid size.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252009
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Tue, 3 Nov 2015 22:50:27 +0000 (22:50 +0000)]
AMDGPU: Fix off by one error in register parsing
If trying to use one past the end, this would assert.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252008
91177308-0d34-0410-b5e6-
96231b3b80d8
Derek Schuff [Tue, 3 Nov 2015 22:40:45 +0000 (22:40 +0000)]
Address nit
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252004
91177308-0d34-0410-b5e6-
96231b3b80d8
Derek Schuff [Tue, 3 Nov 2015 22:40:43 +0000 (22:40 +0000)]
Align whitespace
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252003
91177308-0d34-0410-b5e6-
96231b3b80d8
Derek Schuff [Tue, 3 Nov 2015 22:40:40 +0000 (22:40 +0000)]
[WebAssembly] Support wasm select operator
Summary:
Add support for wasm's select operator, and lower LLVM's select DAG node
to it.
Reviewers: sunfish
Subscribers: dschuff, llvm-commits, jfb
Differential Revision: http://reviews.llvm.org/D14295
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252002
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Tue, 3 Nov 2015 22:39:52 +0000 (22:39 +0000)]
AMDGPU: s[102:103] is unavailable on VI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252000
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Tue, 3 Nov 2015 22:39:50 +0000 (22:39 +0000)]
AMDGPU: Define correct number of SGPRs
There are actually 104 so 2 were missing.
More assembler tests with high register number tuples
will be included in later patches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251999
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Tue, 3 Nov 2015 22:30:15 +0000 (22:30 +0000)]
AMDGPU: Make findUsedSGPR more readable
Add more comments etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251996
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Tue, 3 Nov 2015 22:30:13 +0000 (22:30 +0000)]
AMDGPU: Initialize SIFixSGPRCopies so -print-after works
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251995
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Tue, 3 Nov 2015 22:30:08 +0000 (22:30 +0000)]
AMDGPU: Alphabetize includes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251994
91177308-0d34-0410-b5e6-
96231b3b80d8
Fiona Glaser [Tue, 3 Nov 2015 22:23:39 +0000 (22:23 +0000)]
InstCombine: fix sinking of convergent calls
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251991
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Tue, 3 Nov 2015 22:21:38 +0000 (22:21 +0000)]
[SelectionDAG] Use existing constant nodes instead of recreating them. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251990
91177308-0d34-0410-b5e6-
96231b3b80d8
Alexey Samsonov [Tue, 3 Nov 2015 22:20:52 +0000 (22:20 +0000)]
[LLVMSymbolize] Factor out the logic for printing structs from DIContext. NFC.
Introduce DIPrinter which takes care of rendering DILineInfo and
friends. This allows LLVMSymbolizer class to return a structured data
instead of plain std::strings.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251989
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Tue, 3 Nov 2015 21:58:35 +0000 (21:58 +0000)]
[X86][AVX] Tweaked shuffle stack folding tests
To avoid alternative lowerings.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251986
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Tue, 3 Nov 2015 21:39:52 +0000 (21:39 +0000)]
[LAA] LLE 3/6: Rename InterestingDependence to Dependences, NFC
Summary:
We now collect all types of dependences including lexically forward
deps not just "interesting" ones.
Reviewers: hfinkel
Subscribers: rengolin, llvm-commits
Differential Revision: http://reviews.llvm.org/D13256
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251985
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Tue, 3 Nov 2015 21:39:30 +0000 (21:39 +0000)]
[X86][AVX512] Fixed shuffle test name to match shuffle
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251984
91177308-0d34-0410-b5e6-
96231b3b80d8
Alexey Samsonov [Tue, 3 Nov 2015 21:36:13 +0000 (21:36 +0000)]
[LLVMSymbolize] Move demangling away from printing routines. NFC.
Make printDILineInfo and friends responsible for just rendering the
contents of the structures, demangling should actually be performed
earlier, when we have the information about the originating
SymbolizableModule at hand.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251981
91177308-0d34-0410-b5e6-
96231b3b80d8
Davide Italiano [Tue, 3 Nov 2015 20:32:23 +0000 (20:32 +0000)]
[SimplifyLibCalls] Add a new transformation: pow(exp(x), y) -> exp(x*y)
This one is enabled only under -ffast-math (due to rounding/overflows)
but allows us to emit shorter code.
Before (on FreeBSD x86-64):
4007f0: 50 push %rax
4007f1: f2 0f 11 0c 24 movsd %xmm1,(%rsp)
4007f6: e8 75 fd ff ff callq 400570 <exp2@plt>
4007fb: f2 0f 10 0c 24 movsd (%rsp),%xmm1
400800: 58 pop %rax
400801: e9 7a fd ff ff jmpq 400580 <pow@plt>
400806: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
40080d: 00 00 00
After:
4007b0: f2 0f 59 c1 mulsd %xmm1,%xmm0
4007b4: e9 87 fd ff ff jmpq 400540 <exp2@plt>
4007b9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
Differential Revision: http://reviews.llvm.org/D14045
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251976
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Tue, 3 Nov 2015 20:27:01 +0000 (20:27 +0000)]
[X86][XOP] Add support for the matching of the VPCMOV bit select instruction
XOP has the VPCMOV instruction that performs the common vector bit select operation OR( AND( SRC1, SRC3 ), AND( SRC2, ~SRC3 ) )
This patch adds tablegen pattern matching for this instruction.
Differential Revision: http://reviews.llvm.org/D8841
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251975
91177308-0d34-0410-b5e6-
96231b3b80d8
Rui Ueyama [Tue, 3 Nov 2015 20:16:18 +0000 (20:16 +0000)]
llmv-pdbdump: Make BuiltinDumper shorter. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251974
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Tue, 3 Nov 2015 20:13:43 +0000 (20:13 +0000)]
[LAA] LLE 2/6: Fix a NoDep case that should be a Forward dependence
Summary:
When the dependence distance in zero then we have a loop-independent
dependence from the earlier to the later access.
No current client of LAA uses forward dependences so other than
potentially hitting the MaxDependences threshold earlier, this change
shouldn't affect anything right now.
This and the previous patch were tested together for compile-time
regression. None found in LNT/SPEC.
Reviewers: hfinkel
Subscribers: rengolin, llvm-commits
Differential Revision: http://reviews.llvm.org/D13255
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251973
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Tue, 3 Nov 2015 20:13:23 +0000 (20:13 +0000)]
[LAA] LLE 1/6: Expose Forward dependences
Summary:
Before this change, we didn't use to collect forward dependences since
none of the current clients (LV, LDist) required them.
The motivation to also collect forward dependences is a new pass
LoopLoadElimination (LLE) which discovers store-to-load forwarding
opportunities across the loop's backedge. The pass uses both lexically
forward or backward loop-carried dependences to detect these
opportunities.
The new pass also analyzes loop-independent (forward) dependences since
they can conflict with the loop-carried dependences in terms of how the
data flows through memory.
The newly added test only covers loop-carried forward dependences
because loop-independent ones are currently categorized as NoDep. The
next patch will fix this.
The two patches were tested together for compile-time regression. None
found in LNT/SPEC.
Note that with this change LAA provides all dependences rather than just
"interesting" ones. A subsequent NFC patch will remove the now trivial
isInterestingDependence and rename the APIs.
Reviewers: hfinkel
Subscribers: jmolloy, rengolin, llvm-commits
Differential Revision: http://reviews.llvm.org/D13254
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251972
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Tue, 3 Nov 2015 20:02:22 +0000 (20:02 +0000)]
Don't create empty sections just to look like gas.
We are long past the time when this much bug for bug compatibility was
useful.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251970
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Tue, 3 Nov 2015 19:38:19 +0000 (19:38 +0000)]
Relax a few more overspecified tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251967
91177308-0d34-0410-b5e6-
96231b3b80d8
Teresa Johnson [Tue, 3 Nov 2015 19:36:04 +0000 (19:36 +0000)]
Revert "Move metadata linking after lazy global materialization/linking."
This reverts commit r251926. I believe this is causing an LTO
bootstrapping bot failure
(http://lab.llvm.org:8080/green/job/llvm-stage2-cmake-RgLTO_build/3669/).
Haven't been able to repro it yet, but after looking at the metadata I
am pretty sure I know what is going on.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251965
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Tue, 3 Nov 2015 19:24:17 +0000 (19:24 +0000)]
Remove unnecessary dependency on section and string positions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251964
91177308-0d34-0410-b5e6-
96231b3b80d8
Kostya Serebryany [Tue, 3 Nov 2015 18:57:25 +0000 (18:57 +0000)]
[libFuzzer] make -test_single_input more reliable: make sure the input's size is equal to it's capacity
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251961
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Tue, 3 Nov 2015 18:55:58 +0000 (18:55 +0000)]
Delete dead code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251960
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Tue, 3 Nov 2015 18:50:51 +0000 (18:50 +0000)]
Simplify local common output.
We now create them as they are found and use higher level APIs.
This is a step in avoiding creating unnecessary sections.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251958
91177308-0d34-0410-b5e6-
96231b3b80d8
Igor Laevsky [Tue, 3 Nov 2015 18:37:40 +0000 (18:37 +0000)]
[CodegenPrepare] Do not rematerialize gc.relocates across different basic blocks
Differential Revision: http://reviews.llvm.org/D14258
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251957
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Tue, 3 Nov 2015 18:04:07 +0000 (18:04 +0000)]
Move code out of a loop and use a range loop.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251952
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Tue, 3 Nov 2015 16:40:37 +0000 (16:40 +0000)]
Revert "Revert "[Orc] Directly emit machine code for the x86 resolver block and trampolines.""
This reverts commit r251937.
The test was updated to the new API, bring the API back.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251944
91177308-0d34-0410-b5e6-
96231b3b80d8
Lang Hames [Tue, 3 Nov 2015 16:35:10 +0000 (16:35 +0000)]
[Kaleidoscope][Orc] Fix the fully_lazy Orc Kaleidoscope example.
r251933 changed the Orc compile callbacks API, which broke this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251942
91177308-0d34-0410-b5e6-
96231b3b80d8
Silviu Baranga [Tue, 3 Nov 2015 16:27:04 +0000 (16:27 +0000)]
Fix PR25372 - teach replaceCongruentPHIs to handle cases where SE evaluates a PHI to a SCEVConstant
Summary:
Since now Scalar Evolution can create non-add rec expressions for PHI
nodes, it can also create SCEVConstant expressions. This will confuse
replaceCongruentPHIs, which previously relied on the fact that SCEV
could not produce constants in this case.
We will now replace the node with a constant in these cases - or avoid
processing the Phi in case of a type mismatch.
Reviewers: sanjoy
Subscribers: llvm-commits, majnemer
Differential Revision: http://reviews.llvm.org/D14230
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251938
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Tue, 3 Nov 2015 16:25:20 +0000 (16:25 +0000)]
Revert "[Orc] Directly emit machine code for the x86 resolver block and trampolines."
This reverts commit r251933.
It broke the build of examples/Kaleidoscope/Orc/fully_lazy/toy.cpp.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251937
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Tue, 3 Nov 2015 16:23:21 +0000 (16:23 +0000)]
Kaleidoscope-ch2: Remove the dependence on LLVM by cloning make_unique into this project
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251936
91177308-0d34-0410-b5e6-
96231b3b80d8
Lang Hames [Tue, 3 Nov 2015 16:10:18 +0000 (16:10 +0000)]
[Orc] Directly emit machine code for the x86 resolver block and trampolines.
Bypassing LLVM for this has a number of benefits:
1) Laziness support becomes asm-syntax agnostic (previously lazy jitting didn't
work on Windows as the resolver block was in Darwin asm).
2) For cross-process JITs, it allows resolver blocks and trampolines to be
emitted directly in the target process, reducing cross process traffic.
3) It should be marginally faster.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251933
91177308-0d34-0410-b5e6-
96231b3b80d8
Teresa Johnson [Tue, 3 Nov 2015 15:11:27 +0000 (15:11 +0000)]
Move metadata linking after lazy global materialization/linking.
Summary:
Currently, named metadata is linked before the LazilyLinkGlobalValues
list is walked and materialized/linked. As a result, references
from DISubprogram and DIGlobalVariable metadata to yet unmaterialized
functions and variables cause them to be added to the lazy linking
list and their definitions are materialized and linked.
This makes the llvm-link -only-needed option not have the intended
effect when debug information is present, as the otherwise unneeded
functions/variables are still linked in.
Additionally, for ThinLTO I have implemented a mechanism to only link
in debug metadata needed by imported functions. Moving named metadata
linking after lazy GV linking will facilitate applying this mechanism
to the LTO and "llvm-link -only-needed" cases as well.
Reviewers: dexonsmith, tra, dblaikie
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14195
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251926
91177308-0d34-0410-b5e6-
96231b3b80d8
Teresa Johnson [Tue, 3 Nov 2015 15:10:50 +0000 (15:10 +0000)]
Pass enum instead of bool to new linkInModule call in llvm-link
A new call I added to linkInModule from llvm-link in r251866
was still passing in a boolean for an argument that was changed to an
enum in r246561. I didn't catch this in my merge since the bool false
matched the flag value it mapped to.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251925
91177308-0d34-0410-b5e6-
96231b3b80d8
Filipe Cabecinhas [Tue, 3 Nov 2015 13:48:26 +0000 (13:48 +0000)]
Don't assert if materializing before seeing any function bodies
This assert was reachable from user input. A minimized test case (no
FUNCTION_BLOCK_ID record) is attached.
Bug found with afl-fuzz
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251910
91177308-0d34-0410-b5e6-
96231b3b80d8
Filipe Cabecinhas [Tue, 3 Nov 2015 13:48:21 +0000 (13:48 +0000)]
Don't use Twine objects after their lifetimes end.
No test, since it would depend on what the compiler can optimize/reuse.
My next commit made this bug visible on Linux Release compiles with some
versions of gcc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251909
91177308-0d34-0410-b5e6-
96231b3b80d8
Elena Demikhovsky [Tue, 3 Nov 2015 10:29:34 +0000 (10:29 +0000)]
LoopVectorizer - skip 'bitcast' between GEP and load.
Skipping 'bitcast' in this case allows to vectorize load:
%arrayidx = getelementptr inbounds double*, double** %in, i64 %indvars.iv
%tmp53 = bitcast double** %arrayidx to i64*
%tmp54 = load i64, i64* %tmp53, align 8
Differential Revision http://reviews.llvm.org/D14112
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251907
91177308-0d34-0410-b5e6-
96231b3b80d8
Michael Kuperstein [Tue, 3 Nov 2015 08:17:25 +0000 (08:17 +0000)]
[X86] Generate .cfi_adjust_cfa_offset correctly when pushing arguments
When push instructions are being used to pass function arguments on
the stack, and either EH or debugging are enabled, we need to generate
.cfi_adjust_cfa_offset directives appropriately. For (synch) EH, it is
enough for the CFA offset to be correct at every call site, while
for debugging we want to be correct after every push.
Darwin does not support this well, so don't use pushes whenever it
would be required.
Differential Revision: http://reviews.llvm.org/D13767
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251904
91177308-0d34-0410-b5e6-
96231b3b80d8
Igor Breger [Tue, 3 Nov 2015 07:30:17 +0000 (07:30 +0000)]
AVX512: add encoding tests for vmovq/d instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251903
91177308-0d34-0410-b5e6-
96231b3b80d8
Tobias Grosser [Tue, 3 Nov 2015 07:14:39 +0000 (07:14 +0000)]
Revert "[IndVarSimplify] Rewrite loop exit values with their initial values from loop preheader"
Commit 251839 triggers miscompiles on some bots:
http://lab.llvm.org:8011/builders/perf-x86_64-penryn-O3-polly-fast/builds/13723
(The commit is listed in 13722, but due to an existing failure introduced in
13721 and reverted in 13723 the failure is only visible in 13723)
To verify r251839 is indeed the only change that triggered the buildbot failures
and to ensure the buildbots remain green while investigating I temporarily
revert this commit. At the current state it is unclear if this commit introduced
some miscompile or if it only exposed code to Polly that is subsequently
miscompiled by Polly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251901
91177308-0d34-0410-b5e6-
96231b3b80d8
Matthias Braun [Tue, 3 Nov 2015 02:19:07 +0000 (02:19 +0000)]
Fix build problme introduced in r251883
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251888
91177308-0d34-0410-b5e6-
96231b3b80d8
Matthias Braun [Tue, 3 Nov 2015 01:53:36 +0000 (01:53 +0000)]
RegisterPressure: Improve assert message
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251885
91177308-0d34-0410-b5e6-
96231b3b80d8
Matthias Braun [Tue, 3 Nov 2015 01:53:33 +0000 (01:53 +0000)]
RegisterPressure: Slightly nicer pressure diff dumping
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251884
91177308-0d34-0410-b5e6-
96231b3b80d8
Matthias Braun [Tue, 3 Nov 2015 01:53:29 +0000 (01:53 +0000)]
ScheduleDAGInstrs: Remove IsPostRA flag; NFC
ScheduleDAGInstrs doesn't behave differently before or after register
allocation. It was only used in a method of MachineSchedulerBase which
behaved differently in MachineScheduler/PostMachineScheduler. Change
this to let MachineScheduler/PostMachineScheduler just pass in a
parameter to that function.
The order of the LiveIntervals* and bool RemoveKillFlags paramters have
been switched to make out-of-tree code fail instead of unintentionally
passing a value intended for the IsPostRA flag to the (previously
following and default initialized) RemoveKillFlags.
Differential Revision: http://reviews.llvm.org/D14245
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251883
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Tue, 3 Nov 2015 01:32:40 +0000 (01:32 +0000)]
Don't implicitly construct a Archive::child_iterator.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251878
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Tue, 3 Nov 2015 01:20:44 +0000 (01:20 +0000)]
This never returns end(), simplify to use Child instead of iterator. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251876
91177308-0d34-0410-b5e6-
96231b3b80d8
Rui Ueyama [Tue, 3 Nov 2015 01:04:44 +0000 (01:04 +0000)]
llvm-pdbdump: Simplify. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251873
91177308-0d34-0410-b5e6-
96231b3b80d8
Colin LeMahieu [Tue, 3 Nov 2015 00:21:19 +0000 (00:21 +0000)]
[Hexagon] Fixing mistaken case fallthrough.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251867
91177308-0d34-0410-b5e6-
96231b3b80d8
Teresa Johnson [Tue, 3 Nov 2015 00:14:15 +0000 (00:14 +0000)]
Restore "Support for ThinLTO function importing and symbol linking."
This restores commit r251837, with the new library dependence added to
llvm-link/Makefile to address bot failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251866
91177308-0d34-0410-b5e6-
96231b3b80d8
Kevin Enderby [Mon, 2 Nov 2015 23:42:05 +0000 (23:42 +0000)]
Allow llvm-nm’s single letter command line flags to be grouped.
Which is needed if we want to replace darwin’s nm(1) with llvm-nm
as there are many uses of grouped flags. The added test case is
one specific case that is in real use.
rdar://
23337419
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251864
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Mon, 2 Nov 2015 23:30:48 +0000 (23:30 +0000)]
AMDGPU: Stop assuming vreg for build_vector
This was causing a variety of test failures when v2i64
is added as a legal type.
SIFixSGPRCopies should correctly handle the case of vector inputs
to a scalar reg_sequence, so this isn't necessary anymore. This
was hiding some deficiencies in how reg_sequence is handled later,
but this shouldn't be a problem anymore since the register class
copy of a reg_sequence is now done before the reg_sequence.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251860
91177308-0d34-0410-b5e6-
96231b3b80d8
Derek Schuff [Mon, 2 Nov 2015 23:23:16 +0000 (23:23 +0000)]
[WebAssembly] Make WebAssemblyCodeGen depend on WebAssemblyAsmPrinter
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251859
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Mon, 2 Nov 2015 23:23:02 +0000 (23:23 +0000)]
AMDGPU: Error on graphics shaders with HSA
I've found myself pointlessly debugging problems from running
graphics tests with an HSA triple a few times, so stop this from
happening again.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251858
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Mon, 2 Nov 2015 23:22:49 +0000 (23:22 +0000)]
[CGP] widen switch condition and case constants to target's register width (2nd try)
This is a redo of r251849 except the tests have been split into arch-specific folders
to hopefully make the bots happy.
This is a follow-up from the discussion in D12965. The block-at-a-time limitation of
SelectionDAG also came up in D13297.
Without the InstCombine change from D12965, I don't expect this patch to make any
difference in the real world because InstCombine does not shrink cases like this in
visitSwitchInst(). But we need to have this CGP safety harness in place before
proceeding with any shrinkage in D12965, so we won't generate extra extends for compares.
I've opted for IR regression tests in the patch because that seems like a clearer way to
test the transform, but PowerPC CodeGen for an i16 widening test is shown below. x86
will need more work to solve: https://llvm.org/bugs/show_bug.cgi?id=22473
Before:
BB#0:
mr 4, 3
extsh. 3, 4
ble 0, .LBB0_5
BB#1:
cmpwi 3, 99
bgt 0, .LBB0_9
BB#2:
rlwinm 4, 4, 0, 16, 31 <--- 32-bit mask/extend
li 3, 0
cmplwi 4, 1
beqlr 0
BB#3:
cmplwi 4, 10
bne 0, .LBB0_12
BB#4:
li 3, 1
blr
.LBB0_5:
rlwinm 3, 4, 0, 16, 31 <--- 32-bit mask/extend
cmplwi 3, 65436
beq 0, .LBB0_13
BB#6:
cmplwi 3, 65526
beq 0, .LBB0_15
BB#7:
cmplwi 3, 65535
bne 0, .LBB0_12
BB#8:
li 3, 4
blr
.LBB0_9:
rlwinm 3, 4, 0, 16, 31 <--- 32-bit mask/extend
cmplwi 3, 100
beq 0, .LBB0_14
...
After:
BB#0:
rlwinm 4, 3, 0, 16, 31 <--- mask/extend to 32-bit and then use that for comparisons
cmpwi 4, 999
ble 0, .LBB0_5
BB#1:
lis 3, 0
ori 3, 3, 65525
cmpw 4, 3
bgt 0, .LBB0_9
BB#2:
cmplwi 4, 1000
beq 0, .LBB0_14
BB#3:
cmplwi 4, 65436
bne 0, .LBB0_13
BB#4:
li 3, 6
blr
.LBB0_5:
li 3, 0
cmplwi 4, 1
beqlr 0
BB#6:
cmplwi 4, 10
beq 0, .LBB0_12
BB#7:
cmplwi 4, 100
bne 0, .LBB0_13
BB#8:
li 3, 2
blr
.LBB0_9:
cmplwi 4, 65526
beq 0, .LBB0_15
BB#10:
cmplwi 4, 65535
bne 0, .LBB0_13
...
Differential Revision: http://reviews.llvm.org/D13532
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251857
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Mon, 2 Nov 2015 23:15:46 +0000 (23:15 +0000)]
AMDGPU: Un XFAIL a test
This should probably be merged with one of the other private memory
tests, but it fails on r600.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251856
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Mon, 2 Nov 2015 23:15:42 +0000 (23:15 +0000)]
AMDGPU: Distribute SGPR->VGPR copies of REG_SEQUENCE
Make the REG_SEQUENCE be a VGPR, and do the register class
copy first.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251855
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Mon, 2 Nov 2015 23:10:52 +0000 (23:10 +0000)]
Fix the build I just broke
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251854
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Mon, 2 Nov 2015 23:09:38 +0000 (23:09 +0000)]
Orc: Drop some else-after-return, reflow a few spots, and avoid use of pointee types
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251853
91177308-0d34-0410-b5e6-
96231b3b80d8
Davide Italiano [Mon, 2 Nov 2015 23:07:14 +0000 (23:07 +0000)]
[SimplifyLibCalls] Remove variables that are not used. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251852
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Mon, 2 Nov 2015 23:05:20 +0000 (23:05 +0000)]
revert r251849; need to move tests to arch-specific folders
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251851
91177308-0d34-0410-b5e6-
96231b3b80d8
Cong Hou [Mon, 2 Nov 2015 22:53:48 +0000 (22:53 +0000)]
Add a flag vectorizer-maximize-bandwidth in loop vectorizer to enable using larger vectorization factor.
To be able to maximize the bandwidth during vectorization, this patch provides a new flag vectorizer-maximize-bandwidth. When it is turned on, the vectorizer will determine the vectorization factor (VF) using the smallest instead of widest type in the loop. To avoid increasing register pressure too much, estimates of the register usage for different VFs are calculated so that we only choose a VF when its register usage doesn't exceed the number of available registers.
This is the second attempt to submit this patch. The first attempt got a test failure on ARM. This patch is updated to try to fix the failure (more specifically, by handling the case when VF=1).
Differential revision: http://reviews.llvm.org/D8943
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251850
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Mon, 2 Nov 2015 22:46:24 +0000 (22:46 +0000)]
[CGP] widen switch condition and case constants to target's register width
This is a follow-up from the discussion in D12965. The block-at-a-time limitation of
SelectionDAG also came up in D13297.
Without the InstCombine change from D12965, I don't expect this patch to make any
difference in the real world because InstCombine does not shrink cases like this in
visitSwitchInst(). But we need to have this CGP safety harness in place before
proceeding with any shrinkage in D12965, so we won't generate extra extends for compares.
I've opted for IR regression tests in the patch because that seems like a clearer way to
test the transform, but PowerPC CodeGen for an i16 widening test is shown below. x86
will need more work to solve: https://llvm.org/bugs/show_bug.cgi?id=22473
Before:
BB#0:
mr 4, 3
extsh. 3, 4
ble 0, .LBB0_5
BB#1:
cmpwi 3, 99
bgt 0, .LBB0_9
BB#2:
rlwinm 4, 4, 0, 16, 31 <--- 32-bit mask/extend
li 3, 0
cmplwi 4, 1
beqlr 0
BB#3:
cmplwi 4, 10
bne 0, .LBB0_12
BB#4:
li 3, 1
blr
.LBB0_5:
rlwinm 3, 4, 0, 16, 31 <--- 32-bit mask/extend
cmplwi 3, 65436
beq 0, .LBB0_13
BB#6:
cmplwi 3, 65526
beq 0, .LBB0_15
BB#7:
cmplwi 3, 65535
bne 0, .LBB0_12
BB#8:
li 3, 4
blr
.LBB0_9:
rlwinm 3, 4, 0, 16, 31 <--- 32-bit mask/extend
cmplwi 3, 100
beq 0, .LBB0_14
...
After:
BB#0:
rlwinm 4, 3, 0, 16, 31 <--- mask/extend to 32-bit and then use that for comparisons
cmpwi 4, 999
ble 0, .LBB0_5
BB#1:
lis 3, 0
ori 3, 3, 65525
cmpw 4, 3
bgt 0, .LBB0_9
BB#2:
cmplwi 4, 1000
beq 0, .LBB0_14
BB#3:
cmplwi 4, 65436
bne 0, .LBB0_13
BB#4:
li 3, 6
blr
.LBB0_5:
li 3, 0
cmplwi 4, 1
beqlr 0
BB#6:
cmplwi 4, 10
beq 0, .LBB0_12
BB#7:
cmplwi 4, 100
bne 0, .LBB0_13
BB#8:
li 3, 2
blr
.LBB0_9:
cmplwi 4, 65526
beq 0, .LBB0_15
BB#10:
cmplwi 4, 65535
bne 0, .LBB0_13
...
Differential Revision: http://reviews.llvm.org/D13532
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251849
91177308-0d34-0410-b5e6-
96231b3b80d8
Bill Schmidt [Mon, 2 Nov 2015 22:43:57 +0000 (22:43 +0000)]
[PPC64LE] Properly initialize instr-info in PPCVSXSwapRemoval pass
Replace some hacky code with the proper way to get at this data.
No functional change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251848
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Mon, 2 Nov 2015 22:34:55 +0000 (22:34 +0000)]
don't repeat function names in comments; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251846
91177308-0d34-0410-b5e6-
96231b3b80d8
Davide Italiano [Mon, 2 Nov 2015 22:33:26 +0000 (22:33 +0000)]
[SimplifyLibCalls] Merge two if statements. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251845
91177308-0d34-0410-b5e6-
96231b3b80d8
Teresa Johnson [Mon, 2 Nov 2015 22:17:32 +0000 (22:17 +0000)]
Revert "Support for ThinLTO function importing and symbol linking."
This reverts commit r251837, due to a number of bot failures of the form:
/home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.obj/tools/llvm-link/Release+Asserts/llvm-link.o:llvm-link.cpp:function
loadIndex(llvm::LLVMContext&, llvm::Module const*): error: undefined
reference to
'llvm::object::FunctionIndexObjectFile::create(llvm::MemoryBufferRef,
llvm::LLVMContext&, llvm::Module const*, bool)'
/home/grosser/buildslave/perf-x86_64-penryn-O3-polly-fast/llvm.obj/tools/llvm-link/Release+Asserts/llvm-link.o:llvm-link.cpp:function
loadIndex(llvm::LLVMContext&, llvm::Module const*): error: undefined
reference to 'llvm::object::FunctionIndexObjectFile::takeIndex()'
I'm not sure why these are happening - I added Object to the requred
libraries in tools/llvm-link/LLVMBuild.txt and the LLVM_LINK_COMPONENTS
in tools/llvm-link/CMakeLists.txt. Confirmed for my build that these
symbols come out of libLLVMObject.a. What am I missing?
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251841
91177308-0d34-0410-b5e6-
96231b3b80d8
Chen Li [Mon, 2 Nov 2015 22:00:15 +0000 (22:00 +0000)]
[IndVarSimplify] Rewrite loop exit values with their initial values from loop preheader
Summary:
This patch adds support to check if a loop has loop invariant conditions which lead to loop exits. If so, we know that if the exit path is taken, it is at the first loop iteration. If there is an induction variable used in that exit path whose value has not been updated, it will keep its initial value passing from loop preheader. We can therefore rewrite the exit value with
its initial value. This will help remove phis created by LCSSA and enable other optimizations like loop unswitch.
Reviewers: sanjoy
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13974
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251839
91177308-0d34-0410-b5e6-
96231b3b80d8