Benjamin Kramer [Fri, 15 Aug 2014 11:05:45 +0000 (11:05 +0000)]
PPC: Clean up pointer casting, no functionality change.
Silences GCC's -Wcast-qual.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215703
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Fri, 15 Aug 2014 11:01:40 +0000 (11:01 +0000)]
[x86] Add the initial skeleton of type-based dispatch for AVX vectors in
the new shuffle lowering and an implementation for v4 shuffles.
This allows us to handle non-half-crossing shuffles directly for v4
shuffles, both integer and floating point. This currently misses places
where we could perform the blend via UNPCK instructions, but otherwise
generates equally good or better code for the test cases included to the
existing vector shuffle lowering. There are a few cases that are
entertainingly better. ;]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215702
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Fri, 15 Aug 2014 11:01:37 +0000 (11:01 +0000)]
[x86] Teach the instruction printer to decode immediate operands to
BLENDPS, BLENDPD, and PBLENDW instructions into pretty shuffle comments.
These will be used in my next commit as part of test cases for AVX
shuffles which can directly use blend in more places.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215701
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Northover [Fri, 15 Aug 2014 10:47:12 +0000 (10:47 +0000)]
ARM: implement MRS/MSR (banked reg) system instructions.
These are system-only instructions for CPUs with virtualization
extensions, allowing a hypervisor easy access to all of the various
different AArch32 registers.
rdar://problem/
17861345
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215700
91177308-0d34-0410-b5e6-
96231b3b80d8
Erik Verbruggen [Fri, 15 Aug 2014 10:33:03 +0000 (10:33 +0000)]
Remove testcase from README which we didn't get. We do get it now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215699
91177308-0d34-0410-b5e6-
96231b3b80d8
Vladimir Medic [Fri, 15 Aug 2014 09:29:30 +0000 (09:29 +0000)]
Current implementation of c.cond.fmt instructions only accept default cc0 register. This patch enables the instruction to accept other fcc registers. The aliases with default fcc0 registers are also defined.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215698
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Fri, 15 Aug 2014 07:41:57 +0000 (07:41 +0000)]
[x86] Remove the duplicated code for testing whether we can widen the
elements of a shuffle mask and simplify how it works. No functionality
changed now that the bug that was here has been fixed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215696
91177308-0d34-0410-b5e6-
96231b3b80d8
Eric Fiselier [Fri, 15 Aug 2014 05:54:19 +0000 (05:54 +0000)]
[LIT]Correct name of global lit configuration object to be lit_config (not lit).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215695
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Fri, 15 Aug 2014 03:54:49 +0000 (03:54 +0000)]
[x86] Fix the very broken formation of vpunpck instructions in the
target-specific shuffl DAG combines.
We were recognizing the paired shuffles backwards. This code needs to be
replaced anyways as we have the same functionality elsewhere, but I'll
do the refactoring in a follow-up, this is the minimal fix to the
behavior.
In addition to fixing miscompiles with the new vector shuffle lowering,
it also causes the canonicalization to kick in much better, selecting
the smaller encoding variants in lots of places in the new AVX path.
This still isn't quite ideal as we don't need both the shufpd and the
punpck instructions, but that'll get fixed in a follow-up patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215690
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Fri, 15 Aug 2014 03:07:13 +0000 (03:07 +0000)]
Don't print comments to an object streamer :-)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215689
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Fri, 15 Aug 2014 02:51:31 +0000 (02:51 +0000)]
EmitAbsValue is the same as EmitValue on non-darwin. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215688
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Fri, 15 Aug 2014 02:43:18 +0000 (02:43 +0000)]
[x86] Fix PR20540 where the x86 shuffle DAG combiner had completely
broken logic for merging shuffle masks in the face of SM_SentinelZero
mask operands.
While these are '-1' they don't mean 'undef' the way '-1' means in the
pre-legalized shuffle masks. Instead, they mean that the shuffle
operation is forcibly zeroing that lane. Reflect this and explicitly
handle it in a bunch of places. In one place the effect is equivalent
but much more clear. In the rest it was really weirdly broken.
Also, rewrite the entire merging thing to be a more directy operation
with a single loop and just doing math to map the indices through the
various masks.
Also add a bunch of asserts to try to make in extremely clear what the
different masks can possibly look like.
Finally, add some comments to clarify that we're merging shuffle masks
*up* here rather than *down* as we do everywhere else, and thus the
logic is quite confusing.
Thanks to several different people for sending test cases, and for
Robert Khasanov for an initial attempt at fixing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215687
91177308-0d34-0410-b5e6-
96231b3b80d8
Bill Schmidt [Fri, 15 Aug 2014 01:25:26 +0000 (01:25 +0000)]
[PPC64] Add missing dependency on X2 to LDinto_toc.
The LDinto_toc pattern has been part of 64-bit PowerPC for a long
time, and represents loading from a memory location into the TOC
register (X2). However, this pattern doesn't explicitly record that
it modifies that register. This patch adds the missing dependency.
It was very surprising to me that this has never shown up as a problem
in the past, and that we only saw this problem recently in a single
scenario when building a self-hosted clang. It turns out that in most
cases we have another dependency present that keeps the LDinto_toc
instruction tied in place. LDinto_toc is used for TOC restore
following a call site, so this is a typical sequence:
BCTRL8 <regmask>, %CTR8<imp-use>, %RM<imp-use>, %X3<imp-use>, %X12<imp-use>, %X1<imp-def>, ...
LDinto_toc 24, %X1
ADJCALLSTACKUP 96, 0, %R1<imp-def>, %R1<imp-use>
Because the LDinto_toc is inserted prior to the ADJCALLSTACKUP, there
is a natural anti-dependency between the two that keeps it in place.
Therefore we don't usually see a problem. However, in one particular
case, one call is followed immediately by another call, and the second
call requires a parameter that is a TOC-relative address. This is the
code sequence:
BCTRL8 <regmask>, %CTR8<imp-use>, %RM<imp-use>, %X3<imp-use>, %X4<imp-use>, %X5<imp-use>, %X12<imp-use>, %X1<imp-def>, ...
LDinto_toc 24, %X1
ADJCALLSTACKUP 96, 0, %R1<imp-def>, %R1<imp-use>
ADJCALLSTACKDOWN 96, %R1<imp-def>, %R1<imp-use>
%vreg39<def> = ADDIStocHA %X2, <ga:@.str>; G8RC_and_G8RC_NOX0:%vreg39
%vreg40<def> = ADDItocL %vreg39<kill>, <ga:@.str>; G8RC:%vreg40 G8RC_and_G8RC_NOX0:%vreg39
Note that the back-to-back stack adjustments are the same size! The
back end is smart enough to recognize this and optimize them away:
BCTRL8 <regmask>, %CTR8<imp-use>, %RM<imp-use>, %X3<imp-use>, %X4<imp-use>, %X5<imp-use>, %X12<imp-use>, %X1<imp-def>, ...
LDinto_toc 24, %X1
%vreg39<def> = ADDIStocHA %X2, <ga:@.str>; G8RC_and_G8RC_NOX0:%vreg39
%vreg40<def> = ADDItocL %vreg39<kill>, <ga:@.str>; G8RC:%vreg40 G8RC_and_G8RC_NOX0:%vreg39
Now there is nothing to prevent the ADDIStocHA instruction from moving
ahead of the LDinto_toc instruction, and because of the longest-path
heuristic, this is what happens.
With the accompanying patch, %X2 is represented as an implicit def:
BCTRL8 <regmask>, %CTR8<imp-use>, %RM<imp-use>, %X3<imp-use>, %X4<imp-use>, %X5<imp-use>, %X12<imp-use>, %X1<imp-def>, ...
LDinto_toc 24, %X1, %X2<imp-def,dead>
ADJCALLSTACKUP 96, 0, %R1<imp-def,dead>, %R1<imp-use>
ADJCALLSTACKDOWN 96, %R1<imp-def,dead>, %R1<imp-use>
%vreg39<def> = ADDIStocHA %X2, <ga:@.str>; G8RC_and_G8RC_NOX0:%vreg39
%vreg40<def> = ADDItocL %vreg39<kill>, <ga:@.str>; G8RC:%vreg40 G8RC_and_G8RC_NOX0:%vreg39
So now when the two stack adjustments are removed, ADDIStocHA is
prevented from being moved above LDinto_toc.
I have not yet created a test case for this, because the original
failure occurs on a relatively large function that needs reduction.
However, this is a fairly serious bug, despite its infrequency, and I
wanted to get this patch onto the list as soon as possible so that it
can be considered for a 3.5 backport. I'll work on whittling down a
test case.
Have we missed the boat for 3.5 at this point?
Thanks,
Bill
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215685
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Thu, 14 Aug 2014 23:29:49 +0000 (23:29 +0000)]
[FastISel][ARM] Fall-back to constant pool loads when materializing an i32 constant.
FastEmit_i won't always succeed to materialize an i32 constant and just fail.
This would trigger a fall-back to SelectionDAG, which is really not necessary.
This fix will first fall-back to a constant pool load to materialize the constant
before giving up for good.
This fixes <rdar://problem/
18022633>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215682
91177308-0d34-0410-b5e6-
96231b3b80d8
Hal Finkel [Thu, 14 Aug 2014 21:09:37 +0000 (21:09 +0000)]
Copy noalias metadata from call sites to inlined instructions
When a call site with noalias metadata is inlined, that metadata can be
propagated directly to the inlined instructions (only those that might access
memory because it is not useful on the others). Prior to inlining, the noalias
metadata could express that a call would not alias with some other memory
access, which implies that no instruction within that called function would
alias. By propagating the metadata to the inlined instructions, we preserve
that knowledge.
This should complete the enhancements requested in PR20500.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215676
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Thu, 14 Aug 2014 19:56:28 +0000 (19:56 +0000)]
Revert several FastISel commits to track down a buildbot error.
This reverts:
r215595 "[FastISel][X86] Add large code model support for materializing floating-point constants."
r215594 "[FastISel][X86] Use XOR to materialize the "0" value."
r215593 "[FastISel][X86] Emit more efficient instructions for integer constant materialization."
r215591 "[FastISel][AArch64] Make use of the zero register when possible."
r215588 "[FastISel] Let the target decide first if it wants to materialize a constant."
r215582 "[FastISel][AArch64] Cleanup constant materialization code. NFCI."
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215673
91177308-0d34-0410-b5e6-
96231b3b80d8
Duncan P. N. Exon Smith [Thu, 14 Aug 2014 17:18:26 +0000 (17:18 +0000)]
Fix whitespace error from r215279, NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215667
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Thu, 14 Aug 2014 17:13:33 +0000 (17:13 +0000)]
[AVX512] Add test for FMA masking instrinsics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215665
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Thu, 14 Aug 2014 17:13:30 +0000 (17:13 +0000)]
[AVX512] Switch FMA intrinsics to the masking version
This does the renaming and updates the lowering logic.
Part of <rdar://problem/
17688758>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215664
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Thu, 14 Aug 2014 17:13:27 +0000 (17:13 +0000)]
[X86] Break out logic to map FMA Intrinsic number to Opcode
No functional change. Will be used to lower AVX512 masking FMA intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215663
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Thu, 14 Aug 2014 17:13:26 +0000 (17:13 +0000)]
[AVX512] Add enum for the static rounding types
No functional change. This will be used by the new FMA intrinsic lowering
code.
We can probably add NO_EXC here as well, I am just not too familiar with this
part of AVX512 yet. We can add that later.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215662
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Thu, 14 Aug 2014 17:13:24 +0000 (17:13 +0000)]
[AVX512] Break out the logic to lower masking intrinsics
No functional change. This will be used by the FMA intrinsic lowering as well
and hopefully many more.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215661
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Thu, 14 Aug 2014 17:13:19 +0000 (17:13 +0000)]
[AVX512] Add masking variant for the FMA instructions
This change further evolves the base class AVX512_masking in order to make it
suitable for the masking variants of the FMA instructions.
Besides AVX512_masking there is now a new base class that instructions
including FMAs can use: AVX512_masking_3src. With three-source (destructive)
instructions one of the sources is already tied to the destination. This
difference from AVX512_masking is captured by this new class. The common bits
between _masking and _masking_3src are broken out into a new super class
called AVX512_masking_common.
As with valign, there is some corresponding restructuring of the underlying
format classes. The idea is the same we want to derive from two classes
essentially: one providing the format bits and another format-independent
multiclass supplying the various masking and non-masking instruction variants.
Existing fma tests in avx512-fma*.ll provide coverage here for the non-masking
variants. For masking, the next patches in the series will add intrinsics and
intrinsic tests.
For AVX512_masking_3src to work, the (ins ...) dag has to be passed *without*
the leading source operand that is tied to dst ($src1). This is necessary to
properly construct the (ins ...) for the different variants. For the record,
I did check that if $src is mistakenly included, you do get a fairly intuitive
error message from the tablegen backend.
Part of <rdar://problem/
17688758>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215660
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Thu, 14 Aug 2014 17:10:54 +0000 (17:10 +0000)]
Revert "[FastISel][AArch64] Add support for more addressing modes."
This reverts commits r215597, because it might have broken the build bots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215659
91177308-0d34-0410-b5e6-
96231b3b80d8
Hal Finkel [Thu, 14 Aug 2014 16:44:03 +0000 (16:44 +0000)]
Add noalias metadata for general calls (not just memory intrinsics) during inlining
When preserving noalias function parameter attributes by adding noalias
metadata in the inliner, we should do this for general function calls (not just
memory intrinsics). The logic is very similar to what already existed (except
that we want to add this metadata even for functions taking no relevant
parameters). This metadata can be used by ModRef queries in the caller after
inlining.
This addresses the first part of PR20500. Adding noalias metadata during
inlining is still turned off by default.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215657
91177308-0d34-0410-b5e6-
96231b3b80d8
Moritz Roth [Thu, 14 Aug 2014 16:20:50 +0000 (16:20 +0000)]
Testing commit access.
Remove a trailing whitespace.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215653
91177308-0d34-0410-b5e6-
96231b3b80d8
Chad Rosier [Thu, 14 Aug 2014 15:23:01 +0000 (15:23 +0000)]
[Reassociation] Add support for reassociation with unsafe algebra.
Vector instructions are (still) not supported for either integer or floating
point. Hopefully, that work will be landed shortly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215647
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Thu, 14 Aug 2014 15:15:28 +0000 (15:15 +0000)]
optimize vector fneg of bitcasted integer value
This patch allows a vector fneg of a bitcasted integer value to be optimized in the same way that we already optimize a scalar fneg. If the integer variable is a constant, we can precompute the result and not require any logic ops.
This patch is very similar to a fabs patch committed at r214892.
Differential Revision: http://reviews.llvm.org/D4852
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215646
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Thu, 14 Aug 2014 15:15:09 +0000 (15:15 +0000)]
Delete support for AuroraUX.
auroraux.org is not resolving.
I will add this to the release notes as soon as I figure out where to put the
3.6 release notes :-)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215645
91177308-0d34-0410-b5e6-
96231b3b80d8
Aaron Ballman [Thu, 14 Aug 2014 13:53:19 +0000 (13:53 +0000)]
Silencing some -Wcast-qual warnings and removing some C-style casts at the same time. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215643
91177308-0d34-0410-b5e6-
96231b3b80d8
Aaron Ballman [Thu, 14 Aug 2014 13:43:57 +0000 (13:43 +0000)]
Silencing an MSVC C4334 warning ('<<' : result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)). NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215642
91177308-0d34-0410-b5e6-
96231b3b80d8
Toma Tabacu [Thu, 14 Aug 2014 13:10:48 +0000 (13:10 +0000)]
[mips] Improve robustness of some tests.
Summary:
This is done by removing some hardcoded registers like $at or expecting a single digit register to be selected.
Contains work done by Matheus Almeida.
Reviewers: matheusalmeida, dsanders
Reviewed By: dsanders
Subscribers: tomatabacu
Differential Revision: http://reviews.llvm.org/D4227
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215640
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Thu, 14 Aug 2014 12:13:59 +0000 (12:13 +0000)]
[x86] Begin stubbing out the AVX support in the new vector shuffle
lowering scheme.
Currently, this just directly bails to the fallback path of splitting
the 256-bit vector into two 128-bit vectors, operating there, and then
joining the results back together. While the results are far from
perfect, they are *shockingly* good for what we're doing here. I'll be
layering the rest of the functionality on top of this piece by piece and
updating tests as I go.
Note that 256-bit vectors in this mode are still somewhat WIP. While
I think the code paths that I'm adding here are clean and good-to-go,
there are still a lot of 128-bit assumptions that I'll need to stomp out
as I march through the functional spread here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215637
91177308-0d34-0410-b5e6-
96231b3b80d8
Zoran Jovanovic [Thu, 14 Aug 2014 12:09:10 +0000 (12:09 +0000)]
[mips][microMIPS] MicroMIPS Compact Branch Instructions BEQZC and BNEZC
Differential Revision: http://reviews.llvm.org/D3545
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215636
91177308-0d34-0410-b5e6-
96231b3b80d8
Dan Liew [Thu, 14 Aug 2014 11:57:16 +0000 (11:57 +0000)]
Make message about building sphinx documentation with CMake more
informative by stating where the output is going.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215635
91177308-0d34-0410-b5e6-
96231b3b80d8
Dan Liew [Thu, 14 Aug 2014 11:57:13 +0000 (11:57 +0000)]
Add SPHINX_WARNINGS_AS_ERRORS CMake option to allow warnings to not be
treated as errors (which is still the default). This is useful when
working on documentation that has existing errors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215634
91177308-0d34-0410-b5e6-
96231b3b80d8
Toma Tabacu [Thu, 14 Aug 2014 10:29:17 +0000 (10:29 +0000)]
[mips] Add assembler support for the "la $reg,symbol" pseudo-instruction.
Summary:
This pseudo-instruction allows the programmer to load an address from a symbolic expression into a register.
Patch by David Chisnall.
His work was sponsored by: DARPA, AFRL
I've made some minor changes to the original, such as improving the formatting and adding some comments, and I've also added a test case.
Reviewers: dsanders
Reviewed By: dsanders
Differential Revision: http://reviews.llvm.org/D4808
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215630
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Sanders [Thu, 14 Aug 2014 09:18:14 +0000 (09:18 +0000)]
[mips] Rename [gs]etCanHaveModuleDir to more natural names
Summary:
getCanHaveModuleDir() is renamed to isModuleDirectiveAllowed(), and
setCanHaveModuleDir() is renamed to forbidModuleDirective() since it is only
ever given a false argument.
Reviewers: vmedic
Reviewed By: vmedic
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D4885
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215628
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Thu, 14 Aug 2014 08:18:34 +0000 (08:18 +0000)]
[SDAG] Fix a bug in the DAG combiner where we would fail to return the
input node after manually adding it to the worklist and using CombineTo.
Once we use CombineTo the input node may have been deleted. Despite this
being *completely confusing* and somewhat broken, the only way to
"correctly" return from a DAG combine after potentially deleting the
input node is to return *that exact node*....
But really, this code should just never have used CombineTo. It won't do
what it wants (returning the node as mentioned above just causes the
combine to infloop). The correct way to combine away a casted load to
a load of the correct type is to RAUW the chain directly and then return
the loaded value to replace the actual value node.
I managed to find this with the vector shuffle fuzzer even though it
clearly has nothing at all to do with vector shuffles and rather those
happen to trigger a load of a constant pool that hits this combine *just
right*. I've included the test as it is small and a nice stress test
that the infrastructure isn't asserting.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215622
91177308-0d34-0410-b5e6-
96231b3b80d8
David Majnemer [Thu, 14 Aug 2014 06:46:25 +0000 (06:46 +0000)]
InstCombine: ((A | ~B) ^ (~A | B)) to A ^ B
Proof using CVC3 follows:
$ cat t.cvc
A, B : BITVECTOR(32);
QUERY BVXOR((A | ~B),(~A |B)) = BVXOR(A,B);
$ cvc3 t.cvc
Valid.
Patch by Mayur Pandey!
Differential Revision: http://reviews.llvm.org/D4883
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215621
91177308-0d34-0410-b5e6-
96231b3b80d8
David Majnemer [Thu, 14 Aug 2014 06:44:51 +0000 (06:44 +0000)]
AArch64: Silence warning in AArch64FastISel
GCC was emitting a signed vs unsigned comparison warning.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215620
91177308-0d34-0410-b5e6-
96231b3b80d8
David Majnemer [Thu, 14 Aug 2014 06:41:38 +0000 (06:41 +0000)]
Added InstCombine Transform for ((B | C) & A) | B -> B | (A & C)
Transform ((B | C) & A) | B --> B | (A & C)
Link: http://rise4fun.com/Z3/hP6p
Patch by Sonam Kumari!
Differential Revision: http://reviews.llvm.org/D4865
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215619
91177308-0d34-0410-b5e6-
96231b3b80d8
Saleem Abdulrasool [Thu, 14 Aug 2014 02:51:43 +0000 (02:51 +0000)]
MC: AsmLexer: handle multi-character CommentStrings correctly
As X86MCAsmInfoDarwin uses '##' as CommentString although a single '#' starts a
comment a workaround for this special case is added.
Fixes divisions in constant expressions for the AArch64 assembler and other
targets which use '//' as CommentString.
Patch by Janne Grunau!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215615
91177308-0d34-0410-b5e6-
96231b3b80d8
Lang Hames [Thu, 14 Aug 2014 02:38:20 +0000 (02:38 +0000)]
[MCJIT] Support DisableSymbolSearching and InstallLazyFunctionCreator in MCJIT.
Patch by Anthony Pesch. Thanks Anthony!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215613
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Thu, 14 Aug 2014 01:07:37 +0000 (01:07 +0000)]
[SDAG] Fix a case where we would iteratively legalize a node during
combining by replacing it with something else but not re-process the
node afterward to remove it.
In a truly remarkable stroke of bad luck, this would (in the test case
attached) end up getting some other node combined into it without ever
getting re-processed. By adding it back on to the worklist, in addition
to deleting the dead nodes more quickly we also ensure that if it
*stops* being dead for any reason it makes it back through the
legalizer. Without this, the test case will end up failing during
instruction selection due to an and node with a type we don't have an
instruction pattern for.
It took many million runs of the shuffle fuzz tester to find this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215611
91177308-0d34-0410-b5e6-
96231b3b80d8
Michael J. Spencer [Thu, 14 Aug 2014 00:51:47 +0000 (00:51 +0000)]
Remove llvm_headers_do_not_build for the benefit of XCode and Visual Studio users.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215610
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Wed, 13 Aug 2014 23:49:24 +0000 (23:49 +0000)]
[X86] Fix the value of the low mask for the lowering of MUL_LOHI for v4i32.
Found by code inspection.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215604
91177308-0d34-0410-b5e6-
96231b3b80d8
Akira Hatanaka [Wed, 13 Aug 2014 23:23:58 +0000 (23:23 +0000)]
[AArch64, fast-isel] Fall back to SelectionDAG to select tail calls.
Certain functions such as objc_autoreleaseReturnValue have to be called as
tail-calls even at -O0. Since normal fast-isel doesn't emit calls as tail calls,
we have to fall back to SelectionDAG to select calls that are marked as tail.
<rdar://problem/
17991614>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215600
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Wed, 13 Aug 2014 22:53:29 +0000 (22:53 +0000)]
[FastISel][AArch64] Add support for more addressing modes.
FastISel didn't take much advantage of the different addressing modes available
to it on AArch64. This commit allows the ComputeAddress method to recognize more
addressing modes that allows shifts and sign-/zero-extensions to be folded into
the memory operation itself.
For Example:
lsl x1, x1, #3 --> ldr x0, [x0, x1, lsl #3]
ldr x0, [x0, x1]
sxtw x1, w1
lsl x1, x1, #3 --> ldr x0, [x0, x1, sxtw #3]
ldr x0, [x0, x1]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215597
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Wed, 13 Aug 2014 22:25:35 +0000 (22:25 +0000)]
[FastISel][X86] Add large code model support for materializing floating-point constants.
In the large code model for X86 floating-point constants are placed in the
constant pool and materialized by loading from it. Since the constant pool
could be far away, a PC relative load might not work. Therefore we first
materialize the address of the constant pool with a movabsq and then load
from there the floating-point value.
Fixes <rdar://problem/
17674628>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215595
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Wed, 13 Aug 2014 22:22:17 +0000 (22:22 +0000)]
[FastISel][X86] Use XOR to materialize the "0" value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215594
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Wed, 13 Aug 2014 22:18:11 +0000 (22:18 +0000)]
[FastISel][X86] Emit more efficient instructions for integer constant materialization.
This mostly affects the i64 value type, which always resulted in an 15byte
mobavsq instruction to materialize any constant. The custom code checks the
value of the immediate and tries to use a different and smaller mov
instruction when possible.
This fixes <rdar://problem/
17420988>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215593
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Wed, 13 Aug 2014 22:13:14 +0000 (22:13 +0000)]
[FastISel][AArch64] Make use of the zero register when possible.
This change materializes now the value "0" from the zero register.
The zero register can be folded by several instruction, so no
materialization is need at all.
Fixes <rdar://problem/
17924413>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215591
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Wed, 13 Aug 2014 22:08:02 +0000 (22:08 +0000)]
[FastISel] Let the target decide first if it wants to materialize a constant.
This changes the order in which FastISel tries to materialize a constant.
Originally it would try to use a simple target-independent approach, which
can lead to the generation of inefficient code.
On X86 this would result in the use of movabsq to materialize any 64bit
integer constant - even for simple and small values such as 0 and 1. Also
some very funny floating-point materialization could be observed too.
On AArch64 it would materialize the constant 0 in a register even the
architecture has an actual "zero" register.
On ARM it would generate unnecessary mov instructions or not use mvn.
This change simply changes the order and always asks the target first if it
likes to materialize the constant. This doesn't fix all the issues
mentioned above, but it enables the targets to implement such
optimizations.
Related to <rdar://problem/
17420988>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215588
91177308-0d34-0410-b5e6-
96231b3b80d8
Gerolf Hoflehner [Wed, 13 Aug 2014 22:07:36 +0000 (22:07 +0000)]
[MachineCombiner] Removal of dangling DBG_VALUES after combining [20598]
This is a cleaner solution to the problem described in r215431.
When instructions are combined a dangling DBG_VALUE is removed.
This resolves bug 20598.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215587
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Wed, 13 Aug 2014 22:01:55 +0000 (22:01 +0000)]
[FastISel][X86] Refactor constant materialization. NFCI.
Split the constant materialization code into three separate helper functions for
Integer-, Floating-Point-, and GlobalValue-Constants.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215586
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Wed, 13 Aug 2014 21:42:19 +0000 (21:42 +0000)]
[FastISel][ARM] Use MOVT/MOVW if the subtarget requests it.
This change is also in preparation for a future change to make sure that
the constant materialization uses MOVT/MOVW when available and not a load
from the constant pool.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215584
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Wed, 13 Aug 2014 21:39:18 +0000 (21:39 +0000)]
[FastISel][ARM] Fix a bug in the integer materialization code.
getRegClassFor returns the incorrect register class when in Thumb2 mode.
This fix simply manually selects the register class as in the code just a few
lines above.
There is no test case for this code, because the code is currently
unreachable. This will be changed in a future commit and existing test
cases will exercise this code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215583
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Wed, 13 Aug 2014 21:34:04 +0000 (21:34 +0000)]
[FastISel][AArch64] Cleanup constant materialization code. NFCI.
Cleanup and prepare constant materialization code for future commits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215582
91177308-0d34-0410-b5e6-
96231b3b80d8
Gerolf Hoflehner [Wed, 13 Aug 2014 21:15:23 +0000 (21:15 +0000)]
[Cleanup] Utility function to erase instruction and mark DBG_Values
New function to erase a machine instruction and mark DBG_VALUE
for removal. A DBG_VALUE is marked for removal when it references
an operand defined in the instruction.
Use the new function to cleanup code in dead machine instruction
removal pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215580
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Wed, 13 Aug 2014 21:00:07 +0000 (21:00 +0000)]
[MachineDominatorTree] Provide a method to inform a MachineDominatorTree that a
critical edge has been split. The MachineDominatorTree will when lazy update the
underlying dominance properties when require.
** Context **
This is a follow-up of r215410.
Each time a critical edge is split this invalidates the dominator tree
information. Thus, subsequent queries of that interface will be slow until the
underlying information is actually recomputed (costly).
** Problem **
Prior to this patch, splitting a critical edge needed to query the dominator
tree to update the dominator information.
Therefore, splitting a bunch of critical edges will likely produce poor
performance as each query to the dominator tree will use the slow query path.
This happens a lot in passes like MachineSink and PHIElimination.
** Proposed Solution **
Splitting a critical edge is a local modification of the CFG. Moreover, as soon
as a critical edge is split, it is not critical anymore and thus cannot be a
candidate for critical edge splitting anymore. In other words, the predecessor
and successor of a basic block inserted on a critical edge cannot be inserted by
critical edge splitting.
Using these observations, we can pile up the splitting of critical edge and
apply then at once before updating the DT information.
The core of this patch moves the update of the MachineDominatorTree information
from MachineBasicBlock::SplitCriticalEdge to a lazy MachineDominatorTree.
** Performance **
Thanks to this patch, the motivating example compiles in 4- minutes instead of
6+ minutes. No test case added as the motivating example as nothing special but
being huge!
The binaries are strictly identical for all the llvm test-suite + SPECs with and
without this patch for both Os and O3.
Regarding compile time, I observed only noise, although on average I saw a
small improvement.
<rdar://problem/
17894619>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215576
91177308-0d34-0410-b5e6-
96231b3b80d8
Benjamin Kramer [Wed, 13 Aug 2014 20:41:26 +0000 (20:41 +0000)]
Fix (re-)creation of unittest lit.site.cfg for clang-tools-extra.
This has been hiding really well. Hopefully brings the builders suffering from
outdated lit.site.cfg files back to life.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215575
91177308-0d34-0410-b5e6-
96231b3b80d8
Jan Vesely [Wed, 13 Aug 2014 20:31:53 +0000 (20:31 +0000)]
utils: Fix segfault in flattencfg
v2: continue iterating through the rest of the bb
use for loop
v3: initialize FlattenCFG pass in ScalarOps
add test
v4: split off initializing flattencfg to a separate patch
add comment
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215574
91177308-0d34-0410-b5e6-
96231b3b80d8
Jan Vesely [Wed, 13 Aug 2014 20:31:52 +0000 (20:31 +0000)]
Initialize FlattenCFG pass
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215573
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Wed, 13 Aug 2014 18:59:01 +0000 (18:59 +0000)]
Simplify memory ownership with std::unique_ptr.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215567
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Wed, 13 Aug 2014 18:49:01 +0000 (18:49 +0000)]
Simplify ownership with std::unique_ptr. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215566
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Wed, 13 Aug 2014 18:14:11 +0000 (18:14 +0000)]
R600: Correctly set the src value offset for scalarized kernel args
This for some reason fixes v1i64 kernel arguments on pre-SI. This
currently breaks some other cases in the kernel-args.ll test for R600,
but I'm not particularly confident in the new output. VTX_READ_* are not
used for some of the scalarized cases, and the code reading from the
constant buffer doesn't make much sense to me.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215564
91177308-0d34-0410-b5e6-
96231b3b80d8
Benjamin Kramer [Wed, 13 Aug 2014 16:26:38 +0000 (16:26 +0000)]
Canonicalize header guards into a common format.
Add header guards to files that were missing guards. Remove #endif comments
as they don't seem common in LLVM (we can easily add them back if we decide
they're useful)
Changes made by clang-tidy with minor tweaks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215558
91177308-0d34-0410-b5e6-
96231b3b80d8
Benjamin Foster [Wed, 13 Aug 2014 16:11:50 +0000 (16:11 +0000)]
Test commit, remove trailing whitespace
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215556
91177308-0d34-0410-b5e6-
96231b3b80d8
Andrea Di Biagio [Wed, 13 Aug 2014 16:09:40 +0000 (16:09 +0000)]
[DAGCombiner] Improved target independent vector shuffle combine rule.
This patch improves the existing algorithm in DAGCombiner that
attempts to fold shuffles according to rule:
shuffle(shuffle(x, y, M1), undef, M2) -> shuffle(y, undef, M3)
Before this change, there were cases where the DAGCombiner conservatively
avoided folding shuffles even if the resulting mask would have been legal.
That is because the algorithm wrongly assumed that commuting
an illegal shuffle mask would always produce an illegal mask.
With this change, we now correctly compute the commuted shuffle mask before
calling method 'isShuffleMaskLegal' on it.
On X86, this improves for example the codegen for the following function:
define <4 x i32> @test(<4 x i32> %A, <4 x i32> %B) {
%1 = shufflevector <4 x i32> %B, <4 x i32> %A, <4 x i32> <i32 1, i32 2, i32 6, i32 7>
%2 = shufflevector <4 x i32> %1, <4 x i32> undef, <4 x i32> <i32 2, i32 3, i32 2, i32 3>
ret <4 x i32> %2
}
Before this change the X86 backend (-mcpu=corei7) generated
the following assembly code for function @test:
shufps $-23, %xmm0, %xmm1 # xmm1 = xmm1[1,2],xmm0[2,3]
movhlps %xmm1, %xmm1 # xmm1 = xmm1[1,1]
movaps %xmm1, %xmm0
Now we produce:
movhlps %xmm0, %xmm0 # xmm0 = xmm0[1,1]
Added extra test cases in combine-vec-shuffle-2.ll to verify that we correctly
fold according to the above-mentioned rule.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215555
91177308-0d34-0410-b5e6-
96231b3b80d8
Toma Tabacu [Wed, 13 Aug 2014 12:48:12 +0000 (12:48 +0000)]
[mips] Refactor calls to setCanHaveModuleDir.
Summary:
Moved some calls to setCanHaveModuleDir to the MipsTargetStreamer base class and removed the resulting empty functions from the MipsTargetELFStreamer class.
Also fixed a missing call to setCanHaveModuleDir in MipsTargetELFStreamer::emitDirectiveSetMicroMips.
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: tomatabacu
Differential Revision: http://reviews.llvm.org/D4781
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215542
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 13 Aug 2014 12:27:18 +0000 (12:27 +0000)]
[shuffle] Stand back! I'm about to (try to) do math!
Especially with blends and large tree heights there was a problem with
the fuzzer where it would end up with enough undef shuffle elements in
enough parts of the tree that in a birthday-attack kind of way we ended
up regularly having large numbers of undef elements in the result. I was
seeing reasonably frequent cases of *all* results being undef which
prevents us from doing any correctness checking at all. While having
undef lanes is important, this was too much.
So I've tried to apply some math to the probabilities of having an undef
lane and balance them against the tree height. Please be gentle, I'm
really terrible at math. I probably made a bunch of amateur mistakes
here. Fixes, etc. are quite welcome. =D At least in running it some, it
seems to be producing more interesting (for correctness testing)
results.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215540
91177308-0d34-0410-b5e6-
96231b3b80d8
Aaron Ballman [Wed, 13 Aug 2014 11:17:41 +0000 (11:17 +0000)]
Asserting that the call to chdir succeeds in this test. Fixes some -Wunused-result warnings.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215539
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 13 Aug 2014 10:49:33 +0000 (10:49 +0000)]
[optnone] Make the optnone attribute effective at suppressing function
attribute and function argument attribute synthesizing and propagating.
As with the other uses of this attribute, the goal remains a best-effort
(no guarantees) attempt to not optimize the function or assume things
about the function when optimizing. This is particularly useful for
compiler testing, bisecting miscompiles, triaging things, etc. I was
hitting specific issues using optnone to isolate test code from a test
driver for my fuzz testing, and this is one step of fixing that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215538
91177308-0d34-0410-b5e6-
96231b3b80d8
Aaron Ballman [Wed, 13 Aug 2014 10:49:07 +0000 (10:49 +0000)]
Silence a -Wparenthesis warning with these asserts. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215537
91177308-0d34-0410-b5e6-
96231b3b80d8
Robert Khasanov [Wed, 13 Aug 2014 10:46:00 +0000 (10:46 +0000)]
[SKX] Extended non-temporal load/store instructions for AVX512VL subsets.
Added avx512_movnt_vl multiclass for handling 256/128-bit forms of instruction.
Added encoding and lowering tests.
Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215536
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Sanders [Wed, 13 Aug 2014 10:07:34 +0000 (10:07 +0000)]
Re-commit: [mips] Implement .ent, .end, .frame, .mask and .fmask.
Patch by Matheus Almeida and Toma Tabacu
The lld test failure on the previous attempt to commit was caused by the
addition of the .pdr section causing the offsets it was checking to change.
This has been fixed by removing the .ent/.end directives from that test since
they weren't really needed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215535
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 13 Aug 2014 10:00:46 +0000 (10:00 +0000)]
[shuffle] Make the seed an optional component and add support for
letting the python very directly compute a UUID.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215533
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 13 Aug 2014 09:19:39 +0000 (09:19 +0000)]
Revert r215415 which causse MSan to crash on a great deal of C++ code.
I've followed up on the original commit as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215532
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 13 Aug 2014 09:05:59 +0000 (09:05 +0000)]
[shuffle] Teach the shuffle fuzzer to fuzz blends, including forming
a tree of inputs to blend iteratively together.
This required a pretty substantial rewrite of the innards. The number of
shuffle instructions is now bounded in terms of tree-height. There is
a flag to disable blends so that its still possible to test single input
shuffles. I've also improved various aspects of how the test program is
generated, primarily to simplify the test harness and allow some
optimizations to clean up how we actually check the results and build up
the inputs.
Again, apologies for my likely horrible use of Python... But hey, it
works! (Ish?)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215530
91177308-0d34-0410-b5e6-
96231b3b80d8
Elena Demikhovsky [Wed, 13 Aug 2014 07:58:43 +0000 (07:58 +0000)]
AVX-512: Fixed a bug in shufflevector lowering.
PALIGNR instruction does not exist in AVX-512F set.
Added a test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215526
91177308-0d34-0410-b5e6-
96231b3b80d8
Karthik Bhat [Wed, 13 Aug 2014 05:13:14 +0000 (05:13 +0000)]
InstCombine: Combine (xor (or %a, %b) (xor %a, %b)) to (add %a, %b)
Correctness proof of the transform using CVC3-
$ cat t.cvc
A, B : BITVECTOR(32);
QUERY BVXOR(A | B, BVXOR(A,B) ) = A & B;
$ cvc3 t.cvc
Valid.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215524
91177308-0d34-0410-b5e6-
96231b3b80d8
Hal Finkel [Wed, 13 Aug 2014 04:59:51 +0000 (04:59 +0000)]
[NVPTX] Remove MemIntrinsicSDNode/MemSDNode duplicate checking
As of r214452, isa<MemSDNode> will return true for nodes for which
isa<MemIntrinsicSDNode> will return true (classof now respects the actual class
hierarchy). So we no longer need to check for both MemIntrinsicSDNode and
MemSDNode separately.
No functionality change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215523
91177308-0d34-0410-b5e6-
96231b3b80d8
Nick Lewycky [Wed, 13 Aug 2014 04:54:05 +0000 (04:54 +0000)]
Fix examples of "named metadata" (some of which isn't named).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215522
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 13 Aug 2014 03:21:11 +0000 (03:21 +0000)]
[shuffle] Tweak the shuffle fuzzer to support bigger seeds. I'm
currently using UUIDs to seed this in order to scan a bigger range.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215521
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 13 Aug 2014 01:25:45 +0000 (01:25 +0000)]
[x86] Rewrite a core part of the new vector shuffle lowering to handle
one pesky test case correctly.
This test case caused the old code to infloop occilating between solving
the low-half and the high-half. The 'side balancing' part of
single-input v8 shuffle lowering didn't handle the one pattern which can
cause it to occilate. Fortunately the fuzz testing found this case.
Unfortuately it was *terrible* to handle. I'm really sorry for the
amount and density of the code here, I'd love suggestions on how to
simplify it. I feel like there *must* be a simpler form here, but after
a lot of days I've not found it. This is the only one I've found that
even works. I've added the one pesky test case along with some nice
comments explaining the core problem that we have to solve here.
So far this has survived approximately 32k test cases. More strenuous
fuzzing commencing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215519
91177308-0d34-0410-b5e6-
96231b3b80d8
Hal Finkel [Wed, 13 Aug 2014 01:15:40 +0000 (01:15 +0000)]
[PowerPC] Implement PPCTargetLowering::getTgtMemIntrinsic
This implements PPCTargetLowering::getTgtMemIntrinsic for Altivec load/store
intrinsics. As with the construction of the MachineMemOperands for the
intrinsic calls used for unaligned load/store lowering, the only slight
complication is that we need to represent a larger memory range than the
loaded/stored value-type size (because the address is rounded down to an
aligned address, and we need to conservatively represent the entire possible
range of the actual access). This required adding an extra size field to
TargetLowering::IntrinsicInfo, and this was done in a way that required no
modifications to other targets (the size defaults to the store size of the
provided memory data type).
This fixes test/CodeGen/PowerPC/unal-altivec-wint.ll (so it can be un-XFAILed).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215512
91177308-0d34-0410-b5e6-
96231b3b80d8
Hal Finkel [Wed, 13 Aug 2014 01:15:37 +0000 (01:15 +0000)]
Fix classof for ISD::INTRINSIC_W_CHAIN and INTRINSIC_VOID
Unfortunately, our use of the SDNode class hierarchy for INTRINSIC_W_CHAIN and
INTRINSIC_VOID nodes is somewhat broken right now. These nodes sometimes are
used for memory intrinsics (those with MachineMemOperands), and sometimes not.
When not, the nodes are not created as instances of MemIntrinsicSDNode, but
rather created as some other subclass of SDNode using DAG::getNode. When they
are memory intrinsics, they are created using DAG::getMemIntrinsicNode as
instances of MemIntrinsicSDNode. MemIntrinsicSDNode is a subclass of
MemSDNode, but prior to r214452, we had a non-self-consistent setup whereby
MemIntrinsicSDNode::classof on INTRINSIC_W_CHAIN and INTRINSIC_VOID would
return true but MemSDNode::classof on INTRINSIC_W_CHAIN and INTRINSIC_VOID
would return false. In r214452, MemSDNode::classof was changed to return true
for INTRINSIC_W_CHAIN and INTRINSIC_VOID, which is now self-consistent. The
problem is that neither the pre-r214452 logic and the post-r214452 logic are
really right. The truth is that not all INTRINSIC_W_CHAIN and INTRINSIC_VOID
nodes are instances of MemIntrinsicSDNode (or MemSDNode for that matter), and
the return value from classof needs to reflect that. This was broken before
r214452 (because MemIntrinsicSDNode::classof always returned true), and was
broken afterward (because MemSDNode::classof also always returned true), and
will now be correct.
The minimal solution is to grab one of the SubclassData bits (there is one left
for MemIntrinsicSDNode nodes) and use it to store whether or not a particular
INTRINSIC_W_CHAIN or INTRINSIC_VOID is really an instance of
MemIntrinsicSDNode or not. Doing this allows both MemIntrinsicSDNode::classof
and MemSDNode::classof to return the correct answer for the underlying object
for both the memory-intrinsic and non-memory-intrinsic cases.
This fixes the problem that r214452 created in the SelectionDAGDumper (thanks
to Matt Arsenault for pointing it out).
Because PowerPC does not implement getTgtMemIntrinsic, this change breaks
test/CodeGen/PowerPC/unal-altivec-wint.ll. I've XFAILed it for now, and will
fix it in a follow-up commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215511
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Wed, 13 Aug 2014 00:30:05 +0000 (00:30 +0000)]
[AVX512] Verify the code generated for the intrinsic _mm512_broadcastsd_pd
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215487
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Tue, 12 Aug 2014 23:23:05 +0000 (23:23 +0000)]
Fix -Wsign-compare warnings
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215483
91177308-0d34-0410-b5e6-
96231b3b80d8
Reid Kleckner [Tue, 12 Aug 2014 22:01:39 +0000 (22:01 +0000)]
APInt: Make self-move-assignment a no-op to fix stage3 clang-cl
It's not clear what the semantics of a self-move should be. The
consensus appears to be that a self-move should leave the object in a
moved-from state, which is what our existing move assignment operator
does.
However, the MSVC 2013 STL will perform self-moves in some cases. In
particular, when doing a std::stable_sort of an already sorted APSInt
vector of an appropriate size, one of the merge steps will self-move
half of the elements.
We don't notice this when building with MSVC, because MSVC will not
synthesize the move assignment operator for APSInt. Presumably MSVC
does this because APInt, the base class, has user-declared special
members that implicitly delete move special members. Instead, MSVC
selects the copy-assign operator, which defends against self-assignment.
Clang, on the other hand, selects the move-assign operator, and we get
garbage APInts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215478
91177308-0d34-0410-b5e6-
96231b3b80d8
Adrian Prantl [Tue, 12 Aug 2014 21:55:58 +0000 (21:55 +0000)]
Remove a condition that can never be true, as wittnessed by the assert
above.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215477
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Tue, 12 Aug 2014 21:13:12 +0000 (21:13 +0000)]
[AVX512] Handle valign masking intrinsic via C++ lowering
I think that this will scale better in most cases than adding a Pat<> for each
mapping from the intrinsic DAG to the intruction (i.e. rri, rrik, rrikz). We
can just lower to the SDNode and have the resulting DAG be matches by the DAG
patterns.
Alternatively (long term), we could keep the Pat<>s but generate them via the
new AVX512_masking multiclass. The difficulty is that in order to formulate
that we would have to concatenate DAGs. Currently this is only supported if
the operators of the input DAGs are identical.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215473
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Tue, 12 Aug 2014 19:46:13 +0000 (19:46 +0000)]
Allwo bitcast + struct GEP transform to work with addrspacecast
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215467
91177308-0d34-0410-b5e6-
96231b3b80d8
Jan Vesely [Tue, 12 Aug 2014 17:31:20 +0000 (17:31 +0000)]
R600: Use optimized 24bit path in udivrem
v2: drop enum keyword
use correct extension mode
don't bother computing the sign in unsinged case
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215462
91177308-0d34-0410-b5e6-
96231b3b80d8
Jan Vesely [Tue, 12 Aug 2014 17:31:19 +0000 (17:31 +0000)]
R600: Remove unused code.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215461
91177308-0d34-0410-b5e6-
96231b3b80d8
Jan Vesely [Tue, 12 Aug 2014 17:31:17 +0000 (17:31 +0000)]
R600: Use i24 optimized path for SREM
v2: add tests
rename LowerSDIV24 to LowerSDIVREM24
handle the rem part in this function
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215460
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Tue, 12 Aug 2014 17:11:26 +0000 (17:11 +0000)]
Fix a parentheses warning introduced in r215394.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215459
91177308-0d34-0410-b5e6-
96231b3b80d8
Duncan P. N. Exon Smith [Tue, 12 Aug 2014 16:46:37 +0000 (16:46 +0000)]
Don't upgrade global constructors when reading bitcode
An optional third field was added to `llvm.global_ctors` (and
`llvm.global_dtors`) in r209015. Most of the code has been changed to
deal with both versions of the variables. Users of the C API might
create either version, the helper functions in LLVM create the two-field
version, and clang now creates the three-field version.
However, the BitcodeReader was changed to always upgrade to the
three-field version. This created an unnecessary inconsistency in the
IR before/after serializing to bitcode.
This commit resolves the inconsistency by making the third field truly
optional (and not upgrading in the bitcode reader). Since `llvm-link`
was relying on this upgrade code, rather than deleting it I've moved it
into `ModuleLinker`, where it upgrades these arrays as necessary to
resolve inconsistencies between modules.
The ideal resolution would be to remove the 2-field version and make the
third field required. I filed PR20506 to track that.
I changed `test/Bitcode/upgrade-global-ctors.ll` to a negative test and
duplicated the `llvm-link` check in `test/Linker/global_ctors.ll` to
check both upgrade directions.
Since I came across this as part of PR5680 (serializing use-list order),
I've also added the missing `verify-uselistorder` RUN line to
`test/Bitcode/metadata-2.ll`.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215457
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Tue, 12 Aug 2014 16:00:06 +0000 (16:00 +0000)]
fixed typos
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215451
91177308-0d34-0410-b5e6-
96231b3b80d8