Nadav Rotem [Wed, 11 Apr 2012 08:26:11 +0000 (08:26 +0000)]
Reapply 154397. Original message:
Fix a dagcombine optimization which assumes that the vsetcc result type is always
of the same size as the compared values. This is ture for SSE/AVX/NEON but not
for all targets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154490
91177308-0d34-0410-b5e6-
96231b3b80d8
Duncan Sands [Wed, 11 Apr 2012 08:13:47 +0000 (08:13 +0000)]
Comment typo fix.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154488
91177308-0d34-0410-b5e6-
96231b3b80d8
Evan Cheng [Wed, 11 Apr 2012 06:59:47 +0000 (06:59 +0000)]
Add more fused mul+add/sub patterns. rdar://
10139676
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154484
91177308-0d34-0410-b5e6-
96231b3b80d8
Nadav Rotem [Wed, 11 Apr 2012 06:40:27 +0000 (06:40 +0000)]
Reapply 154396 after fixing a test.
Original message:
Modify the code that lowers shuffles to blends from using blendvXX to vblendXX.
blendV uses a register for the selection while Vblend uses an immediate.
On sandybridge they still have the same latency and execute on the same execution ports.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154483
91177308-0d34-0410-b5e6-
96231b3b80d8
Evan Cheng [Wed, 11 Apr 2012 05:33:07 +0000 (05:33 +0000)]
Clean up ARM fused multiply + add/sub support some more: rename some isel
predicates.
Also remove NEON2 since it's not really useful and it is confusing. If
NEON + VFP4 implies NEON2 but NEON2 doesn't imply NEON + VFP4, what does it
really mean?
rdar://
10139676
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154480
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Wed, 11 Apr 2012 04:55:51 +0000 (04:55 +0000)]
Fix an overly indented line. Remove an 'else' after an 'if' that returns.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154479
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Wed, 11 Apr 2012 04:34:11 +0000 (04:34 +0000)]
Inline implVisitAluOverflow by introducing a nested switch to convert the intrinsic to an nodetype.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154478
91177308-0d34-0410-b5e6-
96231b3b80d8
Andrew Trick [Wed, 11 Apr 2012 04:31:33 +0000 (04:31 +0000)]
Tablegen'd regpressure: emit the weighted pressure limit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154477
91177308-0d34-0410-b5e6-
96231b3b80d8
Andrew Trick [Wed, 11 Apr 2012 03:19:15 +0000 (03:19 +0000)]
Table-generated register pressure fixes.
Handle mixing allocatable and unallocatable register gracefully.
Simplify the pruning of register unit sets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154474
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Wed, 11 Apr 2012 03:06:35 +0000 (03:06 +0000)]
Optimize code a bit by calling push_back only once in some loops. Reduces compiled code size a bit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154473
91177308-0d34-0410-b5e6-
96231b3b80d8
Evan Cheng [Wed, 11 Apr 2012 01:21:25 +0000 (01:21 +0000)]
Match (fneg (fma) to vfnma. rdar://
10139676
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154469
91177308-0d34-0410-b5e6-
96231b3b80d8
Charles Davis [Wed, 11 Apr 2012 01:10:53 +0000 (01:10 +0000)]
Add retw and lretw instructions. Also, fix Intel syntax parsing for all
ret instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154468
91177308-0d34-0410-b5e6-
96231b3b80d8
Evan Cheng [Wed, 11 Apr 2012 01:03:11 +0000 (01:03 +0000)]
Merge fma.ll into fusedMAC.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154466
91177308-0d34-0410-b5e6-
96231b3b80d8
Kevin Enderby [Wed, 11 Apr 2012 00:25:40 +0000 (00:25 +0000)]
Fix ARM disassembly of VLD instructions with writebacks. And add test a case
for all opcodes handed by DecodeVLDInstruction() in ARMDisassembler.cpp .
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154459
91177308-0d34-0410-b5e6-
96231b3b80d8
Jim Grosbach [Wed, 11 Apr 2012 00:15:16 +0000 (00:15 +0000)]
ARM add missing Thumb1 two-operand aliases for shift-by-immediate.
rdar://
11222742
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154457
91177308-0d34-0410-b5e6-
96231b3b80d8
Evan Cheng [Wed, 11 Apr 2012 00:13:00 +0000 (00:13 +0000)]
Fix a number of problems with ARM fused multiply add/subtract instructions.
1. The new instruction itinerary entries are not properly described.
2. The asm parser can't handle vfms and vfnms.
3. There were no assembler, disassembler test cases.
4. HasNEON2 has the wrong assembler predicate.
rdar://
10139676
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154456
91177308-0d34-0410-b5e6-
96231b3b80d8
Jakob Stoklund Olesen [Wed, 11 Apr 2012 00:00:28 +0000 (00:00 +0000)]
Tweak MachineLICM heuristics for cheap instructions.
Allow cheap instructions to be hoisted if they are register pressure
neutral or better. This happens if the instruction is the last loop use
of another virtual register.
Only expensive instructions are allowed to increase loop register
pressure.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154455
91177308-0d34-0410-b5e6-
96231b3b80d8
Jakob Stoklund Olesen [Wed, 11 Apr 2012 00:00:26 +0000 (00:00 +0000)]
Only check for PHI uses inside the current loop.
Hoisting a value that is used by a PHI in the loop will introduce a
copy because the live range is extended to cross the PHI.
The same applies to PHIs in exit blocks.
Also use this opportunity to make HasLoopPHIUse() non-recursive.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154454
91177308-0d34-0410-b5e6-
96231b3b80d8
Jakob Stoklund Olesen [Wed, 11 Apr 2012 00:00:24 +0000 (00:00 +0000)]
Fix test to be register assignment invariant.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154453
91177308-0d34-0410-b5e6-
96231b3b80d8
Andrew Trick [Tue, 10 Apr 2012 23:53:32 +0000 (23:53 +0000)]
TableGen/reginfo potential bug: typo from previous checkin.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154452
91177308-0d34-0410-b5e6-
96231b3b80d8
Owen Anderson [Tue, 10 Apr 2012 22:46:53 +0000 (22:46 +0000)]
Move the constant-folding support for FP_ROUND in SelectionDAG from the one-operand version of getNode() to the two-operand version, since it became a two-operand node at sound point.
Zap a testcase that this allows us to completely fold away.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154447
91177308-0d34-0410-b5e6-
96231b3b80d8
Dylan Noblesmith [Tue, 10 Apr 2012 22:44:51 +0000 (22:44 +0000)]
llvm-stress: stop abusing ConstantFP::get()
ConstantFP::get(Type*, double) is unreliably host-specific:
it can't handle a type like PPC128 on an x86 host. It even
has a comment to that effect: "This should only be used for
simple constant values like 2.0/1.0 etc, that are
known-valid both as host double and as the target format."
Instead, use APFloat. While we're at it, randomize the floating
point value more thoroughly; it was previously limited
to the range 0 to 2**19 - 1.
PR12451.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154446
91177308-0d34-0410-b5e6-
96231b3b80d8
Dylan Noblesmith [Tue, 10 Apr 2012 22:44:49 +0000 (22:44 +0000)]
llvm-stress: don't make vectors of x86_mmx type
LangRef.html says:
"There are no arrays, vectors or constants of this type."
This was hitting assertions when passing the -generate-x86-mmx
option.
PR12452.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154445
91177308-0d34-0410-b5e6-
96231b3b80d8
Kostya Serebryany [Tue, 10 Apr 2012 22:29:17 +0000 (22:29 +0000)]
[tsan] two more compile-time optimizations:
- don't isntrument reads from constant globals.
Saves ~1.5% of instrumented instructions on CPU2006
(counting static instructions, not their execution).
- don't insrument reads from vtable (which is a global constant too).
Saves ~5%.
I did not measure the run-time impact of this,
but it is certainly non-negative.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154444
91177308-0d34-0410-b5e6-
96231b3b80d8
Evan Cheng [Tue, 10 Apr 2012 21:40:28 +0000 (21:40 +0000)]
Handle llvm.fma.* intrinsics. rdar://
10914096
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154439
91177308-0d34-0410-b5e6-
96231b3b80d8
Duncan Sands [Tue, 10 Apr 2012 20:35:27 +0000 (20:35 +0000)]
Add a comment noting that the fdiv -> fmul conversion won't generate
multiplication by a denormal, and some tests checking that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154431
91177308-0d34-0410-b5e6-
96231b3b80d8
Bill Wendling [Tue, 10 Apr 2012 20:12:16 +0000 (20:12 +0000)]
The MDString class stored a StringRef to the string which was already in a
StringMap. This was redundant and unnecessarily bloated the MDString class.
Because the MDString class is a "Value" and will never have a "name", and
because the Name field in the Value class is a pointer to a StringMap entry, we
repurpose the Name field for an MDString. It stores the StringMap entry in the
Name field, and uses the normal methods to get the string (name) back.
PR12474
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154429
91177308-0d34-0410-b5e6-
96231b3b80d8
Chad Rosier [Tue, 10 Apr 2012 19:42:07 +0000 (19:42 +0000)]
Whitespace.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154427
91177308-0d34-0410-b5e6-
96231b3b80d8
Chad Rosier [Tue, 10 Apr 2012 19:39:18 +0000 (19:39 +0000)]
Revert r154396, which looks to be the real culprit behind the bot failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154426
91177308-0d34-0410-b5e6-
96231b3b80d8
Eric Christopher [Tue, 10 Apr 2012 19:33:16 +0000 (19:33 +0000)]
Temporarily revert this patch to see if it brings the buildbots back.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154425
91177308-0d34-0410-b5e6-
96231b3b80d8
Kostya Serebryany [Tue, 10 Apr 2012 18:18:56 +0000 (18:18 +0000)]
[tsan] compile-time instrumentation: do not instrument a read if
a write to the same temp follows in the same BB.
Also add stats printing.
On Spec CPU2006 this optimization saves roughly 4% of instrumented reads
(which is 3% of all instrumented accesses):
Writes : 161216
Reads : 446458
Reads-before-write: 18295
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154418
91177308-0d34-0410-b5e6-
96231b3b80d8
Eric Christopher [Tue, 10 Apr 2012 18:18:10 +0000 (18:18 +0000)]
To ensure that we have more accurate line information for a block
don't elide the branch instruction if it's the only one in the block,
otherwise it's ok.
PR9796 and rdar://
11215207
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154417
91177308-0d34-0410-b5e6-
96231b3b80d8
Owen Anderson [Tue, 10 Apr 2012 18:02:12 +0000 (18:02 +0000)]
Revert r154397, which was causing make check failures on the buildbots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154414
91177308-0d34-0410-b5e6-
96231b3b80d8
Jim Grosbach [Tue, 10 Apr 2012 17:31:55 +0000 (17:31 +0000)]
ARM fix cc_out operand handling for t2SUBrr instructions.
We were incorrectly conflating some add variants which don't have a
cc_out operand with the mirroring sub encodings, which do. Part of the
awesome non-orthogonality legacy of thumb1. Similarly, handling of
add/sub of an immediate was sometimes incorrectly removing the cc_out
operand for add/sub register variants.
rdar://
11216577
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154411
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Tue, 10 Apr 2012 15:23:13 +0000 (15:23 +0000)]
Remove unused variable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154398
91177308-0d34-0410-b5e6-
96231b3b80d8
Nadav Rotem [Tue, 10 Apr 2012 14:58:31 +0000 (14:58 +0000)]
Fix a dagcombine optimization which assumes that the vsetcc result type is always
of the same size as the compared values. This is ture for SSE/AVX/NEON but not
for all targets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154397
91177308-0d34-0410-b5e6-
96231b3b80d8
Nadav Rotem [Tue, 10 Apr 2012 14:33:13 +0000 (14:33 +0000)]
Modify the code that lowers shuffles to blends from using blendvXX to vblendXX.
blendv uses a register for the selection while vblend uses an immediate.
On sandybridge they still have the same latency and execute on the same execution ports.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154396
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Tue, 10 Apr 2012 13:35:57 +0000 (13:35 +0000)]
Make a somewhat subtle change in the logic of block placement. Sometimes
the loop header has a non-loop predecessor which has been pre-fused into
its chain due to unanalyzable branches. In this case, rotating the
header into the body of the loop in order to place a loop exit at the
bottom of the loop is a Very Bad Idea as it makes the loop
non-contiguous.
I'm working on a good test case for this, but it's a bit annoynig to
craft. I should get one shortly, but I'm submitting this now so I can
begin the (lengthy) performance analysis process. An initial run of LNT
looks really, really good, but there is too much noise there for me to
trust it much.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154395
91177308-0d34-0410-b5e6-
96231b3b80d8
Anton Korobeynikov [Tue, 10 Apr 2012 13:22:49 +0000 (13:22 +0000)]
Transform div to mul with reciprocal only when fp imm is legal.
This fixes PR12516 and uncovers one weird problem in legalize (workarounded)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154394
91177308-0d34-0410-b5e6-
96231b3b80d8
David Chisnall [Tue, 10 Apr 2012 11:44:33 +0000 (11:44 +0000)]
Use the correct section types on Solaris for unwind data on both x86 and x86-64.
Patch by Dmitri Shubin!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154391
91177308-0d34-0410-b5e6-
96231b3b80d8
Duncan Sands [Tue, 10 Apr 2012 08:22:43 +0000 (08:22 +0000)]
Express the number of ULPs in fpaccuracy metadata as a real rather than a
rational number, eg as 2.5 rather than 5, 2. OK'd by Peter Collingbourne.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154387
91177308-0d34-0410-b5e6-
96231b3b80d8
Andrew Trick [Tue, 10 Apr 2012 05:14:42 +0000 (05:14 +0000)]
Fix 12513: Loop unrolling breaks with indirect branches.
Take this opportunity to generalize the indirectbr bailout logic for
loop transformations. CFG transformations will never get indirectbr
right, and there's no point trying.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154386
91177308-0d34-0410-b5e6-
96231b3b80d8
Andrew Trick [Tue, 10 Apr 2012 05:14:37 +0000 (05:14 +0000)]
whitespace
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154385
91177308-0d34-0410-b5e6-
96231b3b80d8
Andrew Trick [Tue, 10 Apr 2012 03:36:49 +0000 (03:36 +0000)]
Fix for register pressure tables.
Recent refactoring introduced a bug. Fix: added buildRegUnitSets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154382
91177308-0d34-0410-b5e6-
96231b3b80d8
Evan Cheng [Tue, 10 Apr 2012 03:15:42 +0000 (03:15 +0000)]
Add proper checks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154379
91177308-0d34-0410-b5e6-
96231b3b80d8
Evan Cheng [Tue, 10 Apr 2012 03:15:18 +0000 (03:15 +0000)]
Make the code slightly more palatable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154378
91177308-0d34-0410-b5e6-
96231b3b80d8
Andrew Trick [Tue, 10 Apr 2012 03:12:29 +0000 (03:12 +0000)]
Use std::includes instead of my own implementation.
Jakob's review.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154377
91177308-0d34-0410-b5e6-
96231b3b80d8
Andrew Trick [Tue, 10 Apr 2012 02:25:26 +0000 (02:25 +0000)]
Added a TargetRegisterInfo interface for accessing register pressure sets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154375
91177308-0d34-0410-b5e6-
96231b3b80d8
Andrew Trick [Tue, 10 Apr 2012 02:25:24 +0000 (02:25 +0000)]
Added register unit sets to the target description.
This is a new algorithm that finds sets of register units that can be
used to model registers pressure. This handles arbitrary, overlapping
register classes. Each register class is associated with a (small)
list of pressure sets. These are the dimensions of pressure affected
by the register class's liveness.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154374
91177308-0d34-0410-b5e6-
96231b3b80d8
Andrew Trick [Tue, 10 Apr 2012 02:25:21 +0000 (02:25 +0000)]
Added register unit weights to the target description.
This is a new algorithm that associates registers with weighted
register units to accuretely model their effect on register
pressure. This handles registers with multiple overlapping
subregisters. It is possible, but almost inconceivable that the
algorithm fails to find an exact solution for a target description. If
an exact solution cannot be found, an inexact, but reasonable solution
will be chosen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154373
91177308-0d34-0410-b5e6-
96231b3b80d8
Andrew Trick [Tue, 10 Apr 2012 02:25:18 +0000 (02:25 +0000)]
Fix header comment
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154372
91177308-0d34-0410-b5e6-
96231b3b80d8
Danil Malyshev [Tue, 10 Apr 2012 01:54:44 +0000 (01:54 +0000)]
Add a constructor for DataRefImpl and remove excess initialization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154371
91177308-0d34-0410-b5e6-
96231b3b80d8
Evan Cheng [Tue, 10 Apr 2012 01:51:00 +0000 (01:51 +0000)]
Fix a long standing tail call optimization bug. When a libcall is emitted
legalizer always use the DAG entry node. This is wrong when the libcall is
emitted as a tail call since it effectively folds the return node. If
the return node's input chain is not the entry (i.e. call, load, or store)
use that as the tail call input chain.
PR12419
rdar://
9770785
rdar://
11195178
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154370
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Tue, 10 Apr 2012 00:16:22 +0000 (00:16 +0000)]
Don't try to zExt just to check if an integer constant is zero, it might
not fit in a i64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154364
91177308-0d34-0410-b5e6-
96231b3b80d8
Jim Grosbach [Tue, 10 Apr 2012 00:13:07 +0000 (00:13 +0000)]
ARM LDR/LDRT has the same encoding collision as STR/STRT.
Generalized logic of r154141.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154362
91177308-0d34-0410-b5e6-
96231b3b80d8
Lang Hames [Mon, 9 Apr 2012 23:58:59 +0000 (23:58 +0000)]
Test case for PR12495.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154359
91177308-0d34-0410-b5e6-
96231b3b80d8
Bill Wendling [Mon, 9 Apr 2012 23:16:51 +0000 (23:16 +0000)]
Revert the 'EnableInitializing' flag. There is debate on whether we should run that pass by default in LTO.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154356
91177308-0d34-0410-b5e6-
96231b3b80d8
Bill Wendling [Mon, 9 Apr 2012 22:18:01 +0000 (22:18 +0000)]
Apply the scope restrictions after parsing the command line options. There may be some which are used in that function.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154348
91177308-0d34-0410-b5e6-
96231b3b80d8
Akira Hatanaka [Mon, 9 Apr 2012 20:32:12 +0000 (20:32 +0000)]
Have TargetLowering::getPICJumpTableRelocBase return a node that points to the
GOT if jump table uses 64-bit gp-relative relocation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154341
91177308-0d34-0410-b5e6-
96231b3b80d8
Chad Rosier [Mon, 9 Apr 2012 20:32:02 +0000 (20:32 +0000)]
When performing a truncating store, it's possible to rearrange the data
in-register, such that we can use a single vector store rather then a
series of scalar stores.
For func_4_8 the generated code
vldr d16, LCPI0_0
vmov d17, r0, r1
vadd.i16 d16, d17, d16
vmov.u16 r0, d16[3]
strb r0, [r2, #3]
vmov.u16 r0, d16[2]
strb r0, [r2, #2]
vmov.u16 r0, d16[1]
strb r0, [r2, #1]
vmov.u16 r0, d16[0]
strb r0, [r2]
bx lr
becomes
vldr d16, LCPI0_0
vmov d17, r0, r1
vadd.i16 d16, d17, d16
vuzp.8 d16, d17
vst1.32 {d16[0]}, [r2, :32]
bx lr
I'm not fond of how this combine pessimizes 2012-03-13-DAGCombineBug.ll,
but I couldn't think of a way to judiciously apply this combine.
This
ldrh r0, [r0, #4]
strh r0, [r1]
becomes
vldr d16, [r0]
vmov.u16 r0, d16[2]
vmov.32 d16[0], r0
vuzp.16 d16, d17
vst1.32 {d16[0]}, [r1, :32]
PR11158
rdar://
10703339
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154340
91177308-0d34-0410-b5e6-
96231b3b80d8
Lang Hames [Mon, 9 Apr 2012 20:17:30 +0000 (20:17 +0000)]
Patch r153892 for PR11861 apparently broke an external project (see PR12493).
This patch restores TwoAddressInstructionPass's pre-r153892 behaviour when
rescheduling instructions in TryInstructionTransform. Hopefully this will fix
PR12493. To refix PR11861, lowering of INSERT_SUBREGS is deferred until after
the copy that unties the operands is emitted (this seems to be a more
appropriate fix for that issue anyway).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154338
91177308-0d34-0410-b5e6-
96231b3b80d8
Chad Rosier [Mon, 9 Apr 2012 19:38:15 +0000 (19:38 +0000)]
Update comments and remove unnecessary isVolatile() check.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154336
91177308-0d34-0410-b5e6-
96231b3b80d8
Eric Christopher [Mon, 9 Apr 2012 17:54:34 +0000 (17:54 +0000)]
Typo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154329
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Mon, 9 Apr 2012 16:29:35 +0000 (16:29 +0000)]
Fix accidentally constant conditions found by uncommitted improvements to -Wconstant-conversion.
A couple of cases where we were accidentally creating constant conditions by
something like "x == a || b" instead of "x == a || x == b". In one case a
conditional & then unreachable was used - I transformed this into a direct
assert instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154324
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Mon, 9 Apr 2012 16:06:03 +0000 (16:06 +0000)]
Pattern match a setcc of boolean value with 0 as a truncate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154322
91177308-0d34-0410-b5e6-
96231b3b80d8
Preston Gurd [Mon, 9 Apr 2012 15:32:22 +0000 (15:32 +0000)]
This patch adds X86 instruction itineraries, which were missed by the
original patch to add itineraries, to X86InstrArithmetc.td.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154320
91177308-0d34-0410-b5e6-
96231b3b80d8
Duncan Sands [Mon, 9 Apr 2012 14:08:00 +0000 (14:08 +0000)]
Clarify that fpaccuracy metadata is giving the compiler permission to use a
less accurate method.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154319
91177308-0d34-0410-b5e6-
96231b3b80d8
Nadav Rotem [Mon, 9 Apr 2012 08:33:21 +0000 (08:33 +0000)]
Lower some x86 shuffle sequences to the vblend family of instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154313
91177308-0d34-0410-b5e6-
96231b3b80d8
Bill Wendling [Mon, 9 Apr 2012 08:32:21 +0000 (08:32 +0000)]
s/lto_codegen_whole_program_optimization/lto_codegen_set_whole_program_optimization/
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154312
91177308-0d34-0410-b5e6-
96231b3b80d8
Nadav Rotem [Mon, 9 Apr 2012 07:45:58 +0000 (07:45 +0000)]
Fix a bug in the lowering of broadcasts: ConstantPools need to use the target pointer type.
Move NormalizeVectorShuffle and LowerVectorBroadcast into X86TargetLowering.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154310
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Mon, 9 Apr 2012 07:19:09 +0000 (07:19 +0000)]
Remove unnecessary type check when combining and/or/xor of swizzles. Move some checks to allow better early out.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154309
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Mon, 9 Apr 2012 05:59:53 +0000 (05:59 +0000)]
Remove unnecessary 'else' on an 'if' that always returns
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154308
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Mon, 9 Apr 2012 05:55:33 +0000 (05:55 +0000)]
Optimize code slightly. No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154307
91177308-0d34-0410-b5e6-
96231b3b80d8
Bill Wendling [Mon, 9 Apr 2012 05:26:48 +0000 (05:26 +0000)]
Add a hook to turn on the internalize pass through the LTO interface.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154306
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Mon, 9 Apr 2012 05:16:56 +0000 (05:16 +0000)]
Replace some explicit checks with asserts for conditions that should never happen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154305
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Mon, 9 Apr 2012 02:13:06 +0000 (02:13 +0000)]
Cleanup and relax a restriction on the matching of global offsets into
x86 addressing modes. This allows PIE-based TLS offsets to fit directly
into an addressing mode immediate offset, which is the last remaining
code quality issue from PR12380. With this patch, that PR is completely
fixed.
To understand why this patch is correct to match these offsets into
addressing mode immediates, break it down by cases:
1) 32-bit is trivially correct, and unmodified here.
2) 64-bit non-small mode is unchanged and never matches.
3) 64-bit small PIC code which is RIP-relative is handled specially in
the match to try to fit RIP into the base register. If it fails, it
now early exits. This behavior is unchanged by the patch.
4) 64-bit small non-PIC code which is not RIP-relative continues to work
as it did before. The reason these immediates are safe is because the
ABI ensures they fit in small mode. This behavior is unchanged.
5) 64-bit small PIC code which is *not* using RIP-relative addressing.
This is the only case changed by the patch, and the primary place you
see it is in TLS, either the win64 section offset TLS or Linux
local-exec TLS model in a PIC compilation. Here the ABI again ensures
that the immediates fit because we are in small mode, and any other
operations required due to the PIC relocation model have been handled
externally to the Wrapper node (extra loads etc are made around the
wrapper node in ISelLowering).
I've tested this as much as I can comparing it with GCC's output, and
everything appears safe. I discussed this with Anton and it made sense
to him at least at face value. That said, if there are issues with PIC
code after this patch, yell and we can revert it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154304
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Mon, 9 Apr 2012 01:43:17 +0000 (01:43 +0000)]
Fold 15 tiny test cases into a single file that implements the
comprehensive testing of TLS codegen for x86. Convert all of the ones
that were still using grep to use FileCheck. Remove some redundancies
between them.
Perhaps most interestingly expand the test cases so that they actually
fully list the instruction snippet being tested. TLS operations are
*very* narrowly defined, and so these seem reasonably stable. More
importantly, the existing test cases already were crazy fine grained,
expecting specific registers to be allocated. This just clarifies that
no *other* instructions are expected, and fills in some crucial gaps
that weren't being tested at all.
This will make any subsequent changes to TLS much more clear during
review.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154303
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Sun, 8 Apr 2012 23:15:04 +0000 (23:15 +0000)]
Optimize code a bit. No functional change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154299
91177308-0d34-0410-b5e6-
96231b3b80d8
Benjamin Kramer [Sun, 8 Apr 2012 19:04:45 +0000 (19:04 +0000)]
Silence sign-compare warning.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154297
91177308-0d34-0410-b5e6-
96231b3b80d8
Duncan Sands [Sun, 8 Apr 2012 18:08:12 +0000 (18:08 +0000)]
Only have codegen turn fdiv by a constant into fmul by the reciprocal
when -ffast-math, i.e. don't just always do it if the reciprocal can
be formed exactly. There is already an IR level transform that does
that, and it does it more carefully.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154296
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Sun, 8 Apr 2012 17:53:33 +0000 (17:53 +0000)]
Simplify code that tries to do vector extracts for shuffles when the mask width and the input vector widths don't match. No need to check the min and max are in range before calculating the start index. The range check after having the start index is sufficient. Also no need to check for an extract from the beginning differently.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154295
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sun, 8 Apr 2012 17:51:45 +0000 (17:51 +0000)]
Teach LLVM about a PIE option which, when enabled on top of PIC, makes
optimizations which are valid for position independent code being linked
into a single executable, but not for such code being linked into
a shared library.
I discussed the design of this with Eric Christopher, and the decision
was to support an optional bit rather than a completely separate
relocation model. Fundamentally, this is still PIC relocation, its just
that certain optimizations are only valid under a PIC relocation model
when the resulting code won't be in a shared library. The simplest path
to here is to expose a single bit option in the TargetOptions. If folks
have different/better designs, I'm all ears. =]
I've included the first optimization based upon this: changing TLS
models to the *Exec models when PIE is enabled. This is the LLVM
component of PR12380 and is all of the hard work.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154294
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sun, 8 Apr 2012 17:20:55 +0000 (17:20 +0000)]
Move the TLSModel information into the TargetMachine rather than hiding
in TargetLowering. There was already a FIXME about this location being
odd. The interface is simplified as a consequence. This will also make
it easier to change TLS models when compiling with PIE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154292
91177308-0d34-0410-b5e6-
96231b3b80d8
Benjamin Kramer [Sun, 8 Apr 2012 14:53:14 +0000 (14:53 +0000)]
EngineBuilder::create is expected to take ownership of the TargetMachine passed to it. Delete it on error or when we create an interpreter that doesn't need it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154288
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sun, 8 Apr 2012 14:37:02 +0000 (14:37 +0000)]
Remove an over zealous assert. The assert was trying to catch places
where a chain outside of the loop block-set ended up in the worklist for
scheduling as part of the contiguous loop. However, asserting the first
block in the chain is in the loop-set isn't a valid check -- we may be
forced to drag a chain into the worklist due to one block in the chain
being part of the loop even though the first block is *not* in the loop.
This occurs when we have been forced to form a chain early due to
un-analyzable branches.
No test case here as I have no idea how to even begin reducing one, and
it will be hopelessly fragile. We have to somehow end up with a loop
header of an inner loop which is a successor of a basic block with an
unanalyzable pair of branch instructions. Ow. Self-host triggers it so
it is unlikely it will regress.
This at least gets block placement back to passing selfhost and the test
suite. There are still a lot of slowdown that I don't like coming out of
block placement, although there are now also a lot of speedups. =[ I'm
seeing swings in both directions up to 10%. I'm going to try to find
time to dig into this and see if we can turn this on for 3.1 as it does
a really good job of cleaning up after some loops that degraded with the
inliner changes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154287
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sun, 8 Apr 2012 14:37:01 +0000 (14:37 +0000)]
Add a debug-only 'dump' method to the BlockChain structure to ease
debugging.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154286
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sun, 8 Apr 2012 14:36:56 +0000 (14:36 +0000)]
Teach InstCombine to nuke a common alloca pattern -- an alloca which has
GEPs, bit casts, and stores reaching it but no other instructions. These
often show up during the iterative processing of the inliner, SROA, and
DCE. Once we hit this point, we can completely remove the alloca. These
were actually showing up in the final, fully optimized code in a bunch
of inliner tests I've been working on, and notably they show up after
LLVM finishes optimizing away all function calls involved in
hash_combine(a, b).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154285
91177308-0d34-0410-b5e6-
96231b3b80d8
Nadav Rotem [Sun, 8 Apr 2012 12:54:54 +0000 (12:54 +0000)]
AVX2: Build splat vectors by broadcasting a scalar from the constant pool.
Previously we used three instructions to broadcast an immediate value into a
vector register.
On Sandybridge we continue to load the broadcasted value from the constant pool.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154284
91177308-0d34-0410-b5e6-
96231b3b80d8
Bill Wendling [Sun, 8 Apr 2012 11:53:54 +0000 (11:53 +0000)]
Remove old 'grep' lines.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154283
91177308-0d34-0410-b5e6-
96231b3b80d8
Bill Wendling [Sun, 8 Apr 2012 11:52:52 +0000 (11:52 +0000)]
Formatting changes. Don't put spaces in front of some code, which only makes it look 'off'.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154282
91177308-0d34-0410-b5e6-
96231b3b80d8
Bill Wendling [Sun, 8 Apr 2012 11:00:38 +0000 (11:00 +0000)]
FileCheckize these testcases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154281
91177308-0d34-0410-b5e6-
96231b3b80d8
Bill Wendling [Sun, 8 Apr 2012 10:20:49 +0000 (10:20 +0000)]
Remove the 'Parent' pointer from the MDNodeOperand class.
An MDNode has a list of MDNodeOperands allocated directly after it as part of
its allocation. Therefore, the Parent of the MDNodeOperands can be found by
walking back through the operands to the beginning of that list. Mark the first
operand's value pointer as being the 'first' operand so that we know where the
beginning of said list is.
This saves a *lot* of space during LTO with -O0 -g flags.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154280
91177308-0d34-0410-b5e6-
96231b3b80d8
Bill Wendling [Sun, 8 Apr 2012 10:16:43 +0000 (10:16 +0000)]
Allow subclasses of the ValueHandleBase to store information as part of the
value pointer by making the value pointer into a pointer-int pair with 2 bits
available for flags.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154279
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Sat, 7 Apr 2012 22:32:29 +0000 (22:32 +0000)]
Turn avx2 vinserti128 intrinsic calls into INSERT_SUBVECTOR DAG nodes and remove patterns for selecting the intrinsic. Similar was already done for avx1.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154272
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Sat, 7 Apr 2012 21:57:43 +0000 (21:57 +0000)]
Move vinsertf128 patterns near the instruction definitions. Add AddedComplexity to AVX2 vextracti128 patterns to give them priority over the integer versions of vextractf128 patterns.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154268
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Sat, 7 Apr 2012 21:23:41 +0000 (21:23 +0000)]
Remove 'else' after 'if' that ends in return.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154267
91177308-0d34-0410-b5e6-
96231b3b80d8
Nadav Rotem [Sat, 7 Apr 2012 21:19:08 +0000 (21:19 +0000)]
1. Remove the part of r153848 which optimizes shuffle-of-shuffle into a new
shuffle node because it could introduce new shuffle nodes that were not
supported efficiently by the target.
2. Add a more restrictive shuffle-of-shuffle optimization for cases where the
second shuffle reverses the transformation of the first shuffle.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154266
91177308-0d34-0410-b5e6-
96231b3b80d8
Duncan Sands [Sat, 7 Apr 2012 20:04:00 +0000 (20:04 +0000)]
Convert floating point division by a constant into multiplication by the
reciprocal if converting to the reciprocal is exact. Do it even if inexact
if -ffast-math. This substantially speeds up ac.f90 from the polyhedron
benchmarks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154265
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sat, 7 Apr 2012 20:01:31 +0000 (20:01 +0000)]
Perform partial SROA on the helper hashing structure. I really wish the
optimizers could do this for us, but expecting partial SROA of classes
with template methods through cloning is probably expecting too much
heroics. With this change, the begin/end pointer pairs which indicate
the status of each loop iteration are actually passed directly into each
layer of the combine_data calls, and the inliner has a chance to see
when most of the combine_data function could be deleted by inlining.
Similarly for 'length'.
We have to be careful to limit the places where in/out reference
parameters are used as those will also defeat the inliner / optimizers
from properly propagating constants.
With this change, LLVM is able to fully inline and unroll the hash
computation of small sets of values, such as two or three pointers.
These now decompose into essentially straight-line code with no loops or
function calls.
There is still one code quality problem to be solved with the hashing --
LLVM is failing to nuke the alloca. It removes all loads from the
alloca, leaving only lifetime intrinsics and dead(!!) stores to the
alloca. =/ Very unfortunate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154264
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sat, 7 Apr 2012 19:22:18 +0000 (19:22 +0000)]
Fix ValueTracking to conclude that debug intrinsics are safe to
speculate. Without this, loop rotate (among many other places) would
suddenly stop working in the presence of debug info. I found this
looking at loop rotate, and have augmented its tests with a reduction
out of a very hot loop in yacr2 where failing to do this rotation costs
sometimes more than 10% in runtime performance, perturbing numerous
downstream optimizations.
This should have no impact on performance without debug info, but the
change in performance when debug info is enabled can be extreme. As
a consequence (and this how I got to this yak) any profiling of
performance problems should be treated with deep suspicion -- they may
have been wildly innacurate of debug info was enabled for profiling. =/
Just a heads up.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@154263
91177308-0d34-0410-b5e6-
96231b3b80d8