Kevin Enderby [Fri, 22 Aug 2014 20:35:18 +0000 (20:35 +0000)]
Add the start of the support for llvm-objdump’s -private-headers for Mach-O files.
This adds the printing of the mach header. Load command printing will be next.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216285
91177308-0d34-0410-b5e6-
96231b3b80d8
Kevin Enderby [Fri, 22 Aug 2014 20:34:31 +0000 (20:34 +0000)]
Add a few missing mach header flags.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216284
91177308-0d34-0410-b5e6-
96231b3b80d8
Reid Kleckner [Fri, 22 Aug 2014 19:29:17 +0000 (19:29 +0000)]
Fix PR17239 by changing the semantics of the RemainingArgsClass Option kind
This patch contains the LLVM side of the fix of PR17239.
This bug that happens because the /link (clang-cl.exe argument) is
marked as "consume all remaining arguments". However, when inside a
response file, /link should only consume all remaining arguments inside
the response file where it is located, not the entire command line after
expansion.
My patch will change the semantics of the RemainingArgsClass kind to
always consume only until the end of the response file when the option
originally came from a response file. There are only two options in this
class: dash dash (--) and /link.
Reviewed By: rnk
Differential Revision: http://reviews.llvm.org/D4899
Patch by Rafael Auler!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216280
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Stellard [Fri, 22 Aug 2014 18:49:35 +0000 (18:49 +0000)]
R600/SI: Use READ2/WRITE2 instructions for 64-bit mem ops with 32-bit alignment
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216279
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Stellard [Fri, 22 Aug 2014 18:49:33 +0000 (18:49 +0000)]
R600/SI: Use a ComplexPattern for DS loads and stores
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216278
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Stellard [Fri, 22 Aug 2014 18:49:31 +0000 (18:49 +0000)]
R600/SI: Wrap local memory pointer in AssertZExt on SI
These pointers are really just offsets and they will always be
less than 16-bits. Using AssertZExt allows us to use computeKnownBits
to prove that these values are positive. We will use this information
in a later commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216277
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Stellard [Fri, 22 Aug 2014 18:49:28 +0000 (18:49 +0000)]
R600/SI: Use correct helper class for DS_WRITE2 instructions
DS_1A uses a single offset encoding, so offset1 wasn't being
encoded.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216276
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Fri, 22 Aug 2014 18:05:22 +0000 (18:05 +0000)]
[ARM] Move the implementation of the target hooks related to copy-related
instruction from ARMInstrInfo to ARMBaseInstrInfo.
That way, thumb mode can also benefit from the advanced copy optimization.
<rdar://problem/
12702965>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216274
91177308-0d34-0410-b5e6-
96231b3b80d8
David Majnemer [Fri, 22 Aug 2014 17:11:04 +0000 (17:11 +0000)]
InstCombine: Don't unconditionally preserve 'nuw' when shrinking constants
Consider:
%add = add nuw i32 %a, -
16777216
%and = and i32 %add, 255
Regardless of whether or not we demand the sign bit of %add, we cannot
replace -
16777216 with
2130706432 without also removing 'nuw' from the
instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216273
91177308-0d34-0410-b5e6-
96231b3b80d8
David Majnemer [Fri, 22 Aug 2014 16:41:23 +0000 (16:41 +0000)]
InstCombine: sub nsw %x, C -> add nsw %x, -C if C isn't INT_MIN
We can preserve nsw during this transform if -C won't overflow.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216269
91177308-0d34-0410-b5e6-
96231b3b80d8
Alex Lorenz [Fri, 22 Aug 2014 16:29:45 +0000 (16:29 +0000)]
[Support] Fix the overflow bug in ULEB128 decoding.
Differential Revision: http://reviews.llvm.org/D5029
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216268
91177308-0d34-0410-b5e6-
96231b3b80d8
Sasa Stankovic [Fri, 22 Aug 2014 09:23:22 +0000 (09:23 +0000)]
[mips] Don't use odd-numbered float registers for double arguments for fastcc
calling convention if FP is 64-bit and +nooddspreg is used.
Differential Revision: http://reviews.llvm.org/D4981.diff
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216262
91177308-0d34-0410-b5e6-
96231b3b80d8
David Majnemer [Fri, 22 Aug 2014 07:56:32 +0000 (07:56 +0000)]
InstCombine: Don't unconditionally preserve 'nsw' when shrinking constants
Consider:
%add = add nsw i32 %a, -
16777216
%and = and i32 %add, 255
Regardless of whether or not we demand the sign bit of %add, we cannot
replace -
16777216 with
2130706432 without also removing 'nsw' from the
instruction.
This fixes PR20377.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216261
91177308-0d34-0410-b5e6-
96231b3b80d8
Erik Eckstein [Fri, 22 Aug 2014 01:18:39 +0000 (01:18 +0000)]
fix: SLPVectorizer crashes for unreachable blocks containing not schedulable instructions.
In unreachable blocks it's legal to have instructions like "%x = op %x".
Such instuctions are not schedulable. Therefore the SLPVectorizer has to check for
unreachable blocks and ignore them.
Fixes bug 20646.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216256
91177308-0d34-0410-b5e6-
96231b3b80d8
Peter Collingbourne [Fri, 22 Aug 2014 01:18:18 +0000 (01:18 +0000)]
[dfsan] Fix non-determinism bug in non-zero label check annotator.
We now use a std::vector instead of a DenseSet to store the list of
label checks so that we can iterate over it deterministically.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216255
91177308-0d34-0410-b5e6-
96231b3b80d8
David Majnemer [Fri, 22 Aug 2014 00:40:43 +0000 (00:40 +0000)]
ValueTracking: Figure out more bits when looking at add/sub
Given something like X01XX + X01XX, we know that the result must look
like X1XXX.
Adapted from a patch by Richard Smith, test-case written by me.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216250
91177308-0d34-0410-b5e6-
96231b3b80d8
Reid Kleckner [Fri, 22 Aug 2014 00:09:56 +0000 (00:09 +0000)]
SROA: Handle a case of store size being smaller than allocation size
In this case, we are creating an x86_fp80 slice for a union from C where
the padding bytes may contain real data. An x86_fp80 alloca is 16 bytes,
and that's just fine. We can't, however, use regular loads and stores to
access the slice, because the store size is only 10 bytes / 80 bits.
Instead, use memcpy and memset.
Fixes PR18726.
Reviewed By: chandlerc
Differential Revision: http://reviews.llvm.org/D5012
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216248
91177308-0d34-0410-b5e6-
96231b3b80d8
Duncan P. N. Exon Smith [Thu, 21 Aug 2014 23:36:08 +0000 (23:36 +0000)]
Revert "X86: Align the stack on word boundaries in LowerFormalArguments()"
This (mostly) reverts commit r216119.
Somewhere during the review Reid committed r214980 which fixed this
another way, and I neglected to check that the testcase still failed
before committing.
I've left test/CodeGen/X86/aligned-variadic.ll around in case it adds
extra coverage.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216246
91177308-0d34-0410-b5e6-
96231b3b80d8
Reid Kleckner [Thu, 21 Aug 2014 23:24:08 +0000 (23:24 +0000)]
Add an explicit move constructor to SrcBuffer
MSVC can't synthesize the explicit one. Instead it tries to emit a copy
ctor which would call the deleted copy ctor of unique_ptr.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216244
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Thu, 21 Aug 2014 23:06:07 +0000 (23:06 +0000)]
[FastISel][AArch64] Add support for variable shift.
This adds the missing variable shift support for value type i8, i16, and i32.
This fixes <rdar://problem/
18095685>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216242
91177308-0d34-0410-b5e6-
96231b3b80d8
Philip Reames [Thu, 21 Aug 2014 22:53:49 +0000 (22:53 +0000)]
Minor refactor to make applying patches from 'Add a "probe-stack" attribute' review thread out of order easier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216241
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Thu, 21 Aug 2014 22:45:21 +0000 (22:45 +0000)]
Use DILexicalBlockFile, rather than DILexicalBlock, to track discriminator changes to ensure discriminator changes don't introduce new DWARF DW_TAG_lexical_blocks.
Somewhat unnoticed in the original implementation of discriminators, but
it could cause instructions to end up in new, small,
DW_TAG_lexical_blocks due to the use of DILexicalBlock to track
discriminator changes.
Instead, use DILexicalBlockFile which we already use to track file
changes without introducing new scopes, so it works well to track
discriminator changes in the same way.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216239
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Thu, 21 Aug 2014 22:31:48 +0000 (22:31 +0000)]
name change: isPow2DivCheap -> isPow2SDivCheap
isPow2DivCheap
That name doesn't specify signed or unsigned.
Lazy as I am, I eventually read the function and variable comments. It turns out that this is strictly about signed div. But I discovered that the comments are wrong:
srl/add/sra
is not the general sequence for signed integer division by power-of-2. We need one more 'sra':
sra/srl/add/sra
That's the sequence produced in DAGCombiner. The first 'sra' may be removed when dividing by exactly '2', but that's a special case.
This patch corrects the comments, changes the name of the flag bit, and changes the name of the accessor methods.
No functional change intended.
Differential Revision: http://reviews.llvm.org/D5010
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216237
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Thu, 21 Aug 2014 22:23:52 +0000 (22:23 +0000)]
[PeepholeOptimizer] Enable the advanced copy optimization by default.
The advanced copy optimization does not yield any difference on the whole llvm
test-suite + SPECs, either in compile time or runtime (binaries are identical),
but has a big potential when data go back and forth between register files as
demonstrated with test/CodeGen/ARM/adv-copy-opt.ll.
Note: This was measured for both Os and O3 for armv7s, arm64, and x86_64.
<rdar://problem/
12702965>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216236
91177308-0d34-0410-b5e6-
96231b3b80d8
Philip Reames [Thu, 21 Aug 2014 22:19:16 +0000 (22:19 +0000)]
Whitespace change to reduce diff in future patch.
Patch 2 of 11 in 'Add a "probe-stack" attribute' review thread
Patch by: john.kare.alsaker@gmail.com
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216235
91177308-0d34-0410-b5e6-
96231b3b80d8
Philip Reames [Thu, 21 Aug 2014 22:15:20 +0000 (22:15 +0000)]
[X86] Split out the logic to select the stack probe function (NFC)
Patch 1 of 11 in 'Add a "probe-stack" attribute' review thread.
Patch by: <john.kare.alsaker@gmail.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216233
91177308-0d34-0410-b5e6-
96231b3b80d8
Robin Morisset [Thu, 21 Aug 2014 22:09:25 +0000 (22:09 +0000)]
Add hooks for emitLeading/TrailingFence
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216232
91177308-0d34-0410-b5e6-
96231b3b80d8
Robin Morisset [Thu, 21 Aug 2014 21:50:01 +0000 (21:50 +0000)]
Rename AtomicExpandLoadLinked into AtomicExpand
AtomicExpandLoadLinked is currently rather ARM-specific. This patch is the first of
a group that aim at making it more target-independent. See
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075873.html
for details
The command line option is "atomic-expand"
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216231
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Thu, 21 Aug 2014 21:34:06 +0000 (21:34 +0000)]
[PeepholeOptimizer] Update the kill flags when extending the live-range of the
source of a copy.
<rdar://problem/
12702965>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216229
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Bogner [Thu, 21 Aug 2014 21:09:24 +0000 (21:09 +0000)]
Fix a URL (NFC)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216228
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Thu, 21 Aug 2014 20:57:57 +0000 (20:57 +0000)]
[FastISel][AArch64] Use the correct register class to make the MI verifier happy.
This is mostly achieved by providing the correct register class manually,
because getRegClassFor always returns the GPR*AllRegClass for MVT::i32 and
MVT::i64.
Also cleanup the code to use the FastEmitInst_* method whenever possible. This
makes sure that the operands' register class is properly constrained. For all
the remaining cases this adds the missing constrainOperandRegClass calls for
each operand.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216225
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Thu, 21 Aug 2014 20:44:56 +0000 (20:44 +0000)]
Explicitly pass ownership of the MemoryBuffer to AddNewSourceBuffer using std::unique_ptr
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216223
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Stellard [Thu, 21 Aug 2014 20:41:00 +0000 (20:41 +0000)]
R600/SI: Teach moveToVALU how to handle more S_LOAD_* instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216220
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Stellard [Thu, 21 Aug 2014 20:40:58 +0000 (20:40 +0000)]
R600/SI: Make sure SCRATCH_WAVE_OFFSET is added as Live-In to the function
This fixes a crash in an ocl conformance test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216219
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Stellard [Thu, 21 Aug 2014 20:40:56 +0000 (20:40 +0000)]
R600/SI: Remove unused SGPR spilling code
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216218
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Stellard [Thu, 21 Aug 2014 20:40:54 +0000 (20:40 +0000)]
R600/SI: Use eliminateFrameIndex() to expand SGPR spill pseudos
This will simplify the SGPR spilling and also allow us to use
MachineFrameInfo for calculating offsets, which should be more
reliable than our custom code.
This fixes a crash in some cases where a register would be spilled
in a branch such that the VGPR defined for spilling did not dominate
all the uses when restoring.
This fixes a crash in an ocl conformance test. The test requries
register spilling and is too big to include.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216217
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Stellard [Thu, 21 Aug 2014 20:40:50 +0000 (20:40 +0000)]
R600/SI: Handle VCC in SIRegisterInfo::getPhysRegSubReg()
This fixes a crash in an ocl conformance test. The test requries
register spilling and is too big to include.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216216
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Thu, 21 Aug 2014 20:28:55 +0000 (20:28 +0000)]
Rewrite the gold plugin to fix pr19901.
There is a fundamental difference between how the gold API and lib/LTO view
the LTO process.
The gold API talks about a particular symbol in a particular file. The lib/LTO
API talks about a symbol in the merged module.
The merged module is then defined in terms of the IR semantics. In particular,
a linkonce_odr GV is only copied if it is used, since it is valid to drop
unused linkonce_odr GVs.
In the testcase in pr19901 both properties collide. What happens is that gold
asks us to keep a particular linkonce_odr symbol, but the IR linker doesn't
copy it to the merged module and we never have a chance to ask lib/LTO to keep
it.
This patch fixes it by having a more direct implementation of the gold API. If
it asks us to keep a symbol, we change the linkage so it is not linkonce. If it
says we can drop a symbol, we do so. All of this before we even send the module
to lib/Linker.
Since now we don't have to produce LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN,
during symbol resolution we can use a temporary LLVMContext and do lazy
module loading. This allows us to keep the minimum possible amount of
allocated memory around. This should also allow as much parallelism as
we want, since there is no shared context.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216215
91177308-0d34-0410-b5e6-
96231b3b80d8
Jonathan Roelofs [Thu, 21 Aug 2014 20:09:15 +0000 (20:09 +0000)]
Satiate the sanitizer build bot
This fixes a missing initializer from r216182
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216212
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Thu, 21 Aug 2014 20:03:44 +0000 (20:03 +0000)]
Move some logic to populateLTOPassManager.
This will avoid code duplication in the next commit which calls it directly
from the gold plugin.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216211
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Thu, 21 Aug 2014 19:50:07 +0000 (19:50 +0000)]
[AVX512] Add class to group common template arguments related to vector type
We discussed the issue of generality vs. readability of the AVX512 classes
recently. I proposed this approach to try to hide and centralize the mappings
we commonly perform based on the vector type. A new class X86VectorVTInfo
captures these.
The idea is to pass an instance of this class to classes/multiclasses instead
of the corresponding ValueType. Then the class/multiclass can use its field
for things that derive from the type rather than passing all those as separate
arguments.
I modified avx512_valign to demonstrate this new approach. As you can see
instead of 7 related template parameters we now have one. The downside is
that we have to refer to fields for the derived values. I named the argument
'_' in order to make this as invisible as possible. Please let me know if you
absolutely hate this. (Also once we allow local initializations in
multiclasses we can recover the original version by assigning the fields to
local variables.)
Another possible use-case for this class is to directly map things, e.g.:
RegisterClass KRC = X86VectorVTInfo<32, i16>.KRC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216209
91177308-0d34-0410-b5e6-
96231b3b80d8
Alex Lorenz [Thu, 21 Aug 2014 19:23:25 +0000 (19:23 +0000)]
Coverage Mapping: add function's hash to coverage function records.
The profile data format was recently updated and the new indexing api
requires the code coverage tool to know the function's hash as well
as the function's name to get the execution counts for a function.
Differential Revision: http://reviews.llvm.org/D4994
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216207
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Thu, 21 Aug 2014 19:22:24 +0000 (19:22 +0000)]
llvm-gcc is dead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216206
91177308-0d34-0410-b5e6-
96231b3b80d8
Eric Fiselier [Thu, 21 Aug 2014 18:52:58 +0000 (18:52 +0000)]
[LIT] Remove documentation for method since it does not exist
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216204
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Thu, 21 Aug 2014 18:49:52 +0000 (18:49 +0000)]
Respect LibraryInfo in populateLTOPassManager and use it. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216203
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Thu, 21 Aug 2014 18:11:21 +0000 (18:11 +0000)]
Remove dead code. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216201
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Thu, 21 Aug 2014 18:10:07 +0000 (18:10 +0000)]
[AArch64] Run a peephole pass right after AdvSIMD pass.
The AdvSIMD pass may produce copies that are not coalescer-friendly. The
peephole optimizer knows how to fix that as demonstrated in the test case.
<rdar://problem/
12702965>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216200
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Thu, 21 Aug 2014 18:02:25 +0000 (18:02 +0000)]
[FastISel][AArch64] Factor out ANDWri instruction generation into a helper function. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216199
91177308-0d34-0410-b5e6-
96231b3b80d8
Moritz Roth [Thu, 21 Aug 2014 17:11:03 +0000 (17:11 +0000)]
Thumb1 load/store optimizer: Improve code to materialize new base register.
There are two add-immediate instructions in Thumb1: tADDi8 and tADDi3. Only
the latter supports using different source and destination registers, so
whenever we materialize a new base register (at a certain offset) we'd do
so by moving the base register value to the new register and then adding in
place. This patch changes the code to use a single tADDi3 if the offset is
small enough to fit in 3 bits.
Differential Revision: http://reviews.llvm.org/D5006
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216193
91177308-0d34-0410-b5e6-
96231b3b80d8
Hans Wennborg [Thu, 21 Aug 2014 17:10:00 +0000 (17:10 +0000)]
Use returns_nonnull in BumpPtrAllocator and MallocAllocator to avoid null-check in placement new
In both Clang and LLVM, this is a common pattern:
Size = sizeof(DeclRefExpr) + SomeExtraStuff;
void *Mem = Context.Allocate(Size, llvm::alignOf<DeclRefExpr>());
return new (Mem) DeclRefExpr(...);
The annoying thing is that because the default placement-new operator has a
nothrow specification, the compiler will insert a null check of Mem before
calling the DeclRefExpr constructor. This null check is redundant for us,
because we expect the allocation functions to never return null.
By annotating the allocator functions with returns_nonnull, we can optimize
away these checks. Compiling clang with a recent version of Clang and measuring
with:
$ perf stat -r20 bin/clang.patch -fsyntax-only -w gcc.c && perf stat -r20 bin/clang.orig -fsyntax-only -w gcc.c
Shows a 2.4% speed-up (+- 0.8%).
The pattern occurs in LLVM too. Measuring with -O3 (and now using bzip2.c
instead, because it's smaller):
$ perf stat -r20 bin/clang.patch -O3 -w bzip2.c && perf stat -r20 bin/clang.orig -O3 -w bzip2.c
Shows 4.4 % speed-up (+- 1%).
If anyone knows of a similar attribute we can use for MSVC, or some other
technique to get rid off the null check there, please let me know.
Differential Revision: http://reviews.llvm.org/D4989
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216192
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Thu, 21 Aug 2014 16:40:05 +0000 (16:40 +0000)]
[FastISel][AArch64] Remove redundant test.
These tests and many more are already covered by fast-isel-addressing-modes.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216186
91177308-0d34-0410-b5e6-
96231b3b80d8
Jonathan Roelofs [Thu, 21 Aug 2014 14:35:47 +0000 (14:35 +0000)]
Add a thread-model knob for lowering atomics on baremetal & single threaded systems
http://reviews.llvm.org/D4984
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216182
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Thu, 21 Aug 2014 13:35:30 +0000 (13:35 +0000)]
Handle inlining in populateLTOPassManager like in populateModulePassManager.
No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216178
91177308-0d34-0410-b5e6-
96231b3b80d8
Zinovy Nis [Thu, 21 Aug 2014 13:30:05 +0000 (13:30 +0000)]
[CLNUP] Remove return after llvm_unreachable. Thanks to Hal Finkel for pointing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216176
91177308-0d34-0410-b5e6-
96231b3b80d8
Benjamin Kramer [Thu, 21 Aug 2014 13:28:02 +0000 (13:28 +0000)]
DAGCombiner: Make concat_vector combine safe for EVTs and concat_vectors with many arguments.
PR20677
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216175
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Thu, 21 Aug 2014 13:13:17 +0000 (13:13 +0000)]
Move DisableGVNLoadPRE from populateLTOPassManager to PassManagerBuilder.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216174
91177308-0d34-0410-b5e6-
96231b3b80d8
Josh Klontz [Thu, 21 Aug 2014 12:55:27 +0000 (12:55 +0000)]
X86AsmPrinter MCJIT MSVC bug fix.
Summary:
This bug was introduced in r213006 which makes an assumption that MCSection is COFF for Windows MSVC. This assumption is broken for MCJIT users where ELF is used instead [1]. The fix is to change the MCSection cast to a dyn_cast.
[1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-December/068407.html.
Reviewers: majnemer
Reviewed By: majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D4872
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216173
91177308-0d34-0410-b5e6-
96231b3b80d8
Oliver Stannard [Thu, 21 Aug 2014 12:50:31 +0000 (12:50 +0000)]
[ARM] Enable DP copy, load and store instructions for FPv4-SP
The FPv4-SP floating-point unit is generally referred to as
single-precision only, but it does have double-precision registers and
load, store and GPR<->DPR move instructions which operate on them.
This patch enables the use of these registers, the main advantage of
which is that we now comply with the AAPCS-VFP calling convention.
This partially reverts r209650, which added some AAPCS-VFP support,
but did not handle return values or alignment of double arguments in
registers.
This patch also adds tests for Thumb2 code generation for
floating-point instructions and intrinsics, which previously only
existed for ARM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216172
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Thu, 21 Aug 2014 12:39:07 +0000 (12:39 +0000)]
Sort declarations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216171
91177308-0d34-0410-b5e6-
96231b3b80d8
Benjamin Kramer [Thu, 21 Aug 2014 11:22:05 +0000 (11:22 +0000)]
Make format_object_base's destructor protected and non-virtual.
It's not meant to be used with operator delete and this avoids emitting virtual
dtors for every derived format object.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216170
91177308-0d34-0410-b5e6-
96231b3b80d8
Erik Verbruggen [Thu, 21 Aug 2014 10:45:30 +0000 (10:45 +0000)]
Reassociate x + -0.1234 * y into x - 0.1234 * y
This does not require -ffast-math, and it gives CSE/GVN more options to
eliminate duplicate expressions in, e.g.:
return ((x + 0.1234 * y) * (x - 0.1234 * y));
Differential Revision: http://reviews.llvm.org/D4904
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216169
91177308-0d34-0410-b5e6-
96231b3b80d8
Benjamin Kramer [Thu, 21 Aug 2014 10:31:37 +0000 (10:31 +0000)]
X86: Turn redundant if into an assertion.
While there remove noop casts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216168
91177308-0d34-0410-b5e6-
96231b3b80d8
Robert Khasanov [Thu, 21 Aug 2014 09:43:43 +0000 (09:43 +0000)]
[x86] Added _addcarry_ and _subborrow_ intrinsics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216164
91177308-0d34-0410-b5e6-
96231b3b80d8
Robert Khasanov [Thu, 21 Aug 2014 09:34:12 +0000 (09:34 +0000)]
[x86] SMAP: added HasSMAP attribute for CLAC/STAC, corrected attributes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216163
91177308-0d34-0410-b5e6-
96231b3b80d8
Robert Khasanov [Thu, 21 Aug 2014 09:27:00 +0000 (09:27 +0000)]
[x86] Broadwell: ADOX/ADCX. Added _addcarryx_u{32|64} intrinsics to LLVM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216162
91177308-0d34-0410-b5e6-
96231b3b80d8
Robert Khasanov [Thu, 21 Aug 2014 09:16:12 +0000 (09:16 +0000)]
[x86] Enable Broadwell target.
Added FeatureSMAP.
Broadwell ISA includes Haswell ISA + ADX + RDSEED + SMAP
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216161
91177308-0d34-0410-b5e6-
96231b3b80d8
Zinovy Nis [Thu, 21 Aug 2014 08:25:45 +0000 (08:25 +0000)]
[INDVARS] Extend using of widening of induction variables for the cases of "sub nsw" and "mul nsw" instructions.
Currently only "add nsw" are widened. This patch eliminates tons of "sext" instructions for 64 bit code (and the corresponding target code) in cases like:
int N = 100;
float **A;
void foo(int x0, int x1)
{
float * A_cur = &A[0][0];
float * A_next = &A[1][0];
for(int x = x0; x < x1; ++x).
{
// Currently only [x+N] case is widened. Others 2 cases lead to sext.
// This patch fixes it, so all 3 cases do not need sext.
const float div = A_cur[x + N] + A_cur[x - N] + A_cur[x * N];
A_next[x] = div;
}
}
...
> clang++ test.cpp -march=core-avx2 -Ofast -fno-unroll-loops -fno-tree-vectorize -S -o -
Differential Revision: http://reviews.llvm.org/D4695
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216160
91177308-0d34-0410-b5e6-
96231b3b80d8
Elena Demikhovsky [Thu, 21 Aug 2014 07:01:55 +0000 (07:01 +0000)]
IntelJITEventListener updates to fix breaks by recent changes to EngineBuilder and DIContext.
By Arch Robison.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216159
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Thu, 21 Aug 2014 05:55:13 +0000 (05:55 +0000)]
Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216158
91177308-0d34-0410-b5e6-
96231b3b80d8
David Majnemer [Thu, 21 Aug 2014 05:14:48 +0000 (05:14 +0000)]
InstCombine: Fold ((A | B) & C1) ^ (B & C2) -> (A & C1) ^ B if C1^C2=-1
Adapted from a patch by Richard Smith, test-case written by me.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216157
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Thu, 21 Aug 2014 04:31:10 +0000 (04:31 +0000)]
Remove custom implementations of max/min in StringRef that was originally added to work an old gcc bug. I believe its been fixed by now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216156
91177308-0d34-0410-b5e6-
96231b3b80d8
Eric Fiselier [Thu, 21 Aug 2014 04:27:11 +0000 (04:27 +0000)]
add self to credits
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216155
91177308-0d34-0410-b5e6-
96231b3b80d8
Jiangning Liu [Thu, 21 Aug 2014 02:12:35 +0000 (02:12 +0000)]
Fix a bug around truncating vector in const prop.
In constant folding stage, "TRUNC" can't handle vector data type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216149
91177308-0d34-0410-b5e6-
96231b3b80d8
Jiangning Liu [Thu, 21 Aug 2014 01:59:30 +0000 (01:59 +0000)]
Revert r216066, "Optimize ZERO_EXTEND and SIGN_EXTEND in both SelectionDAG Builder and type".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216147
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Thu, 21 Aug 2014 00:19:16 +0000 (00:19 +0000)]
[PeepholeOptimizer] Take advantage of the isInsertSubreg property in the
advanced copy optimization.
This is the final step patch toward transforming:
udiv r0, r0, r2
udiv r1, r1, r3
vmov.32 d16[0], r0
vmov.32 d16[1], r1
vmov r0, r1, d16
bx lr
into:
udiv r0, r0, r2
udiv r1, r1, r3
bx lr
Indeed, thanks to this patch, this optimization is able to look through
vmov.32 d16[0], r0
vmov.32 d16[1], r1
and is able to rewrite the following sequence:
vmov.32 d16[0], r0
vmov.32 d16[1], r1
vmov r0, r1, d16
into simple generic GPR copies that the coalescer managed to remove.
<rdar://problem/
12702965>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216144
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Thu, 21 Aug 2014 00:10:52 +0000 (00:10 +0000)]
[ARM] Mark VSETLNi32 with the InsertSubreg property and implement the related
target hook.
This patch teaches the compiler that:
dX = VSETLNi32 dY, rZ, imm
is the same as:
dX = INSERT_SUBREG dY, rZ, translateImmToSubIdx(imm)
<rdar://problem/
12702965>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216143
91177308-0d34-0410-b5e6-
96231b3b80d8
James Molloy [Thu, 21 Aug 2014 00:02:51 +0000 (00:02 +0000)]
[LoopVectorize] Up the maximum unroll factor to 4 for AArch64
Only for Cortex-A57 and Cyclone for now, where it has shown wins.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216141
91177308-0d34-0410-b5e6-
96231b3b80d8
James Molloy [Wed, 20 Aug 2014 23:53:52 +0000 (23:53 +0000)]
[LoopVectorizer] Limit unroll factor in the presence of nested reductions.
If we have a scalar reduction, we can increase the critical path length if the loop we're unrolling is inside another loop. Limit, by default to 2, so the critical path only gets increased by one reduction operation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216140
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Wed, 20 Aug 2014 23:49:36 +0000 (23:49 +0000)]
Add isInsertSubreg property.
This patch adds a new property: isInsertSubreg and the related target hooks:
TargetIntrInfo::getInsertSubregInputs and
TargetInstrInfo::getInsertSubregLikeInputs to specify that a target specific
instruction is a (kind of) INSERT_SUBREG.
The approach is similar to r215394.
<rdar://problem/
12702965>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216139
91177308-0d34-0410-b5e6-
96231b3b80d8
Jonathan Roelofs [Wed, 20 Aug 2014 23:38:50 +0000 (23:38 +0000)]
Lower thumbv4t & thumbv5 lo->lo copies through a push-pop sequence
On pre-v6 hardware, 'MOV lo, lo' gives undefined results, so such copies need to
be avoided. This patch trades simplicity for implementation time at the expense
of performance... As they say: correctness first, then performance.
See http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075998.html for a few
ideas on how to make this better.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216138
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Wed, 20 Aug 2014 23:25:28 +0000 (23:25 +0000)]
Mention the right target hook in the comment on isExtractSubreg property.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216137
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Wed, 20 Aug 2014 23:13:02 +0000 (23:13 +0000)]
[PeepholeOptimizer] Take advantage of the isExtractSubreg property in the
advanced copy optimization.
This patch is a step toward transforming:
udiv r0, r0, r2
udiv r1, r1, r3
vmov.32 d16[0], r0
vmov.32 d16[1], r1
vmov r0, r1, d16
bx lr
into:
udiv r0, r0, r2
udiv r1, r1, r3
bx lr
Indeed, thanks to this patch, this optimization is able to look through
vmov r0, r1, d16
but it does not understand yet
vmov.32 d16[0], r0
vmov.32 d16[1], r1
Comming patches will fix that and update the related test case.
<rdar://problem/
12702965>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216136
91177308-0d34-0410-b5e6-
96231b3b80d8
Yi Jiang [Wed, 20 Aug 2014 22:55:40 +0000 (22:55 +0000)]
New InstCombine pattern: (icmp ult/ule (A + C1), C3) | (icmp ult/ule (A + C2), C3) to (icmp ult/ule ((A & ~(C1 ^ C2)) + max(C1, C2)), C3) under certain condition
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216135
91177308-0d34-0410-b5e6-
96231b3b80d8
Alexey Samsonov [Wed, 20 Aug 2014 22:46:38 +0000 (22:46 +0000)]
Don't allow MCStreamer::EmitIntValue to output 0-byte integers.
It makes no sense and can hide bugs. In particular, it lead
to left shift by 64 bits, which is an undefined behavior,
properly reported by UBSan.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216134
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Wed, 20 Aug 2014 22:16:19 +0000 (22:16 +0000)]
[ARM] Mark VMOVRRD with the ExtractSubreg property and implement the related
target hook.
This patch teaches the compiler that:
rX, rY = VMOVRRD dZ
is the same as:
rX = EXTRACT_SUBREG dZ, ssub_0
rY = EXTRACT_SUBREG dZ, ssub_1
<rdar://problem/
12702965>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216132
91177308-0d34-0410-b5e6-
96231b3b80d8
Alexey Samsonov [Wed, 20 Aug 2014 21:56:43 +0000 (21:56 +0000)]
Fix undefined behavior (left shift of negative value) in SystemZ backend.
This bug is reported by UBSan.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216131
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Wed, 20 Aug 2014 21:51:26 +0000 (21:51 +0000)]
Add isExtractSubreg property.
This patch adds a new property: isExtractSubreg and the related target hooks:
TargetIntrInfo::getExtractSubregInputs and
TargetInstrInfo::getExtractSubregLikeInputs to specify that a target specific
instruction is a (kind of) EXTRACT_SUBREG.
The approach is similar to r215394.
<rdar://problem/
12702965>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216130
91177308-0d34-0410-b5e6-
96231b3b80d8
Alexey Samsonov [Wed, 20 Aug 2014 21:40:15 +0000 (21:40 +0000)]
Fix null reference creation in SelectionDAG constructor.
Store TargetSelectionDAGInfo as a pointer instead of a reference:
getSelectionDAGInfo() may not be implemented for certain backends
(e.g. it's not currently implemented for R600).
This bug is reported by UBSan.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216129
91177308-0d34-0410-b5e6-
96231b3b80d8
Alexey Samsonov [Wed, 20 Aug 2014 21:22:03 +0000 (21:22 +0000)]
Fix undefined behavior (left shift of negative value) in Hexagon backend.
This bug is reported by UBSan.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216125
91177308-0d34-0410-b5e6-
96231b3b80d8
Alexey Samsonov [Wed, 20 Aug 2014 20:57:26 +0000 (20:57 +0000)]
Cleanup: Delete seemingly unused reference to MachineDominatorTree from ScheduleDAGInstrs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216124
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Wed, 20 Aug 2014 20:34:56 +0000 (20:34 +0000)]
Don't prevent a vselect of constants from becoming a single load (PR20648).
Fix for PR20648 - http://llvm.org/bugs/show_bug.cgi?id=20648
This patch checks the operands of a vselect to see if all values are constants.
If yes, bail out of any further attempts to create a blend or shuffle because
SelectionDAGLegalize knows how to turn this kind of vselect into a single load.
This already happens for machines without SSE4.1, so the added checks just send
more targets down that path.
Differential Revision: http://reviews.llvm.org/D4934
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216121
91177308-0d34-0410-b5e6-
96231b3b80d8
Duncan P. N. Exon Smith [Wed, 20 Aug 2014 19:58:59 +0000 (19:58 +0000)]
X86: Add missing triples from r216119
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216120
91177308-0d34-0410-b5e6-
96231b3b80d8
Duncan P. N. Exon Smith [Wed, 20 Aug 2014 19:40:59 +0000 (19:40 +0000)]
X86: Align the stack on word boundaries in LowerFormalArguments()
The goal of the patch is to implement section 3.2.3 of the AMD64 ABI
correctly. The controlling sentence is, "The size of each argument gets
rounded up to eightbytes. Therefore the stack will always be eightbyte
aligned." The equivalent sentence in the i386 ABI page 37 says, "At all
times, the stack pointer should point to a word-aligned area." For both
architectures, the stack pointer is not being rounded up to the nearest
eightbyte or word between the last normal argument and the first
variadic argument.
Patch by Thomas Jablin!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216119
91177308-0d34-0410-b5e6-
96231b3b80d8
Alexey Samsonov [Wed, 20 Aug 2014 19:36:05 +0000 (19:36 +0000)]
Fix null reference creation in ScheduleDAGInstrs constructor call.
Both MachineLoopInfo and MachineDominatorTree may be null in ScheduleDAGMI
constructor call. It is undefined behavior to take references to these values.
This bug is reported by UBSan.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216118
91177308-0d34-0410-b5e6-
96231b3b80d8
Keno Fischer [Wed, 20 Aug 2014 19:00:37 +0000 (19:00 +0000)]
Do not insert a tail call when returning multiple values on X86
Summary: This fixes http://llvm.org/bugs/show_bug.cgi?id=19530.
The problem is that X86ISelLowering erroneously thought the third call
was eligible for tail call elimination.
It would have been if it's return value was actually the one returned
by the calling function, but here that is not the case and
additional values are being returned.
Test Plan: Test case from the original bug report is included.
Reviewers: rafael
Reviewed By: rafael
Subscribers: rafael, llvm-commits
Differential Revision: http://reviews.llvm.org/D4968
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216117
91177308-0d34-0410-b5e6-
96231b3b80d8
Alexey Samsonov [Wed, 20 Aug 2014 18:30:07 +0000 (18:30 +0000)]
Fix undefined behavior (left shift by 64 bits) in ScaledNumber::toString().
This bug is reported by UBSan.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216116
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Wed, 20 Aug 2014 18:03:00 +0000 (18:03 +0000)]
critical-anti-dependency breaker: don't use reg def info from kill insts (PR20308)
In PR20308 ( http://llvm.org/bugs/show_bug.cgi?id=20308 ), the critical-anti-dependency breaker
caused a miscompile because it broke a WAR hazard using a register that it thinks is available
based on info from a kill inst. Until PR18663 is solved, we shouldn't use any def/use info from
a kill because they are really just nops.
This patch adds guard checks for kills around calls to ScanInstruction() where the DefIndices
array is set. For good measure, add an assert in ScanInstruction() so we don't hit this bug again.
The test case is a reduced version of the code from the bug report.
Differential Revision: http://reviews.llvm.org/D4977
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216114
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Wed, 20 Aug 2014 17:41:48 +0000 (17:41 +0000)]
[PeepholeOptimizer] Refactor the advanced copy optimization to take advantage of
the isRegSequence property.
This is a follow-up of r215394 and r215404, which respectively introduces the
isRegSequence property and uses it for ARM.
Thanks to the property introduced by the previous commits, this patch is able
to optimize the following sequence:
vmov d0, r2, r3
vmov d1, r0, r1
vmov r0, s0
vmov r1, s2
udiv r0, r1, r0
vmov r1, s1
vmov r2, s3
udiv r1, r2, r1
vmov.32 d16[0], r0
vmov.32 d16[1], r1
vmov r0, r1, d16
bx lr
into:
udiv r0, r0, r2
udiv r1, r1, r3
vmov.32 d16[0], r0
vmov.32 d16[1], r1
vmov r0, r1, d16
bx lr
This patch refactors how the copy optimizations are done in the peephole
optimizer. Prior to this patch, we had one copy-related optimization that
replaced a copy or bitcast by a generic, more suitable (in terms of register
file), copy.
With this patch, the peephole optimizer features two copy-related optimizations:
1. One for rewriting generic copies to generic copies:
PeepholeOptimizer::optimizeCoalescableCopy.
2. One for replacing non-generic copies with generic copies:
PeepholeOptimizer::optimizeUncoalescableCopy.
The goals of these two optimizations are slightly different: one rewrite the
operand of the instruction (#1), the other kills off the non-generic instruction
and replace it by a (sequence of) generic instruction(s).
Both optimizations rely on the ValueTracker introduced in r212100.
The ValueTracker has been refactored to use the information from the
TargetInstrInfo for non-generic instruction. As part of the refactoring, we
switched the tracking from the index of the definition to the actual register
(virtual or physical). This one change is to provide better consistency with
register related APIs and to ease the use of the TargetInstrInfo.
Moreover, this patch introduces a new helper class CopyRewriter used to ease the
rewriting of generic copies (i.e., #1).
Finally, this patch adds a dead code elimination pass right after the peephole
optimizer to get rid of dead code that may appear after rewriting.
This is related to <rdar://problem/
12702965>.
Review: http://reviews.llvm.org/D4874
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216088
91177308-0d34-0410-b5e6-
96231b3b80d8
Andrew Trick [Wed, 20 Aug 2014 17:38:12 +0000 (17:38 +0000)]
Tweak CFGPrinter to wrap very long names.
I added wrapping to the CFGPrinter a while back so the -view-cfg
output is actually viewable. I've since enountered very long mangled
names with the same problem, so I'm slightly tweaking this code to
work in that case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216087
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Wed, 20 Aug 2014 17:33:44 +0000 (17:33 +0000)]
Remove unused field.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216086
91177308-0d34-0410-b5e6-
96231b3b80d8