Chandler Carruth [Thu, 2 Oct 2014 06:52:19 +0000 (06:52 +0000)]
[x86] Switch some of the new consolidated vector tests to use
a bare-metal triple and have nice BB labels, etc.
No significant change here, just tidying up to have a consistent set of
OS-agnostic vector functionality here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218854
91177308-0d34-0410-b5e6-
96231b3b80d8
Lang Hames [Thu, 2 Oct 2014 04:21:27 +0000 (04:21 +0000)]
[PBQP] Update doxygen comment style to match the rest of the file. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218849
91177308-0d34-0410-b5e6-
96231b3b80d8
Lang Hames [Thu, 2 Oct 2014 04:17:36 +0000 (04:17 +0000)]
[PBQP] Add support for graph-level metadata to the PBQP graph. This will be used
in the future to attach useful information about the PBQP graph (e.g. the
associated MachineFunction, pointers to regalloc passes) to the graph itself,
making that information accessible to the solver. This should also allow the
PBQPBuilder interface to be simplified.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218848
91177308-0d34-0410-b5e6-
96231b3b80d8
Eric Christopher [Thu, 2 Oct 2014 00:42:30 +0000 (00:42 +0000)]
Remove test directories with no tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218843
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Bogner [Thu, 2 Oct 2014 00:31:00 +0000 (00:31 +0000)]
InstrProf: Simplify counting a file's regions when writing coverage (NFC)
When writing a coverage mapping we iterate through the mapping regions
in order of FileID, but we were then repeatedly searching from the
beginning of the list to count the number of regions with a given
FileID.
It is simpler and more efficient to search forward from the current
iterator to find the number of regions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218842
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 1 Oct 2014 23:14:28 +0000 (23:14 +0000)]
[x86] Improve and correct how the new vector shuffle lowering was
matching and lowering 64-bit insertions.
The first problem was that we weren't looking through bitcasts to
discover that we *could* lower as insertions. Once fixed, we in turn
weren't looking through bitcasts to discover that we could fold a load
into the lowering. Once fixed, we weren't forming a SCALAR_TO_VECTOR
node around the inserted element and instead were passing a scalar to
a DAG node that expected a vector. It turns out there are some patterns
that will "lower" this into the correct asm, but the rest of the X86
backend is very unhappy with such antics.
This should fix a few more edge case regressions I've spotted going
through the regression test suite to enable the new vector shuffle
lowering.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218839
91177308-0d34-0410-b5e6-
96231b3b80d8
Bob Wilson [Wed, 1 Oct 2014 22:44:01 +0000 (22:44 +0000)]
PR21101: tablegen's FastISel emitter should filter out unused functions.
FastISel has a fixed set of virtual functions that are overridden by the
tablegen-generated code for each target. These functions are distinguished by
the kinds of operands, e.g., register + immediate = "ri". The FastISel emitter
has been blindly emitting functions with different combinations of operand
kinds, even for combinations that are completely unused by FastISel, e.g.,
"fastEmit_rrr". Change to filter out functions that will be irrelevant for
FastISel and do not bother generating the code for them. Also add explicit
"override" keywords for the virtual functions that are overridden.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218838
91177308-0d34-0410-b5e6-
96231b3b80d8
Lang Hames [Wed, 1 Oct 2014 21:57:47 +0000 (21:57 +0000)]
[MCJIT] Don't crash in debugging output for sections that aren't emitted.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218836
91177308-0d34-0410-b5e6-
96231b3b80d8
Eric Christopher [Wed, 1 Oct 2014 21:36:28 +0000 (21:36 +0000)]
constify the TargetMachine argument used in the subtarget and
lowering constructors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218832
91177308-0d34-0410-b5e6-
96231b3b80d8
Duncan P. N. Exon Smith [Wed, 1 Oct 2014 21:32:15 +0000 (21:32 +0000)]
DIBuilder: Remove duplicated comments, NFC
These comments already appear in the header, and some of them are
out-of-date anyway.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218829
91177308-0d34-0410-b5e6-
96231b3b80d8
Duncan P. N. Exon Smith [Wed, 1 Oct 2014 21:32:12 +0000 (21:32 +0000)]
Revert "DIBuilder: Remove dead code"
This reverts commit r218820. It turns out that Adrian has an
outstanding SROA patch that uses this.
I've updated it to forward to `createExpression()`.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218828
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Wed, 1 Oct 2014 21:20:06 +0000 (21:20 +0000)]
Lower FNEG ( FABS (x) ) -> FNABS (x) [X86 codegen] PR20578
Negative FABS of either a scalar or vector should be handled the same way
on x86 with SSE/AVX: a single OR instruction of the FP operand with a
constant to light up the sign bit(s).
http://llvm.org/bugs/show_bug.cgi?id=20578
Differential Revision: http://reviews.llvm.org/D5201
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218822
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Wed, 1 Oct 2014 21:19:39 +0000 (21:19 +0000)]
Update test name to match changes made in r218783
Addressing post commit review feedback from Justin Bogner.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218821
91177308-0d34-0410-b5e6-
96231b3b80d8
Duncan P. N. Exon Smith [Wed, 1 Oct 2014 21:14:20 +0000 (21:14 +0000)]
DIBuilder: Remove dead code
I neglected to update `DIBuilder::createPieceExpression()` in r218797,
which I noticed while rebasing a patch for PR17891. On closer
inspection, it looks like dead code.
If there are any downstream users of this, you should transition to the
more general `createExpression()`. Or, we can add this back, but then
it should just forward to `createExpression()`.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218820
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 1 Oct 2014 21:07:07 +0000 (21:07 +0000)]
[x86] Merge the remaining test cases into vector-blend.ll and remove all
the ISA-specific test files.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218818
91177308-0d34-0410-b5e6-
96231b3b80d8
Eric Christopher [Wed, 1 Oct 2014 21:05:35 +0000 (21:05 +0000)]
Now that the optimization level is adjusting the feature string
before we hit the subtarget, remove the constructor parameter.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218817
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 1 Oct 2014 21:03:21 +0000 (21:03 +0000)]
[x86] Expand the ISA coverage of our blend test in preparation for
merging ISA-specific testing into this file.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218816
91177308-0d34-0410-b5e6-
96231b3b80d8
Argyrios Kyrtzidis [Wed, 1 Oct 2014 21:00:44 +0000 (21:00 +0000)]
Adds 'override' to overriding methods. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218815
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 1 Oct 2014 20:56:57 +0000 (20:56 +0000)]
[x86] Merge the interesting test cases from blend-msb.ll into
vector-blend.ll and remove the former.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218814
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 1 Oct 2014 20:52:55 +0000 (20:52 +0000)]
[x86] Move the AVX blend test to a generic name. I'm going to fold other
blend tests into this one.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218813
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 1 Oct 2014 20:50:58 +0000 (20:50 +0000)]
[x86] Remove a test that wasn't doing anything really. We have plenty of
better tests for zext of vectors at this point.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218811
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 1 Oct 2014 20:49:54 +0000 (20:49 +0000)]
[x86] Add a 32-bit run to the sext test, and remove a sad vec_sext.ll
test file.
This old test had a bunch of functions that were never even checked. =/
The only thing it really did was to make sure that we did something
reasonable in 32-bit mode with SSE4.1. Adding another run line to the
main vector-sext.ll test seems a better way to do that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218810
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 1 Oct 2014 20:41:36 +0000 (20:41 +0000)]
[x86] Teach both sext and zext vector tests to cover a nice wide range
of architectures: SSE2, SSSE3, SSE4.1, AVX, and AVX2.
Unfortunately, this exposses the absolute horror of the code we generate
for many of these patterns. Anyone wanting to familiarize themselves
with the x86 backend and improve performance could do a lot of good
sitting down and making these test cases not look so terrible. While the
new vector shuffle code I'm working on well help some, it won't fix all
of the crimes here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218807
91177308-0d34-0410-b5e6-
96231b3b80d8
Eric Christopher [Wed, 1 Oct 2014 20:38:26 +0000 (20:38 +0000)]
Rework the PPC TargetMachine so that the non-function specific
overrides happen at TargetMachine creation and not on every
subtarget creation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218805
91177308-0d34-0410-b5e6-
96231b3b80d8
Eric Christopher [Wed, 1 Oct 2014 20:38:22 +0000 (20:38 +0000)]
constify TargetMachine parameter for X86TargetLowering.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218804
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Wed, 1 Oct 2014 20:36:33 +0000 (20:36 +0000)]
Make the sqrt intrinsic return undef for a negative input.
As discussed here:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-
20140609/220598.html
And again here:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-September/077168.html
The sqrt of a negative number when using the llvm intrinsic is undefined.
We should return undef rather than 0.0 to match the definition in the LLVM IR lang ref.
This change should not affect any code that isn't using "no-nans-fp-math";
ie, no-nans is a requirement for generating the llvm intrinsic in place of a sqrt function call.
Unfortunately, the behavior introduced by this patch will not match current gcc, xlc, icc, and
possibly other compilers. The current clang/llvm behavior of returning 0.0 doesn't either.
We knowingly approve of this difference with the other compilers in an attempt to flag code
that is invoking undefined behavior.
A front-end warning should also try to convince the user that the program will fail:
http://llvm.org/bugs/show_bug.cgi?id=21093
Differential Revision: http://reviews.llvm.org/D5527
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218803
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 1 Oct 2014 20:32:44 +0000 (20:32 +0000)]
[x86] Sort the ISA-specific RUN lines for vector-sext.ll to go from
oldest to newest. This makes more sense to me and is more consistent
with other tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218802
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Northover [Wed, 1 Oct 2014 20:31:58 +0000 (20:31 +0000)]
ARM: yes it can (as of r218789)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218801
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 1 Oct 2014 20:30:30 +0000 (20:30 +0000)]
[x86] Rename avx-{s,z}ext.ll to vector-{s,z}ext.ll.
These tests are far and away the best sext and zext tests we have for
vectors. I'm going to merge the other similar tests into them and expand
the ISA coverage.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218800
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 1 Oct 2014 20:27:16 +0000 (20:27 +0000)]
[x86] Cleanup and re-generate the checks for avx-zext.ll using the new
script.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218799
91177308-0d34-0410-b5e6-
96231b3b80d8
Duncan P. N. Exon Smith [Wed, 1 Oct 2014 20:26:08 +0000 (20:26 +0000)]
DIBuilder: Encapsulate DIExpression's element type
`DIExpression`'s elements are 64-bit integers that are stored as
`ConstantInt`. The accessors already encapsulate the storage. This
commit updates the `DIBuilder` API to also encapsulate that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218797
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 1 Oct 2014 20:19:45 +0000 (20:19 +0000)]
[x86] Generate the FileCheck assertions for avx-blend.ll with my new
script to make them nice and predictable. This will ease updating them
for the new vector shuffle lowering and seeing the delta if any.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218795
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 1 Oct 2014 20:19:32 +0000 (20:19 +0000)]
[x86] Clean up and generate detailed FileCheck assertions for
avx-sext.ll using my new script.
Also add an AVX2 mode to this test.
Part of cleaning up the test suite before enabling the new vector
shuffle lowering. This also highlights some of the abysmal failures of
the old shuffle lowering. Check out those 'pinsrw' and 'pextrw'
sequences!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218794
91177308-0d34-0410-b5e6-
96231b3b80d8
Bruno Cardoso Lopes [Wed, 1 Oct 2014 20:07:13 +0000 (20:07 +0000)]
[MemoryDepAnalysis] Fix compile time slowdown
- Problem
One program takes ~3min to compile under -O2. This happens after a certain
function A is inlined ~700 times in a function B, inserting thousands of new
BBs. This leads to 80% of the compilation time spent in
GVN::processNonLocalLoad and
MemoryDependenceAnalysis::getNonLocalPointerDependency, while searching for
nonlocal information for basic blocks.
Usually, to avoid spending a long time to process nonlocal loads, GVN bails out
if it gets more than 100 deps as a result from
MD->getNonLocalPointerDependency. However this only happens *after* all
nonlocal information for BBs have been computed, which is the bottleneck in
this scenario. For instance, there are 8280 times where
getNonLocalPointerDependency returns deps with more than 100 bbs and from
those, 600 times it returns more than 1000 blocks.
- Solution
Bail out early during the nonlocal info computation whenever we reach a
specified threshold. This patch proposes a 100 BBs threshold, it also
reduces the compile time from 3min to 23s.
- Testing
The test-suite presented no compile nor execution time regressions.
Some numbers from my machine (x86_64 darwin):
- 17s under -Oz (which avoids inlining).
- 1.3s under -O1.
- 2m51s under -O2 ToT
*** 23s under -O2 w/ Result.size() > 100
- 1m54s under -O2 w/ Result.size() > 500
With NumResultsLimit = 100, GVN yields the same outcome as in the
unlimited 3min version.
http://reviews.llvm.org/D5532
rdar://problem/
18188041
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218792
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Wed, 1 Oct 2014 19:39:32 +0000 (19:39 +0000)]
Don't repeat function/variable name in comment. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218791
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Wed, 1 Oct 2014 19:28:11 +0000 (19:28 +0000)]
[X86 disasm tblegen backend] Clean up numPhysicalOperands asserts
No functionality change intended.
This implements Elena's idea to put the new additionalOperand outside the
switch to cover all cases
(http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-
20140929/237763.html).
Note only nontrivial change is in MRMSrcMemFrm. This requires an inclusive
interval of [2, 4] because we have prefix-dependent *optional* immediate
operand.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218790
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Northover [Wed, 1 Oct 2014 19:21:03 +0000 (19:21 +0000)]
ARM: allow copying of CPSR when all else fails.
As with x86 and AArch64, certain situations can arise where we need to spill
CPSR in the middle of a calculation. These should be avoided where possible
(MRS/MSR is rather expensive), which ARM is actually better at than the other
two since it tries to Glue defs to uses, but as a last ditch effort, copying is
better than crashing.
rdar://problem/
18011155
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218789
91177308-0d34-0410-b5e6-
96231b3b80d8
Adrian Prantl [Wed, 1 Oct 2014 18:55:02 +0000 (18:55 +0000)]
Move the complex address expression out of DIVariable and into an extra
argument of the llvm.dbg.declare/llvm.dbg.value intrinsics.
Previously, DIVariable was a variable-length field that has an optional
reference to a Metadata array consisting of a variable number of
complex address expressions. In the case of OpPiece expressions this is
wasting a lot of storage in IR, because when an aggregate type is, e.g.,
SROA'd into all of its n individual members, the IR will contain n copies
of the DIVariable, all alike, only differing in the complex address
reference at the end.
By making the complex address into an extra argument of the
dbg.value/dbg.declare intrinsics, all of the pieces can reference the
same variable and the complex address expressions can be uniqued across
the CU, too.
Down the road, this will allow us to move other flags, such as
"indirection" out of the DIVariable, too.
The new intrinsics look like this:
declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr)
declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr)
This patch adds a new LLVM-local tag to DIExpressions, so we can detect
and pretty-print DIExpression metadata nodes.
What this patch doesn't do:
This patch does not touch the "Indirect" field in DIVariable; but moving
that into the expression would be a natural next step.
http://reviews.llvm.org/D4919
rdar://problem/
17994491
Thanks to dblaikie and dexonsmith for reviewing this patch!
Note: I accidentally committed a bogus older version of this patch previously.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218787
91177308-0d34-0410-b5e6-
96231b3b80d8
Duncan P. N. Exon Smith [Wed, 1 Oct 2014 18:49:58 +0000 (18:49 +0000)]
LTO: Add missing target triple from r218784
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218786
91177308-0d34-0410-b5e6-
96231b3b80d8
Reed Kotler [Wed, 1 Oct 2014 18:47:02 +0000 (18:47 +0000)]
Add fptrunc to mips fast-sel
Summary: Implement conversion of 64 to 32 bit floating point numbers (fptrunc) in mips fast-isel
Test Plan:
fptrunc.ll
checked also with 4 internal mips build bot flavors mip32r1/miprs32r2 and at -O0 and -O2
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: rfuhler
Differential Revision: http://reviews.llvm.org/D5553
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218785
91177308-0d34-0410-b5e6-
96231b3b80d8
Duncan P. N. Exon Smith [Wed, 1 Oct 2014 18:36:03 +0000 (18:36 +0000)]
LTO: Ignore disabled diagnostic remarks
r206400 and r209442 added remarks that are disabled by default.
However, if a diagnostic handler is registered, the remarks are sent
unfiltered to the handler. This is the right behaviour for clang, since
it has its own filters.
However, the diagnostic handler exposed in the LTO API receives only the
severity and message. It doesn't have the information to filter by pass
name. For LTO, disabled remarks should be filtered by the producer.
I've changed `LLVMContext::setDiagnosticHandler()` to take a `bool`
argument indicating whether to respect the built-in filters. This
defaults to `false`, so other consumers don't have a behaviour change,
but `LTOCodeGenerator::setDiagnosticHandler()` sets it to `true`.
To make this behaviour testable, I added a `-use-diagnostic-handler`
command-line option to `llvm-lto`.
This fixes PR21108.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218784
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Wed, 1 Oct 2014 18:29:44 +0000 (18:29 +0000)]
Add an immovable type to test Optional<T>::emplace more rigorously after r218732.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218783
91177308-0d34-0410-b5e6-
96231b3b80d8
Adrian Prantl [Wed, 1 Oct 2014 18:10:54 +0000 (18:10 +0000)]
Revert r218778 while investigating buldbot breakage.
"Move the complex address expression out of DIVariable and into an extra"
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218782
91177308-0d34-0410-b5e6-
96231b3b80d8
Adrian Prantl [Wed, 1 Oct 2014 17:55:39 +0000 (17:55 +0000)]
Move the complex address expression out of DIVariable and into an extra
argument of the llvm.dbg.declare/llvm.dbg.value intrinsics.
Previously, DIVariable was a variable-length field that has an optional
reference to a Metadata array consisting of a variable number of
complex address expressions. In the case of OpPiece expressions this is
wasting a lot of storage in IR, because when an aggregate type is, e.g.,
SROA'd into all of its n individual members, the IR will contain n copies
of the DIVariable, all alike, only differing in the complex address
reference at the end.
By making the complex address into an extra argument of the
dbg.value/dbg.declare intrinsics, all of the pieces can reference the
same variable and the complex address expressions can be uniqued across
the CU, too.
Down the road, this will allow us to move other flags, such as
"indirection" out of the DIVariable, too.
The new intrinsics look like this:
declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr)
declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr)
This patch adds a new LLVM-local tag to DIExpressions, so we can detect
and pretty-print DIExpression metadata nodes.
What this patch doesn't do:
This patch does not touch the "Indirect" field in DIVariable; but moving
that into the expression would be a natural next step.
http://reviews.llvm.org/D4919
rdar://problem/
17994491
Thanks to dblaikie and dexonsmith for reviewing this patch!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218778
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Stellard [Wed, 1 Oct 2014 17:15:17 +0000 (17:15 +0000)]
R600: Call EmitFunctionHeader() in the AsmPrinter to populate the ELF symbol table
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218776
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Stellard [Wed, 1 Oct 2014 17:14:57 +0000 (17:14 +0000)]
C API: Add LLVMCloneModule()
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218775
91177308-0d34-0410-b5e6-
96231b3b80d8
Jingyue Wu [Wed, 1 Oct 2014 15:22:13 +0000 (15:22 +0000)]
Revert r216862 due to a performance regression
Reported by Alexey Volkov in PR21115
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218771
91177308-0d34-0410-b5e6-
96231b3b80d8
Toma Tabacu [Wed, 1 Oct 2014 14:53:19 +0000 (14:53 +0000)]
[mips] Rename emit and parse functions for the .cpload assembler directive. NFC.
Summary: It's better if we have a consistent name for .cpload-related functions.
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5437
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218768
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Stellard [Wed, 1 Oct 2014 14:44:45 +0000 (14:44 +0000)]
R600/SI: Add a generic pseudo EXP instruction
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218767
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Stellard [Wed, 1 Oct 2014 14:44:43 +0000 (14:44 +0000)]
R600/SI: Add generic pseudo MTBUF instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218766
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Stellard [Wed, 1 Oct 2014 14:44:42 +0000 (14:44 +0000)]
R600/SI: Add generic pseudo SMRD instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218765
91177308-0d34-0410-b5e6-
96231b3b80d8
Oliver Stannard [Wed, 1 Oct 2014 13:13:18 +0000 (13:13 +0000)]
[ARM] Allow selecting VRINT[APMXZR] and VCVT[BT] instructions for FPv5
Currently, we only codegen the VRINT[APMXZR] and VCVT[BT] instructions
when targeting ARMv8, but they are actually present on any target with
FP-ARMv8. Note that FP-ARMv8 is called FPv5 when is is part of an
M-profile core, but they have the same instructions so we model them
both as FPARMv8 in the ARM backend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218763
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 1 Oct 2014 11:14:02 +0000 (11:14 +0000)]
[x86] Fix a few more tiny patterns with the new vector shuffle lowering
that keep cropping up in the regression test suite.
This also addresses one of the issues raised on the mailing list with
failing to form 'movsd' in as many cases as we realistically should.
There will be corresponding patches forthcoming for v4f32 at least. This
was a lot of fuss for a relatively small gain, but all the fuss was on
my end trying different ways of holding the pieces of the x86 fragment
patterns *just right*. Now that it works, the code is reasonably simple.
In the new test cases I'm adding here, v2i64 sticks out as just plain
horrible. I've not come up with any great ideas here other than that it
would be nice to recognize when we're *going* to take a domain crossing
hit and cross earlier to get the decent instructions. At least with AVX
it is slightly less silly....
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218756
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 1 Oct 2014 11:13:57 +0000 (11:13 +0000)]
[x86] Delete some extraneous logic from the new vector shuffle lowering.
Nothing was relying on this and there are potentially some edge cases
that it would not be correct under. Removing it seems better than trying
to "fix" it as nothing was relying on it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218755
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Coxon [Wed, 1 Oct 2014 10:13:59 +0000 (10:13 +0000)]
[AArch64] Allow access to all system registers with MRS/MSR instructions.
The A64 instruction set includes a generic register syntax for accessing
implementation-defined system registers. The syntax for these registers is:
S<op0>_<op1>_<CRn>_<CRm>_<op2>
The encoding space permitted for implementation-defined system registers
is:
op0 op1 CRn CRm op2
11 xxx 1x11 xxxx xxx
The full encoding space can now be accessed:
op0 op1 CRn CRm op2
xx xxx xxxx xxxx xxx
This is useful to anyone needing to write assembly code supporting new
system registers before the assembler has learned the official names for
them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218753
91177308-0d34-0410-b5e6-
96231b3b80d8
Evgeniy Stepanov [Wed, 1 Oct 2014 10:07:28 +0000 (10:07 +0000)]
Revert r218721, r218735.
Failing bootstrap on Linux (arm, x86).
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/13139/steps/bootstrap%20clang/logs/stdio
http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost/builds/470
http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/8518
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218752
91177308-0d34-0410-b5e6-
96231b3b80d8
Asiri Rathnayake [Wed, 1 Oct 2014 09:59:45 +0000 (09:59 +0000)]
Add missing natual vector cast.
Summary: The natual vector cast node (similar to bitcast) AArch64ISD::NVCAST
was introduced in r217159 and r217138. This patch adds a missing cast from
v2f32 to v1i64 which is causing some compilation failures. Also added test
cases to cover various modimm types and BUILD_VECTORs with i64 elements.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218751
91177308-0d34-0410-b5e6-
96231b3b80d8
NAKAMURA Takumi [Wed, 1 Oct 2014 09:14:43 +0000 (09:14 +0000)]
ADTTests/OptionalTest.cpp: Use LLVM_DELETED_FUNCTION.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218750
91177308-0d34-0410-b5e6-
96231b3b80d8
Oliver Stannard [Wed, 1 Oct 2014 09:02:17 +0000 (09:02 +0000)]
[ARM] Add support for Cortex-M7, FPv5-SP and FPv5-DP (LLVM)
The Cortex-M7 has 3 options for its FPU: none, FPv5-SP-D16 and
FPv5-DP-D16. FPv5 has the same instructions as FP-ARMv8, so it can be
modelled using the same target feature, and all double-precision
operations are already disabled by the fp-only-sp target features.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218747
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Sanders [Wed, 1 Oct 2014 08:26:55 +0000 (08:26 +0000)]
[mips] Fix disassembly of [ls][wd]c[23], cache, and pref
Fixes PR21015, and PR20993.
Patch by Jun Koi
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218745
91177308-0d34-0410-b5e6-
96231b3b80d8
Sasa Stankovic [Wed, 1 Oct 2014 08:22:21 +0000 (08:22 +0000)]
[mips] For indirect calls we don't need $gp to point to .got. Mips linker
doesn't generate lazy binding stub for a function whose address is taken in
the program.
Differential Revision: http://reviews.llvm.org/D5067
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218744
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Bogner [Wed, 1 Oct 2014 05:45:45 +0000 (05:45 +0000)]
test: XFAIL the non-darwin gmlt test on darwin
r218702 disabled a -gmlt optimization for darwin, but this means the
non-darwin test isn't working there anymore.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218742
91177308-0d34-0410-b5e6-
96231b3b80d8
Lang Hames [Wed, 1 Oct 2014 04:11:13 +0000 (04:11 +0000)]
[MCJIT] Turn the getSymbolAddress free function created in r218626 into a static
member of RTDyldMemoryManager (and rename to getSymbolAddressInProcess).
The functionality this provides is very specific to RTDyldMemoryManager, so it
makes sense to keep it in that class to avoid accidental re-use.
No functional change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218741
91177308-0d34-0410-b5e6-
96231b3b80d8
Nick Lewycky [Wed, 1 Oct 2014 03:37:34 +0000 (03:37 +0000)]
Fix typo in comment from r218733
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218739
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Bogner [Wed, 1 Oct 2014 03:31:58 +0000 (03:31 +0000)]
InstrProf: Make coverage::Counter comparable
I'll be using this in a clang change very soon.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218736
91177308-0d34-0410-b5e6-
96231b3b80d8
Gerolf Hoflehner [Wed, 1 Oct 2014 03:24:39 +0000 (03:24 +0000)]
[InstCombine] Fix for assert build failures caused by r218721
The icmp-select-icmp optimization made the implicit assumption
that the select-icmp instructions are in the same block and asserted on it.
The fix explicitly checks for that condition and conservatively suppresses
the optimization when it is violated.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218735
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 1 Oct 2014 03:19:43 +0000 (03:19 +0000)]
[x86] Teach the new vector shuffle lowering to be even more aggressive
in exposing the scalar value to the broadcast DAG fragment so that we
can catch even reloads and fold them into the broadcast.
This is somewhat magical I'm afraid but seems to work. It is also what
the old lowering did, and I've switched an old test to run both
lowerings demonstrating that we get the same result.
Unlike the old code, I'm not lowering f32 or f64 scalars through this
path when we only have AVX1. The target patterns include pretty heinous
code to re-cast those as shuffles when the scalar happens to not be
spilled because AVX1 provides no broadcast mechanism from registers
what-so-ever. This is terribly brittle. I'd much rather go through our
generic lowering code to get this. If needed, we can add a peephole to
get even more opportunities to broadcast-from-spill-slots that are
exposed post-RA, but my suspicion is this just doesn't matter that much.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218734
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 1 Oct 2014 02:25:54 +0000 (02:25 +0000)]
[x86] Hoist the zext-lowering up in the v4i32 lowering routine -- it is
the same speed as pshufd but we can fold loads into the pmovzx
instructions.
This fixes some regressions that came up in the regression test suite
for the new vector shuffle lowering.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218733
91177308-0d34-0410-b5e6-
96231b3b80d8
Jordan Rose [Wed, 1 Oct 2014 02:12:35 +0000 (02:12 +0000)]
Add an emplace(...) method to llvm::Optional<T>.
This can be used for in-place initialization of non-moveable types.
For compilers that don't support variadic templates, only up to four
arguments are supported. We can always add more, of course, but this
should be good enough until we move to a later MSVC that has full
support for variadic templates.
Inspired by std::experimental::optional from the "Library Fundamentals" C++ TS.
Reviewed by David Blaikie.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218732
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Wed, 1 Oct 2014 00:56:55 +0000 (00:56 +0000)]
Implement DW_TAG_subrange_type with DW_AT_count rather than DW_AT_upper_bound
This allows proper disambiguation of unbounded arrays and arrays of zero
bound ("struct foo { int x[]; };" and "struct foo { int x[0]; }"). GCC
instead produces an upper bound of -1 in the latter situation, but count
seems tidier. This way lower_bound is provided if it's not the language
default and count is provided if the count is known, otherwise it's
omitted. Simple.
If someone wants to look at rdar://problem/
12566646 and see if this
change is acceptable to that bug/fix, that might be helpful (see the
empty-and-one-elem-array.ll test case which cites that radar).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218726
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Wed, 1 Oct 2014 00:41:32 +0000 (00:41 +0000)]
[AVX512] Remove space before \t in AsmStrings.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218725
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 1 Oct 2014 00:41:21 +0000 (00:41 +0000)]
[x86] Teach the new vector shuffle lowering about VBROADCAST and
VPBROADCAST.
This has the somewhat expected pervasive impact. I don't know why
I forgot about this. Everything seems good with lots of significant
improvements in the tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218724
91177308-0d34-0410-b5e6-
96231b3b80d8
NAKAMURA Takumi [Wed, 1 Oct 2014 00:29:26 +0000 (00:29 +0000)]
llvm-cov/CoverageReport.cpp: Quick fix for msvcrt, since width specifier "z" is unavailable.
Note, mingw uses its own printf instead of msvcrt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218723
91177308-0d34-0410-b5e6-
96231b3b80d8
NAKAMURA Takumi [Wed, 1 Oct 2014 00:29:16 +0000 (00:29 +0000)]
llvm/test/DebugInfo/X86/gmlt.test: Get rid of %llc_dwarf. It should not be used with -mtriple.
Also, remove object-emission. test/DebugInfo/X86 doesn't require it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218722
91177308-0d34-0410-b5e6-
96231b3b80d8
Gerolf Hoflehner [Wed, 1 Oct 2014 00:13:22 +0000 (00:13 +0000)]
[InstCombine] Optimize icmp-select-icmp
In special cases select instructions can be eliminated by
replacing them with a cheaper bitwise operation even when the
select result is used outside its home block. The instances implemented
are patterns like
%x=icmp.eq
%y=select %x,%r, null
%z=icmp.eq|neq %y, null
br %z,true, false
==> %x=icmp.ne
%y=icmp.eq %r,null
%z=or %x,%y
br %z,true,false
The optimization is integrated into the instruction
combiner and performed only when all uses of the select result can
be replaced by the select operand proper. For this dominator information
is used and dominance is now a required analysis pass in the combiner.
The optimization itself is iterative. The critical step is to replace the
select result with the non-constant select operand. So the select becomes
local and the combiner iteratively works out simpler code pattern and
eventually eliminates the select.
rdar://
17853760
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218721
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Tue, 30 Sep 2014 23:29:16 +0000 (23:29 +0000)]
Omit DW_AT_inline under -gmlt to save a little more space.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218719
91177308-0d34-0410-b5e6-
96231b3b80d8
Hal Finkel [Tue, 30 Sep 2014 22:43:40 +0000 (22:43 +0000)]
[BasicAA] Make better use of zext and sign information
Two related things:
1. Fixes a bug when calculating the offset in GetLinearExpression. The code
previously used zext to extend the offset, so negative offsets were converted
to large positive ones.
2. Enhance aliasGEP to deduce that, if the difference between two GEP
allocations is positive and all the variables that govern the offset are also
positive (i.e. the offset is strictly after the higher base pointer), then
locations that fit in the gap between the two base pointers are NoAlias.
Patch by Nick White!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218714
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Tue, 30 Sep 2014 22:32:49 +0000 (22:32 +0000)]
DebugInfo: Sink the code emitting DW_AT_APPLE_omit_frame_ptr down to a more common spot.
No functional change. Pre-emptive refactoring before I start pushing
some of this subprogram creation down into DWARFCompileUnit so I can
build different subprograms in the skeleton unit from the dwo unit for
adding -gmlt-like data to the skeleton.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218713
91177308-0d34-0410-b5e6-
96231b3b80d8
Hans Wennborg [Tue, 30 Sep 2014 22:30:06 +0000 (22:30 +0000)]
MSBuild integration: fix the loop in install.bat
It would previously not continue the platforms loop
unless it could find the latest toolset directory.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218712
91177308-0d34-0410-b5e6-
96231b3b80d8
Jingyue Wu [Tue, 30 Sep 2014 22:23:38 +0000 (22:23 +0000)]
[SimplifyCFG] threshold for folding branches with common destination
Summary:
This patch adds a threshold that controls the number of bonus instructions
allowed for folding branches with common destination. The original code allows
at most one bonus instruction. With this patch, users can customize the
threshold to allow multiple bonus instructions. The default threshold is still
1, so that the code behaves the same as before when users do not specify this
threshold.
The motivation of this change is that tuning this threshold significantly (up
to 25%) improves the performance of some CUDA programs in our internal code
base. In general, branch instructions are very expensive for GPU programs.
Therefore, it is sometimes worth trading more arithmetic computation for a more
straightened control flow. Here's a reduced example:
__global__ void foo(int a, int b, int c, int d, int e, int n,
const int *input, int *output) {
int sum = 0;
for (int i = 0; i < n; ++i)
sum += (((i ^ a) > b) && (((i | c ) ^ d) > e)) ? 0 : input[i];
*output = sum;
}
The select statement in the loop body translates to two branch instructions "if
((i ^ a) > b)" and "if (((i | c) ^ d) > e)" which share a common destination.
With the default threshold, SimplifyCFG is unable to fold them, because
computing the condition of the second branch "(i | c) ^ d > e" requires two
bonus instructions. With the threshold increased, SimplifyCFG can fold the two
branches so that the loop body contains only one branch, making the code
conceptually look like:
sum += (((i ^ a) > b) & (((i | c ) ^ d) > e)) ? 0 : input[i];
Increasing the threshold significantly improves the performance of this
particular example. In the configuration where both conditions are guaranteed
to be true, increasing the threshold from 1 to 2 improves the performance by
18.24%. Even in the configuration where the first condition is false and the
second condition is true, which favors shortcuts, increasing the threshold from
1 to 2 still improves the performance by 4.35%.
We are still looking for a good threshold and maybe a better cost model than
just counting the number of bonus instructions. However, according to the above
numbers, we think it is at least worth adding a threshold to enable more
experiments and tuning. Let me know what you think. Thanks!
Test Plan: Added one test case to check the threshold is in effect
Reviewers: nadav, eliben, meheff, resistor, hfinkel
Reviewed By: hfinkel
Subscribers: hfinkel, llvm-commits
Differential Revision: http://reviews.llvm.org/D5529
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218711
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Tue, 30 Sep 2014 22:16:23 +0000 (22:16 +0000)]
[x86] Add AVX1 and AVX2 testing to all of the 128-bit shuffle test
cases.
While clearly we don't need the AVX vector width, these ISA extensions
often cause us to select different instructions and we should cover them
even with the narrow vector width.
Also, while here, nuke the stress_test2 contents. There is no reason to
try to FileCheck this entire body when it is mostly a test for
successfully surviving the code generator.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218710
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Tue, 30 Sep 2014 22:04:45 +0000 (22:04 +0000)]
[x86] Update the exact FileCheck syntax of the 256-bit and 512-bit
shuffle tests to match that used in the script I posted and now used
consistently in 128-bit tests.
Nothing interesting changing here, just using the label name as the
FileCheck label and a slightly more general comment marker consumption
strategy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218709
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Tue, 30 Sep 2014 22:02:27 +0000 (22:02 +0000)]
Adjust test case addition in r218702 so as not to fail when the X86 target isn't built.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218708
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Tue, 30 Sep 2014 21:44:34 +0000 (21:44 +0000)]
[x86] Rework all of the 128-bit vector shuffle tests with my handy test
updating script so that they are more thorough and consistent.
Specific fixes here include:
- Actually test VEX-encoded AVX mnemonics.
- Actually use an SSE 4.1 run to test SSE 4.1 features!
- Correctly check instructions sequences from the start of the function.
- Elide the shuffle operands and comment designator in a consistent way.
- Test all of the architectures instead of just the ones I was motivated
to manually author.
I've gone back through and fixed up any egregious issues I spotted. Let
me know if I missed something you really dislike.
One downside to this is that we're now not as diligently using FileCheck
variables for registers. I would be much more concerned with this if we
had larger register usage, but there just aren't that interesting of
register choices here and most of the registers are constrained by the
ABI. Ultimately, I don't think this is likely to be the maintenance
burden for these tests and updating them again should be staright
forward.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218707
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Tue, 30 Sep 2014 21:28:32 +0000 (21:28 +0000)]
Disable the -gmlt optimization implemented in r218129 under Darwin due to issues with dsymutil.
r218129 omits DW_TAG_subprograms which have no inlined subroutines when
emitting -gmlt data. This makes -gmlt very low cost for -O0 builds.
Darwin's dsymutil reasonably considers a CU empty if it has no
subprograms (which occurs with the above optimization in -O0 programs
without any force_inline function calls) and drops the line table, CU,
and everything in this situation, making backtraces impossible.
Until dsymutil is modified to account for this, disable this
optimization on Darwin to preserve the desired functionality.
(see r218545, which should be reverted after this patch, for other
discussion/details)
Footnote:
In the long term, it doesn't look like this scheme (of simplified debug
info to describe inlining to enable backtracing) is tenable, it is far
too size inefficient for optimized code (the DW_TAG_inlined_subprograms,
even once compressed, are nearly twice as large as the line table
itself (also compressed)) and we'll be considering things like Cary's
two level line table proposal to encode all this information directly in
the line table.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218702
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Tue, 30 Sep 2014 20:44:23 +0000 (20:44 +0000)]
Use the target-specified iteration count to opt out of any further refinement of an estimate. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218700
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Tue, 30 Sep 2014 20:28:48 +0000 (20:28 +0000)]
Split the estimate() interface into separate functions for each type. NFC.
It was hacky to use an opcode as a switch because it won't always match
(rsqrte != sqrte), and it looks like we'll need to add more special casing
per arch than I had hoped for. Eg, x86 will prefer a different NR estimate
implementation. ARM will want to use it's 'step' instructions. There also
don't appear to be any new estimate instructions in any arch in a long,
long time. Altivec vloge and vexpte may have been the first and last in
that field...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218698
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Tue, 30 Sep 2014 19:59:35 +0000 (19:59 +0000)]
Recommit r218010 [FastISel][AArch64] Fold bit test and branch into TBZ and TBNZ.
Note: This version fixed an issue with the TBZ/TBNZ instructions that were
generated in FastISel. The issue was that the 64bit version of TBZ (TBZX)
automagically sets the upper bit of the immediate field that is used to specify
the bit we want to test. To test for any of the lower 32bits we have to first
extract the subregister and use the 32bit version of the TBZ instruction (TBZW).
Original commit message:
Teach selectBranch to fold bit test and branch into a single instruction (TBZ or
TBNZ).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218693
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Tue, 30 Sep 2014 19:49:48 +0000 (19:49 +0000)]
R600/SI: Fix printing of clamp and omod
No tests for omod since nothing uses it yet, but
this should get rid of the remaining annoying trailing
zeros after some instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218692
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Tue, 30 Sep 2014 19:49:43 +0000 (19:49 +0000)]
R600/SI: Update VOP3b to not include obsolete operands
abs / neg are now part of the srcN_modifiers operands
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218691
91177308-0d34-0410-b5e6-
96231b3b80d8
Bradley Smith [Tue, 30 Sep 2014 16:31:40 +0000 (16:31 +0000)]
Extend C disassembler API to allow specifying target features
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218682
91177308-0d34-0410-b5e6-
96231b3b80d8
Reed Kotler [Tue, 30 Sep 2014 16:30:13 +0000 (16:30 +0000)]
Add numeric extend, trunctate to mips fast-isel
Summary:
Add numeric extend, trunctate to mips fast-isel
Reactivates D4827
Test Plan:
fpext.ll
loadstoreconv.ll
Reviewers: dsanders
Subscribers: mcrosier
Differential Revision: http://reviews.llvm.org/D5251
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218681
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Coxon [Tue, 30 Sep 2014 16:23:16 +0000 (16:23 +0000)]
[AArch64] Remove unnecessary whitespace. (Test commit)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218680
91177308-0d34-0410-b5e6-
96231b3b80d8
Andrea Di Biagio [Tue, 30 Sep 2014 15:30:22 +0000 (15:30 +0000)]
[DAG] Check in advance if a build_vector has a legal type before attempting to convert it into a shuffle.
Currently, the DAG Combiner only tries to convert type-legal build_vector nodes
into shuffles. This patch simply moves the logic that checks if a
build_vector has a legal value type up before we even start analyzing the
operands. This allows to early exit immediately from method
'visitBUILD_VECTOR' if the node type is known to be illegal.
No functional change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218677
91177308-0d34-0410-b5e6-
96231b3b80d8
Alex Lorenz [Tue, 30 Sep 2014 14:48:12 +0000 (14:48 +0000)]
Revert r218673 'llvm-cov: add test for report's function & file association.'
Test causes buildbot failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218676
91177308-0d34-0410-b5e6-
96231b3b80d8
Alex Lorenz [Tue, 30 Sep 2014 12:52:31 +0000 (12:52 +0000)]
llvm-cov: add test for report's function & file association.
This commit adds a test which checks that the functions defined in header files will get associated with the header files rather than the source files in the reports.
Differential Revision: http://reviews.llvm.org/D5489
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218673
91177308-0d34-0410-b5e6-
96231b3b80d8
Alex Lorenz [Tue, 30 Sep 2014 12:45:13 +0000 (12:45 +0000)]
llvm-cov: Use the number of executed functions for the function coverage metric.
This commit fixes llvm-cov's function coverage metric by using the number of executed functions instead of the number of fully covered functions.
Differential Revision: http://reviews.llvm.org/D5196
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218672
91177308-0d34-0410-b5e6-
96231b3b80d8
Lorenzo Martignoni [Tue, 30 Sep 2014 12:33:16 +0000 (12:33 +0000)]
Introduce support for custom wrappers for vararg functions.
Differential Revision: http://reviews.llvm.org/D5412
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218671
91177308-0d34-0410-b5e6-
96231b3b80d8
Robert Khasanov [Tue, 30 Sep 2014 12:15:52 +0000 (12:15 +0000)]
[AVX512] Added intrinsics for 128-, 256- and 512-bit versions of VCMPGT{BWDQ}.
Patch by Sergey Lisitsyn <sergey.lisitsyn@intel.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218670
91177308-0d34-0410-b5e6-
96231b3b80d8
Robert Khasanov [Tue, 30 Sep 2014 11:41:54 +0000 (11:41 +0000)]
[AVX512] Added intrinsics for 128- and 256-bit versions of VCMPEQ{BWDQ}
Fixed lowering of this intrinsics in case when mask is v2i1 and v4i1.
Now cmp intrinsics lower in the following way:
(i8 (int_x86_avx512_mask_pcmpeq_q_128
(v2i64 %a), (v2i64 %b), (i8 %mask))) ->
(i8 (bitcast
(v8i1 (insert_subvector undef,
(v2i1 (and (PCMPEQM %a, %b),
(extract_subvector
(v8i1 (bitcast %mask)), 0))), 0))))
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218669
91177308-0d34-0410-b5e6-
96231b3b80d8