Sasa Stankovic [Wed, 1 Oct 2014 08:22:21 +0000 (08:22 +0000)]
[mips] For indirect calls we don't need $gp to point to .got. Mips linker
doesn't generate lazy binding stub for a function whose address is taken in
the program.
Differential Revision: http://reviews.llvm.org/D5067
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218744
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Bogner [Wed, 1 Oct 2014 05:45:45 +0000 (05:45 +0000)]
test: XFAIL the non-darwin gmlt test on darwin
r218702 disabled a -gmlt optimization for darwin, but this means the
non-darwin test isn't working there anymore.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218742
91177308-0d34-0410-b5e6-
96231b3b80d8
Lang Hames [Wed, 1 Oct 2014 04:11:13 +0000 (04:11 +0000)]
[MCJIT] Turn the getSymbolAddress free function created in r218626 into a static
member of RTDyldMemoryManager (and rename to getSymbolAddressInProcess).
The functionality this provides is very specific to RTDyldMemoryManager, so it
makes sense to keep it in that class to avoid accidental re-use.
No functional change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218741
91177308-0d34-0410-b5e6-
96231b3b80d8
Nick Lewycky [Wed, 1 Oct 2014 03:37:34 +0000 (03:37 +0000)]
Fix typo in comment from r218733
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218739
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Bogner [Wed, 1 Oct 2014 03:31:58 +0000 (03:31 +0000)]
InstrProf: Make coverage::Counter comparable
I'll be using this in a clang change very soon.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218736
91177308-0d34-0410-b5e6-
96231b3b80d8
Gerolf Hoflehner [Wed, 1 Oct 2014 03:24:39 +0000 (03:24 +0000)]
[InstCombine] Fix for assert build failures caused by r218721
The icmp-select-icmp optimization made the implicit assumption
that the select-icmp instructions are in the same block and asserted on it.
The fix explicitly checks for that condition and conservatively suppresses
the optimization when it is violated.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218735
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 1 Oct 2014 03:19:43 +0000 (03:19 +0000)]
[x86] Teach the new vector shuffle lowering to be even more aggressive
in exposing the scalar value to the broadcast DAG fragment so that we
can catch even reloads and fold them into the broadcast.
This is somewhat magical I'm afraid but seems to work. It is also what
the old lowering did, and I've switched an old test to run both
lowerings demonstrating that we get the same result.
Unlike the old code, I'm not lowering f32 or f64 scalars through this
path when we only have AVX1. The target patterns include pretty heinous
code to re-cast those as shuffles when the scalar happens to not be
spilled because AVX1 provides no broadcast mechanism from registers
what-so-ever. This is terribly brittle. I'd much rather go through our
generic lowering code to get this. If needed, we can add a peephole to
get even more opportunities to broadcast-from-spill-slots that are
exposed post-RA, but my suspicion is this just doesn't matter that much.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218734
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 1 Oct 2014 02:25:54 +0000 (02:25 +0000)]
[x86] Hoist the zext-lowering up in the v4i32 lowering routine -- it is
the same speed as pshufd but we can fold loads into the pmovzx
instructions.
This fixes some regressions that came up in the regression test suite
for the new vector shuffle lowering.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218733
91177308-0d34-0410-b5e6-
96231b3b80d8
Jordan Rose [Wed, 1 Oct 2014 02:12:35 +0000 (02:12 +0000)]
Add an emplace(...) method to llvm::Optional<T>.
This can be used for in-place initialization of non-moveable types.
For compilers that don't support variadic templates, only up to four
arguments are supported. We can always add more, of course, but this
should be good enough until we move to a later MSVC that has full
support for variadic templates.
Inspired by std::experimental::optional from the "Library Fundamentals" C++ TS.
Reviewed by David Blaikie.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218732
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Wed, 1 Oct 2014 00:56:55 +0000 (00:56 +0000)]
Implement DW_TAG_subrange_type with DW_AT_count rather than DW_AT_upper_bound
This allows proper disambiguation of unbounded arrays and arrays of zero
bound ("struct foo { int x[]; };" and "struct foo { int x[0]; }"). GCC
instead produces an upper bound of -1 in the latter situation, but count
seems tidier. This way lower_bound is provided if it's not the language
default and count is provided if the count is known, otherwise it's
omitted. Simple.
If someone wants to look at rdar://problem/
12566646 and see if this
change is acceptable to that bug/fix, that might be helpful (see the
empty-and-one-elem-array.ll test case which cites that radar).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218726
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Wed, 1 Oct 2014 00:41:32 +0000 (00:41 +0000)]
[AVX512] Remove space before \t in AsmStrings.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218725
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 1 Oct 2014 00:41:21 +0000 (00:41 +0000)]
[x86] Teach the new vector shuffle lowering about VBROADCAST and
VPBROADCAST.
This has the somewhat expected pervasive impact. I don't know why
I forgot about this. Everything seems good with lots of significant
improvements in the tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218724
91177308-0d34-0410-b5e6-
96231b3b80d8
NAKAMURA Takumi [Wed, 1 Oct 2014 00:29:26 +0000 (00:29 +0000)]
llvm-cov/CoverageReport.cpp: Quick fix for msvcrt, since width specifier "z" is unavailable.
Note, mingw uses its own printf instead of msvcrt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218723
91177308-0d34-0410-b5e6-
96231b3b80d8
NAKAMURA Takumi [Wed, 1 Oct 2014 00:29:16 +0000 (00:29 +0000)]
llvm/test/DebugInfo/X86/gmlt.test: Get rid of %llc_dwarf. It should not be used with -mtriple.
Also, remove object-emission. test/DebugInfo/X86 doesn't require it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218722
91177308-0d34-0410-b5e6-
96231b3b80d8
Gerolf Hoflehner [Wed, 1 Oct 2014 00:13:22 +0000 (00:13 +0000)]
[InstCombine] Optimize icmp-select-icmp
In special cases select instructions can be eliminated by
replacing them with a cheaper bitwise operation even when the
select result is used outside its home block. The instances implemented
are patterns like
%x=icmp.eq
%y=select %x,%r, null
%z=icmp.eq|neq %y, null
br %z,true, false
==> %x=icmp.ne
%y=icmp.eq %r,null
%z=or %x,%y
br %z,true,false
The optimization is integrated into the instruction
combiner and performed only when all uses of the select result can
be replaced by the select operand proper. For this dominator information
is used and dominance is now a required analysis pass in the combiner.
The optimization itself is iterative. The critical step is to replace the
select result with the non-constant select operand. So the select becomes
local and the combiner iteratively works out simpler code pattern and
eventually eliminates the select.
rdar://
17853760
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218721
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Tue, 30 Sep 2014 23:29:16 +0000 (23:29 +0000)]
Omit DW_AT_inline under -gmlt to save a little more space.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218719
91177308-0d34-0410-b5e6-
96231b3b80d8
Hal Finkel [Tue, 30 Sep 2014 22:43:40 +0000 (22:43 +0000)]
[BasicAA] Make better use of zext and sign information
Two related things:
1. Fixes a bug when calculating the offset in GetLinearExpression. The code
previously used zext to extend the offset, so negative offsets were converted
to large positive ones.
2. Enhance aliasGEP to deduce that, if the difference between two GEP
allocations is positive and all the variables that govern the offset are also
positive (i.e. the offset is strictly after the higher base pointer), then
locations that fit in the gap between the two base pointers are NoAlias.
Patch by Nick White!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218714
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Tue, 30 Sep 2014 22:32:49 +0000 (22:32 +0000)]
DebugInfo: Sink the code emitting DW_AT_APPLE_omit_frame_ptr down to a more common spot.
No functional change. Pre-emptive refactoring before I start pushing
some of this subprogram creation down into DWARFCompileUnit so I can
build different subprograms in the skeleton unit from the dwo unit for
adding -gmlt-like data to the skeleton.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218713
91177308-0d34-0410-b5e6-
96231b3b80d8
Hans Wennborg [Tue, 30 Sep 2014 22:30:06 +0000 (22:30 +0000)]
MSBuild integration: fix the loop in install.bat
It would previously not continue the platforms loop
unless it could find the latest toolset directory.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218712
91177308-0d34-0410-b5e6-
96231b3b80d8
Jingyue Wu [Tue, 30 Sep 2014 22:23:38 +0000 (22:23 +0000)]
[SimplifyCFG] threshold for folding branches with common destination
Summary:
This patch adds a threshold that controls the number of bonus instructions
allowed for folding branches with common destination. The original code allows
at most one bonus instruction. With this patch, users can customize the
threshold to allow multiple bonus instructions. The default threshold is still
1, so that the code behaves the same as before when users do not specify this
threshold.
The motivation of this change is that tuning this threshold significantly (up
to 25%) improves the performance of some CUDA programs in our internal code
base. In general, branch instructions are very expensive for GPU programs.
Therefore, it is sometimes worth trading more arithmetic computation for a more
straightened control flow. Here's a reduced example:
__global__ void foo(int a, int b, int c, int d, int e, int n,
const int *input, int *output) {
int sum = 0;
for (int i = 0; i < n; ++i)
sum += (((i ^ a) > b) && (((i | c ) ^ d) > e)) ? 0 : input[i];
*output = sum;
}
The select statement in the loop body translates to two branch instructions "if
((i ^ a) > b)" and "if (((i | c) ^ d) > e)" which share a common destination.
With the default threshold, SimplifyCFG is unable to fold them, because
computing the condition of the second branch "(i | c) ^ d > e" requires two
bonus instructions. With the threshold increased, SimplifyCFG can fold the two
branches so that the loop body contains only one branch, making the code
conceptually look like:
sum += (((i ^ a) > b) & (((i | c ) ^ d) > e)) ? 0 : input[i];
Increasing the threshold significantly improves the performance of this
particular example. In the configuration where both conditions are guaranteed
to be true, increasing the threshold from 1 to 2 improves the performance by
18.24%. Even in the configuration where the first condition is false and the
second condition is true, which favors shortcuts, increasing the threshold from
1 to 2 still improves the performance by 4.35%.
We are still looking for a good threshold and maybe a better cost model than
just counting the number of bonus instructions. However, according to the above
numbers, we think it is at least worth adding a threshold to enable more
experiments and tuning. Let me know what you think. Thanks!
Test Plan: Added one test case to check the threshold is in effect
Reviewers: nadav, eliben, meheff, resistor, hfinkel
Reviewed By: hfinkel
Subscribers: hfinkel, llvm-commits
Differential Revision: http://reviews.llvm.org/D5529
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218711
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Tue, 30 Sep 2014 22:16:23 +0000 (22:16 +0000)]
[x86] Add AVX1 and AVX2 testing to all of the 128-bit shuffle test
cases.
While clearly we don't need the AVX vector width, these ISA extensions
often cause us to select different instructions and we should cover them
even with the narrow vector width.
Also, while here, nuke the stress_test2 contents. There is no reason to
try to FileCheck this entire body when it is mostly a test for
successfully surviving the code generator.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218710
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Tue, 30 Sep 2014 22:04:45 +0000 (22:04 +0000)]
[x86] Update the exact FileCheck syntax of the 256-bit and 512-bit
shuffle tests to match that used in the script I posted and now used
consistently in 128-bit tests.
Nothing interesting changing here, just using the label name as the
FileCheck label and a slightly more general comment marker consumption
strategy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218709
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Tue, 30 Sep 2014 22:02:27 +0000 (22:02 +0000)]
Adjust test case addition in r218702 so as not to fail when the X86 target isn't built.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218708
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Tue, 30 Sep 2014 21:44:34 +0000 (21:44 +0000)]
[x86] Rework all of the 128-bit vector shuffle tests with my handy test
updating script so that they are more thorough and consistent.
Specific fixes here include:
- Actually test VEX-encoded AVX mnemonics.
- Actually use an SSE 4.1 run to test SSE 4.1 features!
- Correctly check instructions sequences from the start of the function.
- Elide the shuffle operands and comment designator in a consistent way.
- Test all of the architectures instead of just the ones I was motivated
to manually author.
I've gone back through and fixed up any egregious issues I spotted. Let
me know if I missed something you really dislike.
One downside to this is that we're now not as diligently using FileCheck
variables for registers. I would be much more concerned with this if we
had larger register usage, but there just aren't that interesting of
register choices here and most of the registers are constrained by the
ABI. Ultimately, I don't think this is likely to be the maintenance
burden for these tests and updating them again should be staright
forward.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218707
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Tue, 30 Sep 2014 21:28:32 +0000 (21:28 +0000)]
Disable the -gmlt optimization implemented in r218129 under Darwin due to issues with dsymutil.
r218129 omits DW_TAG_subprograms which have no inlined subroutines when
emitting -gmlt data. This makes -gmlt very low cost for -O0 builds.
Darwin's dsymutil reasonably considers a CU empty if it has no
subprograms (which occurs with the above optimization in -O0 programs
without any force_inline function calls) and drops the line table, CU,
and everything in this situation, making backtraces impossible.
Until dsymutil is modified to account for this, disable this
optimization on Darwin to preserve the desired functionality.
(see r218545, which should be reverted after this patch, for other
discussion/details)
Footnote:
In the long term, it doesn't look like this scheme (of simplified debug
info to describe inlining to enable backtracing) is tenable, it is far
too size inefficient for optimized code (the DW_TAG_inlined_subprograms,
even once compressed, are nearly twice as large as the line table
itself (also compressed)) and we'll be considering things like Cary's
two level line table proposal to encode all this information directly in
the line table.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218702
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Tue, 30 Sep 2014 20:44:23 +0000 (20:44 +0000)]
Use the target-specified iteration count to opt out of any further refinement of an estimate. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218700
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Tue, 30 Sep 2014 20:28:48 +0000 (20:28 +0000)]
Split the estimate() interface into separate functions for each type. NFC.
It was hacky to use an opcode as a switch because it won't always match
(rsqrte != sqrte), and it looks like we'll need to add more special casing
per arch than I had hoped for. Eg, x86 will prefer a different NR estimate
implementation. ARM will want to use it's 'step' instructions. There also
don't appear to be any new estimate instructions in any arch in a long,
long time. Altivec vloge and vexpte may have been the first and last in
that field...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218698
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Tue, 30 Sep 2014 19:59:35 +0000 (19:59 +0000)]
Recommit r218010 [FastISel][AArch64] Fold bit test and branch into TBZ and TBNZ.
Note: This version fixed an issue with the TBZ/TBNZ instructions that were
generated in FastISel. The issue was that the 64bit version of TBZ (TBZX)
automagically sets the upper bit of the immediate field that is used to specify
the bit we want to test. To test for any of the lower 32bits we have to first
extract the subregister and use the 32bit version of the TBZ instruction (TBZW).
Original commit message:
Teach selectBranch to fold bit test and branch into a single instruction (TBZ or
TBNZ).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218693
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Tue, 30 Sep 2014 19:49:48 +0000 (19:49 +0000)]
R600/SI: Fix printing of clamp and omod
No tests for omod since nothing uses it yet, but
this should get rid of the remaining annoying trailing
zeros after some instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218692
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Tue, 30 Sep 2014 19:49:43 +0000 (19:49 +0000)]
R600/SI: Update VOP3b to not include obsolete operands
abs / neg are now part of the srcN_modifiers operands
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218691
91177308-0d34-0410-b5e6-
96231b3b80d8
Bradley Smith [Tue, 30 Sep 2014 16:31:40 +0000 (16:31 +0000)]
Extend C disassembler API to allow specifying target features
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218682
91177308-0d34-0410-b5e6-
96231b3b80d8
Reed Kotler [Tue, 30 Sep 2014 16:30:13 +0000 (16:30 +0000)]
Add numeric extend, trunctate to mips fast-isel
Summary:
Add numeric extend, trunctate to mips fast-isel
Reactivates D4827
Test Plan:
fpext.ll
loadstoreconv.ll
Reviewers: dsanders
Subscribers: mcrosier
Differential Revision: http://reviews.llvm.org/D5251
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218681
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Coxon [Tue, 30 Sep 2014 16:23:16 +0000 (16:23 +0000)]
[AArch64] Remove unnecessary whitespace. (Test commit)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218680
91177308-0d34-0410-b5e6-
96231b3b80d8
Andrea Di Biagio [Tue, 30 Sep 2014 15:30:22 +0000 (15:30 +0000)]
[DAG] Check in advance if a build_vector has a legal type before attempting to convert it into a shuffle.
Currently, the DAG Combiner only tries to convert type-legal build_vector nodes
into shuffles. This patch simply moves the logic that checks if a
build_vector has a legal value type up before we even start analyzing the
operands. This allows to early exit immediately from method
'visitBUILD_VECTOR' if the node type is known to be illegal.
No functional change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218677
91177308-0d34-0410-b5e6-
96231b3b80d8
Alex Lorenz [Tue, 30 Sep 2014 14:48:12 +0000 (14:48 +0000)]
Revert r218673 'llvm-cov: add test for report's function & file association.'
Test causes buildbot failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218676
91177308-0d34-0410-b5e6-
96231b3b80d8
Alex Lorenz [Tue, 30 Sep 2014 12:52:31 +0000 (12:52 +0000)]
llvm-cov: add test for report's function & file association.
This commit adds a test which checks that the functions defined in header files will get associated with the header files rather than the source files in the reports.
Differential Revision: http://reviews.llvm.org/D5489
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218673
91177308-0d34-0410-b5e6-
96231b3b80d8
Alex Lorenz [Tue, 30 Sep 2014 12:45:13 +0000 (12:45 +0000)]
llvm-cov: Use the number of executed functions for the function coverage metric.
This commit fixes llvm-cov's function coverage metric by using the number of executed functions instead of the number of fully covered functions.
Differential Revision: http://reviews.llvm.org/D5196
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218672
91177308-0d34-0410-b5e6-
96231b3b80d8
Lorenzo Martignoni [Tue, 30 Sep 2014 12:33:16 +0000 (12:33 +0000)]
Introduce support for custom wrappers for vararg functions.
Differential Revision: http://reviews.llvm.org/D5412
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218671
91177308-0d34-0410-b5e6-
96231b3b80d8
Robert Khasanov [Tue, 30 Sep 2014 12:15:52 +0000 (12:15 +0000)]
[AVX512] Added intrinsics for 128-, 256- and 512-bit versions of VCMPGT{BWDQ}.
Patch by Sergey Lisitsyn <sergey.lisitsyn@intel.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218670
91177308-0d34-0410-b5e6-
96231b3b80d8
Robert Khasanov [Tue, 30 Sep 2014 11:41:54 +0000 (11:41 +0000)]
[AVX512] Added intrinsics for 128- and 256-bit versions of VCMPEQ{BWDQ}
Fixed lowering of this intrinsics in case when mask is v2i1 and v4i1.
Now cmp intrinsics lower in the following way:
(i8 (int_x86_avx512_mask_pcmpeq_q_128
(v2i64 %a), (v2i64 %b), (i8 %mask))) ->
(i8 (bitcast
(v8i1 (insert_subvector undef,
(v2i1 (and (PCMPEQM %a, %b),
(extract_subvector
(v8i1 (bitcast %mask)), 0))), 0))))
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218669
91177308-0d34-0410-b5e6-
96231b3b80d8
Robert Khasanov [Tue, 30 Sep 2014 11:32:22 +0000 (11:32 +0000)]
[AVX512] Added intrinsics for VPCMPEQB and VPCMPEQW.
Added new operand type for intrinsics (IIT_V64)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218668
91177308-0d34-0410-b5e6-
96231b3b80d8
Robert Khasanov [Tue, 30 Sep 2014 11:19:50 +0000 (11:19 +0000)]
[AVX512] Enabled intrinsics for VPCMPEQD and VPCMPEQQ.
Added CMP_MASK intrinsic type
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218667
91177308-0d34-0410-b5e6-
96231b3b80d8
Job Noorman [Tue, 30 Sep 2014 11:15:44 +0000 (11:15 +0000)]
Make sure aggregates are properly alligned on MSP430.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218665
91177308-0d34-0410-b5e6-
96231b3b80d8
Chad Rosier [Tue, 30 Sep 2014 03:17:42 +0000 (03:17 +0000)]
[IndVarSimplify] Widen loop unsigned compares.
This patch extends r217953 to handle unsigned comparison.
Phabricator revision: http://reviews.llvm.org/D5526
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218659
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Tue, 30 Sep 2014 02:52:28 +0000 (02:52 +0000)]
[x86] Revert r218588, r218589, and r218600. These patches were pursuing
a flawed direction and causing miscompiles. Read on for details.
Fundamentally, the premise of this patch series was to map
VECTOR_SHUFFLE DAG nodes into VSELECT DAG nodes for all blends because
we are going to *have* to lower to VSELECT nodes for some blends to
trigger the instruction selection patterns of variable blend
instructions. This doesn't actually work out so well.
In order to match performance with the existing VECTOR_SHUFFLE
lowering code, we would need to re-slice the blend in order to fit it
into either the integer or floating point blends available on the ISA.
When coming from VECTOR_SHUFFLE (or other vNi1 style VSELECT sources)
this works well because the X86 backend ensures that these types of
operands to VSELECT get sign extended into '-1' and '0' for true and
false, allowing us to re-slice the bits in whatever granularity without
changing semantics.
However, if the VSELECT condition comes from some other source, for
example code lowering vector comparisons, it will likely only have the
required bit set -- the high bit. We can't blindly slice up this style
of VSELECT. Reid found some code using Halide that triggers this and I'm
hopeful to eventually get a test case, but I don't need it to understand
why this is A Bad Idea.
There is another aspect that makes this approach flawed. When in
VECTOR_SHUFFLE form, we have very distilled information that represents
the *constant* blend mask. Converting back to a VSELECT form actually
can lose this information, and so I think now that it is better to treat
this as VECTOR_SHUFFLE until the very last moment and only use VSELECT
nodes for instruction selection purposes.
My plan is to:
1) Clean up and formalize the target pre-legalization DAG combine that
converts a VSELECT with a constant condition operand into
a VECTOR_SHUFFLE.
2) Remove any fancy lowering from VSELECT during *legalization* relying
entirely on the DAG combine to catch cases where we can match to an
immediate-controlled blend instruction.
One additional step that I'm not planning on but would be interested in
others' opinions on: we could add an X86ISD::VSELECT or X86ISD::BLENDV
which encodes a fully legalized VSELECT node. Then it would be easy to
write isel patterns only in terms of this to ensure VECTOR_SHUFFLE
legalization only ever forms the fully legalized construct and we can't
cycle between it and VSELECT combining.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218658
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Tue, 30 Sep 2014 02:32:36 +0000 (02:32 +0000)]
[x86] Add some vector-register broadcast operations to the 256-bit v4
tests which were missing them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218657
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Tue, 30 Sep 2014 01:05:29 +0000 (01:05 +0000)]
R600: Fix broken check lines, missing scalar case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218655
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Tue, 30 Sep 2014 01:05:27 +0000 (01:05 +0000)]
Fix missing C++ mode comment
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218654
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Tue, 30 Sep 2014 00:49:58 +0000 (00:49 +0000)]
[FastISel][AArch64] Fold sign-/zero-extends into the load instruction.
The sign-/zero-extension of the loaded value can be performed by the memory
instruction for free. If the result of the load has only one use and the use is
a sign-/zero-extend, then we emit the proper load instruction. The extend is
only a register copy and will be optimized away later on.
Other instructions that consume the sign-/zero-extended value are also made
aware of this fact, so they don't fold the extend too.
This fixes rdar://problem/
18495928.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218653
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Tue, 30 Sep 2014 00:49:54 +0000 (00:49 +0000)]
[FastISel][AArch64] Factor out scale factor calculation. NFC.
Factor out the code that determines the implicit scale factor of memory
operations for a given value type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218652
91177308-0d34-0410-b5e6-
96231b3b80d8
Nick Kledzik [Tue, 30 Sep 2014 00:19:58 +0000 (00:19 +0000)]
[llvm-objdump] switch some uses of format() to format_hex() and left_justify()
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218649
91177308-0d34-0410-b5e6-
96231b3b80d8
Eric Christopher [Mon, 29 Sep 2014 23:31:13 +0000 (23:31 +0000)]
Simplify conditional.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218643
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Mon, 29 Sep 2014 22:54:41 +0000 (22:54 +0000)]
[AVX512] Use X86VectorVTInfo in the masking helper classes and the FMAs
No functionality change.
Makes the code more compact (see the FMA part).
This needs a new type attribute MemOpFrag in X86VectorVTInfo. For now I only
defined this in the simple cases. See the commment before the attribute.
Diff of X86.td.expanded before and after is empty except for the appearance of
the new attribute.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218637
91177308-0d34-0410-b5e6-
96231b3b80d8
Hans Wennborg [Mon, 29 Sep 2014 22:43:20 +0000 (22:43 +0000)]
WinCOFFObjectWriter: optimize the string table for common suffices
This is a follow-up from r207670 which did the same for ELF.
Differential Revision: http://reviews.llvm.org/D5530
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218636
91177308-0d34-0410-b5e6-
96231b3b80d8
Eric Christopher [Mon, 29 Sep 2014 21:57:54 +0000 (21:57 +0000)]
Add soft-float to the key for the subtarget lookup in the TargetMachine
map, this makes sure that we can compile the same code for two different
ABIs (hard and soft float) in the same module.
Update one testcase accordingly (and fix some confusing naming) and
add a new testcase as well with the ordering swapped which would
highlight the problem.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218632
91177308-0d34-0410-b5e6-
96231b3b80d8
Eric Christopher [Mon, 29 Sep 2014 21:57:52 +0000 (21:57 +0000)]
Fix spelling and reflow comments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218631
91177308-0d34-0410-b5e6-
96231b3b80d8
Dave Estes [Mon, 29 Sep 2014 21:27:36 +0000 (21:27 +0000)]
[AArch64] Refines the Cortex-A57 Machine Model
Primarily refines all of the instructions with accurate latency
and micro-op information. Refinements largely focus on the NEON
instructions.
Additionally, a few advanced features are modeled, including
forwarding for MAC instructions and hazards for floating point SQRT
and DIV.
Lastly, the issue-width is reduced to three so that the scheduler
will better accommodate the narrower decode and dispatch width.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218627
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Mon, 29 Sep 2014 21:25:13 +0000 (21:25 +0000)]
Unit test r218187, changing RTDyldMemoryManager::getSymbolAddress's behavior favor mangled lookup over unmangled lookup.
The contract of this function seems problematic (fallback in either
direction seems like it could produce bugs in one client or another),
but here's some tests for its current behavior, at least. See the
commit/review thread of r218187 for more discussion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218626
91177308-0d34-0410-b5e6-
96231b3b80d8
Aaron Ballman [Mon, 29 Sep 2014 20:27:01 +0000 (20:27 +0000)]
Fixing the build for compilers which do not yet have support for constexpr functions, NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218622
91177308-0d34-0410-b5e6-
96231b3b80d8
Jordan Rose [Mon, 29 Sep 2014 18:56:08 +0000 (18:56 +0000)]
Add getValueOr to llvm::Optional<T>.
This takes a single argument convertible to T, and
- if the Optional has a value, returns the existing value,
- otherwise, constructs a T from the argument and returns that.
Inspired by std::experimental::optional from the "Library Fundamentals" C++ TS.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218618
91177308-0d34-0410-b5e6-
96231b3b80d8
Jordan Rose [Mon, 29 Sep 2014 18:56:05 +0000 (18:56 +0000)]
Add "typedef T value_type;" to llvm::Optional<T>.
Inspired by std::experimental::optional from the "Library Fundamentals" C++ TS.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218617
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Mon, 29 Sep 2014 15:55:18 +0000 (15:55 +0000)]
Fixing missing C++ mode comment
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218612
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Mon, 29 Sep 2014 15:53:15 +0000 (15:53 +0000)]
Fix include order
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218611
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Mon, 29 Sep 2014 15:50:26 +0000 (15:50 +0000)]
R600/SI: Fix hardcoded values for modifiers.
Move enums to SIDefines.h
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218610
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Mon, 29 Sep 2014 14:59:38 +0000 (14:59 +0000)]
R600/SI: Also fix fsub + fadd a, a to mad combines
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218609
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Mon, 29 Sep 2014 14:59:34 +0000 (14:59 +0000)]
R600/SI: Fix using mad with multiplies by 2
These turn into fadds, so combine them into the target
mad node.
fadd (fadd (a, a), b) -> mad 2.0, a, b
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218608
91177308-0d34-0410-b5e6-
96231b3b80d8
Chad Rosier [Mon, 29 Sep 2014 13:59:31 +0000 (13:59 +0000)]
[AArch64] Improve cost model to handle sdiv by a pow-of-two.
This patch improves the target-specific cost model to better handle signed
division by a power of two. The immediate result is that this enables the SLP
vectorizer to do a better job.
http://reviews.llvm.org/D5469
PR20714
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218607
91177308-0d34-0410-b5e6-
96231b3b80d8
Frederic Riss [Mon, 29 Sep 2014 13:56:39 +0000 (13:56 +0000)]
Store TypeUnits in a SmallVector<DWARFUnitSection> instead of a single DWARFUnitSection.
There will be multiple TypeUnits in an unlinked object that will be extracted
from different sections. Now that we have DWARFUnitSection that is supposed
to represent an input section, we need a DWARFUnitSection<TypeUnit> per
input .debug_types section.
Once this is done, the interface is homogenous and we can move the Section
parsing code into DWARFUnitSection.
This is a respin of r218513 that got reverted because it broke some builders.
This new version features an explicit move constructor for the DWARFUnitSection
class to workaround compilers unable to generate correct C++11 default
constructors.
Reviewers: samsonov, dblaikie
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5482
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218606
91177308-0d34-0410-b5e6-
96231b3b80d8
Kevin Qin [Mon, 29 Sep 2014 11:15:00 +0000 (11:15 +0000)]
Use a loop to simplify the runtime unrolling prologue.
Runtime unrolling will create a prologue to execute the extra
iterations which is can't divided by the unroll factor. It
generates an if-then-else sequence to jump into a factor -1
times unrolled loop body, like
extraiters = tripcount % loopfactor
if (extraiters == 0) jump Loop:
if (extraiters == loopfactor) jump L1
if (extraiters == loopfactor-1) jump L2
...
L1: LoopBody;
L2: LoopBody;
...
if tripcount < loopfactor jump End
Loop:
...
End:
It means if the unroll factor is 4, the loop body will be 7
times unrolled, 3 are in loop prologue, and 4 are in the loop.
This commit is to use a loop to execute the extra iterations
in prologue, like
extraiters = tripcount % loopfactor
if (extraiters == 0) jump Loop:
else jump Prol
Prol: LoopBody;
extraiters -= 1 // Omitted if unroll factor is 2.
if (extraiters != 0) jump Prol: // Omitted if unroll factor is 2.
if (tripcount < loopfactor) jump End
Loop:
...
End:
Then when unroll factor is 4, the loop body will be copied by
only 5 times, 1 in the prologue loop, 4 in the original loop.
And if the unroll factor is 2, new loop won't be created, just
as the original solution.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218604
91177308-0d34-0410-b5e6-
96231b3b80d8
Oliver Stannard [Mon, 29 Sep 2014 10:57:29 +0000 (10:57 +0000)]
[Thumb2] ldrexd and strexd are not defined on v7M
The Thumb2 ldrexd and strexd instructions are not defined for
M-class architectures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218603
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Mon, 29 Sep 2014 09:57:07 +0000 (09:57 +0000)]
[x86] Make the new vector shuffle lowering lower blends as VSELECT
nodes, and rely exclusively on its logic. This removes a ton of
duplication from the blend lowering and centralizes it in one place.
One downside is that it requires a bunch of hacks to make this work with
the current legalization framework. We have to manually speculate one
aspect of legalizing VSELECT nodes to get everything to work nicely
because the existing legalization framework isn't *actually* bottom-up.
The other grossness is that we somewhat duplicate the analysis of
constant blends. I'm on the fence here. If reviewers thing this would
look better with VSELECT when it has constant operands dumping over tho
VECTOR_SHUFFLE, we could go that way. But it would be a substantial
change because currently all of the actual blend instructions are
matched via patterns in the TD files based around VSELECT nodes (despite
them not being perfect fits for that). Suggestions welcome, but at least
this removes the rampant duplication in the backend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218600
91177308-0d34-0410-b5e6-
96231b3b80d8
Jyoti Allur [Mon, 29 Sep 2014 06:32:54 +0000 (06:32 +0000)]
Remove dead code from DIBuilder
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218593
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Mon, 29 Sep 2014 02:01:20 +0000 (02:01 +0000)]
[x86] Delete a bunch of really bad and totally unnecessary code in the
X86 target-specific DAG combining that tried to convert VSELECT nodes
into VECTOR_SHUFFLE nodes that it "knew" would lower into
immediate-controlled blend nodes.
Turns out, we have perfectly good lowering of all these VSELECT nodes,
and indeed that lowering already knows how to handle lowering through
BLENDI to immediate-controlled blend nodes. The code just wasn't getting
used much because this thing forced the world to go through the vector
shuffle lowering. Yuck.
This also exposes that I was too aggressive in avoiding domain crossing
in v218588 with that lowering -- when the other option is to expand into
two 128-bit vectors, it is worth domain crossing. Restore that behavior
now that we have nice tests covering it.
The test updates here fall into two camps. One is where previously we
ended up with an unsigned encoding of the blend operand and now we get
a signed encoding. In most of those places there were elaborate comments
explaining exactly what these operands really mean. Rather than that,
just switch these tests to use the nicely decoded comments that make it
obvious that the final shuffle matches.
The other updates are just removing pointless domain crossing by
blending integers with PBLENDW rather than BLENDPS.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218589
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Mon, 29 Sep 2014 01:32:54 +0000 (01:32 +0000)]
[x86] Refactor all of the VSELECT-as-blend lowering code to avoid domain
crossing and generally work more like the blend emission code in the new
vector shuffle lowering.
My goal is to have the new vector shuffle lowering just produce VSELECT
nodes that are either matched here to BLENDI or are legal and matched in
the .td files to specific blend instructions. That seems much cleaner as
there are other ways to produce a VSELECT anyways. =]
No *observable* functionality changed yet, mostly because this code
appears to be near-dead. The behavior of this lowering routine did
change though. This code being mostly dead and untestable will change
with my next commit which will also point some new tests at it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218588
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Mon, 29 Sep 2014 00:51:58 +0000 (00:51 +0000)]
[x86] Improve naming and comments for VSELECT lowering.
No functionality changed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218586
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Mon, 29 Sep 2014 00:37:27 +0000 (00:37 +0000)]
[x86] Add the dispatch skeleton to the new vector shuffle lowering for
AVX-512.
There is no interesting logic yet. Everything ends up eventually
delegating to the generic code to split the vector and shuffle the
halves. Interestingly, that logic does a significantly better job of
lowering all of these types than the generic vector expansion code does.
Mostly, it lets most of the cases fall back to nice AVX2 code rather
than all the way back to SSE code paths.
Step 2 of basic AVX-512 support in the new vector shuffle lowering. Next
up will be to incrementally add direct support for the basic instruction
set to each type (adding tests first).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218585
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Mon, 29 Sep 2014 00:21:49 +0000 (00:21 +0000)]
[x86] Make the split-and-lower routine fully generic by relaxing the
assertion, making the name generic, and improving the documentation.
Step 1 in adding very primitive support for AVX-512. No functionality
changed yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218584
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sun, 28 Sep 2014 23:53:10 +0000 (23:53 +0000)]
[x86] Teach the new vector shuffle lowering to fall back on AVX-512
vectors.
Someone will need to build the AVX512 lowering, which should follow
AVX1 and AVX2 *very* closely for AVX512F and AVX512BW resp. I've added
a dummy test which is a port of the v8f32 and v8i32 tests from AVX and
AVX2 to v8f64 and v8i64 tests for AVX512F and AVX512BW. Hopefully this
is enough information for someone to implement proper lowering here. If
not, I'll be happy to help, but right now the AVX-512 support isn't
a priority for me.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218583
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sun, 28 Sep 2014 23:23:55 +0000 (23:23 +0000)]
[x86] Fix the new vector shuffle lowering's use of VSELECT for AVX2
lowerings.
This was hopelessly broken. First, the x86 backend wants '-1' to be the
element value representing true in a boolean vector, and second the
operand order for VSELECT is backwards from the actual x86 instructions.
To make matters worse, the backend is just using '-1' as the true value
to get the high bit to be set. It doesn't actually symbolically map the
'-1' to anything. But on x86 this isn't quite how it works: there *only*
the high bit is relevant. As a consequence weird non-'-1' values like
0x80 actually "work" once you flip the operands to be backwards.
Anyways, thanks to Hal for helping me sort out what these *should* be.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218582
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Sun, 28 Sep 2014 19:24:59 +0000 (19:24 +0000)]
Add MachineOperand::ChangeToFPImmediate and setFPImm
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218579
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sun, 28 Sep 2014 06:11:04 +0000 (06:11 +0000)]
[x86] Fix a really silly bug that I introduced fixing another bug in the
new vector shuffle target DAG combines -- it helps to actually test for
the value you want rather than just using an integer in a boolean
context.
Have I mentioned that I loathe implicit conversions recently? :: sigh ::
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218576
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sun, 28 Sep 2014 03:30:25 +0000 (03:30 +0000)]
[x86] Fix yet another bug in the new vector shuffle lowering's handling
of widening masks.
We can't widen a zeroing mask unless both elements that would be merged
are either zeroed or undef. This is the only way to widen a mask if it
has a zeroed element.
Also clean up the code here by ordering the checks in a more logical way
and by using the symoblic values for undef and zero. I'm actually torn
on using the symbolic values because the existing code is littered with
the assumption that -1 is undef, and moreover that entries '< 0' are the
special entries. While that works with the values given to these
constants, using the symbolic constants actually makes it a bit more
opaque why this is the case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218575
91177308-0d34-0410-b5e6-
96231b3b80d8
Hans Wennborg [Sun, 28 Sep 2014 00:22:27 +0000 (00:22 +0000)]
WinCOFFObjectWriter.cpp: make write_uint32_le more efficient
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218574
91177308-0d34-0410-b5e6-
96231b3b80d8
James Molloy [Sat, 27 Sep 2014 17:02:54 +0000 (17:02 +0000)]
[AArch64] Redundant store instructions should be removed as dead code
If there is a store followed by a store with the same value to the same location, then the store is dead/noop. It can be removed.
This problem is found in spec2006-197.parser.
For example,
stur w10, [x11, #-4]
stur w10, [x11, #-4]
Then one of the two stur instructions can be removed.
Patch by David Xu!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218569
91177308-0d34-0410-b5e6-
96231b3b80d8
Yaron Keren [Sat, 27 Sep 2014 14:41:29 +0000 (14:41 +0000)]
Fix llvm::huge_valf multiple initializations with Visual C++.
llvm::huge_valf is defined in a header file, so it is initialized
multiple times in every compiled unit upon program startup.
With non-VC compilers huge_valf is set to a HUGE_VALF which the
compiler can probably optimize out.
With VC numeric_limits<float>::infinity() does not return a number
but a runtime structure member which therotically may change
between calls so the compiler does not optimize out the
initialization and it happens many times. It can be easily seen by
placing a breakpoint on the initialization line.
This patch moves llvm::huge_valf initialization to a source file
instead of the header.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218567
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sat, 27 Sep 2014 08:40:33 +0000 (08:40 +0000)]
[x86] Fix yet another issue with widening vector shuffle elements.
I spotted this by inspection when debugging something else, so I have no
test case what-so-ever, and am not even sure it is possible to
realistically trigger the bug. But this is what was intended here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218565
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Sat, 27 Sep 2014 05:36:53 +0000 (05:36 +0000)]
Update test case to match minor formatting change introduced in r218563.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218564
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Sat, 27 Sep 2014 05:26:42 +0000 (05:26 +0000)]
Reduce code duplication a bit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218563
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sat, 27 Sep 2014 04:42:44 +0000 (04:42 +0000)]
[x86] Fix terrible bugs everywhere in the new vector shuffle lowering
and in the target shuffle combining when trying to widen vector
elements.
Previously only one of these was correct, and we didn't correctly
propagate zeroing target shuffle masks (which have a different sentinel
value from undef in non- target shuffle masks now). This isn't just
a missed optimization, this caused us to drop zeroing shuffles on the
floor and miscompile code. The added test case is one example of that.
There are other fixes to the test suite as a consequence of this as well
as restoring the undef elements in some of the masks that were lost when
I brought sanity to the actual *value* of the undef and zero sentinels.
I've also just cleaned up some of the PSHUFD and PSHUFLW and PSHUFHW
combining code, but that code really needs to go. It was a nice initial
attempt, but it isn't very principled and the recursive shuffle combiner
is much more powerful.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218562
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sat, 27 Sep 2014 04:42:39 +0000 (04:42 +0000)]
[x86] Flip the sentinel values used in the target shuffle mask decoding
to significantly more sane sentinels. Notably, everywhere else in the
backend's representation of shuffles uses '-1' to represent undef. The
target shuffle masks really shouldn't diverge from that, especially as
in a few places they are manipulated by shared code.
This causes us to lose some undef lanes in various test masks. I want to
get these back, but technically it isn't invalid and there are a *lot*
of bugs here so I want to try to establish a saner baseline for fixing
some of the bugs by aligning the specific senitnel values used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218561
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Sat, 27 Sep 2014 04:38:02 +0000 (04:38 +0000)]
Fix TableGen -gen-disassembler output for bit fields with an offset.
This fixes bit assignments like this
Inst{7-0} = Foo{9-2}
Patch by Steve King.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218560
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Fri, 26 Sep 2014 23:01:47 +0000 (23:01 +0000)]
Refactor reciprocal and reciprocal square root estimate into target-independent functions (part 2).
This is purely refactoring. No functional changes intended. PowerPC is the only target
that is currently using this interface.
The ultimate goal is to allow targets other than PowerPC (certainly X86 and Aarch64) to turn this:
z = y / sqrt(x)
into:
z = y * rsqrte(x)
And:
z = y / x
into:
z = y * rcpe(x)
using whatever HW magic they can use. See http://llvm.org/bugs/show_bug.cgi?id=20900 .
There is one hook in TargetLowering to get the target-specific opcode for an estimate instruction
along with the number of refinement steps needed to make the estimate usable.
Differential Revision: http://reviews.llvm.org/D5484
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218553
91177308-0d34-0410-b5e6-
96231b3b80d8
Richard Smith [Fri, 26 Sep 2014 22:40:15 +0000 (22:40 +0000)]
Add LLVM_ENABLE_MODULES flag to CMake to enable building with C++ modules.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218551
91177308-0d34-0410-b5e6-
96231b3b80d8
David Majnemer [Fri, 26 Sep 2014 22:32:19 +0000 (22:32 +0000)]
llvm-vtabledump: Further simplification
Hoist out calls to getSection and getContents. No functional change
intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218550
91177308-0d34-0410-b5e6-
96231b3b80d8
David Majnemer [Fri, 26 Sep 2014 22:32:16 +0000 (22:32 +0000)]
Object: BSS/virtual sections don't have contents
Users of getSectionContents shouldn't try to pass in BSS or virtual
sections. In all instances, this is a bug in the code calling this
routine.
N.B. Some COFF implementations (like CL) will mark their BSS sections as
taking space on disk. This would confuse COFFObjectFile into thinking
the section is larger than the file.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218549
91177308-0d34-0410-b5e6-
96231b3b80d8
Yaron Keren [Fri, 26 Sep 2014 22:27:11 +0000 (22:27 +0000)]
clang-format of ChangeStdinToBinary & ChangeStdoutToBinary.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218547
91177308-0d34-0410-b5e6-
96231b3b80d8
Kevin Enderby [Fri, 26 Sep 2014 22:20:44 +0000 (22:20 +0000)]
Update llvm-objdump’s Mach-O symbolizer code to print the name of symbol stubs.
So in fully linked images when a call is made through a stub it now gets a
comment like the following in the disassembly:
callq 0x100000f6c ## symbol stub for: _printf
indicating the call is to a symbol stub and which symbol it is for. This is
done for branch reference types and seeing if the branch target is in a stub
section and if so using the indirect symbol table entry for that stub and
using that symbol table entries symbol name.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218546
91177308-0d34-0410-b5e6-
96231b3b80d8
Richard Smith [Fri, 26 Sep 2014 21:53:12 +0000 (21:53 +0000)]
Remove definition of LLVM_VERSION_INFO; this macro is not used by any of the
files in this directory. If it should be defined anywhere, it should be defined
when building lib/LTO/LTOCodeGenerator.cpp, but we've not had it defined there
for quite some time, so that doesn't really seem to be very important. (It also
would slow down the modules build by creating extra module variants.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218544
91177308-0d34-0410-b5e6-
96231b3b80d8
Richard Smith [Fri, 26 Sep 2014 21:35:48 +0000 (21:35 +0000)]
Fix CMake warning CMP0054: don't quote a variable name that is intended to be
expanded; future versions of cmake may not expand the variable in this case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218543
91177308-0d34-0410-b5e6-
96231b3b80d8
Richard Smith [Fri, 26 Sep 2014 21:33:05 +0000 (21:33 +0000)]
Fix misinterpretation of CMake rule found by a CMake warning (related to CMP0054).
lldb sets the variable SHARED_LIBRARY to 1, which breaks this conditional,
because older versions of CMake interpret
if ("${t}" STREQUAL "SHARED_LIBRARY")
as meaning
if ("${t}" STREQUAL "1")
in this case. Change the conditional so it does the right thing with both old
and new CMakes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218542
91177308-0d34-0410-b5e6-
96231b3b80d8