Chandler Carruth [Sun, 21 Sep 2014 12:01:19 +0000 (12:01 +0000)]
[x86] Begin teaching the new vector shuffle lowering among the most
important bits of cleverness: to detect and lower repeated shuffle
patterns between the two 128-bit lanes with a single instruction.
This patch just teaches it how to lower single-input shuffles that fit
this model using VPERMILPS. =] There is more that needs to happen here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218211
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sun, 21 Sep 2014 11:51:33 +0000 (11:51 +0000)]
[x86] Regenerate this test case now that I've improved my script for
generating the test cases to format things more consistently and
actually catch all the operand sequences that should be elided in favor
of the asm comments. No actual changes here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218210
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sun, 21 Sep 2014 11:40:39 +0000 (11:40 +0000)]
[x86] Explicitly lower to a blend early if it is trivial to do so for
v8f32 shuffles in the new vector shuffle lowering code.
This is very cheap to do and makes it much more clear that anything more
expensive but overlapping with this lowering should be selected
afterward (for example using AVX2's VPERMPS). However, no functionality
changed here as without this code we would fall through to create no-op
shuffles of each input and a blend. =]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218209
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sun, 21 Sep 2014 11:17:55 +0000 (11:17 +0000)]
[x86] Teach the new vector shuffle lowering of v4f64 to prefer a direct
VBLENDPD over using VSHUFPD. While the 256-bit variant of VBLENDPD slows
down to the same speed as VSHUFPD on Sandy Bridge CPUs, it has twice the
reciprocal throughput on Ivy Bridge CPUs much like it does everywhere
for 128-bits. There isn't a downside, so just eagerly use this
instruction when it suffices.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218208
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sun, 21 Sep 2014 11:12:19 +0000 (11:12 +0000)]
[x86] Add some more comprehensive tests for v4f64 blending.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218207
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sun, 21 Sep 2014 11:07:41 +0000 (11:07 +0000)]
[x86] Re-generate a bunch of the v4f64 test cases with my new script.
This expands the integer cases to cover the fact that AVX2 moves their
lane-crossing shuffles into the integer domain. It also adds proper
support for AVX2 run lines and the "ALL" group when it doesn't matter.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218206
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sun, 21 Sep 2014 10:36:12 +0000 (10:36 +0000)]
[x86] Switch the blend implementation to use a MVT switch rather than
awkward conditions. The readability improvement of this will be even
more important as I generalize it to handle more types.
No functionality changed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218205
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sun, 21 Sep 2014 10:27:14 +0000 (10:27 +0000)]
[x86] Remove some essentially lying comments from the v4f64 path of the
new vector shuffle lowering.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218204
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sun, 21 Sep 2014 09:35:25 +0000 (09:35 +0000)]
[x86] Fix a helper to reflect that what we actually care about is
128-bit lane crossings, not 'half' crossings. This came up in code
review ages ago, but I hadn't really addresesd it. Also added some
documentation for the helper.
No functionality changed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218203
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sun, 21 Sep 2014 09:35:22 +0000 (09:35 +0000)]
[x86] Teach the new vector shuffle lowering the first step toward more
actual support for complex AVX shuffling tricks. We can do independent
blends of the low and high 128-bit lanes of an avx vector, so shuffle
the inputs into place and then do the blend at 256 bits. This will in
many cases remove one blend instruction.
The next step is to permute the low and high halves in-place rather than
extracting them and re-inserting them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218202
91177308-0d34-0410-b5e6-
96231b3b80d8
David Majnemer [Sun, 21 Sep 2014 09:18:07 +0000 (09:18 +0000)]
MC: Support aligned COMMON symbols for COFF
link.exe:
Fuzz testing has shown that COMMON symbols with size > 32 will always
have an alignment of at least 32 and all symbols with size < 32 will
have an alignment of at least the largest power of 2 less than the size
of the symbol.
binutils:
The BFD linker essentially work like the link.exe behavior but with
alignment 4 instead of 32. The BFD linker also supports an extension to
COFF which adds an -aligncomm argument to the .drectve section which
permits specifying a precise alignment for a variable but MC currently
doesn't support editing .drectve in this way.
With all of this in mind, we decide to play a little trick: we can
ensure that the alignment will be respected by bumping the size of the
global to it's alignment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218201
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sun, 21 Sep 2014 09:01:26 +0000 (09:01 +0000)]
[x86] Add some more test cases covering specific blend patterns.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218200
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sun, 21 Sep 2014 08:49:27 +0000 (08:49 +0000)]
[x86] Add the beginnings of some tests for our v8f32 shuffle lowering
under AVX.
This really just documents the current state of the world. I'm going to
try to flesh it out to cover any test cases I plan to improve prior to
improving them so that the delta made by changes is actually visible to
code reviewers.
This is made easier by the fact that I now have a script to automate the
process of producing test cases including the check lines. =]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218199
91177308-0d34-0410-b5e6-
96231b3b80d8
NAKAMURA Takumi [Sat, 20 Sep 2014 23:58:13 +0000 (23:58 +0000)]
RTDyldMemoryManager::getSymbolAddress(): Make sure to return 0 if symbol name is not met. [-Wreturn-type]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218195
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Sat, 20 Sep 2014 22:39:16 +0000 (22:39 +0000)]
mop up: "Don’t duplicate function or class name at the beginning of the comment."
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218194
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sat, 20 Sep 2014 22:09:27 +0000 (22:09 +0000)]
[x86] Teach the new vector shuffle lowering to use VPERMILPD for
single-input shuffles with doubles. This allows them to fold memory
operands into the shuffle, etc. This is just the analog to the v4f32
case in my prior commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218193
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sat, 20 Sep 2014 21:26:41 +0000 (21:26 +0000)]
[x86] Add an AVX run to the 128-bit v2 tests, teach them to have
a generic SSE and AVX mode in addition to a specific AVX1 test path, and
flesh out the AVX tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218192
91177308-0d34-0410-b5e6-
96231b3b80d8
David Majnemer [Sat, 20 Sep 2014 21:18:43 +0000 (21:18 +0000)]
Update tests which broke from r218189
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218191
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sat, 20 Sep 2014 20:52:07 +0000 (20:52 +0000)]
[x86] Teach the new vector shuffle lowering to use the AVX VPERMILPS
instruction for single-vector floating point shuffles. This in turn
allows the shuffles to fold a load into the instruction which is one of
the common regressions hit with the new shuffle lowering.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218190
91177308-0d34-0410-b5e6-
96231b3b80d8
David Majnemer [Sat, 20 Sep 2014 20:40:50 +0000 (20:40 +0000)]
MC: Fix MCSectionCOFF::PrintSwitchToSection
We had a few bugs:
- We were considering the GVKind instead of just looking at the section
characteristics
- We would never print out 'y' when a section was meant to be unreadable
- We would never print out 's' when a section was meant to be shared
- We translated IMAGE_SCN_MEM_DISCARDABLE to 'n' when it should've meant
IMAGE_SCN_LNK_REMOVE
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218189
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sat, 20 Sep 2014 18:36:39 +0000 (18:36 +0000)]
[x86] Start moving to a fancier check syntax to reduce the need for
duplication of check lines. The idea is to have broad sets of
compilation modes that will frequently diverge without having to always
and immediately explode to the precise ISA feature set.
While this already helps due to VEX encoded differences, it will help
much more as I teach the new shuffle lowering about more of the new VEX
encoded instructions which can still be used to implement 128-bit
shuffles.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218188
91177308-0d34-0410-b5e6-
96231b3b80d8
Lang Hames [Sat, 20 Sep 2014 17:44:56 +0000 (17:44 +0000)]
[MCJIT] Make RTDyldMemoryManager::getSymbolAddress's behaviour more consistent.
This patch modifies RTDyldMemoryManager::getSymbolAddress(Name)'s behavior to
make it consistent with how clients are using it: Name should be mangled, and
getSymbolAddress should demangle it on the caller's behalf before looking the
name up in the process. This patch also fixes the one client
(MCJIT::getPointerToFunction) that had been passing unmangled names (by having
it pass mangled names instead).
Background:
RTDyldMemoryManager::getSymbolAddress(Name) has always used a re-try mechanism
when looking up symbol names in the current process. Prior to this patch
getSymbolAddress first tried to look up 'Name' exactly as the user passed it in
and then, if that failed, tried to demangle 'Name' and re-try the look up. The
implication of this behavior is that getSymbolAddress expected to be called with
unmangled names, and that handling mangled names was a fallback for convenience.
This is inconsistent with how clients (particularly the RuntimeDyldImpl
subclasses, but also MCJIT) usually use this API. Most clients pass in mangled
names, and succeed only because of the fallback case. For clients passing in
mangled names, getSymbolAddress's old behavior was actually dangerous, as it
could cause unmangled names in the process to shadow mangled names being looked
up.
For example, consider:
foo.c:
int _x = 7;
int x() { return _x; }
foo.o:
000000000000000c D __x
0000000000000000 T _x
If foo.c becomes part of the process (E.g. via dlopen("libfoo.dylib")) it will
add symbols 'x' (the function) and '_x' (the variable) to the process. However
jit clients looking for the function 'x' will be using the mangled function name
'_x' (note how function 'x' appears in foo.o). When getSymbolAddress goes
looking for '_x' it will find the variable instead, and return its address and
in place of the function, leading to JIT'd code calling the variable and
crashing (if we're lucky).
By requiring that getSymbolAddress be called with mangled names, and demangling
only when we're about to do a lookup in the process, the new behavior
implemented in this patch should eliminate any chance of names being shadowed
during lookup.
There's no good way to test this at the moment: This issue only arrises when
looking up process symbols (not JIT'd symbols). Any test case would have to
generate a platform-appropriate dylib to pass to llvm-rtdyld, and I'm not
aware of any in-tree tool for doing this in a portable way.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218187
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Bogner [Sat, 20 Sep 2014 17:19:52 +0000 (17:19 +0000)]
llvm-cov: Allow creating CoverageMappings from filenames
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218185
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Bogner [Sat, 20 Sep 2014 15:31:56 +0000 (15:31 +0000)]
llvm-cov: Disentangle the coverage data logic from the display (NFC)
This splits the logic for actually looking up coverage information
from the logic that displays it. These were tangled rather thoroughly
so this change is a bit large, but it mostly consists of moving things
around. The coverage lookup logic itself now lives in the library,
rather than being spread between the library and the tool.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218184
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Bogner [Sat, 20 Sep 2014 15:31:51 +0000 (15:31 +0000)]
llvm-cov: Move some reader debug output out of the tool.
This debug output is really for testing CoverageMappingReader, not the
llvm-cov tool. Move it to where it can be more useful.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218183
91177308-0d34-0410-b5e6-
96231b3b80d8
Lenny Maiorani [Sat, 20 Sep 2014 13:29:20 +0000 (13:29 +0000)]
Using a deque to manage the stack of nodes is faster here.
Vector is slow due to many reallocations as the size regularly changes in
unpredictable ways. See the investigation provided on the mailing list for
more information:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-
20120116/135228.html
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218182
91177308-0d34-0410-b5e6-
96231b3b80d8
David Majnemer [Sat, 20 Sep 2014 07:31:46 +0000 (07:31 +0000)]
MC: Treat ReadOnlyWithRel and ReadOnlyWithRelLocal as ReadOnly for COFF
A problem with our old behavior becomes observable under x86-64 COFF
when we need a read-only GV which has an initializer which is referenced
using a relocation: we would mark the section as writable. Marking the
section as writable interferes with section merging.
This fixes PR21009.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218179
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sat, 20 Sep 2014 04:15:22 +0000 (04:15 +0000)]
[x86] Teach the v4f32 path of the new shuffle lowering to handle the
tricky case of single-element insertion into the zero lane of a zero
vector.
We can't just use the same pattern here as we do in every other vector
type because the general insertion logic can handle insertion into the
non-zero lane of the vector. However, in SSE4.1 with v4f32 vectors we
have INSERTPS that is a much better choice than the generic one for such
lowerings. But INSERTPS can do lots of other lowerings as well so
factoring its logic into the general insertion logic doesn't work very
well. We also can't just extract the core common part of the general
insertion logic that is faster (forming VZEXT_MOVL synthetic nodes that
lower to MOVSS when they can) because VZEXT_MOVL is often *faster* than
a blend while INSERTPS is slower! So instead we do a restrictive
condition on attempting to use the generic insertion logic to narrow it
to those cases where VZEXT_MOVL won't need a shuffle afterward and thus
will do better than INSERTPS. Then we try blending. Then we go back to
INSERTPS.
This still doesn't generate perfect code for some silly reasons that can
be fixed by tweaking the td files for lowering VZEXT_MOVL to use
XORPS+BLENDPS when available rather than XORPS+MOVSS when the input ends
up in a register rather than a load from memory -- BLENDPSrr has twice
the reciprocal throughput of MOVSSrr. Don't you love this ISA?
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218177
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sat, 20 Sep 2014 03:57:01 +0000 (03:57 +0000)]
[x86] Refactor the code for emitting INSERTPS to reuse the zeroable mask
analysis used elsewhere. This removes the last duplicate of this logic.
Also simplify the code here quite a bit. No functionality changed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218176
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sat, 20 Sep 2014 03:32:25 +0000 (03:32 +0000)]
[x86] Generalize the single-element insertion lowering to work with
floating point types and use it for both v2f64 and v2i64 single-element
insertion lowering.
This fixes the last non-AVX performance regression test case I've gotten
of for the new vector shuffle lowering. There is obvious analogous
lowering for v4f32 that I'll add in a follow-up patch (because with
INSERTPS, v4f32 requires special treatment). After that, its AVX stuff.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218175
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Sat, 20 Sep 2014 02:44:21 +0000 (02:44 +0000)]
[x86] Replace some duplicated logic reasoning about whether particular
vector lanes can be modeled as zero with a call to the new function that
computes a bit-vector representing that information.
No functionality changed here, but will allow doing more clever things
with the zero-test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218174
91177308-0d34-0410-b5e6-
96231b3b80d8
David Majnemer [Sat, 20 Sep 2014 00:25:06 +0000 (00:25 +0000)]
llvm-readobj: pretty-print special COFF section names
Print IMAGE_SYM_DEBUG and the like instead of (-2).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218172
91177308-0d34-0410-b5e6-
96231b3b80d8
Peter Collingbourne [Sat, 20 Sep 2014 00:10:47 +0000 (00:10 +0000)]
Fix crash with an insertvalue that produces an empty object.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218171
91177308-0d34-0410-b5e6-
96231b3b80d8
Robin Morisset [Fri, 19 Sep 2014 23:56:46 +0000 (23:56 +0000)]
[X86] Erase some obsolete comments from README.txt
I just tried reproducing some of the optimization failures in README.txt in the
X86 backend, and many of them could not be reproduced. In general the entire
file appears quite bit-rotted, whatever interesting parts remain should be
moved to bugzilla, and the rest deleted. I did not spend the time to do that,
so I just deleted the few I tried reproducing which are obsolete, to save some
time to whoever will find the courage to do it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218170
91177308-0d34-0410-b5e6-
96231b3b80d8
Eric Christopher [Fri, 19 Sep 2014 23:30:42 +0000 (23:30 +0000)]
constify the TargetMachine being passed through the Mips subtarget
creation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218169
91177308-0d34-0410-b5e6-
96231b3b80d8
Chris Bieneman [Fri, 19 Sep 2014 23:19:24 +0000 (23:19 +0000)]
Converting InstrProf's error_category to a ManagedStatic to avoid static constructors and destructors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218168
91177308-0d34-0410-b5e6-
96231b3b80d8
Duncan P. N. Exon Smith [Fri, 19 Sep 2014 23:17:58 +0000 (23:17 +0000)]
DIBuilder: Delete dead code, NFC
There are two versions of `DIBuilder::createObjCIVar()`. Delete the one
that's apparently dead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218167
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 19 Sep 2014 23:02:20 +0000 (23:02 +0000)]
R600: Un-xfail a test which passes with pass disabled
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218165
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 19 Sep 2014 23:02:18 +0000 (23:02 +0000)]
R600/SI: Un-xfail tests which work now
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218164
91177308-0d34-0410-b5e6-
96231b3b80d8
Chris Bieneman [Fri, 19 Sep 2014 22:46:28 +0000 (22:46 +0000)]
Converting SpillPlacement's BlockFrequency threshold to a ManagedStatic to avoid static constructors and destructors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218163
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 19 Sep 2014 22:42:40 +0000 (22:42 +0000)]
R600/SI: Un xfail a test that works now
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218162
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Fri, 19 Sep 2014 22:23:46 +0000 (22:23 +0000)]
[FastIsel][AArch64] Fix a think-o in address computation.
When looking through sign/zero-extensions the code would always assume there is
such an extension instruction and use the wrong operand for the address.
There was also a minor issue in the handling of 'AND' instructions. I
accidentially used a 'cast' instead of a 'dyn_cast'.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218161
91177308-0d34-0410-b5e6-
96231b3b80d8
Chris Bieneman [Fri, 19 Sep 2014 22:09:18 +0000 (22:09 +0000)]
Converting object's error_category to a ManagedStatic to avoid static constructors and destructors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218160
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Fri, 19 Sep 2014 21:52:10 +0000 (21:52 +0000)]
[x86] Hoist a function up to the rest of the non-type-specific lowering
helpers, and re-flow the logic to use early exit and be a bit more
readable.
No functionality changed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218155
91177308-0d34-0410-b5e6-
96231b3b80d8
Chris Bieneman [Fri, 19 Sep 2014 21:38:20 +0000 (21:38 +0000)]
Converting the JITDebugLock mutex to a ManagedStatic to avoid the static constructor and destructor.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218154
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Fri, 19 Sep 2014 21:20:08 +0000 (21:20 +0000)]
[x86] Hoist the actual lowering logic into a helper function to separate
it from the shuffle pattern matching logic.
Also cleaned up variable names, comments, etc. No functionality changed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218152
91177308-0d34-0410-b5e6-
96231b3b80d8
Chris Bieneman [Fri, 19 Sep 2014 21:07:01 +0000 (21:07 +0000)]
Converting FuncNames to a ManagedStatic to avoid static constructors and destructors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218151
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Stellard [Fri, 19 Sep 2014 20:42:37 +0000 (20:42 +0000)]
R600/SI: Fix config value for number of gprs
In r217636, the value stored in KernelInfo.Num[VS]GPRSs was changed from
the highest GPR index used to the number of gprs in order to be
consistent with the name of the variable.
The code writing the config values still assumed that the value in this
variable was the highest GPR index used, which caused the compiler to
over report the number of GPRs being used.
https://bugs.freedesktop.org/show_bug.cgi?id=84089
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218150
91177308-0d34-0410-b5e6-
96231b3b80d8
Chris Bieneman [Fri, 19 Sep 2014 20:29:02 +0000 (20:29 +0000)]
Eliminating static destructor for the BitCodeErrorCategory by converting to a ManagedStatic.
Summary: This is part of the overall goal of removing static initializers from LLVM.
Reviewers: chandlerc
Reviewed By: chandlerc
Subscribers: chandlerc, llvm-commits
Differential Revision: http://reviews.llvm.org/D5416
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218149
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Fri, 19 Sep 2014 20:00:32 +0000 (20:00 +0000)]
[x86] Fully generalize the zext lowering in the new vector shuffle
lowering to support both anyext and zext and to custom lower for many
different microarchitectures.
Using this allows us to get *exactly* the right code for zext and anyext
shuffles in all the vector sizes. For v16i8, the improvement is *huge*.
The new SSE2 test case added I refused to add before this because it was
sooooo muny instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218143
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 19 Sep 2014 19:52:11 +0000 (19:52 +0000)]
Add hsail and amdil64 to Triple
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218142
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Bogner [Fri, 19 Sep 2014 19:07:17 +0000 (19:07 +0000)]
llvm-cov: Return unique_ptrs instead of filling objects (NFC)
Having create* functions return the object they create is more
readable than using an in-out parameter.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218139
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Bogner [Fri, 19 Sep 2014 19:04:08 +0000 (19:04 +0000)]
llvm-cov: Prevent a test from matching its own check lines
Since llvm-cov shows the source file in its output, be careful about
potentially matching the check lines themselves.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218138
91177308-0d34-0410-b5e6-
96231b3b80d8
Eric Christopher [Fri, 19 Sep 2014 18:44:27 +0000 (18:44 +0000)]
Revert my earlier change to add "all" as a dependency to check. In
retrospect it really wasn't a good idea.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218136
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Fri, 19 Sep 2014 18:31:25 +0000 (18:31 +0000)]
Fix test case to be portable to different architectures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218134
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 19 Sep 2014 18:11:16 +0000 (18:11 +0000)]
R600/SI: Fix test to prepare for scheduler
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218131
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Fri, 19 Sep 2014 17:03:16 +0000 (17:03 +0000)]
Omit DW_TAG_subprograms for subprograms without inlined subroutines when producing -gmlt data
To reduce the size of -gmlt data, skip the subprograms without any
inlined subroutines. Since we've now got the ability to make these
determinations in the backend (funnily enough - we added the flag so we
wouldn't produce ranges under -gmlt, but with this change we use the
flag, but go back to producing ranges under -gmlt).
Instead, just produce CU ranges to inform the consumer which parts of
the code are described by this CU's line table. Tools could inspect the
line table directly to compute the range, but the CU ranges only seem to
be about 0.5% of object/executable size, so I'm not too worried about
teaching llvm-symbolizer that trick just yet - it's certainly a possible
piece of future work.
Update an llvm-symbolizer test just to demonstrate that this schema is
acceptable there (if it wasn't, the compiler-rt tests would catch this,
but good to have an in-llvm-tree test for llvm-symbolizer's behavior
here)
Building the clang binary with -gmlt with this patch reduces the total
size of object files by 5.1% (5.56% without ranges) without compression
and the executable by 4.37% (4.75% without ranges).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218129
91177308-0d34-0410-b5e6-
96231b3b80d8
Frederic Riss [Fri, 19 Sep 2014 15:12:03 +0000 (15:12 +0000)]
Change DwarfCompileUnit::createGlobalVariable to getOrCreateGlobalVariable.
Summary:
This will allow to request the creation of a forward delacred variable
at is point of use (for imported declarations, this will be
DwarfDebug::constructImportedEntityDIE) rather than having to put the
forward decl in a retention list.
Note that getOrCreateGlobalVariable returns the actual definition DIE when the
routine creates a declaration and a definition DIE. If you agree this is the
right behavior, then I'll have a followup patch that registers the definition
in the DIE map instead of the declaration as it is today (this 'breaks' only
one test, where we test that the imported entity is the declaration). I'm
not sure what's best here, but it's easy enough for a consumer to follow the
DW_AT_specification link to get to the declaration, whereas it takes more
work to find the actual definition from a declaration DIE.
Reviewers: echristo, dblaikie, aprantl
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5381
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218126
91177308-0d34-0410-b5e6-
96231b3b80d8
Frederic Riss [Fri, 19 Sep 2014 15:11:51 +0000 (15:11 +0000)]
Turn local DWARFContext helpers getFileNameForUnit() and getFileLineInfoForCompileUnit() into full-blowm DWARFDebugLine::LineTable methods.
Summary:
getFileNameForUnit() is basically a wrapper around LineTable::getFileNameByIndex().
Fold its additional functionality (adding the DWARFUnit compilation dir) into
LineTable::getFileNameByIndex().
getFileLineInfoForCompileUnit() is a wrapper around getFileNameForUnit(). As
a function to search the line information by address, it seems natural to put
it in the LineTable also.
Before this commit only the Context with its private helpers could do Linetable
lookups. This newly exposed feature will be used by the DIE dumping code to
get access to file information referenced in DIE attributes.
This commit has already been partly reviewed in D5192 and contained an
additional and a bit controversial 'realpath' call that is left out of this
patch. We can reinstate that realpath code later if it is desirable.
Test Plan:
The patch contains no tests as it should be functionally equivalent to the
previous code. As requested in the last review, I checked if the relative
path handling copied from the Context to LineTable::getFileNameByIndex()
was covered, and indeed the symbolizer tests fail if it is removed.
Reviewers: dblaikie, echristo, aprantl, samsonov
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5354
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218125
91177308-0d34-0410-b5e6-
96231b3b80d8
Benjamin Kramer [Fri, 19 Sep 2014 12:26:38 +0000 (12:26 +0000)]
Elide unnecessary DenseMap copy.
No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218122
91177308-0d34-0410-b5e6-
96231b3b80d8
Hal Finkel [Fri, 19 Sep 2014 11:42:56 +0000 (11:42 +0000)]
Optionally enable more-aggressive FMA formation in DAGCombine
The heuristic used by DAGCombine to form FMAs checks that the FMUL has only one
use, but this is overly-conservative on some systems. Specifically, if the FMA
and the FADD have the same latency (and the FMA does not compete for resources
with the FMUL any more than the FADD does), there is no need for the
restriction, and furthermore, forming the FMA leaving the FMUL can still allow
for higher overall throughput and decreased critical-path length.
Here we add a new TLI callback, enableAggressiveFMAFusion, false by default, to
elide the hasOneUse check. This is enabled for PowerPC by default, as most
PowerPC systems will benefit.
Patch by Olivier Sallenave, thanks!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218120
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Fri, 19 Sep 2014 09:45:21 +0000 (09:45 +0000)]
[x86] Recognize that we can use duplication to widen v16i8 shuffles due
to undef lanes as well as defined widenable lanes. This dramatically
improves the lowering we use for undef-shuffles in a zext-ish pattern
for SSE2.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218115
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Fri, 19 Sep 2014 08:51:06 +0000 (08:51 +0000)]
[x86] Actually test the SSE2 lowering for most of the zext-ish shuffles.
Not sure why I only did SSSE3 here. Also, I've left out some of the SSE2
ones because the shuffles are so absurd it's not worth transcribing
them. Will try to fix them to be sane and then check them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218114
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Fri, 19 Sep 2014 08:37:44 +0000 (08:37 +0000)]
[x86] Teach the new vector shuffle lowering to also use pmovzx for v4i32
shuffles that are zext-ing.
Not a lot to see here; the undef lane variant is better handled with
pshufd, but this improves the actual zext pattern.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218112
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Bogner [Fri, 19 Sep 2014 08:13:16 +0000 (08:13 +0000)]
llvm-cov: Fix dropped lines when filters were applied
Uncovered lines in the middle of a covered region weren't being shown
when filtering to a particular function.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218109
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Bogner [Fri, 19 Sep 2014 08:13:12 +0000 (08:13 +0000)]
llvm-cov: Generalize -filename-equivalence
The filename-equivalence flag allows you to show coverage when your
source files don't have the same full paths as those that generated
the data. This is mostly useful for writing tests in a cross-platform
way.
This wasn't triggering in cases where the filename was derived
directly from the coverage data, which meant certain types of test
case were impossible to write. This patch fixes that, and following
patches involve tests that need this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218108
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Fri, 19 Sep 2014 06:07:49 +0000 (06:07 +0000)]
[x86] Add a dedicated lowering path for zext-compatible vector shuffles
to the new vector shuffle lowering code.
This allows us to emit PMOVZX variants consistently for patterns where
it is a viable lowering. This instruction is both fast and allows us to
fold loads into it. This only hooks the new lowering up for i16 and i8
element widths, mostly so I could manage the change to the tests. I'll
add the i32 one next, although it is significantly less interesting.
One thing to note is that we already had some tests for these patterns
but those tests had far less horrible instructions. The problem is that
those tests weren't checking the strict start and end of the instruction
sequence. =[ As a consequence something changed in the lowering making
us generate *TERRIBLE* code for these patterns in SSE2 through SSSE3.
I've consolidated all of the tests and spelled out the madness that we
currently emit for these shuffles. I'm going to try to figure out what
has gone wrong here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218102
91177308-0d34-0410-b5e6-
96231b3b80d8
Jiangning Liu [Fri, 19 Sep 2014 05:30:35 +0000 (05:30 +0000)]
Optimize sext/zext insertion algorithm in back-end.
With this optimization, we will not always insert zext for values crossing
basic blocks, but insert sext if the users of a value crossing basic block
has preference of sign predicate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218101
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Fri, 19 Sep 2014 04:55:05 +0000 (04:55 +0000)]
Omit DW_AT_frame_base under -gmlt for size
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218100
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Fri, 19 Sep 2014 04:47:46 +0000 (04:47 +0000)]
Describe the -gmlt optimization committed in the previous revision.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218099
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Fri, 19 Sep 2014 04:30:36 +0000 (04:30 +0000)]
Omit all the extra static attributes on subprograms in -gmlt
This omission will be done in a fancier manner once we're dealing with
"put gmlt in the skeleton CUs under fission" - it'll have to be
conditional on the kind of CU we're emitting into (skeleton or gmlt).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218098
91177308-0d34-0410-b5e6-
96231b3b80d8
Hans Wennborg [Fri, 19 Sep 2014 01:14:56 +0000 (01:14 +0000)]
Fix an it's vs. its typo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218093
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 19 Sep 2014 00:42:06 +0000 (00:42 +0000)]
R600: Better fix for bug 20982
Just do the left shift as unsigned to avoid the UB.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218092
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Fri, 19 Sep 2014 00:30:24 +0000 (00:30 +0000)]
[x86] Extend this test to cover SSE4.1. Nothing interesting here, but
paves the way for subsequent changes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218091
91177308-0d34-0410-b5e6-
96231b3b80d8
Peter Collingbourne [Thu, 18 Sep 2014 22:56:00 +0000 (22:56 +0000)]
Try to fix i686-cygming bots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218086
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Thu, 18 Sep 2014 22:28:56 +0000 (22:28 +0000)]
Use cast<> instead of unchecked dyn_cast<>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218085
91177308-0d34-0410-b5e6-
96231b3b80d8
Peter Collingbourne [Thu, 18 Sep 2014 21:54:02 +0000 (21:54 +0000)]
Fix sphinx warning.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218081
91177308-0d34-0410-b5e6-
96231b3b80d8
Peter Collingbourne [Thu, 18 Sep 2014 21:28:49 +0000 (21:28 +0000)]
LTO: introduce object file-based on-disk module format.
This format is simply a regular object file with the bitcode stored in a
section named ".llvmbc", plus any number of other (non-allocated) sections.
One immediate use case for this is to accommodate compilation processes
which expect the object file to contain metadata in non-allocated sections,
such as the ".go_export" section used by some Go compilers [1], although I
imagine that in the future we could consider compiling parts of the module
(such as large non-inlinable functions) directly into the object file to
improve LTO efficiency.
[1] http://golang.org/doc/install/gccgo#Imports
Differential Revision: http://reviews.llvm.org/D4371
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218078
91177308-0d34-0410-b5e6-
96231b3b80d8
Quentin Colombet [Thu, 18 Sep 2014 21:17:50 +0000 (21:17 +0000)]
[ARM] Do not perform a tail call when the caller returns several values.
The fix is slightly different then x86 (see r216117) because the number of values
attached to a return can vary even for a single returned value (e.g., f64 yields
two returned values).
<rdar://problem/
18352998>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218076
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Bogner [Thu, 18 Sep 2014 20:31:26 +0000 (20:31 +0000)]
llvm-cov: Simplify FunctionInstantiationSetCollector (NFC)
- Replace std::unordered_map with DenseMap
- Use std::pair instead of manually combining two unsigneds
- Assert if insert is called with invalid arguments
- Avoid an unnecessary copy of a std::vector
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218074
91177308-0d34-0410-b5e6-
96231b3b80d8
Robin Morisset [Thu, 18 Sep 2014 18:56:04 +0000 (18:56 +0000)]
Restore "[ARM, Fix] Fix emitLeading/TrailingFence on old ARM processors"
Summary:
This patch was originally in D5304 (I could not find a way to reopen that revision).
It was accepted, commited and broke the build bots because the overloading of
the constructor of ArrayRef for braced initializer lists is not supported by all
toolchains. I then reverted it, and propose this fixed version that uses a plain
C array instead in makeDMB (that array is then converted implicitly to an
ArrayRef, but that is not behind an ifdef). Could someone confirm me whether
initialization lists for plain C arrays are supported by every toolchain used
to build llvm ? Otherwise I can just initialize the array in the old way:
args[0] = ...; .. ; args[5] = ...;
Below is the description of the original patch:
```
I had only tested this code for ARMv7 and ARMv8. This patch adds several
fallback paths if the processor does not support dmb ish:
- dmb sy if a cortex-M with support for dmb
- mcr p15, #0, r0, c7, c10, #5 for ARMv6 (special instruction equivalent to a DMB)
These fallback paths were chosen based on the code for fence seq_cst.
Thanks to luqmana for having noticed this bug.
```
Test Plan: Added more cases to atomic-load-store.ll + make check-all
Reviewers: jfb, t.p.northover, luqmana
Subscribers: llvm-commits, aemerson
Differential Revision: http://reviews.llvm.org/D5386
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218066
91177308-0d34-0410-b5e6-
96231b3b80d8
Aaron Ballman [Thu, 18 Sep 2014 17:34:23 +0000 (17:34 +0000)]
Reverting NFC changes from r218050. Instead, the warning was disabled for GCC in r218059, so these changes are no longer required.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218062
91177308-0d34-0410-b5e6-
96231b3b80d8
Lang Hames [Thu, 18 Sep 2014 16:43:24 +0000 (16:43 +0000)]
[MCJIT] Fix a debugging-output formatting bug in RuntimeDyld.
The mismatched mask (7 vs (ColsPerRow-1)) could lead to partial lines being
printed out of place.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218061
91177308-0d34-0410-b5e6-
96231b3b80d8
Frederic Riss [Thu, 18 Sep 2014 16:41:04 +0000 (16:41 +0000)]
Revert part of r218041.
The patch moved some logic around in an attempt to generate potentially more
DW_AT_declaration attributes. The patch was flawed though and it stopped
generating the attribute in some cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218060
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Thu, 18 Sep 2014 16:34:25 +0000 (16:34 +0000)]
Disable GCC's -Woverloaded-virtual in the configure+make build. Clang's is better.
Turns out Clang's -Woverloaded-virtual is enabled by -Wall in both CMake
and Configure builds. We were only explicitly specifying it (thus
enabling GCC's version of the warning) in the Configure build.
The specific case of interest is:
struct base {
virtual void func();
virtual void func(int);
};
struct derived: base {
virtual void func(); // GCC warns here, because this causes
// func(int) to be hidden
};
I don't think that's worth getting fussed about (& Clang (indirectly
me... since I improved this warning in Clang) agrees or we would've made
the warning catch these cases.
Technically this could still lead to bugs/confusion if base had
func(int) and func(bool), derived overrode func(bool) and then a caller
with a derived object tried to call func(42) - it would silently call
func(bool). We should probably improve clang's warnings to catch this at
the call site at some point.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218059
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Thu, 18 Sep 2014 15:52:26 +0000 (15:52 +0000)]
R600: Bug 20982 - Avoid undefined left shift of negative value
I'm not sure what the hardware actually does, so don't
bother trying to fold it for now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218057
91177308-0d34-0410-b5e6-
96231b3b80d8
Robert Khasanov [Thu, 18 Sep 2014 14:06:55 +0000 (14:06 +0000)]
[SKX] Deriving rmb multiclasses from general one (avx512_icmp_packed_rmb and avx512_icmp_cc_rmb).
Thanks Adam Nemet for notice about this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218051
91177308-0d34-0410-b5e6-
96231b3b80d8
Aaron Ballman [Thu, 18 Sep 2014 13:27:14 +0000 (13:27 +0000)]
Fixing a bunch of -Woverloaded-virtual warnings due to hiding getSubtargetImpl from the base class. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218050
91177308-0d34-0410-b5e6-
96231b3b80d8
Patrik Hagglund [Thu, 18 Sep 2014 11:52:57 +0000 (11:52 +0000)]
Alternative (to r216344) fix of gcc -Wpedantic.
As suggested by David Blaikie, this may be easier to read.
The original warning was:
../tools/llvm-cov/llvm-cov.cpp:53:49: error: ISO C++ forbids zero-size array 'argv' [-Werror=pedantic]
std::string Invocation(std::string(argv[0]) + " " + argv[1]);
It seems to be the case that GCC's warning gets confused and thinks
'argv' is a declaration here. GCC bugzilla issue #61259.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218048
91177308-0d34-0410-b5e6-
96231b3b80d8
Frederic Riss [Thu, 18 Sep 2014 09:38:23 +0000 (09:38 +0000)]
Always emit DW_AT_declaration attribute when the variable isn't a definition.
Summary:
This doesn't show up today as we don't emit decalration only variables. This
will be tested when the followup patches implementing import of forward
declared entities lands in clang.
Reviewers: echristo, dblaikie, aprantl
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5382
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218041
91177308-0d34-0410-b5e6-
96231b3b80d8
Frederic Riss [Thu, 18 Sep 2014 09:38:15 +0000 (09:38 +0000)]
Fix DWARFUnitSection::getUnitForOffset().
The current code is only able to return the right unit if the passed offset
is the exact offset of a section. Generalize the search function by comparing
againt the offset of the next unit instead and by switching the search
algorithm to upper_bound.
This way, the unit returned is the first unit with a getNextUnitOffset()
strictly greater than the searched offset, which is exactly what we want.
Note that there is no need for testing the range of the resulting unit as
the offsets of a DWARFUnitSection are in a single contiguous range from
0 inclusive to lastUnit->getNextUnitOffset() exclusive.
Reviewers: dblaikie samsonov
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5262
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218040
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Thu, 18 Sep 2014 09:00:25 +0000 (09:00 +0000)]
[x86] Use PALIGNR for v4i32 and v2i64 blends when appropriate.
There is no purpose in using it for single-input shuffles as
pshufd is just as fast and doesn't tie the two operands. This removes
a substantial amount of wrong-domain blend operations in SSSE3 mode. It
also completes the usage of PALIGNR for integer shuffles and addresses
one of the test cases Quentin hit with the new vector shuffle lowering.
There is still the question of whether and when to use this for floating
point shuffles. It is faster than shufps or shufpd but in the integer
domain. I don't yet really have a good heuristic here for when to use
this instruction for floating point vectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218038
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Thu, 18 Sep 2014 08:33:04 +0000 (08:33 +0000)]
[x86] Add an SSSE3 run and check mode to the 128-bit v2 tests of the new
vector shuffle lowering. This will be needed for up-coming palignr
tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218037
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Sanders [Thu, 18 Sep 2014 08:28:39 +0000 (08:28 +0000)]
[mips] Remove custom versions of CCState::AnalyzeReturn() and CCState::AnalyzeCallReturn().
Summary:
The N32/N64 ABI's return f128 values in $f0 and $f2 for hard-float and $v0 and
$a0 for soft-float. The registers used in the soft-float case differ from the
usual $v0, and $v1 specified for return values.
Both cases were previously handled by duplicating the CCState::AnalyzeReturn()
and CCState::AnalyzeCallReturn() functions and modifying them to delegate to
a different assignment function for f128 and further replace the register type
for the hard-float case. There is a simpler way to do both of these.
We now use the common functions and select an initial assignment function based
on whether the original type is f128 or not. We then handle the hard-float case
using CCBitConvertToType<>.
No functional change.
Reviewers: vmedic
Reviewed By: vmedic
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5269
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218036
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Thu, 18 Sep 2014 08:07:40 +0000 (08:07 +0000)]
Revert "[FastISel][AArch64] Fold bit test and branch into TBZ and TBNZ."
Reverting it until I have time to investigate a regression.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218035
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Thu, 18 Sep 2014 07:26:26 +0000 (07:26 +0000)]
Fix previous commit: [FastISel][AArch64] Simplify XALU multiplies.
When folding the intrinsic flag into the branch or select we also have to
consider the fact if the intrinsic got simplified, because it changes the
flag we have to check for.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218034
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Thu, 18 Sep 2014 07:04:54 +0000 (07:04 +0000)]
[FastISel][AArch64] Simplify XALU multiplies.
Simplify {s|u}mul.with.overflow to {s|u}add.with.overflow when possible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218033
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Thu, 18 Sep 2014 07:04:49 +0000 (07:04 +0000)]
[FastISel][AArch64] Followup commit for 218031 to handle negative offsets too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218032
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Thu, 18 Sep 2014 05:40:47 +0000 (05:40 +0000)]
[FastISel][AArch64] Try to fold the offset into the add instruction when simplifying a memory address.
Small optimization in 'simplifyAddress'. When the offset cannot be encoded in
the load/store instruction, then we need to materialize the address manually.
The add instruction can encode a wider range of immediates than the load/store
instructions. This change tries to fold the offset into the add instruction
first before materializing the offset in a register.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218031
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Thu, 18 Sep 2014 05:40:41 +0000 (05:40 +0000)]
[FastISel][AArch64] Fold 'AND' instruction during the address computation.
The 'AND' instruction could be used to mask out the lower 32 bits of a register.
If this is done inside an address computation we might be able to fold the
instruction into the memory instruction itself.
and x1, x1, #0xffffffff ---> ldrb x0, [x0, w1, uxtw]
ldrb x0, [x0, x1]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218030
91177308-0d34-0410-b5e6-
96231b3b80d8