Daniel Sanders [Thu, 10 Jul 2014 13:38:23 +0000 (13:38 +0000)]
[mips] Add support for -modd-spreg/-mno-odd-spreg
Summary:
When -mno-odd-spreg is in effect, 32-bit floating point values are not
permitted in odd FPU registers. The option also prohibits 32-bit and 64-bit
floating point comparison results from being written to odd registers.
This option has three purposes:
* It allows support for certain MIPS implementations such as loongson-3a that
do not allow the use of odd registers for single precision arithmetic.
* When using -mfpxx, -mno-odd-spreg is the default and this allows us to
statically check that code is compliant with the O32 FPXX ABI since mtc1/mfc1
instructions to/from odd registers are guaranteed not to appear for any
reason. Once this has been established, the user can then re-enable
-modd-spreg to regain the use of all 32 single-precision registers.
* When using -mfp64 and -mno-odd-spreg together, an O32 extension named
O32 FP64A is used as the ABI. This is intended to provide almost all
functionality of an FR=1 processor but can also be executed on a FR=0 core
with the assistance of a hardware compatibility mode which emulates FR=0
behaviour on an FR=1 processor.
* Added '.module oddspreg' and '.module nooddspreg' each of which update
the .MIPS.abiflags section appropriately
* Moved setFpABI() call inside emitDirectiveModuleFP() so that the caller
doesn't have to remember to do it.
* MipsABIFlags now calculates the flags1 and flags2 member on demand rather
than trying to maintain them in the same format they will be emitted in.
There is one portion of the -mfp64 and -mno-odd-spreg combination that is not
implemented yet. Moves to/from odd-numbered double-precision registers must not
use mtc1. I will fix this in a follow-up.
Differential Revision: http://reviews.llvm.org/D4383
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212717
91177308-0d34-0410-b5e6-
96231b3b80d8
Zinovy Nis [Thu, 10 Jul 2014 13:03:26 +0000 (13:03 +0000)]
[x32] Add AsmBackend for X32 which uses ELF32 with x86_64 (the author is Pavel Chupin).
This is minimal change for backend required to have "hello world" compiled and working on x32 target (x86_64-linux-gnux32). More patches for x32 will follow.
Differential Revision: http://reviews.llvm.org/D4181
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212716
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Thu, 10 Jul 2014 12:32:32 +0000 (12:32 +0000)]
[x86,SDAG] Introduce any- and sign-extend-vector-inreg nodes analogous
to the zero-extend-vector-inreg node introduced previously for the same
purpose: manage the type legalization of widened extend operations,
especially to support the experimental widening mode for x86.
I'm adding both because sign-extend is expanded in terms of any-extend
with shifts to propagate the sign bit. This removes the last
fundamental scalarization from vec_cast2.ll (a test case that hit many
really bad edge cases for widening legalization), although the trunc
tests in that file still appear scalarized because the the shuffle
legalization is scalarizing. Funny thing, I've been working on that.
Some initial experiments with this and SSE2 scenarios is showing
moderately good behavior already for sign extension. Still some work to
do on the shuffle combining on X86 before we're generating optimal
sequences, but avoiding scalarization is a huge step forward.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212714
91177308-0d34-0410-b5e6-
96231b3b80d8
Richard Sandiford [Thu, 10 Jul 2014 11:44:37 +0000 (11:44 +0000)]
[SystemZ] Use SystemZCallingConv.td to define callee-saved registers
Just a clean-up. No behavioral change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212711
91177308-0d34-0410-b5e6-
96231b3b80d8
NAKAMURA Takumi [Thu, 10 Jul 2014 11:39:59 +0000 (11:39 +0000)]
SpecialCaseList.h: Fix -Wdocumentation with \code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212710
91177308-0d34-0410-b5e6-
96231b3b80d8
NAKAMURA Takumi [Thu, 10 Jul 2014 11:37:39 +0000 (11:37 +0000)]
llvm/test/CodeGen/X86/shift-parts.ll: FileCheck-ize. (from r212640)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212709
91177308-0d34-0410-b5e6-
96231b3b80d8
NAKAMURA Takumi [Thu, 10 Jul 2014 11:37:28 +0000 (11:37 +0000)]
Revert r212640, "Add trunc (select c, a, b) -> select c (trunc a), (trunc b) combine."
This caused miscompilation on, at least, x86-64. SExt(i1 cond) confused other optimizations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212708
91177308-0d34-0410-b5e6-
96231b3b80d8
Richard Sandiford [Thu, 10 Jul 2014 11:29:23 +0000 (11:29 +0000)]
[SystemZ] Tweak instruction format classifications
There's no real need to have Shift as a separate format type from Binary.
The comments for other format types were too specific and in some cases
no longer accurate.
Just a clean-up, no behavioral change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212707
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Thu, 10 Jul 2014 11:09:29 +0000 (11:09 +0000)]
[x86] Add another combine that is particularly useful for the new vector
shuffle lowering: match shuffle patterns equivalent to an unpcklwd or
unpckhwd instruction.
This allows us to use generic lowering code for v8i16 shuffles and match
the unpack pattern late.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212705
91177308-0d34-0410-b5e6-
96231b3b80d8
Richard Sandiford [Thu, 10 Jul 2014 11:00:55 +0000 (11:00 +0000)]
[SystemZ] Add MC support for LEDBRA, LEXBRA and LDXBRA
These instructions aren't used for codegen since the original L*DB instructions
are suitable for fround.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212703
91177308-0d34-0410-b5e6-
96231b3b80d8
Richard Sandiford [Thu, 10 Jul 2014 10:52:51 +0000 (10:52 +0000)]
[SystemZ] Avoid using i8 constants for immediate fields
Immediate fields that have no natural MVT type tended to use i8 if the
field was small enough. This was a bit confusing since i8 isn't a legal
type for the target. Fields for short immediates in a 32-bit or 64-bit
operation use i32 or i64 instead, so it would be better to do the same
for all fields.
No behavioral change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212702
91177308-0d34-0410-b5e6-
96231b3b80d8
Richard Sandiford [Thu, 10 Jul 2014 10:45:11 +0000 (10:45 +0000)]
[SystemZ] Fix FPR dwarf numbering
The dwarf FPR numbers are supposed to have the order F0, F2, F4, F6,
F1, F3, F5, F7, F8, etc., which matches the pairing of registers for
long doubles. E.g. a long double stored in F0 is paired with F2.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212701
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Sanders [Thu, 10 Jul 2014 10:18:12 +0000 (10:18 +0000)]
Make it possible for ints/floats to return different values from getBooleanContents()
Summary:
On MIPS32r6/MIPS64r6, floating point comparisons return 0 or -1 but integer
comparisons return 0 or 1.
Updated the various uses of getBooleanContents. Two simplifications had to be
disabled when float and int boolean contents differ:
- ScalarizeVecRes_VSELECT except when the kind of boolean contents is trivially
discoverable (i.e. when the condition of the VSELECT is a SETCC node).
- visitVSELECT (select C, 0, 1) -> (xor C, 1).
Come to think of it, this one could test for the common case of 'C'
being a SETCC too.
Preserved existing behaviour for all other targets and updated the affected
MIPS32r6/MIPS64r6 tests. This also fixes the pi benchmark where the 'low'
variable was counting in the wrong direction because it thought it could simply
add the result of the comparison.
Reviewers: hfinkel
Reviewed By: hfinkel
Subscribers: hfinkel, jholewinski, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D4389
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212697
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Thu, 10 Jul 2014 09:57:36 +0000 (09:57 +0000)]
[x86] Expand the target DAG combining for PSHUFD nodes to be able to
combine into half-shuffles through unpack instructions that expand the
half to a whole vector without messing with the dword lanes.
This fixes some redundant instructions in splat-like lowerings for
v16i8, which are now getting to be *really* nice.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212695
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Thu, 10 Jul 2014 09:16:40 +0000 (09:16 +0000)]
[x86] Tweak the v16i8 single input special case lowering for shuffles
that splat i8s into i16s.
Previously, we would try much too hard to arrange a sequence of i8s in
one half of the input such that we could unpack them into i16s and
shuffle those into place. This isn't always going to be a cheaper i8
shuffle than our other strategies. The case where it is always going to
be cheaper is when we can arrange all the necessary inputs into one half
using just i16 shuffles. It happens that viewing the problem this way
also makes it much easier to produce an efficient set of shuffles to
move the inputs into one half and then unpack them.
With this, our splat code gets one step closer to being not terrible
with the new experimental lowering strategy. It also exposes two
combines missing which I will add next.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212692
91177308-0d34-0410-b5e6-
96231b3b80d8
Hal Finkel [Thu, 10 Jul 2014 07:04:37 +0000 (07:04 +0000)]
A test case for not asserting in isDereferenceablePointer upon unsized types
This is the test case for r212687.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212688
91177308-0d34-0410-b5e6-
96231b3b80d8
Hal Finkel [Thu, 10 Jul 2014 06:06:11 +0000 (06:06 +0000)]
Fix isDereferenceablePointer not to try to take the size of an unsized type.
I'll add a test-case shortly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212687
91177308-0d34-0410-b5e6-
96231b3b80d8
Hal Finkel [Thu, 10 Jul 2014 05:27:53 +0000 (05:27 +0000)]
Allow isDereferenceablePointer to look through some bitcasts
isDereferenceablePointer should not give up upon encountering any bitcast. If
we're casting from a pointer to a larger type to a pointer to a small type, we
can continue by examining the bitcast's operand. This missing capability
was noted in a comment in the function.
In order for this to work, isDereferenceablePointer now takes an optional
DataLayout pointer (essentially all callers already had such a pointer
available). Most code uses isDereferenceablePointer though
isSafeToSpeculativelyExecute (which already took an optional DataLayout
pointer), and to enable the LICM test case, LICM needs to actually provide its DL
pointer to isSafeToSpeculativelyExecute (which it was not doing previously).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212686
91177308-0d34-0410-b5e6-
96231b3b80d8
Saleem Abdulrasool [Thu, 10 Jul 2014 04:50:09 +0000 (04:50 +0000)]
MC: modernise for loop
Convert a for loop to range bsaed form. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212684
91177308-0d34-0410-b5e6-
96231b3b80d8
Saleem Abdulrasool [Thu, 10 Jul 2014 04:50:06 +0000 (04:50 +0000)]
MC: add and use an accessor for WinCFI
This adds a utility method to access the WinCFI information in bulk and uses
that to iterate rather than requesting the count and individually iterating
them. This is in preparation for restructuring WinCFI handling to enable more
clear sharing across architectures to enable unwind information emission for
Windows on ARM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212683
91177308-0d34-0410-b5e6-
96231b3b80d8
Peter Collingbourne [Thu, 10 Jul 2014 04:39:40 +0000 (04:39 +0000)]
Remove move assignment operator to appease older GCCs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212682
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Thu, 10 Jul 2014 04:34:06 +0000 (04:34 +0000)]
[x86] Initial improvements to the new shuffle lowering for v16i8
shuffles specifically for cases where a small subset of the elements in
the input vector are actually used.
This is specifically targetted at improving the shuffles generated for
trunc operations, but also helps out splat-like operations.
There is still some really low-hanging fruit here that I want to address
but this is a huge step in the right direction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212680
91177308-0d34-0410-b5e6-
96231b3b80d8
Peter Collingbourne [Thu, 10 Jul 2014 04:29:06 +0000 (04:29 +0000)]
Explicitly define move constructor and move assignment operator to appease MSVC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212679
91177308-0d34-0410-b5e6-
96231b3b80d8
Peter Collingbourne [Thu, 10 Jul 2014 03:55:02 +0000 (03:55 +0000)]
SpecialCaseList: use std::unique_ptr.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212678
91177308-0d34-0410-b5e6-
96231b3b80d8
Hao Liu [Thu, 10 Jul 2014 03:41:50 +0000 (03:41 +0000)]
[AArch64]Fix an assertion failure in DAG Combiner about concating 2 build_vector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212677
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Thu, 10 Jul 2014 03:22:20 +0000 (03:22 +0000)]
R600/SI: Add support for llvm.convert.{to|from}.fp16
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212676
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Thu, 10 Jul 2014 03:22:16 +0000 (03:22 +0000)]
Fix types in documentation.
The examples were using f32, but the IR type is called float
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212675
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Thu, 10 Jul 2014 02:24:26 +0000 (02:24 +0000)]
[x86] Refactor some of the new code for lowering v16i8 shuffles to
remove duplication and make it easier to select different strategies.
No functionality changed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212674
91177308-0d34-0410-b5e6-
96231b3b80d8
Peter Collingbourne [Thu, 10 Jul 2014 01:30:39 +0000 (01:30 +0000)]
[dfsan] Handle bitcast aliases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212668
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 9 Jul 2014 22:53:04 +0000 (22:53 +0000)]
[SDAG] Make the new zext-vector-inreg node default to expand so targets
don't need to set it manually.
This is based on feedback from Tom who pointed out that if every target
needs to handle this we need to reach out to those maintainers. In fact,
it doesn't make sense to duplicate everything when anything other than
expand seems unlikely at this stage.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212661
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Wed, 9 Jul 2014 21:02:41 +0000 (21:02 +0000)]
Recommit r212203: Don't try to construct debug LexicalScopes hierarchy for functions that do not have top level debug information.
Reverted by Eric Christopher (Thanks!) in r212203 after Bob Wilson
reported LTO issues. Duncan Exon Smith and Aditya Nandakumar helped
provide a reduced reproduction, though the failure wasn't too hard to
guess, and even easier with the example to confirm.
The assertion that the subprogram metadata associated with an
llvm::Function matches the scope data referenced by the DbgLocs on the
instructions in that function is not valid under LTO. In LTO, a C++
inline function might exist in multiple CUs and the subprogram metadata
nodes will refer to the same llvm::Function. In this case, depending on
the order of the CUs, the first intance of the subprogram metadata may
not be the one referenced by the instructions in that function and the
assertion will fail.
A test case (test/DebugInfo/cross-cu-linkonce-distinct.ll) is added, the
assertion removed and a comment added to explain this situation.
Original commit message:
If a function isn't actually in a CU's subprogram list in the debug info
metadata, ignore all the DebugLocs and don't try to build scopes, track
variables, etc.
While this is possibly a minor optimization, it's also a correctness fix
for an incoming patch that will add assertions to LexicalScopes and the
debug info verifier to ensure that all scope chains lead to debug info
for the current function.
Fix up a few test cases that had broken/incomplete debug info that could
violate this constraint.
Add a test case where this occurs by design (inlining a
debug-info-having function in an attribute nodebug function - we want
this to work because /if/ the nodebug function is then inlined into a
debug-info-having function, it should be fine (and will work fine - we
just stitch the scopes up as usual), but should the inlining not happen
we need to not assert fail either).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212649
91177308-0d34-0410-b5e6-
96231b3b80d8
Alexey Samsonov [Wed, 9 Jul 2014 19:40:08 +0000 (19:40 +0000)]
Decouple llvm::SpecialCaseList text representation and its LLVM IR semantics.
Turn llvm::SpecialCaseList into a simple class that parses text files in
a specified format and knows nothing about LLVM IR. Move this class into
LLVMSupport library. Implement two users of this class:
* DFSanABIList in DFSan instrumentation pass.
* SanitizerBlacklist in Clang CodeGen library.
The latter will be modified to use actual source-level information from frontend
(source file names) instead of unstable LLVM IR things (LLVM Module identifier).
Remove dependency edge from ClangCodeGen/ClangDriver to LLVMTransformUtils.
No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212643
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Northover [Wed, 9 Jul 2014 19:14:34 +0000 (19:14 +0000)]
Use simpler constructor for range adapter.
It is a good idea, it's slightly clearer and simpler. Unfortunately
the headline news is: we save one line!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212641
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Wed, 9 Jul 2014 19:12:07 +0000 (19:12 +0000)]
Add trunc (select c, a, b) -> select c (trunc a), (trunc b) combine.
Do this if the truncate is free and the select is legal.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212640
91177308-0d34-0410-b5e6-
96231b3b80d8
Jim Grosbach [Wed, 9 Jul 2014 18:55:52 +0000 (18:55 +0000)]
AArch64: Better codegen for storing to __fp16.
Storing will generally be immediately preceded by rounding from an f32
or f64, so make sure to match those patterns directly to convert into the
FPR16 register class directly rather than going through the integer GPRs.
This also eliminates an extra step in the convert-from-f64 path
which was first converting to f32 and then to f16 from there.
rdar://
17594379
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212638
91177308-0d34-0410-b5e6-
96231b3b80d8
Jim Grosbach [Wed, 9 Jul 2014 18:55:49 +0000 (18:55 +0000)]
Change an assert() to a diagnostic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212637
91177308-0d34-0410-b5e6-
96231b3b80d8
Benjamin Kramer [Wed, 9 Jul 2014 18:53:57 +0000 (18:53 +0000)]
TargetRegisterInfo: Remove function that fell out of use years ago.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212636
91177308-0d34-0410-b5e6-
96231b3b80d8
Cameron McInally [Wed, 9 Jul 2014 18:29:55 +0000 (18:29 +0000)]
Update ReleaseNotes to mention Atomic NAND semantic changes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212635
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Wed, 9 Jul 2014 18:22:33 +0000 (18:22 +0000)]
[X86] AVX512: Enable it in the Loop Vectorizer
This lets us experiment with 512-bit vectorization without passing
force-vector-width manually.
The code generated for a simple integer memset loop is properly vectorized.
Disassembly is still broken for it though :(.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212634
91177308-0d34-0410-b5e6-
96231b3b80d8
Louis Gerbarg [Wed, 9 Jul 2014 17:54:32 +0000 (17:54 +0000)]
Make AArch64FastISel::EmitIntExt explicitly check its source and destination types
This is a follow up to r212492. There should be no functional difference, but
this patch makes it clear that SrcVT must be an i1/i8/16/i32 and DestVT must be
an i8/i16/i32/i64.
rdar://
17516686
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212633
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Wed, 9 Jul 2014 17:49:58 +0000 (17:49 +0000)]
removed duplicate testcase
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212632
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Wed, 9 Jul 2014 16:34:54 +0000 (16:34 +0000)]
Fix for PR20059 (instcombine reorders shufflevector after instruction that may trap)
In PR20059 ( http://llvm.org/pr20059 ), instcombine eliminates shuffles that are necessary before performing an operation that can trap (srem).
This patch calls isSafeToSpeculativelyExecute() and bails out of the optimization in SimplifyVectorOp() if needed.
Differential Revision: http://reviews.llvm.org/D4424
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212629
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Sanders [Wed, 9 Jul 2014 16:03:10 +0000 (16:03 +0000)]
Add Imagination Technologies to the vendors in llvm::Triple
Summary: This is a pre-requisite for supporting the mips-img-linux-gnu triple in clang.
Differential Revision: http://reviews.llvm.org/D4435
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212626
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Northover [Wed, 9 Jul 2014 13:03:37 +0000 (13:03 +0000)]
Generic: add range-adapter for option parsing.
I want to use it in lld, but while I'm here I'll update LLVM uses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212615
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 9 Jul 2014 12:36:54 +0000 (12:36 +0000)]
[x86] Fix a bug in my new zext-vector-inreg DAG trickery where we were
not widening the input type to the node sufficiently to let the ext take
place in a register.
This would in turn result in a mysterious bitcast assertion failure
downstream. First change here is to add back the helpful assert I had in
an earlier version of the code to catch this immediately.
Next change is to add support to the type legalization to detect when we
have widened the operand either too little or too much (for whatever
reason) and find a size-matched legal vector type to convert it to
first. This can also fail so we get a new fallback path, but that seems
OK.
With this, we no longer crash on vec_cast2.ll when using widening. I've
also added the CHECK lines for the zero-extend cases here. We still need
to support sign-extend and trunc (or something) to get plausible code
for the other two thirds of this test which is one of the regression
tests that showed the most scalarization when widening was
force-enabled. Slowly closing in on widening being a viable legalization
strategy without it resorting to scalarization at every turn. =]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212614
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 9 Jul 2014 11:13:16 +0000 (11:13 +0000)]
Sink two variables only used in an assert into the assert itself. Should
fix the release builds with Werror.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212612
91177308-0d34-0410-b5e6-
96231b3b80d8
Benjamin Kramer [Wed, 9 Jul 2014 11:12:39 +0000 (11:12 +0000)]
X86: When lowering v8i32 himuls use the correct shuffle masks for AVX2.
Turns out my trick of using the same masks for SSE4.1 and AVX2 didn't work out
as we have to blend two vectors. While there remove unecessary cross-lane moves
from the shuffles so the backend can lower it to palignr instead of vperm.
Fixes PR20118, a miscompilation of vector sdiv by constant on AVX2.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212611
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 9 Jul 2014 10:58:18 +0000 (10:58 +0000)]
[x86] Add a ZERO_EXTEND_VECTOR_INREG DAG node and use it when widening
vector types to be legal and a ZERO_EXTEND node is encountered.
When we use widening to legalize vector types, extend nodes are a real
challenge. Either the input or output is likely to be legal, but in many
cases not both. As a consequence, we don't really have any way to
represent this situation and the prior code in the widening legalization
framework would just scalarize the extend operation completely.
This patch introduces a new DAG node to represent doing a zero extend of
a vector "in register". The core of the idea is to allow legal but
different vector types in the input and output. The output vector must
have fewer lanes but wider elements. The operation is defined to zero
extend the low elements of the input to the size of the output elements,
and drop all of the high elements which don't have a corresponding lane
in the output vector.
It also includes generic expansion of this node in terms of blending
a zero vector into the high elements of the vector and bitcasting
across. This in turn yields extremely nice code for x86 SSE2 when we use
the new widening legalization logic in conjunction with the new shuffle
lowering logic.
There is still more to do here. We need to support sign extension, any
extension, and potentially int-to-float conversions. My current plan is
to continue using similar synthetic nodes to model each of these
transitions with generic lowering code for each one.
However, with this patch LLVM already reaches performance parity with
GCC for the core C loops of the x264 code (assuming you disable the
hand-written assembly versions) when compiling for SSE2 and SSE3
architectures and enabling the new widening and lowering logic for
vectors.
Differential Revision: http://reviews.llvm.org/D4405
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212610
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Sanders [Wed, 9 Jul 2014 10:47:26 +0000 (10:47 +0000)]
[mips][mips64r6] Correct select patterns that have the condition or true/false values backwards
Summary: This bug caused SingleSource/Regression/C/uint64_to_float and SingleSource/UnitTests/2002-05-02-CastTest3 to fail (among others).
Differential Revision: http://reviews.llvm.org/D4388
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212608
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Sanders [Wed, 9 Jul 2014 10:40:20 +0000 (10:40 +0000)]
[mips][mips64r6] Correct cond names in the cmp.cond.[ds] instructions
Summary:
It seems we accidentally read the wrong column of the table MIPS64r6 spec
and used the names for c.cond.fmt instead of cmp.cond.fmt.
Differential Revision: http://reviews.llvm.org/D4387
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212607
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 9 Jul 2014 10:36:42 +0000 (10:36 +0000)]
[x86] Initialize a pointer to null to fix a bug in r212602.
This should restore GCC hosts (which happen to put the bad stuff into
the pointer) and MSan, etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212606
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Sanders [Wed, 9 Jul 2014 10:21:59 +0000 (10:21 +0000)]
[mips][mips64r6] Use JALR for indirect branches instead of JR (which is not available on MIPS32r6/MIPS64r6)
Summary:
This completes the change to use JALR instead of JR on MIPS32r6/MIPS64r6.
Reviewers: jkolek, vmedic, zoran.jovanovic, dsanders
Reviewed By: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D4269
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212605
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Sanders [Wed, 9 Jul 2014 10:16:07 +0000 (10:16 +0000)]
[mips][mips64r6] Use JALR for returns instead of JR (which is not available on MIPS32r6/MIPS64r6)
Summary:
RET, and RET_MM have been replaced by a pseudo named PseudoReturn.
In addition a version with a 64-bit GPR named PseudoReturn64 has been
added.
Instruction selection for a return matches RetRA, which is expanded post
register allocation to PseudoReturn/PseudoReturn64. During MipsAsmPrinter,
this PseudoReturn/PseudoReturn64 are emitted as:
- (JALR64 $zero, $rs) on MIPS64r6
- (JALR $zero, $rs) on MIPS32r6
- (JR_MM $rs) on microMIPS
- (JR $rs) otherwise
On MIPS32r6/MIPS64r6, 'jr $rs' is an alias for 'jalr $zero, $rs'. To aid
development and review (specifically, to ensure all cases of jr are
updated), these aliases are temporarily named 'r6.jr' instead of 'jr'.
A follow up patch will change them back to the correct mnemonic.
Added (JALR $zero, $rs) to MipsNaClELFStreamer's definition of an indirect
jump, and removed it from its definition of a call.
Note: I haven't accounted for MIPS64 in MipsNaClELFStreamer since it's
doesn't appear to account for any MIPS64-specifics.
The return instruction created as part of eh_return expansion is now expanded
using expandRetRA() so we use the right return instruction on MIPS32r6/MIPS64r6
('jalr $zero, $rs').
Also, fixed a misuse of isABI_N64() to detect 64-bit wide registers in
expandEhReturn().
Reviewers: jkolek, vmedic, mseaborn, zoran.jovanovic, dsanders
Reviewed By: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D4268
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212604
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Sanders [Wed, 9 Jul 2014 10:07:36 +0000 (10:07 +0000)]
Add ability to emit internal instruction representation to CodeGen assembly output.
Summary:
This patch re-uses the implementation of 'llvm-mc -show-inst' and makes it
available to llc as 'llc -asm-show-inst'.
This is necessary to test parts of MIPS32r6/MIPS64r6 without resorting to
'llc -filetype=obj' tests. For example, on MIPS32r2 and earlier we use the
'jr $rs' instruction for indirect branches and returns. On MIPS32r6, we no
longer have 'jr $rs' and use 'jalr $zero, $rs' instead. The catch is that,
on MIPS32r6, 'jr $rs' is an alias for 'jalr $zero, $rs' and is the preferred
way of writing this instruction. As a result, all MIPS ISA's emit 'jr $rs' in
their assembly output and the assembler encodes this to different opcodes
according to the ISA.
Using this option, we can check that the MCInst really is a JR or a JALR by
matching the emitted comment. This removes the need for a 'llc -filetype=obj'
test.
Reviewers: rafael, dsanders
Reviewed By: dsanders
Subscribers: zoran.jovanovic, llvm-commits
Differential Revision: http://reviews.llvm.org/D4267
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212603
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 9 Jul 2014 10:06:58 +0000 (10:06 +0000)]
[x86] Re-apply a variant of the x86 side of r212324 now that the rest
has settled without incident, removing the x86-specific and overly
strict 'isVectorSplat' routine in favor of generic and more powerful
splat detection.
The primary motivation and result of this is that the x86 backend can
now see through splats which contain undef elements. This is essential
if we are using a widening form of legalization and I've updated a test
case to also run in that mode as before this change the generated code
for the test case was completely scalarized.
This version of the patch much more carefully handles the undef lanes.
- We aren't overly conservative about them in the shift lowering
(where we will never use the splat itself).
- One place where the splat would have been re-used by the existing code
now explicitly constructs a new constant splat that will be safe.
- The broadcast lowering is much more reasonable with undefs by doing
a correct check of whether the splat is the only user of a loaded
value, checking that the splat actually crosses multiple lanes before
using a broadcast, and handling broadcasts of non-constant splats.
As a consequence of the last bullet, the weird usage of vpshufd instead
of vbroadcast is gone, and we actually can lower an AVX splat with
vbroadcastss where before we emitted a really strange pattern of
a vector load and a manual splat across the vector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212602
91177308-0d34-0410-b5e6-
96231b3b80d8
Timur Iskhodzhanov [Wed, 9 Jul 2014 08:35:33 +0000 (08:35 +0000)]
[ASan/Win] Don't instrument COMDAT globals. Properly fixes PR20244.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212596
91177308-0d34-0410-b5e6-
96231b3b80d8
Dmitri Gribenko [Wed, 9 Jul 2014 08:30:15 +0000 (08:30 +0000)]
SourceMgr: consistently use 'unsigned' for the memory buffer ID type
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212595
91177308-0d34-0410-b5e6-
96231b3b80d8
Alp Toker [Wed, 9 Jul 2014 06:27:05 +0000 (06:27 +0000)]
Prospective -fsanitize=memory build fix following r212586
This -f group flag appears to influence linker flags, breaking the usual rules
and causing CMake's link invocation to fail during feature detection due to
missing link dependencies (msan_*).
Let's forcibly add it for now to get things the way they were before feature
detection started working.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212590
91177308-0d34-0410-b5e6-
96231b3b80d8
Nikola Smiljanic [Wed, 9 Jul 2014 05:34:24 +0000 (05:34 +0000)]
Use correct memeber when displaying StringMap's size.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212588
91177308-0d34-0410-b5e6-
96231b3b80d8
Alp Toker [Wed, 9 Jul 2014 03:39:32 +0000 (03:39 +0000)]
CMake: make __DATE__, __TIME__ etc. macro usage an error
When LLVM_ENABLE_TIMESTAMPS has been disabled we can prevent the preprocessor
from embedding dates, times and file timestamps.
There are a few motivations for this:
1) Validate the recent CMake feature detection bugfix from LLVM r212586 with
a flag that's not actually available everywhere.
2) Dogfood clang's new -Wdate-time warning from r210511 when bootstrapping.
3) Encourage reproducible builds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212587
91177308-0d34-0410-b5e6-
96231b3b80d8
Alp Toker [Wed, 9 Jul 2014 03:38:19 +0000 (03:38 +0000)]
CMake: fix compiler feature detection
add_flag_if_supported() and add_flag_or_print_warning() were effectively
no-ops, just returning the value of the first result (usually
'-fno-omit-frame-pointer') for all subsequent checks for different flags.
Due to the way CMake caches feature detection results, we need to provide
symbolic variable names which will persist the cached results. This commit
fixes feature detection using these two macros.
The feature checks now run and get stored correctly, and the correct output can
be observed in configure logs:
-- Performing Test C_SUPPORTS_FPIC
-- Performing Test C_SUPPORTS_FPIC - Success
-- Performing Test CXX_SUPPORTS_FPIC
-- Performing Test CXX_SUPPORTS_FPIC - Success
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212586
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Wed, 9 Jul 2014 00:41:34 +0000 (00:41 +0000)]
[SDAG] At the suggestion of Hal, switch to an output parameter that
tracks which elements of the build vector are in fact undef.
This should make actually inpsecting them (likely in my next patch)
reasonably pretty. Also makes the output parameter optional as it is
clear now that *most* users are happy with undefs in their splats.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212581
91177308-0d34-0410-b5e6-
96231b3b80d8
Ehsan Akhgari [Wed, 9 Jul 2014 00:40:50 +0000 (00:40 +0000)]
[ms-coff] Add a test for proper handling of full Windows path names in the .drectve section
Summary: This test ensures that we can correctly specify a full Windows path to the clang ASAN runtime libraries. This is in preparation to fix PR20246.
Reviewers: rafael
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D4427
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212580
91177308-0d34-0410-b5e6-
96231b3b80d8
NAKAMURA Takumi [Tue, 8 Jul 2014 23:48:22 +0000 (23:48 +0000)]
MipsTargetStreamer.h: Avoid "using" to appease msc17.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212577
91177308-0d34-0410-b5e6-
96231b3b80d8
Kevin Enderby [Tue, 8 Jul 2014 23:47:31 +0000 (23:47 +0000)]
Changed the lvm-nm alias "-s" for -print-armap to "-M".
This will allow the "-s" flag to implemented in the future as it
is in darwin’s nm(1) to list symbols only in the specified section.
Given a LGTM by Shankar Easwaran who originally implemented
the support for lvm-nm’s -print-armap and archive map symbols.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212576
91177308-0d34-0410-b5e6-
96231b3b80d8
Jim Grosbach [Tue, 8 Jul 2014 23:28:48 +0000 (23:28 +0000)]
AArch64: Better codegen for loading from __fp16.
Loading will generally extend to an f32 or an 64, so make sure
to match those patterns directly to load into the FPR16 register
class directly rather than going through the integer GPRs.
This also eliminates an extra step in the convert-to-f64 path
which was first converting to f32 and then to f64 from there.
rdar://
17594379
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212573
91177308-0d34-0410-b5e6-
96231b3b80d8
Hal Finkel [Tue, 8 Jul 2014 23:16:49 +0000 (23:16 +0000)]
Improve BasicAA CS-CS queries
BasicAA contains knowledge of certain intrinsics, such as memcpy and memset,
and uses that information to form more-accurate answers to CallSite vs. Loc
ModRef queries. Unfortunately, it did not use this information when answering
CallSite vs. CallSite queries.
Generically, when an intrinsic takes one or more pointers and the intrinsic is
marked only to read/write from its arguments, the offset/size is unknown. As a
result, the generic code that answers CallSite vs. CallSite (and CallSite vs.
Loc) queries in AA uses UnknownSize when forming Locs from an intrinsic's
arguments. While BasicAA's CallSite vs. Loc override could use more-accurate
size information for some intrinsics, it did not do the same for CallSite vs.
CallSite queries.
This change refactors the intrinsic-specific logic in BasicAA into a generic AA
query function: getArgLocation, which is overridden by BasicAA to supply the
intrinsic-specific knowledge, and used by AA's generic implementation. This
allows the intrinsic-specific knowledge to be used by both CallSite vs. Loc and
CallSite vs. CallSite queries, and simplifies the BasicAA implementation.
Currently, only one function, Mac's memset_pattern16, is handled by BasicAA
(all the rest are intrinsics). As a side-effect of this refactoring, BasicAA's
getModRefBehavior override now also returns OnlyAccessesArgumentPointees for
this function (which is an improvement).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212572
91177308-0d34-0410-b5e6-
96231b3b80d8
Tobias Grosser [Tue, 8 Jul 2014 22:51:03 +0000 (22:51 +0000)]
DominanceInfo is strongly preferred over RegionInfo
This is and always was strong community consensus. Make this clear in the header
in case newcomers may not be aware.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212570
91177308-0d34-0410-b5e6-
96231b3b80d8
Kevin Enderby [Tue, 8 Jul 2014 22:10:02 +0000 (22:10 +0000)]
Add support for BSD format Archive map symbols (aka the table of contents
from a __.SYMDEF or "__.SYMDEF SORTED" archive member).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212568
91177308-0d34-0410-b5e6-
96231b3b80d8
Pete Cooper [Tue, 8 Jul 2014 17:06:03 +0000 (17:06 +0000)]
Revert "GlobalDCE: Delete available_externally initializers if it allows removing the value the initializer is referring to."
This reverts commit
5b55a47e94e28fbb56d0cd5d72c3db9105c15b4c.
A test case was found to crash after this was applied. I'll file a bug to track fixing this with the test case needed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212550
91177308-0d34-0410-b5e6-
96231b3b80d8
Ulrich Weigand [Tue, 8 Jul 2014 16:16:02 +0000 (16:16 +0000)]
[PowerPC] Implement atomic NAND operations as actual NAND
This changes the implementation of atomic NAND operations
from "a & ~b" (compatible with GCC < 4.4) to actual "~(a & b)"
(compatible with GCC >= 4.4).
This is in line with the common-code and ARM back-end change
implemented in r212433.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212547
91177308-0d34-0410-b5e6-
96231b3b80d8
Andrea Di Biagio [Tue, 8 Jul 2014 15:22:29 +0000 (15:22 +0000)]
[DAG] Teach how to combine a pair of shuffles into a single shuffle if the resulting mask is legal.
This patch teaches how to fold a shuffle according to rule:
shuffle (shuffle (x, undef, M0), undef, M1) -> shuffle(x, undef, M2)
We do this only if the resulting mask M2 is legal; this is to avoid introducing
illegal shuffles that are potentially expanded into a sub-optimal sequence
of target specific dag nodes.
This patch has the advantage of being target independent, since it works on ISD
nodes. Therefore, all targets (not only x86) can take advantage of this rule.
The idea behind this patch is that most shuffle pairs can be safely combined
before we run the legalizer on vector operations. This allows us to
combine/simplify dag nodes earlier in the process and not only immediately
before instruction selection stage.
That said. This patch is not meant to replace any existing target specific
combine rules; backends might still introduce new shuffles during legalization
stage. Also, this rule is very simple and avoids to aggressively optimize
shuffles.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212539
91177308-0d34-0410-b5e6-
96231b3b80d8
Benjamin Kramer [Tue, 8 Jul 2014 14:55:06 +0000 (14:55 +0000)]
Fix some Twine locals.
Two of those are use after frees. Found by clang-tidy, fixed by me.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212537
91177308-0d34-0410-b5e6-
96231b3b80d8
Timur Iskhodzhanov [Tue, 8 Jul 2014 13:18:58 +0000 (13:18 +0000)]
[ASan/Win] Don't instrument private COMDAT globals until PR20244 is properly fixed
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212530
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Sanders [Tue, 8 Jul 2014 13:13:42 +0000 (13:13 +0000)]
[mips] Fixed struct/class mismatch introduced in r212522.
Clang emits a warning about this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212528
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Sanders [Tue, 8 Jul 2014 10:35:52 +0000 (10:35 +0000)]
Fix r212522 - [mips] Improve encapsulation of the .MIPS.abiflags implementation and limit scope of related enums
Added two lines that should have been in r212522.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212523
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Sanders [Tue, 8 Jul 2014 10:11:38 +0000 (10:11 +0000)]
[mips] Improve encapsulation of the .MIPS.abiflags implementation and limit scope of related enums
Summary:
Follow on to r212519 to improve the encapsulation and limit the scope of the enums.
Also merged two very similar parser functions, fixed a bug where ASE's
were not being reported, and marked CPR1's as being 128-bit when MSA is
enabled.
Differential Revision: http://reviews.llvm.org/D4384
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212522
91177308-0d34-0410-b5e6-
96231b3b80d8
Renato Golin [Tue, 8 Jul 2014 10:06:16 +0000 (10:06 +0000)]
Revert "Refactor ARM subarchitecture parsing"
This reverts commit
7b4a6882467e7fef4516a0cbc418cbfce0fc6f6d.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212521
91177308-0d34-0410-b5e6-
96231b3b80d8
Arnaud A. de Grandmaison [Tue, 8 Jul 2014 09:53:04 +0000 (09:53 +0000)]
Truncate the immediate in logical operation to the register width
And continue to produce an error if the 32 most significant bits are not all ones or zeros.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212520
91177308-0d34-0410-b5e6-
96231b3b80d8
Vladimir Medic [Tue, 8 Jul 2014 08:59:22 +0000 (08:59 +0000)]
Mips.abiflags is a new implicitly generated section that will be present on all new modules. The section contains a versioned data structure which represents essentially information to allow a program loader to determine the requirements of the application. This patch implements mips.abiflags section and provides test cases for it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212519
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Tue, 8 Jul 2014 08:45:38 +0000 (08:45 +0000)]
[x86,SDAG] Sink the logic for folding shuffles of splats more
aggressively from the x86 shuffle lowering to the generic SDAG vector
shuffle formation code.
This code already tried to fold away shuffles of splats! It just had
lots of bugs and couldn't handle the case my new x86 shuffle lowering
needed.
First, it failed to correctly compute whether N2 was undef because it
pre-computed this, then did transformations which could *make* N2 undef,
then failed to ever re-consider the precomputed state.
Second, it didn't look through bitcasts at all, even in the safe cases
where they are just element-type bitcasts with no change to the number
of elements.
Third, it didn't handle all-zero bit casts nicely the way my code in the
x86 side of things did, which is essential to getting good zext-shuffle
lowerings.
But all of these are generic. I just ported the code down to this layer
and fixed the surrounding bugs. Tests exercising this in the x86 backend
still pass and some silly code in widen_cast-6.ll gets better. I updated
that test to be a bit more precise but it's still pretty unclear what
the value of the test is in this day and age.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212517
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Tue, 8 Jul 2014 07:44:15 +0000 (07:44 +0000)]
[SDAG] Actually check for a non-constant splat and clarify comments
around the handling of UNDEF lanes in boolean vector content analysis.
The code before my changes here also failed to check for non-constant
splats in a buildvector. I have no idea how to trigger this, I just
spotted by inspection when trying to understand the code. It seems
extremely unlikely to be worth the trouble to teach the only caller of
this code (DAG combining setcc patterns) how to cleverly handle undef
lanes, so I've just commented more thoroughly that we're giving up
there.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212515
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Tue, 8 Jul 2014 07:19:55 +0000 (07:19 +0000)]
[SDAG] Build up a more rich set of APIs for querying build-vector SDAG
nodes about whether they are splats. This is factored out and improved
from r212324 which got reverted as it was far too aggressive. The new
API should help more conservatively handle buildvectors that are
a mixture of splatted and undef values.
No functionality change at this point. The hope is to slowly
re-introduce the undef-tolerant optimization of splats, but each time
being forced to make a concious decision about how to handle the undefs
in a way that doesn't lead to contradicting assumptions about the
collapsed value.
Hal has pointed out in discussions that this may not end up being the
desired API and instead it may be more convenient to get a mask of the
undef elements or something similar. I'm starting simple and will expand
the API as I adapt actual callers and see exactly what they need.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212514
91177308-0d34-0410-b5e6-
96231b3b80d8
Alexey Samsonov [Tue, 8 Jul 2014 00:50:49 +0000 (00:50 +0000)]
[ASan] Completely remove sanitizer blacklist file from instrumentation pass.
All blacklisting logic is now moved to the frontend (Clang).
If a function (or source file it is in) is blacklisted, it doesn't
get sanitize_address attribute and is therefore not instrumented.
If a global variable (or source file it is in) is blacklisted, it is
reported to be blacklisted by the entry in llvm.asan.globals metadata,
and is not modified by the instrumentation.
The latter may lead to certain false positives - not all the globals
created by Clang are described in llvm.asan.globals metadata (e.g,
RTTI descriptors are not), so we may start reporting errors on them
even if "module" they appear in is blacklisted. We assume it's fine
to take such risk:
1) errors on these globals are rare and usually indicate wild memory access
2) we can lazily add descriptors for these globals into llvm.asan.globals
lazily.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212505
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Tue, 8 Jul 2014 00:22:32 +0000 (00:22 +0000)]
[X86] AVX512: Only allow k1-k7 as predicates to vpcmp*
As destination k0 is allowed but not as predicate/writemask.
I also modified the test to allow checking of error messages by the assembler.
I applied a similar approach to the test ret.s in the same directory.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212504
91177308-0d34-0410-b5e6-
96231b3b80d8
Alexey Samsonov [Tue, 8 Jul 2014 00:03:11 +0000 (00:03 +0000)]
Kill unnecessary include
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212503
91177308-0d34-0410-b5e6-
96231b3b80d8
Andrea Di Biagio [Mon, 7 Jul 2014 23:25:23 +0000 (23:25 +0000)]
[x86] Fix assertion failure caused by a wrong combine of PSHUFD nodes with different types.
When combining a sequence of two PSHUFD dag nodes into a single PSHUFD,
make sure that we assign the correct type to the resulting PSHUFD.
X86ISD::PSHUFD dag nodes can be either MVT::v4i32 or MVT::v4f32.
Before this change, an assertion failure was triggered in method
'DAGCombinerInfo::CombineTo' when trying to combine the shuffles from the test
below into a single PSHUFD.
define <4 x float> @test1(<4 x float> %V) {
%1 = shufflevector <4 x float> %V, <4 x float> undef, <4 x i32> <i32 3, i32 0, i32 2, i32 1>
%2 = shufflevector <4 x float> %1, <4 x float> undef, <4 x i32> <i32 3, i32 0, i32 2, i32 1>
ret <4 x float> %2
}
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212498
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Mon, 7 Jul 2014 22:13:58 +0000 (22:13 +0000)]
fixed some typos
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212495
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Mon, 7 Jul 2014 21:52:21 +0000 (21:52 +0000)]
[FastISel][X86] Fix smul.with.overflow.i8 lowering.
Add custom lowering code for signed multiply instruction selection, because the
default FastISel instruction selection for ISD::MUL will use unsigned multiply
for the i8 type and signed multiply for all other types. This would set the
incorrect flags for the overflow check.
This fixes <rdar://problem/
17549300>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212493
91177308-0d34-0410-b5e6-
96231b3b80d8
Louis Gerbarg [Mon, 7 Jul 2014 21:37:51 +0000 (21:37 +0000)]
Allow AArch64FastISel to degrade graceully in the presence of an MVT::i128
Currently AArch64FastISel crashes if it tries to extend an integer into an
MVT::i128. This can happen by creating 128 bit integers like so:
typedef unsigned int uint128_t __attribute__((mode(TI)));
typedef int sint128_t __attribute__((mode(TI)));
This patch makes EmitIntExt check for their presence and then falls back to
SelectionDAG.
Tests included.
rdar://
17516686
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212492
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Mon, 7 Jul 2014 21:19:00 +0000 (21:19 +0000)]
Fix for PR17073 ( llvm.org/pr17073 ), simplifycfg illegally hoists an operation in a phi node that can trap.
This patch adds to an existing loop over phi nodes in SimplifyCondBranchToCondBranch() to check for trapping ops and bails out of the optimization if we find one of those.
The test cases verify that trapping ops are not hoisted and non-trapping ops are still optimized as expected.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212490
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Mon, 7 Jul 2014 20:34:51 +0000 (20:34 +0000)]
Use raw_fd_ostream instead of std::ofstream.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212483
91177308-0d34-0410-b5e6-
96231b3b80d8
Renato Golin [Mon, 7 Jul 2014 20:01:11 +0000 (20:01 +0000)]
Refactor ARM subarchitecture parsing
According to a FIXME in ARMMCTargetDesc.cpp the ARM version parsing should be
in the Triple helper class.
Patch by: Gabor Ballabas
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212479
91177308-0d34-0410-b5e6-
96231b3b80d8
Ulrich Weigand [Mon, 7 Jul 2014 19:41:54 +0000 (19:41 +0000)]
[PowerPC] Fix testcase regression
Use -mcpu to avoid different codegen depending on host platform.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212478
91177308-0d34-0410-b5e6-
96231b3b80d8
Ulrich Weigand [Mon, 7 Jul 2014 19:39:44 +0000 (19:39 +0000)]
[PowerPC] Fix no-assert build
r212476 caused a compile failure (unused variable) in a non-assertion
build ...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212477
91177308-0d34-0410-b5e6-
96231b3b80d8
Ulrich Weigand [Mon, 7 Jul 2014 19:26:41 +0000 (19:26 +0000)]
[PowerPC] Fix "byval align" arguments
Arguments passed as "byval align" should get the specified alignment
in the parameter save area. There was some code in PPCISelLowering.cpp
that attempted to implement this, but this didn't work correctly:
while code did update the ArgOffset value, it neglected to update
the PtrOff value (which was already computed from the old ArgOffset),
and it also neglected to update GPR_idx -- fields skipped due to
alignment in the save area must likewise be skipped in GPRs.
This patch fixes and simplifies this logic by:
- handling argument offset alignment right at the beginning
of argument processing, using a new helper routine
CalculateStackSlotAlignment (this avoids having to update
PtrOff and other derived values later on)
- not tracking GPR_idx separately, but always computing the
correct GPR_idx for each argument *from* its ArgOffset
- removing some redundant computation in LowerFormalArguments:
MinReservedArea must equal ArgOffset after argument processing,
so there's no use in computing it twice.
[This doesn't change the behavior of the current clang front-end,
since that never creates "byval align" arguments at the moment.
This will change with a follow-on patch, however.]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212476
91177308-0d34-0410-b5e6-
96231b3b80d8
Chandler Carruth [Mon, 7 Jul 2014 19:03:32 +0000 (19:03 +0000)]
[x86] Revert r212324 which was too aggressive w.r.t. allowing undef
lanes in vector splats.
The core problem here is that undef lanes can't *unilaterally* be
considered to contribute to splats. Their handling needs to be more
cautious. There is also a reported failure of the nightly testers
(thanks Tobias!) that may well stem from the same core issue. I'm going
to fix this theoretical issue, factor the APIs a bit better, and then
verify that I don't see anything bad with Tobias's reduction from the
test suite before recommitting.
Original commit message for r212324:
[x86] Generalize BuildVectorSDNode::getConstantSplatValue to work for
any constant, constant FP, or undef splat and to tolerate any undef
lanes in a splat, then replace all uses of isSplatVector in X86's
lowering with it.
This fixes issues where undef lanes in an otherwise splat vector would
prevent the splat logic from firing. It is a touch more awkward to use
this interface, but it is much more accurate. Suggestions for better
interface structuring welcome.
With this fix, the code generated with the widening legalization
strategy for widen_cast-4.ll is *dramatically* improved as the special
lowering strategies for a v16i8 SRA kick in even though the high lanes
are undef.
We also get a slightly different choice for broadcasting an aligned
memory location, and use vpshufd instead of vbroadcastss. This looks
like a minor win for pipelining and domain crossing, but a minor loss
for the number of micro-ops. I suspect its a wash, but folks can
easily tweak the lowering if they want.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212475
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Mon, 7 Jul 2014 18:34:45 +0000 (18:34 +0000)]
R600: Fix mishandling of load / store chains.
Fixes various bugs with reordering loads and stores.
Scalarized vector loads weren't collecting the chains
at all.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212473
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Mon, 7 Jul 2014 18:34:42 +0000 (18:34 +0000)]
Fix typo, weird indentation
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212472
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Northover [Mon, 7 Jul 2014 15:26:53 +0000 (15:26 +0000)]
[testing]: lld generally lives in tools/, so fix llvm-lit.
Otherwise we can't run individual tests directly ("llvm-lit /path/to/test")
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212461
91177308-0d34-0410-b5e6-
96231b3b80d8