Igor Laevsky [Tue, 29 Sep 2015 14:57:52 +0000 (14:57 +0000)]
[ValueTracking] Lower dom-conditions-dom-blocks and dom-conditions-max-uses thresholds
On some of our benchmarks this change shows about 50% compile time improvement without any noticeable performance difference.
Differential Revision: http://reviews.llvm.org/D13248
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248801
91177308-0d34-0410-b5e6-
96231b3b80d8
Chad Rosier [Tue, 29 Sep 2015 14:57:10 +0000 (14:57 +0000)]
[AArch64] Remove some redundant cases. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248800
91177308-0d34-0410-b5e6-
96231b3b80d8
John Brawn [Tue, 29 Sep 2015 14:33:58 +0000 (14:33 +0000)]
[CMake] Move the setting of LLVM_COMPILER_IS_GCC_COMPATIBLE to a separate file
Currently LLVM_COMPILER_IS_GCC_COMPATIBLE is set as a side-effect of determining
the stdlib to use in HandleLLVMStdlib, which causes problems when attempting to
use AddLLVM from an installed LLVM toolchain, as HandleLLVMStdlib is not used.
Move the setting of this variable into DetermineGCCCompatible and include that
from both AddLLVM and HandleLLVMStdlib.
Differential Revision: http://reviews.llvm.org/D13216
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248798
91177308-0d34-0410-b5e6-
96231b3b80d8
James Molloy [Tue, 29 Sep 2015 14:08:45 +0000 (14:08 +0000)]
[ValueTracking] Teach isKnownNonZero about monotonically increasing PHIs
If a PHI starts at a non-negative constant, monotonically increases
(only adds of a constant are supported at the moment) and that add
does not wrap, then the PHI is known never to be zero.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248796
91177308-0d34-0410-b5e6-
96231b3b80d8
Jeroen Ketema [Tue, 29 Sep 2015 10:12:57 +0000 (10:12 +0000)]
Arguments spilled on the stack before a function call may have
alignment requirements, for example in the case of vectors.
These requirements are exploited by the code generator by using
move instructions that have similar alignment requirements, e.g.,
movaps on x86.
Although the code generator properly aligns the arguments with
respect to the displacement of the stack pointer it computes,
the displacement itself may cause misalignment. For example if
we have
%3 = load <16 x float>, <16 x float>* %1, align 64
call void @bar(<16 x float> %3, i32 0)
the x86 back-end emits:
movaps 32(%ecx), %xmm2
movaps (%ecx), %xmm0
movaps 16(%ecx), %xmm1
movaps 48(%ecx), %xmm3
subl $20, %esp <-- if %esp was 16-byte aligned before this instruction, it no longer will be afterwards
movaps %xmm3, (%esp) <-- movaps requires 16-byte alignment, while %esp is not aligned as such.
movl $0, 16(%esp)
calll __bar
To solve this, we need to make sure that the computed value with which
the stack pointer is changed is a multiple af the maximal alignment seen
during its computation. With this change we get proper alignment:
subl $32, %esp
movaps %xmm3, (%esp)
Differential Revision: http://reviews.llvm.org/D12337
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248786
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Tue, 29 Sep 2015 08:19:11 +0000 (08:19 +0000)]
[InstCombine] Improve Vector Demanded Bits Through Bitcasts
Currently SimplifyDemandedVectorElts can only peek through bitcasts if the vectors have the same number of elements.
This patch fixes and enables some existing (disabled) code to support bitcasting to vectors with more/fewer elements. It currently only accepts cases when vectors alias cleanly (i.e. number of elements are an exact multiple of the other vector).
This was added to improve the demanded vector elements support for SSE vector shifts which require the __m128i (<2 x i64>) argument type to be bitcast to the vector type for the builtin shift. I've added extra tests for various additional bitcasts.
Differential Revision: http://reviews.llvm.org/D12935
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248784
91177308-0d34-0410-b5e6-
96231b3b80d8
Dan Gohman [Tue, 29 Sep 2015 08:13:58 +0000 (08:13 +0000)]
[WebAssembly] Rename test files to match platform naming conventions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248783
91177308-0d34-0410-b5e6-
96231b3b80d8
Chen Li [Tue, 29 Sep 2015 05:03:32 +0000 (05:03 +0000)]
[LoopUnswitch] Add block frequency analysis to recognize hot/cold regions
Summary: This patch adds block frequency analysis to LoopUnswitch pass to recognize hot/cold regions. For cold regions the pass only performs trivial unswitches since they do not increase code size, and for hot regions everything works as before. This helps to minimize code growth in cold regions and be more aggressive in hot regions. Currently the default cold regions are blocks with frequencies below 20% of function entry frequency, and it can be adjusted via -loop-unswitch-cold-block-frequency flag. The entire feature is controlled via -loop-unswitch-with-block-frequency flag and it is off by default.
Reviewers: broune, silvas, dnovillo, reames
Subscribers: davidxl, llvm-commits
Differential Revision: http://reviews.llvm.org/D11605
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248777
91177308-0d34-0410-b5e6-
96231b3b80d8
NAKAMURA Takumi [Tue, 29 Sep 2015 01:25:01 +0000 (01:25 +0000)]
[CMake] X86AsmParser: Prune redundant LINK_LIBS.
It is described in LLVMBuild.txt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248771
91177308-0d34-0410-b5e6-
96231b3b80d8
Evgeniy Stepanov [Tue, 29 Sep 2015 00:30:19 +0000 (00:30 +0000)]
Move dbg.declare intrinsics when merging and replacing allocas.
Place new and update dbg.declare calls immediately after the
corresponding alloca.
Current code in replaceDbgDeclareForAlloca puts the new dbg.declare
at the end of the basic block. LLVM codegen has problems emitting
debug info in a situation when dbg.declare appears after all uses of
the variable. This usually kinda works for inlining and ASan (two
users of this function) but not for SafeStack (see the pending change
in http://reviews.llvm.org/D13178).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248769
91177308-0d34-0410-b5e6-
96231b3b80d8
Matthias Braun [Tue, 29 Sep 2015 00:20:32 +0000 (00:20 +0000)]
RegisterPressure: LiveRegSet tracks register units not physregs
There are always more physical registers and register units so the
previous behaviour was correct but we can do with less memory.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248767
91177308-0d34-0410-b5e6-
96231b3b80d8
Reid Kleckner [Mon, 28 Sep 2015 23:56:30 +0000 (23:56 +0000)]
[WinEH] Fix ip2state table emission with funclets
Previously we were hijacking the old LandingPadInfo data structures to
communicate our state numbers. Now we don't need that anymore.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248763
91177308-0d34-0410-b5e6-
96231b3b80d8
Richard Trieu [Mon, 28 Sep 2015 22:54:43 +0000 (22:54 +0000)]
Fix unused variable warning in non-debug builds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248754
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Mon, 28 Sep 2015 22:14:51 +0000 (22:14 +0000)]
tidy up comments; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248750
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Mon, 28 Sep 2015 22:00:24 +0000 (22:00 +0000)]
add a FIXME for a CPU model check that should have an attribute instead
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248746
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Mon, 28 Sep 2015 21:44:46 +0000 (21:44 +0000)]
move one-use check under the comment that describes it; NFCI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248745
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjoy Das [Mon, 28 Sep 2015 21:14:32 +0000 (21:14 +0000)]
[SCEV] Don't crash on pointer comparisons
`ScalarEvolution::isImpliedCondOperandsViaNoOverflow` tries to cast the
operand type of the comparison it is given to an `IntegerType`. This is
incorrect because it could actually be simplifying a comparison between
two pointers. Switch it to using `getTypeSizeInBits` instead, which
does the right thing for both pointers and integers.
Fixed PR24956.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248743
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Mon, 28 Sep 2015 20:54:57 +0000 (20:54 +0000)]
AMDGPU: Factor switch into separate function
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248742
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Mon, 28 Sep 2015 20:54:52 +0000 (20:54 +0000)]
AMDGPU: Fix splitting x16 SMRD loads
When used recursively, this would set the kill flag
on the intermediate step from first splitting
x16 to x8.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248741
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Mon, 28 Sep 2015 20:54:46 +0000 (20:54 +0000)]
AMDGPU: Fix moving SMRD loads with literal offsets on CI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248740
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Mon, 28 Sep 2015 20:54:42 +0000 (20:54 +0000)]
AMDGPU: Fix splitting SMRD with large offset
The splitting of > 4 dword SMRD instructions
if using an offset in an SGPR instead of an immediate
was not setting the destination register,
resulting an an instruction missing an operand
which would assert later.
Test will be included in a following commit
which fixes a related issue.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248739
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Mon, 28 Sep 2015 20:54:38 +0000 (20:54 +0000)]
AMDGPU: Add testcases
Make sure we are testing moving users
of the moved and split SMRD loads.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248738
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Mon, 28 Sep 2015 20:54:32 +0000 (20:54 +0000)]
AMDGPU: Cleanup test
Run instnamer on it, and rename check prefix.
This is in preparation for adding new testcases to cover
bugs on other subtargets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248737
91177308-0d34-0410-b5e6-
96231b3b80d8
Andrew Kaylor [Mon, 28 Sep 2015 20:33:22 +0000 (20:33 +0000)]
Improved the interface of methods commuting operands, improved X86-FMA3 mem-folding&coalescing.
Patch by Slava Klochkov (vyacheslav.n.klochkov@intel.com)
Differential Revision: http://reviews.llvm.org/D11370
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248735
91177308-0d34-0410-b5e6-
96231b3b80d8
Sean Silva [Mon, 28 Sep 2015 19:02:11 +0000 (19:02 +0000)]
[GlobalOpt] Sort members of llvm.used deterministically
Patch by Jake VanAdrighem!
Summary:
Fix the way we sort the llvm.used and llvm.compiler.used members.
This bug seems to have been introduced in rL183756 through a set of improper casts to GlobalValue*. In subsequent patches this problem was missed and transformed into a getName call on a ConstantExpr.
Reviewers: silvas
Subscribers: silvas, llvm-commits
Differential Revision: http://reviews.llvm.org/D12851
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248728
91177308-0d34-0410-b5e6-
96231b3b80d8
Fiona Glaser [Mon, 28 Sep 2015 18:56:07 +0000 (18:56 +0000)]
Improve performance of SimplifyInstructionsInBlock
1. Use a worklist, not a recursive approach, to avoid needless
revisitation and being repeatedly forced to jump back to the
start of the BB if a handle is invalidated.
2. Only insert operands to the worklist if they become unused
after a dead instruction is removed, so we don’t have to
visit them again in most cases.
3. Use a SmallSetVector to track the worklist.
4. Instead of pre-initting the SmallSetVector like in
DeadCodeEliminationPass, only put things into the worklist
if they have to be revisited after the first run-through.
This minimizes how much the actual SmallSetVector gets used,
which saves a lot of time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248727
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Sanders [Mon, 28 Sep 2015 18:24:08 +0000 (18:24 +0000)]
[mips][p5600] Added P5600 processor and initial scheduler.
Summary:
The P5600 is an out-of-order, superscalar implementation of the MIPS32R5
architecture.
The scheduler has a few missing details (see the 'Tricky Instructions'
section and some quirks of the P5600 are deliberately omitted due to
implementation difficulty and low chance of significant benefit (e.g. the
predicate on P5600WriteEitherALU). However, testing on SingleSource is
showing significant performance benefits on some apps (seven in the 10-30%
range) and only one significant regression (12%) when
-pre-RA-sched=linearize is given. Without -pre-RA-sched=linearize the
results are more variable. Some do even better (up to 55% improvement) but
increased numbers of copies are slowing others down (up to 12%).
Overall, the scheduler as it currently stands is a 2.4% win with
-pre-RA-sched=linearize and a 2.7% win without -pre-RA-sched=linearize.
I'm sure we can improve on this further.
For completeness, the FPGA this was tested on shows some failures with and
without the P5600 scheduler. These appear to be scheduling related since
the two test runs have fairly different sets of failing tests even after
accounting for other factors (e.g. spurious connection failures) however
it's not P5600 specific since we also get some for the generic scheduler.
Reviewers: vkalintiris
Subscribers: mpf, llvm-commits, atrick, vkalintiris
Differential Revision: http://reviews.llvm.org/D12193
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248725
91177308-0d34-0410-b5e6-
96231b3b80d8
Artur Pilipenko [Mon, 28 Sep 2015 17:41:08 +0000 (17:41 +0000)]
Introduce !align metadata for load instruction
Reviewed By: hfinkel
Differential Revision: http://reviews.llvm.org/D12853
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248721
91177308-0d34-0410-b5e6-
96231b3b80d8
Philip Reames [Mon, 28 Sep 2015 17:14:24 +0000 (17:14 +0000)]
[InstSimplify] Fold simple known implications to true
This was split off of http://reviews.llvm.org/D13040 to make it easier to test the correctness of the implication logic. For the moment, this only handles a single easy case which shows up when eliminating and combining range checks. In the (near) future, I plan to extend this for other cases which show up in range checks, but I wanted to make those changes incrementally once the framework was in place.
At the moment, the implication logic will be used by three places. One in InstSimplify (this review) and two in SimplifyCFG (http://reviews.llvm.org/D13040 & http://reviews.llvm.org/D13070). Can anyone think of other locations this style of reasoning would make sense?
Differential Revision: http://reviews.llvm.org/D13074
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248719
91177308-0d34-0410-b5e6-
96231b3b80d8
Weiming Zhao [Mon, 28 Sep 2015 17:03:23 +0000 (17:03 +0000)]
[LoopReroll] Ignore debug intrinsics
Originally, debug intrinsics and annotation intrinsics may prevent
the loop to be rerolled, now they are ignored.
Differential Revision: http://reviews.llvm.org/D13150
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248718
91177308-0d34-0410-b5e6-
96231b3b80d8
Dan Gohman [Mon, 28 Sep 2015 16:22:39 +0000 (16:22 +0000)]
[WebAssembly] Support for direct call and call_indirect.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248716
91177308-0d34-0410-b5e6-
96231b3b80d8
Zoran Jovanovic [Mon, 28 Sep 2015 11:11:34 +0000 (11:11 +0000)]
[mips] Handling of immediates bigger than 16 bits
Differential Revision: http://reviews.llvm.org/D10539
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248706
91177308-0d34-0410-b5e6-
96231b3b80d8
Artyom Skrobov [Mon, 28 Sep 2015 09:44:11 +0000 (09:44 +0000)]
[ARM] Avoid redundant checks for isThumb1Only() after supportsTailCall()
supportsTailCall() has two callers. Both of them double-check isThumb1Only(),
and refuse to proceed with tail-calling in that case.
Therefore, it makes sense to move this check to
ARMSubtarget::initSubtargetFeatures, where SupportsTailCall is initialized;
and to eliminate the extra checks at the call sites.
Following a review comment, added an "assert(supportsTailCall())"
in IsEligibleForTailCall.
NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248703
91177308-0d34-0410-b5e6-
96231b3b80d8
Hal Finkel [Mon, 28 Sep 2015 08:02:14 +0000 (08:02 +0000)]
[DAGCombine] Fix getStoreMergeAndAliasCandidates's AA-enabled chain walking
When AA is being used, non-aliasing stores are canonicalized to use the same
chain, and DAGCombiner::getStoreMergeAndAliasCandidates can take advantage of
this by looking only as users of a store's chain operand. However, user
iteration is not result-number specific, we need to check that the use is as a
chain operand, and not via some other operand. It is certainly possible to have
another potentially-aliasing store, which shares the first's base pointer, and
uses the first's chain's node via some other operand.
Failure to catch this situation caused, at least in the included test case, an
assert later because the relative sequence-number ordering caused later
replacement to create a cycle in the DAG.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248698
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Mon, 28 Sep 2015 00:15:34 +0000 (00:15 +0000)]
Remove 'const' from some ArrayRefs. ArrayRefs are already immutable. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248693
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Bogner [Sun, 27 Sep 2015 22:38:50 +0000 (22:38 +0000)]
AsmWriter: Print the argument names in declarations while debugging
When llvm declarations have argument names, it's helpful to actually
print those names when debugging. Arguably, it'd be nice to print them
all the time, but that would mean the IR we output wouldn't round trip
through bitcode, which doesn't store the names.
Make the varous print() methods in AsmWriter optionally print "for
debug" and set that flag in the dump() methods. The only thing this
does differently for now is print the argument names in declarations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248692
91177308-0d34-0410-b5e6-
96231b3b80d8
Yaron Keren [Sun, 27 Sep 2015 21:31:33 +0000 (21:31 +0000)]
Silence clang warning: variable ‘Status’ set but not used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248691
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjoy Das [Sun, 27 Sep 2015 21:09:48 +0000 (21:09 +0000)]
[SCEV] identical instructions don't compute equal values
Before this change `HasSameValue` would return true for distinct
`alloca` instructions if they happened to be allocating the same
type (`alloca` instructions are not specified as reading memory). This
change adds an explicit whitelist of instruction types for which
"identical" instructions compute the same value.
Fixes PR24952.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248690
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Sun, 27 Sep 2015 20:34:31 +0000 (20:34 +0000)]
[InstCombine] fold zexts and constants into a phi (PR24766)
This is one step towards solving PR24766:
https://llvm.org/bugs/show_bug.cgi?id=24766
We were not producing the same IR for these two C functions because the store
to the temp bool causes extra zexts:
#include <stdbool.h>
bool switchy(char x1, char x2, char condition) {
bool conditionMet = false;
switch (condition) {
case 0: conditionMet = (x1 == x2); break;
case 1: conditionMet = (x1 <= x2); break;
}
return conditionMet;
}
bool switchy2(char x1, char x2, char condition) {
switch (condition) {
case 0: return (x1 == x2);
case 1: return (x1 <= x2);
}
return false;
}
As noted in the code comments, this test case manages to avoid the more general existing
phi optimizations where there are only 2 phi inputs or where there are no constant phi
args mixed in with the casts ops. It seems like a corner case, but if we don't catch it,
then I don't think we can get SimplifyCFG to further optimize towards the canonical form
for this function shown in the bug report.
Differential Revision: http://reviews.llvm.org/D12866
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248689
91177308-0d34-0410-b5e6-
96231b3b80d8
Joseph Tremoulet [Sun, 27 Sep 2015 01:47:46 +0000 (01:47 +0000)]
[EH] Create removeUnwindEdge utility
Summary:
Factor the code that rewrites invokes to calls and rewrites WinEH
terminators to their "unwind to caller" equivalents into a helper in
Utils/Local, and use it in the three places I'm aware of that need to do
this.
Reviewers: andrew.w.kaylor, majnemer, rnk
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13152
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248677
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Sat, 26 Sep 2015 17:49:04 +0000 (17:49 +0000)]
[InstCombine] Removed unnecessary meta attributes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248672
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Sanders [Sat, 26 Sep 2015 17:09:01 +0000 (17:09 +0000)]
[llvm-mc-fuzzer] Fix -jobs option.
The fuzzer argument parser will ignore all options starting with '--' so
operation mode options should begin with '--' and fuzzer options should begin
with '-'. Fuzzer arguments must still follow --fuzzer-args so that they escape
the parsing performed by the CommandLine library.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248671
91177308-0d34-0410-b5e6-
96231b3b80d8
Benjamin Kramer [Sat, 26 Sep 2015 10:09:36 +0000 (10:09 +0000)]
[BranchProbability] Manually round the floating point output.
llvm::format compiles down to snprintf which has no defined rounding for
floating point arguments, and MSVC has implemented it differently from
what the BSD libcs and glibc do. Try to emulate the glibc rounding
behavior to avoid changing tests.
While there simplify code a bit and move trivial methods inline.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248665
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Sat, 26 Sep 2015 05:06:48 +0000 (05:06 +0000)]
AMDGPU: Remove hasPostISelHook from most instructions
Since this is only needed for VOP3 and a few other special
case instructions, stop setting it on everything.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248657
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Sat, 26 Sep 2015 04:59:04 +0000 (04:59 +0000)]
AMDGPU: Switch over reg class size instead of checking all super classes
This gets isSGPRClass out of my profile of SIFixSGPRCopies.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248656
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Sat, 26 Sep 2015 04:53:30 +0000 (04:53 +0000)]
AMDGPU: Don't handle invalid reg classes in helper functions
No tests hit these and it would be better to have checks like
this explicit where they are used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248655
91177308-0d34-0410-b5e6-
96231b3b80d8
Saleem Abdulrasool [Sat, 26 Sep 2015 04:34:52 +0000 (04:34 +0000)]
AMDGPU: address -Winconsistent-missing-override
Add missing override. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248652
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Sat, 26 Sep 2015 04:09:34 +0000 (04:09 +0000)]
AMDGPU: Set CopyCost of register classes
These require multiple mov instructions to copy,
but the default value is that 1 instruction is needed.
I'm not sure if this actually changes anything.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248651
91177308-0d34-0410-b5e6-
96231b3b80d8
Chen Li [Sat, 26 Sep 2015 03:26:47 +0000 (03:26 +0000)]
[Bug 24848] Use range metadata to constant fold comparisons between two values
Summary:
This is the second part of fixing bug 24848 https://llvm.org/bugs/show_bug.cgi?id=24848.
If both operands of a comparison have range metadata, they should be used to constant fold the comparison.
Reviewers: sanjoy, hfinkel
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13177
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248650
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Sat, 26 Sep 2015 02:25:48 +0000 (02:25 +0000)]
AMDGPU: VOP3b definition cleanups
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248647
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Sat, 26 Sep 2015 02:25:45 +0000 (02:25 +0000)]
AMDGPU: Fix sched model for VOP2b instructions
Trying to use the version with the explicit output operand
would complain because of the missing WriteSALU. I'm not sure
why it doesn't complain about this with the implicit VCC def.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248646
91177308-0d34-0410-b5e6-
96231b3b80d8
Dan Gohman [Sat, 26 Sep 2015 01:09:44 +0000 (01:09 +0000)]
[WebAssembly] Rename several functions and types according to the new spec.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248644
91177308-0d34-0410-b5e6-
96231b3b80d8
Ahmed Bougacha [Sat, 26 Sep 2015 00:14:02 +0000 (00:14 +0000)]
[ARM] Don't generate clrex for pre-v7 targets.
Since r248294, we emit clrex, but it doesn't exist on v6.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248640
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjoy Das [Fri, 25 Sep 2015 23:53:50 +0000 (23:53 +0000)]
[SCEV] Reapply 'Teach isLoopBackedgeGuardedByCond to exploit trip counts'
Summary:
If the trip count of a specific backedge is `N`, then we know that
backedge is effectively guarded by the condition `{0,+,1} u< N`. This
change teaches SCEV to use this condition to prove things in
`isLoopBackedgeGuardedByCond`.
Depends on D12948
Depends on D12949
The original checkin, r248608 had to be backed out due to an issue with
a ObjCXX unit test. That issue is now fixed, so re-landing.
Reviewers: atrick, reames, majnemer, hfinkel
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D12950
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248638
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjoy Das [Fri, 25 Sep 2015 23:53:45 +0000 (23:53 +0000)]
[SCEV] Reapply 'Exploit A < B => (A+K) < (B+K) when possible'
Summary:
This change teaches SCEV's `isImpliedCond` two new identities:
A u< B u< -C => (A + C) u< (B + C)
A s< B s< INT_MIN - C => (A + C) s< (B + C)
While these are useful on their own, they're really intended to support
D12950.
The original checkin, r248606 had to be backed out due to an issue with
a ObjCXX unit test. That issue is now fixed, so re-landing.
Reviewers: atrick, reames, majnemer, nlewycky, hfinkel
Subscribers: aadg, sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D12948
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248637
91177308-0d34-0410-b5e6-
96231b3b80d8
Matthias Braun [Fri, 25 Sep 2015 23:50:53 +0000 (23:50 +0000)]
LivePhysRegs: Fix live-outs of return blocks
I realized that the live-out set computed for the return block is
missing the callee saved registers (the non-pristine ones to be exact).
This only affects the liveness computed for instructions inside the
function epilogue which currently none of the LivePhysRegs users in llvm
cares about, so this is just a drive-by fix without a testcase.
Differential Revision: http://reviews.llvm.org/D13180
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248636
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Fri, 25 Sep 2015 23:21:38 +0000 (23:21 +0000)]
[InstCombine] match De Morgan's Law hidden by zext ops (PR22723)
This is a fix for PR22723:
https://llvm.org/bugs/show_bug.cgi?id=22723
My first attempt at this was to change what I thought was the root problem:
xor (zext i1 X to i32), 1 --> zext (xor i1 X, true) to i32
...but we create the opposite pattern in InstCombiner::visitZExt(), so infinite loop!
My next idea was to fix the matchIfNot() implementation in PatternMatch, but that would
mean potentially returning a different size for the match than what was input. I think
this would require all users of m_Not to check the size of the returned match, so I
abandoned that idea.
I settled on just fixing the exact case presented in the PR. This patch does allow the
2 functions in PR22723 to compile identically (x86):
bool test(bool x, bool y) { return !x | !y; }
bool test(bool x, bool y) { return !x || !y; }
...
andb %sil, %dil
xorb $1, %dil
movb %dil, %al
retq
Differential Revision: http://reviews.llvm.org/D12705
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248634
91177308-0d34-0410-b5e6-
96231b3b80d8
Cong Hou [Fri, 25 Sep 2015 23:09:59 +0000 (23:09 +0000)]
Use fixed-point representation for BranchProbability.
BranchProbability now is represented by its numerator and denominator in uint32_t type. This patch changes this representation into a fixed point that is represented by the numerator in uint32_t type and a constant denominator 1<<31. This is quite similar to the representation of BlockMass in BlockFrequencyInfoImpl.h. There are several pros and cons of this change:
Pros:
1. It uses only a half space of the current one.
2. Some operations are much faster like plus, subtraction, comparison, and scaling by an integer.
Cons:
1. Constructing a probability using arbitrary numerator and denominator needs additional calculations.
2. It is a little less precise than before as we use a fixed denominator. For example, 1 - 1/3 may not be exactly identical to 1 / 3 (this will lead to many BranchProbability unit test failures). This should not matter when we only use it for branch probability. If we use it like a rational value for some precise calculations we may need another construct like ValueRatio.
One important reason for this change is that we propose to store branch probabilities instead of edge weights in MachineBasicBlock. We also want clients to use probability instead of weight when adding successors to a MBB. The current BranchProbability has more space which may be a concern.
Differential revision: http://reviews.llvm.org/D12603
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248633
91177308-0d34-0410-b5e6-
96231b3b80d8
Matthias Braun [Fri, 25 Sep 2015 22:27:02 +0000 (22:27 +0000)]
SelectionDAGDumper: Print simple operands inline.
Print simple operands inline instead of their pointer/value number.
Simple operands are SDNodes without predecessors like Constant(FP), Register,
UNDEF. This unifies the behaviour with dumpr() which was already doing this.
Previously:
t0: ch = EntryToken
t1: i64 = Register %vreg0
t2: i64,ch = CopyFromReg t0, t1
t3: i64 = Constant<1>
t4: i64 = add t2, t3
t5: i64 = Constant<2>
t6: i64 = add t2, t5
t10: i64 = undef
t11: i8,ch = load t0, t2, t10<LD1[%tmp81]>
t12: i8,ch = load t0, t4, t10<LD1[%tmp10]>
t13: i8,ch = load t0, t6, t10<LD1[%tmp12]>
Now:
t0: ch = EntryToken
t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0
t4: i64 = add t2, Constant:i64<1>
t6: i64 = add t2, Constant:i64<2>
t11: i8,ch = load<LD1[%tmp81]> t0, t2, undef:i64
t12: i8,ch = load<LD1[%tmp10]> t0, t4, undef:i64
t13: i8,ch = load<LD1[%tmp12]> t0, t6, undef:i64
Differential Revision: http://reviews.llvm.org/D12567
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248628
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 25 Sep 2015 22:21:19 +0000 (22:21 +0000)]
AMDGPU: Construct new buffer instruction when moving SMRD
It's easier to understand creating a full instruction
than the current situation where sometimes a new
instruction is created and sometimes it is awkwardly
mutated in place.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248627
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 25 Sep 2015 22:06:19 +0000 (22:06 +0000)]
DAGCombiner: Check if store is volatile first
This is the simpler check. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248625
91177308-0d34-0410-b5e6-
96231b3b80d8
Matthias Braun [Fri, 25 Sep 2015 21:51:24 +0000 (21:51 +0000)]
TargetRegisterInfo: Introduce PrintLaneMask.
This makes it more convenient to print lane masks and lead to more
uniform printing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248624
91177308-0d34-0410-b5e6-
96231b3b80d8
Matthias Braun [Fri, 25 Sep 2015 21:51:14 +0000 (21:51 +0000)]
TargetRegisterInfo: Add typedef unsigned LaneBitmask and use it where apropriate; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248623
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Fri, 25 Sep 2015 21:49:48 +0000 (21:49 +0000)]
merge vector stores into wider vector stores and fix AArch64 misaligned access TLI hook (PR21711)
This is a redo of D7208 ( r227242 - http://llvm.org/viewvc/llvm-project?view=revision&revision=227242 ).
The patch was reverted because an AArch64 target could infinite loop after the change in DAGCombiner
to merge vector stores. That happened because AArch64's allowsMisalignedMemoryAccesses() wasn't telling
the truth. It reported all unaligned memory accesses as fast, but then split some 128-bit unaligned
accesses up in performSTORECombine() because they are slow.
This patch attempts to fix the problem in AArch's allowsMisalignedMemoryAccesses() while preserving
existing (perhaps questionable) lowering behavior.
The x86 test shows that store merging is working as intended for a target with fast 32-byte unaligned
stores.
Differential Revision: http://reviews.llvm.org/D12635
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248622
91177308-0d34-0410-b5e6-
96231b3b80d8
Matthias Braun [Fri, 25 Sep 2015 21:41:40 +0000 (21:41 +0000)]
PrologueEpilogInserter: Fix missing live-ins when savepoint equals restorepoint
The algorithm would not modify the live-in list of blocks below the save
block point which is correct unless it happens to be a restore point at
the same time.
Also fixes the benign issue of live-in registers being added twice in
some cases.
The testcase is based on a test submitted by Kit Barton.
Differential Revision: http://reviews.llvm.org/D13176
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248620
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Stellard [Fri, 25 Sep 2015 21:41:28 +0000 (21:41 +0000)]
AMDGPU/SI: Use .hsatext section instead of .text for HSA
Reviewers: arsenm, grosbach, rafael
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D12424
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248619
91177308-0d34-0410-b5e6-
96231b3b80d8
Tom Stellard [Fri, 25 Sep 2015 21:41:14 +0000 (21:41 +0000)]
MCAsmInfo: Allow targets to specify when the .section directive should be omitted
Summary:
The default behavior is to omit the .section directive for .text, .data,
and sometimes .bss, but some targets may want to omit this directive for
other sections too.
The AMDGPU backend will uses this to emit a simplified syntax for section
switches. For example if the section directive is not omitted (current
behavior), section switches to .hsatext will be printed like this:
.section .hsatext,#alloc,#execinstr,#write
This is actually wrong, because .hsatext has some custom STT_* flags,
which MC doesn't know how to print or parse.
If the section directive is omitted (made possible by this commit),
section switches will be printed like this:
.hsatext
The motivation for this patch is to make it possible to emit sections
with custom STT_* flags without having to teach MC about all the target
specific STT_* flags.
Reviewers: rafael, grosbach
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D12423
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248618
91177308-0d34-0410-b5e6-
96231b3b80d8
Matthias Braun [Fri, 25 Sep 2015 21:25:19 +0000 (21:25 +0000)]
MachineBasicBlock: Factor out common code into isReturnBlock()
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248617
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjoy Das [Fri, 25 Sep 2015 21:16:50 +0000 (21:16 +0000)]
Revert two SCEV changes that caused test failures in clang.
r248606: "[SCEV] Exploit A < B => (A+K) < (B+K) when possible"
r248608: "[SCEV] Teach isLoopBackedgeGuardedByCond to exploit trip counts."
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248614
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Bogner [Fri, 25 Sep 2015 21:03:46 +0000 (21:03 +0000)]
ADCE: Fix typo in file comment. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248613
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 25 Sep 2015 20:22:12 +0000 (20:22 +0000)]
PeepholeOptimizer: Remove redundant copies
If a virtual register is copied and another copy was already
seen, replace with the previous copy. This only handles the
simplest cases for now.
This pattern shows up from various operand restrictions
AMDGPU has which require inserting copies depending
on the register class of the operands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248611
91177308-0d34-0410-b5e6-
96231b3b80d8
Chad Rosier [Fri, 25 Sep 2015 20:20:22 +0000 (20:20 +0000)]
Simplify code. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248610
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Fri, 25 Sep 2015 20:12:43 +0000 (20:12 +0000)]
more space; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248609
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjoy Das [Fri, 25 Sep 2015 19:59:57 +0000 (19:59 +0000)]
[SCEV] Teach isLoopBackedgeGuardedByCond to exploit trip counts.
Summary:
If the trip count of a specific backedge is `N`, then we know that
backedge is effectively guarded by the condition `{0,+,1} u< N`. This
change teaches SCEV to use this condition to prove things in
`isLoopBackedgeGuardedByCond`.
Depends on D12948
Depends on D12949
Reviewers: atrick, reames, majnemer, hfinkel
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D12950
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248608
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjoy Das [Fri, 25 Sep 2015 19:59:52 +0000 (19:59 +0000)]
[SCEV] Extract helper function from isImpliedCond; NFC
Summary:
This new helper routine will be used in a subsequent change.
Reviewers: hfinkel
Subscribers: hfinkel, sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D12949
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248607
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjoy Das [Fri, 25 Sep 2015 19:59:49 +0000 (19:59 +0000)]
[SCEV] Exploit A < B => (A+K) < (B+K) when possible
Summary:
This change teaches SCEV's `isImpliedCond` two new identities:
A u< B u< -C => (A + C) u< (B + C)
A s< B s< INT_MIN - C => (A + C) s< (B + C)
While these are useful on their own, they're really intended to support
D12950.
Reviewers: atrick, reames, majnemer, nlewycky, hfinkel
Subscribers: aadg, sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D12948
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248606
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 25 Sep 2015 18:21:47 +0000 (18:21 +0000)]
AMDGPU: Add some more tests for literal operands
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248600
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 25 Sep 2015 18:09:15 +0000 (18:09 +0000)]
AMDGPU: Make getNamedOperandIdx declaration readonly
This matches how it is defined in the generated implementation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248598
91177308-0d34-0410-b5e6-
96231b3b80d8
Chad Rosier [Fri, 25 Sep 2015 17:48:17 +0000 (17:48 +0000)]
[AArch64] Add support for generating pre- and post-index load/store pairs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248593
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 25 Sep 2015 17:41:20 +0000 (17:41 +0000)]
AMDGPU: Disable some passes that are not meaningful
Don't run passes related to stack maps, garbage collection,
exceptions since these aren't useful for GPUs.
There might be a few more to turn off that I'm less sure about
(e.g. ShrinkWrapping) or I'm not sure how to disable
(SafeStack and StackProtector)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248591
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 25 Sep 2015 17:27:08 +0000 (17:27 +0000)]
AMDGPU: Handle i64->v2i32 loads/stores in PreprocessISelDAG
This fixes a select error when the i64 source was also
bitcasted to v2i32 in the original source.
Instead of awkwardly trying to select the modified source value and
the store, replace before isel begins.
Uses a worklist to avoid possible problems from mutating the DAG,
although it seems to work OK without it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248589
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 25 Sep 2015 17:21:28 +0000 (17:21 +0000)]
AMDGPU: Fix recomputing dominator tree unnecessarily
SIFixSGPRCopies does not modify the CFG, but this was
being recomputed before running SIFoldOperands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248587
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 25 Sep 2015 17:08:42 +0000 (17:08 +0000)]
AMDGPU: Re-justify workaround and fix worked around problem
When buffer resource descriptors were built, the upper two components
of the descriptor were first composed into a 64-bit register because
legalizeOperands assumed all operands had the same register class.
Fix that problem, but keep the workaround. I'm not sure anything
actually is actually emitting such a REG_SEQUENCE now.
If multiple resource descriptors are set up with different base
pointers, this is copied with a single s_mov_b64. We probably
should fix this better by recognizing a pair of s_mov_b32 later,
but for now delete the dead code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248585
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 25 Sep 2015 17:08:40 +0000 (17:08 +0000)]
AMDGPU: Don't create REG_SEQUENCE with SGPR dest and VGPR sources
This avoids needting to re-legalize the new REG_SEQUENCE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248584
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 25 Sep 2015 16:58:27 +0000 (16:58 +0000)]
AMDGPU: Fix not adding exec to defs of cmpx instruction pseudos
This was only set on the final _si/_vi version, but not
on the pseudos most of codegen sees.
No test since these instructions aren't used yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248583
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 25 Sep 2015 16:58:25 +0000 (16:58 +0000)]
AMDGPU: Improve accuracy of instruction rates for VOPC
These were all using the default 32-bit VALU write class,
but the i64/f64 compares are half rate.
I'm not sure this is really correct, because they are still using
the write to VALU write class, even though they really write
to the SALU.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248582
91177308-0d34-0410-b5e6-
96231b3b80d8
James Molloy [Fri, 25 Sep 2015 15:39:29 +0000 (15:39 +0000)]
[GlobalsAA] Teach GlobalsAA about nocapture
Arguments to function calls marked "nocapture" can be marked as
non-escaping. However, nocapture is defined in terms of the lifetime
of the callee, and if the callee can directly or indirectly recurse to
the caller, the semantics of nocapture are invalid.
Therefore, we eagerly discover which SCC each function belongs to,
and later can check if callee and caller of a callsite belong to
the same SCC, in which case there could be recursion.
This means that we can't be so optimistic in
getModRefInfo(ImmutableCallsite) - previously we assumed all call
arguments never aliased with an escaping global. Now we need to check,
because a global could now be passed as an argument but still not
escape.
This also solves a related conformance problem: MemCpyOptimizer can
turn non-escaping stores of globals into calls to intrinsics like
llvm.memcpy/llvm/memset. This confuses GlobalsAA, which knows the
global can't escape and so returns NoModRef when queried, when
obviously a memcpy/memset call does indeed reference and modify its
arguments.
This fixes PR24800, PR24801, and PR24802.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248576
91177308-0d34-0410-b5e6-
96231b3b80d8
Saleem Abdulrasool [Fri, 25 Sep 2015 05:41:02 +0000 (05:41 +0000)]
ARM: make -Asserts,-Werror=unused-variable build happy
The value was only used in an assertion. Sink the variable usage into the
assertion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248562
91177308-0d34-0410-b5e6-
96231b3b80d8
Saleem Abdulrasool [Fri, 25 Sep 2015 05:15:46 +0000 (05:15 +0000)]
ARM: address WoA division limitation
We now emit the compiler generated divide by zero check that was needed for the
MSVC routines. We construct a psuedo-instruction for the DBZ check as the
operation requires splitting up the BB. For the 64-bit operations, we need to
custom expand the node as we need to insert the DBZ check and then emit the
libcall to the appropriate name. Because this is target specific, it seemed
better to reproduce the expansion operation from the target-agnostic type
legalization rather than sink this there to avoid the duplication. The division
library calls now match MSVC semantically.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248561
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 25 Sep 2015 00:28:43 +0000 (00:28 +0000)]
AMDGPU: Remove unused includes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248553
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjoy Das [Fri, 25 Sep 2015 00:05:40 +0000 (00:05 +0000)]
[LangRef] Unbreak the docs Sphinx build.
r248551 introduced some breakage due to incorrectly terminated
``literals`` s.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248552
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjoy Das [Thu, 24 Sep 2015 23:34:52 +0000 (23:34 +0000)]
[Bitcode][Asm] Teach LLVM to read and write operand bundles.
Summary:
This also adds the first set of tests for operand bundles.
The optimizer has not been audited to ensure that it does the right
thing with operand bundles.
Depends on D12456.
Reviewers: reames, chandlerc, majnemer, dexonsmith, kmod, JosephTremoulet, rnk, bogner
Subscribers: maksfb, llvm-commits
Differential Revision: http://reviews.llvm.org/D12457
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248551
91177308-0d34-0410-b5e6-
96231b3b80d8
Ed Maste [Thu, 24 Sep 2015 23:01:16 +0000 (23:01 +0000)]
Restore test coverage for other than ELFOSABI_NONE
Add a FreeBSD test to restore testing of ELF OSABI other than
ELFOSABI_NONE after r248534.
Differential Revision: http://reviews.llvm.org/D13146
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248550
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Thu, 24 Sep 2015 22:36:49 +0000 (22:36 +0000)]
Fix typo
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248549
91177308-0d34-0410-b5e6-
96231b3b80d8
Chad Rosier [Thu, 24 Sep 2015 21:27:49 +0000 (21:27 +0000)]
[AArch64] Improve the readability of the ld/st optimization pass. NFC.
In this context, MI is an add/sub instruction not a loads/store.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248540
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Thu, 24 Sep 2015 21:02:17 +0000 (21:02 +0000)]
[X86][SSE2] Fix zero/any extension shuffles that don't start from the first element
Fix for D12561 - we weren't correctly ensuring that the base element for extension was moved to start on a boundary suitable for UNPCKL/H
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248536
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Thu, 24 Sep 2015 20:57:24 +0000 (20:57 +0000)]
Use ELFOSABI_NONE instead of ELFOSABI_LINUX.
The doesn't seem to be a difference and ELFOSABI_NONE seems to be far more
common:
* Linux doesn't care when loading and puts ELFOSABI_NONE on core dumps.
* Gold and bfd ld produce files with ELFOSABI_NONE.
* Gold and bfd ld seems to ignore EI_OSABI other than for freebsd.
* Gas puts ELFOSABI_NONE in most .o files.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248534
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Thu, 24 Sep 2015 19:52:27 +0000 (19:52 +0000)]
AMDGPU: Add s_dcache_* instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248533
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Thu, 24 Sep 2015 19:52:21 +0000 (19:52 +0000)]
AMDGPU: Add cache invalidation instructions.
These are necessary for implementing mem_fence for
OpenCL 2.0.
The VI assembler tests are disabled since it seems to be
using the wrong encoding or opcode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248532
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Thu, 24 Sep 2015 19:52:15 +0000 (19:52 +0000)]
AMDGPU: Run mubuf assembler test for CI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248531
91177308-0d34-0410-b5e6-
96231b3b80d8