Programming Languages Research Group: Git

Fix a bug in the lowering of BUILD_VECTOR for AVX. SCALAR_TO_VECTOR does not zero untouched elements. Use INSERT_VECTOR_ELT instead.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147948 91177308-0d34-0410-b5e6-96231b3b80d8

Don't try to create a GEP when the pointee type is unsized (such GEPs
are invalid). Fixes a crash on array1.C from the GCC testsuite when
compiled with dragonegg.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147946 91177308-0d34-0410-b5e6-96231b3b80d8

Disable the transformation I added in r147936 to see if it fixes some
strange build bot failures that look like a miscompile into an infloop.
I'll investigate this tomorrow, but I'd both like to know whether my
patch is the culprit, and get the bots back to green.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147945 91177308-0d34-0410-b5e6-96231b3b80d8

Hoist a really redundant code pattern into a helper function, and delete
lots of lines of code. No functionality changed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147942 91177308-0d34-0410-b5e6-96231b3b80d8

Simplify the AND-rooted mask+shift checking code to match that of the
SRL-rooted code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147941 91177308-0d34-0410-b5e6-96231b3b80d8

Unify the interface of the three mask+shift transform helpers, and
factor the differences that were hiding in one of them into its other
caller, the SRL handling code. No change in behavior.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147940 91177308-0d34-0410-b5e6-96231b3b80d8

Clarify and make explicit some of the requirements for transforming
mask+shift pairs at the beginning of the ISD::AND case block, and then
hoist the final pattern into a helper function, simplifying and
reflowing it appropriately. This should have no observable behavior
change, but several simplifications fell out of this such as directly
computing the new mask constant, etc.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147939 91177308-0d34-0410-b5e6-96231b3b80d8

Fix undefined code and reenable test case.

I don't think the compact encoding code is right, but at least is has
defined behavior now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147938 91177308-0d34-0410-b5e6-96231b3b80d8

Hoist the logic to transform shift+mask combinations into sub-register
extracts and scaled addressing modes into its own helper function. No
functionality changed here, just hoisting and layout fixes falling out
of that hoisting.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147937 91177308-0d34-0410-b5e6-96231b3b80d8

Teach the X86 instruction selection to do some heroic transforms to
detect a pattern which can be implemented with a small 'shl' embedded in
the addressing mode scale. This happens in real code as follows:

unsigned x = my_accelerator_table[input >> 11];

Here we have some lookup table that we look into using the high bits of
'input'. Each entity in the table is 4-bytes, which means this
implicitly gets turned into (once lowered out of a GEP):

*(unsigned*)((char*)my_accelerator_table + ((input >> 11) << 2));

The shift right followed by a shift left is canonicalized to a smaller
shift right and masking off the low bits. That hides the shift right
which x86 has an addressing mode designed to support. We now detect
masks of this form, and produce the longer shift right followed by the
proper addressing mode. In addition to saving a (rather large)
instruction, this also reduces stalls in Intel chips on benchmarks I've
measured.

In order for all of this to work, one part of the DAG needs to be
canonicalized *still further* than it currently is. This involves
removing pointless 'trunc' nodes between a zextload and a zext. Without
that, we end up generating spurious masks and hiding the pattern.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147936 91177308-0d34-0410-b5e6-96231b3b80d8

Improved compile time:
1. Size heuristics changed. Now we calculate number of unswitching
branches only once per loop.
2. Some checks was moved from UnswitchIfProfitable to
processCurrentLoop, since it is not changed during processCurrentLoop
iteration. It allows decide to skip some loops at an early stage.
Extended statistics:
- Added total number of instructions analyzed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147935 91177308-0d34-0410-b5e6-96231b3b80d8

llvm/test/CodeGen/X86/zext-fold.ll: Relax an expression in stack offset.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147928 91177308-0d34-0410-b5e6-96231b3b80d8

llvm/test/CodeGen/X86/sub-with-overflow.ll: Add explicit -mtriple=i686-linux.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147927 91177308-0d34-0410-b5e6-96231b3b80d8

Clarified the SCEV getSmallConstantTripCount interface with in-your-face comments.

This interface is misleading and dangerous, but it is actually what we need for unrolling.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147926 91177308-0d34-0410-b5e6-96231b3b80d8

Add big endian mips support. Based on a patch by Jack Carter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147924 91177308-0d34-0410-b5e6-96231b3b80d8

Add the skeleton of an asm parser for mips.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147923 91177308-0d34-0410-b5e6-96231b3b80d8

ARM Ld/St Optimizer fix.

Allow LDRD to be formed from pairs with different LDR encodings. This was the original intention of the pass. Somewhere along the way, the LDR opcodes were refined which broke the optimization. We really don't care what the original opcodes are as long as they both map to the same LDRD and the immediate still fits.

Fixes rdar://10435045 ARMLoadStoreOptimization cannot handle mixed LDRi8/LDRi12

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147922 91177308-0d34-0410-b5e6-96231b3b80d8

Disable test that seems to expose an unrelated Linux issue.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147921 91177308-0d34-0410-b5e6-96231b3b80d8

Detect when a value is undefined on an edge to a landing pad.

Consider this code:

int h() {
  int x;
  try {
    x = f();
    g();
  } catch (...) {
    return x+1;
  }
  return x;
}

The variable x is undefined on the first edge to the landing pad, but it
has the f() return value on the second edge to the landing pad.

SplitAnalysis::getLastSplitPoint() would assume that the return value
from f() was live into the landing pad when f() throws, which is of
course impossible.

Detect these cases, and treat them as if the landing pad wasn't there.
This allows spill code to be inserted after the function call to f().

<rdar://problem/10664933>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147912 91177308-0d34-0410-b5e6-96231b3b80d8

Exclusively use SplitAnalysis::getLastSplitPoint().

Delete the alternative implementation in LiveIntervalAnalysis.

These functions computed the same thing, but SplitAnalysis caches the
result.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147911 91177308-0d34-0410-b5e6-96231b3b80d8

Avoid CSE of instructions which define physical registers across MBBs unless
the physical registers are not allocatable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147902 91177308-0d34-0410-b5e6-96231b3b80d8

If the global variable is removed by the linker, then don't constant merge it
with other symbols.

An object in the __cfstring section is suppoed to be filled with CFString
objects, which have a pointer to ___CFConstantStringClassReference followed by a
pointer to a __cstring. If we allow the object in the __cstring section to be
merged with another global, then it could end up in any section. Because the
linker is going to remove these symbols in the final executable, we shouldn't
bother to merge them.
<rdar://problem/10564621>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147899 91177308-0d34-0410-b5e6-96231b3b80d8

Don't avoid recursing for pointer types, just reference types. Expand on
the comment.

Fixes constvars.exp on the gdb test builder.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147897 91177308-0d34-0410-b5e6-96231b3b80d8

Add test case for r147881.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147891 91177308-0d34-0410-b5e6-96231b3b80d8

Fixed order of operands in comment to match code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147890 91177308-0d34-0410-b5e6-96231b3b80d8

Default stack alignment for 32bit x86 should be 4 Bytes, not 8 Bytes.
Add a test that checks the stack alignment of a simple function for
Darwin, Linux and NetBSD for 32bit and 64bit mode.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147888 91177308-0d34-0410-b5e6-96231b3b80d8

Consider unknown alignment caused by OptimizeThumb2Instructions().

This function runs after all constant islands have been placed, and may
shrink some instructions to their 2-byte forms. This can actually cause
some constant pool entries to move out of range because of growing
alignment padding.

Treat instructions that may be shrunk the same as inline asm - they
erode the known alignment bits.

Also reinstate an old assertion in verify(). It is correct now that
basic block offsets include alignments.

Add a single large test case that will hopefully exercise many parts of
the constant island pass.

<rdar://problem/10670199>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147885 91177308-0d34-0410-b5e6-96231b3b80d8

80 col violation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147884 91177308-0d34-0410-b5e6-96231b3b80d8

Add missing VEX predicates to VMOVSDto64rr/VMOVSDto64mr. This fixes a few
failing test cases on our internal AVX nightly tester.
rdar://10663637

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147881 91177308-0d34-0410-b5e6-96231b3b80d8

Let asm parser query asm syntax dialect.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147880 91177308-0d34-0410-b5e6-96231b3b80d8

This is the matching change for the data structure name changes for the
functional change in r147860 to use DW_TAG_label's instead TAG_subprogram's.
This only changes names and updates comments. No functional change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147877 91177308-0d34-0410-b5e6-96231b3b80d8

ARM updating VST2 pseudo-lowering fixed vs. register update.

rdar://10663487

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147876 91177308-0d34-0410-b5e6-96231b3b80d8

Fix some leftover control reaches end of non-void function warnings.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147874 91177308-0d34-0410-b5e6-96231b3b80d8

Teach the triple library about the androideabi environment.

Patch by Evgeniy Stepanov.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147871 91177308-0d34-0410-b5e6-96231b3b80d8

Move default case for covered enum outside of switch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147870 91177308-0d34-0410-b5e6-96231b3b80d8

For i386, don't use the generic code.

As the comment around 7746 says, it's better to use the x87 extended precision
here than SSE. And the generic code doesn't know how to do that. It also regains
the speed lost for the uint64_to_float.c testcase.
<rdar://problem/10669858>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147869 91177308-0d34-0410-b5e6-96231b3b80d8

Fix a -Wreturn-type warning in g++.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147867 91177308-0d34-0410-b5e6-96231b3b80d8

Cleanup these asserts to follow common LLVM style and coding
conventions. Also, clarify the grouping of one of the asserts to silence
-Wparentheses.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147863 91177308-0d34-0410-b5e6-96231b3b80d8

Add 'llvm_unreachable' to passify GCC's understanding of the constraints
of several newly un-defaulted switches. This also helps optimizers
(including LLVM's) recognize that every case is covered, and we should
assume as much.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147861 91177308-0d34-0410-b5e6-96231b3b80d8

Various crash reporting tools have a problem with the dwarf generated for
assembly source when it generates the TAG_subprogram dwarf debug info for
the labels that have nothing between them as in this bit of assembly source:

% cat ZeroLength.s
_func1:
_func2:
nop

One solution would be to not emit the subsequent labels with the same address
and use the next label with a different address or the end of the section for
the AT_high_pc value of the TAG_subprogram.

Turns out in llvm-mc it is not possible in all cases to determine of two
symbols have the same value at the point we put out the TAG_subprogram dwarf
debug info.

So we will have llvm-mc instead of putting out TAG_subprogram's put out
DW_TAG_label's. And the DW_TAG_label does not have a AT_high_pc value which
avoids the problem.

This commit is only the functional change to make the diffs clear as to what is
really being changed. The next commit will be to clean up the names of such
things like MCGenDwarfSubprogramEntry to something like MCGenDwarfLabelEntry.

rdar://10666925

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147860 91177308-0d34-0410-b5e6-96231b3b80d8

Add definition for intel asm variant.

Right now, this just adds additional entries in match table. The parser does not use them yet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147859 91177308-0d34-0410-b5e6-96231b3b80d8

Record asm variant id in MatchEntry and check it while matching instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147858 91177308-0d34-0410-b5e6-96231b3b80d8

Remove unnecessary default cases in switches that cover all enum values.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147855 91177308-0d34-0410-b5e6-96231b3b80d8

Fix a bug in the legalization of shuffle vectors. When we emulate shuffles using BUILD_VECTORS we may be using a BV of different type. Make sure to cast it back.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147851 91177308-0d34-0410-b5e6-96231b3b80d8

Add definitions for AMD's bobcat (aka btver1)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147846 91177308-0d34-0410-b5e6-96231b3b80d8

Fix a crash in AVX2 when trying to broadcast a double into a 128-bit vector. There is no vbroadcastsd xmm, but we do need to support 64-bit integers broadcasted into xmm. Also factor the AVX check into the isVectorBroadcast function. This makes more sense since the AVX2 check was already inside.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147844 91177308-0d34-0410-b5e6-96231b3b80d8

Remove hasXMM/hasXMMInt functions. Move callers to hasSSE1/hasSSE2. This is the final piece to remove the AVX hack that disabled SSE.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147843 91177308-0d34-0410-b5e6-96231b3b80d8

Remove hasSSE*orAVX functions and change all callers to use just hasSSE*. AVX is now an SSE level and no longer disables SSE checks.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147842 91177308-0d34-0410-b5e6-96231b3b80d8

Instruction selection priority fixes to remove the XMM/XMMInt/orAVX predicates. Another commit will remove orAVX functions from X86SubTarget.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147841 91177308-0d34-0410-b5e6-96231b3b80d8

Allow machine-cse to look across MBB boundary when cse'ing instructions that
define physical registers. It's currently very restrictive, only catching
cases where the CE is in an immediate (and only) predecessor. But it catches
a surprising large number of cases.

rdar://10660865

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147827 91177308-0d34-0410-b5e6-96231b3b80d8

Enable LSR IV Chains with sufficient heuristics.

These heuristics are sufficient for enabling IV chains by
default. Performance analysis has been done for i386, x86_64, and
thumbv7. The optimization is rarely important, but can significantly
speed up certain cases by eliminating spill code within the
loop. Unrolled loops are prime candidates for IV chains. In many
cases, the final code could still be improved with more target
specific optimization following LSR. The goal of this feature is for
LSR to make the best choice of induction variables.

Instruction selection may not completely take advantage of this
feature yet. As a result, there could be cases of slight code size
increase.

Code size can be worse on x86 because it doesn't support postincrement
addressing. In fact, when chains are formed, you may see redundant
address plus stride addition in the addressing mode. GenerateIVChains
tries to compensate for the common cases.

On ARM, code size increase can be mitigated by using postincrement
addressing, but downstream codegen currently misses some opportunities.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147826 91177308-0d34-0410-b5e6-96231b3b80d8

Accurately model hardware alignment rounding.

On Thumb, the displacement computation hardware uses the address of the
current instruction rouned down to a multiple of 4. Include this
rounding in the UserOffset we compute for each instruction.

When inline asm is present, the instruction alignment may not be known.
Constrain the maximum displacement instead in that case.

This makes it possible for CreateNewWater() and OffsetIsInRange() to
agree about the valid displacements. When they disagree, infinite
looping happens.

As always, test cases for this stuff are insane.

<rdar://problem/10660175>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147825 91177308-0d34-0410-b5e6-96231b3b80d8

Remove the logging streamer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147820 91177308-0d34-0410-b5e6-96231b3b80d8

Catch runaway ARMConstantIslandPass even in -Asserts builds.

The pass is prone to looping, and it is better to crash than loop
forever, even in a -Asserts build.

<rdar://problem/10660175>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147806 91177308-0d34-0410-b5e6-96231b3b80d8

Fix asm string wrt variants.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147805 91177308-0d34-0410-b5e6-96231b3b80d8

Use descriptive variable name and remove incorrect operand number check.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147802 91177308-0d34-0410-b5e6-96231b3b80d8

Adding IV chain generation to LSR.

After collecting chains, check if any should be materialized. If so,
hide the chained IV users from the LSR solver. LSR will only solve for
the head of the chain. GenerateIVChains will then materialize the
chained IV users by computing the IV relative to its previous value in
the chain.

In theory, chained IV users could be exposed to LSR's solver. This
would be considerably complicated to implement and I'm not aware of a
case where we need it. In practice it's more important to
intelligently prune the search space of nontrivial loops before
running the solver, otherwise the solver is often forced to prune the
most optimal solutions. Hiding the chained users does this well, so
that LSR is more likely to find the best IV for the chain as a whole.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147801 91177308-0d34-0410-b5e6-96231b3b80d8

Adding collection of IV chains to LSR.

This collects a set of IV uses within the loop whose values can be
computed relative to each other in a sequence. Following checkins will
make use of this information.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147797 91177308-0d34-0410-b5e6-96231b3b80d8

Split AsmParser into two components - AsmParser and AsmParserVariant

AsmParser holds info specific to target parser.
AsmParserVariant holds info specific to asm variants supported by the target.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147787 91177308-0d34-0410-b5e6-96231b3b80d8

"Minor LSR debugging stuff"

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147785 91177308-0d34-0410-b5e6-96231b3b80d8

Update language check. Do not ignore DW_LANG_Python.
Patch by Joe Groff!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147781 91177308-0d34-0410-b5e6-96231b3b80d8

Move assert to the right place.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147779 91177308-0d34-0410-b5e6-96231b3b80d8

InstCombine: Teach foldLogOpOfMaskedICmpsHelper that sign bit tests are bit tests.

This subsumes several other transforms while enabling us to catch more cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147777 91177308-0d34-0410-b5e6-96231b3b80d8

Don't rely on the fact that shift values are never very large, and thus
this substraction will result in small negative numbers at worst which
become very large positive numbers on assignment and are thus caught by
the <=4 check on the next line. The >0 check clearly intended to catch
these as negative numbers.

Spotted by inspection, and impossible to trigger given the shift widths
that can be used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147773 91177308-0d34-0410-b5e6-96231b3b80d8

Cleanup and FileCheck-ize a test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147772 91177308-0d34-0410-b5e6-96231b3b80d8

Remove AVX hack in X86Subtarget. AVX/AVX2 are now treated as an SSE level. Predicate functions have been altered to maintain previous names and behavior.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147770 91177308-0d34-0410-b5e6-96231b3b80d8

Add HasAVX predicate to some of the AVX patterns.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147769 91177308-0d34-0410-b5e6-96231b3b80d8

Reorder a bunch of patterns to put the AVX version first thus giving it priority over the SSE version. Another step towards trying to remove the AVX hack that disables SSE from X86Subtarget.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147768 91177308-0d34-0410-b5e6-96231b3b80d8

Clean up patterns for MOVNT*. Not sure why there were floating point types on MOVNTPS and MOVNTDQ. And v4i64 was completely missing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147767 91177308-0d34-0410-b5e6-96231b3b80d8

Mark MOVNTI as being supported in SSE2 OR AVX mode. This instruction has no AVX equivalent so we should use the SSE version.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147766 91177308-0d34-0410-b5e6-96231b3b80d8

Move SSE2 logical operations PAND/POR/PXOR/PANDN above SSE1 logical operations ANDPS/ORPS/XORPS/ANDNPS. This fixes a pattern ordering issue that meant that the SSE2 instructions could never be directly selected since the SSE1 patterns would always match first. This is largely moot with the ExeDepsFix pass, but I'm trying to audit for all such ordering issues.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147765 91177308-0d34-0410-b5e6-96231b3b80d8

Change some places that were checking for AVX OR SSE1/2 to use hasXMM/hasXMMInt instead. Also fix one place that checked SSE3, but accidentally excluded AVX to use hasSSE3orAVX. This is a step towards removing the AVX hack from the X86Subtarget.h

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147764 91177308-0d34-0410-b5e6-96231b3b80d8

Don't print an unused label before .cfi_endproc.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147763 91177308-0d34-0410-b5e6-96231b3b80d8

Don't disable MMX support when AVX is enabled. Fix predicates for MMX instructions that were added along with SSE instructions to check for AVX in addition to SSE level.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147762 91177308-0d34-0410-b5e6-96231b3b80d8

Enable FISTTP* instructions when AVX is enabled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147758 91177308-0d34-0410-b5e6-96231b3b80d8

Tweak my last commit to be less conservative about uses.

We still save an instruction when just the "and" part is replaced.
Also change the code to match comments more closely.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147753 91177308-0d34-0410-b5e6-96231b3b80d8

Don't forget to transfer implicit uses of return instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147752 91177308-0d34-0410-b5e6-96231b3b80d8

Avoid eraseing copies from a reserved register unless the definition can be
safely proven not to have been clobbered. No small test case possible.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147751 91177308-0d34-0410-b5e6-96231b3b80d8

InstCombine: If we have a bit test and a sign test anded/ored together, merge the sign bit into the bit test.

This is common in bit field code, e.g. checking if the first or the last bit of a bit field is set.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147749 91177308-0d34-0410-b5e6-96231b3b80d8

Reverted commit #147601 upon Evan's request.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147748 91177308-0d34-0410-b5e6-96231b3b80d8

Remove MCELFStreamer.h.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147745 91177308-0d34-0410-b5e6-96231b3b80d8

Don't print a label before .cfi_startproc when we don't need to. This makes
the produce assembly when using CFI just a bit more readable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147743 91177308-0d34-0410-b5e6-96231b3b80d8

Make clever use of alignment and padding to shrink GlobalValue.

-8 bytes on x86_64, no change on x86.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147742 91177308-0d34-0410-b5e6-96231b3b80d8

Match SelectionDAG logic for enabling movt.

Darwin doesn't do static, and ELF targets only support static.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147740 91177308-0d34-0410-b5e6-96231b3b80d8

Fix typo in the X86 backend readme. Patch from Jaeden Amero.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147739 91177308-0d34-0410-b5e6-96231b3b80d8

Remove VectorExtras. This unused helper was written for a type of API that is discouraged now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147738 91177308-0d34-0410-b5e6-96231b3b80d8

Remove unnecessary check of hasAVX(). It's already included in hasXMM().

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147734 91177308-0d34-0410-b5e6-96231b3b80d8

Replace some uses of hasNUsesOfValue(0, X) with !hasAnyUseOfValue(X)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147733 91177308-0d34-0410-b5e6-96231b3b80d8

Port the trick to skip the check for empty buckets from StringMap to DenseMap.

This should fix the odd behavior that find() is slower than lookup().

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147731 91177308-0d34-0410-b5e6-96231b3b80d8

Add some DAG combines for SUBC/SUBE. If nothing uses the carry/borrow out of subc, turn it into a sub. Turn (subc x, x) into 0 with no borrow. Turn (subc x, 0) into x with no borrow. Turn (subc -1, x) into (xor x, -1) with no borrow. Turn sube with no borrow in into subc.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147728 91177308-0d34-0410-b5e6-96231b3b80d8

Fix TableGen so that it will emit the correct signature for FastEmit_f:

  /// FastEmit_f - This method is called by target-independent code
  /// to request that an instruction with the given type, opcode, and
  /// floating-point immediate operand be emitted.
  virtual unsigned FastEmit_f(MVT VT,
                              MVT RetVT,
                              unsigned Opcode,
                              const ConstantFP *FPImm);

Currently, it emits an accidentally overloaded version without the const on the
ConstantFP*. This doesn't affect anything in the tree, since nothing causes that
method to be autogenerated, but I have been playing with some ARM TableGen
refactorings that hit this problem.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147727 91177308-0d34-0410-b5e6-96231b3b80d8

Optimize reserved register coalescing.

Reserved registers don't have proper live ranges, their LiveInterval
simply has a snippet of liveness for each def. Virtual registers with a
single value that is a copy of a reserved register (typically %esp) can
be coalesced with the reserved register if the live range doesn't
overlap any reserved register defs.

When coalescing with a reserved register, don't modify the reserved
register live range. Just leave it as a bunch of dead defs. This
eliminates quadratic coalescer behavior in i386 functions with many
function calls.

PR11699

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147726 91177308-0d34-0410-b5e6-96231b3b80d8

Use the 'regalloc' debug tag for most register allocator tracing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147725 91177308-0d34-0410-b5e6-96231b3b80d8

Enable redundant phi elimination after LSR.

This will be more important as we extend the LSR pass in ways that don't rely on the formula solver. In particular, we need it for constructing IV chains.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147724 91177308-0d34-0410-b5e6-96231b3b80d8

Fix dead link

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147721 91177308-0d34-0410-b5e6-96231b3b80d8

Use getRegForValue() to materialize the address of ARM globals.

This enables basic local CSE, giving us 20% smaller code for
consumer-typeset in -O0 builds.

<rdar://problem/10658692>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147720 91177308-0d34-0410-b5e6-96231b3b80d8

Revert part of r147716. Looks like x87 instructions kill markers are all messed
up so branch folding pass can't use the scavenger. :-( This doesn't breaks
anything currently. It just means targets which do not carefully update kill
markers cannot run post-ra scheduler (not new, it has always been the case).

We should fix this at some point since it's really hacky.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147719 91177308-0d34-0410-b5e6-96231b3b80d8

LSR: Don't optimize loops if an outer loop has no preheader.

LoopSimplify may not run on some outer loops, e.g. because of indirect
branches. SCEVExpander simply cannot handle outer loops with no preheaders.
Fixes rdar://10655343 SCEVExpander segfault.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147718 91177308-0d34-0410-b5e6-96231b3b80d8

Split Finish into Finish and FinishImpl to have a common place to do end of
file error checking. Use that to error on an unfinished cfi_startproc.

The error is not nice, but is already better than a segmentation fault.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147717 91177308-0d34-0410-b5e6-96231b3b80d8

Added a late machine instruction copy propagation pass. This catches
opportunities that only present themselves after late optimizations
such as tail duplication .e.g.
## BB#1:
        movl    %eax, %ecx
        movl    %ecx, %eax
        ret

The register allocator also leaves some of them around (due to false
dep between copies from phi-elimination, etc.)

This required some changes in codegen passes. Post-ra scheduler and the
pseudo-instruction expansion passes have been moved after branch folding
and tail merging. They were before branch folding before because it did
not always update block livein's. That's fixed now. The pass change makes
independently since we want to properly schedule instructions after
branch folding / tail duplication.

rdar://10428165
rdar://10640363

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@147716 91177308-0d34-0410-b5e6-96231b3b80d8