Programming Languages Research Group: Git

CodeGen: emit IR-level f16 conversion intrinsics as fptrunc/fpext

This makes the first stage DAG for @llvm.convert.to.fp16 an fptrunc,
and correspondingly @llvm.convert.from.fp16 an fpext. The legalisation
path is now uniform, regardless of the input IR:

fptrunc -> FP_TO_FP16 (if f16 illegal) -> libcall
fpext -> FP16_TO_FP (if f16 illegal) -> libcall

Each target should be able to select the version that best matches its
operations and not be required to duplicate patterns for both fptrunc
and FP_TO_FP16 (for example).

As a result we can remove some redundant AArch64 patterns.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213507 91177308-0d34-0410-b5e6-96231b3b80d8

[SDAG,cleanup] Switch the DAG combiner over to use the spelling
'Worklist' consistently rather than a deeply confusing mixture of
'WorkList' and 'Worklist'.

Notably, the very 'WorkList' of the DAG combiner was exposed to target
specific DAG combines under an interface 'AddToWorklist' which was
implemented by in turn calling 'AddToWorkList' in the combiner. This has
sent me circling with the wrong case in grep one too many times.

I chose to normalize on 'Worklist' because that one won the grep-vote
for llvm/lib/... by a hundered hits or so, and it is used in places
relatively "canonical" such as InstCombine's Worklist. Let's all jsut
pick this casing, whether "correct", "good", or "bad" and be
consistent...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213506 91177308-0d34-0410-b5e6-96231b3b80d8

[SDAG] Rather than using a narrow test against the one dummy node on the
stack, filter all handle nodes from the DAG combiner worklist.

This will also handle cases where other handle nodes might be
(erroneously) added to the worklist and then cause bugs and explosions
when deleted. For example, when running the legalizer within the DAG
combiner, there are times when other handle nodes are used and can end
up here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213505 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] Improve the shuffle-vector folding logic.

Canonicalize shuffles according to rules:
*  shuffle(A, shuffle(A, B)) -> shuffle(shuffle(A,B), A)
*  shuffle(B, shuffle(A, B)) -> shuffle(shuffle(A,B), B)
*  shuffle(B, shuffle(A, Undef)) -> shuffle(shuffle(A, Undef), B)

This patch helps identifying more shuffle pairs that could be combined reusing
the already existing rules in the DAGCombiner.

Added new test 'combine-vec-shuffle-5.ll' to verify that the canonicalized
shuffles are now folded into a single shuffle node by the DAGCombiner.
Added more test cases to 'combine-vec-shuffle-4.ll'.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213504 91177308-0d34-0410-b5e6-96231b3b80d8

[DAG] Refactor some logic. No functional change.

This patch removes function 'CommuteVectorShuffle' from X86ISelLowering.cpp
and moves its logic into SelectionDAG.cpp as method 'getCommutedVectorShuffles'.
This refactoring is in preperation of an upcoming change to the DAGCombiner.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213503 91177308-0d34-0410-b5e6-96231b3b80d8

Fix for regression: [Bug 20369] wrong code at -O3 on x86_64-linux-gnu in 64-bit mode

Prevents hoisting of loads above stores and sinking of stores below loads
in MergedLoadStoreMotion.cpp (rdar://15991737)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213497 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] ELFv2 aggregate passing support

This patch adds infrastructure support for passing array types
directly.  These can be used by the front-end to pass aggregate
types (coerced to an appropriate array type).  The details of the
array type being used inform the back-end about ABI-relevant
properties.  Specifically, the array element type encodes:
- whether the parameter should be passed in FPRs, VRs, or just
  GPRs/stack slots  (for float / vector / integer element types,
  respectively)
- what the alignment requirements of the parameter are when passed in
  GPRs/stack slots  (8 for float / 16 for vector / the element type
  size for integer element types) -- this corresponds to the
  "byval align" field

Using the infrastructure provided by this patch, a companion patch
to clang will enable two features:
- In the ELFv2 ABI, pass (and return) "homogeneous" floating-point
  or vector aggregates in FPRs and VRs (this is similar to the ARM
  homogeneous aggregate ABI)
- As an optimization for both ELFv1 and ELFv2 ABIs, pass aggregates
  that fit fully in registers without using the "byval" mechanism

The patch uses the functionArgumentNeedsConsecutiveRegisters callback
to encode that special treatment is required for all directly-passed
array types.  The isInConsecutiveRegs / isInConsecutiveRegsLast bits set
as a results are then used to implement the required size and alignment
rules in CalculateStackSlotSize / CalculateStackSlotAlignment etc.

As a related change, the ABI routines have to be modified to support
passing floating-point types in GPRs.  This is necessary because with
homogeneous aggregates of 4-byte float type we can now run out of FPRs
*before* we run out of the 64-byte argument save area that is shadowed
by GPRs.  Any extra floating-point arguments that no longer fit in FPRs
must now be passed in GPRs until we run out of those too.

Note that there was already code to pass floating-point arguments in
GPRs used with vararg parameters, which was done by writing the argument
out to the argument save area first and then reloading into GPRs.  The
patch re-implements this, however, in favor of code packing float arguments
directly via extension/truncation, BITCAST, and BUILD_PAIR operations.

This is required to support the ELFv2 ABI, since we cannot unconditionally
write to the argument save area (which the caller might not have allocated).
The change does, however, affect ELFv1 varags routines too; but even here
the overall effect should be advantageous: Instead of loading the argument
into the FPR, then storing the argument to the stack slot, and finally
reloading the argument from the stack slot into a GPR, the new code now
just loads the argument into the FPR, and subsequently loads the argument
into the GPR (via BITCAST).  That BITCAST might imply a save/reload from
a stack temporary (in which case we're no worse than before); but it
might be implemented more efficiently in some cases.

The final part of the patch enables up to 8 FPRs and VRs for argument
return in PPCCallingConv.td; this is required to support returning
ELFv2 homogeneous aggregates.  (Note that this doesn't affect other ABIs
since LLVM wil only look for which register to use if the parameter is
marked as "direct" return anyway.)

Reviewed by Hal Finkel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213493 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] ELFv2 explicit CFI for CR fields

This is a minor improvement in the ELFv2 ABI.   In ELFv1, DWARF CFI
would represent a saved CR word (holding CR fields CR2, CR3, and CR4)
using just a single CFI record refering to CR2.   In ELFv2 instead,
each of the CR fields is represented by its own CFI record.  The
advantage is that the compiler can now chose to save just a single
(or two) CR fields instead of all of them, if those are the only ones
that actually need saving.  That can lead to more efficient code using
mf(o)crf instead of the (slow) mfcr instruction.

Note that this patch does not (yet) implement this more efficient
code generation, but it does implement the part that is required to
be ABI compliant: creating multiple CFI records if multiple CR fields
are saved.

Reviewed by Hal Finkel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213492 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] ELFv2 dynamic loader support

This patch enables the new ELFv2 ABI in the runtime dynamic loader.
The loader has to implement the following features:
- In the ELFv2 ABI, do not look up a function descriptor in .opd, but
  instead use the local entry point when resolving a direct call.
- Update the TOC restore code to use the new TOC slot linkage area
  offset.
- Create PLT stubs appropriate for the ELFv2 ABI.

Note that this patch also adds common-code changes. These are necessary
because the loader must check the newly added ELF flags: the e_flags
header bits encoding the ABI version, and the st_other symbol table
entry bits encoding the local entry point offset.  There is currently
no way to access these, so I've added ObjectFile::getPlatformFlags and
SymbolRef::getOther accessors.

Reviewed by Hal Finkel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213491 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] ELFv2 stack space reduction

The ELFv2 ABI reduces the amount of stack required to implement an
ABI-compliant function call in two ways:
* the "linkage area" is reduced from 48 bytes to 32 bytes by
  eliminating two unused doublewords
* the 64-byte "parameter save area" is now optional and need not be
  present in certain cases (it remains mandatory in functions with
  variable arguments, and functions that have any parameter that is
  passed on the stack)

The following patch implements this required changes:
- reducing the linkage area, and associated relocation of the TOC save
  slot, in getLinkageSize / getTOCSaveOffset (this requires updating all
  callers of these routines to pass in the isELFv2ABI flag).
- (partially) handling the case where the parameter save are is optional

This latter part requires some extra explanation:  Currently, we still
always allocate the parameter save area when *calling* a function.
That is certainly always compliant with the ABI, but may cause code to
allocate stack unnecessarily.  This can be addressed by a follow-on
optimization patch.

On the *callee* side, in LowerFormalArguments, we *must* track
correctly whether the ABI guarantees that the caller has allocated
the parameter save area for our use, and the patch does so. However,
there is one complication: the code that handles incoming "byval"
arguments will currently *always* write to the parameter save area,
because it has to force incoming register arguments to the stack since
it must return an *address* to implement the byval semantics.

To fix this, the patch changes the LowerFormalArguments code to write
arguments to a freshly allocated stack slot on the function's own stack
frame instead of the argument save area in those cases where that area
is not present.

Reviewed by Hal Finkel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213490 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] ELFv2 function call changes

This patch builds upon the two preceding MC changes to implement the
basic ELFv2 function call convention.  In the ELFv1 ABI, a "function
descriptor" was associated with every function, pointing to both the
entry address and the related TOC base (and a static chain pointer
for nested functions).  Function pointers would actually refer to that
descriptor, and the indirect call sequence needed to load up both entry
address and TOC base.

In the ELFv2 ABI, there are no more function descriptors, and function
pointers simply refer to the (global) entry point of the function code.
Indirect function calls simply branch to that address, after loading it
up into r12 (as required by the ABI rules for a global entry point).
Direct function calls continue to just do a "bl" to the target symbol;
this will be resolved by the linker to the local entry point of the
target function if it is local, and to a PLT stub if it is global.
That PLT stub would then load the (global) entry point address of the
final target into r12 and branch to it.  Note that when performing a
local function call, r2 must be set up to point to the current TOC
base: if the target ends up local, the ABI requires that its local
entry point is called with r2 set up; if the target ends up global,
the PLT stub requires that r2 is set up.

This patch implements all LLVM changes to implement that scheme:
- No longer create a function descriptor when emitting a function
  definition (in EmitFunctionEntryLabel)
- Emit two entry points *if* the function needs the TOC base (r2)
  anywhere (this is done EmitFunctionBodyStart; note that this cannot
  be done in EmitFunctionBodyStart because the global entry point
  prologue code must be *part* of the function as covered by debug info).
- In order to make use tracking of r2 (as needed above) work correctly,
  mark direct function calls as implicitly using r2.
- Implement the ELFv2 indirect function call sequence (no function
  descriptors; load target address into r12).
- When creating an ELFv2 object file, emit the .abiversion 2 directive
  to tell the linker to create the appropriate version of PLT stubs.

Reviewed by Hal Finkel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213489 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopVectorize] Remove an unused private AA pointer

Thanks to the lld-x86_64-darwin13 builder for catching this first.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213488 91177308-0d34-0410-b5e6-96231b3b80d8

[MC] Pass MCSymbolData to needsRelocateWithSymbol

As discussed in a previous checking to support the .localentry
directive on PowerPC, we need to inspect the actual target symbol
in needsRelocateWithSymbol to make the appropriate decision based
on that symbol's st_other bits.

Currently, needsRelocateWithSymbol does not get the target symbol.
However, it is directly available to its sole caller. This patch
therefore simply extends the needsRelocateWithSymbol by a new
parameter "const MCSymbolData &SD", passes in the target symbol,
and updates all derived implementations.

In particular, in the PowerPC implementation, this patch removes
the FIXME added by the previous checkin.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213487 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopVectorize] Use AA to partition potential dependency checks

Prior to this change, the loop vectorizer did not make use of the alias
analysis infrastructure. Instead, it performed memory dependence analysis using
ScalarEvolution-based linear dependence checks within equivalence classes
derived from the results of ValueTracking's GetUnderlyingObjects.

Unfortunately, this meant that:
  1. The loop vectorizer had logic that essentially duplicated that in BasicAA
     for aliasing based on identified objects.
  2. The loop vectorizer could not partition the space of dependency checks
     based on information only easily available from within AA (TBAA metadata is
     currently the prime example).

This means, for example, regardless of whether -fno-strict-aliasing was
provided, the vectorizer would only vectorize this loop with a runtime
memory-overlap check:

void foo(int *a, float *b) {
  for (int i = 0; i < 1600; ++i)
    a[i] = b[i];
}

This is suboptimal because the TBAA metadata already provides the information
necessary to show that this check unnecessary. Of course, the vectorizer has a
limit on the number of such checks it will insert, so in practice, ignoring
TBAA means not vectorizing more-complicated loops that we should.

This change causes the vectorizer to use an AliasSetTracker to keep track of
the pointers in the loop. The resulting alias sets are then used to partition
the space of dependency checks, and potential runtime checks; this results in
more-efficient vectorizations.

When pointer locations are added to the AliasSetTracker, two things are done:
  1. The location size is set to UnknownSize (otherwise you'd not catch
     inter-iteration dependencies)
  2. For instructions in blocks that would need to be predicated, TBAA is
     removed (because the metadata might have a control dependency on the condition
     being speculated).

For non-predicated blocks, you can leave the TBAA metadata. This is safe
because you can't have an iteration dependency on the TBAA metadata (if you
did, and you unrolled sufficiently, you'd end up with the same pointer value
used by two accesses that TBAA says should not alias, and that would yield
undefined behavior).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213486 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] ELFv2 MC support for .localentry directive

A second binutils feature needed to support ELFv2 is the .localentry
directive.  In the ELFv2 ABI, functions may have two entry points:
one for calling the routine locally via "bl", and one for calling the
function via function pointer (either at the source level, or implicitly
via a PLT stub for global calls).  The two entry points share a single
ELF symbol, where the ELF symbol address identifies the global entry
point address, while the local entry point is found by adding a delta
offset to the symbol address.  That offset is encoded into three
platform-specific bits of the ELF symbol st_other field.

The .localentry directive instructs the assembler to set those fields
to encode a particular offset.  This is typically used by a function
prologue sequence like this:

func:
        addis r2, r12, (.TOC.-func)@ha
        addi r2, r2, (.TOC.-func)@l
        .localentry func, .-func

Note that according to the ABI, when calling the global entry point,
r12 must be set to point the global entry point address itself; while
when calling the local entry point, r2 must be set to point to the TOC
base.  The two instructions between the global and local entry point in
the above example translate the first requirement into the second.

This patch implements support in the PowerPC MC streamers to emit the
.localentry directive (both into assembler and ELF object output), as
well as support in the assembler parser to parse that directive.

In addition, there is another change required in MC fixup/relocation
handling to properly deal with relocations targeting function symbols
with two entry points: When the target function is known local, the MC
layer would immediately handle the fixup by inserting the target
address -- this is wrong, since the call may need to go to the local
entry point instead.  The GNU assembler handles this case by *not*
directly resolving fixups targeting functions with two entry points,
but always emits the relocation and relies on the linker to handle
this case correctly.  This patch changes LLVM MC to do the same (this
is done via the processFixupValue routine).

Similarly, there are cases where the assembler would normally emit a
relocation, but "simplify" it to a relocation targeting a *section*
instead of the actual symbol.  For the same reason as above, this
may be wrong when the target symbol has two entry points.  The GNU
assembler again handles this case by not performing this simplification
in that case, but leaving the relocation targeting the full symbol,
which is then resolved by the linker.  This patch changes LLVM MC
to do the same (via the needsRelocateWithSymbol routine).
NOTE: The method used in this patch is overly pessimistic, since the
needsRelocateWithSymbol routine currently does not have access to the
actual target symbol, and thus must always assume that it might have
two entry points.  This will be improved upon by a follow-on patch
that modifies common code to pass the target symbol when calling
needsRelocateWithSymbol.

Reviewed by Hal Finkel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213485 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] ELFv2 MC support for .abiversion directive

ELFv2 binaries are marked by a bit in the ELF header e_flags field.
A new assembler directive .abiversion can be used to set that flag.
This patch implements support in the PowerPC MC streamers to emit the
.abiversion directive (both into assembler and ELF binary output),
as well as support in the assembler parser to parse the .abiversion
directive.

Reviewed by Hal Finkel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213484 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Refactor byval handling in LowerFormalArguments_64SVR4

When handling an incoming byval argument, we need to possibly write
incoming registers to the stack in order to create an on-stack image
of the parameter, so we can return its address to common code.

This currently uses CreateFixedObject to access the parts of the
parameter save area where the argument is (or needs to be) stored.
However, sometimes we need to access multiple parts of that area,
e.g. to write multiple registers.  The code currently uses a new
CreateFixedObject call for each of these accesses, resulting in
a patchwork of overlapping (fixed) stack objects.

This doesn't really matter in the case of fixed objects, since
any access to those turns into a fixed stackpointer + offset
address anyway.  However, with the upcoming ELFv2 patches, we
may actually need to place an incoming argument into our *own*
stack frame instead of the caller's.  This means we need to use
CreateStackObject instead, and we cannot have multiple overlapping
instances of those.

To make the rest of the argument handling code work equally in
both situations, this patch refactors it to always use just a
single call to CreateFixedObject, and access parts of that object
as required using address arithmetic.  This way, we can in a future
patch substitute CreateStackObject without further changes.

No change to generated code intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213483 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Fix FrameIndex handling in SelectAddressRegImm

The PPCTargetLowering::SelectAddressRegImm routine needs to handle
FrameIndex nodes in a special manner, by tranlating them into a
TargetFrameIndex node. This was done in most cases, but seems to
have been neglected in one path: when the input tree has an OR of
the FrameIndex with an immediate. This can happen if the FrameIndex
can be proven to be sufficiently aligned that an OR of that immediate
is equivalent to an ADD.

The missing handling of FrameIndex in that case caused the SelectionDAG
instruction selection to miss opportunities to merge the OR back into
the FrameIndex node, leading to superfluous addi/ori instructions in
the final assembler output.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213482 91177308-0d34-0410-b5e6-96231b3b80d8

Namespace cleanup (no functional change)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213478 91177308-0d34-0410-b5e6-96231b3b80d8

SIISelLowering.cpp: Define _USE_MATH_DEFINES to let M_PI provided on MS <cmath>.

FIXME: Would it be better to move it into configure?

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213477 91177308-0d34-0410-b5e6-96231b3b80d8

MachineRegionInfo.cpp: Another fix on MachineRegionInfo::MachineRegionInfo::recalculate() to appease msc17.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213476 91177308-0d34-0410-b5e6-96231b3b80d8

Remove braces around single-statement block and rangify outer loop.

This is a follow-up to r213474.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213475 91177308-0d34-0410-b5e6-96231b3b80d8

[C++11] Add predecessors(BasicBlock *) / successors(BasicBlock *) iterator ranges.

Summary: This patch introduces two new iterator ranges and updates existing code to use it. No functional change intended.

Test Plan: All tests (make check-all) still pass.

Reviewers: dblaikie

Reviewed By: dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D4481

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213474 91177308-0d34-0410-b5e6-96231b3b80d8

R600: Add missing test for concat_vectors

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213473 91177308-0d34-0410-b5e6-96231b3b80d8

R600: Remove unused function

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213472 91177308-0d34-0410-b5e6-96231b3b80d8

R600/SI: Remove dead code and add missing tests.

This probably was killed by some generic DAGCombiner
improvements in checking the TargetBooleanContents instead
of just 1.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213471 91177308-0d34-0410-b5e6-96231b3b80d8

Fix msc17 build. RegionInfo::RegionInfo::recalculate() doesn't make sense.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213466 91177308-0d34-0410-b5e6-96231b3b80d8

Fix -Asserts build introduced since r213456.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213465 91177308-0d34-0410-b5e6-96231b3b80d8

Sure up ownership passing of the PBQPBuilder by passing unique_ptrs by value rather than lvalue reference.

Also removes an unnecessary '.release()' that should've been a std::move
anyway. (I'm on a hunt for '.release()' calls)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213464 91177308-0d34-0410-b5e6-96231b3b80d8

MC: permit emitting a symbol value as section relative

This adds an optional parameter to the EmitSymbolValue method in MCStreamer to
permit emitting a symbol value as a section relative value. This is to cover
the use in MCDwarf which should not really know about how to emit a section
relative value for a given target.

This addresses post-review comments from Eric Christopher in SVN r213275.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213463 91177308-0d34-0410-b5e6-96231b3b80d8

Revert accidentally committed r213459

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213461 91177308-0d34-0410-b5e6-96231b3b80d8

Fix build with GCC.

Seems like a bug in either GCC or clang, but I'm
not sure which is right.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213460 91177308-0d34-0410-b5e6-96231b3b80d8

XXX - Increase unroll threshold

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213459 91177308-0d34-0410-b5e6-96231b3b80d8

R600/SI: implement range reduction for sin/cos

These instructions can only take a limited input range, and return
the constant value 1 out of range. We should do range reduction to
be able to process arbitrary values. Use a FRACT instruction after
normalization to achieve this. Also add a test for constant folding
with the lowered code with unsafe-fp-math enabled.

v2: use DAG lowering instead of intrinsic, adapt test
v3: calculate constant, fold pattern into instruction definition
v4: misc style fixes, add sin-fold testcase, cosmetics

Patch by Grigori Goronzy

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213458 91177308-0d34-0410-b5e6-96231b3b80d8

Templatify RegionInfo so it works on MachineBasicBlocks

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213456 91177308-0d34-0410-b5e6-96231b3b80d8

R600: Implement a few simple TTI queries.

I'm not sure if these have any effect right now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213455 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopVectorize] Use CreateAligned(Load|Store)

IRBuilder has CreateAligned(Load|Store) functions; use them and we don't need
to make a second call to setAlignment.

No functionality change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213453 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopVectorize] Propagate known metadata to vectorized instructions

There are some kinds of metadata that are safe to propagate from the scalar
instructions to the vector instructions (fpmath and tbaa currently).

Regarding TBAA, one might worry about propagating it on if-converted loads and
stores, because the metadata might have had a control dependency on the
condition, and thus actually aliased with some other non-speculated memory
access when the condition was false. However, this would be caught by the
runtime overlap checks.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213452 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] Fix wrong shuffle mask in test 'combine-vec-shuffle-3.ll'. No functional change.

Function @test3c should check that the DAGCombiner is able to fold a pair of
shuffles into a new shuffle with a permute mask of <6,7,2,3>. However, one of
the shuffles in @test3c had a wrong permute mask; this prevented the DAGCombiner
from folding the shuffles into the expected result.
Now that the shuffle mask is fixed, the backend correctly folds the two shuffles
in function @test3c into a single movhlps instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213451 91177308-0d34-0410-b5e6-96231b3b80d8

Handle AddrSpaceCast in stripAndAccumulateInBoundsConstantOffsets

All of the other similar functions in that part of the file look through
addrspacecast in addition to bitcast, and I see no reason why
stripAndAccumulateInBoundsConstantOffsets shouldn't do so also.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213449 91177308-0d34-0410-b5e6-96231b3b80d8

MergedLoadStoreMotion.cpp: Fix msc17 build. Member initializer is unavailable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213448 91177308-0d34-0410-b5e6-96231b3b80d8

Make Value::isDereferenceablePointer handle offsets to pointer types with dereferenceable attributes

When we have a parameter (or call site return) with a dereferenceable
attribute, it can specify the size of an array pointed to by that parameter. If
we have a value for which we can accumulate a constant offset to such a
parameter, then we can use that offset in a direct comparison with the size
specified by the dereferenceable attribute.

This enables us to handle cases like this:

  int foo(int a[static 3]) {
    return a[2]; /* this is always dereferenceable */
  }

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213447 91177308-0d34-0410-b5e6-96231b3b80d8

ARM: correct WoA __builtin_alloca handling on O0

When performing a dynamic stack adjustment without optimisations, we would mark
SP as def and R4 as kill. This occurred as part of the expansion of a
WIN__CHKSTK SDNode which indicated the proper handling of SP and R4. The result
would be that we would double define SP as part of an operation, which is
obviously incorrect.

Furthermore, the VTList for the chain had an incorrect parameter type of i32
instead of Other.

Correct these to permit proper lowering of __builtin_alloca at -O0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213442 91177308-0d34-0410-b5e6-96231b3b80d8

Remove uses of the redundant ".reset(nullptr)" of unique_ptr, in favor of ".reset()"

It's also possible to just write "= nullptr", but there's some question
of whether that's as readable, so I leave it up to authors to pick which
they prefer for now. If we want to discuss standardizing on one or the
other, we can do that at some point in the future.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213438 91177308-0d34-0410-b5e6-96231b3b80d8

[MCJIT] Add a 'decodeAddend' method to RuntimeDyldMachO and teach
getBasicRelocationEntry to use this rather than 'memcpy' to get the
relocation addend. Targets with non-trivial addend encodings (E.g. AArch64) can
override decodeAddend to handle immediates with interesting encodings.

No functional change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213435 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "Reapply "DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself.""""

After a successful build it seems to have come back on a later build.

This reverts commit r213391.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213432 91177308-0d34-0410-b5e6-96231b3b80d8

Fundamentally change the MipsSubtarget replacement machinery:

a) Move the replacement level decision to the target machine.
b) Create additional subtargets at the TargetMachine level to
   cache and make replacement easy.
c) Make the mips16 features obvious.
d) Remove the override logic as it no longer does anything.
e) Have MipsModuleDAGToDAGISel take only the target machine.
f) Have the constant islands pass grab the current subtarget
   from the MachineFunction (via the TargetMachine) instead
   of caching it.
g) Unconditionally initialize TLOF.
h) Remove the old complicated subtarget based resetting and
   replace it with simple conditionals.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213430 91177308-0d34-0410-b5e6-96231b3b80d8

FrameLowering depends only upon the Subtarget, so only take a subtarget
during initialization.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213429 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] 32-bit ELF PIC support

This adds initial support for PPC32 ELF PIC (Position Independent Code; the
-fPIC variety), thus rectifying a long-standing deficiency in the PowerPC
backend.

Patch by Justin Hibbits!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213427 91177308-0d34-0410-b5e6-96231b3b80d8

In preparation for replacing the whole subtarget on the target machine,
have target lowering take the subtarget explicitly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213426 91177308-0d34-0410-b5e6-96231b3b80d8

Make InstrInfo depend only upon the Subtarget getting passed in
rather than the TargetMachine.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213425 91177308-0d34-0410-b5e6-96231b3b80d8

The subtarget in MipsTargetLowering isn't going to change and
so doesn't need to be a pointer, but a reference.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213422 91177308-0d34-0410-b5e6-96231b3b80d8

Avoid caching the relocation model on the subtarget, this is for
two reasons:

a) we're already caching the target machine which contains it,
b) which relocation model you get is dependent upon whether or
not you ask before MCCodeGenInfo is constructed on the target
machine, so avoid any latent issues there.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213420 91177308-0d34-0410-b5e6-96231b3b80d8

Remove commented out code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213419 91177308-0d34-0410-b5e6-96231b3b80d8

Clean up some style and formatting issues.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213418 91177308-0d34-0410-b5e6-96231b3b80d8

DebugInfo: Assert that all abstract scopes are subprograms, rather than conditionalizing.

There's nothing else these should ever be...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213417 91177308-0d34-0410-b5e6-96231b3b80d8

Fix build breakage introduced with r213412.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213414 91177308-0d34-0410-b5e6-96231b3b80d8

Remove unroll pragma metadata after it is used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213412 91177308-0d34-0410-b5e6-96231b3b80d8

Fix a couple of formatting and style issues.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213409 91177308-0d34-0410-b5e6-96231b3b80d8

[MCJIT] [AArch64] Make sure to propegate ARM64_RELOC_ADDEND values into the
RelocationEntry.

No test case yet, as this primarily hits GOT entries, which RuntimeDyldChecker
can't examine yet. I'm actively working on features that will enable us to
test this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213408 91177308-0d34-0410-b5e6-96231b3b80d8

Make non-module passes unconditionally added in the pass
manager for mips, and early exit if we don't want to do
anything because of the current subtarget.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213407 91177308-0d34-0410-b5e6-96231b3b80d8

Add tests for atomic adds on floats.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213406 91177308-0d34-0410-b5e6-96231b3b80d8

Rename DiagnosticInfoOptimizationWarning to DiagnosticInfoOptimizationFailure
so the severity of the message is not part of the type name.

Reviewed by Alp Toker

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213399 91177308-0d34-0410-b5e6-96231b3b80d8

Use CHECK-LABEL where appropriate in this test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213398 91177308-0d34-0410-b5e6-96231b3b80d8

Add loop unrolling metadata descriptions to docs/LangRef.rst.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213397 91177308-0d34-0410-b5e6-96231b3b80d8

MergedLoadStoreMotion pass

Merges equivalent loads on both sides of a hammock/diamond
and hoists into into the header.
Merges equivalent stores on both sides of a hammock/diamond
and sinks it to the footer.
Can enable if conversion and tolerate better load misses
and store operand latencies.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213396 91177308-0d34-0410-b5e6-96231b3b80d8

Reapply "DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself."""

Recommits 212776 which was reverted in r212793. This has been committed
and recommitted a few times as I try to test it harder and find/fix more
issues. The most recent revert was due to an asan bot failure which I
can't seem to reproduce locally, though I believe I'm following all the
steps the buildbot does.

So I'm going to recommit this in the hopes of investigating the failure
on the buildbot itself... apologies in advance for the bot noise. If
anyone sees failures with this /please/ provide me with any
reproductions, etc.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213391 91177308-0d34-0410-b5e6-96231b3b80d8

Fix build failure on windows

Add explicit constructor to struct instead of using brace initialization.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213389 91177308-0d34-0410-b5e6-96231b3b80d8

MC: support different sized constants in constant pools

On AArch64 the pseudo instruction ldr <reg>, =... supports both
32-bit and 64-bit constants. Add support for 64 bit constants for
the pools to support the pseudo instruction fully.

Changes the AArch64 ldr-pseudo tests to use 32-bit registers and
adds tests with 64-bit registers.

Patch by Janne Grunau!

Differential Revision: http://reviews.llvm.org/D4279

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213387 91177308-0d34-0410-b5e6-96231b3b80d8

Add a dereferenceable attribute

This attribute indicates that the parameter or return pointer is
dereferenceable. Practically speaking, loads from such a pointer within the
associated byte range are safe to speculatively execute. Such pointer
parameters are common in source languages (C++ references, for example).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213385 91177308-0d34-0410-b5e6-96231b3b80d8

Add MIPS Technologies to the vendors in llvm::Triple.

This is a prerequisite for checking for 'mti' and 'img' in a consistent way in
clang. Previously 'img' could use Triple::getVendor() but 'mti' could only use
Triple::getVendorName().

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213381 91177308-0d34-0410-b5e6-96231b3b80d8

AArch64: implement efficient f16 bitcasts

Because i16 is illegal, there's no native DAG method to
represent a bitcast to or from an f16 type. This meant LLVM was
inserting a stack store/load pair which is really not ideal.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213378 91177308-0d34-0410-b5e6-96231b3b80d8

NVPTX: support fpext/fptrunc to and from f16.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213377 91177308-0d34-0410-b5e6-96231b3b80d8

R600: support fpext/fptrunc operations to and from f16.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213376 91177308-0d34-0410-b5e6-96231b3b80d8

AArch64: support f16 extend/trunc operations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213375 91177308-0d34-0410-b5e6-96231b3b80d8

X86: support fpext/fptrunc operations to and from 16-bit floats.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213374 91177308-0d34-0410-b5e6-96231b3b80d8

ARM: support legalisation of "fptrunc ... to half" operations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213373 91177308-0d34-0410-b5e6-96231b3b80d8

CodeGen: soften f16 type by default instead of marking legal.

Actual support for softening f16 operations is still limited, and can be added
when it's needed. But Soften is much closer to being a useful thing to try
than keeping it Legal when no registers can actually hold such values.

Longer term, we probably want something between Soften and Promote semantics
for most targets, it'll be more efficient to promote the 4 basic operations to
f32 than libcall them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213372 91177308-0d34-0410-b5e6-96231b3b80d8

Suppress 'not handled in switch' warning

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213371 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Add earlyclobber constraint to pre/post-indexed ARM STR instructions.

The post-indexed instructions were missing the constraint, causing unpredictable STR instructions to be emitted.

The earlyclobber constraint on the pre-indexed STR instructions is not strictly necessary, as the instruction selection for pre-indexed STR instructions goes through an additional layer of pseudo instructions which have the constraint defined, however it doesn't hurt to specify the constraint directly on the pre-indexed instructions as well, since at some point someone might create instances of them programmatically and then the constraint is definitely needed.

This fixes PR20323.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213369 91177308-0d34-0410-b5e6-96231b3b80d8

Refactor ARM subarchitecture parsing

Re-commit of a patch to rework the triple parsing on ARM to a more sane
model.

Patch by Gabor Ballabas.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213367 91177308-0d34-0410-b5e6-96231b3b80d8

extracting swapStruct into include/llvm/Support/MachO.h (no functional change)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213361 91177308-0d34-0410-b5e6-96231b3b80d8

R600: rename misleading fp16 test.

This test is actually going in the opposite direction to what the
filename and function name suggested.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213358 91177308-0d34-0410-b5e6-96231b3b80d8

R600: support f16 -> f64 conversion intrinsic.

Unfortunately, we don't seem to have a direct truncation, but the
extension can be legally split into two operations so we should
support that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213357 91177308-0d34-0410-b5e6-96231b3b80d8

NVPTX: support direct f16 <-> f64 conversions via intrinsics.

Clang may well start emitting these soon, and while it may not be
directly relevant for OpenCL or GLSL, the instructions were just
sitting there waiting to be used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213356 91177308-0d34-0410-b5e6-96231b3b80d8

Rename AlignAttribute to IntAttribute

Currently the only kind of integer IR attributes that we have are alignment
attributes, and so the attribute kind that takes an integer parameter is called
AlignAttr, but that will change (we'll soon be adding a dereferenceable
attribute that also takes an integer value). Accordingly, rename AlignAttribute
to IntAttribute (class names, enums, etc.).

No functionality change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213352 91177308-0d34-0410-b5e6-96231b3b80d8

R600: Implement TTI:getPopcntSupport

The test is just copied from X86, and I don't know of a better
way to test it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213351 91177308-0d34-0410-b5e6-96231b3b80d8

X86: Constant fold converting vector setcc results to float.

Since the result of a SETCC for X86 is 0 or -1 in each lane, we can
move unary operations, in this case [su]int_to_fp through the mask
operation and constant fold the operation away. Generally speaking:
  UNARYOP(AND(VECTOR_CMP(x,y), constant))
      --> AND(VECTOR_CMP(x,y), constant2)
where constant2 is UNARYOP(constant).

This implements the transform where UNARYOP is [su]int_to_fp.

For example, consider the simple function:
define <4 x float> @foo(<4 x float> %val, <4 x float> %test) nounwind {
  %cmp = fcmp oeq <4 x float> %val, %test
  %ext = zext <4 x i1> %cmp to <4 x i32>
  %result = sitofp <4 x i32> %ext to <4 x float>
  ret <4 x float> %result
}

Before this change, the SSE code is generated as:
LCPI0_0:
  .long 1                       ## 0x1
  .long 1                       ## 0x1
  .long 1                       ## 0x1
  .long 1                       ## 0x1
  .section  __TEXT,__text,regular,pure_instructions
  .globl  _foo
  .align  4, 0x90
_foo:                                   ## @foo
  cmpeqps %xmm1, %xmm0
  andps LCPI0_0(%rip), %xmm0
  cvtdq2ps  %xmm0, %xmm0
  retq

After, the code is improved to:
LCPI0_0:
  .long 1065353216              ## float 1.000000e+00
  .long 1065353216              ## float 1.000000e+00
  .long 1065353216              ## float 1.000000e+00
  .long 1065353216              ## float 1.000000e+00
  .section  __TEXT,__text,regular,pure_instructions
  .globl  _foo
  .align  4, 0x90
_foo:                                   ## @foo
  cmpeqps %xmm1, %xmm0
  andps LCPI0_0(%rip), %xmm0
  retq

The cvtdq2ps has been constant folded away and the floating point 1.0f
vector lanes are materialized directly via the ModRM operand of andps.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213342 91177308-0d34-0410-b5e6-96231b3b80d8

AArch64: Constant fold converting vector setcc results to float.

Since the result of a SETCC for AArch64 is 0 or -1 in each lane, we can
move unary operations, in this case [su]int_to_fp through the mask
operation and constant fold the operation away. Generally speaking:
  UNARYOP(AND(VECTOR_CMP(x,y), constant))
      --> AND(VECTOR_CMP(x,y), constant2)
where constant2 is UNARYOP(constant).

This implements the transform where UNARYOP is [su]int_to_fp.

For example, consider the simple function:
define <4 x float> @foo(<4 x float> %val, <4 x float> %test) nounwind {
  %cmp = fcmp oeq <4 x float> %val, %test
  %ext = zext <4 x i1> %cmp to <4 x i32>
  %result = sitofp <4 x i32> %ext to <4 x float>
  ret <4 x float> %result
}

Before this change, the code is generated as:
  fcmeq.4s  v0, v0, v1
  movi.4s v1, #0x1        // Integer splat value.
  and.16b v0, v0, v1      // Mask lanes based on the comparison.
  scvtf.4s  v0, v0        // Convert each lane to f32.
  ret

After, the code is improved to:
  fcmeq.4s  v0, v0, v1
  fmov.4s v1, #1.00000000 // f32 splat value.
  and.16b v0, v0, v1      // Mask lanes based on the comparison.
  ret

The svvtf.4s has been constant folded away and the floating point 1.0f
vector lanes are materialized directly via fmov.4s.

Rather than do the folding manually in the target code, teach getNode()
in the generic SelectionDAG to handle folding constant operands of
vector [su]int_to_fp nodes. It is reasonable (as noted in a FIXME) to do
additional constant folding there as well, but I don't have test cases
for those operations, so leaving them for another time when it becomes
appropriate.

rdar://17693791

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213341 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[x86] Fold extract_vector_elt of a load into the Load's address computation."

There's a bug where this can create cycles in the DAG. It will take a bit
to fix, so I'm backing it out for now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213339 91177308-0d34-0410-b5e6-96231b3b80d8

Reset the Subtarget in the AsmPrinter for each machine function
and add explanatory comment about dual initialization. Fix
use of the Subtarget to grab the information off of the target machine.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213336 91177308-0d34-0410-b5e6-96231b3b80d8

Avoid resetting the UseSoftFloat and FloatABIType on the TargetMachine
Options struct and move the comment to inMips16HardFloat. Use the
fact that we now know whether or not we cared about soft float to
set the libcalls.
Accordingly rename mipsSEUsesSoftFloat to abiUsesSoftFloat and
propagate since it's no longer CPU specific.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213335 91177308-0d34-0410-b5e6-96231b3b80d8

[MCJIT] Fix the alignment requirements for ARM and AArch64 which were mistakenly
relaxed in the big RuntimeDyldMachO cleanup of r213293.

No test case yet - this was found via inspection and there's no easy way to test
GOT alignment in RuntimeDyldChecker at the moment. I'm working on adding support
for this now, and hope to have a test case for this soon.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213331 91177308-0d34-0410-b5e6-96231b3b80d8

Tweak formating to match what clang-format would be for llvm-nm.cpp .
No functional change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213330 91177308-0d34-0410-b5e6-96231b3b80d8

Add printing of Mach-O stabs in llvm-nm.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213327 91177308-0d34-0410-b5e6-96231b3b80d8

Remove rules against std::function from the programmer's manual

Clarify that llvm::function_ref is like StringRef for callables.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213326 91177308-0d34-0410-b5e6-96231b3b80d8

ms inline asm: Don't add x86 segment registers to the clobber list.

Clang tries to check the clobber list but doesn't list segment registers in its
x86 register list. This fixes PR20343.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213303 91177308-0d34-0410-b5e6-96231b3b80d8

Make myself code owner of MCJIT.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213302 91177308-0d34-0410-b5e6-96231b3b80d8

Drop the udis86 wrapper from llvm::sys

This optional dependency on the udis86 library was added some time back to aid
JIT development, but doesn't make much sense to link into LLVM binaries these
days.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213300 91177308-0d34-0410-b5e6-96231b3b80d8

TableGen: Add 'static' to a large array to avoid a huge stack allocation

Speculative fix for a -Wframe-larger-than warning from gcc. Clang will
implicitly promote such constant arrays to globals, so in theory it
won't hit this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213298 91177308-0d34-0410-b5e6-96231b3b80d8