Tim Northover [Fri, 18 Jul 2014 13:01:31 +0000 (13:01 +0000)]
AArch64: support f16 extend/trunc operations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213375
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Northover [Fri, 18 Jul 2014 13:01:25 +0000 (13:01 +0000)]
X86: support fpext/fptrunc operations to and from 16-bit floats.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213374
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Northover [Fri, 18 Jul 2014 13:01:19 +0000 (13:01 +0000)]
ARM: support legalisation of "fptrunc ... to half" operations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213373
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Northover [Fri, 18 Jul 2014 12:41:46 +0000 (12:41 +0000)]
CodeGen: soften f16 type by default instead of marking legal.
Actual support for softening f16 operations is still limited, and can be added
when it's needed. But Soften is much closer to being a useful thing to try
than keeping it Legal when no registers can actually hold such values.
Longer term, we probably want something between Soften and Promote semantics
for most targets, it'll be more efficient to promote the 4 basic operations to
f32 than libcall them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213372
91177308-0d34-0410-b5e6-
96231b3b80d8
Renato Golin [Fri, 18 Jul 2014 12:13:04 +0000 (12:13 +0000)]
Suppress 'not handled in switch' warning
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213371
91177308-0d34-0410-b5e6-
96231b3b80d8
Tilmann Scheller [Fri, 18 Jul 2014 12:05:49 +0000 (12:05 +0000)]
[ARM] Add earlyclobber constraint to pre/post-indexed ARM STR instructions.
The post-indexed instructions were missing the constraint, causing unpredictable STR instructions to be emitted.
The earlyclobber constraint on the pre-indexed STR instructions is not strictly necessary, as the instruction selection for pre-indexed STR instructions goes through an additional layer of pseudo instructions which have the constraint defined, however it doesn't hurt to specify the constraint directly on the pre-indexed instructions as well, since at some point someone might create instances of them programmatically and then the constraint is definitely needed.
This fixes PR20323.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213369
91177308-0d34-0410-b5e6-
96231b3b80d8
Renato Golin [Fri, 18 Jul 2014 12:00:48 +0000 (12:00 +0000)]
Refactor ARM subarchitecture parsing
Re-commit of a patch to rework the triple parsing on ARM to a more sane
model.
Patch by Gabor Ballabas.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213367
91177308-0d34-0410-b5e6-
96231b3b80d8
Artyom Skrobov [Fri, 18 Jul 2014 09:26:16 +0000 (09:26 +0000)]
extracting swapStruct into include/llvm/Support/MachO.h (no functional change)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213361
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Northover [Fri, 18 Jul 2014 08:43:30 +0000 (08:43 +0000)]
R600: rename misleading fp16 test.
This test is actually going in the opposite direction to what the
filename and function name suggested.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213358
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Northover [Fri, 18 Jul 2014 08:43:24 +0000 (08:43 +0000)]
R600: support f16 -> f64 conversion intrinsic.
Unfortunately, we don't seem to have a direct truncation, but the
extension can be legally split into two operations so we should
support that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213357
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Northover [Fri, 18 Jul 2014 08:30:10 +0000 (08:30 +0000)]
NVPTX: support direct f16 <-> f64 conversions via intrinsics.
Clang may well start emitting these soon, and while it may not be
directly relevant for OpenCL or GLSL, the instructions were just
sitting there waiting to be used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213356
91177308-0d34-0410-b5e6-
96231b3b80d8
Hal Finkel [Fri, 18 Jul 2014 06:51:55 +0000 (06:51 +0000)]
Rename AlignAttribute to IntAttribute
Currently the only kind of integer IR attributes that we have are alignment
attributes, and so the attribute kind that takes an integer parameter is called
AlignAttr, but that will change (we'll soon be adding a dereferenceable
attribute that also takes an integer value). Accordingly, rename AlignAttribute
to IntAttribute (class names, enums, etc.).
No functionality change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213352
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 18 Jul 2014 06:07:13 +0000 (06:07 +0000)]
R600: Implement TTI:getPopcntSupport
The test is just copied from X86, and I don't know of a better
way to test it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213351
91177308-0d34-0410-b5e6-
96231b3b80d8
Jim Grosbach [Fri, 18 Jul 2014 00:40:56 +0000 (00:40 +0000)]
X86: Constant fold converting vector setcc results to float.
Since the result of a SETCC for X86 is 0 or -1 in each lane, we can
move unary operations, in this case [su]int_to_fp through the mask
operation and constant fold the operation away. Generally speaking:
UNARYOP(AND(VECTOR_CMP(x,y), constant))
--> AND(VECTOR_CMP(x,y), constant2)
where constant2 is UNARYOP(constant).
This implements the transform where UNARYOP is [su]int_to_fp.
For example, consider the simple function:
define <4 x float> @foo(<4 x float> %val, <4 x float> %test) nounwind {
%cmp = fcmp oeq <4 x float> %val, %test
%ext = zext <4 x i1> %cmp to <4 x i32>
%result = sitofp <4 x i32> %ext to <4 x float>
ret <4 x float> %result
}
Before this change, the SSE code is generated as:
LCPI0_0:
.long 1 ## 0x1
.long 1 ## 0x1
.long 1 ## 0x1
.long 1 ## 0x1
.section __TEXT,__text,regular,pure_instructions
.globl _foo
.align 4, 0x90
_foo: ## @foo
cmpeqps %xmm1, %xmm0
andps LCPI0_0(%rip), %xmm0
cvtdq2ps %xmm0, %xmm0
retq
After, the code is improved to:
LCPI0_0:
.long
1065353216 ## float 1.
000000e+00
.long
1065353216 ## float 1.
000000e+00
.long
1065353216 ## float 1.
000000e+00
.long
1065353216 ## float 1.
000000e+00
.section __TEXT,__text,regular,pure_instructions
.globl _foo
.align 4, 0x90
_foo: ## @foo
cmpeqps %xmm1, %xmm0
andps LCPI0_0(%rip), %xmm0
retq
The cvtdq2ps has been constant folded away and the floating point 1.0f
vector lanes are materialized directly via the ModRM operand of andps.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213342
91177308-0d34-0410-b5e6-
96231b3b80d8
Jim Grosbach [Fri, 18 Jul 2014 00:40:52 +0000 (00:40 +0000)]
AArch64: Constant fold converting vector setcc results to float.
Since the result of a SETCC for AArch64 is 0 or -1 in each lane, we can
move unary operations, in this case [su]int_to_fp through the mask
operation and constant fold the operation away. Generally speaking:
UNARYOP(AND(VECTOR_CMP(x,y), constant))
--> AND(VECTOR_CMP(x,y), constant2)
where constant2 is UNARYOP(constant).
This implements the transform where UNARYOP is [su]int_to_fp.
For example, consider the simple function:
define <4 x float> @foo(<4 x float> %val, <4 x float> %test) nounwind {
%cmp = fcmp oeq <4 x float> %val, %test
%ext = zext <4 x i1> %cmp to <4 x i32>
%result = sitofp <4 x i32> %ext to <4 x float>
ret <4 x float> %result
}
Before this change, the code is generated as:
fcmeq.4s v0, v0, v1
movi.4s v1, #0x1 // Integer splat value.
and.16b v0, v0, v1 // Mask lanes based on the comparison.
scvtf.4s v0, v0 // Convert each lane to f32.
ret
After, the code is improved to:
fcmeq.4s v0, v0, v1
fmov.4s v1, #1.
00000000 // f32 splat value.
and.16b v0, v0, v1 // Mask lanes based on the comparison.
ret
The svvtf.4s has been constant folded away and the floating point 1.0f
vector lanes are materialized directly via fmov.4s.
Rather than do the folding manually in the target code, teach getNode()
in the generic SelectionDAG to handle folding constant operands of
vector [su]int_to_fp nodes. It is reasonable (as noted in a FIXME) to do
additional constant folding there as well, but I don't have test cases
for those operations, so leaving them for another time when it becomes
appropriate.
rdar://
17693791
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213341
91177308-0d34-0410-b5e6-
96231b3b80d8
Michael J. Spencer [Fri, 18 Jul 2014 00:15:50 +0000 (00:15 +0000)]
Revert "[x86] Fold extract_vector_elt of a load into the Load's address computation."
There's a bug where this can create cycles in the DAG. It will take a bit
to fix, so I'm backing it out for now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213339
91177308-0d34-0410-b5e6-
96231b3b80d8
Eric Christopher [Fri, 18 Jul 2014 00:08:53 +0000 (00:08 +0000)]
Reset the Subtarget in the AsmPrinter for each machine function
and add explanatory comment about dual initialization. Fix
use of the Subtarget to grab the information off of the target machine.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213336
91177308-0d34-0410-b5e6-
96231b3b80d8
Eric Christopher [Fri, 18 Jul 2014 00:08:50 +0000 (00:08 +0000)]
Avoid resetting the UseSoftFloat and FloatABIType on the TargetMachine
Options struct and move the comment to inMips16HardFloat. Use the
fact that we now know whether or not we cared about soft float to
set the libcalls.
Accordingly rename mipsSEUsesSoftFloat to abiUsesSoftFloat and
propagate since it's no longer CPU specific.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213335
91177308-0d34-0410-b5e6-
96231b3b80d8
Lang Hames [Thu, 17 Jul 2014 23:11:30 +0000 (23:11 +0000)]
[MCJIT] Fix the alignment requirements for ARM and AArch64 which were mistakenly
relaxed in the big RuntimeDyldMachO cleanup of r213293.
No test case yet - this was found via inspection and there's no easy way to test
GOT alignment in RuntimeDyldChecker at the moment. I'm working on adding support
for this now, and hope to have a test case for this soon.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213331
91177308-0d34-0410-b5e6-
96231b3b80d8
Kevin Enderby [Thu, 17 Jul 2014 22:56:27 +0000 (22:56 +0000)]
Tweak formating to match what clang-format would be for llvm-nm.cpp .
No functional change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213330
91177308-0d34-0410-b5e6-
96231b3b80d8
Kevin Enderby [Thu, 17 Jul 2014 22:47:16 +0000 (22:47 +0000)]
Add printing of Mach-O stabs in llvm-nm.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213327
91177308-0d34-0410-b5e6-
96231b3b80d8
Reid Kleckner [Thu, 17 Jul 2014 22:43:00 +0000 (22:43 +0000)]
Remove rules against std::function from the programmer's manual
Clarify that llvm::function_ref is like StringRef for callables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213326
91177308-0d34-0410-b5e6-
96231b3b80d8
Nico Weber [Thu, 17 Jul 2014 20:24:55 +0000 (20:24 +0000)]
ms inline asm: Don't add x86 segment registers to the clobber list.
Clang tries to check the clobber list but doesn't list segment registers in its
x86 register list. This fixes PR20343.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213303
91177308-0d34-0410-b5e6-
96231b3b80d8
Lang Hames [Thu, 17 Jul 2014 20:23:31 +0000 (20:23 +0000)]
Make myself code owner of MCJIT.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213302
91177308-0d34-0410-b5e6-
96231b3b80d8
Alp Toker [Thu, 17 Jul 2014 20:05:29 +0000 (20:05 +0000)]
Drop the udis86 wrapper from llvm::sys
This optional dependency on the udis86 library was added some time back to aid
JIT development, but doesn't make much sense to link into LLVM binaries these
days.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213300
91177308-0d34-0410-b5e6-
96231b3b80d8
Reid Kleckner [Thu, 17 Jul 2014 19:43:40 +0000 (19:43 +0000)]
TableGen: Add 'static' to a large array to avoid a huge stack allocation
Speculative fix for a -Wframe-larger-than warning from gcc. Clang will
implicitly promote such constant arrays to globals, so in theory it
won't hit this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213298
91177308-0d34-0410-b5e6-
96231b3b80d8
Arnaud A. de Grandmaison [Thu, 17 Jul 2014 19:08:14 +0000 (19:08 +0000)]
[AArch64] Cleanup AsmParser: no need to use dyn_cast + assert. cast does it for us.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213296
91177308-0d34-0410-b5e6-
96231b3b80d8
Suyog Sarda [Thu, 17 Jul 2014 19:07:00 +0000 (19:07 +0000)]
Rectify r213231. Use proper version of 'ComputeNumSignBits'.
Earlier when the code was in InstCombine, we were calling the version of ComputeNumSignBits in InstCombine.h
that automatically added the DataLayout* before calling into ValueTracking.
When the code moved to InstSimplify, we are calling into ValueTracking directly without passing in the DataLayout*.
This patch rectifies the same by passing DataLayout in ComputeNumSignBits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213295
91177308-0d34-0410-b5e6-
96231b3b80d8
Lang Hames [Thu, 17 Jul 2014 18:54:50 +0000 (18:54 +0000)]
[MCJIT] Significantly refactor the RuntimeDyldMachO class.
The previous implementation of RuntimeDyldMachO mixed logic for all targets
within a single class, creating problems for readability, maintainability, and
performance. To address these issues, this patch strips the RuntimeDyldMachO
class down to just target-independent functionality, and moves all
target-specific functionality into target-specific subclasses RuntimeDyldMachO.
The new class hierarchy is as follows:
class RuntimeDyldMachO
Implemented in RuntimeDyldMachO.{h,cpp}
Contains logic that is completely independent of the target. This consists
mostly of MachO helper utilities which the derived classes use to get their
work done.
template <typename Impl>
class RuntimeDyldMachOCRTPBase<Impl> : public RuntimeDyldMachO
Implemented in RuntimeDyldMachO.h
Contains generic MachO algorithms/data structures that defer to the Impl class
for target-specific behaviors.
RuntimeDyldMachOARM : public RuntimeDyldMachOCRTPBase<RuntimeDyldMachOARM>
RuntimeDyldMachOARM64 : public RuntimeDyldMachOCRTPBase<RuntimeDyldMachOARM64>
RuntimeDyldMachOI386 : public RuntimeDyldMachOCRTPBase<RuntimeDyldMachOI386>
RuntimeDyldMachOX86_64 : public RuntimeDyldMachOCRTPBase<RuntimeDyldMachOX86_64>
Implemented in their respective *.h files in lib/ExecutionEngine/RuntimeDyld/MachOTargets
Each of these contains the relocation logic specific to their target architecture.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213293
91177308-0d34-0410-b5e6-
96231b3b80d8
Alexey Samsonov [Thu, 17 Jul 2014 18:48:12 +0000 (18:48 +0000)]
[ASan] Don't instrument load/stores with !nosanitize metadata.
This is used to avoid instrumentation of instructions added by UBSan
in Clang frontend (see r213291). This fixes PR20085.
Reviewed in http://reviews.llvm.org/D4544.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213292
91177308-0d34-0410-b5e6-
96231b3b80d8
Hans Wennborg [Thu, 17 Jul 2014 18:33:44 +0000 (18:33 +0000)]
Typo: exists -> exits
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213290
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Holewinski [Thu, 17 Jul 2014 18:10:09 +0000 (18:10 +0000)]
[NVPTX] Improve handling of FP fusion
We now consider the FPOpFusion flag when determining whether
to fuse ops. We also explicitly emit add.rn when fusion is
disabled to prevent ptxas from fusing the operations on its
own.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213287
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Thu, 17 Jul 2014 17:50:22 +0000 (17:50 +0000)]
Fix typos
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213285
91177308-0d34-0410-b5e6-
96231b3b80d8
Zinovy Nis [Thu, 17 Jul 2014 17:14:35 +0000 (17:14 +0000)]
[BUG] Due to a typo introduced in r199933 and r200027 two tests for FMA were never even started.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213283
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Thu, 17 Jul 2014 17:04:56 +0000 (17:04 +0000)]
[X86] AVX512: Add disassembler support for compressed displacement
There are two parts here. First is to modify tablegen to adjust the encoding
type ENCODING_RM with the scaling factor.
The second is to use the new encoding types to compute the correct
displacement in the decoder.
Fixes <rdar://problem/
17608489>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213281
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Thu, 17 Jul 2014 17:04:52 +0000 (17:04 +0000)]
[X86] AVX512: Rename EVEX_CD8V to CD8_Form
This is to match the naming of CD8_EltSize, CD8_Scale, etc.
No functional change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213280
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Thu, 17 Jul 2014 17:04:50 +0000 (17:04 +0000)]
[X86] AVX512: Use the TD version of CD8_Scale in the assembler
Passes the computed scaling factor in TSFlags rather than the old attributes.
Also removes the C++ version of computing the scaling factor (MemObjSize)
along with the asserts added by the previous patch.
No functional change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213279
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Thu, 17 Jul 2014 17:04:34 +0000 (17:04 +0000)]
[X86] AVX512: Move compressed displacement logic to TD
This does not actually move the logic yet but reimplements it in the Tablegen
language. Then asserts that the new implementation results in the same value.
The next patch will remove the assert and the temporary use of the TSFlags and
remove the C++ implementation.
The formula requires a limited form of the logical left and right operators.
I implemented these with the bit-extract/insert operator (i.e. blah{bits}).
No functional change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213278
91177308-0d34-0410-b5e6-
96231b3b80d8
Adam Nemet [Thu, 17 Jul 2014 17:04:27 +0000 (17:04 +0000)]
[TableGen] Allow shift operators to take bits<n>
Convert the operand to int if possible, i.e. if the value is properly
initialized. (I suppose there is further room for improvement here to also
peform the shift if the uninitialized bits are shifted out.)
With this little change we can now compute the scaling factor for compressed
displacement with pure tablegen code in the X86 backend. This is useful
because both the X86-disassembler-specific part of tablegen and the assembler
need this and TD is the natural sharing place.
The patch also adds the missing documentation for the shift and add operator.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213277
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Holewinski [Thu, 17 Jul 2014 16:58:56 +0000 (16:58 +0000)]
[NVPTX] Add missing .v4 qualifier on vector store instruction
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213276
91177308-0d34-0410-b5e6-
96231b3b80d8
Saleem Abdulrasool [Thu, 17 Jul 2014 16:27:44 +0000 (16:27 +0000)]
MC: correct DWARF header for PE/COFF assembly input
The header contains an offset to the DWARF abbreviations for the CU. The offset
must be section relative for COFF and absolute for others. The non-assembly
code path for the DWARF header generation already had the correct emission for
the headers. This corrects just the assembly path. Due to the invalid
relocation, processing of the debug information would halt previously on the
first assembly input as the associated abbreviations would be out of range as
they would have the location increased by image base and the section offset.
This address PR20332.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213275
91177308-0d34-0410-b5e6-
96231b3b80d8
Saleem Abdulrasool [Thu, 17 Jul 2014 16:27:40 +0000 (16:27 +0000)]
MC: fix MCAsmInfo usage for windows-itanium
Windows itanium uses the GNUCOFF assmebly format, not ELF.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213274
91177308-0d34-0410-b5e6-
96231b3b80d8
Saleem Abdulrasool [Thu, 17 Jul 2014 16:27:35 +0000 (16:27 +0000)]
MC: collapse emission of producer
Rather than use three EmitBytes, concatenate the string at compile time,
constructing a single StringRef and emitting the data in one shot. This also
creates nicer assembly output. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213273
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Holewinski [Thu, 17 Jul 2014 14:51:33 +0000 (14:51 +0000)]
[NVPTX] Flag surface/texture query instructions with IsTexSurfQuery
Also, add some tests to make sure we can handle surface/texture
queries on both Fermi and Kepler+.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213268
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Holewinski [Thu, 17 Jul 2014 11:59:04 +0000 (11:59 +0000)]
[NVPTX] Add more surface/texture intrinsics, including CUDA unified texture fetch
This also uses TSFlags to mark machine instructions that are surface/texture
accesses, as well as the vector width for surface operations. This is used
to simplify some of the switch statements that need to detect surface/texture
instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213256
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Northover [Thu, 17 Jul 2014 11:27:04 +0000 (11:27 +0000)]
ARM: support direct f16 <-> f64 conversions
ARMv8 has instructions to handle it, otherwise a libcall is needed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213254
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Holewinski [Thu, 17 Jul 2014 11:23:29 +0000 (11:23 +0000)]
[TABLEGEN] Do not crash on intrinsics with names longer than 40 characters
Differential Revision: http://reviews.llvm.org/D4537
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213253
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Northover [Thu, 17 Jul 2014 11:12:12 +0000 (11:12 +0000)]
CodeGen: generate single libcall for fptrunc -> f16 operations.
Previously we asserted on this code. Currently compiler-rt doesn't
actually implement any of these new libcalls, but external help is
pretty much the only viable option for LLVM.
I've followed the much more generic "__truncST2" naming, as opposed to
the odd name for f32 -> f16 truncation. This can obviously be changed
later, or overridden by any targets that need to.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213252
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Northover [Thu, 17 Jul 2014 11:04:04 +0000 (11:04 +0000)]
X86: support double extension of f16 type.
x86 has no native ability to extend an f16 to f64, but the same result
is obtained if we expand it into two separate extensions: f16 -> f32
-> f64.
Unfortunately the same is not true for truncate, so that still results
in a compilation failure.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213251
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Northover [Thu, 17 Jul 2014 10:51:23 +0000 (10:51 +0000)]
CodeGen: extend f16 conversions to permit types > float.
This makes the two intrinsics @llvm.convert.from.f16 and
@llvm.convert.to.f16 accept types other than simple "float". This is
only strictly needed for the truncate operation, since otherwise
double rounding occurs and there's no way to represent the strict IEEE
conversion. However, for symmetry we allow larger types in the extend
too.
During legalization, we can expand an "fp16_to_double" operation into
two extends for convenience, but abort when the truncate isn't legal. A new
libcall is probably needed here.
Even after this commit, various target tweaks are needed to actually use the
extended intrinsics. I've put these into separate commits for clarity, so there
are no actual tests of f64 conversion here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213248
91177308-0d34-0410-b5e6-
96231b3b80d8
Yi Kong [Thu, 17 Jul 2014 10:50:20 +0000 (10:50 +0000)]
Port memory barriers intrinsics to AArch64
Memory barrier __builtin_arm_[dmb, dsb, isb] intrinsics are required to
implement their corresponding ACLE and MSVC intrinsics.
This patch ports ARM dmb, dsb, isb intrinsic to AArch64.
Differential Revision: http://reviews.llvm.org/D4520
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213247
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Sanders [Thu, 17 Jul 2014 10:10:04 +0000 (10:10 +0000)]
[mips] .reginfo is 8 byte aligned on N32.
Differential Revision: http://reviews.llvm.org/D4540
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213246
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Sanders [Thu, 17 Jul 2014 10:02:08 +0000 (10:02 +0000)]
[mips] Correct ELF e_flags for the N32 ABI when using a mips-* triple rather than a mips64-* triple
Summary:
Generally speaking, mips-* vs mips64-* should not be used to make decisions
about the content or format of the ELF. This should be based on the ABI
and CPU in use. For example, `mips-linux-gnu-clang -mips64r2 -mabi=64`
should produce an ELF64 as should `mips64-linux-gnu-clang -mabi=64`.
Conversely, `mips64-linux-gnu-clang -mabi=n32` should produce an ELF32 as
should `mips-linux-gnu-clang -mips64r2 -mabi=n32`.
This patch fixes the e_flags but leaves the ELF32 vs ELF64 issue for now
since there is no apparent way to base this decision on the ABI and CPU.
Differential Revision: http://reviews.llvm.org/D4539
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213244
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Sanders [Thu, 17 Jul 2014 09:57:23 +0000 (09:57 +0000)]
[mips] Correct .MIPS.abiflags for -mfpxx on MIPS32r6
Summary:
The cpr1_size field describes the minimum register width to run the program
rather than the size of the registers on the target. MIPS32r6 was acting
as if -mfp64 has been given because it starts off with 64-bit FPU registers.
Differential Revision: http://reviews.llvm.org/D4538
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213243
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Sanders [Thu, 17 Jul 2014 09:52:56 +0000 (09:52 +0000)]
[mips] Fix ELF e_flags related to -mabicalls and -mplt.
Summary:
These options are not implemented yet but we act as if they are always
given.
The integrated assembler is driven by the clang driver so the e_flag test
cases should match the e_flags emitted by GCC+GAS rather than GAS
by itself.
Differential Revision: http://reviews.llvm.org/D4536
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213242
91177308-0d34-0410-b5e6-
96231b3b80d8
Yi Kong [Thu, 17 Jul 2014 09:43:27 +0000 (09:43 +0000)]
Fix the prefix for arm64 triple
Triple.cpp still returns "arm64" as prefix for arm64 triple, causing Clang not
being able to select the correct GCCBuiltin IR.
This patch changes the value to correct prefix "aarch64". Regression test will
be added in the coming patch.
Differential Revision: http://reviews.llvm.org/D4516
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213240
91177308-0d34-0410-b5e6-
96231b3b80d8
Evgeniy Stepanov [Thu, 17 Jul 2014 09:10:37 +0000 (09:10 +0000)]
[msan] Avoid redundant origin stores.
Origin is meaningless for fully initialized values. Avoid
storing origin for function arguments that are known to
be always initialized (i.e. shadow is a compile-time null
constant).
This is not about correctness, but purely an optimization.
Seems to affect compilation time of blacklisted functions
significantly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213239
91177308-0d34-0410-b5e6-
96231b3b80d8
Suyog Sarda [Thu, 17 Jul 2014 06:28:15 +0000 (06:28 +0000)]
Move ashr optimization from InstCombineShift to InstSimplify.
Refactor code, no functionality change, test case moved from instcombine to instsimplify.
Differential Revision: http://reviews.llvm.org/D4102
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213231
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Thu, 17 Jul 2014 06:19:06 +0000 (06:19 +0000)]
Use range for
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213230
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Thu, 17 Jul 2014 06:13:41 +0000 (06:13 +0000)]
R600: Short circuit alloca check if address space isn't private.
Skip calling GetUnderlyingObject in cases where it obviously
isn't from an alloca. This should only be a compile time improvement.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213229
91177308-0d34-0410-b5e6-
96231b3b80d8
Suyog Sarda [Thu, 17 Jul 2014 06:09:34 +0000 (06:09 +0000)]
Fix Typo (first commit to test commit access)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213228
91177308-0d34-0410-b5e6-
96231b3b80d8
Eric Fiselier [Thu, 17 Jul 2014 05:53:00 +0000 (05:53 +0000)]
[lit] Add --show-unsupported flag to LIT
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213227
91177308-0d34-0410-b5e6-
96231b3b80d8
Saleem Abdulrasool [Thu, 17 Jul 2014 03:08:50 +0000 (03:08 +0000)]
MC: make WinEH opcode an opaque value
This makes the opcode an opaque value (unsigned int) rather than the
enumeration. This permits the use of target specific operands.
Split out the generic type into a MCWinEH header and add a supporting
MCWin64EH::Instruction to abstract out the selection of the opcode and
construction of the actual instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213221
91177308-0d34-0410-b5e6-
96231b3b80d8
Hal Finkel [Thu, 17 Jul 2014 01:28:25 +0000 (01:28 +0000)]
Improve BasicAA CS-CS queries (redux)
This reverts, "r213024 - Revert r212572 "improve BasicAA CS-CS queries", it
causes PR20303." with a fix for the bug in pr20303. As it turned out, the
relevant code was both wrong and over-conservative (because, as with the code
it replaced, it would return the overall ModRef mask even if just Ref had been
implied by the argument aliasing results). Hopefully, this correctly fixes both
problems.
Thanks to Nick Lewycky for reducing the test case for pr20303 (which I've
cleaned up a little and added in DSE's test directory). The BasicAA test has
also been updated to check for this error.
Original commit message:
BasicAA contains knowledge of certain intrinsics, such as memcpy and memset,
and uses that information to form more-accurate answers to CallSite vs. Loc
ModRef queries. Unfortunately, it did not use this information when answering
CallSite vs. CallSite queries.
Generically, when an intrinsic takes one or more pointers and the intrinsic is
marked only to read/write from its arguments, the offset/size is unknown. As a
result, the generic code that answers CallSite vs. CallSite (and CallSite vs.
Loc) queries in AA uses UnknownSize when forming Locs from an intrinsic's
arguments. While BasicAA's CallSite vs. Loc override could use more-accurate
size information for some intrinsics, it did not do the same for CallSite vs.
CallSite queries.
This change refactors the intrinsic-specific logic in BasicAA into a generic AA
query function: getArgLocation, which is overridden by BasicAA to supply the
intrinsic-specific knowledge, and used by AA's generic implementation. This
allows the intrinsic-specific knowledge to be used by both CallSite vs. Loc and
CallSite vs. CallSite queries, and simplifies the BasicAA implementation.
Currently, only one function, Mac's memset_pattern16, is handled by BasicAA
(all the rest are intrinsics). As a side-effect of this refactoring, BasicAA's
getModRefBehavior override now also returns OnlyAccessesArgumentPointees for
this function (which is an improvement).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213219
91177308-0d34-0410-b5e6-
96231b3b80d8
Jingyue Wu [Wed, 16 Jul 2014 23:25:00 +0000 (23:25 +0000)]
Partially revert r210444 due to performance regression
Summary:
Converting outermost zext(a) to sext(a) causes worse code when the
computation of zext(a) could be reused. For example, after converting
... = array[zext(a)]
... = array[zext(a) + 1]
to
... = array[sext(a)]
... = array[zext(a) + 1],
the program computes sext(a), which is actually unnecessary. I added one
test in split-gep-and-gvn.ll to illustrate this scenario.
Also, with r211281 and r211084, we annotate more "nuw" tags to
computation involving CUDA intrinsics such as threadIdx.x. These
annotations help with splitting GEP a lot, rendering the benefit we get
from this reverted optimization only marginal.
Test Plan: make check-all
Reviewers: eliben, meheff
Reviewed By: meheff
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D4542
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213209
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Wed, 16 Jul 2014 22:40:28 +0000 (22:40 +0000)]
Fixed formatting, removed bug reference, renamed testcase
Thanks to Duncan Exon Smith for reviewing and cleanup suggestions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213205
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Wed, 16 Jul 2014 22:20:51 +0000 (22:20 +0000)]
[FastISel] Local values shouldn't be alive across an inline asm call with side effects.
This fixes an issue where a local value is defined before and used after an
inline asm call with side effects.
This fix simply flushes the local value map, which updates the insertion point
for the inline asm call to be above any previously defined local values.
This fixes <rdar://problem/
17694203>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213203
91177308-0d34-0410-b5e6-
96231b3b80d8
Lang Hames [Wed, 16 Jul 2014 22:02:20 +0000 (22:02 +0000)]
[MCJIT] Improve a RuntimeDyldChecker diagnostic.
When a RuntimeDyldChecker test requests an invalid operand for an instruction,
print the decoded instruction to aid diagnosis.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213202
91177308-0d34-0410-b5e6-
96231b3b80d8
Hal Finkel [Wed, 16 Jul 2014 21:22:46 +0000 (21:22 +0000)]
Fix a typo in the inalloca description
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213200
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Wed, 16 Jul 2014 21:08:10 +0000 (21:08 +0000)]
trivial fix for PR20314
Make sure that the AddrInst is an Instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213197
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Wed, 16 Jul 2014 20:18:49 +0000 (20:18 +0000)]
Remove Atom references in description.
Any CPU can run this pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213190
91177308-0d34-0410-b5e6-
96231b3b80d8
Manuel Jacob [Wed, 16 Jul 2014 20:13:45 +0000 (20:13 +0000)]
Utilize CastInst::CreatePointerBitCastOrAddrSpaceCast here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213189
91177308-0d34-0410-b5e6-
96231b3b80d8
Chris Bieneman [Wed, 16 Jul 2014 20:13:31 +0000 (20:13 +0000)]
[RegisterCoalescer] Moving the RegisterCoalescer subtarget hook onto the TargetRegisterInfo instead of the TargetSubtargetInfo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213188
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Holewinski [Wed, 16 Jul 2014 19:45:35 +0000 (19:45 +0000)]
[NVPTX] Honor alignment on vector loads/stores
We were not considering the stated alignment on vector loads/stores,
leading us to generate vector instructions even when we do not have
sufficient alignment.
Now, for IR like:
%1 = load <4 x float>, <4 x float>* %ptr, align 4
we will generate correct, conservative PTX like:
ld.f32 ... [%ptr]
ld.f32 ... [%ptr+4]
ld.f32 ... [%ptr+8]
ld.f32 ... [%ptr+12]
Or if we have an alignment of 8 (for example), we can
generate code like:
ld.v2.f32 ... [%ptr]
ld.v2.f32 ... [%ptr+8]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213186
91177308-0d34-0410-b5e6-
96231b3b80d8
Alexey Samsonov [Wed, 16 Jul 2014 18:11:31 +0000 (18:11 +0000)]
CHECK-LABEL-ize one test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213177
91177308-0d34-0410-b5e6-
96231b3b80d8
Kevin Enderby [Wed, 16 Jul 2014 17:38:26 +0000 (17:38 +0000)]
Add the "-x" flag to llvm-nm for Mach-O files that prints the fields of a symbol in hex.
(generally use for debugging the tools). This is same functionality as darwin’s
nm(1) "-x" flag.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213176
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Wed, 16 Jul 2014 17:09:21 +0000 (17:09 +0000)]
Remove unnecessary/redundant std::move
(run returns unique_ptr by value already)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213174
91177308-0d34-0410-b5e6-
96231b3b80d8
Alp Toker [Wed, 16 Jul 2014 16:50:34 +0000 (16:50 +0000)]
Track clang r213171
The clang rewriter is now a core facility.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213173
91177308-0d34-0410-b5e6-
96231b3b80d8
Chris Bieneman [Wed, 16 Jul 2014 16:27:31 +0000 (16:27 +0000)]
Added documentation for SizeMultiplier in the ARM subtarget hook for register coalescing. Also fixed some 80 col violations.
No functional code changes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213169
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Holewinski [Wed, 16 Jul 2014 16:26:58 +0000 (16:26 +0000)]
[NVPTX] Rename registers %fl -> %fd and %rl -> %rd
This matches the internal behavior of NVIDIA tools like libnvvm.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213168
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Northover [Wed, 16 Jul 2014 15:37:24 +0000 (15:37 +0000)]
CodeGen: don't form illegail EXTLOAD operations.
It turns out that in most cases (the main exception being i1-related
types) once these operations are formed we cannot separate them and
the targets end up having to deal with them whether they want to or
not.
This is not a good situation, and a more reasonable default can be
formed by ackowledging this and having targets leave them as Legal.
Only x86 seems to be affected (other targets don't even try marking
the operation Expand).
Mostly there's no visible change here yet, but it will be useful to
have truly expanded EXTLOADS for MVT::f16 softening support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213162
91177308-0d34-0410-b5e6-
96231b3b80d8
Tim Northover [Wed, 16 Jul 2014 15:37:08 +0000 (15:37 +0000)]
Convert test to CHECK-LABEL
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213161
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Sanders [Wed, 16 Jul 2014 15:34:07 +0000 (15:34 +0000)]
[mips][fp64a] Temporarily disable odd-numbered double-precision registers when using the FP64A ABI.
Summary:
A few instructions (mostly cvt.d.w and similar) are causing problems with
-mfp64 and -mno-odd-spreg and it looks like fixing it properly may
take several weeks. In the meantime, let's disable the odd-numbered
double-precision registers so that the generated code is at least valid.
The problem is that instructions like cvt.d.w read from the 32-bit low
subregister of a double-precision FPU register. This often leads to the compiler
to inserting moves to transfer a GPR32 to a FGR32 using mtc1. Such moves
violate the rules against 32-bit writes to odd-numbered FPU registers imposed
by -mno-odd-spreg. By disabling the odd-numbered double-precision registers, it
becomes impossible for the 32-bit low subregister to be odd-numbered.
This fixes numerous test-suite failures when compiling for the FP64A ABI
('-mfp64 -mno-odd-spreg'). There is no LLVM test case because it's difficult to
test that odd-numbered FPU registers are not allocatable. Instead, we depend on
the assembler (GAS and -fintegrated-as) raising errors when the rules are
violated.
Differential Revision: http://reviews.llvm.org/D4532
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213160
91177308-0d34-0410-b5e6-
96231b3b80d8
Andrea Di Biagio [Wed, 16 Jul 2014 11:29:39 +0000 (11:29 +0000)]
[X86] Add a check for 'isMOVHLPSMask' within method 'isShuffleMaskLegal'.
Before this change, method 'isShuffleMaskLegal' didn't know that shuffles
implementing a 'movhlps' operation were perfectly legal for SSE targets.
This patch adds the missing check for 'isMOVHLPSMask' inside method
'isShuffleMaskLegal' to fix the problem.
The reason why it is important to do this is because the DAGCombiner
conservatively avoids combining a pair of shuffles if the resulting shuffle
node has an illegal mask. Before this patch, shuffles with a MOVHLPS mask were
wrongly considered not to be legal. This was the root cause of some poor-code
generation bugs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213137
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Bogner [Wed, 16 Jul 2014 08:18:58 +0000 (08:18 +0000)]
unittests: Actually test reverse iterators in Path tests
This re-enables some #if 0'd code (since 2010) in the Path unittests
and makes at least a weak effort at testing sys::path's rbegin/rend.
This change was inspired by some test failures near uses of rbegin and
rend here:
http://lab.llvm.org:8011/builders/clang-x86_64-linux-vg/builds/3209
The "valgrind was whining" comment looked promising in terms of a
simpler to debug case of the same errors. However, it appears that the
valgrind complaints the comment was referring to are distinct from the
ones in the frontend, since this updated test isn't complaining for me
under valgrind.
In any case, the disabled tests weren't helping anybody.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213125
91177308-0d34-0410-b5e6-
96231b3b80d8
Reid Kleckner [Wed, 16 Jul 2014 01:34:27 +0000 (01:34 +0000)]
Roundtrip the inalloca bit on allocas through bitcode
This was an oversight in the original support. As it is, I stuffed this
bit into the alignment. The alignment is stored in log2 form, so it
doesn't need more than 5 bits, given that Value::MaximumAlignment is 1
<< 29.
Reviewers: nicholas
Differential Revision: http://reviews.llvm.org/D3943
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213118
91177308-0d34-0410-b5e6-
96231b3b80d8
Manuel Jacob [Wed, 16 Jul 2014 01:34:21 +0000 (01:34 +0000)]
Fix comment in InstCombiner::visitAddrSpaceCast.
In the original version of the patch the behaviour was like described in
the comment. This behaviour was changed before committing it without
updating the comment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213117
91177308-0d34-0410-b5e6-
96231b3b80d8
Hans Wennborg [Wed, 16 Jul 2014 00:52:11 +0000 (00:52 +0000)]
Perform wildcard expansion in Process::GetArgumentVector on Windows (PR17098)
On Windows, wildcard expansion isn't performed by the shell, but left to the
program itself. The common way to do this is to link with setargv.obj, which
performs the expansion on argc/argv before main is entered. However, we don't
use argv in Clang on Windows, but instead call GetCommandLineW so we can handle
unicode arguments. This means we have to do wildcard expansion ourselves.
A test case will be added on the Clang side.
Differential Revision: http://reviews.llvm.org/D4529
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213114
91177308-0d34-0410-b5e6-
96231b3b80d8
Tyler Nowicki [Wed, 16 Jul 2014 00:36:00 +0000 (00:36 +0000)]
Emit warnings if vectorization is forced and fails.
This patch modifies the existing DiagnosticInfo system to create a generic base
class that is inherited to produce diagnostic-based warnings. This is used by
the loop vectorizer to trigger a warning when vectorization is forced and
fails. Several tests have been added to verify this behavior.
Reviewed by: Arnold Schwaighofer
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213110
91177308-0d34-0410-b5e6-
96231b3b80d8
Juergen Ributzka [Wed, 16 Jul 2014 00:01:22 +0000 (00:01 +0000)]
Remove TLI from isInTailCallPosition's arguments. NFC.
There is no need to pass on TLI separately to the function. As Eric pointed out
the Target Machine already provides everything we need.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213108
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Tue, 15 Jul 2014 23:50:10 +0000 (23:50 +0000)]
R600/SI: Allow using f32 rcp / rsq when denormals not handled.
These are precise enough to use for OpenCL unless denormals
are handled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213107
91177308-0d34-0410-b5e6-
96231b3b80d8
David Majnemer [Tue, 15 Jul 2014 23:01:10 +0000 (23:01 +0000)]
X86: Simplify X86WindowsTargetObjectFile::getSectionForConstant
There exists a helper function to abstract away the various differences
between ConstantVector, ConstantDataVector, ConstantAggregateZero, etc.
Use it to simplify X86WindowsTargetObjectFile::getSectionForConstant.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213104
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Tue, 15 Jul 2014 22:39:58 +0000 (22:39 +0000)]
Move Post RA Scheduling flag bit into SchedMachineModel
Refactoring; no functional changes intended
Removed PostRAScheduler bits from subtargets (X86, ARM).
Added PostRAScheduler bit to MCSchedModel class.
This bit is set by a CPU's scheduling model (if it exists).
Removed enablePostRAScheduler() function from TargetSubtargetInfo and subclasses.
Fixed the existing enablePostMachineScheduler() method to use the MCSchedModel (was just returning false!).
Added methods to TargetSubtargetInfo to allow overrides for AntiDepBreakMode, CriticalPathRCs, and OptLevel for PostRAScheduling.
Added enablePostRAScheduler() function to PostRAScheduler class which queries the subtarget for the above values.
Preserved existing scheduler behavior for ARM, MIPS, PPC, and X86:
a. ARM overrides the CPU's postRA settings by enabling postRA for any non-Thumb or Thumb2 subtarget.
b. MIPS overrides the CPU's postRA settings by enabling postRA for everything.
c. PPC overrides the CPU's postRA settings by enabling postRA for everything.
d. X86 is the only target that actually has postRA specified via sched model info.
Differential Revision: http://reviews.llvm.org/D4217
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213101
91177308-0d34-0410-b5e6-
96231b3b80d8
Peter Collingbourne [Tue, 15 Jul 2014 22:13:19 +0000 (22:13 +0000)]
[dfsan] Introduce further optimization to reduce the number of union queries.
Specifically, do not compute a union if it is statically known that one
shadow set subsumes the other.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213100
91177308-0d34-0410-b5e6-
96231b3b80d8
Alp Toker [Tue, 15 Jul 2014 22:11:54 +0000 (22:11 +0000)]
CMake: avoid a reconfigure loop from r213091
Removing the native CMakeCache.txt causes the target to get re-run needlessly
on some systems. We'll want another solution for that part of the fix.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213099
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Tue, 15 Jul 2014 21:44:37 +0000 (21:44 +0000)]
R600/SI: Fix select on i1
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213096
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Tue, 15 Jul 2014 21:06:37 +0000 (21:06 +0000)]
Try out FileCheck's new (in r212810) -implicit-check-not in a DebugInfo test.
Just tried this on a few tests and this was the only one that was
easily ported to use the new feature, so we'll go with that for now.
Hopefully can act as inspiration/reminder for other tests.
Not all debug info tests need to check for every DW_TAG or NULL child
terminator, but perhaps they should (just to ensure they don't accidentally
end up with tags nested inside other tags without the test failing, for example)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213092
91177308-0d34-0410-b5e6-
96231b3b80d8
Alp Toker [Tue, 15 Jul 2014 21:04:12 +0000 (21:04 +0000)]
CMake: fix cross-compilation with external source directories
This adds support for building native artifacts when cross-compiling using the
popular side-by-side source directory layout (no symlinks, no nested
repositories).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213091
91177308-0d34-0410-b5e6-
96231b3b80d8
Duncan P. N. Exon Smith [Tue, 15 Jul 2014 20:24:56 +0000 (20:24 +0000)]
ADT: Add MapVector::remove_if
Add a `MapVector::remove_if()` that erases items in bulk in linear time,
as opposed to quadratic time for repeated calls to `MapVector::erase()`.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213090
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Tue, 15 Jul 2014 20:18:31 +0000 (20:18 +0000)]
R600/SI: Implement less wrong f32 fdiv
Assuming single precision denormals and accurate sqrt/div are not
reported, this passes the OpenCL conformance test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213089
91177308-0d34-0410-b5e6-
96231b3b80d8