Chih-Hung Hsieh [Mon, 14 Dec 2015 22:08:36 +0000 (22:08 +0000)]
[X86] Part 2 to fix x86-64 fp128 calling convention.
Part 1 was submitted in http://reviews.llvm.org/D15134.
Changes in this part:
* X86RegisterInfo.td, X86RecognizableInstr.cpp: Add FR128 register class.
* X86CallingConv.td: Pass f128 values in XMM registers or on stack.
* X86InstrCompiler.td, X86InstrInfo.td, X86InstrSSE.td:
Add instruction selection patterns for f128.
* X86ISelLowering.cpp:
When target has MMX registers, configure MVT::f128 in FR128RegClass,
with TypeSoftenFloat action, and custom actions for some opcodes.
Add missed cases of MVT::f128 in places that handle f32, f64, or vector types.
Add TODO comment to support f128 type in inline assembly code.
* SelectionDAGBuilder.cpp:
Fix infinite loop when f128 type can have
VT == TLI.getTypeToTransformTo(Ctx, VT).
* Add unit tests for x86-64 fp128 type.
Differential Revision: http://reviews.llvm.org/D11438
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255558
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Mon, 14 Dec 2015 21:59:03 +0000 (21:59 +0000)]
add fast-math-flags to 'call' instructions (PR21290)
This patch adds optional fast-math-flags (the same that apply to fmul/fadd/fsub/fdiv/frem/fcmp)
to call instructions in IR. Follow-up patches would use these flags in LibCallSimplifier, add
support to clang, and extend FMF to the DAG for calls.
Motivating example:
%y = fmul fast float %x, %x
%z = tail call float @sqrtf(float %y)
We'd like to be able to optimize sqrt(x*x) into fabs(x). We do this today using a function-wide
attribute for unsafe-math, but we really want to trigger on the instructions themselves:
%z = tail call fast float @sqrtf(float %y)
because in an LTO build it's possible that calls with fast semantics have been inlined into a
function with non-fast semantics.
The code changes and tests are based on the recent commits that added "notail":
http://reviews.llvm.org/rL252368
and added FMF to fcmp:
http://reviews.llvm.org/rL241901
Differential Revision: http://reviews.llvm.org/D14707
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255555
91177308-0d34-0410-b5e6-
96231b3b80d8
Ben Craig [Mon, 14 Dec 2015 21:57:05 +0000 (21:57 +0000)]
Reordering fields to reduce padding in LLVM. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255554
91177308-0d34-0410-b5e6-
96231b3b80d8
Dan Gohman [Mon, 14 Dec 2015 21:53:54 +0000 (21:53 +0000)]
[WebAssembly] Add an assert to sanity-check dead flags.
The WebAssemblyStoreResults pass runs before LiveVariables, so it doesn't
expect to have to keep dead flags up to date; check this with an assert.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255551
91177308-0d34-0410-b5e6-
96231b3b80d8
Pete Cooper [Mon, 14 Dec 2015 21:49:49 +0000 (21:49 +0000)]
Start implementing FDE dumping when printing the eh_frame.
This code adds some simple decoding of the FDE's in an eh_frame.
There's still more to be done in terms of error handling and verification.
Also, we need to be able to decode the CFI's.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255550
91177308-0d34-0410-b5e6-
96231b3b80d8
Pete Cooper [Mon, 14 Dec 2015 21:39:27 +0000 (21:39 +0000)]
Print the eh_frame section in MachoDump.
This is the start of work to dump the contents of the eh_frame section.
It currently emits CIE entries. FDE entries will come later.
It also needs improved error checking which will follow soon.
http://reviews.llvm.org/D15502
Reviewed by Kevin Enderby and Lang Hames.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255546
91177308-0d34-0410-b5e6-
96231b3b80d8
Krzysztof Parzyszek [Mon, 14 Dec 2015 21:32:25 +0000 (21:32 +0000)]
[Hexagon] Add "const" to function parameters in HexagonInstrInfo
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255544
91177308-0d34-0410-b5e6-
96231b3b80d8
Diego Novillo [Mon, 14 Dec 2015 20:37:15 +0000 (20:37 +0000)]
Fix formatting. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255541
91177308-0d34-0410-b5e6-
96231b3b80d8
Krzysztof Parzyszek [Mon, 14 Dec 2015 20:35:13 +0000 (20:35 +0000)]
[Packetizer] Add AliasAnalysis as a parameter to the packetizer
This will make the depedence graph more accurate if an alias analysis
is provided. If nullptr is specified in its place, the behavior will
remain as it is currently.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255540
91177308-0d34-0410-b5e6-
96231b3b80d8
Pete Cooper [Mon, 14 Dec 2015 20:29:16 +0000 (20:29 +0000)]
Add missing vtable anchor's.
The following description is from http://reviews.llvm.org/D15481:
ICmpInst, GetElementPtrInst and PHINode have no anchor functions. This causes the vtable and the type info (if RTTI is enabled in user code) to be emitted in multiple translation units.
Before 3.7, the destructors were the key functions for these nodes, but they have been removed.
There have been discussions about this here: http://lists.llvm.org/pipermail/llvm-dev/2015-August/089010.html and here: http://lists.llvm.org/pipermail/llvm-dev/2015-December/092921.html.
Patch by Visoiu Mistrih Francis
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255538
91177308-0d34-0410-b5e6-
96231b3b80d8
Krzysztof Parzyszek [Mon, 14 Dec 2015 20:12:24 +0000 (20:12 +0000)]
[Packetizer] Make endPacket virtual
This will allow custom handling of packet finalization. The current
definition of endPacket will still perform the default finalization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255537
91177308-0d34-0410-b5e6-
96231b3b80d8
David Majnemer [Mon, 14 Dec 2015 19:30:32 +0000 (19:30 +0000)]
[ConstantFold] Fix bitcast to gep constant folding transform.
Make sure to check that the destination type is sized.
A check was present but was incorrectly checking the source type
instead.
Patch by Amaury SECHET!
Differential Revision: http://reviews.llvm.org/D15264
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255536
91177308-0d34-0410-b5e6-
96231b3b80d8
Yaron Keren [Mon, 14 Dec 2015 19:28:40 +0000 (19:28 +0000)]
Save several std::string constructions using llvm::Twine.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255535
91177308-0d34-0410-b5e6-
96231b3b80d8
Peter Collingbourne [Mon, 14 Dec 2015 19:22:37 +0000 (19:22 +0000)]
docs: Correct wording in LangRef relating to available_externally linkage.
Differential Revision: http://reviews.llvm.org/D15343
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255534
91177308-0d34-0410-b5e6-
96231b3b80d8
Cong Hou [Mon, 14 Dec 2015 19:11:54 +0000 (19:11 +0000)]
Remove the successor probabilities normalization in tail duplication pass.
The normalization may cause assertion failures on SystemZ and some out-of-tree
tests. The root cause is that unknown probabilities are materialized into known
ones by calling getSuccProbability(), which is then used to add another
successor to the same MBB which results in mixed known and unknown
probabilities. But currently those mixed probabilities cannot be normalized.
I will compose another patch to fix the root issue.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255530
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjoy Das [Mon, 14 Dec 2015 19:11:45 +0000 (19:11 +0000)]
[MergeFunctions] Use II instead of CI for InvokeInst; NFC
Using `CI` is slightly misleading.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255529
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjoy Das [Mon, 14 Dec 2015 19:11:40 +0000 (19:11 +0000)]
Teach MergeFunctions about operand bundles
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255528
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjoy Das [Mon, 14 Dec 2015 19:11:35 +0000 (19:11 +0000)]
Teach haveSameSpecialState about operand bundles
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255527
91177308-0d34-0410-b5e6-
96231b3b80d8
Krzysztof Parzyszek [Mon, 14 Dec 2015 18:54:44 +0000 (18:54 +0000)]
Add "const" to function arguments in DFAPacketizer
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255526
91177308-0d34-0410-b5e6-
96231b3b80d8
Xinliang David Li [Mon, 14 Dec 2015 18:44:01 +0000 (18:44 +0000)]
[PGO] Value profiling text format reader/writer support
This patch adds the missing functionality in parsable
text format support for value profiling.
Differential Revision: http://reviews.llvm.org/D15212
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255523
91177308-0d34-0410-b5e6-
96231b3b80d8
David Majnemer [Mon, 14 Dec 2015 18:34:23 +0000 (18:34 +0000)]
[IR] Remove terminatepad
It turns out that terminatepad gives little benefit over a cleanuppad
which calls the termination function. This is not sufficient to
implement fully generic filters but MSVC doesn't support them which
makes terminatepad a little over-designed.
Depends on D15478.
Differential Revision: http://reviews.llvm.org/D15479
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255522
91177308-0d34-0410-b5e6-
96231b3b80d8
Paul Robinson [Mon, 14 Dec 2015 18:33:18 +0000 (18:33 +0000)]
FastISel needs to remove dead code when it bails out.
When FastISel fails to translate an instruction it hands off code
generation to SelectionDAG. Before it does so, it may have generated
local value instructions to feed phi nodes in successor blocks. These
instructions will then be generated again by SelectionDAG, causing
duplication and less efficient code, including extra spill
instructions.
Patch by Wolfgang Pieb!
Differential Revision: http://reviews.llvm.org/D11768
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255520
91177308-0d34-0410-b5e6-
96231b3b80d8
Petar Jovanovic [Mon, 14 Dec 2015 17:57:33 +0000 (17:57 +0000)]
[Power PC] llvm soft float support for ppc32
This is the second in a set of patches for soft float support for ppc32,
it enables soft float operations.
Patch by Strahinja Petrovic.
Differential Revision: http://reviews.llvm.org/D13700
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255516
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Mon, 14 Dec 2015 17:25:38 +0000 (17:25 +0000)]
AMDGPU: Use generic bitreverse intrinsic
Also fix bug in vector legalization for bitreverse.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255512
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Mon, 14 Dec 2015 17:24:23 +0000 (17:24 +0000)]
getParent() ^ 3 == getModule() ; NFCI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255511
91177308-0d34-0410-b5e6-
96231b3b80d8
Geoff Berry [Mon, 14 Dec 2015 17:01:10 +0000 (17:01 +0000)]
Remove dead function AArch64TargetLowering::getFunctionAlignment. NFC.
Reviewers: t.p.northover, jmolloy, mcrosier
Subscribers: aemerson, rengolin, llvm-commits
Differential Revision: http://reviews.llvm.org/D15458
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255509
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Mon, 14 Dec 2015 16:59:40 +0000 (16:59 +0000)]
AMDGPU: Fix splitting vector loads with existing offsets
If the original MMO had an offset, it was dropped.
Also use the correct alignment after adding the new offset.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255508
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Mon, 14 Dec 2015 16:16:54 +0000 (16:16 +0000)]
[InstCombine] fold trunc ([lshr] (bitcast vector) ) --> extractelement (PR25543)
This is a fix for PR25543:
https://llvm.org/bugs/show_bug.cgi?id=25543
The idea is to take the existing fold of:
bitcast ( trunc ( lshr ( bitcast X))) --> extractelement (bitcast X)
( http://reviews.llvm.org/rL112232 )
And break it into less specific transforms so we'll catch more cases such as
the example in the bug report:
bitcast ( trunc ( lshr ( bitcast X))) -->
bitcast ( extractelement (bitcast X)) -->
extractelement (bitcast X)
Enabling patches for this change:
http://reviews.llvm.org/rL255399 (combine bitcasts)
http://reviews.llvm.org/rL255433 (canonicalize extractelement(bitcast X))
Differential Revision: http://reviews.llvm.org/D15392
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255504
91177308-0d34-0410-b5e6-
96231b3b80d8
Krzysztof Parzyszek [Mon, 14 Dec 2015 15:03:54 +0000 (15:03 +0000)]
[Hexagon] Subtarget features/default CPU corrections
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255501
91177308-0d34-0410-b5e6-
96231b3b80d8
Chad Rosier [Mon, 14 Dec 2015 14:44:06 +0000 (14:44 +0000)]
[PPC] Early exit loop. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255497
91177308-0d34-0410-b5e6-
96231b3b80d8
Adhemerval Zanella [Mon, 14 Dec 2015 14:14:15 +0000 (14:14 +0000)]
[sanitizer] [msan] VarArgHelper for AArch64
This patch add support for variadic argument for AArch64. All the MSAN
unit tests are not passing as well the signal_stress_test (currently
set as XFAIl for aarch64).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255495
91177308-0d34-0410-b5e6-
96231b3b80d8
James Molloy [Mon, 14 Dec 2015 10:57:01 +0000 (10:57 +0000)]
Don't create unnecessary PHIs
In conditional store merging, we were creating PHIs when we didn't
need to. If the value to be predicated isn't defined in the block
we're predicating, then it doesn't need a PHI at all (because we only
deal with triangles and diamonds, any value not in the predicated BB
must dominate the predicated BB).
This fixes a large code size increase in some benchmarks in a popular embedded benchmark suite.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255489
91177308-0d34-0410-b5e6-
96231b3b80d8
NAKAMURA Takumi [Mon, 14 Dec 2015 07:58:25 +0000 (07:58 +0000)]
Reformat to untabify.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255483
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Mon, 14 Dec 2015 07:42:00 +0000 (07:42 +0000)]
[llvm-dwp] Deduplicate type units
It's O(N^2) because it does a simple walk through the existing types to
find duplicates, but that will be fixed in a follow-up commit to use a
mapping data structure of some kind.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255482
91177308-0d34-0410-b5e6-
96231b3b80d8
David Blaikie [Mon, 14 Dec 2015 07:41:56 +0000 (07:41 +0000)]
[llvm-dwp] Remove some unused test code
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255481
91177308-0d34-0410-b5e6-
96231b3b80d8
Akira Hatanaka [Mon, 14 Dec 2015 05:15:40 +0000 (05:15 +0000)]
[Docs] Fix underlines that were too short or too long.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255480
91177308-0d34-0410-b5e6-
96231b3b80d8
Michael Zuckerman [Sun, 13 Dec 2015 21:12:33 +0000 (21:12 +0000)]
I Added a triple flag for x86-evenDirective test.
Continue of rL255461
Differential Revision: http://reviews.llvm.org/D15413
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255469
91177308-0d34-0410-b5e6-
96231b3b80d8
Cong Hou [Sun, 13 Dec 2015 17:15:38 +0000 (17:15 +0000)]
Revert r255460, which still causes test failures on some platforms.
Further investigation on the failures is ongoing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255463
91177308-0d34-0410-b5e6-
96231b3b80d8
Michael Zuckerman [Sun, 13 Dec 2015 17:07:23 +0000 (17:07 +0000)]
[X86][inline asm] support even directive
The .even directive aligns content to an evan-numbered address.
In at&t syntax .even
In Microsoft syntax even (without the dot).
Differential Revision: http://reviews.llvm.org/D15413
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255462
91177308-0d34-0410-b5e6-
96231b3b80d8
Cong Hou [Sun, 13 Dec 2015 17:00:25 +0000 (17:00 +0000)]
Fix a type issue in r255455. Should not use unsigned type as std::abs()'s template type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255461
91177308-0d34-0410-b5e6-
96231b3b80d8
Cong Hou [Sun, 13 Dec 2015 16:55:46 +0000 (16:55 +0000)]
[LoopVectorizer] Refine loop vectorizer's register usage calculator by ignoring specific instructions.
(This is the second attempt to check in this patch: REQUIRES: asserts is added
to reg-usage.ll now.)
LoopVectorizationCostModel::calculateRegisterUsage() is used to estimate the
register usage for specific VFs. However, it takes into account many
instructions that won't be vectorized, such as induction variables,
GetElementPtr instruction, etc.. This makes the loop vectorizer too conservative
when choosing VF. In this patch, the induction variables that won't be
vectorized plus GetElementPtr instruction will be added to ValuesToIgnore set
so that their register usage won't be considered any more.
Differential revision: http://reviews.llvm.org/D15177
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255460
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Sun, 13 Dec 2015 12:49:48 +0000 (12:49 +0000)]
Fix line endings
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255459
91177308-0d34-0410-b5e6-
96231b3b80d8
Cong Hou [Sun, 13 Dec 2015 09:52:14 +0000 (09:52 +0000)]
Replace <cstdint> by llvm/Support/DataTypes.h for the typedef of uint64_t. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255458
91177308-0d34-0410-b5e6-
96231b3b80d8
Cong Hou [Sun, 13 Dec 2015 09:32:21 +0000 (09:32 +0000)]
Add the missing header file <cstdint> needed by uint64_t
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255457
91177308-0d34-0410-b5e6-
96231b3b80d8
Cong Hou [Sun, 13 Dec 2015 09:28:57 +0000 (09:28 +0000)]
Revert r255454 as it leads to several test failers on buildbots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255456
91177308-0d34-0410-b5e6-
96231b3b80d8
Cong Hou [Sun, 13 Dec 2015 09:26:17 +0000 (09:26 +0000)]
Normalize MBB's successors' probabilities in several locations.
This patch adds some missing calls to MBB::normalizeSuccProbs() in several
locations where it should be called. Those places are found by checking if the
sum of successors' probabilities is approximate one in MachineBlockPlacement
pass with some instrumented code (not in this patch).
Differential revision: http://reviews.llvm.org/D15259
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255455
91177308-0d34-0410-b5e6-
96231b3b80d8
Cong Hou [Sun, 13 Dec 2015 08:44:08 +0000 (08:44 +0000)]
[LoopVectorizer] Refine loop vectorizer's register usage calculator by ignoring specific instructions.
LoopVectorizationCostModel::calculateRegisterUsage() is used to estimate the
register usage for specific VFs. However, it takes into account many
instructions that won't be vectorized, such as induction variables,
GetElementPtr instruction, etc.. This makes the loop vectorizer too conservative
when choosing VF. In this patch, the induction variables that won't be
vectorized plus GetElementPtr instruction will be added to ValuesToIgnore set
so that their register usage won't be considered any more.
Differential revision: http://reviews.llvm.org/D15177
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255454
91177308-0d34-0410-b5e6-
96231b3b80d8
Saleem Abdulrasool [Sun, 13 Dec 2015 05:27:45 +0000 (05:27 +0000)]
ARM: only emit EABI attributes on EABI targets
EABI attributes should only be emitted on EABI targets. This prevents the
emission of the optimization goals EABI attribute on Windows ARM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255448
91177308-0d34-0410-b5e6-
96231b3b80d8
Nico Weber [Sun, 13 Dec 2015 04:14:39 +0000 (04:14 +0000)]
Revert r255444.
It doesn't build on Windows and broke the Windows LLD and LLDB bots:
http://lab.llvm.org:8011/builders/lld-x86_64-win7/builds/27693/steps/build_Lld/logs/stdio
http://lab.llvm.org:8011/builders/lldb-x86-windows-msvc/builds/13468/steps/build/logs/stdio
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255446
91177308-0d34-0410-b5e6-
96231b3b80d8
Mehdi Amini [Sat, 12 Dec 2015 22:55:25 +0000 (22:55 +0000)]
Add a C++11 ThreadPool implementation in LLVM
This is a very simple implementation of a thread pool using C++11
thread. It accepts any std::function<void()> for asynchronous
execution. Individual task can be synchronize using the returned
future, or the client can block on the full queue completion.
In case LLVM is configured with Threading disabled, it falls back
to sequential execution using std::async with launch:deferred.
This is intended to support parallelism for ThinLTO processing in
linker plugin, but is generic enough for any other uses.
Differential Revision: http://reviews.llvm.org/D15464
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255444
91177308-0d34-0410-b5e6-
96231b3b80d8
Davide Italiano [Sat, 12 Dec 2015 21:50:11 +0000 (21:50 +0000)]
[llvm-objdump/MachoDump] Simplify.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255443
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Sat, 12 Dec 2015 21:46:23 +0000 (21:46 +0000)]
[X86][AVX512] Added support for VMOVQ shuffle comments
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255442
91177308-0d34-0410-b5e6-
96231b3b80d8
Manuel Jacob [Sat, 12 Dec 2015 21:33:31 +0000 (21:33 +0000)]
Partially fix memcpy / memset / memmove lowering in SelectionDAG construction if address space != 0.
Summary:
Previously SelectionDAGBuilder asserted that the pointer operands of
memcpy / memset / memmove intrinsics are in address space < 256. This assert
implicitly assumed the X86 backend, where all address spaces < 256 are
equivalent to address space 0 from the code generator's point of view. On some
targets (R600 and NVPTX) several address spaces < 256 have a target-defined
meaning, so this assert made little sense for these targets.
This patch removes this wrong assertion and adds extra checks before lowering
these intrinsics to library calls. If a pointer operand can't be casted to
address space 0 without changing semantics, a fatal error is reported to the
user.
The new behavior should be valid for all targets that give address spaces != 0
a target-specified meaning (NVPTX, R600, X86). NVPTX lowers big or
variable-sized memory intrinsics before SelectionDAG construction. All other
memory intrinsics are inlined (the threshold is set very high for this target).
R600 doesn't support memcpy / memset / memmove library calls (previously the
illegal emission of a call to such library function triggered an error
somewhere in the code generator). X86 now emits inline loads and stores for
address spaces 256 and 257 up to the same threshold that is used for address
space 0 and reports a fatal error otherwise.
I call this a "partial fix" because there are still cases that can't be
lowered. A fatal error is reported in these cases.
Reviewers: arsenm, theraven, compnerd, hfinkel
Subscribers: hfinkel, llvm-commits, alex
Differential Revision: http://reviews.llvm.org/D7241
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255441
91177308-0d34-0410-b5e6-
96231b3b80d8
Xinliang David Li [Sat, 12 Dec 2015 17:28:03 +0000 (17:28 +0000)]
[PGO] Stop using invalid char in instr variable names.
Before the patch, -fprofile-instr-generate compile will fail
if no integrated-as is specified when the file contains
any static functions (the -S output is also invalid).
This is the second try. The fix in this patch is very localized.
Only profile symbol names of profile symbols with internal
linkage are fixed up while initializer of name syms are not
changes. This means there is no format change nor version bump.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255434
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Sat, 12 Dec 2015 16:44:48 +0000 (16:44 +0000)]
[InstCombine] canonicalize (bitcast (extractelement X)) --> (extractelement(bitcast X))
This change was discussed in D15392. It allows us to remove the fold that was added
in:
http://reviews.llvm.org/r255261
...and it will allow us to generalize this fold:
http://reviews.llvm.org/rL112232
while preserving the order of bitcast + extract that it produces and testing shows
is better handled by the backend.
Note that the existing check for "isVectorTy()" wasn't strong enough in general
and specifically because: x86_mmx. It's not a vector, but it's not vectorizable
either. So here we check VectorType::isValidElementType() directly before
proceeding with the transform.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255433
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Sat, 12 Dec 2015 12:52:52 +0000 (12:52 +0000)]
[X86][AVX] Tests tidyup
Cleanup/regenerate some tests for some upcoming patches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255432
91177308-0d34-0410-b5e6-
96231b3b80d8
David Majnemer [Sat, 12 Dec 2015 06:56:02 +0000 (06:56 +0000)]
Try to appease sphinx
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255429
91177308-0d34-0410-b5e6-
96231b3b80d8
David Majnemer [Sat, 12 Dec 2015 06:21:08 +0000 (06:21 +0000)]
Move catchpad-phi-cast.ll to the X86 specific subdirectory
It is X86 specific and will not be properly exercised unless LLVM is
built with the X86 target.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255426
91177308-0d34-0410-b5e6-
96231b3b80d8
David Majnemer [Sat, 12 Dec 2015 05:53:20 +0000 (05:53 +0000)]
Try to appease a buildbot
The builder complains thusly:
error C2027: use of undefined type 'llvm::raw_ostream'
Try to make it happy by including raw_ostream.h
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255425
91177308-0d34-0410-b5e6-
96231b3b80d8
David Majnemer [Sat, 12 Dec 2015 05:38:55 +0000 (05:38 +0000)]
[IR] Reformulate LLVM's EH funclet IR
While we have successfully implemented a funclet-oriented EH scheme on
top of LLVM IR, our scheme has some notable deficiencies:
- catchendpad and cleanupendpad are necessary in the current design
but they are difficult to explain to others, even to seasoned LLVM
experts.
- catchendpad and cleanupendpad are optimization barriers. They cannot
be split and force all potentially throwing call-sites to be invokes.
This has a noticable effect on the quality of our code generation.
- catchpad, while similar in some aspects to invoke, is fairly awkward.
It is unsplittable, starts a funclet, and has control flow to other
funclets.
- The nesting relationship between funclets is currently a property of
control flow edges. Because of this, we are forced to carefully
analyze the flow graph to see if there might potentially exist illegal
nesting among funclets. While we have logic to clone funclets when
they are illegally nested, it would be nicer if we had a
representation which forbade them upfront.
Let's clean this up a bit by doing the following:
- Instead, make catchpad more like cleanuppad and landingpad: no control
flow, just a bunch of simple operands; catchpad would be splittable.
- Introduce catchswitch, a control flow instruction designed to model
the constraints of funclet oriented EH.
- Make funclet scoping explicit by having funclet instructions consume
the token produced by the funclet which contains them.
- Remove catchendpad and cleanupendpad. Their presence can be inferred
implicitly using coloring information.
N.B. The state numbering code for the CLR has been updated but the
veracity of it's output cannot be spoken for. An expert should take a
look to make sure the results are reasonable.
Reviewers: rnk, JosephTremoulet, andrew.w.kaylor
Differential Revision: http://reviews.llvm.org/D15139
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255422
91177308-0d34-0410-b5e6-
96231b3b80d8
Hal Finkel [Sat, 12 Dec 2015 01:47:08 +0000 (01:47 +0000)]
[PowerPC] OutStreamer cleanup in PPCAsmPrinter
We don't need to pass OutStreamer as a parameter to LowerSTACKMAP and
LowerPATCHPOINT. It is a member variable of PPCAsmPrinter, and thus, is already
available. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255418
91177308-0d34-0410-b5e6-
96231b3b80d8
Chen Li [Sat, 12 Dec 2015 01:04:15 +0000 (01:04 +0000)]
[X86ISelLowering] Add additional support for multiplication-to-shift conversion.
Summary: This patch adds support of conversion (mul x, 2^N + 1) => (add (shl x, N), x) and (mul x, 2^N - 1) => (sub (shl x, N), x) if the multiplication can not be converted to LEA + SHL or LEA + LEA. LLVM has already supported this on ARM, and it should also be useful on X86. Note the patch currently only applies to cases where the constant operand is positive, and I am planing to add another patch to support negative cases after this.
Reviewers: craig.topper, RKSimon
Subscribers: aemerson, llvm-commits
Differential Revision: http://reviews.llvm.org/D14603
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255415
91177308-0d34-0410-b5e6-
96231b3b80d8
Hal Finkel [Sat, 12 Dec 2015 00:42:05 +0000 (00:42 +0000)]
Fix test/CodeGen/PowerPC/ppc-shrink-wrapping.ll after r255398
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255414
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Sat, 12 Dec 2015 00:33:36 +0000 (00:33 +0000)]
[InstCombine] allow any pair of bitcasts to be combined
This change is discussed in D15392 and should allow us to effectively
revert:
http://llvm.org/viewvc/llvm-project?view=revision&revision=255261
if we canonicalize bitcasts ahead of extracts.
It should be safe to convert any pair of bitcasts into a single bitcast,
however, it was mentioned here:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-
20110829/127089.html
that we're not allowed to bitcast from an x86_mmx to some other types, but I'm
not seeing any failures from that, and we have regression tests in CodeGen/X86
that appear to cover all of those cases.
Some day we'll get to remove that MMX wart from LLVM IR completely?
Differential Revision: http://reviews.llvm.org/D15468
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255399
91177308-0d34-0410-b5e6-
96231b3b80d8
Hal Finkel [Sat, 12 Dec 2015 00:32:00 +0000 (00:32 +0000)]
[PowerPC] Add Branch Hints for Highly-Biased Branches
This branch adds hints for highly biased branches on the PPC architecture. Even
in absence of profiling information, LLVM will mark code reaching unreachable
terminators and other exceptional control flow constructs as highly unlikely to
be reached.
Patch by Tom Jablin!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255398
91177308-0d34-0410-b5e6-
96231b3b80d8
Derek Schuff [Sat, 12 Dec 2015 00:18:40 +0000 (00:18 +0000)]
[WebAssembly] Update test expectations
Many tests are now passing due to eliminateFrameIndex implementation and
the list needs to be re-triaged because it unblocks other failures, and
some previous failures are different. However I'm about to churn it more
by implementing more lowering, so will wait on that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255396
91177308-0d34-0410-b5e6-
96231b3b80d8
Chen Li [Sat, 12 Dec 2015 00:08:37 +0000 (00:08 +0000)]
Revert rL255391: [X86ISelLowering] Add additional support for multiplication-to-shift conversion.
because it broke buildbot.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255395
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Sat, 12 Dec 2015 00:01:10 +0000 (00:01 +0000)]
use FileCheck for better checking
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255394
91177308-0d34-0410-b5e6-
96231b3b80d8
Derek Schuff [Fri, 11 Dec 2015 23:49:46 +0000 (23:49 +0000)]
[WebAssembly] Implement prolog/epilog insertion and FrameIndex elimination
Summary:
Use the SP32 physical register as the base for FrameIndex
lowering. Update it and the __stack_pointer global var in the prolog and
epilog. Extend the mapping of virtual registers to wasm locals to
include the physical registers.
Rather than modify the target-independent PrologEpilogInserter (which
asserts that there are no virtual registers left) include a
slightly-modified copy for Wasm that does not have this assertion and
only clears the virtual registers if scavenging was needed (which of
course it isn't for wasm).
Differential Revision: http://reviews.llvm.org/D15344
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255392
91177308-0d34-0410-b5e6-
96231b3b80d8
Chen Li [Fri, 11 Dec 2015 23:39:32 +0000 (23:39 +0000)]
[X86ISelLowering] Add additional support for multiplication-to-shift conversion.
Summary: This patch adds support of conversion (mul x, 2^N + 1) => (add (shl x, N), x) and (mul x, 2^N - 1) => (sub (shl x, N), x) if the multiplication can not be converted to LEA + SHL or LEA + LEA. LLVM has already supported this on ARM, and it should also be useful on X86. Note the patch currently only applies to cases where the constant operand is positive, and I am planing to add another patch to support negative cases after this.
Reviewers: craig.topper, RKSimon
Subscribers: aemerson, llvm-commits
Differential Revision: http://reviews.llvm.org/D14603
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255391
91177308-0d34-0410-b5e6-
96231b3b80d8
Diego Novillo [Fri, 11 Dec 2015 23:21:38 +0000 (23:21 +0000)]
SamplePGO - Reduce memory utilization by 10x.
DenseMap is the wrong data structure to use for sample records and call
sites. The keys are too large, causing massive core memory growth when
reading profiles.
Before this patch, a 21Mb input profile was causing the compiler to grow
to 3Gb in memory. By switching to std::map, the compiler now grows to
300Mb in memory.
There still are some opportunities for memory footprint reduction. I'll
be looking at those next.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255389
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 11 Dec 2015 23:16:47 +0000 (23:16 +0000)]
SelectionDAG: Match min/max if the scalar operation is legal
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255388
91177308-0d34-0410-b5e6-
96231b3b80d8
Hal Finkel [Fri, 11 Dec 2015 23:11:52 +0000 (23:11 +0000)]
Revert r248483, r242546, r242545, and r242409 - absdiff intrinsics
After much discussion, ending here:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-
20151123/315620.html
it has been decided that, instead of having the vectorizer directly generate
special absdiff and horizontal-add intrinsics, we'll recognize the relevant
reduction patterns during CodeGen. Accordingly, these intrinsics are not needed
(the operations they represent can be pattern matched, as is already done in
some backends). Thus, we're backing these out in favor of the current
development work.
r248483 - Codegen: Fix llvm.*absdiff semantic.
r242546 - [ARM] Use [SU]ABSDIFF nodes instead of intrinsics for VABD/VABA
r242545 - [AArch64] Use [SU]ABSDIFF nodes instead of intrinsics for ABD/ABA
r242409 - [Codegen] Add intrinsics 'absdiff' and corresponding SDNodes for absolute difference operation
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255387
91177308-0d34-0410-b5e6-
96231b3b80d8
Rafael Espindola [Fri, 11 Dec 2015 22:52:32 +0000 (22:52 +0000)]
Avoid buffered reads of /dev/urandom
I am seeing disappointing clang performance on a large PowerPC64
Linux box. GetRandomNumberSeed() does a buffered read from
/dev/urandom to seed its PRNG. As a result we read an entire page
even though we only need 4 bytes.
With every clang task reading a page worth of /dev/urandom we
end up spending a large amount of time stuck on kernel spinlock.
Patch by Anton Blanchard!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255386
91177308-0d34-0410-b5e6-
96231b3b80d8
Davide Italiano [Fri, 11 Dec 2015 22:27:59 +0000 (22:27 +0000)]
[llvm-objdump/MachODump] Reduce code duplication.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255380
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Fri, 11 Dec 2015 20:26:30 +0000 (20:26 +0000)]
Add tests for bitcast-bitcast sequences for all scalar/vector permutations
As noted in http://reviews.llvm.org/D15392 , we should be able to improve this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255370
91177308-0d34-0410-b5e6-
96231b3b80d8
Xinliang David Li [Fri, 11 Dec 2015 20:23:22 +0000 (20:23 +0000)]
[PGO] Revert r255365: solution incomplete, not handling lambda yet
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255369
91177308-0d34-0410-b5e6-
96231b3b80d8
Xinliang David Li [Fri, 11 Dec 2015 19:53:19 +0000 (19:53 +0000)]
[PGO] Stop using invalid char in instr variable names.
Before the patch, -fprofile-instr-generate compile will fail
if no integrated-as is specified when the file contains
any static functions (the -S output is also invalid).
This patch fixed the issue. With the change, the index format
version will be bumped up by 1. Backward compatibility is
preserved with this change.
Differential Revision: http://reviews.llvm.org/D15243
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255365
91177308-0d34-0410-b5e6-
96231b3b80d8
Matthias Braun [Fri, 11 Dec 2015 19:42:09 +0000 (19:42 +0000)]
CodeGen: Redo analyzePhysRegs() and computeRegisterLiveness()
computeRegisterLiveness() was broken in that it reported dead for a
register even if a subregister was alive. I assume this was because the
results of analayzePhysRegs() are hard to understand with respect to
subregisters.
This commit: Changes the results of analyzePhysRegs (=struct
PhysRegInfo) to be clearly understandable, also renames the fields to
avoid silent breakage of third-party code (and improve the grammar).
Fix all (two) users of computeRegisterLiveness() in llvm: By reenabling
it and removing workarounds for the bug.
This fixes http://llvm.org/PR24535 and http://llvm.org/PR25033
Differential Revision: http://reviews.llvm.org/D15320
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255362
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 11 Dec 2015 19:20:16 +0000 (19:20 +0000)]
Start replacing vector_extract/vector_insert with extractelt/insertelt
These are redundant pairs of nodes defined for
INSERT_VECTOR_ELEMENT/EXTRACT_VECTOR_ELEMENT.
insertelement/extractelement are slightly closer to the corresponding
C++ node name, and has stricter type checking so prefer it.
Update targets to only use these nodes where it is trivial to do so.
AArch64, ARM, and Mips all have various type errors on simple replacement,
so they will need work to fix.
Example from AArch64:
def : Pat<(sext_inreg (vector_extract (v16i8 V128:$Rn), VectorIndexB:$idx), i8),
(i32 (SMOVvi8to32 V128:$Rn, VectorIndexB:$idx))>;
Which is trying to do sext_inreg i8, i8.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255359
91177308-0d34-0410-b5e6-
96231b3b80d8
Derek Schuff [Fri, 11 Dec 2015 18:55:34 +0000 (18:55 +0000)]
[WebAssembly] Fix ADJCALLSTACKDOWN/UP use/defs
Summary:
ADJCALLSTACK{DOWN,UP} (aka CALLSEQ_{START,END}) MIs are supposed to use
and def the stack pointer. Since they do not, all the nodes are being
eliminated by DeadMachineInstructionElim, so they aren't in the IR when
PrologEpilogInserter/eliminateCallFramePseudo needs them.
This change fixes that, but since RegStackify will not stackify across
them (and it runs early, before PEI), change LowerCall to only emit them
when the call frame size is > 0. That makes the current code work the
same way and makes code handled by D15344 also work the same way. We can
expand the condition beyond NumBytes > 0 in the future if needed.
Reviewers: sunfish, jfb
Subscribers: jfb, dschuff, llvm-commits
Differential Revision: http://reviews.llvm.org/D15459
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255356
91177308-0d34-0410-b5e6-
96231b3b80d8
Chad Rosier [Fri, 11 Dec 2015 18:39:41 +0000 (18:39 +0000)]
Revert r255247, r255265, and r255286 due to serious compile-time regressions.
Revert "[DSE] Disable non-local DSE to see if the bots go green."
Revert "[DeadStoreElimination] Use range-based loops. NFC."
Revert "[DeadStoreElimination] Add support for non-local DSE."
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255354
91177308-0d34-0410-b5e6-
96231b3b80d8
Manman Ren [Fri, 11 Dec 2015 18:24:30 +0000 (18:24 +0000)]
CXX_FAST_TLS calling convention: target independent portion.
The access function has a short entry and a short exit, the initialization
block is only run the first time. To improve the performance, we want to
have a short frame at the entry and exit.
We explicitly handle most of the CSRs via copies. Only the CSRs that are not
handled via copies will be in CSR_SaveList.
Frame lowering and prologue/epilogue insertion will generate a short frame
in the entry and exit according to CSR_SaveList. The majority of the CSRs will
be handled by register allcoator. Register allocator will try to spill and
reload them in the initialization block.
We add CSRsViaCopy, it will be explicitly handled during lowering.
1> we first set FunctionLoweringInfo->SplitCSR if conditions are met (the target
supports it for the given calling convention and the function has only return
exits). We also call TLI->initializeSplitCSR to perform initialization.
2> we call TLI->insertCopiesSplitCSR to insert copies from CSRsViaCopy to
virtual registers at beginning of the entry block and copies from virtual
registers to CSRsViaCopy at beginning of the exit blocks.
3> we also need to make sure the explicit copies will not be eliminated.
rdar://problem/
23557469
Differential Revision: http://reviews.llvm.org/D15340
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255353
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Fri, 11 Dec 2015 18:12:01 +0000 (18:12 +0000)]
fix typos; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255352
91177308-0d34-0410-b5e6-
96231b3b80d8
Frederic Riss [Fri, 11 Dec 2015 17:50:37 +0000 (17:50 +0000)]
[dsymutil] Ignore absolute symbols in the debug map
Quoting from the comment added to the code:
// Objective-C on i386 uses artificial absolute symbols to
// perform some link time checks. Those symbols have a fixed 0
// address that might conflict with real symbols in the object
// file. As I cannot see a way for absolute symbols to find
// their way into the debug information, let's just ignore those.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255350
91177308-0d34-0410-b5e6-
96231b3b80d8
Hal Finkel [Fri, 11 Dec 2015 17:46:01 +0000 (17:46 +0000)]
AlignmentFromAssumptions and SLPVectorizer preserves AA and GlobalsAA
GlobalsAA's assumptions that passes do not escape globals not previously
escaped is not violated by AlignmentFromAssumptions and SLPVectorizer. Marking
them as such allows GlobalsAA to be preserved until GVN in the LTO pipeline.
http://lists.llvm.org/pipermail/llvm-dev/2015-December/092972.html
Patch by Vaivaswatha Nagaraj!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255348
91177308-0d34-0410-b5e6-
96231b3b80d8
Hal Finkel [Fri, 11 Dec 2015 17:31:27 +0000 (17:31 +0000)]
[TableGen] Correct Namespace lookup with AltNames in AsmWriterEmitter
AsmWriterEmitter will generate a getRegisterName function with an alternate
register name index as its second argument if the target makes use of them. The
enum of these values is generated in RegisterInfoEmitter. The getRegisterName
generator would assume the namespace could always be found by reading index 1
of the list of AltNameIndices, but this will fail if this list is sorted such
that the NoRegAltName is at index 1. Because this list is sorted by record name
(in CodeGenTarget::ReadRegAltNameIndices), you only run in to problems if your
MyTargetRegisterInfo.td defines a single RegAltNameIndex that sorts lexically
before NoRegAltName.
For example, if a target has something like
def AnAltNameIndex : RegAltNameIndex
and defines RegAltNameIndices for some registers then, prior to this change,
AsmWriterEmitter would generate references to
::AnAltNameIndex and ::NoRegAltName
Patch by Alex Bradbury!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255344
91177308-0d34-0410-b5e6-
96231b3b80d8
Artur Pilipenko [Fri, 11 Dec 2015 16:30:26 +0000 (16:30 +0000)]
PruneEH pass incorrectly reports that a change was made
Reviewed By: reames
Differential Revision: http://reviews.llvm.org/D14097
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255343
91177308-0d34-0410-b5e6-
96231b3b80d8
James Molloy [Fri, 11 Dec 2015 13:36:59 +0000 (13:36 +0000)]
[Mem2Reg] Respect optnone
Mem2Reg shouldn't be optimizing a function that is marked
optnone. There is a test checking this that fails when mem2reg is
explicitly added to the standard pass pipeline.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255336
91177308-0d34-0410-b5e6-
96231b3b80d8
James Molloy [Fri, 11 Dec 2015 10:04:51 +0000 (10:04 +0000)]
[InstCombine] Make MatchBSwap also match bit reversals
MatchBSwap has most of the functionality to match bit reversals already. If we switch it from looking at bytes to individual bits and remove a few early exits, we can extend the main recursive function to match any sequence of ORs, ANDs and shifts that assemble a value from different parts of another, base value. Once we have this bit->bit mapping, we can very simply detect if it is appropriate for a bswap or bitreverse.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255334
91177308-0d34-0410-b5e6-
96231b3b80d8
Maxim Ostapenko [Fri, 11 Dec 2015 07:40:25 +0000 (07:40 +0000)]
Revert previous test commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255331
91177308-0d34-0410-b5e6-
96231b3b80d8
Maxim Ostapenko [Fri, 11 Dec 2015 07:31:29 +0000 (07:31 +0000)]
This is a test commit to check my commit access works.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255330
91177308-0d34-0410-b5e6-
96231b3b80d8
Xinliang David Li [Fri, 11 Dec 2015 06:53:53 +0000 (06:53 +0000)]
[PGO] Read VP raw data without depending on the Value field
Before this patch, each function's on-disk VP data is 'pointed'
to by the Value field of per-function ProfileData structue, and
read relies on this field (relocated with ValueDataDelta field)
to read the value data. However this means the Value field needs
to be updated during runtime before dumping, which creates undesirable
data races.
With this patch, the reading of VP data no longer depends on Value
field. There is no format change. ValueDataDelta header field becomes
obsolute but will be kept for compatibility reason (will be removed
next time the raw format change is needed).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255329
91177308-0d34-0410-b5e6-
96231b3b80d8
Hans Wennborg [Fri, 11 Dec 2015 00:58:32 +0000 (00:58 +0000)]
Fix build after r255319.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255322
91177308-0d34-0410-b5e6-
96231b3b80d8
Eric Christopher [Fri, 11 Dec 2015 00:51:59 +0000 (00:51 +0000)]
Fix a spurious if.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255321
91177308-0d34-0410-b5e6-
96231b3b80d8
Akira Hatanaka [Fri, 11 Dec 2015 00:49:47 +0000 (00:49 +0000)]
[LazyValueInfo] Stop inserting overdefined values into ValueCache to
reduce memory usage.
Previously, LazyValueInfoCache inserted overdefined lattice values into
both ValueCache and OverDefinedCache. This wasn't necessary and was
causing LazyValueInfo to use an excessive amount of memory in some cases.
This patch changes LazyValueInfoCache to insert overdefined values only
into OverDefinedCache. The memory usage decreases by 70 to 75% when one
of the files in llvm is compiled.
rdar://problem/
11388615
Differential revision: http://reviews.llvm.org/D15391
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255320
91177308-0d34-0410-b5e6-
96231b3b80d8
Kyle Butt [Fri, 11 Dec 2015 00:47:36 +0000 (00:47 +0000)]
[PPC]: Peephole optimize small accesss to aligned globals.
Access to aligned globals gives us a chance to peephole optimize nonzero
offsets. If a struct is 4 byte aligned, then accesses to bytes 0-3 won't
overflow the available displacement. For example:
addis 3, 2, b4v@toc@ha
addi 4, 3, b4v@toc@l
lbz 5, b4v@toc@l(3) ; This is the result of the current peephole
lbz 6, 1(4) ; optimizer
lbz 7, 2(4)
lbz 8, 3(4)
If b4v is 4-byte aligned, we can skip using register 4 because we know
that b4v@toc@l+{1,2,3} won't overflow 32K, and instead generate:
addis 3, 2, b4v@toc@ha
lbz 4, b4v@toc@l(3)
lbz 5, b4v@toc@l+1(3)
lbz 6, b4v@toc@l+2(3)
lbz 7, b4v@toc@l+3(3)
Saving a register and an addition.
Larger alignments allow larger structures/arrays to be optimized.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255319
91177308-0d34-0410-b5e6-
96231b3b80d8
Hans Wennborg [Fri, 11 Dec 2015 00:43:42 +0000 (00:43 +0000)]
Check in the script for building Win snapshots
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255318
91177308-0d34-0410-b5e6-
96231b3b80d8
Vedant Kumar [Fri, 11 Dec 2015 00:40:05 +0000 (00:40 +0000)]
[ProfileData] clang-format TextInstrProfReader::hasFormat. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255317
91177308-0d34-0410-b5e6-
96231b3b80d8
Cong Hou [Fri, 11 Dec 2015 00:31:39 +0000 (00:31 +0000)]
[X86][SSE] Update the cost table for integer-integer conversions on SSE2/SSE4.1.
Previously in the conversion cost table there are no entries for integer-integer
conversions on SSE2. This will result in imprecise costs for certain vectorized
operations. This patch adds those entries for SSE2 and SSE4.1. The cost numbers
are counted from the result of running llc on the new test case in this patch.
Differential revision: http://reviews.llvm.org/D15132
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255315
91177308-0d34-0410-b5e6-
96231b3b80d8