From: Chandler Carruth Date: Wed, 14 Mar 2012 20:16:41 +0000 (+0000) Subject: Change where we enable the heuristic that delays inlining into functions X-Git-Url: http://demsky.eecs.uci.edu/git/?a=commitdiff_plain;h=b16117c368ad4e6d004ac912549b2c6ed06731a5;p=oota-llvm.git Change where we enable the heuristic that delays inlining into functions which are small enough to themselves be inlined. Delaying in this manner can be harmful if the function is inelligible for inlining in some (or many) contexts as it pessimizes the code of the function itself in the event that inlining does not eventually happen. Previously the check was written to only do this delaying of inlining for static functions in the hope that they could be entirely deleted and in the knowledge that all callers of static functions will have the opportunity to inline if it is in fact profitable. However, with C++ we get two other important sources of functions where the definition is always available for inlining: inline functions and templated functions. This patch generalizes the inliner to allow linkonce-ODR (the linkage such C++ routines receive) to also qualify for this delay-based inlining. Benchmarking across a range of large real-world applications shows roughly 2% size increase across the board, but an average speedup of about 0.5%. Some benhcmarks improved over 2%, and the 'clang' binary itself (when bootstrapped with this feature) shows a 1% -O0 performance improvement when run over all Sema, Lex, and Parse source code smashed into a single file. A clean re-build of Clang+LLVM with a bootstrapped Clang shows approximately 2% improvement, but that measurement is often noisy. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@152737 91177308-0d34-0410-b5e6-96231b3b80d8 --- diff --git a/lib/Transforms/IPO/Inliner.cpp b/lib/Transforms/IPO/Inliner.cpp index 9590df966e6..49042f2493f 100644 --- a/lib/Transforms/IPO/Inliner.cpp +++ b/lib/Transforms/IPO/Inliner.cpp @@ -244,13 +244,20 @@ bool Inliner::shouldInline(CallSite CS) { return false; } - // Try to detect the case where the current inlining candidate caller - // (call it B) is a static function and is an inlining candidate elsewhere, - // and the current candidate callee (call it C) is large enough that - // inlining it into B would make B too big to inline later. In these - // circumstances it may be best not to inline C into B, but to inline B - // into its callers. - if (Caller->hasLocalLinkage()) { + // Try to detect the case where the current inlining candidate caller (call + // it B) is a static or linkonce-ODR function and is an inlining candidate + // elsewhere, and the current candidate callee (call it C) is large enough + // that inlining it into B would make B too big to inline later. In these + // circumstances it may be best not to inline C into B, but to inline B into + // its callers. + // + // This only applies to static and linkonce-ODR functions because those are + // expected to be available for inlining in the translation units where they + // are used. Thus we will always have the opportunity to make local inlining + // decisions. Importantly the linkonce-ODR linkage covers inline functions + // and templates in C++. + if (Caller->hasLocalLinkage() || + Caller->getLinkage() == GlobalValue::LinkOnceODRLinkage) { int TotalSecondaryCost = 0; bool outerCallsFound = false; // This bool tracks what happens if we do NOT inline C into B.