From: Chris Lattner Date: Sun, 29 Nov 2009 02:19:52 +0000 (+0000) Subject: update and consolidate the load pre notes. X-Git-Url: http://demsky.eecs.uci.edu/git/?a=commitdiff_plain;h=d4137f40da9dff91c693721a1415d3f6b6332246;hp=ff0a883fe93df3c7fff056f9fb48dc422f9369ac;p=oota-llvm.git update and consolidate the load pre notes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@90050 91177308-0d34-0410-b5e6-96231b3b80d8 --- diff --git a/lib/Target/README.txt b/lib/Target/README.txt index 4029134f139..2d8a687ebeb 100644 --- a/lib/Target/README.txt +++ b/lib/Target/README.txt @@ -1116,6 +1116,8 @@ later. //===---------------------------------------------------------------------===// +[STORE SINKING] + Store sinking: This code: void f (int n, int *cond, int *res) { @@ -1171,6 +1173,8 @@ This is GCC PR38204. //===---------------------------------------------------------------------===// +[STORE SINKING] + GCC PR37810 is an interesting case where we should sink load/store reload into the if block and outside the loop, so we don't reload/store it on the non-call path. @@ -1198,7 +1202,7 @@ we don't sink the store. We need partially dead store sinking. //===---------------------------------------------------------------------===// -[LOAD PRE with NON-AVAILABLE ADDRESS] +[LOAD PRE CRIT EDGE SPLITTING] GCC PR37166: Sinking of loads prevents SROA'ing the "g" struct on the stack leading to excess stack traffic. This could be handled by GVN with some crazy @@ -1217,62 +1221,57 @@ bb3: ; preds = %bb1, %bb2, %bb %11 is partially redundant, an in BB2 it should have the value %8. -GCC PR33344 is a similar case. +GCC PR33344 and PR35287 are similar cases. //===---------------------------------------------------------------------===// +[LOAD PRE] + There are many load PRE testcases in testsuite/gcc.dg/tree-ssa/loadpre* in the -GCC testsuite. There are many pre testcases as ssa-pre-*.c +GCC testsuite, ones we don't get yet are (checked through loadpre25): -//===---------------------------------------------------------------------===// +[CRIT EDGE BREAKING] +loadpre3.c predcom-4.c -There are some interesting cases in testsuite/gcc.dg/tree-ssa/pred-comm* in the -GCC testsuite. For example, predcom-1.c is: - - for (i = 2; i < 1000; i++) - fib[i] = (fib[i-1] + fib[i - 2]) & 0xffff; - -which compiles into: - -bb1: ; preds = %bb1, %bb1.thread - %indvar = phi i32 [ 0, %bb1.thread ], [ %0, %bb1 ] - %i.0.reg2mem.0 = add i32 %indvar, 2 - %0 = add i32 %indvar, 1 ; [#uses=3] - %1 = getelementptr [1000 x i32]* @fib, i32 0, i32 %0 - %2 = load i32* %1, align 4 ; [#uses=1] - %3 = getelementptr [1000 x i32]* @fib, i32 0, i32 %indvar - %4 = load i32* %3, align 4 ; [#uses=1] - %5 = add i32 %4, %2 ; [#uses=1] - %6 = and i32 %5, 65535 ; [#uses=1] - %7 = getelementptr [1000 x i32]* @fib, i32 0, i32 %i.0.reg2mem.0 - store i32 %6, i32* %7, align 4 - %exitcond = icmp eq i32 %0, 998 ; [#uses=1] - br i1 %exitcond, label %return, label %bb1 +[PRE OF READONLY CALL] +loadpre5.c -This is basically: - LOAD fib[i+1] - LOAD fib[i] - STORE fib[i+2] +[TURN SELECT INTO BRANCH] +loadpre14.c loadpre15.c -instead of handling this as a loop or other xform, all we'd need to do is teach -load PRE to phi translate the %0 add (i+1) into the predecessor as (i'+1+1) = -(i'+2) (where i' is the previous iteration of i). This would find the store -which feeds it. +actually a conditional increment: loadpre18.c loadpre19.c -predcom-2.c is apparently the same as predcom-1.c -predcom-3.c is very similar but needs loads feeding each other instead of -store->load. -predcom-4.c seems the same as the rest. +//===---------------------------------------------------------------------===// + +[SCALAR PRE] +There are many PRE testcases in testsuite/gcc.dg/tree-ssa/ssa-pre-*.c in the +GCC testsuite. //===---------------------------------------------------------------------===// -Other simple load PRE cases: -http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35287 [LPRE crit edge splitting] +There are some interesting cases in testsuite/gcc.dg/tree-ssa/pred-comm* in the +GCC testsuite. For example, we get the first example in predcom-1.c, but +miss the second one: + +unsigned fib[1000]; +unsigned avg[1000]; + +__attribute__ ((noinline)) +void count_averages(int n) { + int i; + for (i = 1; i < n; i++) + avg[i] = (((unsigned long) fib[i - 1] + fib[i] + fib[i + 1]) / 3) & 0xffff; +} + +which compiles into two loads instead of one in the loop. + +predcom-2.c is the same as predcom-1.c + +predcom-3.c is very similar but needs loads feeding each other instead of +store->load. -http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34677 (licm does this, LPRE crit edge) - llvm-gcc t2.c -S -o - -O0 -emit-llvm | llvm-as | opt -mem2reg -simplifycfg -gvn | llvm-dis //===---------------------------------------------------------------------===// @@ -1305,7 +1304,7 @@ Interesting missed case because of control flow flattening (should be 2 loads): http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26629 With: llvm-gcc t2.c -S -o - -O0 -emit-llvm | llvm-as | opt -mem2reg -gvn -instcombine | llvm-dis -we miss it because we need 1) GEP PHI TRAN, 2) CRIT EDGE 3) MULTIPLE DIFFERENT +we miss it because we need 1) CRIT EDGE 2) MULTIPLE DIFFERENT VALS PRODUCED BY ONE BLOCK OVER DIFFERENT PATHS //===---------------------------------------------------------------------===//