Add ARM specific emitFrameIndexDebugValue.

[oota-llvm.git] / lib / Target / ARM / README.txt
diff --git a/lib/Target/ARM/README.txt b/lib/Target/ARM/README.txt

index 6e5a7a151e063c7b8906c17a517accda6097d053..85d5ca05913cb6b05b743beb66b483005dee3707 100644 (file)
--- a/lib/Target/ARM/README.txt
+++ b/lib/Target/ARM/README.txt
@@ -8,11 +8,12 @@ Reimplement 'select' in terms of 'SEL'.
    add doesn't need to overflow between the two 16-bit chunks.
  
  * Implement pre/post increment support.  (e.g. PR935)
-* Coalesce stack slots!
  * Implement smarter constant generation for binops with large immediates.
  
-* Consider materializing FP constants like 0.0f and 1.0f using integer 
-  immediate instructions then copy to FPU.  Slower than load into FPU?
+A few ARMv6T2 ops should be pattern matched: BFI, SBFX, and UBFX
+
+Interesting optimization for PIC codegen on arm-linux:
+http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43129
  
  //===---------------------------------------------------------------------===//
  
@@ -75,26 +76,6 @@ were disabled due to badness with the ARM carry flag on subtracts.
  
  //===---------------------------------------------------------------------===//
  
-We currently compile abs:
-int foo(int p) { return p < 0 ? -p : p; }
-
-into:
-
-_foo:
-        rsb r1, r0, #0
-        cmn r0, #1
-        movgt r1, r0
-        mov r0, r1
-        bx lr
-
-This is very, uh, literal.  This could be a 3 operation sequence:
-  t = (p sra 31); 
-  res = (p xor t)-t
-
-Which would be better.  This occurs in png decode.
-
-//===---------------------------------------------------------------------===//
-
  More load / store optimizations:
  1) Better representation for block transfer? This is from Olden/power:
  
@@ -325,7 +306,7 @@ time.
  4) Once we added support for multiple result patterns, write indexed loads
     patterns instead of C++ instruction selection code.
  
-5) Use FLDM / FSTM to emulate indexed FP load / store.
+5) Use VLDM / VSTM to emulate indexed FP load / store.
  
  //===---------------------------------------------------------------------===//
  
@@ -422,14 +403,6 @@ are not remembered when the same two values are compared twice.
  
  //===---------------------------------------------------------------------===//
  
-More register scavenging work:
-
-1. Use the register scavenger to track frame index materialized into registers
-   (those that do not fit in addressing modes) to allow reuse in the same BB.
-2. Finish scavenging for Thumb.
-
-//===---------------------------------------------------------------------===//
-
  More LSR enhancements possible:
  
  1. Teach LSR about pre- and post- indexed ops to allow iv increment be merged
@@ -540,10 +513,6 @@ while ARMConstantIslandPass only need to worry about LDR (literal).
  
  //===---------------------------------------------------------------------===//
  
-We need to fix constant isel for ARMv6t2 to use MOVT.
-
-//===---------------------------------------------------------------------===//
-
  Constant island pass should make use of full range SoImm values for LEApcrel.
  Be careful though as the last attempt caused infinite looping on lencod.
  
@@ -593,11 +562,6 @@ it saves an instruction and a register.
  
  //===---------------------------------------------------------------------===//
  
-add/sub/and/or + i32 imm can be simplified by folding part of the immediate
-into the operation.
-
-//===---------------------------------------------------------------------===//
-
  It might be profitable to cse MOVi16 if there are lots of 32-bit immediates
  with the same bottom half.
  
@@ -612,3 +576,17 @@ http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-June/022763.html
  //===---------------------------------------------------------------------===//
  
  Make use of the "rbit" instruction.
+
+//===---------------------------------------------------------------------===//
+
+Take a look at test/CodeGen/Thumb2/machine-licm.ll. ARM should be taught how
+to licm and cse the unnecessary load from cp#1.
+
+//===---------------------------------------------------------------------===//
+
+The CMN instruction sets the flags like an ADD instruction, while CMP sets
+them like a subtract. Therefore to be able to use CMN for comparisons other
+than the Z bit, we'll need additional logic to reverse the conditionals
+associated with the comparison. Perhaps a pseudo-instruction for the comparison,
+with a post-codegen pass to clean up and handle the condition codes?
+See PR5694 for testcase.