1 //===---------------------------------------------------------------------===//
2 // Random ideas for the X86 backend.
3 //===---------------------------------------------------------------------===//
5 Add a MUL2U and MUL2S nodes to represent a multiply that returns both the
6 Hi and Lo parts (combination of MUL and MULH[SU] into one node). Add this to
7 X86, & make the dag combiner produce it when needed. This will eliminate one
8 imul from the code generated for:
10 long long test(long long X, long long Y) { return X*Y; }
12 by using the EAX result from the mul. We should add a similar node for
17 long long test(int X, int Y) { return (long long)X*Y; }
19 ... which should only be one imul instruction.
21 //===---------------------------------------------------------------------===//
23 This should be one DIV/IDIV instruction, not a libcall:
25 unsigned test(unsigned long long X, unsigned Y) {
29 This can be done trivially with a custom legalizer. What about overflow
30 though? http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14224
32 //===---------------------------------------------------------------------===//
34 Some targets (e.g. athlons) prefer freep to fstp ST(0):
35 http://gcc.gnu.org/ml/gcc-patches/2004-04/msg00659.html
37 //===---------------------------------------------------------------------===//
39 This should use fiadd on chips where it is profitable:
40 double foo(double P, int *I) { return P+*I; }
42 //===---------------------------------------------------------------------===//
44 The FP stackifier needs to be global. Also, it should handle simple permutates
45 to reduce number of shuffle instructions, e.g. turning:
57 //===---------------------------------------------------------------------===//
59 Improvements to the multiply -> shift/add algorithm:
60 http://gcc.gnu.org/ml/gcc-patches/2004-08/msg01590.html
62 //===---------------------------------------------------------------------===//
64 Improve code like this (occurs fairly frequently, e.g. in LLVM):
65 long long foo(int x) { return 1LL << x; }
67 http://gcc.gnu.org/ml/gcc-patches/2004-09/msg01109.html
68 http://gcc.gnu.org/ml/gcc-patches/2004-09/msg01128.html
69 http://gcc.gnu.org/ml/gcc-patches/2004-09/msg01136.html
71 Another useful one would be ~0ULL >> X and ~0ULL << X.
73 //===---------------------------------------------------------------------===//
75 Should support emission of the bswap instruction, probably by adding a new
76 DAG node for byte swapping. Also useful on PPC which has byte-swapping loads.
78 //===---------------------------------------------------------------------===//
81 _Bool f(_Bool a) { return a!=1; }
88 //===---------------------------------------------------------------------===//
92 1. Dynamic programming based approach when compile time if not an
94 2. Code duplication (addressing mode) during isel.
95 3. Other ideas from "Register-Sensitive Selection, Duplication, and
96 Sequencing of Instructions".
98 //===---------------------------------------------------------------------===//
100 Should we promote i16 to i32 to avoid partial register update stalls?
102 //===---------------------------------------------------------------------===//
104 Leave any_extend as pseudo instruction and hint to register
105 allocator. Delay codegen until post register allocation.
107 //===---------------------------------------------------------------------===//
109 Add a target specific hook to DAG combiner to handle SINT_TO_FP and
110 FP_TO_SINT when the source operand is already in memory.
112 //===---------------------------------------------------------------------===//
114 Check if load folding would add a cycle in the dag.
116 //===---------------------------------------------------------------------===//
118 Model X86 EFLAGS as a real register to avoid redudant cmp / test. e.g.
122 testb %al, %al # unnecessary