//===---------------------------------------------------------------------===//
-Implement __int128 and long double support.
-
-//===---------------------------------------------------------------------===//
-
For this:
extern void xx(void);
We need to do the tailcall optimization as well.
-//===---------------------------------------------------------------------===//
-
-For this:
-
-int test(int a)
-{
- return a * 3;
-}
-
-We generates
- leal (%edi,%edi,2), %eax
-
-We should be generating
- leal (%rdi,%rdi,2), %eax
-
-instead. The later form does not require an address-size prefix 67H.
-
-It's probably ok to simply emit the corresponding 64-bit super class registers
-in this case?
-
-
//===---------------------------------------------------------------------===//
AMD64 Optimization Manual 8.2 has some nice information about optimizing integer
//===---------------------------------------------------------------------===//
-For this:
-
-extern int dst[];
-extern int* ptr;
-
-void test(void) {
- ptr = dst;
-}
-
-We generate this code for static relocation model:
-
-_test:
- leaq _dst(%rip), %rax
- movq %rax, _ptr(%rip)
- ret
-
-If we are in small code model, they we can treat _dst as a 32-bit constant.
- movq $_dst, _ptr(%rip)
-
-Note, however, we should continue to use RIP relative addressing mode as much as
-possible. The above is actually one byte shorter than
- movq $_dst, _ptr
-
-A better example is the code from PR1018. We are generating:
- leaq xcalloc2(%rip), %rax
- movq %rax, 8(%rsp)
-when we should be generating:
- movq $xcalloc2, 8(%rsp)
-
-The reason the better codegen isn't done now is support for static small
-code model in JIT mode. The JIT cannot ensure that all GV's are placed in the
-lower 4G so we are not treating GV labels as 32-bit values.
-
-//===---------------------------------------------------------------------===//
-
Right now the asm printer assumes GlobalAddress are accessed via RIP relative
addressing. Therefore, it is not possible to generate this:
movabsq $__ZTV10polynomialIdE+16, %rax
instruction. We should probably introduce something like AbsoluteAddress to
distinguish it from GlobalAddress so the asm printer and JIT code emitter can
do the right thing.
+
+//===---------------------------------------------------------------------===//
+
+It's not possible to reference AH, BH, CH, and DH registers in an instruction
+requiring REX prefix. However, divb and mulb both produce results in AH. If isel
+emits a CopyFromReg which gets turned into a movb and that can be allocated a
+r8b - r15b.
+
+To get around this, isel emits a CopyFromReg from AX and then right shift it
+down by 8 and truncate it. It's not pretty but it works. We need some register
+allocation magic to make the hack go away (e.g. putting additional constraints
+on the result of the movb).
+
+//===---------------------------------------------------------------------===//
+
+The x86-64 ABI for hidden-argument struct returns requires that the
+incoming value of %rdi be copied into %rax by the callee upon return.
+
+The idea is that it saves callers from having to remember this value,
+which would often require a callee-saved register. Callees usually
+need to keep this value live for most of their body anyway, so it
+doesn't add a significant burden on them.
+
+We currently implement this in codegen, however this is suboptimal
+because it means that it would be quite awkward to implement the
+optimization for callers.
+
+A better implementation would be to relax the LLVM IR rules for sret
+arguments to allow a function with an sret argument to have a non-void
+return type, and to have the front-end to set up the sret argument value
+as the return value of the function. The front-end could more easily
+emit uses of the returned struct value to be in terms of the function's
+lowered return value, and it would free non-C frontends from a
+complication only required by a C-based ABI.
+
+//===---------------------------------------------------------------------===//