--
-We don't support prefetching yet.
-
---
-
There is no scheduling support.
--
--
-We don't optimize block memory operations, except using single MVCs
-for memcpy and single CLCs for memcmp.
+We only use MVC, XC and CLC for constant-length block operations.
+We could extend them to variable-length operations too,
+using EXECUTE RELATIVE LONG.
-It's definitely worth using things like NC, XC and OC with
-constant lengths. MVCIN may be worthwhile too.
-
-We should probably implement general memcpy using MVC with EXECUTE.
-Likewise memcmp and CLC. MVCLE and CLCLE could be useful too.
+MVCIN, MVCLE and CLCLE may be worthwhile too.
--
-We don't optimize string operations.
-
-MVST, CLST, SRST and CUSE could be useful here. Some of the TRANSLATE
-family might be too, although they are probably more difficult to exploit.
+We don't use CUSE or the TRANSLATE family of instructions for string
+operations. The TRANSLATE ones are probably more difficult to exploit.
--
--
-We could take advantage of the various ... UNDER MASK instructions,
-such as ICM and STCM.
-
---
-
-DAGCombiner can detect integer absolute, but there's not yet an associated
-ISD opcode. We could add one and implement it using LOAD POSITIVE.
-Negated absolutes could use LOAD NEGATIVE.
+We don't use ICM or STCM.
--
--
-Atomic loads and stores use the default compare-and-swap based implementation.
-This is much too conservative in practice, since the architecture guarantees
-that 1-, 2-, 4- and 8-byte loads and stores to aligned addresses are
-inherently atomic.
-
---
-
If needed, we can support 16-byte atomics using LPQ, STPQ and CSDG.
--