x86: Emit LAHF/SAHF instead of PUSHF/POPF
NaCl's sandbox doesn't allow PUSHF/POPF out of security concerns (priviledged emulators have forgotten to mask system bits in the past, and EFLAGS's DF bit is a constant source of hilarity). Commit r220529 fixed PR20376 by saving cmpxchg's flags result using EFLAGS, this commit now generated LAHF/SAHF instead, for all of x86 (not just NaCl) because it leads to an overall performance gain over PUSHF/POPF.
As with the previous patch this code generation is pretty bad because it occurs very later, after register allocation, and in many cases it rematerializes flags which were already available (e.g. already in a register through SETE). Fortunately it's somewhat rare that this code needs to fire.
I did [[ https://github.com/jfbastien/benchmark-x86-flags | a bit of benchmarking ]], the results on an Intel Haswell E5-2690 CPU at 2.9GHz are:
| Time per call (ms) | Runtime (ms) | Benchmark |
| 0.
000012514 | 6257 | sete.i386 |
| 0.
000012810 | 6405 | sete.i386-fast |
| 0.
000010456 | 5228 | sete.x86-64 |
| 0.
000010496 | 5248 | sete.x86-64-fast |
| 0.
000012906 | 6453 | lahf-sahf.i386 |
| 0.
000013236 | 6618 | lahf-sahf.i386-fast |
| 0.
000010580 | 5290 | lahf-sahf.x86-64 |
| 0.
000010304 | 5152 | lahf-sahf.x86-64-fast |
| 0.
000028056 | 14028 | pushf-popf.i386 |
| 0.
000027160 | 13580 | pushf-popf.i386-fast |
| 0.
000023810 | 11905 | pushf-popf.x86-64 |
| 0.
000026468 | 13234 | pushf-popf.x86-64-fast |
Clearly `PUSHF`/`POPF` are suboptimal. It doesn't really seems to be worth teaching LLVM about individual flags, at least not for this purpose.
Reviewers: rnk, jvoung, t.p.northover
Subscribers: llvm-commits
Differential revision: http://reviews.llvm.org/D6629
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244503
91177308-0d34-0410-b5e6-
96231b3b80d8