Programming Languages Research Group: Git

author	Nicolai Haehnle <nhaehnle@gmail.com>
	Thu, 7 Jan 2016 17:10:29 +0000 (17:10 +0000)
committer	Nicolai Haehnle <nhaehnle@gmail.com>
	Thu, 7 Jan 2016 17:10:29 +0000 (17:10 +0000)
commit	702b589510d04e13e27d4de31cf8b3f1c5455fd2
tree	cd8c95d5fa91673719806e54436adc3533b6a7d6	tree \| snapshot
parent	64f913f14fc5c4e6945b257a67d851a192747375	commit \| diff

AMDGPU/SI: Fold operands with sub-registers

Summary:
Multi-dword constant loads generated unnecessary moves from SGPRs into VGPRs,
increasing the code size and VGPR pressure. These moves are now folded away.

Note that this lack of operand folding was not a problem for VMEM loads,
because COPY nodes from VReg_Nnn to VGPR32 are eliminated by the register
coalescer.

Some tests are updated, note that the fsub.ll test explicitly checks that
the move is elided.

With the IR generated by current Mesa, the changes are obviously relatively
minor:

7063 shaders in 3531 tests
Totals:
SGPRS: 351872 -> 352560 (0.20 %)
VGPRS: 199984 -> 200732 (0.37 %)
Code Size: 9876968 -> 9881112 (0.04 %) bytes
LDS: 91 -> 91 (0.00 %) blocks
Scratch: 1779712 -> 1767424 (-0.69 %) bytes per wave
Wait states: 295164 -> 295337 (0.06 %)

Totals from affected shaders:
SGPRS: 65784 -> 66472 (1.05 %)
VGPRS: 38064 -> 38812 (1.97 %)
Code Size: 1993828 -> 1997972 (0.21 %) bytes
LDS: 42 -> 42 (0.00 %) blocks
Scratch: 795648 -> 783360 (-1.54 %) bytes per wave
Wait states: 54026 -> 54199 (0.32 %)

Reviewers: tstellarAMD, arsenm, mareko

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15875

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257074 91177308-0d34-0410-b5e6-96231b3b80d8

lib/Target/AMDGPU/SIFixSGPRCopies.cpp		diff \| blob \| history
lib/Target/AMDGPU/SIFoldOperands.cpp		diff \| blob \| history
lib/Target/AMDGPU/SIInstrInfo.cpp		diff \| blob \| history
lib/Target/AMDGPU/SIRegisterInfo.cpp		diff \| blob \| history
test/CodeGen/AMDGPU/fmin_legacy.ll		diff \| blob \| history
test/CodeGen/AMDGPU/fsub.ll		diff \| blob \| history
test/CodeGen/AMDGPU/llvm.round.f64.ll		diff \| blob \| history