AMDGPU: Make SIInsertWaits about a factor of 4 faster
authorMatt Arsenault <Matthew.Arsenault@amd.com>
Thu, 1 Oct 2015 21:43:15 +0000 (21:43 +0000)
committerMatt Arsenault <Matthew.Arsenault@amd.com>
Thu, 1 Oct 2015 21:43:15 +0000 (21:43 +0000)
commitae6db4bdd734d141c614220d3d560d0754b4ff7f
tree6ee484dd5790c7ea949a5151be68a943263cb796
parent646073b30f09ba58353698432dccf8a3af0ff895
AMDGPU: Make SIInsertWaits about a factor of 4 faster

This was the slowest target custom pass and was spending 80%
of the time in getMinimalPhysRegClass which was called
for every register operand.

Try to use the statically known register class when possible from
the instruction's MCOperandInfo. There are a few pseudo instructions
which are not well behaved with unknown register classes which still
require the expensive physical register class search.

There are a few other possibilities for making this even faster,
such as not inspecting implicit operands. For now those are checked
because it is technically possible to have a scalar load into
exec or vcc which can be implicitly used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249079 91177308-0d34-0410-b5e6-96231b3b80d8
lib/Target/AMDGPU/SIInsertWaits.cpp
lib/Target/AMDGPU/SIRegisterInfo.cpp