optimize single MBB loops better. In particular, produce:
LBB1_57: #bb207.i
movl 72(%esp), %ecx
movb (%ecx,%eax), %cl
movl 80(%esp), %edx
movb %cl, 1(%edx,%eax)
incl %eax
cmpl $143, %eax
jne LBB1_57 #bb207.i
jmp LBB1_64 #cond_next255.i
intead of:
LBB1_57: #bb207.i
movl 72(%esp), %ecx
movb (%ecx,%eax), %cl
movl 80(%esp), %edx
movb %cl, 1(%edx,%eax)
incl %eax
cmpl $143, %eax
je LBB1_64 #cond_next255.i
jmp LBB1_57 #bb207.i
This eliminates a branch per iteration of the loop. This hurted PPC
particularly, because the extra branch meant another dispatch group for each
iteration of the loop.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@31530
91177308-0d34-0410-b5e6-
96231b3b80d8