[x86] Restore the bitcasts I removed when refactoring this to avoid
shifting vectors of bytes as x86 doesn't have direct support for that.
This removes a bunch of redundant masking in the generated code for SSE2
and SSE3.
In order to avoid the really significant code size growth this would
have triggered, I also factored the completely repeatative logic for
shifting and masking into two lambdas which in turn makes all of this
much easier to read IMO.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238637
91177308-0d34-0410-b5e6-
96231b3b80d8