[x86] Teach the new vector shuffle lowering to re-use the SHUFPS
lowering when it can use a symmetric SHUFPS across both 128-bit lanes.
This required making the SHUFPS lowering tolerant of other vector types,
and adjusting our canonicalization to canonicalize harder.
This is the last of the clever uses of symmetry I've thought of for
v8f32. The rest of the tricks I'm aware of here are to work around
assymetry in the mask.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218216
91177308-0d34-0410-b5e6-
96231b3b80d8