[X86][SSE] Vectorized v4i32 non-uniform shifts.
authorSimon Pilgrim <llvm-dev@redking.me.uk>
Sun, 12 Jul 2015 11:15:19 +0000 (11:15 +0000)
committerSimon Pilgrim <llvm-dev@redking.me.uk>
Sun, 12 Jul 2015 11:15:19 +0000 (11:15 +0000)
commitf9df477221e294eaf1ee3bf0a88a88c9b9628046
tree834b2dcc73dabbd81e1bab366a5b21c104ffef7d
parenta1b821fac967dedd4b39d2dc8ce58e54a6cecacd
[X86][SSE] Vectorized v4i32 non-uniform shifts.

While the v4i32 shl operation is already vectorized using a cvttps2dq/pmulld pattern, the lshr/ashr opeations are still scalarized.

This patch adds vectorization support for non-uniform v4i32 shift operations - it splats constant shift amounts to allow them to use the immediate sse shift instructions, or extracts/zero-extends non-constant shift amounts. The individual results are then blended together.

Differential Revision: http://reviews.llvm.org/D11063

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241989 91177308-0d34-0410-b5e6-96231b3b80d8
lib/Target/X86/X86ISelLowering.cpp
lib/Target/X86/X86TargetTransformInfo.cpp
test/Analysis/CostModel/X86/testshiftashr.ll
test/Analysis/CostModel/X86/testshiftlshr.ll
test/CodeGen/X86/vector-shift-ashr-128.ll
test/CodeGen/X86/vector-shift-ashr-256.ll
test/CodeGen/X86/vector-shift-lshr-128.ll
test/CodeGen/X86/vector-shift-lshr-256.ll
test/CodeGen/X86/widen_load-2.ll