[X86, AVX] improve insertion into zero element of 256-bit vector
authorSanjay Patel <spatel@rotateright.com>
Wed, 25 Mar 2015 17:36:01 +0000 (17:36 +0000)
committerSanjay Patel <spatel@rotateright.com>
Wed, 25 Mar 2015 17:36:01 +0000 (17:36 +0000)
commite53dbeb2addf8e19cad61227d29e117392e00994
tree38a7cbb8a352d7724dc746683feae2e9fd18f5e1
parent62dd074fe9b4399c3ca18e20eef0f367c0af9348
[X86, AVX] improve insertion into zero element of 256-bit vector

This patch allows AVX blend instructions to handle insertion into the low
element of a 256-bit vector for the appropriate data types.

For f32, instead of:

   vblendps $1, %xmm1, %xmm0, %xmm1 ## xmm1 = xmm1[0],xmm0[1,2,3]
   vblendps $15, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0,1,2,3],ymm0[4,5,6,7]

we get:

   vblendps $1, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0],ymm0[1,2,3,4,5,6,7]

For f64, instead of:

   vmovsd %xmm1, %xmm0, %xmm1     ## xmm1 = xmm1[0],xmm0[1]
   vblendpd $3, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0,1],ymm0[2,3]

we get:

   vblendpd $1, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0],ymm0[1,2,3]

For the hardware-neglected integer data types, I left a TODO comment in the
code and added regression tests for a follow-on patch.

Differential Revision: http://reviews.llvm.org/D8609

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233199 91177308-0d34-0410-b5e6-96231b3b80d8
lib/Target/X86/X86ISelLowering.cpp
test/CodeGen/X86/avx-insertelt.ll [new file with mode: 0644]