Programming Languages Research Group: Git

author	Sanjay Patel <spatel@rotateright.com>
	Wed, 25 Mar 2015 17:36:01 +0000 (17:36 +0000)
committer	Sanjay Patel <spatel@rotateright.com>
	Wed, 25 Mar 2015 17:36:01 +0000 (17:36 +0000)
commit	e53dbeb2addf8e19cad61227d29e117392e00994
tree	38a7cbb8a352d7724dc746683feae2e9fd18f5e1	tree \| snapshot
parent	62dd074fe9b4399c3ca18e20eef0f367c0af9348	commit \| diff

[X86, AVX] improve insertion into zero element of 256-bit vector

This patch allows AVX blend instructions to handle insertion into the low
element of a 256-bit vector for the appropriate data types.

For f32, instead of:

   vblendps $1, %xmm1, %xmm0, %xmm1 ## xmm1 = xmm1[0],xmm0[1,2,3]
   vblendps $15, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0,1,2,3],ymm0[4,5,6,7]

we get:

   vblendps $1, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0],ymm0[1,2,3,4,5,6,7]

For f64, instead of:

   vmovsd %xmm1, %xmm0, %xmm1     ## xmm1 = xmm1[0],xmm0[1]
   vblendpd $3, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0,1],ymm0[2,3]

we get:

   vblendpd $1, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0],ymm0[1,2,3]

For the hardware-neglected integer data types, I left a TODO comment in the
code and added regression tests for a follow-on patch.

Differential Revision: http://reviews.llvm.org/D8609

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233199 91177308-0d34-0410-b5e6-96231b3b80d8

lib/Target/X86/X86ISelLowering.cpp		diff \| blob \| history
test/CodeGen/X86/avx-insertelt.ll	[new file with mode: 0644]	blob