Apparently some code finally started to tickle this after my
canonicalization changes to instcombine.
The bug stems from trying to form a vector type out of scalars that
aren't compatible at all. In this example, from x86_mmx values. The code
in the vectorizer that checks for reasonable types whas checking for
aggregates or vectors, but there are lots of other types that should
just never reach the vectorizer.
Debugging this was made more confusing by the lie in an assert in
VectorType::get() -- it isn't that the types are *primitive*. The types
must be integer, pointer, or floating point types. No other types are
allowed.
I've improved the assert and added a helper to the vectorizer to handle
the element type validity checks. It now re-uses the VectorType static
function and then further excludes weird target-specific types that we
probably shouldn't be touching here (x86_fp80 and ppc_fp128). Neither of
these are really reachable anyways (neither 80-bit nor 128-bit things
will get vectorized) but it seems better to just eagerly exclude such
nonesense.
I've added a test case, but while it definitely covers two of the paths
through this code there may be more paths that would benefit from test
coverage. I'm not familiar enough with the SLP vectorizer to synthesize
test cases for all of these, but was able to update the code itself by
inspection.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228899
91177308-0d34-0410-b5e6-
96231b3b80d8
VectorType *VectorType::get(Type *elementType, unsigned NumElements) {
Type *ElementType = const_cast<Type*>(elementType);
assert(NumElements > 0 && "#Elements of a VectorType must be greater than 0");
- assert(isValidElementType(ElementType) &&
- "Elements of a VectorType must be a primitive type");
-
+ assert(isValidElementType(ElementType) && "Element type of a VectorType must "
+ "be an integer, floating point, or "
+ "pointer type.");
+
LLVMContextImpl *pImpl = ElementType->getContext().pImpl;
VectorType *&Entry = ElementType->getContext().pImpl
->VectorTypes[std::make_pair(ElementType, NumElements)];
// This limit is useful for very large basic blocks.
static const unsigned MaxMemDepDistance = 160;
+/// \brief Predicate for the element types that the SLP vectorizer supports.
+///
+/// The most important thing to filter here are types which are invalid in LLVM
+/// vectors. We also filter target specific types which have absolutely no
+/// meaningful vectorization path such as x86_fp80 and ppc_f128. This just
+/// avoids spending time checking the cost model and realizing that they will
+/// be inevitably scalarized.
+static bool isValidElementType(Type *Ty) {
+ return VectorType::isValidElementType(Ty) && !Ty->isX86_FP80Ty() &&
+ !Ty->isPPC_FP128Ty();
+}
+
/// \returns the parent basic block if all of the instructions in \p VL
/// are in the same block or null otherwise.
static BasicBlock *getSameBlock(ArrayRef<Value *> VL) {
Type *SrcTy = VL0->getOperand(0)->getType();
for (unsigned i = 0; i < VL.size(); ++i) {
Type *Ty = cast<Instruction>(VL[i])->getOperand(0)->getType();
- if (Ty != SrcTy || Ty->isAggregateType() || Ty->isVectorTy()) {
+ if (Ty != SrcTy || !isValidElementType(Ty)) {
BS.cancelScheduling(VL);
newTreeEntry(VL, false);
DEBUG(dbgs() << "SLP: Gathering casts with different src types.\n");
// Check that the pointer points to scalars.
Type *Ty = SI->getValueOperand()->getType();
- if (Ty->isAggregateType() || Ty->isVectorTy())
+ if (!isValidElementType(Ty))
continue;
// Find the base pointer.
for (int i = 0, e = VL.size(); i < e; ++i) {
Type *Ty = VL[i]->getType();
- if (Ty->isAggregateType() || Ty->isVectorTy())
+ if (!isValidElementType(Ty))
return false;
Instruction *Inst = dyn_cast<Instruction>(VL[i]);
if (!Inst || Inst->getOpcode() != Opcode0)
return false;
Type *Ty = B->getType();
- if (Ty->isVectorTy())
+ if (!isValidElementType(Ty))
return false;
ReductionOpcode = B->getOpcode();
--- /dev/null
+; RUN: opt < %s -basicaa -slp-vectorizer -S -mcpu=corei7-avx | FileCheck %s
+
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+define void @test1(x86_mmx %a, x86_mmx %b, i64* %ptr) {
+; Ensure we can handle x86_mmx values which are primitive and can be bitcast
+; with integer types but can't be put into a vector.
+;
+; CHECK-LABEL: @test1
+; CHECK: store i64
+; CHECK: store i64
+; CHECK: ret void
+entry:
+ %a.cast = bitcast x86_mmx %a to i64
+ %b.cast = bitcast x86_mmx %b to i64
+ %a.and = and i64 %a.cast, 42
+ %b.and = and i64 %b.cast, 42
+ %gep = getelementptr i64* %ptr, i32 1
+ store i64 %a.and, i64* %ptr
+ store i64 %b.and, i64* %gep
+ ret void
+}
+
+define void @test2(x86_mmx %a, x86_mmx %b) {
+; Same as @test1 but using phi-input vectorization instead of store
+; vectorization.
+;
+; CHECK-LABEL: @test2
+; CHECK: and i64
+; CHECK: and i64
+; CHECK: ret void
+entry:
+ br i1 undef, label %if.then, label %exit
+
+if.then:
+ %a.cast = bitcast x86_mmx %a to i64
+ %b.cast = bitcast x86_mmx %b to i64
+ %a.and = and i64 %a.cast, 42
+ %b.and = and i64 %b.cast, 42
+ br label %exit
+
+exit:
+ %a.phi = phi i64 [ 0, %entry ], [ %a.and, %if.then ]
+ %b.phi = phi i64 [ 0, %entry ], [ %b.and, %if.then ]
+ tail call void @f(i64 %a.phi, i64 %b.phi)
+ ret void
+}
+
+declare void @f(i64, i64)