From: Hans Wennborg Date: Fri, 19 Feb 2016 21:35:00 +0000 (+0000) Subject: Merging r261360: X-Git-Url: http://demsky.eecs.uci.edu/git/?a=commitdiff_plain;h=78e9cd40a2ea27cc9300d900a7dccc75940f9eb0;p=oota-llvm.git Merging r261360: ------------------------------------------------------------------------ r261360 | dim | 2016-02-19 12:14:11 -0800 (Fri, 19 Feb 2016) | 19 lines Fix incorrect selection of AVX512 sqrt when OptForSize is on Summary: When optimizing for size, sqrt calls can be incorrectly selected as AVX512 VSQRT instructions. This is because X86InstrAVX512.td has a `Requires<[OptForSize]>` in its `avx512_sqrt_scalar` multiclass definition. Even if the target does not support AVX512, the class can apparently still be chosen, leading to an incorrect selection of `vsqrtss`. In PR26625, this lead to an assertion: Reg >= X86::FP0 && Reg <= X86::FP6 && "Expected FP register!", because the `vsqrtss` instruction requires an XMM register, which is not available on i686 CPUs. Reviewers: grosbach, resistor, joker.eph Subscribers: spatel, emaste, llvm-commits Differential Revision: http://reviews.llvm.org/D17414 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@261367 91177308-0d34-0410-b5e6-96231b3b80d8 --- diff --git a/lib/Target/X86/X86InstrAVX512.td b/lib/Target/X86/X86InstrAVX512.td index 49be6488393..6f0199b015c 100644 --- a/lib/Target/X86/X86InstrAVX512.td +++ b/lib/Target/X86/X86InstrAVX512.td @@ -5896,7 +5896,7 @@ multiclass avx512_sqrt_scalar opc, string OpcodeStr,X86VectorVTInfo _, def : Pat<(_.EltVT (OpNode (load addr:$src))), (!cast(NAME#SUFF#Zm) - (_.EltVT (IMPLICIT_DEF)), addr:$src)>, Requires<[OptForSize]>; + (_.EltVT (IMPLICIT_DEF)), addr:$src)>, Requires<[HasAVX512, OptForSize]>; } multiclass avx512_sqrt_scalar_all opc, string OpcodeStr> { diff --git a/test/CodeGen/X86/pr26625.ll b/test/CodeGen/X86/pr26625.ll new file mode 100644 index 00000000000..1b2e227bb59 --- /dev/null +++ b/test/CodeGen/X86/pr26625.ll @@ -0,0 +1,20 @@ +; RUN: llc < %s -mcpu=i686 2>&1 | FileCheck %s +; PR26625 + +target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128" +target triple = "i386" + +define float @x0(float %f) #0 { +entry: + %call = tail call float @sqrtf(float %f) #1 + ret float %call +; CHECK-LABEL: x0: +; CHECK: flds +; CHECK-NEXT: fsqrt +; CHECK-NOT: vsqrtss +} + +declare float @sqrtf(float) #0 + +attributes #0 = { nounwind optsize readnone } +attributes #1 = { nounwind optsize readnone }