When INIT/SIPI sequence is sent to VCPU which before that
was in use by OS, VMRUN might fail with:
KVM: entry failed, hardware error 0xffffffff
EAX=
00000000 EBX=
00000000 ECX=
00000000 EDX=
000006d3
ESI=
00000000 EDI=
00000000 EBP=
00000000 ESP=
00000000
EIP=
00000000 EFL=
00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000
00000000 0000ffff 00009300
CS =9a00
0009a000 0000ffff 00009a00
[...]
CR0=
60000010 CR2=
b6f3e000 CR3=
01942000 CR4=
000007e0
[...]
EFER=
0000000000000000
with corresponding SVM error:
KVM: FAILED VMRUN WITH VMCB:
[...]
cpl: 0 efer:
0000000000001000
cr0:
0000000080010010 cr2:
00007fd7fe85bf90
cr3:
0000000187d0c000 cr4:
0000000000000020
[...]
What happens is that VCPU state right after offlinig:
CR0: 0x80050033 EFER: 0xd01 CR4: 0x7e0
-> long mode with CR3 pointing to longmode page tables
and when VCPU gets INIT/SIPI following transition happens
CR0: 0 -> 0x60000010 EFER: 0x0 CR4: 0x7e0
-> paging disabled with stale CR3
However SVM under the hood puts VCPU in Paged Real Mode*
which effectively translates CR0 0x60000010 ->
80010010 after
svm_vcpu_reset()
-> init_vmcb()
-> kvm_set_cr0()
-> svm_set_cr0()
but from kvm_set_cr0() perspective CR0: 0 -> 0x60000010
only caching bits are changed and
commit
d81135a57aa6
("KVM: x86: do not reset mmu if CR0.CD and CR0.NW are changed")'
regressed svm_vcpu_reset() which relied on MMU being reset.
As result VMRUN after svm_vcpu_reset() tries to run
VCPU in Paged Real Mode with stale MMU context (longmode page tables),
which causes some AMD CPUs** to bail out with VMEXIT_INVALID.
Fix issue by unconditionally resetting MMU context
at init_vmcb() time.
* AMD64 Architecture Programmer’s Manual,
Volume 2: System Programming, rev: 3.25
15.19 Paged Real Mode
** Opteron 1216
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Fixes: d81135a57aa6
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>