firefly-linux-kernel-4.4.55.git
10 years agoKVM: Add KVM_EXIT_SYSTEM_EVENT to user space API header
Anup Patel [Tue, 29 Apr 2014 05:54:19 +0000 (11:24 +0530)]
KVM: Add KVM_EXIT_SYSTEM_EVENT to user space API header

Currently, we don't have an exit reason to notify user space about
a system-level event (for e.g. system reset or shutdown) triggered
by the VCPU. This patch adds exit reason KVM_EXIT_SYSTEM_EVENT for
this purpose. We can also inform user space about the 'type' and
architecture specific 'flags' of a system-level event using the
kvm_run structure.

This newly added KVM_EXIT_SYSTEM_EVENT will be used by KVM ARM/ARM64
in-kernel PSCI v0.2 support to reset/shutdown VMs.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 8ad6b634928a25971dc42dce101808b1491f87ec)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoARM/ARM64: KVM: Make kvm_psci_call() return convention more flexible
Anup Patel [Tue, 29 Apr 2014 05:54:18 +0000 (11:24 +0530)]
ARM/ARM64: KVM: Make kvm_psci_call() return convention more flexible

Currently, the kvm_psci_call() returns 'true' or 'false' based on whether
the PSCI function call was handled successfully or not. This does not help
us emulate system-level PSCI functions where the actual emulation work will
be done by user space (QEMU or KVMTOOL). Examples of such system-level PSCI
functions are: PSCI v0.2 SYSTEM_OFF and SYSTEM_RESET.

This patch updates kvm_psci_call() to return three types of values:
1) > 0 (success)
2) = 0 (success but exit to user space)
3) < 0 (errors)

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit e8e7fcc5e2710b31ef842ee799db99c07986c364)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoARM/ARM64: KVM: Add base for PSCI v0.2 emulation
Anup Patel [Tue, 29 Apr 2014 05:54:16 +0000 (11:24 +0530)]
ARM/ARM64: KVM: Add base for PSCI v0.2 emulation

Currently, the in-kernel PSCI emulation provides PSCI v0.1 interface to
VCPUs. This patch extends current in-kernel PSCI emulation to provide
PSCI v0.2 interface to VCPUs.

By default, ARM/ARM64 KVM will always provide PSCI v0.1 interface for
keeping the ABI backward-compatible.

To select PSCI v0.2 interface for VCPUs, the user space (i.e. QEMU or
KVMTOOL) will have to set KVM_ARM_VCPU_PSCI_0_2 feature when doing VCPU
init using KVM_ARM_VCPU_INIT ioctl.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 7d0f84aae9e231930985eaff63ac91b61aaa15d6)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoARM/ARM64: KVM: Add common header for PSCI related defines
Anup Patel [Tue, 29 Apr 2014 05:54:15 +0000 (11:24 +0530)]
ARM/ARM64: KVM: Add common header for PSCI related defines

We need a common place to share PSCI related defines among ARM kernel,
ARM64 kernel, KVM ARM/ARM64 PSCI emulation, and user space.

We introduce uapi/linux/psci.h for this purpose. This newly added
header will be first used by KVM ARM/ARM64 in-kernel PSCI emulation
and user space (i.e. QEMU or KVMTOOL).

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit e546eea74ec66698e29c583639cf6e2a11e46490)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: Add capability to advertise PSCI v0.2 support
Anup Patel [Tue, 29 Apr 2014 05:54:14 +0000 (11:24 +0530)]
KVM: Add capability to advertise PSCI v0.2 support

User space (i.e. QEMU or KVMTOOL) should be able to check whether KVM
ARM/ARM64 supports in-kernel PSCI v0.2 emulation. For this purpose, we
define KVM_CAP_ARM_PSCI_0_2 in KVM user space interface header.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 717abd208dff75b343243aa5ed688f62190dda5e)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: ARM: vgic: Fix the overlap check action about setting the GICD & GICC base address.
Haibin Wang [Tue, 29 Apr 2014 06:49:17 +0000 (14:49 +0800)]
KVM: ARM: vgic: Fix the overlap check action about setting the GICD & GICC base address.

Currently below check in vgic_ioaddr_overlap will always succeed,
because the vgic dist base and vgic cpu base are still kept UNDEF
after initialization. The code as follows will be return forever.

if (IS_VGIC_ADDR_UNDEF(dist) || IS_VGIC_ADDR_UNDEF(cpu))
                return 0;

So, before invoking the vgic_ioaddr_overlap, it needs to set the
corresponding base address firstly.

Signed-off-by: Haibin Wang <wanghaibin.wang@huawei.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 30c2117085bc4e05d091cee6eba79f069b41a9cd)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: async_pf: change async_pf_execute() to use get_user_pages(tsk => NULL)
Oleg Nesterov [Mon, 28 Apr 2014 15:03:00 +0000 (17:03 +0200)]
KVM: async_pf: change async_pf_execute() to use get_user_pages(tsk => NULL)

async_pf_execute() passes tsk == current to gup(), this is doesn't
hurt but unnecessary and misleading. "tsk" is only used to account
the number of faults and current is the random workqueue thread.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Suggested-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit e9545b9f8aeb63e05818e4b3250057260bc072aa)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: async_pf: kill the unnecessary use_mm/unuse_mm async_pf_execute()
Oleg Nesterov [Mon, 21 Apr 2014 13:25:58 +0000 (15:25 +0200)]
KVM: async_pf: kill the unnecessary use_mm/unuse_mm async_pf_execute()

async_pf_execute() has no reasons to adopt apf->mm, gup(current, mm)
should work just fine even if current has another or NULL ->mm.

Recently kvm_async_page_present_sync() was added insedie the "use_mm"
section, but it seems that it doesn't need current->mm too.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d72d946d0b649b79709b99b9d5cb7269fff8afaa)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: arm/arm64: vgic: fix GICD_ICFGR register accesses
Andre Przywara [Thu, 10 Apr 2014 22:07:18 +0000 (00:07 +0200)]
KVM: arm/arm64: vgic: fix GICD_ICFGR register accesses

Since KVM internally represents the ICFGR registers by stuffing two
of them into one word, the offset for accessing the internal
representation and the one for the MMIO based access are different.
So keep the original offset around, but adjust the internal array
offset by one bit.

Reported-by: Haibin Wang <wanghaibin.wang@huawei.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit f2ae85b2ab3776b9e4e42e5b6fa090f40d396794)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: async_pf: mm->mm_users can not pin apf->mm
Oleg Nesterov [Mon, 21 Apr 2014 13:26:01 +0000 (15:26 +0200)]
KVM: async_pf: mm->mm_users can not pin apf->mm

get_user_pages(mm) is simply wrong if mm->mm_users == 0 and exit_mmap/etc
was already called (or is in progress), mm->mm_count can only pin mm->pgd
and mm_struct itself.

Change kvm_setup_async_pf/async_pf_execute to inc/dec mm->mm_users.

kvm_create_vm/kvm_destroy_vm play with ->mm_count too but this case looks
fine at first glance, it seems that this ->mm is only used to verify that
current->mm == kvm->mm.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 41c22f626254b9dc0376928cae009e73d1b6a49a)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: ARM: vgic: Fix sgi dispatch problem
Haibin Wang [Thu, 10 Apr 2014 12:14:32 +0000 (13:14 +0100)]
KVM: ARM: vgic: Fix sgi dispatch problem

When dispatch SGI(mode == 0), that is the vcpu of VM should send
sgi to the cpu which the target_cpus list.
So, there must add the "break" to branch of case 0.

Cc: <stable@vger.kernel.org> # 3.10+
Signed-off-by: Haibin Wang <wanghaibin.wang@huawei.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 91021a6c8ffdc55804dab5acdfc7de4f278b9ac3)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoarm: KVM: fix possible misalignment of PGDs and bounce page
Mark Salter [Fri, 28 Mar 2014 14:25:19 +0000 (14:25 +0000)]
arm: KVM: fix possible misalignment of PGDs and bounce page

The kvm/mmu code shared by arm and arm64 uses kalloc() to allocate
a bounce page (if hypervisor init code crosses page boundary) and
hypervisor PGDs. The problem is that kalloc() does not guarantee
the proper alignment. In the case of the bounce page, the page sized
buffer allocated may also cross a page boundary negating the purpose
and leading to a hang during kvm initialization. Likewise the PGDs
allocated may not meet the minimum alignment requirements of the
underlying MMU. This patch uses __get_free_page() to guarantee the
worst case alignment needs of the bounce page and PGDs on both arm
and arm64.

Cc: <stable@vger.kernel.org> # 3.10+
Signed-off-by: Mark Salter <msalter@redhat.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 5d4e08c45a6cf8f1ab3c7fa375007635ac569165)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoARM: KVM: disable KVM in Kconfig on big-endian systems
Will Deacon [Fri, 25 Apr 2014 10:46:04 +0000 (11:46 +0100)]
ARM: KVM: disable KVM in Kconfig on big-endian systems

KVM currently crashes and burns on big-endian hosts, so don't allow it
to be selected until we've got that fixed.

Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 4e4468fac4381b92eb333d94256e7fb8350f3de3)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: add kvm_is_error_gpa() helper
Heiko Carstens [Wed, 1 Jan 2014 15:09:21 +0000 (16:09 +0100)]
KVM: add kvm_is_error_gpa() helper

It's quite common (in the s390 guest access code) to test if a guest
physical address points to a valid guest memory area or not.
So add a simple helper function in common code, since this might be
of interest for other architectures as well.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
(cherry picked from commit dfeec843fb237d73947e818f961e8d6f0df22b01)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoarm, kvm: fix double lock on cpu_add_remove_lock
Ming Lei [Sun, 6 Apr 2014 17:36:08 +0000 (01:36 +0800)]
arm, kvm: fix double lock on cpu_add_remove_lock

Commit 8146875de7d4 (arm, kvm: Fix CPU hotplug callback registration)
holds the lock before calling the two functions:

kvm_vgic_hyp_init()
kvm_timer_hyp_init()

and both the two functions are calling register_cpu_notifier()
to register cpu notifier, so cause double lock on cpu_add_remove_lock.

Considered that both two functions are only called inside
kvm_arch_init() with holding cpu_add_remove_lock, so simply use
__register_cpu_notifier() to fix the problem.

Fixes: 8146875de7d4 (arm, kvm: Fix CPU hotplug callback registration)
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit 553f809e23f00976caea7a1ebdabaa58a6383e7d)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoarm64: Add boot time configuration of Intermediate Physical Address size
Radha Mohan Chintakuntla [Fri, 7 Mar 2014 08:49:25 +0000 (08:49 +0000)]
arm64: Add boot time configuration of Intermediate Physical Address size

ARMv8 supports a range of physical address bit sizes. The PARange bits
from ID_AA64MMFR0_EL1 register are read during boot-time and the
intermediate physical address size bits are written in the translation
control registers (TCR_EL1 and VTCR_EL2).

There is no change in the VA bits and levels of translation.

Signed-off-by: Radha Mohan Chintakuntla <rchintakuntla@cavium.com>
Reviewed-by: Will Deacon <Will.deacon@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 87366d8cf7b3f6dc34633938aa8766e5a390ce33)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoARM: KVM: fix non-VGIC compilation
Marc Zyngier [Thu, 6 Mar 2014 03:30:46 +0000 (03:30 +0000)]
ARM: KVM: fix non-VGIC compilation

Add a stub for kvm_vgic_addr when compiling without
CONFIG_KVM_ARM_VGIC. The usefulness of this configurarion is extremely
doubtful, but let's fix it anyway (until we decide that we'll always
support a VGIC).

Reported-by: Michele Paolino <m.paolino@virtualopensystems.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 6cbde8253a8143ada18ec0d1711230747a7c1934)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoARM: KVM: fix warning in mmu.c
Marc Zyngier [Thu, 30 Jan 2014 17:38:33 +0000 (17:38 +0000)]
ARM: KVM: fix warning in mmu.c

Compiling with THP enabled leads to the following warning:

arch/arm/kvm/mmu.c: In function ‘unmap_range’:
arch/arm/kvm/mmu.c:177:39: warning: ‘pte’ may be used uninitialized in this function [-Wmaybe-uninitialized]
   if (kvm_pmd_huge(*pmd) || page_empty(pte)) {
                                        ^
Code inspection reveals that these two cases are mutually exclusive,
so GCC is a bit overzealous here. Silence it anyway by initializing
pte to NULL and testing it later on.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 56041bf920d2937b7cadcb30cb206f0372eee814)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoARM: KVM: trap VM system registers until MMU and caches are ON
Marc Zyngier [Tue, 14 Jan 2014 18:00:55 +0000 (18:00 +0000)]
ARM: KVM: trap VM system registers until MMU and caches are ON

In order to be able to detect the point where the guest enables
its MMU and caches, trap all the VM related system registers.

Once we see the guest enabling both the MMU and the caches, we
can go back to a saner mode of operation, which is to leave these
registers in complete control of the guest.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 8034699a42d68043b495c7e0cfafccd920707ec8)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoARM: KVM: add world-switch for AMAIR{0,1}
Marc Zyngier [Wed, 22 Jan 2014 10:20:09 +0000 (10:20 +0000)]
ARM: KVM: add world-switch for AMAIR{0,1}

HCR.TVM traps (among other things) accesses to AMAIR0 and AMAIR1.
In order to minimise the amount of surprise a guest could generate by
trying to access these registers with caches off, add them to the
list of registers we switch/handle.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit af20814ee927ed888288d98917a766b4179c4fe0)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoARM: KVM: introduce per-vcpu HYP Configuration Register
Marc Zyngier [Wed, 22 Jan 2014 09:43:38 +0000 (09:43 +0000)]
ARM: KVM: introduce per-vcpu HYP Configuration Register

So far, KVM/ARM used a fixed HCR configuration per guest, except for
the VI/VF/VA bits to control the interrupt in absence of VGIC.

With the upcoming need to dynamically reconfigure trapping, it becomes
necessary to allow the HCR to be changed on a per-vcpu basis.

The fix here is to mimic what KVM/arm64 already does: a per vcpu HCR
field, initialized at setup time.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit ac30a11e8e92a03dbe236b285c5cbae0bf563141)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoARM: KVM: fix ordering of 64bit coprocessor accesses
Marc Zyngier [Tue, 21 Jan 2014 18:56:26 +0000 (18:56 +0000)]
ARM: KVM: fix ordering of 64bit coprocessor accesses

Commit 240e99cbd00a (ARM: KVM: Fix 64-bit coprocessor handling)
added an ordering dependency for the 64bit registers.

The order described is: CRn, CRm, Op1, Op2, 64bit-first.

Unfortunately, the implementation is: CRn, 64bit-first, CRm...

Move the 64bit test to be last in order to match the documentation.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 547f781378a22b65c2ab468f235c23001b5924da)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoARM: KVM: fix handling of trapped 64bit coprocessor accesses
Marc Zyngier [Tue, 21 Jan 2014 18:56:26 +0000 (18:56 +0000)]
ARM: KVM: fix handling of trapped 64bit coprocessor accesses

Commit 240e99cbd00a (ARM: KVM: Fix 64-bit coprocessor handling)
changed the way we match the 64bit coprocessor access from
user space, but didn't update the trap handler for the same
set of registers.

The effect is that a trapped 64bit access is never matched, leading
to a fault being injected into the guest. This went unnoticed as we
didn't really trap any 64bit register so far.

Placing the CRm field of the access into the CRn field of the matching
structure fixes the problem. Also update the debug feature to emit the
expected string in case of failing match.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 46c214dd595381c880794413facadfa07fba5c95)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoARM: KVM: force cache clean on page fault when caches are off
Marc Zyngier [Tue, 14 Jan 2014 19:13:10 +0000 (19:13 +0000)]
ARM: KVM: force cache clean on page fault when caches are off

In order for a guest with caches disabled to observe data written
contained in a given page, we need to make sure that page is
committed to memory, and not just hanging in the cache (as guest
accesses are completely bypassing the cache until it decides to
enable it).

For this purpose, hook into the coherent_cache_guest_page
function and flush the region if the guest SCTLR
register doesn't show the MMU and caches as being enabled.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 159793001d7d85af17855630c94f0a176848e16b)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoarm64: KVM: flush VM pages before letting the guest enable caches
Marc Zyngier [Wed, 15 Jan 2014 12:50:23 +0000 (12:50 +0000)]
arm64: KVM: flush VM pages before letting the guest enable caches

When the guest runs with caches disabled (like in an early boot
sequence, for example), all the writes are diectly going to RAM,
bypassing the caches altogether.

Once the MMU and caches are enabled, whatever sits in the cache
becomes suddenly visible, which isn't what the guest expects.

A way to avoid this potential disaster is to invalidate the cache
when the MMU is being turned on. For this, we hook into the SCTLR_EL1
trapping code, and scan the stage-2 page tables, invalidating the
pages/sections that have already been mapped in.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 9d218a1fcf4c6b759d442ef702842fae92e1ea61)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoARM: KVM: introduce kvm_p*d_addr_end
Marc Zyngier [Tue, 18 Feb 2014 14:29:03 +0000 (14:29 +0000)]
ARM: KVM: introduce kvm_p*d_addr_end

The use of p*d_addr_end with stage-2 translation is slightly dodgy,
as the IPA is 40bits, while all the p*d_addr_end helpers are
taking an unsigned long (arm64 is fine with that as unligned long
is 64bit).

The fix is to introduce 64bit clean versions of the same helpers,
and use them in the stage-2 page table code.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit a3c8bd31af260a17d626514f636849ee1cd1f63e)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoarm64: KVM: trap VM system registers until MMU and caches are ON
Marc Zyngier [Tue, 14 Jan 2014 18:00:55 +0000 (18:00 +0000)]
arm64: KVM: trap VM system registers until MMU and caches are ON

In order to be able to detect the point where the guest enables
its MMU and caches, trap all the VM related system registers.

Once we see the guest enabling both the MMU and the caches, we
can go back to a saner mode of operation, which is to leave these
registers in complete control of the guest.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 4d44923b17bff283c002ed961373848284aaff1b)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoarm64: KVM: allows discrimination of AArch32 sysreg access
Marc Zyngier [Tue, 21 Jan 2014 10:55:17 +0000 (10:55 +0000)]
arm64: KVM: allows discrimination of AArch32 sysreg access

The current handling of AArch32 trapping is slightly less than
perfect, as it is not possible (from a handler point of view)
to distinguish it from an AArch64 access, nor to tell a 32bit
from a 64bit access either.

Fix this by introducing two additional flags:
- is_aarch32: true if the access was made in AArch32 mode
- is_32bit: true if is_aarch32 == true and a MCR/MRC instruction
  was used to perform the access (as opposed to MCRR/MRRC).

This allows a handler to cover all the possible conditions in which
a system register gets trapped.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 2072d29c46b73e39b3c6c56c6027af77086f45fd)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoarm64: KVM: force cache clean on page fault when caches are off
Marc Zyngier [Tue, 14 Jan 2014 19:13:10 +0000 (19:13 +0000)]
arm64: KVM: force cache clean on page fault when caches are off

In order for the guest with caches off to observe data written
contained in a given page, we need to make sure that page is
committed to memory, and not just hanging in the cache (as
guest accesses are completely bypassing the cache until it
decides to enable it).

For this purpose, hook into the coherent_icache_guest_page
function and flush the region if the guest SCTLR_EL1
register doesn't show the MMU  and caches as being enabled.
The function also get renamed to coherent_cache_guest_page.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 2d58b733c87689d3d5144e4ac94ea861cc729145)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoarm/arm64: KVM: detect CPU reset on CPU_PM_EXIT
Marc Zyngier [Wed, 26 Feb 2014 18:47:36 +0000 (18:47 +0000)]
arm/arm64: KVM: detect CPU reset on CPU_PM_EXIT

Commit 1fcf7ce0c602 (arm: kvm: implement CPU PM notifier) added
support for CPU power-management, using a cpu_notifier to re-init
KVM on a CPU that entered CPU idle.

The code assumed that a CPU entering idle would actually be powered
off, loosing its state entierely, and would then need to be
reinitialized. It turns out that this is not always the case, and
some HW performs CPU PM without actually killing the core. In this
case, we try to reinitialize KVM while it is still live. It ends up
badly, as reported by Andre Przywara (using a Calxeda Midway):

[    3.663897] Kernel panic - not syncing: unexpected prefetch abort in Hyp mode at: 0x685760
[    3.663897] unexpected data abort in Hyp mode at: 0xc067d150
[    3.663897] unexpected HVC/SVC trap in Hyp mode at: 0xc0901dd0

The trick here is to detect if we've been through a full re-init or
not by looking at HVBAR (VBAR_EL2 on arm64). This involves
implementing the backend for __hyp_get_vectors in the main KVM HYP
code (rather small), and checking the return value against the
default one when the CPU notifier is called on CPU_PM_EXIT.

Reported-by: Andre Przywara <osp@andrep.de>
Tested-by: Andre Przywara <osp@andrep.de>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Rob Herring <rob.herring@linaro.org>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit b20c9f29c5c25921c6ad18b50d4b61e6d181c3cc)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: add kvm_arch_vcpu_runnable() test to kvm_vcpu_on_spin() loop
Michael Mueller [Wed, 26 Feb 2014 15:14:18 +0000 (16:14 +0100)]
KVM: add kvm_arch_vcpu_runnable() test to kvm_vcpu_on_spin() loop

Use the arch specific function kvm_arch_vcpu_runnable() to add a further
criterium to identify a suitable vcpu to yield to during undirected yield
processing.

Signed-off-by: Michael Mueller <mimu@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 98f4a14676127397c54cab7d6119537ed4d113a2)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoarm64: KVM: Add VGIC device control for arm64
Christoffer Dall [Sun, 2 Feb 2014 21:41:02 +0000 (13:41 -0800)]
arm64: KVM: Add VGIC device control for arm64

This fixes the build breakage introduced by
c07a0191ef2de1f9510f12d1f88e3b0b5cd8d66f and adds support for the device
control API and save/restore of the VGIC state for ARMv8.

The defines were simply missing from the arm64 header files and
uaccess.h must be implicitly imported from somewhere else on arm.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 2a2f3e269c75edf916de5967079069aeb6a601cb)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoasmlinkage, kvm: Make kvm_rebooting visible
Andi Kleen [Sat, 8 Feb 2014 07:51:57 +0000 (08:51 +0100)]
asmlinkage, kvm: Make kvm_rebooting visible

kvm_rebooting is referenced from assembler code, thus
needs to be visible.

Cc: Gleb Natapov <gleb@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1391845930-28580-1-git-send-email-ak@linux.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
(cherry picked from commit 52480137d82062bb8d0fb778cb9934667958e367)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoarm64: fix typo: s/SERRROR/SERROR/
Mark Rutland [Wed, 5 Feb 2014 10:24:12 +0000 (10:24 +0000)]
arm64: fix typo: s/SERRROR/SERROR/

Somehow SERROR has acquired an additional 'R' in a couple of headers.
This patch removes them before they spread further. As neither instance
is in use yet, no other sites need to be fixed up.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit bfb67a5606376bb32cb6f93dc05cda2e8c2038a5)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: async_pf: Add missing call for async page present
Dominik Dingel [Fri, 31 Jan 2014 13:32:46 +0000 (14:32 +0100)]
KVM: async_pf: Add missing call for async page present

Commit KVM: async_pf: Provide additional direct page notification
missed the call from kvm_check_async_pf_completion to the new introduced function.

Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 1179ba539541347d5427cde8bcfdaa5ead14f3aa)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: async_pf: Allow to wait for outstanding work
Dominik Dingel [Tue, 3 Sep 2013 10:31:16 +0000 (12:31 +0200)]
KVM: async_pf: Allow to wait for outstanding work

On s390 we are not able to cancel work. Instead we will flush the work and wait for
completion.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
(cherry picked from commit 9f2ceda49c6b8827c795731c204f6c2587886e2c)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: async_pf: Provide additional direct page notification
Dominik Dingel [Thu, 6 Jun 2013 13:32:37 +0000 (15:32 +0200)]
KVM: async_pf: Provide additional direct page notification

By setting a Kconfig option, the architecture can control when
guest notifications will be presented by the apf backend.
There is the default batch mechanism, working as before, where the vcpu
thread should pull in this information.
Opposite to this, there is now the direct mechanism, that will push the
information to the guest.
This way s390 can use an already existing architecture interface.

Still the vcpu thread should call check_completion to cleanup leftovers.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
(cherry picked from commit e0ead41a6dac09f86675ce07a66e4b253a9b7bd5)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: return an error code in kvm_vm_ioctl_register_coalesced_mmio()
Dan Carpenter [Wed, 29 Jan 2014 13:16:39 +0000 (16:16 +0300)]
KVM: return an error code in kvm_vm_ioctl_register_coalesced_mmio()

If kvm_io_bus_register_dev() fails then it returns success but it should
return an error code.

I also did a little cleanup like removing an impossible NULL test.

Cc: stable@vger.kernel.org
Fixes: 2b3c246a682c ('KVM: Make coalesced mmio use a device per zone')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit aac5c4226e7136c331ed384c25d5560204da10a0)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agokvm: Provide kvm_vcpu_eligible_for_directed_yield() stub
Scott Wood [Fri, 10 Jan 2014 00:43:16 +0000 (18:43 -0600)]
kvm: Provide kvm_vcpu_eligible_for_directed_yield() stub

Commit 7940876e1330671708186ac3386aa521ffb5c182 ("kvm: make local
functions static") broke KVM PPC builds due to removing (rather than
moving) the stub version of kvm_vcpu_eligible_for_directed_yield().

This patch reintroduces it.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Alexander Graf <agraf@suse.de>
[Move the #ifdef inside the function. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 4a55dd7273c95b4a19fbcf0ae1bbd1cfd09dfc36)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: ARM: Remove duplicate include
Sachin Kamat [Tue, 7 Jan 2014 08:15:15 +0000 (13:45 +0530)]
KVM: ARM: Remove duplicate include

trace.h was included twice. Remove duplicate inclusion.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 61466710de078c697106fa5b70ec7afc9feab520)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoarm/arm64: KVM: relax the requirements of VMA alignment for THP
Marc Zyngier [Fri, 13 Dec 2013 16:56:06 +0000 (16:56 +0000)]
arm/arm64: KVM: relax the requirements of VMA alignment for THP

The THP code in KVM/ARM is a bit restrictive in not allowing a THP
to be used if the VMA is not 2MB aligned. Actually, it is not so much
the VMA that matters, but the associated memslot:

A process can perfectly mmap a region with no particular alignment
restriction, and then pass a 2MB aligned address to KVM. In this
case, KVM will only use this 2MB aligned region, and will ignore
the range between vma->vm_start and memslot->userspace_addr.

It can also choose to place this memslot at whatever alignment it
wants in the IPA space. In the end, what matters is the relative
alignment of the user space and IPA mappings with respect to a
2M page. They absolutely must be the same if you want to use THP.

Cc: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 136d737fd20102f1be9b02356590fd55e3a40d0e)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agokvm: make local functions static
Stephen Hemminger [Sun, 29 Dec 2013 20:12:29 +0000 (12:12 -0800)]
kvm: make local functions static

Running 'make namespacecheck' found lots of functions that
should be declared static, since only used in one file.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
(cherry picked from commit 7940876e1330671708186ac3386aa521ffb5c182)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoarm64: KVM: Force undefined exception for Guest SMC intructions
Anup Patel [Thu, 12 Dec 2013 16:12:23 +0000 (16:12 +0000)]
arm64: KVM: Force undefined exception for Guest SMC intructions

The SMC-based PSCI emulation for Guest is going to be very different
from the in-kernel HVC-based PSCI emulation hence for now just inject
undefined exception when Guest executes SMC instruction.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: marc Zyngier <marc.zyngier@arm.com>
(cherry picked from commit e5cf9dcdbfd26cd4e1991db08755da900454efeb)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoarm64: KVM: Support X-Gene guest VCPU on APM X-Gene host
Anup Patel [Thu, 14 Nov 2013 15:20:08 +0000 (15:20 +0000)]
arm64: KVM: Support X-Gene guest VCPU on APM X-Gene host

This patch allows us to have X-Gene guest VCPU when using KVM arm64
on APM X-Gene host.

We add KVM_ARM_TARGET_XGENE_POTENZA for X-Gene Potenza compatible
guest VCPU and we return KVM_ARM_TARGET_XGENE_POTENZA in kvm_target_cpu()
when running on X-Gene host with Potenza core.

[maz: sanitized the commit log]

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
(cherry picked from commit e28100bd8ed9e37b7cd4578140a1e7f95bd40835)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoarm64: KVM: Add Kconfig option for max VCPUs per-Guest
Anup Patel [Thu, 12 Dec 2013 16:12:22 +0000 (16:12 +0000)]
arm64: KVM: Add Kconfig option for max VCPUs per-Guest

Current max VCPUs per-Guest is set to 4 which is preventing
us from creating a Guest (or VM) with 8 VCPUs on Host (e.g.
X-Gene Storm SOC) with 8 Host CPUs.

The correct value of max VCPUs per-Guest should be same as
the max CPUs supported by GICv2 which is 8 but, increasing
value of max VCPUs per-Guest can make things slower hence
we add Kconfig option to let KVM users select appropriate
max VCPUs per-Guest.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
(cherry picked from commit da7814700a0c408bead58ce4714b7625ffbaade1)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: arm-vgic: Support CPU interface reg access
Christoffer Dall [Mon, 23 Sep 2013 21:55:57 +0000 (14:55 -0700)]
KVM: arm-vgic: Support CPU interface reg access

Implement support for the CPU interface register access driven by MMIO
address offsets from the CPU interface base address.  Useful for user
space to support save/restore of the VGIC state.

This commit adds support only for the same logic as the current VGIC
support, and no more.  For example, the active priority registers are
handled as RAZ/WI, just like setting priorities on the emulated
distributor.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit fa20f5aea56f271f83e91b9cde00f043a5a14990)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: arm-vgic: Add GICD_SPENDSGIR and GICD_CPENDSGIR handlers
Christoffer Dall [Fri, 25 Oct 2013 20:22:31 +0000 (21:22 +0100)]
KVM: arm-vgic: Add GICD_SPENDSGIR and GICD_CPENDSGIR handlers

Handle MMIO accesses to the two registers which should support both the
case where the VMs want to read/write either of these registers and the
case where user space reads/writes these registers to do save/restore of
the VGIC state.

Note that the added complexity compared to simple set/clear enable
registers stems from the bookkeping of source cpu ids.  It may be
possible to change the underlying data structure to simplify the
complexity, but since this is not in the critical path at all, this will
do.

Also note that reading this register from a live guest will not be
accurate compared to on hardware, because some state may be living on
the CPU LRs and the only way to give a consistent read would be to force
stop all the VCPUs and request them to unqueu the LR state onto the
distributor.  Until we have an actual user of live reading this
register, we can live with the difference.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 90a5355ee7639e92c0492ec592cba5c31bd80687)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: arm-vgic: Support unqueueing of LRs to the dist
Christoffer Dall [Sat, 16 Nov 2013 04:51:31 +0000 (20:51 -0800)]
KVM: arm-vgic: Support unqueueing of LRs to the dist

To properly access the VGIC state from user space it is very unpractical
to have to loop through all the LRs in all register access functions.
Instead, support moving all pending state from LRs to the distributor,
but leave active state LRs alone.

Note that to accurately present the active and pending state to VCPUs
reading these distributor registers from a live VM, we would have to
stop all other VPUs than the calling VCPU and ask each CPU to unqueue
their LR state onto the distributor and add fields to track active state
on the distributor side as well.  We don't have any users of such
functionality yet and there are other inaccuracies of the GIC emulation,
so don't provide accurate synchronized access to this state just yet.
However, when the time comes, having this function should help.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit cbd333a4bfd0d93bba36d46a0e4e7979228873a6)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: arm-vgic: Add vgic reg access from dev attr
Christoffer Dall [Fri, 25 Oct 2013 20:17:31 +0000 (21:17 +0100)]
KVM: arm-vgic: Add vgic reg access from dev attr

Add infrastructure to handle distributor and cpu interface register
accesses through the KVM_{GET/SET}_DEVICE_ATTR interface by adding the
KVM_DEV_ARM_VGIC_GRP_DIST_REGS and KVM_DEV_ARM_VGIC_GRP_CPU_REGS groups
and defining the semantics of the attr field to be the MMIO offset as
specified in the GICv2 specs.

Missing register accesses or other changes in individual register access
functions to support save/restore of the VGIC state is added in
subsequent patches.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit c07a0191ef2de1f9510f12d1f88e3b0b5cd8d66f)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoarm/arm64: kvm: Set vcpu->cpu to -1 on vcpu_put
Christoffer Dall [Thu, 12 Dec 2013 04:29:11 +0000 (20:29 -0800)]
arm/arm64: kvm: Set vcpu->cpu to -1 on vcpu_put

The arch-generic KVM code expects the cpu field of a vcpu to be -1 if
the vcpu is no longer assigned to a cpu.  This is used for the optimized
make_all_cpus_request path and will be used by the vgic code to check
that no vcpus are running.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit e9b152cb957cb194437f37e79f0f3c9d34fe53d6)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: arm-vgic: Make vgic mmio functions more generic
Christoffer Dall [Mon, 23 Sep 2013 21:55:56 +0000 (14:55 -0700)]
KVM: arm-vgic: Make vgic mmio functions more generic

Rename the vgic_ranges array to vgic_dist_ranges to be more specific and
to prepare for handling CPU interface register access as well (for
save/restore of VGIC state).

Pass offset from distributor or interface MMIO base to
find_matching_range function instead of the physical address of the
access in the VM memory map.  This allows other callers unaware of the
VM specifics, but with generic VGIC knowledge to reuse the function.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 1006e8cb22e861260688917ca4cfe6cde8ad69eb)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoirqchip: arm-gic: Define additional MMIO offsets and masks
Christoffer Dall [Mon, 23 Sep 2013 21:55:56 +0000 (14:55 -0700)]
irqchip: arm-gic: Define additional MMIO offsets and masks

Define CPU interface offsets for the GICC_ABPR, GICC_APR, and GICC_IIDR
registers.  Define distributor registers for the GICD_SPENDSGIR and the
GICD_CPENDSGIR.  KVM/ARM needs to know about these definitions to fully
support save/restore of the VGIC.

Also define some masks and shifts for the various GICH_VMCR fields.

Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 0307e1770fdeff2732cf7a35d0f7f49db67c6621)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: arm-vgic: Set base addr through device API
Christoffer Dall [Mon, 23 Sep 2013 21:55:56 +0000 (14:55 -0700)]
KVM: arm-vgic: Set base addr through device API

Support setting the distributor and cpu interface base addresses in the
VM physical address space through the KVM_{SET,GET}_DEVICE_ATTR API
in addition to the ARM specific API.

This has the added benefit of being able to share more code in user
space and do things in a uniform manner.

Also deprecate the older API at the same time, but backwards
compatibility will be maintained.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit ce01e4e8874d410738f4b4733b26642d6611a331)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: arm-vgic: Support KVM_CREATE_DEVICE for VGIC
Christoffer Dall [Fri, 25 Oct 2013 16:29:18 +0000 (17:29 +0100)]
KVM: arm-vgic: Support KVM_CREATE_DEVICE for VGIC

Support creating the ARM VGIC device through the KVM_CREATE_DEVICE
ioctl, which can then later be leveraged to use the
KVM_{GET/SET}_DEVICE_ATTR, which is useful both for setting addresses in
a more generic API than the ARM-specific one and is useful for
save/restore of VGIC state.

Adds KVM_CAP_DEVICE_CTRL to ARM capabilities.

Note that we change the check for creating a VGIC from bailing out if
any VCPUs were created, to bailing out if any VCPUs were ever run.  This
is an important distinction that shouldn't break anything, but allows
creating the VGIC after the VCPUs have been created.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 7330672befe6269e575f79b924a7068b26c144b4)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoARM: KVM: Allow creating the VGIC after VCPUs
Christoffer Dall [Mon, 23 Sep 2013 21:55:55 +0000 (14:55 -0700)]
ARM: KVM: Allow creating the VGIC after VCPUs

Rework the VGIC initialization slightly to allow initialization of the
vgic cpu-specific state even if the irqchip (the VGIC) hasn't been
created by user space yet.  This is safe, because the vgic data
structures are already allocated when the CPU is allocated if VGIC
support is compiled into the kernel.  Further, the init process does not
depend on any other information and the sacrifice is a slight
performance degradation for creating VMs in the no-VGIC case.

The reason is that the new device control API doesn't mandate creating
the VGIC before creating the VCPU and it is unreasonable to require user
space to create the VGIC before creating the VCPUs.

At the same time move the irqchip_in_kernel check out of
kvm_vcpu_first_run_init and into the init function to make the per-vcpu
and global init functions symmetric and add comments on the exported
functions making it a bit easier to understand the init flow by only
looking at vgic.c.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit e1ba0207a1b3714bb3f000e506285ae5123cdfa7)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoARM/KVM: save and restore generic timer registers
Andre Przywara [Fri, 13 Dec 2013 13:23:26 +0000 (14:23 +0100)]
ARM/KVM: save and restore generic timer registers

For migration to work we need to save (and later restore) the state of
each core's virtual generic timer.
Since this is per VCPU, we can use the [gs]et_one_reg ioctl and export
the three needed registers (control, counter, compare value).
Though they live in cp15 space, we don't use the existing list, since
they need special accessor functions and the arch timer is optional.

Acked-by: Marc Zynger <marc.zyngier@arm.com>
Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 39735a3a390431bcf60f9174b7d64f787fd6afa9)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoarm/arm64: KVM: arch_timer: Initialize cntvoff at kvm_init
Christoffer Dall [Sat, 16 Nov 2013 18:51:25 +0000 (10:51 -0800)]
arm/arm64: KVM: arch_timer: Initialize cntvoff at kvm_init

Initialize the cntvoff at kvm_init_vm time, not before running the VCPUs
at the first time because that will overwrite any potentially restored
values from user space.

Cc: Andre Przywara <andre.przywara@linaro.org>
Acked-by: Marc Zynger <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit a1a64387adeeba7a34ce06f2774e81f496ee803b)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoarm: KVM: Don't return PSCI_INVAL if waitqueue is inactive
Christoffer Dall [Wed, 20 Nov 2013 01:43:19 +0000 (17:43 -0800)]
arm: KVM: Don't return PSCI_INVAL if waitqueue is inactive

The current KVM implementation of PSCI returns INVALID_PARAMETERS if the
waitqueue for the corresponding CPU is not active.  This does not seem
correct, since KVM should not care what the specific thread is doing,
for example, user space may not have called KVM_RUN on this VCPU yet or
the thread may be busy looping to user space because it received a
signal; this is really up to the user space implementation.  Instead we
should check specifically that the CPU is marked as being turned off,
regardless of the VCPU thread state, and if it is, we shall
simply clear the pause flag on the CPU and wake up the thread if it
happens to be blocked for us.

Further, the implementation seems to be racy when executing multiple
VCPU threads.  There really isn't a reasonable user space programming
scheme to ensure all secondary CPUs have reached kvm_vcpu_first_run_init
before turning on the boot CPU.

Therefore, set the pause flag on the vcpu at VCPU init time (which can
reasonably be expected to be completed for all CPUs by user space before
running any VCPUs) and clear both this flag and the feature (in case the
feature can somehow get set again in the future) and ping the waitqueue
on turning on a VCPU using PSCI.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 478a8237f656d86d25b3e4e4bf3c48f590156294)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agotreewide: Fix typos in printk
Masanari Iida [Sun, 8 Dec 2013 15:22:53 +0000 (00:22 +0900)]
treewide: Fix typos in printk

Correct spelling typo in various part of kernel

[ cdall: Pickes KVM/arm64 specific part not already merged into LSK ]

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
(cherry picked from commit 77d84ff87e9d38072abcca665ca22cb1da41cb86)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoarm: kvm: implement CPU PM notifier
Lorenzo Pieralisi [Mon, 5 Aug 2013 14:04:46 +0000 (15:04 +0100)]
arm: kvm: implement CPU PM notifier

Upon CPU shutdown and consequent warm-reboot, the hypervisor CPU state
must be re-initialized. This patch implements a CPU PM notifier that
upon warm-boot calls a KVM hook to reinitialize properly the hypervisor
state so that the CPU can be safely resumed.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
(cherry picked from commit 1fcf7ce0c60213994269fb59569ec161eb6e08d6)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: Use cond_resched() directly and remove useless kvm_resched()
Takuya Yoshikawa [Fri, 13 Dec 2013 06:07:21 +0000 (15:07 +0900)]
KVM: Use cond_resched() directly and remove useless kvm_resched()

Since the commit 15ad7146 ("KVM: Use the scheduler preemption notifiers
to make kvm preemptible"), the remaining stuff in this function is a
simple cond_resched() call with an extra need_resched() check which was
there to avoid dropping VCPUs unnecessarily.  Now it is meaningless.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit c08ac06ab3f3cdb8d34376c3a8a5e46a31a62c8f)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: Improve create VCPU parameter (CVE-2013-4587)
Andy Honig [Tue, 19 Nov 2013 00:09:22 +0000 (16:09 -0800)]
KVM: Improve create VCPU parameter (CVE-2013-4587)

In multiple functions the vcpu_id is used as an offset into a bitfield.  Ag
malicious user could specify a vcpu_id greater than 255 in order to set or
clear bits in kernel memory.  This could be used to elevate priveges in the
kernel.  This patch verifies that the vcpu_id provided is less than 255.
The api documentation already specifies that the vcpu_id must be less than
max_vcpus, but this is currently not checked.

Reported-by: Andrew Honig <ahonig@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Andrew Honig <ahonig@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 338c7dbadd2671189cec7faf64c84d01071b3f96)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoarm/arm64: kvm: Use virt_to_idmap instead of virt_to_phys for idmap mappings
Santosh Shilimkar [Tue, 19 Nov 2013 19:59:12 +0000 (14:59 -0500)]
arm/arm64: kvm: Use virt_to_idmap instead of virt_to_phys for idmap mappings

KVM initialisation fails on architectures implementing virt_to_idmap()
because virt_to_phys() on such architectures won't fetch you the correct
idmap page.

So update the KVM ARM code to use the virt_to_idmap() to fix the issue.
Since the KVM code is shared between arm and arm64, we create
kvm_virt_to_phys() and handle the redirection in respective headers.

Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 4fda342cc7f577599c53fd27b99c953c7b1da18a)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: kvm_clear_guest_page(): fix empty_zero_page usage
Heiko Carstens [Mon, 18 Nov 2013 09:35:55 +0000 (10:35 +0100)]
KVM: kvm_clear_guest_page(): fix empty_zero_page usage

Using the address of 'empty_zero_page' as source address in order to
clear a page is wrong. On some architectures empty_zero_page is only the
pointer to the struct page of the empty_zero_page.  Therefore the clear
page operation would copy the contents of a couple of struct pages instead
of clearing a page.  For kvm only arm/arm64 are affected by this bug.

To fix this use the ZERO_PAGE macro instead which will return the struct
page address of the empty_zero_page on all architectures.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit 8a3caa6d74597c2a083f7c87f866891a0b12540b)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoarm/arm64: KVM: Fix hyp mappings of vmalloc regions
Christoffer Dall [Fri, 15 Nov 2013 21:14:12 +0000 (13:14 -0800)]
arm/arm64: KVM: Fix hyp mappings of vmalloc regions

Using virt_to_phys on percpu mappings is horribly wrong as it may be
backed by vmalloc.  Introduce kvm_kaddr_to_phys which translates both
types of valid kernel addresses to the corresponding physical address.

At the same time resolves a typing issue where we were storing the
physical address as a 32 bit unsigned long (on arm), truncating the
physical address for addresses above the 4GB limit.  This caused
breakage on Keystone.

Cc: <stable@vger.kernel.org> [3.10+]
Reported-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 40c2729bab48e2832b17c1fa8af9db60e776131b)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoarm/arm64: KVM: PSCI: propagate caller endianness to the incoming vcpu
Marc Zyngier [Tue, 5 Nov 2013 14:12:15 +0000 (14:12 +0000)]
arm/arm64: KVM: PSCI: propagate caller endianness to the incoming vcpu

When booting a vcpu using PSCI, make sure we start it with the
endianness of the caller. Otherwise, secondaries can be pretty
unhappy to execute a BE kernel in LE mode...

This conforms to PSCI spec Rev B, 5.13.3.

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
(cherry picked from commit ce94fe93d566bf381c6ecbd45010d36c5f04d692)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoarm/arm64: KVM: MMIO support for BE guest
Marc Zyngier [Tue, 12 Feb 2013 12:40:22 +0000 (12:40 +0000)]
arm/arm64: KVM: MMIO support for BE guest

Do the necessary byteswap when host and guest have different
views of the universe. Actually, the only case we need to take
care of is when the guest is BE. All the other cases are naturally
handled.

Also be careful about endianness when the data is being memcopy-ed
from/to the run buffer.

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
(cherry picked from commit 6d89d2d9b5bac9dbe40ee106ceda9307b6265234)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoarm64: KVM: vgic: byteswap GICv2 access on world switch if BE
Marc Zyngier [Tue, 5 Nov 2013 18:29:46 +0000 (18:29 +0000)]
arm64: KVM: vgic: byteswap GICv2 access on world switch if BE

Ensure that accesses to the GICH_* registers are byteswapped
when the kernel is compiled as big-endian.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit c5b2c0f5203b3bc678a8967daedf7114029975ae)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoarm64: KVM: initialize HYP mode following the kernel endianness
Marc Zyngier [Tue, 5 Nov 2013 18:29:45 +0000 (18:29 +0000)]
arm64: KVM: initialize HYP mode following the kernel endianness

Force SCTLR_EL2.EE to 1 if the kernel is compiled as BE.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit 18ea3dbc9e5c8a53a361b17c4a5676ea6f4bcb72)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: remove vm mmap method
Gleb Natapov [Tue, 5 Nov 2013 14:04:18 +0000 (16:04 +0200)]
KVM: remove vm mmap method

It was used in conjunction with KVM_SET_MEMORY_REGION ioctl which was
removed by b74a07beed0 in 2010, QEMU stopped using it in 2008, so
it is time to remove the code finally.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit 80f5b5e700fa9c58480eafce0d47367bafb70006)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agokvm_host: typo fix
Michael S. Tsirkin [Wed, 30 Oct 2013 19:43:01 +0000 (21:43 +0200)]
kvm_host: typo fix

fix up typo in comment.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 81e87e26796782e014fd1f2bb9cd8fb6ce4021a8)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agokvm: Add VFIO device
Alex Williamson [Wed, 30 Oct 2013 17:02:17 +0000 (11:02 -0600)]
kvm: Add VFIO device

So far we've succeeded at making KVM and VFIO mostly unaware of each
other, but areas are cropping up where a connection beyond eventfds
and irqfds needs to be made.  This patch introduces a KVM-VFIO device
that is meant to be a gateway for such interaction.  The user creates
the device and can add and remove VFIO groups to it via file
descriptors.  When a group is added, KVM verifies the group is valid
and gets a reference to it via the VFIO external user interface.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit ec53500fae421e07c5d035918ca454a429732ef4)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agokvm: Add KVM_GET_EMULATED_CPUID
Borislav Petkov [Sun, 22 Sep 2013 14:44:50 +0000 (16:44 +0200)]
kvm: Add KVM_GET_EMULATED_CPUID

Add a kvm ioctl which states which system functionality kvm emulates.
The format used is that of CPUID and we return the corresponding CPUID
bits set for which we do emulate functionality.

Make sure ->padding is being passed on clean from userspace so that we
can use it for something in the future, after the ioctl gets cast in
stone.

s/kvm_dev_ioctl_get_supported_cpuid/kvm_dev_ioctl_get_cpuid/ while at
it.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 9c15bb1d0a8411f9bb3395d21d5309bde7da0c1c)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: use a more sensible error number when debugfs directory creation fails
Paolo Bonzini [Wed, 30 Oct 2013 11:12:13 +0000 (12:12 +0100)]
KVM: use a more sensible error number when debugfs directory creation fails

I don't know if this was due to cut and paste, or somebody was really
using a D20 to pick the error code for kvm_init_debugfs as suggested by
Linus (EFAULT is 14, so the possibility cannot be entirely ruled out).

In any case, this patch fixes it.

Reported-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 0c8eb04a6241da28deb108181213b791c378123b)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoarm64: KVM: Yield CPU when vcpu executes a WFE
Marc Zyngier [Fri, 2 Aug 2013 10:41:13 +0000 (11:41 +0100)]
arm64: KVM: Yield CPU when vcpu executes a WFE

On an (even slightly) oversubscribed system, spinlocks are quickly
becoming a bottleneck, as some vcpus are spinning, waiting for a
lock to be released, while the vcpu holding the lock may not be
running at all.

The solution is to trap blocking WFEs and tell KVM that we're
now spinning. This ensures that other vpus will get a scheduling
boost, allowing the lock to be released more quickly. Also, using
CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT slightly improves the performance
when the VM is severely overcommited.

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
(cherry picked from commit d241aac798eb042e605f78c31a4122e583b2cd13)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: Mapping IOMMU pages after updating memslot
Yang Zhang [Thu, 24 Oct 2013 01:56:39 +0000 (09:56 +0800)]
KVM: Mapping IOMMU pages after updating memslot

In kvm_iommu_map_pages(), we need to know the page size via call
kvm_host_page_size(). And it will check whether the target slot
is valid before return the right page size.
Currently, we will map the iommu pages when creating a new slot.
But we call kvm_iommu_map_pages() during preparing the new slot.
At that time, the new slot is not visible by domain(still in preparing).
So we cannot get the right page size from kvm_host_page_size() and
this will break the IOMMU super page logic.
The solution is to map the iommu pages after we insert the new slot
into domain.

Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Tested-by: Patrick Lu <patrick.lu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit e0230e1327fb862c9b6cde24ae62d55f9db62c9b)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoarm/arm64: KVM: PSCI: use MPIDR to identify a target CPU
Marc Zyngier [Fri, 18 Oct 2013 17:19:03 +0000 (18:19 +0100)]
arm/arm64: KVM: PSCI: use MPIDR to identify a target CPU

The KVM PSCI code blindly assumes that vcpu_id and MPIDR are
the same thing. This is true when vcpus are organized as a flat
topology, but is wrong when trying to emulate any other topology
(such as A15 clusters).

Change the KVM PSCI CPU_ON code to look at the MPIDR instead
of the vcpu_id to pick a target CPU.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 79c648806f9034abf54332b78043bb242189d953)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoARM: KVM: drop limitation to 4 CPU VMs
Marc Zyngier [Fri, 18 Oct 2013 17:19:06 +0000 (18:19 +0100)]
ARM: KVM: drop limitation to 4 CPU VMs

Now that the KVM/arm code knows about affinity, remove the hard
limit of 4 vcpus per VM.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 7999b4d18211bcfb40e3574cf75e94518e9fa2c6)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoARM: KVM: fix L2CTLR to be per-cluster
Marc Zyngier [Fri, 18 Oct 2013 17:19:05 +0000 (18:19 +0100)]
ARM: KVM: fix L2CTLR to be per-cluster

The L2CTLR register contains the number of CPUs in this cluster.

Make sure the register content is actually relevant to the vcpu
that is being configured by computing the number of cores that are
part of its cluster.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 9cbb6d969cb6561de45d917b8bb9281cb374bb35)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoARM: KVM: Fix MPIDR computing to support virtual clusters
Marc Zyngier [Fri, 18 Oct 2013 17:19:04 +0000 (18:19 +0100)]
ARM: KVM: Fix MPIDR computing to support virtual clusters

In order to be able to support more than 4 A7 or A15 CPUs,
we need to fix the MPIDR computing to reflect the fact that
both A15 and A7 can only exist in clusters of at most 4 CPUs.

Fix the MPIDR computing to allow virtual clusters to be exposed
to the guest.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 2d1d841bd44e24b58a3d3cc4fa793670aaa38fbf)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: ARM: Transparent huge page (THP) support
Christoffer Dall [Wed, 2 Oct 2013 22:32:01 +0000 (15:32 -0700)]
KVM: ARM: Transparent huge page (THP) support

Support transparent huge pages in KVM/ARM and KVM/ARM64.  The
transparent_hugepage_adjust is not very pretty, but this is also how
it's solved on x86 and seems to be simply an artifact on how THPs
behave.  This should eventually be shared across architectures if
possible, but that can always be changed down the road.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 9b5fdb9781f74fb15827e465bfb5aa63211953c8)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: ARM: Support hugetlbfs backed huge pages
Christoffer Dall [Thu, 1 Nov 2012 16:14:45 +0000 (17:14 +0100)]
KVM: ARM: Support hugetlbfs backed huge pages

Support huge pages in KVM/ARM and KVM/ARM64.  The pud_huge checking on
the unmap path may feel a bit silly as the pud_huge check is always
defined to false, but the compiler should be smart about this.

Note: This deals only with VMAs marked as huge which are allocated by
users through hugetlbfs only.  Transparent huge pages can only be
detected by looking at the underlying pages (or the page tables
themselves) and this patch so far simply maps these on a page-by-page
level in the Stage-2 page tables.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit ad361f093c1e31d0b43946210a32ab4ff5c49850)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: ARM: Update comments for kvm_handle_wfi
Christoffer Dall [Wed, 16 Oct 2013 01:10:42 +0000 (18:10 -0700)]
KVM: ARM: Update comments for kvm_handle_wfi

Update comments to reflect what is really going on and add the TWE bit
to the comments in kvm_arm.h.

Also renames the function to kvm_handle_wfx like is done on arm64 for
consistency and uber-correctness.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 86ed81aa2e1ce05a4e7f0819f0dfc34e8d8fb910)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoARM: KVM: Yield CPU when vcpu executes a WFE
Marc Zyngier [Tue, 8 Oct 2013 17:38:13 +0000 (18:38 +0100)]
ARM: KVM: Yield CPU when vcpu executes a WFE

On an (even slightly) oversubscribed system, spinlocks are quickly
becoming a bottleneck, as some vcpus are spinning, waiting for a
lock to be released, while the vcpu holding the lock may not be
running at all.

This creates contention, and the observed slowdown is 40x for
hackbench. No, this isn't a typo.

The solution is to trap blocking WFEs and tell KVM that we're
now spinning. This ensures that other vpus will get a scheduling
boost, allowing the lock to be released more quickly. Also, using
CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT slightly improves the performance
when the VM is severely overcommited.

Quick test to estimate the performance: hackbench 1 process 1000

2xA15 host (baseline): 1.843s

2xA15 guest w/o patch: 2.083s
4xA15 guest w/o patch: 80.212s
8xA15 guest w/o patch: Could not be bothered to find out

2xA15 guest w/ patch: 2.102s
4xA15 guest w/ patch: 3.205s
8xA15 guest w/ patch: 6.887s

So we go from a 40x degradation to 1.5x in the 2x overcommit case,
which is vaguely more acceptable.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 58d5ec8f8ee318b26b29207874fbaee626973952)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agokvm: Add struct kvm arg to memslot APIs
Aneesh Kumar K.V [Mon, 7 Oct 2013 16:48:00 +0000 (22:18 +0530)]
kvm: Add struct kvm arg to memslot APIs

We will use that in the later patch to find the kvm ops handler

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 5587027ce9d59a57aecaa190be1c8e560aaff45d)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: Drop FOLL_GET in GUP when doing async page fault
chai wen [Mon, 14 Oct 2013 14:22:33 +0000 (22:22 +0800)]
KVM: Drop FOLL_GET in GUP when doing async page fault

Page pinning is not mandatory in kvm async page fault processing since
after async page fault event is delivered to a guest it accesses page once
again and does its own GUP.  Drop the FOLL_GET flag in GUP in async_pf
code, and do some simplifying in check/clear processing.

Suggested-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Gu zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: chai wen <chaiw.fnst@cn.fujitsu.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit f2e106692d5189303997ad7b96de8d8123aa5613)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: arm64: Get rid of KVM_HPAGE defines
Christoffer Dall [Wed, 2 Oct 2013 21:22:30 +0000 (14:22 -0700)]
KVM: arm64: Get rid of KVM_HPAGE defines

Now when the main kvm code relying on these defines has been moved to
the x86 specific part of the world, we can get rid of these.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit ef0cfe71c2b1710cd4ae747537e36c56f9a26ccf)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: ARM: Get rid of KVM_HPAGE defines
Christoffer Dall [Wed, 2 Oct 2013 21:22:29 +0000 (14:22 -0700)]
KVM: ARM: Get rid of KVM_HPAGE defines

The KVM_HPAGE_DEFINES are a little artificial on ARM, since the huge
page size is statically defined at compile time and there is only a
single huge page size.

Now when the main kvm code relying on these defines has been moved to
the x86 specific part of the world, we can get rid of these.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit dc6f6763dfeaf2dfec906bb78875dcea162accd9)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: Move gfn_to_index to x86 specific code
Christoffer Dall [Wed, 2 Oct 2013 21:22:28 +0000 (14:22 -0700)]
KVM: Move gfn_to_index to x86 specific code

The gfn_to_index function relies on huge page defines which either may
not make sense on systems that don't support huge pages or are defined
in an unconvenient way for other architectures.  Since this is
x86-specific, move the function to arch/x86/include/asm/kvm_host.h.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit 6d9d41e57440e32a3400f37aa05ef7a1a09ced64)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: ARM: Add support for Cortex-A7
Jonathan Austin [Thu, 26 Sep 2013 15:49:28 +0000 (16:49 +0100)]
KVM: ARM: Add support for Cortex-A7

This patch adds support for running Cortex-A7 guests on Cortex-A7 hosts.

As Cortex-A7 is architecturally compatible with A15, this patch is largely just
generalising existing code. Areas where 'implementation defined' behaviour
is identical for A7 and A15 is moved to allow it to be used by both cores.

The check to ensure that coprocessor register tables are sorted correctly is
also moved in to 'common' code to avoid each new cpu doing its own check
(and possibly forgetting to do so!)

Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit e8c2d99f8277d68d28a9f99d16289712bc2aee7f)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: ARM: fix the size of TTBCR_{T0SZ,T1SZ} masks
Jonathan Austin [Thu, 26 Sep 2013 15:49:26 +0000 (16:49 +0100)]
KVM: ARM: fix the size of TTBCR_{T0SZ,T1SZ} masks

The T{0,1}SZ fields of TTBCR are 3 bits wide when using the long descriptor
format. Likewise, the T0SZ field of the HTCR is 3-bits. KVM currently
defines TTBCR_T{0,1}SZ as 3, not 7.

The T0SZ mask is used to calculate the value for the HTCR, both to pick out
TTBCR.T0SZ and mask off the equivalent field in the HTCR during
read-modify-write. The incorrect mask size causes the (UNKNOWN) reset value
of HTCR.T0SZ to leak in to the calculated HTCR value. Linux will hang when
initializing KVM if HTCR's reset value has bit 2 set (sometimes the case on
A7/TC2)

Fixing T0SZ allows A7 cores to boot and T1SZ is also fixed for completeness.

Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 5e497046f005528464f9600a4ee04f49df713596)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: ARM: Fix calculation of virtual CPU ID
Jonathan Austin [Thu, 26 Sep 2013 15:49:27 +0000 (16:49 +0100)]
KVM: ARM: Fix calculation of virtual CPU ID

KVM does not have a notion of multiple clusters for CPUs, just a linear
array of CPUs. When using a system with cores in more than one cluster, the
current method for calculating the virtual MPIDR will leak the (physical)
cluster information into the virtual MPIDR. One effect of this is that
Linux under KVM fails to boot multiple CPUs that aren't in the 0th cluster.

This patch does away with exposing the real MPIDR fields in favour of simply
using the virtual CPU number (but preserving the U bit, as before).

Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 1158fca401e09665c440a9fe4fd4f131ee85c13b)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agovirt/kvm/iommu.c: Add leading zeros to device's BDF notation in debug messages
Andre Richter [Wed, 2 Oct 2013 10:23:26 +0000 (12:23 +0200)]
virt/kvm/iommu.c: Add leading zeros to device's BDF notation in debug messages

When KVM (de)assigns PCI(e) devices to VMs, a debug message is printed
including the BDF notation of the respective device. Currently, the BDF
notation does not have the commonly used leading zeros. This produces
messages like "assign device 0:1:8.0", which look strange at first sight.

The patch fixes this by exchanging the printk(KERN_DEBUG ...) with dev_info()
and also inserts "kvm" into the debug message, so that it is obvious where
the message comes from. Also reduces LoC.

Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Andre Richter <andre.o.richter@gmail.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit 29242cb5c63b1f8e12e8055ba1a6c3e0004fa86d)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoFix NULL dereference in gfn_to_hva_prot()
Gleb Natapov [Tue, 1 Oct 2013 16:58:36 +0000 (19:58 +0300)]
Fix NULL dereference in gfn_to_hva_prot()

gfn_to_memslot() can return NULL or invalid slot. We need to check slot
validity before accessing it.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit a2ac07fe292ea41296049dfdbfeed203e2467ee7)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoARM/ARM64: KVM: Implement KVM_ARM_PREFERRED_TARGET ioctl
Anup Patel [Mon, 30 Sep 2013 08:50:07 +0000 (14:20 +0530)]
ARM/ARM64: KVM: Implement KVM_ARM_PREFERRED_TARGET ioctl

For implementing CPU=host, we need a mechanism for querying
preferred VCPU target type on underlying Host.

This patch implements KVM_ARM_PREFERRED_TARGET vm ioctl which
returns struct kvm_vcpu_init instance containing information
about preferred VCPU target type and target specific features
available for it.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 42c4e0c77ac91505ab94284b14025e3a0865c0a5)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoARM64: KVM: Implement kvm_vcpu_preferred_target() function
Anup Patel [Mon, 30 Sep 2013 08:50:06 +0000 (14:20 +0530)]
ARM64: KVM: Implement kvm_vcpu_preferred_target() function

This patch implements kvm_vcpu_preferred_target() function for
KVM ARM64 which will help us implement KVM_ARM_PREFERRED_TARGET
ioctl for user space.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 473bdc0e6565ebb22455657a40daa21b6b4ee16b)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoARM: KVM: Implement kvm_vcpu_preferred_target() function
Anup Patel [Mon, 30 Sep 2013 08:50:05 +0000 (14:20 +0530)]
ARM: KVM: Implement kvm_vcpu_preferred_target() function

This patch implements kvm_vcpu_preferred_target() function for
KVM ARM which will help us implement KVM_ARM_PREFERRED_TARGET ioctl
for user space.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit 4a6fee805d5e278e4733bf933cb5b184b7a8be1f)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: ARM: Fix typo in comments of inject_abt()
Anup Patel [Wed, 11 Sep 2013 13:04:22 +0000 (18:34 +0530)]
KVM: ARM: Fix typo in comments of inject_abt()

Very minor typo in comments of inject_abt() when we update fault status
register for injecting prefetch abort.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit b373e492f3a3469c615c2ae218d2f723900bf981)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: Convert kvm_lock back to non-raw spinlock
Paolo Bonzini [Wed, 25 Sep 2013 11:53:07 +0000 (13:53 +0200)]
KVM: Convert kvm_lock back to non-raw spinlock

In commit e935b8372cf8 ("KVM: Convert kvm_lock to raw_spinlock"),
the kvm_lock was made a raw lock.  However, the kvm mmu_shrink()
function tries to grab the (non-raw) mmu_lock within the scope of
the raw locked kvm_lock being held.  This leads to the following:

BUG: sleeping function called from invalid context at kernel/rtmutex.c:659
in_atomic(): 1, irqs_disabled(): 0, pid: 55, name: kswapd0
Preemption disabled at:[<ffffffffa0376eac>] mmu_shrink+0x5c/0x1b0 [kvm]

Pid: 55, comm: kswapd0 Not tainted 3.4.34_preempt-rt
Call Trace:
 [<ffffffff8106f2ad>] __might_sleep+0xfd/0x160
 [<ffffffff817d8d64>] rt_spin_lock+0x24/0x50
 [<ffffffffa0376f3c>] mmu_shrink+0xec/0x1b0 [kvm]
 [<ffffffff8111455d>] shrink_slab+0x17d/0x3a0
 [<ffffffff81151f00>] ? mem_cgroup_iter+0x130/0x260
 [<ffffffff8111824a>] balance_pgdat+0x54a/0x730
 [<ffffffff8111fe47>] ? set_pgdat_percpu_threshold+0xa7/0xd0
 [<ffffffff811185bf>] kswapd+0x18f/0x490
 [<ffffffff81070961>] ? get_parent_ip+0x11/0x50
 [<ffffffff81061970>] ? __init_waitqueue_head+0x50/0x50
 [<ffffffff81118430>] ? balance_pgdat+0x730/0x730
 [<ffffffff81060d2b>] kthread+0xdb/0xe0
 [<ffffffff8106e122>] ? finish_task_switch+0x52/0x100
 [<ffffffff817e1e94>] kernel_thread_helper+0x4/0x10
 [<ffffffff81060c50>] ? __init_kthread_worker+0x

After the previous patch, kvm_lock need not be a raw spinlock anymore,
so change it back.

Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: kvm@vger.kernel.org
Cc: gleb@redhat.com
Cc: jan.kiszka@siemens.com
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 2f303b74a62fb74983c0a66e2df353be963c527c)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
10 years agoKVM: protect kvm_usage_count with its own spinlock
Paolo Bonzini [Tue, 10 Sep 2013 10:58:35 +0000 (12:58 +0200)]
KVM: protect kvm_usage_count with its own spinlock

The VM list need not be protected by a raw spinlock.  Separate the
two so that kvm_lock can be made non-raw.

Cc: kvm@vger.kernel.org
Cc: gleb@redhat.com
Cc: jan.kiszka@siemens.com
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 4a937f96f3a29c58b7edd349d2e4dfac371efdf2)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>