Christoffer Dall [Sun, 2 Feb 2014 21:41:02 +0000 (13:41 -0800)]
arm64: KVM: Add VGIC device control for arm64
This fixes the build breakage introduced by
c07a0191ef2de1f9510f12d1f88e3b0b5cd8d66f and adds support for the device
control API and save/restore of the VGIC state for ARMv8.
The defines were simply missing from the arm64 header files and
uaccess.h must be implicitly imported from somewhere else on arm.
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit
2a2f3e269c75edf916de5967079069aeb6a601cb)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Andi Kleen [Sat, 8 Feb 2014 07:51:57 +0000 (08:51 +0100)]
asmlinkage, kvm: Make kvm_rebooting visible
kvm_rebooting is referenced from assembler code, thus
needs to be visible.
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1391845930-28580-1-git-send-email-ak@linux.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
(cherry picked from commit
52480137d82062bb8d0fb778cb9934667958e367)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Mark Rutland [Wed, 5 Feb 2014 10:24:12 +0000 (10:24 +0000)]
arm64: fix typo: s/SERRROR/SERROR/
Somehow SERROR has acquired an additional 'R' in a couple of headers.
This patch removes them before they spread further. As neither instance
is in use yet, no other sites need to be fixed up.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit
bfb67a5606376bb32cb6f93dc05cda2e8c2038a5)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Dominik Dingel [Fri, 31 Jan 2014 13:32:46 +0000 (14:32 +0100)]
KVM: async_pf: Add missing call for async page present
Commit KVM: async_pf: Provide additional direct page notification
missed the call from kvm_check_async_pf_completion to the new introduced function.
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit
1179ba539541347d5427cde8bcfdaa5ead14f3aa)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Dominik Dingel [Tue, 3 Sep 2013 10:31:16 +0000 (12:31 +0200)]
KVM: async_pf: Allow to wait for outstanding work
On s390 we are not able to cancel work. Instead we will flush the work and wait for
completion.
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
(cherry picked from commit
9f2ceda49c6b8827c795731c204f6c2587886e2c)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Dominik Dingel [Thu, 6 Jun 2013 13:32:37 +0000 (15:32 +0200)]
KVM: async_pf: Provide additional direct page notification
By setting a Kconfig option, the architecture can control when
guest notifications will be presented by the apf backend.
There is the default batch mechanism, working as before, where the vcpu
thread should pull in this information.
Opposite to this, there is now the direct mechanism, that will push the
information to the guest.
This way s390 can use an already existing architecture interface.
Still the vcpu thread should call check_completion to cleanup leftovers.
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
(cherry picked from commit
e0ead41a6dac09f86675ce07a66e4b253a9b7bd5)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Dan Carpenter [Wed, 29 Jan 2014 13:16:39 +0000 (16:16 +0300)]
KVM: return an error code in kvm_vm_ioctl_register_coalesced_mmio()
If kvm_io_bus_register_dev() fails then it returns success but it should
return an error code.
I also did a little cleanup like removing an impossible NULL test.
Cc: stable@vger.kernel.org
Fixes: 2b3c246a682c ('KVM: Make coalesced mmio use a device per zone')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit
aac5c4226e7136c331ed384c25d5560204da10a0)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Scott Wood [Fri, 10 Jan 2014 00:43:16 +0000 (18:43 -0600)]
kvm: Provide kvm_vcpu_eligible_for_directed_yield() stub
Commit
7940876e1330671708186ac3386aa521ffb5c182 ("kvm: make local
functions static") broke KVM PPC builds due to removing (rather than
moving) the stub version of kvm_vcpu_eligible_for_directed_yield().
This patch reintroduces it.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Alexander Graf <agraf@suse.de>
[Move the #ifdef inside the function. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit
4a55dd7273c95b4a19fbcf0ae1bbd1cfd09dfc36)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Sachin Kamat [Tue, 7 Jan 2014 08:15:15 +0000 (13:45 +0530)]
KVM: ARM: Remove duplicate include
trace.h was included twice. Remove duplicate inclusion.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
61466710de078c697106fa5b70ec7afc9feab520)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Marc Zyngier [Fri, 13 Dec 2013 16:56:06 +0000 (16:56 +0000)]
arm/arm64: KVM: relax the requirements of VMA alignment for THP
The THP code in KVM/ARM is a bit restrictive in not allowing a THP
to be used if the VMA is not 2MB aligned. Actually, it is not so much
the VMA that matters, but the associated memslot:
A process can perfectly mmap a region with no particular alignment
restriction, and then pass a 2MB aligned address to KVM. In this
case, KVM will only use this 2MB aligned region, and will ignore
the range between vma->vm_start and memslot->userspace_addr.
It can also choose to place this memslot at whatever alignment it
wants in the IPA space. In the end, what matters is the relative
alignment of the user space and IPA mappings with respect to a
2M page. They absolutely must be the same if you want to use THP.
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
136d737fd20102f1be9b02356590fd55e3a40d0e)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Stephen Hemminger [Sun, 29 Dec 2013 20:12:29 +0000 (12:12 -0800)]
kvm: make local functions static
Running 'make namespacecheck' found lots of functions that
should be declared static, since only used in one file.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
(cherry picked from commit
7940876e1330671708186ac3386aa521ffb5c182)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Anup Patel [Thu, 12 Dec 2013 16:12:23 +0000 (16:12 +0000)]
arm64: KVM: Force undefined exception for Guest SMC intructions
The SMC-based PSCI emulation for Guest is going to be very different
from the in-kernel HVC-based PSCI emulation hence for now just inject
undefined exception when Guest executes SMC instruction.
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: marc Zyngier <marc.zyngier@arm.com>
(cherry picked from commit
e5cf9dcdbfd26cd4e1991db08755da900454efeb)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Anup Patel [Thu, 14 Nov 2013 15:20:08 +0000 (15:20 +0000)]
arm64: KVM: Support X-Gene guest VCPU on APM X-Gene host
This patch allows us to have X-Gene guest VCPU when using KVM arm64
on APM X-Gene host.
We add KVM_ARM_TARGET_XGENE_POTENZA for X-Gene Potenza compatible
guest VCPU and we return KVM_ARM_TARGET_XGENE_POTENZA in kvm_target_cpu()
when running on X-Gene host with Potenza core.
[maz: sanitized the commit log]
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
(cherry picked from commit
e28100bd8ed9e37b7cd4578140a1e7f95bd40835)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Anup Patel [Thu, 12 Dec 2013 16:12:22 +0000 (16:12 +0000)]
arm64: KVM: Add Kconfig option for max VCPUs per-Guest
Current max VCPUs per-Guest is set to 4 which is preventing
us from creating a Guest (or VM) with 8 VCPUs on Host (e.g.
X-Gene Storm SOC) with 8 Host CPUs.
The correct value of max VCPUs per-Guest should be same as
the max CPUs supported by GICv2 which is 8 but, increasing
value of max VCPUs per-Guest can make things slower hence
we add Kconfig option to let KVM users select appropriate
max VCPUs per-Guest.
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
(cherry picked from commit
da7814700a0c408bead58ce4714b7625ffbaade1)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Christoffer Dall [Mon, 23 Sep 2013 21:55:57 +0000 (14:55 -0700)]
KVM: arm-vgic: Support CPU interface reg access
Implement support for the CPU interface register access driven by MMIO
address offsets from the CPU interface base address. Useful for user
space to support save/restore of the VGIC state.
This commit adds support only for the same logic as the current VGIC
support, and no more. For example, the active priority registers are
handled as RAZ/WI, just like setting priorities on the emulated
distributor.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
fa20f5aea56f271f83e91b9cde00f043a5a14990)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Christoffer Dall [Fri, 25 Oct 2013 20:22:31 +0000 (21:22 +0100)]
KVM: arm-vgic: Add GICD_SPENDSGIR and GICD_CPENDSGIR handlers
Handle MMIO accesses to the two registers which should support both the
case where the VMs want to read/write either of these registers and the
case where user space reads/writes these registers to do save/restore of
the VGIC state.
Note that the added complexity compared to simple set/clear enable
registers stems from the bookkeping of source cpu ids. It may be
possible to change the underlying data structure to simplify the
complexity, but since this is not in the critical path at all, this will
do.
Also note that reading this register from a live guest will not be
accurate compared to on hardware, because some state may be living on
the CPU LRs and the only way to give a consistent read would be to force
stop all the VCPUs and request them to unqueu the LR state onto the
distributor. Until we have an actual user of live reading this
register, we can live with the difference.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
90a5355ee7639e92c0492ec592cba5c31bd80687)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Christoffer Dall [Sat, 16 Nov 2013 04:51:31 +0000 (20:51 -0800)]
KVM: arm-vgic: Support unqueueing of LRs to the dist
To properly access the VGIC state from user space it is very unpractical
to have to loop through all the LRs in all register access functions.
Instead, support moving all pending state from LRs to the distributor,
but leave active state LRs alone.
Note that to accurately present the active and pending state to VCPUs
reading these distributor registers from a live VM, we would have to
stop all other VPUs than the calling VCPU and ask each CPU to unqueue
their LR state onto the distributor and add fields to track active state
on the distributor side as well. We don't have any users of such
functionality yet and there are other inaccuracies of the GIC emulation,
so don't provide accurate synchronized access to this state just yet.
However, when the time comes, having this function should help.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
cbd333a4bfd0d93bba36d46a0e4e7979228873a6)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Christoffer Dall [Fri, 25 Oct 2013 20:17:31 +0000 (21:17 +0100)]
KVM: arm-vgic: Add vgic reg access from dev attr
Add infrastructure to handle distributor and cpu interface register
accesses through the KVM_{GET/SET}_DEVICE_ATTR interface by adding the
KVM_DEV_ARM_VGIC_GRP_DIST_REGS and KVM_DEV_ARM_VGIC_GRP_CPU_REGS groups
and defining the semantics of the attr field to be the MMIO offset as
specified in the GICv2 specs.
Missing register accesses or other changes in individual register access
functions to support save/restore of the VGIC state is added in
subsequent patches.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
c07a0191ef2de1f9510f12d1f88e3b0b5cd8d66f)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Christoffer Dall [Thu, 12 Dec 2013 04:29:11 +0000 (20:29 -0800)]
arm/arm64: kvm: Set vcpu->cpu to -1 on vcpu_put
The arch-generic KVM code expects the cpu field of a vcpu to be -1 if
the vcpu is no longer assigned to a cpu. This is used for the optimized
make_all_cpus_request path and will be used by the vgic code to check
that no vcpus are running.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
e9b152cb957cb194437f37e79f0f3c9d34fe53d6)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Christoffer Dall [Mon, 23 Sep 2013 21:55:56 +0000 (14:55 -0700)]
KVM: arm-vgic: Make vgic mmio functions more generic
Rename the vgic_ranges array to vgic_dist_ranges to be more specific and
to prepare for handling CPU interface register access as well (for
save/restore of VGIC state).
Pass offset from distributor or interface MMIO base to
find_matching_range function instead of the physical address of the
access in the VM memory map. This allows other callers unaware of the
VM specifics, but with generic VGIC knowledge to reuse the function.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
1006e8cb22e861260688917ca4cfe6cde8ad69eb)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Christoffer Dall [Mon, 23 Sep 2013 21:55:56 +0000 (14:55 -0700)]
irqchip: arm-gic: Define additional MMIO offsets and masks
Define CPU interface offsets for the GICC_ABPR, GICC_APR, and GICC_IIDR
registers. Define distributor registers for the GICD_SPENDSGIR and the
GICD_CPENDSGIR. KVM/ARM needs to know about these definitions to fully
support save/restore of the VGIC.
Also define some masks and shifts for the various GICH_VMCR fields.
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
0307e1770fdeff2732cf7a35d0f7f49db67c6621)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Christoffer Dall [Mon, 23 Sep 2013 21:55:56 +0000 (14:55 -0700)]
KVM: arm-vgic: Set base addr through device API
Support setting the distributor and cpu interface base addresses in the
VM physical address space through the KVM_{SET,GET}_DEVICE_ATTR API
in addition to the ARM specific API.
This has the added benefit of being able to share more code in user
space and do things in a uniform manner.
Also deprecate the older API at the same time, but backwards
compatibility will be maintained.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
ce01e4e8874d410738f4b4733b26642d6611a331)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Christoffer Dall [Fri, 25 Oct 2013 16:29:18 +0000 (17:29 +0100)]
KVM: arm-vgic: Support KVM_CREATE_DEVICE for VGIC
Support creating the ARM VGIC device through the KVM_CREATE_DEVICE
ioctl, which can then later be leveraged to use the
KVM_{GET/SET}_DEVICE_ATTR, which is useful both for setting addresses in
a more generic API than the ARM-specific one and is useful for
save/restore of VGIC state.
Adds KVM_CAP_DEVICE_CTRL to ARM capabilities.
Note that we change the check for creating a VGIC from bailing out if
any VCPUs were created, to bailing out if any VCPUs were ever run. This
is an important distinction that shouldn't break anything, but allows
creating the VGIC after the VCPUs have been created.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
7330672befe6269e575f79b924a7068b26c144b4)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Christoffer Dall [Mon, 23 Sep 2013 21:55:55 +0000 (14:55 -0700)]
ARM: KVM: Allow creating the VGIC after VCPUs
Rework the VGIC initialization slightly to allow initialization of the
vgic cpu-specific state even if the irqchip (the VGIC) hasn't been
created by user space yet. This is safe, because the vgic data
structures are already allocated when the CPU is allocated if VGIC
support is compiled into the kernel. Further, the init process does not
depend on any other information and the sacrifice is a slight
performance degradation for creating VMs in the no-VGIC case.
The reason is that the new device control API doesn't mandate creating
the VGIC before creating the VCPU and it is unreasonable to require user
space to create the VGIC before creating the VCPUs.
At the same time move the irqchip_in_kernel check out of
kvm_vcpu_first_run_init and into the init function to make the per-vcpu
and global init functions symmetric and add comments on the exported
functions making it a bit easier to understand the init flow by only
looking at vgic.c.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
e1ba0207a1b3714bb3f000e506285ae5123cdfa7)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Andre Przywara [Fri, 13 Dec 2013 13:23:26 +0000 (14:23 +0100)]
ARM/KVM: save and restore generic timer registers
For migration to work we need to save (and later restore) the state of
each core's virtual generic timer.
Since this is per VCPU, we can use the [gs]et_one_reg ioctl and export
the three needed registers (control, counter, compare value).
Though they live in cp15 space, we don't use the existing list, since
they need special accessor functions and the arch timer is optional.
Acked-by: Marc Zynger <marc.zyngier@arm.com>
Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
39735a3a390431bcf60f9174b7d64f787fd6afa9)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Christoffer Dall [Sat, 16 Nov 2013 18:51:25 +0000 (10:51 -0800)]
arm/arm64: KVM: arch_timer: Initialize cntvoff at kvm_init
Initialize the cntvoff at kvm_init_vm time, not before running the VCPUs
at the first time because that will overwrite any potentially restored
values from user space.
Cc: Andre Przywara <andre.przywara@linaro.org>
Acked-by: Marc Zynger <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
a1a64387adeeba7a34ce06f2774e81f496ee803b)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Christoffer Dall [Wed, 20 Nov 2013 01:43:19 +0000 (17:43 -0800)]
arm: KVM: Don't return PSCI_INVAL if waitqueue is inactive
The current KVM implementation of PSCI returns INVALID_PARAMETERS if the
waitqueue for the corresponding CPU is not active. This does not seem
correct, since KVM should not care what the specific thread is doing,
for example, user space may not have called KVM_RUN on this VCPU yet or
the thread may be busy looping to user space because it received a
signal; this is really up to the user space implementation. Instead we
should check specifically that the CPU is marked as being turned off,
regardless of the VCPU thread state, and if it is, we shall
simply clear the pause flag on the CPU and wake up the thread if it
happens to be blocked for us.
Further, the implementation seems to be racy when executing multiple
VCPU threads. There really isn't a reasonable user space programming
scheme to ensure all secondary CPUs have reached kvm_vcpu_first_run_init
before turning on the boot CPU.
Therefore, set the pause flag on the vcpu at VCPU init time (which can
reasonably be expected to be completed for all CPUs by user space before
running any VCPUs) and clear both this flag and the feature (in case the
feature can somehow get set again in the future) and ping the waitqueue
on turning on a VCPU using PSCI.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
478a8237f656d86d25b3e4e4bf3c48f590156294)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Masanari Iida [Sun, 8 Dec 2013 15:22:53 +0000 (00:22 +0900)]
treewide: Fix typos in printk
Correct spelling typo in various part of kernel
[ cdall: Pickes KVM/arm64 specific part not already merged into LSK ]
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
(cherry picked from commit
77d84ff87e9d38072abcca665ca22cb1da41cb86)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Lorenzo Pieralisi [Mon, 5 Aug 2013 14:04:46 +0000 (15:04 +0100)]
arm: kvm: implement CPU PM notifier
Upon CPU shutdown and consequent warm-reboot, the hypervisor CPU state
must be re-initialized. This patch implements a CPU PM notifier that
upon warm-boot calls a KVM hook to reinitialize properly the hypervisor
state so that the CPU can be safely resumed.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
(cherry picked from commit
1fcf7ce0c60213994269fb59569ec161eb6e08d6)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Takuya Yoshikawa [Fri, 13 Dec 2013 06:07:21 +0000 (15:07 +0900)]
KVM: Use cond_resched() directly and remove useless kvm_resched()
Since the commit
15ad7146 ("KVM: Use the scheduler preemption notifiers
to make kvm preemptible"), the remaining stuff in this function is a
simple cond_resched() call with an extra need_resched() check which was
there to avoid dropping VCPUs unnecessarily. Now it is meaningless.
Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit
c08ac06ab3f3cdb8d34376c3a8a5e46a31a62c8f)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Andy Honig [Tue, 19 Nov 2013 00:09:22 +0000 (16:09 -0800)]
KVM: Improve create VCPU parameter (CVE-2013-4587)
In multiple functions the vcpu_id is used as an offset into a bitfield. Ag
malicious user could specify a vcpu_id greater than 255 in order to set or
clear bits in kernel memory. This could be used to elevate priveges in the
kernel. This patch verifies that the vcpu_id provided is less than 255.
The api documentation already specifies that the vcpu_id must be less than
max_vcpus, but this is currently not checked.
Reported-by: Andrew Honig <ahonig@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Andrew Honig <ahonig@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit
338c7dbadd2671189cec7faf64c84d01071b3f96)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Santosh Shilimkar [Tue, 19 Nov 2013 19:59:12 +0000 (14:59 -0500)]
arm/arm64: kvm: Use virt_to_idmap instead of virt_to_phys for idmap mappings
KVM initialisation fails on architectures implementing virt_to_idmap()
because virt_to_phys() on such architectures won't fetch you the correct
idmap page.
So update the KVM ARM code to use the virt_to_idmap() to fix the issue.
Since the KVM code is shared between arm and arm64, we create
kvm_virt_to_phys() and handle the redirection in respective headers.
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
4fda342cc7f577599c53fd27b99c953c7b1da18a)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Heiko Carstens [Mon, 18 Nov 2013 09:35:55 +0000 (10:35 +0100)]
KVM: kvm_clear_guest_page(): fix empty_zero_page usage
Using the address of 'empty_zero_page' as source address in order to
clear a page is wrong. On some architectures empty_zero_page is only the
pointer to the struct page of the empty_zero_page. Therefore the clear
page operation would copy the contents of a couple of struct pages instead
of clearing a page. For kvm only arm/arm64 are affected by this bug.
To fix this use the ZERO_PAGE macro instead which will return the struct
page address of the empty_zero_page on all architectures.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit
8a3caa6d74597c2a083f7c87f866891a0b12540b)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Christoffer Dall [Fri, 15 Nov 2013 21:14:12 +0000 (13:14 -0800)]
arm/arm64: KVM: Fix hyp mappings of vmalloc regions
Using virt_to_phys on percpu mappings is horribly wrong as it may be
backed by vmalloc. Introduce kvm_kaddr_to_phys which translates both
types of valid kernel addresses to the corresponding physical address.
At the same time resolves a typing issue where we were storing the
physical address as a 32 bit unsigned long (on arm), truncating the
physical address for addresses above the 4GB limit. This caused
breakage on Keystone.
Cc: <stable@vger.kernel.org> [3.10+]
Reported-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
40c2729bab48e2832b17c1fa8af9db60e776131b)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Marc Zyngier [Tue, 5 Nov 2013 14:12:15 +0000 (14:12 +0000)]
arm/arm64: KVM: PSCI: propagate caller endianness to the incoming vcpu
When booting a vcpu using PSCI, make sure we start it with the
endianness of the caller. Otherwise, secondaries can be pretty
unhappy to execute a BE kernel in LE mode...
This conforms to PSCI spec Rev B, 5.13.3.
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
(cherry picked from commit
ce94fe93d566bf381c6ecbd45010d36c5f04d692)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Marc Zyngier [Tue, 12 Feb 2013 12:40:22 +0000 (12:40 +0000)]
arm/arm64: KVM: MMIO support for BE guest
Do the necessary byteswap when host and guest have different
views of the universe. Actually, the only case we need to take
care of is when the guest is BE. All the other cases are naturally
handled.
Also be careful about endianness when the data is being memcopy-ed
from/to the run buffer.
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
(cherry picked from commit
6d89d2d9b5bac9dbe40ee106ceda9307b6265234)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Marc Zyngier [Tue, 5 Nov 2013 18:29:46 +0000 (18:29 +0000)]
arm64: KVM: vgic: byteswap GICv2 access on world switch if BE
Ensure that accesses to the GICH_* registers are byteswapped
when the kernel is compiled as big-endian.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit
c5b2c0f5203b3bc678a8967daedf7114029975ae)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Marc Zyngier [Tue, 5 Nov 2013 18:29:45 +0000 (18:29 +0000)]
arm64: KVM: initialize HYP mode following the kernel endianness
Force SCTLR_EL2.EE to 1 if the kernel is compiled as BE.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit
18ea3dbc9e5c8a53a361b17c4a5676ea6f4bcb72)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Gleb Natapov [Tue, 5 Nov 2013 14:04:18 +0000 (16:04 +0200)]
KVM: remove vm mmap method
It was used in conjunction with KVM_SET_MEMORY_REGION ioctl which was
removed by
b74a07beed0 in 2010, QEMU stopped using it in 2008, so
it is time to remove the code finally.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit
80f5b5e700fa9c58480eafce0d47367bafb70006)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Michael S. Tsirkin [Wed, 30 Oct 2013 19:43:01 +0000 (21:43 +0200)]
kvm_host: typo fix
fix up typo in comment.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit
81e87e26796782e014fd1f2bb9cd8fb6ce4021a8)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Alex Williamson [Wed, 30 Oct 2013 17:02:17 +0000 (11:02 -0600)]
kvm: Add VFIO device
So far we've succeeded at making KVM and VFIO mostly unaware of each
other, but areas are cropping up where a connection beyond eventfds
and irqfds needs to be made. This patch introduces a KVM-VFIO device
that is meant to be a gateway for such interaction. The user creates
the device and can add and remove VFIO groups to it via file
descriptors. When a group is added, KVM verifies the group is valid
and gets a reference to it via the VFIO external user interface.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit
ec53500fae421e07c5d035918ca454a429732ef4)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Borislav Petkov [Sun, 22 Sep 2013 14:44:50 +0000 (16:44 +0200)]
kvm: Add KVM_GET_EMULATED_CPUID
Add a kvm ioctl which states which system functionality kvm emulates.
The format used is that of CPUID and we return the corresponding CPUID
bits set for which we do emulate functionality.
Make sure ->padding is being passed on clean from userspace so that we
can use it for something in the future, after the ioctl gets cast in
stone.
s/kvm_dev_ioctl_get_supported_cpuid/kvm_dev_ioctl_get_cpuid/ while at
it.
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit
9c15bb1d0a8411f9bb3395d21d5309bde7da0c1c)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Paolo Bonzini [Wed, 30 Oct 2013 11:12:13 +0000 (12:12 +0100)]
KVM: use a more sensible error number when debugfs directory creation fails
I don't know if this was due to cut and paste, or somebody was really
using a D20 to pick the error code for kvm_init_debugfs as suggested by
Linus (EFAULT is 14, so the possibility cannot be entirely ruled out).
In any case, this patch fixes it.
Reported-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit
0c8eb04a6241da28deb108181213b791c378123b)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Marc Zyngier [Fri, 2 Aug 2013 10:41:13 +0000 (11:41 +0100)]
arm64: KVM: Yield CPU when vcpu executes a WFE
On an (even slightly) oversubscribed system, spinlocks are quickly
becoming a bottleneck, as some vcpus are spinning, waiting for a
lock to be released, while the vcpu holding the lock may not be
running at all.
The solution is to trap blocking WFEs and tell KVM that we're
now spinning. This ensures that other vpus will get a scheduling
boost, allowing the lock to be released more quickly. Also, using
CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT slightly improves the performance
when the VM is severely overcommited.
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
(cherry picked from commit
d241aac798eb042e605f78c31a4122e583b2cd13)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Yang Zhang [Thu, 24 Oct 2013 01:56:39 +0000 (09:56 +0800)]
KVM: Mapping IOMMU pages after updating memslot
In kvm_iommu_map_pages(), we need to know the page size via call
kvm_host_page_size(). And it will check whether the target slot
is valid before return the right page size.
Currently, we will map the iommu pages when creating a new slot.
But we call kvm_iommu_map_pages() during preparing the new slot.
At that time, the new slot is not visible by domain(still in preparing).
So we cannot get the right page size from kvm_host_page_size() and
this will break the IOMMU super page logic.
The solution is to map the iommu pages after we insert the new slot
into domain.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Tested-by: Patrick Lu <patrick.lu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit
e0230e1327fb862c9b6cde24ae62d55f9db62c9b)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Marc Zyngier [Fri, 18 Oct 2013 17:19:03 +0000 (18:19 +0100)]
arm/arm64: KVM: PSCI: use MPIDR to identify a target CPU
The KVM PSCI code blindly assumes that vcpu_id and MPIDR are
the same thing. This is true when vcpus are organized as a flat
topology, but is wrong when trying to emulate any other topology
(such as A15 clusters).
Change the KVM PSCI CPU_ON code to look at the MPIDR instead
of the vcpu_id to pick a target CPU.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
79c648806f9034abf54332b78043bb242189d953)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Marc Zyngier [Fri, 18 Oct 2013 17:19:06 +0000 (18:19 +0100)]
ARM: KVM: drop limitation to 4 CPU VMs
Now that the KVM/arm code knows about affinity, remove the hard
limit of 4 vcpus per VM.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
7999b4d18211bcfb40e3574cf75e94518e9fa2c6)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Marc Zyngier [Fri, 18 Oct 2013 17:19:05 +0000 (18:19 +0100)]
ARM: KVM: fix L2CTLR to be per-cluster
The L2CTLR register contains the number of CPUs in this cluster.
Make sure the register content is actually relevant to the vcpu
that is being configured by computing the number of cores that are
part of its cluster.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
9cbb6d969cb6561de45d917b8bb9281cb374bb35)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Marc Zyngier [Fri, 18 Oct 2013 17:19:04 +0000 (18:19 +0100)]
ARM: KVM: Fix MPIDR computing to support virtual clusters
In order to be able to support more than 4 A7 or A15 CPUs,
we need to fix the MPIDR computing to reflect the fact that
both A15 and A7 can only exist in clusters of at most 4 CPUs.
Fix the MPIDR computing to allow virtual clusters to be exposed
to the guest.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
2d1d841bd44e24b58a3d3cc4fa793670aaa38fbf)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Christoffer Dall [Wed, 2 Oct 2013 22:32:01 +0000 (15:32 -0700)]
KVM: ARM: Transparent huge page (THP) support
Support transparent huge pages in KVM/ARM and KVM/ARM64. The
transparent_hugepage_adjust is not very pretty, but this is also how
it's solved on x86 and seems to be simply an artifact on how THPs
behave. This should eventually be shared across architectures if
possible, but that can always be changed down the road.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
9b5fdb9781f74fb15827e465bfb5aa63211953c8)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Christoffer Dall [Thu, 1 Nov 2012 16:14:45 +0000 (17:14 +0100)]
KVM: ARM: Support hugetlbfs backed huge pages
Support huge pages in KVM/ARM and KVM/ARM64. The pud_huge checking on
the unmap path may feel a bit silly as the pud_huge check is always
defined to false, but the compiler should be smart about this.
Note: This deals only with VMAs marked as huge which are allocated by
users through hugetlbfs only. Transparent huge pages can only be
detected by looking at the underlying pages (or the page tables
themselves) and this patch so far simply maps these on a page-by-page
level in the Stage-2 page tables.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
ad361f093c1e31d0b43946210a32ab4ff5c49850)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Christoffer Dall [Wed, 16 Oct 2013 01:10:42 +0000 (18:10 -0700)]
KVM: ARM: Update comments for kvm_handle_wfi
Update comments to reflect what is really going on and add the TWE bit
to the comments in kvm_arm.h.
Also renames the function to kvm_handle_wfx like is done on arm64 for
consistency and uber-correctness.
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
86ed81aa2e1ce05a4e7f0819f0dfc34e8d8fb910)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Marc Zyngier [Tue, 8 Oct 2013 17:38:13 +0000 (18:38 +0100)]
ARM: KVM: Yield CPU when vcpu executes a WFE
On an (even slightly) oversubscribed system, spinlocks are quickly
becoming a bottleneck, as some vcpus are spinning, waiting for a
lock to be released, while the vcpu holding the lock may not be
running at all.
This creates contention, and the observed slowdown is 40x for
hackbench. No, this isn't a typo.
The solution is to trap blocking WFEs and tell KVM that we're
now spinning. This ensures that other vpus will get a scheduling
boost, allowing the lock to be released more quickly. Also, using
CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT slightly improves the performance
when the VM is severely overcommited.
Quick test to estimate the performance: hackbench 1 process 1000
2xA15 host (baseline): 1.843s
2xA15 guest w/o patch: 2.083s
4xA15 guest w/o patch: 80.212s
8xA15 guest w/o patch: Could not be bothered to find out
2xA15 guest w/ patch: 2.102s
4xA15 guest w/ patch: 3.205s
8xA15 guest w/ patch: 6.887s
So we go from a 40x degradation to 1.5x in the 2x overcommit case,
which is vaguely more acceptable.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
58d5ec8f8ee318b26b29207874fbaee626973952)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Aneesh Kumar K.V [Mon, 7 Oct 2013 16:48:00 +0000 (22:18 +0530)]
kvm: Add struct kvm arg to memslot APIs
We will use that in the later patch to find the kvm ops handler
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit
5587027ce9d59a57aecaa190be1c8e560aaff45d)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
chai wen [Mon, 14 Oct 2013 14:22:33 +0000 (22:22 +0800)]
KVM: Drop FOLL_GET in GUP when doing async page fault
Page pinning is not mandatory in kvm async page fault processing since
after async page fault event is delivered to a guest it accesses page once
again and does its own GUP. Drop the FOLL_GET flag in GUP in async_pf
code, and do some simplifying in check/clear processing.
Suggested-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Gu zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: chai wen <chaiw.fnst@cn.fujitsu.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit
f2e106692d5189303997ad7b96de8d8123aa5613)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Christoffer Dall [Wed, 2 Oct 2013 21:22:30 +0000 (14:22 -0700)]
KVM: arm64: Get rid of KVM_HPAGE defines
Now when the main kvm code relying on these defines has been moved to
the x86 specific part of the world, we can get rid of these.
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit
ef0cfe71c2b1710cd4ae747537e36c56f9a26ccf)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Christoffer Dall [Wed, 2 Oct 2013 21:22:29 +0000 (14:22 -0700)]
KVM: ARM: Get rid of KVM_HPAGE defines
The KVM_HPAGE_DEFINES are a little artificial on ARM, since the huge
page size is statically defined at compile time and there is only a
single huge page size.
Now when the main kvm code relying on these defines has been moved to
the x86 specific part of the world, we can get rid of these.
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit
dc6f6763dfeaf2dfec906bb78875dcea162accd9)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Christoffer Dall [Wed, 2 Oct 2013 21:22:28 +0000 (14:22 -0700)]
KVM: Move gfn_to_index to x86 specific code
The gfn_to_index function relies on huge page defines which either may
not make sense on systems that don't support huge pages or are defined
in an unconvenient way for other architectures. Since this is
x86-specific, move the function to arch/x86/include/asm/kvm_host.h.
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit
6d9d41e57440e32a3400f37aa05ef7a1a09ced64)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Jonathan Austin [Thu, 26 Sep 2013 15:49:28 +0000 (16:49 +0100)]
KVM: ARM: Add support for Cortex-A7
This patch adds support for running Cortex-A7 guests on Cortex-A7 hosts.
As Cortex-A7 is architecturally compatible with A15, this patch is largely just
generalising existing code. Areas where 'implementation defined' behaviour
is identical for A7 and A15 is moved to allow it to be used by both cores.
The check to ensure that coprocessor register tables are sorted correctly is
also moved in to 'common' code to avoid each new cpu doing its own check
(and possibly forgetting to do so!)
Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
e8c2d99f8277d68d28a9f99d16289712bc2aee7f)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Jonathan Austin [Thu, 26 Sep 2013 15:49:26 +0000 (16:49 +0100)]
KVM: ARM: fix the size of TTBCR_{T0SZ,T1SZ} masks
The T{0,1}SZ fields of TTBCR are 3 bits wide when using the long descriptor
format. Likewise, the T0SZ field of the HTCR is 3-bits. KVM currently
defines TTBCR_T{0,1}SZ as 3, not 7.
The T0SZ mask is used to calculate the value for the HTCR, both to pick out
TTBCR.T0SZ and mask off the equivalent field in the HTCR during
read-modify-write. The incorrect mask size causes the (UNKNOWN) reset value
of HTCR.T0SZ to leak in to the calculated HTCR value. Linux will hang when
initializing KVM if HTCR's reset value has bit 2 set (sometimes the case on
A7/TC2)
Fixing T0SZ allows A7 cores to boot and T1SZ is also fixed for completeness.
Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
5e497046f005528464f9600a4ee04f49df713596)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Jonathan Austin [Thu, 26 Sep 2013 15:49:27 +0000 (16:49 +0100)]
KVM: ARM: Fix calculation of virtual CPU ID
KVM does not have a notion of multiple clusters for CPUs, just a linear
array of CPUs. When using a system with cores in more than one cluster, the
current method for calculating the virtual MPIDR will leak the (physical)
cluster information into the virtual MPIDR. One effect of this is that
Linux under KVM fails to boot multiple CPUs that aren't in the 0th cluster.
This patch does away with exposing the real MPIDR fields in favour of simply
using the virtual CPU number (but preserving the U bit, as before).
Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
1158fca401e09665c440a9fe4fd4f131ee85c13b)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Andre Richter [Wed, 2 Oct 2013 10:23:26 +0000 (12:23 +0200)]
virt/kvm/iommu.c: Add leading zeros to device's BDF notation in debug messages
When KVM (de)assigns PCI(e) devices to VMs, a debug message is printed
including the BDF notation of the respective device. Currently, the BDF
notation does not have the commonly used leading zeros. This produces
messages like "assign device 0:1:8.0", which look strange at first sight.
The patch fixes this by exchanging the printk(KERN_DEBUG ...) with dev_info()
and also inserts "kvm" into the debug message, so that it is obvious where
the message comes from. Also reduces LoC.
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Andre Richter <andre.o.richter@gmail.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit
29242cb5c63b1f8e12e8055ba1a6c3e0004fa86d)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Gleb Natapov [Tue, 1 Oct 2013 16:58:36 +0000 (19:58 +0300)]
Fix NULL dereference in gfn_to_hva_prot()
gfn_to_memslot() can return NULL or invalid slot. We need to check slot
validity before accessing it.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit
a2ac07fe292ea41296049dfdbfeed203e2467ee7)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Anup Patel [Mon, 30 Sep 2013 08:50:07 +0000 (14:20 +0530)]
ARM/ARM64: KVM: Implement KVM_ARM_PREFERRED_TARGET ioctl
For implementing CPU=host, we need a mechanism for querying
preferred VCPU target type on underlying Host.
This patch implements KVM_ARM_PREFERRED_TARGET vm ioctl which
returns struct kvm_vcpu_init instance containing information
about preferred VCPU target type and target specific features
available for it.
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
42c4e0c77ac91505ab94284b14025e3a0865c0a5)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Anup Patel [Mon, 30 Sep 2013 08:50:06 +0000 (14:20 +0530)]
ARM64: KVM: Implement kvm_vcpu_preferred_target() function
This patch implements kvm_vcpu_preferred_target() function for
KVM ARM64 which will help us implement KVM_ARM_PREFERRED_TARGET
ioctl for user space.
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
473bdc0e6565ebb22455657a40daa21b6b4ee16b)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Anup Patel [Mon, 30 Sep 2013 08:50:05 +0000 (14:20 +0530)]
ARM: KVM: Implement kvm_vcpu_preferred_target() function
This patch implements kvm_vcpu_preferred_target() function for
KVM ARM which will help us implement KVM_ARM_PREFERRED_TARGET ioctl
for user space.
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
4a6fee805d5e278e4733bf933cb5b184b7a8be1f)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Anup Patel [Wed, 11 Sep 2013 13:04:22 +0000 (18:34 +0530)]
KVM: ARM: Fix typo in comments of inject_abt()
Very minor typo in comments of inject_abt() when we update fault status
register for injecting prefetch abort.
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
b373e492f3a3469c615c2ae218d2f723900bf981)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Paolo Bonzini [Wed, 25 Sep 2013 11:53:07 +0000 (13:53 +0200)]
KVM: Convert kvm_lock back to non-raw spinlock
In commit
e935b8372cf8 ("KVM: Convert kvm_lock to raw_spinlock"),
the kvm_lock was made a raw lock. However, the kvm mmu_shrink()
function tries to grab the (non-raw) mmu_lock within the scope of
the raw locked kvm_lock being held. This leads to the following:
BUG: sleeping function called from invalid context at kernel/rtmutex.c:659
in_atomic(): 1, irqs_disabled(): 0, pid: 55, name: kswapd0
Preemption disabled at:[<
ffffffffa0376eac>] mmu_shrink+0x5c/0x1b0 [kvm]
Pid: 55, comm: kswapd0 Not tainted 3.4.34_preempt-rt
Call Trace:
[<
ffffffff8106f2ad>] __might_sleep+0xfd/0x160
[<
ffffffff817d8d64>] rt_spin_lock+0x24/0x50
[<
ffffffffa0376f3c>] mmu_shrink+0xec/0x1b0 [kvm]
[<
ffffffff8111455d>] shrink_slab+0x17d/0x3a0
[<
ffffffff81151f00>] ? mem_cgroup_iter+0x130/0x260
[<
ffffffff8111824a>] balance_pgdat+0x54a/0x730
[<
ffffffff8111fe47>] ? set_pgdat_percpu_threshold+0xa7/0xd0
[<
ffffffff811185bf>] kswapd+0x18f/0x490
[<
ffffffff81070961>] ? get_parent_ip+0x11/0x50
[<
ffffffff81061970>] ? __init_waitqueue_head+0x50/0x50
[<
ffffffff81118430>] ? balance_pgdat+0x730/0x730
[<
ffffffff81060d2b>] kthread+0xdb/0xe0
[<
ffffffff8106e122>] ? finish_task_switch+0x52/0x100
[<
ffffffff817e1e94>] kernel_thread_helper+0x4/0x10
[<
ffffffff81060c50>] ? __init_kthread_worker+0x
After the previous patch, kvm_lock need not be a raw spinlock anymore,
so change it back.
Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: kvm@vger.kernel.org
Cc: gleb@redhat.com
Cc: jan.kiszka@siemens.com
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit
2f303b74a62fb74983c0a66e2df353be963c527c)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Paolo Bonzini [Tue, 10 Sep 2013 10:58:35 +0000 (12:58 +0200)]
KVM: protect kvm_usage_count with its own spinlock
The VM list need not be protected by a raw spinlock. Separate the
two so that kvm_lock can be made non-raw.
Cc: kvm@vger.kernel.org
Cc: gleb@redhat.com
Cc: jan.kiszka@siemens.com
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit
4a937f96f3a29c58b7edd349d2e4dfac371efdf2)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Paolo Bonzini [Tue, 10 Sep 2013 10:57:17 +0000 (12:57 +0200)]
KVM: cleanup (physical) CPU hotplug
Remove the useless argument, and do not do anything if there are no
VMs running at the time of the hotplug.
Cc: kvm@vger.kernel.org
Cc: gleb@redhat.com
Cc: jan.kiszka@siemens.com
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit
4fa92fb25ae5a2d79d872ab54df511c831b1f363)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Olof Johansson [Wed, 11 Sep 2013 22:27:41 +0000 (15:27 -0700)]
ARM: kvm: rename cpu_reset to avoid name clash
cpu_reset is already #defined in <asm/proc-fns.h> as processor.reset,
so it expands here and causes problems.
Cc: <stable@vger.kernel.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
ac570e0493815e0b41681c89cb50d66421429d27)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Radim Krčmář [Wed, 4 Sep 2013 20:32:24 +0000 (22:32 +0200)]
kvm: remove .done from struct kvm_async_pf
'.done' is used to mark the completion of 'async_pf_execute()', but
'cancel_work_sync()' returns true when the work was canceled, so we
use it instead.
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit
98fda169290b3b28c0f2db2b8f02290c13da50ef)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Radim Krčmář [Wed, 4 Sep 2013 20:32:23 +0000 (22:32 +0200)]
kvm: free resources after canceling async_pf
When we cancel 'async_pf_execute()', we should behave as if the work was
never scheduled in 'kvm_setup_async_pf()'.
Fixes a bug when we can't unload module because the vm wasn't destroyed.
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit
28b441e24088081c1e213139d1303b451a34a4f4)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Paolo Bonzini [Mon, 9 Sep 2013 11:52:33 +0000 (13:52 +0200)]
KVM: mmu: allow page tables to be in read-only slots
Page tables in a read-only memory slot will currently cause a triple
fault because the page walker uses gfn_to_hva and it fails on such a slot.
OVMF uses such a page table; however, real hardware seems to be fine with
that as long as the accessed/dirty bits are set. Save whether the slot
is readonly, and later check it when updating the accessed and dirty bits.
Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit
ba6a3541545542721ce821d1e7e5ce35752e6fdf)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Christoffer Dall [Wed, 14 Aug 2013 19:33:48 +0000 (12:33 -0700)]
ARM: KVM: Add newlines to panic strings
The panic strings are hard to read and on narrow terminals some
characters are simply truncated off the panic message.
Make is slightly prettier with a newline in the Hyp panic strings.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
1fe40f6d39d23f39e643607a3e1883bfc74f1244)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Christoffer Dall [Mon, 19 Aug 2013 21:16:57 +0000 (14:16 -0700)]
ARM: KVM: Work around older compiler bug
Compilers before 4.6 do not behave well with unnamed fields in structure
initializers and therefore produces build errors:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=10676
By refering to the unnamed union using braces, both older and newer
compilers produce the same result.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reported-by: Russell King <linux@arm.linux.org.uk>
Tested-by: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
6833d83891140aedab7841589b7c7dbd7b600235)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Christoffer Dall [Fri, 9 Aug 2013 03:34:22 +0000 (20:34 -0700)]
ARM: KVM: Simplify tracepoint text
The tracepoint for kvm_guest_fault was extremely long, make it a
slightly bit shorter.
Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
6e72cc5700fe6b8776d537b736dab64b21ae0f1f)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Christoffer Dall [Fri, 9 Aug 2013 03:35:07 +0000 (20:35 -0700)]
ARM: KVM: Fix kvm_set_pte assignment
THe kvm_set_pte function was actually assigning the entire struct to the
structure member, which should work because the structure only has that
one member, but it is still not very nice.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
0963e5d0f22f9d197dbf206d8b5b2a150722cf5e)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Christoffer Dall [Thu, 29 Aug 2013 10:08:25 +0000 (11:08 +0100)]
ARM: KVM: vgic: Bump VGIC_NR_IRQS to 256
The Versatile Express TC2 board, which we use as our main emulated
platform in QEMU, defines 160+32 == 192 interrupts, so limiting the
number of interrupts to 128 is not quite going to cut it for real board
emulation.
Note that this didn't use to be a problem because QEMU was buggy and
only defined 128 interrupts until recently.
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit
9b2d2e0df8a49414b1e5bc89148c9984dd87782a)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Christoffer Dall [Thu, 29 Aug 2013 10:08:24 +0000 (11:08 +0100)]
ARM: KVM: Bugfix: vgic_bytemap_get_reg per cpu regs
For bytemaps each IRQ field is 1 byte wide, so we pack 4 irq fields in
one word and since there are 32 private (per cpu) irqs, we have 8
private u32 fields on the vgic_bytemap struct. We shift the offset from
the base of the register group right by 2, giving us the word index
instead of the field index. But then there are 8 private words, not 4,
which is also why we subtract 8 words from the offset of the shared
words.
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit
8d98915b6bda499e47d19166101d0bbcfd409c80)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Marc Zyngier [Thu, 29 Aug 2013 10:08:23 +0000 (11:08 +0100)]
ARM: KVM: vgic: fix GICD_ICFGRn access
All the code in handle_mmio_cfg_reg() assumes the offset has
been shifted right to accomodate for the 2:1 bit compression,
but this is only done when getting the register address.
Shift the offset early so the code works mostly unchanged.
Reported-by: Zhaobo (Bob, ERC) <zhaobo@huawei.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit
6545eae3d7a1b6dc2edb8ede9107998aee1207ef)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Marc Zyngier [Thu, 29 Aug 2013 10:08:22 +0000 (11:08 +0100)]
ARM: KVM: vgic: simplify vgic_get_target_reg
vgic_get_target_reg is quite complicated, for no good reason.
Actually, it is fairly easy to write it in a much more efficient
way by using the target CPU array instead of the bitmap.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit
986af8e0789a41ac4844e6eefed4a33e86524918)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Andrea Arcangeli [Thu, 25 Jul 2013 01:04:38 +0000 (03:04 +0200)]
kvm: optimize away THP checks in kvm_is_mmio_pfn()
The checks on PG_reserved in the page structure on head and tail pages
aren't necessary because split_huge_page wouldn't transfer the
PG_reserved bit from head to tail anyway.
This was a forward-thinking check done in the case PageReserved was
set by a driver-owned page mapped in userland with something like
remap_pfn_range in a VM_PFNMAP region, but using hugepmds (not
possible right now). It was meant to be very safe, but it's overkill
as it's unlikely split_huge_page could ever run without the driver
noticing and tearing down the hugepage itself.
And if a driver in the future will really want to map a reserved
hugepage in userland using an huge pmd it should simply take care of
marking all subpages reserved too to keep KVM safe. This of course
would require such a hypothetical driver to tear down the huge pmd
itself and splitting the hugepage itself, instead of relaying on
split_huge_page, but that sounds very reasonable, especially
considering split_huge_page wouldn't currently transfer the reserved
bit anyway.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit
11feeb498086a3a5907b8148bdf1786a9b18fc55)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Yann Droneaud [Sat, 24 Aug 2013 20:14:07 +0000 (22:14 +0200)]
kvm: use anon_inode_getfd() with O_CLOEXEC flag
KVM uses anon_inode_get() to allocate file descriptors as part
of some of its ioctls. But those ioctls are lacking a flag argument
allowing userspace to choose options for the newly opened file descriptor.
In such case it's advised to use O_CLOEXEC by default so that
userspace is allowed to choose, without race, if the file descriptor
is going to be inherited across exec().
This patch set O_CLOEXEC flag on all file descriptors created
with anon_inode_getfd() to not leak file descriptors across exec().
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Link: http://lkml.kernel.org/r/cover.1377372576.git.ydroneaud@opteya.com
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
(cherry picked from commit
24009b0549de563006705b9af8694fc8fc9a5aa1)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Christoffer Dall [Tue, 6 Aug 2013 04:34:16 +0000 (05:34 +0100)]
ARM: 7808/1: KVM: mm: Get rid of L_PTE_USER ref from PAGE_S2_DEVICE
THe L_PTE_USER actually has nothing to do with stage 2 mappings and the
L_PTE_S2_RDWR value sets the readable bit, which was what L_PTE_USER
was used for before proper handling of stage 2 memory defines.
Changelog:
[v3]: Drop call to kvm_set_s2pte_writable in mmu.c
[v2]: Change default mappings to be r/w instead of r/o, as per Marc
Zyngier's suggestion.
Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
(cherry picked from commit
8947c09d05da9f0436f423518f449beaa5ea1bdc)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Will Deacon [Mon, 13 May 2013 11:08:06 +0000 (12:08 +0100)]
ARM: kvm: use inner-shareable barriers after TLB flushing
When flushing the TLB at PL2 in response to remapping at stage-2 or VMID
rollover, we have a dsb instruction to ensure completion of the command
before continuing.
Since we only care about other processors for TLB invalidation, use the
inner-shareable variant of the dsb instruction instead.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit
e3ab547f57bd626201d4b715b696c80ad1ef4ba2)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Christoffer Dall [Tue, 30 Jul 2013 03:46:04 +0000 (20:46 -0700)]
KVM: ARM: Squash len warning
The 'len' variable was declared an unsigned and then checked for less
than 0, which results in warnings on some compilers. Since len is
assigned an int, make it an int.
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
2184a60de26b94bc5a88de3e5a960ef9ff54ba5a)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Chen Gang [Mon, 22 Jul 2013 03:40:38 +0000 (04:40 +0100)]
arm64: KVM: use 'int' instead of 'u32' for variable 'target' in kvm_host.h.
'target' will be set to '-1' in kvm_arch_vcpu_init(), and it need check
'target' whether less than zero or not in kvm_vcpu_initialized().
So need define target as 'int' instead of 'u32', just like ARM has done.
The related warning:
arch/arm64/kvm/../../../arch/arm/kvm/arm.c:497:2: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]
Signed-off-by: Chen Gang <gang.chen@asianux.com>
[Marc: reformated the Subject line to fit the series]
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
(cherry picked from commit
6c8c0c4dc0e98ee2191211d66e9f876e95787073)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Marc Zyngier [Tue, 11 Jun 2013 17:05:25 +0000 (18:05 +0100)]
arm64: KVM: add missing dsb before invalidating Stage-2 TLBs
When performing a Stage-2 TLB invalidation, it is necessary to
make sure the write to the page tables is observable by all CPUs.
For this purpose, add dsb instructions to __kvm_tlb_flush_vmid_ipa
and __kvm_flush_vm_context before doing the TLB invalidation itself.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
(cherry picked from commit
f142e5eeb724cfbedd203b32b3b542d78dbe2545)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Marc Zyngier [Fri, 7 Jun 2013 10:02:34 +0000 (11:02 +0100)]
arm64: KVM: perform save/restore of PAR_EL1
Not saving PAR_EL1 is an unfortunate oversight. If the guest
performs an AT* operation and gets scheduled out before reading
the result of the translation from PAREL1, it could become
corrupted by another guest or the host.
Saving this register is made slightly more complicated as KVM also
uses it on the permission fault handling path, leading to an ugly
"stash and restore" sequence. Fortunately, this is already a slow
path so we don't really care. Also, Linux doesn't do any AT*
operation, so Linux guests are not impacted by this bug.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
(cherry picked from commit
1bbd80549810637b7381ab0649ba7c7d62f1342a)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Marc Zyngier [Tue, 6 Aug 2013 12:05:48 +0000 (13:05 +0100)]
arm64: KVM: fix 2-level page tables unmapping
When using 64kB pages, we only have two levels of page tables,
meaning that PGD, PUD and PMD are fused. In this case, trying
to refcount PUDs and PMDs independently is a a complete disaster,
as they are the same.
We manage to get it right for the allocation (stage2_set_pte uses
{pmd,pud}_none), but the unmapping path clears both pud and pmd
refcounts, which fails spectacularly with 2-level page tables.
The fix is to avoid calling clear_pud_entry when both the pmd and
pud pages are empty. For this, and instead of introducing another
pud_empty function, consolidate both pte_empty and pmd_empty into
page_empty (the code is actually identical) and use that to also
test the validity of the pud.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
979acd5e18c3e5cb7e3308c699d79553af5af8c6)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Christoffer Dall [Tue, 6 Aug 2013 20:50:54 +0000 (13:50 -0700)]
ARM: KVM: Fix unaligned unmap_range leak
The unmap_range function did not properly cover the case when the start
address was not aligned to PMD_SIZE or PUD_SIZE and an entire pte table
or pmd table was cleared, causing us to leak memory when incrementing
the addr.
The fix is to always move onto the next page table entry boundary
instead of adding the full size of the VA range covered by the
corresponding table level entry.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
d3840b26614d8ce3db53c98061d9fcb1b9ccb0dd)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Takuya Yoshikawa [Thu, 4 Jul 2013 04:40:29 +0000 (13:40 +0900)]
KVM: Introduce kvm_arch_memslots_updated()
This is called right after the memslots is updated, i.e. when the result
of update_memslots() gets installed in install_new_memslots(). Since
the memslots needs to be updated twice when we delete or move a memslot,
kvm_arch_commit_memory_region() does not correspond to this exactly.
In the following patch, x86 will use this new API to check if the mmio
generation has reached its maximum value, in which case mmio sptes need
to be flushed out.
Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Acked-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit
e59dbe09f8e6fb8f6ee19dc79d1a2f14299e4cd2)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Marc Zyngier [Thu, 4 Jul 2013 12:34:32 +0000 (13:34 +0100)]
arm64: KVM: Kconfig integration
Finally plug KVM/arm64 into the config system, making it possible
to enable KVM support on AArch64 CPUs.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit
c3eb5b14449a0949e9764d39374a2ea63faae14f)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Arnd Bergmann [Fri, 21 Jun 2013 20:33:22 +0000 (22:33 +0200)]
ARM: kvm: don't include drivers/virtio/Kconfig
The virtio configuration has recently moved and is now visible everywhere.
Including the file again from KVM as we used to need earlier now causes
dependency problems:
warning: (CAIF_VIRTIO && VIRTIO_PCI && VIRTIO_MMIO && REMOTEPROC && RPMSG)
selects VIRTIO which has unmet direct dependencies (VIRTUALIZATION)
Cc: Christoffer Dall <cdall@cs.columbia.edu>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
8bd4ffd6b3a98f00267051dc095076ea2ff06ea8)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Geoff Levand [Fri, 7 Jun 2013 01:02:54 +0000 (18:02 -0700)]
arm/kvm: Cleanup KVM_ARM_MAX_VCPUS logic
Commit
d21a1c83c7595e387545632e44cd7797b76e19cc (ARM: KVM: define KVM_ARM_MAX_VCPUS
unconditionally) changed the Kconfig logic for KVM_ARM_MAX_VCPUS to work around a
build error arising from the use of KVM_ARM_MAX_VCPUS when CONFIG_KVM=n. The
resulting Kconfig logic is a bit awkward and leaves a KVM_ARM_MAX_VCPUS always
defined in the kernel config file.
This change reverts the Kconfig logic back and adds a simple preprocessor
conditional in kvm_host.h to handle when CONFIG_KVM_ARM_MAX_VCPUS is undefined.
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
(cherry picked from commit
f2dda9d829818b055510187059cdfa4ece10c82d)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Marc Zyngier [Tue, 14 May 2013 11:11:39 +0000 (12:11 +0100)]
ARM: KVM: get rid of S2_PGD_SIZE
S2_PGD_SIZE defines the number of pages used by a stage-2 PGD
and is unused, except for a VM_BUG_ON check that missuses the
define.
As the check is very unlikely to ever triggered except in
circumstances where KVM is the least of our worries, just kill
both the define and the VM_BUG_ON check.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
(cherry picked from commit
4db845c3d8e2f8a219e8ac48834dd4fe085e5d63)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Marc Zyngier [Tue, 14 May 2013 11:11:38 +0000 (12:11 +0100)]
ARM: KVM: don't special case PC when doing an MMIO
Admitedly, reading a MMIO register to load PC is very weird.
Writing PC to a MMIO register is probably even worse. But
the architecture doesn't forbid any of these, and injecting
a Prefetch Abort is the wrong thing to do anyway.
Remove this check altogether, and let the adventurous guest
wander into LaLaLand if they feel compelled to do so.
Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
(cherry picked from commit
8734f16fb2aa4ff0bb57ad6532661a38bc8ff957)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Marc Zyngier [Tue, 14 May 2013 11:11:37 +0000 (12:11 +0100)]
ARM: KVM: use phys_addr_t instead of unsigned long long for HYP PGDs
HYP PGDs are passed around as phys_addr_t, except just before calling
into the hypervisor init code, where they are cast to a rather weird
unsigned long long.
Just keep them around as phys_addr_t, which is what makes the most
sense.
Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
(cherry picked from commit
dac288f7b38a7439502b77dabcdf8a9a5c4ae721)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Marc Zyngier [Tue, 14 May 2013 11:11:35 +0000 (12:11 +0100)]
ARM: KVM: remove dead prototype for __kvm_tlb_flush_vmid
__kvm_tlb_flush_vmid has been renamed to __kvm_tlb_flush_vmid_ipa,
and the old prototype should have been removed when the code was
modified.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
(cherry picked from commit
368074d908b785588778f00b4384376cd636f4a1)
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>