firefly-linux-kernel-4.4.55.git
11 years agoMerge branch 'kvm-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390...
Paolo Bonzini [Mon, 20 Jan 2014 12:00:02 +0000 (13:00 +0100)]
Merge branch 'kvm-urgent' of git://git./linux/kernel/git/kvms390/linux into kvm-queue

A fix for a regression that is in current kvm/next, which is
targetted for 3.14.

11 years agokvm: make KVM_MMU_AUDIT help text more readable
Randy Dunlap [Sat, 18 Jan 2014 02:02:38 +0000 (18:02 -0800)]
kvm: make KVM_MMU_AUDIT help text more readable

Make KVM_MMU_AUDIT kconfig help text readable and collapse
two spaces between words down to one space.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
11 years agoKVM: s390: Fix memory access error detection
Christian Borntraeger [Mon, 20 Jan 2014 11:34:13 +0000 (12:34 +0100)]
KVM: s390: Fix memory access error detection

Seems that commit 210b1607012cc9034841a393e0591b2c86d9e26c
(KVM: s390: Removed SIE_INTERCEPT_UCONTROL) lost a hunk when we
reworked our patch queue to rework the async_fp code. We now
ignore faults on the sie instruction (guest accesses non-existing
memory) instead of sending a fault into the guest. This leads to
hang situations with the old virtio transport that checks for
descriptor memory after guest memory. Instead of bailing out this
code now goes wild...
Lets re-add the check.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
11 years agoKVM: nVMX: Update guest activity state field on L2 exits
Jan Kiszka [Sat, 4 Jan 2014 17:47:24 +0000 (18:47 +0100)]
KVM: nVMX: Update guest activity state field on L2 exits

Set guest activity state in L1's VMCS according to the VCPUs mp_state.
This ensures we report the correct state in case we L2 executed HLT or
if we put L2 into HLT state and it was now woken up by an event.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
11 years agoKVM: nVMX: Fix nested_run_pending on activity state HLT
Jan Kiszka [Sat, 4 Jan 2014 17:47:23 +0000 (18:47 +0100)]
KVM: nVMX: Fix nested_run_pending on activity state HLT

When we suspend the guest in HLT state, the nested run is no longer
pending - we emulated it completely. So only set nested_run_pending
after checking the activity state.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
11 years agoKVM: nVMX: Clean up handling of VMX-related MSRs
Jan Kiszka [Sat, 4 Jan 2014 17:47:22 +0000 (18:47 +0100)]
KVM: nVMX: Clean up handling of VMX-related MSRs

This simplifies the code and also stops issuing warning about writing to
unhandled MSRs when VMX is disabled or the Feature Control MSR is
locked - we do handle them all according to the spec.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
11 years agoKVM: nVMX: Add tracepoints for nested_vmexit and nested_vmexit_inject
Jan Kiszka [Sat, 4 Jan 2014 17:47:21 +0000 (18:47 +0100)]
KVM: nVMX: Add tracepoints for nested_vmexit and nested_vmexit_inject

Already used by nested SVM for tracing nested vmexit: kvm_nested_vmexit
marks exits from L2 to L0 while kvm_nested_vmexit_inject marks vmexits
that are reflected to L1.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
11 years agoKVM: nVMX: Pass vmexit parameters to nested_vmx_vmexit
Jan Kiszka [Sat, 4 Jan 2014 17:47:20 +0000 (18:47 +0100)]
KVM: nVMX: Pass vmexit parameters to nested_vmx_vmexit

Instead of fixing up the vmcs12 after the nested vmexit, pass key
parameters already when calling nested_vmx_vmexit. This will help
tracing those vmexits.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
11 years agoKVM: nVMX: Leave VMX mode on clearing of feature control MSR
Jan Kiszka [Sat, 4 Jan 2014 17:47:19 +0000 (18:47 +0100)]
KVM: nVMX: Leave VMX mode on clearing of feature control MSR

When userspace sets MSR_IA32_FEATURE_CONTROL to 0, make sure we leave
root and non-root mode, fully disabling VMX. The register state of the
VCPU is undefined after this step, so userspace has to set it to a
proper state afterward.

This enables to reboot a VM while it is running some hypervisor code.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
11 years agoKVM: VMX: Fix DR6 update on #DB exception
Jan Kiszka [Sat, 4 Jan 2014 17:47:17 +0000 (18:47 +0100)]
KVM: VMX: Fix DR6 update on #DB exception

According to the SDM, only bits 0-3 of DR6 "may" be cleared by "certain"
debug exception. So do update them on #DB exception in KVM, but leave
the rest alone, only setting BD and BS in addition to already set bits
in DR6. This also aligns us with kvm_vcpu_check_singlestep.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
11 years agoKVM: SVM: Fix reading of DR6
Jan Kiszka [Sat, 4 Jan 2014 17:47:16 +0000 (18:47 +0100)]
KVM: SVM: Fix reading of DR6

In contrast to VMX, SVM dose not automatically transfer DR6 into the
VCPU's arch.dr6. So if we face a DR6 read, we must consult a new vendor
hook to obtain the current value. And as SVM now picks the DR6 state
from its VMCB, we also need a set callback in order to write updates of
DR6 back.

Fixes a regression of 020df0794f.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
11 years agoKVM: x86: Sync DR7 on KVM_SET_DEBUGREGS
Jan Kiszka [Sat, 4 Jan 2014 17:47:15 +0000 (18:47 +0100)]
KVM: x86: Sync DR7 on KVM_SET_DEBUGREGS

Whenever we change arch.dr7, we also have to call kvm_update_dr7. In
case guest debugging is off, this will synchronize the new state into
hardware.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
11 years agoadd support for Hyper-V reference time counter
Vadim Rozenfeld [Thu, 16 Jan 2014 09:18:37 +0000 (20:18 +1100)]
add support for Hyper-V reference time counter

Signed-off: Peter Lieven <pl@kamp.de>
Signed-off: Gleb Natapov
Signed-off: Vadim Rozenfeld <vrozenfe@redhat.com>

After some consideration I decided to submit only Hyper-V reference
counters support this time. I will submit iTSC support as a separate
patch as soon as it is ready.

v1 -> v2
1. mark TSC page dirty as suggested by
    Eric Northup <digitaleric@google.com> and Gleb
2. disable local irq when calling get_kernel_ns,
    as it was done by Peter Lieven <pl@amp.de>
3. move check for TSC page enable from second patch
    to this one.

v3 -> v4
    Get rid of ref counter offset.

v4 -> v5
    replace __copy_to_user with kvm_write_guest
    when updateing iTSC page.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
11 years agoKVM: remove useless write to vcpu->hv_clock.tsc_timestamp
Paolo Bonzini [Wed, 15 Jan 2014 17:07:31 +0000 (18:07 +0100)]
KVM: remove useless write to vcpu->hv_clock.tsc_timestamp

After the previous patch from Marcelo, the comment before this write
became obsolete.  In fact, the write is unnecessary.  The calls to
kvm_write_tsc ultimately result in a master clock update as soon as
all TSCs agree and the master clock is re-enabled.  This master
clock update will rewrite tsc_timestamp.

So, together with the comment, delete the dead write too.

Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
11 years agoKVM: x86: fix tsc catchup issue with tsc scaling
Marcelo Tosatti [Mon, 6 Jan 2014 14:18:59 +0000 (12:18 -0200)]
KVM: x86: fix tsc catchup issue with tsc scaling

To fix a problem related to different resolution of TSC and system clock,
the offset in TSC units is approximated by

delta = vcpu->hv_clock.tsc_timestamp  -  vcpu->last_guest_tsc

(Guest TSC value at  (Guest TSC value at last VM-exit)
the last kvm_guest_time_update
call)

Delta is then later scaled using mult,shift pair found in hv_clock
structure (which is correct against tsc_timestamp in that
structure).

However, if a frequency change is performed between these two points,
this delta is measured using different TSC frequencies, but scaled using
mult,shift pair for one frequency only.

The end result is an incorrect delta.

The bug which this code works around is not the only cause for
clock backwards events. The global accumulator is still
necessary, so remove the max_kernel_ns fix and rely on the
global accumulator for no clock backwards events.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
11 years agoKVM: x86: limit PIT timer frequency
Marcelo Tosatti [Mon, 6 Jan 2014 14:00:02 +0000 (12:00 -0200)]
KVM: x86: limit PIT timer frequency

Limit PIT timer frequency similarly to the limit applied by
LAPIC timer.

Cc: stable@kernel.org
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
11 years agoKVM: x86: handle invalid root_hpa everywhere
Marcelo Tosatti [Fri, 3 Jan 2014 19:09:32 +0000 (17:09 -0200)]
KVM: x86: handle invalid root_hpa everywhere

Rom Freiman <rom@stratoscale.com> notes other code paths vulnerable to
bug fixed by 989c6b34f6a9480e397b.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
11 years agoMerge tag 'kvm-arm-for-3.14' of git://git.linaro.org/people/christoffer.dall/linux...
Paolo Bonzini [Wed, 15 Jan 2014 11:14:29 +0000 (12:14 +0100)]
Merge tag 'kvm-arm-for-3.14' of git://git.linaro.org/people/christoffer.dall/linux-kvm-arm into kvm-queue

11 years agokvm: Provide kvm_vcpu_eligible_for_directed_yield() stub
Scott Wood [Fri, 10 Jan 2014 00:43:16 +0000 (18:43 -0600)]
kvm: Provide kvm_vcpu_eligible_for_directed_yield() stub

Commit 7940876e1330671708186ac3386aa521ffb5c182 ("kvm: make local
functions static") broke KVM PPC builds due to removing (rather than
moving) the stub version of kvm_vcpu_eligible_for_directed_yield().

This patch reintroduces it.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Alexander Graf <agraf@suse.de>
[Move the #ifdef inside the function. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
11 years agokvm: vfio: silence GCC warning
Paul Bolle [Fri, 10 Jan 2014 00:28:46 +0000 (01:28 +0100)]
kvm: vfio: silence GCC warning

Building vfio.o triggers a GCC warning (when building for 32 bits x86):
    arch/x86/kvm/../../../virt/kvm/vfio.c: In function 'kvm_vfio_set_group':
    arch/x86/kvm/../../../virt/kvm/vfio.c:104:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
      void __user *argp = (void __user *)arg;
                          ^

Silence this warning by casting arg to unsigned long.

argp's current type, "void __user *", is always casted to "int32_t
__user *". So its type might as well be changed to "int32_t __user *".

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
11 years agoKVM: ARM: Remove duplicate include
Sachin Kamat [Tue, 7 Jan 2014 08:15:15 +0000 (13:45 +0530)]
KVM: ARM: Remove duplicate include

trace.h was included twice. Remove duplicate inclusion.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
11 years agoarm/arm64: KVM: relax the requirements of VMA alignment for THP
Marc Zyngier [Fri, 13 Dec 2013 16:56:06 +0000 (16:56 +0000)]
arm/arm64: KVM: relax the requirements of VMA alignment for THP

The THP code in KVM/ARM is a bit restrictive in not allowing a THP
to be used if the VMA is not 2MB aligned. Actually, it is not so much
the VMA that matters, but the associated memslot:

A process can perfectly mmap a region with no particular alignment
restriction, and then pass a 2MB aligned address to KVM. In this
case, KVM will only use this 2MB aligned region, and will ignore
the range between vma->vm_start and memslot->userspace_addr.

It can also choose to place this memslot at whatever alignment it
wants in the IPA space. In the end, what matters is the relative
alignment of the user space and IPA mappings with respect to a
2M page. They absolutely must be the same if you want to use THP.

Cc: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
11 years agoKVM: VMX: fix use after free of vmx->loaded_vmcs
Marcelo Tosatti [Fri, 3 Jan 2014 19:00:51 +0000 (17:00 -0200)]
KVM: VMX: fix use after free of vmx->loaded_vmcs

After free_loaded_vmcs executes, the "loaded_vmcs" structure
is kfreed, and now vmx->loaded_vmcs points to a kfreed area.
Subsequent free_loaded_vmcs then attempts to manipulate
vmx->loaded_vmcs.

Switch the order to avoid the problem.

https://bugzilla.redhat.com/show_bug.cgi?id=1047892

Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
11 years agoKVM: x86: Fix debug typo error in lapic
Chen Fan [Thu, 2 Jan 2014 09:14:11 +0000 (17:14 +0800)]
KVM: x86: Fix debug typo error in lapic

fix the 'vcpi' typos when apic_debug is enabled.

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
11 years agokvm: remove dead code
Stephen Hemminger [Sun, 29 Dec 2013 20:13:08 +0000 (12:13 -0800)]
kvm: remove dead code

The function kvm_io_bus_read_cookie is defined but never used
in current in-tree code.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
11 years agokvm: make local functions static
Stephen Hemminger [Sun, 29 Dec 2013 20:12:29 +0000 (12:12 -0800)]
kvm: make local functions static

Running 'make namespacecheck' found lots of functions that
should be declared static, since only used in one file.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
11 years agoKVM: VMX: check use I/O bitmap first before unconditional I/O exit
Zhihui Zhang [Mon, 30 Dec 2013 20:56:29 +0000 (15:56 -0500)]
KVM: VMX: check use I/O bitmap first before unconditional I/O exit

According to Table C-1 of Intel SDM 3C, a VM exit happens on an I/O instruction when
"use I/O bitmaps" VM-execution control was 0 _and_ the "unconditional I/O exiting"
VM-execution control was 1. So we can't just check "unconditional I/O exiting" alone.
This patch was improved by suggestion from Jan Kiszka.

Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Zhihui Zhang <zzhsuny@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
11 years agoKVM: doc: Fix typo in doc/virtual/kvm
Masanari Iida [Sat, 21 Dec 2013 16:21:23 +0000 (01:21 +0900)]
KVM: doc: Fix typo in doc/virtual/kvm

Correct spelling typo in Documentations/virtual/kvm

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
11 years agoMerge branch 'kvm-arm64/for-3.14' into kvm-arm64/next
Marc Zyngier [Sat, 28 Dec 2013 10:29:37 +0000 (10:29 +0000)]
Merge branch 'kvm-arm64/for-3.14' into kvm-arm64/next

11 years agoarm64: KVM: Force undefined exception for Guest SMC intructions
Anup Patel [Thu, 12 Dec 2013 16:12:23 +0000 (16:12 +0000)]
arm64: KVM: Force undefined exception for Guest SMC intructions

The SMC-based PSCI emulation for Guest is going to be very different
from the in-kernel HVC-based PSCI emulation hence for now just inject
undefined exception when Guest executes SMC instruction.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: marc Zyngier <marc.zyngier@arm.com>
11 years agoarm64: KVM: Support X-Gene guest VCPU on APM X-Gene host
Anup Patel [Thu, 14 Nov 2013 15:20:08 +0000 (15:20 +0000)]
arm64: KVM: Support X-Gene guest VCPU on APM X-Gene host

This patch allows us to have X-Gene guest VCPU when using KVM arm64
on APM X-Gene host.

We add KVM_ARM_TARGET_XGENE_POTENZA for X-Gene Potenza compatible
guest VCPU and we return KVM_ARM_TARGET_XGENE_POTENZA in kvm_target_cpu()
when running on X-Gene host with Potenza core.

[maz: sanitized the commit log]

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
11 years agoarm64: KVM: Add Kconfig option for max VCPUs per-Guest
Anup Patel [Thu, 12 Dec 2013 16:12:22 +0000 (16:12 +0000)]
arm64: KVM: Add Kconfig option for max VCPUs per-Guest

Current max VCPUs per-Guest is set to 4 which is preventing
us from creating a Guest (or VM) with 8 VCPUs on Host (e.g.
X-Gene Storm SOC) with 8 Host CPUs.

The correct value of max VCPUs per-Guest should be same as
the max CPUs supported by GICv2 which is 8 but, increasing
value of max VCPUs per-Guest can make things slower hence
we add Kconfig option to let KVM users select appropriate
max VCPUs per-Guest.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
11 years agoMerge tag 'vgic-migrate-for-marc' of git://git.linaro.org/people/christoffer.dall...
Marc Zyngier [Sat, 28 Dec 2013 10:26:48 +0000 (10:26 +0000)]
Merge tag 'vgic-migrate-for-marc' of git://git.linaro.org/people/christoffer.dall/linux-kvm-arm into kvm-arm64/next

VGIC and timer migration pull

11 years agoKVM: arm-vgic: Support CPU interface reg access
Christoffer Dall [Mon, 23 Sep 2013 21:55:57 +0000 (14:55 -0700)]
KVM: arm-vgic: Support CPU interface reg access

Implement support for the CPU interface register access driven by MMIO
address offsets from the CPU interface base address.  Useful for user
space to support save/restore of the VGIC state.

This commit adds support only for the same logic as the current VGIC
support, and no more.  For example, the active priority registers are
handled as RAZ/WI, just like setting priorities on the emulated
distributor.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
11 years agoKVM: arm-vgic: Add GICD_SPENDSGIR and GICD_CPENDSGIR handlers
Christoffer Dall [Fri, 25 Oct 2013 20:22:31 +0000 (21:22 +0100)]
KVM: arm-vgic: Add GICD_SPENDSGIR and GICD_CPENDSGIR handlers

Handle MMIO accesses to the two registers which should support both the
case where the VMs want to read/write either of these registers and the
case where user space reads/writes these registers to do save/restore of
the VGIC state.

Note that the added complexity compared to simple set/clear enable
registers stems from the bookkeping of source cpu ids.  It may be
possible to change the underlying data structure to simplify the
complexity, but since this is not in the critical path at all, this will
do.

Also note that reading this register from a live guest will not be
accurate compared to on hardware, because some state may be living on
the CPU LRs and the only way to give a consistent read would be to force
stop all the VCPUs and request them to unqueu the LR state onto the
distributor.  Until we have an actual user of live reading this
register, we can live with the difference.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
11 years agoKVM: arm-vgic: Support unqueueing of LRs to the dist
Christoffer Dall [Sat, 16 Nov 2013 04:51:31 +0000 (20:51 -0800)]
KVM: arm-vgic: Support unqueueing of LRs to the dist

To properly access the VGIC state from user space it is very unpractical
to have to loop through all the LRs in all register access functions.
Instead, support moving all pending state from LRs to the distributor,
but leave active state LRs alone.

Note that to accurately present the active and pending state to VCPUs
reading these distributor registers from a live VM, we would have to
stop all other VPUs than the calling VCPU and ask each CPU to unqueue
their LR state onto the distributor and add fields to track active state
on the distributor side as well.  We don't have any users of such
functionality yet and there are other inaccuracies of the GIC emulation,
so don't provide accurate synchronized access to this state just yet.
However, when the time comes, having this function should help.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
11 years agoKVM: arm-vgic: Add vgic reg access from dev attr
Christoffer Dall [Fri, 25 Oct 2013 20:17:31 +0000 (21:17 +0100)]
KVM: arm-vgic: Add vgic reg access from dev attr

Add infrastructure to handle distributor and cpu interface register
accesses through the KVM_{GET/SET}_DEVICE_ATTR interface by adding the
KVM_DEV_ARM_VGIC_GRP_DIST_REGS and KVM_DEV_ARM_VGIC_GRP_CPU_REGS groups
and defining the semantics of the attr field to be the MMIO offset as
specified in the GICv2 specs.

Missing register accesses or other changes in individual register access
functions to support save/restore of the VGIC state is added in
subsequent patches.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
11 years agoarm/arm64: kvm: Set vcpu->cpu to -1 on vcpu_put
Christoffer Dall [Thu, 12 Dec 2013 04:29:11 +0000 (20:29 -0800)]
arm/arm64: kvm: Set vcpu->cpu to -1 on vcpu_put

The arch-generic KVM code expects the cpu field of a vcpu to be -1 if
the vcpu is no longer assigned to a cpu.  This is used for the optimized
make_all_cpus_request path and will be used by the vgic code to check
that no vcpus are running.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
11 years agoKVM: arm-vgic: Make vgic mmio functions more generic
Christoffer Dall [Mon, 23 Sep 2013 21:55:56 +0000 (14:55 -0700)]
KVM: arm-vgic: Make vgic mmio functions more generic

Rename the vgic_ranges array to vgic_dist_ranges to be more specific and
to prepare for handling CPU interface register access as well (for
save/restore of VGIC state).

Pass offset from distributor or interface MMIO base to
find_matching_range function instead of the physical address of the
access in the VM memory map.  This allows other callers unaware of the
VM specifics, but with generic VGIC knowledge to reuse the function.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
11 years agoirqchip: arm-gic: Define additional MMIO offsets and masks
Christoffer Dall [Mon, 23 Sep 2013 21:55:56 +0000 (14:55 -0700)]
irqchip: arm-gic: Define additional MMIO offsets and masks

Define CPU interface offsets for the GICC_ABPR, GICC_APR, and GICC_IIDR
registers.  Define distributor registers for the GICD_SPENDSGIR and the
GICD_CPENDSGIR.  KVM/ARM needs to know about these definitions to fully
support save/restore of the VGIC.

Also define some masks and shifts for the various GICH_VMCR fields.

Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
11 years agoKVM: arm-vgic: Set base addr through device API
Christoffer Dall [Mon, 23 Sep 2013 21:55:56 +0000 (14:55 -0700)]
KVM: arm-vgic: Set base addr through device API

Support setting the distributor and cpu interface base addresses in the
VM physical address space through the KVM_{SET,GET}_DEVICE_ATTR API
in addition to the ARM specific API.

This has the added benefit of being able to share more code in user
space and do things in a uniform manner.

Also deprecate the older API at the same time, but backwards
compatibility will be maintained.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
11 years agoKVM: arm-vgic: Support KVM_CREATE_DEVICE for VGIC
Christoffer Dall [Fri, 25 Oct 2013 16:29:18 +0000 (17:29 +0100)]
KVM: arm-vgic: Support KVM_CREATE_DEVICE for VGIC

Support creating the ARM VGIC device through the KVM_CREATE_DEVICE
ioctl, which can then later be leveraged to use the
KVM_{GET/SET}_DEVICE_ATTR, which is useful both for setting addresses in
a more generic API than the ARM-specific one and is useful for
save/restore of VGIC state.

Adds KVM_CAP_DEVICE_CTRL to ARM capabilities.

Note that we change the check for creating a VGIC from bailing out if
any VCPUs were created, to bailing out if any VCPUs were ever run.  This
is an important distinction that shouldn't break anything, but allows
creating the VGIC after the VCPUs have been created.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
11 years agoARM: KVM: Allow creating the VGIC after VCPUs
Christoffer Dall [Mon, 23 Sep 2013 21:55:55 +0000 (14:55 -0700)]
ARM: KVM: Allow creating the VGIC after VCPUs

Rework the VGIC initialization slightly to allow initialization of the
vgic cpu-specific state even if the irqchip (the VGIC) hasn't been
created by user space yet.  This is safe, because the vgic data
structures are already allocated when the CPU is allocated if VGIC
support is compiled into the kernel.  Further, the init process does not
depend on any other information and the sacrifice is a slight
performance degradation for creating VMs in the no-VGIC case.

The reason is that the new device control API doesn't mandate creating
the VGIC before creating the VCPU and it is unreasonable to require user
space to create the VGIC before creating the VCPUs.

At the same time move the irqchip_in_kernel check out of
kvm_vcpu_first_run_init and into the init function to make the per-vcpu
and global init functions symmetric and add comments on the exported
functions making it a bit easier to understand the init flow by only
looking at vgic.c.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
11 years agoARM/KVM: save and restore generic timer registers
Andre Przywara [Fri, 13 Dec 2013 13:23:26 +0000 (14:23 +0100)]
ARM/KVM: save and restore generic timer registers

For migration to work we need to save (and later restore) the state of
each core's virtual generic timer.
Since this is per VCPU, we can use the [gs]et_one_reg ioctl and export
the three needed registers (control, counter, compare value).
Though they live in cp15 space, we don't use the existing list, since
they need special accessor functions and the arch timer is optional.

Acked-by: Marc Zynger <marc.zyngier@arm.com>
Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
11 years agoarm/arm64: KVM: arch_timer: Initialize cntvoff at kvm_init
Christoffer Dall [Sat, 16 Nov 2013 18:51:25 +0000 (10:51 -0800)]
arm/arm64: KVM: arch_timer: Initialize cntvoff at kvm_init

Initialize the cntvoff at kvm_init_vm time, not before running the VCPUs
at the first time because that will overwrite any potentially restored
values from user space.

Cc: Andre Przywara <andre.przywara@linaro.org>
Acked-by: Marc Zynger <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
11 years agoarm: KVM: Don't return PSCI_INVAL if waitqueue is inactive
Christoffer Dall [Wed, 20 Nov 2013 01:43:19 +0000 (17:43 -0800)]
arm: KVM: Don't return PSCI_INVAL if waitqueue is inactive

The current KVM implementation of PSCI returns INVALID_PARAMETERS if the
waitqueue for the corresponding CPU is not active.  This does not seem
correct, since KVM should not care what the specific thread is doing,
for example, user space may not have called KVM_RUN on this VCPU yet or
the thread may be busy looping to user space because it received a
signal; this is really up to the user space implementation.  Instead we
should check specifically that the CPU is marked as being turned off,
regardless of the VCPU thread state, and if it is, we shall
simply clear the pause flag on the CPU and wake up the thread if it
happens to be blocked for us.

Further, the implementation seems to be racy when executing multiple
VCPU threads.  There really isn't a reasonable user space programming
scheme to ensure all secondary CPUs have reached kvm_vcpu_first_run_init
before turning on the boot CPU.

Therefore, set the pause flag on the vcpu at VCPU init time (which can
reasonably be expected to be completed for all CPUs by user space before
running any VCPUs) and clear both this flag and the feature (in case the
feature can somehow get set again in the future) and ping the waitqueue
on turning on a VCPU using PSCI.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
11 years agoKVM: MMU: handle invalid root_hpa at __direct_map
Marcelo Tosatti [Thu, 19 Dec 2013 17:28:51 +0000 (15:28 -0200)]
KVM: MMU: handle invalid root_hpa at __direct_map

It is possible for __direct_map to be called on invalid root_hpa
(-1), two examples:

1) try_async_pf -> can_do_async_pf
    -> vmx_interrupt_allowed -> nested_vmx_vmexit
2) vmx_handle_exit -> vmx_interrupt_allowed -> nested_vmx_vmexit

Then to load_vmcs12_host_state and kvm_mmu_reset_context.

Check for this possibility, let fault exception be regenerated.

BZ: https://bugzilla.redhat.com/show_bug.cgi?id=924916

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
11 years agoKVM: VMX: Do not skip the instruction if handle_dr injects a fault
Jan Kiszka [Wed, 18 Dec 2013 18:16:24 +0000 (19:16 +0100)]
KVM: VMX: Do not skip the instruction if handle_dr injects a fault

If kvm_get_dr or kvm_set_dr reports that it raised a fault, we must not
advance the instruction pointer. Otherwise the exception will hit the
wrong instruction.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
11 years agoKVM: nVMX: Support direct APIC access from L2
Jan Kiszka [Mon, 16 Dec 2013 11:55:46 +0000 (12:55 +0100)]
KVM: nVMX: Support direct APIC access from L2

It's a pathological case, but still a valid one: If L1 disables APIC
virtualization and also allows L2 to directly write to the APIC page, we
have to forcibly enable APIC virtualization while in L2 if the in-kernel
APIC is in use.

This allows to run the direct interrupt test case in the vmx unit test
without x2APIC.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
11 years agoKVM: Documentation: Fix typo for KVM_ARM_VCPU_INIT ioctl
Anup Patel [Thu, 12 Dec 2013 16:12:24 +0000 (21:42 +0530)]
KVM: Documentation: Fix typo for KVM_ARM_VCPU_INIT ioctl

Fix minor typo in "Parameters:" of KVM_ARM_VCPU_INIT documentation.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
11 years agoKVM: x86: Add comment on vcpu_enter_guest()'s return value
Takuya Yoshikawa [Fri, 13 Dec 2013 06:08:38 +0000 (15:08 +0900)]
KVM: x86: Add comment on vcpu_enter_guest()'s return value

Giving proper names to the 0 and 1 was once suggested.  But since 0 is
returned to the userspace, giving it another name can introduce extra
confusion.  This patch just explains the meanings instead.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
11 years agoKVM: Use cond_resched() directly and remove useless kvm_resched()
Takuya Yoshikawa [Fri, 13 Dec 2013 06:07:21 +0000 (15:07 +0900)]
KVM: Use cond_resched() directly and remove useless kvm_resched()

Since the commit 15ad7146 ("KVM: Use the scheduler preemption notifiers
to make kvm preemptible"), the remaining stuff in this function is a
simple cond_resched() call with an extra need_resched() check which was
there to avoid dropping VCPUs unnecessarily.  Now it is meaningless.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
11 years agoMerge tag 'kvm-s390-20131211' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms39...
Paolo Bonzini [Thu, 12 Dec 2013 10:41:33 +0000 (11:41 +0100)]
Merge tag 'kvm-s390-20131211' of git://git./linux/kernel/git/kvms390/linux into kvm-next

Some further s390 patches for kvm-next.

Various improvements and bugfixes in the signal processor handling.
Document kvm support for diagnose (s390 hypercalls). And last but
not least, fix a bug in the s390 ioeventfd backend that was causing
us grief in scenarios with 4G+ memory.

11 years agoKVM: nVMX: Add support for activity state HLT
Jan Kiszka [Wed, 4 Dec 2013 07:58:54 +0000 (08:58 +0100)]
KVM: nVMX: Add support for activity state HLT

We can easily emulate the HLT activity state for L1: If it decides that
L2 shall be halted on entry, just invoke the normal emulation of halt
after switching to L2. We do not depend on specific host features to
provide this, so we can expose the capability unconditionally.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
11 years agoKVM: VMX: shadow VM_(ENTRY|EXIT)_CONTROLS vmcs field
Gleb Natapov [Mon, 25 Nov 2013 13:37:13 +0000 (15:37 +0200)]
KVM: VMX: shadow VM_(ENTRY|EXIT)_CONTROLS vmcs field

VM_(ENTRY|EXIT)_CONTROLS vmcs fields are read/written on each guest
entry but most times it can be avoided since values do not changes.
Keep fields copy in memory to avoid unnecessary reads from vmcs.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
11 years agoMerge remote-tracking branch 'tip/x86/cpufeature' into kvm-next
Paolo Bonzini [Thu, 12 Dec 2013 09:46:15 +0000 (10:46 +0100)]
Merge remote-tracking branch 'tip/x86/cpufeature' into kvm-next

11 years agoKVM: s390: ioeventfd: ignore leftmost bits
Dominik Dingel [Mon, 9 Dec 2013 17:30:01 +0000 (18:30 +0100)]
KVM: s390: ioeventfd: ignore leftmost bits

The diagnose 500 subcode 3 contains the 32 bit subchannel id in bits 32-63
(counting from the left). As for other I/O instructions, bits 0-31 should be
ignored and thus not be passed to kvm_io_bus_write_cookie().

This fixes a bug where the guest passed non-zero bits 0-31 which the
host tried to interpret, leading to ioeventfd notification failures.

Cc: stable@vger.kernel.org
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
11 years agoKVM: s390: SIGP START has to report BUSY while stopping a CPU
Thomas Huth [Tue, 3 Dec 2013 11:54:55 +0000 (12:54 +0100)]
KVM: s390: SIGP START has to report BUSY while stopping a CPU

Just like the RESTART order, the START order also has to report BUSY
while a STOP request is pending, to avoid that the START might be
ignored due to a race condition between the STOP and the START order.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
11 years agoKVM: s390: Reworked SIGP RESTART order
Thomas Huth [Wed, 27 Nov 2013 10:47:10 +0000 (11:47 +0100)]
KVM: s390: Reworked SIGP RESTART order

When SIGP RESTART detected an illegal CPU address, there is no need to
drop to userspace, we can return CC3 to the guest directly instead.
Also renamed __sigp_restart() to sigp_check_callable() (since this
is a better description of what the function is really doing) and
moved a string specific to RESTART to the calling place instead, so
that this function gets usable by other SIGP orders, too.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
11 years agoKVM: s390: Add the SIGP order CONDITIONAL EMERGENCY SIGNAL
Thomas Huth [Thu, 21 Nov 2013 15:01:48 +0000 (16:01 +0100)]
KVM: s390: Add the SIGP order CONDITIONAL EMERGENCY SIGNAL

This patch adds the missing SIGP order "conditional emergency
signal" by calling the "emergency signal" SIGP handler if the
required conditions are met.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
11 years agoKVM: s390: Use helper function to set CC in SIGP handler
Thomas Huth [Tue, 26 Nov 2013 11:27:16 +0000 (12:27 +0100)]
KVM: s390: Use helper function to set CC in SIGP handler

We've got a helper function for setting the condition code now,
so let's use it in the SIGP handler, too.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
11 years agoKVM: s390: diagnose call documentation
Cornelia Huck [Wed, 13 Nov 2013 10:15:02 +0000 (11:15 +0100)]
KVM: s390: diagnose call documentation

Add some further documentation on the DIAGNOSE calls we support.

Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
11 years agoarm/arm64: kvm: Use virt_to_idmap instead of virt_to_phys for idmap mappings
Santosh Shilimkar [Tue, 19 Nov 2013 19:59:12 +0000 (14:59 -0500)]
arm/arm64: kvm: Use virt_to_idmap instead of virt_to_phys for idmap mappings

KVM initialisation fails on architectures implementing virt_to_idmap()
because virt_to_phys() on such architectures won't fetch you the correct
idmap page.

So update the KVM ARM code to use the virt_to_idmap() to fix the issue.
Since the KVM code is shared between arm and arm64, we create
kvm_virt_to_phys() and handle the redirection in respective headers.

Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
11 years agox86, xsave: Support eager-only xsave features, add MPX support
Qiaowei Ren [Thu, 5 Dec 2013 09:15:34 +0000 (17:15 +0800)]
x86, xsave: Support eager-only xsave features, add MPX support

Some features, like Intel MPX, work only if the kernel uses eagerfpu
model.  So we should force eagerfpu on unless the user has explicitly
disabled it.

Add definitions for Intel MPX and add it to the supported list.

[ hpa: renamed XSTATE_FLEXIBLE to XSTATE_LAZY and added comments ]

Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
Link: http://lkml.kernel.org/r/9E0BE1322F2F2246BD820DA9FC397ADE014A6115@SHSMSX102.ccr.corp.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
11 years agox86, cpufeature: Define the Intel MPX feature flag
Qiaowei Ren [Sat, 7 Dec 2013 00:20:57 +0000 (08:20 +0800)]
x86, cpufeature: Define the Intel MPX feature flag

Define the Intel MPX (Memory Protection Extensions) CPU feature flag
in the cpufeature list.

Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
Link: http://lkml.kernel.org/r/1386375658-2191-2-git-send-email-qiaowei.ren@intel.com
Signed-off-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Liu Jinsong <jinsong.liu@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
11 years agoLinux 3.13-rc2
Linus Torvalds [Fri, 29 Nov 2013 20:57:14 +0000 (12:57 -0800)]
Linux 3.13-rc2

11 years agoMerge tag 'arm64-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas...
Linus Torvalds [Fri, 29 Nov 2013 17:57:13 +0000 (09:57 -0800)]
Merge tag 'arm64-stable' of git://git./linux/kernel/git/cmarinas/linux-aarch64

Pull ARM64 fixes from Catalin Marinas:
 - Remove preempt_count modifications in the arm64 IRQ handling code
   since that's already dealt with in generic irq_enter/irq_exit
 - PTE_PROT_NONE bit moved higher up to avoid overlapping with the
   hardware bits (for PROT_NONE mappings which are pte_present)
 - Big-endian fixes for ptrace support
 - Asynchronous aborts unmasking while in the kernel
 - pgprot_writecombine() change to create Normal NonCacheable memory
   rather than Device GRE

* tag 'arm64-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64:
  arm64: Move PTE_PROT_NONE higher up
  arm64: Use Normal NonCacheable memory for writecombine
  arm64: debug: make aarch32 bkpt checking endian clean
  arm64: ptrace: fix compat registes get/set to be endian clean
  arm64: Unmask asynchronous aborts when in kernel mode
  arm64: dts: Reserve the memory used for secondary CPU release address
  arm64: let the core code deal with preempt_count

11 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Fri, 29 Nov 2013 17:56:15 +0000 (09:56 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/s390/linux

Pull s390 updates from Martin Schwidefsky:
 "One performance improvement and a few bug fixes.  Two of the fixes
  deal with the clock related problems we have seen on recent kernels"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/mm: handle asce-type exceptions as normal page fault
  s390,time: revert direct ktime path for s390 clockevent device
  s390/time,vdso: convert to the new update_vsyscall interface
  s390/uaccess: add missing page table walk range check
  s390/mm: optimize copy_page
  s390/dasd: validate request size before building CCW/TCW request
  s390/signal: always restore saved runtime instrumentation psw bit

11 years agoMerge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Fri, 29 Nov 2013 17:55:13 +0000 (09:55 -0800)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:
 "Some easy but needed fixes for i2c drivers since rc1"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: bcm2835: Linking platform nodes to adapter nodes
  i2c: omap: raw read and write endian fix
  i2c: i2c-bcm-kona: Fix module build
  i2c: i2c-diolan-u2c: different usb endpoints for DLN-2-U2C
  i2c: bcm-kona: remove duplicated include
  i2c: davinci: raw read and write endian fix

11 years agoMerge branch 'for-3.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Linus Torvalds [Fri, 29 Nov 2013 17:49:08 +0000 (09:49 -0800)]
Merge branch 'for-3.13-fixes' of git://git./linux/kernel/git/tj/wq

Pull workqueue fixes from Tejun Heo:
 "This contains one important fix.  The NUMA support added a while back
  broke ordering guarantees on ordered workqueues.  It was enforced by
  having single frontend interface with @max_active == 1 but the NUMA
  support puts multiple interfaces on unbound workqueues on NUMA
  machines thus breaking the ordered guarantee.  This is fixed by
  disabling NUMA support on ordered workqueues.

  The above and a couple other patches were sitting in for-3.12-fixes
  but I forgot to push that out, so they ended up waiting a bit too
  long.  My aplogies.

  Other fixes are minor"

* 'for-3.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: fix pool ID allocation leakage and remove BUILD_BUG_ON() in init_workqueues
  workqueue: fix comment typo for __queue_work()
  workqueue: fix ordered workqueues in NUMA setups
  workqueue: swap set_cpus_allowed_ptr() and PF_NO_SETAFFINITY

11 years agoMerge branch 'for-3.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj...
Linus Torvalds [Fri, 29 Nov 2013 17:48:25 +0000 (09:48 -0800)]
Merge branch 'for-3.13-fixes' of git://git./linux/kernel/git/tj/libata

Pull libata fixes from Tejun Heo:
 "libata device removal path was removing parent device node before its
  child, which is mostly harmless but triggers warning after recent
  sysfs changes.  Rafael's patch fixes the order.

  Other than that, minor controller-specific fixes and device ID
  additions"

* 'for-3.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
  ATA: Fix port removal ordering
  ahci: add Marvell 9230 to the AHCI PCI device list
  ata: fix acpi_bus_get_device() return value check
  pata_arasan_cf: add missing clk_disable_unprepare() on error path
  ahci: add support for IBM Akebono platform device

11 years agoMerge branch 'for-3.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj...
Linus Torvalds [Fri, 29 Nov 2013 17:47:06 +0000 (09:47 -0800)]
Merge branch 'for-3.13-fixes' of git://git./linux/kernel/git/tj/cgroup

Pull cgroup fixes from Tejun Heo:
 "Fixes for three issues.

   - cgroup destruction path could swamp system_wq possibly leading to
     deadlock.  This actually seems to happen in the wild with memcg
     because memcg destruction path adds nested dependency on system_wq.

     Resolved by isolating cgroup destruction work items on its
     dedicated workqueue.

   - Possible locking context deadlock through seqcount reported by
     lockdep

   - Memory leak under certain conditions"

* 'for-3.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: fix cgroup_subsys_state leak for seq_files
  cpuset: Fix memory allocator deadlock
  cgroup: use a dedicated workqueue for cgroup destruction

11 years agoMerge tag 'sound-3.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Fri, 29 Nov 2013 17:36:42 +0000 (09:36 -0800)]
Merge tag 'sound-3.13-rc2' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "Quite a few HD-Audio fixes, a WUSB audio fix and a fix for FireWire
  audio.  The HD-audio part contains a couple of fixes for the generic
  parser, and these are the only intrusive fixes.  The rest are mostly
  device-specific fixes"

* tag 'sound-3.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda - Add LFE chmap to ASUS ET2700
  ALSA: hda - Initialize missing bass speaker pin for ASUS AIO ET2700
  ALSA: hda - limit mic boost on Asus UX31[A,E]
  ALSA: hda - Check leaf nodes to find aamix amps
  ALSA: hda - Fix hp-mic mode without VREF bits
  ALSA: hda - Create Headhpone Mic Jack Mode when really needed
  ALSA: usb: use multiple packets per urb for Wireless USB inbound audio
  ALSA: hda - Enable mute/mic-mute LEDs for more Thinkpads with Conexant codec
  ALSA: hda - Drop bus->avoid_link_reset flag
  ALSA: hda/realtek - Set pcbeep amp for ALC668
  ALSA: hda/realtek - Add support of ALC231 codec
  ALSA: firewire-lib: fix wrong value for FDF field as an empty packet

11 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Fri, 29 Nov 2013 17:27:19 +0000 (09:27 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs

Pull vfs dentry reference count fix from Al Viro.

This fixes a possible inode_permission NULL pointer dereference (and
other problems) that were due to the root dentry count being decremented
too much.  In commit 48a066e72d97 ("RCU'd vfsmounts") the placement of
clearing the LOOKUP_RCU bit changed, and we then returned failure of
incrementing the lockref on the parent dentry with LOOKUP_RCU cleared.

But that meant we needed to go through the same cleanup routines that
the later failures did wrt LOOKUP_ROOT and nd->root.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fix bogus path_put() of nd->root after some unlazy_walk() failures

11 years agoMerge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Fri, 29 Nov 2013 17:26:42 +0000 (09:26 -0800)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm qxl leak fix from Dave Airlie:
 "As usual 5 mins after I send a trivial pull fix I find a real bug!

  This fixes a memory leak and I'd like to get it into stable queue
  asap"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/qxl: fix memory leak in release list handling

11 years agoarm64: Move PTE_PROT_NONE higher up
Catalin Marinas [Wed, 27 Nov 2013 16:59:27 +0000 (16:59 +0000)]
arm64: Move PTE_PROT_NONE higher up

PTE_PROT_NONE means that a pte is present but does not have any
read/write attributes. However, setting the memory type like
pgprot_writecombine() is allowed and such bits overlap with
PTE_PROT_NONE. This causes mmap/munmap issues in drivers that change the
vma->vm_pg_prot on PROT_NONE mappings.

This patch reverts the PTE_FILE/PTE_PROT_NONE shift in commit
59911ca4325d (ARM64: mm: Move PTE_PROT_NONE bit) and moves PTE_PROT_NONE
together with the other software bits.

Signed-off-by: Steve Capper <steve.capper@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Steve Capper <steve.capper@linaro.org>
Cc: <stable@vger.kernel.org> # 3.11+
11 years agoarm64: Use Normal NonCacheable memory for writecombine
Catalin Marinas [Fri, 29 Nov 2013 10:56:14 +0000 (10:56 +0000)]
arm64: Use Normal NonCacheable memory for writecombine

This provides better performance compared to Device GRE and also allows
unaligned accesses. Such memory is intended to be used with standard RAM
(e.g. framebuffers) and not I/O.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
11 years agofix bogus path_put() of nd->root after some unlazy_walk() failures
Al Viro [Fri, 29 Nov 2013 06:48:32 +0000 (01:48 -0500)]
fix bogus path_put() of nd->root after some unlazy_walk() failures

Failure to grab reference to parent dentry should go through the
same cleanup as nd->seq mismatch.  As it is, we might end up with
caller thinking it needs to path_put() nd->root, with obvious
nasty results once we'd hit that bug enough times to drive the
refcount of root dentry all the way to zero...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agodrm/qxl: fix memory leak in release list handling
Dave Airlie [Thu, 28 Nov 2013 05:39:03 +0000 (05:39 +0000)]
drm/qxl: fix memory leak in release list handling

wow no idea how I got this far without seeing this,
leaking the entries in the list makes kmalloc-64 slab grow.

References: https://bugzilla.kernel.org/show_bug.cgi?id=65121
Cc: stable@vger.kernel.org
Reported-by: Matthew Stapleton <matthew4196@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
11 years agoarm64: debug: make aarch32 bkpt checking endian clean
Matthew Leach [Thu, 28 Nov 2013 12:07:23 +0000 (12:07 +0000)]
arm64: debug: make aarch32 bkpt checking endian clean

The current breakpoint instruction checking code for A32 is not endian
clean. Fix this with appropriate byte-swapping when retrieving
instructions.

Signed-off-by: Matthew Leach <matthew.leach@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
11 years agoarm64: ptrace: fix compat registes get/set to be endian clean
Matthew Leach [Thu, 28 Nov 2013 12:07:22 +0000 (12:07 +0000)]
arm64: ptrace: fix compat registes get/set to be endian clean

On a BE system the wrong half of the X registers is retrieved/written
when attempting to get/set the value of aarch32 registers through
ptrace.

Ensure that types are the correct width so that the relevant
casting occurs.

Signed-off-by: Matthew Leach <matthew.leach@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
11 years agoMerge tag 'gpio-v3.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw...
Linus Torvalds [Thu, 28 Nov 2013 17:57:46 +0000 (09:57 -0800)]
Merge tag 'gpio-v3.13-2' of git://git./linux/kernel/git/linusw/linux-gpio

Pull GPIO fixes from Linus Walleij:
 "Here us a bunch of patches for the v3.13 series.  Most important stuff
  is related to fixes and documentation for the new GPIO descriptor API.
  If the diffstat is scary you'll notice most of it is to
  Documentation/*:

   - A big slew of documentation for the gpiod transition that happened
     in the merge window, no semantic effect, but we should provide
     proper documentation with the new API.

   - Fix flags related to the new API.

   - Fix to the find_chip_by_name() lookup function related to the new
     API.

   - Fix of_find_gpio() when not using device tree.

   - Bug fix for the TB10x direction setting.

   - Error path fixes from Dan Carpenter.

   - Nasty IRQdomain bug relating to taking an unitialized spinlock.

   - Minor fixes here and there"

* tag 'gpio-v3.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpio: bcm281xx: Fix return value of bcm_kona_gpio_get()
  gpio: pl061: move irqdomain initialization
  gpio: ucb1400: Add MODULE_ALIAS
  gpiolib: fix of_find_gpio() when OF not defined
  gpio: fix memory leak in error path
  gpio: rcar: NULL dereference on error in probe()
  gpio: msm: make msm_gpio.summary_irq signed for error handling
  gpio: mvebu: make mvchip->irqbase signed for error handling
  gpiolib: use dedicated flags for GPIO properties
  gpiolib: fix find_chip_by_name()
  Documentation: gpiolib: document new interface
  gpio: tb10x: Set output value before setting direction to output

11 years agoMerge tag 'md/3.13-fixes' of git://neil.brown.name/md
Linus Torvalds [Thu, 28 Nov 2013 17:51:39 +0000 (09:51 -0800)]
Merge tag 'md/3.13-fixes' of git://neil.brown.name/md

Pull md fixes from Neil Brown:
 "Three bug fixes for md in 3.13-rc

  All recent regressions, one in 3.12 so marked for -stable"

* tag 'md/3.13-fixes' of git://neil.brown.name/md:
  md/raid5: fix newly-broken locking in get_active_stripe.
  md: test mddev->flags more safely in md_check_recovery.
  md/raid5: fix new memory-reference bug in alloc_thread_groups.

11 years agoMerge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Thu, 28 Nov 2013 17:50:25 +0000 (09:50 -0800)]
Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
 "SMB3 "validate negotiate" is needed to prevent certain types of
  downgrade attacks.

  Also changes SMB2/SMB3 copy offload from using the BTRFS copy ioctl
  (BTRFS_IOC_CLONE) to a cifs specific ioctl (CIFS_IOC_COPYCHUNK_FILE)
  to address Christoph's comment that there are semantic differences
  between requesting copy offload in which copy-on-write is mandatory
  (as in the BTRFS ioctl) and optional in the SMB2/SMB3 case.  Also
  fixes SMB2/SMB3 copychunk for large files"

* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
  [CIFS] Do not use btrfs refcopy ioctl for SMB2 copy offload
  Check SMB3 dialects against downgrade attacks
  Removed duplicated (and unneeded) goto
  CIFS: Fix SMB2/SMB3 Copy offload support (refcopy) for large files

11 years agokernel/extable: fix address-checks for core_kernel and init areas
Helge Deller [Thu, 28 Nov 2013 08:16:33 +0000 (09:16 +0100)]
kernel/extable: fix address-checks for core_kernel and init areas

The init_kernel_text() and core_kernel_text() functions should not
include the labels _einittext and _etext when checking if an address is
inside the .text or .init sections.

Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoMerge tag 'kvm-s390-20131128' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms39...
Paolo Bonzini [Thu, 28 Nov 2013 14:30:16 +0000 (15:30 +0100)]
Merge tag 'kvm-s390-20131128' of git://git./linux/kernel/git/kvms390/linux into kvm-next

Several fixes for kvm-next.

The diagnose decoding and the missing store status are more major (can
cause the host to execute different things than the guest wanted resp.
produce unusable dumps when something breaks). The other patches contain
more minor fixes and cleanups.

11 years agoALSA: hda - Add LFE chmap to ASUS ET2700
Takashi Iwai [Thu, 28 Nov 2013 14:24:34 +0000 (15:24 +0100)]
ALSA: hda - Add LFE chmap to ASUS ET2700

As the previous commit 1f0bbf03cb82 added the pin config for the bass
speaker, this patch adds the corresponding LFE-only channel map on
ASUS ET2700.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=65961
Signed-off-by: Takashi Iwai <tiwai@suse.de>
11 years agoALSA: hda - Initialize missing bass speaker pin for ASUS AIO ET2700
Takashi Iwai [Thu, 28 Nov 2013 14:21:21 +0000 (15:21 +0100)]
ALSA: hda - Initialize missing bass speaker pin for ASUS AIO ET2700

Add a fixup entry for the missing bass speaker pin 0x16 on ASUS ET2700
AiO desktop.  The channel map will be added in the next patch, so that
this can be backported easily to stable kernels.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=65961
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
11 years agoALSA: hda - limit mic boost on Asus UX31[A,E]
Oleksij Rempel [Wed, 27 Nov 2013 16:12:03 +0000 (17:12 +0100)]
ALSA: hda - limit mic boost on Asus UX31[A,E]

This both devices need limit for internal dmic.

[cosmetic change; renamed fixup name by tiwai]

Signed-off-by: Oleksij Rempel <linux@rempel-privat.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
11 years agoALSA: hda - Check leaf nodes to find aamix amps
Takashi Iwai [Thu, 28 Nov 2013 10:05:28 +0000 (11:05 +0100)]
ALSA: hda - Check leaf nodes to find aamix amps

The current generic parser assumes blindly that the volume and mute
amps are found in the aamix node itself.  But on some codecs,
typically Analog Devices ones, the aamix amps are separately
implemented in each leaf node of the aamix node, and the current
driver can't establish the correct amp controls.  This is a regression
compared with the previous static quirks.

This patch extends the search for the amps to the leaf nodes for
allowing the aamix controls again on such codecs.
In this implementation, I didn't code to loop through the whole paths,
since usually one depth should suffice, and we can't search too
deeply, as it may result in the conflicting control assignments.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=65641
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
11 years agoKVM: s390: Removed kvm_s390_inject_sigp_stop()
Thomas Huth [Thu, 14 Nov 2013 10:08:20 +0000 (11:08 +0100)]
KVM: s390: Removed kvm_s390_inject_sigp_stop()

The function kvm_s390_inject_sigp_stop() as been unused since the
removal of the old mmu reload code and thus can be removed safely.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
11 years agoKVM: s390: Add SIGP store-status-at-address order
Thomas Huth [Wed, 13 Nov 2013 19:48:51 +0000 (20:48 +0100)]
KVM: s390: Add SIGP store-status-at-address order

The STORE STATUS AT ADDRESS order of SIGP was still missing.
Now it is supported, using the common kvm_s390_store_status()
function.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
11 years agoKVM: s390: fix diagnose code extraction
Heiko Carstens [Mon, 11 Nov 2013 12:56:47 +0000 (13:56 +0100)]
KVM: s390: fix diagnose code extraction

The diagnose code to be used is the contents of the base register (if not
zero), plus the displacement. The current code ignores the base register
contents. So let's fix that...

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
11 years agoKVM: s390: Fix clock comparator field for STORE STATUS
Thomas Huth [Wed, 13 Nov 2013 19:28:18 +0000 (20:28 +0100)]
KVM: s390: Fix clock comparator field for STORE STATUS

Only the most 7 significant bytes of the clock comparator must be
saved to the status area, and the byte at offset 304 has to be zero.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
11 years agoKVM: s390: Always store status during SIGP STOP_AND_STORE_STATUS
Thomas Huth [Wed, 6 Nov 2013 14:46:33 +0000 (15:46 +0100)]
KVM: s390: Always store status during SIGP STOP_AND_STORE_STATUS

The SIGP order STOP_AND_STORE_STATUS is defined to stop a CPU and store
its status. However, we only stored the status if the CPU was still
running, so make sure that the status is now also stored if the CPU was
already stopped. This fixes the problem that the CPU information was
not stored correctly in kdump files, rendering them unreadable.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
11 years agoKVM: s390: Do not set CC3 for EQBS and SQBS
Thomas Huth [Wed, 9 Oct 2013 14:49:03 +0000 (16:49 +0200)]
KVM: s390: Do not set CC3 for EQBS and SQBS

The EQBS and SQBS instructions do not set CC3 for invalid channels, but
should throw an operation exception instead when not available. Thus they
should not be handled by the handle_io_inst() wrapper but drop to userspace
instead (which will then inject the operation exception).

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
11 years agoKVM: s390: Fix access to CR6 in TPI handler
Thomas Huth [Wed, 9 Oct 2013 12:15:54 +0000 (14:15 +0200)]
KVM: s390: Fix access to CR6 in TPI handler

The TPI handler currently uses vcpu->run->s.regs.crs[6] to get the current
value of CR6. I think this is wrong, because vcpu->run->s.regs.crs is
only updated when kvm_arch_vcpu_ioctl_run() drops back to userspace.
So let's change the TPI handler to use vcpu->arch.sie_block->gcr[6] instead.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
11 years agoKVM: s390: Removed VIRTIODESCSPACE
Thomas Huth [Mon, 7 Oct 2013 15:50:22 +0000 (17:50 +0200)]
KVM: s390: Removed VIRTIODESCSPACE

VIRTIODESCSPACE is completely unused nowadays and thus can be removed
without any problems.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
11 years agoKVM: s390: Removed SIE_INTERCEPT_UCONTROL
Thomas Huth [Thu, 19 Sep 2013 14:26:18 +0000 (16:26 +0200)]
KVM: s390: Removed SIE_INTERCEPT_UCONTROL

The SIE_INTERCEPT_UCONTROL can be removed by moving the related code
from kvm_arch_vcpu_ioctl_run() to vcpu_post_run().

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
11 years agoi2c: bcm2835: Linking platform nodes to adapter nodes
Florian Meier [Mon, 25 Nov 2013 08:01:50 +0000 (09:01 +0100)]
i2c: bcm2835: Linking platform nodes to adapter nodes

In order to find I2C devices in the device tree, the platform nodes
have to be known by the I2C core. This requires setting the
dev.of_node parameter of the adapter.

Signed-off-by: Florian Meier <florian.meier@koalo.de>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>