Rusty Russell [Tue, 17 Feb 2015 05:42:44 +0000 (16:12 +1030)]
virtio: don't set VIRTIO_CONFIG_S_DRIVER_OK twice.
I noticed this with the console device. It's not *wrong*, just a bit
weird.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Tue, 17 Feb 2015 05:42:43 +0000 (16:12 +1030)]
virtio_net: unconditionally define struct virtio_net_hdr_v1.
This was introduced in commit
ed9ecb0415b97b5f9f91f146e1977bb372c74c6d,
but only defined if !VIRTIO_NET_NO_LEGACY. We should always define
it: easier for users to have conditional legacy code.
Suggested-by: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Fri, 13 Feb 2015 06:43:44 +0000 (17:13 +1030)]
tools/lguest: don't use legacy definitions for net device in example launcher.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Fri, 13 Feb 2015 06:43:44 +0000 (17:13 +1030)]
virtio: Don't expose legacy net features when VIRTIO_NET_NO_LEGACY defined.
In particular, the virtio header always has the u16 num_buffers field.
We define a new 'struct virtio_net_hdr_v1' for this (rather than
simply calling it 'struct virtio_net_hdr', to avoid nasty type errors
if some parts of a project define VIRTIO_NET_NO_LEGACY and some don't.
Transitional devices (which can't define VIRTIO_NET_NO_LEGACY) will
have to keep using struct virtio_net_hdr_mrg_rxbuf, which has the same
byte layout as struct virtio_net_hdr_v1.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Fri, 13 Feb 2015 06:43:44 +0000 (17:13 +1030)]
tools/lguest: use common error macros in the example launcher.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Fri, 13 Feb 2015 06:43:43 +0000 (17:13 +1030)]
tools/lguest: give virtqueues names for better error messages
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Fri, 13 Feb 2015 06:43:43 +0000 (17:13 +1030)]
tools/lguest: more documentation and checking of virtio 1.0 compliance.
This is from all the non-PCI parts of the spec.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Fri, 13 Feb 2015 06:43:43 +0000 (17:13 +1030)]
lguest: don't look in console features to find emerg_wr.
The 1.0 spec clearly states that you must set the ACKNOWLEDGE and
DRIVER status bits before accessing the feature bits. This is a
problem for the early console code, which doesn't really want to
acknowledge the device (the spec specifically excepts writing to the
console's emerg_wr from the usual ordering constrains).
Instead, we check that the *size* of the device configuration is
sufficient to hold emerg_wr: at worst (if the device doesn't support
the VIRTIO_CONSOLE_F_EMERG_WRITE feature), it will ignore the
writes.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Fri, 13 Feb 2015 06:43:42 +0000 (17:13 +1030)]
tools/lguest: don't start devices until DRIVER_OK status set.
We were activating them with the virtqueues, and that's not allowed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Fri, 13 Feb 2015 06:43:42 +0000 (17:13 +1030)]
tools/lguest: handle indirect partway through chain.
Linux doesn't generate these, but it's perfectly valid according to
a close reading of the spec. I opened virtio spec bug VIRTIO-134 to
make this clearer there, too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Fri, 13 Feb 2015 06:43:42 +0000 (17:13 +1030)]
tools/lguest: insert driver references from the 1.0 spec (4.1 Virtio Over PCI)
As a demonstration, the lguest launcher is pretty strict, trying to
catch badly behaved drivers. Document this precisely.
A good implementation would *NOT* crash the guest when these happened!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Fri, 13 Feb 2015 06:43:41 +0000 (17:13 +1030)]
tools/lguest: insert device references from the 1.0 spec (4.1 Virtio Over PCI)
There are some (optional) parts we don't implement, but this quotes all
the device requirements from the spec (csd 03, but it should be the same
across all released versions).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Fri, 13 Feb 2015 06:43:41 +0000 (17:13 +1030)]
tools/lguest: rename virtio_pci_cfg_cap field to match spec.
The next patch will insert many quotes from the virtio 1.0 spec; they
make most sense if we copy the spec.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Fri, 13 Feb 2015 06:43:41 +0000 (17:13 +1030)]
tools/lguest: fix features_accepted logic in example launcher.
We were clearing the lower bits when setting the upper bits.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Fri, 13 Feb 2015 06:43:40 +0000 (17:13 +1030)]
tools/lguest: handle device reset correctly in example launcher.
The example launcher doesn't reset the queue_enable like the spec says
we have to. Plus, we should reset the size in case they negotiated
a different (smaller) one.
This is easy to test by unloading and reloading a virtio module.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Luis R. Rodriguez [Fri, 13 Feb 2015 06:43:40 +0000 (17:13 +1030)]
virtual: Documentation: simplify and generalize paravirt_ops.txt
The general documentation we have for pv_ops is currenty present
on the IA64 docs, but since this documentation covers IA64 xen
enablement and IA64 Xen support got ripped out a while ago
through commit
d52eefb47 present since v3.14-rc1 lets just
simplify, generalize and move the pv_ops documentation to a
shared place.
Cc: Isaku Yamahata <yamahata@valinux.co.jp>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: virtualization@lists.linux-foundation.org
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: xen-devel@lists.xenproject.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:58:01 +0000 (15:28 +1030)]
lguest: remove NOTIFY call and eventfd facility.
Disappointing, as this was kind of neat (especially getting to use RCU
to manage the address -> eventfd mapping). But now the devices are PCI
handled in userspace, we get rid of both the NOTIFY hypercall and
the interface to connect an eventfd.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:57:01 +0000 (15:27 +1030)]
lguest: remove NOTIFY facility from demonstration launcher.
This was only used for early console, now we can get rid of it altogether.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:56:01 +0000 (15:26 +1030)]
lguest: use the PCI console device's emerg_wr for early boot messages.
This involves manually checking the console device (which is always in
slot 1 of bus 0) and using the window in VIRTIO_PCI_CAP_PCI_CFG to
program it (as we can't map the BAR yet).
We could in fact do this much earlier, but we wait for the first
write from the virtio_cons_early_init() facility.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:55:01 +0000 (15:25 +1030)]
lguest: always put console in PCI slot #1.
This simplifies the early probe.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:54:01 +0000 (15:24 +1030)]
lguest: support backdoor window.
The VIRTIO_PCI_CAP_PCI_CFG in the PCI virtio 1.0 spec allows access to
the BAR registers without mapping them. This is a compulsory feature,
and we implement it here.
There are some subtleties involving access widths which we should
note:
4.1.4.7.1 Device Requirements: PCI configuration access capability
...
Upon detecting driver write access to pci_cfg_data, the device MUST
execute a write access at offset cap.offset at BAR selected by
cap.bar using the first cap.length bytes from pci_cfg_data.
Upon detecting driver read access to pci_cfg_data, the device MUST
execute a read access of length cap.length at offset cap.offset at
BAR selected by cap.bar and store the first cap.length bytes in
pci_cfg_data.
So, for a write, we copy into the pci_cfg_data window, then write from
there out to the BAR. This works correctly if cap.length != width of
write. Similarly, for a read, we read into window from the BAR then
read the value from there.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:53:01 +0000 (15:23 +1030)]
lguest: support emerg_wr in console device in example launcher.
This is a magic register which causes a character to be outputted: it can
be used even before the device is configured.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:52:01 +0000 (15:22 +1030)]
lguest: remove lguest bus definitions from header.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:51:01 +0000 (15:21 +1030)]
lguest: remove support for lguest bus in demonstration launcher.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:50:01 +0000 (15:20 +1030)]
lguest: remove support for lguest bus.
The demonstration launcher now uses PCI entirely.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:49:01 +0000 (15:19 +1030)]
lguest: define VIRTIO_CONFIG_NO_LEGACY in example launcher.
We only support virtio 1.0 now
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:48:01 +0000 (15:18 +1030)]
lguest: Convert console device to virtio 1.0 PCI.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:47:01 +0000 (15:17 +1030)]
lguest: Convert entropy device to virtio 1.0 PCI.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:46:01 +0000 (15:16 +1030)]
lguest: Convert net device to virtio 1.0 PCI.
The only real change here (other than using the PCI bus) is that we
didn't negotiate VIRTIO_NET_F_MRG_RXBUF before, so the format of the
packet header changed with virtio 1.0; we need TUNSETVNETHDRSZ on the
tun fd to tell it about the extra two bytes.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:45:12 +0000 (15:15 +1030)]
lguest: Convert block device to virtio 1.0 PCI.
We remove SCSI support (which was removed for 1.0) and VIRTIO_BLK_F_FLUSH
feature flag (removed too, since it's compulsory for 1.0).
The rest is mainly mechanical.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:45:12 +0000 (15:15 +1030)]
lguest: add a dummy PCI host bridge.
Otherwise Linux fails to find the bus.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:45:11 +0000 (15:15 +1030)]
lguest: fix failure to find linux/virtio_types.h
We want to use the local kernel headers, but -I../../include/uapi leads us into
a world of hurt. Instead we create a dummy include/ dir with symlinks.
If we just use #include "../../include/uapi/linux/virtio_blk.h" we get:
../../include/uapi/linux/virtio_blk.h:31:32: fatal error: linux/virtio_types.h: No such file or directory
#include <linux/virtio_types.h>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:45:11 +0000 (15:15 +1030)]
lguest: implement virtio-PCI MMIO accesses.
For each device, We need to include the vendor capabilities to demark
where virtio common, notification and ISR regions are (we put them
all in BAR0).
We need to handle the switching of the virtqueues using the accessors.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:45:11 +0000 (15:15 +1030)]
lguest: add PCI config space emulation to example launcher.
This handles ioport 0xCF8 and 0xCFC accesses, which are used to
read/write PCI device config space.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:45:11 +0000 (15:15 +1030)]
lguest: decode mmio accesses for PCI in example launcher.
We don't do anything with them yet (emulate_mmio_write and
emulate_mmio_read are stubs), but we decode the instructions and
search for the device they're hitting.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:45:11 +0000 (15:15 +1030)]
lguest: add MMIO region allocator in example launcher.
This is where we point our PCI BARs, so that we can intercept MMIO
accesses. We tell the kernel about it so any faults in this area are
directed to us.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:45:10 +0000 (15:15 +1030)]
lguest: Override pcibios_enable_irq/pcibios_disable_irq to our stupid PIC
This lets us deliver interrupts for our emulated PCI devices using our
dumb PIC, and not emulate an 8259 and PCI irq mapping tables or whatever.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:45:10 +0000 (15:15 +1030)]
lguest: disable ACPI explicitly.
Once we add PCI, it starts trying to manage our interrupts.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:45:10 +0000 (15:15 +1030)]
lguest: add iomem region, where guest page faults get sent to userspace.
This lets us implement PCI.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:45:10 +0000 (15:15 +1030)]
lguest: don't disable iospace.
This no longer speeds up boot (IDE got better, I guess), but it does stop
us probing for a PCI bus.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:45:10 +0000 (15:15 +1030)]
lguest: suppress PS/2 keyboard polling.
While hacking on getting I/O out to the lguest launcher, I noticed
that returning 0xFF for the PS/2 keyboard status made it spin for a
while thinking there was a key pending. Fix this by returning 1
instead of 0xFF.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:45:10 +0000 (15:15 +1030)]
lguest: send trap 13 through to userspace.
We copy 7 bytes at eip for userspace's instruction decode; we have to
carefully handle the case where eip is at the end of a page. We can't
leave this to userspace since kernel has all the page table decode
logic.
The decode logic moves to userspace, basically unchanged.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:45:09 +0000 (15:15 +1030)]
lguest: add infrastructure to check mappings.
We normally abort the guest unconditionally when it gives us a bad address,
but in the next patch we want to copy some bytes which may not be mapped.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:45:09 +0000 (15:15 +1030)]
lguest: add infrastructure for userspace to deliver a trap to the guest.
This is required for instruction emulation to move to userspace.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:45:09 +0000 (15:15 +1030)]
lguest: write more information to userspace about pending traps.
This is preparation for userspace handling MMIO and ioport accesses.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:45:09 +0000 (15:15 +1030)]
lguest: add operations to get/set a register from the Launcher.
We use the ptrace API struct, and we currently don't let them set
anything but the normal registers (we'd have to filter the others).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:45:09 +0000 (15:15 +1030)]
lguest: have --rng read from /dev/urandom not /dev/random.
Theoretical debates aside, now it boots.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:31:14 +0000 (15:01 +1030)]
virtio: don't require a config space on the console device.
Strictly, it's only needed when we have features (size or multiport).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 11 Feb 2015 04:31:14 +0000 (15:01 +1030)]
virtio_pci: use 16-bit accessor for queue_enable.
Since PCI is little endian, 8-bit access might work, but the spec section
is very clear on this:
4.1.3.1 Driver Requirements: PCI Device Layout
The driver MUST access each field using the “natural” access method,
i.e. 32-bit accesses for 32-bit fields, 16-bit accesses for 16-bit
fields and 8-bit accesses for 8-bit fields.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Rusty Russell [Wed, 11 Feb 2015 04:31:14 +0000 (15:01 +1030)]
virtio: Don't expose legacy config features when VIRTIO_CONFIG_NO_LEGACY defined.
The VIRTIO_F_ANY_LAYOUT and VIRTIO_F_NOTIFY_ON_EMPTY features are pre-1.0
only.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Rusty Russell [Wed, 11 Feb 2015 04:31:14 +0000 (15:01 +1030)]
virtio: Don't expose legacy block features when VIRTIO_BLK_NO_LEGACY defined.
This allows modern implementations to ensure they don't use legacy
feature bits or SCSI commands (which are not used in v1.0 non-legacy).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Rusty Russell [Wed, 11 Feb 2015 04:31:14 +0000 (15:01 +1030)]
virtio: define VIRTIO_PCI_CAP_PCI_CFG in header.
This provides backdoor access to the device MMIOs, and every device should
have one. From the virtio 1.0 spec (CS03):
4.1.4.7.1 Device Requirements: PCI configuration access capability
The device MUST present at least one VIRTIO_PCI_CAP_PCI_CFG capability.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Tetsuo Handa [Wed, 11 Feb 2015 04:31:13 +0000 (15:01 +1030)]
virtio: Avoid possible kernel panic if DEBUG is enabled.
The virtqueue_add() calls START_USE() upon entry. The virtqueue_kick() is
called if vq->num_added == (1 << 16) - 1 before calling END_USE().
The virtqueue_kick_prepare() called via virtqueue_kick() calls START_USE()
upon entry, and will call panic() if DEBUG is enabled.
Move this virtqueue_kick() call to after END_USE() call.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Pawel Moll [Fri, 23 Jan 2015 04:15:55 +0000 (14:45 +1030)]
virtio-mmio: Update the device to OASIS spec version
This patch add a support for second version of the virtio-mmio device,
which follows OASIS "Virtual I/O Device (VIRTIO) Version 1.0"
specification.
Main changes:
1. The control register symbolic names use the new device/driver
nomenclature rather than the old guest/host one.
2. The driver detect the device version (version 1 is the pre-OASIS
spec, version 2 is compatible with fist revision of the OASIS spec)
and drives the device accordingly.
3. New version uses direct addressing (64 bit address split into two
low/high register) instead of the guest page size based one,
and addresses each part of the queue (descriptors, available, used)
separately.
4. The device activity is now explicitly triggered by writing to the
"queue ready" register.
5. Whole 64 bit features are properly handled now (both ways).
Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Michael S. Tsirkin [Tue, 20 Jan 2015 12:30:52 +0000 (14:30 +0200)]
virtio_pci_modern: drop an unused function
release function in modern driver is unused:
it's a left-over from when each driver had
to have its own release.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Michael S. Tsirkin [Thu, 15 Jan 2015 15:54:13 +0000 (17:54 +0200)]
virtio_pci: add module param to force legacy mode
If set, try legacy interface first, modern one if that fails. Useful to
work around device/driver bugs, and for compatibility testing.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Michael S. Tsirkin [Thu, 15 Jan 2015 14:06:26 +0000 (16:06 +0200)]
virtio_pci: add an option to disable legacy driver
Useful for testing device virtio 1 compatibility.
Based on patch by Rusty - couldn't resist putting
that flying car joke in there!
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Michael S. Tsirkin [Thu, 15 Jan 2015 12:16:24 +0000 (14:16 +0200)]
virtio_pci: drop Kconfig warnings
The ABI *is* stable, and has been for a while now.
Drop Kconfig warning saying that it's not guaranteed
to work.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Michael S. Tsirkin [Thu, 15 Jan 2015 12:15:51 +0000 (14:15 +0200)]
virtio_pci: Kconfig grammar fix
This drivers -> this driver.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Michael S. Tsirkin [Thu, 15 Jan 2015 11:50:06 +0000 (13:50 +0200)]
virtio_rng: drop extra empty line
makes code look a bit prettier.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Michael S. Tsirkin [Thu, 15 Jan 2015 11:33:31 +0000 (13:33 +0200)]
virtio_ring: coding style fix
Most of our code has
struct foo {
}
Fix one instances where ring is inconsistent.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Michael S. Tsirkin [Thu, 15 Jan 2015 11:33:31 +0000 (13:33 +0200)]
virtio_blk: coding style fixes
Most of our code has
struct foo {
}
Fix two instances where blk is inconsistent.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Michael S. Tsirkin [Thu, 15 Jan 2015 11:33:31 +0000 (13:33 +0200)]
virtio_balloon: coding style fixes
Most of our code has
struct foo {
}
Fix two instances where balloon is inconsistent.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Michael S. Tsirkin [Tue, 13 Jan 2015 14:34:58 +0000 (16:34 +0200)]
virtio_pci_modern: support devices with no config
Virtio 1.0 spec lists device config as optional.
Set get/set callbacks to NULL. Drivers can check that
and fail gracefully.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Michael S. Tsirkin [Wed, 14 Jan 2015 16:50:55 +0000 (18:50 +0200)]
virtio_pci_modern: reduce number of mappings
We don't know the # of VQs that drivers are going to use so it's hard to
predict how much memory we'll need to map. However, the relevant
capability does give us an upper limit.
If that's below a page, we can reduce the number of required
mappings by mapping it all once ahead of the time.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Thu, 30 May 2013 06:59:32 +0000 (16:29 +0930)]
virtio_pci: macros for PCI layout offsets
QEMU wants it, so why not? Trust, but verify.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Michael S. Tsirkin [Thu, 11 Dec 2014 11:59:51 +0000 (13:59 +0200)]
virtio_pci: modern driver
Lightly tested against qemu.
One thing *not* implemented here is separate mappings
for descriptor/avail/used rings. That's nice to have,
will be done later after we have core support.
This also exposes the PCI layout to userspace, and
adds macros for PCI layout offsets:
QEMU wants it, so why not? Trust, but verify.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Rusty Russell [Wed, 29 May 2013 02:22:22 +0000 (11:52 +0930)]
virtio-pci: define layout for virtio 1.0
Based on patches by Michael S. Tsirkin <mst@redhat.com>, but I found it
hard to follow so changed to use structures which are more
self-documenting.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Michael S. Tsirkin [Tue, 13 Jan 2015 09:23:32 +0000 (11:23 +0200)]
virtio_pci: move probe/remove code to common
Most of initialization is device-independent.
Let's move it to common.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Sasha Levin [Fri, 2 Jan 2015 19:47:39 +0000 (14:47 -0500)]
virtio_pci: drop useless del_vqs call
Device VQs were getting freed twice: once in every device's removal
functions, and then again in virtio_pci_legacy_remove(). The ones in
devices are called first, so drop the useless second call.
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Michael S. Tsirkin [Wed, 29 May 2013 02:22:21 +0000 (11:52 +0930)]
s390: add pci_iomap_range
Virtio drivers should map the part of the range they need, not
necessarily all of it.
To this end, support mapping ranges within BAR on s390.
Since multiple ranges can now be mapped within a BAR, we keep track of
the number of mappings created, and only clear out the mapping for a BAR
when this number reaches 0.
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-pci@vger.kernel.org
Tested-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Michael S. Tsirkin [Wed, 29 May 2013 02:22:21 +0000 (11:52 +0930)]
pci: add pci_iomap_range
Virtio drivers should map the part of the BAR they need, not necessarily
all of it.
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-pci@vger.kernel.org
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Michael S. Tsirkin [Sun, 14 Dec 2014 12:54:17 +0000 (14:54 +0200)]
mn10300: drop dead code
pci-iomap.c was (apparently, mistakenly) reintroduced as part of
commit
83c2dc15ce824450e7044b9f90cd529c25747ae0
MN10300: Handle cacheable PCI regions in pci_iomap()
probably as side-effect of forward-porting the patch
from an old kernel.
It's not really needed: the generic pci_iomap does the right thing here.
The new file isn't compiled so it's safe to drop.
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-pci@vger.kernel.org
Cc: trivial@kernel.org
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Michael S. Tsirkin [Mon, 12 Jan 2015 14:23:37 +0000 (16:23 +0200)]
virtio/balloon: verify device has config space
Some devices might not implement config space access
(e.g. remoteproc used not to - before 3.9).
virtio/balloon needs config space access so make it
fail gracefully if not there.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Michael S. Tsirkin [Mon, 12 Jan 2015 14:23:37 +0000 (16:23 +0200)]
virtio/scsi: verify device has config space
Some devices might not implement config space access
(e.g. remoteproc used not to - before 3.9).
virtio/scsi needs config space access so make it
fail gracefully if not there.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Michael S. Tsirkin [Mon, 12 Jan 2015 14:23:37 +0000 (16:23 +0200)]
virtio/net: verify device has config space
Some devices might not implement config space access
(e.g. remoteproc used not to - before 3.9).
virtio/net needs config space access so make it
fail gracefully if not there.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Michael S. Tsirkin [Mon, 12 Jan 2015 14:23:37 +0000 (16:23 +0200)]
virtio/console: verify device has config space
Some devices might not implement config space access
(e.g. remoteproc used not to - before 3.9).
virtio/console needs config space access so make it
fail gracefully if not there.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Michael S. Tsirkin [Mon, 12 Jan 2015 14:23:37 +0000 (16:23 +0200)]
virtio/blk: verify device has config space
Some devices might not implement config space access
(e.g. remoteproc used not to - before 3.9).
virtio/blk needs config space access so make it
fail gracefully if not there.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Michael S. Tsirkin [Mon, 12 Jan 2015 14:23:37 +0000 (16:23 +0200)]
virtio/9p: verify device has config space
Some devices might not implement config space access
(e.g. remoteproc used not to - before 3.9).
virtio/9p needs config space access so make it
fail gracefully if not there.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Michael S. Tsirkin [Sun, 28 Dec 2014 10:35:05 +0000 (12:35 +0200)]
virtio_pci: drop virtio_config dependency
virtio_pci does not depend on virtio_config:
let's not include it, users can pull it in as necessary.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Linus Torvalds [Tue, 20 Jan 2015 19:54:16 +0000 (07:54 +1200)]
Merge branch 'for-3.19-fixes' of git://git./linux/kernel/git/tj/libata
Pull libata fixes from Tejun Heo:
- Bartlomiej will be co-maintaining PATA portion of libata. git
workflow will stay the same.
- sata_sil24 wasn't happy with tag ordered submission. An option to
restore the old tag allocation behavior is implemented for sil24.
- a very old race condition in PIO host state machine which can trigger
BUG fixed.
- other driver-specific changes
* 'for-3.19-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
libata: prevent HSM state change race between ISR and PIO
libata: allow sata_sil24 to opt-out of tag ordered submission
ata: pata_at91: depend on !ARCH_MULTIPLATFORM
ahci: Remove Device ID for Intel Sunrise Point PCH
ahci: Use dev_info() to inform about the lack of Device Sleep support
libata: Whitelist SSDs that are known to properly return zeroes after TRIM
sata_dwc_460ex: fix resource leak on error path
ata: add MAINTAINERS entry for libata PATA drivers
libata: clean up MAINTAINERS entries
libata: export ata_get_cmd_descript()
ahci_xgene: Fix the DMA state machine lockup for the ATA_CMD_PACKET PIO mode command.
ahci_xgene: Fix the endianess issue in APM X-Gene SoC AHCI SATA controller driver.
Linus Torvalds [Tue, 20 Jan 2015 19:51:46 +0000 (07:51 +1200)]
Merge branch 'for-3.19-fixes' of git://git./linux/kernel/git/tj/wq
Pull workqueue fix from Tejun Heo:
"The xfs folks have been running into weird and very rare lockups for
some time now. I didn't think this could have been from workqueue
side because no one else was reporting it. This time, Eric had a
kdump which we looked into and it turned out this actually was a
workqueue bug and the bug has been there since the beginning of
concurrency managed workqueue.
A worker pool ensures forward progress of the workqueues associated
with it by always having at least one worker reserved from executing
work items. When the pool is under contention, the idle one tries to
create more workers for the pool and if that doesn't succeed quickly
enough, it calls the rescuers to the pool.
This logic had a subtle race condition in an early exit path. When a
worker invokes this manager function, the function may return %false
indicating that the caller may proceed to executing work items either
because another worker is already performing the role or conditions
have changed and the pool is no longer under contention.
The latter part depended on the assumption that whether more workers
are necessary or not remains stable while the pool is locked; however,
pool->nr_running (concurrency count) may change asynchronously and it
getting bumped from zero asynchronously could send off the last idle
worker to execute work items.
The race window is fairly narrow, and, even when it gets triggered,
the pool deadlocks iff if all work items get blocked on pending work
items of the pool, which is highly unlikely but can be triggered by
xfs.
The patch removes the race window by removing the early exit path,
which doesn't server any purpose anymore anyway"
* 'for-3.19-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: fix subtle pool management issue which can stall whole worker_pool
Linus Torvalds [Tue, 20 Jan 2015 09:23:41 +0000 (21:23 +1200)]
Merge tag 'pinctrl-v3.19-3' of git://git./linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
"Here is a (hopefully final) slew of pin control fixes for the v3.19
series. The deadlock fix is kind of serious and tagged for stable,
the rest is business as usual.
- Fix two deadlocks around the pin control mutexes, a long-standing
issue that manifest itself in plug/unplug of pin controllers.
(Tagged for stable.)
- Handle an error path with zero functions in the Qualcomm pin
controller.
- Drop a bogus second GPIO chip added in the Lantiq driver.
- Fix sudden IRQ loss on Rockchip pin controllers.
- Register the GIT tree in MAINTAINERS"
* tag 'pinctrl-v3.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: MAINTAINERS: add git tree reference
pinctrl: qcom: Don't iterate past end of function array
pinctrl: lantiq: remove bogus of_gpio_chip_add
pinctrl: Fix two deadlocks
pinctrl: rockchip: Avoid losing interrupts when supporting both edges
Linus Torvalds [Tue, 20 Jan 2015 06:19:31 +0000 (18:19 +1200)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Socket addresses returned in the error queue need to be fully
initialized before being passed on to userspace, fix from Willem de
Bruijn.
2) Interrupt handling fixes to davinci_emac driver from Tony Lindgren.
3) Fix races between receive packet steering and cpu hotplug, from Eric
Dumazet.
4) Allowing netlink sockets to subscribe to unknown multicast groups
leads to crashes, don't allow it. From Johannes Berg.
5) One to many socket races in SCTP fixed by Daniel Borkmann.
6) Put in a guard against the mis-use of ipv6 atomic fragments, from
Hagen Paul Pfeifer.
7) Fix promisc mode and ethtool crashes in sh_eth driver, from Ben
Hutchings.
8) NULL deref and double kfree fix in sxgbe driver from Girish K.S and
Byungho An.
9) cfg80211 deadlock fix from Arik Nemtsov.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (36 commits)
s2io: use snprintf() as a safety feature
r8152: remove sram_read
r8152: remove generic_ocp_read before writing
bgmac: activate irqs only if there is nothing to poll
bgmac: register napi before the device
sh_eth: Fix ethtool operation crash when net device is down
sh_eth: Fix promiscuous mode on chips without TSU
ipv6: stop sending PTB packets for MTU < 1280
net: sctp: fix race for one-to-many sockets in sendmsg's auto associate
genetlink: synchronize socket closing and family removal
genetlink: disallow subscribing to unknown mcast groups
genetlink: document parallel_ops
net: rps: fix cpu unplug
net: davinci_emac: Add support for emac on dm816x
net: davinci_emac: Fix ioremap for devices with MDIO within the EMAC address space
net: davinci_emac: Fix incomplete code for getting the phy from device tree
net: davinci_emac: Free clock after checking the frequency
net: davinci_emac: Fix runtime pm calls for davinci_emac
net: davinci_emac: Fix hangs with interrupts
ip: zero sockaddr returned on error queue
...
Linus Torvalds [Tue, 20 Jan 2015 06:17:34 +0000 (18:17 +1200)]
Merge git://git./linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
"This fixes a regression that arose from the change to add a crypto
prefix to module names which was done to prevent the loading of
arbitrary modules through the Crypto API.
In particular, a number of modules were missing the crypto prefix
which meant that they could no longer be autoloaded"
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: add missing crypto module aliases
Dan Carpenter [Mon, 19 Jan 2015 19:34:51 +0000 (22:34 +0300)]
s2io: use snprintf() as a safety feature
"sp->desc[i]" has 25 characters. "dev->name" has 15 characters. If we
used all 15 characters then the sprintf() would overflow.
I changed the "sprintf(sp->name, "%s Neterion %s"" to snprintf(), as
well, even though it can't overflow just to be consistent.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 19 Jan 2015 21:16:36 +0000 (16:16 -0500)]
Merge branch 'r8152'
Hayes Wang says:
====================
r8152: couldn't read OCP_SRAM_DATA
Read OCP_SRAM_DATA would read additional bytes and may let
the hw abnormal.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
hayeswang [Mon, 19 Jan 2015 09:02:46 +0000 (17:02 +0800)]
r8152: remove sram_read
Read OCP register 0xa43a~0xa43b would clear some flags which the hw
would use, and it may let the device lost. However, the unit of
reading is 4 bytes. That is, it would read 0xa438~0xa43b when calling
sram_read() to read OCP_SRAM_DATA.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
hayeswang [Mon, 19 Jan 2015 09:02:45 +0000 (17:02 +0800)]
r8152: remove generic_ocp_read before writing
For ocp_write_word() and ocp_write_byte(), there is a generic_ocp_read()
which is used to read the whole 4 byte data, keep the unchanged bytes,
and modify the expected bytes. However, the "byen" could be used to
determine which bytes of the 4 bytes to write, so the action could be
removed.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 19 Jan 2015 21:00:02 +0000 (16:00 -0500)]
Merge branch 'bgmac'
Hauke Mehrtens says:
====================
bgmac: some fixes to napi usage
I compared the napi documentation with the bgmac driver and found some
problems in that driver. These two patches should fix the problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Hauke Mehrtens [Sun, 18 Jan 2015 18:49:59 +0000 (19:49 +0100)]
bgmac: activate irqs only if there is nothing to poll
IRQs should only get activated when there is nothing to poll in the
queue any more and to after every poll.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hauke Mehrtens [Sun, 18 Jan 2015 18:49:58 +0000 (19:49 +0100)]
bgmac: register napi before the device
napi should get registered before the netdev and not after.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 19 Jan 2015 20:37:44 +0000 (15:37 -0500)]
Merge branch 'sh_eth'
Ben Hutchings says:
====================
sh_eth fixes
I'm currently looking at Ethernet support on the R-Car H2 chip,
reviewing and testing the sh_eth driver. Here are fixes for two fairly
obvious bugs in the driver; I will probably have some more later.
These are not tested on any of the other supported chips.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Fri, 16 Jan 2015 17:51:25 +0000 (17:51 +0000)]
sh_eth: Fix ethtool operation crash when net device is down
The driver connects and disconnects the PHY device whenever the
net device is brought up and down. The ethtool get_settings,
set_settings and nway_reset operations will dereference a null
or dangling pointer if called while it is down.
I think it would be preferable to keep the PHY connected, but there
may be good reasons not to.
As an immediate fix for this bug:
- Set the phydev pointer to NULL after disconnecting the PHY
- Change those three operations to return -ENODEV while the PHY is
not connected
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Fri, 16 Jan 2015 17:51:12 +0000 (17:51 +0000)]
sh_eth: Fix promiscuous mode on chips without TSU
Currently net_device_ops::set_rx_mode is only implemented for
chips with a TSU (multiple address table). However we do need
to turn the PRM (promiscuous) flag on and off for other chips.
- Remove the unlikely() from the TSU functions that we may safely
call for chips without a TSU
- Make setting of the MCT flag conditional on the tsu capability flag
- Rename sh_eth_set_multicast_list() to sh_eth_set_rx_mode() and plumb
it into both net_device_ops structures
- Remove the previously-unreachable branch in sh_eth_rx_mode() that
would otherwise reset the flags to defaults for non-TSU chips
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hagen Paul Pfeifer [Thu, 15 Jan 2015 21:34:25 +0000 (22:34 +0100)]
ipv6: stop sending PTB packets for MTU < 1280
Reduce the attack vector and stop generating IPv6 Fragment Header for
paths with an MTU smaller than the minimum required IPv6 MTU
size (1280 byte) - called atomic fragments.
See IETF I-D "Deprecating the Generation of IPv6 Atomic Fragments" [1]
for more information and how this "feature" can be misused.
[1] https://tools.ietf.org/html/draft-ietf-6man-deprecate-atomfrag-generation-00
Signed-off-by: Fernando Gont <fgont@si6networks.com>
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Jeffery [Mon, 19 Jan 2015 19:03:25 +0000 (13:03 -0600)]
libata: prevent HSM state change race between ISR and PIO
It is possible for ata_sff_flush_pio_task() to set ap->hsm_task_state to
HSM_ST_IDLE in between the time __ata_sff_port_intr() checks for HSM_ST_IDLE
and before it calls ata_sff_hsm_move() causing ata_sff_hsm_move() to BUG().
This problem is hard to reproduce making this patch hard to verify, but this
fix will prevent the race.
I have not been able to reproduce the problem, but here is a crash dump from
a 2.6.32 kernel.
On examining the ata port's state, its hsm_task_state field has a value of HSM_ST_IDLE:
crash> struct ata_port.hsm_task_state
ffff881c1121c000
hsm_task_state = 0
Normally, this should not be possible as ata_sff_hsm_move() was called from ata_sff_host_intr(),
which checks hsm_task_state and won't call ata_sff_hsm_move() if it has a HSM_ST_IDLE value.
PID: 11053 TASK:
ffff8816e846cae0 CPU: 0 COMMAND: "sshd"
#0 [
ffff88008ba03960] machine_kexec at
ffffffff81038f3b
#1 [
ffff88008ba039c0] crash_kexec at
ffffffff810c5d92
#2 [
ffff88008ba03a90] oops_end at
ffffffff8152b510
#3 [
ffff88008ba03ac0] die at
ffffffff81010e0b
#4 [
ffff88008ba03af0] do_trap at
ffffffff8152ad74
#5 [
ffff88008ba03b50] do_invalid_op at
ffffffff8100cf95
#6 [
ffff88008ba03bf0] invalid_op at
ffffffff8100bf9b
[exception RIP: ata_sff_hsm_move+317]
RIP:
ffffffff813a77ad RSP:
ffff88008ba03ca0 RFLAGS:
00010097
RAX:
0000000000000000 RBX:
ffff881c1121dc60 RCX:
0000000000000000
RDX:
ffff881c1121dd10 RSI:
ffff881c1121dc60 RDI:
ffff881c1121c000
RBP:
ffff88008ba03d00 R8:
0000000000000000 R9:
000000000000002e
R10:
000000000001003f R11:
000000000000009b R12:
ffff881c1121c000
R13:
0000000000000000 R14:
0000000000000050 R15:
ffff881c1121dd78
ORIG_RAX:
ffffffffffffffff CS: 0010 SS: 0018
#7 [
ffff88008ba03d08] ata_sff_host_intr at
ffffffff813a7fbd
#8 [
ffff88008ba03d38] ata_sff_interrupt at
ffffffff813a821e
#9 [
ffff88008ba03d78] handle_IRQ_event at
ffffffff810e6ec0
--- <IRQ stack> ---
[exception RIP: pipe_poll+48]
RIP:
ffffffff81192780 RSP:
ffff880f26d459b8 RFLAGS:
00000246
RAX:
0000000000000000 RBX:
ffff880f26d459c8 RCX:
0000000000000000
RDX:
0000000000000001 RSI:
0000000000000000 RDI:
ffff881a0539fa80
RBP:
ffffffff8100bb8e R8:
ffff8803b23324a0 R9:
0000000000000000
R10:
ffff880f26d45dd0 R11:
0000000000000008 R12:
ffffffff8109b646
R13:
ffff880f26d45948 R14:
0000000000000246 R15:
0000000000000246
ORIG_RAX:
ffffffffffffff10 CS: 0010 SS: 0018
RIP:
00007f26017435c3 RSP:
00007fffe020c420 RFLAGS:
00000206
RAX:
0000000000000017 RBX:
ffffffff8100b072 RCX:
00007fffe020c45c
RDX:
00007f2604a3f120 RSI:
00007f2604a3f140 RDI:
000000000000000d
RBP:
0000000000000000 R8:
00007fffe020e570 R9:
0101010101010101
R10:
0000000000000000 R11:
0000000000000246 R12:
00007fffe020e5f0
R13:
00007fffe020e5f4 R14:
00007f26045f373c R15:
00007fffe020e5e0
ORIG_RAX:
0000000000000017 CS: 0033 SS: 002b
Somewhere between the ata_sff_hsm_move() check and the ata_sff_host_intr() check, the value changed.
On examining the other cpus to see what else was running, another cpu was running the error handler
routines:
PID: 326 TASK:
ffff881c11014aa0 CPU: 1 COMMAND: "scsi_eh_1"
#0 [
ffff88008ba27e90] crash_nmi_callback at
ffffffff8102fee6
#1 [
ffff88008ba27ea0] notifier_call_chain at
ffffffff8152d515
#2 [
ffff88008ba27ee0] atomic_notifier_call_chain at
ffffffff8152d57a
#3 [
ffff88008ba27ef0] notify_die at
ffffffff810a154e
#4 [
ffff88008ba27f20] do_nmi at
ffffffff8152b1db
#5 [
ffff88008ba27f50] nmi at
ffffffff8152aaa0
[exception RIP: _spin_lock_irqsave+47]
RIP:
ffffffff8152a1ff RSP:
ffff881c11a73aa0 RFLAGS:
00000006
RAX:
0000000000000001 RBX:
ffff881c1121deb8 RCX:
0000000000000000
RDX:
0000000000000246 RSI:
0000000000000020 RDI:
ffff881c122612d8
RBP:
ffff881c11a73aa0 R8:
ffff881c17083800 R9:
0000000000000000
R10:
0000000000000000 R11:
0000000000000000 R12:
ffff881c1121c000
R13:
000000000000001f R14:
ffff881c1121dd50 R15:
ffff881c1121dc60
ORIG_RAX:
ffffffffffffffff CS: 0010 SS: 0000
--- <NMI exception stack> ---
#6 [
ffff881c11a73aa0] _spin_lock_irqsave at
ffffffff8152a1ff
#7 [
ffff881c11a73aa8] ata_exec_internal_sg at
ffffffff81396fb5
#8 [
ffff881c11a73b58] ata_exec_internal at
ffffffff81397109
#9 [
ffff881c11a73bd8] atapi_eh_request_sense at
ffffffff813a34eb
Before it tried to acquire a spinlock, ata_exec_internal_sg() called ata_sff_flush_pio_task().
This function will set ap->hsm_task_state to HSM_ST_IDLE, and has no locking around setting this
value. ata_sff_flush_pio_task() can then race with the interrupt handler and potentially set
HSM_ST_IDLE at a fatal moment, which will trigger a kernel BUG.
v2: Fixup comment in ata_sff_flush_pio_task()
tj: Further updated comment. Use ap->lock instead of shost lock and
use the [un]lock_irq variant instead of the irqsave/restore one.
Signed-off-by: David Milburn <dmilburn@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
Dan Williams [Fri, 16 Jan 2015 23:13:02 +0000 (15:13 -0800)]
libata: allow sata_sil24 to opt-out of tag ordered submission
Ronny reports: https://bugzilla.kernel.org/show_bug.cgi?id=87101
"Since commit
8a4aeec8d "libata/ahci: accommodate tag ordered
controllers" the access to the harddisk on the first SATA-port is
failing on its first access. The access to the harddisk on the
second port is working normal.
When reverting the above commit, access to both harddisks is working
fine again."
Maintain tag ordered submission as the default, but allow sata_sil24 to
continue with the old behavior.
Cc: <stable@vger.kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Reported-by: Ronny Hegewald <Ronny.Hegewald@online.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Linus Walleij [Mon, 19 Jan 2015 10:27:19 +0000 (11:27 +0100)]
pinctrl: MAINTAINERS: add git tree reference
Reference my pinctrl GIT tree @kernel.org
Reported-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Stephen Boyd [Mon, 19 Jan 2015 10:17:45 +0000 (11:17 +0100)]
pinctrl: qcom: Don't iterate past end of function array
Timur reports that this code crashes if nfunctions is 0. Fix the
loop iteration to only consider valid elements of the functions
array.
Reported-by: Timur Tabi <timur@codeaurora.org>
Cc: Pramod Gurav <pramod.gurav@smartplayin.com>
Cc: Bjorn Andersson <bjorn.andersson@sonymobile.com>
Cc: Ivan T. Ivanov <iivanov@mm-sol.com>
Cc: Andy Gross <agross@codeaurora.org>
Fixes: 327455817a92 "pinctrl: qcom: Add support for reset for apq8064"
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>