Javier Cardona [Thu, 10 Dec 2009 02:43:00 +0000 (18:43 -0800)]
mac80211: Fixed bug in mesh portal paths
commit
5d618cb81aeea19879975cd1f9a1e707694dfd7c upstream.
Paths to mesh portals were being timed out immediately after each use in
intermediate forwarding nodes. mppath->exp_time is set to the expiration time
so assigning it to jiffies was marking the path as expired.
Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: Andrey Yurovsky <andrey@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Vasanthakumar Thiagarajan [Fri, 4 Dec 2009 12:11:34 +0000 (17:41 +0530)]
mac80211: Fix bug in computing crc over dynamic IEs in beacon
commit
1814077fd12a9cdf478c10076e9c42094e9d9250 upstream.
On a 32-bit machine, BIT() macro does not give the required
bit value if the bit is mroe than 31. In ieee802_11_parse_elems_crc(),
BIT() is suppossed to get the bit value more than 31 (42 (id of ERP_INFO_IE),
37 (CHANNEL_SWITCH_IE), (42), 32 (POWER_CONSTRAINT_IE), 45 (HT_CAP_IE),
61 (HT_INFO_IE)). As we do not get the required bit value for the above
IEs, crc over these IEs are never calculated, so any dynamic change in these
IEs after the association is not really handled on 32-bit platforms.
This patch fixes this issue.
Signed-off-by: Vasanthakumar Thiagarajan <vasanth@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Ian Jackson [Wed, 18 Nov 2009 10:08:11 +0000 (11:08 +0100)]
Serial: Do not read IIR in serial8250_start_tx when UART_BUG_TXEN
commit
68cb4f8e246bbbc649980be0628cae9265870a91 upstream.
Do not read IIR in serial8250_start_tx when UART_BUG_TXEN
Reading the IIR clears some oustanding interrupts so it is not safe.
Instead, simply transmit immediately if the buffer is empty without
regard to IIR.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Jiri Kosina <jkosina@suse.cz>
Cc: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Alan Stern [Fri, 4 Dec 2009 16:06:57 +0000 (11:06 -0500)]
Driver core: fix race in dev_driver_string
commit
3589972e51fac1e02d0aaa576fa47f568cb94d40 upstream.
This patch (as1310) works around a race in dev_driver_string(). If
the device is unbound while the function is running, dev->driver might
become NULL after we test it and before we dereference it.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Cc: Oliver Neukum <oliver@neukum.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Mathieu Desnoyers [Tue, 17 Nov 2009 22:40:26 +0000 (14:40 -0800)]
debugfs: fix create mutex racy fops and private data
commit
d3a3b0adad0865c12e39b712ca89efbd0a3a0dbc upstream.
Setting fops and private data outside of the mutex at debugfs file
creation introduces a race where the files can be opened with the wrong
file operations and private data. It is easy to trigger with a process
waiting on file creation notification.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Sukadev Bhattiprolu [Wed, 18 Nov 2009 02:35:43 +0000 (18:35 -0800)]
devpts_get_tty() should validate inode
commit
edfacdd6f81119b9005615593f2cbd94b8c7e2d8 upstream.
devpts_get_tty() assumes that the inode passed in is associated with a valid
pty. But if the only reference to the pty is via a bind-mount, the inode
passed to devpts_get_tty() while valid, would refer to a pty that no longer
exists.
With a lot of debug effort, Grzegorz Nosek developed a small program (see
below) to reproduce a crash on recent kernels. This crash is a regression
introduced by the commit:
commit
527b3e4773628b30d03323a2cb5fb0d84441990f
Author: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Date: Mon Oct 13 10:43:08 2008 +0100
To fix, ensure that the dentry associated with the inode has not yet been
deleted/unhashed by devpts_pty_kill().
See also:
https://lists.linux-foundation.org/pipermail/containers/2009-July/019273.html
tty-bug.c:
#define _GNU_SOURCE
#include <fcntl.h>
#include <sched.h>
#include <stdlib.h>
#include <sys/mount.h>
#include <sys/signal.h>
#include <unistd.h>
#include <stdio.h>
#include <linux/fs.h>
void dummy(int sig)
{
}
static int child(void *unused)
{
int fd;
signal(SIGINT, dummy); signal(SIGHUP, dummy);
pause(); /* cheesy synchronisation to wait for /dev/pts/0 to appear */
mount("/dev/pts/0", "/dev/console", NULL, MS_BIND, NULL);
sleep(2);
fd = open("/dev/console", O_RDWR);
dup(0); dup(0);
write(1, "Hello world!\n", sizeof("Hello world!\n")-1);
return 0;
}
int main(void)
{
pid_t pid;
char *stack;
stack = malloc(16384);
pid = clone(child, stack+16384, CLONE_NEWNS|SIGCHLD, NULL);
open("/dev/ptmx", O_RDWR|O_NOCTTY|O_NONBLOCK);
unlockpt(fd); grantpt(fd);
sleep(2);
kill(pid, SIGHUP);
sleep(1);
return 0; /* exit before child opens /dev/console */
}
Reported-by: Grzegorz Nosek <root@localdomain.pl>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Tested-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Andi Kleen [Tue, 8 Dec 2009 12:19:42 +0000 (13:19 +0100)]
futex: Take mmap_sem for get_user_pages in fault_in_user_writeable
commit
722d0172377a5697919b9f7e5beb95165b1dec4e upstream.
get_user_pages() must be called with mmap_sem held.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linuxfoundation.org>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Darren Hart <dvhltc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <
20091208121942.GA21298@basil.fritz.box>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
NeilBrown [Mon, 14 Dec 2009 01:49:46 +0000 (12:49 +1100)]
md/bitmap: protect against bitmap removal while being updated.
commit
aa5cbd103887011b4830355f88fb055f9ad2d556 upstream.
A write intent bitmap can be removed from an array while the
array is active.
When this happens, all IO is suspended and flushed before the
bitmap is removed.
However it is possible that bitmap_daemon_work is still running to
clear old bits from the bitmap. If it is, it can dereference the
bitmap after it has been freed.
So introduce a new mutex to protect bitmap_daemon_work and get it
before destroying a bitmap.
This is suitable for any current -stable kernel.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Trond Myklebust [Thu, 10 Dec 2009 14:05:55 +0000 (09:05 -0500)]
NFS: Fix nfs_migrate_page()
commit
190f38e5cedc910940b1da9015f00458c18f97b4 upstream.
The call to migrate_page() will cause the page->private field to be
cleared.
Also fix up the locking around the page->private transfer, so that we ensure
that calls to nfs_page_find_request() don't end up racing.
Finally, fix up a double free bug: nfs_unlock_request() already calls
nfs_release_request() for us...
Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Tested-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Roel Kluin [Tue, 8 Dec 2009 18:13:03 +0000 (13:13 -0500)]
SUNRPC: IS_ERR/PTR_ERR confusion
commit
480e3243df156e39eea6c91057e2ae612a6bbe19 upstream.
IS_ERR returns 1 or 0, PTR_ERR returns the error value.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Amerigo Wang [Tue, 15 Dec 2009 01:57:37 +0000 (17:57 -0800)]
hfs: fix a potential buffer overflow
commit
ec81aecb29668ad71f699f4e7b96ec46691895b6 upstream.
A specially-crafted Hierarchical File System (HFS) filesystem could cause
a buffer overflow to occur in a process's kernel stack during a memcpy()
call within the hfs_bnode_read() function (at fs/hfs/bnode.c:24). The
attacker can provide the source buffer and length, and the destination
buffer is a local variable of a fixed length. This local variable (passed
as "&entry" from fs/hfs/dir.c:112 and allocated on line 60) is stored in
the stack frame of hfs_bnode_read()'s caller, which is hfs_readdir().
Because the hfs_readdir() function executes upon any attempt to read a
directory on the filesystem, it gets called whenever a user attempts to
inspect any filesystem contents.
[amwang@redhat.com: modify this patch and fix coding style problems]
Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Eugene Teo <eteo@redhat.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Dave Anderson <anderson@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Igor Grinberg [Sun, 6 Dec 2009 13:45:43 +0000 (15:45 +0200)]
pxa/em-x270: fix usb hub power up/reset sequence
commit
1b82e4c32fba96d8805b1e2126ba5382e56fac32 upstream.
Signed-off-by: Igor Grinberg <grinberg@compulab.co.il>
Signed-off-by: Mike Rapoport <mike@compulab.co.il>
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Russ Dill [Tue, 15 Dec 2009 04:45:35 +0000 (21:45 -0700)]
USB: Close usb_find_interface race v3
commit
c2d284ee04ab6f6718de2ddcf1b43160e046c41d upstream.
USB drivers that create character devices call usb_register_dev in their
probe function. This associates the usb_interface device with that minor
number and creates the character device and announces it to the world.
However, the driver's probe function is called before the new
usb_interface is added to the driver's klist_devices.
This is a problem because userspace will respond to the character device
creation announcement by opening the character device. The driver's open
function will the call usb_find_interface to find the usb_interface
associated with that minor number. usb_find_interface will walk the
driver's list of devices and find the usb_interface with the matching
minor number.
Because the announcement happens before the usb_interface is added to the
driver's klist_devices, a race condition exists. A straightforward fix
is to walk the list of devices on usb_bus_type instead since the device
is added to that list before the announcement occurs.
bus_find_device calls get_device to bump the reference count on the found
device. It is arguable that the reference count should be dropped by the
caller of usb_find_interface instead of usb_find_interface, however,
the current users of usb_find_interface do not expect this.
The original version of this patch only matched against minor number
instead of driver and minor number. This version matches against both.
Signed-off-by: Russ Dill <Russ.Dill@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Alan Stern [Mon, 7 Dec 2009 21:39:16 +0000 (16:39 -0500)]
USB: usb-storage: add BAD_SENSE flag
commit
a0bb108112a872c0b0c4b3ef4974f95fb75b155d upstream.
This patch (as1311) fixes a problem in usb-storage: Some devices are
pretty broken when it comes to reporting sense data. The information
they send back indicates that they have more than 18 bytes of sense
data available, but when the system asks for more than 18 they fail or
hang. The symptom is that probing fails with multiple resets.
The patch adds a new BAD_SENSE flag to indicate that usb-storage
should never ask for more than 18 bytes of sense data. The flag can
be set in an unusual_devs entry or via the "quirks=" module parameter,
and it is set automatically whenever a REQUEST SENSE command for more
than 18 bytes fails or times out.
An unusual_devs entry is added for the Agfa photo frame, which uses a
Prolific chip having this bug.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Daniel Kukula <daniel.kuku@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Andre Herms [Thu, 19 Nov 2009 17:14:49 +0000 (18:14 +0100)]
USB: usbtmc: repeat usb_bulk_msg until whole message is transfered
commit
ec412b92dbe3ea839716853eea058d1bcc5e6ca4 upstream.
usb_bulk_msg() transfers only bytes up to the maximum packet size.
It must be repeated by the usbtmc driver until all bytes of a TMC message
are transfered.
Without this patch, ETIMEDOUT is reported when writing TMC messages
larger than the maximum USB bulk size and the transfer remains incomplete.
The user will notice that the device hangs and must be reset by either closing
the application or pulling the plug.
Signed-off-by: Andre Herms <andre.herms@tec-venture.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Zhang Le [Tue, 17 Nov 2009 22:53:42 +0000 (14:53 -0800)]
USB: option.c: add support for D-Link DWM-162-U5
commit
54a8e144acad6506920f385f4ef2779664f05b21 upstream.
Add D-Link DWM-162-U5 device id 1e0e:ce16 into option driver. The device
has 4 interfaces, of which 1 is handled by storage and the other 3 by
option driver.
The device appears first as CD-only 05c6:2100 device and must be switched
to 1e0e:ce16 mode either by using "eject CD" or usb_modeswitch.
The MessageContent for usb_modeswitch.conf is:
"
55534243e0c26a85000000000000061b000000020000000000000000000000"
Signed-off-by: Zhang Le <r0bertz@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Sergei Shtylyov [Mon, 16 Nov 2009 10:54:05 +0000 (16:24 +0530)]
USB: musb_gadget_ep0: fix unhandled endpoint 0 IRQs, again
commit
196f1b7a387546f425df2f1fad26772e3d513aea upstream.
Commit
a5073b52833e4df8e16c93dc4cbb7e0c558c74a2 (musb_gadget: fix
unhandled endpoint 0 IRQs) somehow missed its key change:
"The gadget EP0 code routinely ignores an interrupt at end of
the data phase because of musb_g_ep0_giveback() resetting the
state machine to "idle, waiting for SETUP" phase prematurely."
So, the majority of the cases of unhandled IRQs is still unfixed...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Anand Gadiyar <gadiyar@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Sarah Sharp [Tue, 1 Dec 2009 18:37:07 +0000 (10:37 -0800)]
USB: xhci: Add correct email and files to MAINTAINERS entry.
commit
36d0344c254a7b333272757f858c403ea3a2d29f upstream.
Add the xHCI driver files to its MAINTAINERS entry so that I'm Cc'd on
cleanup patches. Update the email address to one I actually use for
sending patches and responding to Linux mailing list emails.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Theodore Ts'o [Sun, 15 Nov 2009 20:31:37 +0000 (15:31 -0500)]
jbd2: don't wipe the journal on a failed journal checksum
commit
e6a47428de84e19fda52f21ab73fde2906c40d09 upstream.
If there is a failed journal checksum, don't reset the journal. This
allows for userspace programs to decide how to recover from this
situation. It may be that ignoring the journal checksum failure might
be a better way of recovering the file system. Once we add per-block
checksums, we can definitely do better. Until then, a system
administrator can try backing up the file system image (or taking a
snapshot) and and trying to determine experimentally whether ignoring
the checksum failure or aborting the journal replay results in less
data loss.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Sebastian Andrzej Siewior [Sun, 29 Nov 2009 18:46:02 +0000 (19:46 +0100)]
UBI: flush wl before clearing update marker
commit
6afaf8a484cbbfd2ccf58a4e5396d1f280469789 upstream.
ubiupdatevol -t does the following:
- ubi_start_update()
- set_update_marker()
- for all LEBs ubi_eba_unmap_leb()
- clear_update_marker()
- ubi_wl_flush()
ubi_wl_flush() physically erases all PEB, once it returns all PEBs are
empty. clear_update_marker() has the update marker written after return.
If there is a power cut between the last two functions then the UBI
volume has no longer the "update" marker set and may have some valid
LEBs while some of them may be gone.
If that volume in question happens to be a UBIFS volume, then mount
will fail with
|UBIFS error (pid 1361): ubifs_read_node: bad node type (255 but expected 6)
|UBIFS error (pid 1361): ubifs_read_node: bad node at LEB 0:0
|Not a node, first 24 bytes:
|
00000000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
if there is at least one valid LEB and the wear-leveling worker managed
to clear LEB 0.
The patch waits for the wl worker to finish prior clearing the "update"
marker on flash. The two new LEB which are scheduled for erasing after
clear_update_marker() should not matter because they are only visible to
UBI.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Alexey Dobriyan [Tue, 15 Dec 2009 01:57:34 +0000 (17:57 -0800)]
bsdacct: fix uid/gid misreporting
commit
4b731d50ff3df6b9141a6c12b088e8eb0109e83c upstream.
commit
d8e180dcd5bbbab9cd3ff2e779efcf70692ef541 "bsdacct: switch
credentials for writing to the accounting file" introduced credential
switching during final acct data collecting. However, uid/gid pair
continued to be collected from current which became credentials of who
created acct file, not who exits.
Addresses http://bugzilla.kernel.org/show_bug.cgi?id=14676
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reported-by: Juho K. Juopperi <jkj@kapsi.fi>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: David Howells <dhowells@redhat.com>
Reviewed-by: Michal Schmidt <mschmidt@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Roel Kluin [Fri, 20 Nov 2009 18:34:13 +0000 (15:34 -0300)]
V4L/DVB: Fix test in copy_reg_bits()
commit
c95a419a5604ec8a23cd73f61e9bb151e8cbe89b upstream.
The reg_pair2[j].reg was tested twice.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Acked-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Hendrik Brueckner [Mon, 7 Dec 2009 11:44:42 +0000 (12:44 +0100)]
s390: clear high-order bits of registers after sam64
commit
cf87b7439ec81b9374e7772e44e9cb2eb9e57160 upstream.
When the kernel is IPLed without the CLEAR option and switches
to 64-bit, the high-order half of the registers might contain
random values. This can cause addressing exceptions and the
kernel enters an interrupt loop.
Initialize the high-order half of the general purpose registers
with zeros after switching to 64-bit mode.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Sergei Shtylyov [Fri, 27 Nov 2009 18:29:02 +0000 (22:29 +0400)]
pata_hpt{37x|3x2n}: fix timing register masks (take 2)
commit
5600c70e576199a7552e1c0fff43f3fe16f5566e upstream.
These drivers inherited from the older 'hpt366' IDE driver the buggy timing
register masks in their set_piomode() metods. As a result, too low command
cycle active time is programmed for slow PIO modes. Quite fortunately, it's
later "fixed up" by the set_dmamode() methods which also "helpfully" reprogram
the command timings, usually to PIO mode 4; unfortunately, setting an UltraDMA
mode #N also reprograms already set PIO data timings, usually to MWDMA mode #
max(N, 2) timings...
However, the drivers added some breakage of their own too: the bit that they
set/clear to control the FIFO is sometimes wrong -- it's actually the MSB of
the command cycle setup time; also, setting it in DMA mode is wrong as this
bit is only for PIO actually and clearing it for PIO modes is not needed as
no mode in any timing table has it set...
Fix all this, inverting the masks while at it, like in the 'hpt366' and
'pata_hpt366' drivers; bump the drivers' versions, accounting for recent
patches that forgot to do it...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Dave Jones [Tue, 10 Nov 2009 20:01:20 +0000 (15:01 -0500)]
x86: Fix typo in Intel CPU cache size descriptor
commit
e02e0e1a130b9ca37c5186d38ad4b3aaf58bb149 upstream.
I double-checked the datasheet. One of the existing
descriptors has a typo: it should be 2MB not 2038 KB.
Signed-off-by: Dave Jones <davej@redhat.com>
LKML-Reference: <
20091110200120.GA27090@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Dave Jones [Tue, 10 Nov 2009 18:49:24 +0000 (13:49 -0500)]
x86: Add new Intel CPU cache size descriptors
commit
85160b92fbd35321104819283c91bfed2b553e3c upstream.
The latest rev of Intel doc AP-485 details new cache descriptors
that we don't yet support. 12MB, 18MB and 24MB 24-way assoc L3
caches.
Signed-off-by: Dave Jones <davej@redhat.com>
LKML-Reference: <
20091110184924.GA20337@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Cliff Wickman [Fri, 11 Dec 2009 17:36:18 +0000 (11:36 -0600)]
x86: Fix duplicated UV BAU interrupt vector
commit
1d865fb728bd6bbcdfbd6ec1e2b8ade3b4805641 upstream.
Interrupt vector 0xec has been doubly defined in irq_vectors.h
It seems arbitrary whether LOCAL_PENDING_VECTOR or
UV_BAU_MESSAGE is the higher number. As long as they are
unique. If they are not unique we'll hit a BUG in
alloc_system_vector().
Signed-off-by: Cliff Wickman <cpw@sgi.com>
LKML-Reference: <E1NJ9Pe-0004P7-0Q@eag09.americas.sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Cliff Wickman [Thu, 19 Nov 2009 23:12:43 +0000 (17:12 -0600)]
x86: SGI UV: Fix BAU initialization
commit
e38e2af1c57c3eb5211331a5b4fcaae0c4a2a918 upstream.
A memory mapped register that affects the SGI UV Broadcast
Assist Unit's interrupt handling may sometimes be unintialized.
Remove the condition on its initialization, as that condition
can be randomly satisfied by a hardware reset.
Signed-off-by: Cliff Wickman <cpw@sgi.com>
LKML-Reference: <E1NBGB9-0005nU-Dp@eag09.americas.sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Jan Beulich [Tue, 8 Dec 2009 02:21:37 +0000 (11:21 +0900)]
x86/mce: Set up timer unconditionally
commit
bc09effabf0c5c6c7021e5ef9af15a23579b32a8 upstream.
mce_timer must be passed to setup_timer() in all cases, no
matter whether it is going to be actually used. Otherwise, when
the CPU gets brought down, its call to del_timer_sync() will
never return, as the timer won't have a base associated, and
hence lock_timer_base() will loop infinitely.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
LKML-Reference: <
4B1DB831.
2030801@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Hidetoshi Seto [Thu, 3 Dec 2009 02:33:08 +0000 (11:33 +0900)]
x86, mce: don't restart timer if disabled
commit
fe5ed91ddce85a0ed0e4f92c10b099873ef62167 upstream.
Even it is in error path unlikely taken, add_timer_on() at
CPU_DOWN_FAILED* needs to be skipped if mce_timer is disabled.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Jan Beulich <jbeulich@novell.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Thomas Gleixner [Sat, 28 Nov 2009 14:03:03 +0000 (15:03 +0100)]
x86: Use -maccumulate-outgoing-args for sane mcount prologues
commit
b8b7d791a8ff01d2380089279a69afa99115fb23 upstream.
commit
746357d (x86: Prevent GCC 4.4.x (pentium-mmx et al) function
prologue wreckage) uses -mtune=generic to work around the function
prologue problem with mcount on -march=pentium-mmx and others.
Jakub pointed out that we can use -maccumulate-outgoing-args instead
which is selected by -mtune=generic and prevents the problem without
losing the -march specific optimizations.
Pointed-out-by: Jakub Jelinek <jakub@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Thomas Gleixner [Fri, 20 Nov 2009 11:01:43 +0000 (12:01 +0100)]
x86: Prevent GCC 4.4.x (pentium-mmx et al) function prologue wreckage
commit
746357d6a526d6da9d89a2ec645b28406e959c2e upstream.
When the kernel is compiled with -pg for tracing GCC 4.4.x inserts
stack alignment of a function _before_ the mcount prologue if the
-march=pentium-mmx is set and -mtune=generic is not set. This breaks
the assumption of the function graph tracer which expects that the
mcount prologue
push %ebp
mov %esp, %ebp
is the first stack operation in a function because it needs to modify
the function return address on the stack to trap into the tracer
before returning to the real caller.
The generated code is:
push %edi
lea 0x8(%esp),%edi
and $0xfffffff0,%esp
pushl -0x4(%edi)
push %ebp
mov %esp,%ebp
so the tracer modifies the copy of the return address which is stored
after the stack alignment and therefor does not trap the return which
in turn breaks the call chain logic of the tracer and leads to a
kernel panic.
Aside of the fact that the generated code is horrible for no good
reason other -march -mtune options generate the expected:
push %ebp
mov %esp,%ebp
and $0xfffffff0,%esp
which does the same and keeps everything intact.
After some experimenting we found out that this problem is restricted
to gcc4.4.x and to the following -march settings:
i586, pentium, pentium-mmx, k6, k6-2, k6-3, winchip-c6, winchip2, c3,
geode
By adding -mtune=generic the code generator produces always the
expected code.
So forcing -mtune=generic when CONFIG_FUNCTION_GRAPH_TRACER=y is not
pretty, but at the moment the only way to prevent that the kernel
trips over gcc-shrooms induced code madness.
Most distro kernels have CONFIG_X86_GENERIC=y anyway which forces
-mtune=generic as well so it will not impact those.
References: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42109
http://lkml.org/lkml/2009/11/19/17
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <alpine.LFD.2.00.
0911200206570.24119@localhost.localdomain>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>,
Cc: Jeff Law <law@redhat.com>
Cc: gcc@gcc.gnu.org
Cc: David Daney <ddaney@caviumnetworks.com>
Cc: Andrew Haley <aph@redhat.com>
Cc: Richard Guenther <richard.guenther@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Glauber Costa [Tue, 6 Oct 2009 17:24:50 +0000 (13:24 -0400)]
KVM: x86: include pvclock MSRs in msrs_to_save
commit
e3267cbbbfbcbe9c18833e89b10beabb1117cb55 upstream.
For a while now, we are issuing a rdmsr instruction to find out which
msrs in our save list are really supported by the underlying machine.
However, it fails to account for kvm-specific msrs, such as the pvclock
ones.
This patch moves then to the beginning of the list, and skip testing them.
Signed-off-by: Glauber Costa <glommer@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Marcelo Tosatti [Sun, 18 Oct 2009 01:47:23 +0000 (22:47 -0300)]
KVM: fix irq_source_id size verification
commit
cd5a2685de4a642fd0bd763e8c19711ef08dbe27 upstream.
find_first_zero_bit works with bit numbers, not bytes.
Fixes
https://sourceforge.net/tracker/?func=detail&aid=
2847560&group_id=180599&atid=893831
Reported-by: "Xu, Jiajun" <jiajun.xu@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Carsten Otte [Thu, 19 Nov 2009 13:21:16 +0000 (14:21 +0100)]
KVM: s390: Make psw available on all exits, not just a subset
commit
d7b0b5eb3000c6fb902f08c619fcd673a23d8fab upstream.
This patch moves s390 processor status word into the base kvm_run
struct and keeps it up-to date on all userspace exits.
The userspace ABI is broken by this, however there are no applications
in the wild using this. A capability check is provided so users can
verify the updated API exists.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Carsten Otte [Mon, 30 Nov 2009 16:14:41 +0000 (17:14 +0100)]
KVM: s390: Fix prefix register checking in arch/s390/kvm/sigp.c
commit
f50146bd7bdb75435638e60d4960edd9bcdf88b8 upstream.
This patch corrects the checking of the new address for the prefix register.
On s390, the prefix register is used to address the cpu's lowcore (address
0...8k). This check is supposed to verify that the memory is readable and
present.
copy_from_guest is a helper function, that can be used to read from guest
memory. It applies prefixing, adds the start address of the guest memory in
user, and then calls copy_from_user. Previous code was obviously broken for
two reasons:
- prefixing should not be applied here. The current prefix register is
going to be updated soon, and the address we're looking for will be
0..8k after we've updated the register
- we're adding the guest origin (gmsor) twice: once in subject code
and once in copy_from_guest
With kuli, we did not hit this problem because (a) we were lucky with
previous prefix register content, and (b) our guest memory was mmaped
very low into user address space.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reported-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Avi Kivity [Tue, 24 Nov 2009 13:20:15 +0000 (15:20 +0200)]
KVM: x86 emulator: limit instructions to 15 bytes
commit
eb3c79e64a70fb8f7473e30fa07e89c1ecc2c9bb upstream.
While we are never normally passed an instruction that exceeds 15 bytes,
smp games can cause us to attempt to interpret one, which will cause
large latencies in non-preempt hosts.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Jaroslav Kysela [Wed, 9 Dec 2009 09:44:47 +0000 (10:44 +0100)]
ALSA: hda - Terradici HDA controllers does not support 64-bit mode
commit
396087eaead95fcb29eb36f1e59517aeb58c545e upstream.
Confirmed from vendor and tests in RedHat bugzilla #536782 .
Signed-off-by: Jaroslav Kysela <perex@perex.cz>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Takashi Iwai [Fri, 11 Dec 2009 11:51:05 +0000 (12:51 +0100)]
ALSA: hrtimer - Fix lock-up
commit
fcfdebe70759c74e2e701f69aaa7f0e5e32cf5a6 upstream.
The timer stop callback can be called from snd_timer_interrupt(), which
is called from the hrtimer callback. Since hrtimer_cancel() waits for
the callback completion, this eventually results in a lock-up.
This patch fixes the problem by just toggling a flag at stop callback
and call hrtimer_cancel() later.
Reported-and-tested-by: Wojtek Zabolotny <W.Zabolotny@elka.pw.edu.pl>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Feng Tang [Thu, 3 Sep 2009 08:32:53 +0000 (16:32 +0800)]
hrtimer: Fix /proc/timer_list regression
commit
8629ea2eaba8ca0de2e38ce1b4a825e16255976e upstream.
commit
507e1231 (timer stats: Optimize by adding quick check to avoid
function calls) introduced a regression in /proc/timer_list.
/proc/timer_list shows now
#0: <
c27d46b0>, tick_sched_timer, S:01, <(null)>, /-1
instead of
#0: <
c27d46b0>, tick_sched_timer, S:01, hrtimer_start, swapper/0
Revert the hrtimer quick check for now. The optimization needs more
thought, but this is neither 2.6.32-rc7 nor stable material.
[ tglx: - Removed unrelated changes from the original patch
- Prevent unneccesary call to timer_stats_update_stats
- massaged the changelog ]
Signed-off-by: Feng Tang <feng.tang@intel.com>
LKML-Reference: <alpine.LFD.2.00.
0911181933540.24119@localhost.localdomain>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Luis R. Rodriguez [Wed, 25 Nov 2009 22:23:26 +0000 (17:23 -0500)]
ath5k: enable EEPROM checksum check
commit
512414b0bed0d376ac4d5ec1dd6f0b1a3551febc upstream.
Without this we have no gaurantee of the integrity of the
EEPROM and are likely to encounter a lot of bogus bug reports
due to actual issues on the EEPROM. With the EEPROM checksum
check in place we can easily rule those issues out.
If you run patch during a revert *you* have a card with a busted
EEPROM and only older kernel will support that concoction. This
patch is a trade off between not accepitng bogus EEPROMs and
avoiding bogus bug reports allowing developers to focus instead
on real concrete issues.
If stable keeps bogus bug reports because of a possibly busted EEPROM
feel free to apply this there too.
Tested on an AR5414
Cc: jirislaby@gmail.com
Cc: akpm@linux-foundation.org
Cc: rjw@sisk.pl
Cc: me@bobcopeland.com
Cc: david.quan@atheros.com
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Bob Copeland [Mon, 16 Nov 2009 13:30:29 +0000 (08:30 -0500)]
ath5k: allow setting txpower to 0
commit
2eb2fa67e5462a36e98172fb92c78bc405b3035f upstream.
As a holdover from earlier code when we used to set
the power limit to '0' after a reset to configure the
default transmit power, ath5k interprets txpower=0 as
12.5 dBm. Fix that by just passing 0 through.
This fixes http://bugzilla.kernel.org/show_bug.cgi?id=14567
Reported-by: Daniel Folkers <daniel.folkers@task24.nl>
Tested-by: Daniel Folkers <daniel.folkers@task24.nl>
Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Michael Buesch [Mon, 23 Nov 2009 19:58:06 +0000 (20:58 +0100)]
ssb: Fix range check in sprom write
commit
e33761e6f23881de9f3ee77cc2204ab2e26f3d9a upstream.
The range check in the sprom image parser hex2sprom() is broken.
One sprom word is 4 hex characters.
This fixes the check and also adds much better sanity checks to the code.
We better make sure the image is OK by doing some sanity checks to avoid
bricking the device by accident.
Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Mikael Pettersson [Thu, 3 Dec 2009 14:52:44 +0000 (15:52 +0100)]
x86, apic: Enable lapic nmi watchdog on AMD Family 11h
commit
7d1849aff6687a135a8da3a75e32a00e3137a5e2 upstream.
The x86 lapic nmi watchdog does not recognize AMD Family 11h,
resulting in:
NMI watchdog: CPU not supported
As far as I can see from available documentation (the BKDM),
family 11h looks identical to family 10h as far as the PMU
is concerned.
Extending the check to accept family 11h results in:
Testing NMI watchdog ... OK.
I've been running with this change on a Turion X2 Ultra ZM-82
laptop for a couple of weeks now without problems.
Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Joerg Roedel <joerg.roedel@amd.com>
LKML-Reference: <19223.53436.931768.278021@pilspetsen.it.uu.se>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Leann Ogasawara [Fri, 4 Dec 2009 23:42:22 +0000 (15:42 -0800)]
x86: ASUS P4S800 reboot=bios quirk
commit
4832ddda2ec4df96ea1eed334ae2dbd65fc1f541 upstream.
Bug reporter noted their system with an ASUS P4S800 motherboard would
hang when rebooting unless reboot=b was specified. Their dmidecode
didn't contain descriptive System Information for Manufacturer or
Product Name, so I used their Base Board Information to create a
reboot quirk patch. The bug reporter confirmed this patch resolves
the reboot hang.
Handle 0x0001, DMI type 1, 25 bytes
System Information
Manufacturer: System Manufacturer
Product Name: System Name
Version: System Version
Serial Number: SYS-
1234567890
UUID:
E0BFCD8B-7948-D911-A953-
E486B4EEB67F
Wake-up Type: Power Switch
Handle 0x0002, DMI type 2, 8 bytes
Base Board Information
Manufacturer: ASUSTeK Computer INC.
Product Name: P4S800
Version: REV 1.xx
Serial Number: xxxxxxxxxxx
BugLink: http://bugs.launchpad.net/bugs/366682
ASUS P4S800 will hang when rebooting unless reboot=b is specified.
Add a quirk to reboot through the bios.
Signed-off-by: Leann Ogasawara <leann.ogasawara@canonical.com>
LKML-Reference: <
1259972107.4629.275.camel@emiko>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Joe Perches [Tue, 10 Nov 2009 01:58:50 +0000 (17:58 -0800)]
x86: GART: pci-gart_64.c: Use correct length in strncmp
commit
41855b77547fa18d90ed6a5d322983d3fdab1959 upstream.
Signed-off-by: Joe Perches <joe@perches.com>
LKML-Reference: <
1257818330.12852.72.camel@Joe-Laptop.home>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Tejun Heo [Mon, 26 Oct 2009 14:41:46 +0000 (15:41 +0100)]
x86: Fix iommu=nodac parameter handling
commit
2ae8bb75db1f3de422eb5898f2a063c46c36dba8 upstream.
iommu=nodac should forbid dac instead of enabling it. Fix it.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Matteo Frigo <athena@fftw.org>
LKML-Reference: <
4AE5B52A.
4050408@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Darrick J. Wong [Wed, 2 Dec 2009 23:05:56 +0000 (15:05 -0800)]
x86, Calgary IOMMU quirk: Find nearest matching Calgary while walking up the PCI tree
commit
4528752f49c1f4025473d12bc5fa9181085c3f22 upstream.
On a multi-node x3950M2 system, there's a slight oddity in the
PCI device tree for all secondary nodes:
30:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev e1)
\-33:00.0 PCI bridge: IBM CalIOC2 PCI-E Root Port (rev 01)
\-34:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 1078 (rev 04)
...as compared to the primary node:
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev e1)
\-01:00.0 VGA compatible controller: ATI Technologies Inc ES1000 (rev 02)
03:00.0 PCI bridge: IBM CalIOC2 PCI-E Root Port (rev 01)
\-04:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 1078 (rev 04)
In both nodes, the LSI RAID controller hangs off a CalIOC2
device, but on the secondary nodes, the BIOS hides the VGA
device and substitutes the device tree ending with the disk
controller.
It would seem that Calgary devices don't necessarily appear at
the top of the PCI tree, which means that the current code to
find the Calgary IOMMU that goes with a particular device is
buggy.
Rather than walk all the way to the top of the PCI
device tree and try to match bus number with Calgary descriptor,
the code needs to examine each parent of the particular device;
if it encounters a Calgary with a matching bus number, simply
use that.
Otherwise, we BUG() when the bus number of the Calgary doesn't
match the bus number of whatever's at the top of the device tree.
Extra note: This patch appears to work correctly for the x3950
that came before the x3950 M2.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Acked-by: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Jon D. Mason <jdmason@kudzu.us>
Cc: Corinna Schultz <coschult@us.ibm.com>
LKML-Reference: <
20091202230556.GG10295@tux1.beaverton.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Joerg Roedel [Mon, 23 Nov 2009 11:45:25 +0000 (12:45 +0100)]
x86/amd-iommu: un__init iommu_setup_msi
commit
9f800de38b05d84809e89f16671d636a140eede7 upstream.
This function may be called on the resume path and can not
be dropped after booting.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Joerg Roedel [Mon, 23 Nov 2009 11:50:00 +0000 (12:50 +0100)]
x86/amd-iommu: attach devices to pre-allocated domains early
commit
be831297716036de5b24308447ecb69f1706a846 upstream.
For some devices the ACPI table may define unity map
requirements which must me met when the IOMMU is enabled. So
we need to attach devices to their domains as early as
possible so that these mappings are in place when needed.
This patch assigns the domains right after they are
allocated. Otherwise this can result in I/O page faults
before a driver binds to a device and BIOS is still using
it.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Mike Galbraith [Tue, 10 Nov 2009 02:50:02 +0000 (03:50 +0100)]
sched: Fix and clean up rate-limit newidle code
commit
eae0c9dfb534cb3449888b9601228efa6480fdb5 upstream.
Commit
1b9508f, "Rate-limit newidle" has been confirmed to fix
the netperf UDP loopback regression reported by Alex Shi.
This is a cleanup and a fix:
- moved to a more out of the way spot
- fix to ensure that balancing doesn't try to balance
runqueues which haven't gone online yet, which can
mess up CPU enumeration during boot.
Reported-by: Alex Shi <alex.shi@intel.com>
Reported-by: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <
1257821402.5648.17.camel@marge.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Mike Galbraith [Wed, 4 Nov 2009 16:53:50 +0000 (17:53 +0100)]
sched: Rate-limit newidle
commit
1b9508f6831e10d53256825de8904caa22d1ca2c upstream.
Rate limit newidle to migration_cost. It's a win for all
stages of sysbench oltp tests.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Mike Galbraith [Thu, 5 Nov 2009 09:57:46 +0000 (10:57 +0100)]
sched: Fix affinity logic in select_task_rq_fair()
commit
fd210738f6601d0fb462df9a2fe5a41896ff6a8f upstream.
Ingo Molnar reported:
[ 26.804000] BUG: using smp_processor_id() in preemptible [
00000000] code: events/1/10
[ 26.808000] caller is vmstat_update+0x26/0x70
[ 26.812000] Pid: 10, comm: events/1 Not tainted 2.6.32-rc5 #6887
[ 26.816000] Call Trace:
[ 26.820000] [<
c1924a24>] ? printk+0x28/0x3c
[ 26.824000] [<
c13258a0>] debug_smp_processor_id+0xf0/0x110
[ 26.824000] mount used greatest stack depth: 1464 bytes left
[ 26.828000] [<
c111d086>] vmstat_update+0x26/0x70
[ 26.832000] [<
c1086418>] worker_thread+0x188/0x310
[ 26.836000] [<
c10863b7>] ? worker_thread+0x127/0x310
[ 26.840000] [<
c108d310>] ? autoremove_wake_function+0x0/0x60
[ 26.844000] [<
c1086290>] ? worker_thread+0x0/0x310
[ 26.848000] [<
c108cf0c>] kthread+0x7c/0x90
[ 26.852000] [<
c108ce90>] ? kthread+0x0/0x90
[ 26.856000] [<
c100c0a7>] kernel_thread_helper+0x7/0x10
[ 26.860000] BUG: using smp_processor_id() in preemptible [
00000000] code: events/1/10
[ 26.864000] caller is vmstat_update+0x3c/0x70
Because this commit:
a1f84a3: sched: Check for an idle shared cache in select_task_rq_fair()
broke ->cpus_allowed.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: arjan@infradead.org
LKML-Reference: <
1257415066.12867.1.camel@marge.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Mike Galbraith [Tue, 27 Oct 2009 14:35:38 +0000 (15:35 +0100)]
sched: Check for an idle shared cache in select_task_rq_fair()
commit
a1f84a3ab8e002159498814eaa7e48c33752b04b upstream.
When waking affine, check for an idle shared cache, and if
found, wake to that CPU/sibling instead of the waker's CPU.
This improves pgsql+oltp ramp up by roughly 8%. Possibly more
for other loads, depending on overlap. The trade-off is a
roughly 1% peak downturn if tasks are truly synchronous.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <
1256654138.17752.7.camel@marge.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Rafael J. Wysocki [Thu, 3 Dec 2009 19:21:21 +0000 (20:21 +0100)]
PM / Runtime: Fix lockdep warning in __pm_runtime_set_status()
commit
bab636b921017f0db6e0c2979438f50b898a9808 upstream.
Lockdep complains about taking the parent lock in
__pm_runtime_set_status(), so mark it as nested.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Kristian Høgsberg [Tue, 1 Dec 2009 20:05:01 +0000 (15:05 -0500)]
perf: Don't free perf_mmap_data until work has been done
commit
ec70ccd806111ba3caf596def91a8580138b12db upstream.
In the CONFIG_PERF_USE_VMALLOC case, perf_mmap_data_free() only
schedules the cleanup of the perf_mmap_data struct. In that
case we have to wait until the work has been done before we free
data.
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <
1259697901-1747-1-git-send-email-krh@bitplanet.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Xiao Guangrong [Tue, 1 Dec 2009 09:30:08 +0000 (17:30 +0800)]
perf_event: Initialize data.period in perf_swevent_hrtimer()
commit
59d069eb5ae9b033ed1c124c92e1532c4a958991 upstream.
In current code in perf_swevent_hrtimer(), data.period is not
initialized, The result is obvious wrong:
# ./perf record -f -e cpu-clock make
# ./perf report
# Samples: 1740
#
# Overhead Command ......
# ........ ........ ..........................................
#
1025422183050275328.00% sh libc-2.9.90.so ...
1025422183050275328.00% perl libperl.so ...
1025422168240043264.00% perl [kernel] ...
1025422030011210752.00% perl [kernel] ...
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <
4B14E220.
2050107@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Arjan van de Ven [Sat, 14 Nov 2009 05:47:33 +0000 (21:47 -0800)]
perf_event: Fix invalid type in ioctl definition
commit
4c49b12853fbb5eff4849b7b6a1e895776f027a1 upstream.
u64 is invalid in userspace headers, including ioctl
definitions; use __u64 instead
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
LKML-Reference: <
20091113214733.
7cd76be9@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Paul E. McKenney [Tue, 10 Nov 2009 21:37:19 +0000 (13:37 -0800)]
rcu: Remove inline from forward-referenced functions
commit
dbe01350fa8ce0c11948ab7d6be71a4d901be151 upstream.
Some variants of gcc are reputed to dislike forward references
to functions declared "inline". Remove the "inline" keyword
from such functions.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: Benjamin Gilbert <bgilbert@cs.cmu.edu>
LKML-Reference: <
12578890422402-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Paul E. McKenney [Mon, 2 Nov 2009 21:52:29 +0000 (13:52 -0800)]
rcu: Fix note_new_gpnum() uses of ->gpnum
commit
9160306e6f5b68bb64630c9031c517ca1cf463db upstream.
Impose a clear locking design on the note_new_gpnum()
function's use of the ->gpnum counter. This is done by updating
rdp->gpnum only from the corresponding leaf rcu_node structure's
rnp->gpnum field, and even then only under the protection of
that same rcu_node structure's ->lock field. Performance and
scalability are maintained using a form of double-checked
locking, and excessive spinning is avoided by use of the
spin_trylock() function. The use of spin_trylock() is safe due
to the fact that CPUs who fail to acquire this lock will try
again later. The hierarchical nature of the rcu_node data
structure limits contention (which could be limited further if
need be using the RCU_FANOUT kernel parameter).
Without this patch, obscure but quite possible races could
result in a quiescent state that occurred during one grace
period to be accounted to the following grace period, causing
this following grace period to end prematurely. Not good!
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference: <
12571987492350-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Paul E. McKenney [Mon, 2 Nov 2009 21:52:28 +0000 (13:52 -0800)]
rcu: Fix synchronization for rcu_process_gp_end() uses of ->completed counter
commit
d09b62dfa336447c52a5ec9bb88adbc479b0f3b8 upstream.
Impose a clear locking design on the rcu_process_gp_end()
function's use of the ->completed counter. This is done by
creating a ->completed field in the rcu_node structure, which
can safely be accessed under the protection of that structure's
lock. Performance and scalability are maintained by using a
form of double-checked locking, so that rcu_process_gp_end()
only acquires the leaf rcu_node structure's ->lock if a grace
period has recently ended.
This fix reduces rcutorture failure rate by at least two orders
of magnitude under heavy stress with force_quiescent_state()
being invoked artificially often. Without this fix,
unsynchronized access to the ->completed field can cause
rcu_process_gp_end() to advance callbacks whose grace period has
not yet expired. (Bad idea!)
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference: <
12571987494069-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Paul E. McKenney [Mon, 2 Nov 2009 21:52:27 +0000 (13:52 -0800)]
rcu: Prepare for synchronization fixes: clean up for non-NO_HZ handling of ->completed counter
commit
281d150c5f8892f158747594ab49ce2823fd8b8c upstream.
Impose a clear locking design on non-NO_HZ handling of the
->completed counter. This increases the distance between the
RCU and the CPU-hotplug mechanisms.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference: <
12571987491353-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Jay Fenlason [Fri, 11 Dec 2009 19:23:58 +0000 (14:23 -0500)]
firewire: ohci: handle receive packets with a data length of zero
commit
8c0c0cc2d9f4c523fde04bdfe41e4380dec8ee54 upstream.
Queueing to receive an ISO packet with a payload length of zero
silently does nothing in dualbuffer mode, and crashes the kernel in
packet-per-buffer mode. Return an error in dualbuffer mode, because
the DMA controller won't let us do what we want, and work correctly in
packet-per-buffer mode.
Signed-off-by: Jay Fenlason <fenlason@redhat.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
zhao.ming9@zte.com.cn [Mon, 7 Dec 2009 03:36:34 +0000 (11:36 +0800)]
USB: option: add pid for ZTE
commit
8d87cacda7c8db5c131bfcaaa1d90bfe918c2ebc upstream.
This patch adds ZTE modem devices.
Signed-off-by: Ming Zhao <zhao.ming9@zte.com.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Alan Stern [Mon, 7 Dec 2009 21:47:43 +0000 (16:47 -0500)]
USB: usb-storage: fix bug in fill_inquiry
commit
f3f6faa9edf67c1018270793e0547b0f81abb47e upstream.
This patch (as1312) fixes a minor bug in usb-storage. The
fill_inquiry() routine neglects to pre-load the inquiry data buffer
with spaces. As a result, if the vendor name is shorter than 8
characters or the product name is shorter than 16, the remainder will
be filled with garbage.
The patch also removes some unnecessary calls to strlen().
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Greg Kroah-Hartman [Mon, 14 Dec 2009 17:47:25 +0000 (09:47 -0800)]
Linux 2.6.32.1
Theodore Ts'o [Thu, 10 Dec 2009 02:30:02 +0000 (21:30 -0500)]
ext4: Fix potential fiemap deadlock (mmap_sem vs. i_data_sem)
(cherry picked from commit
fab3a549e204172236779f502eccb4f9bf0dc87d)
Fix the following potential circular locking dependency between
mm->mmap_sem and ei->i_data_sem:
=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.32-04115-gec044c5 #37
-------------------------------------------------------
ureadahead/1855 is trying to acquire lock:
(&mm->mmap_sem){++++++}, at: [<
ffffffff81107224>] might_fault+0x5c/0xac
but task is already holding lock:
(&ei->i_data_sem){++++..}, at: [<
ffffffff811be1fd>] ext4_fiemap+0x11b/0x159
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&ei->i_data_sem){++++..}:
[<
ffffffff81099bfa>] __lock_acquire+0xb67/0xd0f
[<
ffffffff81099e7e>] lock_acquire+0xdc/0x102
[<
ffffffff81516633>] down_read+0x51/0x84
[<
ffffffff811a2414>] ext4_get_blocks+0x50/0x2a5
[<
ffffffff811a3453>] ext4_get_block+0xab/0xef
[<
ffffffff81154f39>] do_mpage_readpage+0x198/0x48d
[<
ffffffff81155360>] mpage_readpages+0xd0/0x114
[<
ffffffff811a104b>] ext4_readpages+0x1d/0x1f
[<
ffffffff810f8644>] __do_page_cache_readahead+0x12f/0x1bc
[<
ffffffff810f86f2>] ra_submit+0x21/0x25
[<
ffffffff810f0cfd>] filemap_fault+0x19f/0x32c
[<
ffffffff81107b97>] __do_fault+0x55/0x3a2
[<
ffffffff81109db0>] handle_mm_fault+0x327/0x734
[<
ffffffff8151aaa9>] do_page_fault+0x292/0x2aa
[<
ffffffff81518205>] page_fault+0x25/0x30
[<
ffffffff812a34d8>] clear_user+0x38/0x3c
[<
ffffffff81167e16>] padzero+0x20/0x31
[<
ffffffff81168b47>] load_elf_binary+0x8bc/0x17ed
[<
ffffffff81130e95>] search_binary_handler+0xc2/0x259
[<
ffffffff81166d64>] load_script+0x1b8/0x1cc
[<
ffffffff81130e95>] search_binary_handler+0xc2/0x259
[<
ffffffff8113255f>] do_execve+0x1ce/0x2cf
[<
ffffffff81027494>] sys_execve+0x43/0x5a
[<
ffffffff8102918a>] stub_execve+0x6a/0xc0
-> #0 (&mm->mmap_sem){++++++}:
[<
ffffffff81099aa4>] __lock_acquire+0xa11/0xd0f
[<
ffffffff81099e7e>] lock_acquire+0xdc/0x102
[<
ffffffff81107251>] might_fault+0x89/0xac
[<
ffffffff81139382>] fiemap_fill_next_extent+0x95/0xda
[<
ffffffff811bcb43>] ext4_ext_fiemap_cb+0x138/0x157
[<
ffffffff811be069>] ext4_ext_walk_space+0x178/0x1f1
[<
ffffffff811be21e>] ext4_fiemap+0x13c/0x159
[<
ffffffff811390e6>] do_vfs_ioctl+0x348/0x4d6
[<
ffffffff811392ca>] sys_ioctl+0x56/0x79
[<
ffffffff81028cb2>] system_call_fastpath+0x16/0x1b
other info that might help us debug this:
1 lock held by ureadahead/1855:
#0: (&ei->i_data_sem){++++..}, at: [<
ffffffff811be1fd>] ext4_fiemap+0x11b/0x159
stack backtrace:
Pid: 1855, comm: ureadahead Not tainted
2.6.32-04115-gec044c5 #37
Call Trace:
[<
ffffffff81098c70>] print_circular_bug+0xa8/0xb7
[<
ffffffff81099aa4>] __lock_acquire+0xa11/0xd0f
[<
ffffffff8102f229>] ? sched_clock+0x9/0xd
[<
ffffffff81099e7e>] lock_acquire+0xdc/0x102
[<
ffffffff81107224>] ? might_fault+0x5c/0xac
[<
ffffffff81107251>] might_fault+0x89/0xac
[<
ffffffff81107224>] ? might_fault+0x5c/0xac
[<
ffffffff81124b44>] ? __kmalloc+0x13b/0x18c
[<
ffffffff81139382>] fiemap_fill_next_extent+0x95/0xda
[<
ffffffff811bcb43>] ext4_ext_fiemap_cb+0x138/0x157
[<
ffffffff811bca0b>] ? ext4_ext_fiemap_cb+0x0/0x157
[<
ffffffff811be069>] ext4_ext_walk_space+0x178/0x1f1
[<
ffffffff811be21e>] ext4_fiemap+0x13c/0x159
[<
ffffffff81107224>] ? might_fault+0x5c/0xac
[<
ffffffff811390e6>] do_vfs_ioctl+0x348/0x4d6
[<
ffffffff8129f6d0>] ? __up_read+0x8d/0x95
[<
ffffffff81517fb5>] ? retint_swapgs+0x13/0x1b
[<
ffffffff811392ca>] sys_ioctl+0x56/0x79
[<
ffffffff81028cb2>] system_call_fastpath+0x16/0x1b
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Akira Fujita [Mon, 7 Dec 2009 04:38:31 +0000 (23:38 -0500)]
ext4: Fix insufficient checks in EXT4_IOC_MOVE_EXT
(cherry picked from commit
4a58579b9e4e2a35d57e6c9c8483e52f6f1b7fd6)
This patch fixes three problems in the handling of the
EXT4_IOC_MOVE_EXT ioctl:
1. In current EXT4_IOC_MOVE_EXT, there are read access mode checks for
original and donor files, but they allow the illegal write access to
donor file, since donor file is overwritten by original file data. To
fix this problem, change access mode checks of original (r->r/w) and
donor (r->w) files.
2. Disallow the use of donor files that have a setuid or setgid bits.
3. Call mnt_want_write() and mnt_drop_write() before and after
ext4_move_extents() calling to get write access to a mount.
Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Jan Kara [Wed, 9 Dec 2009 04:51:10 +0000 (23:51 -0500)]
ext4: Wait for proper transaction commit on fsync
(cherry picked from commit
b436b9bef84de6893e86346d8fbf7104bc520645)
We cannot rely on buffer dirty bits during fsync because pdflush can come
before fsync is called and clear dirty bits without forcing a transaction
commit. What we do is that we track which transaction has last changed
the inode and which transaction last changed allocation and force it to
disk on fsync.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Dmitry Monakhov [Wed, 9 Dec 2009 03:42:28 +0000 (22:42 -0500)]
ext4: fix incorrect block reservation on quota transfer.
(cherry picked from commit
194074acacebc169ded90a4657193f5180015051)
Inside ->setattr() call both ATTR_UID and ATTR_GID may be valid
This means that we may end-up with transferring all quotas. Add
we have to reserve QUOTA_DEL_BLOCKS for all quotas, as we do in
case of QUOTA_INIT_BLOCKS.
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Reviewed-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Dmitry Monakhov [Wed, 9 Dec 2009 03:42:15 +0000 (22:42 -0500)]
ext4: quota macros cleanup
(cherry picked from commit
5aca07eb7d8f14d90c740834d15ca15277f4820c)
Currently all quota block reservation macros contains hard-coded "2"
aka MAXQUOTAS value. This is no good because in some places it is not
obvious to understand what does this digit represent. Let's introduce
new macro with self descriptive name.
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Acked-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Dmitry Monakhov [Wed, 9 Dec 2009 03:41:52 +0000 (22:41 -0500)]
ext4: ext4_get_reserved_space() must return bytes instead of blocks
(cherry picked from commit
8aa6790f876e81f5a2211fe1711a5fe3fe2d7b20)
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Acked-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Curt Wohlgemuth [Wed, 9 Dec 2009 03:18:25 +0000 (22:18 -0500)]
ext4: remove blocks from inode prealloc list on failure
(cherry picked from commit
b844167edc7fcafda9623955c05e4c1b3c32ebc7)
This fixes a leak of blocks in an inode prealloc list if device failures
cause ext4_mb_mark_diskspace_used() to fail.
Signed-off-by: Curt Wohlgemuth <curtw@google.com>
Acked-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Josef Bacik [Wed, 9 Dec 2009 02:48:58 +0000 (21:48 -0500)]
ext4: wait for log to commit when umounting
(cherry picked from commit
d4edac314e9ad0b21ba20ba8bc61b61f186f79e1)
There is a potential race when a transaction is committing right when
the file system is being umounting. This could reduce in a race
because EXT4_SB(sb)->s_group_info could be freed in ext4_put_super
before the commit code calls a callback so the mballoc code can
release freed blocks in the transaction, resulting in a panic trying
to access the freed s_group_info.
The fix is to wait for the transaction to finish committing before we
shutdown the multiblock allocator.
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Jan Kara [Wed, 9 Dec 2009 02:24:33 +0000 (21:24 -0500)]
ext4: Avoid data / filesystem corruption when write fails to copy data
(cherry picked from commit
b9a4207d5e911b938f73079a83cc2ae10524ec7f)
When ext4_write_begin fails after allocating some blocks or
generic_perform_write fails to copy data to write, we truncate blocks
already instantiated beyond i_size. Although these blocks were never
inside i_size, we have to truncate the pagecache of these blocks so
that corresponding buffers get unmapped. Otherwise subsequent
__block_prepare_write (called because we are retrying the write) will
find the buffers mapped, not call ->get_block, and thus the page will
be backed by already freed blocks leading to filesystem and data
corruption.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Roel Kluin [Mon, 7 Dec 2009 15:38:16 +0000 (10:38 -0500)]
ext4: Return the PTR_ERR of the correct pointer in setup_new_group_blocks()
(cherry picked from commit
c09eef305dd43846360944ad072f051f964fa383)
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Theodore Ts'o [Tue, 1 Dec 2009 14:04:42 +0000 (09:04 -0500)]
jbd2: Add ENOMEM checking in and for jbd2_journal_write_metadata_buffer()
(cherry picked from commit
e6ec116b67f46e0e7808276476554727b2e6240b)
OOM happens.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Akira Fujita [Tue, 24 Nov 2009 15:31:56 +0000 (10:31 -0500)]
ext4: move_extent_per_page() cleanup
(cherry picked from commit
ac48b0a1d068887141581bea8285de5fcab182b0)
Integrate duplicate lines (acquire/release semaphore and invalidate
extent cache in move_extent_per_page()) into mext_replace_branches(),
to reduce source and object code size.
Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Kazuya Mio [Tue, 24 Nov 2009 15:28:48 +0000 (10:28 -0500)]
ext4: initialize moved_len before calling ext4_move_extents()
(cherry picked from commit
446aaa6e7e993b38a6f21c6acfa68f3f1af3dbe3)
The move_extent.moved_len is used to pass back the number of exchanged
blocks count to user space. Currently the caller must clear this
field; but we spend more code space checking for this requirement than
simply zeroing the field ourselves, so let's just make life easier for
everyone all around.
Signed-off-by: Kazuya Mio <k-mio@sx.jp.nec.com>
Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Akira Fujita [Tue, 24 Nov 2009 15:19:57 +0000 (10:19 -0500)]
ext4: Fix double-free of blocks with EXT4_IOC_MOVE_EXT
(cherry picked from commit
94d7c16cbbbd0e03841fcf272bcaf0620ad39618)
At the beginning of ext4_move_extent(), we call
ext4_discard_preallocations() to discard inode PAs of orig and donor
inodes. But in the following case, blocks can be double freed, so
move ext4_discard_preallocations() to the end of ext4_move_extents().
1. Discard inode PAs of orig and donor inodes with
ext4_discard_preallocations() in ext4_move_extents().
orig : [ DATA1 ]
donor: [ DATA2 ]
2. While data blocks are exchanging between orig and donor inodes, new
inode PAs is created to orig by other process's block allocation.
(Since there are semaphore gaps in ext4_move_extents().) And new
inode PAs is used partially (2-1).
2-1 Create new inode PAs to orig inode
orig : [ DATA1 | used PA1 | free PA1 ]
donor: [ DATA2 ]
3. Donor inode which has old orig inode's blocks is deleted after
EXT4_IOC_MOVE_EXT finished (3-1, 3-2). So the block bitmap
corresponds to old orig inode's blocks are freed.
3-1 After EXT4_IOC_MOVE_EXT finished
orig : [ DATA2 | free PA1 ]
donor: [ DATA1 | used PA1 ]
3-2 Delete donor inode
orig : [ DATA2 | free PA1 ]
donor: [ FREE SPACE(DATA1) | FREE SPACE(used PA1) ]
4. The double-free of blocks is occurred, when close() is called to
orig inode. Because ext4_discard_preallocations() for orig inode
frees used PA1 and free PA1, though used PA1 is already freed in 3.
4-1 Double-free of blocks is occurred
orig : [ DATA2 | FREE SPACE(free PA1) ]
donor: [ FREE SPACE(DATA1) | DOUBLE FREE(used PA1) ]
Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Eric Sandeen [Thu, 19 Nov 2009 19:28:50 +0000 (14:28 -0500)]
ext4: make "norecovery" an alias for "noload"
(cherry picked from commit
e3bb52ae2bb9573e84c17b8e3560378d13a5c798)
Users on the linux-ext4 list recently complained about differences
across filesystems w.r.t. how to mount without a journal replay.
In the discussion it was noted that xfs's "norecovery" option is
perhaps more descriptively accurate than "noload," so let's make
that an alias for ext4.
Also show this status in /proc/mounts
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Eric Sandeen [Thu, 19 Nov 2009 19:25:42 +0000 (14:25 -0500)]
ext4: make trim/discard optional (and off by default)
(cherry picked from commit
5328e635315734d42080de9a5a1ee87bf4cae0a4)
It is anticipated that when sb_issue_discard starts doing
real work on trim-capable devices, we may see issues. Make
this mount-time optional, and default it to off until we know
that things are working out OK.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Jan Kara [Mon, 23 Nov 2009 12:24:48 +0000 (07:24 -0500)]
ext4: fix error handling in ext4_ind_get_blocks()
(cherry picked from commit
2bba702d4f88d7b010ec37e2527b552588404ae7)
When an error happened in ext4_splice_branch we failed to notice that
in ext4_ind_get_blocks and mapped the buffer anyway. Fix the problem
by checking for error properly.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Theodore Ts'o [Mon, 23 Nov 2009 12:24:57 +0000 (07:24 -0500)]
ext4: avoid issuing unnecessary barriers
(cherry picked from commit
6b17d902fdd241adfa4ce780df20547b28bf5801)
We don't to issue an I/O barrier on an error or if we force commit
because we are doing data journaling.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Theodore Ts'o [Sun, 15 Nov 2009 20:29:56 +0000 (15:29 -0500)]
ext4: fix block validity checks so they work correctly with meta_bg
(cherry picked from commit
1032988c71f3f85483b2b4319684d1205a704c02)
The block validity checks used by ext4_data_block_valid() wasn't
correctly written to check file systems with the meta_bg feature. Fix
this.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Theodore Ts'o [Mon, 23 Nov 2009 12:24:38 +0000 (07:24 -0500)]
ext4: fix uninit block bitmap initialization when s_meta_first_bg is non-zero
(cherry picked from commit
8dadb198cb70ef811916668fe67eeec82e8858dd)
The number of old-style block group descriptor blocks is
s_meta_first_bg when the meta_bg feature flag is set.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Theodore Ts'o [Mon, 23 Nov 2009 12:24:52 +0000 (07:24 -0500)]
ext4: don't update the superblock in ext4_statfs()
(cherry picked from commit
3f8fb9490efbd300887470a2a880a64e04dcc3f5)
commit
a71ce8c6c9bf269b192f352ea555217815cf027e updated ext4_statfs()
to update the on-disk superblock counters, but modified this buffer
directly without any journaling of the change. This is one of the
accesses that was causing the crc errors in journal replay as seen in
kernel.org bugzilla #14354.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Eric Sandeen [Sun, 15 Nov 2009 20:30:52 +0000 (15:30 -0500)]
ext4: journal all modifications in ext4_xattr_set_handle
(cherry picked from commit
86ebfd08a1930ccedb8eac0aeb1ed4b8b6a41dbc)
ext4_xattr_set_handle() was zeroing out an inode outside
of journaling constraints; this is one of the accesses that
was causing the crc errors in journal replay as seen in
kernel.org bugzilla #14354.
Reviewed-by: Andreas Dilger <adilger@sun.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Julia Lawall [Sun, 15 Nov 2009 20:30:58 +0000 (15:30 -0500)]
ext4: fix i_flags access in ext4_da_writepages_trans_blocks()
(cherry picked from commit
30c6e07a92ea4cb87160d32ffa9bce172576ae4c)
We need to be testing the i_flags field in the ext4 specific portion
of the inode, instead of the (confusingly aliased) i_flags field in
the generic struct inode.
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Theodore Ts'o [Mon, 23 Nov 2009 12:17:34 +0000 (07:17 -0500)]
ext4: make sure directory and symlink blocks are revoked
(cherry picked from commit
50689696867d95b38d9c7be640a311494a04fb86)
When an inode gets unlinked, the functions ext4_clear_blocks() and
ext4_remove_blocks() call ext4_forget() for all the buffer heads
corresponding to the deleted inode's data blocks. If the inode is a
directory or a symlink, the is_metadata parameter must be non-zero so
ext4_forget() will revoke them via jbd2_journal_revoke(). Otherwise,
if these blocks are reused for a data file, and the system crashes
before a journal checkpoint, the journal replay could end up
corrupting these data blocks.
Thanks to Curt Wohlgemuth for pointing out potential problems in this
area.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Theodore Ts'o [Sat, 14 Nov 2009 13:19:05 +0000 (08:19 -0500)]
ext4: plug a buffer_head leak in an error path of ext4_iget()
(cherry picked from commit
567f3e9a70d71e5c9be03701b8578be77857293b)
One of the invalid error paths in ext4_iget() forgot to brelse() the
inode buffer head. Fix it by adding a brelse() in the common error
return path, which also simplifies function.
Thanks to Andi Kleen <ak@linux.intel.com> reporting the problem.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Akira Fujita [Mon, 23 Nov 2009 12:24:41 +0000 (07:24 -0500)]
ext4: fix possible recursive locking warning in EXT4_IOC_MOVE_EXT
(cherry picked from commit
49bd22bc4d603a2a4fc2a6a60e156cbea52eb494)
If CONFIG_PROVE_LOCKING is enabled, the double_down_write_data_sem()
will trigger a false-positive warning of a recursive lock. Since we
take i_data_sem for the two inodes ordered by their inode numbers,
this isn't a problem. Use of down_write_nested() will notify the lock
dependency checker machinery that there is no problem here.
This problem was reported by Brian Rogers:
http://marc.info/?l=linux-ext4&m=
125115356928011&w=1
Reported-by: Brian Rogers <brian@xyzw.org>
Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Akira Fujita [Mon, 23 Nov 2009 12:24:43 +0000 (07:24 -0500)]
ext4: fix lock order problem in ext4_move_extents()
(cherry picked from commit
fc04cb49a898c372a22b21fffc47f299d8710801)
ext4_move_extents() checks the logical block contiguousness
of original file with ext4_find_extent() and mext_next_extent().
Therefore the extent which ext4_ext_path structure indicates
must not be changed between above functions.
But in current implementation, there is no i_data_sem protection
between ext4_ext_find_extent() and mext_next_extent(). So the extent
which ext4_ext_path structure indicates may be overwritten by
delalloc. As a result, ext4_move_extents() will exchange wrong blocks
between original and donor files. I change the place where
acquire/release i_data_sem to solve this problem.
Moreover, I changed move_extent_per_page() to start transaction first,
and then acquire i_data_sem. Without this change, there is a
possibility of the deadlock between mmap() and ext4_move_extents():
* NOTE: "A", "B" and "C" mean different processes
A-1: ext4_ext_move_extents() acquires i_data_sem of two inodes.
B: do_page_fault() starts the transaction (T),
and then tries to acquire i_data_sem.
But process "A" is already holding it, so it is kept waiting.
C: While "A" and "B" running, kjournald2 tries to commit transaction (T)
but it is under updating, so kjournald2 waits for it.
A-2: Call ext4_journal_start with holding i_data_sem,
but transaction (T) is locked.
Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Akira Fujita [Mon, 23 Nov 2009 12:25:48 +0000 (07:25 -0500)]
ext4: fix the returned block count if EXT4_IOC_MOVE_EXT fails
(cherry picked from commit
f868a48d06f8886cb0367568a12367fa4f21ea0d)
If the EXT4_IOC_MOVE_EXT ioctl fails, the number of blocks that were
exchanged before the failure should be returned to the userspace
caller. Unfortunately, currently if the block size is not the same as
the page size, the returned block count that is returned is the
page-aligned block count instead of the actual block count. This
commit addresses this bug.
Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Theodore Ts'o [Mon, 23 Nov 2009 12:24:46 +0000 (07:24 -0500)]
ext4: avoid divide by zero when trying to mount a corrupted file system
(cherry picked from commit
503358ae01b70ce6909d19dd01287093f6b6271c)
If s_log_groups_per_flex is greater than 31, then groups_per_flex will
will overflow and cause a divide by zero error. This can cause kernel
BUG if such a file system is mounted.
Thanks to Nageswara R Sastry for analyzing the failure and providing
an initial patch.
http://bugzilla.kernel.org/show_bug.cgi?id=14287
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Theodore Ts'o [Mon, 23 Nov 2009 12:25:49 +0000 (07:25 -0500)]
ext4: fix potential buffer head leak when add_dirent_to_buf() returns ENOSPC
(cherry picked from commit
2de770a406b06dfc619faabbf5d85c835ed3f2e1)
Previously add_dirent_to_buf() did not free its passed-in buffer head
in the case of ENOSPC, since in some cases the caller still needed it.
However, this led to potential buffer head leaks since not all callers
dealt with this correctly. Fix this by making simplifying the freeing
convention; now add_dirent_to_buf() *never* frees the passed-in buffer
head, and leaves that to the responsibility of its caller. This makes
things cleaner and easier to prove that the code is neither leaking
buffer heads or calling brelse() one time too many.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Curt Wohlgemuth <curtw@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Yang, Bo [Tue, 6 Oct 2009 20:52:20 +0000 (14:52 -0600)]
SCSI: megaraid_sas: fix 64 bit sense pointer truncation
commit
7b2519afa1abd1b9f63aa1e90879307842422dae upstream.
The current sense pointer is cast to a u32 pointer, which can truncate
on 64 bits. Fix by using unsigned long instead.
Signed-off-by Bo Yang<bo.yang@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Martin Michlmayr [Mon, 16 Nov 2009 18:49:25 +0000 (20:49 +0200)]
SCSI: osd_protocol.h: Add missing #include
commit
0899638688f223fd9e9fee60d662665e11693d12 upstream.
include/scsi/osd_protocol.h uses ALIGN() without an #include
<linux/kernel.h>, leading to:
| include/scsi/osd_protocol.h:362: error: implicit declaration of function 'ALIGN'
Signed-off-by: Martin Michlmayr <tbm@cyrius.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
James Bottomley [Thu, 5 Nov 2009 19:33:12 +0000 (13:33 -0600)]
SCSI: scsi_lib_dma: fix bug with dma maps on nested scsi objects
commit
d139b9bd0e52dda14fd13412e7096e68b56d0076 upstream.
Some of our virtual SCSI hosts don't have a proper bus parent at the
top, which can be a problem for doing DMA on them
This patch makes the host device cache a pointer to the physical bus
device and provides an extra API for setting it (the normal API picks
it up from the parent). This patch also modifies the qla2xxx and lpfc
vport logic to use the new DMA host setting API.
Acked-By: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Sebastian Andrzej Siewior [Sun, 25 Oct 2009 14:37:58 +0000 (15:37 +0100)]
signal: Fix alternate signal stack check
commit
2a855dd01bc1539111adb7233f587c5c468732ac upstream.
All architectures in the kernel increment/decrement the stack pointer
before storing values on the stack.
On architectures which have the stack grow down sas_ss_sp == sp is not
on the alternate signal stack while sas_ss_sp + sas_ss_size == sp is
on the alternate signal stack.
On architectures which have the stack grow up sas_ss_sp == sp is on
the alternate signal stack while sas_ss_sp + sas_ss_size == sp is not
on the alternate signal stack.
The current implementation fails for architectures which have the
stack grow down on the corner case where sas_ss_sp == sp.This was
reported as Debian bug #544905 on AMD64.
Simplified test case: http://download.breakpoint.cc/tc-sig-stack.c
The test case creates the following stack scenario:
0xn0300 stack top
0xn0200 alt stack pointer top (when switching to alt stack)
0xn01ff alt stack end
0xn0100 alt stack start == stack pointer
If the signal is sent the stack pointer is pointing to the base
address of the alt stack and the kernel erroneously decides that it
has already switched to the alternate stack because of the current
check for "sp - sas_ss_sp < sas_ss_size"
On parisc (stack grows up) the scenario would be:
0xn0200 stack pointer
0xn01ff alt stack end
0xn0100 alt stack start = alt stack pointer base
(when switching to alt stack)
0xn0000 stack base
This is handled correctly by the current implementation.
[ tglx: Modified for archs which have the stack grow up (parisc) which
would fail with the correct implementation for stack grows
down. Added a check for sp >= current->sas_ss_sp which is
strictly not necessary but makes the code symetric for both
variants ]
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
LKML-Reference: <
20091025143758.GA6653@Chamillionaire.breakpoint.cc>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>