Bart Oldeman [Thu, 23 Sep 2010 17:16:58 +0000 (13:16 -0400)]
x86, vm86: Fix preemption bug for int1 debug and int3 breakpoint handlers.
Impact: fix kernel bug such as:
BUG: scheduling while atomic: dosemu.bin/19680/0x00000004
See also Ubuntu bug 455067 at
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/455067
Commits
4915a35e35a037254550a2ba9f367a812bc37d40
("Use preempt_conditional_sti/cli in do_int3, like on x86_64.")
and
3d2a71a596bd9c761c8487a2178e95f8a61da083
("x86, traps: converge do_debug handlers")
started disabling preemption in int1 and int3 handlers on i386.
The problem with vm86 is that the call to handle_vm86_trap() may jump
straight to entry_32.S and never returns so preempt is never enabled
again, and there is an imbalance in the preempt count.
Commit
be716615fe596ee117292dc615e95f707fb67fd1 ("x86, vm86:
fix preemption bug"), which was later (accidentally?) reverted by commit
08d68323d1f0c34452e614263b212ca556dae47f ("hw-breakpoints: modifying
generic debug exception to use thread-specific debug registers")
fixed the problem for debug exceptions but not for breakpoints.
There are three solutions to this problem.
1. Reenable preemption before calling handle_vm86_trap(). This
was the approach that was later reverted.
2. Do not disable preemption for i386 in breakpoint and debug handlers.
This was the situation before October 2008. As far as I understand
preemption only needs to be disabled on x86_64 because a seperate stack is
used, but it's nice to have things work the same way on
i386 and x86_64.
3. Let handle_vm86_trap() return instead of jumping to assembly code.
By setting a flag in _TIF_WORK_MASK, either TIF_IRET or TIF_NOTIFY_RESUME,
the code in entry_32.S is instructed to return to 32 bit mode from
V86 mode. The logic in entry_32.S was already present to handle signals.
(I chose TIF_IRET because it's slightly more efficient in
do_notify_resume() in signal.c, but in fact TIF_IRET can probably be
replaced by TIF_NOTIFY_RESUME everywhere.)
I'm submitting approach 3, because I believe it is the most elegant
and prevents future confusion. Still, an obvious
preempt_conditional_cli(regs); is necessary in traps.c to correct the
bug.
[ hpa: This is technically a regression, but because:
1. the regression is so old,
2. the patch seems relatively high risk, justifying more testing, and
3. we're late in the 2.6.36-rc cycle,
I'm queuing it up for the 2.6.37 merge window. It might, however,
justify as a -stable backport at a latter time, hence Cc: stable. ]
Signed-off-by: Bart Oldeman <bartoldeman@users.sourceforge.net>
LKML-Reference: <alpine.DEB.2.00.
1009231312330.4732@localhost.localdomain>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: K.Prasad <prasad@linux.vnet.ibm.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Alexander van Heukelum <heukelum@fastmail.fm>
Cc: <stable@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Linus Torvalds [Thu, 23 Sep 2010 15:06:55 +0000 (08:06 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/tj/percpu
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu: fix pcpu_last_unit_cpu
Yinghai Lu [Wed, 22 Sep 2010 20:05:15 +0000 (13:05 -0700)]
ipmi: fix hardcoded ipmi device exit path warning
When modprobe.conf has
options ipmi_si type="kcs" ports=0xCA2 regspacings="4"
ipmi_si can be loaded properly, but when try to unload it get:
Sep 20 15:00:27 xx abrt: Kerneloops: Reported 1 kernel oopses to Abrt
Sep 20 15:00:27 xx abrtd: Directory 'kerneloops-
1285020027-1' creation detected
Sep 20 15:00:27 xx abrtd: New crash /var/spool/abrt/kerneloops-
1285020027-1, processing
Sep 20 15:01:09 xx kernel: ------------[ cut here ]------------
Sep 20 15:01:09 xx kernel: WARNING: at drivers/base/driver.c:262 driver_unregister+0x8a/0xa0()
Sep 20 15:01:09 xx kernel: Hardware name: Sun Fire x4800
Sep 20 15:01:09 xx kernel: Unexpected driver unregister!
Sep 20 15:01:09 xx kernel: Modules linked in: ipmi_si(-) ipmi_msghandler ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat bridge stp llc autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf xt_physdev be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb3i iw_cxgb3 cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi dm_mirror dm_region_hash dm_log dm_mod vhost_net macvtap macvlan tun kvm_intel kvm uinput sg ses enclosure ahci libahci pcspkr i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support igb dca i7core_edac edac_core ext3 jbd mbcache sd_mod crc_t10dif megaraid_sas [last unloaded: ipmi_devintf]
Sep 20 15:01:09 xx kernel: Pid: 10625, comm: modprobe Tainted: G W 2.6.36-rc5-tip+ #6
Sep 20 15:01:09 xx kernel: Call Trace:
Sep 20 15:01:09 xx kernel: [<
ffffffff810600df>] warn_slowpath_common+0x7f/0xc0
Sep 20 15:01:09 xx kernel: [<
ffffffff810601d6>] warn_slowpath_fmt+0x46/0x50
Sep 20 15:01:09 xx kernel: [<
ffffffff812ff60a>] driver_unregister+0x8a/0xa0
Sep 20 15:01:09 xx kernel: [<
ffffffff812ae112>] pnp_unregister_driver+0x12/0x20
Sep 20 15:01:09 xx kernel: [<
ffffffffa01d0327>] cleanup_ipmi_si+0x3c/0xa7 [ipmi_si]
Sep 20 15:01:09 xx kernel: [<
ffffffff81099a60>] sys_delete_module+0x1a0/0x270
Sep 20 15:01:09 xx kernel: [<
ffffffff814b7070>] ? do_page_fault+0x150/0x320
Sep 20 15:01:09 xx kernel: [<
ffffffff8100b072>] system_call_fastpath+0x16/0x1b
Sep 20 15:01:09 xx kernel: ---[ end trace
0d1967161adcee0d ]---
We need to check if ipmi_pnp_driver is loaded before we try to unload it.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vladimir Zapolskiy [Wed, 22 Sep 2010 20:05:13 +0000 (13:05 -0700)]
rtc: s3c: balance state changes of wakeup flag
This change resolves a problem about unbalanced calls of
enable_irq_wakeup() and disable_irq_wakeup() for alarm interrupt.
Bug reproduction:
root@eb600:~# echo 0 > /sys/class/rtc/rtc0/wakealarm
WARNING: at kernel/irq/manage.c:361 set_irq_wake+0x7c/0xe4()
Unbalanced IRQ 46 wake disable
Modules linked in:
[<
c0025708>] (unwind_backtrace+0x0/0xd8) from [<
c003358c>] (warn_slowpath_common+0x44/0x5c)
[<
c003358c>] (warn_slowpath_common+0x44/0x5c) from [<
c00335dc>] (warn_slowpath_fmt+0x24/0x30)
[<
c00335dc>] (warn_slowpath_fmt+0x24/0x30) from [<
c0058c20>] (set_irq_wake+0x7c/0xe4)
[<
c0058c20>] (set_irq_wake+0x7c/0xe4) from [<
c01b5e80>] (s3c_rtc_setalarm+0xa8/0xb8)
[<
c01b5e80>] (s3c_rtc_setalarm+0xa8/0xb8) from [<
c01b47a0>] (rtc_set_alarm+0x60/0x74)
[<
c01b47a0>] (rtc_set_alarm+0x60/0x74) from [<
c01b5a98>] (rtc_sysfs_set_wakealarm+0xc8/0xd8)
[<
c01b5a98>] (rtc_sysfs_set_wakealarm+0xc8/0xd8) from [<
c01891ec>] (dev_attr_store+0x20/0x24)
[<
c01891ec>] (dev_attr_store+0x20/0x24) from [<
c00be934>] (sysfs_write_file+0x104/0x13c)
[<
c00be934>] (sysfs_write_file+0x104/0x13c) from [<
c0080e7c>] (vfs_write+0xb0/0x158)
[<
c0080e7c>] (vfs_write+0xb0/0x158) from [<
c0080fcc>] (sys_write+0x3c/0x68)
[<
c0080fcc>] (sys_write+0x3c/0x68) from [<
c0020ec0>] (ret_fast_syscall+0x0/0x28)
Signed-off-by: Vladimir Zapolskiy <vzapolskiy@gmail.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Ben Dooks <ben@fluff.org.uk>
Cc: Atul Dahiya <atul.dahiya@samsung.com>
Cc: Taekgyun Ko <taeggyun.ko@samsung.com>
Cc: Kukjin Kim <kgene.kim@samsung.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrea Arcangeli [Wed, 22 Sep 2010 20:05:12 +0000 (13:05 -0700)]
mmap: call unlink_anon_vmas() in __split_vma() in case of error
If __split_vma fails because of an out of memory condition the
anon_vma_chain isn't teardown and freed potentially leading to rmap walks
accessing freed vma information plus there's a memleak.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Johannes Weiner <jweiner@redhat.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrea Arcangeli [Wed, 22 Sep 2010 20:05:12 +0000 (13:05 -0700)]
rmap: fix walk during fork
The below bug in fork led to the rmap walk finding the parent huge-pmd
twice instead of just once, because the anon_vma_chain objects of the
child vma still point to the vma->vm_mm of the parent.
The patch fixes it by making the rmap walk accurate during fork. It's not
a big deal normally but it worth being accurate considering the cost is
the same.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Johannes Weiner <jweiner@redhat.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Wed, 22 Sep 2010 20:05:11 +0000 (13:05 -0700)]
drivers/pci/intel-iommu.c: fix build with older gcc's
drivers/pci/intel-iommu.c: In function `__iommu_calculate_agaw':
drivers/pci/intel-iommu.c:437: sorry, unimplemented: inlining failed in call to 'width_to_agaw': function body not available
drivers/pci/intel-iommu.c:445: sorry, unimplemented: called from here
Move the offending function (and its siblings) to top-of-file, remove the
forward declaration.
Addresses https://bugzilla.kernel.org/show_bug.cgi?id=17441
Reported-by: Martin Mokrejs <mmokrejs@ribosome.natur.cuni.cz>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Wed, 22 Sep 2010 20:05:10 +0000 (13:05 -0700)]
oom: filter unkillable tasks from tasklist dump
/proc/sys/vm/oom_dump_tasks is enabled by default, so it's necessary to
limit as much information as possible that it should emit.
The tasklist dump should be filtered to only those tasks that are eligible
for oom kill. This is already done for memcg ooms, but this patch extends
it to both cpuset and mempolicy ooms as well as init.
In addition to suppressing irrelevant information, this also reduces
confusion since users currently don't know which tasks in the tasklist
aren't eligible for kill (such as those attached to cpusets or bound to
mempolicies with a disjoint set of mems or nodes, respectively) since that
information is not shown.
Signed-off-by: David Rientjes <rientjes@google.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dan Rosenberg [Wed, 22 Sep 2010 20:05:09 +0000 (13:05 -0700)]
drivers/video/sis/sis_main.c: prevent reading uninitialized stack memory
The FBIOGET_VBLANK device ioctl allows unprivileged users to read 16 bytes
of uninitialized stack memory, because the "reserved" member of the
fb_vblank struct declared on the stack is not altered or zeroed before
being copied back to the user. This patch takes care of it.
Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com>
Cc: Thomas Winischhofer <thomas@winischhofer.net>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Richard Weinberger [Wed, 22 Sep 2010 20:05:07 +0000 (13:05 -0700)]
uml: fix compile warning
This fixes:
incompatible pointer type: => 89
arch/um/kernel/exec.c: warning: passing argument 2 of 'execve1' from
incompatible pointer type: => 69, 85
arch/um/kernel/exec.c: warning: passing argument 3 of 'execve1' from
incompatible pointer type: => 69, 85
which was introduced by
d7627467b7a8d ("Make do_execve() take a const
filename pointer")
Signed-off-by: Richard Weinberger <richard@nod.at>
Cc: David Howells <dhowells@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KOSAKI Motohiro [Wed, 22 Sep 2010 20:05:06 +0000 (13:05 -0700)]
/proc/pid/smaps: fix dirty pages accounting
Currently, /proc/<pid>/smaps has wrong dirty pages accounting.
Shared_Dirty and Private_Dirty output only pte dirty pages and ignore
PG_dirty page flag. It is difference against documentation, but also
inconsistent against Referenced field. (Referenced checks both pte and
page flags)
This patch fixes it.
Test program:
large-array.c
---------------------------------------------------
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
char array[1*1024*1024*1024L];
int main(void)
{
memset(array, 1, sizeof(array));
pause();
return 0;
}
---------------------------------------------------
Test case:
1. run ./large-array
2. cat /proc/`pidof large-array`/smaps
3. swapoff -a
4. cat /proc/`pidof large-array`/smaps again
Test result:
<before patch>
00601000-
40601000 rw-p
00000000 00:00 0
Size:
1048576 kB
Rss:
1048576 kB
Pss:
1048576 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 218992 kB <-- showed pages as clean incorrectly
Private_Dirty: 829584 kB
Referenced: 388364 kB
Swap: 0 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
<after patch>
00601000-
40601000 rw-p
00000000 00:00 0
Size:
1048576 kB
Rss:
1048576 kB
Pss:
1048576 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty:
1048576 kB <-- fixed
Referenced: 388480 kB
Swap: 0 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jarek Poplawski [Wed, 22 Sep 2010 20:05:05 +0000 (13:05 -0700)]
fbcon: fix lockdep warning from fbcon_deinit()
Fix the lockdep warning:
[ 13.657164] INFO: trying to register non-static key.
[ 13.657169] the code is fine but needs lockdep annotation.
[ 13.657171] turning off the locking correctness validator.
[ 13.657177] Pid: 622, comm: modprobe Not tainted 2.6.36-rc3c #8
[ 13.657180] Call Trace:
[ 13.657194] [<
c13002c8>] ? printk+0x18/0x20
[ 13.657202] [<
c1056cf6>] register_lock_class+0x336/0x350
[ 13.657208] [<
c1058bf9>] __lock_acquire+0x449/0x1180
[ 13.657215] [<
c1059997>] lock_acquire+0x67/0x80
[ 13.657222] [<
c1042bf1>] ? __cancel_work_timer+0x51/0x230
[ 13.657227] [<
c1042c23>] __cancel_work_timer+0x83/0x230
[ 13.657231] [<
c1042bf1>] ? __cancel_work_timer+0x51/0x230
[ 13.657236] [<
c10582b2>] ? mark_held_locks+0x62/0x80
[ 13.657243] [<
c10b3a2f>] ? kfree+0x7f/0xe0
[ 13.657248] [<
c105853c>] ? trace_hardirqs_on_caller+0x11c/0x160
[ 13.657253] [<
c105858b>] ? trace_hardirqs_on+0xb/0x10
[ 13.657259] [<
c117f4cd>] ? fbcon_deinit+0x16d/0x1e0
[ 13.657263] [<
c117f4cd>] ? fbcon_deinit+0x16d/0x1e0
[ 13.657268] [<
c1042dea>] cancel_work_sync+0xa/0x10
[ 13.657272] [<
c117f444>] fbcon_deinit+0xe4/0x1e0
...
The warning is caused by trying to cancel an uninitialized work from
fbcon_exit(). Fix it by adding a check for queue.func, similarly to other
places in this code.
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Luke Macken [Wed, 22 Sep 2010 20:05:04 +0000 (13:05 -0700)]
efifb: support the EFI framebuffer on more Apple hardware
Enable the EFI framebuffer on 14 more Macs, including the iMac11,1
iMac10,1 iMac8,1 Macmini3,1 Macmini4,1 MacBook5,1 MacBook6,1 MacBook7,1
MacBookPro2,2 MacBookPro5,2 MacBookPro5,3 MacBookPro6,1 MacBookPro6,2 and
MacBookPro7,1
Information gathered from various user submissions.
https://bugzilla.redhat.com/show_bug.cgi?id=528232
http://ubuntuforums.org/showthread.php?t=
1557326
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Luke Macken <lmacken@redhat.com>
Signed-off-by: Peter Jones <pjones@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Peter Jones [Wed, 22 Sep 2010 20:05:04 +0000 (13:05 -0700)]
efifb: check that the base address is plausible on pci systems
Some Apple machines have identical DMI data but different memory
configurations for the video. Given that, check that the address in our
table is actually within the range of a PCI BAR on a VGA device in the
machine.
This also fixes up the return value from set_system(), which has always
been wrong, but never resulted in bad behavior since there's only ever
been one matching entry in the dmi table.
The patch
1) stops people's machines from crashing when we get their display wrong,
which seems to be unfortunately inevitable,
2) allows us to support identical dmi data with differing video memory
configurations
This also adds me as the efifb maintainer, since I've effectively been
acting as such for quite some time.
Signed-off-by: Peter Jones <pjones@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jan Kara [Wed, 22 Sep 2010 20:05:03 +0000 (13:05 -0700)]
aio: do not return ERESTARTSYS as a result of AIO
OCFS2 can return ERESTARTSYS from its write function when the process is
signalled while waiting for a cluster lock (and the filesystem is mounted
with intr mount option). Generally, it seems reasonable to allow
filesystems to return this error code from its IO functions. As we must
not leak ERESTARTSYS (and similar error codes) to userspace as a result of
an AIO operation, we have to properly convert it to EINTR inside AIO code
(restarting the syscall isn't really an option because other AIO could
have been already submitted by the same io_submit syscall).
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Zach Brown <zach.brown@oracle.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Minchan Kim [Wed, 22 Sep 2010 20:05:01 +0000 (13:05 -0700)]
vmscan: check all_unreclaimable in direct reclaim path
M. Vefa Bicakci reported 2.6.35 kernel hang up when hibernation on his
32bit 3GB mem machine.
(https://bugzilla.kernel.org/show_bug.cgi?id=16771). Also he bisected
the regression to
commit
bb21c7ce18eff8e6e7877ca1d06c6db719376e3c
Author: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Date: Fri Jun 4 14:15:05 2010 -0700
vmscan: fix do_try_to_free_pages() return value when priority==0 reclaim failure
At first impression, this seemed very strange because the above commit
only chenged function return value and hibernate_preallocate_memory()
ignore return value of shrink_all_memory(). But it's related.
Now, page allocation from hibernation code may enter infinite loop if the
system has highmem. The reasons are that vmscan don't care enough OOM
case when oom_killer_disabled.
The problem sequence is following as.
1. hibernation
2. oom_disable
3. alloc_pages
4. do_try_to_free_pages
if (scanning_global_lru(sc) && !all_unreclaimable)
return 1;
If kswapd is not freozen, it would set zone->all_unreclaimable to 1 and
then shrink_zones maybe return true(ie, all_unreclaimable is true). So at
last, alloc_pages could go to _nopage_. If it is, it should have no
problem.
This patch adds all_unreclaimable check to protect in direct reclaim path,
too. It can care of hibernation OOM case and help bailout
all_unreclaimable case slightly.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
Reported-by: M. Vefa Bicakci <bicave@superonline.com>
Reported-by: <caiqian@redhat.com>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: <caiqian@redhat.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Axel Lin [Wed, 22 Sep 2010 20:04:59 +0000 (13:04 -0700)]
drivers/rtc/rtc-ab3100.c: add missing platform_set_drvdata() in ab3100_rtc_probe()
Otherwise, calling platform_get_drvdata() in ab3100_rtc_remove() returns
NULL.
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Acked-by:Wan ZongShun <mcuos.com@gmail.com>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hans-Christian Egtvedt [Wed, 22 Sep 2010 20:04:58 +0000 (13:04 -0700)]
MAINTAINERS: change AVR32 and AT32AP maintainer
Alter the maintainer of the AVR32 architecture and the AVR32/AT32AP
machine support to me. Haavard is moving on to new challenges, and we've
found it better to transfer the maintainer part to me. I will have good
contact with Haavard anyway.
Signed-off-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Acked-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dmitry Torokhov [Wed, 22 Sep 2010 20:04:57 +0000 (13:04 -0700)]
vmware balloon: rename module
In an effort to minimize customer confusion we want to unify naming
convention for VMware-provided kernel modules. This change renames the
balloon driver from vmware_ballon to vmw_balloon.
We expect to follow this naming convention (vmw_<module_name>) for all
modules that are part of mainline kernel and/or being distributed by
VMware, with the sole exception of vmxnet3 driver (since the name of
mainline driver happens to match with the name used in VMware Tools).
Signed-off-by: Dmitry Torokhov <dtor@vmware.com>
Acked-by: Bhavesh Davda <bhavesh@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
FUJITA Tomonori [Wed, 22 Sep 2010 20:04:55 +0000 (13:04 -0700)]
arm: fix "arm: fix pci_set_consistent_dma_mask for dmabounce devices"
This fixes the regression caused by the commit
6fee48cd330c68
("dma-mapping: arm: use generic pci_set_dma_mask and
pci_set_consistent_dma_mask").
ARM needs to clip the dma coherent mask for dmabounce devices. This
restores the old trick.
Note that strictly speaking, the DMA API doesn't allow architectures to do
such but I'm not sure it's worth adding the new API to set the dma mask
that allows architectures to clip it.
Reported-by: Krzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Wed, 22 Sep 2010 20:04:54 +0000 (13:04 -0700)]
/proc/vmcore: fix seeking
Commit
73296bc611 ("procfs: Use generic_file_llseek in /proc/vmcore")
broke seeking on /proc/vmcore. This changes it back to use default_llseek
in order to restore the original behaviour.
The problem with generic_file_llseek is that it only allows seeks up to
inode->i_sb->s_maxbytes, which is zero on procfs and some other virtual
file systems. We should merge generic_file_llseek and default_llseek some
day and clean this up in a proper way, but for 2.6.35/36, reverting vmcore
is the safer solution.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Reported-by: CAI Qian <caiqian@redhat.com>
Tested-by: CAI Qian <caiqian@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yinghai Lu [Wed, 22 Sep 2010 20:04:53 +0000 (13:04 -0700)]
ipmi: fix acpi probe print
After
d9e1b6c45059ccf ("ipmi: fix ACPI detection with regspacing") we get
[ 11.026326] ipmi_si: probing via ACPI
[ 11.030019] ipmi_si 00:09: (null) regsize 1 spacing 1 irq 0
[ 11.035594] ipmi_si: Adding ACPI-specified kcs state machine
on an old system with only one range for ipmi kcs range.
Try to fix it by adding another res pointer.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Wed, 22 Sep 2010 20:04:52 +0000 (13:04 -0700)]
oom: always return a badness score of non-zero for eligible tasks
A task's badness score is roughly a proportion of its rss and swap
compared to the system's capacity. The scale ranges from 0 to 1000 with
the highest score chosen for kill. Thus, this scale operates on a
resolution of 0.1% of RAM + swap. Admin tasks are also given a 3% bonus,
so the badness score of an admin task using 3% of memory, for example,
would still be 0.
It's possible that an exceptionally large number of tasks will combine to
exhaust all resources but never have a single task that uses more than
0.1% of RAM and swap (or 3.0% for admin tasks).
This patch ensures that the badness score of any eligible task is never 0
so the machine doesn't unnecessarily panic because it cannot find a task
to kill.
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dan Rosenberg [Wed, 22 Sep 2010 18:32:56 +0000 (14:32 -0400)]
Prevent freeing uninitialized pointer in compat_do_readv_writev
In 32-bit compatibility mode, the error handling for
compat_do_readv_writev() may free an uninitialized pointer, potentially
leading to all sorts of ugly memory corruption. This is reliably
triggerable by unprivileged users by invoking the readv()/writev()
syscalls with an invalid iovec pointer. The below patch fixes this to
emulate the non-compat version.
Introduced by commit
b83733639a49 ("compat: factor out
compat_rw_copy_check_uvector from compat_do_readv_writev")
Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com>
Cc: stable@kernel.org (2.6.35)
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 22 Sep 2010 19:09:46 +0000 (12:09 -0700)]
Merge git://git./linux/kernel/git/davem/sparc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
sparc: Prevent no-handler signal syscall restart recursion.
sparc: Don't mask signal when we can't setup signal frame.
sparc64: Fix race in signal instruction flushing.
sparc64: Support RAW perf events.
Al Viro [Mon, 20 Sep 2010 20:48:57 +0000 (21:48 +0100)]
powerpc: fix double syscall restarts
Make sigreturn zero regs->trap, make do_signal() do the same on all
paths. As it is, signal interrupting e.g. read() from fd 512 (==
ERESTARTSYS) with another signal getting unblocked when the first
handler finishes will lead to restart one insn earlier than it ought
to. Same for multiple signals with in-kernel handlers interrupting
that sucker at the same time. Same for multiple signals of any kind
interrupting that sucker on 64bit...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 22 Sep 2010 16:12:37 +0000 (09:12 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
bdi: Fix warnings in __mark_inode_dirty for /dev/zero and friends
char: Mark /dev/zero and /dev/kmem as not capable of writeback
bdi: Initialize noop_backing_dev_info properly
cfq-iosched: fix a kernel OOPs when usb key is inserted
block: fix blk_rq_map_kern bio direction flag
cciss: freeing uninitialized data on error path
Jan Kara [Tue, 21 Sep 2010 09:51:01 +0000 (11:51 +0200)]
bdi: Fix warnings in __mark_inode_dirty for /dev/zero and friends
Inodes of devices such as /dev/zero can get dirty for example via
utime(2) syscall or due to atime update. Backing device of such inodes
(zero_bdi, etc.) is however unable to handle dirty inodes and thus
__mark_inode_dirty complains. In fact, inode should be rather dirtied
against backing device of the filesystem holding it. This is generally a
good rule except for filesystems such as 'bdev' or 'mtd_inodefs'. Inodes
in these pseudofilesystems are referenced from ordinary filesystem
inodes and carry mapping with real data of the device. Thus for these
inodes we have to use inode->i_mapping->backing_dev_info as we did so
far. We distinguish these filesystems by checking whether sb->s_bdi
points to a non-trivial backing device or not.
Example: Assume we have an ext3 filesystem on /dev/sda1 mounted on /.
There's a device inode A described by a path "/dev/sdb" on this
filesystem. This inode will be dirtied against backing device "8:0"
after this patch. bdev filesystem contains block device inode B coupled
with our inode A. When someone modifies a page of /dev/sdb, it's B that
gets dirtied and the dirtying happens against the backing device "8:16".
Thus both inodes get filed to a correct bdi list.
Cc: stable@kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Jan Kara [Tue, 21 Sep 2010 09:49:01 +0000 (11:49 +0200)]
char: Mark /dev/zero and /dev/kmem as not capable of writeback
These devices don't do any writeback but their device inodes still can get
dirty so mark bdi appropriately so that bdi code does the right thing and files
inodes to lists of bdi carrying the device inodes.
Cc: stable@kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Jan Kara [Tue, 21 Sep 2010 09:48:55 +0000 (11:48 +0200)]
bdi: Initialize noop_backing_dev_info properly
Properly initialize this backing dev info so that writeback code does not
barf when getting to it e.g. via sb->s_bdi.
Cc: stable@kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
David S. Miller [Wed, 22 Sep 2010 05:30:13 +0000 (22:30 -0700)]
sparc: Prevent no-handler signal syscall restart recursion.
Explicitly clear the "in-syscall" bit when we have no signal
handler and back up the program counters to back up the system
call.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 22 Sep 2010 04:41:12 +0000 (21:41 -0700)]
sparc: Don't mask signal when we can't setup signal frame.
Don't invoke the signal handler tracehook in that situation
either.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 22 Sep 2010 01:21:05 +0000 (18:21 -0700)]
Merge branch 'for-linus/i2c/2636-rc5' of git://git.fluff.org/bjdooks/linux
* 'for-linus/i2c/2636-rc5' of git://git.fluff.org/bjdooks/linux:
i2c-omap: Make sure i2c bus is free before setting it to idle
Sage Weil [Tue, 21 Sep 2010 21:35:37 +0000 (14:35 -0700)]
fs: {lock,unlock}_flocks() stubs to prepare for BKL removal
The lock structs are currently protected by the BKL, but are accessed by
code in fs/locks.c and misc file system and DLM code. These stubs will
allow all users to switch to the new interface before the implementation
is changed to a spinlock.
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mathias Nyman [Thu, 26 Aug 2010 07:36:44 +0000 (07:36 +0000)]
i2c-omap: Make sure i2c bus is free before setting it to idle
If the i2c bus receives an interrupt with both BB (bus busy) and
ARDY (register access ready) statuses set during the tranfer of the last message
the bus was put to idle while still busy.
This caused bus to timeout.
Signed-off-by: Mathias Nyman <mathias.nyman@nokia.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Linus Torvalds [Tue, 21 Sep 2010 20:22:10 +0000 (13:22 -0700)]
Merge branch 'sched-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Fix nohz balance kick
sched: Fix user time incorrectly accounted as system time on 32-bit
Linus Torvalds [Tue, 21 Sep 2010 20:21:42 +0000 (13:21 -0700)]
Merge branch 'perf-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
hw breakpoints: Fix pid namespace bug
x86: Fix instruction breakpoint encoding
oprofile: Add Support for Intel CPU Family 6 / Model 22 (Intel Celeron 540)
kprobes: Fix Kconfig dependency
Linus Torvalds [Tue, 21 Sep 2010 18:20:10 +0000 (11:20 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
ceph: select CRYPTO
ceph: check mapping to determine if FILE_CACHE cap is used
ceph: only send one flushsnap per cap_snap per mds session
ceph: fix cap_snap and realm split
ceph: stop sending FLUSHSNAPs when we hit a dirty capsnap
ceph: correctly set 'follows' in flushsnap messages
ceph: fix dn offset during readdir_prepopulate
ceph: fix file offset wrapping at 4GB on 32-bit archs
ceph: fix reconnect encoding for old servers
ceph: fix pagelist kunmap tail
ceph: fix null pointer deref on anon root dentry release
Linus Torvalds [Tue, 21 Sep 2010 18:00:30 +0000 (11:00 -0700)]
Merge branch 'drm-intel-fixes' of git://git./linux/kernel/git/ickle/drm-intel
* 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ickle/drm-intel:
drm/i915: Hold a reference to the object whilst unbinding the eviction list
drm/i915,agp/intel: Add second set of PCI-IDs for B43
drm/i915: Fix Sandybridge fence registers
drm/i915/crt: Downgrade warnings for hotplug failures
drm/i915: Ensure that the crtcinfo is populated during mode_fixup()
Linus Torvalds [Tue, 21 Sep 2010 18:00:09 +0000 (11:00 -0700)]
Merge git://git./linux/kernel/git/rusty/linux-2.6-for-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
lguest: update comments to reflect LHCALL_LOAD_GDT_ENTRY.
virtio: console: Prevent userspace from submitting NULL buffers
virtio: console: Fix poll blocking even though there is data to read
Suresh Siddha [Mon, 13 Sep 2010 18:02:21 +0000 (11:02 -0700)]
sched: Fix nohz balance kick
There's a situation where the nohz balancer will try to wake itself:
cpu-x is idle which is also ilb_cpu
got a scheduler tick during idle
and the nohz_kick_needed() in trigger_load_balance() checks for
rq_x->nr_running which might not be zero (because of someone waking a
task on this rq etc) and this leads to the situation of the cpu-x
sending a kick to itself.
And this can cause a lockup.
Avoid this by not marking ourself eligible for kicking.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <
1284400941.2684.19.camel@sbsiddha-MOBL3.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Vivek Goyal [Tue, 14 Sep 2010 06:47:11 +0000 (08:47 +0200)]
cfq-iosched: fix a kernel OOPs when usb key is inserted
Mike reported a kernel crash when a usb key hotplug is performed while all
kernel thrads are not in a root cgroup and are running in one of the child
cgroups of blkio controller.
BUG: unable to handle kernel NULL pointer dereference at
0000002c
IP: [<
c11c7b08>] cfq_get_queue+0x232/0x412
*pde =
00000000
Oops: 0000 [#1] PREEMPT
last sysfs file: /sys/devices/pci0000:00/0000:00:1d.7/usb2/2-1/2-1:1.0/host3/scsi_host/host3/uevent
[..]
Pid: 30039, comm: scsi_scan_3 Not tainted 2.6.35.2-fg.roam #1 Volvi2 /Aspire 4315
EIP: 0060:[<
c11c7b08>] EFLAGS:
00010086 CPU: 0
EIP is at cfq_get_queue+0x232/0x412
EAX:
f705f9c0 EBX:
e977abac ECX:
00000000 EDX:
00000000
ESI:
f00da400 EDI:
f00da4ec EBP:
e977a800 ESP:
dff8fd00
DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
Process scsi_scan_3 (pid: 30039, ti=
dff8e000 task=
f6b6c9a0 task.ti=
dff8e000)
Stack:
00000000 00000000 00000001 01ff0000 f00da508 00000000 f00da524 f00da540
<0>
e7994940 dd631750 f705f9c0 e977a820 e977ac44 f00da4d0 00000001 f6b6c9a0
<0>
00000010 00008010 0000000b 00000000 00000001 e977a800 dd76fac0 00000246
Call Trace:
[<
c11c7f10>] ? cfq_set_request+0x228/0x34c
[<
c11c7ce8>] ? cfq_set_request+0x0/0x34c
[<
c11bb3b9>] ? elv_set_request+0xf/0x1c
[<
c11bdd51>] ? get_request+0x1ad/0x22f
[<
c11bddf2>] ? get_request_wait+0x1f/0x11a
[<
c11d013b>] ? kvasprintf+0x33/0x3b
[<
c127b537>] ? scsi_execute+0x1d/0x103
[<
c127b675>] ? scsi_execute_req+0x58/0x83
[<
c127c391>] ? scsi_probe_and_add_lun+0x188/0x7c2
[<
c12718c6>] ? attribute_container_add_device+0x15/0xfa
[<
c11c95d1>] ? kobject_get+0xf/0x13
[<
c126d1db>] ? get_device+0x10/0x14
[<
c127be93>] ? scsi_alloc_target+0x217/0x24d
[<
c127cbd8>] ? __scsi_scan_target+0x95/0x480
[<
c10204eb>] ? dequeue_entity+0x14/0x1fe
[<
c1020491>] ? update_curr+0x165/0x1ab
[<
c1020491>] ? update_curr+0x165/0x1ab
[<
c127d00d>] ? scsi_scan_channel+0x4a/0x76
[<
c127d0b0>] ? scsi_scan_host_selected+0x77/0xad
[<
c127d13c>] ? do_scan_async+0x0/0x11a
[<
c127d137>] ? do_scsi_scan_host+0x51/0x56
[<
c127d13c>] ? do_scan_async+0x0/0x11a
[<
c127d14a>] ? do_scan_async+0xe/0x11a
[<
c127d13c>] ? do_scan_async+0x0/0x11a
[<
c10354c5>] ? kthread+0x5e/0x63
[<
c1035467>] ? kthread+0x0/0x63
[<
c1002af6>] ? kernel_thread_helper+0x6/0x10
Code: 44 24 1c 54 83 44 24 18 54 83 fa 03 75 94 8b 06 c7 86 64 02 00 00 01 00 00 00 83 e0 03 09 f0 89 06 8b 44 24 28 8b 90 58 01 00 00 <8b> 42 2c 85 c0 75 03 8b 42 08 8d 54 24 48 52 8d 4c 24 50 51 68
EIP: [<
c11c7b08>] cfq_get_queue+0x232/0x412 SS:ESP 0068:
dff8fd00
CR2:
000000000000002c
---[ end trace
9a88306573f69b12 ]---
The problem here is that we don't have bdi->dev information available when
thread does some IO. Hence when dev_name() tries to access bdi->dev, it
crashes.
This problem does not happen if kernel threads are in root group as root
group is statically allocated at device initialization time and we don't
hit this piece of code.
Fix it by delaying the filling of major and minor number information of
device in blk_group. Initially a blk_group is created with 0 as device
information and this information is filled later once some more IO comes
in from same group.
Reported-by: Mike Kazantsev <mk.fraggod@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Benny Halevy [Mon, 13 Sep 2010 19:32:19 +0000 (21:32 +0200)]
block: fix blk_rq_map_kern bio direction flag
This bug was introduced in
7b6d91daee5cac6402186ff224c3af39d79f4a0e
"block: unify flags for struct bio and struct request"
Cc: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Dan Carpenter [Mon, 13 Sep 2010 12:09:33 +0000 (14:09 +0200)]
cciss: freeing uninitialized data on error path
The "h->scatter_list" is allocated inside a for loop. If any of those
allocations fail, then the rest of the list is uninitialized data. When
we free it we should start from the top and free backwards so that we
don't call kfree() on uninitialized pointers.
Also if the allocation for "h->scatter_list" fails then we would get an
Oops here. I should have noticed this when I send:
4ee69851c "cciss:
handle allocation failure." but I didn't. Sorry about that.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Chris Wilson [Tue, 21 Sep 2010 08:14:55 +0000 (09:14 +0100)]
Merge remote branch 'linus' into drm-intel-fixes
David S. Miller [Tue, 21 Sep 2010 06:24:52 +0000 (23:24 -0700)]
sparc64: Fix race in signal instruction flushing.
If another cpu does a very wide munmap() on the signal frame area,
it can tear down the page table hierarchy from underneath us.
Borrow an idea from the 64-bit fault path's get_user_insn(), and
disable cross call interrupts during the page table traversal
to lock them in place while we operate.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tejun Heo [Tue, 21 Sep 2010 05:57:19 +0000 (07:57 +0200)]
percpu: fix pcpu_last_unit_cpu
pcpu_first/last_unit_cpu are used to track which cpu has the first and
last units assigned. This in turn is used to determine the span of a
chunk for man/unmap cache flushes and whether an address belongs to
the first chunk or not in per_cpu_ptr_to_phys().
When the number of possible CPUs isn't power of two, a chunk may
contain unassigned units towards the end of a chunk. The logic to
determine pcpu_last_unit_cpu was incorrect when there was an unused
unit at the end of a chunk. It failed to ignore the unused unit and
assigned the unused marker NR_CPUS to pcpu_last_unit_cpu.
This was discovered through kdump failure which was caused by
malfunctioning per_cpu_ptr_to_phys() on a kvm setup with 50 possible
CPUs by CAI Qian.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: CAI Qian <caiqian@redhat.com>
Cc: stable@kernel.org
Rusty Russell [Tue, 21 Sep 2010 16:54:01 +0000 (10:54 -0600)]
lguest: update comments to reflect LHCALL_LOAD_GDT_ENTRY.
We used to have a hypercall which reloaded the entire GDT, then we
switched to one which loaded a single entry (to match the IDT code).
Some comments were not updated, so fix them.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reported by: Eviatar Khen <eviatarkhen@gmail.com>
Amit Shah [Tue, 14 Sep 2010 07:56:16 +0000 (13:26 +0530)]
virtio: console: Prevent userspace from submitting NULL buffers
A userspace could submit a buffer with 0 length to be written to the
host. Prevent such a situation.
This was not needed previously, but recent changes in the way write()
works exposed this condition to trigger a virtqueue event to the host,
causing a NULL buffer to be sent across.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
CC: stable@kernel.org
Hans de Goede [Thu, 16 Sep 2010 09:13:08 +0000 (14:43 +0530)]
virtio: console: Fix poll blocking even though there is data to read
I found this while working on a Linux agent for spice, the symptom I was
seeing was select blocking on the spice vdagent virtio serial port even
though there were messages queued up there.
virtio_console's port_fops_poll checks port->inbuf != NULL to determine
if read won't block. However if an application reads enough bytes from
inbuf through port_fops_read, to empty the current port->inbuf,
port->inbuf will be NULL even though there may be buffers left in the
virtqueue.
This causes poll() to block even though there is data to be read,
this patch fixes this by using will_read_block(port) instead of the
port->inbuf != NULL check.
Signed-off-By: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org
Linus Torvalds [Mon, 20 Sep 2010 23:56:53 +0000 (16:56 -0700)]
Linux 2.6.36-rc5
Linus Torvalds [Mon, 20 Sep 2010 23:45:08 +0000 (16:45 -0700)]
Merge git://git./linux/kernel/git/gregkh/staging-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6:
Staging: vt6655: fix buffer overflow
Revert: "Staging: batman-adv: Adding netfilter-bridge hooks"
Linus Torvalds [Mon, 20 Sep 2010 23:44:40 +0000 (16:44 -0700)]
Merge git://git./linux/kernel/git/gregkh/usb-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6:
USB: musb: MAINTAINERS: Fix my mail address
USB: serial/mos*: prevent reading uninitialized stack memory
USB: otg: twl4030: fix phy initialization(v1)
USB: EHCI: Disable langwell/penwell LPM capability
usb: musb_debugfs: don't use the struct file private_data field with seq_files
Linus Torvalds [Mon, 20 Sep 2010 23:44:24 +0000 (16:44 -0700)]
Merge git://git./linux/kernel/git/gregkh/tty-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6:
serial: mfd: fix bug in serial_hsu_remove()
serial: amba-pl010: fix set_ldisc
Dan Carpenter [Mon, 6 Sep 2010 12:32:30 +0000 (14:32 +0200)]
Staging: vt6655: fix buffer overflow
"param->u.wpa_associate.wpa_ie_len" comes from the user. We should
check it so that the copy_from_user() doesn't overflow the buffer.
Also further down in the function, we assume that if
"param->u.wpa_associate.wpa_ie_len" is set then "abyWPAIE[0]" is
initialized. To make that work, I changed the test here to say that if
"wpa_ie_len" is set then "wpa_ie" has to be a valid pointer or we return
-EINVAL.
Oddly, we only use the first element of the abyWPAIE[] array. So I
suspect there may be some other issues in this function.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Sven Eckelmann [Sat, 4 Sep 2010 23:58:18 +0000 (01:58 +0200)]
Revert: "Staging: batman-adv: Adding netfilter-bridge hooks"
This reverts commit
96d592ed599434d2d5f339a1d282871bc6377d2c.
The netfilter hook seems to be misused and may leak skbs in situations
when NF_HOOK returns NF_STOLEN. It may not filter everything as
expected. Also the ethernet bridge tables are not yet capable to
understand batman-adv packet correctly.
It was only added for testing purposes and can be removed again.
Reported-by: Vasiliy Kulikov <segooon@gmail.com>
Signed-off-by: Sven Eckelmann <sven.eckelmann@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Feng Tang [Mon, 6 Sep 2010 12:41:02 +0000 (13:41 +0100)]
serial: mfd: fix bug in serial_hsu_remove()
Medfield HSU driver deal with 4 pci devices(3 uart ports + 1 dma controller),
so in pci remove func, we need handle them differently
Signed-off-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Mika Westerberg [Sat, 4 Sep 2010 07:23:23 +0000 (10:23 +0300)]
serial: amba-pl010: fix set_ldisc
Commit
d87d9b7d1 ("tty: serial - fix tty referencing in set_ldisc") changed
set_ldisc to take ldisc number as parameter. This patch fixes AMBA PL010 driver
according the new prototype.
Signed-off-by: Mika Westerberg <mika.westerberg@iki.fi>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Felipe Balbi [Thu, 9 Sep 2010 06:04:25 +0000 (09:04 +0300)]
USB: musb: MAINTAINERS: Fix my mail address
If we don't, contributors to musb and any USB OMAP
code will be sending mails to an unexistent inbox.
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Dan Rosenberg [Wed, 15 Sep 2010 21:44:16 +0000 (17:44 -0400)]
USB: serial/mos*: prevent reading uninitialized stack memory
The TIOCGICOUNT device ioctl in both mos7720.c and mos7840.c allows
unprivileged users to read uninitialized stack memory, because the
"reserved" member of the serial_icounter_struct struct declared on the
stack is not altered or zeroed before being copied back to the user.
This patch takes care of it.
Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Ming Lei [Mon, 6 Sep 2010 15:27:09 +0000 (23:27 +0800)]
USB: otg: twl4030: fix phy initialization(v1)
Commit
461c317705eca5cac09a360f488715927fd0a927(into 2.6.36-v3)
is put forward to power down phy if no usb cable is connected,
but does introduce the two issues below:
1), phy is not into work state if usb cable is connected
with PC during poweron, so musb device mode is not usable
in such case, follows the reasons:
-twl4030_phy_resume is not called, so
regulators are not enabled
i2c access are not enabled
usb mode not configurated
2), The kernel warings[1] of regulators 'unbalanced disables'
is caused if poweron without usb cable connected
with PC or b-device.
This patch fixes the two issues above:
-power down phy only if no usb cable is connected with PC
and b-device
-do phy initialization(via __twl4030_phy_resume) if usb cable
is connected with PC(vbus event) or another b-device(ID event) in
twl4030_usb_probe.
This patch also doesn't put VUSB3V1 LDO into active mode in
twl4030_usb_ldo_init until VBUS/ID change detected, so we can
save more power consumption than before.
This patch is verified OK on Beagle board either connected with
usb cable or not when poweron.
[1]. warnings of 'unbalanced disables' of regulators.
[root@OMAP3EVM /]# dmesg
------------[ cut here ]------------
WARNING: at drivers/regulator/core.c:1357 _regulator_disable+0x38/0x128()
unbalanced disables for VUSB1V8
Modules linked in:
Backtrace:
[<
c0030c48>] (dump_backtrace+0x0/0x110) from [<
c034f5a8>] (dump_stack+0x18/0x1c)
r7:
c78179d8 r6:
c01ed6b8 r5:
c0410822 r4:
0000054d
[<
c034f590>] (dump_stack+0x0/0x1c) from [<
c0057da8>] (warn_slowpath_common+0x54/0x6c)
[<
c0057d54>] (warn_slowpath_common+0x0/0x6c) from [<
c0057e64>] (warn_slowpath_fmt+0x38/0x40)
r9:
00000000 r8:
00000000 r7:
c78e6608 r6:
00000000 r5:
fffffffb
r4:
c78e6c00
[<
c0057e2c>] (warn_slowpath_fmt+0x0/0x40) from [<
c01ed6b8>] (_regulator_disable+0x38/0x128)
r3:
c0410e53 r2:
c0410ad5
[<
c01ed680>] (_regulator_disable+0x0/0x128) from [<
c01ed87c>] (regulator_disable+0x24/0x38)
r7:
c78e6608 r6:
00000000 r5:
c78e6c40 r4:
c78e6c00
[<
c01ed858>] (regulator_disable+0x0/0x38) from [<
c02382dc>] (twl4030_phy_power+0x15c/0x17c)
r5:
c78595c0 r4:
00000000
[<
c0238180>] (twl4030_phy_power+0x0/0x17c) from [<
c023831c>] (twl4030_phy_suspend+0x20/0x2c)
r6:
00000000 r5:
c78595c0 r4:
c78595c0
[<
c02382fc>] (twl4030_phy_suspend+0x0/0x2c) from [<
c0238638>] (twl4030_usb_irq+0x11c/0x16c)
r5:
c78595c0 r4:
00000040
[<
c023851c>] (twl4030_usb_irq+0x0/0x16c) from [<
c034ec18>] (twl4030_usb_probe+0x2c4/0x32c)
r6:
00000000 r5:
00000000 r4:
c78595c0
[<
c034e954>] (twl4030_usb_probe+0x0/0x32c) from [<
c02152a0>] (platform_drv_probe+0x20/0x24)
r7:
00000000 r6:
c047d49c r5:
c78e6608 r4:
c047d49c
[<
c0215280>] (platform_drv_probe+0x0/0x24) from [<
c0214244>] (driver_probe_device+0xd0/0x190)
[<
c0214174>] (driver_probe_device+0x0/0x190) from [<
c02143d4>] (__device_attach+0x44/0x48)
r7:
00000000 r6:
c78e6608 r5:
c78e6608 r4:
c047d49c
[<
c0214390>] (__device_attach+0x0/0x48) from [<
c0213694>] (bus_for_each_drv+0x50/0x90)
r5:
c0214390 r4:
00000000
[<
c0213644>] (bus_for_each_drv+0x0/0x90) from [<
c0214474>] (device_attach+0x70/0x94)
r6:
c78e663c r5:
c78e6608 r4:
c78e6608
[<
c0214404>] (device_attach+0x0/0x94) from [<
c02134fc>] (bus_probe_device+0x2c/0x48)
r7:
00000000 r6:
00000002 r5:
c78e6608 r4:
c78e6600
[<
c02134d0>] (bus_probe_device+0x0/0x48) from [<
c0211e48>] (device_add+0x340/0x4b4)
[<
c0211b08>] (device_add+0x0/0x4b4) from [<
c021597c>] (platform_device_add+0x110/0x16c)
[<
c021586c>] (platform_device_add+0x0/0x16c) from [<
c0220cb0>] (add_numbered_child+0xd8/0x118)
r7:
00000000 r6:
c045f15c r5:
c78e6600 r4:
00000000
[<
c0220bd8>] (add_numbered_child+0x0/0x118) from [<
c001c618>] (twl_probe+0x3a4/0x72c)
[<
c001c274>] (twl_probe+0x0/0x72c) from [<
c02601ac>] (i2c_device_probe+0x7c/0xa4)
[<
c0260130>] (i2c_device_probe+0x0/0xa4) from [<
c0214244>] (driver_probe_device+0xd0/0x190)
r5:
c7856e20 r4:
c047c860
[<
c0214174>] (driver_probe_device+0x0/0x190) from [<
c02143d4>] (__device_attach+0x44/0x48)
r7:
c7856e04 r6:
c7856e20 r5:
c7856e20 r4:
c047c860
[<
c0214390>] (__device_attach+0x0/0x48) from [<
c0213694>] (bus_for_each_drv+0x50/0x90)
r5:
c0214390 r4:
00000000
[<
c0213644>] (bus_for_each_drv+0x0/0x90) from [<
c0214474>] (device_attach+0x70/0x94)
r6:
c7856e54 r5:
c7856e20 r4:
c7856e20
[<
c0214404>] (device_attach+0x0/0x94) from [<
c02134fc>] (bus_probe_device+0x2c/0x48)
r7:
c7856e04 r6:
c78fd048 r5:
c7856e20 r4:
c7856e20
[<
c02134d0>] (bus_probe_device+0x0/0x48) from [<
c0211e48>] (device_add+0x340/0x4b4)
[<
c0211b08>] (device_add+0x0/0x4b4) from [<
c0211fd8>] (device_register+0x1c/0x20)
[<
c0211fbc>] (device_register+0x0/0x20) from [<
c0260aa8>] (i2c_new_device+0xec/0x150)
r5:
c7856e00 r4:
c7856e20
[<
c02609bc>] (i2c_new_device+0x0/0x150) from [<
c0260dc0>] (i2c_register_adapter+0xa0/0x1c4)
r7:
00000000 r6:
c78fd078 r5:
c78fd048 r4:
c781d5c0
[<
c0260d20>] (i2c_register_adapter+0x0/0x1c4) from [<
c0260f80>] (i2c_add_numbered_adapter+0x9c/0xb4)
r7:
00000a28 r6:
c04600a8 r5:
c78fd048 r4:
00000000
[<
c0260ee4>] (i2c_add_numbered_adapter+0x0/0xb4) from [<
c034efa4>] (omap_i2c_probe+0x324/0x3e8)
r5:
00000000 r4:
c78fd000
[<
c034ec80>] (omap_i2c_probe+0x0/0x3e8) from [<
c02152a0>] (platform_drv_probe+0x20/0x24)
[<
c0215280>] (platform_drv_probe+0x0/0x24) from [<
c0214244>] (driver_probe_device+0xd0/0x190)
[<
c0214174>] (driver_probe_device+0x0/0x190) from [<
c021436c>] (__driver_attach+0x68/0x8c)
r7:
c78b2140 r6:
c047e214 r5:
c04600e4 r4:
c04600b0
[<
c0214304>] (__driver_attach+0x0/0x8c) from [<
c021399c>] (bus_for_each_dev+0x50/0x84)
r7:
c78b2140 r6:
c047e214 r5:
c0214304 r4:
00000000
[<
c021394c>] (bus_for_each_dev+0x0/0x84) from [<
c0214068>] (driver_attach+0x20/0x28)
r6:
c047e214 r5:
c047e214 r4:
c00270d0
[<
c0214048>] (driver_attach+0x0/0x28) from [<
c0213274>] (bus_add_driver+0xa8/0x228)
[<
c02131cc>] (bus_add_driver+0x0/0x228) from [<
c02146a4>] (driver_register+0xb0/0x13c)
[<
c02145f4>] (driver_register+0x0/0x13c) from [<
c0215744>] (platform_driver_register+0x4c/0x60)
r9:
00000000 r8:
c001f688 r7:
00000013 r6:
c005b6fc r5:
c00083dc
r4:
c00270d0
[<
c02156f8>] (platform_driver_register+0x0/0x60) from [<
c001f69c>] (omap_i2c_init_driver+0x14/0x1c)
[<
c001f688>] (omap_i2c_init_driver+0x0/0x1c) from [<
c002c460>] (do_one_initcall+0xd0/0x1a4)
[<
c002c390>] (do_one_initcall+0x0/0x1a4) from [<
c0008478>] (kernel_init+0x9c/0x154)
[<
c00083dc>] (kernel_init+0x0/0x154) from [<
c005b6fc>] (do_exit+0x0/0x688)
r5:
c00083dc r4:
00000000
---[ end trace
1b75b31a2719ed1d ]---
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Cc: David Brownell <dbrownell@users.sourceforge.net>
Cc: Felipe Balbi <me@felipebalbi.com>
Cc: Anand Gadiyar <gadiyar@ti.com>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Alek Du [Mon, 6 Sep 2010 13:50:57 +0000 (14:50 +0100)]
USB: EHCI: Disable langwell/penwell LPM capability
We have to do so due to HW limitation.
Signed-off-by: Alek Du <alek.du@intel.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Mathias Nyman [Mon, 6 Sep 2010 10:52:01 +0000 (13:52 +0300)]
usb: musb_debugfs: don't use the struct file private_data field with seq_files
seq_files use the private_data field of a file struct for storing a seq_file structure,
data should be stored in seq_file's own private field (e.g. file->private_data->private)
Otherwise seq_release() will free the private data when the file is closed.
Signed-off-by: Mathias Nyman <mathias.nyman@nokia.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Chris Wilson [Mon, 20 Sep 2010 09:31:40 +0000 (10:31 +0100)]
drm/i915: Hold a reference to the object whilst unbinding the eviction list
During heavy aperture thrashing we may be forced to wait upon several active
objects during eviction. The active list may be the last reference to
these objects and so the action of waiting upon one of them may cause
another to be freed (and itself unbound). To prevent the object
disappearing underneath us, we need to acquire and hold a reference
whilst unbinding.
This should fix the reported page refcount OOPS:
kernel BUG at drivers/gpu/drm/i915/i915_gem.c:1444!
...
RIP: 0010:[<
ffffffffa0093026>] [<
ffffffffa0093026>] i915_gem_object_put_pages+0x25/0xf5 [i915]
Call Trace:
[<
ffffffffa009481d>] i915_gem_object_unbind+0xc5/0x1a7 [i915]
[<
ffffffffa0098ab2>] i915_gem_evict_something+0x3bd/0x409 [i915]
[<
ffffffffa0027923>] ? drm_gem_object_lookup+0x27/0x57 [drm]
[<
ffffffffa0093bc3>] i915_gem_object_bind_to_gtt+0x1d3/0x279 [i915]
[<
ffffffffa0095b30>] i915_gem_object_pin+0xa3/0x146 [i915]
[<
ffffffffa0027948>] ? drm_gem_object_lookup+0x4c/0x57 [drm]
[<
ffffffffa00961bc>] i915_gem_do_execbuffer+0x50d/0xe32 [i915]
Reported-by: Shawn Starr <shawn.starr@rogers.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=18902
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Al Viro [Mon, 20 Sep 2010 14:13:25 +0000 (15:13 +0100)]
frv: double syscall restarts, syscall restart in sigreturn()
We need to make sure that only the first do_signal() to be handled on
the way out syscall will bother with syscall restarts; additionally, the
check on the "signal has user handler" path had been wrong - compare
with restart prevention in sigreturn()...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al Viro [Mon, 20 Sep 2010 14:13:19 +0000 (15:13 +0100)]
frv: handling of restart into restart_syscall is fscked
do_signal() should place the syscall number in gr7, not gr8 when
handling ERESTART_WOULDBLOCK.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al Viro [Mon, 20 Sep 2010 14:13:14 +0000 (15:13 +0100)]
frv: avoid infinite loop of SIGSEGV delivery
Use force_sigsegv() rather than force_sig(SIGSEGV, ...) as the former
resets the SEGV handler pointer which will kill the process, rather than
leaving it open to an infinite loop if the SEGV handler itself caused a
SEGV signal.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al Viro [Mon, 20 Sep 2010 14:13:09 +0000 (15:13 +0100)]
frv: fix address verification holes in setup_frame/setup_rt_frame
a) sa_handler might be maliciously set to point to kernel memory;
blindly dereferencing it in FDPIC case is a Bad Idea(tm).
b) I'm not sure you need that set_fs(USER_DS) there at all, but if you
do, you'd better do it *before* checking the frame you've decided to
use with access_ok(), lest sigaltstack() becomes a convenient
roothole.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al Viro [Mon, 20 Sep 2010 14:13:04 +0000 (15:13 +0100)]
frv: restart_block.fn needs to be reset on sigreturn
Reset restart_block.fn on executing a sigreturn such that any currently
pending system call restarts will be forced to return -EINTR.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hugh Dickins [Mon, 20 Sep 2010 02:40:22 +0000 (19:40 -0700)]
mm: further fix swapin race condition
Commit
4969c1192d15 ("mm: fix swapin race condition") is now agreed to
be incomplete. There's a race, not very much less likely than the
original race envisaged, in which it is further necessary to check that
the swapcache page's swap has not changed.
Here's the reasoning: cast in terms of reuse_swap_page(), but probably
could be reformulated to rely on try_to_free_swap() instead, or on
swapoff+swapon.
A, faults into do_swap_page(): does page1 = lookup_swap_cache(swap1) and
comes through the lock_page(page1).
B, a racing thread of the same process, faults on the same address: does
page1 = lookup_swap_cache(swap1) and now waits in lock_page(page1), but
for whatever reason is unlucky not to get the lock any time soon.
A carries on through do_swap_page(), a write fault, but cannot reuse the
swap page1 (another reference to swap1). Unlocks the page1 (but B
doesn't get it yet), does COW in do_wp_page(), page2 now in that pte.
C, perhaps the parent of A+B, comes in and write faults the same swap
page1 into its mm, reuse_swap_page() succeeds this time, swap1 is freed.
kswapd comes in after some time (B still unlucky) and swaps out some
pages from A+B and C: it allocates the original swap1 to page2 in A+B,
and some other swap2 to the original page1 now in C. But does not
immediately free page1 (actually it couldn't: B holds a reference),
leaving it in swap cache for now.
B at last gets the lock on page1, hooray! Is PageSwapCache(page1)? Yes.
Is pte_same(*page_table, orig_pte)? Yes, because page2 has now been
given the swap1 which page1 used to have. So B proceeds to insert page1
into A+B's page_table, though its content now belongs to C, quite
different from what A wrote there.
B ought to have checked that page1's swap was still swap1.
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 19 Sep 2010 18:09:23 +0000 (11:09 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mattst88/alpha-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha-2.6:
alpha: deal with multiple simultaneously pending signals
alpha: fix a 14 years old bug in sigreturn tracing
alpha: unb0rk sigsuspend() and rt_sigsuspend()
alpha: belated ERESTART_RESTARTBLOCK race fix
alpha: Shift perf event pending work earlier in timer interrupt
alpha: wire up fanotify and prlimit64 syscalls
alpha: kill big kernel lock
alpha: fix build breakage in asm/cacheflush.h
alpha: remove unnecessary cast from void* in assignment.
alpha: Use static const char * const where possible
Linus Torvalds [Sun, 19 Sep 2010 18:06:34 +0000 (11:06 -0700)]
Merge git://git./linux/kernel/git/davem/ide-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide-2.6:
ide: Fix ordering of procfs registry.
Linus Torvalds [Sun, 19 Sep 2010 18:05:50 +0000 (11:05 -0700)]
Merge git://git./linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits)
dca: disable dca on IOAT ver.3.0 multiple-IOH platforms
netpoll: Disable IRQ around RCU dereference in netpoll_rx
sctp: Do not reset the packet during sctp_packet_config().
net/llc: storing negative error codes in unsigned short
MAINTAINERS: move atlx discussions to netdev
drivers/net/cxgb3/cxgb3_main.c: prevent reading uninitialized stack memory
drivers/net/eql.c: prevent reading uninitialized stack memory
drivers/net/usb/hso.c: prevent reading uninitialized memory
xfrm: dont assume rcu_read_lock in xfrm_output_one()
r8169: Handle rxfifo errors on 8168 chips
3c59x: Remove atomic context inside vortex_{set|get}_wol
tcp: Prevent overzealous packetization by SWS logic.
net: RPS needs to depend upon USE_GENERIC_SMP_HELPERS
phylib: fix PAL state machine restart on resume
net: use rcu_barrier() in rollback_registered_many
bonding: correctly process non-linear skbs
ipv4: enable getsockopt() for IP_NODEFRAG
ipv4: force_igmp_version ignored when a IGMPv3 query received
ppp: potential NULL dereference in ppp_mp_explode()
net/llc: make opt unsigned in llc_ui_setsockopt()
...
Linus Torvalds [Sun, 19 Sep 2010 18:05:05 +0000 (11:05 -0700)]
Merge branch 's5p-fixes-for-linus' of git://git./linux/kernel/git/kgene/linux-samsung
* 's5p-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
ARM: S3C64XX: Add IORESOURCE_IRQ_HIGHLEVEL flag to dm9000 on mach-real6410
ARM: S3C64XX: Fix coding style errors on mach-real6410
ARM: S3C64XX: Prototype SPI devices
ARM: S3C64XX: Fix dev-spi build
ARM: SAMSUNG: Fix on s5p_gpio_[get,set]_drvstr
ARM: SAMSUNG: Fix on drive strength value
ARM: S5PV210: Add FIMC clocks
ARM: S5PV210: Reduce the iodesc length of systimer
ARM: S5PV210: Update I2C-1 Clock Register Property.
ARM: S5P: Decrease IO Registers memory region size on FIMC
ARM: S5P: Fix DMA coherent mask for FIMC
Jan Harkes [Sat, 18 Sep 2010 03:26:01 +0000 (23:26 -0400)]
Coda: mount hangs because of missed REQ_WRITE rename
Coda's REQ_* defines were renamed to avoid clashes with the block layer
(commit
4aeefdc69f7b: "coda: fixup clash with block layer REQ_*
defines").
However one was missed and response messages are no longer matched with
requests and waiting threads are no longer woken up. This patch fixes
this.
Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
[ Also fixed up whitespace while at it -Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al Viro [Sat, 18 Sep 2010 12:42:27 +0000 (08:42 -0400)]
alpha: deal with multiple simultaneously pending signals
Unlike the other targets, alpha sets _one_ sigframe and
buggers off until the next syscall/interrupt, even if
more signals are pending. It leads to quite a few unpleasant
inconsistencies, starting with SIGSEGV potentially arriving
not where it should and including e.g. mess with sigsuspend();
consider two pending signals blocked until sigsuspend()
unblocks them. We pick the first one; then, if we are hit
by interrupt while in the handler, we process the second one
as well. If we are not, and if no syscalls had been made,
we get out of the first handler and leave the second signal
pending; normally sigreturn() would've picked it anyway, but
here it starts with restoring the original mask and voila -
the second signal is blocked again. On everything else we
get both delivered consistently.
It's actually easy to fix; the only thing to watch out for
is prevention of double syscall restart. Fortunately, the
idea I've nicked from arm fix by rmk works just fine...
Testcase demonstrating the behaviour in question; on alpha
we get one or both flags set (usually one), on everything
else both are always set.
#include <signal.h>
#include <stdio.h>
int had1, had2;
void f1(int sig) { had1 = 1; }
void f2(int sig) { had2 = 1; }
main()
{
sigset_t set1, set2;
sigemptyset(&set1);
sigemptyset(&set2);
sigaddset(&set2, 1);
sigaddset(&set2, 2);
signal(1, f1);
signal(2, f2);
sigprocmask(SIG_SETMASK, &set2, NULL);
raise(1);
raise(2);
sigsuspend(&set1);
printf("had1:%d had2:%d\n", had1, had2);
}
Tested-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Matt Turner <mattst88@gmail.com>
Al Viro [Sat, 18 Sep 2010 12:41:16 +0000 (08:41 -0400)]
alpha: fix a 14 years old bug in sigreturn tracing
The way sigreturn() is implemented on alpha breaks PTRACE_SYSCALL,
all way back to 1.3.95 when alpha has grown PTRACE_SYSCALL support.
What happens is direct return to ret_from_syscall, in order to bypass
mangling of a3 (error indicator) and prevent other mutilations of
registers (e.g. by syscall restart). That's fine, but... the entire
TIF_SYSCALL_TRACE codepath is kept separate on alpha and post-syscall
stopping/notifying the tracer is after the syscall. And the normal
path we are forcibly switching to doesn't have it.
So we end up with *one* stop in traced sigreturn() vs. two in other
syscalls. And yes, strace is visibly broken by that; try to strace
the following
#include <signal.h>
#include <stdio.h>
void f(int sig) {}
main()
{
signal(SIGHUP, f);
raise(SIGHUP);
write(1, "eeeek\n", 6);
}
and watch the show. The
close(1) = 405
in the end of strace output is coming from return value of write() (6 ==
__NR_close on alpha) and syscall number of exit_group() (__NR_exit_group ==
405 there).
The fix is fairly simple - the only thing we end up missing is the call
of syscall_trace() and we can tell whether we'd been called from the
SYSCALL_TRACE path by checking ra value. Since we are setting the
switch_stack up (that's what sys_sigreturn() does), we have the right
environment for calling syscall_trace() - just before we call
undo_switch_stack() and return. Since undo_switch_stack() will overwrite
s0 anyway, we can use it to store the result of "has it been called from
SYSCALL_TRACE path?" check. The same thing applies in rt_sigreturn().
Tested-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Matt Turner <mattst88@gmail.com>
Al Viro [Sat, 18 Sep 2010 12:40:07 +0000 (08:40 -0400)]
alpha: unb0rk sigsuspend() and rt_sigsuspend()
Old code used to set regs->r0 and regs->r19 to force the right
return value. Leaving that after switch to ERESTARTNOHAND
was a Bad Idea(tm), since now that screws the restart - if we
hit the case when get_signal_to_deliver() returns 0, we will
step back to syscall insn, with v0 set to EINTR and a3 to 1.
The latter won't matter, since EINTR is 4, aka __NR_write.
Testcase:
#include <signal.h>
#define _GNU_SOURCE
#include <unistd.h>
#include <sys/syscall.h>
main()
{
sigset_t mask;
sigemptyset(&mask);
sigaddset(&mask, SIGCONT);
sigprocmask(SIG_SETMASK, &mask, NULL);
kill(0, SIGCONT);
syscall(__NR_sigsuspend, 1, "b0rken\n", 7);
}
results on alpha in immediate message to stdout...
Fix is obvious; moreover, since we don't need regs anymore, we can
switch to normal prototypes for these guys and lose the wrappers.
Even better, rt_sigsuspend() is identical to generic version in
kernel/signal.c now.
Tested-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Matt Turner <mattst88@gmail.com>
Al Viro [Sat, 18 Sep 2010 12:38:47 +0000 (08:38 -0400)]
alpha: belated ERESTART_RESTARTBLOCK race fix
same thing as had been done on other targets back in 2003 -
move setting ->restart_block.fn into {rt_,}sigreturn().
Tested-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Matt Turner <mattst88@gmail.com>
Michael Cree [Sun, 19 Sep 2010 06:05:40 +0000 (02:05 -0400)]
alpha: Shift perf event pending work earlier in timer interrupt
Pending work from the performance event subsystem is executed in
the timer interrupt. This patch shifts the call to
perf_event_do_pending() before the call to update_process_times()
as the latter may call back into the perf event subsystem and it
is prudent to have the pending work executed first.
Signed-off-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Matt Turner <mattst88@gmail.com>
Mikael Pettersson [Thu, 16 Sep 2010 18:12:55 +0000 (14:12 -0400)]
alpha: wire up fanotify and prlimit64 syscalls
The 2.6.36-rc kernel added three new system calls:
fanotify_init, fanotify_mark, and prlimit64. This
patch wires them up on Alpha.
Built and booted on an XP900. Untested beyond that.
Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Matt Turner <mattst88@gmail.com>
Arnd Bergmann [Tue, 14 Sep 2010 23:34:56 +0000 (19:34 -0400)]
alpha: kill big kernel lock
All uses of the BKL on alpha are totally bogus, nothing
is really protected by this. Remove the remaining users
so we don't have to mark alpha as 'depends on BKL'.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: linux-alpha@vger.kernel.org
Signed-off-by: Matt Turner <mattst88@gmail.com>
Tejun Heo [Tue, 14 Sep 2010 13:00:22 +0000 (09:00 -0400)]
alpha: fix build breakage in asm/cacheflush.h
Alpha SMP flush_icache_user_range() is implemented as an inline
function inside include/asm/cacheflush.h. It dereferences @current
but doesn't include linux/sched.h and thus causes build failure if
linux/sched.h wasn't included previously. Fix it by including the
needed header file explicitly.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Matt Turner <mattst88@gmail.com>
matt mooney [Tue, 14 Sep 2010 09:27:39 +0000 (05:27 -0400)]
alpha: remove unnecessary cast from void* in assignment.
Acked-by: Jan-Benedict Glaw <jbglaw@lug-owl.de>
Signed-off-by: matt mooney <mfm@muteddisk.com>
Signed-off-by: Matt Turner <mattst88@gmail.com>
Joe Perches [Tue, 14 Sep 2010 08:23:47 +0000 (04:23 -0400)]
alpha: Use static const char * const where possible
Acked-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Matt Turner <mattst88@gmail.com>
Sosnowski, Maciej [Thu, 16 Sep 2010 06:02:26 +0000 (06:02 +0000)]
dca: disable dca on IOAT ver.3.0 multiple-IOH platforms
Direct Cache Access is not supported on IOAT ver.3.0 multiple-IOH platforms.
This patch blocks registering of dca providers when multiple IOH detected with IOAT ver.3.0.
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Darius Augulis [Thu, 9 Sep 2010 12:41:31 +0000 (21:41 +0900)]
ARM: S3C64XX: Add IORESOURCE_IRQ_HIGHLEVEL flag to dm9000 on mach-real6410
Add IORESOURCE_IRQ_HIGHLEVEL irq flag to dm9000 driver
platform data in board mach-real6410.
Signed-off-by: Darius Augulis <augulis.darius@gmail.com>
[kgene.kim@samsung.com: minor title fix]
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Darius Augulis [Thu, 9 Sep 2010 12:40:22 +0000 (21:40 +0900)]
ARM: S3C64XX: Fix coding style errors on mach-real6410
Fix errors reported by checkpatch.pl script
Signed-off-by: Darius Augulis <augulis.darius@gmail.com>
[kgene.kim@samsung.com: minor title fix]
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Mark Brown [Sat, 18 Sep 2010 00:54:38 +0000 (09:54 +0900)]
ARM: S3C64XX: Prototype SPI devices
Avoids build warnings due to the undeclared non-statics.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Herbert Xu [Fri, 17 Sep 2010 23:55:03 +0000 (16:55 -0700)]
netpoll: Disable IRQ around RCU dereference in netpoll_rx
We cannot use rcu_dereference_bh safely in netpoll_rx as we may
be called with IRQs disabled. We could however simply disable
IRQs as that too causes BH to be disabled and is safe in either
case.
Thanks to John Linville for discovering this bug and providing
a patch.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Wed, 15 Sep 2010 14:00:26 +0000 (10:00 -0400)]
sctp: Do not reset the packet during sctp_packet_config().
sctp_packet_config() is called when getting the packet ready
for appending of chunks. The function should not touch the
current state, since it's possible to ping-pong between two
transports when sending, and that can result packet corruption
followed by skb overlfow crash.
Reported-by: Thomas Dreibholz <dreibh@iem.uni-due.de>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sage Weil [Fri, 17 Sep 2010 19:30:31 +0000 (12:30 -0700)]
ceph: select CRYPTO
We select CRYPTO_AES, but not CRYPTO.
Signed-off-by: Sage Weil <sage@newdream.net>
Linus Torvalds [Fri, 17 Sep 2010 17:53:28 +0000 (10:53 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: pcm - Fix race with proc files
ALSA: pcm - Fix unbalanced pm_qos_request
ALSA: HDA: Enable internal speaker on Dell M101z
ALSA: patch_nvhdmi.c: Fix supported sample rate list.
sound: Remove pr_<level> uses of KERN_<level>
ALSA: hda - Add quirk for Toshiba C650D using a Conexant CX20585
ALSA: hda_intel: ALSA HD Audio patch for Intel Patsburg DeviceIDs
Linus Torvalds [Fri, 17 Sep 2010 17:25:47 +0000 (10:25 -0700)]
Merge branch 'hwmon-for-linus' of git://git./linux/kernel/git/jdelvare/staging
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
hwmon: (lm95241) Replace rate sysfs attribute with update_interval
hwmon: (adm1031) Replace update_rate sysfs attribute with update_interval
hwmon: (w83627ehf) Use proper exit sequence
hwmon: (emc1403) Remove unnecessary hwmon_device_unregister
hwmon: (f75375s) Do not overwrite values read from registers
hwmon: (f75375s) Shift control mode to the correct bit position
hwmon: New subsystem maintainers
hwmon: (lis3lv02d) Prevent NULL pointer dereference
Linus Torvalds [Fri, 17 Sep 2010 17:23:42 +0000 (10:23 -0700)]
Merge git://git./linux/kernel/git/steve/gfs2-2.6-fixes
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes:
GFS2: gfs2_logd should be using interruptible waits
Linus Torvalds [Fri, 17 Sep 2010 17:23:08 +0000 (10:23 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/ieee1394/linux1394-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
firewire: nosy: fix build when CONFIG_FIREWIRE=N
firewire: ohci: activate cycle timer register quirk on Ricoh chips
Linus Torvalds [Fri, 17 Sep 2010 17:22:48 +0000 (10:22 -0700)]
Merge branch 'for-linus' of git://neil.brown.name/md
* 'for-linus' of git://neil.brown.name/md:
md: fix v1.x metadata update when a disk is missing.
md: call md_update_sb even for 'external' metadata arrays.
Al Viro [Fri, 17 Sep 2010 13:34:39 +0000 (14:34 +0100)]
arm: fix really nasty sigreturn bug
If a signal hits us outside of a syscall and another gets delivered
when we are in sigreturn (e.g. because it had been in sa_mask for
the first one and got sent to us while we'd been in the first handler),
we have a chance of returning from the second handler to location one
insn prior to where we ought to return. If r0 happens to contain -513
(-ERESTARTNOINTR), sigreturn will get confused into doing restart
syscall song and dance.
Incredible joy to debug, since it manifests as random, infrequent and
very hard to reproduce double execution of instructions in userland
code...
The fix is simple - mark it "don't bother with restarts" in wrapper,
i.e. set r8 to 0 in sys_sigreturn and sys_rt_sigreturn wrappers,
suppressing the syscall restart handling on return from these guys.
They can't legitimately return a restart-worthy error anyway.
Testcase:
#include <unistd.h>
#include <signal.h>
#include <stdlib.h>
#include <sys/time.h>
#include <errno.h>
void f(int n)
{
__asm__ __volatile__(
"ldr r0, [%0]\n"
"b 1f\n"
"b 2f\n"
"1:b .\n"
"2:\n" : : "r"(&n));
}
void handler1(int sig) { }
void handler2(int sig) { raise(1); }
void handler3(int sig) { exit(0); }
main()
{
struct sigaction s = {.sa_handler = handler2};
struct itimerval t1 = { .it_value = {1} };
struct itimerval t2 = { .it_value = {2} };
signal(1, handler1);
sigemptyset(&s.sa_mask);
sigaddset(&s.sa_mask, 1);
sigaction(SIGALRM, &s, NULL);
signal(SIGVTALRM, handler3);
setitimer(ITIMER_REAL, &t1, NULL);
setitimer(ITIMER_VIRTUAL, &t2, NULL);
f(-513); /* -ERESTARTNOINTR */
write(1, "buggered\n", 9);
return 1;
}
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sage Weil [Fri, 17 Sep 2010 16:54:08 +0000 (09:54 -0700)]
ceph: check mapping to determine if FILE_CACHE cap is used
See if the i_data mapping has any pages to determine if the FILE_CACHE
capability is currently in use, instead of assuming it is any time the
rdcache_gen value is set (i.e., issued -> used).
This allows the MDS RECALL_STATE process work for inodes that have cached
pages.
Signed-off-by: Sage Weil <sage@newdream.net>
Takashi Iwai [Fri, 17 Sep 2010 15:44:20 +0000 (17:44 +0200)]
Merge branch 'fix/hda' into for-linus