Thomas Gleixner [Thu, 15 Sep 2011 13:32:06 +0000 (15:32 +0200)]
sched: Fix idle_cpu()
On -rt we observed hackbench waking all 400 tasks to a single cpu.
This is because of select_idle_sibling()'s interaction with the new
ipi based wakeup scheme.
The existing idle_cpu() test only checks to see if the current task on
that cpu is the idle task, it does not take already queued tasks into
account, nor does it take queued to be woken tasks into account.
If the remote wakeup IPIs come hard enough, there won't be time to
schedule away from the idle task, and would thus keep thinking the cpu
was in fact idle, regardless of the fact that there were already
several hundred tasks runnable.
We couldn't reproduce on mainline, but there's no reason it couldn't
happen.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-3o30p18b2paswpc9ohy2gltp@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Mon, 12 Sep 2011 13:50:49 +0000 (15:50 +0200)]
llist: Remove cpu_relax() usage in cmpxchg loops
Initial benchmarks show they're a net loss:
$ for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor ; do echo performance > $i; done
$ echo 4096 32000 64 128 > /proc/sys/kernel/sem
$ ./sembench -t 2048 -w 1900 -o 0
Pre:
run time 30 seconds 778936 worker burns per second
run time 30 seconds 912190 worker burns per second
run time 30 seconds 817506 worker burns per second
run time 30 seconds 830870 worker burns per second
run time 30 seconds 845056 worker burns per second
Post:
run time 30 seconds 905920 worker burns per second
run time 30 seconds 849046 worker burns per second
run time 30 seconds 886286 worker burns per second
run time 30 seconds 822320 worker burns per second
run time 30 seconds 900283 worker burns per second
So about 4% faster. (!)
cpu_relax() stalls the pipeline, therefore, when used in a tight loop
it has the following benefits:
- allows SMT siblings to have a go;
- reduces pressure on the CPU interconnect.
However, cmpxchg loops are unfair and thus have unbounded completion
time, therefore we should avoid getting in such heavily contended
situations where the above benefits make any difference.
A typical cmpxchg loop should not go round more than a handfull of
times at worst, therefore adding extra delays just slows things down.
Since the llist primitives are new, there aren't any bad users yet,
and we should avoid growing them. Heavily contended sites should
generally be better off using the ticket locks for serialization since
they provide bounded completion times (fifo-fair over the cpus).
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1315836358.26517.43.camel@twins
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Mon, 12 Sep 2011 11:06:17 +0000 (13:06 +0200)]
sched: Convert to struct llist
Use the generic llist primitives.
We had a private lockless list implementation in the scheduler in the wake-list
code, now that we have a generic llist implementation that provides all required
operations, switch to it.
This patch is not expected to change any behavior.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1315836353.26517.42.camel@twins
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Mon, 12 Sep 2011 11:12:28 +0000 (13:12 +0200)]
llist: Add llist_next()
So we don't have to expose the struct list_node member.
Cc: Huang Ying <ying.huang@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1315836348.26517.41.camel@twins
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Huang Ying [Thu, 8 Sep 2011 06:00:46 +0000 (14:00 +0800)]
irq_work: Use llist in the struct irq_work logic
Use llist in irq_work instead of the lock-less linked list
implementation in irq_work to avoid the code duplication.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1315461646-1379-6-git-send-email-ying.huang@intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Huang Ying [Thu, 8 Sep 2011 06:00:45 +0000 (14:00 +0800)]
llist: Return whether list is empty before adding in llist_add()
Extend the llist_add*() functions to return a success indicator, this
allows us in the scheduler code to send an IPI if the queue was empty.
( There's no effect on existing users, because the list_add_xxx() functions
are inline, thus this will be optimized out by the compiler if not used
by callers. )
Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1315461646-1379-5-git-send-email-ying.huang@intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Huang Ying [Thu, 8 Sep 2011 06:00:44 +0000 (14:00 +0800)]
llist: Move cpu_relax() to after the cmpxchg()
If in llist_add()/etc. functions the first cmpxchg() call succeeds, it is
not necessary to use cpu_relax() before the cmpxchg(). So cpu_relax() in
a busy loop involving cmpxchg() should go after cmpxchg() instead of before
that.
This patch fixes this for all involved llist functions.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1315461646-1379-4-git-send-email-ying.huang@intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Tue, 4 Oct 2011 10:43:11 +0000 (12:43 +0200)]
llist: Remove the platform-dependent NMI checks
Remove the nmi() checks spread around the code. in_nmi() is not available
on every architecture and it's a pretty obscure and ugly check in any case.
Cc: Huang Ying <ying.huang@intel.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1315461646-1379-3-git-send-email-ying.huang@intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Huang Ying [Thu, 8 Sep 2011 06:00:42 +0000 (14:00 +0800)]
llist: Make some llist functions inline
Because llist code will be used in performance critical scheduler
code path, make llist_add() and llist_del_all() inline to avoid
function calling overhead and related 'glue' overhead.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1315461646-1379-2-git-send-email-ying.huang@intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Tue, 4 Oct 2011 09:08:16 +0000 (11:08 +0200)]
Merge branch 'linus' into sched/core
Merge reason: pick up the latest fixes.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus Torvalds [Mon, 3 Oct 2011 19:54:56 +0000 (12:54 -0700)]
Merge branch 'hwmon-for-linus' of git://github.com/groeck/linux
* 'hwmon-for-linus' of git://github.com/groeck/linux:
hwmon: (coretemp) Avoid leaving around dangling pointer
hwmon: (coretemp) Fixup platform device ID change
Linus Torvalds [Mon, 3 Oct 2011 19:53:43 +0000 (12:53 -0700)]
Merge git://github.com/davem330/ide
* git://github.com/davem330/ide:
ide-disk: Fix request requeuing
Linus Torvalds [Mon, 3 Oct 2011 19:17:44 +0000 (12:17 -0700)]
Merge branch 'btrfs-3.0' of git://github.com/chrismason/linux
* 'btrfs-3.0' of git://github.com/chrismason/linux:
Btrfs: force a page fault if we have a shorty copy on a page boundary
Borislav Petkov [Mon, 3 Oct 2011 18:28:18 +0000 (14:28 -0400)]
ide-disk: Fix request requeuing
Simon Kirby reported that on his RAID setup with idedisk underneath
the box OOMs after a couple of days of runtime. Running with
CONFIG_DEBUG_KMEMLEAK pointed to idedisk_prep_fn() which unconditionally
allocates an ide_cmd struct. However, ide_requeue_and_plug() can be
called more than once per request, either from the request issue or the
IRQ handler path and do blk_peek_request() ends up in idedisk_prep_fn()
repeatedly, allocating a struct ide_cmd everytime and "forgetting" the
previous pointer.
Make sure the code reuses the old allocated chunk.
Reported-and-tested-by: Simon Kirby <sim@hostway.ca>
Cc: <stable@kernel.org> [ 39.x, 3.0.x ]
Link: http://marc.info/?l=linux-kernel&m=131667641517919
Link: http://lkml.kernel.org/r/20110922072643.GA27232@hostway.ca
Signed-off-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 3 Oct 2011 02:23:44 +0000 (19:23 -0700)]
Merge branch 'for-linus' of git://git.infradead.org/users/sameo/mfd-2.6
* 'for-linus' of git://git.infradead.org/users/sameo/mfd-2.6:
mfd: Fix generic irq chip ack function name for jz4740-adc
Linus Torvalds [Mon, 3 Oct 2011 02:22:44 +0000 (19:22 -0700)]
Merge branch 'for-linus' of git://github.com/tiwai/sound
* 'for-linus' of git://github.com/tiwai/sound:
ALSA: hda - Fix a regression of the position-buffer check
Linus Torvalds [Sun, 2 Oct 2011 00:46:13 +0000 (17:46 -0700)]
Merge branch 'perf-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip
* 'perf-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip:
perf tools: Fix raw sample reading
Linus Torvalds [Sat, 1 Oct 2011 15:37:25 +0000 (08:37 -0700)]
Merge branches 'irq-urgent-for-linus', 'x86-urgent-for-linus' and 'sched-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip
* 'irq-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip:
irq: Fix check for already initialized irq_domain in irq_domain_add
irq: Add declaration of irq_domain_simple_ops to irqdomain.h
* 'x86-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip:
x86/rtc: Don't recursively acquire rtc_lock
* 'sched-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip:
posix-cpu-timers: Cure SMP wobbles
sched: Fix up wchan borkage
sched/rt: Migrate equal priority tasks to available CPUs
Josef Bacik [Fri, 30 Sep 2011 19:23:54 +0000 (15:23 -0400)]
Btrfs: force a page fault if we have a shorty copy on a page boundary
A user reported a problem where ceph was getting into 100% cpu usage while doing
some writing. It turns out it's because we were doing a short write on a not
uptodate page, which means we'd fall back at one page at a time and fault the
page in. The problem is our position is on the page boundary, so our fault in
logic wasn't actually reading the page, so we'd just spin forever or until the
page got read in by somebody else. This will force a readpage if we end up
doing a short copy. Alexandre could reproduce this easily with ceph and reports
it fixes his problem. I also wrote a reproducer that no longer hangs my box
with this patch. Thanks,
Reported-and-tested-by: Alexandre Oliva <aoliva@redhat.com>
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Ingo Molnar [Fri, 30 Sep 2011 18:08:56 +0000 (20:08 +0200)]
Merge branch 'perf/urgent' of git://github.com/acmel/linux into perf/urgent
Peter Zijlstra [Thu, 1 Sep 2011 10:42:04 +0000 (12:42 +0200)]
posix-cpu-timers: Cure SMP wobbles
David reported:
Attached below is a watered-down version of rt/tst-cpuclock2.c from
GLIBC. Just build it with "gcc -o test test.c -lpthread -lrt" or
similar.
Run it several times, and you will see cases where the main thread
will measure a process clock difference before and after the nanosleep
which is smaller than the cpu-burner thread's individual thread clock
difference. This doesn't make any sense since the cpu-burner thread
is part of the top-level process's thread group.
I've reproduced this on both x86-64 and sparc64 (using both 32-bit and
64-bit binaries).
For example:
[davem@boricha build-x86_64-linux]$ ./test
process: before(0.
001221967) after(0.
498624371) diff(
497402404)
thread: before(0.
000081692) after(0.
498316431) diff(
498234739)
self: before(0.
001223521) after(0.
001240219) diff(16698)
[davem@boricha build-x86_64-linux]$
The diff of 'process' should always be >= the diff of 'thread'.
I make sure to wrap the 'thread' clock measurements the most tightly
around the nanosleep() call, and that the 'process' clock measurements
are the outer-most ones.
---
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <fcntl.h>
#include <string.h>
#include <errno.h>
#include <pthread.h>
static pthread_barrier_t barrier;
static void *chew_cpu(void *arg)
{
pthread_barrier_wait(&barrier);
while (1)
__asm__ __volatile__("" : : : "memory");
return NULL;
}
int main(void)
{
clockid_t process_clock, my_thread_clock, th_clock;
struct timespec process_before, process_after;
struct timespec me_before, me_after;
struct timespec th_before, th_after;
struct timespec sleeptime;
unsigned long diff;
pthread_t th;
int err;
err = clock_getcpuclockid(0, &process_clock);
if (err)
return 1;
err = pthread_getcpuclockid(pthread_self(), &my_thread_clock);
if (err)
return 1;
pthread_barrier_init(&barrier, NULL, 2);
err = pthread_create(&th, NULL, chew_cpu, NULL);
if (err)
return 1;
err = pthread_getcpuclockid(th, &th_clock);
if (err)
return 1;
pthread_barrier_wait(&barrier);
err = clock_gettime(process_clock, &process_before);
if (err)
return 1;
err = clock_gettime(my_thread_clock, &me_before);
if (err)
return 1;
err = clock_gettime(th_clock, &th_before);
if (err)
return 1;
sleeptime.tv_sec = 0;
sleeptime.tv_nsec =
500000000;
nanosleep(&sleeptime, NULL);
err = clock_gettime(th_clock, &th_after);
if (err)
return 1;
err = clock_gettime(my_thread_clock, &me_after);
if (err)
return 1;
err = clock_gettime(process_clock, &process_after);
if (err)
return 1;
diff = process_after.tv_nsec - process_before.tv_nsec;
printf("process: before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n",
process_before.tv_sec, process_before.tv_nsec,
process_after.tv_sec, process_after.tv_nsec, diff);
diff = th_after.tv_nsec - th_before.tv_nsec;
printf("thread: before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n",
th_before.tv_sec, th_before.tv_nsec,
th_after.tv_sec, th_after.tv_nsec, diff);
diff = me_after.tv_nsec - me_before.tv_nsec;
printf("self: before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n",
me_before.tv_sec, me_before.tv_nsec,
me_after.tv_sec, me_after.tv_nsec, diff);
return 0;
}
This is due to us using p->se.sum_exec_runtime in
thread_group_cputime() where we iterate the thread group and sum all
data. This does not take time since the last schedule operation (tick
or otherwise) into account. We can cure this by using
task_sched_runtime() at the cost of having to take locks.
This also means we can (and must) do away with
thread_group_sched_runtime() since the modified thread_group_cputime()
is now more accurate and would deadlock when called from
thread_group_sched_runtime().
Aside of that it makes the function safe on 32 bit systems. The old
code added t->se.sum_exec_runtime unprotected. sum_exec_runtime is a
64bit value and could be changed on another cpu at the same time.
Reported-by: David Miller <davem@davemloft.net>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@kernel.org
Link: http://lkml.kernel.org/r/1314874459.7945.22.camel@twins
Tested-by: David Miller <davem@davemloft.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Takashi Iwai [Fri, 30 Sep 2011 06:52:26 +0000 (08:52 +0200)]
ALSA: hda - Fix a regression of the position-buffer check
The commit
a810364a0424c297242c6c66071a42f7675a5568
ALSA: hda - Handle -1 as invalid position, too
caused a regression on some machines that require the position-buffer
instead of LPIB, e.g. resulting in noises with mic recording with
PulseAudio.
This patch fixes the detection by delaying the test at the timing as
same as 3.0, i.e. doing the position check only when requested in
azx_position_ok().
Reported-and-tested-by: Rocko Requin <rockorequin@hotmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Ram Pai [Thu, 22 Sep 2011 07:48:58 +0000 (15:48 +0800)]
Resource: fix wrong resource window calculation
__find_resource() incorrectly returns a resource window which overlaps
an existing allocated window. This happens when the parent's
resource-window spans 0x00000000 to 0xffffffff and is entirely allocated
to all its children resource-windows.
__find_resource() looks for gaps in resource allocation among the
children resource windows. When it encounters the last child window it
blindly tries the range next to one allocated to the last child. Since
the last child's window ends at 0xffffffff the calculation overflows,
leading the algorithm to believe that any window in the range 0x0000000
to 0xfffffff is available for allocation. This leads to a conflicting
window allocation.
Michal Ludvig reported this issue seen on his platform. The following
patch fixes the problem and has been verified by Michal. I believe this
bug has been there for ages. It got exposed by git commit
2bbc6942273b
("PCI : ability to relocate assigned pci-resources")
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Tested-by: Michal Ludvig <mludvig@logix.net.nz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 30 Sep 2011 02:58:58 +0000 (19:58 -0700)]
Merge branch 'for-linus' of git://github.com/NewDreamNetwork/ceph-client
* 'for-linus' of git://github.com/NewDreamNetwork/ceph-client:
libceph: fix pg_temp mapping update
libceph: fix pg_temp mapping calculation
libceph: fix linger request requeuing
libceph: fix parse options memory leak
libceph: initialize ack_stamp to avoid unnecessary connection reset
Linus Torvalds [Fri, 30 Sep 2011 02:29:45 +0000 (19:29 -0700)]
Merge branch 'v4l_for_linus' of git://linuxtv.org/mchehab/for_linus
* 'v4l_for_linus' of git://linuxtv.org/mchehab/for_linus:
[media] omap3isp: Fix build error in ispccdc.c
[media] uvcvideo: Fix crash when linking entities
[media] v4l: Make sure we hold a reference to the v4l2_device before using it
[media] v4l: Fix use-after-free case in v4l2_device_release
[media] uvcvideo: Set alternate setting 0 on resume if the bus has been reset
[media] OMAP_VOUT: Fix build break caused by update_mode removal in DSS2
Linus Torvalds [Fri, 30 Sep 2011 02:28:26 +0000 (19:28 -0700)]
Merge branch 'for-linus' of git://git390.marist.edu/linux-2.6
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
[S390] cio: fix cio_tpi ignoring adapter interrupts
[S390] gmap: always up mmap_sem properly
[S390] Do not clobber personality flags on exec
Linus Torvalds [Fri, 30 Sep 2011 02:24:33 +0000 (19:24 -0700)]
Merge git://github.com/davem330/sparc
* git://github.com/davem330/sparc:
sparc64: Force the execute bit in OpenFirmware's translation entries.
sparc: Make '-p' boot option meaningful again.
sparc, exec: remove redundant addr_limit assignment
sparc64: Future proof Niagara cpu detection.
Linus Torvalds [Fri, 30 Sep 2011 02:23:30 +0000 (19:23 -0700)]
Merge branch 'drm-intel-fixes' of git://people.freedesktop.org/~keithp/linux
* 'drm-intel-fixes' of git://people.freedesktop.org/~keithp/linux:
drm/i915: FBC off for ironlake and older, otherwise on by default
drm/i915: Enable SDVO hotplug interrupts for HDMI and DVI
drm/i915: Enable dither whenever display bpc < frame buffer bpc
Benjamin Herrenschmidt [Thu, 29 Sep 2011 05:57:01 +0000 (15:57 +1000)]
powerpc: Fix device-tree matching for Apple U4 bridge
Apple Quad G5 has some oddity in it's device-tree which causes the new
generic matching code to fail to relate nodes for PCI-E devices below U4
with their respective struct pci_dev. This breaks graphics on those
machines among others.
This fixes it using a quirk which copies the node pointer from the host
bridge for the root complex, which makes the generic code work for the
children afterward.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
wangyanqing [Thu, 29 Sep 2011 07:09:40 +0000 (15:09 +0800)]
bootup: move 'usermodehelper_enable()' a little earlier
Commit
d5767c53535a ("bootup: move 'usermodehelper_enable()' to the end
of do_basic_setup()") moved 'usermodehelper_enable()' to end of
do_basic_setup() to after the initcalls. But then I get failed to let
uvesafb work on my computer, and lose the splash boot.
So maybe we could start usermodehelper_enable a little early to make
some task work that need eary init with the help of user mode.
[ I would *really* prefer that initcalls not call into user space - even
the real 'init' hasn't been execve'd yet, after all! But for uvesafb
it really does look like we don't have much choice.
I considered doing this when we mount the root filesystem, but
depending on config options that is in multiple places. We could do
the usermode helper enable as a rootfs_initcall()..
So I'm just using wang yanqing's trivial patch. It's not wonderful,
but it's simple and should work. We should revisit this some day,
though. - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiri Olsa [Thu, 29 Sep 2011 15:05:08 +0000 (17:05 +0200)]
perf tools: Fix raw sample reading
Wrong pointer is being passed for raw data sanity checking, when parsing
sample event.
This ends up with invalid event and perf record being stuck in
__perf_session__process_events function during processing build IDs
(process_buildids function).
Following command hangs up in my setup:
./perf record -e raw_syscalls:sys_enter ls
The fix is to use proper pointer to the raw data instead of the 'u'
union.
Reviewed-by: David Ahern <dsahern@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1317308709-9474-2-git-send-email-jolsa@redhat.com
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
David S. Miller [Thu, 29 Sep 2011 19:18:59 +0000 (12:18 -0700)]
sparc64: Force the execute bit in OpenFirmware's translation entries.
In the OF 'translations' property, the template TTEs in the mappings
never specify the executable bit. This is the case even though some
of these mappings are for OF's code segment.
Therefore, we need to force the execute bit on in every mapping.
This problem can only really trigger on Niagara/sun4v machines and the
history behind this is a little complicated.
Previous to sun4v, the sun4u TTE entries lacked a hardware execute
permission bit. So OF didn't have to ever worry about setting
anything to handle executable pages. Any valid TTE loaded into the
I-TLB would be respected by the chip.
But sun4v Niagara chips have a real hardware enforced executable bit
in their TTEs. So it has to be set or else the I-TLB throws an
instruction access exception with type code 6 (protection violation).
We've been extremely fortunate to not get bitten by this in the past.
The best I can tell is that the OF's mappings for it's executable code
were mapped using permanent locked mappings on sun4v in the past.
Therefore, the fact that we didn't have the exec bit set in the OF
translations we would use did not matter in practice.
Thanks to Greg Onufer for helping me track this down.
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 28 Sep 2011 17:23:44 +0000 (10:23 -0700)]
bootup: move 'usermodehelper_enable()' to the end of do_basic_setup()
Doing it just before starting to call into cpu_idle() made a sick kind
of sense only because the original bug we fixed (see commit
288d5abec831: "Boot up with usermodehelper disabled") was about problems
with some scheduler data structures not being initialized, and they had
better be initialized at that point.
But it really didn't make any other conceptual sense, and doing it after
the initial "schedule()" call for the idle thread actually opened up a
race: what if the main initialization thread did everything without
needing to sleep, and got all the way into user land too? Without
actually having scheduled back to the idle thread?
Now, in normal circumstances that doesn't ever happen, but it looks like
Richard Cochran triggered exactly that on his ARM IXP4xx machines:
"I have some ARM IXP4xx based machines that use the two on chip MAC
ports (aka NPEs). The NPE needs a firmware in order to function.
Ever since the following commit [that
288d5abec831 one], it is no
longer possible to bring up the interfaces during the init scripts."
with a call trace showing an ioctl coming from user space. Richard says:
"The init is busybox, and the startup script does mount, syslogd, and
then ifup, so that all can go by quickly."
The fix is to move the usermodehelper_enable() into the main 'init'
thread, and just put it after we've done all our initcalls. By then,
everything really should be up, but we've obviously not actually started
the user-mode portion of init yet.
Reported-and-tested-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sage Weil [Wed, 28 Sep 2011 17:11:04 +0000 (10:11 -0700)]
libceph: fix pg_temp mapping update
The incremental map updates have a record for each pg_temp mapping that is
to be add/updated (len > 0) or removed (len == 0). The old code was
written as if the updates were a complete enumeration; that was just wrong.
Update the code to remove 0-length entries and drop the rbtree traversal.
This avoids misdirected (and hung) requests that manifest as server
errors like
[WRN] client4104 10.0.1.219:0/
275025290 misdirected client4104.1:129 0.1 to osd0 not [1,0] in e11/11
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil [Wed, 28 Sep 2011 17:08:27 +0000 (10:08 -0700)]
libceph: fix pg_temp mapping calculation
We need to apply the modulo pg_num calculation before looking up a pgid in
the pg_temp mapping rbtree. This fixes pg_temp mappings, and fixes
(some) misdirected requests that result in messages like
[WRN] client4104 10.0.1.219:0/
275025290 misdirected client4104.1:129 0.1 to osd0 not [1,0] in e11/11
on the server and stall make the client block without getting a reply (at
least until the pg_temp mapping goes way, but that can take a long long
time).
Reorder calc_pg_raw() a bit to make more sense.
Signed-off-by: Sage Weil <sage@newdream.net>
Linus Torvalds [Wed, 28 Sep 2011 15:39:05 +0000 (08:39 -0700)]
Merge git://github.com/davem330/net
* git://github.com/davem330/net:
ipv6-multicast: Fix memory leak in IPv6 multicast.
ipv6: check return value for dst_alloc
net: check return value for dst_alloc
ipv6-multicast: Fix memory leak in input path.
bnx2x: add missing break in bnx2x_dcbnl_get_cap
bnx2x: fix WOL by enablement PME in config space
bnx2x: fix hw attention handling
net: fix a typo in Documentation/networking/scaling.txt
ath9k: Fix a dma warning/memory leak
rtlwifi: rtl8192cu: Fix unitialized struct
iwlagn: fix dangling scan request
batman-adv: do_bcast has to be true for broadcast packets only
cfg80211: Fix validation of AKM suites
iwlegacy: do not use interruptible waits
iwlegacy: fix command queue timeout
ath9k_hw: Fix Rx DMA stuck for AR9003 chips
Linus Torvalds [Wed, 28 Sep 2011 15:23:39 +0000 (08:23 -0700)]
Merge git://bedivere.hansenpartnership.com/git/scsi-rc-fixes-2.6
* git://bedivere.hansenpartnership.com/git/scsi-rc-fixes-2.6:
[SCSI] 3w-9xxx: fix iommu_iova leak
[SCSI] cxgb3i: convert cdev->l2opt to use rcu to prevent NULL dereference
[SCSI] scsi: qla4xxx needs libiscsi.o
[SCSI] libsas: fix failure to revalidate domain for anything but the first expander child.
[SCSI] aacraid: reset should disable MSI interrupt
Guenter Roeck [Sat, 24 Sep 2011 22:27:04 +0000 (15:27 -0700)]
hwmon: (coretemp) Avoid leaving around dangling pointer
Storing the struct temp_data pointer allocated from create_core_data()
when returning an error has the potential of leaving around a pointer
to freed memory. Reset it to NULL for error returns.
Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
Jean Delvare [Wed, 28 Sep 2011 15:11:00 +0000 (08:11 -0700)]
hwmon: (coretemp) Fixup platform device ID change
With recent change "hwmon: (coretemp) don't use kernel assigned CPU
number as platform device ID", the microcode check is now running on
random CPU. Fix that by checking the microcode before creating the
platform device rather than at probe time.
Also avoid calling TO_PHYS_ID(cpu) twice in the same function, it's
expensive.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Guenter Roeck <guenter.roeck@ericsson.com>
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Linus Torvalds [Wed, 28 Sep 2011 15:03:00 +0000 (08:03 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
* 'for-linus' of git://git.kernel.dk/linux-block:
block: Free queue resources at blk_release_queue()
Linus Torvalds [Wed, 28 Sep 2011 15:01:05 +0000 (08:01 -0700)]
Merge branch 'writeback-for-linus' of git://github.com/fengguang/linux
* 'writeback-for-linus' of git://github.com/fengguang/linux:
writeback: show raw dirtied_when in trace writeback_single_inode
Hannes Reinecke [Wed, 28 Sep 2011 14:07:01 +0000 (08:07 -0600)]
block: Free queue resources at blk_release_queue()
A kernel crash is observed when a mounted ext3/ext4 filesystem is
physically removed. The problem is that blk_cleanup_queue() frees up
some resources eg by calling elevator_exit(), which are not checked for
in normal operation. So we should rather move these calls to the
destructor function blk_release_queue() as at that point all remaining
references are gone. However, in doing so we have to ensure that any
externally supplied queue_lock is disconnected as the driver might free
up the lock after the call of blk_cleanup_queue(),
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
David S. Miller [Wed, 28 Sep 2011 02:42:30 +0000 (22:42 -0400)]
Merge branch 'for-davem' of git://git.infradead.org/users/linville/wireless
Linus Torvalds [Tue, 27 Sep 2011 22:48:34 +0000 (15:48 -0700)]
Linux 3.1-rc8
Linus Torvalds [Tue, 27 Sep 2011 22:46:21 +0000 (15:46 -0700)]
Merge branch 'for-linus' of git://github.com/tiwai/sound
* 'for-linus' of git://github.com/tiwai/sound:
ASoC: ssm2602: Re-enable oscillator after suspend
ALSA: usb-audio: Check for possible chip NULL pointer before clearing probing flag
ALSA: hda/realtek - Don't detect LO jack when identical with HP
ALSA: hda/realtek - Avoid bogus HP-pin assignment
ALSA: HDA: No power nids on 92HD93
ASoC: omap-mcbsp: Do not attempt to change DAI sysclk if stream is active
Linus Torvalds [Tue, 27 Sep 2011 22:41:32 +0000 (15:41 -0700)]
Merge branch 'pm-fixes' of git://github.com/rjwysocki/linux-pm
* 'pm-fixes' of git://github.com/rjwysocki/linux-pm:
PM / Clocks: Do not acquire a mutex under a spinlock
John W. Linville [Tue, 27 Sep 2011 19:47:33 +0000 (15:47 -0400)]
Merge branch 'master' of git://git.infradead.org/users/linville/wireless into for-davem
Ben Greear [Fri, 23 Sep 2011 13:11:01 +0000 (13:11 +0000)]
ipv6-multicast: Fix memory leak in IPv6 multicast.
If reg_vif_xmit cannot find a routing entry, be sure to
free the skb before returning the error.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Madalin Bucur [Mon, 26 Sep 2011 07:04:56 +0000 (07:04 +0000)]
ipv6: check return value for dst_alloc
return value of dst_alloc must be checked before use
Signed-off-by: Madalin Bucur <madalin.bucur@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Madalin Bucur [Mon, 26 Sep 2011 07:04:36 +0000 (07:04 +0000)]
net: check return value for dst_alloc
return value of dst_alloc must be checked before use
Signed-off-by: Madalin Bucur <madalin.bucur@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Greear [Tue, 27 Sep 2011 19:16:08 +0000 (15:16 -0400)]
ipv6-multicast: Fix memory leak in input path.
Have to free the skb before returning if we fail
the fib lookup.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 27 Sep 2011 19:05:47 +0000 (15:05 -0400)]
Merge branch 'batman-adv/maint' of git://git.open-mesh.org/linux-merge
Shmulik Ravid [Thu, 22 Sep 2011 02:33:33 +0000 (02:33 +0000)]
bnx2x: add missing break in bnx2x_dcbnl_get_cap
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Kravkov [Thu, 22 Sep 2011 02:33:32 +0000 (02:33 +0000)]
bnx2x: fix WOL by enablement PME in config space
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Kravkov [Thu, 22 Sep 2011 02:33:31 +0000 (02:33 +0000)]
bnx2x: fix hw attention handling
Use register name to initialize attention mask
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Tue, 27 Sep 2011 17:26:27 +0000 (13:26 -0400)]
net: fix a typo in Documentation/networking/scaling.txt
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Takashi Iwai [Tue, 27 Sep 2011 16:21:41 +0000 (18:21 +0200)]
Merge branch 'fix/asoc' into for-linus
Linus Torvalds [Tue, 27 Sep 2011 15:12:33 +0000 (08:12 -0700)]
vfs: remove LOOKUP_NO_AUTOMOUNT flag
That flag no longer makes sense, since we don't look up automount points
as eagerly any more. Additionally, it turns out that the NO_AUTOMOUNT
handling was buggy to begin with: it would avoid automounting even for
cases where we really *needed* to do the automount handling, and could
return ENOENT for autofs entries that hadn't been instantiated yet.
With our new non-eager automount semantics, one discussion has been
about adding a AT_AUTOMOUNT flag to vfs_fstatat (and thus the
newfstatat() and fstatat64() system calls), but it's probably not worth
it: you can always force at least directory automounting by simply
adding the final '/' to the filename, which works for *all* of the stat
family system calls, old and new.
So AT_NO_AUTOMOUNT (and thus LOOKUP_NO_AUTOMOUNT) really were just a
result of our bad default behavior.
Acked-by: Ian Kent <raven@themaw.net>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Lars-Peter Clausen [Tue, 27 Sep 2011 09:08:46 +0000 (11:08 +0200)]
ASoC: ssm2602: Re-enable oscillator after suspend
Currently the the internal oscillator is powered down when entering BIAS_OFF
state, but not re-enabled when going back to BIAS_STANDBY. As a result the
CODEC will stop working after suspend if the internal oscillator is used to
generate the sysclock signal. This patch fixes it by clearing the appropriate
bit in the power down register when the CODEC is re-enabled.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: stable@kernel.org
Trond Myklebust [Tue, 27 Sep 2011 00:36:09 +0000 (20:36 -0400)]
VFS: Fix the remaining automounter semantics regressions
The concensus seems to be that system calls such as stat() etc should
not trigger an automount. Neither should the l* versions.
This patch therefore adds a LOOKUP_AUTOMOUNT flag to tag those lookups
that _should_ trigger an automount on the last path element.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
[ Edited to leave out the cases that are already covered by LOOKUP_OPEN,
LOOKUP_DIRECTORY and LOOKUP_CREATE - all of which also fundamentally
force automounting for their own reasons - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 27 Sep 2011 00:44:55 +0000 (17:44 -0700)]
vfs pathname lookup: Add LOOKUP_AUTOMOUNT flag
Since we've now turned around and made LOOKUP_FOLLOW *not* force an
automount, we want to add the ability to force an automount event on
lookup even if we don't happen to have one of the other flags that force
it implicitly (LOOKUP_OPEN, LOOKUP_DIRECTORY, LOOKUP_PARENT..)
Most cases will never want to use this, since you'd normally want to
delay automounting as long as possible, which usually implies
LOOKUP_OPEN (when we open a file or directory, we really cannot avoid
the automount any more).
But Trond argued sufficiently forcefully that at a minimum bind mounting
a file and quotactl will want to force the automount lookup. Some other
cases (like nfs_follow_remote_path()) could use it too, although
LOOKUP_DIRECTORY would work there as well.
This commit just adds the flag and logic, no users yet, though. It also
doesn't actually touch the LOOKUP_NO_AUTOMOUNT flag that is related, and
was made irrelevant by the same change that made us not follow on
LOOKUP_FOLLOW.
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Ian Kent <raven@themaw.net>
Cc: Jeff Layton <jlayton@redhat.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Greg KH <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 26 Sep 2011 23:29:26 +0000 (16:29 -0700)]
Merge branch 'samsung-fixes-3' of git://github.com/kgene/linux-samsung
* 'samsung-fixes-3' of git://github.com/kgene/linux-samsung:
ARM: EXYNOS4: Rename sclk_cam clocks for FIMC driver
ARM: S5PV210: Rename sclk_cam clocks for FIMC media driver
ARM: S5P: fix incorrect loop iterator usage on gpio-interrupt
ARM: S3C2443: Fix bit-reset in setrate of clk_armdiv
Sylwester Nawrocki [Mon, 26 Sep 2011 22:00:59 +0000 (07:00 +0900)]
ARM: EXYNOS4: Rename sclk_cam clocks for FIMC driver
The sclk_cam clocks are now controlled by the top level FIMC media
device driver bound to "s5p-fimc-md" platform device.
Rename sclk_cam clocks so they accessible by the corresponding
driver.
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Sylwester Nawrocki [Mon, 26 Sep 2011 22:00:53 +0000 (07:00 +0900)]
ARM: S5PV210: Rename sclk_cam clocks for FIMC media driver
The sclk_cam clocks are now controlled by the top level FIMC media
device driver bound to "s5p-fimc-md" platform device.
Rename sclk_cam clocks so they accessible by the corresponding
driver.
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Linus Torvalds [Mon, 26 Sep 2011 20:35:43 +0000 (13:35 -0700)]
Merge branch 'hwmon-for-linus' of git://github.com/groeck/linux
* 'hwmon-for-linus' of git://github.com/groeck/linux:
hwmon: (coretemp) remove struct platform_data * parameter from create_core_data()
hwmon: (coretemp) constify static data
hwmon: (coretemp) don't use kernel assigned CPU number as platform device ID
hwmon: (ds620) Fix handling of negative temperatures
hwmon: (w83791d) rename prototype parameter from 'register' to 'reg'
hwmon: (coretemp) Don't use threshold registers for tempX_max
hwmon: (coretemp) Let the user force TjMax
hwmon: (coretemp) Drop duplicate function get_pkg_tjmax
Linus Torvalds [Mon, 26 Sep 2011 20:33:44 +0000 (13:33 -0700)]
Merge branch 'kvm-updates/3.1' of git://github.com/avikivity/kvm
* 'kvm-updates/3.1' of git://github.com/avikivity/kvm:
KVM: x86 emulator: fix Src2CL decode
KVM: MMU: fix incorrect return of spte
Linus Torvalds [Mon, 26 Sep 2011 20:26:30 +0000 (13:26 -0700)]
Merge branch 'fixes' of ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm
* 'fixes' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm:
ARM: 7099/1: futex: preserve oldval in SMP __futex_atomic_op
ARM: dma-mapping: free allocated page if unable to map
ARM: fix vmlinux.lds.S discarding sections
ARM: nommu: fix warning with checksyscalls.sh
ARM: 7091/1: errata: D-cache line maintenance operation by MVA may not succeed
Mohammed Shafi Shajakhan [Fri, 23 Sep 2011 09:03:14 +0000 (14:33 +0530)]
ath9k: Fix a dma warning/memory leak
proper dma_unmapping and freeing of skb's has to be done in the rx
cleanup for EDMA chipsets when the device is unloaded and this also
seems to address the following warning which shows up occasionally when
the device is unloaded
Call Trace:
[<
c0148cd2>] warn_slowpath_common+0x72/0xa0
[<
c03b669c>] ? dma_debug_device_change+0x19c/0x200
[<
c03b669c>] ? dma_debug_device_change+0x19c/0x200
[<
c0148da3>] warn_slowpath_fmt+0x33/0x40
[<
c03b669c>] dma_debug_device_change+0x19c/0x200
[<
c0657f12>] notifier_call_chain+0x82/0xb0
[<
c0171370>] __blocking_notifier_call_chain+0x60/0x90
[<
c01713bf>] blocking_notifier_call_chain+0x1f/0x30
[<
c044f594>] __device_release_driver+0xa4/0xc0
[<
c044f647>] driver_detach+0x97/0xa0
[<
c044e65c>] bus_remove_driver+0x6c/0xe0
[<
c029af0b>] ? sysfs_addrm_finish+0x4b/0x60
[<
c0450109>] driver_unregister+0x49/0x80
[<
c0299f54>] ? sysfs_remove_file+0x14/0x20
[<
c03c3ab2>] pci_unregister_driver+0x32/0x80
[<
f92c2162>] ath_pci_exit+0x12/0x20 [ath9k]
[<
f92c8467>] ath9k_exit+0x17/0x36 [ath9k]
[<
c06523cd>] ? mutex_unlock+0xd/0x10
[<
c018e27f>] sys_delete_module+0x13f/0x200
[<
c02139bb>] ? sys_munmap+0x4b/0x60
[<
c06547c5>] ? restore_all+0xf/0xf
[<
c0657a20>] ? spurious_fault+0xe0/0xe0
[<
c01832f4>] ? trace_hardirqs_on_caller+0xf4/0x180
[<
c065b863>] sysenter_do_call+0x12/0x38
---[ end trace
16e1c1521c06bcf9 ]---
Mapped at:
[<
c03b7938>] debug_dma_map_page+0x48/0x120
[<
f92ba3e8>] ath_rx_init+0x3f8/0x4b0 [ath9k]
[<
f92b5ae4>] ath9k_init_device+0x4c4/0x7b0 [ath9k]
[<
f92c2813>] ath_pci_probe+0x263/0x330 [ath9k]
Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Larry Finger [Fri, 23 Sep 2011 03:59:02 +0000 (22:59 -0500)]
rtlwifi: rtl8192cu: Fix unitialized struct
Driver rtl8192cu assigns a new struct rtl_tcb_desc object, but fails to
clear it.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Stable <stable@kernel.org> [2.6.39+]
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Johannes Berg [Thu, 22 Sep 2011 21:59:04 +0000 (14:59 -0700)]
iwlagn: fix dangling scan request
If iwl_scan_initiate() fails for any reason,
priv->scan_request and priv->scan_vif are left
dangling. This can lead to a crash later when
iwl_bg_scan_completed() tries to run a pending
scan request.
In practice, this seems to be very rare due to
the STATUS_SCANNING check earlier. That check,
however, is wrong -- it should allow a scan to
be queued when a reset/roc scan is going on.
When a normal scan is already going on, a new
one can't be issued by mac80211, so that code
can be removed completely. I introduced this
bug when adding off-channel support in commit
266af4c745952e9bebf687dd68af58df553cb59d.
Cc: stable@kernel.org [3.0]
Reported-by: Peng Yan <peng.yan@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Rafael J. Wysocki [Mon, 26 Sep 2011 17:40:23 +0000 (19:40 +0200)]
PM / Clocks: Do not acquire a mutex under a spinlock
Commit
b7ab83e (PM: Use spinlock instead of mutex in clock
management functions) introduced a regression causing clocks_mutex
to be acquired under a spinlock. This happens because
pm_clk_suspend() and pm_clk_resume() call pm_clk_acquire() under
pcd->lock, but pm_clk_acquire() executes clk_get() which causes
clocks_mutex to be acquired. Similarly, __pm_clk_remove(),
executed under pcd->lock, calls clk_put(), which also causes
clocks_mutex to be acquired.
To fix those problems make pm_clk_add() call pm_clk_acquire(), so
that pm_clk_suspend() and pm_clk_resume() don't have to do that.
Change pm_clk_remove() and pm_clk_destroy() to separate
modifications of the pcd->clock_list list from the actual removal of
PM clock entry objects done by __pm_clk_remove().
Reported-and-tested-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Peter Oberparleiter [Mon, 26 Sep 2011 14:40:35 +0000 (16:40 +0200)]
[S390] cio: fix cio_tpi ignoring adapter interrupts
Ensure that adapter interrupts are correctly processed when they are
retrieved using TEST PENDING INTERRUPTION.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Carsten Otte [Mon, 26 Sep 2011 14:40:34 +0000 (16:40 +0200)]
[S390] gmap: always up mmap_sem properly
If gmap_unmap_segment figures that the segment was not mapped in the
first place, it need to up mmap_sem on exit.
Cc: <stable@kernel.org>
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Martin Schwidefsky [Mon, 26 Sep 2011 14:40:33 +0000 (16:40 +0200)]
[S390] Do not clobber personality flags on exec
Analog to git commit
59e4c3a2fe9cb1681bb2cff508ff79466f7585ba
do not clear the additional personality flags on exec. We
need to inherit the personality bits in PER_MASK across exec.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
James Bottomley [Sun, 18 Sep 2011 14:56:20 +0000 (18:56 +0400)]
[SCSI] 3w-9xxx: fix iommu_iova leak
Following reports on the list, it looks like the 3e-9xxx driver will leak dma
mappings every time we get a transient queueing error back from the card.
This is because it maps the sg list in the routine that sends the command, but
doesn't unmap again in the transient failure path (even though the command is
sent back to the block layer). Fix by unmapping before returning the status.
Reported-by: Chris Boot <bootc@bootc.net>
Tested-by: Chris Boot <bootc@bootc.net>
Acked-by: Adam Radford <aradford@gmail.com>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Neil Horman [Tue, 6 Sep 2011 17:59:13 +0000 (13:59 -0400)]
[SCSI] cxgb3i: convert cdev->l2opt to use rcu to prevent NULL dereference
This oops was reported recently:
d:mon> e
cpu 0xd: Vector: 300 (Data Access) at [
c0000000fd4c7120]
pc:
d00000000076f194: .t3_l2t_get+0x44/0x524 [cxgb3]
lr:
d000000000b02108: .init_act_open+0x150/0x3d4 [cxgb3i]
sp:
c0000000fd4c73a0
msr:
8000000000009032
dar: 0
dsisr:
40000000
current = 0xc0000000fd640d40
paca = 0xc00000000054ff80
pid = 5085, comm = iscsid
d:mon> t
[
c0000000fd4c7450]
d000000000b02108 .init_act_open+0x150/0x3d4 [cxgb3i]
[
c0000000fd4c7500]
d000000000e45378 .cxgbi_ep_connect+0x784/0x8e8 [libcxgbi]
[
c0000000fd4c7650]
d000000000db33f0 .iscsi_if_rx+0x71c/0xb18
[scsi_transport_iscsi2]
[
c0000000fd4c7740]
c000000000370c9c .netlink_data_ready+0x40/0xa4
[
c0000000fd4c77c0]
c00000000036f010 .netlink_sendskb+0x4c/0x9c
[
c0000000fd4c7850]
c000000000370c18 .netlink_sendmsg+0x358/0x39c
[
c0000000fd4c7950]
c00000000033be24 .sock_sendmsg+0x114/0x1b8
[
c0000000fd4c7b50]
c00000000033d208 .sys_sendmsg+0x218/0x2ac
[
c0000000fd4c7d70]
c00000000033f55c .sys_socketcall+0x228/0x27c
[
c0000000fd4c7e30]
c0000000000086a4 syscall_exit+0x0/0x40
--- Exception: c01 (System Call) at
00000080da560cfc
The root cause was an EEH error, which sent us down the offload_close path in
the cxgb3 driver, which in turn sets cdev->l2opt to NULL, without regard for
upper layer driver (like the cxgbi drivers) which might have execution contexts
in the middle of its use. The result is the oops above, when t3_l2t_get attempts
to dereference L2DATA(cdev)->nentries in arp_hash right after the EEH error handler sets it to NULL.
The fix is to prevent the setting of the NULL pointer until after there are no
further users of it. The t3cdev->l2opt pointer is now converted to be an rcu
pointer and the L2DATA macro is now called under the protection of the
rcu_read_lock(). When the EEH error path:
t3_adapter_error->offload_close->cxgb3_offload_deactivate
Is exectured, setting of that l2opt pointer to NULL, is now gated on an rcu
quiescence point, preventing, allowing L2DATA callers to safely check for a NULL
pointer without concern that the underlying data will be freeded before the
pointer is dereferenced.
This has been tested by the reporter and shown to fix the reproted oops
[nhorman: fix up unitinialised variable reported by Dan Carpenter]
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reviewed-by: Karen Xie <kxie@chelsio.com>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Thomas Pfaff [Mon, 26 Sep 2011 13:43:59 +0000 (15:43 +0200)]
ALSA: usb-audio: Check for possible chip NULL pointer before clearing probing flag
Before clearing the probing flag in the error exit path, check that the
chip pointer is not NULL.
Signed-off-by: Thomas Pfaff <tpfaff@gmx.net>
Cc: <stable@kernel.org> [2.6.39+]
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Takashi Iwai [Mon, 26 Sep 2011 13:19:55 +0000 (15:19 +0200)]
ALSA: hda/realtek - Don't detect LO jack when identical with HP
The spec->autocfg.line_out_pins[] may contain the same pins as hp_pins[]
depending on the configuration. When they are identical, detecting the
line_jack_present flag screws up the auto-mute because alc_line_automute()
is called unconditionally at initialization while it won't be triggered
by unsol events, thus the old line_jack_present flag is kept for the
whole run.
For fixing this buggy behavior, the driver needs to check whether the
line-outs are really individual, and skip if same as headphone jacks.
Reference: https://bugzilla.novell.com/show_bug.cgi?id=716104
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Will Deacon [Fri, 23 Sep 2011 13:34:12 +0000 (14:34 +0100)]
ARM: 7099/1: futex: preserve oldval in SMP __futex_atomic_op
The SMP implementation of __futex_atomic_op clobbers oldval with the
status flag from the exclusive store. This causes it to always read as
zero when performing the FUTEX_OP_CMP_* operation.
This patch updates the ARM __futex_atomic_op implementations to take a
tmp argument, allowing us to store the strex status flag without
overwriting the register containing oldval.
Cc: stable@kernel.org
Reported-by: Minho Ban <mhban@samsung.com>
Reviewed-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Peter Zijlstra [Fri, 16 Sep 2011 09:16:43 +0000 (11:16 +0200)]
sched, tracing: Show PREEMPT_ACTIVE state in trace_sched_switch
We had need to see the difference between scheduling a runnable task and
a runnable task being involuntarily preempted.
No app should rely on the old string output (the binary
trace event record format is not changed).
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1316164603.10174.11.camel@twins
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Wang Xingchao [Fri, 16 Sep 2011 17:35:52 +0000 (13:35 -0400)]
sched: Remove redundant test in check_preempt_tick()
The caller already checks for nr_running > 1, therefore we don't have
to do so again.
Signed-off-by: Wang Xingchao <xingchao.wang@intel.com>
Reviewed-by: Paul Turner <pjt@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1316194552-12019-1-git-send-email-xingchao.wang@intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Simon Kirby [Fri, 23 Sep 2011 00:03:46 +0000 (17:03 -0700)]
sched: Fix up wchan borkage
Commit
c259e01a1ec ("sched: Separate the scheduler entry for
preemption") contained a boo-boo wrecking wchan output. It forgot to
put the new schedule() function in the __sched section and thereby
doesn't get properly ignored for things like wchan.
Tested-by: Simon Kirby <sim@hostway.ca>
Cc: stable@kernel.org # 2.6.39+
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110923000346.GA25425@hostway.ca
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Takashi Iwai [Mon, 26 Sep 2011 08:41:21 +0000 (10:41 +0200)]
ALSA: hda/realtek - Avoid bogus HP-pin assignment
When the headphone pin is assigned as primary output to line_out_pins[],
the automatic HP-pin assignment by ASSID must be suppressed. Otherwise
a wrong pin might be assigned to the headphone and breaks the auto-mute.
Reference: https://bugzilla.novell.com/show_bug.cgi?id=716104
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Cc: <stable@kernel.org>
Russell King [Thu, 22 Sep 2011 09:32:25 +0000 (10:32 +0100)]
ARM: dma-mapping: free allocated page if unable to map
If the attempt to map a page for DMA fails (eg, because we're out of
mapping space) then we must not hold on to the page we allocated for
DMA - doing so will result in a memory leak.
Cc: <stable@kernel.org>
Reported-by: Bryan Phillippe <bp@darkforest.org>
Tested-by: Bryan Phillippe <bp@darkforest.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Marek Szyprowski [Mon, 26 Sep 2011 04:16:45 +0000 (13:16 +0900)]
ARM: S5P: fix incorrect loop iterator usage on gpio-interrupt
Loop iterator value after terminating list_for_each_entry()
is not NULL. This patch fixes incorrect iterator usage in
GPIO interrupt code for SAMSUNG S5P platforms.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Heiko Stuebner [Mon, 26 Sep 2011 01:30:29 +0000 (10:30 +0900)]
ARM: S3C2443: Fix bit-reset in setrate of clk_armdiv
The changed statement should set the old armdiv bits to 0
and not everything else, before setting the new value.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Oleg Nesterov [Sun, 25 Sep 2011 17:46:22 +0000 (19:46 +0200)]
ptrace: PTRACE_LISTEN forgets to unlock ->siglock
If PTRACE_LISTEN fails after lock_task_sighand() it doesn't drop ->siglock.
Reported-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Avi Kivity [Tue, 13 Sep 2011 07:45:38 +0000 (10:45 +0300)]
KVM: x86 emulator: fix Src2CL decode
Src2CL decode (used for double width shifts) erronously decodes only bit 3
of %rcx, instead of bits 7:0.
Fix by decoding %cl in its entirety.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Zhao Jin [Mon, 19 Sep 2011 04:19:51 +0000 (12:19 +0800)]
KVM: MMU: fix incorrect return of spte
__update_clear_spte_slow should return original spte while the
current code returns low half of original spte combined with high
half of new spte.
Signed-off-by: Zhao Jin <cronozhj@gmail.com>
Reviewed-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
David Henningsson [Sat, 24 Sep 2011 06:30:44 +0000 (08:30 +0200)]
ALSA: HDA: No power nids on 92HD93
This patch is necessary to make internal speakers work on this chip.
Cc: stable@kernel.org
BugLink: http://bugs.launchpad.net/bugs/854468
Tested-by: Alex Wolfson <alex.wolfson@canonical.com>
Signed-off-by: David Henningsson <david.henningsson@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Linus Torvalds [Fri, 23 Sep 2011 23:53:16 +0000 (16:53 -0700)]
Merge branch 'spi/merge' of git://git.secretlab.ca/git/linux-2.6
* 'spi/merge' of git://git.secretlab.ca/git/linux-2.6:
spi: Fix WARN when removing spi-fsl-spi module
spi/imx: Fix spi-imx when the hardware SPI chipselects are used
Jeff Harris [Fri, 23 Sep 2011 15:49:36 +0000 (11:49 -0400)]
spi: Fix WARN when removing spi-fsl-spi module
If CPM mode is not used, the fsl_dummy_rx variable is never allocated. When
the cleanup attempts to free it, the reference count is zero and a WARN is
generated. The same CPM mode check used in the initialize is applied to the
free as well.
Tested on 2.6.33 with the previous spi_mpc8xxx driver. The renamed
spi-fsl-spi driver looks to have the same problem.
Signed-off-by: Jeff Harris <jeff_harris@kentrox.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Randy Dunlap [Fri, 23 Sep 2011 22:40:50 +0000 (15:40 -0700)]
scsi: fix qla2xxx printk format warning
sector_t can be different types, so cast it to its largest possible
type.
drivers/scsi/qla2xxx/qla_isr.c:1509:5: warning: format '%lx' expects type 'long unsigned int', but argument 5 has type 'sector_t'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Fri, 23 Sep 2011 22:43:54 +0000 (15:43 -0700)]
scsi: SCSI_ISCI needs to select SCSI_SAS_HOST_SMP, fixes build error
SCSI_ISCI needs to select SCSI_SAS_HOST_SMP to ensure that all
needed symbols are available to it.
Fixes this build error:
ERROR: "try_test_sas_gpio_gp_bit" [drivers/scsi/isci/isci.ko] undefined!
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 23 Sep 2011 22:17:02 +0000 (15:17 -0700)]
Merge branch 'perf-tools-for-linus' of git://github.com/acmel/linux
* 'perf-tools-for-linus' of git://github.com/acmel/linux:
perf python: Add missing perf_event__parse_sample 'swapped' parm
Linus Torvalds [Fri, 23 Sep 2011 20:59:37 +0000 (13:59 -0700)]
Merge branch 'perf-tools-for-linus' of git://github.com/acmel/linux
* 'perf-tools-for-linus' of git://github.com/acmel/linux:
perf tools: Add support for disabling -Werror via WERROR=0
perf top: Fix userspace sample addr map offset
perf symbols: Fix issue with binaries using 16-bytes buildids (v2)
perf tool: Fix endianness handling of u32 data in samples
perf sort: Fix symbol sort output by separating unresolved samples by type
perf symbols: Synthesize anonymous mmap events
perf record: Create events initially disabled and enable after init
perf symbols: Add some heuristics for choosing the best duplicate symbol
perf symbols: Preserve symbol scope when parsing /proc/kallsyms
perf symbols: /proc/kallsyms does not sort module symbols
perf symbols: Fix ppc64 SEGV in dso__load_sym with debuginfo files
perf probe: Fix regression of variable finder
Linus Torvalds [Fri, 23 Sep 2011 19:05:53 +0000 (12:05 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/radeon/kms: fix DDIA enable on some rs690 systems
Revert "drm/radeon/kms: fix typo in r100_blit_copy"
Linus Torvalds [Fri, 23 Sep 2011 19:04:32 +0000 (12:04 -0700)]
Merge branch 'for-linus' of git://github.com/tiwai/sound
* 'for-linus' of git://github.com/tiwai/sound:
ALSA: usb-audio - clear chip->probing on error exit
ALSA: fm801: Gracefully handle failure of tuner auto-detect
ALSA: fm801: Fix double free in case of error in tuner detection
ASoC: Ensure we generate a driver name
ASoC: Remove bitrotted wm8962_resume()
ASoC: bf5xx-
ad73311: Fix prototype for bf5xx_probe
Arnaldo Carvalho de Melo [Fri, 23 Sep 2011 18:38:53 +0000 (15:38 -0300)]
perf python: Add missing perf_event__parse_sample 'swapped' parm
Problem introduced in
936be50, that missed one perf_event__parse_sample
user, the python binding.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-ja4phms9618ggi657plyuch2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jan Beulich [Fri, 23 Sep 2011 10:40:08 +0000 (06:40 -0400)]
hwmon: (coretemp) remove struct platform_data * parameter from create_core_data()
The only caller of the function obtained the pointer solely for the
purpose of passing it to this function, while it can be easily
determined from the struct platform_device * parameter also passed.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>