Eric Dumazet [Fri, 4 Oct 2013 17:31:41 +0000 (10:31 -0700)]
tcp: do not forget FIN in tcp_shifted_skb()
[ Upstream commit
5e8a402f831dbe7ee831340a91439e46f0d38acd ]
Yuchung found following problem :
There are bugs in the SACK processing code, merging part in
tcp_shift_skb_data(), that incorrectly resets or ignores the sacked
skbs FIN flag. When a receiver first SACK the FIN sequence, and later
throw away ofo queue (e.g., sack-reneging), the sender will stop
retransmitting the FIN flag, and hangs forever.
Following packetdrill test can be used to reproduce the bug.
$ cat sack-merge-bug.pkt
`sysctl -q net.ipv4.tcp_fack=0`
// Establish a connection and send 10 MSS.
0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+.000 bind(3, ..., ...) = 0
+.000 listen(3, 1) = 0
+.050 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
+.000 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 6>
+.001 < . 1:1(0) ack 1 win 1024
+.000 accept(3, ..., ...) = 4
+.100 write(4, ..., 12000) = 12000
+.000 shutdown(4, SHUT_WR) = 0
+.000 > . 1:10001(10000) ack 1
+.050 < . 1:1(0) ack 2001 win 257
+.000 > FP. 10001:12001(2000) ack 1
+.050 < . 1:1(0) ack 2001 win 257 <sack 10001:11001,nop,nop>
+.050 < . 1:1(0) ack 2001 win 257 <sack 10001:12002,nop,nop>
// SACK reneg
+.050 < . 1:1(0) ack 12001 win 257
+0 %{ print "unacked: ",tcpi_unacked }%
+5 %{ print "" }%
First, a typo inverted left/right of one OR operation, then
code forgot to advance end_seq if the merged skb carried FIN.
Bug was added in 2.6.29 by commit
832d11c5cd076ab
("tcp: Try to restore large SKBs while SACK processing")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Tue, 15 Oct 2013 18:54:30 +0000 (11:54 -0700)]
tcp: must unclone packets before mangling them
[ Upstream commit
c52e2421f7368fd36cbe330d2cf41b10452e39a9 ]
TCP stack should make sure it owns skbs before mangling them.
We had various crashes using bnx2x, and it turned out gso_size
was cleared right before bnx2x driver was populating TC descriptor
of the _previous_ packet send. TCP stack can sometime retransmit
packets that are still in Qdisc.
Of course we could make bnx2x driver more robust (using
ACCESS_ONCE(shinfo->gso_size) for example), but the bug is TCP stack.
We have identified two points where skb_unclone() was needed.
This patch adds a WARN_ON_ONCE() to warn us if we missed another
fix of this kind.
Kudos to Neal for finding the root cause of this bug. Its visible
using small MSS.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Fri, 27 Sep 2013 10:28:54 +0000 (03:28 -0700)]
tcp: TSQ can use a dynamic limit
[ Upstream commit
c9eeec26e32e087359160406f96e0949b3cc6f10 ]
When TCP Small Queues was added, we used a sysctl to limit amount of
packets queues on Qdisc/device queues for a given TCP flow.
Problem is this limit is either too big for low rates, or too small
for high rates.
Now TCP stack has rate estimation in sk->sk_pacing_rate, and TSO
auto sizing, it can better control number of packets in Qdisc/device
queues.
New limit is two packets or at least 1 to 2 ms worth of packets.
Low rates flows benefit from this patch by having even smaller
number of packets in queues, allowing for faster recovery,
better RTT estimations.
High rates flows benefit from this patch by allowing more than 2 packets
in flight as we had reports this was a limiting factor to reach line
rate. [ In particular if TX completion is delayed because of coalescing
parameters ]
Example for a single flow on 10Gbp link controlled by FQ/pacing
14 packets in flight instead of 2
$ tc -s -d qd
qdisc fq 8001: dev eth0 root refcnt 32 limit 10000p flow_limit 100p
buckets 1024 quantum 3028 initial_quantum 15140
Sent
1168459366606 bytes
771822841 pkt (dropped 0, overlimits 0
requeues
6822476)
rate 9346Mbit 771713pps backlog
953820b 14p requeues
6822476
2047 flow, 2046 inactive, 1 throttled, delay 15673 ns
2372 gc, 0 highprio, 0 retrans,
9739249 throttled, 0 flows_plimit
Note that sk_pacing_rate is currently set to twice the actual rate, but
this might be refined in the future when a flow is in congestion
avoidance.
Additional change : skb->destructor should be set to tcp_wfree().
A future patch (for linux 3.13+) might remove tcp_limit_output_bytes
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Tue, 27 Aug 2013 12:46:32 +0000 (05:46 -0700)]
tcp: TSO packets automatic sizing
[ Upstream commits
6d36824e730f247b602c90e8715a792003e3c5a7,
02cf4ebd82ff0ac7254b88e466820a290ed8289a, and parts of
7eec4174ff29cd42f2acfae8112f51c228545d40 ]
After hearing many people over past years complaining against TSO being
bursty or even buggy, we are proud to present automatic sizing of TSO
packets.
One part of the problem is that tcp_tso_should_defer() uses an heuristic
relying on upcoming ACKS instead of a timer, but more generally, having
big TSO packets makes little sense for low rates, as it tends to create
micro bursts on the network, and general consensus is to reduce the
buffering amount.
This patch introduces a per socket sk_pacing_rate, that approximates
the current sending rate, and allows us to size the TSO packets so
that we try to send one packet every ms.
This field could be set by other transports.
Patch has no impact for high speed flows, where having large TSO packets
makes sense to reach line rate.
For other flows, this helps better packet scheduling and ACK clocking.
This patch increases performance of TCP flows in lossy environments.
A new sysctl (tcp_min_tso_segs) is added, to specify the
minimal size of a TSO packet (default being 2).
A follow-up patch will provide a new packet scheduler (FQ), using
sk_pacing_rate as an input to perform optional per flow pacing.
This explains why we chose to set sk_pacing_rate to twice the current
rate, allowing 'slow start' ramp up.
sk_pacing_rate = 2 * cwnd * mss / srtt
v2: Neal Cardwell reported a suspect deferring of last two segments on
initial write of 10 MSS, I had to change tcp_tso_should_defer() to take
into account tp->xmit_size_goal_segs
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Van Jacobson <vanj@google.com>
Cc: Tom Herbert <therbert@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Fri, 18 Oct 2013 17:44:19 +0000 (10:44 -0700)]
Linux 3.10.17
Linn Crosetto [Tue, 13 Aug 2013 21:46:41 +0000 (15:46 -0600)]
x86: avoid remapping data in parse_setup_data()
commit
30e46b574a1db7d14404e52dca8e1aa5f5155fd2 upstream.
Type SETUP_PCI, added by setup_efi_pci(), may advertise a ROM size
larger than early_memremap() is able to handle, which is currently
limited to 256kB. If this occurs it leads to a NULL dereference in
parse_setup_data().
To avoid this, remap the setup_data header and allow parsing functions
for individual types to handle their own data remapping.
Signed-off-by: Linn Crosetto <linn@hp.com>
Link: http://lkml.kernel.org/r/1376430401-67445-1-git-send-email-linn@hp.com
Acked-by: Yinghai Lu <yinghai@kernel.org>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Davidlohr Bueso [Mon, 30 Sep 2013 20:45:26 +0000 (13:45 -0700)]
ipc,msg: prevent race with rmid in msgsnd,msgrcv
commit
4271b05a227dc6175b66c3d9941aeab09048aeb2 upstream.
This fixes a race in both msgrcv() and msgsnd() between finding the msg
and actually dealing with the queue, as another thread can delete shmid
underneath us if we are preempted before acquiring the
kern_ipc_perm.lock.
Manfred illustrates this nicely:
Assume a preemptible kernel that is preempted just after
msq = msq_obtain_object_check(ns, msqid)
in do_msgrcv(). The only lock that is held is rcu_read_lock().
Now the other thread processes IPC_RMID. When the first task is
resumed, then it will happily wait for messages on a deleted queue.
Fix this by checking for if the queue has been deleted after taking the
lock.
Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Reported-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Manfred Spraul [Mon, 30 Sep 2013 20:45:25 +0000 (13:45 -0700)]
ipc/sem.c: update sem_otime for all operations
commit
0e8c665699e953fa58dc1b0b0d09e5dce7343cc7 upstream.
In commit
0a2b9d4c7967 ("ipc/sem.c: move wake_up_process out of the
spinlock section"), the update of semaphore's sem_otime(last semop time)
was moved to one central position (do_smart_update).
But since do_smart_update() is only called for operations that modify
the array, this means that wait-for-zero semops do not update sem_otime
anymore.
The fix is simple:
Non-alter operations must update sem_otime.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Reported-by: Jia He <jiakernel@gmail.com>
Tested-by: Jia He <jiakernel@gmail.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Manfred Spraul [Mon, 30 Sep 2013 20:45:07 +0000 (13:45 -0700)]
ipc/sem.c: synchronize the proc interface
commit
d8c633766ad88527f25d9f81a5c2f083d78a2b39 upstream.
The proc interface is not aware of sem_lock(), it instead calls
ipc_lock_object() directly. This means that simple semop() operations
can run in parallel with the proc interface. Right now, this is
uncritical, because the implementation doesn't do anything that requires
a proper synchronization.
But it is dangerous and therefore should be fixed.
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Manfred Spraul [Mon, 30 Sep 2013 20:45:06 +0000 (13:45 -0700)]
ipc/sem.c: optimize sem_lock()
commit
6d07b68ce16ae9535955ba2059dedba5309c3ca1 upstream.
Operations that need access to the whole array must guarantee that there
are no simple operations ongoing. Right now this is achieved by
spin_unlock_wait(sem->lock) on all semaphores.
If complex_count is nonzero, then this spin_unlock_wait() is not
necessary, because it was already performed in the past by the thread
that increased complex_count and even though sem_perm.lock was dropped
inbetween, no simple operation could have started, because simple
operations cannot start when complex_count is non-zero.
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Mike Galbraith <bitbucket@online.de>
Cc: Rik van Riel <riel@redhat.com>
Reviewed-by: Davidlohr Bueso <davidlohr@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Manfred Spraul [Mon, 30 Sep 2013 20:45:04 +0000 (13:45 -0700)]
ipc/sem.c: fix race in sem_lock()
commit
5e9d527591421ccdb16acb8c23662231135d8686 upstream.
The exclusion of complex operations in sem_lock() is insufficient: after
acquiring the per-semaphore lock, a simple op must first check that
sem_perm.lock is not locked and only after that test check
complex_count. The current code does it the other way around - and that
creates a race. Details are below.
The patch is a complete rewrite of sem_lock(), based in part on the code
from Mike Galbraith. It removes all gotos and all loops and thus the
risk of livelocks.
I have tested the patch (together with the next one) on my i3 laptop and
it didn't cause any problems.
The bug is probably also present in 3.10 and 3.11, but for these kernels
it might be simpler just to move the test of sma->complex_count after
the spin_is_locked() test.
Details of the bug:
Assume:
- sma->complex_count = 0.
- Thread 1: semtimedop(complex op that must sleep)
- Thread 2: semtimedop(simple op).
Pseudo-Trace:
Thread 1: sem_lock(): acquire sem_perm.lock
Thread 1: sem_lock(): check for ongoing simple ops
Nothing ongoing, thread 2 is still before sem_lock().
Thread 1: try_atomic_semop()
<<< preempted.
Thread 2: sem_lock():
static inline int sem_lock(struct sem_array *sma, struct sembuf *sops,
int nsops)
{
int locknum;
again:
if (nsops == 1 && !sma->complex_count) {
struct sem *sem = sma->sem_base + sops->sem_num;
/* Lock just the semaphore we are interested in. */
spin_lock(&sem->lock);
/*
* If sma->complex_count was set while we were spinning,
* we may need to look at things we did not lock here.
*/
if (unlikely(sma->complex_count)) {
spin_unlock(&sem->lock);
goto lock_array;
}
<<<<<<<<<
<<< complex_count is still 0.
<<<
<<< Here it is preempted
<<<<<<<<<
Thread 1: try_atomic_semop() returns, notices that it must sleep.
Thread 1: increases sma->complex_count.
Thread 1: drops sem_perm.lock
Thread 2:
/*
* Another process is holding the global lock on the
* sem_array; we cannot enter our critical section,
* but have to wait for the global lock to be released.
*/
if (unlikely(spin_is_locked(&sma->sem_perm.lock))) {
spin_unlock(&sem->lock);
spin_unlock_wait(&sma->sem_perm.lock);
goto again;
}
<<< sem_perm.lock already dropped, thus no "goto again;"
locknum = sops->sem_num;
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Mike Galbraith <bitbucket@online.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Davidlohr Bueso [Tue, 24 Sep 2013 00:04:45 +0000 (17:04 -0700)]
ipc: fix race with LSMs
commit
53dad6d3a8e5ac1af8bacc6ac2134ae1a8b085f1 upstream.
Currently, IPC mechanisms do security and auditing related checks under
RCU. However, since security modules can free the security structure,
for example, through selinux_[sem,msg_queue,shm]_free_security(), we can
race if the structure is freed before other tasks are done with it,
creating a use-after-free condition. Manfred illustrates this nicely,
for instance with shared mem and selinux:
-> do_shmat calls rcu_read_lock()
-> do_shmat calls shm_object_check().
Checks that the object is still valid - but doesn't acquire any locks.
Then it returns.
-> do_shmat calls security_shm_shmat (e.g. selinux_shm_shmat)
-> selinux_shm_shmat calls ipc_has_perm()
-> ipc_has_perm accesses ipc_perms->security
shm_close()
-> shm_close acquires rw_mutex & shm_lock
-> shm_close calls shm_destroy
-> shm_destroy calls security_shm_free (e.g. selinux_shm_free_security)
-> selinux_shm_free_security calls ipc_free_security(&shp->shm_perm)
-> ipc_free_security calls kfree(ipc_perms->security)
This patch delays the freeing of the security structures after all RCU
readers are done. Furthermore it aligns the security life cycle with
that of the rest of IPC - freeing them based on the reference counter.
For situations where we need not free security, the current behavior is
kept. Linus states:
"... the old behavior was suspect for another reason too: having the
security blob go away from under a user sounds like it could cause
various other problems anyway, so I think the old code was at least
_prone_ to bugs even if it didn't have catastrophic behavior."
I have tested this patch with IPC testcases from LTP on both my
quad-core laptop and on a 64 core NUMA server. In both cases selinux is
enabled, and tests pass for both voluntary and forced preemption models.
While the mentioned races are theoretical (at least no one as reported
them), I wanted to make sure that this new logic doesn't break anything
we weren't aware of.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Acked-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Davidlohr Bueso [Wed, 11 Sep 2013 21:26:31 +0000 (14:26 -0700)]
ipc: drop ipc_lock_check
commit
20b8875abcf2daa1dda5cf70bd6369df5e85d4c1 upstream.
No remaining users, we now use ipc_obtain_object_check().
Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Davidlohr Bueso [Wed, 11 Sep 2013 21:26:30 +0000 (14:26 -0700)]
ipc, shm: drop shm_lock_check
commit
7a25dd9e042b2b94202a67e5551112f4ac87285a upstream.
This function was replaced by a the lockless shm_obtain_object_check(),
and no longer has any users.
Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Davidlohr Bueso [Wed, 11 Sep 2013 21:26:29 +0000 (14:26 -0700)]
ipc: drop ipc_lock_by_ptr
commit
32a2750010981216fb788c5190fb0e646abfab30 upstream.
After previous cleanups and optimizations, this function is no longer
heavily used and we don't have a good reason to keep it. Update the few
remaining callers and get rid of it.
Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Davidlohr Bueso [Wed, 11 Sep 2013 21:26:28 +0000 (14:26 -0700)]
ipc, shm: guard against non-existant vma in shmdt(2)
commit
530fcd16d87cd2417c472a581ba5a1e501556c86 upstream.
When !CONFIG_MMU there's a chance we can derefence a NULL pointer when the
VM area isn't found - check the return value of find_vma().
Also, remove the redundant -EINVAL return: retval is set to the proper
return code and *only* changed to 0, when we actually unmap the segments.
Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Davidlohr Bueso [Wed, 11 Sep 2013 21:26:26 +0000 (14:26 -0700)]
ipc: document general ipc locking scheme
commit
05603c44a7627793219b0bd9a7b236099dc9cd9d upstream.
As suggested by Andrew, add a generic initial locking scheme used
throughout all sysv ipc mechanisms. Documenting the ids rwsem, how rcu
can be enough to do the initial checks and when to actually acquire the
kern_ipc_perm.lock spinlock.
I found that adding it to util.c was generic enough.
Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Davidlohr Bueso [Wed, 11 Sep 2013 21:26:25 +0000 (14:26 -0700)]
ipc,msg: drop msg_unlock
commit
4718787d1f626f45ddb239912bc07266b9880044 upstream.
There is only one user left, drop this function and just call
ipc_unlock_object() and rcu_read_unlock().
Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Davidlohr Bueso [Wed, 11 Sep 2013 21:26:24 +0000 (14:26 -0700)]
ipc: rename ids->rw_mutex
commit
d9a605e40b1376eb02b067d7690580255a0df68f upstream.
Since in some situations the lock can be shared for readers, we shouldn't
be calling it a mutex, rename it to rwsem.
Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Davidlohr Bueso [Wed, 11 Sep 2013 21:26:23 +0000 (14:26 -0700)]
ipc,shm: shorten critical region for shmat
commit
c2c737a0461e61a34676bd0bd1bc1a70a1b4e396 upstream.
Similar to other system calls, acquire the kern_ipc_perm lock after doing
the initial permission and security checks.
[sasha.levin@oracle.com: dont leave do_shmat with rcu lock held]
Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Davidlohr Bueso [Wed, 11 Sep 2013 21:26:22 +0000 (14:26 -0700)]
ipc,shm: cleanup do_shmat pasta
commit
f42569b1388b1408b574a5e93a23a663647d4181 upstream.
Clean up some of the messy do_shmat() spaghetti code, getting rid of
out_free and out_put_dentry labels. This makes shortening the critical
region of this function in the next patch a little easier to do and read.
Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Davidlohr Bueso [Wed, 11 Sep 2013 21:26:21 +0000 (14:26 -0700)]
ipc,shm: shorten critical region for shmctl
commit
2caacaa82a51b78fc0c800e206473874094287ed upstream.
With the *_INFO, *_STAT, IPC_RMID and IPC_SET commands already optimized,
deal with the remaining SHM_LOCK and SHM_UNLOCK commands. Take the
shm_perm lock after doing the initial auditing and security checks. The
rest of the logic remains unchanged.
Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Davidlohr Bueso [Wed, 11 Sep 2013 21:26:20 +0000 (14:26 -0700)]
ipc,shm: make shmctl_nolock lockless
commit
c97cb9ccab8c85428ec21eff690642ad2ce1fa8a upstream.
While the INFO cmd doesn't take the ipc lock, the STAT commands do acquire
it unnecessarily. We can do the permissions and security checks only
holding the rcu lock.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Davidlohr Bueso [Wed, 11 Sep 2013 21:26:18 +0000 (14:26 -0700)]
ipc,shm: introduce shmctl_nolock
commit
68eccc1dc345539d589ae78ee43b835c1a06a134 upstream.
Similar to semctl and msgctl, when calling msgctl, the *_INFO and *_STAT
commands can be performed without acquiring the ipc object.
Add a shmctl_nolock() function and move the logic of *_INFO and *_STAT out
of msgctl(). Since we are just moving functionality, this change still
takes the lock and it will be properly lockless in the next patch.
Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Davidlohr Bueso [Wed, 11 Sep 2013 21:26:17 +0000 (14:26 -0700)]
ipc: drop ipcctl_pre_down
commit
3b1c4ad37741e53804ffe0a30dd01e08b2ab6241 upstream.
Now that sem, msgque and shm, through *_down(), all use the lockless
variant of ipcctl_pre_down(), go ahead and delete it.
[akpm@linux-foundation.org: fix function name in kerneldoc, cleanups]
Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Davidlohr Bueso [Wed, 11 Sep 2013 21:26:16 +0000 (14:26 -0700)]
ipc,shm: shorten critical region in shmctl_down
commit
79ccf0f8c8e04e8b9eda6645ba0f63b0915a3075 upstream.
Instead of holding the ipc lock for the entire function, use the
ipcctl_pre_down_nolock and only acquire the lock for specific commands:
RMID and SET.
Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Davidlohr Bueso [Wed, 11 Sep 2013 21:26:15 +0000 (14:26 -0700)]
ipc,shm: introduce lockless functions to obtain the ipc object
commit
8b8d52ac382b17a19906b930cd69e2edb0aca8ba upstream.
This is the third and final patchset that deals with reducing the amount
of contention we impose on the ipc lock (kern_ipc_perm.lock). These
changes mostly deal with shared memory, previous work has already been
done for semaphores and message queues:
http://lkml.org/lkml/2013/3/20/546 (sems)
http://lkml.org/lkml/2013/5/15/584 (mqueues)
With these patches applied, a custom shm microbenchmark stressing shmctl
doing IPC_STAT with 4 threads a million times, reduces the execution
time by 50%. A similar run, this time with IPC_SET, reduces the
execution time from 3 mins and 35 secs to 27 seconds.
Patches 1-8: replaces blindly taking the ipc lock for a smarter
combination of rcu and ipc_obtain_object, only acquiring the spinlock
when updating.
Patch 9: renames the ids rw_mutex to rwsem, which is what it already was.
Patch 10: is a trivial mqueue leftover cleanup
Patch 11: adds a brief lock scheme description, requested by Andrew.
This patch:
Add shm_obtain_object() and shm_obtain_object_check(), which will allow us
to get the ipc object without acquiring the lock. Just as with other
forms of ipc, these functions are basically wrappers around
ipc_obtain_object*().
Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Manfred Spraul [Tue, 3 Sep 2013 14:00:08 +0000 (16:00 +0200)]
ipc/msg.c: Fix lost wakeup in msgsnd().
commit
bebcb928c820d0ee83aca4b192adc195e43e66a2 upstream.
The check if the queue is full and adding current to the wait queue of
pending msgsnd() operations (ss_add()) must be atomic.
Otherwise:
- the thread that performs msgsnd() finds a full queue and decides to
sleep.
- the thread that performs msgrcv() first reads all messages from the
queue and then sleeps, because the queue is empty.
- the msgrcv() calls do not perform any wakeups, because the msgsnd()
task has not yet called ss_add().
- then the msgsnd()-thread first calls ss_add() and then sleeps.
Net result: msgsnd() and msgrcv() both sleep forever.
Observed with msgctl08 from ltp with a preemptible kernel.
Fix: Call ipc_lock_object() before performing the check.
The patch also moves security_msg_queue_msgsnd() under ipc_lock_object:
- msgctl(IPC_SET) explicitely mentions that it tries to expunge any
pending operations that are not allowed anymore with the new
permissions. If security_msg_queue_msgsnd() is called without locks,
then there might be races.
- it makes the patch much simpler.
Reported-and-tested-by: Vineet Gupta <Vineet.Gupta1@synopsys.com>
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Manfred Spraul [Mon, 8 Jul 2013 23:01:26 +0000 (16:01 -0700)]
ipc/sem.c: rename try_atomic_semop() to perform_atomic_semop(), docu update
commit
758a6ba39ef6df4cdc615e5edd7bd86eab81a5f7 upstream.
Cleanup: Some minor points that I noticed while writing the previous
patches
1) The name try_atomic_semop() is misleading: The function performs the
operation (if it is possible).
2) Some documentation updates.
No real code change, a rename and documentation changes.
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Manfred Spraul [Mon, 8 Jul 2013 23:01:25 +0000 (16:01 -0700)]
ipc/sem.c: replace shared sem_otime with per-semaphore value
commit
d12e1e50e47e0900dbbf52237b7e171f4f15ea1e upstream.
sem_otime contains the time of the last semaphore operation that
completed successfully. Every operation updates this value, thus access
from multiple cpus can cause thrashing.
Therefore the patch replaces the variable with a per-semaphore variable.
The per-array sem_otime is only calculated when required.
No performance improvement on a single-socket i3 - only important for
larger systems.
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Manfred Spraul [Mon, 8 Jul 2013 23:01:24 +0000 (16:01 -0700)]
ipc/sem.c: always use only one queue for alter operations
commit
f269f40ad5aeee229ed70044926f44318abe41ef upstream.
There are two places that can contain alter operations:
- the global queue: sma->pending_alter
- the per-semaphore queues: sma->sem_base[].pending_alter.
Since one of the queues must be processed first, this causes an odd
priorization of the wakeups: complex operations have priority over
simple ops.
The patch restores the behavior of linux <=3.0.9: The longest waiting
operation has the highest priority.
This is done by using only one queue:
- if there are complex ops, then sma->pending_alter is used.
- otherwise, the per-semaphore queues are used.
As a side effect, do_smart_update_queue() becomes much simpler: no more
goto logic.
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Manfred Spraul [Mon, 8 Jul 2013 23:01:23 +0000 (16:01 -0700)]
ipc/sem: separate wait-for-zero and alter tasks into seperate queues
commit
1a82e9e1d0f1b45f47a97c9e2349020536ff8987 upstream.
Introduce separate queues for operations that do not modify the
semaphore values. Advantages:
- Simpler logic in check_restart().
- Faster update_queue(): Right now, all wait-for-zero operations are
always tested, even if the semaphore value is not 0.
- wait-for-zero gets again priority, as in linux <=3.0.9
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Manfred Spraul [Mon, 8 Jul 2013 23:01:22 +0000 (16:01 -0700)]
ipc/sem.c: cacheline align the semaphore structures
commit
f5c936c0f267ec58641451cf8b8d39b4c207ee4d upstream.
As now each semaphore has its own spinlock and parallel operations are
possible, give each semaphore its own cacheline.
On a i3 laptop, this gives up to 28% better performance:
#semscale 10 | grep "interleave 2"
- before:
Cpus 1, interleave 2 delay 0:
36109234 in 10 secs
Cpus 2, interleave 2 delay 0:
55276317 in 10 secs
Cpus 3, interleave 2 delay 0:
62411025 in 10 secs
Cpus 4, interleave 2 delay 0:
81963928 in 10 secs
-after:
Cpus 1, interleave 2 delay 0:
35527306 in 10 secs
Cpus 2, interleave 2 delay 0:
70922909 in 10 secs <<< + 28%
Cpus 3, interleave 2 delay 0:
80518538 in 10 secs
Cpus 4, interleave 2 delay 0:
89115148 in 10 secs <<< + 8.7%
i3, with 2 cores and with hyperthreading enabled. Interleave 2 in order
use first the full cores. HT partially hides the delay from cacheline
trashing, thus the improvement is "only" 8.7% if 4 threads are running.
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Manfred Spraul [Mon, 8 Jul 2013 23:01:20 +0000 (16:01 -0700)]
ipc/util.c, ipc_rcu_alloc: cacheline align allocation
commit
196aa0132fc7261f34b10ae1bfb44abc1bc69b3c upstream.
Enforce that ipc_rcu_alloc returns a cacheline aligned pointer on SMP.
Rationale:
The SysV sem code tries to move the main spinlock into a seperate
cacheline (____cacheline_aligned_in_smp). This works only if
ipc_rcu_alloc returns cacheline aligned pointers. vmalloc and kmalloc
return cacheline algined pointers, the implementation of ipc_rcu_alloc
breaks that.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Davidlohr Bueso [Mon, 8 Jul 2013 23:01:19 +0000 (16:01 -0700)]
ipc: remove unused functions
commit
9ad66ae65fc8d3e7e3344310fb0aa835910264fe upstream.
We can now drop the msg_lock and msg_lock_check functions along with a
bogus comment introduced previously in semctl_down.
Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Davidlohr Bueso [Mon, 8 Jul 2013 23:01:18 +0000 (16:01 -0700)]
ipc,msg: shorten critical region in msgrcv
commit
41a0d523d0f626e9da0dc01de47f1b89058033cf upstream.
do_msgrcv() is the last msg queue function that abuses the ipc lock Take
it only when needed when actually updating msq.
Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Rik van Riel <riel@redhat.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Davidlohr Bueso [Mon, 8 Jul 2013 23:01:17 +0000 (16:01 -0700)]
ipc,msg: shorten critical region in msgsnd
commit
3dd1f784ed6603d7ab1043e51e6371235edf2313 upstream.
do_msgsnd() is another function that does too many things with the ipc
object lock acquired. Take it only when needed when actually updating
msq.
Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Davidlohr Bueso [Mon, 8 Jul 2013 23:01:16 +0000 (16:01 -0700)]
ipc,msg: make msgctl_nolock lockless
commit
ac0ba20ea6f2201a1589d6dc26ad1a4f0f967bb8 upstream.
While the INFO cmd doesn't take the ipc lock, the STAT commands do
acquire it unnecessarily. We can do the permissions and security checks
only holding the rcu lock.
This function now mimics semctl_nolock().
Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Davidlohr Bueso [Mon, 8 Jul 2013 23:01:15 +0000 (16:01 -0700)]
ipc,msg: introduce lockless functions to obtain the ipc object
commit
a5001a0d9768568de5d613c3b3a5b9c7721299da upstream.
Add msq_obtain_object() and msq_obtain_object_check(), which will allow
us to get the ipc object without acquiring the lock. Just as with
semaphores, these functions are basically wrappers around
ipc_obtain_object*().
Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Davidlohr Bueso [Mon, 8 Jul 2013 23:01:14 +0000 (16:01 -0700)]
ipc,msg: introduce msgctl_nolock
commit
2cafed30f150f7314f98717b372df8173516cae0 upstream.
Similar to semctl, when calling msgctl, the *_INFO and *_STAT commands
can be performed without acquiring the ipc object.
Add a msgctl_nolock() function and move the logic of *_INFO and *_STAT
out of msgctl(). This change still takes the lock and it will be
properly lockless in the next patch
Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Davidlohr Bueso [Mon, 8 Jul 2013 23:01:13 +0000 (16:01 -0700)]
ipc,msg: shorten critical region in msgctl_down
commit
15724ecb7e9bab35fc694c666ad563adba820cc3 upstream.
Instead of holding the ipc lock for the entire function, use the
ipcctl_pre_down_nolock and only acquire the lock for specific commands:
RMID and SET.
Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Davidlohr Bueso [Mon, 8 Jul 2013 23:01:12 +0000 (16:01 -0700)]
ipc: move locking out of ipcctl_pre_down_nolock
commit
7b4cc5d8411bd4e9d61d8714f53859740cf830c2 upstream.
This function currently acquires both the rw_mutex and the rcu lock on
successful lookups, leaving the callers to explicitly unlock them,
creating another two level locking situation.
Make the callers (including those that still use ipcctl_pre_down())
explicitly lock and unlock the rwsem and rcu lock.
Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Davidlohr Bueso [Mon, 8 Jul 2013 23:01:11 +0000 (16:01 -0700)]
ipc: close open coded spin lock calls
commit
cf9d5d78d05bca96df7618dfc3a5ee4414dcae58 upstream.
Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Davidlohr Bueso [Mon, 8 Jul 2013 23:01:10 +0000 (16:01 -0700)]
ipc: introduce ipc object locking helpers
commit
1ca7003ab41152d673d9e359632283d05294f3d6 upstream.
Simple helpers around the (kern_ipc_perm *)->lock spinlock.
Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Davidlohr Bueso [Mon, 8 Jul 2013 23:01:09 +0000 (16:01 -0700)]
ipc: move rcu lock out of ipc_addid
commit
dbfcd91f06f0e2d5564b2fd184e9c2a43675f9ab upstream.
This patchset continues the work that began in the sysv ipc semaphore
scaling series, see
https://lkml.org/lkml/2013/3/20/546
Just like semaphores used to be, sysv shared memory and msg queues also
abuse the ipc lock, unnecessarily holding it for operations such as
permission and security checks.
This patchset mostly deals with mqueues, and while shared mem can be
done in a very similar way, I want to get these patches out in the open
first. It also does some pending cleanups, mostly focused on the two
level locking we have in ipc code, taking care of ipc_addid() and
ipcctl_pre_down_nolock() - yes there are still functions that need to be
updated as well.
This patch:
Make all callers explicitly take and release the RCU read lock.
This addresses the two level locking seen in newary(), newseg() and
newqueue(). For the last two, explicitly unlock the ipc object and the
rcu lock, instead of calling the custom shm_unlock and msg_unlock
functions. The next patch will deal with the open coded locking for
->perm.lock
Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
wojciech kapuscinski [Tue, 1 Oct 2013 23:54:33 +0000 (19:54 -0400)]
drm/radeon: fix hw contexts for SUMO2 asics
commit
50b8f5aec04ebec7dbdf2adb17220b9148c99e63 upstream.
They have 4 rather than 8.
Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=63599
Signed-off-by: wojciech kapuscinski <wojtask9@wp.pl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alex Deucher [Tue, 1 Oct 2013 20:40:45 +0000 (16:40 -0400)]
drm/radeon: fix typo in CP DMA register headers
commit
aa3e146d04b6ae37939daeebaec060562b3db559 upstream.
Wrong bit offset for SRC endian swapping.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dan Carpenter [Mon, 1 Jul 2013 16:39:34 +0000 (19:39 +0300)]
drm/radeon: forever loop on error in radeon_do_test_moves()
commit
89cd67b326fa95872cc2b4524cd807128db6071d upstream.
The error path does this:
for (--i; i >= 0; --i) {
which is a forever loop because "i" is unsigned.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Chris Wilson [Sun, 29 Sep 2013 18:15:07 +0000 (19:15 +0100)]
drm/i915: Only apply DPMS to the encoder if enabled
commit
c9976dcf55c8aaa7037427b239f15e5acfc01a3a upstream.
The current test for an attached enabled encoder fails if we have
multiple connectors aliased to the same encoder - both connectors
believe they own the enabled encoder and so we attempt to both enable
and disable DPMS on the encoder, leading to hilarity and an OOPs:
[ 354.803064] WARNING: CPU: 0 PID: 482 at
/usr/src/linux/dist/3.11.2/drivers/gpu/drm/i915/intel_display.c:3869 intel_modeset_check_state+0x764/0x770 [i915]()
[ 354.803064] wrong connector dpms state
[ 354.803084] Modules linked in: nfsd auth_rpcgss oid_registry exportfs nfs lockd sunrpc xt_nat iptable_nat nf_nat_ipv4 nf_nat xt_limit xt_LOG xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 ipt_REJECT ipv6 xt_recent xt_conntrack nf_conntrack iptable_filter ip_tables x_tables snd_hda_codec_realtek snd_hda_codec_hdmi x86_pkg_temp_thermal snd_hda_intel coretemp kvm_intel snd_hda_codec i915 kvm snd_hwdep snd_pcm_oss snd_mixer_oss crc32_pclmul snd_pcm crc32c_intel e1000e intel_agp igb ghash_clmulni_intel intel_gtt aesni_intel cfbfillrect aes_x86_64 cfbimgblt lrw cfbcopyarea drm_kms_helper ptp video thermal processor gf128mul snd_page_alloc drm snd_timer glue_helper 8250_pci snd pps_core ablk_helper agpgart cryptd sg soundcore fan i2c_algo_bit sr_mod thermal_sys 8250 i2c_i801 serial_core
hwmon cdrom i2c_core evdev button
[ 354.803086] CPU: 0 PID: 482 Comm: kworker/0:1 Not tainted 3.11.2 #1
[ 354.803087] Hardware name: Supermicro X10SAE/X10SAE, BIOS 1.00 05/03/2013 [ 354.803091] Workqueue: events console_callback
[ 354.803092]
0000000000000009 ffff88023611db48 ffffffff814048ac ffff88023611db90
[ 354.803093]
ffff88023611db80 ffffffff8103d4e3 ffff880230d82800 ffff880230f9b800
[ 354.803094]
ffff880230f99000 ffff880230f99448 ffff8802351c0e00 ffff88023611dbe0
[ 354.803094] Call Trace:
[ 354.803098] [<
ffffffff814048ac>] dump_stack+0x54/0x8d
[ 354.803101] [<
ffffffff8103d4e3>] warn_slowpath_common+0x73/0x90
[ 354.803103] [<
ffffffff8103d547>] warn_slowpath_fmt+0x47/0x50
[ 354.803109] [<
ffffffffa089f1be>] ? intel_ddi_connector_get_hw_state+0x5e/0x110 [i915]
[ 354.803114] [<
ffffffffa0896974>] intel_modeset_check_state+0x764/0x770 [i915]
[ 354.803117] [<
ffffffffa08969bb>] intel_connector_dpms+0x3b/0x60 [i915]
[ 354.803120] [<
ffffffffa037e1d0>] drm_fb_helper_dpms.isra.11+0x120/0x160 [drm_kms_helper]
[ 354.803122] [<
ffffffffa037e24e>] drm_fb_helper_blank+0x3e/0x80 [drm_kms_helper]
[ 354.803123] [<
ffffffff812116c2>] fb_blank+0x52/0xc0
[ 354.803125] [<
ffffffff8121e04b>] fbcon_blank+0x21b/0x2d0
[ 354.803127] [<
ffffffff81062243>] ? update_rq_clock.part.74+0x13/0x30
[ 354.803129] [<
ffffffff81047486>] ? lock_timer_base.isra.30+0x26/0x50
[ 354.803130] [<
ffffffff810472b2>] ? internal_add_timer+0x12/0x40
[ 354.803131] [<
ffffffff81047f48>] ? mod_timer+0xf8/0x1c0
[ 354.803133] [<
ffffffff81266d61>] do_unblank_screen+0xa1/0x1c0
[ 354.803134] [<
ffffffff81268087>] poke_blanked_console+0xc7/0xd0
[ 354.803136] [<
ffffffff812681cf>] console_callback+0x13f/0x160
[ 354.803137] [<
ffffffff81053258>] process_one_work+0x148/0x3d0
[ 354.803138] [<
ffffffff81053f19>] worker_thread+0x119/0x3a0
[ 354.803140] [<
ffffffff81053e00>] ? manage_workers.isra.30+0x2a0/0x2a0
[ 354.803141] [<
ffffffff8105994b>] kthread+0xbb/0xc0
[ 354.803142] [<
ffffffff81059890>] ? kthread_create_on_node+0x120/0x120
[ 354.803144] [<
ffffffff8140b32c>] ret_from_fork+0x7c/0xb0
[ 354.803145] [<
ffffffff81059890>] ? kthread_create_on_node+0x120/0x120
This regression goes back to the big modeset rework and the conversion
to the new dpms helpers which started with:
commit
5ab432ef4997ce32c9406721b37ef6e97e57dae1
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Sat Jun 30 08:59:56 2012 +0200
drm/i915/hdmi: convert to encoder->disable/enable
Fixes: igt/kms_flip/dpms-off-confusion
Reported-and-tested-by: Wakko Warner <wakko@animx.eu.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68030
Link: http://lkml.kernel.org/r/
20130928185023.GA21672@animx.eu.org
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Add regression citation, mention the igt testcase this fixes
and slap a cc: stable on the patch.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Al Viro [Sat, 24 Aug 2013 16:08:17 +0000 (12:08 -0400)]
cope with potentially long ->d_dname() output for shmem/hugetlb
commit
118b23022512eb2f41ce42db70dc0568d00be4ba upstream.
dynamic_dname() is both too much and too little for those - the
output may be well in excess of 64 bytes dynamic_dname() assumes
to be enough (thanks to ashmem feeding really long names to
shmem_file_setup()) and vsnprintf() is an overkill for those
guys.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Colin Cross <ccross@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
David Henningsson [Mon, 7 Oct 2013 08:39:59 +0000 (10:39 +0200)]
ALSA: hda - Fix mono speakers and headset mic on Dell Vostro 5470
This is a backport for stable. The original commit SHA is
338cae565c53755de9f87d6a801517940d2d56f7.
On this machine, DAC on node 0x03 seems to give mono output.
Also, it needs additional patches for headset mic support.
It supports CTIA style headsets only.
Alsa-info available at the bug link below.
BugLink: https://bugs.launchpad.net/bugs/1236228
Signed-off-by: David Henningsson <david.henningsson@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ingo Molnar [Thu, 10 Oct 2013 08:16:30 +0000 (10:16 +0200)]
compiler/gcc4: Add quirk for 'asm goto' miscompilation bug
commit
3f0116c3238a96bc18ad4b4acefe4e7be32fa861 upstream.
Fengguang Wu, Oleg Nesterov and Peter Zijlstra tracked down
a kernel crash to a GCC bug: GCC miscompiles certain 'asm goto'
constructs, as outlined here:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58670
Implement a workaround suggested by Jakub Jelinek.
Reported-and-tested-by: Fengguang Wu <fengguang.wu@intel.com>
Reported-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Suggested-by: Jakub Jelinek <jakub@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20131015062351.GA4666@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dan Carpenter [Fri, 23 Aug 2013 08:40:59 +0000 (11:40 +0300)]
watchdog: ts72xx_wdt: locking bug in ioctl
commit
8612ed0d97abcf1c016d34755b7cf2060de71963 upstream.
Calling the WDIOC_GETSTATUS & WDIOC_GETBOOTSTATUS and twice will cause a
interruptible deadlock.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Cc: Jonghwan Choi <jhbird.choi@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Vineet Gupta [Thu, 10 Oct 2013 14:03:57 +0000 (19:33 +0530)]
ARC: Ignore ptrace SETREGSET request for synthetic register "stop_pc"
commit
5b24282846c064ee90d40fcb3a8f63b8e754fd28 upstream.
ARCompact TRAP_S insn used for breakpoints, commits before exception is
taken (updating architectural PC). So ptregs->ret contains next-PC and
not the breakpoint PC itself. This is different from other restartable
exceptions such as TLB Miss where ptregs->ret has exact faulting PC.
gdb needs to know exact-PC hence ARC ptrace GETREGSET provides for
@stop_pc which returns ptregs->ret vs. EFA depending on the
situation.
However, writing stop_pc (SETREGSET request), which updates ptregs->ret
doesn't makes sense stop_pc doesn't always correspond to that reg as
described above.
This was not an issue so far since user_regs->ret / user_regs->stop_pc
had same value and both writing to ptregs->ret was OK, needless, but NOT
broken, hence not observed.
With gdb "jump", they diverge, and user_regs->ret updating ptregs is
overwritten immediately with stop_pc, which this patch fixes.
Reported-by: Anton Kolesov <akolesov@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Christian Ruppert [Wed, 2 Oct 2013 09:13:38 +0000 (11:13 +0200)]
ARC: Fix signal frame management for SA_SIGINFO
commit
10469350e345599dfef3fa78a7c19fb230e674c1 upstream.
Previously, when a signal was registered with SA_SIGINFO, parameters 2
and 3 of the signal handler were written to registers r1 and r2 before
the register set was saved. This led to corruption of these two
registers after returning from the signal handler (the wrong values were
restored).
With this patch, registers are now saved before any parameters are
passed, thus maintaining the processor state from before signal entry.
Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Vineet Gupta [Wed, 25 Sep 2013 11:23:32 +0000 (16:53 +0530)]
ARC: Workaround spinlock livelock in SMP SystemC simulation
commit
6c00350b573c0bd3635436e43e8696951dd6e1b6 upstream.
Some ARC SMP systems lack native atomic R-M-W (LLOCK/SCOND) insns and
can only use atomic EX insn (reg with mem) to build higher level R-M-W
primitives. This includes a SystemC based SMP simulation model.
So rwlocks need to use a protecting spinlock for atomic cmp-n-exchange
operation to update reader(s)/writer count.
The spinlock operation itself looks as follows:
mov reg, 1 ; 1=locked, 0=unlocked
retry:
EX reg, [lock] ; load existing, store 1, atomically
BREQ reg, 1, rety ; if already locked, retry
In single-threaded simulation, SystemC alternates between the 2 cores
with "N" insn each based scheduling. Additionally for insn with global
side effect, such as EX writing to shared mem, a core switch is
enforced too.
Given that, 2 cores doing a repeated EX on same location, Linux often
got into a livelock e.g. when both cores were fiddling with tasklist
lock (gdbserver / hackbench) for read/write respectively as the
sequence diagram below shows:
core1 core2
-------- --------
1. spin lock [EX r=0, w=1] - LOCKED
2. rwlock(Read) - LOCKED
3. spin unlock [ST 0] - UNLOCKED
spin lock [EX r=0,w=1] - LOCKED
-- resched core 1----
5. spin lock [EX r=1] - ALREADY-LOCKED
-- resched core 2----
6. rwlock(Write) - READER-LOCKED
7. spin unlock [ST 0]
8. rwlock failed, retry again
9. spin lock [EX r=0, w=1]
-- resched core 1----
10 spinlock locked in #9, retry #5
11. spin lock [EX gets 1]
-- resched core 2----
...
...
The fix was to unlock using the EX insn too (step 7), to trigger another
SystemC scheduling pass which would let core1 proceed, eliding the
livelock.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Vineet Gupta [Thu, 26 Sep 2013 13:20:40 +0000 (18:50 +0530)]
ARC: Fix 32-bit wrap around in access_ok()
commit
0752adfda15f0eca9859a76da3db1800e129ad43 upstream.
Anton reported
| LTP tests syscalls/process_vm_readv01 and process_vm_writev01 fail
| similarly in one testcase test_iov_invalid -> lvec->iov_base.
| Testcase expects errno EFAULT and return code -1,
| but it gets return code 1 and ERRNO is 0 what means success.
Essentially test case was passing a pointer of -1 which access_ok()
was not catching. It was doing [@addr + @sz <= TASK_SIZE] which would
pass for @addr == -1
Fixed that by rewriting as [@addr <= TASK_SIZE - @sz]
Reported-by: Anton Kolesov <Anton.Kolesov@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mischa Jonker [Thu, 26 Sep 2013 13:44:56 +0000 (15:44 +0200)]
ARC: Handle zero-overhead-loop in unaligned access handler
commit
c11eb222fd7d4db91196121dbf854178505d2751 upstream.
If a load or store is the last instruction in a zero-overhead-loop, and
it's misaligned, the loop would execute only once.
This fixes that problem.
Signed-off-by: Mischa Jonker <mjonker@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mischa Jonker [Fri, 30 Aug 2013 09:56:25 +0000 (11:56 +0200)]
ARC: Fix __udelay calculation
commit
7efd0da2d17360e1cef91507dbe619db0ee2c691 upstream.
Cast usecs to u64, to ensure that the (usecs * 4295 * HZ)
multiplication is 64 bit.
Initially, the (usecs * 4295 * HZ) part was done as a 32 bit
multiplication, with the result casted to 64 bit. This led to some bits
falling off, causing a "DMA initialization error" in the stmmac Ethernet
driver, due to a premature timeout.
Signed-off-by: Mischa Jonker <mjonker@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Noam Camus [Thu, 12 Sep 2013 07:37:39 +0000 (13:07 +0530)]
ARC: SMP failed to boot due to missing IVT setup
commit
c3567f8a359b7917dcffa442301f88ed0a75211f upstream.
Commit
05b016ecf5e7a "ARC: Setup Vector Table Base in early boot" moved
the Interrupt vector Table setup out of arc_init_IRQ() which is called
for all CPUs, to entry point of boot cpu only, breaking booting of others.
Fix by adding the same to entry point of non-boot CPUs too.
read_arc_build_cfg_regs() printing IVT Base Register didn't help the
casue since it prints a synthetic value if zero which is totally bogus,
so fix that to print the exact Register.
[vgupta: Remove the now stale comment from header of arc_init_IRQ and
also added the commentary for halt-on-reset]
Cc: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Vineet Gupta [Mon, 17 Jun 2013 12:57:23 +0000 (18:27 +0530)]
ARC: Setup Vector Table Base in early boot
commit
05b016ecf5e7a8c24409d8e9effb5d2ec9107708 upstream.
Otherwise early boot exceptions such as instructions errors due to
configuration mismatch between kernel and hardware go off to la-la land,
as opposed to hitting the handler and panic()'ing properly.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Russell King [Tue, 6 Aug 2013 08:49:14 +0000 (09:49 +0100)]
ARM: Fix the world famous typo with is_gate_vma()
commit
1d0bbf428924f94867542d49d436cf254b9dbd06 upstream.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Colin Cross <ccross@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Helge Deller [Tue, 1 Oct 2013 19:54:46 +0000 (21:54 +0200)]
parisc: fix interruption handler to respect pagefault_disable()
commit
59b33f148cc08fb33cbe823fca1e34f7f023765e upstream.
Running an "echo t > /proc/sysrq-trigger" crashes the parisc kernel. The
problem is, that in print_worker_info() we try to read the workqueue info via
the probe_kernel_read() functions which use pagefault_disable() to avoid
crashes like this:
probe_kernel_read(&pwq, &worker->current_pwq, sizeof(pwq));
probe_kernel_read(&wq, &pwq->wq, sizeof(wq));
probe_kernel_read(name, wq->name, sizeof(name) - 1);
The problem here is, that the first probe_kernel_read(&pwq) might return zero
in pwq and as such the following probe_kernel_reads() try to access contents of
the page zero which is read protected and generate a kernel segfault.
With this patch we fix the interruption handler to call parisc_terminate()
directly only if pagefault_disable() was not called (in which case
preempt_count()==0). Otherwise we hand over to the pagefault handler which
will try to look up the faulting address in the fixup tables.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Paul Mackerras [Fri, 20 Sep 2013 23:53:28 +0000 (09:53 +1000)]
KVM: PPC: Book3S HV: Fix typo in saving DSCR
commit
cfc860253abd73e1681696c08ea268d33285a2c4 upstream.
This fixes a typo in the code that saves the guest DSCR (Data Stream
Control Register) into the kvm_vcpu_arch struct on guest exit. The
effect of the typo was that the DSCR value was saved in the wrong place,
so changes to the DSCR by the guest didn't persist across guest exit
and entry, and some host kernel memory got corrupted.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dave Jones [Fri, 11 Oct 2013 00:05:35 +0000 (20:05 -0400)]
ext4: fix memory leak in xattr
commit
6e4ea8e33b2057b85d75175dd89b93f5e26de3bc upstream.
If we take the 2nd retry path in ext4_expand_extra_isize_ea, we
potentionally return from the function without having freed these
allocations. If we don't do the return, we over-write the previous
allocation pointers, so we leak either way.
Spotted with Coverity.
[ Fixed by tytso to set is and bs to NULL after freeing these
pointers, in case in the retry loop we later end up triggering an
error causing a jump to cleanup, at which point we could have a double
free bug. -- Ted ]
Signed-off-by: Dave Jones <davej@fedoraproject.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Josef Bacik [Wed, 9 Oct 2013 16:24:04 +0000 (12:24 -0400)]
Btrfs: use right root when checking for hash collision
commit
4871c1588f92c6c13f4713a7009f25f217055807 upstream.
btrfs_rename was using the root of the old dir instead of the root of the new
dir when checking for a hash collision, so if you tried to move a file into a
subvol it would freak out because it would see the file you are trying to move
in its current root. This fixes the bug where this would fail
btrfs subvol create test1
btrfs subvol create test2
mv test1 test2.
Thanks to Chris Murphy for catching this,
Reported-by: Chris Murphy <lists@colorremedies.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Henrik Rydberg [Wed, 2 Oct 2013 17:15:03 +0000 (19:15 +0200)]
hwmon: (applesmc) Always read until end of data
commit
25f2bd7f5add608c1d1405938f39c96927b275ca upstream.
The crash reported and investigated in commit 5f4513 turned out to be
caused by a change to the read interface on newer (2012) SMCs.
Tests by Chris show that simply reading the data valid line is enough
for the problem to go away. Additional tests show that the newer SMCs
no longer wait for the number of requested bytes, but start sending
data right away. Apparently the number of bytes to read is no longer
specified as before, but instead found out by reading until end of
data. Failure to read until end of data confuses the state machine,
which eventually causes the crash.
As a remedy, assuming bit0 is the read valid line, make sure there is
nothing more to read before leaving the read function.
Tested to resolve the original problem, and runtested on MBA3,1,
MBP4,1, MBP8,2, MBP10,1, MBP10,2. The patch seems to have no effect on
machines before 2012.
Tested-by: Chris Murphy <chris@cmurf.com>
Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Taras Kondratiuk [Mon, 7 Oct 2013 10:41:59 +0000 (13:41 +0300)]
i2c: omap: Clear ARDY bit twice
commit
4cdbf7d346e7461c3b93a26707c852e2c9db3753 upstream.
Initially commit
cb527ede1bf6ff2008a025606f25344b8ed7b4ac
"i2c-omap: Double clear of ARDY status in IRQ handler"
added a workaround for undocumented errata ProDB0017052.
But then commit
1d7afc95946487945cc7f5019b41255b72224b70
"i2c: omap: ack IRQ in parts" refactored code and missed
one of ARDY clearings. So current code violates errata.
It causes often i2c bus timeouts on my Pandaboard.
This patch adds a second clearing in place.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Taras Kondratiuk <taras.kondratiuk@linaro.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Linus Torvalds [Mon, 30 Sep 2013 15:35:10 +0000 (08:35 -0700)]
vfs: allow O_PATH file descriptors for fstatfs()
commit
9d05746e7b16d8565dddbe3200faa1e669d23bbf upstream.
Olga reported that file descriptors opened with O_PATH do not work with
fstatfs(), found during further development of ksh93's thread support.
There is no reason to not allow O_PATH file descriptors here (fstatfs is
very much a path operation), so use "fdget_raw()". See commit
55815f70147d ("vfs: make O_PATH file descriptors usable for 'fstat()'")
for a very similar issue reported for fstat() by the same team.
Reported-and-tested-by: ольга крыжановская <olga.kryzhanovska@gmail.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Theodore Ts'o [Tue, 10 Sep 2013 14:52:35 +0000 (10:52 -0400)]
random: run random_int_secret_init() run after all late_initcalls
commit
47d06e532e95b71c0db3839ebdef3fe8812fca2c upstream.
The some platforms (e.g., ARM) initializes their clocks as
late_initcalls for some unknown reason. So make sure
random_int_secret_init() is run after all of the late_initcalls are
run.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
David Henningsson [Fri, 11 Oct 2013 08:18:45 +0000 (10:18 +0200)]
ALSA: hda - Fix microphone for Sony VAIO Pro 13 (Haswell model)
commit
88cfcf86aa3ada84d97195bcad74f4dadb4ae23b upstream.
The external mic showed up with a precense detect of "always present",
essentially disabling the internal mic. Therefore turn off presence
detection for this pin.
Note: The external mic seems not yet working, but an internal mic is
certainly better than no mic at all.
BugLink: https://bugs.launchpad.net/bugs/1227093
Signed-off-by: David Henningsson <david.henningsson@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Takashi Iwai [Tue, 8 Oct 2013 17:57:50 +0000 (19:57 +0200)]
ALSA: hda - Add fixup for ASUS N56VZ
commit
c6cc3d58b4042f5cadae653ff8d3df26af1a0169 upstream.
ASUS N56VZ needs a fixup for the bass speaker pin, which was already
provided via model=asus-mode4.
Bugzilla: https://bugzilla.novell.com/show_bug.cgi?id=841645
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Anssi Hannula [Mon, 7 Oct 2013 16:24:52 +0000 (19:24 +0300)]
ALSA: hda - hdmi: Fix channel map switch not taking effect
commit
39edac70e9aedf451fccaa851b273ace9fcca0bd upstream.
Currently hdmi_setup_audio_infoframe() reprograms the HDA channel
mapping only when the infoframe is not up-to-date or the non-PCM flag
has changed.
However, when just the channel map has been changed, the infoframe may
still be up-to-date and non-PCM flag may not have changed, so the new
channel map is not actually programmed into the HDA codec.
Notably, this failing case is also always triggered when the device is
already in a prepared state and a new channel map is configured while
changing only the channel positions (for example, plain
"speaker-test -c2 -m FR,FL").
Fix that by always programming the channel map in
hdmi_setup_audio_infoframe(). Tested on Intel HDMI.
Signed-off-by: Anssi Hannula <anssi.hannula@iki.fi>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Daniel Mack [Wed, 2 Oct 2013 15:49:50 +0000 (17:49 +0200)]
ALSA: snd-usb-usx2y: remove bogus frame checks
commit
a9d14bc0b188a822e42787d01e56c06fe9750162 upstream.
The frame check in i_usX2Y_urb_complete() and
i_usX2Y_usbpcm_urb_complete() is bogus and produces false positives as
described in this LAU thread:
http://linuxaudio.org/mailarchive/lau/2013/5/20/200177
This patch removes the check code entirely.
Cc: fzu@wemgehoertderstaat.de
Reported-by: Dr Nicholas J Bailey <nicholas.bailey@glasgow.ac.uk>
Suggested-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Daniel Mack <zonque@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Sun, 13 Oct 2013 23:08:56 +0000 (16:08 -0700)]
Linux 3.10.16
Kent Overstreet [Fri, 11 Oct 2013 00:31:15 +0000 (17:31 -0700)]
bcache: Fix a null ptr deref regression
commit
2fe80d3bbf1c8bd9efc5b8154207c8dd104e7306 upstream.
Commit
c0f04d88e46d ("bcache: Fix flushes in writeback mode") was fixing
a reported data corruption bug, but it seems some last minute
refactoring or rebasing introduced a null pointer deref.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Reported-by: Gabriel de Perthuis <g2p.code@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bjørn Mork [Tue, 10 Sep 2013 13:06:20 +0000 (15:06 +0200)]
net: qmi_wwan: add new Qualcomm devices
commit
0470667caa8261beb8a9141102b04a5357dd45b5 upstream.
Adding the device list from the Windows driver description files
included with a new Qualcomm MDM9615 based device, "Alcatel-sbell
ASB TL131 TDD LTE", from China Mobile. This device is tested
and verified to work. The others are assumed to work based on
using the same Windows driver.
Many of these devices support multiple QMI/wwan ports, requiring
multiple interface matching entries. All devices are composite,
providing a mix of one or more serial, storage or Android Debug
Brigde functions in addition to the wwan function.
This device list included an update of one previously known device,
which was incorrectly assumed to have a Gobi 2K layout. This is
corrected.
Reported-by: 王康 <scateu@gmail.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
David Herrmann [Mon, 9 Sep 2013 16:33:54 +0000 (18:33 +0200)]
HID: uhid: allocate static minor
commit
19872d20c890073c5207d9e02bb8f14d451a11eb upstream.
udev has this nice feature of creating "dead" /dev/<node> device-nodes if
it finds a devnode:<node> modalias. Once the node is accessed, the kernel
automatically loads the module that provides the node. However, this
requires udev to know the major:minor code to use for the node. This
feature was introduced by:
commit
578454ff7eab61d13a26b568f99a89a2c9edc881
Author: Kay Sievers <kay.sievers@vrfy.org>
Date: Thu May 20 18:07:20 2010 +0200
driver core: add devname module aliases to allow module on-demand auto-loading
However, uhid uses dynamic minor numbers so this doesn't actually work. We
need to load uhid to know which minor it's going to use.
Hence, allocate a static minor (just like uinput does) and we're good
to go.
Reported-by: Tom Gundersen <teg@jklm.no>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Marcel Holtmann [Sun, 1 Sep 2013 18:02:46 +0000 (11:02 -0700)]
HID: uhid: add devname module alias
commit
60cbd53e4bf623fe978e6f23a6da642e730fde3a upstream.
For simple device node creation, add the devname module alias.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Reviewed-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stefan Achatz [Fri, 30 Aug 2013 12:10:07 +0000 (14:10 +0200)]
HID: roccat: add support for KonePureOptical v2
commit
a4be0ed39f2b1ea990804ea54e39bc42d17ed5a5 upstream.
KonePureOptical is a KonePure with different sensor.
Signed-off-by: Stefan Achatz <erazor_de@users.sourceforge.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Josef Bacik [Thu, 22 Aug 2013 21:03:29 +0000 (17:03 -0400)]
Btrfs: remove ourselves from the cluster list under lock
commit
b8d0c69b9469ffd33df30fee3e990f2d4aa68a09 upstream.
A user was reporting weird warnings from btrfs_put_delayed_ref() and I noticed
that we were doing this list_del_init() on our head ref outside of
delayed_refs->lock. This is a problem if we have people still on the list, we
could end up modifying old pointers and such. Fix this by removing us from the
list before we do our run_delayed_ref on our head ref. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Josef Bacik [Mon, 12 Aug 2013 14:56:14 +0000 (10:56 -0400)]
Btrfs: skip subvol entries when checking if we've created a dir already
commit
a05254143cd183b18002cbba7759a1e4629aa762 upstream.
We have logic to see if we've already created a parent directory by check to see
if an inode inside of that directory has a lower inode number than the one we
are currently processing. The logic is that if there is a lower inode number
then we would have had to made sure the directory was created at that previous
point. The problem is that subvols inode numbers count from the lowest objectid
in the root tree, which may be less than our current progress. So just skip if
our dir item key is a root item. This fixes the original test and the xfstest
version I made that added an extra subvol create. Thanks,
Reported-by: Emil Karlson <jekarlson@gmail.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Josef Bacik [Tue, 30 Jul 2013 20:30:30 +0000 (16:30 -0400)]
Btrfs: change how we queue blocks for backref checking
commit
b6c60c8018c4e9beb2f83fc82c09f9d033766571 upstream.
Previously we only added blocks to the list to have their backrefs checked if
the level of the block is right above the one we are searching for. This is
because we want to make sure we don't add the entire path up to the root to the
lists to make sure we process things one at a time. This assumes that if any
blocks in the path to the root are going to be not checked (shared in other
words) then they will be in the level right above the current block on up. This
isn't quite right though since we can have blocks higher up the list that are
shared because they are attached to a reloc root. But we won't add this block
to be checked and then later on we will BUG_ON(!upper->checked). So instead
keep track of wether or not we've queued a block to be checked in this current
search, and if we haven't go ahead and queue it to be checked. This patch fixed
the panic I was seeing where we BUG_ON(!upper->checked). Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Martin Schwidefsky [Fri, 27 Sep 2013 13:24:38 +0000 (15:24 +0200)]
s390: fix system call restart after inferior call
commit
dbbfe487e5f3fc00c9fe5207d63309859704d12f upstream.
Git commit
616498813b11ffef "s390: system call path micro optimization"
introduced a regression in regard to system call restarting and inferior
function calls via the ptrace interface. The pointer to the system call
table needs to be loaded in sysc_sigpending if do_signal returns with
TIF_SYSCALl set after it restored a system call context.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Chris Metcalf [Thu, 26 Sep 2013 17:24:53 +0000 (13:24 -0400)]
tile: use a more conservative __my_cpu_offset in CONFIG_PREEMPT
commit
f862eefec0b68e099a9fa58d3761ffb10bad97e1 upstream.
It turns out the kernel relies on barrier() to force a reload of the
percpu offset value. Since we can't easily modify the definition of
barrier() to include "tp" as an output register, we instead provide a
definition of __my_cpu_offset as extended assembly that includes a fake
stack read to hazard against barrier(), forcing gcc to know that it
must reread "tp" and recompute anything based on "tp" after a barrier.
This fixes observed hangs in the slub allocator when we are looping
on a percpu cmpxchg_double.
A similar fix for ARMv7 was made in June in change
509eb76ebf97.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Takashi Iwai [Mon, 30 Sep 2013 10:13:44 +0000 (12:13 +0200)]
ALSA: hda - Fix GPIO for Acer Aspire 3830TG
commit
4a4370442c996be0fd08234a167c8a127c2488bb upstream.
Acer Aspire 3830TG seems requiring GPIO bit 0 as the primary mute
control. When a machine is booted after Windows 8, the GPIO pin is
turned off and it results in the silent output.
This patch adds the manual fixup of GPIO bit 0 for this model.
Reported-by: Christopher <DIDI2002@web.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ben Skeggs [Tue, 10 Sep 2013 02:11:01 +0000 (12:11 +1000)]
drm/nouveau/bios/init: stub opcode 0xaa
commit
5495e39fb3695182b9f2a72fe4169056cada37a1 upstream.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mark Tinguely [Mon, 23 Sep 2013 17:18:58 +0000 (12:18 -0500)]
xfs: fix node forward in xfs_node_toosmall
commit
997def25e4b9cee3b01609e18a52f926bca8bd2b upstream.
Commit
f5ea1100 cleans up the disk to host conversions for
node directory entries, but because a variable is reused in
xfs_node_toosmall() the next node is not correctly found.
If the original node is small enough (<= 3/8 of the node size),
this change may incorrectly cause a node collapse when it should
not. That will cause an assert in xfstest generic/319:
Assertion failed: first <= last && last < BBTOB(bp->b_length),
file: /root/newest/xfs/fs/xfs/xfs_trans_buf.c, line: 569
Keep the original node header to get the correct forward node.
(When a node is considered for a merge with a sibling, it overwrites the
sibling pointers of the original incore nodehdr with the sibling's
pointers. This leads to loop considering the original node as a merge
candidate with itself in the second pass, and so it incorrectly
determines a merge should occur.)
[v3: added Dave Chinner's (slightly modified) suggestion to the commit header,
cleaned up whitespace. -bpm]
Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Lv Zheng [Fri, 13 Sep 2013 05:13:23 +0000 (13:13 +0800)]
ACPI / IPMI: Fix atomic context requirement of ipmi_msg_handler()
commit
06a8566bcf5cf7db9843a82cde7a33c7bf3947d9 upstream.
This patch fixes the issues indicated by the test results that
ipmi_msg_handler() is invoked in atomic context.
BUG: scheduling while atomic: kipmi0/18933/0x10000100
Modules linked in: ipmi_si acpi_ipmi ...
CPU: 3 PID: 18933 Comm: kipmi0 Tainted: G AW 3.10.0-rc7+ #2
Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.0027.
070120100606 07/01/2010
ffff8838245eea00 ffff88103fc63c98 ffffffff814c4a1e ffff88103fc63ca8
ffffffff814bfbab ffff88103fc63d28 ffffffff814c73e0 ffff88103933cbd4
0000000000000096 ffff88103fc63ce8 ffff88102f618000 ffff881035c01fd8
Call Trace:
<IRQ> [<
ffffffff814c4a1e>] dump_stack+0x19/0x1b
[<
ffffffff814bfbab>] __schedule_bug+0x46/0x54
[<
ffffffff814c73e0>] __schedule+0x83/0x59c
[<
ffffffff81058853>] __cond_resched+0x22/0x2d
[<
ffffffff814c794b>] _cond_resched+0x14/0x1d
[<
ffffffff814c6d82>] mutex_lock+0x11/0x32
[<
ffffffff8101e1e9>] ? __default_send_IPI_dest_field.constprop.0+0x53/0x58
[<
ffffffffa09e3f9c>] ipmi_msg_handler+0x23/0x166 [ipmi_si]
[<
ffffffff812bf6e4>] deliver_response+0x55/0x5a
[<
ffffffff812c0fd4>] handle_new_recv_msgs+0xb67/0xc65
[<
ffffffff81007ad1>] ? read_tsc+0x9/0x19
[<
ffffffff814c8620>] ? _raw_spin_lock_irq+0xa/0xc
[<
ffffffffa09e1128>] ipmi_thread+0x5c/0x146 [ipmi_si]
...
Also Tony Camuso says:
We were getting occasional "Scheduling while atomic" call traces
during boot on some systems. Problem was first seen on a Cisco C210
but we were able to reproduce it on a Cisco c220m3. Setting
CONFIG_LOCKDEP and LOCKDEP_SUPPORT to 'y' exposed a lockdep around
tx_msg_lock in acpi_ipmi.c struct acpi_ipmi_device.
=================================
[ INFO: inconsistent lock state ]
2.6.32-415.el6.x86_64-debug-splck #1
---------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
ksoftirqd/3/17 [HC0[0]:SC1[1]:HE1:SE0] takes:
(&ipmi_device->tx_msg_lock){+.?...}, at: [<
ffffffff81337a27>] ipmi_msg_handler+0x71/0x126
{SOFTIRQ-ON-W} state was registered at:
[<
ffffffff810ba11c>] __lock_acquire+0x63c/0x1570
[<
ffffffff810bb0f4>] lock_acquire+0xa4/0x120
[<
ffffffff815581cc>] __mutex_lock_common+0x4c/0x400
[<
ffffffff815586ea>] mutex_lock_nested+0x4a/0x60
[<
ffffffff8133789d>] acpi_ipmi_space_handler+0x11b/0x234
[<
ffffffff81321c62>] acpi_ev_address_space_dispatch+0x170/0x1be
The fix implemented by this change has been tested by Tony:
Tested the patch in a boot loop with lockdep debug enabled and never
saw the problem in over 400 reboots.
Reported-and-tested-by: Tony Camuso <tcamuso@redhat.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Jonghwan Choi <jhbird.choi@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Michael Grzeschik [Tue, 17 Sep 2013 13:56:06 +0000 (15:56 +0200)]
dmaengine: imx-dma: fix slow path issue in prep_dma_cyclic
commit
edc530fe7ee5a562680615d2e7cd205879c751a7 upstream.
When perparing cyclic_dma buffers by the sound layer, it will dump the
following lockdep trace. The leading snd_pcm_action_single get called
with read_lock_irq called. To fix this, we change the kcalloc call from
GFP_KERNEL to GFP_ATOMIC.
WARNING: at kernel/lockdep.c:2740 lockdep_trace_alloc+0xcc/0x114()
DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags))
Modules linked in:
CPU: 0 PID: 832 Comm: aplay Not tainted 3.11.0-
20130823+ #903
Backtrace:
[<
c000b98c>] (dump_backtrace+0x0/0x10c) from [<
c000bb28>] (show_stack+0x18/0x1c)
r6:
c004c090 r5:
00000009 r4:
c2e0bd18 r3:
00404000
[<
c000bb10>] (show_stack+0x0/0x1c) from [<
c02f397c>] (dump_stack+0x20/0x28)
[<
c02f395c>] (dump_stack+0x0/0x28) from [<
c001531c>] (warn_slowpath_common+0x54/0x70)
[<
c00152c8>] (warn_slowpath_common+0x0/0x70) from [<
c00153dc>] (warn_slowpath_fmt+0x38/0x40)
r8:
00004000 r7:
a3b90000 r6:
000080d0 r5:
60000093 r4:
c2e0a000 r3:
00000009
[<
c00153a4>] (warn_slowpath_fmt+0x0/0x40) from [<
c004c090>] (lockdep_trace_alloc+0xcc/0x114)
r3:
c03955d8 r2:
c03907db
[<
c004bfc4>] (lockdep_trace_alloc+0x0/0x114) from [<
c008f16c>] (__kmalloc+0x34/0x118)
r6:
000080d0 r5:
c3800120 r4:
000080d0 r3:
c040a0f8
[<
c008f138>] (__kmalloc+0x0/0x118) from [<
c019c95c>] (imxdma_prep_dma_cyclic+0x64/0x168)
r7:
a3b90000 r6:
00000004 r5:
c39d8420 r4:
c3847150
[<
c019c8f8>] (imxdma_prep_dma_cyclic+0x0/0x168) from [<
c024618c>] (snd_dmaengine_pcm_trigger+0xa8/0x160)
[<
c02460e4>] (snd_dmaengine_pcm_trigger+0x0/0x160) from [<
c0241fa8>] (soc_pcm_trigger+0x90/0xb4)
r8:
c058c7b0 r7:
c3b8140c r6:
c39da560 r5:
00000001 r4:
c3b81000
[<
c0241f18>] (soc_pcm_trigger+0x0/0xb4) from [<
c022ece4>] (snd_pcm_do_start+0x2c/0x38)
r7:
00000000 r6:
00000003 r5:
c058c7b0 r4:
c3b81000
[<
c022ecb8>] (snd_pcm_do_start+0x0/0x38) from [<
c022e958>] (snd_pcm_action_single+0x40/0x6c)
[<
c022e918>] (snd_pcm_action_single+0x0/0x6c) from [<
c022ea64>] (snd_pcm_action_lock_irq+0x7c/0x9c)
r7:
00000003 r6:
c3b810f0 r5:
c3b810f0 r4:
c3b81000
[<
c022e9e8>] (snd_pcm_action_lock_irq+0x0/0x9c) from [<
c023009c>] (snd_pcm_common_ioctl1+0x7f8/0xfd0)
r8:
c3b7f888 r7:
005407b8 r6:
c2c991c0 r5:
c3b81000 r4:
c3b81000 r3:
00004142
[<
c022f8a4>] (snd_pcm_common_ioctl1+0x0/0xfd0) from [<
c023117c>] (snd_pcm_playback_ioctl1+0x464/0x488)
[<
c0230d18>] (snd_pcm_playback_ioctl1+0x0/0x488) from [<
c02311d4>] (snd_pcm_playback_ioctl+0x34/0x40)
r8:
c3b7f888 r7:
00004142 r6:
00000004 r5:
c2c991c0 r4:
005407b8
[<
c02311a0>] (snd_pcm_playback_ioctl+0x0/0x40) from [<
c00a14a4>] (vfs_ioctl+0x30/0x44)
[<
c00a1474>] (vfs_ioctl+0x0/0x44) from [<
c00a1fe8>] (do_vfs_ioctl+0x55c/0x5c0)
[<
c00a1a8c>] (do_vfs_ioctl+0x0/0x5c0) from [<
c00a208c>] (SyS_ioctl+0x40/0x68)
[<
c00a204c>] (SyS_ioctl+0x0/0x68) from [<
c0009380>] (ret_fast_syscall+0x0/0x44)
r8:
c0009544 r7:
00000036 r6:
bedeaa58 r5:
00000000 r4:
000000c0
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Cc: Jonghwan Choi <jhbird.choi@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Michael Grzeschik [Tue, 17 Sep 2013 13:56:08 +0000 (15:56 +0200)]
dmaengine: imx-dma: fix callback path in tasklet
commit
fcaaba6c7136fe47e5a13352f99a64b019b6d2c5 upstream.
We need to free the ld_active list head before jumping into the callback
routine. Otherwise the callback could run into issue_pending and change
our ld_active list head we just going to free. This will run the channel
list into an currupted and undefined state.
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Cc: Jonghwan Choi <jhbird.choi@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Michael Grzeschik [Tue, 17 Sep 2013 13:56:07 +0000 (15:56 +0200)]
dmaengine: imx-dma: fix lockdep issue between irqhandler and tasklet
commit
5a276fa6bdf82fd442046969603968c83626ce0b upstream.
The tasklet and irqhandler are using spin_lock while other routines are
using spin_lock_irqsave/restore. This leads to lockdep issues as
described bellow. This patch is changing the code to use
spinlock_irq_save/restore in both code pathes.
As imxdma_xfer_desc always gets called with spin_lock_irqsave lock held,
this patch also removes the spare call inside the routine to avoid
double locking.
[ 403.358162] =================================
[ 403.362549] [ INFO: inconsistent lock state ]
[ 403.366945] 3.10.0-
20130823+ #904 Not tainted
[ 403.371331] ---------------------------------
[ 403.375721] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
[ 403.381769] swapper/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
[ 403.386762] (&(&imxdma->lock)->rlock){?.-...}, at: [<
c019d77c>] imxdma_tasklet+0x20/0x134
[ 403.395201] {IN-HARDIRQ-W} state was registered at:
[ 403.400108] [<
c004b264>] mark_lock+0x2a0/0x6b4
[ 403.404798] [<
c004d7c8>] __lock_acquire+0x650/0x1a64
[ 403.410004] [<
c004f15c>] lock_acquire+0x94/0xa8
[ 403.414773] [<
c02f74e4>] _raw_spin_lock+0x54/0x8c
[ 403.419720] [<
c019d094>] dma_irq_handler+0x78/0x254
[ 403.424845] [<
c0061124>] handle_irq_event_percpu+0x38/0x1b4
[ 403.430670] [<
c00612e4>] handle_irq_event+0x44/0x64
[ 403.435789] [<
c0063a70>] handle_level_irq+0xd8/0xf0
[ 403.440903] [<
c0060a20>] generic_handle_irq+0x28/0x38
[ 403.446194] [<
c0009cc4>] handle_IRQ+0x68/0x8c
[ 403.450789] [<
c0008714>] avic_handle_irq+0x3c/0x48
[ 403.455811] [<
c0008f84>] __irq_svc+0x44/0x74
[ 403.460314] [<
c0040b04>] cpu_startup_entry+0x88/0xf4
[ 403.465525] [<
c02f00d0>] rest_init+0xb8/0xe0
[ 403.470045] [<
c03e07dc>] start_kernel+0x28c/0x2d4
[ 403.474986] [<
a0008040>] 0xa0008040
[ 403.478709] irq event stamp: 50854
[ 403.482140] hardirqs last enabled at (50854): [<
c001c6b8>] tasklet_action+0x38/0xdc
[ 403.489954] hardirqs last disabled at (50853): [<
c001c6a0>] tasklet_action+0x20/0xdc
[ 403.497761] softirqs last enabled at (50850): [<
c001bc64>] _local_bh_enable+0x14/0x18
[ 403.505741] softirqs last disabled at (50851): [<
c001c268>] irq_exit+0x88/0xdc
[ 403.513026]
[ 403.513026] other info that might help us debug this:
[ 403.519593] Possible unsafe locking scenario:
[ 403.519593]
[ 403.525548] CPU0
[ 403.528020] ----
[ 403.530491] lock(&(&imxdma->lock)->rlock);
[ 403.534828] <Interrupt>
[ 403.537474] lock(&(&imxdma->lock)->rlock);
[ 403.541983]
[ 403.541983] *** DEADLOCK ***
[ 403.541983]
[ 403.547951] no locks held by swapper/0.
[ 403.551813]
[ 403.551813] stack backtrace:
[ 403.556222] CPU: 0 PID: 0 Comm: swapper Not tainted 3.10.0-
20130823+ #904
[ 403.563039] Backtrace:
[ 403.565581] [<
c000b98c>] (dump_backtrace+0x0/0x10c) from [<
c000bb28>] (show_stack+0x18/0x1c)
[ 403.574054] r6:
00000000 r5:
c05c51d8 r4:
c040bd58 r3:
00200000
[ 403.579872] [<
c000bb10>] (show_stack+0x0/0x1c) from [<
c02f398c>] (dump_stack+0x20/0x28)
[ 403.587955] [<
c02f396c>] (dump_stack+0x0/0x28) from [<
c02f29c8>] (print_usage_bug.part.28+0x224/0x28c)
[ 403.597340] [<
c02f27a4>] (print_usage_bug.part.28+0x0/0x28c) from [<
c004b404>] (mark_lock+0x440/0x6b4)
[ 403.606682] r8:
c004a41c r7:
00000000 r6:
c040bd58 r5:
c040c040 r4:
00000002
[ 403.613566] [<
c004afc4>] (mark_lock+0x0/0x6b4) from [<
c004d844>] (__lock_acquire+0x6cc/0x1a64)
[ 403.622244] [<
c004d178>] (__lock_acquire+0x0/0x1a64) from [<
c004f15c>] (lock_acquire+0x94/0xa8)
[ 403.631010] [<
c004f0c8>] (lock_acquire+0x0/0xa8) from [<
c02f74e4>] (_raw_spin_lock+0x54/0x8c)
[ 403.639614] [<
c02f7490>] (_raw_spin_lock+0x0/0x8c) from [<
c019d77c>] (imxdma_tasklet+0x20/0x134)
[ 403.648434] r6:
c3847010 r5:
c040e890 r4:
c38470d4
[ 403.653194] [<
c019d75c>] (imxdma_tasklet+0x0/0x134) from [<
c001c70c>] (tasklet_action+0x8c/0xdc)
[ 403.662013] r8:
c0599160 r7:
00000000 r6:
00000000 r5:
c040e890 r4:
c3847114 r3:
c019d75c
[ 403.670042] [<
c001c680>] (tasklet_action+0x0/0xdc) from [<
c001bd4c>] (__do_softirq+0xe4/0x1f0)
[ 403.678687] r7:
00000101 r6:
c0402000 r5:
c059919c r4:
00000001
[ 403.684498] [<
c001bc68>] (__do_softirq+0x0/0x1f0) from [<
c001c268>] (irq_exit+0x88/0xdc)
[ 403.692652] [<
c001c1e0>] (irq_exit+0x0/0xdc) from [<
c0009cc8>] (handle_IRQ+0x6c/0x8c)
[ 403.700514] r4:
00000030 r3:
00000110
[ 403.704192] [<
c0009c5c>] (handle_IRQ+0x0/0x8c) from [<
c0008714>] (avic_handle_irq+0x3c/0x48)
[ 403.712664] r5:
c0403f28 r4:
c0593ebc
[ 403.716343] [<
c00086d8>] (avic_handle_irq+0x0/0x48) from [<
c0008f84>] (__irq_svc+0x44/0x74)
[ 403.724733] Exception stack(0xc0403f28 to 0xc0403f70)
[ 403.729841] 3f20:
00000001 00000004 00000000 20000013 c0402000 c04104a8
[ 403.738078] 3f40:
00000002 c0b69620 a0004000 41069264 a03fb5f4 c0403f7c c0403f40 c0403f70
[ 403.746301] 3f60:
c004b92c c0009e74 20000013 ffffffff
[ 403.751383] r6:
ffffffff r5:
20000013 r4:
c0009e74 r3:
c004b92c
[ 403.757210] [<
c0009e30>] (arch_cpu_idle+0x0/0x4c) from [<
c0040b04>] (cpu_startup_entry+0x88/0xf4)
[ 403.766161] [<
c0040a7c>] (cpu_startup_entry+0x0/0xf4) from [<
c02f00d0>] (rest_init+0xb8/0xe0)
[ 403.774753] [<
c02f0018>] (rest_init+0x0/0xe0) from [<
c03e07dc>] (start_kernel+0x28c/0x2d4)
[ 403.783051] r6:
c03fc484 r5:
ffffffff r4:
c040a0e0
[ 403.787797] [<
c03e0550>] (start_kernel+0x0/0x2d4) from [<
a0008040>] (0xa0008040)
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Cc: Jonghwan Choi <jhbird.choi@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Rafał Miłecki [Thu, 10 Oct 2013 05:56:07 +0000 (07:56 +0200)]
Revert "drm/radeon: add missing hdmi callbacks for rv6xx"
This reverts commit
b2a9484006875ecd7d94582e7bcb72a02682be92.
Commit
99d79aa2f3b7729e7290e8bda5d0dd8b0240ec62 (backported by
b2a9484006875ecd7d94582e7bcb72a02682be92) was supposed to fix rv6xx_asic
struct.
In kernel 3.10 we didn't have that struct yet, so the original patch
should never be backported to the 3.10. Accidentally it has applied and
modified different struct (r520_asic) that shouldn't have any HDMI
callbacks at all.
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Frederic Weisbecker [Mon, 23 Sep 2013 22:50:25 +0000 (00:50 +0200)]
irq: Force hardirq exit's softirq processing on its own stack
commit
ded797547548a5b8e7b92383a41e4c0e6b0ecb7f upstream.
The commit
facd8b80c67a3cf64a467c4a2ac5fb31f2e6745b
("irq: Sanitize invoke_softirq") converted irq exit
calls of do_softirq() to __do_softirq() on all architectures,
assuming it was only used there for its irq disablement
properties.
But as a side effect, the softirqs processed in the end
of the hardirq are always called on the inline current
stack that is used by irq_exit() instead of the softirq
stack provided by the archs that override do_softirq().
The result is mostly safe if the architecture runs irq_exit()
on a separate irq stack because then softirqs are processed
on that same stack that is near empty at this stage (assuming
hardirq aren't nesting).
Otherwise irq_exit() runs in the task stack and so does the softirq
too. The interrupted call stack can be randomly deep already and
the softirq can dig through it even further. To add insult to the
injury, this softirq can be interrupted by a new hardirq, maximizing
the chances for a stack overrun as reported in powerpc for example:
do_IRQ: stack overflow: 1920
CPU: 0 PID: 1602 Comm: qemu-system-ppc Not tainted 3.10.4-300.1.fc19.ppc64p7 #1
Call Trace:
[
c0000000050a8740] .show_stack+0x130/0x200 (unreliable)
[
c0000000050a8810] .dump_stack+0x28/0x3c
[
c0000000050a8880] .do_IRQ+0x2b8/0x2c0
[
c0000000050a8930] hardware_interrupt_common+0x154/0x180
--- Exception: 501 at .cp_start_xmit+0x3a4/0x820 [8139cp]
LR = .cp_start_xmit+0x390/0x820 [8139cp]
[
c0000000050a8d40] .dev_hard_start_xmit+0x394/0x640
[
c0000000050a8e00] .sch_direct_xmit+0x110/0x260
[
c0000000050a8ea0] .dev_queue_xmit+0x260/0x630
[
c0000000050a8f40] .br_dev_queue_push_xmit+0xc4/0x130 [bridge]
[
c0000000050a8fc0] .br_dev_xmit+0x198/0x270 [bridge]
[
c0000000050a9070] .dev_hard_start_xmit+0x394/0x640
[
c0000000050a9130] .dev_queue_xmit+0x428/0x630
[
c0000000050a91d0] .ip_finish_output+0x2a4/0x550
[
c0000000050a9290] .ip_local_out+0x50/0x70
[
c0000000050a9310] .ip_queue_xmit+0x148/0x420
[
c0000000050a93b0] .tcp_transmit_skb+0x4e4/0xaf0
[
c0000000050a94a0] .__tcp_ack_snd_check+0x7c/0xf0
[
c0000000050a9520] .tcp_rcv_established+0x1e8/0x930
[
c0000000050a95f0] .tcp_v4_do_rcv+0x21c/0x570
[
c0000000050a96c0] .tcp_v4_rcv+0x734/0x930
[
c0000000050a97a0] .ip_local_deliver_finish+0x184/0x360
[
c0000000050a9840] .ip_rcv_finish+0x148/0x400
[
c0000000050a98d0] .__netif_receive_skb_core+0x4f8/0xb00
[
c0000000050a99d0] .netif_receive_skb+0x44/0x110
[
c0000000050a9a70] .br_handle_frame_finish+0x2bc/0x3f0 [bridge]
[
c0000000050a9b20] .br_nf_pre_routing_finish+0x2ac/0x420 [bridge]
[
c0000000050a9bd0] .br_nf_pre_routing+0x4dc/0x7d0 [bridge]
[
c0000000050a9c70] .nf_iterate+0x114/0x130
[
c0000000050a9d30] .nf_hook_slow+0xb4/0x1e0
[
c0000000050a9e00] .br_handle_frame+0x290/0x330 [bridge]
[
c0000000050a9ea0] .__netif_receive_skb_core+0x34c/0xb00
[
c0000000050a9fa0] .netif_receive_skb+0x44/0x110
[
c0000000050aa040] .napi_gro_receive+0xe8/0x120
[
c0000000050aa0c0] .cp_rx_poll+0x31c/0x590 [8139cp]
[
c0000000050aa1d0] .net_rx_action+0x1dc/0x310
[
c0000000050aa2b0] .__do_softirq+0x158/0x330
[
c0000000050aa3b0] .irq_exit+0xc8/0x110
[
c0000000050aa430] .do_IRQ+0xdc/0x2c0
[
c0000000050aa4e0] hardware_interrupt_common+0x154/0x180
--- Exception: 501 at .bad_range+0x1c/0x110
LR = .get_page_from_freelist+0x908/0xbb0
[
c0000000050aa7d0] .list_del+0x18/0x50 (unreliable)
[
c0000000050aa850] .get_page_from_freelist+0x908/0xbb0
[
c0000000050aa9e0] .__alloc_pages_nodemask+0x21c/0xae0
[
c0000000050aaba0] .alloc_pages_vma+0xd0/0x210
[
c0000000050aac60] .handle_pte_fault+0x814/0xb70
[
c0000000050aad50] .__get_user_pages+0x1a4/0x640
[
c0000000050aae60] .get_user_pages_fast+0xec/0x160
[
c0000000050aaf10] .__gfn_to_pfn_memslot+0x3b0/0x430 [kvm]
[
c0000000050aafd0] .kvmppc_gfn_to_pfn+0x64/0x130 [kvm]
[
c0000000050ab070] .kvmppc_mmu_map_page+0x94/0x530 [kvm]
[
c0000000050ab190] .kvmppc_handle_pagefault+0x174/0x610 [kvm]
[
c0000000050ab270] .kvmppc_handle_exit_pr+0x464/0x9b0 [kvm]
[
c0000000050ab320] kvm_start_lightweight+0x1ec/0x1fc [kvm]
[
c0000000050ab4f0] .kvmppc_vcpu_run_pr+0x168/0x3b0 [kvm]
[
c0000000050ab9c0] .kvmppc_vcpu_run+0xc8/0xf0 [kvm]
[
c0000000050aba50] .kvm_arch_vcpu_ioctl_run+0x5c/0x1a0 [kvm]
[
c0000000050abae0] .kvm_vcpu_ioctl+0x478/0x730 [kvm]
[
c0000000050abc90] .do_vfs_ioctl+0x4ec/0x7c0
[
c0000000050abd80] .SyS_ioctl+0xd4/0xf0
[
c0000000050abe30] syscall_exit+0x0/0x98
Since this is a regression, this patch proposes a minimalistic
and low-risk solution by blindly forcing the hardirq exit processing of
softirqs on the softirq stack. This way we should reduce significantly
the opportunities for task stack overflow dug by softirqs.
Longer term solutions may involve extending the hardirq stack coverage to
irq_exit(), etc...
Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@au1.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Mackerras <paulus@au1.ibm.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: James E.J. Bottomley <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric W. Biederman [Sat, 5 Oct 2013 20:15:30 +0000 (13:15 -0700)]
net: Update the sysctl permissions handler to test effective uid/gid
commit
2433c8f094a008895e66f25bd1773cdb01c91d01 upstream.
Modify the code to use current_euid(), and in_egroup_p, as in done
in fs/proc/proc_sysctl.c:test_perm()
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Reported-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Nicholas Bellinger [Thu, 3 Oct 2013 20:37:21 +0000 (13:37 -0700)]
iscsi-target: Only perform wait_for_tasks when performing shutdown
commit
e255a28598e8e63070322fc89bd34189dd660a89 upstream.
This patch changes transport_generic_free_cmd() to only wait_for_tasks
when shutdown=true is passed to iscsit_free_cmd().
With the advent of >= v3.10 iscsi-target code using se_cmd->cmd_kref,
the extra wait_for_tasks with shutdown=false is unnecessary, and may
end up causing an extra context switch when releasing WRITEs.
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Rafael Aquini [Mon, 30 Sep 2013 20:45:16 +0000 (13:45 -0700)]
mm: avoid reinserting isolated balloon pages into LRU lists
commit
117aad1e9e4d97448d1df3f84b08bd65811e6d6a upstream.
Isolated balloon pages can wrongly end up in LRU lists when
migrate_pages() finishes its round without draining all the isolated
page list.
The same issue can happen when reclaim_clean_pages_from_list() tries to
reclaim pages from an isolated page list, before migration, in the CMA
path. Such balloon page leak opens a race window against LRU lists
shrinkers that leads us to the following kernel panic:
BUG: unable to handle kernel NULL pointer dereference at
0000000000000028
IP: [<
ffffffff810c2625>] shrink_page_list+0x24e/0x897
PGD
3cda2067 PUD
3d713067 PMD 0
Oops: 0000 [#1] SMP
CPU: 0 PID: 340 Comm: kswapd0 Not tainted
3.12.0-rc1-22626-g4367597 #87
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
RIP: shrink_page_list+0x24e/0x897
RSP: 0000:
ffff88003da499b8 EFLAGS:
00010286
RAX:
0000000000000000 RBX:
ffff88003e82bd60 RCX:
00000000000657d5
RDX:
0000000000000000 RSI:
000000000000031f RDI:
ffff88003e82bd40
RBP:
ffff88003da49ab0 R08:
0000000000000001 R09:
0000000081121a45
R10:
ffffffff81121a45 R11:
ffff88003c4a9a28 R12:
ffff88003e82bd40
R13:
ffff88003da0e800 R14:
0000000000000001 R15:
ffff88003da49d58
FS:
0000000000000000(0000) GS:
ffff88003fc00000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00000000067d9000 CR3:
000000003ace5000 CR4:
00000000000407b0
Call Trace:
shrink_inactive_list+0x240/0x3de
shrink_lruvec+0x3e0/0x566
__shrink_zone+0x94/0x178
shrink_zone+0x3a/0x82
balance_pgdat+0x32a/0x4c2
kswapd+0x2f0/0x372
kthread+0xa2/0xaa
ret_from_fork+0x7c/0xb0
Code: 80 7d 8f 01 48 83 95 68 ff ff ff 00 4c 89 e7 e8 5a 7b 00 00 48 85 c0 49 89 c5 75 08 80 7d 8f 00 74 3e eb 31 48 8b 80 18 01 00 00 <48> 8b 74 0d 48 8b 78 30 be 02 00 00 00 ff d2 eb
RIP [<
ffffffff810c2625>] shrink_page_list+0x24e/0x897
RSP <
ffff88003da499b8>
CR2:
0000000000000028
---[ end trace
703d2451af6ffbfd ]---
Kernel panic - not syncing: Fatal exception
This patch fixes the issue, by assuring the proper tests are made at
putback_movable_pages() & reclaim_clean_pages_from_list() to avoid
isolated balloon pages being wrongly reinserted in LRU lists.
[akpm@linux-foundation.org: clarify awkward comment text]
Signed-off-by: Rafael Aquini <aquini@redhat.com>
Reported-by: Luiz Capitulino <lcapitulino@redhat.com>
Tested-by: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Christian Lamparter [Tue, 24 Sep 2013 19:56:46 +0000 (21:56 +0200)]
p54usb: add USB ID for Corega WLUSB2GTST USB adapter
commit
1e43692cdb7cc445d6347d8a5207d9cef0c71434 upstream.
Added USB ID for Corega WLUSB2GTST USB adapter.
Reported-by: Joerg Kalisch <the_force@gmx.de>
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Larry Finger [Thu, 19 Sep 2013 02:21:35 +0000 (21:21 -0500)]
rtlwifi: Align private space in rtl_priv struct
commit
60ce314d1750fef843e9db70050e09e49f838b69 upstream.
The private array at the end of the rtl_priv struct is not aligned.
On ARM architecture, this causes an alignment trap and is fixed by aligning
that array with __align(sizeof(void *)). That should properly align that
space according to the requirements of all architectures.
Reported-by: Jason Andrews <jasona@cadence.com>
Tested-by: Jason Andrews <jasona@cadence.com>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jack Wang [Mon, 30 Sep 2013 08:09:05 +0000 (10:09 +0200)]
ib_srpt: always set response for task management
commit
c807f64340932e19f0d2ac9b30c8381e1f60663a upstream.
The SRP specification requires:
"Response data shall be provided in any SRP_RSP response that is sent in
response to an SRP_TSK_MGMT request (see 6.7). The information in the
RSP_CODE field (see table 24) shall indicate the completion status of
the task management function."
So fix this to avoid the SRP initiator interprets task management functions
that succeeded as failed.
Signed-off-by: Jack Wang <jinpu.wang@profitbricks.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>