sched: fix the theoretical signal_wake_up() vs schedule() race
authorOleg Nesterov <oleg@redhat.com>
Mon, 12 Aug 2013 16:14:00 +0000 (18:14 +0200)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Thu, 9 Jan 2014 20:24:23 +0000 (12:24 -0800)
commit57f74b6ecebf59991677dd2da0f0433e8be6c945
tree4e3b332c34f41796e61bf7bcfb3e6bb06386e0b3
parenta29ccdd1b5a61fad7d4883b3ef63da3a313f1e44
sched: fix the theoretical signal_wake_up() vs schedule() race

commit e0acd0a68ec7dbf6b7a81a87a867ebd7ac9b76c4 upstream.

This is only theoretical, but after try_to_wake_up(p) was changed
to check p->state under p->pi_lock the code like

__set_current_state(TASK_INTERRUPTIBLE);
schedule();

can miss a signal. This is the special case of wait-for-condition,
it relies on try_to_wake_up/schedule interaction and thus it does
not need mb() between __set_current_state() and if(signal_pending).

However, this __set_current_state() can move into the critical
section protected by rq->lock, now that try_to_wake_up() takes
another lock we need to ensure that it can't be reordered with
"if (signal_pending(current))" check inside that section.

The patch is actually one-liner, it simply adds smp_wmb() before
spin_lock_irq(rq->lock). This is what try_to_wake_up() already
does by the same reason.

We turn this wmb() into the new helper, smp_mb__before_spinlock(),
for better documentation and to allow the architectures to change
the default implementation.

While at it, kill smp_mb__after_lock(), it has no callers.

Perhaps we can also add smp_mb__before/after_spinunlock() for
prepare_to_wait().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
arch/x86/include/asm/spinlock.h
include/linux/spinlock.h
kernel/sched/core.c