genirq: Remove racy waitqueue_active check
authorChuansheng Liu <chuansheng.liu@intel.com>
Mon, 24 Feb 2014 03:29:50 +0000 (11:29 +0800)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Mon, 24 Mar 2014 04:38:15 +0000 (21:38 -0700)
commit56f1c4124bd0c769591071916abc5358b8811c1a
tree388d4eb722cdeb1ce8c076e13da61129bb7dfcef
parent3ec34db27ae6cc05ebb61732cbdeecce38c486c1
genirq: Remove racy waitqueue_active check

commit c685689fd24d310343ac33942e9a54a974ae9c43 upstream.

We hit one rare case below:

T1 calling disable_irq(), but hanging at synchronize_irq()
always;
The corresponding irq thread is in sleeping state;
And all CPUs are in idle state;

After analysis, we found there is one possible scenerio which
causes T1 is waiting there forever:
CPU0                                       CPU1
 synchronize_irq()
  wait_event()
    spin_lock()
                                           atomic_dec_and_test(&threads_active)
      insert the __wait into queue
    spin_unlock()
                                           if(waitqueue_active)
    atomic_read(&threads_active)
                                             wake_up()

Here after inserted the __wait into queue on CPU0, and before
test if queue is empty on CPU1, there is no barrier, it maybe
cause it is not visible for CPU1 immediately, although CPU0 has
updated the queue list.
It is similar for CPU0 atomic_read() threads_active also.

So we'd need one smp_mb() before waitqueue_active.that, but removing
the waitqueue_active() check solves it as wel l and it makes
things simple and clear.

Signed-off-by: Chuansheng Liu <chuansheng.liu@intel.com>
Cc: Xiaoming Wang <xiaoming.wang@intel.com>
Link: http://lkml.kernel.org/r/1393212590-32543-1-git-send-email-chuansheng.liu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kernel/irq/manage.c