md/raid5: fix bug that could result in reads from a failed device.
authorNeilBrown <neilb@suse.de>
Wed, 14 Dec 2011 23:54:39 +0000 (10:54 +1100)
committerGreg Kroah-Hartman <gregkh@suse.de>
Wed, 21 Dec 2011 20:57:42 +0000 (12:57 -0800)
commit 355840e7a7e56bb2834fd3b0da64da5465f8aeaa upstream.

commit a847627709b3402163d99f7c6fda4a77bcd6b51b in linux-3.0.9
attempted to backport this to 3.0 but only made one change were two
were necessary.  This add the second change.

This bug was introduced in 415e72d034c50520ddb7ff79e7d1792c1306f0c9
which was in 2.6.36.

There is a small window of time between when a device fails and when
it is removed from the array.  During this time we might still read
from it, but we won't write to it - so it is possible that we could
read stale data.

We didn't need the test of 'Faulty' before because the test on
In_sync is sufficient.  Since we started allowing reads from the early
part of non-In_sync devices we need a test on Faulty too.

This is suitable for any kernel from 2.6.36 onwards, though the patch
might need a bit of tweaking in 3.0 and earlier.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
drivers/md/raid5.c

index cbb50d3883a40e3f5e66942edde1a008137fbfda..1f6c68df6f382de7f62fddb459c3a9d5bf2377e0 100644 (file)
@@ -3078,7 +3078,7 @@ static void handle_stripe5(struct stripe_head *sh)
                        /* Not in-sync */;
                else if (test_bit(In_sync, &rdev->flags))
                        set_bit(R5_Insync, &dev->flags);
-               else {
+               else if (!test_bit(Faulty, &rdev->flags)) {
                        /* could be in-sync depending on recovery/reshape status */
                        if (sh->sector + STRIPE_SECTORS <= rdev->recovery_offset)
                                set_bit(R5_Insync, &dev->flags);