md/raid5: fix bug that could result in reads from a failed device.
authorNeilBrown <neilb@suse.de>
Tue, 25 Oct 2011 23:31:04 +0000 (10:31 +1100)
committerGreg Kroah-Hartman <gregkh@suse.de>
Fri, 11 Nov 2011 17:35:53 +0000 (09:35 -0800)
commit 355840e7a7e56bb2834fd3b0da64da5465f8aeaa upstream.

This bug was introduced in 415e72d034c50520ddb7ff79e7d1792c1306f0c9
which was in 2.6.36.

There is a small window of time between when a device fails and when
it is removed from the array.  During this time we might still read
from it, but we won't write to it - so it is possible that we could
read stale data.

We didn't need the test of 'Faulty' before because the test on
In_sync is sufficient.  Since we started allowing reads from the early
part of non-In_sync devices we need a test on Faulty too.

This is suitable for any kernel from 2.6.36 onwards, though the patch
might need a bit of tweaking in 3.0 and earlier.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
drivers/md/raid5.c

index 2581ba127354d8e8acf30d9ffbe273366b23c157..e509147318e6e9b2393582fd6ffa823e301cab0e 100644 (file)
@@ -3369,7 +3369,7 @@ static void handle_stripe6(struct stripe_head *sh)
                        /* Not in-sync */;
                else if (test_bit(In_sync, &rdev->flags))
                        set_bit(R5_Insync, &dev->flags);
-               else {
+               else if (!test_bit(Faulty, &rdev->flags)) {
                        /* in sync if before recovery_offset */
                        if (sh->sector + STRIPE_SECTORS <= rdev->recovery_offset)
                                set_bit(R5_Insync, &dev->flags);