drbd: Be more careful with SyncSource -> Ahead transitions
authorPhilipp Reisner <philipp.reisner@linbit.com>
Thu, 23 Dec 2010 13:24:33 +0000 (14:24 +0100)
committerPhilipp Reisner <philipp.reisner@linbit.com>
Thu, 10 Mar 2011 10:45:26 +0000 (11:45 +0100)
We may not get from SyncSource to Ahead if we have sent some
P_RS_DATA_REPLY packets to the peer and are waiting for
P_WRITE_ACK.

Again, this is not relevant for proper tuned systems, but makes
sure that the not-tuned system does not get diverging bitmaps.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
drivers/block/drbd/drbd_receiver.c
drivers/block/drbd/drbd_req.c

index bf865bd834143ffadfc26a84069b8fe1b8227cde..fd0957f9c230546b3ab8349d1e796a1ea72d399d 100644 (file)
@@ -4385,10 +4385,11 @@ static int got_BarrierAck(struct drbd_conf *mdev, struct p_header80 *h)
 
        if (mdev->state.conn == C_AHEAD &&
            atomic_read(&mdev->ap_in_flight) == 0 &&
+           atomic_read(&mdev->rs_pending_cnt) == 0 &&
            list_empty(&mdev->start_resync_work.list)) {
                    struct drbd_work *w = &mdev->start_resync_work;
                    w->cb = w_start_resync;
-                   drbd_queue_work_front(&mdev->data.work, w);
+                   drbd_queue_work(&mdev->data.work, w);
        }
 
        return true;
index 889175110c918f72a1eb926cfbf1a4a650d11ce0..a3f6b04ebabadc6fa252ebcbe4904ac1ca8805ce 100644 (file)
@@ -1002,7 +1002,13 @@ allocate_barrier:
                        congested = 1;
                }
 
-               if (congested) {
+               if (congested && atomic_read(&mdev->rs_pending_cnt) == 0) {
+                       /* rs_pending_cnt must be zero, otherwise the two peers
+                          might get different bitmaps. With sane configurations
+                          the resync stalls long before we might want to go into
+                          AHEAD mode.
+                          We could force the resync into PAUSE mode here if
+                          rs_pending_cnt is > 0 ... */
                        queue_barrier(mdev);
 
                        if (mdev->net_conf->on_congestion == OC_PULL_AHEAD)