From: Philipp Reisner <philipp.reisner@linbit.com>
Date: Thu, 29 Sep 2011 11:00:14 +0000 (+0200)
Subject: drbd: fix "stalled" empty resync
X-Git-Tag: firefly_0821_release~3680^2~1419^2~18^2~141
X-Git-Url: http://demsky.eecs.uci.edu/git/?a=commitdiff_plain;h=9bcd2521827bb7c418a83a77474c449d6496d55c;p=firefly-linux-kernel-4.4.55.git

drbd: fix "stalled" empty resync

With sync-after dependencies, given "lucky" timing of pause/unpause
events, and the end of an empty (0 bits set) resync was sometimes not
detected on the SyncTarget, leading to a "stalled" SyncSource state.

Fixed this by expecting not only "Inconsistent -> UpToDate" but also
"Consistent -> UpToDate" transitions for the peer disk state
to end a resync.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
---

diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index e64b1c897c82..16c3710e1b9c 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -2262,7 +2262,7 @@ int drbd_adm_resize(struct sk_buff *skb, struct genl_info *info)
 
 	if (rs.no_resync && mdev->tconn->agreed_pro_version < 93) {
 		retcode = ERR_NEED_APV_93;
-		goto fail;
+		goto fail_ldev;
 	}
 
 	rcu_read_lock();
@@ -2272,7 +2272,7 @@ int drbd_adm_resize(struct sk_buff *skb, struct genl_info *info)
 		new_disk_conf = kmalloc(sizeof(struct disk_conf), GFP_KERNEL);
 		if (!new_disk_conf) {
 			retcode = ERR_NOMEM;
-			goto fail;
+			goto fail_ldev;
 		}
 	}
 
@@ -2310,6 +2310,10 @@ int drbd_adm_resize(struct sk_buff *skb, struct genl_info *info)
  fail:
 	drbd_adm_finish(info, retcode);
 	return 0;
+
+ fail_ldev:
+	put_ldev(mdev);
+	goto fail;
 }
 
 int drbd_adm_resource_opts(struct sk_buff *skb, struct genl_info *info)
diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
index 4004a682d0ec..aba04d7dadf5 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -3748,9 +3748,14 @@ static int receive_state(struct drbd_tconn *tconn, struct packet_info *pi)
 	os = ns = drbd_read_state(mdev);
 	spin_unlock_irq(&mdev->tconn->req_lock);
 
-	/* peer says his disk is uptodate, while we think it is inconsistent,
-	 * and this happens while we think we have a sync going on. */
-	if (os.pdsk == D_INCONSISTENT && real_peer_disk == D_UP_TO_DATE &&
+	/* If this is the "end of sync" confirmation, usually the peer disk
+	 * transitions from D_INCONSISTENT to D_UP_TO_DATE. For empty (0 bits
+	 * set) resync started in PausedSyncT, or if the timing of pause-/
+	 * unpause-sync events has been "just right", the peer disk may
+	 * transition from D_CONSISTENT to D_UP_TO_DATE as well.
+	 */
+	if ((os.pdsk == D_INCONSISTENT || os.pdsk == D_CONSISTENT) &&
+	    real_peer_disk == D_UP_TO_DATE &&
 	    os.conn > C_CONNECTED && os.disk == D_UP_TO_DATE) {
 		/* If we are (becoming) SyncSource, but peer is still in sync
 		 * preparation, ignore its uptodate-ness to avoid flapping, it