From: Sage Weil Date: Thu, 18 Mar 2010 22:20:53 +0000 (-0700) Subject: ceph: fix connection fault con_work reentrancy problem X-Git-Tag: firefly_0821_release~9833^2~2459^2~12 X-Git-Url: http://demsky.eecs.uci.edu/git/?a=commitdiff_plain;h=3c3f2e32effd4c6acc3a9434bd7eecb0af653d89;p=firefly-linux-kernel-4.4.55.git ceph: fix connection fault con_work reentrancy problem The messenger fault was clearing the BUSY bit, for reasons unclear. This made it possible for the con->ops->fault function to reopen the connection, and requeue work in the workqueue--even though the current thread was already in con_work. This avoids a problem where the client busy loops with connection failures on an unreachable OSD, but doesn't address the root cause of that problem. Signed-off-by: Sage Weil --- diff --git a/fs/ceph/messenger.c b/fs/ceph/messenger.c index 203c4359b549..983285540945 100644 --- a/fs/ceph/messenger.c +++ b/fs/ceph/messenger.c @@ -1836,8 +1836,6 @@ static void ceph_fault(struct ceph_connection *con) goto out; } - clear_bit(BUSY, &con->state); /* to avoid an improbable race */ - mutex_lock(&con->mutex); if (test_bit(CLOSED, &con->state)) goto out_unlock;