tcp: select(writefds) don't hang up when a peer close connection
authorKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Tue, 24 Aug 2010 16:05:48 +0000 (16:05 +0000)
committerDavid S. Miller <davem@davemloft.net>
Thu, 26 Aug 2010 06:02:48 +0000 (23:02 -0700)
This issue come from ruby language community. Below test program
hang up when only run on Linux.

% uname -mrsv
Linux 2.6.26-2-486 #1 Sat Dec 26 08:37:39 UTC 2009 i686
% ruby -rsocket -ve '
BasicSocket.do_not_reverse_lookup = true
serv = TCPServer.open("127.0.0.1", 0)
s1 = TCPSocket.open("127.0.0.1", serv.addr[1])
s2 = serv.accept
s2.close
s1.write("a") rescue p $!
s1.write("a") rescue p $!
Thread.new {
  s1.write("a")
}.join'
ruby 1.9.3dev (2010-07-06 trunk 28554) [i686-linux]
#<Errno::EPIPE: Broken pipe>
[Hang Here]

FreeBSD, Solaris, Mac doesn't. because Ruby's write() method call
select() internally. and tcp_poll has a bug.

SUS defined 'ready for writing' of select() as following.

|  A descriptor shall be considered ready for writing when a call to an output
|  function with O_NONBLOCK clear would not block, whether or not the function
|  would transfer data successfully.

That said, EPIPE situation is clearly one of 'ready for writing'.

We don't have read-side issue because tcp_poll() already has read side
shutdown care.

|        if (sk->sk_shutdown & RCV_SHUTDOWN)
|                mask |= POLLIN | POLLRDNORM | POLLRDHUP;

So, Let's insert same logic in write side.

- reference url
  http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/31065
  http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/31068

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
net/ipv4/tcp.c

index e2add5ff9cb144516cca25233d76a98190108689..3fb1428e526eedb521057a49624fa28dde8b41cd 100644 (file)
@@ -451,7 +451,8 @@ unsigned int tcp_poll(struct file *file, struct socket *sock, poll_table *wait)
                                if (sk_stream_wspace(sk) >= sk_stream_min_wspace(sk))
                                        mask |= POLLOUT | POLLWRNORM;
                        }
-               }
+               } else
+                       mask |= POLLOUT | POLLWRNORM;
 
                if (tp->urg_data & TCP_URG_VALID)
                        mask |= POLLPRI;