kernel/5135: nfs client can't reconnect tcp mounts after disconnect

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

kernel/5135: nfs client can't reconnect tcp mounts after disconnect

Jim Rees
>Number:         5135
>Category:       kernel
>Synopsis:       nfs client can't reconnect tcp mounts after disconnect
>Confidential:   yes
>Severity:       serious
>Priority:       medium
>Responsible:    bugs
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri May 26 14:50:02 GMT 2006
>Originator:     Jim Rees
>Release:        3.9
University of Michigan CITI
        System      : OpenBSD 3.9
        Architecture: OpenBSD.i386
        Machine     : i386

The NFS client sometimes can't reconnect to the server on a tcp mount after
a disconnect.  This happens when the process trying to reconnect does not
have uid=0 but tries to bind to a privileged port.

Set up an nfs server that will disconnect tcp clients after a timeout.
Solaris and Netapp servers are known to do this and I have reproduced this
bug with both.  Find out what the disconnect timeout is, and if possible set
it to a small value.  I use ten minutes.

Mount the server with tcp from OpenBSD.  I use the following fstab entry,
but any tcp mount should work:
troy:/vol/home /home nfs rw,nosuid,-i,-3,-T,port=2049 0 0

Wait until the timeout interval has passed.  Make sure there is no nfs
traffic during this interval.  It is useful, but not necessary, to run
ethereal or tcpdump during this time so that you can see the disconnect

Become any user other than root.  Try to "ls" or "cat" a file or directory
on the server.  This will fail with "nfs server not responding."

One possible fix is to become root before calling sobind() in nfs_connect().
I have tested this and it works.  I think a better solution would be to
modify sobind() so that it takes a cred argument, then use a root cred for
this call.

The error messages associated with the reconnect should also be removed,
because reconnect is normal, not an error.

Index: nfs/nfs_socket.c
RCS file: /cvs/src/sys/nfs/nfs_socket.c,v
retrieving revision 1.43
diff -u -r1.43 nfs_socket.c
--- nfs/nfs_socket.c 2006/01/24 15:06:41 1.43
+++ nfs/nfs_socket.c 2006/05/24 17:10:17
@@ -170,6 +170,7 @@
  if (saddr->sa_family == AF_INET) {
  struct mbuf *mopt;
  int *ip;
+ struct ucred *ocred;
  mopt->m_len = sizeof(int);
@@ -185,8 +186,16 @@
  sin->sin_family = AF_INET;
  sin->sin_addr.s_addr = INADDR_ANY;
  sin->sin_port = htons(0);
+ /* temporarily become root */
+ ocred = curproc->p_ucred;
+ curproc->p_ucred = crdup(ocred);
+ curproc->p_ucred->cr_uid = 0;
  error = sobind(so, m);
+ crfree(curproc->p_ucred);
+ curproc->p_ucred = ocred;
  if (error)
  goto bad;
@@ -400,8 +409,9 @@
  (struct mbuf *)0, flags);
  if (error) {
  if (rep) {
- log(LOG_INFO, "nfs send error %d for server %s\n",error,
-    rep->r_nmp->nm_mountp->mnt_stat.f_mntfromname);
+ if (error != EPIPE)
+ log(LOG_INFO, "nfs send error %d for server %s\n",error,
+    rep->r_nmp->nm_mountp->mnt_stat.f_mntfromname);
  * Deal with errors for the client side.
@@ -536,11 +546,6 @@
  } while (error == EWOULDBLOCK);
  if (!error && auio.uio_resid > 0) {
-    log(LOG_INFO,
- "short receive (%d/%d) from nfs server %s\n",
- sizeof(u_int32_t) - auio.uio_resid,
- sizeof(u_int32_t),
- rep->r_nmp->nm_mountp->mnt_stat.f_mntfromname);
     error = EPIPE;
  if (error)
@@ -567,10 +572,6 @@
  } while (error == EWOULDBLOCK || error == EINTR ||
  error == ERESTART);
  if (!error && auio.uio_resid > 0) {
-    log(LOG_INFO,
- "short receive (%d/%d) from nfs server %s\n",
- len - auio.uio_resid, len,
- rep->r_nmp->nm_mountp->mnt_stat.f_mntfromname);
     error = EPIPE;
  } else {