FS#27760 - [nfs-utils/rpcbind/libtirpc/libgssglue] incompatible with kernel 2.6.32-lts / long NFSv4 mount delay

Attached to Project: Arch Linux
Opened by Andreas Radke (AndyRTR) - Thursday, 29 December 2011, 17:14 GMT
Last edited by Andreas Radke (AndyRTR) - Friday, 13 January 2012, 14:58 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Dale Blount (dale)
Tobias Powalowski (tpowa)
Jan de Groot (JGC)
Thomas Bächler (brain0)
Andreas Radke (AndyRTR)
Tom Gundersen (tomegun)
Architecture All
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

I'm running into a very long mount delay (~1hour) whenever trying to mount NFSv4 shares when my server is running our
current 2.6.32 LTS kernel no matter what kernel is used on my clients.

When running our stock 3.1.x kernel on the server all clients can mount instantly after a server reboot.

client log with server running failing LTS kernel:

Dec 29 11:21:16 workstation64 rpc.idmapd[15874]: libnfsidmap: using domain: home
Dec 29 11:21:16 workstation64 rpc.idmapd[15874]: libnfsidmap: processing 'Method' list
Dec 29 11:21:16 workstation64 rpc.idmapd[15874]: libnfsidmap: loaded plugin /usr/lib/libnfsidmap/nsswitch.so for method nsswitch
Dec 29 11:21:16 workstation64 rpc.idmapd[15875]: Expiration time is 600 seconds.
Dec 29 11:21:16 workstation64 rpc.idmapd[15875]: Opened /proc/net/rpc/nfs4.nametoid/channel
Dec 29 11:21:16 workstation64 rpc.idmapd[15875]: Opened /proc/net/rpc/nfs4.idtoname/channel
Dec 29 11:21:51 workstation64 rpc.idmapd[15875]: New client: 35
Dec 29 11:21:51 workstation64 rpc.idmapd[15875]: Opened /var/lib/nfs/rpc_pipefs/nfs/clnt35/idmap
Dec 29 11:21:51 workstation64 rpc.idmapd[15875]: New client: 36
Dec 29 11:24:51 workstation64 kernel: nfs: server 192.168.0.90 not responding, still trying

NFS utils FAQ says:


D4. I frequently see this in my logs:
kernel: nfs: server server.domain.name not responding, still trying
kernel: nfs: task 10754 can't get a request slot
kernel: nfs: server server.domain.name OK
A. The "can't get a request slot" message means that the client-side RPC code has detected a lot of timeouts (perhaps due to network congestion, perhaps due to an overloaded server), and is throttling back the number of concurrent outstanding requests in an attempt to lighten the load. Some possible causes:
Network congestion
Overloaded server
Packets (input or output) dropped by a bad NIC or driver....

I guess this won't help much here. It's my local setup working well with everything else.

After ~1hour rpcbind seems to have throttled enough down to allow the clients to mount and from then on it works all time.

As far as I can remember the issue has been introduced by the last updates of nfs-utils 1.2.5 but maybe it was there before in 1.2.4 and was introduced by an package on what it depends.

I'm searching if another distro has run into such issues...
Assigning the bug also to some devs that run some servers.
This task depends upon

Closed by  Andreas Radke (AndyRTR)
Friday, 13 January 2012, 14:58 GMT
Reason for closing:  Fixed
Additional comments about closing:  update to linux 3.0.x LTS solves this.

Loading...