FS#59890 - NFS servers sees thousands and thousands of RPC request reserved 84 but used 292

Attached to Project: Arch Linux
Opened by Richard PALO (risto3) - Friday, 31 August 2018, 07:22 GMT
Last edited by Andreas Radke (AndyRTR) - Tuesday, 17 March 2020, 09:27 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Andreas Radke (AndyRTR)
Architecture x86_64
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:

Since quite recently on our nfs server (Linux 4.14.67-1-lts #1 SMP )
I notice tens of thousands of the following message (dmesg)
>... RPC request reserved 84 but used 292

looks similar to https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=898060

our nfs clients are still primarily nfs3 as they are still linux-odroid-c2
(3.14.79-28-ARCH) needing linux-aarch64 to satisfactorily run nfs4 but which
is still somewhat unstable for production.

Additional info:
* package version(s)
* config and/or log files etc.


Steps to reproduce:

nfs home + public r/w directories

NB a test machine is running linux-aarch64 (4.18.5-1-ARCH) and nfs4 mounting nfs4 vers=4.2
I will try the redhat workaround mentioned on debian to force mounting 4.0 to see if it helps that machine.
This task depends upon

Closed by  Andreas Radke (AndyRTR)
Tuesday, 17 March 2020, 09:27 GMT
Reason for closing:  No response
Comment by Richard PALO (risto3) - Friday, 31 August 2018, 08:22 GMT
BTW, this is the pertinent comment on RH: https://bugzilla.redhat.com/show_bug.cgi?id=1552037#c14
Comment by Richard PALO (risto3) - Thursday, 06 September 2018, 06:35 GMT
I guess I can confirm that it has to do with nfs4 clients and not nfs3.

With mount option vers=4.0 in auto.home I was able to run a few days without the problem at all.

I happened to remember that I set {TEMP,TMP,TMPDIR} to ~/.local/tmp (which is on the home directory)
in .profile, so I commented this out for the time being.

Setting the mount option to vers=4 (or latest 4.x == 4.2 at the moment) I didn't notice the problem immediately
(did a simple find . >/dev/null)

I then put back the overrides of temp variables to ~/.local/tmp and redid the find...
almost immediately I had '[419733.168008] RPC request reserved 84 but used 120' on the server.

So it seems related to the sort of file operations used in temp on nfs 4.2 mounted filesystems.
Comment by Richard PALO (risto3) - Thursday, 06 September 2018, 06:40 GMT
By the way, here are the mount stats visible on the client:

$ mountstats
Stats for smicro.local:/home/richard mounted on /home/richard:

NFS mount options: rw,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,acregmin=3,acregmax=60,acdirmin=30,acdirmax=60,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.0.39,local_lock=none
NFS server capabilities: caps=0x1eb7fff,wtmult=512,dtsize=32768,bsize=0,namlen=255
NFSv4 capability flags: bm0=0xfdffbfff,bm1=0x40f9be3e,bm2=0x20803,acl=0x3,sessions,pnfs=notconfigured
NFS security flavor: 1 pseudoflavor: 0

NFS byte counts:
applications read 72009603 bytes via read(2)
applications wrote 33762209 bytes via write(2)
applications read 0 bytes via O_DIRECT read(2)
applications wrote 0 bytes via O_DIRECT write(2)
client read 88463314 bytes via NFS READ
client wrote 34919048 bytes via NFS WRITE
RPC statistics:
50034 RPC requests sent, 50032 RPC replies received (2 XIDs not found)
average backlog queue length: 0

GETATTR:
11026 ops (22%)
avg bytes sent per op: 203 avg bytes received per op: 263
backlog wait: 0.014874 RTT: 2.900961 total execute time: 2.949574 (milliseconds)
READDIR:
9261 ops (18%)
avg bytes sent per op: 227 avg bytes received per op: 3808
backlog wait: 0.010042 RTT: 9.509880 total execute time: 9.537091 (milliseconds)
ACCESS:
8839 ops (17%)
avg bytes sent per op: 211 avg bytes received per op: 167
backlog wait: 0.009164 RTT: 2.797602 total execute time: 2.822718 (milliseconds)
LOOKUP:
8441 ops (16%)
avg bytes sent per op: 236 avg bytes received per op: 308
backlog wait: 0.011136 RTT: 2.902026 total execute time: 2.938277 (milliseconds)
READ:
2039 ops (4%)
avg bytes sent per op: 220 avg bytes received per op: 43489
backlog wait: 68.426680 RTT: 45.564492 total execute time: 114.033840 (milliseconds)
WRITE:
880 ops (1%)
avg bytes sent per op: 39921 avg bytes received per op: 176
backlog wait: 4.211364 RTT: 9.084091 total execute time: 13.340909 (milliseconds)
LOCK:
819 ops (1%)
avg bytes sent per op: 244 avg bytes received per op: 112
backlog wait: 0.330891 RTT: 4.516484 total execute time: 4.886447 (milliseconds)
LOCKU:
755 ops (1%)
avg bytes sent per op: 232 avg bytes received per op: 112
backlog wait: 0.235762 RTT: 5.274172 total execute time: 5.545695 (milliseconds)
CLOSE:
753 ops (1%)
avg bytes sent per op: 221 avg bytes received per op: 150
backlog wait: 0.037185 RTT: 4.184595 total execute time: 4.258964 (milliseconds)
REMOVE:
474 ops (0%)
avg bytes sent per op: 217 avg bytes received per op: 115
backlog wait: 0.010549 RTT: 8.582278 total execute time: 8.622363 (milliseconds)
OPEN_NOATTR:
471 ops (0%)
avg bytes sent per op: 272 avg bytes received per op: 360
backlog wait: 0.053079 RTT: 3.900212 total execute time: 4.002123 (milliseconds)
OPEN:
390 ops (0%)
avg bytes sent per op: 330 avg bytes received per op: 345
backlog wait: 0.797436 RTT: 9.630769 total execute time: 10.474359 (milliseconds)
SETATTR:
334 ops (0%)
avg bytes sent per op: 248 avg bytes received per op: 287
backlog wait: 0.086826 RTT: 8.808383 total execute time: 8.973054 (milliseconds)
DELEGRETURN:
305 ops (0%)
avg bytes sent per op: 224 avg bytes received per op: 160
backlog wait: 0.485246 RTT: 3.121311 total execute time: 3.645902 (milliseconds)
STATFS:
209 ops (0%)
avg bytes sent per op: 203 avg bytes received per op: 160
backlog wait: 0.023923 RTT: 3.846890 total execute time: 3.956938 (milliseconds)
RENAME:
197 ops (0%)
avg bytes sent per op: 280 avg bytes received per op: 152
backlog wait: 0.025381 RTT: 11.152284 total execute time: 11.213198 (milliseconds)
FREE_STATEID:
104 ops (0%)
avg bytes sent per op: 168 avg bytes received per op: 88
backlog wait: 0.019231 RTT: 3.086538 total execute time: 3.134615 (milliseconds)
COMMIT:
54 ops (0%)
avg bytes sent per op: 204 avg bytes received per op: 104
backlog wait: 0.018519 RTT: 17.185185 total execute time: 17.240741 (milliseconds)
LOCKT:
27 ops (0%)
avg bytes sent per op: 244 avg bytes received per op: 96
backlog wait: 8.592593 RTT: 5.444444 total execute time: 14.111111 (milliseconds)
CREATE:
15 ops (0%)
avg bytes sent per op: 260 avg bytes received per op: 249
backlog wait: 0.000000 RTT: 6.400000 total execute time: 6.533333 (milliseconds)
LINK:
4 ops (0%)
avg bytes sent per op: 272 avg bytes received per op: 316
backlog wait: 0.000000 RTT: 9.000000 total execute time: 9.250000 (milliseconds)
SYMLINK:
2 ops (0%)
avg bytes sent per op: 276 avg bytes received per op: 352
backlog wait: 0.000000 RTT: 6.500000 total execute time: 6.500000 (milliseconds)
SERVER_CAPS:
2 ops (0%)
avg bytes sent per op: 180 avg bytes received per op: 164
backlog wait: 0.000000 RTT: 1.500000 total execute time: 1.500000 (milliseconds)
DEALLOCATE:
2 ops (0%)
avg bytes sent per op: 240 avg bytes received per op: 160
backlog wait: 0.000000 RTT: 6.000000 total execute time: 6.000000 (milliseconds)
FSINFO:
1 ops (0%)
avg bytes sent per op: 180 avg bytes received per op: 164
backlog wait: 0.000000 RTT: 4.000000 total execute time: 5.000000 (milliseconds)
PATHCONF:
1 ops (0%)
avg bytes sent per op: 172 avg bytes received per op: 116
backlog wait: 0.000000 RTT: 1.000000 total execute time: 1.000000 (milliseconds)
READLINK:
1 ops (0%)
avg bytes sent per op: 192 avg bytes received per op: 156
backlog wait: 0.000000 RTT: 2.000000 total execute time: 2.000000 (milliseconds)
ALLOCATE:
1 ops (0%)
avg bytes sent per op: 240 avg bytes received per op: 96
backlog wait: 0.000000 RTT: 2.000000 total execute time: 2.000000 (milliseconds)

Comment by Andreas Radke (AndyRTR) - Tuesday, 10 December 2019, 11:20 GMT
Is this still an issue? From Debian report it should be solved now.

Loading...