Disable nfs version4 in proxmox 8.xx

phanos · Oct 8, 2024

Hi I am attempting to disable nfs server version 4 on my proxmox VE 8.xx environment. I followed this guide here

Code:

https://unix.stackexchange.com/questions/205403/disable-nfsv4-server-on-debian-allow-nfsv3

and tried to update both the

Code:

/etc/default/nfs-kernel-server

and the

Code:

/etc/init.d/nfs-kernel-server

directly but with no luck. After restarting and doing

Code:

cat /proc/fs/nfsd/versions

I get

Code:

+3 +4 +4.1 +4.2

Also when I try to view the process rpc.mount by running ps -ef |grep rpc.mount I see the process does not show any parameters next to it

Code:

root        1883       1  0 15:43 ?        00:00:00 /usr/sbin/rpc.mountd
root        4069    2486  0 15:49 pts/0    00:00:00 grep rpc.mount

Does anyone has any idea why this is happening and if there is a way to disable nfs 4 on proxmox 8.xx?

Thanks

BobhWasatch · Oct 8, 2024

That bug report is 9 years old and supposedly the bug didn't affect systemd (only sysv-init). Are you sure you used the right syntax?

RPCMOUNTDOPTS="--no-nfs-version 4"
RPCNFSDOPTS="--no-nfs-version 4"

WITH the quotes? The file /etc/default/nfs-kernel-server is a shell script fragment so you need them. I use that setting to disable NFSv3 on a regular Debian server and it works fine. I don't see it in the output of ps either but cat /proc/fs/nfsd/versions is correct.

There is also /etc/nfs.conf but I've never touched that.

mariol · Oct 8, 2024

I am not aware of any situation why this should be switched off on the server. The easiest way is to set this on the nfs client.
Here is an example of the FSTAB:

Code:

mynfsserver:/yourNFS-share /your_local_mountpoint nfs     nfs     defaults,nfsvers=3 0       0

For the WebUI in Proxmox this can be set very simple:

phanos · Oct 8, 2024

BobhWasatch said:
That bug report is 9 years old and supposedly the bug didn't affect systemd (only sysv-init). Are you sure you used the right syntax?

RPCMOUNTDOPTS="--no-nfs-version 4"
RPCNFSDOPTS="--no-nfs-version 4"

WITH the quotes? The file /etc/default/nfs-kernel-server is a shell script fragment so you need them. I use that setting to disable NFSv3 on a regular Debian server and it works fine. I don't see it in the output of ps either but cat /proc/fs/nfsd/versions is correct.

There is also /etc/nfs.conf but I've never touched that.

Hi I used

1) RPCNFSDCOUNT="8 --no-nfs-version 4"
or
2) RPCMOUNTDOPTS="--manage-gids --no-nfs-version 4"
or
3) RPCMOUNTDOPTS="--no-nfs-version 4"

but separate not together in one boot. Every time I reboot I see
cat /proc/fs/nfsd/versions
+3 +4 +4.1 +4.2

Should I use what you are suggesting above?

RPCMOUNTDOPTS="--no-nfs-version 4"
RPCNFSDOPTS="--no-nfs-version 4"

Will the command cat /proc/fs/nfsd/versions

that +3 -4 -4.1 -4.2 instead of +3 +4 +4.1 +4.2
or it does not matter what it produces? Will the version 4 be disabled?

phanos · Oct 8, 2024

mariol said:
I am not aware of any situation why this should be switched off on the server. The easiest way is to set this on the nfs client.
Here is an example of the FSTAB:

Code:

mynfsserver:/yourNFS-share /your_local_mountpoint nfs nfs defaults,nfsvers=3 0 0

For the WebUI in Proxmox this can be set very simple:

View attachment 75943

Where is this option? The nfs server is running on proxmox and the nfs clients are VMs that run on this machine as well. I am using /etc/fstab to mount the nfs. This was working fine in previous proxmox version but in version 7.xx and 8.xx I notice crashes on the nfs server on proxmox and the only way to recover is with a full reboot of the server.

I tried in a client to mount with nfs version 3 in /etc/fstab and thinks did got better but instead of mounting every single VM with nfs version 3 I prefer to disable nfs version 4 and see if that fixes my issue. But I am not sure if what I did above did had any effect.

BobhWasatch · Oct 8, 2024

phanos said:
Should I use what you are suggesting above?

RPCMOUNTDOPTS="--no-nfs-version 4"
RPCNFSDOPTS="--no-nfs-version 4"

I would try that. I have both set to "-N 2 -N 3" to only allow NFSv4 and it gives: -3 +4 +4.1 +4.2 (v2 is disabled by default). Seems like it should work to disable v4 as well. If not it would still be a bug I think.

BobhWasatch · Oct 8, 2024

I also found this bug from May 2024: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071562

Seems like it might be similar to what you are seeing. I'm using a stock Debian 12 VM as the server, so it has kernel 6.1.0-26-amd64. The client VM's using the server are likewise Debian, although one has (had, it is no longer in use) the 6.9.7 kernel from backports.

waltar · Oct 8, 2024

To disable on server side edit /etc/nfs.conf with shown defaults to the versions with y or either n and restart nfs server,
vers4=n should disable all v4 variants and v3 is still default on.
...
[nfsd]
# debug=0
# threads=8
# host=
# port=0
# grace-time=90
# lease-time=90
# udp=n
# tcp=y
# vers3=y
# vers4=y
# vers4.0=y
# vers4.1=y
# vers4.2=y
# rdma=n
# rdma-port=20049
vers4=n
...

phanos · Oct 8, 2024

waltar said:
To disable on server side edit /etc/nfs.conf with shown defaults to the versions with y or either n and restart nfs server,
vers4=n should disable all v4 variants and v3 is still default on.
...
[nfsd]
# debug=0
# threads=8
# host=
# port=0
# grace-time=90
# lease-time=90
# udp=n
# tcp=y
# vers3=y
# vers4=y
# vers4.0=y
# vers4.1=y
# vers4.2=y
# rdma=n
# rdma-port=20049
vers4=n
...

Thanks your suggestion worked. Now I get

+3 -4 -4.0 -4.1 -4.2

We will see if this fixes the issue with the crashes

phanos · Oct 8, 2024

BobhWasatch said:
I also found this bug from May 2024: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071562

Seems like it might be similar to what you are seeing. I'm using a stock Debian 12 VM as the server, so it has kernel 6.1.0-26-amd64. The client VM's using the server are likewise Debian, although one has (had, it is no longer in use) the 6.9.7 kernel from backports.

My running kernel is 6.8.12-2-pve

Back when I was running proxmox 4.xx with the same hardware my system was rock solid. It could take months to reboot the server and when I did it was because I needed the updates or because I needed to make some maintenance on my machine.

When I tried proxmox 7.xx I could not even get a running system and there was kernel panics even a few minutes after boot so I had to revert back to proxmox 4.xx back then.

After proxmox 8.xx came out it seemed stabled enough but after I installed it I notice that there were nfs crashes every one a month or so. But now with the latest updates I get one every few days. Not sure what to make of it. When nfs server is crashed (I see the error in dmesg -wH) all the shares in the vms become unresponsive. The vms and server does run however but it useless since no data is accessible to my vms and the only way to get a working system is to reboot my machine.

I hope nfs v3 makes it more stable. We will see. Thanks for the help.

waltar · Oct 9, 2024

We didn't notice any nfs4.2 problems in our 4-node pve8 along rocky9 nfs server.

phanos · Oct 9, 2024

waltar said:
We didn't notice any nfs4.2 problems in our 4-node pve8 along rocky9 nfs server.

I know I already check if other users in proxmox 8.xx had similar issues but could not found any general issue.
What is described here (https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071562) is about what happening with me.

Please note that in my case I installed proxmox on top of Debian as described here https://pve.proxmox.com/wiki/Install_Proxmox_VE_on_Debian_12_Bookworm so I am not sure if this matters or changes thinks somehow.

I did the same however when I installed proxmox 4.xx and 7.xx with similar guides and in proxmox 4.xx my system was rock solid.

Now only nfs server v3 is enabled on proxmox 8.xx. I will report back after some time and let you know if disabling nfs version 4 helps somehow.

waltar · Oct 9, 2024

As pve is based on debian and updated regulary nfs-server should be the same when systems are up2date. The difference is the kernel which on orig. debian is 6.1.... but when "upgraded" to work as a pve you get "ubuntu" 6.8.... kernels which again is the same a pure pve inst.
So I cannot assume you have the debian bug linked which reports usage of 6.1...
We have 2 further nfs4 shares between the nodes, ISO's und lxc templates on raidz1 and backup (on xfs) of running images from rocky9 fileserver (xfs) while again no nfs(4.2) problems at all just between the pve nodes.

phanos · Oct 9, 2024

waltar said:
As pve is based on debian and updated regulary nfs-server should be the same when systems are up2date. The difference is the kernel which on orig. debian is 6.1.... but when "upgraded" to work as a pve you get "ubuntu" 6.8.... kernels which again is the same a pure pve inst.
So I cannot assume you have the debian bug linked which reports usage of 6.1...
We have 2 further nfs4 shares between the nodes, ISO's und lxc templates on raidz1 and backup (on xfs) of running images from rocky9 fileserver (xfs) while again no nfs(4.2) problems at all just between the pve nodes.

I see if this is a bug it must be a different one. But please note that what describes in that link is very similar to what I am experiencing. I mean the errors and stack traces after the crash look very similar (if not identical) to what I am getting. I notice that the crash usually starts when I try to download multiple files on a VM and writing at the same time those files to the nfs storage.

If the problem continues should I report it somewhere? I mean do you have a bug report system or the issue should be reported at some other place?

waltar · Oct 9, 2024

Mmh, I can't imagine that a few parallel downloads to nfs share will stress the protocol as it's production use with multiple 1000's of clients like eg on any netapp production system - even that is not your case but just to demonstrate the normal nfs use cases at all.
Your case is different to mine as we use nfs between the pve nodes while you have problems/crashes - of the vm or the downloads (??) out of a vm to a mounted share inside the vm running in pve. Did you have "virtio scsi single" defined and tried different vm disk "cache settings" ?

phanos · Oct 9, 2024

waltar said:
Mmh, I can't imagine that a few parallel downloads to nfs share will stress the protocol as it's production use with multiple 1000's of clients like eg on any netapp production system - even that is not your case but just to demonstrate the normal nfs use cases at all.
Your case is different to mine as we use nfs between the pve nodes while you have problems/crashes - of the vm or the downloads (??) out of a vm to a mounted share inside the vm running in pve. Did you have "virtio scsi single" defined and tried different vm disk "cache settings" ?

The crash happens on the nfs server, the client VM that downloads simply hangs and from there all vms that have nfs shares mounted from the server hang too. A reboot of each VM will cause the vm to start up again but with no mounted nfs shares. If I try to manually mount an nfs share from within any vm the vm simply hangs. I try async, sync, soft etc options in fstab but the result is the same, eventually nfs server crashes.

I am using default settings in the vm when I create the vms, same settings I was using before on Proxmox 4.xx. I never tried to change the default virtio settings or cache in the vm.

waltar · Oct 9, 2024

phanos said:
The crash happens on the nfs server,

Then you mean on the pve node. But you mean the node still running further right ? What does systemctl status nfs-server say when the crash occured and or vm rebootet and still cannot get the mount ?

phanos · Oct 9, 2024

waltar said:
Then you mean on the pve node. But you mean the node still running further right ? What does systemctl status nfs-server say when the crash occured and or vm rebootet and still cannot get the mount ?

Yes node runs fine and eventually any vm that is rebooted will start up but the mounts to nfs fail and are not present.

Systemctl status nfs-server hangs and gives nothing

I also try systemctl restart but also hangs

Only a reboot of the server restores a full working system

waltar · Oct 9, 2024

Look into /var/log/syslog+kern.log to entries before to your reboot (dmesg would show nothing as it's only since reboot while then again it works).
"systemctl status nfs-server hangs and gives nothing" is not normal ... maybe you have hw problem on disk or network controller ...

phanos · Oct 10, 2024

waltar said:
Look into /var/log/syslog+kern.log to entries before to your reboot (dmesg would show nothing as it's only since reboot while then again it works).
"systemctl status nfs-server hangs and gives nothing" is not normal ... maybe you have hw problem on disk or network controller ...

Actually I am doing dmesg -wH every time I boot and i run it under byobu so I can connect and view any messages. Before the crash everything seems quit but after the crash happens I get simillar errors to this

Code:

[300146.046666] INFO: task nfsd:1426 blocked for more than 241 seconds.
[300146.046732]       Not tainted 6.5.0-27-generic #28~22.04.1-Ubuntu
[300146.046770] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[300146.046813] task:nfsd            state:D stack:0     pid:1426  ppid:2      flags:0x00004000
[300146.046827] Call Trace:
[300146.046832]  <TASK>
[300146.046839]  __schedule+0x2cb/0x750
[300146.046860]  schedule+0x63/0x110
[300146.046870]  schedule_timeout+0x157/0x170
[300146.046881]  wait_for_completion+0x88/0x150
[300146.046894]  __flush_workqueue+0x140/0x3e0
[300146.046908]  nfsd4_probe_callback_sync+0x1a/0x30 [nfsd]
[300146.047074]  nfsd4_destroy_session+0x193/0x260 [nfsd]
[300146.047219]  nfsd4_proc_compound+0x3b7/0x770 [nfsd]
[300146.047365]  nfsd_dispatch+0xbf/0x1d0 [nfsd]
[300146.047497]  svc_process_common+0x420/0x6e0 [sunrpc]
[300146.047695]  ? __pfx_read_tsc+0x10/0x10
[300146.047706]  ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd]
[300146.047848]  ? __pfx_nfsd+0x10/0x10 [nfsd]
[300146.047977]  svc_process+0x132/0x1b0 [sunrpc]
[300146.048157]  nfsd+0xdc/0x1c0 [nfsd]
[300146.048287]  kthread+0xf2/0x120
[300146.048299]  ? __pfx_kthread+0x10/0x10
[300146.048310]  ret_from_fork+0x47/0x70
[300146.048321]  ? __pfx_kthread+0x10/0x10
[300146.048331]  ret_from_fork_asm+0x1b/0x30
[300146.048341]  </TASK>

After this every time I try to mount an nfs from a client in the server logs I get this error again.

Maybe is a hardware error I can not tell. I will have to test with nfs version 3 for a couple of days first and then check again.

Disable nfs version4 in proxmox 8.xx

Renowned Member

Famous Member

Proxmox Staff Member

Renowned Member

Renowned Member

Famous Member

Famous Member

Renowned Member

Renowned Member

Renowned Member

Renowned Member

Renowned Member

Renowned Member

Renowned Member

Renowned Member

Renowned Member

Renowned Member

Renowned Member

Renowned Member

Renowned Member

We value your privacy