updating storage.cfg not working on 7.4. Potential bug?

someone2 · Mar 22, 2024

I added an NFS mount in the storage.cfg on proxmox7.4. This shared storage is now available to all hosts, as I wanted too. I can then bind mount it to the containers as I wished.
After a while, that this NFS is already in use, I updated the options of the NFS in the storage.cfg, namely the timeo option.
this is when I noticed strange behaviors:
1. Deleting an storage in storage.cfg does not automatically unmount it from the hosts, even if that host (or given containers) is not using that given storage.
Its visible in the UI anymore, but it is still mounted.
2. If I unmount the shared storage, it is automagically remounted by proxmox.... I can understand this.... BUT the new configuration changes that I did (updating the timeo value) are not applied!
I tried to restart the pvedaemon.service or the pve-storage.target but this does not help.

Am I misunderstanding the purpose of storage.cfg?
So what should I do ?

Reboot the hosts so that new config is picked up? => not something I wish
Not using the storage.cfg for that and replace it with autofs for example? => a bit silly since there is this option in proxmox.

fabian · Mar 22, 2024

someone2 said:
1. Deleting an storage in storage.cfg does not automatically unmount it from the hosts, even if that host (or given containers) is not using that given storage.
Its visible in the UI anymore, but it is still mounted.

this is expected. there might be more than on storage.cfg entry for example pointing at the same usage, so deactivating can be dangerous.

someone2 said:
2. If I unmount the shared storage, it is automagically remounted by proxmox.... I can understand this.... BUT the new configuration changes that I did (updating the timeo value) are not applied!

the remount is also expected, but the storage.cfg should actually be reloaded. are you sure you edited it correctly? how did you do it?

someone2 · Mar 22, 2024

for 1: I asked someone else running proxmox 8 to try the same thing and he reported that it got unmounted (unless he missed the mounting), that's why i'm confused...

This is my configuration:

nfs: my_nfs
export /exports/nfs
path /mnt/pve/nfs
server XXX.XXX.XXX.XXX
content vztmpl
options noatime,soft,timeo=20

I edited it with vim on /etc/pve/storage.cfg. And there I changed the value of the timeo option, and added the noatime.
I don't see any error in journalctl after unmount the path.
But i also dont see any message concerning the remounting...

fabian · Mar 22, 2024

please run "journalctl -f" in one shell, and then "mount | grep pve/nfs; umount /mnt/pve/nfs; echo 'post unmount'; mount | grep pve/nfs; sleep 30; echo 'post sleep'; mount | grep pve/nfs;" in another, and post the full output of both shells

someone2 · Mar 22, 2024

Thank you, there it is.
I did not mention it earlier, but the configuration change in storage.cfg gets well propagated to all hosts.
And that issue is occuring on all hosts, its not specific to one entity

fabian · Mar 22, 2024

maybe you were just unlucky the first time, and whichever component did the re-activation was not aware of the updated config file yet..

someone2 · Mar 22, 2024

yah but as you can see on the screenshot, the options are not updated.
Even if i repeat that process a few time, nothing changes.

fabian · Mar 22, 2024

well, noatime and soft are handled, maybe the timeo has a lower limit? 20 seems very small (it's not in seconds, but deciseconds

)

someone2 · Mar 25, 2024

yah thats 2sec timeout, but thats what we need in that case... however i already tried with a higher value, and same results...
Thats why i think there is a bug somewhere

fabian · Mar 25, 2024

our code literally just does a "mount -t nfs $source $target -o $options", so either your config file is somehow messed up, or your NFS share is not mounted using our code, or it is because of your usage of NFS v4 and requires additional parameters to be set to take effect?

someone2 · Mar 25, 2024

Is there a tool to validate the storage.cfg? the entry looks ok compared to the others, I also posted it above.
it is mounted using pve, that I'm 100% sure, it was created this way. but again, if you know ways to verify i can post the output here.
if I use the same parameters for a CLI mount cmd it works with no problems

fabian · Mar 25, 2024

could you post the full storage.cfg here in code tags? you can of course censor IPs/export names etc...

you can see the code here:

https://git.proxmox.com/?p=pve-stor...c804986fcef938b3c47a6443ad0920bb5;hb=HEAD#l34
https://git.proxmox.com/?p=pve-stor...804986fcef938b3c47a6443ad0920bb5;hb=HEAD#l123

someone2 · Mar 25, 2024

Thank you, here is the file . The last one is the problematic one. I cannot modify any of the other mount points to test whether the problem would be the same.

Code:

dir: local
    disable
    path /var/XXXXXXXXXXXXXXXX/XXXXXXXXXXXXXXXX
    content vztmpl,rootdir,images,iso
    prune-backups keep-all=1
    shared 0

rbd: XXXXXXXXXXXXXXXX
    content images,rootdir
    krbd 1
    monhost XXXXXXXXXXXXXXXX
    pool XXXXXXXXXXXXXXXX
    username XXXXXXXXXXXXXXXX

zfspool: XXXXXXXXXXXXXXXX
    pool rpool/storage
    content images,rootdir

nfs: XXXXXXXXXXXXXXXX
    export /backup/XXXXXXXXXXXXXXXX
    path /mnt/pve/XXXXXXXXXXXXXXXX
    server XXXXXXXXXXXXXXXX
    content vztmpl,iso
    options async,intr,soft,noatime,vers=3
    prune-backups keep-last=5

dir: XXXXXXXXXXXXXXXX
    path /home/XXXXXXXXXXXXXXXX/XXXXXXXXXXXXXXXX
    content vztmpl
    nodes XXX,XXX,XXXX,XXXX
    prune-backups keep-all=1
    shared 0

zfspool: backup
    pool backup
    content rootdir,images
    nodes XXXX
    sparse 0

dir: XXXX
    path XXXXXX
    content vztmpl
    nodes XXXXXXXXXXXXXXXX
    shared 0

dir: XXXXXXXXXX
    path XXXXXXXXX
    content vztmpl
    nodes XXXXXXXXX
    shared 0

dir: XXXXXXXXX
    path XXXXXXXXX
    content vztmpl
    nodes XXXXXXXXX
    shared 0

dir: XXXXXXXXX
    path XXXXXXXXX
    content vztmpl
    nodes XXXXXXXXX
    shared 0

pbs: XXXXXXXXX
    datastore pbs
    server XXXXXXXXX
    content backup
    fingerprint XXXXXXXXX
    prune-backups keep-all=1
    username XXXXXXXXX

nfs: XXXXXXXXX
    export /exports/XXXXXXXXX/XXXXXXXXX
    path /mnt/pve/XXXXXXXXX
    server XXXXXXXXX
    content vztmpl
    options noatime,soft,timeo=40

fabian · Mar 25, 2024

that looks okay.. if mounting manually works, then you could adapt my command from above to mount manually after unmounting, and then verify after the next scheduled reboot that it worked properly after rebooting?

someone2 · Mar 27, 2024

So I did some more experiements:
activating/deactivating did not help.
deleting/renaming either. it always come up with the same timeo and retrans options

Interrestingly, I changed the soft/hard option and that seems to be the only one that works!

bbgeek17 · Mar 27, 2024

fyi

Code:

nfs: nfs
        export /mnt/data/shared
        path /mnt/pve/shared
        server bbnas
        content vztmpl
        prune-backups keep-all=1

mount|grep nfs
bbnas:/mnt/data/shared on /mnt/pve/shared type nfs (rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=172.16.100.20,mountvers=3,mountport=911,mountproto=udp,local_lock=none,addr=172.16.100.20)

pvesm set shared --options noatime,soft,timeo=40

umount /mnt/pve/shared

mount|grep nfs
bbnas:/mnt/data/shared on /mnt/pve/shared type nfs (rw,noatime,vers=3,rsize=131072,wsize=131072,namlen=255,soft,proto=tcp,timeo=40,retrans=2,sec=sys,mountaddr=172.16.100.20,mountvers=3,mountport=911,mountproto=udp,local_lock=none,addr=172.16.100.20)

pveversion
pve-manager/8.1.4/ec5affc9e41f1d79 (running kernel: 6.5.13-1-pve)

and:

Code:

root@pve7test1:~# pveversion
pve-manager/7.4-17/513c62be (running kernel: 5.15.136-1-pve)
root@pve7test1:~# tail /etc/pve/storage.cfg
nfs: shared
        export /mnt/data/shared
        path /mnt/pve/shared
        server bbnas
        content vztmpl
        prune-backups keep-all=1

root@pve7test1:~# mount|grep shared
bbnas:/mnt/data/shared on /mnt/pve/shared type nfs (rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=172.16.5.132,mountvers=3,mountport=911,mountproto=udp,local_lock=none,addr=172.16.5.132)

root@pve7test1:~# pvesm set shared --options noatime,soft,timeo=40

root@pve7test1:~# umount /mnt/pve/shared

root@pve7test1:~# mount|grep shared

bbnas:/mnt/data/shared on /mnt/pve/shared type nfs (rw,noatime,vers=3,rsize=131072,wsize=131072,namlen=255,soft,proto=tcp,timeo=40,retrans=2,sec=sys,mountaddr=172.16.5.132,mountvers=3,mountport=911,mountproto=udp,local_lock=none,addr=172.16.5.132)
root@pve7test1:~#

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

someone2 · Mar 28, 2024

Are you trying to show that it is not dependant on the PVE version?
So i have noticed that the options timeo and retrans are always set to the defaults with soft, and it does not let me overwrite them, while with hard it works without issue. unless I miss understood the NFS documentation, it should work with soft too. And your example above seems to confirm my thoughts.... the only difference is that i use nfs4.2

someone2 · Mar 28, 2024

So some one else was facing the same problem
https://forum.proxmox.com/threads/nfs-soft-option-causes-i-o-errors.124074/

bbgeek17 · Mar 28, 2024

someone2 said:
Are you trying to show that it is not dependant on the PVE version?

No, I was showing that it was working as expected for me. I started with default NFS settings. Then used "pvesm" to update the settings from hard to soft, from default timeo to one lower, adding noatime. It worked as expected, changing the values after remount, in both PVE7 and PVE8. Unless I am doing something different that you complained about?

someone2 said:
So some one else was facing the same problem

One should never use "soft" with NFS. It already stands for No File Safe, doubly so with "soft".

Everything seems to be working for me, including "retrans":

Code:

root@pve7demo1:~# mount|grep shared
bbnas:/mnt/data/shared on /mnt/pve/shared type nfs (rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=172.16.100.20,mountvers=3,mountport=911,mountproto=udp,local_lock=none,addr=172.16.100.20)
root@pve7demo1:~# pvesm set shared --options noatime,soft,timeo=40,retrans=10
root@pve7demo1:~# umount /mnt/pve/shared

root@pve7demo1:~# mount|grep shared
root@pve7demo1:~# mount|grep shared
bbnas:/mnt/data/shared on /mnt/pve/shared type nfs (rw,noatime,vers=3,rsize=131072,wsize=131072,namlen=255,soft,proto=tcp,timeo=40,retrans=10,sec=sys,mountaddr=172.16.100.20,mountvers=3,mountport=911,mountproto=udp,local_lock=none,addr=172.16.100.20)

Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox

someone2 · Mar 28, 2024

AS I wrote, it just does not work with timeo, setting async does not work either...
then i set `nocto` with all the other options and it works...

In my case I need to use the `soft`: the client is reading files from the NFS and serving them on a REST API. if the NFS goes down, the the request hang forever

updating storage.cfg not working on 7.4. Potential bug?

New Member

Proxmox Staff Member

New Member

Proxmox Staff Member

New Member

Attachments

Proxmox Staff Member

New Member

Proxmox Staff Member

New Member

Proxmox Staff Member

New Member

Proxmox Staff Member

New Member

Proxmox Staff Member

New Member

Distinguished Member

New Member

New Member

Distinguished Member

New Member

We value your privacy