updating storage.cfg not working on 7.4. Potential bug?

someone2

New Member
Feb 12, 2024
11
0
1
I added an NFS mount in the storage.cfg on proxmox7.4. This shared storage is now available to all hosts, as I wanted too. I can then bind mount it to the containers as I wished.
After a while, that this NFS is already in use, I updated the options of the NFS in the storage.cfg, namely the timeo option.
this is when I noticed strange behaviors:
1. Deleting an storage in storage.cfg does not automatically unmount it from the hosts, even if that host (or given containers) is not using that given storage.
Its visible in the UI anymore, but it is still mounted.
2. If I unmount the shared storage, it is automagically remounted by proxmox.... I can understand this.... BUT the new configuration changes that I did (updating the timeo value) are not applied!
I tried to restart the pvedaemon.service or the pve-storage.target but this does not help.

Am I misunderstanding the purpose of storage.cfg?
So what should I do ?

Reboot the hosts so that new config is picked up? => not something I wish
Not using the storage.cfg for that and replace it with autofs for example? => a bit silly since there is this option in proxmox.
 
1. Deleting an storage in storage.cfg does not automatically unmount it from the hosts, even if that host (or given containers) is not using that given storage.
Its visible in the UI anymore, but it is still mounted.

this is expected. there might be more than on storage.cfg entry for example pointing at the same usage, so deactivating can be dangerous.
2. If I unmount the shared storage, it is automagically remounted by proxmox.... I can understand this.... BUT the new configuration changes that I did (updating the timeo value) are not applied!

the remount is also expected, but the storage.cfg should actually be reloaded. are you sure you edited it correctly? how did you do it?
 
for 1: I asked someone else running proxmox 8 to try the same thing and he reported that it got unmounted (unless he missed the mounting), that's why i'm confused...

This is my configuration:

nfs: my_nfs
export /exports/nfs
path /mnt/pve/nfs
server XXX.XXX.XXX.XXX
content vztmpl
options noatime,soft,timeo=20

I edited it with vim on /etc/pve/storage.cfg. And there I changed the value of the timeo option, and added the noatime.
I don't see any error in journalctl after unmount the path.
But i also dont see any message concerning the remounting...
 
please run "journalctl -f" in one shell, and then "mount | grep pve/nfs; umount /mnt/pve/nfs; echo 'post unmount'; mount | grep pve/nfs; sleep 30; echo 'post sleep'; mount | grep pve/nfs;" in another, and post the full output of both shells
 
Thank you, there it is.
I did not mention it earlier, but the configuration change in storage.cfg gets well propagated to all hosts.
And that issue is occuring on all hosts, its not specific to one entity
 

Attachments

  • proxmox_storage.png
    proxmox_storage.png
    94.6 KB · Views: 8
maybe you were just unlucky the first time, and whichever component did the re-activation was not aware of the updated config file yet..
 
yah but as you can see on the screenshot, the options are not updated.
Even if i repeat that process a few time, nothing changes.
 
well, noatime and soft are handled, maybe the timeo has a lower limit? 20 seems very small (it's not in seconds, but deciseconds ;))
 
yah thats 2sec timeout, but thats what we need in that case... however i already tried with a higher value, and same results...
Thats why i think there is a bug somewhere
 
our code literally just does a "mount -t nfs $source $target -o $options", so either your config file is somehow messed up, or your NFS share is not mounted using our code, or it is because of your usage of NFS v4 and requires additional parameters to be set to take effect?
 
Is there a tool to validate the storage.cfg? the entry looks ok compared to the others, I also posted it above.
it is mounted using pve, that I'm 100% sure, it was created this way. but again, if you know ways to verify i can post the output here.
if I use the same parameters for a CLI mount cmd it works with no problems
 
Thank you, here is the file . The last one is the problematic one. I cannot modify any of the other mount points to test whether the problem would be the same.


Code:
dir: local
    disable
    path /var/XXXXXXXXXXXXXXXX/XXXXXXXXXXXXXXXX
    content vztmpl,rootdir,images,iso
    prune-backups keep-all=1
    shared 0

rbd: XXXXXXXXXXXXXXXX
    content images,rootdir
    krbd 1
    monhost XXXXXXXXXXXXXXXX
    pool XXXXXXXXXXXXXXXX
    username XXXXXXXXXXXXXXXX

zfspool: XXXXXXXXXXXXXXXX
    pool rpool/storage
    content images,rootdir

nfs: XXXXXXXXXXXXXXXX
    export /backup/XXXXXXXXXXXXXXXX
    path /mnt/pve/XXXXXXXXXXXXXXXX
    server XXXXXXXXXXXXXXXX
    content vztmpl,iso
    options async,intr,soft,noatime,vers=3
    prune-backups keep-last=5

dir: XXXXXXXXXXXXXXXX
    path /home/XXXXXXXXXXXXXXXX/XXXXXXXXXXXXXXXX
    content vztmpl
    nodes XXX,XXX,XXXX,XXXX
    prune-backups keep-all=1
    shared 0

zfspool: backup
    pool backup
    content rootdir,images
    nodes XXXX
    sparse 0

dir: XXXX
    path XXXXXX
    content vztmpl
    nodes XXXXXXXXXXXXXXXX
    shared 0

dir: XXXXXXXXXX
    path XXXXXXXXX
    content vztmpl
    nodes XXXXXXXXX
    shared 0

dir: XXXXXXXXX
    path XXXXXXXXX
    content vztmpl
    nodes XXXXXXXXX
    shared 0

dir: XXXXXXXXX
    path XXXXXXXXX
    content vztmpl
    nodes XXXXXXXXX
    shared 0

pbs: XXXXXXXXX
    datastore pbs
    server XXXXXXXXX
    content backup
    fingerprint XXXXXXXXX
    prune-backups keep-all=1
    username XXXXXXXXX

nfs: XXXXXXXXX
    export /exports/XXXXXXXXX/XXXXXXXXX
    path /mnt/pve/XXXXXXXXX
    server XXXXXXXXX
    content vztmpl
    options noatime,soft,timeo=40
 
that looks okay.. if mounting manually works, then you could adapt my command from above to mount manually after unmounting, and then verify after the next scheduled reboot that it worked properly after rebooting?
 
So I did some more experiements:
activating/deactivating did not help.
deleting/renaming either. it always come up with the same timeo and retrans options

Interrestingly, I changed the soft/hard option and that seems to be the only one that works!
 
fyi
Code:
nfs: nfs
        export /mnt/data/shared
        path /mnt/pve/shared
        server bbnas
        content vztmpl
        prune-backups keep-all=1

mount|grep nfs
bbnas:/mnt/data/shared on /mnt/pve/shared type nfs (rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=172.16.100.20,mountvers=3,mountport=911,mountproto=udp,local_lock=none,addr=172.16.100.20)

pvesm set shared --options noatime,soft,timeo=40

umount /mnt/pve/shared

mount|grep nfs
bbnas:/mnt/data/shared on /mnt/pve/shared type nfs (rw,noatime,vers=3,rsize=131072,wsize=131072,namlen=255,soft,proto=tcp,timeo=40,retrans=2,sec=sys,mountaddr=172.16.100.20,mountvers=3,mountport=911,mountproto=udp,local_lock=none,addr=172.16.100.20)

pveversion
pve-manager/8.1.4/ec5affc9e41f1d79 (running kernel: 6.5.13-1-pve)

and:
Code:
root@pve7test1:~# pveversion
pve-manager/7.4-17/513c62be (running kernel: 5.15.136-1-pve)
root@pve7test1:~# tail /etc/pve/storage.cfg
nfs: shared
        export /mnt/data/shared
        path /mnt/pve/shared
        server bbnas
        content vztmpl
        prune-backups keep-all=1

root@pve7test1:~# mount|grep shared
bbnas:/mnt/data/shared on /mnt/pve/shared type nfs (rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=172.16.5.132,mountvers=3,mountport=911,mountproto=udp,local_lock=none,addr=172.16.5.132)

root@pve7test1:~# pvesm set shared --options noatime,soft,timeo=40

root@pve7test1:~# umount /mnt/pve/shared

root@pve7test1:~# mount|grep shared

bbnas:/mnt/data/shared on /mnt/pve/shared type nfs (rw,noatime,vers=3,rsize=131072,wsize=131072,namlen=255,soft,proto=tcp,timeo=40,retrans=2,sec=sys,mountaddr=172.16.5.132,mountvers=3,mountport=911,mountproto=udp,local_lock=none,addr=172.16.5.132)
root@pve7test1:~#


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
Are you trying to show that it is not dependant on the PVE version?
So i have noticed that the options timeo and retrans are always set to the defaults with soft, and it does not let me overwrite them, while with hard it works without issue. unless I miss understood the NFS documentation, it should work with soft too. And your example above seems to confirm my thoughts.... the only difference is that i use nfs4.2
 
Are you trying to show that it is not dependant on the PVE version?
No, I was showing that it was working as expected for me. I started with default NFS settings. Then used "pvesm" to update the settings from hard to soft, from default timeo to one lower, adding noatime. It worked as expected, changing the values after remount, in both PVE7 and PVE8. Unless I am doing something different that you complained about?

So some one else was facing the same problem
One should never use "soft" with NFS. It already stands for No File Safe, doubly so with "soft".

Everything seems to be working for me, including "retrans":

Code:
root@pve7demo1:~# mount|grep shared
bbnas:/mnt/data/shared on /mnt/pve/shared type nfs (rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=172.16.100.20,mountvers=3,mountport=911,mountproto=udp,local_lock=none,addr=172.16.100.20)
root@pve7demo1:~# pvesm set shared --options noatime,soft,timeo=40,retrans=10
root@pve7demo1:~# umount /mnt/pve/shared

root@pve7demo1:~# mount|grep shared
root@pve7demo1:~# mount|grep shared
bbnas:/mnt/data/shared on /mnt/pve/shared type nfs (rw,noatime,vers=3,rsize=131072,wsize=131072,namlen=255,soft,proto=tcp,timeo=40,retrans=10,sec=sys,mountaddr=172.16.100.20,mountvers=3,mountport=911,mountproto=udp,local_lock=none,addr=172.16.100.20)


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
AS I wrote, it just does not work with timeo, setting async does not work either...
then i set `nocto` with all the other options and it works...

In my case I need to use the `soft`: the client is reading files from the NFS and serving them on a REST API. if the NFS goes down, the the request hang forever
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!