NFS share not accessible/inactive after NFS server reboot

beriapl

Active Member
Jan 22, 2019
23
0
41
41
I have big issue with NFS storage that is used only for backups in my proxmox cluster (12 physical nodes).

Every time NFS server is rebooted i need to remove NFS share from proxmox, restart all 12 nodes, and the remount NFS share again, by adding it from scratch

Is there any way to force proxmxox to change NFS status to active?

Code:
root@pve-07-08:~# pvesm status
got timeout
unable to activate storage 'BACKUP01-NFS' - directory '/mnt/pve/BACKUP01-NFS' does not exist or is unreachable
Name                Type     Status           Total            Used       Available        %
BACKUP01-NFS         nfs   inactive               0               0               0    0.00%

Restarting whole cluster is just nightmare.
 
hi,

Every time NFS server is rebooted i need to remove NFS share from proxmox, restart all 12 nodes, and the remount NFS share again, by adding it from scratch
why?

Is there any way to force proxmxox to change NFS status to active?
can you try: pvesm set BACKUP01-NFS --disable 0 (and --disable 1 to deactivate)
 
can you try: pvesm set BACKUP01-NFS --disable 0 (and --disable 1 to deactivate)
That doesn't change anything.



Code:
root@pve-07-08:~# pvesm set BACKUP01-NFS --disable 0
root@pve-07-08:~# pvesm status
storage 'BACKUP01-NFS' is not online
Name                Type     Status           Total            Used       Available        %
BACKUP01-NFS         nfs   inactive               0               0               0    0.00%
root@pve-07-08:~# pvesm set BACKUP01-NFS --disable 1
root@pve-07-08:~# pvesm status
Name                Type     Status           Total            Used       Available        %
BACKUP01-NFS         nfs   disabled               0               0               0      N/A
root@pve-07-08:~# pvesm set BACKUP01-NFS --disable 0
root@pve-07-08:~# pvesm status
storage 'BACKUP01-NFS' is not online
Name                Type     Status           Total            Used       Available        %
BACKUP01-NFS         nfs   inactive               0               0               0    0.00%
 
can you reach the NFS server? what do you get from rcpinfo -p your.nfs.server.ip

does mount -a bring the share back up?

please post the contents of cat /etc/pve/storage.cfg
 
rpcinfo:


Code:
root@pve-07-06:~# rpcinfo -p 10.2.128.43
   program vers proto   port  service
    100000    2   udp    111  portmapper
    100000    3   udp    111  portmapper
    100000    4   udp    111  portmapper
    100000    2   tcp    111  portmapper
    100000    3   tcp    111  portmapper
    100000    4   tcp    111  portmapper
    100003    2   tcp   2049  nfs
    100003    3   tcp   2049  nfs
    100003    2   udp   2049  nfs
    100003    3   udp   2049  nfs
    100003    4   tcp   2049  nfs
    100005    1   tcp   2049  mountd
    100005    2   tcp   2049  mountd
    100005    3   tcp   2049  mountd
    100005    1   udp   2049  mountd
    100005    2   udp   2049  mountd
    100005    3   udp   2049  mountd
    100021    1   tcp   2049  nlockmgr
    100021    2   tcp   2049  nlockmgr
    100021    3   tcp   2049  nlockmgr
    100021    4   tcp   2049  nlockmgr
    100021    1   udp   2049  nlockmgr
    100021    2   udp   2049  nlockmgr
    100021    3   udp   2049  nlockmgr
    100021    4   udp   2049  nlockmgr
    100024    1   tcp   2049  status
    100024    1   udp   2049  status
    252525    1   udp    653
    252525    1   tcp    655

showmount:


Code:
root@pve-07-06:~# showmount -e 10.2.128.43
Export list for 10.2.128.43:
/nfs (everyone)



Code:
root@pve-07-06:~# cat /etc/pve/storage.cfg
dir: local
        path /var/lib/vz
        content backup,vztmpl,iso


lvmthin: local-lvm
        thinpool data
        vgname pve
        content images,rootdir


rbd: poolvm
        content rootdir,images
        krbd 1
        pool poolvm


cephfs: ISO
        path /mnt/pve/ISO
        content vztmpl,iso


rbd: poolct
        content images,rootdir
        krbd 0
        pool poolct


nfs: BACKUP01-NFS
        export /nfs
        path /mnt/pve/BACKUP01-NFS
        server 10.2.128.43
        content backup
        maxfiles 8

mount -a do not re-mount this share.
 
can you check journalctl | grep -i nfs output?
 
can you check journalctl | grep -i nfs output?

A lot of tose errors only:


Code:
May 05 14:18:31 pve-07-06 kernel: nfs: server 10.2.128.43 not responding, timed out
May 05 14:18:34 pve-07-06 pvestatd[2084]: storage 'BACKUP01-NFS' is not online
May 05 14:18:43 pve-07-06 pvestatd[2084]: unable to activate storage 'BACKUP01-NFS' - directory '/mnt/pve/BACKUP01-NFS' does not exist or is unreachable

And I've tried a lot of things re-mounting etc - the only way is just to reboot all nodes. And then add this again from web gui.
 
could you post the output of pveversion -v
 
Sure:
(I'm going to upgrade to newest on upcoming weekend)

Code:
root@pve-07-06:~# pveversion -v
proxmox-ve: 6.1-2 (running kernel: 5.3.10-1-pve)
pve-manager: 6.1-3 (running version: 6.1-3/37248ce6)
pve-kernel-5.3: 6.0-12
pve-kernel-helper: 6.0-12
pve-kernel-5.3.10-1-pve: 5.3.10-1
ceph: 14.2.18-pve1
ceph-fuse: 14.2.18-pve1
corosync: 3.0.2-pve4
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.13-pve1
libpve-access-control: 6.0-5
libpve-apiclient-perl: 3.0-2
libpve-common-perl: 6.0-9
libpve-guest-common-perl: 3.0-3
libpve-http-server-perl: 3.0-3
libpve-storage-perl: 6.1-2
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve3
lxc-pve: 3.2.1-1
lxcfs: 3.0.3-pve60
novnc-pve: 1.1.0-1
openvswitch-switch: 2.10.7+ds1-0+deb10u1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.1-1
pve-cluster: 6.1-2
pve-container: 3.0-14
pve-docs: 6.1-3
pve-edk2-firmware: 2.20191002-1
pve-firewall: 4.0-9
pve-firmware: 3.0-4
pve-ha-manager: 3.0-8
pve-i18n: 2.0-3
pve-qemu-kvm: 4.1.1-2
pve-xtermjs: 3.13.2-1
qemu-server: 6.1-2
smartmontools: 7.0-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.2-pve2
 
Probably same problem here, with the latest and greatest pve. When my NFS server goes down and comes back up again, I have to reboot the PVE nodes to get the NFS share working again.

Did you ever find a better solution than rebooting?
 
Probably same problem here, with the latest and greatest pve. When my NFS server goes down and comes back up again, I have to reboot the PVE nodes to get the NFS share working again.

Did you ever find a better solution than rebooting?
maybe unmounting the share and remounting it works?
 
Same issue here.
Tried a few solutions mentioned both here and on Reddit.

Whats strange for me, is of my 3x NFS shares, one of them actually does auto-recover.
 
Interesting fix i blundered onto - just a straight umount, a second later it re-creates itself and continues working fine again.
 
Thanks for posting this.
Implemented, but not tested.

Either way, cant make it any worse! :p
I got 8 intel NUCs running proxmox all using a cron job to test if the NFS share is up and it works (I just run it every 30 minutes as it's just for nightly backups)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!