Issues with NFS share not reconnecting if I have to reboot the nfs server.

ZerkerEOD · May 23, 2023

Hello, I am tired and finally at my witts end with NFS. I currently have a setup in a home lab environment. I run a few VM's to include an unraid server (that currently hosts my dockers). It currently holds a couple VM's that I run from proxmox. If I have to reboot the unraid server for any reason my NFS shares mostly don't come back. Like this last time only my appdata share connected, while my iso, domains, and data shares show not connected even though they are all online. The only way for me to fix this is to reboot the proxmox server that I can find. I see online some one says that there is a proxmox service that should try to reconnect failed nfs every minute or so, but they don't reconnect. Its been about 3 hours since I rebooted the unraid server and proxmox still says they are offline and unreachable (they were reachable before the reboot), well all except appdata, that one came back.

Any help would be nice, I would like to get it working so that I can then pass the shares into VM's for use as I would ultimately like to break dockers out from unraid and just use unraid as a dedicated storage array and nothing else.

I also ran a couple things from other post but the output doesn't looke like the expected output from the posts, so not sure if it will help.

Code:

root@pve:~# pvesm nfsscan 10.[redacted]
/mnt/user/appdata *
/mnt/user/data    *
/mnt/user/domains *
/mnt/user/isos    *
root@pve:~# time pvesm nfsscan 10.[redacted]
/mnt/user/appdata *
/mnt/user/data    *
/mnt/user/domains *
/mnt/user/isos    *

real    0m0.258s
user    0m0.248s
sys     0m0.008s

Chris · May 24, 2023

Hi,
please post the storage config cat /etc/pve/storage.cfg and the VM config for the vm in question qm config <VMID>.

Also dump and attach the journal from around the time when the connection is lost journalct --since <DATETIME> --until <DATETIME> > journal.txt.

ZerkerEOD said:
I run a few VM's to include an unraid server (that currently hosts my dockers). It currently holds a couple VM's that I run from proxmox.

Can you clarify this further, who is providing the NFS share and who is using this (PVE host, VMs/CTs?)

ZerkerEOD · May 24, 2023

Chris said:
Hi,
please post the storage config cat /etc/pve/storage.cfg and the VM config for the vm in question qm config <VMID>.

Also dump and attach the journal from around the time when the connection is lost journalct --since <DATETIME> --until <DATETIME> > journal.txt.

Can you clarify this further, who is providing the NFS share and who is using this (PVE host, VMs/CTs?)

Here is the output of storage.cfg:

Code:

cat /etc/pve/storage.cfg
dir: local
        path /var/lib/vz
        content vztmpl,iso,backup


lvmthin: local-lvm
        thinpool data
        vgname pve
        content images,rootdir


nfs: unraid-domains
        export /mnt/user/domains
        path /mnt/pve/unraid-domains
        server 10.133.122.220
        content images
        options vers=3
        prune-backups keep-all=1


nfs: unraid-appdata
        export /mnt/user/appdata
        path /mnt/pve/unraid-appdata
        server 10.133.122.220
        content images
        options vers=3
        prune-backups keep-all=1


nfs: unraid-data
        export /mnt/user/data
        path /mnt/pve/unraid-data
        server 10.133.122.220
        content images
        options vers=3
        prune-backups keep-all=1


nfs: unraid-isos
        export /mnt/user/isos
        path /mnt/pve/unraid-isos
        server 10.133.122.220
        content iso
        nodes pve
        options vers=3
        prune-backups keep-all=1

Here is the config for the VM:

Code:

qm config 102
bios: ovmf
boot: order=virtio0
cores: 5
hostpci0: 0000:06:00
ide2: none,media=cdrom
machine: q35
memory: 12288
meta: creation-qemu=7.1.0,ctime=1673952885
name: Windows-11
net0: e1000=42:29:31:5F:9E:D8,bridge=vmbr0
numa: 0
ostype: win11
scsihw: virtio-scsi-single
smbios1: uuid=5451d605-d4d9-42d5-8c0d-6788cf945c7b
sockets: 2
usb0: host=1-6
usb1: host=7-1
vga: qxl,memory=32
virtio0: unraid-domains:102/vm-102-disk-0.raw,iothread=1,size=100G
vmgenid: 89d247c2-6c98-4a78-a439-a3335bcac818

I stopped the array shortly after 9:02 AM and the array was brought back online around 9:08 AM BST. Here is the journal after the array came online and a few minutes after it should have reconnected. Trying to get enough to diagnose. I pulled a full day worth since my issues were yesterday because today they all seemed to reconnect. I will add the file but the timestamp of May 23 14:44:49 contains errors that did not show on this mornings test saying the following:

Code:

May 23 14:44:49 pve pvestatd[1840]: unable to activate storage 'unraid-domains' - directory '/mnt/pve/unraid-domains' does not exist or is unreachable
May 23 14:44:49 pve kernel: NFS: server 10.133.122.220 error: fileid changed
                            fsid 0:44: expected fileid 0x9, got 0xa
May 23 14:44:49 pve pvestatd[1840]: unable to activate storage 'unraid-isos' - directory '/mnt/pve/unraid-isos' does not exist or is unreachable
May 23 14:44:49 pve kernel: NFS: server 10.133.122.220 error: fileid changed
                            fsid 0:42: expected fileid 0xa, got 0xb
May 23 14:44:49 pve pvestatd[1840]: unable to activate storage 'unraid-data' - directory '/mnt/pve/unraid-data' does not exist or is unreachable
May 23 14:44:49 pve kernel: NFS: server 10.133.122.220 error: fileid changed
                            fsid 0:43: expected fileid 0x8, got 0x9

To answer the additional questions, I am hosting proxmox on my own server, on that proxmox I have several VM's but the first one that starts is the unraid VM (100) that manages storage, a Home Assist VM (101), and a Windows 11 VM (102) that I use for remote gaming. The Unraid and Home Assist are on the proxmox drives, while the Windows 11 one was placed on the unraid share called unraid-domains. So I am providing the NFS shares through unraid and pve connects to it as storage services. Unraid has 4 public nfs shares for this data because I didn't see where pve could use a secure NFS share.

ZerkerEOD · May 26, 2023

Does anyone know how to fix this? I am taking steps to limit the amount of reboots for the Unraid server but it is impossible for 100% uptime.

supermarkert · Sep 25, 2023

I'm having same issue (except with Open Media Vault instead of Unraid). OMV is running in VM, and sharing NFS share to host. If I reboot the VM, the NFS share goes down and there's no easy way to get it back in Proxmox. Only way to fix it is: remove NFS share from host, reboot Host, then add share back.
If I could at least figure out or get help with a script to do this, I may be content with a workaround.

philliphs · Nov 21, 2024

umount -l -f /mnt/pve/<your share name>
from shell on each node

I just remounted the drive everytime it failed. wish Proxmox does this automatically

okiedokie · Jan 2, 2025

philliphs said:
umount -l -f /mnt/pve/<your share name>
from shell on each node

I just remounted the drive everytime it failed. wish Proxmox does this automatically

Thank you - this is way better than rebooting.
I wish Proxmox would do this automatically or through the GUI.

waltar · Jan 2, 2025

Proxmox do this automatically with a default nfs 4.2 mount (so work our 5 node pve).
Would search on the nfs-server side for your issues ! Is it exporting as 4.2 too ? Does exportfs -v (on nfs-server) give expected when pve's don't mount ?

okiedokie · Jan 3, 2025

waltar said:
Proxmox do this automatically with a default nfs 4.2 mount (so work our 5 node pve).
Would search on the nfs-server side for your issues ! Is it exporting as 4.2 too ? Does exportfs -v (on nfs-server) give expected when pve's don't mount ?

Thanks Waltar
When I do exportfs -v I get:

Code:

/mnt/user/testnfs
                <world>(async,wdelay,hide,no_subtree_check,fsid=100,anonuid=99,anongid=100,sec=sys,rw,insecure,root_squash,all_squash)

This is using Unraid 7 rc.2.
I will setup up an NFS share on TrueNAS and see if that works better.

waltar · Jan 3, 2025

I would for nfs prefere the default option sync instead of async even if that's a little bit slower.
Did you "manipulate" the default mount options from pve to your nfs share (in /etc/pve/storage.cfg) ?
On pve host "mount|grep testnfs" should show "hard" (default) and not "soft" in output line.

phillip-de · Apr 16, 2025

Give as an update?
I have the same issue this unraid 7.

Search

Search

Issues with NFS share not reconnecting if I have to reboot the nfs server.

ZerkerEOD

New Member

Chris

Proxmox Staff Member

ZerkerEOD

New Member

ZerkerEOD

New Member

supermarkert

Member

philliphs

Member

okiedokie

Member

waltar

Renowned Member

okiedokie

Member

waltar

Renowned Member

phillip-de

New Member

We value your privacy