Issues with NFS share not reconnecting if I have to reboot the nfs server.

ZerkerEOD

New Member
May 23, 2023
9
0
1
Hello, I am tired and finally at my witts end with NFS. I currently have a setup in a home lab environment. I run a few VM's to include an unraid server (that currently hosts my dockers). It currently holds a couple VM's that I run from proxmox. If I have to reboot the unraid server for any reason my NFS shares mostly don't come back. Like this last time only my appdata share connected, while my iso, domains, and data shares show not connected even though they are all online. The only way for me to fix this is to reboot the proxmox server that I can find. I see online some one says that there is a proxmox service that should try to reconnect failed nfs every minute or so, but they don't reconnect. Its been about 3 hours since I rebooted the unraid server and proxmox still says they are offline and unreachable (they were reachable before the reboot), well all except appdata, that one came back.

Any help would be nice, I would like to get it working so that I can then pass the shares into VM's for use as I would ultimately like to break dockers out from unraid and just use unraid as a dedicated storage array and nothing else.

1684859360978.png

I also ran a couple things from other post but the output doesn't looke like the expected output from the posts, so not sure if it will help.

Code:
root@pve:~# pvesm nfsscan 10.[redacted]
/mnt/user/appdata *
/mnt/user/data    *
/mnt/user/domains *
/mnt/user/isos    *
root@pve:~# time pvesm nfsscan 10.[redacted]
/mnt/user/appdata *
/mnt/user/data    *
/mnt/user/domains *
/mnt/user/isos    *

real    0m0.258s
user    0m0.248s
sys     0m0.008s
 
Hi,
please post the storage config cat /etc/pve/storage.cfg and the VM config for the vm in question qm config <VMID>.

Also dump and attach the journal from around the time when the connection is lost journalct --since <DATETIME> --until <DATETIME> > journal.txt.
I run a few VM's to include an unraid server (that currently hosts my dockers). It currently holds a couple VM's that I run from proxmox.
Can you clarify this further, who is providing the NFS share and who is using this (PVE host, VMs/CTs?)
 
Hi,
please post the storage config cat /etc/pve/storage.cfg and the VM config for the vm in question qm config <VMID>.

Also dump and attach the journal from around the time when the connection is lost journalct --since <DATETIME> --until <DATETIME> > journal.txt.

Can you clarify this further, who is providing the NFS share and who is using this (PVE host, VMs/CTs?)
Here is the output of storage.cfg:

Code:
cat /etc/pve/storage.cfg
dir: local
        path /var/lib/vz
        content vztmpl,iso,backup


lvmthin: local-lvm
        thinpool data
        vgname pve
        content images,rootdir


nfs: unraid-domains
        export /mnt/user/domains
        path /mnt/pve/unraid-domains
        server 10.133.122.220
        content images
        options vers=3
        prune-backups keep-all=1


nfs: unraid-appdata
        export /mnt/user/appdata
        path /mnt/pve/unraid-appdata
        server 10.133.122.220
        content images
        options vers=3
        prune-backups keep-all=1


nfs: unraid-data
        export /mnt/user/data
        path /mnt/pve/unraid-data
        server 10.133.122.220
        content images
        options vers=3
        prune-backups keep-all=1


nfs: unraid-isos
        export /mnt/user/isos
        path /mnt/pve/unraid-isos
        server 10.133.122.220
        content iso
        nodes pve
        options vers=3
        prune-backups keep-all=1

Here is the config for the VM:
Code:
qm config 102
bios: ovmf
boot: order=virtio0
cores: 5
hostpci0: 0000:06:00
ide2: none,media=cdrom
machine: q35
memory: 12288
meta: creation-qemu=7.1.0,ctime=1673952885
name: Windows-11
net0: e1000=42:29:31:5F:9E:D8,bridge=vmbr0
numa: 0
ostype: win11
scsihw: virtio-scsi-single
smbios1: uuid=5451d605-d4d9-42d5-8c0d-6788cf945c7b
sockets: 2
usb0: host=1-6
usb1: host=7-1
vga: qxl,memory=32
virtio0: unraid-domains:102/vm-102-disk-0.raw,iothread=1,size=100G
vmgenid: 89d247c2-6c98-4a78-a439-a3335bcac818

I stopped the array shortly after 9:02 AM and the array was brought back online around 9:08 AM BST. Here is the journal after the array came online and a few minutes after it should have reconnected. Trying to get enough to diagnose. I pulled a full day worth since my issues were yesterday because today they all seemed to reconnect. I will add the file but the timestamp of May 23 14:44:49 contains errors that did not show on this mornings test saying the following:

Code:
May 23 14:44:49 pve pvestatd[1840]: unable to activate storage 'unraid-domains' - directory '/mnt/pve/unraid-domains' does not exist or is unreachable
May 23 14:44:49 pve kernel: NFS: server 10.133.122.220 error: fileid changed
                            fsid 0:44: expected fileid 0x9, got 0xa
May 23 14:44:49 pve pvestatd[1840]: unable to activate storage 'unraid-isos' - directory '/mnt/pve/unraid-isos' does not exist or is unreachable
May 23 14:44:49 pve kernel: NFS: server 10.133.122.220 error: fileid changed
                            fsid 0:42: expected fileid 0xa, got 0xb
May 23 14:44:49 pve pvestatd[1840]: unable to activate storage 'unraid-data' - directory '/mnt/pve/unraid-data' does not exist or is unreachable
May 23 14:44:49 pve kernel: NFS: server 10.133.122.220 error: fileid changed
                            fsid 0:43: expected fileid 0x8, got 0x9


To answer the additional questions, I am hosting proxmox on my own server, on that proxmox I have several VM's but the first one that starts is the unraid VM (100) that manages storage, a Home Assist VM (101), and a Windows 11 VM (102) that I use for remote gaming. The Unraid and Home Assist are on the proxmox drives, while the Windows 11 one was placed on the unraid share called unraid-domains. So I am providing the NFS shares through unraid and pve connects to it as storage services. Unraid has 4 public nfs shares for this data because I didn't see where pve could use a secure NFS share.
 
Last edited:
Does anyone know how to fix this? I am taking steps to limit the amount of reboots for the Unraid server but it is impossible for 100% uptime.
 
I'm having same issue (except with Open Media Vault instead of Unraid). OMV is running in VM, and sharing NFS share to host. If I reboot the VM, the NFS share goes down and there's no easy way to get it back in Proxmox. Only way to fix it is: remove NFS share from host, reboot Host, then add share back.
If I could at least figure out or get help with a script to do this, I may be content with a workaround.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!