nfs-problem

TErxleben · Apr 26, 2023

i,
Here I have integrated nfs-shares of a third server on two Proxmox servers as storage. If the releasing NFS server goes offline, the two Proxmox servers notice it. Unfortunately, they don't get to know when the release server is online again. The storage remains unreachable.
How do I get without restarting the Proxmox servers to reactivate the shares?
storage.cfg:


basic@pve1:~# cat /etc/pve/storage.cfg  
dir: local
        path /var/lib/vz
        content backup,iso,vztmpl

lvmthin: local-lvm
        thinpool data
        vgname pve
        content rootdir,images

nfs: pve
        export /mnt/backup
        path /mnt/pve
        server pve
        content backup
        options soft
        prune-backups keep-all=1

Chris · Apr 26, 2023

TErxleben said:
i,
Here I have integrated nfs-shares of a third server on two Proxmox servers as storage. If the releasing NFS server goes offline, the two Proxmox servers notice it. Unfortunately, they don't get to know when the release server is online again. The storage remains unreachable.
How do I get without restarting the Proxmox servers to reactivate the shares?
storage.cfg:
basic@pve1:~# cat /etc/pve/storage.cfg dir: local path /var/lib/vz content backup,iso,vztmpl lvmthin: local-lvm thinpool data vgname pve content rootdir,images nfs: pve export /mnt/backup path /mnt/pve server pve content backup options soft prune-backups keep-all=1

Hi,
the storage should be reachable again, as soon as the storage is back online. Please check the connectivity to your NFS storage when it is back online. Can you e.g. scan for shares by running pvesm scan nfs <NFS-server-IP>?

TErxleben · Apr 26, 2023

btw. its pve 7.4-3

basic@pve1:~# pvesm scan nfs pve
/mnt/backup pve2.local,pve1.local

trying to mount from pve manually:

Code:

root@pve1:~# mount -v  pve:/mnt/backup /mnt/test
mount.nfs: timeout set for Wed Apr 26 17:04:07 2023
mount.nfs: trying text-based options 'vers=4.2,addr=192.168.115.230,clientaddr=192.168.115.231'
mount.nfs: mount(2): Stale file handle
mount.nfs: trying text-based options 'vers=4.2,addr=192.168.115.230,clientaddr=192.168.115.231'
mount.nfs: mount(2): Stale file handle
mount.nfs: trying text-based options 'vers=4.2,addr=192.168.115.230,clientaddr=192.168.115.231'
mount.nfs: mount(2): Stale file handle
^C

Chris · Apr 26, 2023

TErxleben said:
The storage remains unreachable.

How exactly does it remain unreachable, can you share more details? Also check the output of mount | grep 'type nfs' and check the journal for errors related to the NFS storage journalctl -b.

TErxleben · Apr 26, 2023

Hi Chris,


root@pve1:~# mount -v  pve:/mnt/backup /mnt/test
mount.nfs: timeout set for Wed Apr 26 17:04:07 2023
mount.nfs: trying text-based options 'vers=4.2,addr=192.168.115.230,clientaddr=192.168.115.231'
mount.nfs: mount(2): Stale file handle


root@pve1:~# mount | grep 'type nfs'
pve:/mnt/backup on /mnt/pve/pve type nfs4 (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.115.23
1,local_lock=none,addr=192.168.115.230)
pve:/mnt/backup on /mnt/pve type nfs4 (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.115.231,lo
cal_lock=none,addr=192.168.115.230)
root@pve1:~# mount | grep 'type nfs'
pve:/mnt/backup on /mnt/pve/pve type nfs4 (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.115.23
1,local_lock=none,addr=192.168.115.230)
pve:/mnt/backup on /mnt/pve type nfs4 (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.115.231,lo
cal_lock=none,addr=192.168.115.230)


root@pve1:~# journalctl -b
-- Journal begins at Thu 2023-04-13 13:49:38 CEST, ends at Wed 2023-04-26 17:09:42 CEST. --
Apr 15 16:27:04 pve1 kernel: Linux version 5.15.104-1-pve (build@proxmox) (gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP PVE 5.15>
Apr 15 16:27:04 pve1 kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-5.15.104-1-pve root=/dev/mapper/pve-root ro quiet
Apr 15 16:27:04 pve1 kernel: KERNEL supported cpus:
Apr 15 16:27:04 pve1 kernel:   Intel GenuineIntel
Apr 15 16:27:04 pve1 kernel:   AMD AuthenticAMD
Apr 15 16:27:04 pve1 kernel:   Hygon HygonGenuine
Apr 15 16:27:04 pve1 kernel:   Centaur CentaurHauls
Apr 15 16:27:04 pve1 kernel:   zhaoxin   Shanghai 
Apr 15 16:27:04 pve1 kernel: x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
Apr 15 16:27:04 pve1 kernel: x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
Apr 15 16:27:04 pve1 kernel: x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
Apr 15 16:27:04 pve1 kernel: x86/fpu: Supporting XSAVE feature 0x008: 'MPX bounds registers'
Apr 15 16:27:04 pve1 kernel: x86/fpu: Supporting XSAVE feature 0x010: 'MPX CSR'
Apr 15 16:27:04 pve1 kernel: x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
Apr 15 16:27:04 pve1 kernel: x86/fpu: xstate_offset[3]:  832, xstate_sizes[3]:   64
Apr 15 16:27:04 pve1 kernel: x86/fpu: xstate_offset[4]:  896, xstate_sizes[4]:   64
Apr 15 16:27:04 pve1 kernel: x86/fpu: Enabled xstate features 0x1f, context size is 960 bytes, using 'compacted' format.
Apr 15 16:27:04 pve1 kernel: signal: max sigframe size: 2032
Apr 15 16:27:04 pve1 kernel: BIOS-provided physical RAM map:
Apr 15 16:27:04 pve1 kernel: BIOS-e820: [mem 0x0000000000000000-0x000000000009efff] usable
Apr 15 16:27:04 pve1 kernel: BIOS-e820: [mem 0x000000000009f000-0x00000000000fffff] reserved
Apr 15 16:27:04 pve1 kernel: BIOS-e820: [mem 0x0000000000100000-0x00000000a2a28fff] usable
Apr 15 16:27:04 pve1 kernel: BIOS-e820: [mem 0x00000000a2a29000-0x00000000a2a29fff] ACPI NVS
Apr 15 16:27:04 pve1 kernel: BIOS-e820: [mem 0x00000000a2a2a000-0x00000000a2a5afff] usable
Apr 15 16:27:04 pve1 kernel: BIOS-e820: [mem 0x00000000a2a5b000-0x00000000a2a5bfff] reserved
Apr 15 16:27:04 pve1 kernel: BIOS-e820: [mem 0x00000000a2a5c000-0x00000000b679afff] usable
Apr 15 16:27:04 pve1 kernel: BIOS-e820: [mem 0x00000000b679b000-0x00000000b6c9afff] type 20
Apr 15 16:27:04 pve1 kernel: BIOS-e820: [mem 0x00000000b6c9b000-0x00000000b7c7efff] reserved
Apr 15 16:27:04 pve1 kernel: BIOS-e820: [mem 0x00000000b7c7f000-0x00000000b7e7efff] ACPI NVS
Apr 15 16:27:04 pve1 kernel: BIOS-e820: [mem 0x00000000b7e7f000-0x00000000b7efefff] ACPI data
Apr 15 16:27:04 pve1 kernel: BIOS-e820: [mem 0x00000000b7eff000-0x00000000b7efffff] usable
Apr 15 16:27:04 pve1 kernel: BIOS-e820: [mem 0x00000000b7f00000-0x00000000cc7fffff] reserved
Apr 15 16:27:04 pve1 kernel: BIOS-e820: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
Apr 15 16:27:04 pve1 kernel: BIOS-e820: [mem 0x0000000100000000-0x000000102f7fffff] usable
Apr 15 16:27:04 pve1 kernel: NX (Execute Disable) protection: active
Apr 15 16:27:04 pve1 kernel: efi: EFI v2.70 by HP
Apr 15 16:27:04 pve1 kernel: efi: ACPI=0xb7efe000 ACPI 2.0=0xb7efe014 TPMFinalLog=0xb7e0e000 SMBIOS=0xb6f66000 ESRT=0xb6f7a218 MEMATTR=0xb2a22018
Apr 15 16:27:04 pve1 kernel: secureboot: Secure boot disabled
Apr 15 16:27:04 pve1 kernel: SMBIOS 3.2 present.
Apr 15 16:27:04 pve1 kernel: DMI: HP HP ProDesk 600 G6 Desktop Mini PC/8715, BIOS S22 Ver. 02.06.02 05/14/2021

Chris · Apr 26, 2023

TErxleben said:
root@pve1:~# mount | grep 'type nfs' pve:/mnt/backup on /mnt/pve/pve type nfs4 (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.115.23 1,local_lock=none,addr=192.168.115.230) pve:/mnt/backup on /mnt/pve type nfs4 (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.115.231,lo cal_lock=none,addr=192.168.115.230) ro

Your issue is probably related to the fact that you have mounted nfs shares on both, /mnt/pve and /mnt/pve/pve. So when the NFS goes offline, also the filehandle for the mountpoint to /mnt/pve/pve is lost / becomes invalid. I would recomend to separating the two mount points.

TErxleben · Apr 26, 2023

Hi Chris,
Thank you for your keen eye.
I have actually adapted the suggestion of the WebGUI from /mnt/pve/pve to /mnt/pve manually in the storage.cfg. Unfortunately this doesn't work in the WebGUI.
/mnt/pve/pve no longer appears in the WebGUI.
I will check it and give feedback.

Maybe there is a problem with the synchronization between CLI and WebGUI?

TErxleben · Apr 27, 2023

Problem solved.

I fixed this via explicitly un-exporting and re-exporting the relevant exports on the server. For example to do this with all exports:

Code:

# exportfs -ua
# cat /proc/fs/nfs/exports
# exportfs -a

Chris · Apr 27, 2023

TErxleben said:
I have actually adapted the suggestion of the WebGUI from /mnt/pve/pve to /mnt/pve manually in the storage.cfg. Unfortunately this doesn't work in the WebGUI.

By this you probably created the double mount, since the storage was not unmounted before editing the config i assume. I do not understand your motivation for doing this... maybe I am missing your point. NFS shares are mounted on a mountpoint named like the storage on /mnt/pve. I recommend to not interfere with that.

TErxleben · Apr 27, 2023

Well Chris,
- The NFS server/system was probably to blame here
- In the WebGUI, I'm missing some options for adapting the system to my needs.
- I would like to be able to determine the mount point myself or just accept the system suggestion.
- A "soft" option when mounting NFS shares should also be selectable either by default or in the WebGUI.
- Deactivating an NFS share in the WebGUI should actually remove an active mount.

Nevertheless, thank you for your efforts. You already have a great system at the start. Hats off!

Search

Search

nfs-problem

TErxleben

Renowned Member

Chris

Proxmox Staff Member

TErxleben

Renowned Member

Chris

Proxmox Staff Member

TErxleben

Renowned Member

Chris

Proxmox Staff Member

TErxleben

Renowned Member

TErxleben

Renowned Member

Chris

Proxmox Staff Member

TErxleben

Renowned Member

We value your privacy