[SOLVED] NFS share not online after recent updates

prahn · Apr 6, 2021

Hi!

I have a small Proxmox cluster with 2 nodes running 6.3. On the weekend I installed the current updates, that the WebGUI offered.
I rebooted node 1, but after booting up the host it does not connect to the NFS share on my Synology NAS anymore... why that??
Node 2 was not rebooted, here the NFS share is still "online". Will it also be offline, when rebooting the host?

node 1 (pve1) = 192.168.1.3
node 2 (pve3) = 192.168.1.6

I googled a lot, but I do not find any solution:

Code:

root@pve1:~# showmount -e 192.168.1.13
rpc mount export: RPC: Unable to receive; errno = No route to host

Code:

root@pve3:~# showmount -e 192.168.1.13
Export list for 192.168.1.13:
/volume1/vmbackup        192.166.1.3,192.168.1.6

However I can ping the NFS server from both nodes, but "rpcinfo -p 192.168.1.13" works only on node 2:

Code:

root@pve1:~# rpcinfo -p 192.168.1.13
192.168.1.13: RPC: Remote system error - No route to host

Code:

root@pve3:~# rpcinfo -p 192.168.1.13
   program vers proto   port  service
    100000    4   tcp    111  portmapper
    100000    3   tcp    111  portmapper
    100000    2   tcp    111  portmapper
    100000    4   udp    111  portmapper
    100000    3   udp    111  portmapper
    100000    2   udp    111  portmapper
    100005    1   udp    892  mountd
    100005    1   tcp    892  mountd
    100005    2   udp    892  mountd
    100005    2   tcp    892  mountd
    100005    3   udp    892  mountd
    100005    3   tcp    892  mountd
    100003    2   tcp   2049  nfs
    100003    3   tcp   2049  nfs
    100003    4   tcp   2049  nfs
    100003    2   udp   2049  nfs
    100003    3   udp   2049  nfs
    100021    1   udp  54715  nlockmgr
    100021    3   udp  54715  nlockmgr
    100021    4   udp  54715  nlockmgr
    100021    1   tcp  46220  nlockmgr
    100021    3   tcp  46220  nlockmgr
    100021    4   tcp  46220  nlockmgr
    100024    1   udp  60464  status
    100024    1   tcp  35473  status

What's the problem here? I can not find any difference in the network configuration of both nodes?!
All hosts are in the same subnet on the same switch?!

Moayad · Apr 7, 2021

Hello

Please provide the version of Proxmox pveversion -v also please check the Syslog using journalctl -f if there any hint in pve1 node.

prahn · Apr 7, 2021

No problem, here it is:

Code:

root@pve1:~# pveversion -v
proxmox-ve: 6.3-1 (running kernel: 5.4.106-1-pve)
pve-manager: 6.3-6 (running version: 6.3-6/2184247e)
pve-kernel-5.4: 6.3-8
pve-kernel-helper: 6.3-8
pve-kernel-5.4.106-1-pve: 5.4.106-1
pve-kernel-5.4.78-2-pve: 5.4.78-2
pve-kernel-5.4.73-1-pve: 5.4.73-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.1.0-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.20-pve1
libproxmox-acme-perl: 1.0.8
libproxmox-backup-qemu0: 1.0.2-1
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.3-5
libpve-guest-common-perl: 3.1-5
libpve-http-server-perl: 3.1-1
libpve-storage-perl: 6.3-7
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.6-2
lxcfs: 4.0.6-pve1
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.0.11-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-9
pve-cluster: 6.2-1
pve-container: 3.3-4
pve-docs: 6.3-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.2-2
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.1.0-8
pve-xtermjs: 4.7.0-3
qemu-server: 6.3-8
smartmontools: 7.2-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 2.0.4-pve1

The only thing I can see about the NFS share in Syslog is this line, coming every couple of seconds:

Code:

Apr 07 09:30:04 pve1 pvestatd[4039]: storage 'vmbackup' is not online

In all the tips which I found with googling around I read about problems with iptables or any other firewall. However I do not have any firewall enabled on the pve-nodes and not on the NFS server!

Moayad · Apr 7, 2021

Hi,

Thanks for the outputs.

Does the NFS server log any problems that could indicate the root problem?

I guess the reboot will fix the issue.

prahn · Apr 7, 2021

Hi!
I did not find any info in the log files on the server according this problem... which logfile should I look at?
I searched in syslog and dmesg.

I also did a reboot of the NFS Server right now... and now both nodes could not mount the NFS share anymore. :-(

However I just tried to mount this share from an other debian system... no problem, the mount is working!
You think another reboot of the PVE node should bring it back to work??

Thanks for your help.

Regards,
Ingo

Moayad · Apr 7, 2021

prahn said:
You think another reboot of the PVE node should bring it back to work??

Yes, reboot the both node as well

prahn · Apr 7, 2021

I just rebooted node 1 but the problem remains. :-(
Any other idea what I can try?

Like I wrote, mounting this NFS share from another Debian system works without a problem.
And node 1 has another NFS share configured, which simply reconnects succesfully after rebooting.

che · Apr 7, 2021

prahn said:
I just rebooted node 1 but the problem remains. :-(
Any other idea what I can try?

Like I wrote, mounting this NFS share from another Debian system works without a problem.
And node 1 has another NFS share configured, which simply reconnects succesfully after rebooting.

Hi,
is the firewall enabled on the node having the connection issues? maybe try to disable it if enabled to see if it interferes? otherwise you will have to check your network with tools such as socat tcpdump nmapecc.

prahn · Apr 7, 2021

As I wrote in a posting before:

In all the tips which I found with googling around I read about problems with iptables or any other firewall. However I do not have any firewall enabled on the pve-nodes and not on the NFS server!

I checked open ports with nmap on TCP and UDP, everything looks fine to me:

Code:

root@pve1:~# nmap -sS -sU -p 111,2049 192.168.1.13
Starting Nmap 7.70 ( https://nmap.org ) at 2021-04-07 20:34 CEST
Nmap scan report for backup.xxx.de (192.168.1.13)
Host is up (0.00041s latency).

PORT     STATE SERVICE
111/tcp  open  rpcbind
2049/tcp open  nfs
111/udp  open  rpcbind
2049/udp open  nfs
MAC Address: 00:11:32:9A:CB:8A (Synology Incorporated)

Nmap done: 1 IP address (1 host up) scanned in 0.92 seconds

The other Debian system I use is a VM running on node 1. Here you can see that it is mounting flawless:

Code:

root@checkmk:~# mount 192.168.1.13:/volume1/vmbackup /mnt/test/
root@checkmk:~# mount | grep vmbackup
192.168.1.13:/volume1/vmbackup on /mnt/test type nfs4 (rw,relatime,vers=4.1,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.1.1,local_lock=none,addr=192.168.1.13)

And also mounting NFS shares from a different Synology NAS on node 1 is not a problem:

Code:

root@pve1:~# mount | grep dss
dss:/volume1/esx-nfs-storage on /mnt/pve/dss-storage type nfs (rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.1.4,mountvers=3,mountport=892,mountproto=udp,local_lock=none,addr=192.168.1.4)

How can I inspect this deeper with socat oder tcpdump??

prahn · Apr 7, 2021

Solved!

While still searching the web for a solution I came across this really long thread in this forum for 2019, also working with a Synology as NFS server:
https://forum.proxmox.com/threads/mount-no-longer-works-in-proxmox-6-nfs-synology.56503/post-263048
I simply had to disable the option "Reply to ARP requests if the target IP address is a local address configured on the incoming interface." and now everything is back to work!

But I still do not really understand:
On my other Synology NAS this option is enabled and PVE connects without a problem.
And also the other Debian system connects fine.

Strange...

aaronouthier · Apr 10, 2021

Hello,

My NFS connectons also went down for all of my containers after a Proxmox 6.3 update about a week ago.
Only Proxmox containers seem to be affected. NFS client says the server is rejecting the connection or something, however, the NFS server doesn't seem to log any connection attempt. Running dmesg on the client spews about 3 pages of errors mentioning NFS and "apparmor", as well as quite a bit of other gibberish.

All of my Proxmox VMs, as well as as separate physical machines can access the same server share just fine, with the same command-line options used in my Proxmox LXC containers. Only the LXC containers are failing to connect.

I am not clear if this issue is similar to the OP or not. Let me know if I need to start a new thread.

aaronouthier · Apr 10, 2021

proxmox-ve: 6.3-1 (running kernel: 5.4.106-1-pve) pve-manager: 6.3-6 (running version: 6.3-6/2184247e) pve-kernel-5.4: 6.3-8 pve-kernel-helper: 6.3-8 pve-kernel-5.4.106-1-pve: 5.4.106-1 pve-kernel-5.4.103-1-pve: 5.4.103-1 pve-kernel-5.4.73-1-pve: 5.4.73-1 ceph-fuse: 15.2.9-pve1 corosync: 3.1.0-pve1 criu: 3.11-3 glusterfs-client: 5.5-3 ifupdown: 0.8.35+pve1 ksm-control-daemon: 1.3-1 libjs-extjs: 6.0.1-10 libknet1: 1.20-pve1 libproxmox-acme-perl: 1.0.8 libproxmox-backup-qemu0: 1.0.3-1 libpve-access-control: 6.1-3 libpve-apiclient-perl: 3.1-3 libpve-common-perl: 6.3-5 libpve-guest-common-perl: 3.1-5 libpve-http-server-perl: 3.1-1 libpve-storage-perl: 6.3-8 libqb0: 1.0.5-1 libspice-server1: 0.14.2-4~pve6+1 lvm2: 2.03.02-pve4 lxc-pve: 4.0.6-2 lxcfs: 4.0.6-pve1 novnc-pve: 1.1.0-1 proxmox-backup-client: 1.0.13-1 proxmox-mini-journalreader: 1.1-1 proxmox-widget-toolkit: 2.4-9 pve-cluster: 6.2-1 pve-container: 3.3-4 pve-docs: 6.3-1 pve-edk2-firmware: 2.20200531-1 pve-firewall: 4.1-3 pve-firmware: 3.2-2 pve-ha-manager: 3.1-1 pve-i18n: 2.3-1 pve-qemu-kvm: 5.2.0-5 pve-xtermjs: 4.7.0-3 qemu-server: 6.3-10 smartmontools: 7.2-pve2 spiceterm: 3.1-1 vncterm: 1.6-2 zfsutils-linux: 2.0.4-pve1

Search

Search

[SOLVED] NFS share not online after recent updates

prahn

Member

Moayad

Proxmox Staff Member

prahn

Member

Moayad

Proxmox Staff Member

prahn

Member

Moayad

Proxmox Staff Member

prahn

Member

che

Active Member

prahn

Member

prahn

Member

aaronouthier

Member

aaronouthier

Member