NFS or SSHfs shared from host randomly hangs VM.

ola

Renowned Member
Mar 6, 2013
20
0
66
Hi!

I have been hitting my head against the wall for some time now after one of my VM's that had a NFS mount from the proxmox host start having issues that the NFS just would stall randomly without reason (i think this happend after kernel upgrade from 5.13 to 5.15 on the host)
so i did try to mount the same folder over sshfs but got the same issue.
I installed a new VM going from Ubuntu 20.04 to Ubuntu 22.04 but still same issue that NFS or SSHfs randomly hangs.
I can have a ping up from the VM to the host without losing any packages even when it drops the NFS connection and i can also access the same NFS share on secondary VM at the same time that the first one says it's unreacebal.

there is no information i any log on the host and the only thing in the vm syslog is
Code:
Jul 18 15:41:41 X kernel: [62512.474034] nfs: server 192.168.X.3 not responding, still trying

i have changed from virtio networking to e1000 to test, without success
and im sort of out of ideas, it was running perfectly for over a year and now i have no clue how to make it work again.

This is the version im running on the host
Code:
root@pmox1:~# pveversion -v
proxmox-ve: 7.2-1 (running kernel: 5.15.39-1-pve)
pve-manager: 7.2-7 (running version: 7.2-7/d0dd0e85)
pve-kernel-5.15: 7.2-6
pve-kernel-helper: 7.2-6
pve-kernel-5.13: 7.1-9
pve-kernel-5.4: 6.4-13
pve-kernel-5.15.39-1-pve: 5.15.39-1
pve-kernel-5.15.35-3-pve: 5.15.35-6
pve-kernel-5.15.35-2-pve: 5.15.35-5
pve-kernel-5.13.19-6-pve: 5.13.19-15
pve-kernel-5.13.19-4-pve: 5.13.19-9
pve-kernel-5.4.166-1-pve: 5.4.166-1
pve-kernel-5.4.140-1-pve: 5.4.140-1
pve-kernel-5.4.106-1-pve: 5.4.106-1
ceph-fuse: 14.2.21-1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: residual config
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve1
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.2-3
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.2-2
libpve-guest-common-perl: 4.1-2
libpve-http-server-perl: 4.1-3
libpve-storage-perl: 7.2-5
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.2.3-1
proxmox-backup-file-restore: 2.2.3-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.5.1
pve-cluster: 7.2-1
pve-container: 4.2-1
pve-docs: 7.2-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.4-2
pve-ha-manager: 3.3-4
pve-i18n: 2.7-2
pve-qemu-kvm: 6.2.0-11
pve-xtermjs: 4.16.0-1
qemu-server: 7.2-3
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.4-pve1

All help much appreciated becuase i start to lose my sanity over this.

//Ola
 
Same here...

Code:
root@serwer:~# pveversion -v
proxmox-ve: 7.2-1 (running kernel: 5.15.39-1-pve)
pve-manager: 7.2-7 (running version: 7.2-7/d0dd0e85)
pve-kernel-5.15: 7.2-6
pve-kernel-helper: 7.2-6
pve-kernel-5.15.39-1-pve: 5.15.39-1
pve-kernel-5.15.30-2-pve: 5.15.30-3
ceph-fuse: 15.2.16-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve1
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.2-3
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.2-2
libpve-guest-common-perl: 4.1-2
libpve-http-server-perl: 4.1-3
libpve-storage-perl: 7.2-5
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.2.3-1
proxmox-backup-file-restore: 2.2.3-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.5.1
pve-cluster: 7.2-1
pve-container: 4.2-1
pve-docs: 7.2-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.4-2
pve-ha-manager: 3.3-4
pve-i18n: 2.7-2
pve-qemu-kvm: 6.2.0-11
pve-xtermjs: 4.16.0-1
qemu-server: 7.2-3
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.4-pve1
 
Same here...

Code:
root@serwer:~# pveversion -v
proxmox-ve: 7.2-1 (running kernel: 5.15.39-1-pve)
pve-manager: 7.2-7 (running version: 7.2-7/d0dd0e85)
pve-kernel-5.15: 7.2-6
pve-kernel-helper: 7.2-6
pve-kernel-5.15.39-1-pve: 5.15.39-1
pve-kernel-5.15.30-2-pve: 5.15.30-3
ceph-fuse: 15.2.16-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve1
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.2-3
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.2-2
libpve-guest-common-perl: 4.1-2
libpve-http-server-perl: 4.1-3
libpve-storage-perl: 7.2-5
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.2.3-1
proxmox-backup-file-restore: 2.2.3-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.5.1
pve-cluster: 7.2-1
pve-container: 4.2-1
pve-docs: 7.2-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.4-2
pve-ha-manager: 3.3-4
pve-i18n: 2.7-2
pve-qemu-kvm: 6.2.0-11
pve-xtermjs: 4.16.0-1
qemu-server: 7.2-3
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.4-pve1
Do you have console access to your machine? i have not been able to make it boot the 5.13.x kernel remotely so i think it has be done localy or using OOB in the grub menu..
Also, intresting im not alone, have you seen any pattern? when did you discover the problem?
 
It's a relativelly new proxmox install since I've decided to move my config to proxmox less than 2 weeks ago.
But from the start I've had problems with nfs shared disk access from my Ubuntu 20.04 so a few days ago I've installed 22.04 and nothing changed.
Now I moved to samba shares instead of nfs shares. And things look like they've improved (but I still think it should work better).
I don't have let's say direct access to booting options on that machine since it's totally headless - since my motherboard lets me boot without graphics card it currently doesn't have one.

Maybe we should try the Proxmox Edge kernel ? https://github.com/fabianishere/pve-edge-kernel
I don't have any other ideas to make it work.
 
It's a relativelly new proxmox install since I've decided to move my config to proxmox less than 2 weeks ago.
But from the start I've had problems with nfs shared disk access from my Ubuntu 20.04 so a few days ago I've installed 22.04 and nothing changed.
Now I moved to samba shares instead of nfs shares. And things look like they've improved (but I still think it should work better).
I don't have let's say direct access to booting options on that machine since it's totally headless - since my motherboard lets me boot without graphics card it currently doesn't have one.

Maybe we should try the Proxmox Edge kernel ? https://github.com/fabianishere/pve-edge-kernel
I don't have any other ideas to make it work.
Ahh! so you share from a VM to other VM's or VM to some external clients?
i share from the Proxmox host itself to a VM on the same host.

That's intresting!
 
I didn't write anywhere that I'm sharing from VM to VM.
I'm sharing from host to VM's. That's why I wrote 'same here...'
The Ubuntu VMs are the ones giving me most problems. But I suppose that if I used the shared drives as much on other os VMs the effect would be similar.

For now... Samba is slower, but at least it works (also sharing from host to VMs).
 
I didn't write anywhere that I'm sharing from VM to VM.
I'm sharing from host to VM's. That's why I wrote 'same here...'
The Ubuntu VMs are the ones giving me most problems. But I suppose that if I used the shared drives as much on other os VMs the effect would be similar.

For now... Samba is slower, but at least it works (also sharing from host to VMs).
Sorry my bad! I did missunderstand
... I've had problems with nfs shared disk access from my Ubuntu 20.04 ...
But ye, maybee try a later kernel or a 5.13.x and see. i working on a way to access IPMI to my machine to be able to test it
 
Unfortunatelly I'm experiencing the same thing with samba shares.
I'll be going with edge kernel now. We'll if that'll help.
 
After some short test using 5.18.12-edge kernel the nfs shares work much much better now.
 
Not everything is ok on that kernel.
From what I've observed the VMs still take ages to reboot/shutdown showing timeouts on sync operations.
 
did see a new kernel, 5.15.39-2. Going to reboot to it tomorrow and see what it does
 
Ok. I think I'll stay with the edge one.
But if you'll try the official one please give feedback if it fixed the issue.
 
How have the edge worked so far? Host is stable but VM's take a while to reboot?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!