Node showing "?" after NFS VM hangs, qm command stuck

pulipulichen · Mar 17, 2026

Hi everyone, I've encountered a very tricky issue with my PVE cluster and I'm looking for a way to recover without rebooting the physical host.

Environment Setup:

3-node PVE Cluster.
Each node uses LACP (Bonding) with two NICs to handle PVE Management, VM traffic, and PBS backup traffic.
A VM on pve3003 acts as an NFS Server, using PCIe passthrough for an SSD.
The entire cluster (including the node itself) mounts this NFS share for ISO storage.

What Happened:Everything was fine until I tried uploading a large ISO from pve3001's WebUI. While waiting for the transfer to pve3003, the node pve3003 suddenly displayed a question mark (disconnected state) in the cluster view.

Current Status:

I’ve forced unmounted the NFS paths on the nodes (umount -f -l). Now df -h works.
However, pve3003 still shows a "?" in the WebUI and cannot be managed via other nodes.
I've tried restarting pve-cluster, pvedaemon, and pveproxy, but it didn't help.
Critically, any qm commands (like qm list) executed on pve3003 hang indefinitely.
I checked dmesg and the system console, and it's flooded with "nfs: server not responding, still trying" messages along with kernel call traces related to I/O wait.

Other VMs on pve3003 are still running, but the NFS VM is completely unresponsive. It seems the passthrough I/O or the VM process is stuck.

Since this host runs critical services, a reboot would cause significant downtime. Is there any way to force reset these hung management processes (especially qm) and restore the node's cluster status without a full hardware reboot?

fiona · Mar 18, 2026

Hi @pulipulichen,
see:

Post in thread 'Issues with backup / vzdump'

May 15, 2024

Hi,
if you have a hanging NFS/network mount (verify by going there and issuing an ls command or similar - if it also ends up in uninterruptible D state/your shell hangs, it's very likely that), you can use umount -l -f /path/to/mount/point to tell the kernel to unmount it lazily.

gfngfn256 · Mar 18, 2026

fiona said:
see:

Post in thread 'Issues with backup / vzdump'

May 15, 2024

Hi,
if you have a hanging NFS/network mount (verify by going there and issuing an ls command or similar - if it also ends up in uninterruptible D state/your shell hangs, it's very likely that), you can use umount -l -f /path/to/mount/point to tell the kernel to unmount it lazily.

fiona

It appears he already did that:

pulipulichen said:
I’ve forced unmounted the NFS paths on the nodes (umount -f -l). Now df -h works.

fiona · Mar 18, 2026

Ah, sorry. Missed that. Is the storage disabled in the configuration? Otherwise Proxmox VE, will attempt to mount it again.
What does findmnt say?
I suppose there is not much that can be done about the VM that was accessing the NFS, since its process is most likely in uninterruptible D state. You could try to kill the QEMU process, but it might not help either.

Node showing "?" after NFS VM hangs, qm command stuck

pulipulichen

Active Member

fiona

Proxmox Staff Member

Post in thread 'Issues with backup / vzdump'

gfngfn256

Distinguished Member

Post in thread 'Issues with backup / vzdump'

fiona

Proxmox Staff Member

We value your privacy