Hi all,
I'm experiencing host reboots on my Proxmox node and hoping for some insights.
Problem: My Proxmox host node crashes and reboots whenever I try to stop a specific Windows 10 VM
Symptoms & Findings:
Question: This seems to point specifically to an instability during the forceful VFIO device detachment and resource cleanup process. Has anyone else experienced host crashes only during forced VM stops (not graceful shutdowns) with NVIDIA passthrough?
Any advice or suggestions would be greatly appreciated!
This summary clearly outlines:
Host spec:
VM spec:
I'm experiencing host reboots on my Proxmox node and hoping for some insights.
Problem: My Proxmox host node crashes and reboots whenever I try to stop a specific Windows 10 VM
(ID 509001)
that has an NVIDIA GPU passed through via VFIO.Symptoms & Findings:
- The host crash occurs when using the UI "Stop" button, the
qm stop 509001
command, or the equivalent API call. - Crucially, initiating a shutdown from within the Windows 10 guest OS works perfectly. The VM shuts down cleanly, and the Proxmox host remains stable.
- Removing the NVIDIA GPU RTX 5090 passthrough configuration entirely from the VM prevents the host crash – the VM stops normally via any method.
- When the passthrough was active, checking host logs
(journalctl -b -1 -e)
just before a crash revealed repeated kvm:VFIO_MAP_DMA failed: Invalid argument errors associated with the VM's QEMU process.
Question: This seems to point specifically to an instability during the forceful VFIO device detachment and resource cleanup process. Has anyone else experienced host crashes only during forced VM stops (not graceful shutdowns) with NVIDIA passthrough?
Any advice or suggestions would be greatly appreciated!
This summary clearly outlines:
- The specific action causing the crash (forceful stop). (Other processes like Start, Reboot, Shutdown, Rollback snapshot are working well)
- What works (guest shutdown, no passthrough).
- Evidence pointing to VFIO (
VFIO_MAP_DMA failed
). - The exact help needed (solutions for forceful stop instability).
Host spec:
- Proxmox 8.4.1
- 28 x Intel(R) Core(TM) i7-14700 (1 Socket)
- Linux 6.8.12-9-pve (2025-03-16T19:18Z)
- GPU NVIDIA RTX 5090
VM spec:
Code:
agent: 1
args: -cpu host,hv_passthrough,level=30
balloon: 0
bios: ovmf
boot: order=virtio0;net0;ide0
cores: 16
cpu: host
efidisk0: local-lvm:vm-509001-disk-0,efitype=4m,pre-enrolled-keys=1,size=4M
hostpci0: 0000:01:00,pcie=1
ide0: local:iso/virtio-win-0.1.266.iso,media=cdrom,size=707456K
machine: pc-q35-9.2
memory: 65536
meta: creation-qemu=9.2.0,ctime=1743998400
net0: virtio=BC:24:11:56:8E:0B,bridge=vmbr0,firewall=1
numa: 0
ostype: win10
parent: session_start_point
scsihw: virtio-scsi-single
smbios1: uuid=64139eb6-cad7-4366-a1dd-e89fb117e7d4
sockets: 1
vga: std
virtio0: local-lvm:vm-509001-disk-1,cache=writeback,iothread=1,size=1500G
vmgenid: d18e9d69-86b8-43bd-b1d4-85b9b31dc6b8