GPU Passthrough crash PVE

frenk970

Well-Known Member
Jan 20, 2020
94
3
48
27
Good morning,
I changed the motherboard to my server due to problems with the SATA controller, when I pass my GPU in Passthrough to a Windows 11 VM the PVE manager crashes and I have to restart, does anyone know if I have to do something more than what is written in the guide?

here is the guide: https://akashrajvanshi.medium.com/step-by-step-guide-for-proxmox-gpu-passthrough-6e885898fdae

kernel: proxmox-kernel-6.11.11-2-pve
pve-manager: 8.3.5
GPU: NVidia GTX 1660
CPU: Intel i9-9900K
RAM 64GB DDR4
 
Hello frenk970! First of all, please make sure to follow the official guides: the Proxmox VE documentation on PCI(e) passthrough and the wiki page with further information.

Could you please provide us with the output of the following commands on the host:
  1. lspci -nnk
  2. dmesg | grep -e DMAR -e IOMMU -e AMD-Vi
  3. pvesh get /nodes/{nodename}/hardware/pci --pci-class-blacklist ""
 
You might also need to enable Above 4G Decoding.

In any case, this is some consumer board, right? I'd try to update the firmware/UEFI to the latest available, if possible.
The IOMMU group allocation is suboptimal. The GPU itself is in group 2, which also contains other devices such as the 10G NIC (if you use that) as well as core PCIe controller - the latter probably causes the crash. On passthrough, the host (obviously) loses connection to the device, but its needed for normal operation.

Updating the firmware might change the IOMMU grouping. But PCIe passthrough in general is very much dependent on the motherboard and the firmware quality.
 
You might also need to enable Above 4G Decoding.

In any case, this is some consumer board, right? I'd try to update the firmware/UEFI to the latest available, if possible.
The IOMMU group allocation is suboptimal. The GPU itself is in group 2, which also contains other devices such as the 10G NIC (if you use that) as well as core PCIe controller - the latter probably causes the crash. On passthrough, the host (obviously) loses connection to the device, but its needed for normal operation.

Updating the firmware might change the IOMMU grouping. But PCIe passthrough in general is very much dependent on the motherboard and the firmware quality.
I tried to update the BIOS, but it says the files are invalid, I have a GIGABYTE C246-WU4 motherboard
 
Do you mean that you see a file, but your BIOS complains that it is invalid? You might want to check precisely that you have the correct motherboard model with the correct revision, and if that's already the case, try to re-download the BIOS update. You can also check that by using dmidecode -t system and/or dmidecode -t baseboard.

Which BIOS version are you using at the moment? You can also find that out directly from your Proxmox VE installation by executing dmidecode -t bios
 
Do you mean that you see a file, but your BIOS complains that it is invalid? You might want to check precisely that you have the correct motherboard model with the correct revision, and if that's already the case, try to re-download the BIOS update. You can also check that by using dmidecode -t system and/or dmidecode -t baseboard.

Which BIOS version are you using at the moment? You can also find that out directly from your Proxmox VE installation by executing dmidecode -t bios
dmidecode -t bios: https://pastebin.com/XgCkJsjB - dmidecode -t system: https://pastebin.com/7UvUwrse - dmidecode -t baseboard: https://pastebin.com/csXXRysF
 
I solved it by following the guides carefully and updating the grub/kernel at each change, I rebooted and the video card separated from the 10GB/s card group and inserted the GPU into the VM and nothing crashed anymore