Hello,
I've had PCIe passthrough working for quite some time on a TRX50 Threadripper build with an Nvidia 4070ti Super Asus Tuf and an AMD 7900xtx Sapphire Pulse. I recently swapped the 4070ti Super out for an AMD 7700xt Sapphire Pulse, though, and started having issues.
On two different Windows VMs, it will pass through okay, but upon rebooting either VM, the Proxmox GUI will become unresponsive for about 30 seconds. It appears to go through a reboot, except the BIOS splash screen doesn't come up and several PCIe devices in addition to the GPU will no longer show up with
What's even stranger is that the 7700xt will pass through just fine to an EndeavourOS VM and I can reboot the guest without issue.
The issue sounds like the reset bug, but why would it only affect Windows guests?
A few other relevant things:
I'm using
I've tried blacklisting drivers and also not blacklisting while using the softdep.
HDR doesn't work when the 7700xt is passed through to the Windows VM, but it does in the EndeavourOS VM with everything else the same.
I have to use
I get the following errors running
These errors do not come up on normal reboots, but I've reset the CPU and RAM just in case. RAM was testing for 8+hours when I first built the system.
If I run
The 7700xt is device 0000:03:00.0.
Can anyone help me out?
I've had PCIe passthrough working for quite some time on a TRX50 Threadripper build with an Nvidia 4070ti Super Asus Tuf and an AMD 7900xtx Sapphire Pulse. I recently swapped the 4070ti Super out for an AMD 7700xt Sapphire Pulse, though, and started having issues.
On two different Windows VMs, it will pass through okay, but upon rebooting either VM, the Proxmox GUI will become unresponsive for about 30 seconds. It appears to go through a reboot, except the BIOS splash screen doesn't come up and several PCIe devices in addition to the GPU will no longer show up with
lspci
. I have to reboot from the GUI once it comes back up in order to get the PCIe devices back. Also, I have confirmed that the devices are not in the same IOMMU groups. All of my PCIe slots go to the CPU and not the chipset. The 7700xt is in the top x16 slot.What's even stranger is that the 7700xt will pass through just fine to an EndeavourOS VM and I can reboot the guest without issue.
The issue sounds like the reset bug, but why would it only affect Windows guests?
A few other relevant things:
I'm using
softdep radeon pre: vfio-pci
softdep amdgpu pre: vfio-pci
softdep snd_hda_intel pre: vfio-pc
and setting the device ids in vfio.conf.I've tried blacklisting drivers and also not blacklisting while using the softdep.
HDR doesn't work when the 7700xt is passed through to the Windows VM, but it does in the EndeavourOS VM with everything else the same.
I have to use
vga=1
for both the 7900xtx and the 7700xt on a Windows guest or there won't be any display at all. This is not the case for EndeavourOS VMs.I get the following errors running
journalctl -p 4
after the hard reboots:
Code:
Apr 05 11:42:48 pve3 kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 28: bea000000505080b
Apr 05 11:42:48 pve3 kernel: mce: [Hardware Error]: TSC 0 ADDR fffff080807fe0 MISC d0150fff00000000 PPIN 2b0bb7e3cdc4045 SYND 5d000009 IPID 1002e1f002c00
Apr 05 11:42:48 pve3 kernel: mce: [Hardware Error]: PROCESSOR 2:a10f81 TIME 1743874964 SOCKET 0 APIC 0 microcode a108108
If I run
journalctl -p 4
quickly enough before the machine crashes/reboots, I can get these errors:
Code:
Apr 05 11:39:27 pve3 pvedaemon[3134]: error writing '1' to '/sys/bus/pci/devices/0000:03:00.0/reset': Inappropriate ioctl for device
Apr 05 11:39:27 pve3 pvedaemon[3134]: failed to reset PCI device '0000:03:00.0', but trying to continue as not all devices need a reset
Apr 05 11:40:10 pve3 pvestatd[2051]: VM 303 qmp command failed - VM 303 not running
Apr 05 11:41:33 pve3 kernel: pcieport 0000:01:00.0: Unable to change power state from D3hot to D0, device inaccessible
Can anyone help me out?
Last edited: