Asus ROG Strix 570 freezing after sometime on a windows 10 virtual machine

ajaxr0

New Member
May 19, 2023
1
0
1
Hello, i have been having a lot of weird issues lately with my gpu pass through on proxmox. I have a Asus ROG Strix RX570 4GB which is correctly blocked from loading on the host and yet the entire system freezes after some time on the windows 10 vm.

Here is my vfio-pci.conf file which blocks the ids of both audio and video kernel drivers:
Rich (BB code):
options vfio-pci ids=1002:67df,1002:aaf0

And these do indeed match the ids of the GPU:
Rich (BB code):
03:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] [1002:67df] (rev ef)
03:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590] [1002:aaf0]

^ this is the output of lspci -nnk | grep ATI

The following below is the full output:
Rich (BB code):
03:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] [1002:67df] (rev ef)
        Subsystem: ASUSTeK Computer Inc. Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] [1043:04c2]
        Kernel driver in use: vfio-pci
        Kernel modules: amdgpu
03:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590] [1002:aaf0]
        Subsystem: ASUSTeK Computer Inc. Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590] [1043:aaf0]
        Kernel driver in use: snd_hda_intel
        Kernel modules: snd_hda_intel

With these we can see that it is properly blocked from loading and that vfio-pci will take priority.

I have these kernel arguments on my /etc/default/grub config
Rich (BB code):
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt initcall_blacklist=sysfb_init pcie_aspm=off radeon.blacklist=1 rd.driver.blacklist=radeon vfio-pci.ids=1002:67df,1002:aaf0 amdgpu.blacklist=1 rd.driver.blacklist=amdgpu

It ran fine for two months until it started crashing randomly and the disk disappearing from the vm also but now it wont anymore but this issue wouldnt occur, the gpu worked fine. I had previously a sapphire installed on this system and it did work fine too but i sold it and right now i am trying to get the asus one working. They have the same device ids so i did not change the vfio-pci config. I also have a blacklist conf blocking the kernel modules for any gpu type from loading:

Rich (BB code):
blacklist amdgpu
blacklist radeon
blacklist nouveau
blacklist nvidia

Any help would be very much appreciated thank you
PS: no the card is not defective, it was in use in my desktop before and worked completely fine. Same goes for the sapphire one that i sold.
also, yes i do have vendor-reset
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!