Dual RX570 GPU passthrough

SupaInLaw

New Member
Mar 26, 2024
7
1
3
I have been digging around through various "Solved" similar issues but am having no luck on my own.

Setup:
2x MSI RX570 4Gb GPUs
Intel i7-4790K CPU
ASUS Z97-A Mobo
32GB Ram

I have Nvidea GTX1080 and a GTX1070 on hand if this is an AMD problem, but all installed hardware has been verified to be functional outside this setup.

Goal:
I want it to run 2 Windows VMs with the purpose of running 2 Games (Valheim and Satisfactory) One game needs a GPU for sure as it is a Co-op type multiplayer where the charecter will just sit in lobby while other clients can join them. Both GPUs will have dummy headless displays attached so the GPU is being used.

Progress so far: I have the RX570 showing up in windows guest, but no image out of the GPU.

Issues: Windows reports Error 43 and fails to startup the GPU. AMD driver software fails to load driver. Disabling/ Enabling the driver makes the error go away but the GPU is still un-usable.

Setup:
Brand new install, 4 days old

Kernel Version : Linux 6.5.13-3-pve (2024-03-20T10:45Z)
Boot Mode: EFI (Secure Boot)
Manager Version: pve-manager/8.1.5/60e01c6ac2325b3f

Code:
root@gamehost:~# cat /etc/modules
# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.
# Parameters can be specified after the module name.
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
vendor-reset

/etc/modprobe.d/blacklist.conf -> is empty as ive seen suggestions not to block amdgpu or radeon. have tried both, same result
Code:
root@gamehost:~# cat /etc/modprobe.d/vfio.conf
options vfio-pci ids=1002:67df,1002:aaf0 disable_vga=1

Code:
root@gamehost:~# cat /etc/default/grub
-/-
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt pcie_acs_override=id:1002:67df,1002:aaf0 vfio_iommu_type1.allow_unsafe_interrupts=1 initcall_blacklist=sysfb_init"
GRUB_CMDLINE_LINUX=""
-/-

Code:
root@gamehost:~# cat /etc/pve/qemu-server/101.conf
args: -global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off
boot: order=scsi0;net0;ide0
cores: 4
cpu: host
hostpci0: 0000:01:00,pcie=1
ide0: local:iso/virtio-win-0.1.240.iso,media=cdrom,size=612812K
machine: pc-q35-6.2
memory: 8192
meta: creation-qemu=8.1.5,ctime=1711397582
name: Satisfactory
net0: virtio=BC:24:11:DB:78:DA,bridge=vmbr0,firewall=1
numa: 0
ostype: l26
scsi0: tbSeagate:101/vm-101-disk-0.qcow2,cache=writeback,iothread=1,size=256G
scsihw: virtio-scsi-single
smbios1: uuid=xxxx
sockets: 1
vmgenid: xxx

Code:
dmesg | grep -e DMAR -e IOMMU
[    0.000000] Warning: PCIe ACS overrides enabled; This may allow non-IOMMU protected peer-to-peer DMA
[    0.007192] ACPI: DMAR 0x0000000088DE2B68 0000B8 (v01 INTEL  BDW      00000001 INTL 00000001)
[    0.007208] ACPI: Reserving DMAR table memory at [mem 0x88de2b68-0x88de2c1f]
[    0.030299] DMAR: IOMMU enabled
[    0.085443] DMAR: Host address width 39
[    0.085444] DMAR: DRHD base: 0x000000fed90000 flags: 0x0
[    0.085448] DMAR: dmar0: reg_base_addr fed90000 ver 1:0 cap c0000020660462 ecap f0101a
[    0.085450] DMAR: DRHD base: 0x000000fed91000 flags: 0x1
[    0.085452] DMAR: dmar1: reg_base_addr fed91000 ver 1:0 cap d2008c20660462 ecap f010da
[    0.085453] DMAR: RMRR base: 0x00000088d50000 end: 0x00000088d5dfff
[    0.085455] DMAR: RMRR base: 0x0000008b000000 end: 0x0000008f1fffff
[    0.085457] DMAR-IR: IOAPIC id 8 under DRHD base  0xfed91000 IOMMU 1
[    0.085458] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
[    0.085459] DMAR-IR: x2apic is disabled because BIOS sets x2apic opt out bit.
[    0.085459] DMAR-IR: Use 'intremap=no_x2apic_optout' to override the BIOS setting.
[    0.086231] DMAR-IR: Enabled IRQ remapping in xapic mode
[    0.240397] DMAR: No ATSR found
[    0.240398] DMAR: No SATC found
[    0.240399] DMAR: IOMMU feature pgsel_inv inconsistent
[    0.240400] DMAR: IOMMU feature sc_support inconsistent
[    0.240400] DMAR: IOMMU feature pass_through inconsistent
[    0.240401] DMAR: dmar0: Using Queued invalidation
[    0.240405] DMAR: dmar1: Using Queued invalidation
[    0.286268] DMAR: Intel(R) Virtualization Technology for Directed I/O
[    4.322226] AMD-Vi: AMD IOMMUv2 functionality not available on this system - This is not a bug.
[    6.129789] i915 0000:00:02.0: [drm] DMAR active, disabling use of stolen memory

Code:
root@gamehost:~# dmesg | grep -e vfio
[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-6.5.13-3-pve root=/dev/mapper/pve-root ro quiet intel_iommu=on iommu=pt pcie_acs_override=id:1002:67df,1002:aaf0 vfio_iommu_type1.allow_unsafe_interrupts=1 initcall_blacklist=sysfb_init
[    0.030246] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.5.13-3-pve root=/dev/mapper/pve-root ro quiet intel_iommu=on iommu=pt pcie_acs_override=id:1002:67df,1002:aaf0 vfio_iommu_type1.allow_unsafe_interrupts=1 initcall_blacklist=sysfb_init
[    3.867570] vfio-pci 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
[    3.867721] vfio-pci 0000:02:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
[    3.867746] vfio_pci: add [1002:67df[ffffffff:ffffffff]] class 0x000000/00000000
[    6.130614] vfio-pci 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:owns=none
[    6.130618] vfio-pci 0000:02:00.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:owns=none
[    6.180411] vfio_pci: add [1002:aaf0[ffffffff:ffffffff]] class 0x000000/00000000
[  112.035969] vfio-pci 0000:02:00.0: enabling device (0000 -> 0003)
[ 1150.072278] vfio-pci 0000:01:00.0: Unsupported reset method 'device_specific'
[ 1161.490004] vfio-pci 0000:02:00.0: Unsupported reset method 'device_specific'
[ 1211.993371] vfio-pci 0000:01:00.0: enabling device (0000 -> 0003)

I dont see any errors in dmesg that look out of the norm and have Vt-d enabled in bios, cant find resizable BAR in there and the primary GPU is set to the CPU iGPU.
I am very new to ProxMox and everything so far has been haphazard with one being windows not installing from the onboard iso requiering me to plug in and passthrough a bootable USB. Im not sure if that is related to this at all, trying to provide as much info as possible to help solve this.

Thank you
 

Attachments

  • Screenshot 2024-03-26 113404.png
    Screenshot 2024-03-26 113404.png
    67.4 KB · Views: 4
[ 1150.072278] vfio-pci 0000:01:00.0: Unsupported reset method 'device_specific'
[ 1161.490004] vfio-pci 0000:02:00.0: Unsupported reset method 'device_specific'
I dont see any errors in dmesg that look out of the norm
This suggest that vendor-reset is not loaded (but RX570 does need it) or does not support RX570 (which it's supposed to) or something else is wrong (but you did activate it per device which is needed).
 
would it be best to start from scratch? Im not sure how to address this error, I have run the command:
Code:
echo 'device_specific' > /sys/bus/pci/devices/0000\:01\:00.0/reset_method
and then i have seen it reset the card in dmesg but again to no result in fixing error 43
 
I have run the command:
Code:
echo 'device_specific' > /sys/bus/pci/devices/0000\:01\:00.0/reset_method
and then i have seen it reset the card in dmesg but again to no result in fixing error 43
It does look like you run the correct commands but the message suggest that something is wrong.
would it be best to start from scratch?
Might be. In my experience, NVidia passthrough is more difficult to get to work, as long as vendor-reset is installed and working.
RX570's, when booting with the integrated Intel graphics should "just work" with passthrough (once, per Proxmox host reboot). Maybe reinstall, restore VMs from backup, install vendor-reset (but don't blacklist of bind to vfio-pci like you did before) and activate it for the RX570's to see if that "just works". Are you sure the RX570 are in working order?
 
The Gpus do work, I tried running Ubuntu using a bootable USB and I could get output on all ports.
I have reinstalled Proxmox,
Installed vendor-reset
found a faulty ram stick and removed it (all ram has been tested after this) this fixed my windows install issues
But alas, still Code 43 in windows guest.
 
The Gpus do work, I tried running Ubuntu using a bootable USB and I could get output on all ports.
I have reinstalled Proxmox,
Installed vendor-reset
found a faulty ram stick and removed it (all ram has been tested after this) this fixed my windows install issues
But alas, still Code 43 in windows guest.
I have the exact same issues with a 7th gen HD 620 GPU on a 7th gen i3 in a NUC. It works on Linux and I tried passing it through to a Fedora VM, which worked and also gave display on the HDMI output. But on Windows, it just doesn't work. I tried Windows 8, Windows 8.1 and Windows 11 and all of them I haven't been able to get to work either. Have tried changing q35 version, hiding KVM etc.
 
I have the exact same issues with a 7th gen HD 620 GPU on a 7th gen i3 in a NUC. It works on Linux and I tried passing it through to a Fedora VM, which worked and also gave display on the HDMI output. But on Windows, it just doesn't work. I tried Windows 8, Windows 8.1 and Windows 11 and all of them I haven't been able to get to work either. Have tried changing q35 version, hiding KVM etc.
Well i may have found that the MSI RX570s are to blame.
I threw in new drive to keep the proxmox settup backedup, unplugged those drives and installed an old copy of VMware hypervisor since their PCIe Passthru is as simple as a checkbox and a reboot.
Still Error 43 in windows.
So I wanted to narrow it down and see if its the Mobo, CPU or GPU. I put in a GTX970 and everything worked. I got output on the GPU ports and windows had no issues using it.
I will try to do a Vbios update and see if i can get the MSI RX570 working right since I have a few of them and would rather not build individual physical machines for them all.

I'll report back if that fixes the issue.
 
Well i may have found that the MSI RX570s are to blame.
I threw in new drive to keep the proxmox settup backedup, unplugged those drives and installed an old copy of VMware hypervisor since their PCIe Passthru is as simple as a checkbox and a reboot.
Still Error 43 in windows.
So I wanted to narrow it down and see if its the Mobo, CPU or GPU. I put in a GTX970 and everything worked. I got output on the GPU ports and windows had no issues using it.
I will try to do a Vbios update and see if i can get the MSI RX570 working right since I have a few of them and would rather not build individual physical machines for them all.

I'll report back if that fixes the issue.
I managed to get it to work on GVT-g! It's not ideal, but it's something. Still looking on how I may be able to get full passthrough to work for my Windows 8.1 and 11 VMs.
 
As an update, I gave up on both ProxMox (for this specific setup) and the RX570s. the MSI Rx570s just dont want to work with PCIe Passthrough while ProxMox kept giving me issues even with other GPUs. I went back to ESXi where they just worked out of the box. I'll save using ProxMox for non Passthrough type applications.
 
  • Like
Reactions: Minionguyjpro

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!