GPU Passthrough with AMD Renoir for Windows 11 guest

christian789

New Member
Apr 27, 2024
17
1
3
I have Proxmox VE 8.3.1 installed on a Gigabyte Brix BRR-4300 and would like to passthrough GPU (+audio, USB). I worked through https://pve.proxmox.com/pve-docs/pve-admin-guide.html#qm_pci_passthrough:
Bash:
# dmesg | grep -e DMAR -e IOMMU
[    0.482306] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[    0.483526] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).
# dmesg | grep 'remapping'
[    0.483033] AMD-Vi: Interrupt remapping enabled
# pvesh get /nodes/pve-armc2oh/hardware/pci --pci-class-blacklist ""
┌──────────┬────────┬──────────────┬────────────┬────────┬────────────────────────────────────────────────
│ class    │ device │ id           │ iommugroup │ vendor │ device_name
╞══════════╪════════╪══════════════╪════════════╪════════╪════════════════════════════════════════════════
│ 0x010802 │ 0xa80a │ 0000:03:00.0 │         11 │ 0x144d │ NVMe SSD Controller PM9A1/PM9A3/980PRO
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────────────
│ 0x020000 │ 0x8125 │ 0000:02:00.0 │         10 │ 0x10ec │ RTL8125 2.5GbE Controller
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────────────
│ 0x028000 │ 0x2723 │ 0000:01:00.0 │          9 │ 0x8086 │ Wi-Fi 6 AX200
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────────────
│ 0x030000 │ 0x1636 │ 0000:05:00.0 │          6 │ 0x1002 │ Renoir
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────────────
│ 0x040300 │ 0x1637 │ 0000:05:00.1 │          6 │ 0x1002 │ Renoir Radeon High Definition Audio Controller
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────────────
│ 0x040300 │ 0x15e3 │ 0000:05:00.6 │          6 │ 0x1022 │ Family 17h/19h HD Audio Controller
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────────────
│ 0x060000 │ 0x1630 │ 0000:00:00.0 │         -1 │ 0x1022 │ Renoir/Cezanne Root Complex
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────────────
│ 0x060000 │ 0x1632 │ 0000:00:01.0 │          0 │ 0x1022 │ Renoir PCIe Dummy Host Bridge
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────────────
│ 0x060000 │ 0x1632 │ 0000:00:02.0 │          2 │ 0x1022 │ Renoir PCIe Dummy Host Bridge
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────────────
│ 0x060000 │ 0x1632 │ 0000:00:08.0 │          6 │ 0x1022 │ Renoir PCIe Dummy Host Bridge
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────────────
│ 0x060000 │ 0x1448 │ 0000:00:18.0 │          8 │ 0x1022 │ Renoir Device 24: Function 0
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────────────
│ 0x060000 │ 0x1449 │ 0000:00:18.1 │          8 │ 0x1022 │ Renoir Device 24: Function 1
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────────────
│ 0x060000 │ 0x144a │ 0000:00:18.2 │          8 │ 0x1022 │ Renoir Device 24: Function 2
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────────────
│ 0x060000 │ 0x144b │ 0000:00:18.3 │          8 │ 0x1022 │ Renoir Device 24: Function 3
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────────────
│ 0x060000 │ 0x144c │ 0000:00:18.4 │          8 │ 0x1022 │ Renoir Device 24: Function 4
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────────────
│ 0x060000 │ 0x144d │ 0000:00:18.5 │          8 │ 0x1022 │ Renoir Device 24: Function 5
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────────────
│ 0x060000 │ 0x144e │ 0000:00:18.6 │          8 │ 0x1022 │ Renoir Device 24: Function 6
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────────────
│ 0x060000 │ 0x144f │ 0000:00:18.7 │          8 │ 0x1022 │ Renoir Device 24: Function 7
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────────────
│ 0x060100 │ 0x790e │ 0000:00:14.3 │          7 │ 0x1022 │ FCH LPC Bridge
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────────────
│ 0x060400 │ 0x1634 │ 0000:00:01.3 │          1 │ 0x1022 │ Renoir/Cezanne PCIe GPP Bridge
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────────────
│ 0x060400 │ 0x1634 │ 0000:00:02.1 │          3 │ 0x1022 │ Renoir/Cezanne PCIe GPP Bridge
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────────────
│ 0x060400 │ 0x1634 │ 0000:00:02.2 │          4 │ 0x1022 │ Renoir/Cezanne PCIe GPP Bridge
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────────────
│ 0x060400 │ 0x1634 │ 0000:00:02.4 │          5 │ 0x1022 │ Renoir/Cezanne PCIe GPP Bridge
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────────────
│ 0x060400 │ 0x1635 │ 0000:00:08.1 │          6 │ 0x1022 │ Renoir Internal PCIe GPP Bridge to Bus
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────────────
│ 0x080600 │ 0x1631 │ 0000:00:00.2 │         -1 │ 0x1022 │ Renoir/Cezanne IOMMU
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────────────
│ 0x0c0330 │ 0x2142 │ 0000:04:00.0 │         12 │ 0x1b21 │ ASM2142/ASM3142 USB 3.1 Host Controller
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────────────
│ 0x0c0330 │ 0x1639 │ 0000:05:00.3 │          6 │ 0x1022 │ Renoir/Cezanne USB 3.1
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────────────
│ 0x0c0330 │ 0x1639 │ 0000:05:00.4 │          6 │ 0x1022 │ Renoir/Cezanne USB 3.1
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────────────
│ 0x0c0500 │ 0x790b │ 0000:00:14.0 │          7 │ 0x1022 │ FCH SMBus Controller
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────────────
│ 0x108000 │ 0x15df │ 0000:05:00.2 │          6 │ 0x1022 │ Family 17h (Models 10h-1fh) Platform Security P
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────────────
│ 0x118000 │ 0x15e4 │ 0000:05:00.7 │          6 │ 0x1022 │ Sensor Fusion Hub
└──────────┴────────┴──────────────┴────────────┴────────┴────────────────────────────────────────────────
# cat /etc/modprobe.d/pci-passthrough.conf
options vfio-pci ids=1002:1636,1002:1637 disable_vga=1
blacklist amdgpu
# lspci -k | grep -A 3 "VGA"
05:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Renoir (rev c4)
        Subsystem: Gigabyte Technology Co., Ltd Renoir [Radeon Vega Series / Radeon Vega Mobile Series]
        Kernel driver in use: vfio-pci
        Kernel modules: amdgpu
Not sure if that means the host is properly prepared or something is missing. This is my VM config
Bash:
~# cat /etc/pve/qemu-server/204.conf
agent: 1
balloon: 2048
bios: ovmf
boot: order=sata0
cores: 4
cpu: host
efidisk0: local-zfs:vm-204-disk-5,efitype=4m,pre-enrolled-keys=1,size=1M
hostpci0: 0000:05:00.0,pcie=1,x-vga=1
hostpci1: 0000:05:00.1,pcie=1
hotplug: disk,network,usb,memory,cpu
machine: pc-q35-9.0,viommu=virtio
memory: 16384
meta: creation-qemu=8.1.5,ctime=1714335466
name: Desktop-armc2oh
net0: virtio=BC:24:11:DC:BC:BD,bridge=vmbr0,firewall=1
numa: 1
ostype: win11
sata0: local-zfs:vm-204-disk-3,cache=writeback,discard=on,size=476941M,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=e95128c3-9e2a-45de-bff9-b9305e306175
sockets: 1
tpmstate0: local-zfs:vm-204-disk-2,size=4M,version=v2.0
unused0: local-zfs:vm-204-disk-0
unused1: local-zfs:vm-204-disk-4
unused2: local-zfs:vm-204-disk-1
vga: virtio
vmgenid: 6ebe16c2-e0a6-48cb-8cdd-accf313dd99d
Trying to start the VM takes very long and I see
Code:
~# dmesg | grep -i vfio
[    4.549868] VFIO - User Level meta-driver version: 0.3
[    4.586593] vfio-pci 0000:05:00.0: vgaarb: deactivate vga console
[    4.586608] vfio-pci 0000:05:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=none
[    4.586842] vfio_pci: add [1002:1636[ffffffff:ffffffff]] class 0x000000/00000000
[    4.610538] vfio_pci: add [1002:1637[ffffffff:ffffffff]] class 0x000000/00000000
[ 1308.789291] vfio-pci 0000:05:00.0: enabling device (0002 -> 0003)
[ 1308.804572] vfio-pci 0000:05:00.1: enabling device (0000 -> 0002)
[ 1313.189356] vfio-pci 0000:05:00.0: not ready 1023ms after bus reset; waiting
[ 1314.235573] vfio-pci 0000:05:00.0: not ready 2047ms after bus reset; waiting
[ 1316.347566] vfio-pci 0000:05:00.0: not ready 4095ms after bus reset; waiting
[ 1320.699569] vfio-pci 0000:05:00.0: not ready 8191ms after bus reset; waiting
[ 1329.403566] vfio-pci 0000:05:00.0: not ready 16383ms after bus reset; waiting
[ 1346.299568] vfio-pci 0000:05:00.0: not ready 32767ms after bus reset; waiting
[ 1381.627570] vfio-pci 0000:05:00.0: not ready 65535ms after bus reset; giving up
[ 1382.056463] vfio-pci 0000:05:00.3: Unable to change power state from D0 to D3hot, device inaccessible
[ 1382.057355] vfio-pci 0000:05:00.6: Unable to change power state from D0 to D3hot, device inaccessible
[ 1382.058449] vfio-pci 0000:05:00.1: vfio_bar_restore: reset recovery - restoring BARs
[ 1382.059925] vfio-pci 0000:05:00.0: vfio_bar_restore: reset recovery - restoring BARs
[ 1382.070817] vfio-pci 0000:05:00.4: Unable to change power state from D0 to D3hot, device inaccessible
[ 1382.071321] vfio-pci 0000:05:00.2: Unable to change power state from D0 to D3hot, device inaccessible
[ 1382.071812] vfio-pci 0000:05:00.7: Unable to change power state from D0 to D3hot, device inaccessible
[ 1383.039971] vfio-pci 0000:05:00.0: vfio_bar_restore: reset recovery - restoring BARs
[ 1383.043485] vfio-pci 0000:05:00.1: vfio_bar_restore: reset recovery - restoring BARs
[ 1383.081582] vfio-pci 0000:05:00.0: vfio_bar_restore: reset recovery - restoring BARs
[ 1383.084716] vfio-pci 0000:05:00.1: vfio_bar_restore: reset recovery - restoring BARs
...
The VM is accessible via RDP, but the physically connected monitor still shows the boot image of Proxmox and the graphic cards listed in Windows' device manager are just Microsoft Remote Display and Red Hat VirtIO GPU.
Any idea?
 
I assume I'm suffering from the AMD GPU reset bug - the iGPU is dated 2020. I tried installing vendor-reset as in [TUTORIAL] PCI/GPU Passthrough on Proxmox VE 8 : Installation and configuration. But that failed, because my self build vendor-reset module is not signed. So I followed tlex dkms signature procedure removed vendor-reset and installed again. In dmesg I see vendor_reset_hook: installed. I assume vendor-reset installed now.
Now I'm struggling with the vreset.service mentioned in asded's tutorial.
Code:
[   11.938222] vfio-pci 0000:05:00.0: Unsupported reset method 'device_specific'
[   31.882826] pcie_mp2_amd 0000:05:00.7: Failed to discover, sensors not enabled is 0
[   31.882846] pcie_mp2_amd 0000:05:00.7: amd_sfh_hid_client_init failed err -95
...
[  198.313750] vfio-pci 0000:05:00.6: Unable to change power state from D0 to D3hot, device inaccessible
[  198.314589] vfio-pci 0000:05:00.1: vfio_bar_restore: reset recovery - restoring BARs
[  198.319418] vfio-pci 0000:05:00.0: vfio_bar_restore: reset recovery - restoring BARs
[  198.323795] vfio-pci 0000:05:00.2: Unable to change power state from D0 to D3hot, device inaccessible
[  198.325259] vfio-pci 0000:05:00.7: Unable to change power state from D0 to D3hot, device inaccessible
[  198.325293] vfio-pci 0000:05:00.3: Unable to change power state from D0 to D3hot, device inaccessible
[  198.325301] vfio-pci 0000:05:00.4: Unable to change power state from D0 to D3hot, device inaccessible
[  199.320610] vfio-pci 0000:05:00.0: vfio_bar_restore: reset recovery - restoring BARs
[  199.325174] vfio-pci 0000:05:00.1: vfio_bar_restore: reset recovery - restoring BARs
...
[  205.433892] kvm_amd: kvm [1513]: vcpu0, guest rIP: 0xfffff82b31909d29 Unhandled WRMSR(0xc0010115) = 0x0
[  206.468532] kvm_amd: kvm [1513]: vcpu1, guest rIP: 0xfffff82b31909d29 Unhandled WRMSR(0xc0010115) = 0x0
[  206.592892] kvm_amd: kvm [1513]: vcpu2, guest rIP: 0xfffff82b31909d29 Unhandled WRMSR(0xc0010115) = 0x0
[  206.718155] kvm_amd: kvm [1513]: vcpu3, guest rIP: 0xfffff82b31909d29 Unhandled WRMSR(0xc0010115) = 0x0
The VM starts, but the AMD graphics GPU is still not detected and the physically connected display still shows "Loading Linux 6.8.12-5-pve ...\nLoading initial ramdisk ...".
Anyone can help me here? Probably finding the root cause of Unsupported reset method 'device_specific' is the starting point
 
Last edited:
  • Like
Reactions: wustrong

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!