[SOLVED] Proxmox 8.3.5 / Kernel 6.11.11-1: Unable to passthrough AMD iGPU (Ryzen 7 9800X3D / Granite Ridge) on X870E mobo

proxuser25

New Member
Mar 20, 2025
3
0
1
Hello all,

i am getting a little frustrated trying the passthrough the iGPU of my Ryzen 7 9800X3D to one of my VM's.

Kernel (pve-headers installed): Linux deb 6.11.11-1-pve #1 SMP PREEMPT_DYNAMIC PMX 6.11.11-1 (2025-01-17T15:44Z) x86_64 GNU/Linux
Processor: AMD Ryzen 7 9800X3D
Mainboard: Gigabyte X870E

Both devices are separated in its own iommugroup which has no other devices in it, so no conflicts. I have no problem passing through dedicated GPUs (Nvidia and AMD).

VM configuration:
agent: 1
args: -cpu 'host,-hypervisor,kvm=off'
bios: ovmf
boot: order=scsi0;ide0;net0
cores: 8
cpu: host
hostpci0: 0000:7c:00.0,pcie=1,romfile=vbios_1002_13c0.bin <- tried my own dump as well as the github repo and none at all
hostpci1: 0000:7c:00.1,pcie=1
efidisk0: local:123/vm-123-disk-0.qcow2,efitype=4m,pre-enrolled-keys=1,size=528K
machine: pc-q35-9.2
memory: 8192
meta: creation-qemu=9.2.0,ctime=1234
name: iGPUTest
net0: virtio=AA:BB:CC:DD:EE:FF,bridge=vmbr5,firewall=1
numa: 0
ostype: win10
scsi0: local:123/vm-123-disk-1.qcow2,iothread=1,size=50G
scsihw: virtio-scsi-single
smbios1: uuid=aaaaaaa-aaaaaaa-aaaaaaa-aaaaaaa-aaaaaaa
sockets: 1
vmgenid: aaaaaaa-aaaaaaa-aaaaaaa-aaaaaaa-aaaaaaa

dmesg |grep -e DMAR -e IOMMU -e AMD-Vi
[ 0.159159] AMD-Vi: Using global IVHD EFR:0x246577efa2254afa, EFR2:0x0

dmesg | grep 'remapping'
[ 0.480650] AMD-Vi: Interrupt remapping enabled

dmesg | grep -i vfio
[ 11.129426] VFIO - User Level meta-driver version: 0.3
[ 11.146535] vfio-pci 0000:7c:00.0: vgaarb: deactivate vga console
[ 11.146537] vfio-pci 0000:7c:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=io+mem:owns=none
[ 11.146643] vfio_pci: add [1002:13c0[ffffffff:ffffffff]] class 0x000000/00000000
[ 11.170511] vfio_pci: add [1002:1640[ffffffff:ffffffff]] class 0x000000/00000000
[ 129.110782] vfio-pci 0000:03:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=io+mem:owns=none
[ 129.182108] vfio-pci 0000:03:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=io+mem:owns=none
[ 129.182253] vfio-pci 0000:03:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=io+mem:owns=none
[ 130.559031] vfio-pci 0000:03:00.0: enabling device (0002 -> 0003)
[ 130.579019] vfio-pci 0000:03:00.1: enabling device (0000 -> 0002)

lspci -nnk (Host)
7c:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Granite Ridge [Radeon Graphics] [1002:13c0] (rev cb)
Subsystem: Gigabyte Technology Co., Ltd Granite Ridge [Radeon Graphics] [1458:d000]
Kernel driver in use: vfio-pci
Kernel modules: amdgpu
7c:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Rembrandt Radeon High Definition Audio Controller [1002:1640]
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Rembrandt Radeon High Definition Audio Controller [1002:1640]
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel

I also tried following https://github.com/isc30/ryzen-gpu-passthrough-proxmox by either dumping the vbios myself or using the 9800x3d.bin which the repo provides - or using none at all.

Within Linux, i have tried multiple kernels, and Windows the device is being detected by either lspci or AMD Adrenaline during install.
However, whenever i try to probe the driver (Linux) or to open the Adrenaline App (Windows), both show errors. To be fair, it has worked ONCE on Linux (iGPU listed in /dev/dri*), but never after. Had problems before that as well. Must have been pure luck.
Installing the official AMD drivers on Linux did not help either.

Debian 12 (6.12.12+bpo-amd64) during boot:
shpchp 0000:05:04.0: pci_hp_register failed with error -16
shpchp 0000:05:04.0: Slot initialization failed
snd_hda_intel 0000:00:1b.0: no codecs found!

Debian 12 (6.12.12+bpo-amd64) probing amdgpu:
amdgpu 0000:01:00.0: amdgpu: Failed to load toc
amdgpu 0000:01:00.0: amdgpu: PSP tmr init failed!
amdgpu 0000:01:00.0: amdgpu: PSP firmware loading failed
andgpu_device_fw_loading [amdgpu] *ERROR* hw_init of IP block ‹psp> failed -22
amdgpu 0000:01:00.0: amdgpu: amdgpu_device_ip_init failed
amdgpu 0000:01:00.0: amdgpu: Fatal error during GPU init
andgpu 0000:01:00.0: probe with driver andgou failed with error -22

Interestingly the sound device seems to be working!

lspci -nnk (VM)
01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Granite Ridge [Radeon Graphics] [1002:13c0] (rev cb)
Subsystem: Gigabyte Technology Co., Ltd Granite Ridge [Radeon Graphics] [1458:d000]
Kernel modules: amdgpu
02:00.0 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Rembrandt Radeon High Definition Audio Controller [1002:1640]
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Rembrandt Radeon High Definition Audio Controller [1002:1640]
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intel

At last, i am suffering the reset bug although using the vendor reset service/ module (Windows VM/ Linux Host). My Host won't reboot proplery (hangs). I must power cycle.

Checking https://linux-hardware.org/?id=pci:1002-13c0-1043-8877 i am not sure if it should be working.

Any help is appreciated, thanks!!!
 
Got it working ootb with Ubuntu Server 24.10 with Mesa Gallium driver 24.0.9. It is working even after reboots (because rebooting does not completely shut down the VM).
Nevertheless, the reset bug still bothers me and up-to-date chip families like Granite Ridge are not part of vendor-reset.

Linux igpu 6.11.0-19-generic #19-Ubuntu SMP PREEMPT_DYNAMIC Wed Feb 12 21:43:43 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

VM config
agent: 1
bios: ovmf
boot: order=scsi0;net0
cores: 8
cpu: host
efidisk0: local:910/vm-910-disk-1.qcow2,efitype=4m,pre-enrolled-keys=1,size=528K
hostpci0: 0000:7c:00.0,pcie=1,romfile=vbios_9800x3d.bin
hostpci1: 0000:7c:00.1,pcie=1
machine: q35
memory: 8192
meta: creation-qemu=9.2.0,ctime=123
name: iGPU
net0: virtio=12:34:56:78:90,bridge=vmbr5,firewall=1
numa: 0
ostype: l26
scsi0: local:910/vm-910-disk-0.qcow2,iothread=1,size=32G
scsihw: virtio-scsi-single
smbios1: uuid=aaasfsafasfsafasfas
sockets: 1
vga: none
vmgenid: aaasfsafasfsafasfas

radeontop
Untitled.png

lspci -nnk
01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Granite Ridge [Radeon Graphics] [1002:13c0] (rev cb)
Subsystem: Gigabyte Technology Co., Ltd Device [1458:d000]
Kernel driver in use: amdgpu
Kernel modules: amdgpu
02:00.0 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Rembrandt Radeon High Definition Audio Controller [1002:1640]
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Rembrandt Radeon High Definition Audio Controller [1002:1640]
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intel

ls /dev/dri*
by-path card0 card1 renderD128

glxinfo (with Display=none)
name of display: :99
display: :99 screen: 0
direct rendering: Yes
server glx vendor string: SGI
server glx version string: 1.4

Some more info from jellyfin-ffmpeg for those who are interested:
Trying display: drm
libva info: VA-API version 1.22.0
libva info: Trying to open /usr/lib/jellyfin-ffmpeg/lib/dri/radeonsi_drv_video.so
libva info: Found init function __vaDriverInit_1_22
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.22 (libva 2.22.0)
vainfo: Driver version: Mesa Gallium driver 24.0.9 for AMD Radeon Graphics (radeonsi, raphael_mendocino, LLVM 17.0.6, DRM 3.59, 6.11.0-19-generic)
vainfo: Supported profile and entrypoints
VAProfileH264ConstrainedBaseline: VAEntrypointVLD
VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice
VAProfileH264Main : VAEntrypointVLD
VAProfileH264Main : VAEntrypointEncSlice
VAProfileH264High : VAEntrypointVLD
VAProfileH264High : VAEntrypointEncSlice
VAProfileHEVCMain : VAEntrypointVLD
VAProfileHEVCMain : VAEntrypointEncSlice
VAProfileHEVCMain10 : VAEntrypointVLD
VAProfileHEVCMain10 : VAEntrypointEncSlice
VAProfileJPEGBaseline : VAEntrypointVLD
VAProfileVP9Profile0 : VAEntrypointVLD
VAProfileVP9Profile2 : VAEntrypointVLD
VAProfileAV1Profile0 : VAEntrypointVLD
VAProfileNone : VAEntrypointVideoProc

[AVHWDeviceContext @ 0x5a6386878740] Opened DRM device /dev/dri/renderD128: driver amdgpu version 3.59.0.
Applying option init_hw_device (initialise hardware device) with argument vulkan@dr.
[AVHWDeviceContext @ 0x5a6386878940] Supported layers:
[AVHWDeviceContext @ 0x5a6386878940] VK_LAYER_MESA_device_select
[AVHWDeviceContext @ 0x5a6386878940] VK_LAYER_MESA_overlay
[AVHWDeviceContext @ 0x5a6386878940] Using instance extension VK_KHR_portability_enumeration
[AVHWDeviceContext @ 0x5a6386878940] GPU listing:
[AVHWDeviceContext @ 0x5a6386878940] 0: AMD Radeon Graphics (RADV RAPHAEL_MENDOCINO) (integrated) (0x13c0)
[AVHWDeviceContext @ 0x5a6386878940] Requested device: 0x13c0
[AVHWDeviceContext @ 0x5a6386878940] Device 0 selected: AMD Radeon Graphics (RADV RAPHAEL_MENDOCINO) (integrated) (0x13c0)
[AVHWDeviceContext @ 0x5a6386878940] Using device extension VK_KHR_push_descriptor
[AVHWDeviceContext @ 0x5a6386878940] Using device extension VK_EXT_descriptor_buffer
[AVHWDeviceContext @ 0x5a6386878940] Using device extension VK_EXT_physical_device_drm
[AVHWDeviceContext @ 0x5a6386878940] Using device extension VK_EXT_shader_atomic_float
[AVHWDeviceContext @ 0x5a6386878940] Using device extension VK_KHR_external_memory_fd
[AVHWDeviceContext @ 0x5a6386878940] Using device extension VK_EXT_external_memory_dma_buf
[AVHWDeviceContext @ 0x5a6386878940] Using device extension VK_EXT_image_drm_format_modifier
[AVHWDeviceContext @ 0x5a6386878940] Using device extension VK_KHR_external_semaphore_fd
[AVHWDeviceContext @ 0x5a6386878940] Using device extension VK_EXT_external_memory_host
[AVHWDeviceContext @ 0x5a6386878940] Queue families:
[AVHWDeviceContext @ 0x5a6386878940] 0: graphics compute transfer (queues: 1)
[AVHWDeviceContext @ 0x5a6386878940] 1: compute transfer (queues: 4)
[AVHWDeviceContext @ 0x5a6386878940] 2: sparse (queues: 1)
[AVHWDeviceContext @ 0x5a6386878940] Using device: AMD Radeon Graphics (RADV RAPHAEL_MENDOCINO)
[AVHWDeviceContext @ 0x5a6386878940] Alignments:
[AVHWDeviceContext @ 0x5a6386878940] optimalBufferCopyRowPitchAlignment: 1
[AVHWDeviceContext @ 0x5a6386878940] minMemoryMapAlignment: 4096
[AVHWDeviceContext @ 0x5a6386878940] nonCoherentAtomSize: 64
[AVHWDeviceContext @ 0x5a6386878940] minImportedHostPointerAlignment: 4096
[AVHWDeviceContext @ 0x5a6386878940] Using queue family 0 (queues: 1) for graphics
[AVHWDeviceContext @ 0x5a6386878940] Using queue family 1 (queues: 4) for compute transfers
Successfully parsed a group of options.
 
Last edited:
TLDR: Everything is working as expected unless you care for the annoying reset bug. I have not found a solution for it so far other than rebooting the host (the iGPU only supports bus resets).