AMD Ryzen 7 "Renoir" 4750G APU and iGPU pass-thru (to Windows 10 guest)?

NetworkingMicrobe · Mar 23, 2021

Ramalama said:
As far i remember, you need ovmf and try q35 v5.1 and v5.2.

The reason I went with SeaBIOS was because my VBIOS (which I don't pass through in the PCIe device declaration in PVE, since I found it made no difference) doesn't support UEFI. I tested using the guide in Proxmox forums here: https://pve.proxmox.com/wiki/Pci_pa...a_Graphics_Card_is_UEFI_.28OVMF.29_compatible

Output was:

Code:

# ./rom-parser Renoir-017.010.000.015.000000.ROM 
Valid ROM signature found @0h, PCIR offset 1b0h
    PCIR: type 0 (x86 PC-AT), vendor: 1002, device: 1636, class: 030000
    PCIR: revision 0, vendor revision: 110a
    Last image

I also couldn't get any output at all for any guest VM (W10, or otherwise) when trying with OVMF, another reason to stick with SeaBIOS.

How can I try with v5.1 and 5.2 q35 machines? I think only 4.0 is installed in the latest version of PVE?

Ramalama · Mar 23, 2021

You can switch, just doubleclick q35 and tick advanced options.

4750g gpu doesn't support uefi? I don't believe that xD

I mean i see type 0, but i still don't believe that xD

NetworkingMicrobe · Mar 23, 2021

Ramalama said:
You can switch, just doubleclick q35 and tick advanced options.

4750g gpu doesn't support uefi? I don't believe that xD

I mean i see type 0, but i still don't believe that xD

I can't seem to find the option to switch to other q35 versions, either in the "create a VM" wizard, or the "hardware" tab once I made a VM, even with advanced enabled in the wizard...

I also find it hard to believe that the VBIOS which I extracted from different BIOSes all say type 0, since clearly I can boot W10 UEFI baremetal and it will work fine...so I'm not sure what that is about.
I did try with the q35 machine (honestly not sure what version since I can't see any more details), and with OVMF, and the screen just goes from the "Loading initial ramdisk..." which was left after I cleared efifb on boot, to a full black screen. No OVMF proxmox logo, or WIndows logo at all.

Screen Shot 2021-03-23 at 4.47.38 PM.png

Screen Shot 2021-03-23 at 4.47.46 PM.png

Ramalama · Mar 23, 2021

Yes exactly there. Under edit machine.
Did you updated proxmox to the latest?

Ramalama · Mar 23, 2021

NetworkingMicrobe · Mar 23, 2021

Looks like I was running 6.3-2 instead of latest 6.3-6, and now I see the option. Sorry about that, I just downloaded it so I thought I was latest but I guess not.

Tried out OVMF (q35 5.1 and 5.2), can't even get it to initialize the iGPU, screen is still on "Loading initial ramdisk..." from during bootup when efifb got turned off -- both with and without VBIOS romfile.

NetworkingMicrobe · Mar 23, 2021

Conversely, if I remove amdgpu from the pve-blacklist.conf file, and I remove the GPU and audio devcie from /etc/modprobe.d/vfio.conf (so it doesn't bint VFIO on boot), and I remove the video=efifb: off from /etc/default/grub -- then when I start my VM with OVMF, with or without VBIOS, the screen goes black and monitor says "no signal detected". However it still shows iGPU in W10 Device Manager (code 43 though).

Ramalama · Mar 23, 2021

Yeah the devs were busy on providing updates last week xD

Lol, i wanted to give you a hint to add "textonly video=efifb:off" to grub cmdline, there was one more option, to not initialize graphics at all, but i forgot how it was called...

I remember with my nvidia graphics, that without cmdline, i didn't got it working, once the host touched the card during boot, i had code 43 in the guest....

But for me it is easy, because i have actually ast2500 (ipmi/bmc) as graphics card, so textonly video=astdrmfb video=efifb:off i bound the whole output to that card and nvidia passthrough worked perfectly...

However, you could try this: vfio-pci.ids=1002:1636 textonly video=efifb:off

NetworkingMicrobe · Mar 23, 2021

Ramalama said:
Yeah the devs were busy on providing updates last week xD

Lol, i wanted to give you a hint to add "textonly video=efifb:off" to grub cmdline, there was one more option, to not initialize graphics at all, but i forgot how it was called...

I remember with my nvidia graphics, that without cmdline, i didn't got it working, once the host touched the card during boot, i had code 43 in the guest....

But for me it is easy, because i have actually ast2500 (ipmi/bmc) as graphics card, so textonly video=astdrmfb video=efifb:off i bound the whole output to that card and nvidia passthrough worked perfectly...

However, you could try this: vfio-pci.ids=1002:1636 textonly video=efifb:off

I tried the options you suggested, and no real change. Screen still stuck on "Loading initial ramdisk" and I guess that means the VM can't "grab" control of the iGPU. Not sure if you saw my post above yours (#27), but at one point I did manage to get the VM to "grab" the iGPU I think, because the screen output went blank/no signal found (displayed on my monitor) as soon as I turn on the VM. I listed the details in post 27.

Really appreciate the time you and other users are taking to help me out with this! Thanks a lot.

EDIT: I followed dmesg live as I start the OVMF VM and I noticed an interesting error: [drm:amdgpu_pci_remove [amdgpu]] *ERROR* Hotplug removal is not supported

Code:

[  +0.033377] Generic FE-GE Realtek PHY r8169-200:00: attached PHY driver (mii_bus:phy_addr=r8169-200:00, irq=IGNORE)
[  +0.160763] amdgpu 0000:03:00.0: amdgpu: Unsupported power profile mode 0 on RENOIR
[  +0.047528] r8169 0000:02:00.0 enp2s0: Link is Down
[  +0.002766] vmbr0: port 1(enp2s0) entered blocking state
[  +0.000005] vmbr0: port 1(enp2s0) entered forwarding state
[  +0.150011] bpfilter: Loaded bpfilter_umh pid 873
[  +0.000200] Started bpfilter
[  +0.631307] vmbr0: port 1(enp2s0) entered disabled state
[  +2.714129] r8169 0000:02:00.0 enp2s0: Link is Up - 1Gbps/Full - flow control rx/tx
[  +0.000027] vmbr0: port 1(enp2s0) entered blocking state
[  +0.000005] vmbr0: port 1(enp2s0) entered forwarding state
[  +0.000108] IPv6: ADDRCONF(NETDEV_CHANGE): vmbr0: link becomes ready

***THIS IS WHERE I CLICK START FOR THE W10 OVMF Q35 V5.1 VM***

[Mar23 18:49] [drm:amdgpu_pci_remove [amdgpu]] *ERROR* Hotplug removal is not supported
[  +0.001332] amdgpu 0000:03:00.0: amdgpu: amdgpu: finishing device.
[  +0.100417] Console: switching to colour dummy device 80x25
[  +0.054735] [drm] free PSP TMR buffer
[  +0.038138] [TTM] Zone  kernel: Used memory at exit: 0 KiB
[  +0.000011] [TTM] Zone   dma32: Used memory at exit: 0 KiB
[  +0.000005] [drm] amdgpu: ttm finalized
[  +0.000799] vfio-pci 0000:03:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[  +0.479684] device tap101i0 entered promiscuous mode
[  +0.026777] fwbr101i0: port 1(fwln101i0) entered blocking state
[  +0.000004] fwbr101i0: port 1(fwln101i0) entered disabled state
[  +0.000059] device fwln101i0 entered promiscuous mode
[  +0.000034] fwbr101i0: port 1(fwln101i0) entered blocking state
[  +0.000001] fwbr101i0: port 1(fwln101i0) entered forwarding state
[  +0.002669] vmbr0: port 2(fwpr101p0) entered blocking state
[  +0.000004] vmbr0: port 2(fwpr101p0) entered disabled state
[  +0.000066] device fwpr101p0 entered promiscuous mode
[  +0.000042] vmbr0: port 2(fwpr101p0) entered blocking state
[  +0.000003] vmbr0: port 2(fwpr101p0) entered forwarding state
[  +0.002539] fwbr101i0: port 2(tap101i0) entered blocking state
[  +0.000003] fwbr101i0: port 2(tap101i0) entered disabled state
[  +0.000087] fwbr101i0: port 2(tap101i0) entered blocking state
[  +0.000002] fwbr101i0: port 2(tap101i0) entered forwarding state
[  +0.024589] kvm: SMP vm created on host with unstable TSC; guest TSC will not be reliable
[  +0.587221] vfio-pci 0000:03:00.0: vfio_ecap_init: hiding ecap 0x19@0x270
[  +0.000009] vfio-pci 0000:03:00.0: vfio_ecap_init: hiding ecap 0x1b@0x2d0
[  +0.000002] vfio-pci 0000:03:00.0: vfio_ecap_init: hiding ecap 0x25@0x400
[  +0.000001] vfio-pci 0000:03:00.0: vfio_ecap_init: hiding ecap 0x26@0x410
[  +0.000001] vfio-pci 0000:03:00.0: vfio_ecap_init: hiding ecap 0x27@0x440

So then I blacklist amdgpu again, and get a BAR memory reservation error instead...progress?! Weird how this isn't an issue at all with SeaBIOS, but I would like to continue down the OVMF path for now...

Code:

***THIS IS WHERE I CLICK START FOR THE W10 OVMF Q35 V5.1 VM***

[ +16.389389] vfio-pci 0000:03:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[  +0.043602] vfio-pci 0000:03:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[  +0.000598] vfio-pci 0000:03:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[  +0.510428] device tap101i0 entered promiscuous mode
[  +0.028015] fwbr101i0: port 1(fwln101i0) entered blocking state
[  +0.000004] fwbr101i0: port 1(fwln101i0) entered disabled state
[  +0.000065] device fwln101i0 entered promiscuous mode
[  +0.000034] fwbr101i0: port 1(fwln101i0) entered blocking state
[  +0.000001] fwbr101i0: port 1(fwln101i0) entered forwarding state
[  +0.002576] vmbr0: port 2(fwpr101p0) entered blocking state
[  +0.000003] vmbr0: port 2(fwpr101p0) entered disabled state
[  +0.000063] device fwpr101p0 entered promiscuous mode
[  +0.000037] vmbr0: port 2(fwpr101p0) entered blocking state
[  +0.000001] vmbr0: port 2(fwpr101p0) entered forwarding state
[  +0.002624] fwbr101i0: port 2(tap101i0) entered blocking state
[  +0.000005] fwbr101i0: port 2(tap101i0) entered disabled state
[  +0.000092] fwbr101i0: port 2(tap101i0) entered blocking state
[  +0.000001] fwbr101i0: port 2(tap101i0) entered forwarding state
[  +0.024897] kvm: SMP vm created on host with unstable TSC; guest TSC will not be reliable
[  +0.586979] vfio-pci 0000:03:00.0: vfio_ecap_init: hiding ecap 0x19@0x270
[  +0.000010] vfio-pci 0000:03:00.0: vfio_ecap_init: hiding ecap 0x1b@0x2d0
[  +0.000001] vfio-pci 0000:03:00.0: vfio_ecap_init: hiding ecap 0x25@0x400
[  +0.000002] vfio-pci 0000:03:00.0: vfio_ecap_init: hiding ecap 0x26@0x410
[  +0.000001] vfio-pci 0000:03:00.0: vfio_ecap_init: hiding ecap 0x27@0x440
[  +0.000589] vfio-pci 0000:03:00.0: BAR 0: can't reserve [mem 0xd0000000-0xdfffffff 64bit pref]
[Mar23 19:24] usb 3-2.2: reset full-speed USB device number 3 using xhci_hcd

Then I check /proc/iomem to see what's using that range, and it is the efifb. So I have to set efifb

ff again, almost back to square 1. Then my dmesg looks relatively fine, no errors, but yet no output on display, just "Loading initial ramdisk" from kernel.

Code:

[Mar23 19:31] vfio-pci 0000:03:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[  +0.037397] vfio-pci 0000:03:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[  +0.000575] vfio-pci 0000:03:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[  +0.479181] device tap101i0 entered promiscuous mode
[  +0.027498] fwbr101i0: port 1(fwln101i0) entered blocking state
[  +0.000005] fwbr101i0: port 1(fwln101i0) entered disabled state
[  +0.000077] device fwln101i0 entered promiscuous mode
[  +0.000042] fwbr101i0: port 1(fwln101i0) entered blocking state
[  +0.000001] fwbr101i0: port 1(fwln101i0) entered forwarding state
[  +0.002831] vmbr0: port 2(fwpr101p0) entered blocking state
[  +0.000004] vmbr0: port 2(fwpr101p0) entered disabled state
[  +0.000070] device fwpr101p0 entered promiscuous mode
[  +0.000039] vmbr0: port 2(fwpr101p0) entered blocking state
[  +0.000001] vmbr0: port 2(fwpr101p0) entered forwarding state
[  +0.003233] fwbr101i0: port 2(tap101i0) entered blocking state
[  +0.000004] fwbr101i0: port 2(tap101i0) entered disabled state
[  +0.000070] fwbr101i0: port 2(tap101i0) entered blocking state
[  +0.000001] fwbr101i0: port 2(tap101i0) entered forwarding state
[  +0.024956] kvm: SMP vm created on host with unstable TSC; guest TSC will not be reliable
[  +0.586866] vfio-pci 0000:03:00.0: vfio_ecap_init: hiding ecap 0x19@0x270
[  +0.000009] vfio-pci 0000:03:00.0: vfio_ecap_init: hiding ecap 0x1b@0x2d0
[  +0.000002] vfio-pci 0000:03:00.0: vfio_ecap_init: hiding ecap 0x25@0x400
[  +0.000001] vfio-pci 0000:03:00.0: vfio_ecap_init: hiding ecap 0x26@0x410
[  +0.000001] vfio-pci 0000:03:00.0: vfio_ecap_init: hiding ecap 0x27@0x440
[Mar23 19:32] usb 3-2.2: reset full-speed USB device number 3 using xhci_hcd

Ramalama · Mar 24, 2021

I readed your post above, thats why i wrote "lol, i wanted" but you tryed/used that already, but i still decided to continue to write xD
Dunno, sure is sure xD

But yes, i guess you are unlucky and would never get that working, only wasting your time.

I have another idea, if the whole ovmf thing doesn't work and seabios works halfway with destroyed picture...
Maybe then something like parsec/openstream or rdp with acceleration will work?

Since this all uses only the internal x264/x265 encoder of the gpu, but doesn't actually renders to the display.
Parsec/Openstream are easy, for accelerated rdp, you need to google, there was something with gpc edits to activate that.

Just an idea.

Another idea is, that someone with actually good iommu groups (x570 etc) tryes that, or need to google if someone had luck at all to pass the g series gpu xD

NetworkingMicrobe · Mar 24, 2021

Ramalama said:
I readed your post above, thats why i wrote "lol, i wanted" but you tryed/used that already, but i still decided to continue to write xD
Dunno, sure is sure xD

But yes, i guess you are unlucky and would never get that working, only wasting your time.

I have another idea, if the whole ovmf thing doesn't work and seabios works halfway with destroyed picture...
Maybe then something like parsec/openstream or rdp with acceleration will work?

Since this all uses only the internal x264/x265 encoder of the gpu, but doesn't actually renders to the display.
Parsec/Openstream are easy, for accelerated rdp, you need to google, there was something with gpc edits to activate that.

Just an idea.

Another idea is, that someone with actually good iommu groups (x570 etc) tryes that, or need to google if someone had luck at all to pass the g series gpu xD

I will look into those options. Thanks again for your time and help!

Maybe one day a future driver will allow it to work with SeaBIOS so I could see more than jut a cursor moving around over pixelated output.

Cheers!

thex · Apr 1, 2021

Any progress? Currently trying the same with Ubuntu 20.04.
I get Ubuntu to display on the GPU but not with proper AMD drivers.

Client dmesg filtered for amd:

Code:

[    0.000000] Linux version 5.8.0-48-generic (buildd@lgw01-amd64-008) (gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #54~20.04.1-Ubuntu SMP Sat Mar 20 13:40:25 UTC 2021 (Ubuntu 5.8.0-48.54~20.04.1-generic 5.8.18)
[    0.000000]   AMD AuthenticAMD
[    0.043320] RAMDISK: [mem 0x2f21f000-0x33906fff]
[    0.140038] Spectre V2 : Mitigation: Full AMD retpoline
[    0.249573] smpboot: CPU0: AMD Common KVM processor (family: 0xf, model: 0x6, stepping: 0x1)
[    0.249666] Performance Events: AMD PMU driver.
[    1.491208] amdkcl: loading out-of-tree module taints kernel.
[    1.491219] amdkcl: module verification failed: signature and/or required key missing - tainting kernel
[    1.551281] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>
[    1.551282] AMD-Vi: AMD IOMMUv2 functionality not available on this system
[    1.566247] [drm] amdgpu kernel modesetting enabled.
[    1.566248] [drm] amdgpu version: 5.9.10.20.50
[    1.566373] amdgpu: CRAT table not found
[    1.566375] amdgpu: Virtual CRAT table created for CPU
[    1.566385] amdgpu: Topology: Add CPU node
[    1.568501] fb0: switching to amdgpudrmfb from VESA VGA
[    1.568653] amdgpu 0000:01:00.0: vgaarb: deactivate vga console
[    1.568937] amdgpu 0000:01:00.0: amdgpu: Trusted Memory Zone (TMZ) feature disabled as experimental (default)
[    1.573748] amdgpu 0000:01:00.0: amdgpu: Fetched VBIOS from ROM BAR
[    1.573750] amdgpu: ATOM BIOS: 113-RENOIR-033
[    1.573823] amdgpu 0000:01:00.0: amdgpu: VRAM: 512M 0x000000F400000000 - 0x000000F41FFFFFFF (512M used)
[    1.573824] amdgpu 0000:01:00.0: amdgpu: GART: 1024M 0x0000000000000000 - 0x000000003FFFFFFF
[    1.573825] amdgpu 0000:01:00.0: amdgpu: AGP: 267419648M 0x000000F800000000 - 0x0000FFFFFFFFFFFF
[    1.573841] Modules linked in: amdgpu(OE+) iommu_v2 amd_sched(OE) amdttm(OE) amdkcl(OE) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec psmouse rc_core virtio_net i2c_i801 ahci net_failover xhci_pci i2c_smbus drm libahci lpc_ich failover xhci_pci_renesas virtio_scsi i2c_algo_bit
[    1.573927]  amdgpu_bo_init+0x21/0xa0 [amdgpu]
[    1.573983]  gmc_v9_0_sw_init+0x365/0x590 [amdgpu]
[    1.574047]  amdgpu_device_init.cold+0x12c4/0x1ad1 [amdgpu]
[    1.574101]  amdgpu_driver_load_kms+0x30/0x200 [amdgpu]
[    1.574149]  amdgpu_pci_probe+0x15e/0x1e0 [amdgpu]
[    1.574205]  amdgpu_init+0xb1/0x1000 [amdgpu]
[    1.574620] caller amdgpu_ttm_init+0x116/0x410 [amdgpu] mapping multiple BARs
[    1.574632] [drm] amdgpu: 512M of VRAM memory ready
[    1.574637] [drm] amdgpu: 3934M of GTT memory ready.
[    3.193943] [drm:psp_hw_start [amdgpu]] *ERROR* PSP create ring failed!
[    3.194057] [drm:psp_hw_init [amdgpu]] *ERROR* PSP firmware loading failed
[    3.194174] [drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* hw_init of IP block <psp> failed -22
[    3.194177] amdgpu 0000:01:00.0: amdgpu: amdgpu_device_ip_init failed
[    3.194206] amdgpu 0000:01:00.0: amdgpu: Fatal error during GPU init
[    3.194653] amdgpu: probe of 0000:01:00.0 failed with error -22

I already tried it with VBIOS instead of the passthrough (which seems to work in general) (see: https://forums.unraid.net/topic/100729-help-requested-for-amd-ryzen-5-pro-4650g-passthrough/)

I also tried EFI boot but this didn't even give me output.

For initial instalation of the VM I did not yet attach the GPU so that I can see the installer.
I also tried with installing latest amdgpu drivers but it didn't help (from here: https://www.amd.com/en/support/grap.../amd-radeon-rx-5700-series/amd-radeon-rx-5700)

leesteken · Apr 1, 2021

I guess that the AMD driver does not expect an integrated GPU inside a VM, and therefore it does things that don't work or does not do things that do work. Unfortunately, manufacturers often do not plan, test and support these kind of configurations. It would be amazing if you do get this to work, but don't expect the manufacturer (e.g, the official drivers) to support and/or help you with this. My bet is on the open-source community-supported drivers (eventually), but please prove me wrong...
EDIT: integrated GPU is perfect for showing the Proxmox console, but not for VMs, and on Ryzen they take 8 PCIe lanes away from passthrough unfortunately.

NetworkingMicrobe · Apr 2, 2021

@thex interesting output, thanks for sharing. I have no updates sadly, and as user avw wrote, it's likely that the drivers don't expect the iGPU in a VM. Also, I'm the user CodingMicrobe over on the Unraid forums

Having said that, the errors you are seeing are very curious. It looks like the driver crashes specifically when looking for the AMD PSP (platform security processor), also known as AMD secure technology. It would be worth trying to pass it through as well, on the same bus as the iGPU and audio device, since it is listed as such in the IOMMU groups.
I won't be able to test for a while, but would be very interested in hearing if this helps at all!

Here's a snippet of my PCI devices visible in the host, so 03:00.2 would be the PSP which seems to be needed for iGPU function:

Code:

03:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Renoir [1002:1636] (rev d8)
03:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:1637]
03:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) Platform Security Processor [1022:15df]

EDIT: just to add, I am not sure if the host OS (PVE) uses this encryption controller or how critical it is, so there is a chance that it may lead to your host OS crashing...keep this in mind and proceed with caution.

thex · Apr 2, 2021

I added the PSP to the VM but result was exactly the same.

However one boot it looked a bit different but I don't know why:

Code:

[    1.571310] [drm] amdgpu: 512M of VRAM memory ready
[    1.571312] [drm] amdgpu: 3934M of GTT memory ready.
[    2.843233] amdgpu 0000:01:00.0: amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:158 vmid:0 pasid:0, for process  pid 0 thread  pid 0)
[    2.843237] amdgpu 0000:01:00.0: amdgpu:   in page starting at address 0x000000000000 from client 18
[    2.843240] amdgpu 0000:01:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x0000033D
[    2.843241] amdgpu 0000:01:00.0: amdgpu:      Faulty UTCL2 client ID: MP0 (0x1)
[    2.843242] amdgpu 0000:01:00.0: amdgpu:      MORE_FAULTS: 0x1
[    2.843243] amdgpu 0000:01:00.0: amdgpu:      WALKER_ERROR: 0x6
[    2.843244] amdgpu 0000:01:00.0: amdgpu:      PERMISSION_FAULTS: 0x3
[    2.843245] amdgpu 0000:01:00.0: amdgpu:      MAPPING_ERROR: 0x1
[    2.843246] amdgpu 0000:01:00.0: amdgpu:      RW: 0x0
[    2.843248] amdgpu 0000:01:00.0: amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:158 vmid:0 pasid:0, for process  pid 0 thread  pid 0)
[    2.843249] amdgpu 0000:01:00.0: amdgpu:   in page starting at address 0x000000000000 from client 18
[    2.843251] amdgpu 0000:01:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
[    2.843252] amdgpu 0000:01:00.0: amdgpu:      Faulty UTCL2 client ID: MP1 (0x0)
[    2.843253] amdgpu 0000:01:00.0: amdgpu:      MORE_FAULTS: 0x0
[    2.843254] amdgpu 0000:01:00.0: amdgpu:      WALKER_ERROR: 0x0
[    2.843255] amdgpu 0000:01:00.0: amdgpu:      PERMISSION_FAULTS: 0x0
[    2.843256] amdgpu 0000:01:00.0: amdgpu:      MAPPING_ERROR: 0x0
[    2.843257] amdgpu 0000:01:00.0: amdgpu:      RW: 0x0
[   26.832172] [drm:psp_hw_start [amdgpu]] *ERROR* PSP load tmr failed!
[   26.832250] [drm:psp_hw_init [amdgpu]] *ERROR* PSP firmware loading failed
[   26.832322] [drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* hw_init of IP block <psp> failed -22
[   26.832325] amdgpu 0000:01:00.0: amdgpu: amdgpu_device_ip_init failed
[   26.832339] amdgpu 0000:01:00.0: amdgpu: Fatal error during GPU init
[   26.832692] amdgpu: probe of 0000:01:00.0 failed with error -22

Currently not sure if I will proceed with this, main usecase would be video transcoding and there I could go via lxc and forwarding /dev/dri (however I also have a problem there https://forum.proxmox.com/threads/amd-renoir-drivers-4650g.84205/)

NetworkingMicrobe · Apr 2, 2021

thex said:

I added the PSP to the VM but result was exactly the same.

However one boot it looked a bit different but I don't know why:

Code:

[    1.571310] [drm] amdgpu: 512M of VRAM memory ready
[    1.571312] [drm] amdgpu: 3934M of GTT memory ready.
[    2.843233] amdgpu 0000:01:00.0: amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:158 vmid:0 pasid:0, for process  pid 0 thread  pid 0)
[    2.843237] amdgpu 0000:01:00.0: amdgpu:   in page starting at address 0x000000000000 from client 18
[    2.843240] amdgpu 0000:01:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x0000033D
[    2.843241] amdgpu 0000:01:00.0: amdgpu:      Faulty UTCL2 client ID: MP0 (0x1)
[    2.843242] amdgpu 0000:01:00.0: amdgpu:      MORE_FAULTS: 0x1
[    2.843243] amdgpu 0000:01:00.0: amdgpu:      WALKER_ERROR: 0x6
[    2.843244] amdgpu 0000:01:00.0: amdgpu:      PERMISSION_FAULTS: 0x3
[    2.843245] amdgpu 0000:01:00.0: amdgpu:      MAPPING_ERROR: 0x1
[    2.843246] amdgpu 0000:01:00.0: amdgpu:      RW: 0x0
[    2.843248] amdgpu 0000:01:00.0: amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:158 vmid:0 pasid:0, for process  pid 0 thread  pid 0)
[    2.843249] amdgpu 0000:01:00.0: amdgpu:   in page starting at address 0x000000000000 from client 18
[    2.843251] amdgpu 0000:01:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
[    2.843252] amdgpu 0000:01:00.0: amdgpu:      Faulty UTCL2 client ID: MP1 (0x0)
[    2.843253] amdgpu 0000:01:00.0: amdgpu:      MORE_FAULTS: 0x0
[    2.843254] amdgpu 0000:01:00.0: amdgpu:      WALKER_ERROR: 0x0
[    2.843255] amdgpu 0000:01:00.0: amdgpu:      PERMISSION_FAULTS: 0x0
[    2.843256] amdgpu 0000:01:00.0: amdgpu:      MAPPING_ERROR: 0x0
[    2.843257] amdgpu 0000:01:00.0: amdgpu:      RW: 0x0
[   26.832172] [drm:psp_hw_start [amdgpu]] *ERROR* PSP load tmr failed!
[   26.832250] [drm:psp_hw_init [amdgpu]] *ERROR* PSP firmware loading failed
[   26.832322] [drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* hw_init of IP block <psp> failed -22
[   26.832325] amdgpu 0000:01:00.0: amdgpu: amdgpu_device_ip_init failed
[   26.832339] amdgpu 0000:01:00.0: amdgpu: Fatal error during GPU init
[   26.832692] amdgpu: probe of 0000:01:00.0 failed with error -22

Currently not sure if I will proceed with this, main usecase would be video transcoding and there I could go via lxc and forwarding /dev/dri (however I also have a problem there https://forum.proxmox.com/threads/amd-renoir-drivers-4650g.84205/)

Saw this post on L1T as well, the last comment proposes a possible solution - https://forum.level1techs.com/t/radeon-vii-not-initialising-psp-fails/143654/2

Essentially requiring

Code:

iommu=soft

since there may be issues with AMD IOMMU drivers. Worth a try? Maybe with and without passing through the PSP...

thex · Apr 2, 2021

no, didn't help

Same output on initial start.

Although some new output upon restarting the VM

Code:

[    1.509119] [drm] amdgpu: 512M of VRAM memory ready
[    1.509121] [drm] amdgpu: 3934M of GTT memory ready.
[    2.885269] amdgpu 0000:01:00.0: amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:158 vmid:0 pasid:0, for process  pid 0 thread  pid 0)
[    2.885272] amdgpu 0000:01:00.0: amdgpu:   in page starting at address 0x000000000000 from client 18
[    2.885274] amdgpu 0000:01:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x0000033D
[    2.885275] amdgpu 0000:01:00.0: amdgpu:      Faulty UTCL2 client ID: MP0 (0x1)
[    2.885276] amdgpu 0000:01:00.0: amdgpu:      MORE_FAULTS: 0x1
[    2.885277] amdgpu 0000:01:00.0: amdgpu:      WALKER_ERROR: 0x6
[    2.885277] amdgpu 0000:01:00.0: amdgpu:      PERMISSION_FAULTS: 0x3
[    2.885278] amdgpu 0000:01:00.0: amdgpu:      MAPPING_ERROR: 0x1
[    2.885278] amdgpu 0000:01:00.0: amdgpu:      RW: 0x0
[    2.885280] amdgpu 0000:01:00.0: amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:158 vmid:0 pasid:0, for process  pid 0 thread  pid 0)
[    2.885281] amdgpu 0000:01:00.0: amdgpu:   in page starting at address 0x000000000000 from client 18
[    2.885283] amdgpu 0000:01:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x0000033C
[    2.885283] amdgpu 0000:01:00.0: amdgpu:      Faulty UTCL2 client ID: MP0 (0x1)
[    2.885284] amdgpu 0000:01:00.0: amdgpu:      MORE_FAULTS: 0x0
[    2.885284] amdgpu 0000:01:00.0: amdgpu:      WALKER_ERROR: 0x6
[    2.885285] amdgpu 0000:01:00.0: amdgpu:      PERMISSION_FAULTS: 0x3
[    2.885286] amdgpu 0000:01:00.0: amdgpu:      MAPPING_ERROR: 0x1
[    2.885286] amdgpu 0000:01:00.0: amdgpu:      RW: 0x0
[    2.885288] amdgpu 0000:01:00.0: amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:158 vmid:0 pasid:0, for process  pid 0 thread  pid 0)
[    2.885289] amdgpu 0000:01:00.0: amdgpu:   in page starting at address 0x000000000000 from client 18
[    2.885290] amdgpu 0000:01:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
[    2.885291] amdgpu 0000:01:00.0: amdgpu:      Faulty UTCL2 client ID: MP1 (0x0)
[    2.885291] amdgpu 0000:01:00.0: amdgpu:      MORE_FAULTS: 0x0
[    2.885292] amdgpu 0000:01:00.0: amdgpu:      WALKER_ERROR: 0x0
[    2.885293] amdgpu 0000:01:00.0: amdgpu:      PERMISSION_FAULTS: 0x0
[    2.885293] amdgpu 0000:01:00.0: amdgpu:      MAPPING_ERROR: 0x0
[    2.885294] amdgpu 0000:01:00.0: amdgpu:      RW: 0x0
[    2.885319] amdgpu 0000:01:00.0: amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:158 vmid:0 pasid:0, for process  pid 0 thread  pid 0)
[    2.885320] amdgpu 0000:01:00.0: amdgpu:   in page starting at address 0x000000000000 from client 18
[    2.885321] amdgpu 0000:01:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x0000033C
[    2.885322] amdgpu 0000:01:00.0: amdgpu:      Faulty UTCL2 client ID: MP0 (0x1)
[    2.885322] amdgpu 0000:01:00.0: amdgpu:      MORE_FAULTS: 0x0
[    2.885323] amdgpu 0000:01:00.0: amdgpu:      WALKER_ERROR: 0x6
[    2.885324] amdgpu 0000:01:00.0: amdgpu:      PERMISSION_FAULTS: 0x3
[    2.885324] amdgpu 0000:01:00.0: amdgpu:      MAPPING_ERROR: 0x1
[    2.885325] amdgpu 0000:01:00.0: amdgpu:      RW: 0x0
[    2.885361] amdgpu 0000:01:00.0: amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:158 vmid:0 pasid:0, for process  pid 0 thread  pid 0)
[    2.885362] amdgpu 0000:01:00.0: amdgpu:   in page starting at address 0x000000000000 from client 18
[    2.885364] amdgpu 0000:01:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x0000033D
[    2.885364] amdgpu 0000:01:00.0: amdgpu:      Faulty UTCL2 client ID: MP0 (0x1)
[    2.885365] amdgpu 0000:01:00.0: amdgpu:      MORE_FAULTS: 0x1
[    2.885365] amdgpu 0000:01:00.0: amdgpu:      WALKER_ERROR: 0x6
[    2.885366] amdgpu 0000:01:00.0: amdgpu:      PERMISSION_FAULTS: 0x3
[    2.885367] amdgpu 0000:01:00.0: amdgpu:      MAPPING_ERROR: 0x1
[    2.885367] amdgpu 0000:01:00.0: amdgpu:      RW: 0x0
[    2.885368] amdgpu 0000:01:00.0: amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:158 vmid:0 pasid:0, for process  pid 0 thread  pid 0)
[    2.885369] amdgpu 0000:01:00.0: amdgpu:   in page starting at address 0x000000000000 from client 18
[    2.885370] amdgpu 0000:01:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
[    2.885371] amdgpu 0000:01:00.0: amdgpu:      Faulty UTCL2 client ID: MP1 (0x0)
[    2.885372] amdgpu 0000:01:00.0: amdgpu:      MORE_FAULTS: 0x0
[    2.885372] amdgpu 0000:01:00.0: amdgpu:      WALKER_ERROR: 0x0
[    2.885373] amdgpu 0000:01:00.0: amdgpu:      PERMISSION_FAULTS: 0x0
[    2.885373] amdgpu 0000:01:00.0: amdgpu:      MAPPING_ERROR: 0x0
[    2.885374] amdgpu 0000:01:00.0: amdgpu:      RW: 0x0
[    2.885402] amdgpu 0000:01:00.0: amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:158 vmid:0 pasid:0, for process  pid 0 thread  pid 0)
[    2.885403] amdgpu 0000:01:00.0: amdgpu:   in page starting at address 0x000000000000 from client 18
[    2.885405] amdgpu 0000:01:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x0000033D
[    2.885405] amdgpu 0000:01:00.0: amdgpu:      Faulty UTCL2 client ID: MP0 (0x1)
[    2.885406] amdgpu 0000:01:00.0: amdgpu:      MORE_FAULTS: 0x1
[    2.885407] amdgpu 0000:01:00.0: amdgpu:      WALKER_ERROR: 0x6
[    2.885407] amdgpu 0000:01:00.0: amdgpu:      PERMISSION_FAULTS: 0x3
[    2.885408] amdgpu 0000:01:00.0: amdgpu:      MAPPING_ERROR: 0x1
[    2.885408] amdgpu 0000:01:00.0: amdgpu:      RW: 0x0
[    2.885409] amdgpu 0000:01:00.0: amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:158 vmid:0 pasid:0, for process  pid 0 thread  pid 0)
[    2.885410] amdgpu 0000:01:00.0: amdgpu:   in page starting at address 0x000000000000 from client 18
[    2.885411] amdgpu 0000:01:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
[    2.885412] amdgpu 0000:01:00.0: amdgpu:      Faulty UTCL2 client ID: MP1 (0x0)
[    2.885413] amdgpu 0000:01:00.0: amdgpu:      MORE_FAULTS: 0x0
[    2.885413] amdgpu 0000:01:00.0: amdgpu:      WALKER_ERROR: 0x0
[    2.885414] amdgpu 0000:01:00.0: amdgpu:      PERMISSION_FAULTS: 0x0
[    2.885414] amdgpu 0000:01:00.0: amdgpu:      MAPPING_ERROR: 0x0
[    2.885415] amdgpu 0000:01:00.0: amdgpu:      RW: 0x0
[    2.885459] amdgpu 0000:01:00.0: amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:158 vmid:0 pasid:0, for process  pid 0 thread  pid 0)
[    2.885459] amdgpu 0000:01:00.0: amdgpu:   in page starting at address 0x000000000000 from client 18
[    2.885461] amdgpu 0000:01:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x0000033D
[    2.885461] amdgpu 0000:01:00.0: amdgpu:      Faulty UTCL2 client ID: MP0 (0x1)
[    2.885462] amdgpu 0000:01:00.0: amdgpu:      MORE_FAULTS: 0x1
[    2.885463] amdgpu 0000:01:00.0: amdgpu:      WALKER_ERROR: 0x6
[    2.885463] amdgpu 0000:01:00.0: amdgpu:      PERMISSION_FAULTS: 0x3
[    2.885464] amdgpu 0000:01:00.0: amdgpu:      MAPPING_ERROR: 0x1
[    2.885464] amdgpu 0000:01:00.0: amdgpu:      RW: 0x0
[    3.188360] [drm:psp_hw_start [amdgpu]] *ERROR* PSP create ring failed!
[    3.188417] [drm:psp_hw_init [amdgpu]] *ERROR* PSP firmware loading failed
[    3.188467] [drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* hw_init of IP block <psp> failed -22
[    3.188470] amdgpu 0000:01:00.0: amdgpu: amdgpu_device_ip_init failed
[    3.188484] amdgpu 0000:01:00.0: amdgpu: Fatal error during GPU init
[    3.188838] amdgpu: probe of 0000:01:00.0 failed with error -22

NetworkingMicrobe · Apr 2, 2021

thex said:
no, didn't help
Same output on initial start.

Although some new output upon restarting the VM

Sorry I'm out of ideas then

It seems to really be something with the driver not supporting VMs. Even on a Windows 10 VM, all I could see was my mouse moving as I moved it, but the entire rest of my screen was pixelated mess.

thex · Apr 2, 2021

I think I follow the lxc route now...

saw that there were some updates in amdgpu driver in 5.11 kernel regarding renoir but upgrading to 5.11 did not change anything

thex · Apr 2, 2021

last thing I tried now was if kernel 5.11 on pve side does help anything but it doesn't. (needed it for LXC route anyway)
I already expected that, my current guess it is the init of the drivers, PCI pass through seems to be fine.

AMD Ryzen 7 "Renoir" 4750G APU and iGPU pass-thru (to Windows 10 guest)?

New Member

Renowned Member

New Member

Renowned Member

Renowned Member

New Member

New Member

Renowned Member

New Member

Renowned Member

New Member

Member

Distinguished Member

New Member

Member

New Member

Member

New Member

Member

Member

We value your privacy