For me the situation is as follows:
I added the vendor id (0x1638 for Ryzen 5 5600G iGPU) to device-db.h & udev-rule in vendor-reset, so it resets like AMD_NAVI10, and this made the card show up whenever I start qemu!
Without vendor-reset, qemu will refuse to start because of "pci_hp_register failed with error -16" AND the first time you start it after host reboot it will/can also not work inside VM no matter what, but other people reported that this is the time where it should work naturally and can work at all for that matter (for me in this instance, all VFIO PCIe cards are simply absent/missing, and the VM is bugged out and reacts like 100x slower).
It would seem that for me (or maybe due to updates nowadays in general), the card is already reset when Qemu starts. And without vendor-reset this vanilla reset of Linux causes it to malfunction in the guest, and then consecutively the host. Resulting in the loathed "pci_hp_register error" on second invocation of qemu after reboot.
So thanks to vendor-reset now apparently also working with Ryzen 5 5600G iGPU, the card showed up for the first time ever, but with Code 43 sadly. I then incorporated UEFI boot into my Qemu command, which requires you to reinstall Windows. It seemed quite an odd thing to try to me, despite being mentioned here and there by people such as LANnerd in this thread, because there are not any real instructions for it anywhere in all the tutorials. And all the examples don't use UEFI. Anyway, now I also had to install Adrenaline with the driver again.
The card then showed up without Code 43 in Device Manager. However naturally I did this while using qxl-vga. So when I started Looking Glass, unfortunately it seemed to utilize this virtual card. This is why I rebooted the VM for the 10th time without rebooting host, only to remove the qxl-vga device. Sadly then the card showed up as Code 43 again, and I had to use Remote Desktop to work with the machine. When using qxl-vga again out of desperation,
I was able to make it load the driver in Windows a second time without errors, only to be faced with the same issue. After trying various things a dozen times in a row, such as trying exactly what I did before, adding a dedicated pcie-root-port or using romfile=vgabios-cezanne-uefi.bin (which I didn't do before), plus rebooting the host and hoping the first time I start qemu it will be different (but it never was) ... I came to the conclusion that it seems too random and unlikely to happen for the Adrenaline driver to fix the Code 43.
I am not sure what is going on, but I think the Adrenaline installer is performing some sort of magic reset procedure, which fixes Code 43, but often it doesn't and it demands you restart the PC instead.
Before using UEFI, I was getting those weird warning in Qemu that came over STDOUT (not STDERR) so you could not see them in the console with -daemonize .
qemu-system-x86_64: VFIO_MAP_DMA failed: Bad address
qemu-system-x86_64: vfio_container_dma_map(0x56141ecabab0, 0x380000000000, 0x10000000, 0x7ef58c000000) = -14 (Bad address)
qemu-system-x86_64: VFIO_MAP_DMA failed: Bad address
qemu-system-x86_64: vfio_container_dma_map(0x56141ecabab0, 0x380010000000, 0x4000, 0x7ef5b4400000) = -14 (Bad address)
qemu-system-x86_64: VFIO_MAP_DMA failed: Bad address
qemu-system-x86_64: vfio_container_dma_map(0x56141ecabab0, 0x380010005000, 0x1fb000, 0x7ef5b4405000) = -14 (Bad address)
qemu-system-x86_64: VFIO_MAP_DMA failed: Bad address
qemu-system-x86_64: vfio_container_dma_map(0x56141ecabab0, 0xfe900000, 0x42000, 0x7ef6b6a99000) = -14 (Bad address)
qemu-system-x86_64: VFIO_MAP_DMA failed: Bad address
qemu-system-x86_64: vfio_container_dma_map(0x56141ecabab0, 0xfe943000, 0x3d000, 0x7ef6b6adc000) = -14 (Bad address)
qemu-system-x86_64: VFIO_MAP_DMA failed: Bad address
qemu-system-x86_64: vfio_container_dma_map(0x56141ecabab0, 0xfe988000, 0x4000, 0x7ef6cc007000) = -14 (Bad address)
Those errors however disappear 100% of the time if I use UEFI with Qemu... I hope that's not because UEFI just breaks the logging somehow.
Here are all the detailed steps I did to make it work, considering that you can take the normal tutorials as reference to make sense of it and understand what the procedure actually is for. I put those things that I believe to be essential in bold and the probably redundant stuff in italics and what I am unsure about in normal formatting.
Grub CMD: amd_iommu=on iommu=pt rd.driver.pre=vfio-pci vfio-pci.ids=1002:1638,1002:1637,1022:15df,1022:1639,1022:15e3,1022:1635,1022:1632 kvm.ignore_msrs=1 video=efifb:off video=vesafb:off
pcie_acs_override=downstream,multifunction
Note: ACS override let's you "cherry pick" GPU from the IOMMU group, but causes VM to be able to read host memory! video= not needed if GPU not used during boot
After boot remove unwanted drivers:
modprobe vendor_reset
rmmod amdgpu
rmmod snd_hda_intel
rmmod snd_hda_codec_hdmi
And reset drivers to vfio-pci (so you don't have to blacklist the entire driver) + fix power states:
# My GPU is on iommu group 5 along with TPM module, sound card, mystery USB controller that has no outlets and dummy/empty PCIe root controller = don't need (not the same for everyone)
IOMMU_GROUP=5
for i in $(command ls /sys/kernel/iommu_groups/${IOMMU_GROUP}/devices); do
echo "$i" > /sys/bus/pci/devices/${i}/driver/unbind
echo "vfio-pci" > /sys/bus/pci/devices/${i}/driver_override
echo "$i" > /sys/bus/pci/drivers/vfio-pci/bind
# TODO: Those two hurt power draw, needs to be put in VM script later and toggled only when VM is on
echo "0" > /sys/bus/pci/devices/${i}/d3cold_allowed
echo on > /sys/bus/pci/devices/${i}/power/control
done
In vendor-reset add the following line to src/device-db.h (use lspci -nnk to get this number for GPU, e.g. 1002:1638) after #define _AMD_NAVI10(op) \:
{PCI_VENDOR_ID_ATI, 0x1638, op, DEVICE_INFO(AMD_NAVI10)}, \
Then you can also add your vendor id to 99-vendor-reset.rules, which you have to copy to /etc/udev/rules.d/ by hand... or you can do the same by hand in vm script:
modprobe vendor_reset
mygpuid=0000:0e:00.0; echo "device_specific" > /sys/bus/pci/devices/${mygpuid}/reset_method
The last thing is my qemu command:
qemu-system-x86_64 \
-enable-kvm \
-cpu host,kvm=on,l3-cache=on,hv_relaxed,hv_vapic,hv_time,hv_spinlocks=0x1fff,hv_vendor_id=hv_dummy \
-smp 3 \
-m 4G \
-machine q35,accel=kvm,kernel_irqchip=on \
-net tap,script=no,ifname=vm5,vnet_hdr=on -net nic,macaddr=52:13:37:2A:F1:75,model=e1000 \
-device ivshmem-plain,memdev=ivshmem,bus=pcie.0 \
-object memory-backend-file,id=ivshmem,share=on,mem-path=/dev/shm/looking-glass,size=64M \
-monitor telnet:127.0.0.1:4448,server,nowait \
-device virtio-mouse-pci \
-device virtio-keyboard-pci \
-device ich9-intel-hda \
-device hda-output \
-drive if=pflash,format=raw,readonly=on,file=/usr/share/edk2/x64/OVMF_CODE.4m.fd \
-drive if=pflash,format=raw,file=/home/myuser/STORE/VM/win10.OVMF_VARS.4m.fd \
-spice port=5900,addr=127.0.0.1,disable-ticketing=on,ipv4=on \
-device virtio-serial-pci \
-chardev spicevmc,id=vdagent,name=vdagent \
-device virtserialport,chardev=vdagent,name=com.redhat.spice.0 \
-drive index=0,file=/home/myuser/STORE/VM/win10_uefi.img,if=ide,cache=writeback,format=raw \
-vga none \
-serial file:/tmp/win10kvm.log \
-D /tmp/win10kvm.log2 \
-device pcie-root-port,id=root_port1,chassis=0,slot=0,bus=pcie.0,hotplug=on,multifunction=on \
-device vfio-pci,host=0e:00.0,bus=root_port1,addr=00.0,multifunction=on,x-vga=on,romfile=vgabios-cezanne-uefi.bin \
-device vfio-pci,host=0e:00.1,bus=root_port1,addr=00.1 \
-device vfio-pci,host=0e:00.2,bus=root_port1,addr=00.2 \
-device vfio-pci,host=0e:00.3,bus=root_port1,addr=00.3 \
-device vfio-pci,host=0e:00.4,bus=root_port1,addr=00.4 \
-device vfio-pci,host=0e:00.6,bus=root_port1,addr=00.6 \
-daemonize
The two lines with pflash are to use UEFI (don't use -bios switch). The file in the second line "should" be writable, that's why I copied it over to a user folder... however I think that's probably overcomplicating it. When it first "worked", I was not using root port nor adding the entire iommu group and I only did this out of desperation basically. The normal switches without root port are:
-device vfio-pci,host=0e:00.0,multifunction=on,x-vga=on,romfile=vgabios-cezanne-uefi.bin \
-device vfio-pci,host=0e:00.1 \
Also what I did was add
vfio_pci_core vfio_iommu_type1 vfio_pci
to MODULES in /etc/mkinitcpio.conf & mkinitcpio -P ... however I don't think you really need to do this ... like listing
[B]vfio-pci.ids=[/B]
in kernel command line, I think this is only relevant if you want to forward the same GPU that you are using during boot process. I think the only cmd you actually need in Grub is amd_iommu=on iommu=pt , because this enables iommu in the case that you cannot enable it in the BIOS by hand (such as in my case, I have a BIOS with practically zero options). But if your BIOS allows you to do this, you might not even need to add anything to kernel CMD!
I think it is possible that lots of unsupported GPUs might work with vendor-reset, if you simply add the vendor IDs to either navi10, polaris10, vega10 or vega20. Because I had a brief look at the Linux AMD drivers, and there is an explanation in it that there are only a couple of reset strategies that the driver also uses. So my guess is that most of these strategies are covered by those 4 options in vendor-reset... according to the explanation "BACO" reset is only used for dedicated GPUs so not iGPUs, and NAVI10 is the only one without "BACO".
There are a couple of people that have reported 5600G or similar iGPUs to work with VFIO, not only in this thread. But none of them have said or explicitly mentioned that it still worked also after restarting Qemu, which is vital for it to use it at all, and I now presented a solution for. So perhaps at this point it is only a small detail missing so that VFIO can work properly for 5600G and possibly tons of other iGPUs or GPUs as well.