[SOLVED] AMD GPU Passtrought don't work anymore since upgrade to 7.2-11

olixsm

Member
Oct 27, 2020
5
2
23
67
Hello everyone,

Just upgrade from Proxmox 6.4 to 7.2-11

This broke the passtrought for my AMD GPU and i don't know how to resolve this problem.

I run differents MacOS but now they run without screen output.

A linux/proxmox technician already try all the solution proposed in previous posts but without success
Can anybody help me ?

23:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Vega 10 XTX [Radeon Vega Frontier Edition
23:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Vega 10 HDMI Audio [Radeon Vega 56/64]

Thank already for help
 
I found that updating to kernel version 5.19 helped a lot. I needed work-arounds since 5.13 (and even more so with 5.15) but with version 5.19 the amdgpu does do a graceful hand-off to vfio-pci again.
And you need to select device_specific when using vendor-reset (which you probably already have installed for your GPU) since kernel version 5.15 (PVE 7.2).
 
  • Like
Reactions: olixsm
Hi @leesteken, I had an eye today on Olixsm's configuration today and we noticed a big improvement thanks to your idea to upgrade to the kernel 5.19 (we moved to 5.19.7-2-pve)/

Unfortunately, we are still facing some issues with the GPU passthrough. Since the update it is no longer working with the PCI_Express flag activated on the graphics card in Proxmox. We also have a lot of display and performance problems since this upgrade to 5.19.

Can I ask you to review our configuration files? I am sure it will help us a lot :)

/etc/default/grub
Code:
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on nofb video=vesafb:off video=efifb:off"

/etc/modules
Code:
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

/etc/modprobes.d/vfio.conf
Code:
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd


/etc/modprobes.d/vfio.conf
Code:
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

/etc/modprobes.d/blacklist.conf
Code:
blacklist snd_hda_intel
blacklist snd_hda_codec_hdmi
blacklist i915
blacklist radeon
blacklist nouveau
blacklist nvidia
blacklist i2c-nvidia-gpu
blacklist amdgpu
blacklist nvidiafb
blacklist snd_hda_codec
blacklist snd_hda_core

/etc/pve/qemu-server/200.conf
Code:
args: -device isa-applesmc,osk="ourhardworkbythesewordsguardedpleasedontsteal(c)AppleComputerInc" -smbios type=2  -smp 56,sockets=7,cores=4,threads=2 -device usb-kbd,bus=ehci.0,port=2 -cpu Penryn,kvm=on,vendor=GenuineIntel,+kvm_pv_unhalt,+kvm_pv_eoi,+hypervisor,+invtsc,+pcid,+ssse3,+sse4.2,+popcnt,+avx,+avx2,+aes,+fma,+fma4,+bmi1,+bmi2,+xsave,+xsaveopt,check
balloon: 0
bios: ovmf
boot: order=sata0;net0
cores: 56
cpu: Penryn
efidisk0: local-zfs:vm-200-disk-0,size=1M
hostpci0: 0000:03:00.0,pcie=1
hostpci1: 0000:25:00.3,pcie=1
hostpci2: 0000:43:00.0,pcie=1
hostpci3: 0000:23:00,x-vga=1
machine: q35
memory: 108544
name: BigSur-500-VM
net0: vmxnet3=B2:7E:07:BB:9C:7C,bridge=vmbr0,firewall=1
numa: 0
ostype: other
sata0: local-zfs:vm-200-disk-1,cache=unsafe,discard=on,size=500G,ssd=1
scsihw: virtio-scsi-pci
smbios1: uuid=a2342af1-3db2-429d-9a20-a369bd21cb98
sockets: 1
vga: none
virtio1: /dev/disk/by-id/nvme-Force_MP600_20368295000130352032,backup=0,size=1953514584K
virtio2: /dev/disk/by-id/nvme-Force_MP600_2036829500013035201C,backup=0,size=1953514584K
virtio3: /dev/disk/by-label/NVMe2000_BAK,backup=0
vmgenid: 98ca5ced-167f-4a31-88f8-f1e8a8a432fa

Thanks in advance
Vincent
 
/etc/default/grub
Code:
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on nofb video=vesafb:off video=efifb:off"
amd_iommu=on nofb video=vesafb:off video=efifb:off don't really do anything so you might as well remove them.
/etc/modules
Code:
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
I would also expect vendor-reset here when you use an AMD GPU.
/etc/modprobes.d/vfio.conf
Code:
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
This is wrong. .conf-files in /etc/modprobe.d/ have a different format (and there is no s in modprobe.d).
/etc/modprobes.d/vfio.conf
Code:
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
Why do you repeat this? See my remark above.
/etc/modprobes.d/blacklist.conf
Code:
blacklist snd_hda_intel
blacklist snd_hda_codec_hdmi
blacklist i915
blacklist radeon
blacklist nouveau
blacklist nvidia
blacklist i2c-nvidia-gpu
blacklist amdgpu
blacklist nvidiafb
blacklist snd_hda_codec
blacklist snd_hda_core
My suggestion was to not blacklist amdgpu. I don't know why you are blacklisting all this instead of using early binding to vfio-pci. (there should be no s in modprobe.d).
/etc/pve/qemu-server/200.conf
Code:
args: -device isa-applesmc,osk="ourhardworkbythesewordsguardedpleasedontsteal(c)AppleComputerInc" -smbios type=2  -smp 56,sockets=7,cores=4,threads=2 -device usb-kbd,bus=ehci.0,port=2 -cpu Penryn,kvm=on,vendor=GenuineIntel,+kvm_pv_unhalt,+kvm_pv_eoi,+hypervisor,+invtsc,+pcid,+ssse3,+sse4.2,+popcnt,+avx,+avx2,+aes,+fma,+fma4,+bmi1,+bmi2,+xsave,+xsaveopt,check
balloon: 0
bios: ovmf
boot: order=sata0;net0
cores: 56
cpu: Penryn
efidisk0: local-zfs:vm-200-disk-0,size=1M
hostpci0: 0000:03:00.0,pcie=1
hostpci1: 0000:25:00.3,pcie=1
hostpci2: 0000:43:00.0,pcie=1
hostpci3: 0000:23:00,x-vga=1
machine: q35
memory: 108544
name: BigSur-500-VM
net0: vmxnet3=B2:7E:07:BB:9C:7C,bridge=vmbr0,firewall=1
numa: 0
ostype: other
sata0: local-zfs:vm-200-disk-1,cache=unsafe,discard=on,size=500G,ssd=1
scsihw: virtio-scsi-pci
smbios1: uuid=a2342af1-3db2-429d-9a20-a369bd21cb98
sockets: 1
vga: none
virtio1: /dev/disk/by-id/nvme-Force_MP600_20368295000130352032,backup=0,size=1953514584K
virtio2: /dev/disk/by-id/nvme-Force_MP600_2036829500013035201C,backup=0,size=1953514584K
virtio3: /dev/disk/by-label/NVMe2000_BAK,backup=0
vmgenid: 98ca5ced-167f-4a31-88f8-f1e8a8a432fa
I know nothing about Apple macOS VMs, sorry.
Hi @leesteken, I had an eye today on Olixsm's configuration today and we noticed a big improvement thanks to your idea to upgrade to the kernel 5.19 (we moved to 5.19.7-2-pve)/

Unfortunately, we are still facing some issues with the GPU passthrough. Since the update it is no longer working with the PCI_Express flag activated on the graphics card in Proxmox. We also have a lot of display and performance problems since this upgrade to 5.19.
That's weird because you are using the q35 machine type. What other things did you change and what kernel version did you use before. What errors are you getting?
You don't specify what kind of GPU you are using. I also can't tell from the various hostpci-lines. There is really too little information about your hardware and IOMMU groups to work with. Your configuration is a bit all over the place and I doubt that it worked before in this state.
 
Last edited:
I found that updating to kernel version 5.19 helped a lot. I needed work-arounds since 5.13 (and even more so with 5.15) but with version 5.19 the amdgpu does do a graceful hand-off to vfio-pci again.
And you need to select device_specific when using vendor-reset (which you probably already have installed for your GPU) since kernel version 5.15 (PVE 7.2).
Kernel 5.19 fixed passthrough for me 6900xt. No vendor-reset required. Thanks so much!
 
Hello
Just received the solution about my problem (Thanks so much to Nick Sherlock)

Because of the new version of q35, you need to add "-global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off" to your VM args in /etc/pve/qemu-server/xxx.conf where xxx is the id of your VM

Anyway, thank you for your help
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!