[SOLVED] Problem with GPU Passthrough

That was freaking awsome dude!

Let me first say thank you. And right after this let me point out that I am pretty new to proxmox and all this.

I tried to follow the most common procedures and as they didnt work I combined it with others which was probably the beginning of the end :)

Let me know if you need more and I try to post immediately.


First things first...

Yes, Vt-d is enabled.


My actual VM looks like this but I really think the issue starts before the VM itself.
balloon: 0 bios: ovmf boot: order=scsi0;ide2;ide0 cores: 12 cpu: host efidisk0: nvpool02:vm-100-disk-0,efitype=4m,pre-enrolled-keys=1,size=1M hookscript: local:snippets/gpu-hookscript.sh hostpci0: 0000:03:00,pcie=1,romfile=Navi21.rom ide0: local:iso/virtio-win-0.1.215.iso,media=cdrom,size=528322K ide2: local:iso/virtio-win-0.1.215.iso,cache=unsafe,size=528322K machine: pc-q35-5.2 memory: 16384 meta: creation-qemu=6.2.0,ctime=1654983483 name: legosmagic net0: virtio=AE:8A:09:05:36:85,bridge=vmbr0,firewall=1 numa: 0 ostype: win10 scsi0: nvpool02:vm-100-disk-1,size=256G scsihw: virtio-scsi-pci smbios1: uuid=ef8e7f9e-e491-4f80-80ec-652921726db5 sockets: 1 "/etc/pve/qemu-server/100.conf" 24 lines, 773 bytes




I ereased all entried from /etc/modprobe.d/blacklist.conf


modprobe.d/vfio.conf
options vfio-pci ids=1002:73bf,1002:ab28,1002:1479,1002:1478 disable_vga=1


/etc/kernel/cmdline now says:
root=ZFS=rpool/ROOT/pve-1 boot=zfs quiet intel_iommu=on iommu=pt

You wrote: Weird. Looks like you are booting in UEFI from ZFS but according to the manual systemd-boot is not used, Maybe CSM is enabled?
--> I actually heard that already once but didnt undertand what was meant. So am I booting correctly? How do I change this CSM or check if enabled?


proxmox-boot-tool status
Re-executing '/usr/sbin/proxmox-boot-tool' in new private mount namespace..
System currently booted with uefi
ls: cannot access '/var/tmp/espmounts/44A2-FE31/vmlinuz-*': No such file or directory
44A2-FE31 is configured with: uefi (versions: 5.15.30-2-pve, 5.15.35-2-pve, 5.16.20-edge, 5.17.14-edge), grub (versions: )
44A3-35FA is configured with: uefi (versions: 5.15.30-2-pve, 5.15.35-2-pve, 5.16.20-edge, 5.17.14-edge)


lspci -nnk GPU
Re-executing '/usr/sbin/proxmox-boot-tool' in new private mount namespace..

01:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Upstream Port of PCI Express Switch [1002:1478] (rev c1)
Kernel driver in use: pcieport
02:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch [1002:1479]
Kernel driver in use: pcieport
03:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] [1002:73bf] (rev c1)
Subsystem: ASUSTeK Computer Inc. Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] [1043:04f2]
Kernel driver in use: vfio-pci
Kernel modules: amdgpu
03:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller [1002:ab28]
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller [1002:ab28]
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel


ispci -k after changes and rebooting
01:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Upstream Port of PCI Express Switch (rev c1)
Kernel driver in use: pcieport
02:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch
Kernel driver in use: pcieport
03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] (rev c1)
Subsystem: ASUSTeK Computer Inc. Navi 21 [Radeon RX 6800/6800 XT / 6900 XT]
Kernel driver in use: vfio-pci
Kernel modules: amdgpu
03:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel
 
That was freaking awsome dude!
We have done nothing yet ;-)
balloon: 0 bios: ovmf boot: order=scsi0;ide2;ide0 cores: 12 cpu: host efidisk0: nvpool02:vm-100-disk-0,efitype=4m,pre-enrolled-keys=1,size=1M hookscript: local:snippets/gpu-hookscript.sh hostpci0: 0000:03:00,pcie=1,romfile=Navi21.rom ide0: local:iso/virtio-win-0.1.215.iso,media=cdrom,size=528322K ide2: local:iso/virtio-win-0.1.215.iso,cache=unsafe,size=528322K machine: pc-q35-5.2 memory: 16384 meta: creation-qemu=6.2.0,ctime=1654983483 name: legosmagic net0: virtio=AE:8A:09:05:36:85,bridge=vmbr0,firewall=1 numa: 0 ostype: win10 scsi0: nvpool02:vm-100-disk-1,size=256G scsihw: virtio-scsi-pci smbios1: uuid=ef8e7f9e-e491-4f80-80ec-652921726db5 sockets: 1
Can you try without ,romfile=Navi21.rom? Because we want amdgpu to reset the GPU properly, it is probably best to let it use the physical ROM in its current state.
I ereased all entried from /etc/modprobe.d/blacklist.conf

modprobe.d/vfio.conf
options vfio-pci ids=1002:73bf,1002:ab28,1002:1479,1002:1478 disable_vga=1
Please remove 1002:73bf (and the comma) to allow the amdgpu driver to load for the GPU.
Also make sure to run update-initramfs -u (which runs proxmox-boot-tool refresh automatically) to apply the changes and reboot.
/etc/kernel/cmdline now says:
root=ZFS=rpool/ROOT/pve-1 boot=zfs quiet intel_iommu=on iommu=pt
You have not shown the IOMMU groups without pcie_acs_override yet.
You wrote: Weird. Looks like you are booting in UEFI from ZFS but according to the manual systemd-boot is not used, Maybe CSM is enabled?
--> I actually heard that already once but didnt undertand what was meant. So am I booting correctly? How do I change this CSM or check if enabled?

proxmox-boot-tool status
Re-executing '/usr/sbin/proxmox-boot-tool' in new private mount namespace..
System currently booted with uefi
ls: cannot access '/var/tmp/espmounts/44A2-FE31/vmlinuz-*': No such file or directory
44A2-FE31 is configured with: uefi (versions: 5.15.30-2-pve, 5.15.35-2-pve, 5.16.20-edge, 5.17.14-edge), grub (versions: )
44A3-35FA is configured with: uefi (versions: 5.15.30-2-pve, 5.15.35-2-pve, 5.16.20-edge, 5.17.14-edge)
If your cat /proc/cmdline corresponds (after a update-initramfs -u or proxmox-boot-tool refresh and a reboot) with /etc/kernel/cmdline then you are booting in UEFI mode (with root on ZFS). If you have been seeing changes to /etc/kernel/cmdline being in effect (instead of /etc/default/grub) then you are definately booting with systemd-boot and not GRUB. It does not matter much which one is used, but we need to be sure to edit the right file for the kernel parameters. (Having two possible bootloaders really annoys me about Proxmox when trying to help people.)
Still, I would prefer that you used pve-kernel-5.15 (which I know to work fine with amdgpu) instead of 5.17 edge.
ispci -k after changes and rebooting
03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] (rev c1) Subsystem: ASUSTeK Computer Inc. Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] Kernel driver in use: vfio-pci Kernel modules: amdgpu
We want amdgpu to be the kernel driver in use. This is probably because of the 1002:73bf in /etc/modprobe.d/vfio.conf. You can easily check with lspci -nnks 03:00 after making the changes above (and rebooting and before starting the VM). It's fine that the other functions of the GPU are early bound to vfio-pci.
 
  • Like
Reactions: duxnobis13
I thought I did...hmm..am I now using the correct one?

Linux hci01 5.15.39-1-pve #1 SMP PVE 5.15.39-1 (Wed, 22 Jun 2022 17:22:00 +0200) x86_64


cat /proc/cmdline
initrd=\EFI\proxmox\5.15.39-1-pve\initrd.img-5.15.39-1-pve root=ZFS=rpool/ROOT/pve-1 boot=zfs quiet intel_iommu=on iommu=pt

cat /etc/kernel/cmdline
root=ZFS=rpool/ROOT/pve-1 boot=zfs quiet intel_iommu=on iommu=pt

cat /etc/modprobe.d/vfio.conf
options vfio-pci ids=1002:ab28,1002:1479,1002:1478 disable_vga=1

lspci -nnks 03:00
03:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] [1002:73bf] (rev c1)
Subsystem: ASUSTeK Computer Inc. Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] [1043:04f2]
Kernel driver in use: amdgpu
Kernel modules: amdgpu
03:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller [1002:ab28]
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller [1002:ab28]
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel

cat /etc/pve/qemu-server/100.conf
balloon: 0
bios: ovmf
boot: order=scsi0;ide2;ide0
cores: 12
cpu: host
efidisk0: nvpool02:vm-100-disk-0,efitype=4m,pre-enrolled-keys=1,size=1M
hookscript: local:snippets/gpu-hookscript.sh
hostpci0: 0000:03:00,pcie=1
ide0: local:iso/virtio-win-0.1.215.iso,media=cdrom,size=528322K
ide2: local:iso/virtio-win-0.1.215.iso,cache=unsafe,size=528322K
machine: pc-q35-5.2
memory: 16384
meta: creation-qemu=6.2.0,ctime=1654983483
name: legosmagic
net0: virtio=AE:8A:09:05:36:85,bridge=vmbr0,firewall=1
numa: 0
ostype: win10
scsi0: nvpool02:vm-100-disk-1,size=256G
scsihw: virtio-scsi-pci
smbios1: uuid=ef8e7f9e-e491-4f80-80ec-652921726db5
sockets: 1


dmesg | grep -e DMAR -e IOMMU
[ 0.008635] ACPI: DMAR 0x0000000044D58000 000088 (v02 INTEL EDK2 00000002 01000013)
[ 0.008681] ACPI: Reserving DMAR table memory at [mem 0x44d58000-0x44d58087]
[ 0.077421] DMAR: IOMMU enabled
[ 0.187573] DMAR: Host address width 39
[ 0.187573] DMAR: DRHD base: 0x000000fed90000 flags: 0x0
[ 0.187577] DMAR: dmar0: reg_base_addr fed90000 ver 4:0 cap 1c0000c40660462 ecap 29a00f0505e
[ 0.187579] DMAR: DRHD base: 0x000000fed91000 flags: 0x1
[ 0.187581] DMAR: dmar1: reg_base_addr fed91000 ver 5:0 cap d2008c40660462 ecap f050da
[ 0.187584] DMAR: RMRR base: 0x0000004c000000 end: 0x000000507fffff
[ 0.187586] DMAR-IR: IOAPIC id 2 under DRHD base 0xfed91000 IOMMU 1
[ 0.187587] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
[ 0.187588] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[ 0.188501] DMAR-IR: Enabled IRQ remapping in x2apic mode
[ 0.505339] pci 0000:00:02.0: DMAR: Skip IOMMU disabling for graphics
[ 0.566788] DMAR: No ATSR found
[ 0.566788] DMAR: No SATC found
[ 0.566789] DMAR: IOMMU feature fl1gp_support inconsistent
[ 0.566790] DMAR: IOMMU feature pgsel_inv inconsistent
[ 0.566791] DMAR: IOMMU feature nwfs inconsistent
[ 0.566791] DMAR: IOMMU feature dit inconsistent
[ 0.566791] DMAR: IOMMU feature sc_support inconsistent
[ 0.566792] DMAR: IOMMU feature dev_iotlb_support inconsistent
[ 0.566793] DMAR: dmar0: Using Queued invalidation
[ 0.566795] DMAR: dmar1: Using Queued invalidation
[ 0.567739] DMAR: Intel(R) Virtualization Technology for Directed I/O
[ 3.902125] AMD-Vi: AMD IOMMUv2 functionality not available on this system - This is not a bug.


lspci -k
01:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Upstream Port of PCI Express Switch (rev c1)
Kernel driver in use: pcieport
02:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch
Kernel driver in use: pcieport
03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] (rev c1)
Subsystem: ASUSTeK Computer Inc. Navi 21 [Radeon RX 6800/6800 XT / 6900 XT]
Kernel driver in use: amdgpu
Kernel modules: amdgpu
03:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel


lspci -v
01:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Upstream Port of PCI Express Switch (rev c1) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 130, IOMMU group 16
Memory at 82300000 (32-bit, non-prefetchable) [size=16K]
Bus: primary=01, secondary=02, subordinate=03, sec-latency=0
I/O behind bridge: 00004000-00004fff [size=4K]
Memory behind bridge: 82100000-822fffff [size=2M]
Prefetchable memory behind bridge: 0000006000000000-000000640fffffff [size=16640M]
Capabilities: [48] Vendor Specific Information: Len=08 <?>
Capabilities: [50] Power Management version 3
Capabilities: [58] Express Upstream Port, MSI 00
Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
Capabilities: [150] Advanced Error Reporting
Capabilities: [270] Secondary PCI Express
Capabilities: [320] Latency Tolerance Reporting
Capabilities: [370] L1 PM Substates
Capabilities: [400] Data Link Feature <?>
Capabilities: [410] Physical Layer 16.0 GT/s <?>
Capabilities: [440] Lane Margining at the Receiver <?>
Kernel driver in use: pcieport

02:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 131, IOMMU group 17
Bus: primary=02, secondary=03, subordinate=03, sec-latency=0
I/O behind bridge: 00004000-00004fff [size=4K]
Memory behind bridge: 82100000-822fffff [size=2M]
Prefetchable memory behind bridge: 0000006000000000-000000640fffffff [size=16640M]
Capabilities: [50] Power Management version 3
Capabilities: [58] Express Downstream Port (Slot-), MSI 00
Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Capabilities: [c0] Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
Capabilities: [150] Advanced Error Reporting
Capabilities: [270] Secondary PCI Express
Capabilities: [2a0] Access Control Services
Capabilities: [400] Data Link Feature <?>
Capabilities: [410] Physical Layer 16.0 GT/s <?>
Capabilities: [440] Lane Margining at the Receiver <?>
Kernel driver in use: pcieport

03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] (rev c1) (prog-if 00 [VGA controller])
Subsystem: ASUSTeK Computer Inc. Navi 21 [Radeon RX 6800/6800 XT / 6900 XT]
Flags: bus master, fast devsel, latency 0, IRQ 218, IOMMU group 18
Memory at 6000000000 (64-bit, prefetchable) [size=16G]
Memory at 6400000000 (64-bit, prefetchable) [size=256M]
I/O ports at 4000
Memory at 82100000 (32-bit, non-prefetchable) [size=1M]
Expansion ROM at 82200000 [disabled] [size=128K]
Capabilities: [48] Vendor Specific Information: Len=08 <?>
Capabilities: [50] Power Management version 3
Capabilities: [64] Express Legacy Endpoint, MSI 00
Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
Capabilities: [150] Advanced Error Reporting
Capabilities: [200] Physical Resizable BAR
Capabilities: [240] Power Budgeting <?>
Capabilities: [270] Secondary PCI Express
Capabilities: [2a0] Access Control Services
Capabilities: [2d0] Process Address Space ID (PASID)
Capabilities: [320] Latency Tolerance Reporting
Capabilities: [410] Physical Layer 16.0 GT/s <?>
Capabilities: [440] Lane Margining at the Receiver <?>
Kernel driver in use: amdgpu
Kernel modules: amdgpu

03:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller
Flags: fast devsel, IRQ 255, IOMMU group 19
Memory at 82220000 (32-bit, non-prefetchable) [disabled] [size=16K]
Capabilities: [48] Vendor Specific Information: Len=08 <?>
Capabilities: [50] Power Management version 3
Capabilities: [64] Express Legacy Endpoint, MSI 00
Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
Capabilities: [150] Advanced Error Reporting
Capabilities: [2a0] Access Control Services
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel
 
Last edited:
I thought I did...hmm..am I now using the correct one?

Linux hci01 5.15.39-1-pve #1 SMP PVE 5.15.39-1 (Wed, 22 Jun 2022 17:22:00 +0200) x86_64
Look good.
cat /proc/cmdline
initrd=\EFI\proxmox\5.15.39-1-pve\initrd.img-5.15.39-1-pve root=ZFS=rpool/ROOT/pve-1 boot=zfs quiet intel_iommu=on iommu=pt

cat /etc/kernel/cmdline
root=ZFS=rpool/ROOT/pve-1 boot=zfs quiet intel_iommu=on iommu=pt
Looks good and makes me think you are editing the right file.
cat /etc/modprobe.d/vfio.conf
options vfio-pci ids=1002:ab28,1002:1479,1002:1478 disable_vga=1

lspci -nnks 03:00
03:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] [1002:73bf] (rev c1)
Subsystem: ASUSTeK Computer Inc. Navi 21 [Radeon RX 6800/6800 XT / 6900 XT] [1043:04f2]
Kernel driver in use: amdgpu
Kernel modules: amdgpu
Looks good.
03:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller [1002:ab28]
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller [1002:ab28]
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel
This is fine.
cat /etc/pve/qemu-server/100.conf
balloon: 0
bios: ovmf
boot: order=scsi0;ide2;ide0
cores: 12
cpu: host
efidisk0: nvpool02:vm-100-disk-0,efitype=4m,pre-enrolled-keys=1,size=1M
hookscript: local:snippets/gpu-hookscript.sh
hostpci0: 0000:03:00,pcie=1
ide0: local:iso/virtio-win-0.1.215.iso,media=cdrom,size=528322K
ide2: local:iso/virtio-win-0.1.215.iso,cache=unsafe,size=528322K
machine: pc-q35-5.2
memory: 16384
meta: creation-qemu=6.2.0,ctime=1654983483
name: legosmagic
net0: virtio=AE:8A:09:05:36:85,bridge=vmbr0,firewall=1
numa: 0
ostype: win10
scsi0: nvpool02:vm-100-disk-1,size=256G
scsihw: virtio-scsi-pci
smbios1: uuid=ef8e7f9e-e491-4f80-80ec-652921726db5
sockets: 1
Don't see anything wrong. If you get the VM to start and have problems with the drivers for the GPU, we might want to try a Linux Live ISO like Ubuntu 22.04.
dmesg | grep -e DMAR -e IOMMU
[ 0.008635] ACPI: DMAR 0x0000000044D58000 000088 (v02 INTEL EDK2 00000002 01000013)
[ 0.008681] ACPI: Reserving DMAR table memory at [mem 0x44d58000-0x44d58087]
[ 0.077421] DMAR: IOMMU enabled
[ 0.187573] DMAR: Host address width 39
[ 0.187573] DMAR: DRHD base: 0x000000fed90000 flags: 0x0
[ 0.187577] DMAR: dmar0: reg_base_addr fed90000 ver 4:0 cap 1c0000c40660462 ecap 29a00f0505e
[ 0.187579] DMAR: DRHD base: 0x000000fed91000 flags: 0x1
[ 0.187581] DMAR: dmar1: reg_base_addr fed91000 ver 5:0 cap d2008c40660462 ecap f050da
[ 0.187584] DMAR: RMRR base: 0x0000004c000000 end: 0x000000507fffff
[ 0.187586] DMAR-IR: IOAPIC id 2 under DRHD base 0xfed91000 IOMMU 1
[ 0.187587] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
[ 0.187588] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[ 0.188501] DMAR-IR: Enabled IRQ remapping in x2apic mode
[ 0.505339] pci 0000:00:02.0: DMAR: Skip IOMMU disabling for graphics
[ 0.566788] DMAR: No ATSR found
[ 0.566788] DMAR: No SATC found
[ 0.566789] DMAR: IOMMU feature fl1gp_support inconsistent
[ 0.566790] DMAR: IOMMU feature pgsel_inv inconsistent
[ 0.566791] DMAR: IOMMU feature nwfs inconsistent
[ 0.566791] DMAR: IOMMU feature dit inconsistent
[ 0.566791] DMAR: IOMMU feature sc_support inconsistent
[ 0.566792] DMAR: IOMMU feature dev_iotlb_support inconsistent
[ 0.566793] DMAR: dmar0: Using Queued invalidation
[ 0.566795] DMAR: dmar1: Using Queued invalidation
[ 0.567739] DMAR: Intel(R) Virtualization Technology for Directed I/O
[ 3.902125] AMD-Vi: AMD IOMMUv2 functionality not available on this system - This is not a bug.
This does not tell me anything, which might just be my ignorance.
lspci -k
01:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Upstream Port of PCI Express Switch (rev c1)
Kernel driver in use: pcieport
02:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch
Kernel driver in use: pcieport
PCI(e) bridges/switches usually don't matter for passthrough.
I skipped the (hard to read) rest because you already showed the (relevant) information.

Is passthrough working now? Did the VM start or give errors? What errors are you encountering (inside the VM)? Anything relevant in journalctl -b 0 from around the time of starting the VM?

One more thing: 6000-series AMD GPUs support Resizable BAR (or Smart Access Memory as they call it), which is not supported by KVM/QEMU passthrough (as far as I know). Please disable it in the BIOS. Maybe I should have said this earlier, sorry.
 
OMG! I think it kind of works.

I see the GPU in win10 without the error 43! I restarted the VM several times and it keeps working!

Guess it was a combination of all but at the end it was the CAM in the BIOS.

Thanks Thanks Thanks!!!!!!!!!

Only thing which I dont know if its important…when I start the VM I see this…

BdsDxe: loading Boot0007 "Windous Boot Manager" from HD(1.6PT,E4059DC1-9055-4205
-AAFE-946787FEEA14.0x800,0x32000)\EFIWicrosoft\Boot\bootngfu.efi
BdsDxe: starting Boot0007 "Windows Boot Manager" from HD(1.GPT,E4059DC1-9055-42C
5-AAFE-9467B7FEEA14.0x800,0x32000)/NEFIMicrosoft\Boot\bootngfw.efi


My next question is…how can I now bring the VM/windows10 on my monitor?

I keep seei g the proxmox welcome message with the login prompt?!
 
OMG! I think it kind of works.

I see the GPU in win10 without the error 43! I restarted the VM several times and it keeps working!
I guess it works now.
Guess it was a combination of all but at the end it was the CAM in the BIOS.
What is CAM?
Only thing which I dont know if its important…when I start the VM I see this…

BdsDxe: loading Boot0007 "Windous Boot Manager" from HD(1.6PT,E4059DC1-9055-4205
-AAFE-946787FEEA14.0x800,0x32000)\EFIWicrosoft\Boot\bootngfu.efi
BdsDxe: starting Boot0007 "Windows Boot Manager" from HD(1.GPT,E4059DC1-9055-42C
5-AAFE-9467B7FEEA14.0x800,0x32000)/NEFIMicrosoft\Boot\bootngfw.efi
The Proxmox boot screen does show things like that for each VM when it boots.
My next question is…how can I now bring the VM/windows10 on my monitor?
I keep seei g the proxmox welcome message with the login prompt?!
Connect a physical monitor to the GPU and set Display to none for the VM. It should use the physical display because it is the only GPU and output of the VM (and you can't look at the Windows desktop from Proxmox anymore).
Note that you won't be able to send mouse and keyboard input to the VM. For this you either pass some USB ports to the VM (and plug a keyboard and mouse into those ports) or PCI(e) passthrough an entire USB controller (just like the GPU) (and plug a keyboard and mouse into the ports of that controller).
 
  • Like
Reactions: duxnobis13
Oh lord I could cry! I am literally fiddeling around with all this since march!!! Nobody was able to help! But you mate!!! My hero! Thank you 1000 times! PM me your address and I will try to send you a bottle of good single malt! ❤️
 
Stupid question…if I shut down windows…i end at the „welcome to proxmox“ screen. As I am planning to run different VMs and OS, using 3 monitors, how can I switch from one to the other?
If I shutdown windows I dont see proxmox anymore so I wont be able to start a VM right?
 
Oh lord I could cry! I am literally fiddeling around with all this since march!!! Nobody was able to help! But you mate!!! My hero! Thank you 1000 times! PM me your address and I will try to send you a bottle of good single malt! ❤️
Tempting, but it would set a bad precedence and accepting gifts can actually get me into trouble. Sorry!
 
  • Like
Reactions: duxnobis13
Stupid question…if I shut down windows…i end at the „welcome to proxmox“ screen. As I am planning to run different VMs and OS, using 3 monitors, how can I switch from one to the other?
You can only use a passed through device with one VM at a time. Just like it would be impossible to share the GPU between three physical computers.
With Proxmox, you usually run different VMs without a GPU and connect to the graphical user interface of the OS inside the VM with noVNC or SPICE.

You can use your Windows VM, with the GPU and keyboard and mouse, to browse to the Proxmox web GUI and use the Console button to connect to interact with other VMs. I also use a virtual main desktop VM with a GPU on the Proxmox host and connect to other VMs and containers using a browser. If you connect multiple displays to your (Windows VM) GPU, you can move the other VM consoles to other monitors.
If I shutdown windows I dont see proxmox anymore so I wont be able to start a VM right?
Indeed, the GPU will not be reused by Proxmox for a text console. It is possible to get write a hook script to get this to work...
You can start other VMs using the Proxmox web GUI or by connecting to the Proxmox server with SSH.
 
ok. I see. Thanks.
CAM is how AsRock implements your SAM.
Its called Clever Access Memory.

Only issue now is how to use or passthrough bluetooth if possible. Does that actually work at all or only via dongle?
 
Only issue now is how to use or passthrough bluetooth if possible. Does that actually work at all or only via dongle?
It depends. It is a PCIe device and can you get it to passthrough (and reset properly)? Or is in an internal USB device and can you use USB passthrough? Or do you want to save yourself the trouble and buy a cheap generic USB bluetooth device?

EDIT: It's probably a combined device with WiFi and not in a separate IOMMU group, (but in the big chipset group), so it probably won't work. Remember, there is no such thing a "the bluetooh". It all depends on the specific physical device, the manufacturer, the drivers and whether someone already found out if it can be passed through (via several work-arounds) before.
 
Last edited:
Well whatever works really. I installed F1 2022 game to test and realized that I had no mouse within the game eventough it works within windows etc (logitec mx keys and mx master 3 mouse) using the unify dongle. Any idea why?
I than thought I could connect my xbox controller and realized, windows did not have bluetooth connection.
 
Well whatever works really.
Passthrough remains trial and error (and not guaranteed or officially supported). Feel free to try. ;-)
I installed F1 2022 game to test and realized that I had no mouse within the game eventough it works within windows etc (logitec mx keys and mx master 3 mouse) using the unify dongle. Any idea why?
I don't have experience with such a mouse and/or Windows games. Maybe PCIe passthrough of a whole USB controller device works better? Lower latency, higher throughput and more compatible with the OS inside the VM.
I than thought I could connect my xbox controller and realized, windows did not have bluetooth connection.
I use a cheap generic USB bluetooth controller via USB. Not great but it works for a remote keyboard on the couch.
 
First of all, you saved my a****. So many gray hairs and finally it worked with this patch:
echo 1 > /sys/bus/pci/devices/0000\:09\:00.0/remove
echo 1 > /sys/bus/pci/rescan

You can create a .sh chmod +x and add it to cron

File: /root/fix_gpu_pass.sh

//Note Change "0000\:0X\:00.0" for your GPU PCI ID

#!/bin/bash
echo 1 > /sys/bus/pci/devices/0000\:0X\:00.0/remove
echo 1 > /sys/bus/pci/rescan

Add to cron:

crontab -e

add:

@reboot /root/fix_gpu_pass.sh

Even if I come out as a starter, I would like to make come comments on this patch to help other beginners to get it running a bit faster, maybe they look obvious for everyone else but I would have loved to have them:

1. this patch applies to the host it self not in the VM.

2. How to find the correct syntax of PCI ID:

This is the note from the original idea, but how on earth is this matching to my specific device?
//Note Change "0000\:0X\:00.0" for your GPU PCI ID

To find out the correct syntax an which ID was exactly to put in there use
lspci
and look for the output like this:
65:00.0 VGA compatible controller: NVIDIA Corporation GP102 [GeForce GTX 1080 Ti] (rev a1)

Than you would write to the fix_gpu_pass.sh accordingly
echo 1 > /sys/bus/pci/devices/0000:65:00.0/remove

I only understood this syntax when I tried ls /sys/bus/pci/devices/ and found the leading 0000.

Because its NOT the output of the PCI ID which you get with lspci -s 65:00 -n which is used in some other tutorials for different purposes.
65:00.0 0300: 10de:1b06 (rev a1) 65:00.1 0403: 10de:10ef (rev a1)

Again thanks for this patch and don't be to hard with me, I just wanted to note at this point where I stumbled across as a beginner.
 
  • Like
Reactions: MrChiphoi
Good afternoon!
There was a following problem. When the guest operating system starts, the following line appears in dmesg:

vfio-pci 0000:65:00.0: No more image in the PCI ROM

What could be the problem?

dmesg | grep -e DMAR -e IOMMU:

[ 0.006765] ACPI: DMAR 0x000000004BA22000 0000E8 (v01 ALASKA A M I 00000001 INTL 20091013) [ 0.006792] ACPI: Reserving DMAR table memory at [mem 0x4ba22000-0x4ba220e7] [ 0.154546] DMAR: IOMMU enabled [ 0.404385] DMAR: Host address width 46 [ 0.404386] DMAR: DRHD base: 0x000000b5ffc000 flags: 0x0 [ 0.404389] DMAR: dmar0: reg_base_addr b5ffc000 ver 1:0 cap 8d2078c106f0466 ecap f020df [ 0.404391] DMAR: DRHD base: 0x000000d8ffc000 flags: 0x0 [ 0.404394] DMAR: dmar1: reg_base_addr d8ffc000 ver 1:0 cap 8d2078c106f0466 ecap f020df [ 0.404396] DMAR: DRHD base: 0x000000fbffc000 flags: 0x0 [ 0.404398] DMAR: dmar2: reg_base_addr fbffc000 ver 1:0 cap 8d2078c106f0466 ecap f020df [ 0.404399] DMAR: DRHD base: 0x00000092ffc000 flags: 0x1 [ 0.404401] DMAR: dmar3: reg_base_addr 92ffc000 ver 1:0 cap 8d2078c106f0466 ecap f020df [ 0.404402] DMAR: RMRR base: 0x0000004d242000 end: 0x0000004d48bfff [ 0.404404] DMAR: ATSR flags: 0x0 [ 0.404406] DMAR-IR: IOAPIC id 12 under DRHD base 0xfbffc000 IOMMU 2 [ 0.404407] DMAR-IR: IOAPIC id 11 under DRHD base 0xd8ffc000 IOMMU 1 [ 0.404408] DMAR-IR: IOAPIC id 10 under DRHD base 0xb5ffc000 IOMMU 0 [ 0.404409] DMAR-IR: IOAPIC id 8 under DRHD base 0x92ffc000 IOMMU 3 [ 0.404409] DMAR-IR: IOAPIC id 9 under DRHD base 0x92ffc000 IOMMU 3 [ 0.404410] DMAR-IR: HPET id 0 under DRHD base 0x92ffc000 [ 0.404411] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping. [ 0.405307] DMAR-IR: Enabled IRQ remapping in x2apic mode [ 2.197911] DMAR: No SATC found [ 2.197913] DMAR: dmar2: Using Queued invalidation [ 2.197917] DMAR: dmar1: Using Queued invalidation [ 2.197919] DMAR: dmar3: Using Queued invalidation [ 2.200618] DMAR: Intel(R) Virtualization Technology for Directed I/O

lspci -k:

65:00.0 VGA compatible controller: NVIDIA Corporation GK208B [GeForce GT 710] (rev a1) Subsystem: Gigabyte Technology Co., Ltd GK208B [GeForce GT 710] Kernel driver in use: vfio-pci Kernel modules: nvidiafb, nouveau 65:00.1 Audio device: NVIDIA Corporation GK208 HDMI/DP Audio Controller (rev a1) Subsystem: Gigabyte Technology Co., Ltd GK208 HDMI/DP Audio Controller Kernel driver in use: vfio-pci Kernel modules: snd_hda_intel

cat /proc/cmdline:

BOOT_IMAGE=/boot/vmlinuz-5.15.30-2-pve root=/dev/mapper/pve-root ro quiet intel_iommu=on iommu=pt initcall_blacklist=sysfb_init textonly video=vesafb:off video=efifb:off video=simplefb:off

cat /etc/modprobe.d/vfio.conf:

options vfio-pci ids=10de:128b,10de:0e0f disable_vga=1

cat /proc/iomem:

b6000000-d8ffffff : PCI Bus 0000:64 c8000000-d1ffffff : PCI Bus 0000:65 c8000000-cfffffff : 0000:65:00.0 d0000000-d1ffffff : 0000:65:00.0 d7000000-d80fffff : PCI Bus 0000:65 d7000000-d7ffffff : 0000:65:00.0 d8080000-d8083fff : 0000:65:00.1 d8100000-d8100fff : 0000:64:05.4 d8ffc000-d8ffcfff : dmar1

lspci -v:

65:00.0 VGA compatible controller: NVIDIA Corporation GK208B [GeForce GT 710] (rev a1) (prog-if 00 [VGA controller]) Subsystem: Gigabyte Technology Co., Ltd GK208B [GeForce GT 710] Flags: fast devsel, IRQ 108, NUMA node 0, IOMMU group 72 Memory at d7000000 (32-bit, non-prefetchable) [size=16M] Memory at c8000000 (64-bit, prefetchable) [size=128M] Memory at d0000000 (64-bit, prefetchable) [size=32M] I/O ports at b000 [size=128] Expansion ROM at 000c0000 [disabled] [size=128K] Capabilities: [60] Power Management version 3 Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [78] Express Legacy Endpoint, MSI 00 Capabilities: [100] Virtual Channel Capabilities: [128] Power Budgeting <?> Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?> Kernel driver in use: vfio-pci Kernel modules: nvidiafb, nouveau 65:00.1 Audio device: NVIDIA Corporation GK208 HDMI/DP Audio Controller (rev a1) Subsystem: Gigabyte Technology Co., Ltd GK208 HDMI/DP Audio Controller Flags: fast devsel, IRQ 109, NUMA node 0, IOMMU group 72 Memory at d8080000 (32-bit, non-prefetchable) [size=16K] Capabilities: [60] Power Management version 3 Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [78] Express Endpoint, MSI 00 Kernel driver in use: vfio-pci Kernel modules: snd_hda_intel

Is it really necessary to slip the file with the firmware?
 
What a nice post! I've been looking for a solution to this BAR 3 issues all night. Big Thanks!

By the way, I also found better solution on a reddit post. It is adding "initcall_blacklist=sysfb_init" to kernel parameter. No need "video=efifb:off" or "video=simplefb:off" in kernel parameter. I also tested, it does solve the problem!

Reference:
https://www.reddit.com/r/VFIO/comme...let_simplefb_stay_away_from_the_gpu/?sort=old
https://www.reddit.com/r/Proxmox/comments/vc9hw3/latest_proxmox_7_the_kernel_breaks_my_gpu/?sort=old
This worked for me with a Nvidia T400, Ryzen 1700x, and a Gigabyte B450M UD3H V2 on bios F60.
I was pulling my hair out because just a few days ago this same setup was working, but all of a sudden after moving some VMs around it stopped working.

If someone could update the Proxmox Wiki that'd be amazing.
 
Using the suggestions from this post, this is my working configuration:

MB: Asus ROG Strix X570-F Gaming
BIOS F4403 (AGESA V2 PI 1.2.0.7): enable IOMMU, SVM; disable CSM
CPU: Ryzen 5950x
GPU: GTX 1060

Proxmox 7.2-7
5.15.39-1-pve

/etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet initcall_blacklist=sysfb_init"

/etc/modprobe.d/blacklist.conf
blacklist radeon
blacklist nouveau
blacklist nvidia

/etc/modprobe.d/vfio.conf
options vfio-pci ids=10de:1c03,10de:10f1 disable_vga=1

/etc/modprobe.d/blacklist.conf
blacklist radeon
blacklist nouveau
blacklist nvidia

/etc/modprobe.d/kvm.conf
options kvm ignore_msrs=1 report_ignored_msrs=0

/etc/modprobe.d/snd-hda-intel.conf
options snd-hda-intel enable_msi=1

I was able to passthrough the GTX 1060, onboard audio and USB controllers (include ARGB) to a W10 VM.
VM: ovmf, q35, virtio drivers, (remove cdrom as it sometimes makes audio crack)

Thanks.
 
What a nice post! I've been looking for a solution to this BAR 3 issues all night. Big Thanks!

By the way, I also found better solution on a reddit post. It is adding "initcall_blacklist=sysfb_init" to kernel parameter. No need "video=efifb:off" or "video=simplefb:off" in kernel parameter. I also tested, it does solve the problem!

Reference:
https://www.reddit.com/r/VFIO/comme...let_simplefb_stay_away_from_the_gpu/?sort=old
https://www.reddit.com/r/Proxmox/comments/vc9hw3/latest_proxmox_7_the_kernel_breaks_my_gpu/?sort=old
It works after I upgraded from 7.1 to 7.2-7
Thank you!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!