[SOLVED] GPU passthrough problem

aldariz

Member
Apr 23, 2022
25
5
8
Hi,

My windows VM has recently stopped working since updating Proxmox to the latest version, and I was wondering if any new of any could assist in a potential solution to fix the issue. I believe it's related to the GPU passthrough however I'm unsure what is causing it. I can remotely connect the VM once it has started however it has Code 43 on the graphic card. Motherboard is ASUS Prime Z690-P Wifi DA, and Intel Core i9-12900k.

As I'm getting the follow error when trying to start the VM
Code:
Apr 28 22:10:05 host1 kernel: vfio-pci 0000:03:00.0: enabling device (0002 -> 0003)
Apr 28 22:10:06 host1 kernel: vfio-pci 0000:03:00.0: vfio_ecap_init: hiding ecap 0x19@0x270
Apr 28 22:10:06 host1 kernel: vfio-pci 0000:03:00.0: vfio_ecap_init: hiding ecap 0x1b@0x2d0
Apr 28 22:10:06 host1 kernel: vfio-pci 0000:03:00.0: vfio_ecap_init: hiding ecap 0x26@0x410
Apr 28 22:10:06 host1 kernel: vfio-pci 0000:03:00.0: vfio_ecap_init: hiding ecap 0x27@0x440
Apr 28 22:10:06 host1 kernel: vfio-pci 0000:03:00.1: enabling device (0000 -> 0002)
Apr 28 22:10:08 host1 QEMU[2368]: kvm: VFIO_MAP_DMA failed: Invalid argument
Apr 28 22:10:08 host1 QEMU[2368]: kvm: vfio_dma_map(0x55e4839dc6f0, 0x380000000000, 0x400000000, 0x7fc7fc000000) = -22 (Invalid argument)
Apr 28 22:10:08 host1 QEMU[2368]: kvm: VFIO_MAP_DMA failed: Invalid argument
Apr 28 22:10:08 host1 QEMU[2368]: kvm: vfio_dma_map(0x55e4839dc6f0, 0x380400000000, 0x10000000, 0x7fc7ec000000) = -22 (Invalid argument)
Apr 28 22:10:08 host1 QEMU[2368]: kvm: VFIO_MAP_DMA failed: Invalid argument
Apr 28 22:10:08 host1 QEMU[2368]: kvm: vfio_dma_map(0x55e4839dc6f0, 0x380000000000, 0x400000000, 0x7fc7fc000000) = -22 (Invalid argument)

pveversion -v
Code:
proxmox-ve: 7.4-1 (running kernel: 5.15.107-1-pve)
pve-manager: 7.4-3 (running version: 7.4-3/9002ab8a)
pve-kernel-5.15: 7.4-2
pve-kernel-5.13: 7.1-9
pve-kernel-5.15.107-1-pve: 5.15.107-1
pve-kernel-5.15.104-1-pve: 5.15.104-2
pve-kernel-5.15.102-1-pve: 5.15.102-1
pve-kernel-5.13.19-6-pve: 5.13.19-15
pve-kernel-5.13.19-2-pve: 5.13.19-4
ceph-fuse: 15.2.15-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.4
libproxmox-backup-qemu0: 1.3.1-1
libproxmox-rs-perl: 0.2.1
libpve-access-control: 7.4-2
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.3-4
libpve-guest-common-perl: 4.2-4
libpve-http-server-perl: 4.2-3
libpve-rs-perl: 0.7.5
libpve-storage-perl: 7.4-2
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-2
lxcfs: 5.0.3-pve1
novnc-pve: 1.4.0-1
proxmox-backup-client: 2.4.1-1
proxmox-backup-file-restore: 2.4.1-1
proxmox-kernel-helper: 7.4-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-offline-mirror-helper: 0.5.1-1
proxmox-widget-toolkit: 3.6.5
pve-cluster: 7.3-3
pve-container: 4.4-3
pve-docs: 7.4-2
pve-edk2-firmware: 3.20230228-2
pve-firewall: 4.3-1
pve-firmware: 3.6-5
pve-ha-manager: 3.6.1
pve-i18n: 2.12-1
pve-qemu-kvm: 7.2.0-8
pve-xtermjs: 4.16.0-1
qemu-server: 7.4-3
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+3
vncterm: 1.7-1
zfsutils-linux: 2.1.11-pve1

Grub
Code:
GRUB_CMDLINE_LINUX_DEFAULT="quiet split_lock_detect=off intel_iommu=on initcall_blacklist=sysfb_init"

dmesg | grep -e DMAR -e IOMMU
Code:
[    0.005956] ACPI: DMAR 0x000000007844C000 000050 (v02 INTEL  EDK2     00000002      01000013)
[    0.005991] ACPI: Reserving DMAR table memory at [mem 0x7844c000-0x7844c04f]
[    0.170366] DMAR: IOMMU enabled
[    0.386396] DMAR: Host address width 39
[    0.386397] DMAR: DRHD base: 0x000000fed91000 flags: 0x1
[    0.386401] DMAR: dmar0: reg_base_addr fed91000 ver 5:0 cap d2008c40660462 ecap f050da
[    0.386403] DMAR-IR: IOAPIC id 2 under DRHD base  0xfed91000 IOMMU 0
[    0.386404] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
[    0.386405] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[    0.387811] DMAR-IR: Enabled IRQ remapping in x2apic mode
[    0.606888] DMAR: No RMRR found
[    0.606889] DMAR: No ATSR found
[    0.606889] DMAR: No SATC found
[    0.606891] DMAR: dmar0: Using Queued invalidation
[    0.609541] DMAR: Intel(R) Virtualization Technology for Directed I/O

/etc/modules
Code:
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

find /sys/kernel/iommu_groups/ -type l
Code:
/sys/kernel/iommu_groups/17/devices/0000:00:1f.0
/sys/kernel/iommu_groups/17/devices/0000:00:1f.5
/sys/kernel/iommu_groups/17/devices/0000:00:1f.3
/sys/kernel/iommu_groups/17/devices/0000:00:1f.4
/sys/kernel/iommu_groups/7/devices/0000:00:15.1
/sys/kernel/iommu_groups/7/devices/0000:00:15.2
/sys/kernel/iommu_groups/7/devices/0000:00:15.0
/sys/kernel/iommu_groups/25/devices/0000:0b:00.0
/sys/kernel/iommu_groups/15/devices/0000:00:1d.0
/sys/kernel/iommu_groups/5/devices/0000:00:14.2
/sys/kernel/iommu_groups/5/devices/0000:00:14.0
/sys/kernel/iommu_groups/23/devices/0000:07:00.0
/sys/kernel/iommu_groups/23/devices/0000:07:00.1
/sys/kernel/iommu_groups/13/devices/0000:00:1c.0
/sys/kernel/iommu_groups/3/devices/0000:00:0a.0
/sys/kernel/iommu_groups/21/devices/0000:03:00.1
/sys/kernel/iommu_groups/11/devices/0000:00:1b.0
/sys/kernel/iommu_groups/1/devices/0000:00:01.0
/sys/kernel/iommu_groups/18/devices/0000:01:00.0
/sys/kernel/iommu_groups/8/devices/0000:00:16.0
/sys/kernel/iommu_groups/16/devices/0000:00:1d.4
/sys/kernel/iommu_groups/6/devices/0000:00:14.3
/sys/kernel/iommu_groups/24/devices/0000:09:00.0
/sys/kernel/iommu_groups/14/devices/0000:00:1c.2
/sys/kernel/iommu_groups/4/devices/0000:00:0e.0
/sys/kernel/iommu_groups/22/devices/0000:04:00.0
/sys/kernel/iommu_groups/12/devices/0000:00:1b.4
/sys/kernel/iommu_groups/2/devices/0000:00:06.0
/sys/kernel/iommu_groups/20/devices/0000:03:00.0
/sys/kernel/iommu_groups/10/devices/0000:00:1a.0
/sys/kernel/iommu_groups/0/devices/0000:00:00.0
/sys/kernel/iommu_groups/19/devices/0000:02:00.0
/sys/kernel/iommu_groups/9/devices/0000:00:17.0

/etc/modprobe.d/vfio.conf
Code:
options vfio-pci ids=1002:1478,1002:1479,1002:73bf,1002:ab28 disable_vga=1

/etc/modprobe.d/blacklist.conf
Code:
blacklist radeon
blacklist nouveau
blacklist nvidia

etc/pve/qemu-server/100.conf
Code:
balloon: 0
bios: ovmf
boot: order=virtio0;net0
cores: 24
cpu: host
efidisk0: Win:100/vm-100-disk-4.qcow2,efitype=4m,pre-enrolled-keys=1,size=528K
hostpci0: 0000:03:00,pcie=1,x-vga=1
machine: pc-q35-7.2
memory: 32768
meta: creation-qemu=6.2.0,ctime=1650915814
name: Win11
net0: e1000=72:43:99:C1:CD:C5,bridge=vmbr0,firewall=1
numa: 0
ostype: win11
scsihw: virtio-scsi-pci
smbios1: uuid=13d78e22-9f38-41e4-8002-b9de98f6cd8c
sockets: 1
startup: order=1,up=10
tpmstate0: Win:100/vm-100-disk-2.raw,size=16896,version=v2.0
unused0: macOS:100/vm-100-disk-1.qcow2
unused1: macOS:100/vm-100-disk-0.qcow2
unused2: Win:100/vm-100-disk-0.qcow2
unused3: Win:100/vm-100-disk-3.qcow2
usb0: host=1-11.1,usb3=1
usb1: host=1-12,usb3=1
usb2: host=1-10,usb3=1
vga: none
virtio0: Win:100/vm-100-disk-1.qcow2,size=1000G
vmgenid: 0250c817-459d-4692-b4e8-0601540c6401

Any help would be much appreciated, and thanks in advance.
 
I discovered what the issue was but for those wondering its with the Host BIOS and as the battery had died and after the Proxmox updated and a PC restart, the BIOS reset to default which then caused Resizable BAR'/'Smart Access Memory to default to enabled.