updated to PVE 9.0 then GPU passthrough stopped working on Windows

jasonwch

New Member
Feb 13, 2025
25
4
3
Previously the Intel 630 GPU passthrough is working fine on PVE 8 (for both Ubuntu and Windows 11).

Just now updated to 9.0, Windows - Device manager shows the device as error code 43. But if I passthrough the GPU to Ubuntu VM, it's working fine.

Checked these and verified the GPU should be successfully passed through (as Linux also can take that up)
https://pve.proxmox.com/wiki/PCI(e)_Passthrough

Thanks
 
The passthrough is working fine until I just now updated to PVE 9

or you want me to try revert to old kernel? if so, may I know how
 
I have confirmed that the iGPU of N100 does not work with Proxmox VE 9.

It works with Proxmox VE 8.4.9 (kernel 6.14.8-2), but it does not work with Proxmox VE 9.0.3 (kernel 6.14.8-2).

pveversion
pve-manager/9.0.3/025864202ebb6109 (running kernel: 6.14.8-2-pve)
 
Last edited:
I have confirmed that the iGPU of N100 does not work with Proxmox VE 9.

It works with Proxmox VE 8.4.9 (kernel 6.14.8-2), but it does not work with Proxmox VE 9.0.3 (kernel 6.14.8-2).
so we can just wait for them to release the fix?

Thanks
 
It works with Proxmox VE 8.4.9 (kernel 6.14.8-2), but it does not work with Proxmox VE 9.0.3 (kernel 6.14.8-2).

pveversion
pve-manager/9.0.3/025864202ebb6109 (running kernel: 6.14.8-2-pve)
Thanks! Could you provide further information about the GPU passthrough (VM config, lspci -nnk, cat /proc/cmdline, any errors in the boot log / syslog at vm startup?, etc.)
 
We are reverting to Proxmox VE 8.4.9 (kernel 6.14.8-2) to see if it works again.

We will install Proxmox VE 9.0.3 (kernel 6.14.8-2) from the ISO with the same settings as yours and log the results.

This is a test machine for checking the installation procedure, so there is no problem with deleting the data.
 
Thanks! Could you provide further information about the GPU passthrough (VM config, lspci -nnk, cat /proc/cmdline, any errors in the boot log / syslog at vm startup?, etc.)
Code:
root@pve:~# lspci -nnk
00:00.0 Host bridge [0600]: Intel Corporation Comet Lake-S 6c Host Bridge/DRAM Controller [8086:9b53] (rev 03)
        DeviceName: Onboard - Other
        Subsystem: Dell Device [1028:09a6]
        Kernel driver in use: skl_uncore
00:02.0 VGA compatible controller [0300]: Intel Corporation CometLake-S GT2 [UHD Graphics 630] [8086:9bc8] (rev 03)
        DeviceName: Onboard - Video
        Subsystem: Dell Device [1028:09a6]
        Kernel driver in use: vfio-pci
        Kernel modules: i915
00:08.0 System peripheral [0880]: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model [8086:1911]
        DeviceName: Onboard - Other
        Subsystem: Dell Device [1028:09a6]
00:12.0 Signal processing controller [1180]: Intel Corporation Comet Lake PCH Thermal Controller [8086:06f9]
        DeviceName: Onboard - Other
        Subsystem: Dell Device [1028:09a6]
        Kernel driver in use: intel_pch_thermal
        Kernel modules: intel_pch_thermal
00:14.0 USB controller [0c03]: Intel Corporation Comet Lake USB 3.1 xHCI Host Controller [8086:06ed]
        DeviceName: Onboard - Other
        Subsystem: Dell Device [1028:09a6]
        Kernel driver in use: xhci_hcd
        Kernel modules: xhci_pci
00:14.2 RAM memory [0500]: Intel Corporation Comet Lake PCH Shared SRAM [8086:06ef]
        DeviceName: Onboard - Other
        Subsystem: Dell Device [1028:09a6]
00:14.3 Network controller [0280]: Intel Corporation Comet Lake PCH CNVi WiFi [8086:06f0]
        DeviceName: Onboard - Ethernet
        Subsystem: Intel Corporation Device [8086:4070]
        Kernel driver in use: iwlwifi
        Kernel modules: iwlwifi
00:15.0 Serial bus controller [0c80]: Intel Corporation Comet Lake PCH Serial IO I2C Controller #0 [8086:06e8]
        DeviceName: Onboard - Other
        Subsystem: Dell Device [1028:09a6]
        Kernel driver in use: intel-lpss
        Kernel modules: intel_lpss_pci
00:16.0 Communication controller [0780]: Intel Corporation Comet Lake HECI Controller [8086:06e0]
        DeviceName: Onboard - Other
        Subsystem: Dell Device [1028:09a6]
        Kernel driver in use: mei_me
        Kernel modules: mei_me
00:16.3 Serial controller [0700]: Intel Corporation Comet Lake Keyboard and Text (KT) Redirection [8086:06e3]
        DeviceName: Onboard - Other
        Subsystem: Dell Device [1028:09a6]
        Kernel driver in use: serial
00:17.0 SATA controller [0106]: Intel Corporation Comet Lake SATA AHCI Controller [8086:06d2]
        DeviceName: Onboard - SATA
        Subsystem: Dell Device [1028:09a6]
        Kernel driver in use: ahci
        Kernel modules: ahci
00:1b.0 PCI bridge [0604]: Intel Corporation Comet Lake PCI Express Root Port #17 [8086:06c0] (rev f0)
        Subsystem: Dell Device [1028:09a6]
        Kernel driver in use: pcieport
00:1f.0 ISA bridge [0601]: Intel Corporation Q470 Chipset LPC/eSPI Controller [8086:0687]
        DeviceName: Onboard - Other
        Subsystem: Dell Device [1028:09a6]
00:1f.3 Audio device [0403]: Intel Corporation Comet Lake PCH cAVS [8086:06c8]
        DeviceName: Onboard - Sound
        Subsystem: Dell Device [1028:09a6]
        Kernel modules: snd_hda_intel, snd_soc_avs, snd_sof_pci_intel_cnl
00:1f.4 SMBus [0c05]: Intel Corporation Comet Lake PCH SMBus Controller [8086:06a3]
        DeviceName: Onboard - Other
        Subsystem: Dell Device [1028:09a6]
        Kernel driver in use: i801_smbus
        Kernel modules: i2c_i801
00:1f.5 Serial bus controller [0c80]: Intel Corporation Comet Lake PCH SPI Controller [8086:06a4]
        DeviceName: Onboard - Other
        Subsystem: Dell Device [1028:09a6]
        Kernel driver in use: intel-spi
        Kernel modules: spi_intel_pci
00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (11) I219-LM [8086:0d4c]
        DeviceName: Onboard - Ethernet
        Subsystem: Dell Device [1028:09a6]
        Kernel driver in use: e1000e
        Kernel modules: e1000e
01:00.0 Non-Volatile memory controller [0108]: Sandisk Corp IX SN530 NVMe SSD (DRAM-less) [15b7:5007] (rev 01)
        Subsystem: Sandisk Corp IX SN530 NVMe SSD (DRAM-less) [15b7:5007]
        Kernel driver in use: nvme
        Kernel modules: nvme
root@pve:~# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-6.14.8-2-pve root=/dev/mapper/pve-root ro quiet intel_idle.max_cstate=1 processor.max_cstate=5 intel_iommu=on iommu=pt initcall_blacklist=sysfb_init

VM config
Code:
agent: 1
args: -cpu host,-hypervisor,+vmx
audio0: device=ich9-intel-hda,driver=none
bios: ovmf
boot: order=scsi0;net0;ide2
cores: 10
cpu: host
efidisk0: local-lvm:vm-100-disk-0,efitype=4m,pre-enrolled-keys=1,size=4M
hostpci0: 0000:00:02,pcie=1,x-vga=1
ide2: none,media=cdrom
machine: pc-q35-9.2+pve1
memory: 16384
meta: creation-qemu=9.0.2,ctime=1739517307
name: JWPMWIN11
net0: virtio=BC:24:11:CA:6C:80,bridge=vmbr0,firewall=1
numa: 0
ostype: win11
scsi0: local-lvm:vm-100-disk-1,aio=threads,cache=writeback,discard=on,iothread=1,size=60G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=320a01c6-b5aa-409b-9d39-e1be2dfa31fa
sockets: 1
tpmstate0: local-lvm:vm-100-disk-2,size=4M,version=v2.0
vga: none
vmgenid:

When I start the VM, I don't see any error, where can I collect the log?
 
It is running on Proxmox VE 8.4.9 with the following settings.

I don't want to add drivers to the blacklist, so I am applying the following settings.



I reinstalled it and confirmed that it works on Proxmox VE 8.4.9 with these settings.

sed -i '/GRUB_CMDLINE_LINUX_DEFAULT=/c GRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=on iommu=pt"' /etc/default/grub

mv gen12_igd.rom /usr/share/kvm/
mv gen12_gop.rom /usr/share/kvm/
cat << EOF > /etc/modules
vfio
vfio_iommu_type1
vfio_pci
EOF
qm set 629 -args '-set device.hostpci0.addr=02.0 -set device.hostpci0.x-igd-gms=0x0 -set device.hostpci0.x-igd-opregion=on'
qm set 629 -hostpci0 0000:00:02.0,romfile=gen12_igd.rom
qm set 629 -hostpci1 0000:00:1f.3,romfile=gen12_gop.rom
nano /var/lib/vz/snippets/intel_igpu_reset.sh
---
#!/bin/bash
phase="$2"
echo "Phase is $phase"
if [ "$phase" == "pre-start" ]; then
# Unbind gpu from i915
echo "0000:00:02.0" > /sys/bus/pci/drivers/i915/unbind 2>/dev/null
sleep 5
elif [ "$phase" == "post-stop" ]; then
# Unbind gpu from vfio-pci
sleep 5
echo "0000:00:02.0" > /sys/bus/pci/drivers/vfio-pci/unbind 2>/dev/null
sleep 2
# Bind i915
echo "0000:00:02.0" > /sys/bus/pci/drivers/i915/bind 2>/dev/null
sleep 2
fi
---
chmod +x /var/lib/vz/snippets/intel_igpu_reset.sh

qm set 629 -hookscript local:snippets/intel_igpu_reset.sh

update-initramfs -u -k all
proxmox-boot-tool refresh
update-grub
reboot

Next, we will try the same settings on Proxmox VE 9.0.3.
 

Attachments

Last edited:
I created it using the same procedure and confirmed that it still resulted in Code 43.

There does not appear to be any error, but it occurs only in PVE9 and not in PVE8 with exactly the same settings.
 

Attachments

Last edited:
So what will be the current workaround? fallback the PVE? or need to use specific intel driver?
This comment is written because it was referenced in a post about issues in different environments (issues related to using external modules).

The issues in this thread need to be investigated by Proxmox, and we are currently waiting for their response.

It will not be resolved immediately (and it is unclear whether it will be resolved at all), so if you want to use it, you will need to revert to Proxmox VE 8.


<https://wiki.qemu.org/ChangeLog/10.1>

VFIO​

  • Updated IGD passthrough documentation
  • Fixed L2 crash on pseries machines
  • Added automatic enablement of OpRegion for IGD passthrough
  • Fixed OpRegion detection in IGD passthrough
  • Added support to report vfio-ap configuration changes
  • Added support for vfio-user client device
  • Added live update (CPR) support
  • Added support for VFIO migration with multifd on aarch64
  • Introduced a property to override a device PCI class code
  • Support for VFIO on TDX and SNP virtual machines.
The kernel is the same, it is unclear if the driver has changed, Qemu-Server has changed. Is it a Qemu issue?
 
This comment is written because it was referenced in a post about issues in different environments (issues related to using external modules).

The issues in this thread need to be investigated by Proxmox, and we are currently waiting for their response.

It will not be resolved immediately (and it is unclear whether it will be resolved at all), so if you want to use it, you will need to revert to Proxmox VE 8.


<https://wiki.qemu.org/ChangeLog/10.1>

VFIO​

  • Updated IGD passthrough documentation
  • Fixed L2 crash on pseries machines
  • Added automatic enablement of OpRegion for IGD passthrough
  • Fixed OpRegion detection in IGD passthrough
  • Added support to report vfio-ap configuration changes
  • Added support for vfio-user client device
  • Added live update (CPR) support
  • Added support for VFIO migration with multifd on aarch64
  • Introduced a property to override a device PCI class code
  • Support for VFIO on TDX and SNP virtual machines.
The kernel is the same, it is unclear if the driver has changed, Qemu-Server has changed. Is it a Qemu issue?
Thanks, may I know if any easy way to revert to 8.x? can I still use Trixie for the backend or I need to revery the whole server?

Thanks
 
A clean install of Proxmox VE is required. (If you have a clone, that's fine.)
Set up Proxmox VE 8 with the information you took before the update.

If the virtual disk of a virtual machine is in the same location as the boot disk, Proxmox VE must be installed on a separate storage.

Compatibility issues are always a problem with migrations.
I am keeping my main environment on 8.4.9 and updating it in my migration test environment to squash the problem.
 
Last edited:
A clean install of Proxmox VE is required. (If you have a clone, that's fine.)
Set up Proxmox VE 8 with the information you took before the update.

If the virtual disk of a virtual machine is in the same location as the boot disk, Proxmox VE must be installed on a separate storage.

Compatibility issues are always a problem with migrations.
I am keeping my main environment on 8.4.9 and updating it in my migration test environment to squash the problem.
unfortunately i have only 1 machine for the lab
 
After upgrading to PVE9, I see the same error on my machine. I‘m also trying to pass through a intel 630 to win 11, which worked fine with PVE8.
 
The kernel is the same, it is unclear if the driver has changed, Qemu-Server has changed. Is it a Qemu issue?
Thank you both for the thorough reports!

I investigated a bit and as already noted the kernel images should be the same AFAICT, even though these have different build times the versions indicate that these have the same upstream base kernel and patches applied. The only thing that stood out is that there are slight changes in how the BIOS exposes and initializes the memory regions in the bootlogs of @uzumo.

@uzumo Could you provide the full VM config and the Windows version of the VM? It would also be helpful to get an event log of the Windows machine that might give more insight what lies behind the Error 43 inside the guest.