Hello,
I am having problems with Proxmox 8.3 and the iGPU of my Lenovo M90q. After the update, I am having a lot of crash with a debian vm where I pass the iGPU.
The vm crashes every 24-48 hours, sometimes during the daily backup, or during its idle period during the day/night.
How could I solve?
host
dmesg inside of the vm
lspci
crash during the backup
I am having problems with Proxmox 8.3 and the iGPU of my Lenovo M90q. After the update, I am having a lot of crash with a debian vm where I pass the iGPU.
The vm crashes every 24-48 hours, sometimes during the daily backup, or during its idle period during the day/night.
How could I solve?
host
Code:
root@kronos:~# pveversion -v
proxmox-ve: 8.3.0 (running kernel: 6.8.12-4-pve)
pve-manager: 8.3.0 (running version: 8.3.0/c1689ccb1065a83b)
proxmox-kernel-helper: 8.1.0
proxmox-kernel-6.8: 6.8.12-4
proxmox-kernel-6.8.12-4-pve-signed: 6.8.12-4
proxmox-kernel-6.8.12-3-pve-signed: 6.8.12-3
proxmox-kernel-6.8.12-2-pve-signed: 6.8.12-2
proxmox-kernel-6.8.12-1-pve-signed: 6.8.12-1
proxmox-kernel-6.8.4-2-pve-signed: 6.8.4-2
ceph-fuse: 17.2.7-pve3
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx11
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-5
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.1
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.4
libpve-access-control: 8.2.0
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.0.10
libpve-cluster-perl: 8.0.10
libpve-common-perl: 8.2.9
libpve-guest-common-perl: 5.1.6
libpve-http-server-perl: 5.1.2
libpve-network-perl: 0.10.0
libpve-rs-perl: 0.9.0
libpve-storage-perl: 8.2.9
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.5.0-1
proxmox-backup-client: 3.2.9-1
proxmox-backup-file-restore: 3.2.9-1
proxmox-firewall: 0.6.0
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.3.1
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.7
proxmox-widget-toolkit: 4.3.1
pve-cluster: 8.0.10
pve-container: 5.2.2
pve-docs: 8.3.1
pve-edk2-firmware: 4.2023.08-4
pve-esxi-import-tools: 0.7.2
pve-firewall: 5.1.0
pve-firmware: 3.14-1
pve-ha-manager: 4.0.6
pve-i18n: 3.3.1
pve-qemu-kvm: 9.0.2-4
pve-xtermjs: 5.3.0-3
qemu-server: 8.3.0
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.6-pve1
root@kronos:~# lspci -s 00:02.0 -k
00:02.0 VGA compatible controller: Intel Corporation CometLake-S GT2 [UHD Graphics 630] (rev 03)
DeviceName: Onboard - Video
Subsystem: Lenovo CometLake-S GT2 [UHD Graphics 630]
Kernel driver in use: vfio-pci
Kernel modules: i915
root@kronos:~# uname -a
Linux kronos 6.8.12-4-pve #1 SMP PREEMPT_DYNAMIC PMX 6.8.12-4 (2024-11-06T15:04Z) x86_64 GNU/Linux
root@kronos:~# dmesg | grep -e DMAR -e IOMMU
[ 0.011896] ACPI: DMAR 0x000000008FA22000 0000C8 (v01 LENOVO TC-M2W 000015E0 01000013)
[ 0.011944] ACPI: Reserving DMAR table memory at [mem 0x8fa22000-0x8fa220c7]
[ 0.112474] DMAR: IOMMU enabled
[ 0.325654] DMAR: Host address width 39
[ 0.325655] DMAR: DRHD base: 0x000000fed90000 flags: 0x0
[ 0.325667] DMAR: dmar0: reg_base_addr fed90000 ver 1:0 cap 1c0000c40660462 ecap 19e2ff0505e
[ 0.325671] DMAR: DRHD base: 0x000000fed91000 flags: 0x1
[ 0.325676] DMAR: dmar1: reg_base_addr fed91000 ver 1:0 cap d2008c40660462 ecap f050da
[ 0.325678] DMAR: RMRR base: 0x000000901c9000 end: 0x00000090412fff
[ 0.325681] DMAR: RMRR base: 0x00000093000000 end: 0x0000009f7fffff
[ 0.325682] DMAR: RMRR base: 0x0000008f8ab000 end: 0x0000008f92afff
[ 0.325685] DMAR-IR: IOAPIC id 2 under DRHD base 0xfed91000 IOMMU 1
[ 0.325687] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
[ 0.325689] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[ 0.327981] DMAR-IR: Enabled IRQ remapping in x2apic mode
[ 0.628809] DMAR: No ATSR found
[ 0.628810] DMAR: No SATC found
[ 0.628811] DMAR: IOMMU feature fl1gp_support inconsistent
[ 0.628813] DMAR: IOMMU feature pgsel_inv inconsistent
[ 0.628814] DMAR: IOMMU feature nwfs inconsistent
[ 0.628815] DMAR: IOMMU feature pasid inconsistent
[ 0.628816] DMAR: IOMMU feature eafs inconsistent
[ 0.628817] DMAR: IOMMU feature prs inconsistent
[ 0.628818] DMAR: IOMMU feature nest inconsistent
[ 0.628819] DMAR: IOMMU feature mts inconsistent
[ 0.628820] DMAR: IOMMU feature sc_support inconsistent
[ 0.628821] DMAR: IOMMU feature dev_iotlb_support inconsistent
[ 0.628822] DMAR: dmar0: Using Queued invalidation
[ 0.628826] DMAR: dmar1: Using Queued invalidation
[ 0.629474] DMAR: Intel(R) Virtualization Technology for Directed I/O
root@kronos:~# cat /etc/modprobe.d/blacklist.conf
blacklist i915
root@kronos:~# dmesg | grep -i vfio
[ 2.106645] VFIO - User Level meta-driver version: 0.3
[ 2.132616] vfio-pci 0000:00:02.0: vgaarb: deactivate vga console
[ 2.132620] vfio-pci 0000:00:02.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[ 2.132749] vfio_pci: add [8086:9bc8[ffffffff:ffffffff]] class 0x000000/00000000
[ 899.729744] vfio-pci 0000:00:02.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0xcb43
[284391.236288] vfio-pci 0000:00:02.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0xcb43
dmesg inside of the vm
Code:
[ 1.557050] i915 0000:01:00.0: [drm] VT-d active for gfx access
[ 1.557087] i915 0000:01:00.0: [drm] Using Transparent Hugepages
[ 1.561356] i915 0000:01:00.0: BAR 6: can't assign [??? 0x00000000 flags 0x20000000] (bogus alignment)
[ 1.561359] i915 0000:01:00.0: [drm] Failed to find VBIOS tables (VBT)
[ 1.561368] i915 0000:01:00.0: [drm] *ERROR* DC state mismatch (0x0 -> 0x2)
[ 1.561653] i915 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
[ 1.565210] i915 0000:01:00.0: firmware: direct-loading firmware i915/kbl_dmc_ver1_04.bin
[ 1.565579] i915 0000:01:00.0: [drm] Finished loading DMC firmware i915/kbl_dmc_ver1_04.bin (v1.4)
[ 1.594712] sr 1:0:0:0: Attached scsi CD-ROM sr0
[ 2.746842] i915 0000:01:00.0: [drm] failed to retrieve link info, disabling eDP
[ 2.753984] [drm] Initialized i915 1.6.0 20201103 for 0000:01:00.0 on minor 0
[ 2.786415] i915 0000:01:00.0: [drm] Cannot find any crtc or sizes
[ 2.789858] Console: switching to colour dummy device 80x25
[ 2.789928] bochs-drm 0000:00:01.0: vgaarb: deactivate vga console
[ 2.790091] bochs-drm 0000:00:01.0: enabling device (0004 -> 0006)
[ 2.791083] [drm] Found bochs VGA, ID 0xb0c5.
[ 2.791084] [drm] Framebuffer size 16384 kB @ 0x80000000, mmio @ 0x8304b000.
[ 2.792028] [drm] Found EDID data blob.
[ 2.792154] [drm] Initialized bochs-drm 1.0.0 20130925 for 0000:00:01.0 on minor 1
lspci
Code:
olimpo@persefone:~$ sudo lspci -s 01:00.0 -k
01:00.0 VGA compatible controller: Intel Corporation CometLake-S GT2 [UHD Graphics 630] (rev 03)
Subsystem: Lenovo CometLake-S GT2 [UHD Graphics 630]
Kernel driver in use: i915
Kernel modules: i915
crash during the backup
Code:
INFO: Starting Backup of VM 254 (qemu)
INFO: Backup started at 2024-11-26 10:00:50
INFO: status = running
INFO: VM Name: persefone
INFO: include disk 'virtio0' 'local-btrfs:254/vm-254-disk-1.raw' 48G
INFO: include disk 'efidisk0' 'local-btrfs:254/vm-254-disk-0.raw' 528K
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/254/2024-11-26T09:00:50Z'
INFO: started backup task '9be2454c-f4a8-481b-88a0-045e835aa228'
INFO: resuming VM again
ERROR: VM 254 qmp command 'cont' failed - Resetting the Virtual Machine is required
INFO: aborting backup job
INFO: resuming VM again
ERROR: Backup of VM 254 failed - VM 254 qmp command 'cont' failed - Resetting the Virtual Machine is required
INFO: Failed at 2024-11-26 10:00:50
INFO: Starting Backup of VM 255 (qemu)
INFO: Backup started at 2024-11-26 10:00:50
INFO: status = stopped
INFO: backup mode: stop
INFO: ionice priority: 7