GVT-g locks up virtual machine and Proxmox

kcrawford

Renowned Member
Nov 29, 2013
21
2
68
Hello,

I have a Windows 10 virtual machine that I am using mediated pci-e passthrough so that the virtual machine (Blue Iris NVR) can take advantage of the graphics card. The issue that I am facing is that the VM lock up after 20 minutes to an hour after adding the pci device. I have tried passing through the pci device with and without rom-bar enabled as well as with and without primary gpu set. The result is the same, an un-responsive virtual machine. A reboot of the hypervisor is required to start the VM again. Below is the vm configuration file:

Code:
# cat /etc/pve/qemu-server/119.conf
agent: 1
bios: ovmf
boot: order=scsi0;net0
cores: 4
cpu: host
efidisk0: ssd500:119/vm-119-disk-0.qcow2,size=128K
hostpci0: 0000:00:02.0,mdev=i915-GVTg_V5_4,pcie=1,rombar=0
ide0: none,media=cdrom
localtime: 0
machine: pc-q35-6.0
memory: 6144
name: bi2
net0: e1000=E2:EF:80:37:91:82,bridge=vmbr0,firewall=1
net1: e1000=12:2D:CF:69:FC:6E,bridge=vmbr1,firewall=1
numa: 0
ostype: win10
scsi0: ssd500:119/vm-119-disk-1.qcow2,cache=writeback,discard=on,size=80G
scsi1: /dev/disk/by-id/ata-ST8000VX0022-2EJ112_ZA1KM67N,backup=0,size=7814026584K
scsihw: virtio-scsi-pci
smbios1: uuid=64b1ba38-8d07-43b4-bca6-9113ce6337e3
sockets: 1
vmgenid: 8e2c0e58-2d76-49b2-ac00-b34884c76878

I am seeing the following errors in dmesg:

Code:
[ 1757.538527] gvt: guest page write error, gpa 1561adae8
[ 1757.539045] gvt: vgpu 1: fail: shadow page 0000000000000000 guest entry 0xffffffffffffffff type 9
[ 1757.539563] gvt: vgpu 1: fail: spt 000000004f922f3d guest entry 0xffffffffffffffff type 9
[ 1757.540080] gvt: vgpu 1: fail: shadow page 000000004f922f3d guest entry 0xffffffffffffffff type 9.
[ 1757.540652] gvt: guest page write error, gpa 1561adaf0
[ 1757.541180] gvt: vgpu 1: fail: shadow page 0000000000000000 guest entry 0xffffffffffffffff type 9
[ 1757.541704] gvt: vgpu 1: fail: spt 000000004f922f3d guest entry 0xffffffffffffffff type 9
[ 1757.542226] gvt: vgpu 1: fail: shadow page 000000004f922f3d guest entry 0xffffffffffffffff type 9.

As well as:

Code:
[ 2659.794244] task:kvm             state:D stack:    0 pid:30173 ppid:     1 flags:0x00000000
[ 2659.794247] Call Trace:
[ 2659.794249]  __schedule+0x2ca/0x880
[ 2659.794252]  schedule+0x4f/0xc0
[ 2659.794254]  schedule_preempt_disabled+0xe/0x10
[ 2659.794255]  __mutex_lock.constprop.0+0x309/0x4d0
[ 2659.794258]  ? handle_ept_misconfig+0x64/0x100 [kvm_intel]
[ 2659.794263]  __mutex_lock_slowpath+0x13/0x20
[ 2659.794264]  mutex_lock+0x34/0x40
[ 2659.794266]  intel_vgpu_emulate_mmio_read+0x51/0x3f0 [i915]
[ 2659.794315]  intel_vgpu_rw+0x1e4/0x220 [kvmgt]
[ 2659.794317]  intel_vgpu_read+0x14a/0x1f0 [kvmgt]
[ 2659.794318]  vfio_mdev_read+0x22/0x30 [vfio_mdev]
[ 2659.794320]  vfio_device_fops_read+0x26/0x30 [vfio]
[ 2659.794323]  vfs_read+0xb5/0x1c0
[ 2659.794326]  __x64_sys_pread64+0x93/0xc0
[ 2659.794327]  do_syscall_64+0x38/0x90
[ 2659.794329]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 2659.794331] RIP: 0033:0x7fe329c171a7
[ 2659.794332] RSP: 002b:00007fe31cbfa140 EFLAGS: 00000293 ORIG_RAX: 0000000000000011
[ 2659.794334] RAX: ffffffffffffffda RBX: 000055ec9a67e2a0 RCX: 00007fe329c171a7
[ 2659.794335] RDX: 0000000000000004 RSI: 00007fe31cbfa188 RDI: 000000000000001e
[ 2659.794335] RBP: 0000000000000004 R08: 0000000000000000 R09: 00000000ffffffff
[ 2659.794336] R10: 0000000000002024 R11: 0000000000000293 R12: 0000000000002024
[ 2659.794337] R13: 000055ec9a67e1b0 R14: 0000000000000004 R15: 0000000000002024
[ 2766.095028] usb 1-11: usbfs: USBDEVFS_CONTROL failed cmd usbhid-ups rqt 161 rq 1 len 2 ret -71
[ 2780.622751] INFO: task kvm:30173 blocked for more than 966 seconds.
[ 2780.623360]       Tainted: P           O      5.11.22-2-pve #1

I have the following grub configuration:

Code:
# grep i915 /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on i915.enable_gvt=1"

And these modules loaded:

Code:
# cat /etc/modules
# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.

# Module for wireguard
wireguard

# Modules for IOMMU
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
kvmgt

My kernel is:

Code:
# uname -a
Linux bigbear 5.11.22-2-pve #1 SMP PVE 5.11.22-3 (Sun, 11 Jul 2021 13:45:15 +0200) x86_64 GNU/Linux

And pveversion:

Code:
# pveversion -v
proxmox-ve: 7.0-2 (running kernel: 5.11.22-2-pve)
pve-manager: 7.0-10 (running version: 7.0-10/d2f465d3)
pve-kernel-5.11: 7.0-5
pve-kernel-helper: 7.0-5
pve-kernel-5.4: 6.4-4
pve-kernel-5.3: 6.1-6
pve-kernel-5.11.22-2-pve: 5.11.22-3
pve-kernel-5.11.22-1-pve: 5.11.22-2
pve-kernel-5.4.124-1-pve: 5.4.124-1
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-5.3.10-1-pve: 5.3.10-1
ceph-fuse: 14.2.21-1
corosync: 3.1.2-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: residual config
ifupdown2: 3.1.0-1+pmx2
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.21-pve1
libproxmox-acme-perl: 1.1.1
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.0-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.0-5
libpve-guest-common-perl: 4.0-2
libpve-http-server-perl: 4.0-2
libpve-storage-perl: 7.0-9
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.9-2
lxcfs: 4.0.8-pve2
novnc-pve: 1.2.0-3
proxmox-backup-client: 2.0.4-1
proxmox-backup-file-restore: 2.0.4-1
proxmox-mini-journalreader: 1.2-1
proxmox-widget-toolkit: 3.3-4
pve-cluster: 7.0-3
pve-container: 4.0-8
pve-docs: 7.0-5
pve-edk2-firmware: 3.20200531-1
pve-firewall: 4.2-2
pve-firmware: 3.2-4
pve-ha-manager: 3.3-1
pve-i18n: 2.4-1
pve-qemu-kvm: 6.0.0-2
pve-xtermjs: 4.12.0-1
qemu-server: 7.0-10
smartmontools: 7.2-pve2
spiceterm: 3.2-2
vncterm: 1.7-1
zfsutils-linux: 2.0.5-pve1
 
After changing from mediated to PCI device passthrough, VM is no longer crashing. Seems like there is an issue with using mediated devices? Any guidance on getting mediated devices working would be appreciated.
 
Hi kcrawford,

I can't help, but I hope you can assist me.

I'm on my 2nd attempt at getting gvt-g working, so far I've never manage to boot a VM, could you share your config?

The guide I was following did say they had abandoned the effort due to instability issues.

Thanks
Derek
 
Last edited:
Hi kcrawford,

I can't help, but I hope you can assist me.

I'm on my 2nd attempt at getting gvt-g working, so far I've never manage to boot a VM, could you share your config?

The guide I was following did say they had abandoned the effort due to instability issues.

Thanks
Derek


In grub add: i915.enable_gvt=1 (make sure i915.enable_guc=0 is also set)
In /etc/modules add: kvmgt

then run "update-grub" and "update-initramfs -u -k all" and reboot.

You should now be able to add a mediated GPU from the VM hardware menu.
 
Windows 10 has always been a nightmare for me with Intel GVT, but I've seen much better stability with pve-edge-kernels and Windows Server 2019.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!