GVT-g crash installing drivers in windows 10 VM

drbob

Member
Jul 29, 2020
5
0
6
44
Hi,

I've been trying to get GPU virtualization working with a Windows 10 guest in Proxmox 6.2. My testbed is a Dell Poweredge T30 server with a Xeon E3-1225v5 Skylake CPU.

I followed the instructions in the documentation and got to the point of a windows 10 VM with the virtualised GPU visible in device manager, but no drivers installed.
1596503514600.png

When I try to install the Intel drivers in the guest, Windows blue screens with a VIDEO TDR FAILURE in igdkmd64.sys error. The only way I've found so far to recover is to restore the VM to a snapshop from before I attempted to install the driver.

1596505043733.png

Is anyone else here experiencing this issue/got any tips? My googling has come up short so far...
 
Looking in dmesg I get the following errors when windows crashes on driver install:
Code:
[ 1121.711044] gvt: len is not valid:  len=195  valid_len=3
[ 1121.711048] gvt: vgpu 1: MI_LOAD_REGISTER_IMM handler error
[ 1121.711049] gvt: vgpu 1: cmd parser error
[ 1121.711050] 0x0
[ 1121.711050] 0x22

[ 1121.711053] gvt: vgpu 1: scan wa ctx error
[ 1121.711058] GVT Internal error  for the guest
[ 1121.711059] Now vgpu 1 will enter failsafe mode.
[ 1121.711061] gvt: vgpu 1: failed to submit desc 0
[ 1121.711062] gvt: vgpu 1: fail submit workload on ring 0
[ 1121.711064] gvt: vgpu 1: fail to emulate MMIO write 00002230 len 4

Which led me to this bug (#130) in the gvt-g github issue tracker. Apparently it's been fixed with this patch in kernel 5.7.

Might Proxmox consider backporting the patch, as gvt-g is currently unsuable in windows VMs?
 
I believe that this is the same bug as this: https://forum.proxmox.com/threads/p...ugh-crashes-after-nvidia-driver-update.70531/

I've been having the same issue since May. If anyone is aware of a semi-easy fix please post. As far as I'm aware the only way to get it work is adding the VM default display driver back, unticking primary display and then going into safe-mode to edit the MSI flag for said GPU. It's just a bit too much effort, so I've been sitting on the same driver since I updated in May.
 
@ferociousmilkyway whilst the windows bluescreen message may be similar, I don't think it's the same issue. The problem I'm having is specifically due to a bug in the host Intel GVT-g driver.

GVT-g is a feature that virtualizes (not passthrough) Intel GPUs so multiple vms and the host can simultanously share a GPU, in a similar fashion to how they share the host CPU. Sidenote: Nvidia and AMD chips can also be virtualised, but they artificially restrict the feature to their "Professional" branded cards (Tesla and RadeonPro respectively)

I temporarily fixed my problem by explicitly setting grub to load kernel 5.3 as described in the top answer here.

I couldn't use the simpler solutions of uninstalling kernel 5.4 (proxmox 6.2 is set to depend on 5.4 so I would have uninstalled proxmox as well), nor could I set grub to save the last used kernel as my default proxmox installation has put /boot on an lvm and it seems grub can read, but not write to lvm volumes.

I also filed a bug on the proxmox bug tracker (2919) and wrote to the author of the patch for kernel 5.7, they have now nominated the patch for the 5.4 branch, so hopefully the fix will be in the LTS kernel soon.
 
Last edited:
+1 here any ideas when this is fixed ? I cant boot the machine with it enabled.
Don't seem to be able to edit my previous post but happy to compile my own kernel if someone can point me at a patched source code that will work with proxmox
 
+1 here any ideas when this is fixed ? I cant boot the machine with it enabled.
The patch from the mainline kernel was ported over in LTS Kernel 5.4.66, as soon as proxmox 6.2 moves onto that kernel or above this issue with gvt-g should be fixed. edit: Proxmox seems to update the kernel to a newer point release on an approximately monthly basis (for the non-subscription repo at least). Last update (to LTS Kernel 5.4.65) was on 21st September, so I'd expect the patch to arrive in an update around the third week of October.

You can view the patch here.
 
Last edited:
The patch from the mainline kernel was ported over in LTS Kernel 5.4.66, as soon as proxmox 6.2 moves onto that kernel or above this issue with gvt-g should be fixed. edit: Proxmox seems to update the kernel to a newer point release on an approximately monthly basis (for the non-subscription repo at least). Last update (to LTS Kernel 5.4.65) was on 21st September, so I'd expect the patch to arrive in an update around the third week of October.

You can view the patch here.
Wow, thanks for the update and predicted ETA. I didn't expect more than a standard reply but have been watching the bug reports closely because I want this fixed to work on a project that I'm doing.

Thanks
Squeeky
 
Wow really taking their time on this one lol. I've been checking nearly every day for a new kernel.

Is this broken on windows 7 also ?
 
Last edited:
Wow really taking their time on this one lol. I've been checking nearly every day for a new kernel.
I've also been checking updates every day ;)

Really hope it's soon. In the interim I've been passing through an NVIDIA GPU to my Windows VM, but I'd really prefer to use Quick Sync via GVT-g for power savings.
 
Hi, i have gvt-g (hd620) working fine in v6.2 with win10 guests. 1 q35 seabios and another q35 omvf. Did nothing special to get that working.

Wil check later on the config used...
 
Hi, i have gvt-g (hd620) working fine in v6.2 with win10 guests. 1 q35 seabios and another q35 omvf. Did nothing special to get that working.

Wil check later on the config used..
Would you mind doing a uname -a to see exactly which kernel you have please.
 
Have 2 hosts, both with win10 guest with gvt-g enabled.

intel driver v27.20.100.8681

host 1:
GRUB:
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on pci=nomsi i915.enable_gvt=1"

Code:
root@pve:~# uname -a
Linux pve 5.4.65-1-pve #1 SMP PVE 5.4.65-1 (Mon, 21 Sep 2020 15:40:22 +0200) x86_64 GNU/Linux

modules:
Code:
kvmgt
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
vfio-mdev

guest:
Code:
agent: 1
bios: ovmf
bootdisk: scsi0
cores: 2
cpu: host
efidisk0: ZFSdata:vm-102-disk-1,size=1M
hostpci0: 00:02.0,mdev=i915-GVTg_V5_8,pcie=1
ide0: data:iso/virtio-win-0.1.173.iso,media=cdrom,size=385062K
ide2: none,media=cdrom
machine: q35
memory: 4096
name: T-iGPU-Win10-EFI
net0: virtio=2A:02:F9:3C:F6:89,bridge=vmbr0,firewall=1
numa: 0
ostype: win10
scsi0: ZFSdata:vm-102-disk-0,discard=on,iothread=1,size=64G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=acb7827d-17c6-41a0-a57c-ecbf0b308f78
sockets: 1
vga: none
vmgenid: 2ffcdd09-32af-4585-a6a5-a536c10ba6e1

host 2:
systemd:
root=ZFS=rpool/ROOT/pve-1 boot=zfs intel_iommu=on i915.enable_gvt=1 acpi_enforce_resources=lax

Code:
root@pve2:~# uname -a
Linux pve2 5.4.65-1-pve #1 SMP PVE 5.4.65-1 (Mon, 21 Sep 2020 15:40:22 +0200) x86_64 GNU/Linux

modules:
Code:
kvmgt
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
vfio-mdev

guest1:
Code:
balloon: 2048
bios: ovmf
bootdisk: scsi0
cores: 2
cpu: host,hidden=1,flags=+pcid
hostpci0: 00:02.0,mdev=i915-GVTg_V5_2,pcie=1
efidisk0: local-zfs:vm-198-disk-1,size=1M
machine: q35
memory: 4096
name: T-igpu-win10-uefi
net0: virtio=86:3B:E8:C6:3B:1E,bridge=vmbr0,firewall=1
numa: 1
ostype: win10
scsi0: local-zfs:vm-198-disk-0,discard=on,iothread=1,size=32G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=02caef27-dae4-431c-9a58-d189e0c91b19
sockets: 1
tablet: 0
vga: none
vmgenid: 1af4ec83-6265-46c4-a1fe-3ce17ee11af2

guest 2:
Code:
balloon: 2048
bootdisk: scsi0
cores: 2
cpu: host
hostpci0: 00:02.0,mdev=i915-GVTg_V5_2,pcie=1
machine: q35
memory: 4096
name: T-igpu-win10-bios
net0: virtio=FA:2B:E3:B3:CB:6A,bridge=vmbr0,firewall=1
numa: 0
ostype: win10
scsi0: local-zfs:vm-199-disk-0,discard=on,iothread=1,size=32G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=ed0c4d0e-f3f6-4195-8e35-fab8bce39bde
sockets: 1
tablet: 0
vga: none
vmgenid: 5d6c2fd2-fae6-4975-8af4-5830e20a63f9
 
Have 2 hosts, both with win10 guest with gvt-g enabled.

intel driver v27.20.100.8681

host 1:
GRUB:
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on pci=nomsi i915.enable_gvt=1"

Code:
root@pve:~# uname -a
Linux pve 5.4.65-1-pve #1 SMP PVE 5.4.65-1 (Mon, 21 Sep 2020 15:40:22 +0200) x86_64 GNU/Linux

modules:
Code:
kvmgt
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
vfio-mdev

guest:
Code:
agent: 1
bios: ovmf
bootdisk: scsi0
cores: 2
cpu: host
efidisk0: ZFSdata:vm-102-disk-1,size=1M
hostpci0: 00:02.0,mdev=i915-GVTg_V5_8,pcie=1
ide0: data:iso/virtio-win-0.1.173.iso,media=cdrom,size=385062K
ide2: none,media=cdrom
machine: q35
memory: 4096
name: T-iGPU-Win10-EFI
net0: virtio=2A:02:F9:3C:F6:89,bridge=vmbr0,firewall=1
numa: 0
ostype: win10
scsi0: ZFSdata:vm-102-disk-0,discard=on,iothread=1,size=64G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=acb7827d-17c6-41a0-a57c-ecbf0b308f78
sockets: 1
vga: none
vmgenid: 2ffcdd09-32af-4585-a6a5-a536c10ba6e1

host 2:
systemd:
root=ZFS=rpool/ROOT/pve-1 boot=zfs intel_iommu=on i915.enable_gvt=1 acpi_enforce_resources=lax

Code:
root@pve2:~# uname -a
Linux pve2 5.4.65-1-pve #1 SMP PVE 5.4.65-1 (Mon, 21 Sep 2020 15:40:22 +0200) x86_64 GNU/Linux

modules:
Code:
kvmgt
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
vfio-mdev

guest1:
Code:
balloon: 2048
bios: ovmf
bootdisk: scsi0
cores: 2
cpu: host,hidden=1,flags=+pcid
hostpci0: 00:02.0,mdev=i915-GVTg_V5_2,pcie=1
efidisk0: local-zfs:vm-198-disk-1,size=1M
machine: q35
memory: 4096
name: T-igpu-win10-uefi
net0: virtio=86:3B:E8:C6:3B:1E,bridge=vmbr0,firewall=1
numa: 1
ostype: win10
scsi0: local-zfs:vm-198-disk-0,discard=on,iothread=1,size=32G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=02caef27-dae4-431c-9a58-d189e0c91b19
sockets: 1
tablet: 0
vga: none
vmgenid: 1af4ec83-6265-46c4-a1fe-3ce17ee11af2

guest 2:
Code:
balloon: 2048
bootdisk: scsi0
cores: 2
cpu: host
hostpci0: 00:02.0,mdev=i915-GVTg_V5_2,pcie=1
machine: q35
memory: 4096
name: T-igpu-win10-bios
net0: virtio=FA:2B:E3:B3:CB:6A,bridge=vmbr0,firewall=1
numa: 0
ostype: win10
scsi0: local-zfs:vm-199-disk-0,discard=on,iothread=1,size=32G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=ed0c4d0e-f3f6-4195-8e35-fab8bce39bde
sockets: 1
tablet: 0
vga: none
vmgenid: 5d6c2fd2-fae6-4975-8af4-5830e20a63f9
Thanks

I was missing vfio-mdev which I didn't see documented anywhere

Then I also have iommu=PT and PCI=noaer

Will do some testing
 
Have 2 hosts, both with win10 guest with gvt-g enabled.

intel driver v27.20.100.8681

host 1:
GRUB:
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on pci=nomsi i915.enable_gvt=1"

Code:
root@pve:~# uname -a
Linux pve 5.4.65-1-pve #1 SMP PVE 5.4.65-1 (Mon, 21 Sep 2020 15:40:22 +0200) x86_64 GNU/Linux

modules:
Code:
kvmgt
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
vfio-mdev

guest:
Code:
agent: 1
bios: ovmf
bootdisk: scsi0
cores: 2
cpu: host
efidisk0: ZFSdata:vm-102-disk-1,size=1M
hostpci0: 00:02.0,mdev=i915-GVTg_V5_8,pcie=1
ide0: data:iso/virtio-win-0.1.173.iso,media=cdrom,size=385062K
ide2: none,media=cdrom
machine: q35
memory: 4096
name: T-iGPU-Win10-EFI
net0: virtio=2A:02:F9:3C:F6:89,bridge=vmbr0,firewall=1
numa: 0
ostype: win10
scsi0: ZFSdata:vm-102-disk-0,discard=on,iothread=1,size=64G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=acb7827d-17c6-41a0-a57c-ecbf0b308f78
sockets: 1
vga: none
vmgenid: 2ffcdd09-32af-4585-a6a5-a536c10ba6e1

host 2:
systemd:
root=ZFS=rpool/ROOT/pve-1 boot=zfs intel_iommu=on i915.enable_gvt=1 acpi_enforce_resources=lax

Code:
root@pve2:~# uname -a
Linux pve2 5.4.65-1-pve #1 SMP PVE 5.4.65-1 (Mon, 21 Sep 2020 15:40:22 +0200) x86_64 GNU/Linux

modules:
Code:
kvmgt
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
vfio-mdev

guest1:
Code:
balloon: 2048
bios: ovmf
bootdisk: scsi0
cores: 2
cpu: host,hidden=1,flags=+pcid
hostpci0: 00:02.0,mdev=i915-GVTg_V5_2,pcie=1
efidisk0: local-zfs:vm-198-disk-1,size=1M
machine: q35
memory: 4096
name: T-igpu-win10-uefi
net0: virtio=86:3B:E8:C6:3B:1E,bridge=vmbr0,firewall=1
numa: 1
ostype: win10
scsi0: local-zfs:vm-198-disk-0,discard=on,iothread=1,size=32G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=02caef27-dae4-431c-9a58-d189e0c91b19
sockets: 1
tablet: 0
vga: none
vmgenid: 1af4ec83-6265-46c4-a1fe-3ce17ee11af2

guest 2:
Code:
balloon: 2048
bootdisk: scsi0
cores: 2
cpu: host
hostpci0: 00:02.0,mdev=i915-GVTg_V5_2,pcie=1
machine: q35
memory: 4096
name: T-igpu-win10-bios
net0: virtio=FA:2B:E3:B3:CB:6A,bridge=vmbr0,firewall=1
numa: 0
ostype: win10
scsi0: local-zfs:vm-199-disk-0,discard=on,iothread=1,size=32G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=ed0c4d0e-f3f6-4195-8e35-fab8bce39bde
sockets: 1
tablet: 0
vga: none
vmgenid: 5d6c2fd2-fae6-4975-8af4-5830e20a63f9
Cloned your config and still got the error the only thing that is possibly different is that we have different era cpus. im i5 6th gen ...
 
Last edited:
i think it has something to do with a driver conflict. The intel driver cannot start because another driver is conflicting.

During installation of win10, which proxmox display setting did you use? default / std vga / spice / etc ?
 
i think it has something to do with a driver conflict. The intel driver cannot start because another driver is conflicting.

During installation of win10, which proxmox display setting did you use? default / std vga / spice / etc ?
Yeh, I used default. How else can you do it .?.. Otherwise you can't see the screen. Pretty sure it's still a kernel issue

I created this link to track changes to the proxmox repo. It alerts every 24 hours about changes. Might be useful for anyone keeping an eye on the problem.
 
hmm, strange.

just tested 2 fresh win10 installs. both work fine....

1. ovmf, q35, spice (qxl)
installed win10 and no gvt-g device, added spice/qxl driver via virtio-win.xxx.iso
after install of win10, shutdown, add gvt-g device (minimal vX_4), keep SPICE enabled.
boot, install intel igpu driver via windows update.
reboot, dual display adapter enabled and works fine.

2.seabios, i440fx, standard vga
installed win10 with no gvt-g device.
after install of win10, shutdown, add gvt-g device (minimal vX_4), keep vga enabled.
boot, install intel igpu driver via windows update.
reboot, dual display adapter enabled and works fine.
 
Yeh, I used default. How else can you do it .?.. Otherwise you can't see the screen. Pretty sure it's still a kernel issue

I created this link to track changes to the proxmox repo. It alerts every 24 hours about changes. Might be useful for anyone keeping an eye on the problem.

For what it's worth, I recently gave it a try with my GeForce GTX 1650 also passed through (KVM VGA set to 'none' since I used Windows Remote Desktop). As soon as I install the Intel graphics driver, I get a BSOD. Same thing happens without the NVIDIA GPU passed through.

I'm on the latest Windows 10 Pro stable build and have a Xeon E3-1245 v5 (equivalent to i7-6700).
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!