Urgent Help, Windows 10 Crash!!! BSOD VIDEO_DXGKRNL_FATAL_ERROR

Aug 7, 2023
8
0
1
Hi Folks,

I am getting VIDEO_DXGKRNL_FATAL_ERROR error when i do load test like Passmark
This is very random, 90% time this error happens with in the first test.

Note: If i switch to KVM64/QEMU 64 this crash does not happen strangely


My Hardware/Software

AMD 5900X
32GB DDR4 RAM @ 3200MHZ (XMP)
1TB SSD
2 X NVIDIA 1080
X570 GIGABYTE AORUS PRO (LATEST BIOS 2023)

Bios:

SR-IOV - ENABLED
4G DECODING - DISABLED
CSM - ENABLED

Host:
ZFS (with Arc Memory limit to 2GB)
5.15.108-1-pve
Proxmox 7.4-16


VM:
Windows 10 PRO (latest updated)


Here is my Config

agent: 1,type=virtio
balloon: 0
bios: ovmf
boot: order=sata0
cores: 8
cpu: host,hidden=1
cpuunits: 10000
efidisk0: local-zfs:vm-100-disk-0,size=1M
hostpci0: 0000:09:00,pcie=1
machine: pc-q35-7.2
memory: 14048
name: ogn1
net0: rtl8139=AA:DD:33:51:FF:52,bridge=vmbr0
numa: 0
ostype: win10
sata0: local-zfs:vm-100-disk-1,aio=threads,cache=writeback,discard=on,size=100G,snapshot=0,ssd=1
sockets: 1
vga: none
vmgenid: f345cc23-e6ad-486e-8783-c909092232c1


Things i have tried

1. Fresh Windows
2. Safe Boot DDU + STUDIO Driver
3. MSI Interrupt
4. Moved to SATA/RLTK instead of VFIO Drivers
5. kvm=off and -hypervisor in host


Nothing seems to work at all


Please help, need to run things in host mode for the best performance i need.
 
Last edited:
Hello,


Can you check if there is any new available version of the GPU driver on the Windows side?
 
could you upload the result of journalctl -b > journal.txt
 
@Moayad I used the latest studio driver, further, i have made a clean uninstall using DDU and tried latest game driver 538 and still the same

PFA the log

Update: Sometimes i randomly get host reboot or sometimes vm reboot with CRITICAL_PROCESS_DIED BSOD
 
Last edited:
Thank you for the logs!

Code:
Aug 07 15:27:53 myproxmoxserverkernel: vfio-pci 0000:09:00.0: No more image in the PCI ROM
Aug 07 15:27:55 myproxmoxserverkernel: vfio-pci 0000:09:00.0: No more image in the PCI ROM
Aug 07 15:27:55 myproxmoxserverkernel: vfio-pci 0000:09:00.0: No more image in the PCI ROM

The next step I would set up the ROM file [0] to the VM.

[0] https://pve.proxmox.com/wiki/PCI_Passthrough#The_.27romfile.27_option
 
I dont know it it related but there were also a lot of memory errors. Maybe it would be worth to double check your ram and memtest it
 
I encountered the exact same issue. My Windows 10 vm recently started experiencing random BSOD(VIDEO_DXGKRNL_FATAL_ERROR). It had been running smoothly for the past three weeks. I tried various methods, including reinstalling Windows 10, attempting Windows 11, and passing the VBIOS ROM file, but the problem persisted. It wasn't until I came across your post that I tried changing the CPU from "host" to "kvm64," and miraculously, the issue was resolved.

After numerous attempts, I discovered a method that consistently triggered the BSOD: exporting the GPU's VBIOS using GPU-Z. Every time I confirmed the export, the system would BSOD.

Furthermore, I observed an unusual behavior. GPU-Z would show the UEFI status as "enabled" only the first time it was opened. Upon closing and reopening GPU-Z, the UEFI status would become "disabled." Additionally, a few seconds after opening GPU-Z, the bus interface information would change from "PCIe x16 3.0" to "PCI."
open GPU-Z first time
gpu-z-bad.jpg
open GPU-Z again:
gpu-z-bad2.jpg

Switching the CPU from "host" to "kvm64" resolved the above issues. GPU-Z now correctly displays UEFI status and bus interface and won't change.

I hope this promblem can be fixed someday. Here is some of my system information for reference:

Hardware:
  • AMD 5800X
  • ASUS TUF B550
  • MSI GTX 1070
pve:

Code:
root@ryzen-pve ➜  ~ pveversion -v
proxmox-ve: 8.0.2 (running kernel: 6.2.16-12-pve)
pve-manager: 8.0.4 (running version: 8.0.4/d258a813cfa6b390)
pve-kernel-6.2: 8.0.5
proxmox-kernel-helper: 8.0.3
proxmox-kernel-6.2.16-12-pve: 6.2.16-12
proxmox-kernel-6.2: 6.2.16-12
pve-kernel-6.2.16-3-pve: 6.2.16-3
ceph-fuse: 17.2.6-pve1+3
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx4
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.25-pve1
libproxmox-acme-perl: 1.4.6
libproxmox-backup-qemu0: 1.4.0
libproxmox-rs-perl: 0.3.1
libpve-access-control: 8.0.5
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.0.8
libpve-guest-common-perl: 5.0.4
libpve-http-server-perl: 5.0.4
libpve-rs-perl: 0.8.5
libpve-storage-perl: 8.0.2
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-2
proxmox-backup-client: 3.0.2-1
proxmox-backup-file-restore: 3.0.2-1
proxmox-kernel-helper: 8.0.3
proxmox-mail-forward: 0.2.0
proxmox-mini-journalreader: 1.4.0
proxmox-widget-toolkit: 4.0.6
pve-cluster: 8.0.3
pve-container: 5.0.4
pve-docs: 8.0.4
pve-edk2-firmware: 3.20230228-4
pve-firewall: 5.0.3
pve-firmware: 3.8-2
pve-ha-manager: 4.0.2
pve-i18n: 3.0.5
pve-qemu-kvm: 8.0.2-5
pve-xtermjs: 4.16.0-3
qemu-server: 8.0.7
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.1.12-pve1

Kernel, module parameters:

Code:
root@ryzen-pve ➜  ~ cat /etc/kernel/cmdline
root=ZFS=rpool/ROOT/pve-1 boot=zfs amd_iommu=on iommu=pt pcie_acs_override=downstream,multifunction nofb nomodeset video=vesafb:off video=efifb:off video=simplefb:off
root@ryzen-pve ➜  ~ cat /etc/kernel/cmdline
root=ZFS=rpool/ROOT/pve-1 boot=zfs amd_iommu=on iommu=pt pcie_acs_override=downstream,multifunction nofb nomodeset video=vesafb:off video=efifb:off video=simplefb:off
root@ryzen-pve ➜  ~ cat /etc/modprobe.d/vifo.conf
options vfio-pci ids=10de:1b81,10de:10f0,10de:1c02,10de:10f1 disable_vga=1
options vfio_iommu_type1 allow_unsafe_interrupts=1
root@ryzen-pve ➜  ~ cat /etc/modprobe.d/blacklist.conf
blacklist nouveau
blacklist nvidia*
blacklist snd_hda_intel
blacklist nvidiafb
root@ryzen-pve ➜  ~ cat /etc/modprobe.d/kvm.conf
options kvm ignore_msrs=1 report_ignored_msrs=0

VM Configuration:

YAML:
root@ryzen-pve ➜  ~ qm config 201
bios: ovmf
boot: order=scsi0;ide2
cores: 8
cpu: host
efidisk0: local-zfs:vm-201-disk-0,efitype=4m,pre-enrolled-keys=1,size=1M
hostpci0: 0000:08:00,pcie=1,romfile=MSI.GTX1070.8192.160520.rom
ide2: none,media=cdrom
machine: pc-q35-8.0
memory: 32768
meta: creation-qemu=8.0.2,ctime=1694880276
name: kvm-win11
net0: virtio=7E:EE:45:82:75:0C,bridge=vmbr1,firewall=1
numa: 0
ostype: win11
scsi0: local-zfs:vm-201-disk-system,discard=on,iothread=1,size=150G
scsi1: local-zfs:vm-200-disk-life,discard=on,iothread=1,size=200G,ssd=1
scsi2: local-zfs:vm-200-disk-common,discard=on,iothread=1,size=400G,ssd=1
scsi3: pool0:vm-200-disk-simulator,discard=on,iothread=1,size=600G,ssd=1
scsi4: pool0:vm-200-disk-gamessd,discard=on,iothread=1,size=600G,ssd=1
scsi5: /dev/disk/by-id/wwn-0x5000c500dca9e8c1-part2,size=1615895M
scsihw: virtio-scsi-single
smbios1: uuid=4353a00c-0248-49fd-9b29-ef792829b5ac
sockets: 1
tpmstate0: local-zfs:vm-201-disk-2,size=4M,version=v2.0
vmgenid: 14636e76-53f0-4fbc-bcab-77dfd55e89cf
 
Last edited:
  • Like
Reactions: Cantalupo
Same here with AMD 5950X. No problem on a Intel 9th Gen CPU or KVM CPU. So the problem is somewhere with AMD CPUs?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!