GPU Passthrough: Host crashes when VM attempts to start up after the GPU driver has initialized the card once

josephm101

New Member
Jun 4, 2024
2
0
1
I'm having this really annoying issue with a GeForce 9800 GT and a Windows 7 VM on the latest version of Proxmox. When I was installing Windows, everything worked just fine. The VM restarted multiple times without a hitch. Ever since I installed the NVIDIA drivers in the guest, the entire host crashes when the VM attempts to reboot. This issue only happens after the GPU has been initialized in the guest by the NVIDIA driver.

Does anyone know what's going on? Is it some kind of a reset bug?

No errors show up with `journalctl -f` or `dmesg -f` when I start the VM the second time.

Here's my VM config (let me know if you need me to provide more info):
1717462736779.png

ADDITIONAL INFO:
- The motherboard I'm using is a Gigabyte B450 AORUS M. It has POST LEDs on it. When the host crashes, two of the four LEDs start blinking: specifically, the ones labeled "CPU" and "DRAM". It will do this until I turn off the power.
- When the GPU is in its regular text mode, and the VM is abruptly stopped, whatever was on the display/in the framebuffer will remain on the screen. After the GPU has been initialized by its driver, if the VM is abruptly stopped, the screen goes black.

The most annoying bit is that I feel like I've solved this before, but I can't remember what the hell I did!!!
 
Last edited:
Many GPUs don't reset properly and, because they are real devices, they can crash the Proxmox host. Such devices only work once in a VM (per Proxmox host reboot), which seems to fix your issues. Maybe search the internet for a work-around for your specific GPU (if any exists)?
 
I tried searching for workarounds but couldn't find any. I might have to look again. Everything I could find about GPU reset issues all pointed to AMD's GPUs.

I feel like the issue must be really bad if the motherboard starts throwing out POST errors after the crash. I have never seen that happen with any other (unrelated) system crash.

What sucks the most is that I feel like I had all of this working at one point. Would have been about a year ago, but it's all the same hardware. The only thing that I can think of being different is that the motherboard BIOS has been updated since.

This is a real puzzler... but thank you for that info. I have a couple other GPUs that were released around the same time as the 9800 GT that I can try. Maybe I can try back-flashing the BIOS?
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!