Passthrough RTX 6000/5090 CPU Soft BUG lockup, D3cold to D0, after guest shutdown

Trying some 4 extra parameters to see if it will help or not. Will report when issue happens again.

quiet idle=nomwait pci=nocrs pci=realloc processor.max_cstate=5 amd_iommu=on iommu=pt vfio_iommu_type1.allow_unsafe_interrupts=1 vfio-pci.ids=10de:22e8,10de:2bb1 initcall_blacklist=sysfb_init
 
Have a look at the attachment:
So the upgrading firmware did not help. I am looking for next steps. Already wrote to asrock support, also i will write to nvidia but i think it might be motherboard issue. And thinking what more can i do more.
Anyway i would like to thank you for your suggestion. If you need for example RTX5090 or 6000 to use for a while in VM then i do have datacenter on which i am renting those bu i can rent for free for some time if you would need it.
The scripts you sent i will check out later once i will have this stable.
 
Those commands did not help.
I am currently speaking to Nvidia enterprise support Tier II to fix this. They asked me to talk with Proxmox support team so i also sent them support ticket.
 
Hi!
I have basically the same problem and opened another threat:
my threat

I think I tried the firmware update aswell, but I am not quite sure about it.

Is there any help at the moment?

Regards

Christof
 
Today i got answer from Proxmox. It is not my proxmox installation and it was installed from clean debian. But here it is:

pve-edk2-firmware: not correctly installed
According to the report, the package is not installed correctly, which may affect EFI VMs. Please reinstall:

Code:
apt update
apt install --reinstall pve-edk2-firmware
# Check if apt is ok:
apt -f install

# You could also try testing Optin Kernel 6.14. This has fixed a GPU theme in the past.

apt install proxmox-kernel-6.14
 
I answered your thread on Level1Techs buddy - don't know if you saw it:

https://forum.level1techs.com/t/do-...ies-has-reset-bug-in-vm-passthrough/228549/35

What fixed it on my system (debian) was to disable the nvidia-drm modeset option on the VM - that's all. No changes needed on the host.
Oh i actually did not see it. Thanks.
Hmm that is interesting. I can try this setting in VM: options nvidia-drm modeset=0

Still you have rock solid Windows and we also got this issue after windows shutdown as well.
I am wondering if you set something special in windows or maybe drivers have to do something with it. Or if you were setting something special in VGA in proxmox for VM ?
 
I am wondering if you set something special in windows or maybe drivers have to do something with it. Or if you were setting something special in VGA in proxmox for VM ?
No nothing special, just a typical Windows 11 installation with the latest NVidia driver. I'm not on Proxmox though, just a Debian 12 host running libvirt KVMs. I've attached my VM definitions in case it helps.
 

Attachments

I answered your thread on Level1Techs buddy - don't know if you saw it:

https://forum.level1techs.com/t/do-...ies-has-reset-bug-in-vm-passthrough/228549/35

What fixed it on my system (debian) was to disable the nvidia-drm modeset option on the VM - that's all. No changes needed on the host.
One of my clients confirmed that he had crashed all the time that rtx6000 blackwell when he was training unsloth. And after adding that fix in VM, it no longer crashes !
Anyone can confirm.
I have asked nvidia support if they can fix this on host side or in gpu bios as changing anytging in my clients vms would be impossible lol and they still can change that config and crash again.
 
Same issue after guest shutdown on Linux 6.14.8-2-pve, X670E PG Lightning, Proxmox VE 9.0.3 x86_64, NVIDIA GeForce RTX 5070, AMD Ryzen 9 7900

Code:
Aug 11 21:11:48 proxmox kernel: vfio-pci 0000:01:00.1: Unable to change power state from D3cold to D0, device inaccessible
Aug 11 21:11:48 proxmox kernel: vfio-pci 0000:01:00.0: resetting
Aug 11 21:11:48 proxmox kernel: vfio-pci 0000:01:00.1: resetting
Aug 11 21:11:48 proxmox kernel: vfio-pci 0000:01:00.1: Unable to change power state from D3cold to D0, device inaccessible
Aug 11 21:11:50 proxmox kernel: pcieport 0000:00:01.1: broken device, retraining non-functional downstream link at 2.5GT/s
Aug 11 21:11:50 proxmox kernel: vfio-pci 0000:01:00.0: reset done
Aug 11 21:11:50 proxmox kernel: vfio-pci 0000:01:00.1: reset done
Aug 11 21:11:50 proxmox kernel: vfio-pci 0000:01:00.1: Unable to change power state from D3cold to D0, device inaccessible
Aug 11 21:11:50 proxmox kernel: vfio-pci 0000:01:00.0: Unable to change power state from D0 to D3hot, device inaccessible
 
Got response from nvidia that they were able to reproduce this issue and they are thinking about fix.
Also i have installed apt install proxmox-kernel-6.14.8-2-bpo12-pve/stable and i see that RTX6000 boots super fast now vs very slow when i had older 6.8 and 6.11 kernels. In 6.14 they added some support for blackwell so worth to try it out.
https://www.phoronix.com/news/Linux-6.14-VFIO
Anyway the crash on shutdown is caused by either specific training itself or/and some module options for nvidia.
The training that caused issues afte applying options nvidia-drm modeset=0 and /etc/X11/xorg.conf.d now it does not crash gpu anymore.
But since client can do any stuff in VM, this is not good solution.
 
  • Like
Reactions: fuomag9
Yeah it seems like it !
I have upgraded few servers with RTX4090 and RTX5090 and l also RTX6000 blackwell to that kernel proxmox-kernel-6.14.8-2-bpo12-pve/stable
And so far it works ok + those crazy fast startup.

So only one thing stil remains. Crashing GPUs when VM guest does some strange modeset=1 and other things and then shuits down VM - then GPU goes to sleep forever xD
Will see if nvidia will fix it or not.

I also suspect RTX4090 to have same issue but i am not certain as there might be different issue, Will check it soon.
 
  • Like
Reactions: uzumo
These messages are related to the GPU being put into a power sleep state, and not being able to be activated in time for the VM.
vfio-pci 0000:02:00.0: Unable to change power state from D3cold to D0, device inaccessible
vfio-pci 0000:02:00.1: Unable to change power state from D3cold to D0, device inaccessible

Try on the Host VM:
1. Kernel module

In /etc/default/grub add to options:

disable_idle_d3=1


2. Bind the GPU early. and stop idle 3D

Create /etc/modprobe.d/vfio-pci.conf:

# RTX 5090 GPU (10de:2b85) + Audio (10de:22e8)
# Bind both functions to vfio-pci and prevent runtime D3 (D3cold)
options vfio-pci ids=10de:2b85,10de:22e8 disable_vga=1 disable_idle_d3=1


3. Reinforce power management rules

Create /etc/udev/rules.d/99-vfio-gpu-pm.rules:

# Force RTX 5090 functions to stay in D0 and block deepest sleep
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{device}=="0x2b85", ATTR{power/control}="on", ATTR{d3cold_allowed}="0"
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{device}=="0x22e8", ATTR{power/control}="on", ATTR{d3cold_allowed}="0"


4. Apply and Reboot
update-initramfs -u
udevadm control --reload-rules
udevadm trigger -s pci
update-grub
reboot
 
Last edited:
  • Like
Reactions: fireon
Actually mortise, i think this solves the issue. I am not sure as it does not happen often but i think you hit the spot.
Where did you heard about this ? Wondering why i could not find that solution anywhere ?
I could not find a solution posted anywhere either :) It was wasted time investigating and troubleshooting to find the cause. It makes sense though. The device is placed into a suspend state and cannot recover quickly enough when needed.

Once I figured it out I searched the forum to see if I had missed something or to post if anyone had a similar issue.

Note the vendor and device id's are specific to the 5090 e.g 10de:2b85,10de:22e8.

If you have different hardware you can find using: lspci -nn |grep -i nvidia

02:00.0 VGA compatible controller [0300]: NVIDIA Corporation GB202 [GeForce RTX 5090] [10de:2b85] (rev a1)
02:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:22e8] (rev a1)


Relooking at Jaques Schidt's answer (https://forum.level1techs.com/t/do-...ies-has-reset-bug-in-vm-passthrough/228549/34), that could also help if running a gui as binding Xorg to the nvidia driver and keeping an active screen will help prevent the gpu from being suspended. Not everyone runs a gui though.

HTH
 
Last edited:
These messages are related to the GPU being put into a power sleep state, and not being able to be activated in time for the VM.
vfio-pci 0000:02:00.0: Unable to change power state from D3cold to D0, device inaccessible
vfio-pci 0000:02:00.1: Unable to change power state from D3cold to D0, device inaccessible

Try on the Host VM:
1. Kernel module

In /etc/default/grub add to options:

disable_idle_d3=1


2. Bind the GPU early. and stop idle 3D

Create /etc/modprobe.d/vfio-pci.conf:

# RTX 5090 GPU (10de:2b85) + Audio (10de:22e8)
# Bind both functions to vfio-pci and prevent runtime D3 (D3cold)
options vfio-pci ids=10de:2b85,10de:22e8 disable_vga=1 disable_idle_d3=1


3. Reinforce power management rules

Create /etc/udev/rules.d/99-vfio-gpu-pm.rules:

# Force RTX 5090 functions to stay in D0 and block deepest sleep
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{device}=="0x2b85", ATTR{power/control}="on", ATTR{d3cold_allowed}="0"
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{device}=="0x22e8", ATTR{power/control}="on", ATTR{d3cold_allowed}="0"


4. Apply and Reboot
update-initramfs -u
udevadm control --reload-rules
udevadm trigger -s pci
update-grub
reboot
I believe I have implemented this correctly with systemd (using ZFS) and am still receiving "D3cold to D0, device inaccessible".
I have noticed that once reproducing the crash if I hard shutdown the VM there is a likely crash of the host, but instead of upon hanging of the guest VM and I reset, I think a different process is being done with the hardware as the host wont crash but the VM will fail to turn back on with
"kvm: ../hw/pci/pci.c:1654: pci_irq_handler: Assertion `0 <= irq_num && irq_num < PCI_NUM_PINS' failed.
TASK ERROR: start failed: QEMU exited with code 1" in proxmox.

I then have to reboot the host to get access to mount the card in the VM once more. If I hard shutdown the VM, about 5-10 later the full host will go offline including all VM's within.
 
I believe I have implemented this correctly with systemd (using ZFS) and am still receiving "D3cold to D0, device inaccessible".
I have noticed that once reproducing the crash if I hard shutdown the VM there is a likely crash of the host, but instead of upon hanging of the guest VM and I reset, I think a different process is being done with the hardware as the host wont crash but the VM will fail to turn back on with
"kvm: ../hw/pci/pci.c:1654: pci_irq_handler: Assertion `0 <= irq_num && irq_num < PCI_NUM_PINS' failed.
TASK ERROR: start failed: QEMU exited with code 1" in proxmox.

I then have to reboot the host to get access to mount the card in the VM once more. If I hard shutdown the VM, about 5-10 later the full host will go offline including all VM's within.
Which device is reporting the D3cold to D0 message?

If you have different hardware you can find using: lspci -nn
e.g
02:00.0 VGA compatible controller [0300]: NVIDIA Corporation GB202 [GeForce RTX 5090] [10de:2b85] (rev a1)
02:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:22e8] (rev a1)

The appropriate vendor and device id's for your system should be used when binding to vfio-pic and setting pm rules.

For the interrupt, in the guest check dmesg and lspci to determine which device is having the issue, then follow through to the host to see what's next.

dmesg -T
lspci -vv |grep -i 'interrupt:'