Can't change power state from D0 to D3hot

randoomkiller

New Member
Sep 20, 2023
10
0
1
Hi there,

I have a proxmox hypervisor in which I am running a Windows VM and sometimes an Ubuntu VM
I am more frequently encountering this bug where the GPU passthrough just makes in unable to boot because of Power state issues with the GPU.
It worked well for a while, but recently I'm getting more of these errors.
Sometimes reboot fixes it but not always.

Does anyone knows why it would be and how to fix it?

dmesg :

Code:
[  215.247402] ata1.00: Enabling discard_zeroes_data
[  215.248208] vfio-pci 0000:08:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[  215.268648] vfio-pci 0000:08:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[  215.272123]  sda: sda1
[  215.288102] vfio-pci 0000:08:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[  216.039182] device tap100i0 entered promiscuous mode
[  216.059654] fwbr100i0: port 1(fwln100i0) entered blocking state
[  216.059657] fwbr100i0: port 1(fwln100i0) entered disabled state
[  216.059693] device fwln100i0 entered promiscuous mode
[  216.059722] fwbr100i0: port 1(fwln100i0) entered blocking state
[  216.059723] fwbr100i0: port 1(fwln100i0) entered forwarding state
[  216.061771] vmbr0: port 2(fwpr100p0) entered blocking state
[  216.061773] vmbr0: port 2(fwpr100p0) entered disabled state
[  216.061801] device fwpr100p0 entered promiscuous mode
[  216.061825] vmbr0: port 2(fwpr100p0) entered blocking state
[  216.061825] vmbr0: port 2(fwpr100p0) entered forwarding state
[  216.063739] fwbr100i0: port 2(tap100i0) entered blocking state
[  216.063741] fwbr100i0: port 2(tap100i0) entered disabled state
[  216.063776] fwbr100i0: port 2(tap100i0) entered blocking state
[  216.063777] fwbr100i0: port 2(tap100i0) entered forwarding state
[  219.024153] vfio-pci 0000:08:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258
[  219.024177] vfio-pci 0000:08:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[  219.024184] vfio-pci 0000:08:00.0: vfio_ecap_init: hiding ecap 0x26@0xc1c
[  219.024186] vfio-pci 0000:08:00.0: vfio_ecap_init: hiding ecap 0x27@0xd00
[  219.024187] vfio-pci 0000:08:00.0: vfio_ecap_init: hiding ecap 0x25@0xe00
[  219.025480] vfio-pci 0000:08:00.0: No more image in the PCI ROM
[  219.044076] vfio-pci 0000:08:00.1: vfio_ecap_init: hiding ecap 0x25@0x160
[  220.272565] vfio-pci 0000:08:00.1: vfio_bar_restore: reset recovery - restoring BARs
[  220.304561] vfio-pci 0000:08:00.0: vfio_bar_restore: reset recovery - restoring BARs
[  221.032390] vfio-pci 0000:08:00.0: timed out waiting for pending transaction; performing function level reset anyway
[  222.280386] vfio-pci 0000:08:00.0: not ready 1023ms after FLR; waiting
[  223.336401] vfio-pci 0000:08:00.0: not ready 2047ms after FLR; waiting
[  225.576288] vfio-pci 0000:08:00.0: not ready 4095ms after FLR; waiting
[  229.928272] vfio-pci 0000:08:00.0: not ready 8191ms after FLR; waiting
[  238.376115] vfio-pci 0000:08:00.0: not ready 16383ms after FLR; waiting
[  255.783947] vfio-pci 0000:08:00.0: not ready 32767ms after FLR; waiting
[  290.599246] vfio-pci 0000:08:00.0: not ready 65535ms after FLR; giving up
[  290.748260] fwbr100i0: port 2(tap100i0) entered disabled state
[  290.772594] fwbr100i0: port 1(fwln100i0) entered disabled state
[  290.772707] vmbr0: port 2(fwpr100p0) entered disabled state
[  290.772773] device fwln100i0 left promiscuous mode
[  290.772775] fwbr100i0: port 1(fwln100i0) entered disabled state
[  290.795133] device fwpr100p0 left promiscuous mode
[  290.795136] vmbr0: port 2(fwpr100p0) entered disabled state
[  290.891460] ata1.00: Enabling discard_zeroes_data
[  290.930613]  sda: sda1
[  291.041966] vfio-pci 0000:08:00.1: can't change power state from D0 to D3hot (config space inaccessible)
[  291.050074]  sdb: sdb1 sdb2 sdb3 sdb4 sdb5
[  291.783259] vfio-pci 0000:08:00.0: timed out waiting for pending transaction; performing function level reset anyway
[  293.031205] vfio-pci 0000:08:00.0: not ready 1023ms after FLR; waiting
[  294.087181] vfio-pci 0000:08:00.0: not ready 2047ms after FLR; waiting
[  296.231158] vfio-pci 0000:08:00.0: not ready 4095ms after FLR; waiting
[  300.583073] vfio-pci 0000:08:00.0: not ready 8191ms after FLR; waiting
[  309.030927] vfio-pci 0000:08:00.0: not ready 16383ms after FLR; waiting
[  327.462587] vfio-pci 0000:08:00.0: not ready 32767ms after FLR; waiting
[  362.277982] vfio-pci 0000:08:00.0: not ready 65535ms after FLR; giving up
[  363.368797] vfio-pci 0000:08:00.1: can't change power state from D0 to D3hot (config space inaccessible)
[  363.368808] vfio-pci 0000:08:00.0: can't change power state from D0 to D3hot (config space inaccessible)
 
Hi there,

I have a proxmox hypervisor in which I am running a Windows VM and sometimes an Ubuntu VM
I am more frequently encountering this bug where the GPU passthrough just makes in unable to boot because of Power state issues with the GPU.
It worked well for a while, but recently I'm getting more of these errors.
Sometimes reboot fixes it but not always.

Does anyone knows why it would be and how to fix it?

dmesg :

Code:
[  215.247402] ata1.00: Enabling discard_zeroes_data
[  215.248208] vfio-pci 0000:08:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[  215.268648] vfio-pci 0000:08:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[  215.272123]  sda: sda1
[  215.288102] vfio-pci 0000:08:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[  216.039182] device tap100i0 entered promiscuous mode
[  216.059654] fwbr100i0: port 1(fwln100i0) entered blocking state
[  216.059657] fwbr100i0: port 1(fwln100i0) entered disabled state
[  216.059693] device fwln100i0 entered promiscuous mode
[  216.059722] fwbr100i0: port 1(fwln100i0) entered blocking state
[  216.059723] fwbr100i0: port 1(fwln100i0) entered forwarding state
[  216.061771] vmbr0: port 2(fwpr100p0) entered blocking state
[  216.061773] vmbr0: port 2(fwpr100p0) entered disabled state
[  216.061801] device fwpr100p0 entered promiscuous mode
[  216.061825] vmbr0: port 2(fwpr100p0) entered blocking state
[  216.061825] vmbr0: port 2(fwpr100p0) entered forwarding state
[  216.063739] fwbr100i0: port 2(tap100i0) entered blocking state
[  216.063741] fwbr100i0: port 2(tap100i0) entered disabled state
[  216.063776] fwbr100i0: port 2(tap100i0) entered blocking state
[  216.063777] fwbr100i0: port 2(tap100i0) entered forwarding state
[  219.024153] vfio-pci 0000:08:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258
[  219.024177] vfio-pci 0000:08:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[  219.024184] vfio-pci 0000:08:00.0: vfio_ecap_init: hiding ecap 0x26@0xc1c
[  219.024186] vfio-pci 0000:08:00.0: vfio_ecap_init: hiding ecap 0x27@0xd00
[  219.024187] vfio-pci 0000:08:00.0: vfio_ecap_init: hiding ecap 0x25@0xe00
[  219.025480] vfio-pci 0000:08:00.0: No more image in the PCI ROM
[  219.044076] vfio-pci 0000:08:00.1: vfio_ecap_init: hiding ecap 0x25@0x160
[  220.272565] vfio-pci 0000:08:00.1: vfio_bar_restore: reset recovery - restoring BARs
[  220.304561] vfio-pci 0000:08:00.0: vfio_bar_restore: reset recovery - restoring BARs
[  221.032390] vfio-pci 0000:08:00.0: timed out waiting for pending transaction; performing function level reset anyway
[  222.280386] vfio-pci 0000:08:00.0: not ready 1023ms after FLR; waiting
[  223.336401] vfio-pci 0000:08:00.0: not ready 2047ms after FLR; waiting
[  225.576288] vfio-pci 0000:08:00.0: not ready 4095ms after FLR; waiting
[  229.928272] vfio-pci 0000:08:00.0: not ready 8191ms after FLR; waiting
[  238.376115] vfio-pci 0000:08:00.0: not ready 16383ms after FLR; waiting
[  255.783947] vfio-pci 0000:08:00.0: not ready 32767ms after FLR; waiting
[  290.599246] vfio-pci 0000:08:00.0: not ready 65535ms after FLR; giving up
[  290.748260] fwbr100i0: port 2(tap100i0) entered disabled state
[  290.772594] fwbr100i0: port 1(fwln100i0) entered disabled state
[  290.772707] vmbr0: port 2(fwpr100p0) entered disabled state
[  290.772773] device fwln100i0 left promiscuous mode
[  290.772775] fwbr100i0: port 1(fwln100i0) entered disabled state
[  290.795133] device fwpr100p0 left promiscuous mode
[  290.795136] vmbr0: port 2(fwpr100p0) entered disabled state
[  290.891460] ata1.00: Enabling discard_zeroes_data
[  290.930613]  sda: sda1
[  291.041966] vfio-pci 0000:08:00.1: can't change power state from D0 to D3hot (config space inaccessible)
[  291.050074]  sdb: sdb1 sdb2 sdb3 sdb4 sdb5
[  291.783259] vfio-pci 0000:08:00.0: timed out waiting for pending transaction; performing function level reset anyway
[  293.031205] vfio-pci 0000:08:00.0: not ready 1023ms after FLR; waiting
[  294.087181] vfio-pci 0000:08:00.0: not ready 2047ms after FLR; waiting
[  296.231158] vfio-pci 0000:08:00.0: not ready 4095ms after FLR; waiting
[  300.583073] vfio-pci 0000:08:00.0: not ready 8191ms after FLR; waiting
[  309.030927] vfio-pci 0000:08:00.0: not ready 16383ms after FLR; waiting
[  327.462587] vfio-pci 0000:08:00.0: not ready 32767ms after FLR; waiting
[  362.277982] vfio-pci 0000:08:00.0: not ready 65535ms after FLR; giving up
[  363.368797] vfio-pci 0000:08:00.1: can't change power state from D0 to D3hot (config space inaccessible)
[  363.368808] vfio-pci 0000:08:00.0: can't change power state from D0 to D3hot (config space inaccessible)

The error from the proxmox UI:


no efidisk configured! Using temporary efivars disk.
kvm: vfio: Unable to power on device, stuck in D3
kvm: vfio: Unable to power on device, stuck in D3
TASK ERROR: start failed: command '/usr/bin/kvm -id 100 -name Windows.P2V -no-shutdown -chardev 'socket,id=qmp,path=/var/run/qemu-server/100.qmp,server=on,wait=off' -mon 'chardev=qmp,mode=control' -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' -mon 'chardev=qmp-event,mode=control' -pidfile /var/run/qemu-server/100.pid -daemonize -smbios 'type=1,uuid=6b269a63-971b-4f80-a579-4b8dac17cd0d' -drive 'if=pflash,unit=0,format=raw,readonly=on,file=/usr/share/pve-edk2-firmware//OVMF_CODE.fd' -drive 'if=pflash,unit=1,format=raw,id=drive-efidisk0,size=131072,file=/tmp/100-ovmf.fd' -smp '12,sockets=1,cores=12,maxcpus=12' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' -vga none -nographic -no-hpet -cpu 'host,hv_ipi,hv_relaxed,hv_reset,hv_runtime,hv_spinlocks=0x1fff,hv_stimer,hv_synic,hv_time,hv_vapic,hv_vpindex,kvm=off,+kvm_pv_eoi,+kvm_pv_unhalt' -m 25000 -readconfig /usr/share/qemu-server/pve-q35-4.0.cfg -device 'vmgenid,guid=fc53f5df-13d9-41cc-b011-1140c7f41ed7' -device 'usb-tablet,id=tablet,bus=ehci.0,port=1' -device 'vfio-pci,host=0000:08:00.0,id=hostpci0.0,bus=ich9-pcie-port-1,addr=0x0.0,multifunction=on' -device 'vfio-pci,host=0000:08:00.1,id=hostpci0.1,bus=ich9-pcie-port-1,addr=0x0.1' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:8bc0d5e2cd2c' -drive 'file=/var/lib/vz/template/iso/virtio-win-0.1.240.iso,if=none,id=drive-ide2,media=cdrom,aio=io_uring' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=101' -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' -drive 'file=/dev/disk/by-id/ata-Samsung_SSD_860_QVO_1TB_S4CZNF0MB08077X,if=none,id=drive-scsi2,format=raw,cache=none,aio=io_uring,detect-zeroes=on' -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=2,drive=drive-scsi2,id=scsi2' -device 'ahci,id=ahci0,multifunction=on,bus=pci.0,addr=0x7' -drive 'file=/dev/disk/by-id/ata-INTEL_SSDSC2CT240A4_CVKI3190023Q240DGN,if=none,id=drive-sata1,format=raw,cache=none,aio=io_uring,detect-zeroes=on' -device 'ide-hd,bus=ahci0.1,drive=drive-sata1,id=sata1,bootindex=100' -netdev 'type=tap,id=net0,ifname=tap100i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=36:0D:C4:60:B8:3D,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=102' -rtc 'driftfix=slew,base=localtime' -machine 'type=pc-q35-6.1+pve0' -global 'kvm-pit.lost_tick_policy=discard'' failed: got timeout
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!