Hi! Recently I transformed my workstation from win11 to proxmox. Everything went fine, I created some containers for some applications of mine and they are working correctly.
Now here's the issue: I created a vm for win11 (mainly for gaming or other windows apps), I installed the os onto another dedicated drive (nvme), I then followed this guide for gpu passthrough https://forum.proxmox.com/threads/2025-proxmox-pcie-gpu-passthrough-with-nvidia.169543/ and everything worked kinda ok.
I moved the server from my home to my business (I have ftth) and gpu passthrough stopped working.
The first time everything started correctly, and I even used the win vm to test some games, but then it crashed and went unresponsive (sunshine + moonlight and proxmox vnc). I rebooted the system and now I'm having issues, lots of it!
kvm: vfio: Unable to power on device, stuck in D3
kvm: vfio: Unable to power on device, stuck in D3
I checked the bios, my config, and everything, and I haven't changed nothing from when it was working!
My hardware: i9 10850k, Nvidia RTX3090, 128GB Ram, multiple discs, MSI Z490-F, Corsair 1200W platinum.
Any help is greatly appreciated
Now here's the issue: I created a vm for win11 (mainly for gaming or other windows apps), I installed the os onto another dedicated drive (nvme), I then followed this guide for gpu passthrough https://forum.proxmox.com/threads/2025-proxmox-pcie-gpu-passthrough-with-nvidia.169543/ and everything worked kinda ok.
I moved the server from my home to my business (I have ftth) and gpu passthrough stopped working.
The first time everything started correctly, and I even used the win vm to test some games, but then it crashed and went unresponsive (sunshine + moonlight and proxmox vnc). I rebooted the system and now I'm having issues, lots of it!
- My gpu changes every reboot the id, it goes from 01 to 02 to 03 and back to 01, etc... and I need to change every time I reboot the id by hand
- the vm doesn't start anymore, I'm getting mainly these errors
kvm: vfio: Unable to power on device, stuck in D3
kvm: vfio: Unable to power on device, stuck in D3
I checked the bios, my config, and everything, and I haven't changed nothing from when it was working!
My hardware: i9 10850k, Nvidia RTX3090, 128GB Ram, multiple discs, MSI Z490-F, Corsair 1200W platinum.
Code:
root@supernova:~# pveversion -v
proxmox-ve: 9.0.0 (running kernel: 6.14.11-4-pve)
pve-manager: 9.0.11 (running version: 9.0.11/3bf5476b8a4699e2)
proxmox-kernel-helper: 9.0.4
proxmox-kernel-6.14.11-4-pve-signed: 6.14.11-4
proxmox-kernel-6.14: 6.14.11-4
proxmox-kernel-6.14.8-2-pve-signed: 6.14.8-2
ceph: 19.2.3-pve2
ceph-fuse: 19.2.3-pve2
corosync: 3.1.9-pve2
criu: 4.1.1-1
frr-pythontools: 10.3.1-1+pve4
ifupdown2: 3.3.0-1+pmx10
intel-microcode: 3.20250812.1~deb13u1
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-5
libproxmox-acme-perl: 1.7.0
libproxmox-backup-qemu0: 2.0.1
libproxmox-rs-perl: 0.4.1
libpve-access-control: 9.0.3
libpve-apiclient-perl: 3.4.0
libpve-cluster-api-perl: 9.0.6
libpve-cluster-perl: 9.0.6
libpve-common-perl: 9.0.11
libpve-guest-common-perl: 6.0.2
libpve-http-server-perl: 6.0.5
libpve-network-perl: 1.1.8
libpve-rs-perl: 0.10.10
libpve-storage-perl: 9.0.13
libspice-server1: 0.15.2-1+b1
lvm2: 2.03.31-2+pmx1
lxc-pve: 6.0.5-1
lxcfs: 6.0.4-pve1
novnc-pve: 1.6.0-3
proxmox-backup-client: 4.0.16-1
proxmox-backup-file-restore: 4.0.16-1
proxmox-backup-restore-image: 1.0.0
proxmox-firewall: 1.2.0
proxmox-kernel-helper: 9.0.4
proxmox-mail-forward: 1.0.2
proxmox-mini-journalreader: 1.6
proxmox-offline-mirror-helper: 0.7.2
proxmox-widget-toolkit: 5.0.6
pve-cluster: 9.0.6
pve-container: 6.0.13
pve-docs: 9.0.8
pve-edk2-firmware: 4.2025.02-4
pve-esxi-import-tools: 1.0.1
pve-firewall: 6.0.3
pve-firmware: 3.17-2
pve-ha-manager: 5.0.5
pve-i18n: 3.6.1
pve-qemu-kvm: 10.0.2-4
pve-xtermjs: 5.5.0-2
qemu-server: 9.0.23
smartmontools: 7.4-pve1
spiceterm: 3.4.1
swtpm: 0.8.0+pve2
vncterm: 1.9.1
zfsutils-linux: 2.3.4-pve1
root@supernova:~# cat /etc/pve/qemu-server/100.conf
bios: ovmf
boot: order=ide2;ide3
cores: 12
cpu: host,hidden=1,flags=+pcid
efidisk0: local-lvm:vm-100-disk-0,efitype=4m,pre-enrolled-keys=0,size=4M
hostpci0: 0000:03:00,pcie=1,x-vga=1
ide2: none,media=cdrom
ide3: none,media=cdrom
machine: pc-q35-10.0
memory: 32768
meta: creation-qemu=10.0.2,ctime=1761692254
name: Windows11
net0: virtio=BC:24:11:74:6E:CF,bridge=vmbr0
ostype: win11
sata0: /dev/disk/by-id/nvme-Samsung_SSD_970_EVO_Plus_500GB_S4EVNMFN739698M,size=488386584K
sata1: /dev/disk/by-id/ata-WDC_WD40EFAX-68JH4N0_WD-WX32D70ELR9J,size=3907018584K
sata2: /dev/disk/by-id/ata-Samsung_SSD_870_EVO_250GB_S61WNJ0NC43808E,size=244198584K
smbios1: uuid=ed16fcfd-3bca-44be-a128-84333bb6d9b9
tpmstate0: local-lvm:vm-100-disk-1,size=4M,version=v2.0
vga: std,clipboard=vnc
vmgenid: fe6418f9-1aca-4616-b721-9bb89bcac5bf
root@supernova:~# cat /proc/cpuinfo | grep -i "model name" | head -1
dmesg | grep -i iommu
model name : Intel(R) Core(TM) i9-10850K CPU @ 3.60GHz
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-6.14.11-4-pve root=/dev/mapper/pve-root ro quiet intel_iommu=on iommu=pt vfio_iommu_type1.allow_unsafe_interrupts=1
[ 0.088868] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.14.11-4-pve root=/dev/mapper/pve-root ro quiet intel_iommu=on iommu=pt vfio_iommu_type1.allow_unsafe_interrupts=1
[ 0.088927] DMAR: IOMMU enabled
[ 0.261038] DMAR-IR: IOAPIC id 2 under DRHD base 0xfed91000 IOMMU 1
[ 0.535984] iommu: Default domain type: Passthrough (set via kernel command line)
[ 0.604097] pci 0000:00:02.0: Adding to iommu group 0
[ 0.604124] pci 0000:00:00.0: Adding to iommu group 1
[ 0.604143] pci 0000:00:01.0: Adding to iommu group 2
[ 0.604151] pci 0000:00:01.1: Adding to iommu group 2
[ 0.604159] pci 0000:00:01.2: Adding to iommu group 2
[ 0.604173] pci 0000:00:14.0: Adding to iommu group 3
[ 0.604180] pci 0000:00:14.2: Adding to iommu group 3
[ 0.604193] pci 0000:00:15.0: Adding to iommu group 4
[ 0.604200] pci 0000:00:15.1: Adding to iommu group 4
[ 0.604210] pci 0000:00:16.0: Adding to iommu group 5
[ 0.604217] pci 0000:00:17.0: Adding to iommu group 6
[ 0.604230] pci 0000:00:1b.0: Adding to iommu group 7
[ 0.604245] pci 0000:00:1c.0: Adding to iommu group 8
[ 0.604260] pci 0000:00:1c.4: Adding to iommu group 9
[ 0.604269] pci 0000:00:1d.0: Adding to iommu group 10
[ 0.604288] pci 0000:00:1f.0: Adding to iommu group 11
[ 0.604296] pci 0000:00:1f.3: Adding to iommu group 11
[ 0.604304] pci 0000:00:1f.4: Adding to iommu group 11
[ 0.604312] pci 0000:00:1f.5: Adding to iommu group 11
[ 0.604316] pci 0000:03:00.0: Adding to iommu group 2
[ 0.604319] pci 0000:03:00.1: Adding to iommu group 2
[ 0.604332] pci 0000:04:00.0: Adding to iommu group 12
[ 0.604342] pci 0000:06:00.0: Adding to iommu group 13
[ 0.604351] pci 0000:07:00.0: Adding to iommu group 14
[ 15.494366] pci 0000:03:00.0: Adding to iommu group 2
[ 15.494477] pci 0000:03:00.1: Adding to iommu group 2
root@supernova:~# lsmod | grep vfio
vfio_pci 16384 0
vfio_pci_core 86016 1 vfio_pci
irqbypass 12288 2 vfio_pci_core,kvm
vfio_iommu_type1 49152 0
vfio 65536 4 vfio_pci_core,vfio_iommu_type1,vfio_pci
iommufd 110592 1 vfio
root@supernova:~# cat /etc/modprobe.d/vfio.conf
cat /etc/modprobe.d/blacklist.conf
options vfio-pci ids=10de:2204,10de:1aef disable_vga=1
softdep nouveau pre: vfio-pci
softdep nvidia pre: vfio-pci
softdep snd_hda_intel pre: vfio-pci
#options vfio_iommu_type1 allow_unsafe_interrupts=1
blacklist nouveau
blacklist nvidia
root@supernova:~# dmidecode -t baseboard | grep -E "Manufacturer|Product"
Manufacturer: ASUSTeK COMPUTER INC.
Product Name: ROG STRIX Z490-F GAMING
root@supernova:~# lspci -k -s 03:00
03:00.0 VGA compatible controller: NVIDIA Corporation GA102 [GeForce RTX 3090] (rev a1)
Subsystem: Micro-Star International Co., Ltd. [MSI] Device 3882
Kernel driver in use: vfio-pci
Kernel modules: nvidiafb, nouveau
03:00.1 Audio device: NVIDIA Corporation GA102 High Definition Audio Controller (rev a1)
Subsystem: Micro-Star International Co., Ltd. [MSI] Device 3882
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel
Any help is greatly appreciated