Issues Setting Up Windows 10 VM with GPU Passthrough

Partikel

New Member
Oct 22, 2023
3
0
1
Germany
novelnia.net
Hello dear Proxmox community,

I am currently attempting to set up a Windows 10 VM with GPU passthrough, but I'm encountering two primary issues and am hoping for your support.

Problem 1: "kvm: vfio: Cannot reset device 0000:08:00.0, depends on group 22 which is not owned."

Upon starting the VM, I receive the following message:


Code:
kvm: vfio: Cannot reset device 0000:08:00.0, depends on group 22 which is not owned.
TASK OK


However, the VM starts normally and works well. Yet, after shutting it down, I'm unable to start it again as I encounter the following error:


Code:
kvm: ../hw/pci/pci.c:1613: pci_irq_handler: Assertion '0 <= irq_num && irq_num < PCI_NUM_PINS' failed.
TASK ERROR: start failed: QEMU exited with code 1

It only starts working again after performing a hard reset of my main PC (Datacenter).

Problem 2: Routing Sound through the GPU

My second challenge is routing sound through my GPU. I have two devices in different IOMMU groups:

- 08:00.0 (GPU Graphic) in IOMMU group 21
- 08:00.1 (GPU Sound) in IOMMU group 22

classdeviceidiommugroupvendordevice_namesubsystem_devicesubsystem_device_namesubsystem_vendorsubsystem_vendor_name
0x0300000x1c020000:08:00.0210x10deGP106 [GeForce GTX 1060 3GB]0x11c20x10deNVIDIA Corporation
0x0403000x10f10000:08:00.1220x10deGP106 High Definition Audio Controller0x11c20x10deNVIDIA Corporation

Here are some relevant configuration details:

Grub Line:

Code:
GRUB_CMDLINE_LINUX_DEFAULT="quiet textonly amd_iommu=on pcie_acs_override=downstream,multifunction video=efifb:off video=vesa:off video=vesafb:off video=simplefb:off nofb nomodeset vfio-pci.ids=10de:1c02,10de:10f1 vfio_iommu_typ>"

kvm.conf Config (/etc/modprobe.d/kvm.conf):

Code:
options kvm ignore_msrs=1

VM Config:

Code:
agent: 1
args: -cpu 'host,+kvm_pv_unhalt,+kvm_pv_eoi,hv_vendor_id=NV43FIX,kvm=off'
bios: ovmf
boot: order=sata0;ide2;net0
cores: 8
cpu: host,hidden=1,flags=+pcid
efidisk0: local-lvm:vm-104-disk-0,efitype=4m,pre-enrolled-keys=1,size=4M
hostpci0: 0000:08:00.0,pcie=1,x-vga=1
hugepages: 2
ide2: local:iso/Windows10.iso,media=cdrom,size=4725312K
machine: pc-q35-7.2
memory: 8192
meta: creation-qemu=7.2.0,ctime=1697566386
name: Windows10-Wohnzimmer
net0: e1000=0A:D1:C9:7E:E7:6B,bridge=vmbr0,firewall=1
numa: 1
ostype: win10
sata0: local-lvm:vm-104-disk-1,size=100G
scsihw: virtio-scsi-pci
smbios1: uuid=113f9316-3806-4c32-9216-923ea7d24f09
sockets: 1
usb0: host=145f:02f0
vga: none
vmgenid: 9ee3c758-39b4-45b6-990f-fb88e5569a8c

My Hardware in the Main PC:

Code:
CPU: AMD Ryzen 5 1400 Quad-Core Processor
GPU: GeForce GTX 1060 3GB
RAM: 16GB DDR4 3200 CL16 Corsair
Mainboard: ASUS PRIME B450M-A

and maybe this is Important too for the Audio Passtrough:

Code:
45: PCI 800.1: 0403 Audio device
  [Created at pci.386]
  Unique ID: 7Wns.afGtLgFnDe7
  Parent ID: w+J7.0TU4LKoL980
  SysFS ID: /devices/pci0000:00/0000:00:03.1/0000:08:00.1
  SysFS BusID: 0000:08:00.1
  Hardware Class: sound
  Model: "nVidia GP106 High Definition Audio Controller"
  Vendor: pci 0x10de "nVidia Corporation"
  Device: pci 0x10f1 "GP106 High Definition Audio Controller"
  SubVendor: pci 0x10de "nVidia Corporation"
  SubDevice: pci 0x11c2
  Revision: 0xa1
  Driver: "vfio-pci"
  Driver Modules: "vfio_pci"
  Memory Range: 0xf6080000-0xf6083fff (rw,non-prefetchable)
  IRQ: 11 (no events)
  Module Alias: "pci:v000010DEd000010F1sv000010DEsd000011C2bc04sc03i00"
  Driver Info #0:
    Driver Status: snd_hda_intel is active
    Driver Activation Cmd: "modprobe snd_hda_intel"
  Config Status: cfg=new, avail=yes, need=no, active=unknown
  Attached to: #17 (PCI bridge)

46: PCI a00.3: 0403 Audio device
  [Created at pci.386]
  Unique ID: Dt9q.ul4ScaGyp28
  Parent ID: JZZT.XtQqpuv2hW0
  SysFS ID: /devices/pci0000:00/0000:00:08.1/0000:0a:00.3
  SysFS BusID: 0000:0a:00.3
  Hardware Class: sound
  Model: "AMD Family 17h (Models 00h-0fh) HD Audio Controller"
  Vendor: pci 0x1022 "AMD"
  Device: pci 0x1457 "Family 17h (Models 00h-0fh) HD Audio Controller"
  SubVendor: pci 0x1043 "ASUSTeK Computer Inc."
  SubDevice: pci 0x86c7
  Driver: "snd_hda_intel"
  Driver Modules: "snd_hda_intel"
  Memory Range: 0xf6700000-0xf6707fff (rw,non-prefetchable)
  IRQ: 50 (844 events)
  Module Alias: "pci:v00001022d00001457sv00001043sd000086C7bc04sc03i00"
  Driver Info #0:
    Driver Status: snd_hda_intel is active
    Driver Activation Cmd: "modprobe snd_hda_intel"
  Config Status: cfg=new, avail=yes, need=no, active=unknown
  Attached to: #39 (PCI bridge)

I would greatly appreciate your assistance and advice on how to resolve these issues. Thank you in advance for your support!

Best regards,
Partikel (Luca)
 
Here is some more Information after Shutdown:
Code:
Oct 22 14:59:35 pve01 qmeventd[159142]: Finished cleanup for 104
Oct 22 14:59:35 pve01 pvedaemon[1099]: <root@pam> end task UPID:pve01:00026D65:0065283A:65351CAB:qmshutdown:104:root@pam: OK
Oct 22 14:59:35 pve01 kernel: vfio-pci 0000:08:00.0: not ready 1023ms after bus reset; waiting
Oct 22 14:59:36 pve01 kernel: vfio-pci 0000:08:00.0: not ready 2047ms after bus reset; waiting
Oct 22 14:59:39 pve01 kernel: vfio-pci 0000:08:00.0: not ready 4095ms after bus reset; waiting
Oct 22 14:59:43 pve01 kernel: vfio-pci 0000:08:00.0: not ready 8191ms after bus reset; waiting
Oct 22 14:59:52 pve01 kernel: vfio-pci 0000:08:00.0: not ready 16383ms after bus reset; waiting
Oct 22 14:59:57 pve01 pvedaemon[159209]: start VM 104: UPID:pve01:00026DE9:006535A0:65351CCD:qmstart:104:root@pam:
Oct 22 14:59:57 pve01 pvedaemon[1099]: <root@pam> starting task UPID:pve01:00026DE9:006535A0:65351CCD:qmstart:104:root@pam:
Oct 22 15:00:08 pve01 kernel: vfio-pci 0000:08:00.0: not ready 32767ms after bus reset; waiting
Oct 22 15:00:43 pve01 kernel: vfio-pci 0000:08:00.0: not ready 65535ms after bus reset; giving up
Oct 22 15:00:43 pve01 kernel: vfio-pci 0000:08:00.1: Unable to change power state from D0 to D3hot, device inaccessible
Oct 22 15:00:43 pve01 kernel: vfio-pci 0000:08:00.0: Unable to change power state from D0 to D3hot, device inaccessible
Oct 22 15:00:44 pve01 systemd[1]: 104.scope: Deactivated successfully.
Oct 22 15:00:44 pve01 systemd[1]: 104.scope: Consumed 9min 4.773s CPU time.
Oct 22 15:00:45 pve01 systemd[1]: Started 104.scope.
Oct 22 15:00:45 pve01 kernel: device tap104i0 entered promiscuous mode
Oct 22 15:00:45 pve01 kernel: vmbr0: port 2(fwpr104p0) entered blocking state
Oct 22 15:00:45 pve01 kernel: vmbr0: port 2(fwpr104p0) entered disabled state
Oct 22 15:00:45 pve01 kernel: device fwpr104p0 entered promiscuous mode
Oct 22 15:00:45 pve01 kernel: vmbr0: port 2(fwpr104p0) entered blocking state
Oct 22 15:00:45 pve01 kernel: vmbr0: port 2(fwpr104p0) entered forwarding state
Oct 22 15:00:45 pve01 kernel: fwbr104i0: port 1(fwln104i0) entered blocking state
Oct 22 15:00:45 pve01 kernel: fwbr104i0: port 1(fwln104i0) entered disabled state
Oct 22 15:00:45 pve01 kernel: device fwln104i0 entered promiscuous mode
Oct 22 15:00:45 pve01 kernel: fwbr104i0: port 1(fwln104i0) entered blocking state
Oct 22 15:00:45 pve01 kernel: fwbr104i0: port 1(fwln104i0) entered forwarding state
Oct 22 15:00:45 pve01 kernel: fwbr104i0: port 2(tap104i0) entered blocking state
Oct 22 15:00:45 pve01 kernel: fwbr104i0: port 2(tap104i0) entered disabled state
Oct 22 15:00:45 pve01 kernel: fwbr104i0: port 2(tap104i0) entered blocking state
Oct 22 15:00:45 pve01 kernel: fwbr104i0: port 2(tap104i0) entered forwarding state
Oct 22 15:00:46 pve01 kernel: vfio-pci 0000:08:00.0: Unable to change power state from D3cold to D0, device inaccessible
Oct 22 15:00:46 pve01 kernel: vfio-pci 0000:08:00.0: Unable to change power state from D3cold to D0, device inaccessible
Oct 22 15:00:46 pve01 kernel: vfio-pci 0000:08:00.0: Unable to change power state from D3cold to D0, device inaccessible
Oct 22 15:00:46 pve01 kernel: vfio-pci 0000:08:00.0: vfio_cap_init: hiding cap 0xff@0xff
Oct 22 15:00:46 pve01 kernel: vfio-pci 0000:08:00.0: vfio_cap_init: hiding cap 0xff@0xff
Oct 22 15:00:46 pve01 kernel: vfio-pci 0000:08:00.0: vfio_cap_init: hiding cap 0xff@0xff
++
Oct 22 15:00:46 pve01 kernel: vfio-pci 0000:08:00.0: vfio_cap_init: hiding cap 0xff@0xff
Oct 22 15:00:46 pve01 kernel: vfio-pci 0000:08:00.0: vfio_cap_init: hiding cap 0xff@0xff
Oct 22 15:00:46 pve01 kernel: vfio-pci 0000:08:00.0: vfio_cap_init: hiding cap 0xff@0xff
Oct 22 15:00:46 pve01 kernel: vfio-pci 0000:08:00.0: vfio_ecap_init: hiding ecap 0xffff@0x100
Oct 22 15:00:46 pve01 kernel: vfio-pci 0000:08:00.0: vfio_ecap_init: hiding ecap 0xffff@0xffc
Oct 22 15:00:46 pve01 kernel: vfio-pci 0000:08:00.0: vfio_ecap_init: hiding ecap 0xffff@0xffc
Oct 22 15:00:46 pve01 kernel: vfio-pci 0000:08:00.0: vfio_ecap_init: hiding ecap 0xffff@0xffc
Oct 22 15:00:46 pve01 kernel: vfio-pci 0000:08:00.0: vfio_ecap_init: hiding ecap 0xffff@0xffc
Oct 22 15:00:46 pve01 kernel: vfio-pci 0000:08:00.0: vfio_ecap_init: hiding ecap 0xffff@0xffc
+++
Oct 22 15:00:46 pve01 kernel: vfio-pci 0000:08:00.0: vfio_ecap_init: hiding ecap 0xffff@0xffc
Oct 22 15:00:46 pve01 kernel: vfio-pci 0000:08:00.0: vfio_ecap_init: hiding ecap 0xffff@0xffc
Oct 22 15:00:46 pve01 kernel: vfio-pci 0000:08:00.0: vfio_ecap_init: hiding ecap 0xffff@0xffc
Oct 22 15:00:46 pve01 kernel: vfio-pci 0000:08:00.0: vfio_ecap_init: hiding ecap 0xffff@0xffc
Oct 22 15:00:46 pve01 kernel: vfio-pci 0000:08:00.0: vfio_ecap_init: hiding ecap 0xffff@0xffc
Oct 22 15:00:46 pve01 kernel: vfio-pci 0000:08:00.0: vfio_ecap_init: hiding ecap 0xffff@0xffc
Oct 22 15:00:46 pve01 kernel: vfio-pci 0000:08:00.0: Unable to change power state from D3cold to D0, device inaccessible
Oct 22 15:00:46 pve01 kernel: vfio-pci 0000:08:00.1: Unable to change power state from D3cold to D0, device inaccessible
Oct 22 15:00:46 pve01 kernel: vfio-pci 0000:08:00.0: Unable to change power state from D3cold to D0, device inaccessible
Oct 22 15:00:46 pve01 kernel: vfio-pci 0000:08:00.1: Unable to change power state from D3cold to D0, device inaccessible
Oct 22 15:00:48 pve01 kernel: vfio-pci 0000:08:00.0: not ready 1023ms after bus reset; waiting
Oct 22 15:00:48 pve01 pvedaemon[1099]: VM 104 qmp command failed - VM 104 not running
Oct 22 15:00:49 pve01 kernel: vfio-pci 0000:08:00.0: not ready 2047ms after bus reset; waiting
Oct 22 15:00:51 pve01 kernel: vfio-pci 0000:08:00.0: not ready 4095ms after bus reset; waiting
Oct 22 15:00:55 pve01 kernel: vfio-pci 0000:08:00.0: not ready 8191ms after bus reset; waiting
Oct 22 15:01:04 pve01 kernel: vfio-pci 0000:08:00.0: not ready 16383ms after bus reset; waiting
Oct 22 15:01:22 pve01 kernel: vfio-pci 0000:08:00.0: not ready 32767ms after bus reset; waiting
Oct 22 15:01:57 pve01 kernel: vfio-pci 0000:08:00.0: not ready 65535ms after bus reset; giving up
Oct 22 15:01:57 pve01 kernel: vfio-pci 0000:08:00.1: Unable to change power state from D3cold to D0, device inaccessible
Oct 22 15:01:57 pve01 kernel: vfio-pci 0000:08:00.0: Unable to change power state from D3cold to D0, device inaccessible
Oct 22 15:01:57 pve01 kernel: fwbr104i0: port 2(tap104i0) entered disabled state
Oct 22 15:01:57 pve01 kernel: fwbr104i0: port 2(tap104i0) entered disabled state
Oct 22 15:01:57 pve01 pvedaemon[159209]: start failed: QEMU exited with code 1
Oct 22 15:01:57 pve01 pvedaemon[1099]: <root@pam> end task UPID:pve01:00026DE9:006535A0:65351CCD:qmstart:104:root@pam: start failed: QEMU exited with code 1

Best regards,
Partikel (Luca)
 
Looks like the GPU cannot be reset because the audio function is not in the same IOMMU group (because you use pcie_acs_override) and/or because you only give the VGA-function of the GPU to the VM and not the audio function (which is literally the error message): change hostpci0: 0000:08:00.0,pcie=1,x-vga=1 to hostpci0: 0000:08:00,pcie=1,x-vga=1, so all functions of the GPU device are passed to the VM.
PS: There is lots of useless stuff in your kernel parameters but you can find my remarks on that in several threads of this forum.
 
Hello leesteken,

I wanted to take a moment to express my appreciation for your engagement and willingness to assist. I already tried some things out of multiple Replys from u with no Luck :(

Would it be possible that u help me find the right kernel parameters? There is lots of useless stuff because 2 weeks of try and error xD

I removed pcie_acs_override and changed the hostpcie0 to this:
hostpci0: 0000:08:00,pcie=1,x-vga=1

and now there is this error:

Code:
Oct 22 16:54:57 pve01 pvedaemon[1389]: start VM 104: UPID:pve01:0000056D:00001B26:653537C1:qmstart:104:root@pam:
Oct 22 16:54:57 pve01 pvedaemon[1098]: <root@pam> starting task UPID:pve01:0000056D:00001B26:653537C1:qmstart:104:root@pam:
Oct 22 16:54:58 pve01 systemd[1]: Created slice qemu.slice - Slice /qemu.
Oct 22 16:54:58 pve01 systemd[1]: Started 104.scope.
Oct 22 16:54:59 pve01 kernel: device tap104i0 entered promiscuous mode
Oct 22 16:54:59 pve01 kernel: vmbr0: port 2(fwpr104p0) entered blocking state
Oct 22 16:54:59 pve01 kernel: vmbr0: port 2(fwpr104p0) entered disabled state
Oct 22 16:54:59 pve01 kernel: device fwpr104p0 entered promiscuous mode
Oct 22 16:54:59 pve01 kernel: vmbr0: port 2(fwpr104p0) entered blocking state
Oct 22 16:54:59 pve01 kernel: vmbr0: port 2(fwpr104p0) entered forwarding state
Oct 22 16:54:59 pve01 kernel: fwbr104i0: port 1(fwln104i0) entered blocking state
Oct 22 16:54:59 pve01 kernel: fwbr104i0: port 1(fwln104i0) entered disabled state
Oct 22 16:54:59 pve01 kernel: device fwln104i0 entered promiscuous mode
Oct 22 16:54:59 pve01 kernel: fwbr104i0: port 1(fwln104i0) entered blocking state
Oct 22 16:54:59 pve01 kernel: fwbr104i0: port 1(fwln104i0) entered forwarding state
Oct 22 16:54:59 pve01 kernel: fwbr104i0: port 2(tap104i0) entered blocking state
Oct 22 16:54:59 pve01 kernel: fwbr104i0: port 2(tap104i0) entered disabled state
Oct 22 16:54:59 pve01 kernel: fwbr104i0: port 2(tap104i0) entered blocking state
Oct 22 16:54:59 pve01 kernel: fwbr104i0: port 2(tap104i0) entered forwarding state
Oct 22 16:55:00 pve01 kernel: vfio-pci 0000:08:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
Oct 22 16:55:00 pve01 kernel: vfio-pci 0000:08:00.0: No more image in the PCI ROM
Oct 22 16:55:01 pve01 kernel: vfio-pci 0000:08:00.0: not ready 1023ms after bus reset; waiting
Oct 22 16:55:02 pve01 kernel: vfio-pci 0000:08:00.0: not ready 2047ms after bus reset; waiting
Oct 22 16:55:05 pve01 kernel: vfio-pci 0000:08:00.0: not ready 4095ms after bus reset; waiting
Oct 22 16:55:07 pve01 pvestatd[1071]: VM 104 qmp command failed - VM 104 qmp command 'query-proxmox-support' failed - got timeout
Oct 22 16:55:07 pve01 pvestatd[1071]: status update time (8.186 seconds)
Oct 22 16:55:09 pve01 kernel: vfio-pci 0000:08:00.0: not ready 8191ms after bus reset; waiting
Oct 22 16:55:17 pve01 kernel: vfio-pci 0000:08:00.0: not ready 16383ms after bus reset; waiting
Oct 22 16:55:17 pve01 pvestatd[1071]: VM 104 qmp command failed - VM 104 qmp command 'query-proxmox-support' failed - unable to connect to VM 104 qmp socket - timeout after 51 retries
Oct 22 16:55:18 pve01 pvestatd[1071]: status update time (8.226 seconds)
Oct 22 16:55:27 pve01 pvestatd[1071]: VM 104 qmp command failed - VM 104 qmp command 'query-proxmox-support' failed - unable to connect to VM 104 qmp socket - timeout after 51 retries
Oct 22 16:55:27 pve01 pvestatd[1071]: status update time (8.248 seconds)
Oct 22 16:55:35 pve01 kernel: vfio-pci 0000:08:00.0: not ready 32767ms after bus reset; waiting
Oct 22 16:55:37 pve01 pvestatd[1071]: VM 104 qmp command failed - VM 104 qmp command 'query-proxmox-support' failed - unable to connect to VM 104 qmp socket - timeout after 51 retries
Oct 22 16:55:37 pve01 pvestatd[1071]: status update time (8.230 seconds)
Oct 22 16:55:47 pve01 pvestatd[1071]: VM 104 qmp command failed - VM 104 qmp command 'query-proxmox-support' failed - unable to connect to VM 104 qmp socket - timeout after 51 retries
Oct 22 16:55:47 pve01 pvestatd[1071]: status update time (8.244 seconds)
Oct 22 16:55:57 pve01 pvestatd[1071]: VM 104 qmp command failed - VM 104 qmp command 'query-proxmox-support' failed - unable to connect to VM 104 qmp socket - timeout after 51 retries
Oct 22 16:55:58 pve01 pvestatd[1071]: status update time (8.235 seconds)
Oct 22 16:55:59 pve01 pve-ha-crm[1105]: successfully acquired lock 'ha_manager_lock'
Oct 22 16:55:59 pve01 pve-ha-crm[1105]: watchdog active
Oct 22 16:55:59 pve01 pve-ha-crm[1105]: status change slave => master
Oct 22 16:56:07 pve01 pvestatd[1071]: VM 104 qmp command failed - VM 104 qmp command 'query-proxmox-support' failed - unable to connect to VM 104 qmp socket - timeout after 51 retries
Oct 22 16:56:07 pve01 pvestatd[1071]: status update time (8.247 seconds)
Oct 22 16:56:10 pve01 kernel: vfio-pci 0000:08:00.0: not ready 65535ms after bus reset; giving up
Oct 22 16:56:10 pve01 kernel: vfio-pci 0000:08:00.1: vfio_bar_restore: reset recovery - restoring BARs
Oct 22 16:56:10 pve01 kernel: vfio-pci 0000:08:00.0: vfio_bar_restore: reset recovery - restoring BARs
Oct 22 16:56:10 pve01 pvedaemon[1098]: <root@pam> end task UPID:pve01:0000056D:00001B26:653537C1:qmstart:104:root@pam: OK
Oct 22 16:56:15 pve01 kernel: vfio-pci 0000:08:00.0: vfio_bar_restore: reset recovery - restoring BARs
Oct 22 16:56:15 pve01 kernel: vfio-pci 0000:08:00.1: vfio_bar_restore: reset recovery - restoring BARs
Oct 22 16:56:15 pve01 kernel: vfio-pci 0000:08:00.0: vfio_bar_restore: reset recovery - restoring BARs
Oct 22 16:56:15 pve01 kernel: vfio-pci 0000:08:00.1: vfio_bar_restore: reset recovery - restoring BARs
Oct 22 16:56:15 pve01 kernel: vfio-pci 0000:08:00.0: vfio_bar_restore: reset recovery - restoring BARs
Oct 22 16:56:15 pve01 kernel: vfio-pci 0000:08:00.1: vfio_bar_restore: reset recovery - restoring BARs
Oct 22 16:56:15 pve01 kernel: vfio-pci 0000:08:00.0: vfio_bar_restore: reset recovery - restoring BARs
Oct 22 16:56:15 pve01 kernel: vfio-pci 0000:08:00.0: No more image in the PCI ROM
Oct 22 16:56:15 pve01 kernel: vfio-pci 0000:08:00.0: No more image in the PCI ROM
Oct 22 16:56:15 pve01 kernel: vfio-pci 0000:08:00.0: vfio_bar_restore: reset recovery - restoring BARs
Oct 22 16:56:15 pve01 kernel: vfio-pci 0000:08:00.0: vfio_bar_restore: reset recovery - restoring BARs
Oct 22 16:56:15 pve01 kernel: vfio-pci 0000:08:00.0: vfio_bar_restore: reset recovery - restoring BARs
Oct 22 16:56:15 pve01 kernel: vfio-pci 0000:08:00.0: vfio_bar_restore: reset recovery - restoring BARs
Oct 22 16:56:15 pve01 kernel: vfio-pci 0000:08:00.1: vfio_bar_restore: reset recovery - restoring BARs
Oct 22 16:56:15 pve01 kernel: vfio-pci 0000:08:00.1: vfio_bar_restore: reset recovery - restoring BARs
Oct 22 16:56:15 pve01 kernel: vfio-pci 0000:08:00.1: vfio_bar_restore: reset recovery - restoring BARs
Oct 22 16:56:15 pve01 kernel: vfio-pci 0000:08:00.0: vfio_bar_restore: reset recovery - restoring BARs
Oct 22 16:56:15 pve01 kernel: vfio-pci 0000:08:00.1: vfio_bar_restore: reset recovery - restoring BARs
Oct 22 16:56:15 pve01 kernel: vfio-pci 0000:08:00.0: vfio_bar_restore: reset recovery - restoring BARs
Oct 22 16:56:15 pve01 kernel: vfio-pci 0000:08:00.1: vfio_bar_restore: reset recovery - restoring BARs
Oct 22 16:56:15 pve01 kernel: vfio-pci 0000:08:00.0: vfio_bar_restore: reset recovery - restoring BARs
Oct 22 16:56:15 pve01 kernel: vfio-pci 0000:08:00.0: vfio_bar_restore: reset recovery - restoring BARs
Oct 22 16:56:15 pve01 kernel: vfio-pci 0000:08:00.1: vfio_bar_restore: reset recovery - restoring BARs
Oct 22 16:56:15 pve01 kernel: vfio-pci 0000:08:00.1: vfio_bar_restore: reset recovery - restoring BARs
Oct 22 16:56:15 pve01 kernel: vfio-pci 0000:08:00.0: vfio_bar_restore: reset recovery - restoring BARs
Oct 22 16:56:15 pve01 kernel: vfio-pci 0000:08:00.1: vfio_bar_restore: reset recovery - restoring BARs
Oct 22 16:56:16 pve01 kernel: vfio-pci 0000:08:00.0: vfio_bar_restore: reset recovery - restoring BARs
Oct 22 16:56:16 pve01 kernel: vfio-pci 0000:08:00.1: vfio_bar_restore: reset recovery - restoring BARs
Oct 22 16:56:16 pve01 kernel: vfio-pci 0000:08:00.0: vfio_bar_restore: reset recovery - restoring BARs
Oct 22 16:56:16 pve01 kernel: vfio-pci 0000:08:00.0: vfio_bar_restore: reset recovery - restoring BARs
Oct 22 16:56:16 pve01 kernel: vfio-pci 0000:08:00.0: vfio_bar_restore: reset recovery - restoring BARs
 
I removed pcie_acs_override and changed the hostpcie0 to this:
hostpci0: 0000:08:00,pcie=1,x-vga=1

and now there is this error:

Code:
Oct 22 16:54:57 pve01 pvedaemon[1389]: start VM 104: UPID:pve01:0000056D:00001B26:653537C1:qmstart:104:root@pam:
Oct 22 16:54:57 pve01 pvedaemon[1098]: <root@pam> starting task UPID:pve01:0000056D:00001B26:653537C1:qmstart:104:root@pam:
Oct 22 16:54:58 pve01 systemd[1]: Created slice qemu.slice - Slice /qemu.
Oct 22 16:54:58 pve01 systemd[1]: Started 104.scope.
Oct 22 16:54:59 pve01 kernel: device tap104i0 entered promiscuous mode
Oct 22 16:54:59 pve01 kernel: vmbr0: port 2(fwpr104p0) entered blocking state
Oct 22 16:54:59 pve01 kernel: vmbr0: port 2(fwpr104p0) entered disabled state
Oct 22 16:54:59 pve01 kernel: device fwpr104p0 entered promiscuous mode
Oct 22 16:54:59 pve01 kernel: vmbr0: port 2(fwpr104p0) entered blocking state
Oct 22 16:54:59 pve01 kernel: vmbr0: port 2(fwpr104p0) entered forwarding state
Oct 22 16:54:59 pve01 kernel: fwbr104i0: port 1(fwln104i0) entered blocking state
Oct 22 16:54:59 pve01 kernel: fwbr104i0: port 1(fwln104i0) entered disabled state
Oct 22 16:54:59 pve01 kernel: device fwln104i0 entered promiscuous mode
Oct 22 16:54:59 pve01 kernel: fwbr104i0: port 1(fwln104i0) entered blocking state
Oct 22 16:54:59 pve01 kernel: fwbr104i0: port 1(fwln104i0) entered forwarding state
Oct 22 16:54:59 pve01 kernel: fwbr104i0: port 2(tap104i0) entered blocking state
Oct 22 16:54:59 pve01 kernel: fwbr104i0: port 2(tap104i0) entered disabled state
Oct 22 16:54:59 pve01 kernel: fwbr104i0: port 2(tap104i0) entered blocking state
Oct 22 16:54:59 pve01 kernel: fwbr104i0: port 2(tap104i0) entered forwarding state
Oct 22 16:55:00 pve01 kernel: vfio-pci 0000:08:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
Oct 22 16:55:00 pve01 kernel: vfio-pci 0000:08:00.0: No more image in the PCI ROM
Oct 22 16:55:01 pve01 kernel: vfio-pci 0000:08:00.0: not ready 1023ms after bus reset; waiting
Looks like your GPU does not reset properly (and will probably only work once after a Proxmox host reboot). I don't know how to fix that for NVidia GPUs. Maybe someone else knows; there are threads on this forum about ROM patching and other work-arounds.
 
This happened to me as well, the causes were 2:
1. resize bar was enabled in the BIOS
2. the PCIe slot was set to "raid" mode (4x4x4x4) instead of 16x.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!