PCI passthrough UEFI VM not boot with 8 cards

tangyuty

New Member
May 2, 2024
6
0
1
I trying to get PCI passthrough work with my gpu pve servers:
UEFI boot with less than 8 Nvidia RTX 3090, win10 and ubuntu 22.04 is boot fine without any issue; but won't boot for 8 cards, not booting from disk(can not find bootable disk), both win10 and ubuntu.

I tried:
  1. Turn off security boot from for guest vm during starting vm
  2. Switch Display mode: Standard VGA, SPICE, VirtIO-GPU

I assume RTX 3090 is support UEFI, because 7 cards or less working fine.

I tested seabios with 8 cards, both win10 and ubuntu working fine.

I have been struggling with this issue for a long time without finding any further clues. Below is my setup, could someone please offer some advice?


Code:
pveversion -v
proxmox-ve: 8.2.0 (running kernel: 6.8.4-2-pve)
pve-manager: 8.2.2 (running version: 8.2.2/9355359cd7afbae4)
proxmox-kernel-helper: 8.1.0
proxmox-kernel-6.8: 6.8.4-2
proxmox-kernel-6.8.4-2-pve-signed: 6.8.4-2
ceph-fuse: 17.2.7-pve3
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx8
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.0
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.3
libpve-access-control: 8.1.4
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.0.6
libpve-cluster-perl: 8.0.6
libpve-common-perl: 8.2.1
libpve-guest-common-perl: 5.1.1
libpve-http-server-perl: 5.1.0
libpve-network-perl: 0.9.8
libpve-rs-perl: 0.8.8
libpve-storage-perl: 8.2.1
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.2.0-1
proxmox-backup-file-restore: 3.2.0-1
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.3
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.6
proxmox-widget-toolkit: 4.2.1
pve-cluster: 8.0.6
pve-container: 5.0.10
pve-docs: 8.2.1
pve-edk2-firmware: 4.2023.08-4
pve-esxi-import-tools: 0.7.0
pve-firewall: 5.0.5
pve-firmware: 3.11-1
pve-ha-manager: 4.0.4
pve-i18n: 3.2.2
pve-qemu-kvm: 8.1.5-5
pve-xtermjs: 5.3.0-3
qemu-server: 8.2.1
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.3-pve2

Code:
root@node8:~# qm config 102
agent: 1
bios: ovmf
boot: order=scsi0;net0
cores: 8
cpu: host
efidisk0: vm1:102/vm-102-disk-1.qcow2,efitype=4m,pre-enrolled-keys=1,size=528K
hostpci0: mapping=RTX3090,pcie=1,x-vga=1
hostpci1: mapping=RTX3090,pcie=1
hostpci2: mapping=RTX3090,pcie=1
hostpci3: mapping=RTX3090,pcie=1
hostpci4: mapping=RTX3090,pcie=1
hostpci5: mapping=RTX3090,pcie=1
hostpci6: mapping=RTX3090,pcie=1
hostpci7: mapping=RTX3090,pcie=1
machine: pc-q35-8.1
memory: 32768
meta: creation-qemu=8.1.5,ctime=1716211018
name: windows-10-nvidia-base
net0: virtio=BC:24:11:9C:91:37,bridge=vmbr0,firewall=1
numa: 0
ostype: win11
scsi0: vm1:102/vm-102-disk-0.qcow2,cache=writeback,discard=on,iothread=1,size=50G
scsihw: virtio-scsi-single
smbios1: uuid=621e908a-41d2-472f-b775-a94ebacbf5ba
sockets: 1
tablet: 1
tpmstate0: vm1:102/vm-102-disk-0.raw,size=4M,version=v2.0
vga: std
vmgenid: bf6a964b-21d5-49a9-b2b1-3830b84553ae

Code:
## /etc/default/grub:
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt"

## /etc/modules:
vfio
vfio_iommu_type1
vfio_pci

## /etc/modprobe.d/kvm.conf
options kvm ignore_msrs=1 report_ignored_msrs=0

## /etc/modprobe.d/vfio.conf
options vfio-pci ids=10de:2204,10de:1aef
softdep nouveau pre: vfio-pci
softdep nvidia pre: vfio-pci
softdep nvidiafb pre: vfio-pci
softdep nvidia_drm pre: vfio-pci
softdep drm pre: vfio-pci


Host machine:
1. boot with uefi mode
2. Enabled Intel VT for directed I/O, ASC control, Interrupt remapping.
 
Last edited:
mhmm... sadly we don't have as much gpus here for testing, but imho that should work, can you boot with a live cd in the guest in that case?
or what error do you get exactly? (a screenshot might be helpful)

also can you post the output of the command

Code:
qm showcmd ID --pretty
?

and post the journal of the host while trying to boot the guest?
 
Thanks again. @dcsapak
I tried RTX 3060 8 cards (which have less memory 12G) with ubuntu this time.
Boot and install from live cd is fine, but after installation finished, vm can not find bootable disk.
1724761777435.png


Code:
qm showcmd 140 --pretty
/usr/bin/kvm \
  -id 140 \
  -name 'ubuntu-uefi6,debug-threads=on' \
  -no-shutdown \
  -chardev 'socket,id=qmp,path=/var/run/qemu-server/140.qmp,server=on,wait=off' \
  -mon 'chardev=qmp,mode=control' \
  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
  -mon 'chardev=qmp-event,mode=control' \
  -pidfile /var/run/qemu-server/140.pid \
  -daemonize \
  -smbios 'type=1,uuid=b16ea5a0-38eb-4e91-bd7f-eecff382150c' \
  -drive 'if=pflash,unit=0,format=raw,readonly=on,file=/usr/share/pve-edk2-firmware//OVMF_CODE_4M.secboot.fd' \
  -drive 'if=pflash,unit=1,id=drive-efidisk0,format=qcow2,file=/mnt/pve/vm/images/140/vm-140-disk-0.qcow2' \
  -smp '32,sockets=2,cores=16,maxcpus=32' \
  -nodefaults \
  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
  -vnc 'unix:/var/run/qemu-server/140.vnc,password=on' \
  -cpu 'host,kvm=off,+kvm_pv_eoi,+kvm_pv_unhalt' \
  -m 131072 \
  -object 'iothread,id=iothread-virtioscsi0' \
  -readconfig /usr/share/qemu-server/pve-q35-4.0.cfg \
  -device 'vmgenid,guid=4cf30955-5536-4b79-8208-420c98d8344d' \
  -device 'usb-tablet,id=tablet,bus=ehci.0,port=1' \
  -device 'vfio-pci,host=0000:01:00.0,id=hostpci0.0,bus=ich9-pcie-port-1,addr=0x0.0,multifunction=on' \
  -device 'vfio-pci,host=0000:01:00.1,id=hostpci0.1,bus=ich9-pcie-port-1,addr=0x0.1' \
  -device 'vfio-pci,host=0000:02:00.0,id=hostpci1.0,bus=ich9-pcie-port-2,addr=0x0.0,multifunction=on' \
  -device 'vfio-pci,host=0000:02:00.1,id=hostpci1.1,bus=ich9-pcie-port-2,addr=0x0.1' \
  -device 'vfio-pci,host=0000:41:00.0,id=hostpci2.0,bus=ich9-pcie-port-3,addr=0x0.0,multifunction=on' \
  -device 'vfio-pci,host=0000:41:00.1,id=hostpci2.1,bus=ich9-pcie-port-3,addr=0x0.1' \
  -device 'vfio-pci,host=0000:42:00.0,id=hostpci3.0,bus=ich9-pcie-port-4,addr=0x0.0,multifunction=on' \
  -device 'vfio-pci,host=0000:42:00.1,id=hostpci3.1,bus=ich9-pcie-port-4,addr=0x0.1' \
  -device 'pcie-root-port,id=ich9-pcie-port-5,addr=10.0,x-speed=16,x-width=32,multifunction=on,bus=pcie.0,port=5,chassis=5' \
  -device 'vfio-pci,host=0000:81:00.0,id=hostpci4.0,bus=ich9-pcie-port-5,addr=0x0.0,multifunction=on' \
  -device 'vfio-pci,host=0000:81:00.1,id=hostpci4.1,bus=ich9-pcie-port-5,addr=0x0.1' \
  -device 'pcie-root-port,id=ich9-pcie-port-6,addr=10.1,x-speed=16,x-width=32,multifunction=on,bus=pcie.0,port=6,chassis=6' \
  -device 'vfio-pci,host=0000:82:00.0,id=hostpci5.0,bus=ich9-pcie-port-6,addr=0x0.0,multifunction=on' \
  -device 'vfio-pci,host=0000:82:00.1,id=hostpci5.1,bus=ich9-pcie-port-6,addr=0x0.1' \
  -device 'pcie-root-port,id=ich9-pcie-port-7,addr=10.2,x-speed=16,x-width=32,multifunction=on,bus=pcie.0,port=7,chassis=7' \
  -device 'vfio-pci,host=0000:c4:00.0,id=hostpci6.0,bus=ich9-pcie-port-7,addr=0x0.0,multifunction=on' \
  -device 'vfio-pci,host=0000:c4:00.1,id=hostpci6.1,bus=ich9-pcie-port-7,addr=0x0.1' \
  -device 'pcie-root-port,id=ich9-pcie-port-8,addr=10.3,x-speed=16,x-width=32,multifunction=on,bus=pcie.0,port=8,chassis=8' \
  -device 'vfio-pci,host=0000:c5:00.0,id=hostpci7.0,bus=ich9-pcie-port-8,addr=0x0.0,multifunction=on' \
  -device 'vfio-pci,host=0000:c5:00.1,id=hostpci7.1,bus=ich9-pcie-port-8,addr=0x0.1' \
  -device 'VGA,id=vga,bus=pcie.0,addr=0x1' \
  -chardev 'socket,path=/var/run/qemu-server/140.qga,server=on,wait=off,id=qga0' \
  -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' \
  -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' \
  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:c1cdf5984fb' \
  -drive 'file=/var/lib/vz/template/iso/ubuntu-22.04.4-desktop-amd64.iso,if=none,id=drive-ide2,media=cdrom,aio=io_uring' \
  -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=101' \
  -device 'virtio-scsi-pci,id=virtioscsi0,bus=pci.3,addr=0x1,iothread=iothread-virtioscsi0' \
  -drive 'file=/mnt/pve/vm/images/140/vm-140-disk-1.qcow2,if=none,id=drive-scsi0,cache=writeback,format=qcow2,aio=io_uring,detect-zeroes=on' \
  -device 'scsi-hd,bus=virtioscsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=100' \
  -netdev 'type=tap,id=net0,ifname=tap140i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' \
  -device 'virtio-net-pci,mac=BC:24:11:CD:58:2B,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=102' \
  -machine 'type=q35+pve0' \
  -fw_cfg 'name=opt/ovmf/X-PciMmio64Mb,string=65536'

journal log:
Code:
## journalctl -f
Aug 27 20:37:21 node12 pmxcfs[1968]: [status] notice: received log
Aug 27 20:37:23 node12 pmxcfs[1968]: [status] notice: received log
Aug 27 20:38:12 node12 pvedaemon[2106]: <root@pam> update VM 140: -hostpci7 mapping=RTX3060,pcie=1
Aug 27 20:38:14 node12 pvedaemon[2643]: start VM 140: UPID:node12:00000A53:000032C6:66CDC8B6:qmstart:140:root@pam:
Aug 27 20:38:14 node12 pvedaemon[2106]: <root@pam> starting task UPID:node12:00000A53:000032C6:66CDC8B6:qmstart:140:root@pam:
Aug 27 20:38:15 node12 systemd[1]: Stopped target sound.target - Sound Card.
Aug 27 20:38:16 node12 systemd[1]: Created slice qemu.slice - Slice /qemu.
Aug 27 20:38:16 node12 systemd[1]: Started 140.scope.
Aug 27 20:38:16 node12 kernel: tap140i0: entered promiscuous mode
Aug 27 20:38:16 node12 kernel: vmbr0: port 2(fwpr140p0) entered blocking state
Aug 27 20:38:16 node12 kernel: vmbr0: port 2(fwpr140p0) entered disabled state
Aug 27 20:38:16 node12 kernel: fwpr140p0: entered allmulticast mode
Aug 27 20:38:16 node12 kernel: fwpr140p0: entered promiscuous mode
Aug 27 20:38:16 node12 kernel: vmbr0: port 2(fwpr140p0) entered blocking state
Aug 27 20:38:16 node12 kernel: vmbr0: port 2(fwpr140p0) entered forwarding state
Aug 27 20:38:16 node12 kernel: fwbr140i0: port 1(fwln140i0) entered blocking state
Aug 27 20:38:16 node12 kernel: fwbr140i0: port 1(fwln140i0) entered disabled state
Aug 27 20:38:16 node12 kernel: fwln140i0: entered allmulticast mode
Aug 27 20:38:16 node12 kernel: fwln140i0: entered promiscuous mode
Aug 27 20:38:16 node12 kernel: fwbr140i0: port 1(fwln140i0) entered blocking state
Aug 27 20:38:16 node12 kernel: fwbr140i0: port 1(fwln140i0) entered forwarding state
Aug 27 20:38:16 node12 kernel: fwbr140i0: port 2(tap140i0) entered blocking state
Aug 27 20:38:16 node12 kernel: fwbr140i0: port 2(tap140i0) entered disabled state
Aug 27 20:38:16 node12 kernel: tap140i0: entered allmulticast mode
Aug 27 20:38:16 node12 kernel: fwbr140i0: port 2(tap140i0) entered blocking state
Aug 27 20:38:16 node12 kernel: fwbr140i0: port 2(tap140i0) entered forwarding state
Aug 27 20:38:25 node12 pvedaemon[2106]: VM 140 qmp command failed - VM 140 qmp command 'query-proxmox-support' failed - unable to connect to VM 140 qmp socket - timeout after 51 retries
Aug 27 20:38:26 node12 pvestatd[2081]: VM 140 qmp command failed - VM 140 qmp command 'query-proxmox-support' failed - unable to connect to VM 140 qmp socket - timeout after 51 retries
Aug 27 20:38:27 node12 pvestatd[2081]: status update time (8.225 seconds)
Aug 27 20:38:31 node12 kernel: vfio-pci 0000:01:00.0: enabling device (0000 -> 0003)
Aug 27 20:38:32 node12 kernel: vfio-pci 0000:02:00.0: enabling device (0000 -> 0003)
Aug 27 20:38:32 node12 kernel: vfio-pci 0000:41:00.0: enabling device (0000 -> 0003)
Aug 27 20:38:32 node12 kernel: vfio-pci 0000:42:00.0: enabling device (0000 -> 0003)
Aug 27 20:38:32 node12 kernel: vfio-pci 0000:81:00.0: enabling device (0000 -> 0003)
Aug 27 20:38:32 node12 kernel: vfio-pci 0000:82:00.0: enabling device (0000 -> 0003)
Aug 27 20:38:32 node12 kernel: vfio-pci 0000:c4:00.0: enabling device (0000 -> 0003)
Aug 27 20:38:32 node12 kernel: vfio-pci 0000:c5:00.0: enabling device (0000 -> 0003)
Aug 27 20:38:35 node12 pvedaemon[2106]: <root@pam> end task UPID:node12:00000A53:000032C6:66CDC8B6:qmstart:140:root@pam: OK
Aug 27 20:38:36 node12 pvestatd[2081]: status update time (8.050 seconds)
Aug 27 20:38:36 node12 pvedaemon[2105]: <root@pam> starting task UPID:node12:00000BC2:00003B28:66CDC8CC:vncproxy:140:root@pam:
Aug 27 20:38:36 node12 pvedaemon[3010]: starting vnc proxy UPID:node12:00000BC2:00003B28:66CDC8CC:vncproxy:140:root@pam:
Aug 27 20:39:02 node12 pmxcfs[1968]: [status] notice: received log


kernel log:
Code:
# journalctl -k -f
Aug 27 20:38:16 node12 kernel: tap140i0: entered promiscuous mode
Aug 27 20:38:16 node12 kernel: vmbr0: port 2(fwpr140p0) entered blocking state
Aug 27 20:38:16 node12 kernel: vmbr0: port 2(fwpr140p0) entered disabled state
Aug 27 20:38:16 node12 kernel: fwpr140p0: entered allmulticast mode
Aug 27 20:38:16 node12 kernel: fwpr140p0: entered promiscuous mode
Aug 27 20:38:16 node12 kernel: vmbr0: port 2(fwpr140p0) entered blocking state
Aug 27 20:38:16 node12 kernel: vmbr0: port 2(fwpr140p0) entered forwarding state
Aug 27 20:38:16 node12 kernel: fwbr140i0: port 1(fwln140i0) entered blocking state
Aug 27 20:38:16 node12 kernel: fwbr140i0: port 1(fwln140i0) entered disabled state
Aug 27 20:38:16 node12 kernel: fwln140i0: entered allmulticast mode
Aug 27 20:38:16 node12 kernel: fwln140i0: entered promiscuous mode
Aug 27 20:38:16 node12 kernel: fwbr140i0: port 1(fwln140i0) entered blocking state
Aug 27 20:38:16 node12 kernel: fwbr140i0: port 1(fwln140i0) entered forwarding state
Aug 27 20:38:16 node12 kernel: fwbr140i0: port 2(tap140i0) entered blocking state
Aug 27 20:38:16 node12 kernel: fwbr140i0: port 2(tap140i0) entered disabled state
Aug 27 20:38:16 node12 kernel: tap140i0: entered allmulticast mode
Aug 27 20:38:16 node12 kernel: fwbr140i0: port 2(tap140i0) entered blocking state
Aug 27 20:38:16 node12 kernel: fwbr140i0: port 2(tap140i0) entered forwarding state
Aug 27 20:38:31 node12 kernel: vfio-pci 0000:01:00.0: enabling device (0000 -> 0003)
Aug 27 20:38:32 node12 kernel: vfio-pci 0000:02:00.0: enabling device (0000 -> 0003)
Aug 27 20:38:32 node12 kernel: vfio-pci 0000:41:00.0: enabling device (0000 -> 0003)
Aug 27 20:38:32 node12 kernel: vfio-pci 0000:42:00.0: enabling device (0000 -> 0003)
Aug 27 20:38:32 node12 kernel: vfio-pci 0000:81:00.0: enabling device (0000 -> 0003)
Aug 27 20:38:32 node12 kernel: vfio-pci 0000:82:00.0: enabling device (0000 -> 0003)
Aug 27 20:38:32 node12 kernel: vfio-pci 0000:c4:00.0: enabling device (0000 -> 0003)
Aug 27 20:38:32 node12 kernel: vfio-pci 0000:c5:00.0: enabling device (0000 -> 0003)
 
Remove one cards then I can see boot disk in bios and boot work fine.

1724762876783.png

Code:
journal -f
Aug 27 20:39:13 node12 pvedaemon[2105]: <root@pam> end task UPID:node12:00000BC2:00003B28:66CDC8CC:vncproxy:140:root@pam: OK
Aug 27 20:39:16 node12 pvedaemon[2106]: <root@pam> update VM 140: -delete hostpci7
Aug 27 20:39:19 node12 pvedaemon[3198]: stop VM 140: UPID:node12:00000C7E:00004C37:66CDC8F7:qmstop:140:root@pam:
Aug 27 20:39:19 node12 pvedaemon[2105]: <root@pam> starting task UPID:node12:00000C7E:00004C37:66CDC8F7:qmstop:140:root@pam:
Aug 27 20:39:20 node12 kernel: tap140i0: left allmulticast mode
Aug 27 20:39:20 node12 kernel: fwbr140i0: port 2(tap140i0) entered disabled state
Aug 27 20:39:20 node12 kernel: fwbr140i0: port 1(fwln140i0) entered disabled state
Aug 27 20:39:20 node12 kernel: vmbr0: port 2(fwpr140p0) entered disabled state
Aug 27 20:39:20 node12 kernel: fwln140i0 (unregistering): left allmulticast mode
Aug 27 20:39:20 node12 kernel: fwln140i0 (unregistering): left promiscuous mode
Aug 27 20:39:20 node12 kernel: fwbr140i0: port 1(fwln140i0) entered disabled state
Aug 27 20:39:20 node12 kernel: fwpr140p0 (unregistering): left allmulticast mode
Aug 27 20:39:20 node12 kernel: fwpr140p0 (unregistering): left promiscuous mode
Aug 27 20:39:20 node12 kernel: vmbr0: port 2(fwpr140p0) entered disabled state
Aug 27 20:39:20 node12 qmeventd[1689]: read: Connection reset by peer
Aug 27 20:39:20 node12 pvedaemon[2105]: <root@pam> end task UPID:node12:00000C7E:00004C37:66CDC8F7:qmstop:140:root@pam: OK
Aug 27 20:39:20 node12 qmeventd[3218]: Starting cleanup for 140
Aug 27 20:39:20 node12 qmeventd[3218]: Finished cleanup for 140
Aug 27 20:39:25 node12 systemd[1]: 140.scope: Deactivated successfully.
Aug 27 20:39:25 node12 systemd[1]: 140.scope: Consumed 1min 53.589s CPU time.
Aug 27 20:39:26 node12 pmxcfs[1968]: [status] notice: received log
Aug 27 20:39:27 node12 pmxcfs[1968]: [status] notice: received log
Aug 27 20:39:27 node12 pmxcfs[1968]: [status] notice: received log
Aug 27 20:39:37 node12 pmxcfs[1968]: [status] notice: received log
Aug 27 20:39:38 node12 pvedaemon[3256]: start VM 140: UPID:node12:00000CB8:00005359:66CDC90A:qmstart:140:root@pam:
Aug 27 20:39:38 node12 pvedaemon[2106]: <root@pam> starting task UPID:node12:00000CB8:00005359:66CDC90A:qmstart:140:root@pam:
Aug 27 20:39:38 node12 pmxcfs[1968]: [status] notice: received log
Aug 27 20:39:39 node12 systemd[1]: Started 140.scope.
Aug 27 20:39:39 node12 kernel: tap140i0: entered promiscuous mode
Aug 27 20:39:39 node12 kernel: vmbr0: port 2(fwpr140p0) entered blocking state
Aug 27 20:39:39 node12 kernel: vmbr0: port 2(fwpr140p0) entered disabled state
Aug 27 20:39:39 node12 kernel: fwpr140p0: entered allmulticast mode
Aug 27 20:39:39 node12 kernel: fwpr140p0: entered promiscuous mode
Aug 27 20:39:39 node12 kernel: vmbr0: port 2(fwpr140p0) entered blocking state
Aug 27 20:39:39 node12 kernel: vmbr0: port 2(fwpr140p0) entered forwarding state
Aug 27 20:39:39 node12 kernel: fwbr140i0: port 1(fwln140i0) entered blocking state
Aug 27 20:39:39 node12 kernel: fwbr140i0: port 1(fwln140i0) entered disabled state
Aug 27 20:39:39 node12 kernel: fwln140i0: entered allmulticast mode
Aug 27 20:39:39 node12 kernel: fwln140i0: entered promiscuous mode
Aug 27 20:39:39 node12 kernel: fwbr140i0: port 1(fwln140i0) entered blocking state
Aug 27 20:39:39 node12 kernel: fwbr140i0: port 1(fwln140i0) entered forwarding state
Aug 27 20:39:39 node12 kernel: fwbr140i0: port 2(tap140i0) entered blocking state
Aug 27 20:39:39 node12 kernel: fwbr140i0: port 2(tap140i0) entered disabled state
Aug 27 20:39:39 node12 kernel: tap140i0: entered allmulticast mode
Aug 27 20:39:39 node12 kernel: fwbr140i0: port 2(tap140i0) entered blocking state
Aug 27 20:39:39 node12 kernel: fwbr140i0: port 2(tap140i0) entered forwarding state
Aug 27 20:39:40 node12 pmxcfs[1968]: [status] notice: received log
Aug 27 20:39:48 node12 pvedaemon[2107]: VM 140 qmp command failed - VM 140 qmp command 'query-proxmox-support' failed - unable to connect to VM 140 qmp socket - timeout after 51 retries
Aug 27 20:39:48 node12 pvedaemon[2105]: VM 140 qmp command failed - VM 140 qmp command 'query-proxmox-support' failed - unable to connect to VM 140 qmp socket - timeout after 51 retries
Aug 27 20:39:49 node12 pvedaemon[3603]: starting vnc proxy UPID:node12:00000E13:000057A4:66CDC915:vncproxy:140:root@pam:
Aug 27 20:39:49 node12 pvedaemon[2107]: <root@pam> starting task UPID:node12:00000E13:000057A4:66CDC915:vncproxy:140:root@pam:
Aug 27 20:39:49 node12 pveproxy[2120]: proxy detected vanished client connection
Aug 27 20:39:56 node12 pvestatd[2081]: VM 140 qmp command failed - VM 140 qmp command 'query-proxmox-support' failed - unable to connect to VM 140 qmp socket - timeout after 51 retries
Aug 27 20:39:56 node12 pvestatd[2081]: status update time (8.229 seconds)
Aug 27 20:39:57 node12 pvedaemon[2107]: VM 140 qmp command failed - VM 140 qmp command 'query-proxmox-support' failed - unable to connect to VM 140 qmp socket - timeout after 51 retries
Aug 27 20:39:57 node12 pvedaemon[3620]: starting vnc proxy UPID:node12:00000E24:00005AE4:66CDC91D:vncproxy:140:root@pam:
Aug 27 20:39:57 node12 pvedaemon[2106]: <root@pam> starting task UPID:node12:00000E24:00005AE4:66CDC91D:vncproxy:140:root@pam:
Aug 27 20:39:58 node12 pvedaemon[2106]: <root@pam> end task UPID:node12:00000CB8:00005359:66CDC90A:qmstart:140:root@pam: OK
Aug 27 20:39:59 node12 pvedaemon[3603]: connection timed out
Aug 27 20:39:59 node12 pvedaemon[2107]: <root@pam> end task UPID:node12:00000E13:000057A4:66CDC915:vncproxy:140:root@pam: connection timed out
Aug 27 20:39:59 node12 pmxcfs[1968]: [status] notice: received log
Aug 27 20:40:42 node12 pmxcfs[1968]: [status] notice: received log
Aug 27 20:40:42 node12 pmxcfs[1968]: [status] notice: received log
Aug 27 20:40:42 node12 pmxcfs[1968]: [status] notice: received log
Aug 27 20:40:50 node12 pmxcfs[1968]: [status] notice: received log
Aug 27 20:40:51 node12 pmxcfs[1968]: [status] notice: received log
Aug 27 20:40:53 node12 pmxcfs[1968]: [status] notice: received log
Aug 27 20:41:13 node12 pmxcfs[1968]: [dcdb] notice: data verification successful
Aug 27 20:41:25 node12 pmxcfs[1968]: [status] notice: received log
Aug 27 20:41:43 node12 pmxcfs[1968]: [status] notice: received log
Aug 27 20:41:43 node12 pmxcfs[1968]: [status] notice: received log
Aug 27 20:41:43 node12 pmxcfs[1968]: [status] notice: received log
Aug 27 20:42:02 node12 pmxcfs[1968]: [status] notice: received log
Aug 27 20:42:03 node12 pmxcfs[1968]: [status] notice: received log
Aug 27 20:42:04 node12 pmxcfs[1968]: [status] notice: received log
Aug 27 20:44:04 node12 pmxcfs[1968]: [status] notice: received log

Code:
# journalctl -k -f
Aug 27 20:39:20 node12 kernel: tap140i0: left allmulticast mode
Aug 27 20:39:20 node12 kernel: fwbr140i0: port 2(tap140i0) entered disabled state
Aug 27 20:39:20 node12 kernel: fwbr140i0: port 1(fwln140i0) entered disabled state
Aug 27 20:39:20 node12 kernel: vmbr0: port 2(fwpr140p0) entered disabled state
Aug 27 20:39:20 node12 kernel: fwln140i0 (unregistering): left allmulticast mode
Aug 27 20:39:20 node12 kernel: fwln140i0 (unregistering): left promiscuous mode
Aug 27 20:39:20 node12 kernel: fwbr140i0: port 1(fwln140i0) entered disabled state
Aug 27 20:39:20 node12 kernel: fwpr140p0 (unregistering): left allmulticast mode
Aug 27 20:39:20 node12 kernel: fwpr140p0 (unregistering): left promiscuous mode
Aug 27 20:39:20 node12 kernel: vmbr0: port 2(fwpr140p0) entered disabled state
Aug 27 20:39:39 node12 kernel: tap140i0: entered promiscuous mode
Aug 27 20:39:39 node12 kernel: vmbr0: port 2(fwpr140p0) entered blocking state
Aug 27 20:39:39 node12 kernel: vmbr0: port 2(fwpr140p0) entered disabled state
Aug 27 20:39:39 node12 kernel: fwpr140p0: entered allmulticast mode
Aug 27 20:39:39 node12 kernel: fwpr140p0: entered promiscuous mode
Aug 27 20:39:39 node12 kernel: vmbr0: port 2(fwpr140p0) entered blocking state
Aug 27 20:39:39 node12 kernel: vmbr0: port 2(fwpr140p0) entered forwarding state
Aug 27 20:39:39 node12 kernel: fwbr140i0: port 1(fwln140i0) entered blocking state
Aug 27 20:39:39 node12 kernel: fwbr140i0: port 1(fwln140i0) entered disabled state
Aug 27 20:39:39 node12 kernel: fwln140i0: entered allmulticast mode
Aug 27 20:39:39 node12 kernel: fwln140i0: entered promiscuous mode
Aug 27 20:39:39 node12 kernel: fwbr140i0: port 1(fwln140i0) entered blocking state
Aug 27 20:39:39 node12 kernel: fwbr140i0: port 1(fwln140i0) entered forwarding state
Aug 27 20:39:39 node12 kernel: fwbr140i0: port 2(tap140i0) entered blocking state
Aug 27 20:39:39 node12 kernel: fwbr140i0: port 2(tap140i0) entered disabled state
Aug 27 20:39:39 node12 kernel: tap140i0: entered allmulticast mode
Aug 27 20:39:39 node12 kernel: fwbr140i0: port 2(tap140i0) entered blocking state
Aug 27 20:39:39 node12 kernel: fwbr140i0: port 2(tap140i0) entered forwarding stat
 
ok, weird, could you try the following: https://forum.proxmox.com/threads/multi-gpu-passthrough-4g-decoding-error.49479/

Code:
 qm set VMID -args '-global q35-pcihost.pci-hole64-size=2048G'

i'm not sure why ovmf does not see the device, but sadly i can't reproduce it here. i tested with 8 nics (with sr-iov) and that worked as expected, so i guess it has to do with e.g. the amount of memory of the gpus (or something other that works differently for gpus than nics, on a bios level)
 
I have tried with no luck; could this be caused by a boot-related issue, not PCI passthrough?
 
well it seems ovmf does not see the disk if all 8 devices are passed through, but i could not think of a reason why that would be aside from things i already mentioned (regarding memory etc.)

you could test different things to narrow it down, e.g temporarily remove the virtual nic, change the disk controller or the bus (e.g. to sata) and see if anything makes a difference. this could help narrow it down a bit

also is there anything of note in the start task log?
 
Thanks for the hints.

I switched scsi controller, VirtIO SCSI and VMware PVSCSI is now working!!! Original was VirtIO SCSI single.

No idea why this related to disk vitalization.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!