System Setup
Proxmox 8.0.4
Supermicro H12SSL
1 Nvidia 4090
3 Nvidia 3080
Machine q35
virt101 - 3080 PCI Device 0000:02:00
virt103 - 4090 PCI Device 0000:01:00
I had virt 105 with two 3080s, PCI Device 0000:81:00 and 0000:82:00
Everything works great with this setup; I shut down 105, cloned 106, and put 0000:81:00 on 105 and 0000:82:00 on 106. If I start 105 first, 106 will not start and if I swap them and start 106 first, it is fine, but then 105 won't start.
Does anyone have any ideas? How can I get better logs from kvm or qemu? This is what I see from journal:
Proxmox 8.0.4
Supermicro H12SSL
1 Nvidia 4090
3 Nvidia 3080
Machine q35
virt101 - 3080 PCI Device 0000:02:00
virt103 - 4090 PCI Device 0000:01:00
I had virt 105 with two 3080s, PCI Device 0000:81:00 and 0000:82:00
Everything works great with this setup; I shut down 105, cloned 106, and put 0000:81:00 on 105 and 0000:82:00 on 106. If I start 105 first, 106 will not start and if I swap them and start 106 first, it is fine, but then 105 won't start.
Does anyone have any ideas? How can I get better logs from kvm or qemu? This is what I see from journal:
Code:
Sep 29 11:10:55 virt1 pvedaemon[667600]: start VM 106: UPID:virt1:000A2FD0:0CDF7EA5:6516E8FF:qmstart:106:root@pam:
Sep 29 11:10:55 virt1 pvedaemon[651968]: <root@pam> starting task UPID:virt1:000A2FD0:0CDF7EA5:6516E8FF:qmstart:106:root@pam:
Sep 29 11:10:56 virt1 systemd[1]: Started 106.scope.
Sep 29 11:10:56 virt1 systemd-udevd[617]: Configuration file /etc/udev/rules.d/60-persistent-storage-hptblock.rules is marked executable. Please remove executable permission bits. Proceeding anyway.
Sep 29 11:10:56 virt1 kernel: device tap106i0 entered promiscuous mode
Sep 29 11:10:56 virt1 kernel: vmbr2: port 8(tap106i0) entered blocking state
Sep 29 11:10:56 virt1 kernel: vmbr2: port 8(tap106i0) entered disabled state
Sep 29 11:10:56 virt1 kernel: vmbr2: port 8(tap106i0) entered blocking state
Sep 29 11:10:56 virt1 kernel: vmbr2: port 8(tap106i0) entered forwarding state
Sep 29 11:10:57 virt1 kernel: device tap106i1 entered promiscuous mode
Sep 29 11:10:57 virt1 kernel: fwbr106i1: port 1(fwln106i1) entered disabled state
Sep 29 11:10:57 virt1 kernel: vmbr1: port 5(fwpr106p1) entered disabled state
Sep 29 11:10:57 virt1 kernel: device fwln106i1 left promiscuous mode
Sep 29 11:10:57 virt1 kernel: fwbr106i1: port 1(fwln106i1) entered disabled state
Sep 29 11:10:57 virt1 kernel: device fwpr106p1 left promiscuous mode
Sep 29 11:10:57 virt1 kernel: vmbr1: port 5(fwpr106p1) entered disabled state
Sep 29 11:10:57 virt1 kernel: vmbr1: port 5(fwpr106p1) entered blocking state
Sep 29 11:10:57 virt1 kernel: vmbr1: port 5(fwpr106p1) entered disabled state
Sep 29 11:10:57 virt1 kernel: device fwpr106p1 entered promiscuous mode
Sep 29 11:10:57 virt1 kernel: vmbr1: port 5(fwpr106p1) entered blocking state
Sep 29 11:10:57 virt1 kernel: vmbr1: port 5(fwpr106p1) entered forwarding state
Sep 29 11:10:57 virt1 kernel: fwbr106i1: port 1(fwln106i1) entered blocking state
Sep 29 11:10:57 virt1 kernel: fwbr106i1: port 1(fwln106i1) entered disabled state
Sep 29 11:10:57 virt1 kernel: device fwln106i1 entered promiscuous mode
Sep 29 11:10:57 virt1 kernel: fwbr106i1: port 1(fwln106i1) entered blocking state
Sep 29 11:10:57 virt1 kernel: fwbr106i1: port 1(fwln106i1) entered forwarding state
Sep 29 11:10:57 virt1 kernel: fwbr106i1: port 2(tap106i1) entered blocking state
Sep 29 11:10:57 virt1 kernel: fwbr106i1: port 2(tap106i1) entered disabled state
Sep 29 11:10:57 virt1 kernel: fwbr106i1: port 2(tap106i1) entered blocking state
Sep 29 11:10:57 virt1 kernel: fwbr106i1: port 2(tap106i1) entered forwarding state
Sep 29 11:11:05 virt1 pvedaemon[651969]: VM 106 qmp command failed - VM 106 qmp command 'query-proxmox-support' failed - unable to connect to VM 106 qmp socket - timeout after 51 retries
Sep 29 11:11:07 virt1 pvestatd[3884]: VM 106 qmp command failed - VM 106 qmp command 'query-proxmox-support' failed - unable to connect to VM 106 qmp socket - timeout after 51 retries
Sep 29 11:11:07 virt1 pvestatd[3884]: status update time (8.078 seconds)
Sep 29 11:11:17 virt1 pvestatd[3884]: VM 106 qmp command failed - VM 106 qmp command 'query-proxmox-support' failed - unable to connect to VM 106 qmp socket - timeout after 51 retries
Sep 29 11:11:17 virt1 pvestatd[3884]: status update time (8.083 seconds)
Sep 29 11:11:17 virt1 pvedaemon[668116]: stop VM 106: UPID:virt1:000A31D4:0CDF8720:6516E915:qmstop:106:root@pam:
Sep 29 11:11:17 virt1 pvedaemon[651969]: <root@pam> starting task UPID:virt1:000A31D4:0CDF8720:6516E915:qmstop:106:root@pam:
Sep 29 11:11:27 virt1 pvestatd[3884]: VM 106 qmp command failed - VM 106 qmp command 'query-proxmox-support' failed - unable to connect to VM 106 qmp socket - timeout after 51 retries
Sep 29 11:11:27 virt1 pvestatd[3884]: status update time (8.090 seconds)
Sep 29 11:11:27 virt1 pvedaemon[668116]: can't lock file '/var/lock/qemu-server/lock-106.conf' - got timeout
Sep 29 11:11:27 virt1 pvedaemon[651969]: <root@pam> end task UPID:virt1:000A31D4:0CDF8720:6516E915:qmstop:106:root@pam: can't lock file '/var/lock/qemu-server/lock-106.conf' - got timeout
Sep 29 11:11:30 virt1 pvedaemon[651970]: VM 106 qmp command failed - VM 106 qmp command 'query-proxmox-support' failed - unable to connect to VM 106 qmp socket - timeout after 51 retries
Sep 29 11:11:37 virt1 pvestatd[3884]: VM 106 qmp command failed - VM 106 qmp command 'query-proxmox-support' failed - unable to connect to VM 106 qmp socket - timeout after 51 retries
Sep 29 11:11:37 virt1 pvestatd[3884]: status update time (8.088 seconds)
Sep 29 11:11:47 virt1 pvestatd[3884]: VM 106 qmp command failed - VM 106 qmp command 'query-proxmox-support' failed - unable to connect to VM 106 qmp socket - timeout after 51 retries
Sep 29 11:11:47 virt1 pvestatd[3884]: status update time (8.089 seconds)
Sep 29 11:11:55 virt1 pvedaemon[651969]: VM 106 qmp command failed - VM 106 qmp command 'query-proxmox-support' failed - unable to connect to VM 106 qmp socket - timeout after 51 retries
Sep 29 11:11:57 virt1 pvestatd[3884]: VM 106 qmp command failed - VM 106 qmp command 'query-proxmox-support' failed - unable to connect to VM 106 qmp socket - timeout after 51 retries
Sep 29 11:11:57 virt1 pvestatd[3884]: status update time (8.086 seconds)
Sep 29 11:12:00 virt1 pvedaemon[667600]: start failed: command '/usr/bin/kvm -id 106 -name 'gpu1,debug-threads=on' -no-shutdown -chardev 'socket,id=qmp,path=/var/run/qemu-server/106.qmp,server=on,wait=off' -mon 'chardev=qmp,mode=control' -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' -mon 'chardev=qmp-event,mode=control' -pidfile /var/run/qemu-server/106.pid -daemonize -smbios 'type=1,uuid=619cd9b6-6474-4493-8c9b-33556553d455' -smp '8,sockets=1,cores=8,maxcpus=8' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' -vga none -nographic -cpu 'host,kvm=off,+kvm_pv_eoi,+kvm_pv_unhalt' -m 65536 -object 'iothread,id=iothread-virtioscsi0' -readconfig /usr/share/qemu-server/pve-q35-4.0.cfg -device 'vmgenid,guid=8da5648e-d72a-405e-833e-1ae3cce1e881' -device 'usb-tablet,id=tablet,bus=ehci.0,port=1' -device 'vfio-pci,host=0000:82:00.0,id=hostpci1.0,bus=ich9-pcie-port-2,addr=0x0.0,x-vga=on,multifunction=on' -device 'vfio-pci,host=0000:82:00.1,id=hostpci1.1,bus=ich9-pcie-port-2,addr=0x0.1' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:eac3be58846' -device 'virtio-scsi-pci,id=virtioscsi0,bus=pci.3,addr=0x1,iothread=iothread-virtioscsi0' -drive 'file=/dev/zvol/nvme_virt1/vm-106-disk-0,if=none,id=drive-scsi0,format=raw,cache=none,aio=io_uring,detect-zeroes=on' -device 'scsi-hd,bus=virtioscsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=100' -netdev 'type=tap,id=net0,ifname=tap106i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=2A:96:50:42:E3:F6,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=101' -netdev 'type=tap,id=net1,ifname=tap106i1,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=DE:A3:76:A8:FC:05,netdev=net1,bus=pci.0,addr=0x13,id=net1,rx_queue_size=1024,tx_queue_size=256' -machine 'smm=off,type=q35+pve0'' failed: got timeout
Sep 29 11:12:00 virt1 pvedaemon[651968]: <root@pam> end task UPID:virt1:000A2FD0:0CDF7EA5:6516E8FF:qmstart:106:root@pam: start failed: command '/usr/bin/kvm -id 106 -name 'gpu1,debug-threads=on' -no-shutdown -chardev 'socket,id=qmp,path=/var/run/qemu-server/106.qmp,server=on,wait=off' -mon 'chardev=qmp,mode=control' -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' -mon 'chardev=qmp-event,mode=control' -pidfile /var/run/qemu-server/106.pid -daemonize -smbios 'type=1,uuid=619cd9b6-6474-4493-8c9b-33556553d455' -smp '8,sockets=1,cores=8,maxcpus=8' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' -vga none -nographic -cpu 'host,kvm=off,+kvm_pv_eoi,+kvm_pv_unhalt' -m 65536 -object 'iothread,id=iothread-virtioscsi0' -readconfig /usr/share/qemu-server/pve-q35-4.0.cfg -device 'vmgenid,guid=8da5648e-d72a-405e-833e-1ae3cce1e881' -device 'usb-tablet,id=tablet,bus=ehci.0,port=1' -device 'vfio-pci,host=0000:82:00.0,id=hostpci1.0,bus=ich9-pcie-port-2,addr=0x0.0,x-vga=on,multifunction=on' -device 'vfio-pci,host=0000:82:00.1,id=hostpci1.1,bus=ich9-pcie-port-2,addr=0x0.1' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:eac3be58846' -device 'virtio-scsi-pci,id=virtioscsi0,bus=pci.3,addr=0x1,iothread=iothread-virtioscsi0' -drive 'file=/dev/zvol/nvme_virt1/vm-106-disk-0,if=none,id=drive-scsi0,format=raw,cache=none,aio=io_uring,detect-zeroes=on' -device 'scsi-hd,bus=virtioscsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=100' -netdev 'type=tap,id=net0,ifname=tap106i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=2A:96:50:42:E3:F6,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=101' -netdev 'type=tap,id=net1,ifname=tap106i1,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=DE:A3:76:A8:FC:05,netdev=net1,bus=pci.0,addr=0x13,id=net1,rx_queue_size=1024,tx_queue_size=256' -machine 'smm=off,type=q35+pve0'' failed: got timeout
Sep 29 11:12:07 virt1 pvestatd[3884]: VM 106 qmp command failed - VM 106 qmp command 'query-proxmox-support' failed - unable to connect to VM 106 qmp socket - timeout after 51 retries
Sep 29 11:12:07 virt1 pvestatd[3884]: status update time (8.089 seconds)
Sep 29 11:12:17 virt1 pvestatd[3884]: VM 106 qmp command failed - VM 106 qmp command 'query-proxmox-support' failed - unable to connect to VM 106 qmp socket - timeout after 51 retries
Sep 29 11:12:17 virt1 pvestatd[3884]: status update time (8.089 seconds)
Sep 29 11:12:20 virt1 pvedaemon[651968]: VM 106 qmp command failed - VM 106 qmp command 'query-proxmox-support' failed - unable to connect to VM 106 qmp socket - timeout after 51 retries