GPU passthrough, qmp command failed - VM X qmp command 'query-proxmox-support' failed - unable to connect to VM id qmp socket

Mar 28, 2021
11
3
8
Hello all,

I've searched the forum and found similar qmp command failed errors but these had mostly to do with replication or backups.

I have an Ubuntu 20 guest VM on my Poweredge T630 (along with 5 other VMs) and it has an Nvidia Quadro passed through. This worked perfectly up until today.
I've installed a second Nvidia GTX 1050Ti and then the problems arose, not sure if it has anything to do with it.

When I remove the GPUs altogether (from the guest config) I can start the VM and use it via noVNC without any issues.

As soon as I pass through 1 or 2 GPUs's the VM won't start. I'm using ZFS storage btw.

I'll attach some logs and config and I hope someone can give me some pointers to fix this issue.
Thank you for this great software and an awesome community!

/usr/bin/kvm \
-id 112 \
-name Desktop \
-no-shutdown \
-chardev 'socket,id=qmp,path=/var/run/qemu-server/112.qmp,server,nowait' \
-mon 'chardev=qmp,mode=control' \
-chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
-mon 'chardev=qmp-event,mode=control' \
-pidfile /var/run/qemu-server/112.pid \
-daemonize \
-smbios 'type=1,uuid=de3bfcdc-f976-4464-b349-869bf83eb69b' \
-drive 'if=pflash,unit=0,format=raw,readonly,file=/usr/share/pve-edk2-firmware//OVMF_CODE.fd' \
-drive 'if=pflash,unit=1,format=raw,id=drive-efidisk0,size=131072,file=/dev/zvol/SSD/vm-112-disk-0' \
-smp '8,sockets=2,cores=4,maxcpus=8' \
-nodefaults \
-boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
-vga none \
-nographic \
-cpu 'kvm64,enforce,kvm=off,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep' \
-m 16384 \
-readconfig /usr/share/qemu-server/pve-q35-4.0.cfg \
-device 'vmgenid,guid=4ca056a3-30a9-4aca-aa41-ba7b9f921918' \
-device 'nec-usb-xhci,id=xhci,bus=pci.1,addr=0x1b' \
-device 'usb-tablet,id=tablet,bus=ehci.0,port=1' \
-device 'vfio-pci,host=0000:42:00.0,id=hostpci0.0,bus=ich9-pcie-port-1,addr=0x0.0,multifunction=on' \
-device 'vfio-pci,host=0000:42:00.1,id=hostpci0.1,bus=ich9-pcie-port-1,addr=0x0.1' \
-device 'usb-host,bus=xhci.0,vendorid=0x0461,productid=0x0010,id=usb0' \
-device 'usb-host,bus=xhci.0,vendorid=0x145f,productid=0x01c1,id=usb1' \
-device 'usb-host,bus=xhci.0,hostbus=1,hostport=1.5,id=usb2' \
-chardev 'socket,path=/var/run/qemu-server/112.qga,server,nowait,id=qga0' \
-device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' \
-device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' \
-iscsi 'initiator-name=iqn.1993-08.org.debian:01:7d26085f50' \
-drive 'file=/dev/zvol/SSD/vm-112-disk-1,if=none,id=drive-virtio0,cache=writeback,discard=on,format=raw,aio=threads,detect-zeroes=unmap' \
-device 'virtio-blk-pci,drive=drive-virtio0,id=virtio0,bus=pci.0,addr=0xa,bootindex=100' \
-netdev 'type=tap,id=net0,ifname=tap112i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' \
-device 'virtio-net-pci,mac=A6:A8:C7:B5:2E:17,netdev=net0,bus=pci.0,addr=0x12,id=net0' \
-machine 'type=q35+pve0'
agent: 1
balloon: 0
bios: ovmf
boot: order=virtio0
cores: 4
efidisk0: SSD:vm-112-disk-0,size=1M
hostpci0: 42:00,pcie=1,x-vga=1
machine: q35
memory: 16384
name: Desktop
net0: virtio=A6:A8:C7:B5:2E:17,bridge=vmbr0,tag=25
numa: 0
onboot: 1
ostype: l26
scsihw: virtio-scsi-pci
smbios1: uuid=de3bfcdc-f976-4464-b349-869bf83eb69b
sockets: 2
usb0: host=0461:0010,usb3=1
usb1: host=145f:01c1,usb3=1
usb2: host=1-1.5,usb3=1
virtio0: SSD:vm-112-disk-1,cache=writeback,size=100G,discard=on
vmgenid: 4ca056a3-30a9-4aca-aa41-ba7b9f921918
Apr 23 20:42:06 pve systemd[1]: 112.scope: Succeeded.
Apr 23 20:42:07 pve qmeventd[5023]: OK
Apr 23 20:42:07 pve qmeventd[5023]: Finished cleanup for 112
Apr 23 20:42:07 pve pvedaemon[13218]: <root@pam> end task UPID:pve:00002042:00900DE5:608314D4:qmshutdown:112:root@pam: OK
Apr 23 20:42:13 pve pvedaemon[28167]: <root@pam> update VM 112: -hostpci0 42:00,pcie=1,x-vga=1
Apr 23 20:42:20 pve pvedaemon[18887]: start VM 112: UPID:pve:000049C7:009023E5:6083150C:qmstart:112:root@pam:
Apr 23 20:42:20 pve pvedaemon[13218]: <root@pam> starting task UPID:pve:000049C7:009023E5:6083150C:qmstart:112:root@pam:
Apr 23 20:42:21 pve systemd[1]: Started 112.scope.
Apr 23 20:42:21 pve systemd-udevd[18905]: Using default interface naming scheme 'v240'.
Apr 23 20:42:21 pve systemd-udevd[18905]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Apr 23 20:42:21 pve systemd-udevd[18905]: Could not generate persistent MAC address for tap112i0: No such file or directory
Apr 23 20:42:21 pve NetworkManager[5062]: <info> [1619203341.1383] manager: (tap112i0): new Tun device (/org/freedesktop/NetworkManager/Devices/64)
Apr 23 20:42:21 pve kernel: [94465.731346] device tap112i0 entered promiscuous mode
Apr 23 20:42:21 pve NetworkManager[5062]: <info> [1619203341.7010] device (tap112i0): state change: unmanaged -> unavailable (reason 'connection-assumed', sys-iface-state: 'external')
Apr 23 20:42:21 pve NetworkManager[5062]: <info> [1619203341.7028] device (tap112i0): state change: unavailable -> disconnected (reason 'none', sys-iface-state: 'external')
Apr 23 20:42:21 pve kernel: [94465.742154] vmbr0: port 5(tap112i0) entered blocking state
Apr 23 20:42:21 pve kernel: [94465.742156] vmbr0: port 5(tap112i0) entered disabled state
Apr 23 20:42:21 pve kernel: [94465.742677] vmbr0: port 5(tap112i0) entered blocking state
Apr 23 20:42:21 pve kernel: [94465.742678] vmbr0: port 5(tap112i0) entered forwarding state
Apr 23 20:42:27 pve pvedaemon[26522]: VM 112 qmp command failed - VM 112 qmp command 'query-proxmox-support' failed - got timeout
Apr 23 20:42:36 pve pvestatd[7040]: VM 112 qmp command failed - VM 112 qmp command 'query-proxmox-support' failed - unable to connect to VM 112 qmp socket - timeout after 31 retries
Apr 23 20:42:36 pve pvestatd[7040]: status update time (6.274 seconds)
Apr 23 20:42:37 pve pvedaemon[28167]: <root@pam> successful auth for user 'root@pam'
Apr 23 20:42:40 pve pveproxy[1937]: Clearing outdated entries from certificate cache
Apr 23 20:42:46 pve pvedaemon[13218]: VM 112 qmp command failed - VM 112 qmp command 'query-proxmox-support' failed - unable to connect to VM 112 qmp socket - timeout after 31 retries
Apr 23 20:42:46 pve pvestatd[7040]: VM 112 qmp command failed - VM 112 qmp command 'query-proxmox-support' failed - unable to connect to VM 112 qmp socket - timeout after 31 retries
Apr 23 20:42:46 pve pvestatd[7040]: status update time (6.268 seconds)
Apr 23 20:42:51 pve pvedaemon[18887]: start failed: command '/usr/bin/kvm -id 112 -name Desktop -no-shutdown -chardev 'socket,id=qmp,path=/var/run/qemu-server/112.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' -mon 'chardev=qmp-event,mode=control' -pidfile /var/run/qemu-server/112.pid -daemonize -smbios 'type=1,uuid=de3bfcdc-f976-4464-b349-869bf83eb69b' -drive 'if=pflash,unit=0,format=raw,readonly,file=/usr/share/pve-edk2-firmware//OVMF_CODE.fd' -drive 'if=pflash,unit=1,format=raw,id=drive-efidisk0,size=131072,file=/dev/zvol/SSD/vm-112-disk-0' -smp '8,sockets=2,cores=4,maxcpus=8' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' -vga none -nographic -cpu 'kvm64,enforce,kvm=off,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep' -m 16384 -readconfig /usr/share/qemu-server/pve-q35-4.0.cfg -device 'vmgenid,guid=4ca056a3-30a9-4aca-aa41-ba7b9f921918' -device 'nec-usb-xhci,id=xhci,bus=pci.1,addr=0x1b' -device 'usb-tablet,id=tablet,bus=ehci.0,port=1' -device 'vfio-pci,host=0000:42:00.0,id=hostpci0.0,bus=ich9-pcie-port-1,addr=0x0.0,multifunction=on' -device 'vfio-pci,host=0000:42:00.1,id=hostpci0.1,bus=ich9-pcie-port-1,addr=0x0.1' -device 'usb-host,bus=xhci.0,vendorid=0x0461,productid=0x0010,id=usb0' -device 'usb-host,bus=xhci.0,vendorid=0x145f,productid=0x01c1,id=usb1' -device 'usb-host,bus=xhci.0,hostbus=1,hostport=1.5,id=usb2' -chardev 'socket,path=/var/run/qemu-server/112.qga,server,nowait,id=qga0' -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:7d26085f50' -drive 'file=/dev/zvol/SSD/vm-112-disk-1,if=none,id=drive-virtio0,cache=writeback,discard=on,format=raw,aio=threads,detect-zeroes=unmap' -device 'virtio-blk-pci,drive=drive-virtio0,id=virtio0,bus=pci.0,addr=0xa,bootindex=100' -netdev 'type=tap,id=net0,ifname=tap112i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=A6:A8:C7:B5:2E:17,netdev=net0,bus=pci.0,addr=0x12,id=net0' -machine 'type=q35+pve0'' failed: got timeout
Apr 23 20:42:51 pve pvedaemon[13218]: <root@pam> end task UPID:pve:000049C7:009023E5:6083150C:qmstart:112:root@pam: start failed: command '/usr/bin/kvm -id 112 -name Desktop -no-shutdown -chardev 'socket,id=qmp,path=/var/run/qemu-server/112.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' -mon 'chardev=qmp-event,mode=control' -pidfile /var/run/qemu-server/112.pid -daemonize -smbios 'type=1,uuid=de3bfcdc-f976-4464-b349-869bf83eb69b' -drive 'if=pflash,unit=0,format=raw,readonly,file=/usr/share/pve-edk2-firmware//OVMF_CODE.fd' -drive 'if=pflash,unit=1,format=raw,id=drive-efidisk0,size=131072,file=/dev/zvol/SSD/vm-112-disk-0' -smp '8,sockets=2,cores=4,maxcpus=8' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' -vga none -nographic -cpu 'kvm64,enforce,kvm=off,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep' -m 16384 -readconfig /usr/share/qemu-server/pve-q35-4.0.cfg -device 'vmgenid,guid=4ca056a3-30a9-4aca-aa41-ba7b9f921918' -device 'nec-usb-xhci,id=xhci,bus=pci.1,addr=0x1b' -device 'usb-tablet,id=tablet,bus=ehci.0,port=1' -device 'vfio-pci,host=0000:42:00.0,id=hostpci0.0,bus=ich9-pcie-port-1,addr=0x0.0,multifunction=on' -device 'vfio-pci,host=0000:42:00.1,id=hostpci0.1,bus=ich9-pcie-port-1,addr=0x0.1' -device 'usb-host,bus=xhci.0,vendorid=0x0461,productid=0x0010,id=usb0' -device 'usb-host,bus=xhci.0,vendorid=0x145f,productid=0x01c1,id=usb1' -device 'usb-host,bus=xhci.0,hostbus=1,hostport=1.5,id=usb2' -chardev 'socket,path=/var/run/qemu-server/112.qga,server,nowait,id=qga0' -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:7d26085f50' -drive 'file=/dev/zvol/SSD/vm-112-disk-1,if=none,id=drive-virtio0,cache=writeback,discard=on,format=raw,aio=threads,detect-zeroes=unmap' -device 'virtio-blk-pci,drive=drive-virtio0,id=virtio0,bus=pci.0,addr=0xa,bootindex=100' -netdev 'type=tap,id=net0,ifname=tap112i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=A6:A8:C7:B5:2E:17,netdev=net0,bus=pci.0,addr=0x12,id=net0' -machine 'type=q35+pve0'' failed: got timeout
Apr 23 20:42:56 pve pvestatd[7040]: VM 112 qmp command failed - VM 112 qmp command 'query-proxmox-support' failed - unable to connect to VM 112 qmp socket - timeout after 31 retries
Apr 23 20:42:57 pve pvestatd[7040]: status update time (6.260 seconds)
Apr 23 20:43:00 pve systemd[1]: Starting Proxmox VE replication runner...
Apr 23 20:43:01 pve systemd[1]: pvesr.service: Succeeded.
Apr 23 20:43:01 pve systemd[1]: Started Proxmox VE replication runner.
Apr 23 20:43:05 pve pvedaemon[28167]: VM 112 qmp command failed - VM 112 qmp command 'query-proxmox-support' failed - unable to connect to VM 112 qmp socket - timeout after 31 retries
Apr 23 20:43:06 pve pvestatd[7040]: VM 112 qmp command failed - VM 112 qmp command 'query-proxmox-support' failed - unable to connect to VM 112 qmp socket - timeout after 31 retries
Apr 23 20:43:06 pve pvestatd[7040]: status update time (6.281 seconds)
Apr 23 20:43:16 pve pvestatd[7040]: VM 112 qmp command failed - VM 112 qmp command 'query-proxmox-support' failed - unable to connect to VM 112 qmp socket - timeout after 31 retries
Apr 23 20:43:16 pve pvestatd[7040]: status update time (6.269 seconds)
Apr 23 20:43:19 pve kernel: [94523.650095] vmbr0: port 5(tap112i0) entered disabled state
Apr 23 20:43:19 pve NetworkManager[5062]: <info> [1619203399.6446] device (tap112i0): state change: disconnected -> unmanaged (reason 'connection-assumed', sys-iface-state: 'external')
Apr 23 20:43:19 pve NetworkManager[5062]: <info> [1619203399.6446] device (tap112i0): released from master device vmbr0
Apr 23 20:43:20 pve pvedaemon[28167]: VM 112 qmp command failed - VM 112 not running
Apr 23 20:43:24 pve systemd[1]: 112.scope: Succeeded.
 
Last edited:
Please show us your IOMMU groups with both GPUs installed using for d in /sys/kernel/iommu_groups/*/devices/*; do n=${d#*/iommu_groups/*}; n=${n%%/*}; printf 'IOMMU group %s ' "$n"; lspci -nns "${d##*/}"; done
Sometimes PCI-numbers shift when installing additional hardware, sometimes they get added to a IOMMU group that contains other devices.