VM ### qmp command failed - VM ### qmp command 'query-proxmox-support' failed

MikeAndreev

Member
Feb 3, 2021
12
1
8
54
hi all!
Got this errors on 3 big(only big) VM in PVE 6.4 (128-256Gb RAM) with Ubuntu guest OS, storages: external CEPH cluster (RBD pools)

May 31 09:31:13 pm-cal-56-02 pvestatd[3589]: VM 134 qmp command failed - VM 134 qmp command 'query-proxmox-support' failed - unable to connect to VM 134 qmp socket - timeout after 31 retries
May 31 09:31:16 pm-cal-56-02 pvestatd[3589]: VM 138 qmp command failed - VM 138 qmp command 'query-proxmox-support' failed - got timeout
May 31 09:31:19 pm-cal-56-02 pvestatd[3589]: status update time (13.761 seconds)
May 31 09:31:25 pm-cal-56-02 pvestatd[3589]: VM 138 qmp command failed - VM 138 qmp command 'query-proxmox-support' failed - unable to connect to VM 138 qmp socket - timeout after 31 retries
May 31 09:31:28 pm-cal-56-02 pvestatd[3589]: VM 134 qmp command failed - VM 134 qmp command 'query-proxmox-support' failed - unable to connect to VM 134 qmp socket - timeout after 31 retries
May 31 09:31:33 pm-cal-56-02 pvestatd[3589]: got timeout
May 31 09:31:33 pm-cal-56-02 pvestatd[3589]: status update time (14.259 seconds)
May 31 09:31:42 pm-cal-56-02 pvestatd[3589]: VM 138 qmp command failed - VM 138 qmp command 'query-proxmox-support' failed - got timeout
May 31 09:31:45 pm-cal-56-02 pvestatd[3589]: VM 134 qmp command failed - VM 134 qmp command 'query-proxmox-support' failed - unable to connect to VM 134 qmp socket - timeout after 31 retries
May 31 09:31:45 pm-cal-56-02 pvestatd[3589]: status update time (12.259 seconds)
May 31 09:31:54 pm-cal-56-02 pvestatd[3589]: VM 138 qmp command failed - VM 138 qmp command 'query-proxmox-support' failed - unable to connect to VM 138 qmp socket - timeout after 31 retries
May 31 09:31:57 pm-cal-56-02 pvestatd[3589]: VM 134 qmp command failed - VM 134 qmp command 'query-proxmox-support' failed - unable to connect to VM 134 qmp socket - timeout after 31 retries


root@pm-cal-56-02:~# pveversion -v
proxmox-ve: 6.4-1 (running kernel: 5.4.106-1-pve)
pve-manager: 6.4-4 (running version: 6.4-4/337d6701)
pve-kernel-5.4: 6.4-1
pve-kernel-helper: 6.4-1
pve-kernel-5.4.106-1-pve: 5.4.106-1
pve-kernel-5.4.73-1-pve: 5.4.73-1
ceph-fuse: 14.2.20-pve1
corosync: 3.1.2-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: residual config
ifupdown2: 3.0.0-1+pve3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.20-pve1
libproxmox-acme-perl: 1.0.8
libproxmox-backup-qemu0: 1.0.3-1
libpve-access-control: 6.4-1
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.4-2
libpve-guest-common-perl: 3.1-5
libpve-http-server-perl: 3.2-1
libpve-storage-perl: 6.4-1
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.6-2
lxcfs: 4.0.6-pve1
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.1.5-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.5-3
pve-cluster: 6.4-1
pve-container: 3.3-5
pve-docs: 6.4-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.2-2
pve-ha-manager: 3.1-1
pve-i18n: 2.3-1
pve-qemu-kvm: 5.2.0-6
pve-xtermjs: 4.7.0-3
qemu-server: 6.4-1
smartmontools: 7.2-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 2.0.4-pve1

cat /etc/pve/qemu-server/134.conf

#Ubuntu Focal Fossa LTS
agent: 1
boot: order=scsi0
cores: 21
cpu: Cascadelake-Server-noTSX,flags=+spec-ctrl;+ssbd
ide2: none,media=cdrom
localtime: 0
memory: 131072
name: **********
net0: virtio=********,bridge=vmbr10,firewall=1,tag=226
numa: 0
ostype: l26
scsi0: pm-ce1-ssd-ec42:vm-134-disk-0,cache=writeback,size=932G
scsi1: pm-ce1-ec42:vm-134-disk-1,cache=writeback,size=1397G
scsi2: pm-ce1-ec42:vm-134-disk-2,cache=writeback,size=1863G
scsihw: virtio-scsi-pci
smbios1: uuid=c9f3d864-76a1-444f-b054-bbc239aff0db
sockets: 2
tablet: 0
vga: qxl
vmgenid: 1961b584-14f1-4b75-8651-008eaf663ca7

any ideas?

WBR
Mike
 
Same thing here, any feedback would be appreciated?
Any ceph issue related ?
 
@MikeAndreev Not sure if this really did the trick, but I had the same problem and since I did your suggested change, it didn't happen anymore. Thanks for sharing your suggestion.
 
Hi,

I have trigger this bug on my ceph cluster, with a vm with 10 disk and 100 osd cluster.

doing sequential read of all disk is openning connections to each osd (so 10x100 connection just for network + qemu internal file descriptions)

After reaching the limit, I have some random some block access timeout (as it was not possible to open connection)

The default max open file soft limit is 1024, and it's really too low

I'll try to see if we could increase the default value on pve8.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!