All VMs locking up after latest PVE update

After upgrade to 5.2.0-4 i still get "VM ### qmp command failed - VM ### qmp command 'query-proxmox-support' failed - got timeout" when try to resize a disk for Windows VM (pc-i440fx-5.1+pve0) with guest agent running. The only difference is VM does not hang completely but have errors in the event log:
* The IO operation at logical block address 0x150f0a408 for Disk 4 was retried (ID 153)
* Reset to device, \Device\RaidPort4, was issued (ID 129)

Code:
Mar 30 09:22:15 pvedaemon[34041]: update VM 274: resize --disk scsi5 --size +500G
Mar 30 09:22:16 qm[1119]: VM 274 qmp command failed - VM 274 qmp command 'change' failed - got timeout
Mar 30 09:22:19 pvedaemon[34041]: VM 274 qmp command failed - VM 274 qmp command 'block_resize' failed - got timeout
Mar 30 09:22:23 pvestatd[3918]: VM 274 qmp command failed - VM 274 qmp command 'query-proxmox-support' failed - unable to connect to VM 274 qmp socket - timeout after 31 retries
Mar 30 09:22:25 pvedaemon[17935]: VM 274 qmp command failed - VM 274 qmp command 'query-proxmox-support' failed - unable to connect to VM 274 qmp socket - timeout after 31 retries
Mar 30 09:22:26 pvedaemon[37379]: VM 274 qmp command failed - VM 274 qmp command 'query-proxmox-support' failed - unable to connect to VM 274 qmp socket - timeout after 31 retries
Mar 30 09:22:29 pvedaemon[17935]: VM 274 qmp command failed - VM 274 qmp command 'guest-ping' failed - got timeout

Same issue for Linux VM:
Code:
agent: 1
machine: pc-i440fx-5.1
ostype: l26
scsihw: virtio-scsi-single
scsi1: XXX:vm-159-disk-0,cache=writeback,iothread=1,size=32G
virtio0: XXX:vm-159-disk-0,backup=0,cache=writeback,iothread=1,size=35G
Resize virtio0 - OK, resize scsi1 - VM 159 qmp command 'block_resize' failed - got timeout (500)
 
Last edited:
After upgrade to 5.2.0-4 i still get "VM ### qmp command failed - VM ### qmp command 'query-proxmox-support' failed - got timeout" when try to resize a disk for Windows VM (pc-i440fx-5.1+pve0) with guest agent running. The only difference is VM does not hang completely but have errors in the event log:
* The IO operation at logical block address 0x150f0a408 for Disk 4 was retried (ID 153)
* Reset to device, \Device\RaidPort4, was issued (ID 129)
What storage type are you using? If it is Ceph, could you try enabling "krbd" in the storage settings for your cephpool (in PVE), restarting the VM and then trying again?
 
What storage type are you using? If it is Ceph, could you try enabling "krbd" in the storage settings for your cephpool (in PVE), restarting the VM and then trying again?
Yes, it's Ceph. Enabling krbd does not fix the issue:
Code:
Mar 30 12:08:13 kvm01 pvedaemon[56911]: update VM 110: resize --disk scsi1 --size +1G
Mar 30 12:08:13 kvm01 kernel: [526339.584535] rbd0: detected capacity change from 35433480192 to 36507222016
Mar 30 12:08:16 kvm01 pvedaemon[56911]: VM 110 qmp command failed - VM 110 qmp command 'block_resize' failed - got timeout
Mar 30 12:08:19 kvm01 pvedaemon[59485]: VM 110 qmp command failed - VM 110 qmp command 'query-proxmox-support' failed - got timeout


Code:
# pveversion --verbose
proxmox-ve: 6.3-1 (running kernel: 5.4.103-1-pve)
pve-manager: 6.3-6 (running version: 6.3-6/2184247e)
pve-kernel-5.4: 6.3-8
pve-kernel-helper: 6.3-8
pve-kernel-5.4.106-1-pve: 5.4.106-1
pve-kernel-5.4.103-1-pve: 5.4.103-1
pve-kernel-4.15: 5.4-19
pve-kernel-4.15.18-30-pve: 4.15.18-58
pve-kernel-4.15.18-20-pve: 4.15.18-46
pve-kernel-4.4.117-2-pve: 4.4.117-110
pve-kernel-4.4.98-2-pve: 4.4.98-101
pve-kernel-4.4.49-1-pve: 4.4.49-86
pve-kernel-4.4.8-1-pve: 4.4.8-52
pve-kernel-4.2.6-1-pve: 4.2.6-36
pve-kernel-4.2.0-1-pve: 4.2.0-13
ceph-fuse: 14.2.18-pve1
corosync: 3.1.0-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.20-pve1
libproxmox-acme-perl: 1.0.8
libproxmox-backup-qemu0: 1.0.3-1
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.3-5
libpve-guest-common-perl: 3.1-5
libpve-http-server-perl: 3.1-1
libpve-storage-perl: 6.3-7
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.6-2
lxcfs: 4.0.6-pve1
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.0.11-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-9
pve-cluster: 6.2-1
pve-container: 3.3-4
pve-docs: 6.3-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.2-2
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.2.0-4
pve-xtermjs: 4.7.0-3
qemu-server: 6.3-8
smartmontools: 7.2-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 2.0.4-pve1
 
What storage type are you using? If it is Ceph, could you try enabling "krbd" in the storage settings for your cephpool (in PVE), restarting the VM and then trying again?

We are also using Ceph and the error is the same. With virtio-scsi-pci the resizing works but not for virtio-scsi-single.

Error:
Code:
Mar 30 10:10:16 pve-test01 pvedaemon[3165485]: <root@pam> update VM 102: resize --disk scsi0 --size +1G
Mar 30 10:10:19 pve-test01 pvedaemon[3165485]: VM 102 qmp command failed - VM 102 qmp command 'block_resize' failed - got timeout
Mar 30 10:10:22 pve-test01 pvedaemon[3878351]: VM 102 qmp command failed - VM 102 qmp command 'query-proxmox-support' failed - got timeout
Mar 30 10:10:30 pve-test01 pvestatd[2366]: VM 102 qmp command failed - VM 102 qmp command 'query-proxmox-support' failed - got timeout
Mar 30 10:10:40 pve-test01 pvestatd[2366]: VM 102 qmp command failed - VM 102 qmp command 'query-proxmox-support' failed - got timeout

The disk size gets resized on the storage but not on the vm:

Code:
root@pve-test01:~# qemu-img info rbd:ceph-data/vm-102-disk-0
image: json:{"driver": "raw", "file": {"pool": "ceph-data", "image": "vm-102-disk-0", "driver": "rbd", "namespace": ""}}
file format: raw
virtual size: 36 GiB (38654705664 bytes)
disk size: unavailable
cluster_size: 419430
Code:
root@pve-test01:~# qm config 102
agent: 1
boot:
cores: 8
cpu: host
memory: 16384
name: Test
numa: 0
ostype: l26
scsi0: ceph-data:vm-102-disk-0,cache=writeback,discard=on,iothread=1,size=35G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=432c26bc-a911-45bb-92af-8baf2d89c95f
sockets: 1
vmgenid: b0cd3b46-0903-497e-b977-4f7a300cbf39


With KRBD enabled still the same error.


Newest proxmox and windows updates; fresh Windows Server 2k19 with newest QEMU Agent and newest VirtIO drivers.

Code:
proxmox-ve: 6.3-1 (running kernel: 5.4.106-1-pve)
pve-manager: 6.3-6 (running version: 6.3-6/2184247e)
pve-kernel-5.4: 6.3-8
pve-kernel-helper: 6.3-8
pve-kernel-5.4.106-1-pve: 5.4.106-1
pve-kernel-5.4.98-1-pve: 5.4.98-1
pve-kernel-5.4.73-1-pve: 5.4.73-1
ceph: 15.2.9-pve1
ceph-fuse: 15.2.9-pve1
corosync: 3.1.0-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: not correctly installed
ifupdown2: 3.0.0-1+pve3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.20-pve1
libproxmox-acme-perl: 1.0.8
libproxmox-backup-qemu0: 1.0.3-1
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.3-5
libpve-guest-common-perl: 3.1-5
libpve-http-server-perl: 3.1-1
libpve-storage-perl: 6.3-7
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.6-2
lxcfs: 4.0.6-pve1
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.0.11-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-9
pve-cluster: 6.2-1
pve-container: 3.3-4
pve-docs: 6.3-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.2-2
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.2.0-4
pve-xtermjs: 4.7.0-3
qemu-server: 6.3-8
smartmontools: 7.2-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 2.0.4-pve1

Greetz
 
The disk size gets resized on the storage but not on the vm:
It does get resized, the value in the config is not meaningful in this case.

The problem is rather that it gets stuck... I can reproduce it here with RBD, but not KRBD (note that you need to restart the VMs after enabling krbd) - I'll take a look anyway. Do these errors only occur until the resize is done, or do they continue to appear in the syslog until the VM is stopped?
 
Do these errors only occur until the resize is done, or do they continue to appear in the syslog until the VM is stopped?
Errors stop ONLY after the kvm process killed with SIGKILL. I wait about 2 hours, but errors continue to appear, and writes to the disk get stuck:

Code:
[  240.063103] INFO: task ext4lazyinit:1081 blocked for more than 120 seconds.
[  240.063927] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  240.064912] ext4lazyinit    D ffff9d2b2fc1acc0     0  1081      2 0x00000080
[  240.064918] Call Trace:
[  240.064926]  [<ffffffff8f356d4d>] ? blk_peek_request+0x9d/0x2a0
[  240.064931]  [<ffffffff8f787dd9>] schedule+0x29/0x70
[  240.064935]  [<ffffffff8f7858c1>] schedule_timeout+0x221/0x2d0
[  240.064939]  [<ffffffff8f353869>] ? __blk_run_queue+0x39/0x50
[  240.064942]  [<ffffffff8f3575dc>] ? blk_queue_bio+0x31c/0x400
[  240.064946]  [<ffffffff8f06d39e>] ? kvm_clock_get_cycles+0x1e/0x20
[  240.064951]  [<ffffffff8f7874ad>] io_schedule_timeout+0xad/0x130
[  240.064954]  [<ffffffff8f78840d>] wait_for_completion_io+0xfd/0x140
[  240.064959]  [<ffffffff8f0db130>] ? wake_up_state+0x20/0x20
[  240.064964]  [<ffffffff8f35e1d9>] blkdev_issue_write_same+0x1c9/0x1e0
[  240.064968]  [<ffffffff8f35e294>] blkdev_issue_zeroout+0xa4/0x280
[  240.064978]  [<ffffffffc035efc2>] ? jbd2_journal_get_write_access+0x32/0x40 [jbd2]
[  240.065009]  [<ffffffffc037f6d5>] ext4_init_inode_table+0x185/0x410 [ext4]
[  240.065047]  [<ffffffffc03a0505>] ext4_lazyinit_thread+0x275/0x2f0 [ext4]
[  240.065063]  [<ffffffffc03a0290>] ? ext4_unregister_li_request+0x60/0x60 [ext4]
[  240.065066]  [<ffffffff8f0c5e11>] kthread+0xd1/0xe0
[  240.065071]  [<ffffffff8f794ec9>] ? system_call_after_swapgs+0x96/0x13a
[  240.065075]  [<ffffffff8f0c5d40>] ? insert_kthread_work+0x40/0x40
[  240.065079]  [<ffffffff8f794de4>] ret_from_fork_nospec_begin+0xe/0x21
[  240.065083]  [<ffffffff8f0c5d40>] ? insert_kthread_work+0x40/0x40
[  360.065076] INFO: task jbd2/sda-8:1079 blocked for more than 120 seconds.
[  360.066196] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  360.067472] jbd2/sda-8      D ffff9d2b2fc1acc0     0  1079      2 0x00000080
[  360.067480] Call Trace:
[  360.067493]  [<ffffffff8f0d2d90>] ? task_rq_unlock+0x20/0x20
[  360.067501]  [<ffffffff8f787dd9>] schedule+0x29/0x70
[  360.067520]  [<ffffffffc036033c>] jbd2_journal_commit_transaction+0x23c/0x19c0 [jbd2]
[  360.067527]  [<ffffffff8f0e11ee>] ? account_entity_dequeue+0xae/0xd0
[  360.067532]  [<ffffffff8f0e4afc>] ? dequeue_entity+0x11c/0x5c0
[  360.067536]  [<ffffffff8f06d3be>] ? kvm_sched_clock_read+0x1e/0x30
[  360.067540]  [<ffffffff8f0e6101>] ? put_prev_entity+0x31/0x400
[  360.067551]  [<ffffffff8f02b59e>] ? __switch_to+0xce/0x580
[  360.067554]  [<ffffffff8f0d4c8e>] ? finish_task_switch+0x4e/0x1c0
[  360.067559]  [<ffffffff8f0c6f00>] ? wake_up_atomic_t+0x30/0x30
[  360.067562]  [<ffffffff8f7878ff>] ? __schedule+0x3af/0x860
[  360.067567]  [<ffffffff8f0ae09e>] ? try_to_del_timer_sync+0x5e/0x90
[  360.067576]  [<ffffffffc0366f89>] kjournald2+0xc9/0x260 [jbd2]
[  360.067580]  [<ffffffff8f0c6f00>] ? wake_up_atomic_t+0x30/0x30
[  360.067588]  [<ffffffffc0366ec0>] ? commit_timeout+0x10/0x10 [jbd2]
[  360.067592]  [<ffffffff8f0c5e11>] kthread+0xd1/0xe0
[  360.067597]  [<ffffffff8f0c5d40>] ? insert_kthread_work+0x40/0x40
[  360.067602]  [<ffffffff8f794de4>] ret_from_fork_nospec_begin+0xe/0x21
[  360.067606]  [<ffffffff8f0c5d40>] ? insert_kthread_work+0x40/0x40
[  360.067610] INFO: task ext4lazyinit:1081 blocked for more than 120 seconds.
[  360.068761] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

Code:
Syslog:
Mar 30 14:31:15 kvm01 pvedaemon[59485]: VM 110 qmp command failed - VM 110 qmp command 'query-proxmox-support' failed - unable to connect to VM 110 qmp socket - timeout after 31 retries
Mar 30 14:31:25 kvm01 pvestatd[2089]: VM 110 qmp command failed - VM 110 qmp command 'query-proxmox-support' failed - unable to connect to VM 110 qmp socket - timeout after 31 retries
Mar 30 14:31:25 kvm01 pvestatd[2089]: status update time (6.255 seconds)
Mar 30 14:31:34 kvm01 pvestatd[2089]: VM 110 qmp command failed - VM 110 qmp command 'query-proxmox-support' failed - unable to connect to VM 110 qmp socket - timeout after 31 retries
Mar 30 14:31:34 kvm01 pvestatd[2089]: status update time (6.248 seconds)
Mar 30 14:31:35 kvm01 pvedaemon[59485]: VM 110 qmp command failed - VM 110 qmp command 'query-proxmox-support' failed - unable to connect to VM 110 qmp socket - timeout after 31 retries
Mar 30 14:31:37 kvm01 qmeventd[1324]: read: Connection reset by peer <--- KILL -9
Mar 30 14:31:37 kvm01 systemd[1]: 110.scope: Succeeded.
Mar 30 14:31:38 kvm01 qmeventd[1324]: Starting cleanup for 110
Mar 30 14:31:38 kvm01 qmeventd[1324]: Finished cleanup for 110
... no errors
 
The problem is rather that it gets stuck... I can reproduce it here with RBD, but not KRBD (note that you need to restart the VMs after enabling krbd) - I'll take a look anyway. Do these errors only occur until the resize is done, or do they continue to appear in the syslog until the VM is stopped?

The block_resize failed - got timeout error only appiers once - the other one query-proxmox-support failed - got timeout every ~10 seconds until the vm is stopped. The vm gets stuck and only a kvm kill helps.


To be sure i created a new test pool with KRBD enabled. Installed a new Windows Server 2k19 VM in that pool with also the latest virtio drivers.

Same error:
Code:
Mar 30 13:39:33 pve-test01 pvedaemon[4003803]: VM 104 qmp command failed - VM 104 qmp command 'block_resize' failed - got timeout
Mar 30 13:39:37 pve-test01 pvedaemon[3906002]: VM 104 qmp command failed - VM 104 qmp command 'query-proxmox-support' failed - got timeout
Mar 30 13:39:40 pve-test01 pvestatd[2366]: VM 104 qmp command failed - VM 104 qmp command 'query-proxmox-support' failed - got timeout
Mar 30 13:39:50 pve-test01 pvestatd[2366]: VM 104 qmp command failed - VM 104 qmp command 'query-proxmox-support' failed - got timeout
Mar 30 13:39:56 pve-test01 pvedaemon[3906002]: VM 104 qmp command failed - VM 104 qmp command 'query-proxmox-support' failed - got timeout
Mar 30 13:40:00 pve-test01 pvestatd[2366]: VM 104 qmp command failed - VM 104 qmp command 'query-proxmox-support' failed - got timeout

Greetz
 
I update to latest today, and then issue comes, please help , I can't boot new VM after I create it.
This is my pve info:

Code:
pve-qemu-kvm   5.2.0-5      amd64        Full virtualization on x86 hardware
pve-manager/6.3-6/2184247e (running kernel: 5.4.106-1-pve)

And bellow is the error info: (PS: I use ZFS to store VM)

Code:
Apr 05 17:17:23 pve1 pveproxy[14040]: worker exit
Apr 05 17:17:23 pve1 pveproxy[3457]: worker 14040 finished
Apr 05 17:17:23 pve1 pveproxy[3457]: starting 1 worker(s)
Apr 05 17:17:23 pve1 pveproxy[3457]: worker 17945 started
Apr 05 17:17:52 pve1 pvedaemon[3449]: worker exit
Apr 05 17:17:52 pve1 pvedaemon[3447]: worker 3449 finished
Apr 05 17:17:52 pve1 pvedaemon[3447]: starting 1 worker(s)
Apr 05 17:17:52 pve1 pvedaemon[3447]: worker 18027 started
 
I update to latest today, and then issue comes, please help , I can't boot new VM after I create it.
No indication that this is related, open a separate new thread.

And bellow is the error info: (PS: I use ZFS to store VM)
There's no error in there, open and new thread and post actual information (what did you try, what happened (actual task log)), ...
 
Update:
In the Bios of Hardware page, if I choose to use OVMF(UEFI) , when I start a new created VM ,the cursor will be freeze at the screen -- see the screenshot.
0405.PNG

If I choose Default(SeaBIOS), then the cursor will be freeze for a few second , then the VM will be boot into OS.

bug ?
 
No indication that this is related, open a separate new thread.


There's no error in there, open and new thread and post actual information (what did you try, what happened (actual task log)), ...
I believe it's related. before I update to latest version , my VM will not been lock (it's freeze in my words)
 
I believe it's related. before I update to latest version , my VM will not been lock (it's freeze in my words)
No, this thread was about QMP errors and hangs with theres on VMs that can start just fine. If yours cannot start that's something completely different. Please, open a new thread, provide actual information from your setup like asked.
 
Hello everyone!
When backing up, errors appear in the log: VM XXX qmp command failed - VM XXX qmp command 'query-proxmox-support' failed - unable to connect to VM XXX qmp socket - timeout after 31 retries
After the backup is complete, the error disappears. Is this a bug?

vm disk format: raw
backup mode: stop vm

Apr 21 01:07:38 pve pvestatd[1071]: VM 103 qmp command failed - VM 103 qmp command 'query-proxmox-support' failed - unable to connect to VM 103 qmp socket - timeout after 31 retries

Apr 21 01:07:39 pve pvestatd[1071]: status update time (6.162 seconds)

Apr 21 01:07:48 pve pvestatd[1071]: VM 103 qmp command failed - VM 103 qmp command 'query-proxmox-support' failed - unable to connect to VM 103 qmp socket - timeout after 31 retries

Apr 21 01:07:48 pve pvestatd[1071]: status update time (6.168 seconds)

Apr 21 01:08:28 pve pvestatd[1071]: VM 103 qmp command failed - VM 103 qmp command 'query-proxmox-support' failed - unable to connect to VM 103 qmp socket - timeout after 31 retries

Apr 21 01:08:28 pve pvestatd[1071]: status update time (6.191 seconds)

Apr 21 01:08:58 pve pvestatd[1071]: VM 103 qmp command failed - VM 103 qmp command 'query-proxmox-support' failed - unable to connect to VM 103 qmp socket - timeout after 31 retries

Apr 21 01:08:58 pve pvestatd[1071]: status update time (6.160 seconds)

Apr 21 01:09:08 pve pvestatd[1071]: VM 103 qmp command failed - VM 103 qmp command 'query-proxmox-support' failed - unable to connect to VM 103 qmp socket - timeout after 31 retries

Apr 21 01:09:08 pve pvestatd[1071]: status update time (6.163 seconds)

Apr 21 01:10:48 pve pvestatd[1071]: VM 103 qmp command failed - VM 103 qmp command 'query-proxmox-support' failed - unable to connect to VM 103 qmp socket - timeout after 31 retries

Apr 21 01:10:48 pve pvestatd[1071]: status update time (6.158 seconds)

Apr 21 01:12:08 pve pvestatd[1071]: VM 103 qmp command failed - VM 103 qmp command 'query-proxmox-support' failed - unable to connect to VM 103 qmp socket - timeout after 31 retries

Apr 21 01:12:08 pve pvestatd[1071]: status update time (6.168 seconds)

Apr 21 01:12:48 pve pvestatd[1071]: VM 103 qmp command failed - VM 103 qmp command 'query-proxmox-support' failed - unable to connect to VM 103 qmp socket - timeout after 31 retries

Apr 21 01:12:48 pve pvestatd[1071]: status update time (6.159 seconds)

Apr 21 01:13:58 pve pvestatd[1071]: VM 103 qmp command failed - VM 103 qmp command 'query-proxmox-support' failed - unable to connect to VM 103 qmp socket - timeout after 31 retries

Apr 21 01:13:58 pve pvestatd[1071]: status update time (6.182 seconds)

Apr 21 01:17:58 pve pvestatd[1071]: VM 103 qmp command failed - VM 103 qmp command 'query-proxmox-support' failed - unable to connect to VM 103 qmp socket - timeout after 31 retries

Apr 21 01:17:58 pve pvestatd[1071]: status update time (6.170 seconds)

proxmox-ve: 6.3-1 (running kernel: 5.4.106-1-pve)

pve-manager: 6.3-6 (running version: 6.3-6/2184247e)

pve-kernel-5.4: 6.3-8

pve-kernel-helper: 6.3-8

pve-kernel-5.4.106-1-pve: 5.4.106-1

pve-kernel-5.4.98-1-pve: 5.4.98-1

pve-kernel-5.4.44-2-pve: 5.4.44-2

ceph-fuse: 12.2.11+dfsg1-2.1+b1

corosync: 3.1.2-pve1

criu: 3.11-3

glusterfs-client: 5.5-3

ifupdown: not correctly installed

ifupdown2: 3.0.0-1+pve3

libjs-extjs: 6.0.1-10

libknet1: 1.20-pve1

libproxmox-acme-perl: 1.0.8

libproxmox-backup-qemu0: 1.0.3-1

libpve-access-control: 6.1-3

libpve-apiclient-perl: 3.1-3

libpve-common-perl: 6.3-5

libpve-guest-common-perl: 3.1-5

libpve-http-server-perl: 3.1-1

libpve-storage-perl: 6.3-9

libqb0: 1.0.5-1

libspice-server1: 0.14.2-4~pve6+1

lvm2: 2.03.02-pve4

lxc-pve: 4.0.6-2

lxcfs: 4.0.6-pve1

novnc-pve: 1.1.0-1

proxmox-backup-client: 1.1.1-1

proxmox-mini-journalreader: 1.1-1

proxmox-widget-toolkit: 2.5-1

pve-cluster: 6.2-1

pve-container: 3.3-4

pve-docs: 6.3-1

pve-edk2-firmware: 2.20200531-1

pve-firewall: 4.1-3

pve-firmware: 3.2-2

pve-ha-manager: 3.1-1

pve-i18n: 2.3-1

pve-qemu-kvm: 5.2.0-5

pve-xtermjs: 4.7.0-3

qemu-server: 6.3-10

smartmontools: 7.2-pve2

spiceterm: 3.1-1

vncterm: 1.6-2

zfsutils-linux: 2.0.4-pve1
 
After the backup is complete, the error disappears. Is this a bug?
With the fact that your pvestatd status uptime goes quite high it seems like your setup (PVE and/or target storage) is highly loaded, maybe even overloaded.

When backing up, errors appear in the log: VM XXX qmp command failed - VM XXX qmp command 'query-proxmox-support' failed - unable to connect to VM XXX qmp socket - timeout after 31 retries
Do you observe any other negative effect, or is it just those message which get logged?
 

t.lamprecht

Thanks for the answer.
With the fact that your pvestatd status uptime goes quite high it seems like your setup (PVE and/or target storage) is highly loaded, maybe even overloaded.
Yes, the load is high during the day. But at night (when backups are made) there is no load.
I checked the experiment. Turned off all vm and started backup vm (id 113). The mistakes were

repeated.
Proxmox 5 was up and running 24x 7 and the log is clean.

Logs Proxmox 6
The error disappears after the backup is completed

Storage vm: SSD in Raid 1
Storage os: hdd in Raid 1
File system everywhere: ext4
Backup to another physical server. Nfs protocol.

Apr 21 23:15:19 pve pvestatd[1071]: VM 113 qmp command failed - VM 113 qmp

command 'query-proxmox-support' failed - unable to connect to VM 113 qmp socket - timeout after 31

retries

Apr 21 23:15:19 pve pvestatd[1071]: status update time (6.144 seconds)

Apr 21 23:15:49 pve pvestatd[1071]: VM 113 qmp command failed - VM 113 qmp command 'query-

proxmox-support' failed - unable to connect to VM 113 qmp socket - timeout after 31 retries

Apr 21 23:15:49 pve pvestatd[1071]: status update time (6.150 seconds)

Apr 21 23:15:59 pve pvestatd[1071]: VM 113 qmp command failed - VM 113 qmp command 'query-

proxmox-support' failed - unable to connect to VM 113 qmp socket - timeout after 31 retries

Apr 21 23:15:59 pve pvestatd[1071]: status update time (6.147 seconds)

Apr 21 23:16:49 pve pvestatd[1071]: VM 113 qmp command failed - VM 113 qmp command 'query-

proxmox-support' failed - unable to connect to VM 113 qmp socket - timeout after 31 retries

Apr 21 23:16:49 pve pvestatd[1071]: status update time (6.147 seconds)

Apr 21 23:17:19 pve pvestatd[1071]: status update time (5.989 seconds)

Apr 21 23:17:49 pve pvestatd[1071]: status update time (5.557 seconds)

Apr 21 23:18:39 pve pvestatd[1071]: VM 113 qmp command failed - VM 113 qmp command 'query-

proxmox-support' failed - unable to connect to VM 113 qmp socket - timeout after 31 retries

Apr 21 23:18:39 pve pvestatd[1071]: status update time (6.112 seconds)

Apr 21 23:19:29 pve pvestatd[1071]: VM 113 qmp command failed - VM 113 qmp command 'query-

proxmox-support' failed - unable to connect to VM 113 qmp socket - timeout after 31 retries

Apr 21 23:19:29 pve pvestatd[1071]: status update time (6.110 seconds)

Apr 21 23:19:39 pve pvestatd[1071]: VM 113 qmp command failed - VM 113 qmp command 'query-

proxmox-support' failed - unable to connect to VM 113 qmp socket - timeout after 31 retries

Apr 21 23:19:39 pve pvestatd[1071]: status update time (6.139 seconds)

Apr 21 23:19:49 pve pvestatd[1071]: VM 113 qmp command failed - VM 113 qmp command 'query-

proxmox-support' failed - unable to connect to VM 113 qmp socket - timeout after 31 retries

Apr 21 23:19:49 pve pvestatd[1071]: status update time (6.130 seconds)


INFO: starting new backup job: vzdump 113 --mode stop --quiet 1 --mailto XXX

--mailnotification failure --compress 0 --storage backup-vm

INFO: Starting Backup of VM 113 (qemu)

INFO: Backup started at 2021-04-21 23:15:02

INFO: status = stopped

INFO: backup mode: stop

INFO: ionice priority: 7

INFO: VM Name: srv13

INFO: include disk 'scsi0' 'vm3:113/vm-113-disk-0.raw' 35G

INFO: creating vzdump archive '/mnt/pve/backup-vm/dump/vzdump-qemu-113-2021_04_21-23_15_02.vma'

INFO: starting kvm to execute backup task

INFO: started backup task 'b909779b-8319-4db0-8467-94ba6002ea44'

INFO: 1% (525.3 MiB of 35.0 GiB) in 3s, read: 175.1 MiB/s, write: 133.2 MiB/s

INFO: 2% (954.4 MiB of 35.0 GiB) in 6s, read: 143.0 MiB/s, write: 142.5 MiB/s

INFO: 3% (1.3 GiB of 35.0 GiB) in 9s, read: 138.1 MiB/s, write: 136.7 MiB/s

INFO: 4% (1.5 GiB of 35.0 GiB) in 21s, read: 14.7 MiB/s, write: 14.6 MiB/s

INFO: 5% (1.9 GiB of 35.0 GiB) in 24s, read: 137.7 MiB/s, write: 135.3 MiB/s

INFO: 6% (2.3 GiB of 35.0 GiB) in 27s, read: 138.8 MiB/s, write: 138.3 MiB/s

INFO: 7% (2.7 GiB of 35.0 GiB) in 30s, read: 144.5 MiB/s, write: 144.2 MiB/s

INFO: 9% (3.2 GiB of 35.0 GiB) in 33s, read: 160.5 MiB/s, write: 155.0 MiB/s

INFO: 10% (3.6 GiB of 35.0 GiB) in 56s, read: 17.3 MiB/s, write: 17.0 MiB/s

INFO: 11% (4.0 GiB of 35.0 GiB) in 59s, read: 139.7 MiB/s, write: 136.9 MiB/s

INFO: 12% (4.3 GiB of 35.0 GiB) in 1m 8s, read: 32.8 MiB/s, write: 32.7 MiB/s

INFO: 13% (4.7 GiB of 35.0 GiB) in 1m 11s, read: 134.2 MiB/s, write: 133.7 MiB/s

INFO: 14% (5.1 GiB of 35.0 GiB) in 1m 14s, read: 152.0 MiB/s, write: 149.4 MiB/s

INFO: 15% (5.6 GiB of 35.0 GiB) in 1m 24s, read: 46.1 MiB/s, write: 45.9 MiB/s

INFO: 17% (6.0 GiB of 35.0 GiB) in 1m 27s, read: 135.8 MiB/s, write: 135.4 MiB/s

INFO: 18% (6.4 GiB of 35.0 GiB) in 1m 30s, read: 158.6 MiB/s, write: 156.6 MiB/s

INFO: 19% (6.9 GiB of 35.0 GiB) in 1m 33s, read: 148.1 MiB/s, write: 147.2 MiB/s

INFO: 20% (7.3 GiB of 35.0 GiB) in 1m 36s, read: 158.1 MiB/s, write: 156.6 MiB/s

INFO: 21% (7.5 GiB of 35.0 GiB) in 1m 46s, read: 14.8 MiB/s, write: 14.8 MiB/s

INFO: 22% (7.7 GiB of 35.0 GiB) in 1m 49s, read: 85.5 MiB/s, write: 85.4 MiB/s

INFO: 29% (10.3 GiB of 35.0 GiB) in 1m 52s, read: 869.7 MiB/s, write: 113.4 MiB/s

INFO: 30% (10.8 GiB of 35.0 GiB) in 1m 55s, read: 163.2 MiB/s, write: 145.4 MiB/s

INFO: 32% (11.2 GiB of 35.0 GiB) in 1m 58s, read: 160.4 MiB/s, write: 159.3 MiB/s

INFO: 33% (11.8 GiB of 35.0 GiB) in 2m 1s, read: 183.5 MiB/s, write: 182.6 MiB/s

INFO: 34% (12.0 GiB of 35.0 GiB) in 2m 15s, read: 18.9 MiB/s, write: 18.8 MiB/s

INFO: 35% (12.4 GiB of 35.0 GiB) in 2m 22s, read: 60.6 MiB/s, write: 60.5 MiB/s

INFO: 36% (12.6 GiB of 35.0 GiB) in 2m 25s, read: 65.3 MiB/s, write: 65.2 MiB/s

INFO: 37% (13.1 GiB of 35.0 GiB) in 2m 28s, read: 158.7 MiB/s, write: 158.6 MiB/s

INFO: 38% (13.4 GiB of 35.0 GiB) in 2m 35s, read: 49.5 MiB/s, write: 49.4 MiB/s

INFO: 39% (13.9 GiB of 35.0 GiB) in 2m 38s, read: 146.8 MiB/s, write: 146.7 MiB/s

INFO: 40% (14.1 GiB of 35.0 GiB) in 2m 46s, read: 23.9 MiB/s, write: 23.7 MiB/s

INFO: 41% (14.5 GiB of 35.0 GiB) in 2m 50s, read: 109.5 MiB/s, write: 108.4 MiB/s

INFO: 42% (15.0 GiB of 35.0 GiB) in 2m 53s, read: 159.8 MiB/s, write: 156.6 MiB/s

INFO: 43% (15.2 GiB of 35.0 GiB) in 3m 1s, read: 25.8 MiB/s, write: 25.4 MiB/s

INFO: 44% (15.6 GiB of 35.0 GiB) in 3m 8s, read: 60.6 MiB/s, write: 59.3 MiB/s

INFO: 45% (16.0 GiB of 35.0 GiB) in 3m 11s, read: 149.5 MiB/s, write: 148.0 MiB/s

INFO: 46% (16.4 GiB of 35.0 GiB) in 3m 14s, read: 140.4 MiB/s, write: 138.7 MiB/s

INFO: 48% (16.9 GiB of 35.0 GiB) in 3m 17s, read: 153.6 MiB/s, write: 152.9 MiB/s

INFO: 49% (17.3 GiB of 35.0 GiB) in 3m 20s, read: 156.6 MiB/s, write: 154.9 MiB/s

INFO: 50% (17.8 GiB of 35.0 GiB) in 3m 23s, read: 156.7 MiB/s, write: 156.1 MiB/s

INFO: 52% (18.2 GiB of 35.0 GiB) in 3m 26s, read: 154.9 MiB/s, write: 153.9 MiB/s

INFO: 53% (18.7 GiB of 35.0 GiB) in 3m 29s, read: 160.2 MiB/s, write: 158.3 MiB/s

INFO: 54% (18.9 GiB of 35.0 GiB) in 3m 46s, read: 13.2 MiB/s, write: 13.1 MiB/s

INFO: 55% (19.3 GiB of 35.0 GiB) in 3m 49s, read: 137.1 MiB/s, write: 136.3 MiB/s

INFO: 56% (19.7 GiB of 35.0 GiB) in 3m 52s, read: 137.2 MiB/s, write: 136.2 MiB/s

INFO: 57% (20.1 GiB of 35.0 GiB) in 3m 55s, read: 133.7 MiB/s, write: 133.0 MiB/s

INFO: 58% (20.5 GiB of 35.0 GiB) in 3m 58s, read: 140.7 MiB/s, write: 140.1 MiB/s

INFO: 59% (20.9 GiB of 35.0 GiB) in 4m 1s, read: 137.6 MiB/s, write: 135.4 MiB/s

INFO: 60% (21.3 GiB of 35.0 GiB) in 4m 4s, read: 134.0 MiB/s, write: 133.5 MiB/s

INFO: 62% (21.7 GiB of 35.0 GiB) in 4m 7s, read: 139.0 MiB/s, write: 133.4 MiB/s

INFO: 63% (22.1 GiB of 35.0 GiB) in 4m 10s, read: 135.6 MiB/s, write: 134.2 MiB/s

INFO: 64% (22.5 GiB of 35.0 GiB) in 4m 13s, read: 137.5 MiB/s, write: 136.6 MiB/s

INFO: 65% (22.9 GiB of 35.0 GiB) in 4m 16s, read: 134.4 MiB/s, write: 133.6 MiB/s

INFO: 66% (23.1 GiB of 35.0 GiB) in 4m 55s, read: 4.7 MiB/s, write: 4.7 MiB/s

INFO: 67% (23.5 GiB of 35.0 GiB) in 4m 58s, read: 148.4 MiB/s, write: 148.1 MiB/s

INFO: 68% (24.0 GiB of 35.0 GiB) in 5m 1s, read: 148.6 MiB/s, write: 148.6 MiB/s

INFO: 69% (24.3 GiB of 35.0 GiB) in 5m 10s, read: 38.9 MiB/s, write: 28.7 MiB/s

INFO: 70% (24.7 GiB of 35.0 GiB) in 5m 16s, read: 64.1 MiB/s, write: 63.9 MiB/s

INFO: 71% (24.9 GiB of 35.0 GiB) in 5m 20s, read: 53.3 MiB/s, write: 50.8 MiB/s

INFO: 72% (25.3 GiB of 35.0 GiB) in 5m 23s, read: 135.0 MiB/s, write: 134.9 MiB/s

INFO: 73% (25.6 GiB of 35.0 GiB) in 5m 30s, read: 45.2 MiB/s, write: 44.1 MiB/s

INFO: 74% (26.1 GiB of 35.0 GiB) in 5m 39s, read: 55.6 MiB/s, write: 55.5 MiB/s

INFO: 75% (26.6 GiB of 35.0 GiB) in 5m 42s, read: 156.8 MiB/s, write: 147.0 MiB/s

INFO: 76% (26.8 GiB of 35.0 GiB) in 5m 50s, read: 28.2 MiB/s, write: 28.2 MiB/s

INFO: 77% (27.3 GiB of 35.0 GiB) in 5m 53s, read: 169.5 MiB/s, write: 169.5 MiB/s

INFO: 78% (27.4 GiB of 35.0 GiB) in 6m, read: 23.1 MiB/s, write: 23.1 MiB/s

INFO: 100% (35.0 GiB of 35.0 GiB) in 6m 3s, read: 2.5 GiB/s, write: 39.9 MiB/s

INFO: backup is sparse: 10.15 GiB (29%) total zero data

INFO: transferred 35.00 GiB in 363 seconds (98.7 MiB/s)

INFO: stopping kvm after backup task

INFO: archive file size: 24.85GB

INFO: removing backup 'backup-vm:backup/vzdump-qemu-113-2021_04_06-04_28_21.vma'

INFO: Finished Backup of VM 113 (00:06:06)

INFO: Backup finished at 2021-04-21 23:21:08

INFO: Backup job finished successfully

TASK OK


Do you observe any other negative effect, or is it just those message which get logged?
During vm shutdown, the pvestatd log contains entries of the form. The ID can be different each time.

"Apr 21 23:03:43 pve pvestatd [1071]: VM 104 qmp command failed - VM 104 not running
Apr 21 23:07:03 pve pvestatd [1071]: VM 110 qmp command failed - unable to open monitor socket"

Code:
Apr 21 22:56:21 pve pvedaemon[1132]: <root@pam> successful auth for user 'root@pam'
Apr 21 23:04:11 pve pvedaemon[1131]: <root@pam> starting task UPID:pve:00001DD6:02B120CA:6080853A:qmshutdown:102:root@pam:
Apr 21 23:04:11 pve pvedaemon[7638]: shutdown VM 102: UPID:pve:00001DD6:02B120CA:6080853A:qmshutdown:102:root@pam:
Apr 21 23:04:23 pve pvedaemon[1132]: VM 102 qmp command failed - VM 102 not running
Apr 21 23:04:24 pve pvedaemon[1131]: <root@pam> end task UPID:pve:00001DD6:02B120CA:6080853A:qmshutdown:102:root@pam: OK
Apr 21 23:04:43 pve pvedaemon[7785]: shutdown VM 105: UPID:pve:00001E69:02B12D94:6080855B:qmshutdown:105:root@pam:
Apr 21 23:04:43 pve pvedaemon[1132]: <root@pam> starting task UPID:pve:00001E69:02B12D94:6080855B:qmshutdown:105:root@pam:
Apr 21 23:05:01 pve pvedaemon[1132]: <root@pam> end task UPID:pve:00001E69:02B12D94:6080855B:qmshutdown:105:root@pam: OK
Apr 21 23:05:15 pve pvedaemon[7950]: shutdown VM 106: UPID:pve:00001F0E:02B139E0:6080857B:qmshutdown:106:root@pam:
Apr 21 23:05:15 pve pvedaemon[1132]: <root@pam> starting task UPID:pve:00001F0E:02B139E0:6080857B:qmshutdown:106:root@pam:
Apr 21 23:05:27 pve pvedaemon[1132]: <root@pam> end task UPID:pve:00001F0E:02B139E0:6080857B:qmshutdown:106:root@pam: OK
Apr 21 23:06:42 pve pvedaemon[1132]: <root@pam> starting task UPID:pve:00002070:02B15BEB:608085D2:qmshutdown:110:root@pam:
Apr 21 23:06:42 pve pvedaemon[8304]: shutdown VM 110: UPID:pve:00002070:02B15BEB:608085D2:qmshutdown:110:root@pam:
Apr 21 23:07:03 pve pvedaemon[1131]: VM 110 qmp command failed - unable to open monitor socket
Apr 21 23:07:04 pve pvedaemon[1132]: <root@pam> end task UPID:pve:00002070:02B15BEB:608085D2:qmshutdown:110:root@pam: OK
Apr 21 23:09:06 pve pvedaemon[8947]: shutdown VM 113: UPID:pve:000022F3:02B19459:60808662:qmshutdown:113:root@pam:
Apr 21 23:09:06 pve pvedaemon[1132]: <root@pam> starting task UPID:pve:000022F3:02B19459:60808662:qmshutdown:113:root@pam:
Apr 21 23:09:21 pve pvedaemon[1131]: worker exit
Apr 21 23:09:23 pve pvedaemon[1132]: <root@pam> end task UPID:pve:000022F3:02B19459:60808662:qmshutdown:113:root@pam: OK
Apr 21 23:09:41 pve pvedaemon[1129]: worker 1131 finished
Apr 21 23:09:41 pve pvedaemon[1129]: starting 1 worker(s)
Apr 21 23:09:41 pve pvedaemon[1129]: worker 9076 started
Apr 21 23:10:01 pve pvedaemon[1130]: <root@pam> starting task UPID:pve:000023CA:02B1A9CC:60808699:qmshutdown:109:root@pam:
Apr 21 23:10:01 pve pvedaemon[9162]: shutdown VM 109: UPID:pve:000023CA:02B1A9CC:60808699:qmshutdown:109:root@pam:
Apr 21 23:10:28 pve pvedaemon[9076]: VM 109 qmp command failed - unable to open monitor socket
Apr 21 23:10:29 pve pvedaemon[1130]: <root@pam> end task UPID:pve:000023CA:02B1A9CC:60808699:qmshutdown:109:root@pam: OK
Apr 21 23:10:45 pve pvedaemon[9076]: <root@pam> starting task UPID:pve:0000249E:02B1BAE5:608086C5:qmshutdown:108:root@pam:
Apr 21 23:10:45 pve pvedaemon[9374]: shutdown VM 108: UPID:pve:0000249E:02B1BAE5:608086C5:qmshutdown:108:root@pam:
Apr 21 23:10:53 pve pvedaemon[9076]: <root@pam> end task UPID:pve:0000249E:02B1BAE5:608086C5:qmshutdown:108:root@pam: OK
Apr 21 23:10:57 pve pvedaemon[1130]: worker exit
Apr 21 23:10:59 pve pvedaemon[1129]: worker 1130 finished
Apr 21 23:10:59 pve pvedaemon[1129]: starting 1 worker(s)
Apr 21 23:10:59 pve pvedaemon[1129]: worker 9430 started
Apr 21 23:11:21 pve pvedaemon[9076]: <root@pam> successful auth for user 'root@pam'
Apr 21 23:26:21 pve pvedaemon[9076]: <root@pam> successful auth for user 'root@pam'
 
Last edited by a moderator:
Proxmox 5 was up and running 24x 7 and the log is clean.

Logs Proxmox 6
The error disappears after the backup is completed
In PVE 5, that log line didn't exist yet. So this might have still occured, we just wouldn't immediately log it.

These messages by themselves are not really a *problem* so to say, they just indicate that the VM is currently busy on a blocking task - which flushing an NFS storage during a write-intensive backup might very well be. Still not ideal of course, but that's why we asked if there's any other negative effects or just the log (i.e. are your VMs also hanging in the guest, are your services affected, etc...).

It's even more interesting that this happens during a stop mode backup for you... I unfortunately cannot reproduce the issue here on a quick try, even with a slow NFS server. Could you maybe try a "snapshot" mode backup? Using the QEMU guest agent usually means that consistency should still be given.

During vm shutdown, the pvestatd log contains entries of the form. The ID can be different each time.

"Apr 21 23:03:43 pve pvestatd [1071]: VM 104 qmp command failed - VM 104 not running
Apr 21 23:07:03 pve pvestatd [1071]: VM 110 qmp command failed - unable to open monitor socket"
Basically more of the same. If the VM shuts down right as we try to get it's state, this can happen. Again, not ideal, but not a problem either, as long as it doesn't indicate actual service disruptions or similar.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!