High disk read/write loads for all VMs at the same time

ilucaslp

New Member
Jan 22, 2023
5
0
1
Hello, recently I have a problem where all the VMs in my pve register a high read and write load at the same time.

The times are random, sometimes it happens at 1-hour intervals, other times it takes a long time

Even VMs with basic functions in Linux or Windows at idle register the high load at exactly the same moment.
**Generally the io usage of all VMs is between 40k and 60k.

It happens both in vms with ssd storage in the pve and in vms with nvme storage scsi storage, which I have already ruled out being something on the disk.

The problem is that at the time of these loads, some processes running in the vm stop responding or simply shut down.
In my syslog I noticed that at the time of these loads it records the following log below on all vms:

"Oct 06 22:23:30 br209 pvestatd[2189]: VM 1007 qmp command failed - VM 1007 qmp command 'query-proxmox-support' failed - got timeout

Oct 06 22:23:35 br209 pvestatd[2189]: VM 1002 qmp command failed - VM 1002 qmp command 'query-proxmox-support' failed -
unable to connect to VM 1002 qmp socket - timeout after 51 retries"

Has anyone had something similar or have any idea what it could be?

My pve configuration is as follows:
Dual Xeon E5-2660 v2 @ 2.20GHz / 128GB RAM
Kernel 5.15.116-1-pve
PVE 7.4-16

Storage:
1 Micron M600 SSD = PVE and 1 Linux VM
ISCSI NVME storage on 10Gbps port = 10 VMs varying between Windows and Linux

imagem_2023-10-06_225918183.png
 
Last edited:
I managed to find a pattern, the high loads and problems on the VMs occur when pvestatd performs the status update.

Is there any way to resolve this?



Oct 08 23:41:38 br209 pvestatd[2189]: status update time (17.108 seconds)
Oct 08 23:51:46 br209 pvestatd[2189]: VM 210 qmp command failed - VM 210 qmp command 'query-proxmox-support' failed - got timeout
Oct 08 23:51:51 br209 pvestatd[2189]: VM 1002 qmp command failed - VM 1002 qmp command 'query-proxmox-support' failed - got timeout
Oct 08 23:51:52 br209 pvestatd[2189]: status update time (13.810 seconds)
Oct 08 23:52:02 br209 pvestatd[2189]: status update time (10.416 seconds)
Oct 09 01:22:51 br209 pvestatd[2189]: VM 1003 qmp command failed - VM 1003 qmp command 'query-proxmox-support' failed - got timeout
Oct 09 01:22:56 br209 pvestatd[2189]: VM 1008 qmp command failed - VM 1008 qmp command 'query-proxmox-support' failed - got timeout
Oct 09 01:22:59 br209 pvestatd[2189]: status update time (16.311 seconds)
Oct 09 01:24:17 br209 pvestatd[2189]: VM 1009 qmp command failed - VM 1009 qmp command 'query-proxmox-support' failed - got timeout
Oct 09 01:24:22 br209 pvestatd[2189]: VM 1002 qmp command failed - VM 1002 qmp command 'query-proxmox-support' failed - unable to connect to VM 1002 qmp socket - timeout after 51 retries
Oct 09 01:24:31 br209 pvestatd[2189]: status update time (22.547 seconds)
Oct 09 05:58:16 br209 pvestatd[2189]: status update time (25.211 seconds)
Oct 09 09:46:24 br209 pvestatd[2189]: VM 1009 qmp command failed - VM 1009 qmp command 'query-proxmox-support' failed - got timeout
Oct 09 09:46:24 br209 pvestatd[2189]: status update time (8.461 seconds)
Oct 09 11:15:06 br209 pvestatd[2189]: VM 209 qmp command failed - VM 209 not running
Oct 09 11:29:44 br209 pvestatd[2189]: VM 1006 qmp command failed - VM 1006 qmp command 'query-proxmox-support' failed - got timeout
Oct 09 11:29:49 br209 pvestatd[2189]: VM 1007 qmp command failed - VM 1007 qmp command 'query-proxmox-support' failed - got timeout
Oct 09 11:29:54 br209 pvestatd[2189]: VM 1001 qmp command failed - VM 1001 qmp command 'query-proxmox-support' failed - got timeout
Oct 09 11:29:58 br209 pvestatd[2189]: status update time (21.861 seconds)
Oct 09 12:14:39 br209 pvestatd[2189]: VM 1006 qmp command failed - VM 1006 not running
Oct 09 12:46:36 br209 pvestatd[2189]: VM 1002 qmp command failed - VM 1002 qmp command 'query-proxmox-support' failed - got timeout
Oct 09 12:46:43 br209 pvestatd[2189]: status update time (14.725 seconds)
Oct 09 15:04:01 br209 pvestatd[2189]: VM 1003 qmp command failed - VM 1003 qmp command 'query-proxmox-support' failed - got timeout
Oct 09 15:04:06 br209 pvestatd[2189]: VM 1002 qmp command failed - VM 1002 qmp command 'query-proxmox-support' failed - got timeout
Oct 09 15:04:10 br209 pvestatd[2189]: status update time (16.798 seconds)
Oct 09 15:21:00 br209 pvestatd[2189]: auth key pair too old, rotating..
Oct 09 15:35:59 br209 pvestatd[2189]: VM 1001 qmp command failed - VM 1001 qmp command 'query-proxmox-support' failed - got timeout
Oct 09 15:35:59 br209 pvestatd[2189]: status update time (8.753 seconds)
Oct 09 15:36:11 br209 pvestatd[2189]: status update time (11.071 seconds)
Oct 09 15:47:09 br209 pvestatd[2189]: VM 1001 qmp command failed - VM 1001 qmp command 'query-proxmox-support' failed - got timeout
Oct 09 15:47:14 br209 pvestatd[2189]: VM 1003 qmp command failed - VM 1003 qmp command 'query-proxmox-support' failed - got timeout
Oct 09 15:47:22 br209 pvestatd[2189]: status update time (21.689 seconds)
Oct 09 15:52:30 br209 pvestatd[2189]: VM 1002 qmp command failed - VM 1002 qmp command 'query-proxmox-support' failed - got timeout
Oct 09 15:52:46 br209 pvestatd[2189]: status update time (24.627 seconds)
Oct 09 16:04:35 br209 pvestatd[2189]: VM 1002 qmp command failed - VM 1002 qmp command 'query-proxmox-support' failed - got timeout
Oct 09 16:04:40 br209 pvestatd[2189]: VM 1008 qmp command failed - VM 1008 qmp command 'query-proxmox-support' failed - got timeout
Oct 09 16:04:50 br209 pvestatd[2189]: status update time (23.380 seconds)
Oct 09 17:17:28 br209 pvestatd[2189]: VM 1003 qmp command failed - VM 1003 qmp command 'query-proxmox-support' failed - got timeout
Oct 09 17:17:41 br209 pvestatd[2189]: status update time (21.470 seconds)
Oct 09 18:33:32 br209 pvestatd[2189]: VM 1002 qmp command failed - VM 1002 qmp command 'query-proxmox-support' failed - got timeout
Oct 09 18:33:37 br209 pvestatd[2189]: VM 1003 qmp command failed - VM 1003 qmp command 'query-proxmox-support' failed - unable to connect to VM 1003 qmp socket - timeout after 51 retries
Oct 09 18:33:42 br209 pvestatd[2189]: VM 1001 qmp command failed - VM 1001 qmp command 'query-proxmox-support' failed - got timeout
Oct 09 18:33:42 br209 pvestatd[2189]: status update time (21.908 seconds)
Oct 09 18:47:30 br209 pvestatd[2189]: VM 1002 qmp command failed - VM 1002 qmp command 'query-proxmox-support' failed - got timeout
Oct 09 18:47:33 br209 pvestatd[2189]: status update time (10.449 seconds)
Oct 09 18:55:51 br209 pvestatd[2189]: VM 1002 qmp command failed - VM 1002 qmp command 'query-proxmox-support' failed - unable to connect to VM 1002 qmp socket - timeout after 51 retries
Oct 09 18:55:51 br209 pvestatd[2189]: status update time (8.776 seconds)
Oct 09 18:55:59 br209 pvestatd[2189]: status update time (5.908 seconds)
Oct 09 18:59:22 br209 pvestatd[2189]: VM 1007 qmp command failed - VM 1007 qmp command 'query-proxmox-support' failed - got timeout
Oct 09 18:59:22 br209 pvestatd[2189]: status update time (8.797 seconds)
Oct 09 18:59:29 br209 pvestatd[2189]: status update time (5.824 seconds)
Oct 09 19:56:41 br209 pvestatd[2189]: VM 1008 qmp command failed - VM 1008 qmp command 'query-proxmox-support' failed - got timeout
Oct 09 19:56:47 br209 pvestatd[2189]: status update time (14.825 seconds)
Oct 09 20:04:15 br209 pvestatd[2189]: VM 1007 qmp command failed - VM 1007 qmp command 'query-proxmox-support' failed - got timeout
Oct 09 20:04:20 br209 pvestatd[2189]: VM 1001 qmp command failed - VM 1001 qmp command 'query-proxmox-support' failed - got timeout
Oct 09 20:04:21 br209 pvestatd[2189]: status update time (13.856 seconds)
Oct 09 20:04:29 br209 pvestatd[2189]: status update time (8.141 seconds)
Oct 09 20:09:19 br209 pvestatd[2189]: VM 1001 qmp command failed - VM 1001 qmp command 'query-proxmox-support' failed - got timeout
Oct 09 20:09:24 br209 pvestatd[2189]: VM 1007 qmp command failed - VM 1007 qmp command 'query-proxmox-support' failed - got timeout
Oct 09 20:09:25 br209 pvestatd[2189]: status update time (13.857 seconds)
Oct 09 20:09:32 br209 pvestatd[2189]: status update time (7.707 seconds)
Oct 09 20:24:02 br209 pvestatd[2189]: VM 1003 qmp command failed - VM 1003 qmp command 'query-proxmox-support' failed - got timeout
Oct 09 20:24:07 br209 pvestatd[2189]: VM 1007 qmp command failed - VM 1007 qmp command 'query-proxmox-support' failed - got timeout
Oct 09 20:24:07 br209 pvestatd[2189]: status update time (13.527 seconds)
Oct 09 20:24:14 br209 pvestatd[2189]: status update time (6.469 seconds)
Oct 09 20:25:15 br209 pvestatd[2189]: VM 209 qmp command failed - VM 209 qmp command 'query-proxmox-support' failed - got timeout
Oct 09 20:25:20 br209 pvestatd[2189]: VM 1007 qmp command failed - VM 1007 qmp command 'query-proxmox-support' failed - got timeout
Oct 09 20:25:21 br209 pvestatd[2189]: status update time (13.831 seconds)
Oct 09 20:25:28 br209 pvestatd[2189]: status update time (6.759 seconds)
 
you can systemctl stop pvestatd temporary.
but imo, pvestatd suffer of the slowdown, it's not the cause.
Stop some VMs to compare.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!