Windows KVM freezes - qmp socket - timeout after 599 retries

encore

Member
May 4, 2018
95
0
6
31
Hi,

we are facing issues on all KVM VMs (qcow2, VirtIO SCSI, DIR storage (ext4)) for months now.
These VMs freeze suddenly. Means we are unable to access by RDP or even by Console (VNC).
Syslog shows
May 23 20:30:47 captive020-74050-bl13 qm[8866]: VM 1200247 qmp command failed - VM 1200247 qmp command 'change' failed - unable to connect to VM 1200247 qmp socket - timeout after 599 retries
after crash.

Stopping the frozen VM and trying to start them again lead to:
TASK ERROR: start failed: org.freedesktop.systemd1.UnitExists: Unit 1200192.scope already exists.
There is still a QEMU process showing the VM is running (after stop) so it is not possible to start it again due to the mentioned error message.
10-15mins later, the process is gone and I can start the VM again.

When it happens, several Windows VMs freeze on the same time on that node.
We use Win16 Datacenter R2 and Win19 Datacenter. We tried virtio stable drivers and also virtio latest drivers. Same issue
Our Cluster has 32 nodes with different hardware. All nodes are affected.

root@captive020-74050-bl13:~# pveversion -v
proxmox-ve: 5.4-1 (running kernel: 4.15.18-14-pve)
pve-manager: 5.4-5 (running version: 5.4-5/c6fdb264)
pve-kernel-4.15: 5.4-2
pve-kernel-4.15.18-14-pve: 4.15.18-38
pve-kernel-4.15.18-12-pve: 4.15.18-36
pve-kernel-4.13.13-2-pve: 4.13.13-33
corosync: 2.4.4-pve1
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.1-9
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-51
libpve-guest-common-perl: 2.0-20
libpve-http-server-perl: 2.0-13
libpve-storage-perl: 5.0-42
libqb0: 1.0.3-1~bpo9
lvm2: 2.02.168-pve6
lxc-pve: 3.1.0-3
lxcfs: 3.0.3-pve1
novnc-pve: 1.0.0-3
proxmox-widget-toolkit: 1.0-26
pve-cluster: 5.0-37
pve-container: 2.0-37
pve-docs: 5.4-2
pve-edk2-firmware: 1.20190312-1
pve-firewall: 3.0-20
pve-firmware: 2.0-6
pve-ha-manager: 2.0-9
pve-i18n: 1.1-4
pve-libspice-server1: 0.14.1-2
pve-qemu-kvm: 3.0.1-2
pve-xtermjs: 3.12.0-1
qemu-server: 5.0-51
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.13-pve1~bpo2
Any ideas?
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!