Hi,
we are facing issues on all KVM VMs (qcow2, VirtIO SCSI, DIR storage (ext4)) for months now.
These VMs freeze suddenly. Means we are unable to access by RDP or even by Console (VNC).
Syslog shows
Stopping the frozen VM and trying to start them again lead to:
10-15mins later, the process is gone and I can start the VM again.
When it happens, several Windows VMs freeze on the same time on that node.
We use Win16 Datacenter R2 and Win19 Datacenter. We tried virtio stable drivers and also virtio latest drivers. Same issue
Our Cluster has 32 nodes with different hardware. All nodes are affected.
Any ideas?
we are facing issues on all KVM VMs (qcow2, VirtIO SCSI, DIR storage (ext4)) for months now.
These VMs freeze suddenly. Means we are unable to access by RDP or even by Console (VNC).
Syslog shows
after crash.May 23 20:30:47 captive020-74050-bl13 qm[8866]: VM 1200247 qmp command failed - VM 1200247 qmp command 'change' failed - unable to connect to VM 1200247 qmp socket - timeout after 599 retries
Stopping the frozen VM and trying to start them again lead to:
There is still a QEMU process showing the VM is running (after stop) so it is not possible to start it again due to the mentioned error message.TASK ERROR: start failed: org.freedesktop.systemd1.UnitExists: Unit 1200192.scope already exists.
10-15mins later, the process is gone and I can start the VM again.
When it happens, several Windows VMs freeze on the same time on that node.
We use Win16 Datacenter R2 and Win19 Datacenter. We tried virtio stable drivers and also virtio latest drivers. Same issue
Our Cluster has 32 nodes with different hardware. All nodes are affected.
root@captive020-74050-bl13:~# pveversion -v
proxmox-ve: 5.4-1 (running kernel: 4.15.18-14-pve)
pve-manager: 5.4-5 (running version: 5.4-5/c6fdb264)
pve-kernel-4.15: 5.4-2
pve-kernel-4.15.18-14-pve: 4.15.18-38
pve-kernel-4.15.18-12-pve: 4.15.18-36
pve-kernel-4.13.13-2-pve: 4.13.13-33
corosync: 2.4.4-pve1
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.1-9
libpve-apiclient-perl: 2.0-5
libpve-common-perl: 5.0-51
libpve-guest-common-perl: 2.0-20
libpve-http-server-perl: 2.0-13
libpve-storage-perl: 5.0-42
libqb0: 1.0.3-1~bpo9
lvm2: 2.02.168-pve6
lxc-pve: 3.1.0-3
lxcfs: 3.0.3-pve1
novnc-pve: 1.0.0-3
proxmox-widget-toolkit: 1.0-26
pve-cluster: 5.0-37
pve-container: 2.0-37
pve-docs: 5.4-2
pve-edk2-firmware: 1.20190312-1
pve-firewall: 3.0-20
pve-firmware: 2.0-6
pve-ha-manager: 2.0-9
pve-i18n: 1.1-4
pve-libspice-server1: 0.14.1-2
pve-qemu-kvm: 3.0.1-2
pve-xtermjs: 3.12.0-1
qemu-server: 5.0-51
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.13-pve1~bpo2
Any ideas?
Last edited: