vm cpu stalled after backup

Sean510 · Feb 22, 2021

Hello everybody,
i tried to search but i don't find a problem similar to mine, or i misspelled the search terms.
It still happens, even 3 or 4 times a month, that when the PBS backup passes, the cpus of the vm stall, remaining at 55% with the vm that no longer works. This is true for almost all the VMs in the cluster. I can't understand what the problem is. Restarting all the VMs makes everything go back to normal.

Has anyone encountered a similar problem?
Regards

Stefan_R · Feb 22, 2021

We've had a few reports of VMs hanging on backup start, but not on finish AFAIK. The situation should be improved with pve-qemu-kvm 5.1.0-8, and even more so in the upcoming 5.2.0-1.

To debug your specific issue we'd need a bit more information:

Logs from when the hang starts (both PVE and PBS, journalctl, etc...)
"Restarting the VM" implies which operations exactly (i.e. "Restart", "Stop -> Start", "Shutdown -> Start")? What logs/output do they produce?
VM config (qm config <vmid>)
pveversion -v
/etc/proxmox-backup/datastore.cfg from your PBS server if you have anything special configured

Sean510 · Feb 22, 2021

For the vm I execute "stop" and "start". it is very slow to go out
it could really be qemu version problem, i fell behind

Code:

root@hv1:~# qm config 112
agent: 1
bootdisk: scsi0
cores: 1
cpulimit: 2
memory: 8192
name: vps.meteoregionelazio.it
net0: virtio=CE:32:9D:FC:5A:DB,bridge=vmbr0
numa: 0
ostype: l26
scsi0: storage:vm-112-disk-0,discard=on,iothread=1,size=20G,ssd=1
scsi1: storage:vm-112-disk-1,discard=on,iothread=1,size=30G,ssd=1
scsihw: virtio-scsi-pci
smbios1: uuid=140cc6f3-e2b4-4f25-90cf-d49b9dab31a7
sockets: 2
vcpus: 2
vmgenid: c12a8d22-c2f8-4b26-bc3c-8a7b12ffc637


root@hv1:~# pveversion -v
proxmox-ve: 6.2-1 (running kernel: 5.4.55-1-pve)
pve-manager: 6.2-11 (running version: 6.2-11/22fb4983)
pve-kernel-5.4: 6.2-5
pve-kernel-helper: 6.2-5
pve-kernel-5.4.55-1-pve: 5.4.55-1
pve-kernel-5.4.34-1-pve: 5.4.34-2
ceph: 14.2.10-pve1
ceph-fuse: 14.2.10-pve1
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.4
libpve-access-control: 6.1-2
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.1-5
libpve-guest-common-perl: 3.1-2
libpve-http-server-perl: 3.0-6
libpve-storage-perl: 6.2-6
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.3-1
lxcfs: 4.0.3-pve3
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.2-10
pve-cluster: 6.1-8
pve-container: 3.1-12
pve-docs: 6.2-5
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-2
pve-firmware: 3.1-2
pve-ha-manager: 3.0-9
pve-i18n: 2.1-3
pve-qemu-kvm: 5.0.0-12
pve-xtermjs: 4.7.0-1
qemu-server: 6.2-11
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.4-pve1


atastore: bk-prox
        gc-schedule sat 18:15
        keep-daily 7
        path /bk-prox
        prune-schedule 0/2:00

datastore: bk-site
        path /bk-site

Stefan_R · Feb 22, 2021

Hm, pve-qemu-kvm: 5.0.0-12 is pretty old indeed - with respect to PBS support anyway. I'd recommend updating to the newest version and trying again, there have been a heap of PBS-related fixes in the meantime.

If that doesn't work, logs from the time of the crash would be very helpful.

Search

Search

vm cpu stalled after backup

Sean510

Member

Stefan_R

Proxmox Retired Staff

Sean510

Member

Stefan_R

Proxmox Retired Staff

We value your privacy