vm cpu stalled after backup

Sean510

Member
Apr 18, 2020
19
2
23
44
Hello everybody,
i tried to search but i don't find a problem similar to mine, or i misspelled the search terms.
It still happens, even 3 or 4 times a month, that when the PBS backup passes, the cpus of the vm stall, remaining at 55% with the vm that no longer works. This is true for almost all the VMs in the cluster. I can't understand what the problem is. Restarting all the VMs makes everything go back to normal.

Has anyone encountered a similar problem?
Regards
 
We've had a few reports of VMs hanging on backup start, but not on finish AFAIK. The situation should be improved with pve-qemu-kvm 5.1.0-8, and even more so in the upcoming 5.2.0-1.

To debug your specific issue we'd need a bit more information:
  • Logs from when the hang starts (both PVE and PBS, journalctl, etc...)
  • "Restarting the VM" implies which operations exactly (i.e. "Restart", "Stop -> Start", "Shutdown -> Start")? What logs/output do they produce?
  • VM config (qm config <vmid>)
  • pveversion -v
  • /etc/proxmox-backup/datastore.cfg from your PBS server if you have anything special configured
 
For the vm I execute "stop" and "start". it is very slow to go out
it could really be qemu version problem, i fell behind

Code:
root@hv1:~# qm config 112
agent: 1
bootdisk: scsi0
cores: 1
cpulimit: 2
memory: 8192
name: vps.meteoregionelazio.it
net0: virtio=CE:32:9D:FC:5A:DB,bridge=vmbr0
numa: 0
ostype: l26
scsi0: storage:vm-112-disk-0,discard=on,iothread=1,size=20G,ssd=1
scsi1: storage:vm-112-disk-1,discard=on,iothread=1,size=30G,ssd=1
scsihw: virtio-scsi-pci
smbios1: uuid=140cc6f3-e2b4-4f25-90cf-d49b9dab31a7
sockets: 2
vcpus: 2
vmgenid: c12a8d22-c2f8-4b26-bc3c-8a7b12ffc637

root@hv1:~# pveversion -v proxmox-ve: 6.2-1 (running kernel: 5.4.55-1-pve) pve-manager: 6.2-11 (running version: 6.2-11/22fb4983) pve-kernel-5.4: 6.2-5 pve-kernel-helper: 6.2-5 pve-kernel-5.4.55-1-pve: 5.4.55-1 pve-kernel-5.4.34-1-pve: 5.4.34-2 ceph: 14.2.10-pve1 ceph-fuse: 14.2.10-pve1 corosync: 3.0.4-pve1 criu: 3.11-3 glusterfs-client: 5.5-3 ifupdown: 0.8.35+pve1 ksm-control-daemon: 1.3-1 libjs-extjs: 6.0.1-10 libknet1: 1.16-pve1 libproxmox-acme-perl: 1.0.4 libpve-access-control: 6.1-2 libpve-apiclient-perl: 3.0-3 libpve-common-perl: 6.1-5 libpve-guest-common-perl: 3.1-2 libpve-http-server-perl: 3.0-6 libpve-storage-perl: 6.2-6 libqb0: 1.0.5-1 libspice-server1: 0.14.2-4~pve6+1 lvm2: 2.03.02-pve4 lxc-pve: 4.0.3-1 lxcfs: 4.0.3-pve3 novnc-pve: 1.1.0-1 proxmox-mini-journalreader: 1.1-1 proxmox-widget-toolkit: 2.2-10 pve-cluster: 6.1-8 pve-container: 3.1-12 pve-docs: 6.2-5 pve-edk2-firmware: 2.20200531-1 pve-firewall: 4.1-2 pve-firmware: 3.1-2 pve-ha-manager: 3.0-9 pve-i18n: 2.1-3 pve-qemu-kvm: 5.0.0-12 pve-xtermjs: 4.7.0-1 qemu-server: 6.2-11 smartmontools: 7.1-pve2 spiceterm: 3.1-1 vncterm: 1.6-2 zfsutils-linux: 0.8.4-pve1

atastore: bk-prox gc-schedule sat 18:15 keep-daily 7 path /bk-prox prune-schedule 0/2:00 datastore: bk-site path /bk-site
 
Hm, pve-qemu-kvm: 5.0.0-12 is pretty old indeed - with respect to PBS support anyway. I'd recommend updating to the newest version and trying again, there have been a heap of PBS-related fixes in the meantime.

If that doesn't work, logs from the time of the crash would be very helpful.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!