VM goes read only when starting local backup

uFx

Member
Jun 19, 2015
13
0
21
We upgraded one of our nodes to the latest Proxmox version (pve-manager/6.2-4/9824574a (running kernel: 5.4.34-1-pve)) and now we encounter an issue during backup of one specific VM (Linux guest). The backup starts:

INFO: Starting Backup of VM 123 (qemu)
INFO: Backup started at 2020-05-19 03:01:41
INFO: status = running
INFO: VM Name: test-VM
INFO: include disk 'virtio0' 'local:123/vm-123-disk-1.raw' 220G
INFO: backup mode: snapshot
INFO: bandwidth limit: 250000 KB/s
INFO: ionice priority: 7
INFO: creating archive '/var/lib/vz/dump/vzdump-qemu-123-2020_05_19-03_01_41.vma.lzo'
INFO: started backup task 'bfa59d15-0af8-4726-b21d-6faa2e4dcb33'
INFO: resuming VM again
ERROR: VM 123 qmp command 'cont' failed - got timeout
INFO: aborting backup job
ERROR: Backup of VM 123 failed - VM 123 qmp command 'cont' failed - got timeout

And fails after a few seconds. The guest goes in read only mode and we have to reboot the vm and execute a filesystem check to fix it. This issue does not occur on any other vm's on this server. And there's enough space available at /var/lib/vz/dump.
 
We are experiencing this with more vm's on this node now. Some extra info:

All VM's use local directory storage with raw as disk format. The problem occured when we upgraded to Proxmox 6.2. The VM's are not using the qemu-agent. Not all vm's on this node have this problem.

pveversion -v:
Code:
 proxmox-ve: 6.2-1 (running kernel: 5.4.34-1-pve)
pve-manager: 6.2-4 (running version: 6.2-4/9824574a)
pve-kernel-5.4: 6.2-1
pve-kernel-helper: 6.2-1
pve-kernel-5.0: 6.0-11
pve-kernel-5.4.34-1-pve: 5.4.34-2
pve-kernel-4.15: 5.4-9
pve-kernel-5.0.21-5-pve: 5.0.21-10
pve-kernel-5.0.21-4-pve: 5.0.21-9
pve-kernel-5.0.21-2-pve: 5.0.21-7
pve-kernel-4.15.18-21-pve: 4.15.18-48
pve-kernel-4.15.18-11-pve: 4.15.18-34
pve-kernel-4.4.134-1-pve: 4.4.134-112
pve-kernel-4.4.98-3-pve: 4.4.98-103
pve-kernel-4.4.67-1-pve: 4.4.67-92
pve-kernel-4.4.62-1-pve: 4.4.62-88
pve-kernel-4.4.59-1-pve: 4.4.59-87
pve-kernel-4.4.49-1-pve: 4.4.49-86
pve-kernel-4.4.19-1-pve: 4.4.19-66
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.3-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.15-pve1
libproxmox-acme-perl: 1.0.3
libpve-access-control: 6.1-1
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.1-2
libpve-guest-common-perl: 3.0-10
libpve-http-server-perl: 3.0-5
libpve-storage-perl: 6.1-7
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.2-1
lxcfs: 4.0.3-pve2
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.2-1
pve-cluster: 6.1-8
pve-container: 3.1-5
pve-docs: 6.2-4
pve-edk2-firmware: 2.20200229-1
pve-firewall: 4.1-2
pve-firmware: 3.1-1
pve-ha-manager: 3.0-9
pve-i18n: 2.1-2
pve-qemu-kvm: 5.0.0-2
pve-xtermjs: 4.3.0-1
qemu-server: 6.2-2
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.3-pve1
 
Another VM on this host hangs for a few moments when starting the backup but doesn't go to read only:

Code:
May 22 05:12:35 vps1 kernel: [442137.812027] INFO: rcu_sched detected stalls on CPUs/tasks:
May 22 05:12:35 vps1 kernel: [442137.812171]     0-...: (1 GPs behind) idle=a73/2/0 softirq=65007472/65007472 fqs=15000
May 22 05:12:35 vps1 kernel: [442137.814520]     (detected by 1, t=15002 jiffies, g=43606771, c=43606770, q=97)
May 22 05:12:35 vps1 kernel: [442137.815982] Task dump for CPU 0:
May 22 05:12:35 vps1 kernel: [442137.815985] swapper/0       R  running task        0     0      0 0x00000008
May 22 05:12:35 vps1 kernel: [442137.815990]  ffffffff81067af2 0000000000000010 0000000000000246 ffffffff81e03e98
May 22 05:12:35 vps1 kernel: [442137.815995]  0000000000000018 ffffffff81f43800 ffffffff81e03eb8 ffffffff8103914e
May 22 05:12:35 vps1 kernel: [442137.815998]  ffffffff81f43800 ffffffff81e04000 ffffffff81e03ec8 ffffffff81039ff5
May 22 05:12:35 vps1 kernel: [442137.816002] Call Trace:

What's the best way to downgrade to the previous kernel? Manual select an older one during boot?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!