Hello,
We've got 6 node cluster with a bunch on VMs and CTs on it.
All nodes are updated to PVE 8.2.2 (pve-manager/8.2.2/9355359cd7afbae4, kernel 6.5.13-5-pve) and work (almost) smoothy.
Unfortunately, there are some weird issues with "rcu_sched self-detected stall on cpu" during the backup.
A backup (mode=snapshot, compression=none) of a single VM (type=kvm64, VirtIO SCSI controller) makes this VM unavailable for couple of seconds/minutes.
I see no errors on a Proxmox node itself, but quite a lot of information in VM's dmesg (attached file)
Also, I was SSHed to the VM and received some messages right after VM was "unfreezed":
Message from syslogd@fast at Jun 18 20:31:32 ...
kernel:[ 1292.104018] watchdog: BUG: soft lockup - CPU#0 stuck for 21s! [sshd:580]
Message from syslogd@fast at Jun 18 20:31:37 ...
kernel:[ 1297.151211] watchdog: BUG: soft lockup - CPU#3 stuck for 72s! [kworker/3:2:138]
Message from syslogd@fast at Jun 18 20:31:37 ...
kernel:[ 1297.151432] Uhhuh. NMI received for unknown reason 20 on CPU 2.
Message from syslogd@fast at Jun 18 20:31:37 ...
kernel:[ 1297.151432] Do you have a strange power saving mode enabled?
Message from syslogd@fast at Jun 18 20:31:37 ...
kernel:[ 1297.151433] Dazed and confused, but trying to continue
What can I do to get rid of this issue?
Thanks,
Konrad
We've got 6 node cluster with a bunch on VMs and CTs on it.
All nodes are updated to PVE 8.2.2 (pve-manager/8.2.2/9355359cd7afbae4, kernel 6.5.13-5-pve) and work (almost) smoothy.
Unfortunately, there are some weird issues with "rcu_sched self-detected stall on cpu" during the backup.
A backup (mode=snapshot, compression=none) of a single VM (type=kvm64, VirtIO SCSI controller) makes this VM unavailable for couple of seconds/minutes.
I see no errors on a Proxmox node itself, but quite a lot of information in VM's dmesg (attached file)
Also, I was SSHed to the VM and received some messages right after VM was "unfreezed":
Message from syslogd@fast at Jun 18 20:31:32 ...
kernel:[ 1292.104018] watchdog: BUG: soft lockup - CPU#0 stuck for 21s! [sshd:580]
Message from syslogd@fast at Jun 18 20:31:37 ...
kernel:[ 1297.151211] watchdog: BUG: soft lockup - CPU#3 stuck for 72s! [kworker/3:2:138]
Message from syslogd@fast at Jun 18 20:31:37 ...
kernel:[ 1297.151432] Uhhuh. NMI received for unknown reason 20 on CPU 2.
Message from syslogd@fast at Jun 18 20:31:37 ...
kernel:[ 1297.151432] Do you have a strange power saving mode enabled?
Message from syslogd@fast at Jun 18 20:31:37 ...
kernel:[ 1297.151433] Dazed and confused, but trying to continue
What can I do to get rid of this issue?
Thanks,
Konrad