kernel:[ 9203.691567] watchdog: BUG: soft lockup - CPU#15 stuck for 6802s! [systemd-timesyn:639]

dengolius

Well-Known Member
Dec 28, 2016
46
8
48
35
Ukraine
t.me
I've just upgrade PVE server from 8.1.3 to the latest 8.2.2 and got this:
Code:
sudo apt autoremove
[sudo] password for dengolius:
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following packages will be REMOVED:
  pve-kernel-5.15 pve-kernel-5.15.143-1-pve
0 upgraded, 0 newly installed, 2 to remove and 0 not upgraded.
After this operation, 400 MB disk space will be freed.
Do you want to continue? [Y/n]
(Reading database ... 85017 files and directories currently installed.)
Removing pve-kernel-5.15 (7.4-11) ...
Removing pve-kernel-5.15.143-1-pve (5.15.143-1) ...
Examining /etc/kernel/postrm.d.
run-parts: executing /etc/kernel/postrm.d/initramfs-tools 5.15.143-1-pve /boot/vmlinuz-5.15.143-1-pve
update-initramfs: Deleting /boot/initrd.img-5.15.143-1-pve
run-parts: executing /etc/kernel/postrm.d/proxmox-auto-removal 5.15.143-1-pve /boot/vmlinuz-5.15.143-1-pve
run-parts: executing /etc/kernel/postrm.d/zz-proxmox-boot 5.15.143-1-pve /boot/vmlinuz-5.15.143-1-pve
Re-executing '/etc/kernel/postrm.d/zz-proxmox-boot' in new private mount namespace..
No /etc/kernel/proxmox-boot-uuids found, skipping ESP sync.
run-parts: executing /etc/kernel/postrm.d/zz-update-grub 5.15.143-1-pve /boot/vmlinuz-5.15.143-1-pve
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-6.8.4-2-pve
Found initrd image: /boot/initrd.img-6.8.4-2-pve
Found linux image: /boot/vmlinuz-6.5.13-5-pve
Found initrd image: /boot/initrd.img-6.5.13-5-pve
Found linux image: /boot/vmlinuz-6.5.13-1-pve
Found initrd image: /boot/initrd.img-6.5.13-1-pve
Found linux image: /boot/vmlinuz-5.13.19-2-pve
Found initrd image: /boot/initrd.img-5.13.19-2-pve
done
dengolius@hub1:~$
Message from syslogd@hub1 at May  3 18:42:33 ...
 kernel:[ 1734.574973] watchdog: Watchdog detected hard LOCKUP on cpu 12

Message from syslogd@hub1 at May  3 18:44:56 ...
 kernel:[ 1923.987764] watchdog: BUG: soft lockup - CPU#15 stuck for 22s! [systemd-timesyn:639]

Message from syslogd@hub1 at May  3 18:45:24 ...
 kernel:[ 1951.986701] watchdog: BUG: soft lockup - CPU#15 stuck for 48s! [systemd-timesyn:639]

Message from syslogd@hub1 at May  3 18:45:52 ...
 kernel:[ 1979.985629] watchdog: BUG: soft lockup - CPU#15 stuck for 75s! [systemd-timesyn:639]

Message from syslogd@hub1 at May  3 18:46:20 ...
 kernel:[ 2007.984550] watchdog: BUG: soft lockup - CPU#15 stuck for 101s! [systemd-timesyn:639]

Message from syslogd@hub1 at May  3 18:46:48 ...
 kernel:[ 2035.983465] watchdog: BUG: soft lockup - CPU#15 stuck for 127s! [systemd-timesyn:639]

Message from syslogd@hub1 at May  3 18:47:16 ...
 kernel:[ 2063.982374] watchdog: BUG: soft lockup - CPU#15 stuck for 153s! [systemd-timesyn:639]

Message from syslogd@hub1 at May  3 18:47:44 ...
 kernel:[ 2091.981277] watchdog: BUG: soft lockup - CPU#15 stuck for 179s! [systemd-timesyn:639]

Message from syslogd@hub1 at May  3 18:48:12 ...
 kernel:[ 2119.980176] watchdog: BUG: soft lockup - CPU#15 stuck for 205s! [systemd-timesyn:639]

Message from syslogd@hub1 at May  3 18:48:40 ...
 kernel:[ 2147.979070] watchdog: BUG: soft lockup - CPU#15 stuck for 231s! [systemd-timesyn:639]

Message from syslogd@hub1 at May  3 18:49:08 ...
 kernel:[ 2175.977961] watchdog: BUG: soft lockup - CPU#15 stuck for 257s! [systemd-timesyn:639]

Message from syslogd@hub1 at May  3 18:49:36 ...
 kernel:[ 2203.976848] watchdog: BUG: soft lockup - CPU#15 stuck for 283s! [systemd-timesyn:639]

Message from syslogd@hub1 at May  3 18:50:04 ...
 kernel:[ 2231.975733] watchdog: BUG: soft lockup - CPU#15 stuck for 309s! [systemd-timesyn:639]

Message from syslogd@hub1 at May  3 18:50:32 ...
 kernel:[ 2259.974615] watchdog: BUG: soft lockup - CPU#15 stuck for 335s! [systemd-timesyn:639]

Message from syslogd@hub1 at May  3 18:51:00 ...
 kernel:[ 2287.973493] watchdog: BUG: soft lockup - CPU#15 stuck for 361s! [systemd-timesyn:639]

Message from syslogd@hub1 at May  3 18:51:28 ...
 kernel:[ 2315.972370] watchdog: BUG: soft lockup - CPU#15 stuck for 387s! [systemd-timesyn:639]

Message from syslogd@hub1 at May  3 18:51:56 ...
 kernel:[ 2343.971245] watchdog: BUG: soft lockup - CPU#15 stuck for 413s! [systemd-timesyn:639]

Message from syslogd@hub1 at May  3 18:52:24 ...
 kernel:[ 2371.970118] watchdog: BUG: soft lockup - CPU#15 stuck for 440s! [systemd-timesyn:639]

Message from syslogd@hub1 at May  3 18:52:52 ...
 kernel:[ 2399.968990] watchdog: BUG: soft lockup - CPU#15 stuck for 466s! [systemd-timesyn:639]

Message from syslogd@hub1 at May  3 18:53:20 ...
 kernel:[ 2427.967859] watchdog: BUG: soft lockup - CPU#15 stuck for 492s! [systemd-timesyn:639]

Message from syslogd@hub1 at May  3 18:53:48 ...
 kernel:[ 2455.966728] watchdog: BUG: soft lockup - CPU#15 stuck for 518s! [systemd-timesyn:639]

Message from syslogd@hub1 at May  3 18:54:16 ...
 kernel:[ 2483.965596] watchdog: BUG: soft lockup - CPU#15 stuck for 544s! [systemd-timesyn:639]

Message from syslogd@hub1 at May  3 18:54:44 ...
 kernel:[ 2511.964463] watchdog: BUG: soft lockup - CPU#15 stuck for 570s! [systemd-timesyn:639]

Message from syslogd@hub1 at May  3 18:55:12 ...
 kernel:[ 2539.963328] watchdog: BUG: soft lockup - CPU#15 stuck for 596s! [systemd-timesyn:639]

Message from syslogd@hub1 at May  3 18:55:40 ...
 kernel:[ 2567.962193] watchdog: BUG: soft lockup - CPU#15 stuck for 622s! [systemd-timesyn:639]


See my bug report https://bugzilla.proxmox.com/show_bug.cgi?id=3821#c11
 
Last edited:
Looks like I've solved it by disabling systemd-timesyncd service:

sudo systemctl disable systemd-timesyncd

and enabling sudo timedatectl set-ntp true+ uncomment line with
Code:
FallbackNTP

Bash:
cat /etc/systemd/timesyncd.conf
#  This file is part of systemd.
#
#  systemd is free software; you can redistribute it and/or modify it under the
#  terms of the GNU Lesser General Public License as published by the Free
#  Software Foundation; either version 2.1 of the License, or (at your option)
#  any later version.
#
# Entries in this file show the compile time defaults. Local configuration
# should be created by either modifying this file, or by creating "drop-ins" in
# the timesyncd.conf.d/ subdirectory. The latter is generally recommended.
# Defaults can be restored by simply deleting this file and all drop-ins.
#
# See timesyncd.conf(5) for details.

[Time]
NTP=ntp.hetzner.com
FallbackNTP=0.debian.pool.ntp.org 1.debian.pool.ntp.org 2.debian.pool.ntp.org 3.debian.pool.ntp.org
#RootDistanceMaxSec=5
#PollIntervalMinSec=32
#PollIntervalMaxSec=2048
#ConnectionRetrySec=30
#SaveIntervalSec=60
 
Have similar issue on Promox 8.2.2 hosted at Hetzner with a single VM debian having these CPU soft lockup's when running database backups.. Have you found any solution to the problem ?

We already changed hardware twice, I suspect it's a software issue (running AMD EPYC 9454, 512GB RAM 4800Mt/s, NVME DISKS)
 
  • Like
Reactions: dengolius
Have similar issue on Promox 8.2.2 hosted at Hetzner with a single VM debian having these CPU soft lockup's when running database backups.. Have you found any solution to the problem ?

We already changed hardware twice, I suspect it's a software issue (running AMD EPYC 9454, 512GB RAM 4800Mt/s, NVME DISKS)
I've applied recommendations from https://forum.proxmox.com/threads/k...6802s-systemd-timesyn-639.146368/#post-665192 so my server is running smoothly for the latest month.

PS: I also failed in issue with Kernel 6.8 on my laptop on Sunday, but it was on Garuda linux so I just reinstalled OS to Xubuntu 24.04, because didn't have time for resolving issues and experiments with Arch :D
 
  • Like
Reactions: JustThat
Ok our problem seem to be solved with Kernel 6.5 indeed.

Default 6.8 kernel has issues in our configuration (AMD EPYC, we also have an intel i9-10940X CPU @ 3.30GHz, that one doesn't seemt o have the issue on 6.8)
 
Spoke too soon, the issue is back

We are using kernel 6.5 + Disabled C-State, and issue still happens...any ideas?