Proxmox VE 8.1.4 - watchdog: BUG: soft lockup - CPU#X stuck for Xs

Jul 4, 2022
54
6
8
Poland
Hello there.

Since I connected my nodes to cluster I noticed on some Linux VMs I'm getting this error:
1708879705606.png

I couldn't find any working solution for this. I suppose this has something to do with ZFS as one of my nodes where ZFS is not operating these VMs work without any issues.
Do you have any idea what could be wrong? Servers aren't overloaded and have plenty of free memory.
 
I doubt this is connected to ZFS, your VM was not scheduled for a while.

EDIT: I might have been wrong on this one. What does zpool iostat -vl show for wait times for your vdevs?

Inside a VM (or two, for comparison), I would have a look at journalctl -k -b all | grep "soft lockup"

Are those times the same? Are those times when some batch job occurs, perhaps?

Note you can always have those thresholds changed if all you worry about is the kernel log entries through: /proc/sys/kernel/watchdog_thresh
 
Last edited:
I doubt this is connected to ZFS, your VM was not scheduled for a while.

EDIT: I might have been wrong on this one. What does zpool iostat -vl show for wait times for your vdevs?

Inside a VM (or two, for comparison), I would have a look at journalctl -k -b all | grep "soft lockup"

Are those times the same? Are those times when some batch job occurs, perhaps?

Note you can always have those thresholds changed if all you worry about is the kernel log entries through: /proc/sys/kernel/watchdog_thresh
Code:
zpool iostat -vl
                                             capacity     operations     bandwidth    total_wait     disk_wait    syncq_wait    asyncq_wait  scrub   trim  rebuild
pool                                       alloc   free   read  write   read  write   read  write   read  write   read  write   read  write   wait   wait   wait
-----------------------------------------  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----
NVMe                                        383G  1.44T    867    189  5.23M  4.40M   95us  193us   78us   48us    2us   63us   27us  150us    7ms      -      -
  mirror-0                                  383G  1.44T    867    189  5.23M  4.40M   95us  193us   78us   48us    2us   63us   27us  150us    7ms      -      -
    nvme-WD_Red_SN700_2000GB_23041F800575      -      -    431     94  2.61M  2.20M   95us  197us   78us   49us    2us  100us   27us  152us    7ms      -      -
    nvme-WD_Red_SN700_2000GB_23024R800475      -      -    435     95  2.62M  2.20M   94us  188us   78us   48us    2us   25us   27us  149us    7ms      -      -
-----------------------------------------  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----
 
Inside a VM (or two, for comparison), I would have a look at journalctl -k -b all | grep "soft lockup"

Are those times the same? Are those times when some batch job occurs, perhaps?

I still wonder if you have any idea about the times being possibly conspicuous?

I would get sysstat service running for now:

Code:
apt install sysstat
systemctl enable sysstat
systemctl start sysstat
systemctl status sysstat

To later check what's the load on the system over time.
 
I still wonder if you have any idea about the times being possibly conspicuous?

I would get sysstat service running for now:

Code:
apt install sysstat
systemctl enable sysstat
systemctl start sysstat
systemctl status sysstat

To later check what's the load on the system over time.
I have no idea, this is totally random. At this moment it happens on 2 of my nodes with ZFS running the 3rd one without ZFS (same hardware spec) works fine. Maybe ZFS isn't releasing memory quick enough to let guests use it?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!