Problem with one node - sporadically not available pve6

Feb 14, 2019
78
2
8
Erlangen, Germany
Hello togehther!
We have a five node cluster with ceph. pve1 to pve5. The problem is, that pve1 is sometimes not "available". The osd´s of the node are working, the pve1 is available via ping but not via ssh. Here is a screenshot about the problem node - does anyone now why it is sometimes or sporadically not available? All is up to date, and is working, but this is a no go for an production environment. would be nice if someone knows this message.
upload_2019-8-19_10-14-55.jpeg
Thanks in advance!
 

Stoiko Ivanov

Proxmox Staff Member
Staff member
May 2, 2018
1,857
179
63
the hung_task for a kworker is a generic error message, that a kernel background thread has been computing for 2 minutes without yielding.

This can have various reasons and it's impossible to get a closer diagnosis based on that information alone...
(the reasons can range from slow disk access, broken hardware, outdated firmware, bugs in particular hardware, bugs in the kernel)
check the node's `dmesg` and `journalctl -r` output for further hints which might explain what causes the issue

Hope this helps!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!