PVE nodes getting disconnected from the cluster

Aug 16, 2023
29
1
3
Dear All,
We are using PVE 8
We have 8 node cluster of PVE8 EE . We have been experiencing some of the nodes getting disconnected from the cluster for some time and then they join back , No clue on what is the reason.
Posting here the screen shot of the graph ( you see that after some interval it is back)
1696047959818.png


When we check the uptime, of the machine, by going to the server room and checking on the server monitor, we see known uptime and no reboot

This occurrence is being very much concern
Guidance requested on how can we check what is happening, what is the reason ?, any log files from the cluster which can show us where the issue is
Thanks
Joseph John
 
Thanks for the reply


appreciate, if you can guide which log files to check
Start with the SYSLOG of the Nodes getting lost at the time where you have no data on the other nodes.
 
  • Like
Reactions: joseph.john
Hello,

Do you have a dedicated Corosync network? How many nodes are being dropped from the cluster? Whats the latency from these nodes to the others?

Whats the output of `pvecm status` and could you please share the contents of `/etc/pve/corosync.conf`.

As mentioned already, the system logs (`journalctl`) are a good place to start.
 
  • Like
Reactions: joseph.john