Cluster problem after one node is unreachable

Alireza

Member
Jul 15, 2019
1
0
6
50
There is a problem in my cluster , that when sometimes there is a problem in one node, the problem that exactly I dont know what it is ,but the node becomes unreachable and cant be pinged , then all other nodes go to the red icon phase and it seems that the cluster network becomes slow and even some virtual machines are not then accessible . If I do not prepare that machine at the first minutes of inaccessibility then the problem becomes more bigger and maybe some more nodes become inaccessible.I dont know why its like that.
As I know even when more that one machines are not accessible and the number of them is less that n+1 , the cluster should work.
in such situations i have a script that I‌ stop all 5 related services of each node one by one and then start them all one by one. there is another problem accure when I have started almost 60 or 70 percents of cluster nodes. then the speed of nodes to be Online becomes suddently very slow and sometimes the problem again shows itself and I have to stop and start once more. I have devoted separate port for cluster network f each node. i Have separated week nodes in one other cluster and the number of nodes are 30 and 300 vms .I have not yet found out what the problem is and from where it comes. could you please help me.Thanks
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!