[SOLVED] Cluster lost quorum after last update

Tsi_tech

Member
Aug 8, 2019
2
0
6
123
Hello,

We are running a simple cluster of two nodes ( pve 5.4.13).
After last update ans reboot of the two nodes, cluster lost quorum after few minutes.
We made several tests and tried many solutions showed in this forum with no luck.

omping command ran on two nodes returned this :

Code:
192.168.1.95 :   unicast, xmt/rcv/%loss = 10000/10000/0%, min/avg/max/std-dev = 0.094/0.188/0.825/0.037
192.168.1.95 : multicast, xmt/rcv/%loss = 10000/0/100%, min/avg/max/std-dev = 0.000/0.000/0.000/0.000

multicast seemed to fail.

syslog in the meantime showed this :

Code:
error [TOTEM ] FAILED TO RECEIVE

This post on the forum bring us a solution in disabling "multicast-snooping" by type this command line on one of the two nodes
Code:
echo 0 > /sys/class/net/vmbr0/bridge/multicast_snooping

Qorum is now reached and here is the result of omping commande :

Code:
prx1 :   unicast, xmt/rcv/%loss = 9999/9999/0%, min/avg/max/std-dev = 0.099/0.188/0.748/0.032
prx1 : multicast, xmt/rcv/%loss = 9999/9999/0%, min/avg/max/std-dev = 0.109/0.206/0.619/0.031
 
Qorum is now reached and here is the result of omping commande :
Great that your problem is solved!

You didn't mention an external voter. We generally recommend having one for two node clusters (see documentation).
 
  • Like
Reactions: Tsi_tech
We will take a closer look at this part of the configuration.
thank you for drawing our attention to this.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!