Corosync3 reboot while using 2 rings

inspa

New Member
Jul 5, 2021
2
0
1
42
Hi !

Are there any known failover issues for corosync ?
The issue am reporting here was cause by me the tester while trying to confirm that I got a stable corosync network but so far I don't believe so.

The test I ran consist on shutting the majority of the links for ring0 (primary ring) and then shutting down a link for ring1 (standby ring).

Rings total: 2
Nodes total: 19

* During the test I shutdown 16 links for ring0 out of 19 and the quorum was not lost. Then I recovered 4 links from ring0 and shut 1 of ring1 on a host which had both links up, after that the cluster lost the quorum and all servers were rebooted.

* Interesting enough, I previously tested the failover as above successfully but in the previous test only 12 links were shut from ring0 and 1 from ring1 no links from ring0 were recovered. In this case the only server that rebooted was the one which lost its link 1.

Was cluster failure expected to happen on the first scenario that i described ? For what i see/understand from the logs the cluster failed over again to ring0 once I shutdown a link from ring1 but I did not see this happening on the 2nd scenario.

Can someone help understand this issue ?

Here attach are the logs for the first described scenario.

Thanks!
 

Attachments

  • quorum-lost-scenario1.log
    12.7 KB · Views: 4

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!