Question regarding fencing and HA

tyjtyj

New Member
Feb 7, 2022
9
3
3
42
I have some question regarding fencing and node force reboot which recently cause outage to my cluster.

Having 3(let call 1,2,3) node cluster, all same version 8.0.3

node 3 having some issue thus was shutdown. No vm is running
node 2 is out for maintenance few days ago
node 1 is running. Unfortunately reloaded even quorum was set to 1 before 60 sec

Here is the syslog on node 1
Start timestamp
2023-07-27T22:28:11.728476+08:00 pve1 corosync[2835]: [CFG ] Node 3 was shut down by sysadmin
2023-07-27T22:28:11.743625+08:00 pve1 corosync[2835]: [QUORUM] Sync members[1]: 1
2023-07-27T22:28:11.743988+08:00 pve1 corosync[2835]: [QUORUM] Sync left[1]: 3
2023-07-27T22:28:11.744208+08:00 pve1 corosync[2835]: [TOTEM ] A new membership (1.ef2) was formed. Members left: 3
2023-07-27T22:28:11.744943+08:00 pve1 corosync[2835]: [QUORUM] This node is within the non-primary component and will NOT provide any services.
2023-07-27T22:28:11.745266+08:00 pve1 corosync[2835]: [QUORUM] Members[1]: 1
2023-07-27T22:28:11.746199+08:00 pve1 corosync[2835]: [MAIN ] Completed service synchronization, ready to provide service.
2023-07-27T22:28:12.810077+08:00 pve1 corosync[2835]: [KNET ] link: host: 3 link: 0 is down
2023-07-27T22:28:12.810318+08:00 pve1 corosync[2835]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
2023-07-27T22:28:12.810489+08:00 pve1 corosync[2835]: [KNET ] host: host: 3 has no active links
Configured 'pvecm e 1' on pve1
2023-07-27T22:29:05.553133+08:00 pve1 corosync[2835]: [QUORUM] This node is within the primary component and will provide service.
2023-07-27T22:29:05.553542+08:00 pve1 corosync[2835]: [QUORUM] Members[1]: 1
[Thu Jul 27 22:29:53 2023] Pve1 reloaded by fencing
2023-07-27T22:30:53.161464+08:00 pve1 systemd[1]: Starting corosync.service - Corosync Cluster Engine...
2023-07-27T22:30:53.193733+08:00 pve1 corosync[3424]: [MAIN ] Corosync Cluster Engine starting up

My concern is why PVE1 still reloaded even Quorum is Yes.

My understanding is fencing if Quorum lost for 60 second
22:28:11 - 22:29:05 is about 54 secs.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!