Whole cluster fenced when 1 node fails

Bruno ASTIER

Member
Dec 13, 2018
4
0
21
51
Hi all,


Facing strange problem, hope you could help me.

Brings up a 4 nodes + QDevice cluster in last version 5.3-1 with dedicated Corosync network multicast aware.

All servers up and running
- I reboot/shutdown 1 node, Quorum decreases to 4 votes, everything is OK
- I reboot/shutdown a second node, Quorum decreases to 3 votes and all is right

All servers up and running
- 1 node fails due to power outage, all seems good but in fact watchdog on working servers are not updated and so all of them reboots.

Does anybody already see this ?
Can't understand why fence working quorates nodes
Can't beleive that's a feature, sounds to me like a bug in my configuration.


Let me know if you need some debug files.
Regards,
Bruno.

Sorry for my bad English
 
Dedicated Corosync network on 172.18.0.0/16
172.18.1.0/24 on datacenter 1
172.18.2.0/24 on datacenter 2
172.18.3.0/24 on datacenter 3

Here node 4 is off, so 3 nodes cluster + QnetDevice : Quorum is OK
upload_2018-12-20_16-17-15.png
upload_2018-12-20_16-20-37.png

Now, power outage on node 3 :
upload_2018-12-20_16-23-45.png
All is fine, but
on node 1 var/log/syslog I can see
Dec 20 16:22:24 frdc1px001 corosync[4741]: debug [TOTEM ] The token was lost in the OPERATIONAL state.
Dec 20 16:22:24 frdc1px001 corosync[4741]: notice [TOTEM ] A processor failed, forming new configuration.
Dec 20 16:22:24 frdc1px001 corosync[4741]: [TOTEM ] The token was lost in the OPERATIONAL state.
Dec 20 16:22:24 frdc1px001 corosync[4741]: [TOTEM ] A processor failed, forming new configuration.
Dec 20 16:22:24 frdc1px001 corosync[4741]: debug [TOTEM ] Receive multicast socket recv buffer size (320000 bytes).
Dec 20 16:22:24 frdc1px001 corosync[4741]: debug [TOTEM ] Transmit multicast socket send buffer size (320000 bytes).
Dec 20 16:22:24 frdc1px001 corosync[4741]: debug [TOTEM ] Local receive multicast loop socket recv buffer size (320000 bytes).
Dec 20 16:22:24 frdc1px001 corosync[4741]: debug [TOTEM ] Local transmit multicast loop socket send buffer size (320000 bytes).
Dec 20 16:22:24 frdc1px001 corosync[4741]: debug [TOTEM ] entering GATHER state from 2(The token was lost in the OPERATIONAL state.).
Dec 20 16:22:24 frdc1px001 corosync[4741]: [TOTEM ] Receive multicast socket recv buffer size (320000 bytes).
Dec 20 16:22:24 frdc1px001 corosync[4741]: [TOTEM ] Transmit multicast socket send buffer size (320000 bytes).
Dec 20 16:22:24 frdc1px001 corosync[4741]: [TOTEM ] Local receive multicast loop socket recv buffer size (320000 bytes).
Dec 20 16:22:24 frdc1px001 corosync[4741]: [TOTEM ] Local transmit multicast loop socket send buffer size (320000 bytes).
Dec 20 16:22:24 frdc1px001 corosync[4741]: [TOTEM ] entering GATHER state from 2(The token was lost in the OPERATIONAL state.).
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [TOTEM ] entering GATHER state from 0(consensus timeout).
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [TOTEM ] Creating commit token because I am the rep.
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [TOTEM ] Saving state aru a2bba high seq received a2bba
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [MAIN ] Storing new sequence id for ring 6c0
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [TOTEM ] entering COMMIT state.
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] entering GATHER state from 0(consensus timeout).
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [TOTEM ] got commit token
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [TOTEM ] entering RECOVERY state.
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [TOTEM ] TRANS [0] member 172.18.1.201:
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [TOTEM ] TRANS [1] member 172.18.1.202:
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [TOTEM ] position [0] member 172.18.1.201:
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [TOTEM ] previous ring seq 6bc rep 172.18.1.201
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [TOTEM ] aru a2bba high delivered a2bba received flag 1
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [TOTEM ] position [1] member 172.18.1.202:
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [TOTEM ] previous ring seq 6bc rep 172.18.1.201
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [TOTEM ] aru a2bba high delivered a2bba received flag 1
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [TOTEM ] Did not need to originate any messages in recovery.
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [TOTEM ] got commit token
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [TOTEM ] Sending initial ORF token
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 0, aru 0
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [TOTEM ] install seq 0 aru 0 high seq received 0
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 1, aru 0
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [TOTEM ] install seq 0 aru 0 high seq received 0
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] Creating commit token because I am the rep.
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Votequorum nodelist notify callback:
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Ring_id = (1.6c0)
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Node list (size = 2):
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: 0 nodeid = 1
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: 1 nodeid = 2
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Algorithm decided to pause cast vote timer and result vote is No change
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Cast vote timer is now paused.
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: worker: qdevice_heuristics_worker_cmd_process_exec: Received exec command with seq_no "29" and timeout "15000"
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Received heuristics exec result command with seq_no "29" and result "Disabled"
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Votequorum heuristics exec result callback:
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: seq_number = 29, exec_result = Disabled
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Algorithm decided to send list, result vote is Wait for reply and heuristics is Undefined
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Sending membership node list seq = 50, ringid = (1.6c0), heuristics = Undefined.
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Node list:
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: 0 node_id = 1, data_center_id = 0, node_state = not set
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: 1 node_id = 2, data_center_id = 0, node_state = not set
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Cast vote timer is no longer paused.
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Cast vote timer is now stopped.
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Not scheduling heuristics timer because mode is not enabled
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Received membership node list reply
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: seq = 50
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: vote = Wait for reply
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: ring id = (1.6c0)
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Algorithm result vote is Wait for reply
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Cast vote timer remains stopped.
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 2, aru 0
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [TOTEM ] install seq 0 aru 0 high seq received 0
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 3, aru 0
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [TOTEM ] install seq 0 aru 0 high seq received 0
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [TOTEM ] retrans flag count 4 token aru 0 install seq 0 aru 0 0
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [TOTEM ] Resetting old ring state
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [TOTEM ] recovery to regular 1-0
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [MAIN ] Member left: r(0) ip(172.18.2.201)
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [TOTEM ] waiting_trans_ack changed to 1
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [SYNC ] call init for locally known services
Dec 20 16:22:27 frdc1px001 corosync[4741]: info [VOTEQ ] waiting for quorum device Qdevice poll (but maximum for 30000 ms)
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [TOTEM ] entering OPERATIONAL state.
Dec 20 16:22:27 frdc1px001 corosync[4741]: notice [TOTEM ] A new membership (172.18.1.201:1728) was formed. Members left: 3
Dec 20 16:22:27 frdc1px001 corosync[4741]: notice [TOTEM ] Failed to receive the leave message. failed: 3
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [SYNC ] enter sync process
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [SYNC ] Committing synchronization for corosync configuration map access
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [CMAP ] Not first sync -> no action
Dec 20 16:22:27 frdc1px001 corosync[4741]: warning [CPG ] downlist left_list: 1 received
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [CPG ] got joinlist message from node 0x2
Dec 20 16:22:27 frdc1px001 corosync[4741]: warning [CPG ] downlist left_list: 1 received
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [CPG ] got joinlist message from node 0x1
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [SYNC ] Committing synchronization for corosync cluster closed process group service v1.01
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [CPG ] my downlist: members(old:3 left:1)
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [CPG ] left_list_entries:1
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [CPG ] left_list[0] group:pve_dcdb_v1\x00, ip:r(0) ip(172.18.2.201) , pid:4785
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [CPG ] left_list_entries:1
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [CPG ] left_list[0] group:pve_kvstore_v1\x00, ip:r(0) ip(172.18.2.201) , pid:4785
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [CPG ] joinlist_messages[0] group:pve_kvstore_v1\x00, ip:r(0) ip(172.18.1.201) , pid:4654
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [CPG ] joinlist_messages[1] group:pve_dcdb_v1\x00, ip:r(0) ip(172.18.1.201) , pid:4654
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [CPG ] joinlist_messages[2] group:pve_kvstore_v1\x00, ip:r(0) ip(172.18.1.202) , pid:4603
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] Saving state aru a2bba high seq received a2bba
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [CPG ] joinlist_messages[3] group:pve_dcdb_v1\x00, ip:r(0) ip(172.18.1.202) , pid:4603
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: Yes QdeviceAlive: Yes QdeviceCastVote: Yes QdeviceMasterWins: No
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [VOTEQ ] Sending nodelist callback. ring_id = 1/1728
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [VOTEQ ] got nodeinfo message from cluster node 1
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [VOTEQ ] nodeinfo message[1]: votes: 1, expected: 5 flags: 113
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: Yes QdeviceAlive: Yes QdeviceCastVote: Yes QdeviceMasterWins: No
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [VOTEQ ] total_votes=3, expected_votes=5
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [VOTEQ ] node 1 state=1, votes=1, expected=5
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [VOTEQ ] node 2 state=1, votes=1, expected=5
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [VOTEQ ] node 3 state=2, votes=1, expected=5
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [VOTEQ ] node 4 state=2, votes=1, expected=5
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [VOTEQ ] node 0 state=1, votes=1
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [VOTEQ ] lowest node id: 1 us: 1
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [VOTEQ ] highest node id: 2 us: 1
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [VOTEQ ] got nodeinfo message from cluster node 1
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [VOTEQ ] nodeinfo message[0]: votes: 1, expected: 0 flags: 0
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [VOTEQ ] Received qdevice op 1 req from node 1 [Qdevice]
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [VOTEQ ] got nodeinfo message from cluster node 2
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [VOTEQ ] nodeinfo message[2]: votes: 1, expected: 5 flags: 113
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: Yes QdeviceAlive: Yes QdeviceCastVote: Yes QdeviceMasterWins: No
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [VOTEQ ] got nodeinfo message from cluster node 2
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [VOTEQ ] nodeinfo message[0]: votes: 1, expected: 0 flags: 0
Dec 20 16:22:27 frdc1px001 corosync[4741]: debug [VOTEQ ] Received qdevice op 1 req from node 2 [Qdevice]
Dec 20 16:22:27 frdc1px001 corosync[4741]: [MAIN ] Storing new sequence id for ring 6c0
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] entering COMMIT state.
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] got commit token
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] entering RECOVERY state.
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] TRANS [0] member 172.18.1.201:
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] TRANS [1] member 172.18.1.202:
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] position [0] member 172.18.1.201:
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] previous ring seq 6bc rep 172.18.1.201
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] aru a2bba high delivered a2bba received flag 1
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] position [1] member 172.18.1.202:
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] previous ring seq 6bc rep 172.18.1.201
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] aru a2bba high delivered a2bba received flag 1
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] Did not need to originate any messages in recovery.
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] got commit token
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] Sending initial ORF token
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 0, aru 0
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] install seq 0 aru 0 high seq received 0
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 1, aru 0
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] install seq 0 aru 0 high seq received 0
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 2, aru 0
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] install seq 0 aru 0 high seq received 0
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 3, aru 0
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] install seq 0 aru 0 high seq received 0
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] retrans flag count 4 token aru 0 install seq 0 aru 0 0
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] Resetting old ring state
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] recovery to regular 1-0
Dec 20 16:22:27 frdc1px001 corosync[4741]: [MAIN ] Member left: r(0) ip(172.18.2.201)
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] waiting_trans_ack changed to 1
Dec 20 16:22:27 frdc1px001 corosync[4741]: [SYNC ] call init for locally known services
Dec 20 16:22:27 frdc1px001 corosync[4741]: [VOTEQ ] waiting for quorum device Qdevice poll (but maximum for 30000 ms)
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] entering OPERATIONAL state.
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] A new membership (172.18.1.201:1728) was formed. Members left: 3
Dec 20 16:22:27 frdc1px001 corosync[4741]: [TOTEM ] Failed to receive the leave message. failed: 3
Dec 20 16:22:27 frdc1px001 corosync[4741]: [SYNC ] enter sync process
Dec 20 16:22:27 frdc1px001 corosync[4741]: [SYNC ] Committing synchronization for corosync configuration map access
Dec 20 16:22:27 frdc1px001 corosync[4741]: [CMAP ] Not first sync -> no action
Dec 20 16:22:27 frdc1px001 corosync[4741]: [CPG ] downlist left_list: 1 received
Dec 20 16:22:27 frdc1px001 corosync[4741]: [CPG ] got joinlist message from node 0x2
Dec 20 16:22:27 frdc1px001 corosync[4741]: [CPG ] downlist left_list: 1 received
Dec 20 16:22:27 frdc1px001 corosync[4741]: [CPG ] got joinlist message from node 0x1
Dec 20 16:22:27 frdc1px001 corosync[4741]: [SYNC ] Committing synchronization for corosync cluster closed process group service v1.01
Dec 20 16:22:27 frdc1px001 corosync[4741]: [CPG ] my downlist: members(old:3 left:1)
Dec 20 16:22:27 frdc1px001 corosync[4741]: [CPG ] left_list_entries:1
Dec 20 16:22:27 frdc1px001 corosync[4741]: [CPG ] left_list[0] group:pve_dcdb_v1\x00, ip:r(0) ip(172.18.2.201) , pid:4785
Dec 20 16:22:27 frdc1px001 corosync[4741]: [CPG ] left_list_entries:1
Dec 20 16:22:27 frdc1px001 corosync[4741]: [CPG ] left_list[0] group:pve_kvstore_v1\x00, ip:r(0) ip(172.18.2.201) , pid:4785
Dec 20 16:22:27 frdc1px001 corosync[4741]: [CPG ] joinlist_messages[0] group:pve_kvstore_v1\x00, ip:r(0) ip(172.18.1.201) , pid:4654
Dec 20 16:22:27 frdc1px001 corosync[4741]: [CPG ] joinlist_messages[1] group:pve_dcdb_v1\x00, ip:r(0) ip(172.18.1.201) , pid:4654
Dec 20 16:22:27 frdc1px001 corosync[4741]: [CPG ] joinlist_messages[2] group:pve_kvstore_v1\x00, ip:r(0) ip(172.18.1.202) , pid:4603
Dec 20 16:22:27 frdc1px001 corosync[4741]: [CPG ] joinlist_messages[3] group:pve_dcdb_v1\x00, ip:r(0) ip(172.18.1.202) , pid:4603
Dec 20 16:22:27 frdc1px001 pmxcfs[4654]: [dcdb] notice: members: 1/4654, 2/4603
Dec 20 16:22:27 frdc1px001 pmxcfs[4654]: [dcdb] notice: starting data syncronisation
Dec 20 16:22:27 frdc1px001 corosync[4741]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: Yes QdeviceAlive: Yes QdeviceCastVote: Yes QdeviceMasterWins: No
Dec 20 16:22:27 frdc1px001 corosync[4741]: [VOTEQ ] Sending nodelist callback. ring_id = 1/1728
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Votequorum nodelist notify callback:
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Ring_id = (1.6c0)
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Node list (size = 2):
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: 0 nodeid = 1
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: 1 nodeid = 2
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Algorithm decided to pause cast vote timer and result vote is No change
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Cast vote timer is now paused.
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: worker: qdevice_heuristics_worker_cmd_process_exec: Received exec command with seq_no "29" and timeout "15000"
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Received heuristics exec result command with seq_no "29" and result "Disabled"
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Votequorum heuristics exec result callback:
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: seq_number = 29, exec_result = Disabled
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Algorithm decided to send list, result vote is Wait for reply and heuristics is Undefined
Dec 20 16:22:27 frdc1px001 corosync[4741]: [VOTEQ ] got nodeinfo message from cluster node 1
Dec 20 16:22:27 frdc1px001 corosync[4741]: [VOTEQ ] nodeinfo message[1]: votes: 1, expected: 5 flags: 113
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Sending membership node list seq = 50, ringid = (1.6c0), heuristics = Undefined.
Dec 20 16:22:27 frdc1px001 corosync[4741]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: Yes QdeviceAlive: Yes QdeviceCastVote: Yes QdeviceMasterWins: No
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Node list:
Dec 20 16:22:27 frdc1px001 corosync[4741]: [VOTEQ ] total_votes=3, expected_votes=5
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: 0 node_id = 1, data_center_id = 0, node_state = not set
Dec 20 16:22:27 frdc1px001 corosync[4741]: [VOTEQ ] node 1 state=1, votes=1, expected=5
Dec 20 16:22:27 frdc1px001 corosync[4741]: [VOTEQ ] node 2 state=1, votes=1, expected=5
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: 1 node_id = 2, data_center_id = 0, node_state = not set
Dec 20 16:22:27 frdc1px001 corosync[4741]: [VOTEQ ] node 3 state=2, votes=1, expected=5
Dec 20 16:22:27 frdc1px001 corosync[4741]: [VOTEQ ] node 4 state=2, votes=1, expected=5
Dec 20 16:22:27 frdc1px001 corosync[4741]: [VOTEQ ] node 0 state=1, votes=1
Dec 20 16:22:27 frdc1px001 corosync[4741]: [VOTEQ ] lowest node id: 1 us: 1
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Cast vote timer is no longer paused.
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Cast vote timer is now stopped.
Dec 20 16:22:27 frdc1px001 corosync[4741]: [VOTEQ ] highest node id: 2 us: 1
Dec 20 16:22:27 frdc1px001 corosync[4741]: [VOTEQ ] got nodeinfo message from cluster node 1
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Not scheduling heuristics timer because mode is not enabled
Dec 20 16:22:27 frdc1px001 corosync[4741]: [VOTEQ ] nodeinfo message[0]: votes: 1, expected: 0 flags: 0
Dec 20 16:22:27 frdc1px001 corosync[4741]: [VOTEQ ] Received qdevice op 1 req from node 1 [Qdevice]
Dec 20 16:22:27 frdc1px001 corosync[4741]: [VOTEQ ] got nodeinfo message from cluster node 2
Dec 20 16:22:27 frdc1px001 corosync[4741]: [VOTEQ ] nodeinfo message[2]: votes: 1, expected: 5 flags: 113
Dec 20 16:22:27 frdc1px001 corosync[4741]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: Yes QdeviceAlive: Yes QdeviceCastVote: Yes QdeviceMasterWins: No
Dec 20 16:22:27 frdc1px001 corosync[4741]: [VOTEQ ] got nodeinfo message from cluster node 2
Dec 20 16:22:27 frdc1px001 corosync[4741]: [VOTEQ ] nodeinfo message[0]: votes: 1, expected: 0 flags: 0
Dec 20 16:22:27 frdc1px001 corosync[4741]: [VOTEQ ] Received qdevice op 1 req from node 2 [Qdevice]
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Received membership node list reply
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: seq = 50
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: vote = Wait for reply
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: ring id = (1.6c0)
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Algorithm result vote is Wait for reply
Dec 20 16:22:27 frdc1px001 corosync-qdevice[4815]: Cast vote timer remains stopped.
Dec 20 16:22:28 frdc1px001 pmxcfs[4654]: [dcdb] notice: cpg_send_message retry 10
Dec 20 16:22:29 frdc1px001 pmxcfs[4654]: [dcdb] notice: cpg_send_message retry 20
Dec 20 16:22:29 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retry 10
Dec 20 16:22:30 frdc1px001 pmxcfs[4654]: [dcdb] notice: cpg_send_message retry 30
Dec 20 16:22:30 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retry 20
Dec 20 16:22:31 frdc1px001 pmxcfs[4654]: [dcdb] notice: cpg_send_message retry 40
Dec 20 16:22:31 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retry 30
Dec 20 16:22:32 frdc1px001 pmxcfs[4654]: [dcdb] notice: cpg_send_message retry 50
Dec 20 16:22:32 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retry 40
Dec 20 16:22:33 frdc1px001 pmxcfs[4654]: [dcdb] notice: cpg_send_message retry 60
Dec 20 16:22:33 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retry 50
Dec 20 16:22:34 frdc1px001 pmxcfs[4654]: [dcdb] notice: cpg_send_message retry 70
Dec 20 16:22:34 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retry 60
Dec 20 16:22:35 frdc1px001 pmxcfs[4654]: [dcdb] notice: cpg_send_message retry 80
Dec 20 16:22:35 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retry 70
Dec 20 16:22:36 frdc1px001 pmxcfs[4654]: [dcdb] notice: cpg_send_message retry 90
Dec 20 16:22:36 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retry 80
Dec 20 16:22:37 frdc1px001 pmxcfs[4654]: [dcdb] notice: cpg_send_message retry 100
Dec 20 16:22:37 frdc1px001 pmxcfs[4654]: [dcdb] notice: cpg_send_message retried 100 times
Dec 20 16:22:37 frdc1px001 pmxcfs[4654]: [status] notice: members: 1/4654, 2/4603
Dec 20 16:22:37 frdc1px001 pmxcfs[4654]: [status] notice: starting data syncronisation
Dec 20 16:22:37 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retry 90
Dec 20 16:22:38 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retry 10
Dec 20 16:22:38 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retry 100
Dec 20 16:22:38 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retried 100 times
Dec 20 16:22:38 frdc1px001 pmxcfs[4654]: [status] crit: cpg_send_message failed: 6
Dec 20 16:22:38 frdc1px001 pve-firewall[4834]: firewall update time (8.224 seconds)
Dec 20 16:22:39 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retry 20
Dec 20 16:22:39 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retry 10
Dec 20 16:22:40 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retry 30
Dec 20 16:22:40 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retry 20
Dec 20 16:22:41 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retry 40
Dec 20 16:22:41 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retry 30
Dec 20 16:22:42 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retry 50
Dec 20 16:22:42 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retry 40
Dec 20 16:22:43 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retry 60
Dec 20 16:22:43 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retry 50
Dec 20 16:22:44 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retry 70
Dec 20 16:22:44 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retry 60
Dec 20 16:22:45 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retry 80
Dec 20 16:22:45 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retry 70
Dec 20 16:22:46 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retry 90
Dec 20 16:22:46 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retry 80
Dec 20 16:22:47 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retry 100
Dec 20 16:22:47 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retried 100 times
Dec 20 16:22:47 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retry 90
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: Received vote info
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: Received vote info
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: seq = 2
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: vote = ACK
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: ring id = (1.6c0)
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: Algorithm result vote is ACK
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: Cast vote timer is now scheduled every 5000ms voting ACK.
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: seq = 2
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: vote = ACK
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: ring id = (1.6c0)
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: Algorithm result vote is ACK
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: Cast vote timer is now scheduled every 5000ms voting ACK.
Dec 20 16:22:48 frdc1px001 corosync[4741]: debug [SYNC ] Committing synchronization for corosync vote quorum service v1.0
Dec 20 16:22:48 frdc1px001 corosync[4741]: debug [VOTEQ ] total_votes=3, expected_votes=5
Dec 20 16:22:48 frdc1px001 corosync[4741]: debug [VOTEQ ] node 1 state=1, votes=1, expected=5
Dec 20 16:22:48 frdc1px001 corosync[4741]: debug [VOTEQ ] node 2 state=1, votes=1, expected=5
Dec 20 16:22:48 frdc1px001 corosync[4741]: debug [VOTEQ ] node 3 state=2, votes=1, expected=5
Dec 20 16:22:48 frdc1px001 corosync[4741]: debug [VOTEQ ] node 4 state=2, votes=1, expected=5
Dec 20 16:22:48 frdc1px001 corosync[4741]: debug [VOTEQ ] node 0 state=1, votes=1
Dec 20 16:22:48 frdc1px001 corosync[4741]: debug [VOTEQ ] lowest node id: 1 us: 1
Dec 20 16:22:48 frdc1px001 corosync[4741]: debug [VOTEQ ] highest node id: 2 us: 1
Dec 20 16:22:48 frdc1px001 corosync[4741]: notice [QUORUM] Members[2]: 1 2
Dec 20 16:22:48 frdc1px001 corosync[4741]: debug [QUORUM] sending quorum notification to (nil), length = 56
Dec 20 16:22:48 frdc1px001 corosync[4741]: debug [VOTEQ ] Sending quorum callback, quorate = 1
Dec 20 16:22:48 frdc1px001 corosync[4741]: notice [MAIN ] Completed service synchronization, ready to provide service.
Dec 20 16:22:48 frdc1px001 corosync[4741]: debug [TOTEM ] waiting_trans_ack changed to 0
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: Votequorum quorum notify callback:
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: Quorate = 1
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: Node list (size = 5):
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: 0 nodeid = 1, state = 1
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: 1 nodeid = 2, state = 1
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: 2 nodeid = 3, state = 2
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: 3 nodeid = 4, state = 2
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: 4 nodeid = 0, state = 0
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: Algorithm decided to send list and result vote is No change
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: Sending quorum node list seq = 51, quorate = 1
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: Node list:
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: 0 node_id = 1, data_center_id = 0, node_state = member
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: 1 node_id = 2, data_center_id = 0, node_state = member
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: 2 node_id = 3, data_center_id = 0, node_state = dead
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: 3 node_id = 4, data_center_id = 0, node_state = dead
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: Received quorum node list reply
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: seq = 51
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: vote = No change
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: ring id = (1.6c0)
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: Algorithm result vote is No change
Dec 20 16:22:48 frdc1px001 corosync[4741]: [SYNC ] Committing synchronization for corosync vote quorum service v1.0
Dec 20 16:22:48 frdc1px001 corosync[4741]: [VOTEQ ] total_votes=3, expected_votes=5
Dec 20 16:22:48 frdc1px001 corosync[4741]: [VOTEQ ] node 1 state=1, votes=1, expected=5
Dec 20 16:22:48 frdc1px001 corosync[4741]: [VOTEQ ] node 2 state=1, votes=1, expected=5
Dec 20 16:22:48 frdc1px001 corosync[4741]: [VOTEQ ] node 3 state=2, votes=1, expected=5
Dec 20 16:22:48 frdc1px001 corosync[4741]: [VOTEQ ] node 4 state=2, votes=1, expected=5
Dec 20 16:22:48 frdc1px001 corosync[4741]: [VOTEQ ] node 0 state=1, votes=1
Dec 20 16:22:48 frdc1px001 corosync[4741]: [VOTEQ ] lowest node id: 1 us: 1
Dec 20 16:22:48 frdc1px001 corosync[4741]: [VOTEQ ] highest node id: 2 us: 1
Dec 20 16:22:48 frdc1px001 corosync[4741]: [QUORUM] Members[2]: 1 2
Dec 20 16:22:48 frdc1px001 corosync[4741]: [QUORUM] sending quorum notification to (nil), length = 56
Dec 20 16:22:48 frdc1px001 corosync[4741]: [VOTEQ ] Sending quorum callback, quorate = 1
Dec 20 16:22:48 frdc1px001 corosync[4741]: [MAIN ] Completed service synchronization, ready to provide service.
Dec 20 16:22:48 frdc1px001 corosync[4741]: [TOTEM ] waiting_trans_ack changed to 0
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: Votequorum quorum notify callback:
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: Quorate = 1
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: Node list (size = 5):
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: 0 nodeid = 1, state = 1
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: 1 nodeid = 2, state = 1
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: 2 nodeid = 3, state = 2
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: 3 nodeid = 4, state = 2
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: 4 nodeid = 0, state = 0
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: Algorithm decided to send list and result vote is No change
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: Sending quorum node list seq = 51, quorate = 1
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: Node list:
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: 0 node_id = 1, data_center_id = 0, node_state = member
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: 1 node_id = 2, data_center_id = 0, node_state = member
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: 2 node_id = 3, data_center_id = 0, node_state = dead
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: 3 node_id = 4, data_center_id = 0, node_state = dead
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: Received quorum node list reply
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: seq = 51
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: vote = No change
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: ring id = (1.6c0)
Dec 20 16:22:48 frdc1px001 corosync-qdevice[4815]: Algorithm result vote is No change
Dec 20 16:22:48 frdc1px001 pmxcfs[4654]: [status] notice: cpg_send_message retried 98 times
Dec 20 16:22:48 frdc1px001 pve-firewall[4834]: firewall update time (7.812 seconds)
Dec 20 16:22:50 frdc1px001 pvestatd[4859]: got timeout
Dec 20 16:23:00 frdc1px001 systemd[1]: Starting Proxmox VE replication runner...
Dec 20 16:23:14 frdc1px001 watchdog-mux[3256]: client watchdog expired - disable watchdog updates

and on QnetDevice logs
Dec 20 16:22:27 frsapx000 corosync-qnetd: Dec 20 16:22:27 debug Client 172.18.1.201:39012 (cluster PXv5, node_id 1) sent membership node list.
Dec 20 16:22:27 frsapx000 corosync-qnetd: Dec 20 16:22:27 debug msg seq num = 50
Dec 20 16:22:27 frsapx000 corosync-qnetd: Dec 20 16:22:27 debug ring id = (1.6c0)
Dec 20 16:22:27 frsapx000 corosync-qnetd: Dec 20 16:22:27 debug heuristics = Undefined
Dec 20 16:22:27 frsapx000 corosync-qnetd: Dec 20 16:22:27 debug node list:
Dec 20 16:22:27 frsapx000 corosync-qnetd: Dec 20 16:22:27 debug node_id = 1, data_center_id = 0, node_state = not set
Dec 20 16:22:27 frsapx000 corosync-qnetd: Dec 20 16:22:27 debug node_id = 2, data_center_id = 0, node_state = not set
Dec 20 16:22:27 frsapx000 corosync-qnetd: Dec 20 16:22:27 debug ffsplit: Membership for cluster PXv5 is not yet stable
Dec 20 16:22:27 frsapx000 corosync-qnetd: Dec 20 16:22:27 debug Algorithm result vote is Wait for reply
Dec 20 16:22:27 frsapx000 corosync-qnetd: Dec 20 16:22:27 debug Client 172.18.1.202:39084 (cluster PXv5, node_id 2) sent membership node list.
Dec 20 16:22:27 frsapx000 corosync-qnetd: Dec 20 16:22:27 debug msg seq num = 47
Dec 20 16:22:27 frsapx000 corosync-qnetd: Dec 20 16:22:27 debug ring id = (1.6c0)
Dec 20 16:22:27 frsapx000 corosync-qnetd: Dec 20 16:22:27 debug heuristics = Undefined
Dec 20 16:22:27 frsapx000 corosync-qnetd: Dec 20 16:22:27 debug node list:
Dec 20 16:22:27 frsapx000 corosync-qnetd: Dec 20 16:22:27 debug node_id = 1, data_center_id = 0, node_state = not set
Dec 20 16:22:27 frsapx000 corosync-qnetd: Dec 20 16:22:27 debug node_id = 2, data_center_id = 0, node_state = not set
Dec 20 16:22:27 frsapx000 corosync-qnetd: Dec 20 16:22:27 debug ffsplit: Membership for cluster PXv5 is not yet stable
Dec 20 16:22:27 frsapx000 corosync-qnetd: Dec 20 16:22:27 debug Algorithm result vote is Wait for reply
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 warning Client 172.18.2.201:38160 doesn't sent any message during 20000ms. Disconnecting
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug Client 172.18.2.201:38160 (init_received 1, cluster PXv5, node_id 3) disconnect
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug ffsplit: Membership for cluster PXv5 is now stable
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug ffsplit: Quorate partition selected
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug node list:
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug node_id = 1, data_center_id = 0, node_state = not set
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug node_id = 2, data_center_id = 0, node_state = not set
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug ffsplit: No client gets NACK
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug Sending vote info to client 172.18.1.202:39084 (cluster PXv5, node_id 2)
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug msg seq num = 3
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug vote = ACK
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug Sending vote info to client 172.18.1.201:39012 (cluster PXv5, node_id 1)
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug msg seq num = 2
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug vote = ACK
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug Client 172.18.1.202:39084 (cluster PXv5, node_id 2) replied back to vote info message
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug msg seq num = 3
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug Client 172.18.1.201:39012 (cluster PXv5, node_id 1) replied back to vote info message
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug msg seq num = 2
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug ffsplit: All ACK votes sent for cluster PXv5
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug Client 172.18.1.202:39084 (cluster PXv5, node_id 2) sent quorum node list.
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug msg seq num = 48
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug quorate = 1
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug node list:
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug node_id = 1, data_center_id = 0, node_state = member
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug node_id = 2, data_center_id = 0, node_state = member
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug node_id = 3, data_center_id = 0, node_state = dead
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug node_id = 4, data_center_id = 0, node_state = dead
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug Algorithm result vote is No change
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug Client 172.18.1.201:39012 (cluster PXv5, node_id 1) sent quorum node list.
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug msg seq num = 51
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug quorate = 1
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug node list:
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug node_id = 1, data_center_id = 0, node_state = member
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug node_id = 2, data_center_id = 0, node_state = member
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug node_id = 3, data_center_id = 0, node_state = dead
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug node_id = 4, data_center_id = 0, node_state = dead
Dec 20 16:22:48 frsapx000 corosync-qnetd: Dec 20 16:22:48 debug Algorithm result vote is No change

And after 30s watchdog reboot working server


Please, help.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!