HA cluster - both hosts are always reboot without waiting for each other

glade

New Member
Feb 8, 2021
5
0
1
50
We have two Proxxmox hosts in a cluster. I have defined two HA groups and assigned the LXC containers to the groups. The reqest state is set to "started" for containers.

When I restart one of the Proxmos hosts, I briefly see that one or two lxc container are migrated. Then the connection breaks. Both host machines are then rebooted.

Is there any way to prevent this. Actually, the idea was that during the reboot all LCV containers are automatically moved to the other host, so that the second host can be restarted later. Or have I misunderstood the system?

many greetings
 
A quorum is present. However, I do not have shared storage, but have built everything on ZFS volumes.

Currently, the quorum is always depend to a host. However, I can't find an example anywhere where both hosts have a quorum.
I had always assumed that the quorum is moved to the other host on restart. But it is also interesting that even when restarting the host on which the quorum is not valid, both hosts are restarted.

# ha-manager status
quorum OK
master h-proxmox05 (active, Tue Feb 9 09:57:53 2021)
lrm h-proxmox05 (active, Tue Feb 9 09:57:55 2021)
lrm h-proxmox06 (active, Tue Feb 9 09:57:55 2021)
service ct:100 (h-proxmox05, started)
service ct:101 (h-proxmox05, started)
service ct:102 (h-proxmox05, started)
service ct:103 (h-proxmox05, started)
service ct:104 (h-proxmox05, started)
service ct:105 (h-proxmox05, started)
service ct:106 (h-proxmox05, started)
service ct:107 (h-proxmox05, started)
service ct:108 (h-proxmox05, started)
service ct:109 (h-proxmox05, started)
service ct:110 (h-proxmox05, started)
service ct:111 (h-proxmox05, started)
service ct:112 (h-proxmox05, started)
service ct:200 (h-proxmox06, started)
service ct:201 (h-proxmox06, started)
service ct:202 (h-proxmox06, started)
service ct:203 (h-proxmox06, started)
service ct:204 (h-proxmox06, started)


#pvecm status
Cluster information
-------------------
Name: proxmox-05-06
Config Version: 2
Transport: knet
Secure auth: on

Quorum information
------------------
Date: Tue Feb 9 10:01:38 2021
Quorum provider: corosync_votequorum
Nodes: 2
Node ID: 0x00000001
Ring ID: 1.e8
Quorate: Yes

Votequorum information
----------------------
Expected votes: 2
Highest expected: 2
Total votes: 2
Quorum: 2
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 172.24.2.22 (local)
0x00000002 1 172.24.2.21
 
HA cannot work with two voters in the cluster, if one goes down the other cannot tell if it is rightfully working or if the cluster network split/broke and thus it must not do anything to avoid split brain.

If you want HA then it would be good to read through the basic requirements first:
https://pve.proxmox.com/pve-docs/chapter-ha-manager.html#_requirements

Note, you can work-around the third node with an external voter (must NOT be hosted on any of the two existing nodes) - but in general a third node is better, as load can be distributed better on an outage.
https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_corosync_external_vote_support