HA cluster - both hosts are always reboot without waiting for each other

glade

New Member
Feb 8, 2021
5
0
1
49
We have two Proxxmox hosts in a cluster. I have defined two HA groups and assigned the LXC containers to the groups. The reqest state is set to "started" for containers.

When I restart one of the Proxmos hosts, I briefly see that one or two lxc container are migrated. Then the connection breaks. Both host machines are then rebooted.

Is there any way to prevent this. Actually, the idea was that during the reboot all LCV containers are automatically moved to the other host, so that the second host can be restarted later. Or have I misunderstood the system?

many greetings
 
A quorum is present. However, I do not have shared storage, but have built everything on ZFS volumes.

Currently, the quorum is always depend to a host. However, I can't find an example anywhere where both hosts have a quorum.
I had always assumed that the quorum is moved to the other host on restart. But it is also interesting that even when restarting the host on which the quorum is not valid, both hosts are restarted.

# ha-manager status
quorum OK
master h-proxmox05 (active, Tue Feb 9 09:57:53 2021)
lrm h-proxmox05 (active, Tue Feb 9 09:57:55 2021)
lrm h-proxmox06 (active, Tue Feb 9 09:57:55 2021)
service ct:100 (h-proxmox05, started)
service ct:101 (h-proxmox05, started)
service ct:102 (h-proxmox05, started)
service ct:103 (h-proxmox05, started)
service ct:104 (h-proxmox05, started)
service ct:105 (h-proxmox05, started)
service ct:106 (h-proxmox05, started)
service ct:107 (h-proxmox05, started)
service ct:108 (h-proxmox05, started)
service ct:109 (h-proxmox05, started)
service ct:110 (h-proxmox05, started)
service ct:111 (h-proxmox05, started)
service ct:112 (h-proxmox05, started)
service ct:200 (h-proxmox06, started)
service ct:201 (h-proxmox06, started)
service ct:202 (h-proxmox06, started)
service ct:203 (h-proxmox06, started)
service ct:204 (h-proxmox06, started)


#pvecm status
Cluster information
-------------------
Name: proxmox-05-06
Config Version: 2
Transport: knet
Secure auth: on

Quorum information
------------------
Date: Tue Feb 9 10:01:38 2021
Quorum provider: corosync_votequorum
Nodes: 2
Node ID: 0x00000001
Ring ID: 1.e8
Quorate: Yes

Votequorum information
----------------------
Expected votes: 2
Highest expected: 2
Total votes: 2
Quorum: 2
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 172.24.2.22 (local)
0x00000002 1 172.24.2.21
 
HA cannot work with two voters in the cluster, if one goes down the other cannot tell if it is rightfully working or if the cluster network split/broke and thus it must not do anything to avoid split brain.

If you want HA then it would be good to read through the basic requirements first:
https://pve.proxmox.com/pve-docs/chapter-ha-manager.html#_requirements

Note, you can work-around the third node with an external voter (must NOT be hosted on any of the two existing nodes) - but in general a third node is better, as load can be distributed better on an outage.
https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_corosync_external_vote_support
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!