Check master node of Proxmox

itvietnam · Oct 23, 2018

Hi,

Can we find master node in HA of Proxmox from past specific time?

Eg: 10h AM on the same date of last month.

udo · Oct 23, 2018

Hi,
since a very long time there aren't master/slave nodes at proxmox - all nodes have the same vote, and if enough nodes to reach the quorum, they form an working cluster.
Nodes without quorum can't start VMs (and /etc/pve is write protected). With HA enabled, such nodes are perhaps self-fenced.

Udo

itvietnam · Oct 23, 2018

udo said:
nd /etc/pve is write protected

we can write with following command

Code:

pvecm e 1

udo said:
since a very long time there aren't master/slave nodes at proxmox

i mean the master node in the attachment (seem pve-ha-crm).

May i know how much is enough? may we have this formula?

udo · Oct 23, 2018

itvietnam said:
we can write with following command

Code:

pvecm e 1

Hi,
I hope you know what you do - because this is dangerous. With this command an single node, which don't have an working cluster configuration (network?), has quorum again and e.g. can start an VM, wich is perhaps on the running cluster allready moved to another node (if the vm run on shared storage).
In this case you have two (equals) cluster running!!

Have you checked with omping?
Checked the switch?
Perhaps it's possible to enable an second corosync path? (ring0 + ring1)

Udo

itvietnam · Oct 23, 2018

I do this because related to this incident

master node die, lrm is waiting for agent lock but master node was die so LRM can not get any activity from pve-ha-crm. So i decided to shutdown all node and reboot node19. After node19 booted, we run pve e 1 to get /etc/pve writable. Then i modified corosync totem address from 10.10.30.0 (ring0) to 10.10.30.169 and 10.20.30.0 (ring1) to 10.20.30.169 with a hope HA cluster will make this node19 become master node.

I assume after boot remaining node, they will update to new config from node19 but they won't. They are running on separate cluster. node19 on 1 cluster and all remaining node on differences cluster.

Finally, i decided to shutdown node19 and remove all VM config with (ha-manager remove sid command) then start all VPS back.

All VPS from node19 can not start so i have to MOVE all config from /etc/pve/nodes/node19/qemu-server/*.conf to /etc/pve/nodes/node15/qemu-server/ and start manually on node15

Can we add node19 back to cluster without reinstall them? probable VM config is still listed under this directory: /etc/pve/nodes/node19/qemu-server/

Back to the question of this thread, can we find out which server control (keep) pve-ha-crm on the incident date? we want to explore more logs on this server. node19 is shutdown now so maybe this node is control pve-ha-crm on that date and this node has more information for us.

Search

Search

Check master node of Proxmox

itvietnam

Renowned Member

udo

Distinguished Member

itvietnam

Renowned Member

Attachments

udo

Distinguished Member

itvietnam

Renowned Member

We value your privacy