PVE Cluster not working anymore rear updating system / power outage

hardwareadictos

New Member
Mar 19, 2022
16
1
3
33
Good morning

Just updated yesterday throw GUi my Proxmox packets on my two node cluster.

For some reason my nodes cannot live together on the cluster anymore:

1655288098839.png

Time is synced on both servers.

Corosync config wasn't modified:

1655288324948.png


Tried restarting proxmox core services with no luck:

service pve-cluster restart && service pvedaemon restart && service pvestatd restart && service pveproxy restart

Also tried installing systemd-timesyncd (which wasn't present on system) but still failing.

Had to migrate my critical VMs/LXCs to another single proxmox host.

Tried manually giving 1 vote individually by running pvecm expected 1 while the opossite node is powered off, but this ends to traspasing the problem to the other node.

¿Can someone help me?

Thanks in advance.
 

Attachments

  • node2_packets.txt
    39.3 KB · Views: 0
  • node1_packets.txt
    39.2 KB · Views: 2
I had a similar issue this morning: a node in a small cluster would boot but no PVE services were available. The reason: I had moved the primary (and until a few days ago off-cluster) Nameserver onto that node. This does not work if you use DNS for resolution of the names of the cluster nodes (and did only configure one single NS...)

So I learned the hard way that a cluster-wide correct content of /etc/hosts is important under some circumstances :)

Best regards
 
  • Like
Reactions: hardwareadictos
I had a similar issue this morning: a node in a small cluster would boot but no PVE services were available. The reason: I had moved the primary (and until a few days ago off-cluster) Nameserver onto that node. This does not work if you use DNS for resolution of the names of the cluster nodes (and did only configure one single NS...)

So I learned the hard way that a cluster-wide correct content of /etc/hosts is important under some circumstances :)

Best regards

Thank you @UdoB :)

On my case i didn't move any nameserver or modified the hosts file, i just made a OS update from web GUI.

Will check later those and report back anyways. Thank you!
 
Thank you @UdoB :)

On my case i didn't move any nameserver or modified the hosts file, i just made a OS update from web GUI.

Will check later those and report back anyways. Thank you!
Edit: Just checked it yesterday and no hosts file anomalies and primary nameserver is outside cluster hosts (it's on another separate proxmox host)
 
Just to add some more information:

1655379374906.png

Added two votes on the primary host by rising expected votes and config version on corosync.conf which is applied successfully and persistent across reboots, for some reason this configuration is not replicated on the secondary host.

For the node number 2 i have to rise votes to 1 with pvecm expected 1 in orther to make the web GUI working, if not it is impossible to login becasue quorum is not ready.

First node boots with quorum ready
On the WEB GUI, both nodes have not visibility with each other:

Node 1:

1655380885450.png

Node 2 (rear applying pvecm expected 1):

1655380958394.png


Seems like there is no coneectivity as ping doesnt work between the cluster interfaces:

1655381673136.png

Have to check that
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!