Pve cluster and GUI

strix

Active Member
Mar 20, 2018
23
0
41
37
We have a cluster with 3 nodes. And we have a very strange situation. If pve cluster service starts on node3, we are loosing all nodes and vm’s on GUI.

Proxmox version 5.4.7
Linux version 4.15.18-41

What can we do about that? Any advice?
Thanks
 
try a gain with -force :
#pvecm add ip_node1 -ring0_addr-node3 -force
 
Did you enabled Multicast on your Network? Your Hosts file is correct?

Could you post a Screenshot of the PVE GUI when the Problem came up again?
 
After several restarts of the 3rd node (something that I've done quite a few times before even posting here) nodes 1 & 2 are part of the cluster while node 3 believes it's the only member of the same cluster.
Here is the pvecm status from node 2
root@srv2:~# pvecm status
Quorum information
------------------
Date: Sat Jun 29 19:09:10 2019
Quorum provider: corosync_votequorum
Nodes: 2
Node ID: 0x00000003
Ring ID: 1/13860
Quorate: Yes

Votequorum information
----------------------
Expected votes: 3
Highest expected: 3
Total votes: 2
Quorum: 2
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 172.16.254.1
0x00000003 1 172.16.254.2 (local)

While here is the same command from node 3
root@srv3:~# pvecm status
Quorum information
------------------
Date: Sat Jun 29 20:16:31 2019
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000002
Ring ID: 2/13396
Quorate: No

Votequorum information
----------------------
Expected votes: 3
Highest expected: 3
Total votes: 1
Quorum: 2 Activity blocked
Flags:

Membership information
----------------------
Nodeid Votes Name
0x00000002 1 172.16.254.3 (local)
 
After several times and after node3 joins cluster we are loosing all nodes / VM's on GUI. We are trying with serveral ways... multicast, unicast but unfortunately everytime the same result. The hosts files are currect.

Any help ?
 

Attachments

  • GUI-1.jpg
    GUI-1.jpg
    169.5 KB · Views: 7
Hey Badji,

because without cluster we can't do that. Cluster doesn't working currect right now and we don't have GUI, we must remove node3 from cluster to have back GUI, but that is not a solution right now.

Also we must transfer VM's from one node to another to update corosync 3 .. and after that to update on PVE 6.
 
So to sum, you have a broken cluster and cannot repair with bringing the third node online. Fixing this is possible but could be painful and with much potential downtime. What I'd recommend depends highly on how much available cluster space you have, and what type of storage you're using. Do verify that you dont have network problems, esp on the corosync interface.

IF you have clustered storage, migrate all your assets to one of the surviving nodes and evict the misbehaving node.
IF you do not have clustered storage, you'd need to migrate assets manually offline to either of the survivor nodes, either with zfs send if available to you or vzdump.

Once you do that, reinstall proxmox on the broken node MAKING SURE TO GIVE IT A NEW NAME, readd to the cluster and away you go.
 
I had the same problem with the old version, I settled with my previous suggestion.
I advise you to save your vms and reinstalle a new installation, with the new version and restore them.
I have created several clusters on dozens of physical machines and it works very well.
I can even give you a remote access on my POC to test all that if you want.

Moula.
Merci.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!