Cannot create cluster at all

Bisser

New Member
May 13, 2019
12
0
1
49
I go to the web gui I push the button Create Cluster. I type the name of the cluster and push the Create button.
I get an error - cluster config '/etc/pve/corosync.conf' already exists (500).

What to do next?

I got to that situation in the following way as long as I remember:
In the beginning i successfully created a cluster then I tried to add a node. Then I was shown some error which I don't remember and there was a spinning wheel. I waited for sometime then I closed the browser and opened it again. The second node I was adding got messed up and I reinstalled it because there was nothing on it. On the node where I created the cluster the failed node still shows in the web ui as not accessible. Then I searched the web and found some commands to run to delete the cluster. I don't remember everything I tired. So now I am able to push the create cluster button on the web but I get the above error. Please advise.
 

Glowsome

Member
Jul 25, 2017
37
3
8
46
check the following :

- on the node you are trying to 'create the cluster' run : pvecm nodes from commandline/console session
If this returns information the node you are trying to create the cluster on already is acting like a member of a previous attempt to create a cluster.

To be quick on explanation, dont try to recreate the cluster, as it seems to have been created, focus on the error of the node-add , your problem is there.
 

Bisser

New Member
May 13, 2019
12
0
1
49
this is what I get from pvecm nodes
Nodeid Votes Name
1 1 51.89.X.X (local)

I cannot add a new node because I cannot access the Join Information button - it is Disabled.
 

t.lamprecht

Proxmox Staff Member
Staff member
Jul 28, 2015
1,489
212
63
South Tyrol/Italy
I cannot add a new node because I cannot access the Join Information button - it is Disabled.
So you neither could click the "Create Cluster" but also not the "Join Information"??

Can you post the output of:

Code:
pvesh get /cluster/config/join --output-format=yaml
 

Bisser

New Member
May 13, 2019
12
0
1
49
Create cluster is enabled but I cannot create cluster because I get cluster config '/etc/pve/corosync.conf' already exists (500).
Join Information is disabled. Join Cluster is enabled.

the result from
pvesh get /cluster/config/join --output-format=yaml
is
unable to read '/etc/pve/nodes/PX-XXX1/pve-ssl.pem' - No such file or directory

Output of
cat /etc/pve/corosync.conf

logging {
debug: off
to_syslog: yes
}

nodelist {
node {
name: GE-XXX3
nodeid: 1
quorum_votes: 1
ring0_addr: 51.89.X.XXX
}
node {
name: PX-XXX1
nodeid: 2
quorum_votes: 1
ring0_addr: 139.99.XXX.X
}
}

quorum {
provider: corosync_votequorum
}

totem {
cluster_name: Bxxxxxx
config_version: 2
interface {
bindnetaddr: 51.89.X.XXX
ringnumber: 0
}
ip_version: ipv4
secauth: on
version: 2
}
 

t.lamprecht

Proxmox Staff Member
Staff member
Jul 28, 2015
1,489
212
63
South Tyrol/Italy
unable to read '/etc/pve/nodes/PX-XXX1/pve-ssl.pem' - No such file or directory
Huh, did you fiddled with your SSL certificates? As above seems to be the underlying issue of your situation..

can you run a
Code:
pvecm updatecerts
and try again?
 

Bisser

New Member
May 13, 2019
12
0
1
49
I haven't touched anything about certificates. But I remember on the second server when I was trying to add it to the cluster the error shown was something about SSL.
It could be because I tried to change the node name of the server I was adding but that created even a bigger mess so I just reverted the node name back it still didn't work so I reinstalled the second server but now I have a problem with the main node.

pvecm updatecerts - returns
(re)generate node files
merge authorized SSH keys and known hosts

pvesh get /cluster/config/join --output-format=yaml
again the same
unable to read '/etc/pve/nodes/PX-XXX1/pve-ssl.pem' - No such file or directory


Isn't there a button to just get rid of everything cluster related and start from scratch?
 

t.lamprecht

Proxmox Staff Member
Staff member
Jul 28, 2015
1,489
212
63
South Tyrol/Italy
Isn't there a button to just get rid of everything cluster related and start from scratch?
That alone won't help you, the missing SSL Certificate file needs to be fixed too....

But it seems that in your corosync configuration both nodes are already added, so you probably are not quorated (see pvecm status) and thus the updatecerts command could not re-generated the missing SSL file.

Maybe it's really the best thing to kill the cluster, here are the steps required - as there's no simple button, separating a clusters in a general way is not really possible, as they share resources depending on the specific setup.

Note, this is something I'd only do in your situation and not recommended in any way in genearl (for others reading this):
Code:
# DANGEROUS, only do for single node cluster or those where all other nodes got re-installed/purged

# below two steps are only required if not quorate
systemctl stop pve-cluster
# restart in local mode
pmxcfs -l

# remove all corosync cluster configuration traces
rm -f /etc/pve/corosync.conf
rm -rf /etc/corosync/*
systemctl stop corosync pve-cluster
systemctl start pve-cluster
 

t.lamprecht

Proxmox Staff Member
Staff member
Jul 28, 2015
1,489
212
63
South Tyrol/Italy
And then re-do a "pvecm updatecerts" and check if the /etc/pve/nodes/PX-XXX1/pve-ssl.pem file is here
 

Bisser

New Member
May 13, 2019
12
0
1
49
I saw plenty of articles of messed up clusters. This whole process seems quite fragile. In my case I think the time I spent reading what to do is much more than to simply reinstall and move the VMs. This time I will install the servers and the first thing to do is the cluster. Once it is running then I will install the VMs. Hopefully it doesn't fall apart at some point.
 

Bisser

New Member
May 13, 2019
12
0
1
49
While I was searching for solutions there seems to be so many people who have cluster problems cluster getting stuck and nothing would work but reinstall. I will try the steps you suggested but first I will copy everything to another server just in case. I don't want to lose anything.
 

Glowsome

Member
Jul 25, 2017
37
3
8
46
While I was searching for solutions there seems to be so many people who have cluster problems cluster getting stuck and nothing would work but reinstall. I will try the steps you suggested but first I will copy everything to another server just in case. I don't want to lose anything.
Been running a 4-node cluster now without huge problems for over 1.5 years, just the perils i usually create myself.
And till now with help of the docs / forums i've alway been able to get it all back to a 'nice and tidy' state.
 

Bisser

New Member
May 13, 2019
12
0
1
49
Ok this cluster thing doesn't work. I did a clean install on 2 servers they are hosted at OVH. VPS Proxmox VE 5 (ZFS). Both servers have full access to each other - no firewall restrictions. What I did:
1. Opened the web interface on SVR1 and clicked Create Cluster on SVR1. The cluster was created.
2. Opened the web interface on SVR2 and clicked Join Cluster. I copied the join information from SVR1 and used its root pass.
3. I got the following on SVR2
Establishing API connection with host '51.89.21.201'
Login succeeded.
Request addition of this node
Join request OK, finishing setup locally
stopping pve-cluster service

Also in the background of SVR2 I can see a spinning wheel and it says: permission denied - invalid PVE ticket (401).
This froze like that and it has been staying like that for about 15 minutes already. I don't think anything else will happen. This is what happened the last time I tried as well.

I rebooted both servers

And I can no longer enter the web interface of SVR2.
SVR1 shows 2 nodes but the second one is not active and SVR1 says it is a standalone node - not a cluster anymore.

Clean install and doesn't work. Strange. Any suggestions are welcome. I will not attempt to repair it I just want to make it work from clean install. If anyone can provide instruction of how can this be done it would be great. Thanks.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!