Setting up a cluster and made a mistake

mason64

Member
Apr 18, 2022
44
5
13
Hi

I started to set up a cluster with 2 nodes, 1st node was running fine, installed pve on a second node, both on the same network with no access issues to both.

on pve01 i started to create a cluster but didnt see an error on the IP where it was telling it was on 10.0.0.49 but my pve01 is on 10.0.0.41 and the pve02 is on 10.0.0.42

PVE01 created the cluster, went to add pve02 but it failed with the error about 10.0.0.49 which isnt an IP i use.

tried this from a post on here
Code:
systemctl stop pve-cluster corosync
pmxcfs -l
rm -R /etc/corosync/*
rm -R /etc/pve/nodes
killall pmxcfs
systemctl start pve-cluster

logged back in to PVE01 but my 1 vm has gone and the cluster is still set up, when i go to clusters in data center i get this error '/etc/pve/nodes/pve01/pve-ssl.pem' does not exist! (500)

Any way i can get PVE01 back ?

Just an update i cant even get on to the GUI now for PVE01

PVE02 is fine because it never ended up joining the cluster

Thanks
 
Last edited:
hi @spirit


Thanks for the reply,

Yes i did because on that post some of the replies said it fixed there issue.

All i have in that directory is config.db config.db-shm and config.db-wal

think it all got messed up because i used the wrong IP address that was stored in the hosts file when i changed my network adapter i forgot to update the hosts file and when creating a cluster it must have taken the ip from there for the setup.

Thanks
Dave
 
Last edited:
hi @spirit


Thanks for the reply,

Yes i did because on that post some of the replies said it fixed there issue.

All i have in that directory is config.db config.db-shm and config.db-wal

think it all got messed up because i used the wrong IP address that was stored in the hosts file when i changed my network adapter i forgot to update the hosts file and when creating a cluster it must have taken the ip from there for the setup.

Thanks
Dave
so, no backup :/

you can try to recreate the vm without disk from the gui, with same id, then write the disk config in the vm config manually.
(/etc/pve/nodes/<node>/qemu-server/<vmid>.conf) scsi0:.....

not sure about others files for the node (certificates,...)


if you have a vm backup, reinstall the whole node && restore the vm.



A good advise: always made a backup of /etc/pve && /var/lib/pve-cluster/config* for disaster recovery
 
Hey thanks so much i do i have a backup of the vm on a pbs so thats all good. i will go ahead and reinstall node 1 and get pve back up and get a cluster working.

can i just ask, Is this a downside to having a cluster, if 1 of the nodes does down it can kill all the others, i am not looking at doing HA at the moment i just wanted to set up a cluster to allow me to manage all my nodes from one place.

Thanks
Dave
 
Hey thanks so much i do i have a backup of the vm on a pbs so thats all good. i will go ahead and reinstall node 1 and get pve back up and get a cluster working.

can i just ask, Is this a downside to having a cluster, if 1 of the nodes does down it can kill all the others, i am not looking at doing HA at the moment i just wanted to set up a cluster to allow me to manage all my nodes from one place.

Thanks
Dave
you need 3 nodes for a cluster. with 2 nodes, if 1 node goes down, the other node will be readonly. (or install corosync qdevice on pbs for the third corosync vote).


you can use proxmox datacenter manager if you want to manage in a central way standalone nodes.
 
Last edited:
great does proxmox datacenter manager work on its own physical machine?
it's a simple debian package, so if you are running debian on your machine yes.

(you could install it on pbs too as alternative).

they are also this alternative opensource project : https://pegaprox.com/ , it's python based web management interface, running on any linux
 
  • Like
Reactions: mason64