[SOLVED] After failed cluster join, config in /etc/pve messed up

alh

Active Member
Jul 7, 2021
49
8
28
46
Hi, after an error during cluster join operation the config files and dirs in /etc/pve were deleted/are missing.

Bash:
root@srv1:/etc/pve# ls -lh
total 0
lrwxr-xr-x 1 root www-data 0 Jan  1  1970 local -> nodes/srv1
lrwxr-xr-x 1 root www-data 0 Jan  1  1970 lxc -> nodes/srv1/lxc
lrwxr-xr-x 1 root www-data 0 Jan  1  1970 openvz -> nodes/srv1/openvz
lrwxr-xr-x 1 root www-data 0 Jan  1  1970 qemu-server -> nodes/srv1/qemu-server
drwxr-xr-x 2 root www-data 0 Jul 19 02:33 virtual-guest

Is there a way to fix this without reinstalling?
 
Have you restarted the server after the failed join? What is
Code:
sysstemctl status pve-cluster
journalctl -u pve-cluster
?
 
Yes, I had restarted the server after the failed join and I also tried restarting the different services. In the end I just did the following:

Bash:
# separate node
systemctl stop pve-cluster
systemctl stop corosync
pmxcfs -l
rm /etc/pve/corosync.conf
rm -r /etc/corosync/*
killall pmxcfs
systemctl start pve-cluster
rm /var/lib/corosync/*

# re-create nodes dir
mkdir -p /etc/pve/nodes/srv1

# re-create certs
pvecm updatecerts --force

# copy from another/similar node
# - datacenter.cfg
# - storage.cfg
# - user.cfg

Not sure this is the correct way but it gave me back the WebGUI etc.

On a sidenote: Is the cluster mode suitable to centrally manage all PVEs across several sites (over the internet) or should it only be used inside one datacenter (latency etc.)?
 
  • Like
Reactions: Dominic