/etc/pve/corosync.conf missing

Ulrar

Active Member
Feb 4, 2016
37
1
28
34
Hi,

I've also posted on the mailing list, fyi.
I have just been tasked with managing an existing cluster (no HA, but still clustered for ease of use). The first thing I did was fixing the /etc/hosts of each node so that they can see each other, and now the "Summary" tab of each node work from each of the other nodes. But they still can't see each other as online.

I've looked, and it seems they are all missing the /etc/pve/corosync.conf file, even though they are configured to be cluster, the other nodes are showing up as offline on the web interface, and pvecm status on some of the nodes even shows a broken cluster, with only self as the member (on some node that command just complains about the missing corosync.conf though).

Is there a way to fix this ? Reinstalling isn't really an option. Everything still works fine, but having a bunch of offline nodes in the web interface isn't great, at the very least I guess it'd be good to remove them, but it would be better to fix it and have it working of course.
Thanks !
 
You can try this:

1] BACKUP BACKUP (/etc/pve, sqlite db for pve, vm images...check manual!!) and check what runs on every node and where has every VM disk images !!!!!! BE CAREFUL and check 2x, 3x before doing anything !!

Better doing all this on node(s) without running VMs.

Now:
1] check /root/.ssh/authorized_keys - are there keys for every nodes in cluster?
if not, full reinstall cluster is the best option, but you can still try with adding all needed node's keys to that file
2] configue all around corosync - check manual
3] check multicast/unicast on switches
4] restart node

And be very careful, you can lose all configs/vm images on node added to cluster, if nodes works standalone.