Please help, I am so close!!!

warloxian

Member
Jun 26, 2021
49
0
11
58
I have a cluster of 6 nodes running ceph on all. When i set up my last node I accidently set a duplicate static address that matched another node. I went in and changed the ip of the node. I fixed it in /hosts and /interfaces. When I run pvesh get cluster/config/nodes I can see all 6 nodes, I also fixed the ip address in /etc/pve/corosync.conf. When I run pvesh get cluster/config/nodes all 6 nodes show up, when i run it from vga terminal on the lost noDe I see all 6 nodes. When I am logged into my main node 5 nodes show up green and the one i changed ip on has a red x. when i log inot the gui of my changed node , it is green and 5 nodes have red x's . I am unable to access shell when logged into the gui of the changed node , I get error 1006, I am able to ping google.com as well as 8.8.8.8 and I am able to ping the addresses of all of the other 5 nodes and permission denied error. can ping the changed ip from all other nodes. I am not able to edit any config files from the changed node with permission denied error. When i run pvecm nodes from any of the 5 working nodes I can only see the 5 nodes and when i run the same from the changed node I can only see it and not the other 5 , this is the only place I can find where all 6 nodes don't show up.
When I look at the log files I see "cluster not quorate - extending auth key lifetime and when I run journalctl -u corosync on the changed node I see multiple instances of " host 1,2,3,4,5, has no active link"
Yes i am NOOB, yes I have tried everything I can find and have reached the end of Google, yes I am OVER MY HEAD. Yes , I am tired of reinstalling OS's everytime I screw things up. It's time for me to start learning how to work through these problems , instead of just writing over them. I have made tremendous progress with Linux in general, considering 12 months ago I knew nothing. I have several stable Linux OS's running on multiple laptops and have found ways to work through my BORKED Linux systems, but the server , VM world is my new "thorn in my side , from the tree that I planted" If I may quote Metallica?
Any help would be GREATLY Appreciated
 
Last edited:
/etc/pve/corosync.conf --> do you have increase config_version ?

also, after changing it, verify on each node that /etc/corosync/corosync.conf is same than /etc/pve/corosync.conf.
if not, overwrite /etc/corosync/corosync.conf and restart corosync service.
 
/etc/pve/corosync.conf --> do you have increase config_version ?

also, after changing it, verify on each node that /etc/corosync/corosync.conf is same than /etc/pve/corosync.conf.
if not, overwrite /etc/corosync/corosync.conf and restart corosync service.
As I ama NOOB, I guess I dont fully understand the question " increase config_version ?" , my problem with modifying the /etc/corosync/corosync.conf file is this. When I log into the changed node from web gui, I am not able to access shell at all? When I plug in my monitor, keyboard to the changed node and I try to change any of the files I get "access denied, even when I run it with sudo? This seems to be the hanging point. I appear to have been locked out of the ability to change any files for this reason
 
in /etc/pve/corosync.conf file, you have a field like:

Code:
totem {
  cluster_name: your cluster
  config_version: 4
  ...

in this example, config_version = 4 , do you to increase it (config_version + 1 = 5),
so

Code:
totem {
  cluster_name: your cluster
  config_version: 5
  ...


if you don't increase the config_version, the new config is not apply.



When it's done, the /etc/pve/corosync.conf will overwrite local /etc/corosync/corosync.conf on each node.
As it's seem that you have duplicate ips or bad config, it's possible that it's not done correctly, so you need to verify manually that /etc/corosync/corosync.conf is correctly updated.
if not, copy manually /etc/pve/corosync.conf to /etc/corosync/corosync.conf on each node, and restart corosync server "systemctl restart corosync" .

Then, when all is done, verify the coroysnc cluster status with "pvecm status"
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!