[SOLVED] One node is still dead in cluster after changing the hostname

Egner

Renowned Member
Aug 2, 2015
96
1
73
Hi,

I have a problem after i done a huge mistake last night. I have changed the hostname on one of the new installed nodes in to the cluster after i have found out the name was not correct.

And i have done it in the following steps :

1. Change the name in /etc/hostname and /etc/hosts
2. Move the old /etc/pve/nodes/tc5-vm-c to the new one /etc/pve/nodes/tc5-c-h (that should be the right one)
3. Do a reboot, after the reboot the nodes named tc5-vm-c is still exist in the cluster.

And if i go a head to look in my master configuration the host tc5-vm-c or tc5-c-h aren't exist in the pcvmc status / nodes.

root@w01-c-h:/etc/pve/nodes# pvecm status
Quorum information
------------------
Date: Tue Dec 20 10:11:03 2016
Quorum provider: corosync_votequorum
Nodes: 6
Node ID: 0x00000001
Ring ID: 1/1840
Quorate: Yes

Votequorum information
----------------------
Expected votes: 7
Highest expected: 7
Total votes: 6
Quorum: 4
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000001 1 10.10.13.101 (local)
0x00000002 1 10.10.13.102
0x00000003 1 10.10.13.103
0x00000005 1 10.10.13.104
0x00000007 1 10.10.13.105
0x00000004 1 10.10.13.117

As you can see the ip with 10.10.13.106 is not there any longer, but still the server are showing up in proxmox cluster.. (Check my printscreen)

And if i looked in to /etc/corosync/corosync.conf there is my old tc05-vm-c showed up.

Can someone help me to get rid of this server from custer so i can reinstall it with the new hostname?

And if i tried to remove the host 10.10.13.106, the system said it does not exist in the cluster as you can se under membership information.

Thanks for help!

/Egner
 

Attachments

  • proxmox-dead-host.jpg
    proxmox-dead-host.jpg
    18.1 KB · Views: 16
Here is my nodes list if that would help..

pvecm nodes

Membership information
----------------------
Nodeid Votes Name
1 1 w01-c-h (local)
2 1 tc01-c-h
3 1 tc02-c-h
5 1 tc03-c-h
7 1 tc04-c-h
4 1 c01-c-h

As you can see the number NodeID: 6 is missing.
 
Here is my nodes list if that would help..

pvecm nodes

Membership information
----------------------
Nodeid Votes Name
1 1 w01-c-h (local)
2 1 tc01-c-h
3 1 tc02-c-h
5 1 tc03-c-h
7 1 tc04-c-h
4 1 c01-c-h

As you can see the number NodeID: 6 is missing.

Honestly your best bet is to do a clean install. Changing the hostname is tough.
 
Is there not any better option ? there is 3 live hosts today and it will take some time to do a re-installation that contains a lot of lxc containers :/

Or you mean reinstall just the cluster machine i just added?
 
Is there not any better option ? there is 3 live hosts today and it will take some time to do a re-installation that contains a lot of lxc containers :/

Not in my honest opinion. I have seen to many issues with changing host names, just so much is tied to it. Much easier to reinstall the node.
 
But how will it be if there is just left the dead node in the cluster with that be overwritten if i use the same IP ?
 
So if i understand what exactly you want me to do, i have used that guide when i do the installation. and i have no problems with reinstallation of the node, but i still want the dead one to be gone.

But as i said before there is no dead hosts located in master.

pvecm nodes

Membership information
----------------------
Nodeid Votes Name
1 1 w01-c-h (local)
2 1 tc01-c-h
3 1 tc02-c-h
5 1 tc03-c-h
7 1 tc04-c-h
4 1 c01-c-h

pvecm add xxx.xxx.xxx.xxx -force
pvecm updatecerts

And thanks for your time right now! :-)
 
So if i understand what exactly you want me to do, i have used that guide when i do the installation. and i have no problems with reinstallation of the node, but i still want the dead one to be gone.

But as i said before there is no dead hosts located in master.

pvecm nodes

Membership information
----------------------
Nodeid Votes Name
1 1 w01-c-h (local)
2 1 tc01-c-h
3 1 tc02-c-h
5 1 tc03-c-h
7 1 tc04-c-h
4 1 c01-c-h

pvecm add xxx.xxx.xxx.xxx -force
pvecm updatecerts

And thanks for your time right now! :)

I would follow the steps to remove a node. Do a fresh install on that node and re-add.

Im not quite sure what you mean by there is no dead hosts located in master. This is simply a hostname issue. So you need to remove c01-c-h from the cluster? Is that correct?
 
No i need to remove the tc05-vm-c from the cluster. But if i looked in to my master,

pvecm nodes

Membership information
----------------------
Nodeid Votes Name
1 1 w01-c-h (local)
2 1 tc01-c-h
3 1 tc02-c-h
5 1 tc03-c-h
7 1 tc04-c-h
4 1 c01-c-h

The node is not there at all.. So there is nothing that i actually can remove.
As you can see in my print-screen there is a tc05-vm-c machine that exist in the cluster but not showed in to pvecm nodes or status.
 
No i need to remove the tc05-vm-c from the cluster. But if i looked in to my master,

pvecm nodes

Membership information
----------------------
Nodeid Votes Name
1 1 w01-c-h (local)
2 1 tc01-c-h
3 1 tc02-c-h
5 1 tc03-c-h
7 1 tc04-c-h
4 1 c01-c-h

The node is not there at all.. So there is nothing that i actually can remove.

Oh so you already started making changes. Not sure what to tell you there. That would push me to do a complete cluster re-install. Its a headache, but take your time and make sure its 100% before going into production.
 
If i have seen the node in the pvecm status, then i already know how i can delete the entire node from the cluster. But now i feels a little bit fucked after my mistake last night :(

One thing that i have in my mind, that maybe i can remove the /etc/pve/nodes/tc05-c-h data and hopefully have the corrosync remove the old lines and repair the cluster. But before i do it i need some expertise :/
 
I can still see like this lines in to corosync.conf

nodelist {
node {
name: tc05-vm-c
nodeid: 6
quorum_votes: 1
ring0_addr: tc05-vm-c
}
 
adamb i have just solved the issue :-)

I move the hostname back to the original tc-05-vm-c and recopy the files in /etc/pve/nodes and now the cluster comes up green. Now i can remove it from the cluster and reinstall the server :)

See the below :

pvecm nodes

Membership information
----------------------
Nodeid Votes Name
1 1 w01-c-h (local)
2 1 tc01-c-h
3 1 tc02-c-h
5 1 tc03-c-h
7 1 tc04-c-h
6 1 tc05-vm-c
4 1 c01-c-h

The number 6 is back on track!!!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!