Rename node w/ hci and ceph

lknite

New Member
Sep 27, 2024
26
1
3
Goal:
- I have a 3 node cluster with one node I wish to rename.
- I'd like an assist to ensure that I don't break ceph.

Detail:
- I have enough resources that I was able to move all vms off the node

The Plan (on the node to be renamed)
- Set OSDs to 'out'
- PGs will become 'active+clean+remapped', wait until this number becomes zero (is this true?)
- Once 'Used %' shows 0.00, Stop the OSDs
- Once stopped, select More and choose Destroy for each OSD
- If a ceph monitor is enabled on the node, Stop the monitor, and after stopped Destroy it
- If a ceph manager is enabled on the node, Stop the manager, and after stopped Destroy it
- (at least one manager is required, enable on another node if needed)
- Remove the node from the proxmox cluster using: pvecm delnode <node> (is this necessary?)
- Rename the node using: hostnamectl set-hostname <new_name>
- Reboot
- Rename node in /etc/corosync/corosync.conf (or should this be on an existing node in the cluster /etc/pve/corosync.conf?)
- Add renamed node back into cluster: pvecm add <ip_of_existing_cluster_member> (is this necessary?)
- Add the OSDs back in
- Enable ceph monitor
- Optionally, enable ceph manager on the node
- If you are using HA, then add the new node name to the relevant ha groups.
 
Last edited:
I'm trying the plan above, and after the reboot I can ssh to the renamed node but the gui isn't coming up.

What's needed to get the gui to come up so I can join the node back into the cluster?

Or, how can I add the cluster via the cli, I tried this but got an error:
Code:
# pvecm add 10.0.0.21
Please enter superuser (root) password for '10.0.0.21': **********
detected the following error(s):
* authentication key '/etc/corosync/authkey' already exists
* cluster config '/etc/pve/corosync.conf' already exists
* this host already contains virtual guests
Check if node may join a cluster failed!
 
Last edited:
Ok, I'm not sure what changed, I did another reboot and the gui came up... I didn't even have to add the node in it just worked.

If someone knows what step I did wrong let me know and I'll update the issue so other folks can reference this in the future.
 
You may not have done anything wrong, it just may have taken a couple of reboots to get it back into the fold.
I'd like to know if it all worked myself. :D
We might rename some nodes to take 'em from POC to test/build and if those are the right steps we'll be in good shape.
 
It's doing its recovery. I'll report back when it finishes as to whether it reaches 100% or if there's anything left over.

A couple attempts to migrate a vm back to the node didn't work, but I figure that's cause ceph is busy recovering.
 
Finished after about 10 hours. Ceph is healthy. Had to add the new node name to HA before I could move vms to it.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!