[SOLVED] Replace cluster node

frantek

Renowned Member
May 30, 2009
170
7
83
Hi,

I've a proxmox / Ceph cluster with 3 nodes. Every thing is setup as described in the wiki with a Ceph mesh network. I've to replace the boot disk of one node. What is best practice to do so? As the problem disk works for some time I could just duplicate it. An other alternative would be to install proxmox on the new disk, and add networking setup but what next?

TIA
Matthias
 
Hi,

I've a proxmox / Ceph cluster with 3 nodes. Every thing is setup as described in the wiki with a Ceph mesh network. I've to replace the boot disk of one node. What is best practice to do so? As the problem disk works for some time I could just duplicate it. An other alternative would be to install proxmox on the new disk, and add networking setup but what next?

TIA
Matthias
Hi,
depends what format does the disk use. LVM? Than you can add the new hdd (during run time), create partitions, copy /boot, extend the VG with the new partition and use pvmove to migrate the data.
After that vgreduce (remove old partition) and write the boot-sector on the new hdd.
Then shutdown,
remove old disk,
select the new one in bios as start disk
Boot and all should fine.

For this you have an short downtime of one node (app. 5 min.)

Udo
 
Tanks for your Reply. As the node ist down any way at the moment I can replicate the disk using an other Linux system. The disk is a pure boot disk, there is no data (images etc.) on the disk. All data is stored in Ceph storage.

How is the process when I'm not able to replicate the boot disk and use a proxmox CD image for installation? The node needs to rejoin the cluster. With the same name? Ceph disks are still there. What is best practice to get Ceph back online? I found no comprehensive documentation about disaster recovery of one proxmox/Ceph node.
 
Tanks for your Reply. As the node ist down any way at the moment I can replicate the disk using an other Linux system. The disk is a pure boot disk, there is no data (images etc.) on the disk. All data is stored in Ceph storage.
Hi,
I think not that the disk is an pure boot disk!
I assume that the disk contains ceph mon-data + keys and other ceph-keys too (and the pve-sqlite-db).
How is the process when I'm not able to replicate the boot disk and use a proxmox CD image for installation? The node needs to rejoin the cluster. With the same name? Ceph disks are still there. What is best practice to get Ceph back online? I found no comprehensive documentation about disaster recovery of one proxmox/Ceph node.
of course you can do an fresh install and join the pve-cluster again (with -f for force). Then you must also delete the old ceph-mon and create this mon again. And do the same with the osd-disks (resync of all osd-data!). To avoid the resync it's perhaps easier to save the content below /var/lib/ceph (without the mounted osds) if you old hdd allow this.

Udo
 
What I finally did: Fresh install to get GPT setup (faster than thinking my self :), deleted everything, copied all files offline, corrected disk UUIDs in grub.cfg und fstab. Works.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!