Restoring a Node with CEPH OSD’s and Crush Map

djfreak

New Member
May 27, 2023
25
6
3
Greetings,

I’m about to spin up a Proxmox Backup Server on dedicated hardware for the first time. I already have all of my VM’s and LXC’s backing up to my NFS storage so I’m not worried about those. They should fence anyway if a node goes down. Even if they don’t fence, those backups are fine and I’ve successfully restored them many times.

Here is what I need to set up. These are very old servers. They are going to die one day and they each have 3 ceph OSD’s plugged in so across 5 nodes this has gotten to be quite an elaborate CEPH configuration.

I need to be able to restore each node byte for byte so that the exact same OSD’s can be plugged right back in to new server hardware (renamed with the same proxmox name and IP as the dead server) without missing a beat.

Is there a video or document that can walk me through this before I invest in this new backup server?

Thanks in advance.
 
Greetings,

I’m about to spin up a Proxmox Backup Server on dedicated hardware for the first time. I already have all of my VM’s and LXC’s backing up to my NFS storage so I’m not worried about those. They should fence anyway if a node goes down. Even if they don’t fence, those backups are fine and I’ve successfully restored them many times.

Here is what I need to set up. These are very old servers. They are going to die one day and they each have 3 ceph OSD’s plugged in so across 5 nodes this has gotten to be quite an elaborate CEPH configuration.

I need to be able to restore each node byte for byte so that the exact same OSD’s can be plugged right back in to new server hardware (renamed with the same proxmox name and IP as the dead server) without missing a beat.

Is there a video or document that can walk me through this before I invest in this new backup server?

Thanks in advance.
Pasting this in this thread. I found this extremely helpful.

Basically I'm accepting that if you build your CEPH cluster properly, meaning all pools have a size of 3 with a minimum of 2, you should and in fact must always be able to lose an entire node and its OSDs without losing any data. If a node does down, take your OSD's out and down, destroy them, and then re add your formatted drives and rebuilt OSD's on your new replacement node. Apparently if you keep the node's ceph bucket intact, it should name the OSD's like the old ones - I think. This seems so difficult to accept still but I'm learning to.

Here is an excellent script I just used to backup node configs. Once you accept that OSDs will be rebuilt though, as long as services fence properly, you really don't even have to go to backup if a node goes down. And if they don't fence, VM and LXC backups are there to restore onto the restored node.

https://i12bretro.github.io/tutorials/0431.html
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!