Ceph Node Down - Proper Restore Procedures

MrPaul

Active Member
Apr 27, 2019
34
0
26
49
I've just lost my root drive which was on Cisco FlexFlash. It was supposed to be raid-0 but when I forced the master switch it is completely failing to boot where as on the other disk I was getting fsck errors and unable to write to the filesystem even after a repair. For simplicity I think I'll just consider the 2 FlexFlash disks a loss at this point.

I'm expecting the first order of business will be to reinstall Proxmox but what is the proper procedure to repair/replace all the CEPH mon/osd nodes? I've got the data on my other 2 servers as I was running with triple redundancy so even though I suspect the data is still complete locally I don't mind replacing the server in it's entirety and letting CEPH rebult if that's easier.

For what it's worth I'm running on 6.2-4 and everything was up to date with patches as of about a week ago.
 
Upon further research I see that running Proxmox off the FlexFlash is less than ideal due to the write capacity of the FlexFlash. Luckily for me this is just a lab environment so I'll probably just let it (the cluster) die off.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!