3 nodes cluster messed up

Just for ref if anyone wants to find some help reading this...

I ended firing up a new virtual machine (Debian11) on which I attach a copy of the faulty (virtual) drive. The drive had no partition but I succeeded in repairing the (raw) drive with testdisk. Once I had access to the data I made a backup copy on another machine using RSync.

Finally we remade the whole cluster but before that we changed the internal disk drives on node2 and P3 which were HDD for SSDs.

It's been a few days it's running. Of course lesson learnt, we now have a daily backup on an external drive (using PBS), other lesson a cluster (even with cph replication) is not three machines with replcation but should be seen as a whole from which one node can be down. If ever this happen, and knowing that the ceph settings are for 2 disks minimum, never ever touch (reboot or stop) the remaining working nodes before you replaced the faulty one.
 
Last edited:
  • Like
Reactions: Chris

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!