[SOLVED] General Question on Node Failures and Ceph

Seed

Renowned Member
Oct 18, 2019
109
65
68
125
Hello,

I have a 3 node ceph cluster with 4 OSDs in each node in a 3/2 for a 12 disk 120TB HDD pool

If I have a power outage, and I lose my UPS batteries, what will happen?

1. If the servers shut off at different times because of lack of power.
2. I get to the hosts in time and shut them all down but even then they'll be out of sync I assume. What happens then?

I have about 30 minutes of UPS time per node and wondering if I need more or need to be concerned about this at all. As I build more of this lab out, the more I can't afford a really bad failure or the system gets wedged to where all the VMa and Storage is broken.
 
If I have a power outage, and I lose my UPS batteries, what will happen?
Hello darkness my old friend... ;)

1. If the servers shut off at different times because of lack of power.
If the ups isn't triggering a shutdown, then they just die, when the battery is empty.

2. I get to the hosts in time and shut them all down but even then they'll be out of sync I assume. What happens then?
No, the last man standing has the last epoch. On boot, the Ceph MONs should acquire quorum again, as will corosync.


I have about 30 minutes of UPS time per node and wondering if I need more or need to be concerned about this at all. As I build more of this lab out, the more I can't afford a really bad failure or the system gets wedged to where all the VMa and Storage is broken.
Ordered shutdowns are always better than pulling the plug. The time depends on your requirements and how long the system needs to shutdown. This has to be tested.