changing min_size automatically

Nov 14, 2019
36
2
28
35
Hello,

we would like to build a 4 node Proxmox/Ceph-Cluster that is able to recover from 2 nodes failing at once. To prevent data loss in such a case, we have to choose a min_size of 3. But when 2 nodes fail, there are only 2 nodes left. That is why we came up with the idea of reducing the min_size of our pools automatically when 2 nodes fail.

How good or bad is this idea?

Cedric
 
you need 5 nodes to tolerate 2 going down on the PVE side (or 4 + an external quorum device). for ceph, you'd need 5 nodes with a monitor each, and 3/3 or 4/3 or 5/3 replication.
 
We plan to have 4 nodes and 1 external quorum device for the PVE side. For Ceph, we plan to have a configuration of 3/3. Could you please comment in the idea of adapting the min_size automatically. To my understanding, it would enable writing to the rbd in the case of 2 nodes failing. Are there any downsides?
 
We plan to have 4 nodes and 1 external quorum device for the PVE side. For Ceph, we plan to have a configuration of 3/3. Could you please comment in the idea of adapting the min_size automatically. To my understanding, it would enable writing to the rbd in the case of 2 nodes failing. Are there any downsides?

you need a majority of monitor nodes to be online and form a quorum for Ceph to be operational. with 2/4 nodes online, you lose quorum. this is irrespective of the replication settings.
 
And would it work, if the quorum device is a Ceph monitor as well?

if it has enough performance, and good enough connectivity to the rest of the cluster, yes. but then the question is - why don't you use it as regular node? also, you'd need to run with 4/2 (and a failure domain of node/host) to ensure each OSD node has all of the data.
 
The quorum device doesn't have any OSDs. Wouldn't 3/3 be sufficient to ensure the data availability even if 2 nodes fail?

if you have 4 OSD nodes with 3/3, and 2 of those nodes are offline/destroyed/.., you will still have 1 copy of your data (and if you have the space, you'll soon have 2 again) - but it won't be available until it gets replicated to the full min_size (3) again. effectively your whole cluster is unavailable then, until you add new nodes (or your old ones are back up) and the PGs are replicated.
 
the question is what you mean with "data availability". 3/3 is enough to lose 2 nodes and not lose any data. 3/3 is not enough to lose 2 nodes and not have an outage. in fact, with 3/3 you lose one node and already have a partial outage, since access to the data that was on the failed node is blocked until it is replicated 3 times again.
 
the question is what you mean with "data availability". 3/3 is enough to lose 2 nodes and not lose any data. 3/3 is not enough to lose 2 nodes and not have an outage. in fact, with 3/3 you lose one node and already have a partial outage, since access to the data that was on the failed node is blocked until it is replicated 3 times again.

That is why we think about reducing the min_size automatically in the case of nodes failing. That would make the the ceph storage writeable again, right?
 
That is why we think about reducing the min_size automatically in the case of nodes failing. That would make the the ceph storage writeable again, right?

yes, but especially if you go down to 1 you are running with a lot of risk (Ceph will ACK writes once the first OSD has the data, if that OSD dies the data is gone). it is a better idea to have more nodes and run with 3/2 (or 4/3 if you are extra paranoid). the more nodes and OSDs you have, the quicker the cluster can "absorb" any rebalancing needed when a node or OSD fails, and be back to regular redundancy.
 
I'd not recommend it.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!