Smaller size OSD - bigger VM, what happening? Ceph OSD, replication settings.

MrBruce

New Member
Mar 7, 2020
25
3
3
Hi!

Please help me understand for be clear:

Example for sure, all works fine, Ceph health OK, VMs are stored in Ceph, by default only node1 running a VMs:

node1: 1000 GB OSD (1 HDD)
node2: 1000 GB OSD (1 HDD)
node3: 1000 GB OSD (1 HDD)
node4: 500 GB OSD (1 HDD)
node5: 500 GB + 500 GB OSD (2 HDD)
node6: 250 GB + 250 GB + 250 GB + 250 GB OSD (4 HDD)
node7: 250 GB OSD (1 HDD)


I set:

Code:
ceph osd pool set data size 7
Code:
ceph osd pool set data min_size 2

And I adding to Ceph data pool 200 GB size of VM. All works fine and VM running on node1.

Question1: This time, will work fine, and all of node will have 1 copy of 200 GB VM if synchronized and Ceph status ok, right?

Question2: If random 5 node down, Ceph will still works, right? After other nodes back, will sync and Ceph health OK?

Question3: If VM running on node1 and only also available node7, cluster and VM still works fine with node1 and node7 online, and other 5 nodes are offline right?

Question4: What happening, if this 200 GB VM grows to 300 GB?
My theory: Node7 cannot store a replica because of out of space, so only 6 available replica will be, and will also works in node6. No quorum, right?

Question5: Shall I change this time "ceph osd pool set data size 6" or "ceph osd pool set data size 5" ? 5 for quorum, right? From default value at this example is 7.

Question6: If I have 6 node but I set "ceph osd pool set data size 5" what happening? randomly storing 5 replica of VM in 6 machine? So no 100% backup at 1 or more node? Or if I have 6 node, shat I should set? (I know, odd number of nodes are recommended, like 3, 5, 7, 9, 11 ... )

I set back all to default:
Code:
ceph osd pool set data size 7
Code:
ceph osd pool set data min_size 2

Question7: What happening, if this 200 GB vm grows to 600 GB?
My theory: Node4 and node7 cannot store a replica because of out of space (node4: only 500 GB and node7: only 250 GB), so only 5 available replica will be, and will also works in node6. Error messages I will see or something? Ceph health will be not OK, right?

Question8: What happening, if in Q7 situation, where a VMs is 600 GB but only 5 node can store 600+ GB or more, I set "ceph osd pool set data size 5" ?
My theory: Ceph health will be OK. And Node 4 and node7 will store only partial of VM data, not full?

Question9: Extending Q7. What happening if still 600 GB VMs and still "ceph osd pool set data size 7" but I add more OSDs to node4 and node7, like this:
node4: 500 GB OSD + 1000 GB (2 HDD)
node7: 250 GB OSD + 1000 GB (2 HDD)

My theory: after sync back node4 and node7, Ceph will be OK, because now can store all of 7 node at least 600GB, right?

Also you can add more questions to my topic just please add a unique Question id, like Q10, Q11, etc :)
I think this can helpful for other newbies like me.
Please, if you can answer for at least only 1 question, post here and help me. Sorry for my English. Thank you.
 
Last edited:
In short to all of your questions. Never let Ceph run full! All IO will stop when the limit is hit. It is also very unlikely that 5 out of 7 nodes will die at the same time. Ceph is self-healing, so if a node is down the data will be redistributed to meet the size of replicas again.
 
  • Like
Reactions: MrBruce
In short to all of your questions. Never let Ceph run full! All IO will stop when the limit is hit. It is also very unlikely that 5 out of 7 nodes will die at the same time. Ceph is self-healing, so if a node is down the data will be redistributed to meet the size of replicas again.

Sadly I not clearly understand. So if I have 7 nodes and data_min size = 2 and 5 node down it will not working? Or not recommended to use this?
 
So if I have 7 nodes and data_min size = 2 and 5 node down it will not working? Or not recommended to use this?
It can work. But the performance will not be great and a lot more space will be used.