ceph 2 nodes with osd + 3 nodes without osd

robertozappaterreni

New Member
Nov 3, 2022
11
1
3
I have a proxmox installation on 5 nodes of which 2 nodes contain 22+22 OSD, 3 Nodes without OSD are Monitors and Manager

the 44 OSDs are from 1.92TB ssd

The pool configuration is 2/2 with automatic PG (2048) and the rest of the default settings

Having 8 SSD slots available on the other 3 nodes, does it make sense to insert another 24osd distributed on the 3 nodes currently without osd? Could this lead to an increase in available space?

At the moment it appears that of 84TB raw I can use 76 of which 60 are occupied and we are at 80%, does this mean that I actually don't have any replica of the data correct?
 
Last edited:
Yes,it will increase space since you use 2/2 configuration, but i would recommend going to 3/2 once you add disks. Yes you will lose some space, but the availability will be better.
 
Yes,it will increase space since you use 2/2 configuration, but i would recommend going to 3/2 once you add disks. Yes you will lose some space, but the availability will be better.
A thousand thanks,
but really the space indicated "76TB" are really usable? or should I consider replication? and then divide by 2 or 3 ?
 
Yeah,from usability point, divide it by 3 or 2(depending on 3/2 or 2/2) ,and then let's say don't use more than 60-80% of the pool, because of the rebalance when something dies.
 
Could I think of having different TB denominations for the nodes?
What can this type of configuration entail?

Example SSD:
node1: 4x0.9TB + 2x1.92TB = 7,44TB
node2: 4x0.9TB + 2x1.92TB = 7,44TB
node3: 4x0.9TB + 2x1.92TB = 7,44TB
node4: 22 x 1.92TB = 42,24TB
node5: 22 x 1.92TB = 42,24TB

RAW:
TOT: 106TB
Replication 2/2 = 53 Usable
Replication 3/2 = 35 Usable

Is this a plausible calculation?
If the node with 22 disks fails will ceph have to redeploy about 42TB? with this situation will he be able to do it in 2/2?

thanks for the support
 
node3: 4x0.9TB + 2x1.92TB = 7,44TB
node4: 22 x 1.92TB = 42,24TB
If you actually go with 2/2 (which is in itself dangerous) and node 4 or 5 fails ceph might try to re-balance itself and node 1/2/3 will run out of space.
Disclaimer: I am NOT a ceph specialist.
 
If you actually go with 2/2 (which is in itself dangerous) and node 4 or 5 fails ceph might try to re-balance itself and node 1/2/3 will run out of space.
Disclaimer: I am NOT a ceph specialist.
Thanks for your interest, it seems to me that you are saying that to make up for the lack of one of the two nodes 4 or 5 on nodes 1-2-3, I should have enough space for all 42 TB by adding them together?
 
Ask yourself: what happens when (not: if) Node 4 dies? In my understanding Ceph tries to re-ballance. Onto which nodes should it copy the data and how much could it be?
As I have already said: I am not really a Ceph user. But in my understanding a 2/2 system can only work while two copies are available. Every single bit that is written NEEDS to get stored on two devices/nodes/osds at the same time. If one of these two copies is not possible to get written then the complete ceph system will stop writing immediately. That's why I wrote "dangerous". (And no, 2/1 is not a solution nor a workaround.)

And a third time: I am not really competent to give ultimate advice here. Maybe I am completely wrong. If you are using Ceph in this way, you need to make sure to understand the implications...
 
  • Like
Reactions: robertozappaterreni
Grazie credo di aver capito che i dati del nodo morto verranno ribilanciati sugli altri nodi nel caso di 2/2 in duplicato in due nodi diversi. Idealmente, credo che dovrei avere tanto spazio "libero" quanto la dimensione del nodo morto.