[SOLVED] Node redundancy on ceph.

vijayk

New Member
Apr 8, 2024
5
0
1
Hello,
I'm new in proxmox and I just deploy the Promox with 5 nodes with installed ceph on all nodes.
ceph is configured with 5 monitors.

Facing issue while I shutdown 1 node VM installed on ceph storage migrate automatic to next available node. But when shutdown 2 nodes my ceph storage is not accessible in cluster. So what I should to look?

Thanks in advance.
 
So what I should to look?
The critical setting is "size/minsize" which is "3/2" by default. Data is written to three nodes and two of them need to be online. You can only lose a single node.

If you want to be able to lose two hosts you probably need to go for 5/3 = all nodes get a copy, three of them need to be available at any time.

4/3 would not be sufficient: it writes data to four (of the five) nodes and three of these four need to be available.

Disclaimer: I am not a Ceph specialist but a beginner, so please correct me if I am wrong...

Edit, thinking further: what about 4/2...? Should work - four copies, two may be lost = any two nodes can vanish.

Edit 2: the reference = https://pve.proxmox.com/pve-docs/chapter-pveceph.html
 
Last edited:
  • Like
Reactions: vijayk
@UdoB Tested with 4/2, if shutdown 2 nodes ceph storage is not accessible.
Tested with 5/3 and it works, if shutdown 2 nodes ceph storage is accessible.
 
Tested with 4/2, if shutdown 2 nodes ceph storage is not accessible.
Just to be sure: did you wait long enough for re-balancing to finish first? I am surprised...

(PS: did you select "Tutorial" on purpose? You can change it by editing the first post and using the drop-down menu for the title.)
 
Yes, wait for re-balancing but ceph storage was not accessible.
Okay. I was focusing on the distribution/availability of the OSDs.

It seems plausible that there are different (but similar?) requirements regarding MONs (and possibly MGRs). I am not sure about this. But if the majority of MONs got shutdown Ceph may deny writing.

As I've said: I am not an expert... ;-)