2/4 nodes failure on CLUSTER with CEPH

dominiaz

Renowned Member
Sep 16, 2016
37
1
73
38
I have 4 node setup - 4 identical servers.

Setup is ok, everything looks fine.

I am trying to simulate a node failure before i will start to using that cluster.

When I shutdown 1 node (3 nodes are connected together):
Cluster, HA, CEPH - everything is fine.

But the problem is when I shutdown 2 nodes (another 2 nodes are connected together):
Cluster, HA, CEPH - everything is crushed and stopped working.

Why? How to fix it?
 
if half of the cluster is gone, there is no more quorum, and thus no decisions can be made. you always need at least (N/2)+1 nodes for quorum, where N is the total amount of nodes.
 
OK, thanks for your answear.

Can I set an extra node to cluster without ceph storage only for quorum?
 
yes, that would solve the quorum issue. whether you can still access your ceph data when 2 of the 4 storage nodes are down depends on the ceph configuration / replication.
 
I have 4TB raid5 in every server.

I have 4x OSD (each of them 4TB)

My ceph pool is:

Size: 3
Min. Size: 1
pg_num: 300

Is it ok?