2/4 nodes failure on CLUSTER with CEPH

dominiaz

Renowned Member
Sep 16, 2016
34
1
73
37
I have 4 node setup - 4 identical servers.

Setup is ok, everything looks fine.

I am trying to simulate a node failure before i will start to using that cluster.

When I shutdown 1 node (3 nodes are connected together):
Cluster, HA, CEPH - everything is fine.

But the problem is when I shutdown 2 nodes (another 2 nodes are connected together):
Cluster, HA, CEPH - everything is crushed and stopped working.

Why? How to fix it?
 
if half of the cluster is gone, there is no more quorum, and thus no decisions can be made. you always need at least (N/2)+1 nodes for quorum, where N is the total amount of nodes.
 
OK, thanks for your answear.

Can I set an extra node to cluster without ceph storage only for quorum?
 
yes, that would solve the quorum issue. whether you can still access your ceph data when 2 of the 4 storage nodes are down depends on the ceph configuration / replication.
 
I have 4TB raid5 in every server.

I have 4x OSD (each of them 4TB)

My ceph pool is:

Size: 3
Min. Size: 1
pg_num: 300

Is it ok?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!