server dead in pve-ceph cluster

Stefano Giunchi

Renowned Member
Jan 17, 2016
84
12
73
50
Forlì, Italy
www.soasi.com
I have a cluster with 3 nodes which act both as vm nodes and ceph storage.
node1: 1 osd, 1tb disk
node2: 2 osd, 1tb disks
node3: 2 osd, 1tb disks

the total is 4,7 usable disks. I created a ceph pool with size 2/ min 1, so I have a single replica of my data. I've now read that it's a bad setting, I will change config to 3/2 when everything is ok.

After I shut down all servers, because I had to traslocate everything, the node3 didn't turn on anymore. More precisely, the raid controller is dead: I had 1 mirror for system disks, and various single drive configurations for the OSDs.

The cluster is working with node1 and node2, but the available space is dangerously near the 2x data space I need.

I'll repair or replace ASAP node3. In the while, I set osd noout hoping it stops recreating the 2x replicas, as I know OSDs must NEVER go 100%.
I ask If that's a good move, while I'm struggling to recover my node3, or at least add disks to node1 and node2, or there's another way to pause replication.

Any other suggestion is really appreciated.

Thanks
 
Hi,
r at least add disks to node1 and node2,
Yes I would add the disks from the node3 to node2 and node1 to increase the disk space and set the replica to 2/2 to avoid warnings.

But a 2 node ceph cluster is only for temporarily use. So try to fix your node3 as soon as possible.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!