Hello!
Due to an HD crash I was forced to rebuild a server node from scratch, means I installed OS and Proxmox VE (
apt install proxmox-ve postfix open-iscsi) fresh on the server.
Then I executed and Ceph (pveceph install) on greenfield.
Then I ran pvecm add 192.168.10.11 -ring0_addr 192.168.10.12 -ring1_addr 192.168.20.12 to add the node to the existing cluster.
This all worked well.
As a next step I started installation of Ceph (pveceph install) and finally executed pveceph createmon.
The Ceph status shows that the relevant node is out of quorum:
ceph health detail
HEALTH_WARN noout flag(s) set; 20 osds down; 3 hosts (22 osds) down; Reduced data availability: 1429 pgs inactive; Degraded data redundancy: 714
7773/14446416 objects degraded (49.478%), 1444 pgs degraded, 1845 pgs undersized; mon ld4257 is low on available space; 1/3 mons down, quorum ld
4257,ld4465
OSDMAP_FLAGS noout flag(s) set
OSD_DOWN 20 osds down
[...]
MON_DOWN 1/3 mons down, quorum ld4257,ld4465
mon.ld4464 (rank 1) addr 10.97.206.98:6789/0 is down (out of quorum)
Question:
Does it makes sense to continue like this?
Will it be possible to rebuild the cluster?
In my understanding I must fix the issue with failed monitoring service on node ld4464.
How can I do this?
THX
Due to an HD crash I was forced to rebuild a server node from scratch, means I installed OS and Proxmox VE (
apt install proxmox-ve postfix open-iscsi) fresh on the server.
Then I executed and Ceph (pveceph install) on greenfield.
Then I ran pvecm add 192.168.10.11 -ring0_addr 192.168.10.12 -ring1_addr 192.168.20.12 to add the node to the existing cluster.
This all worked well.
As a next step I started installation of Ceph (pveceph install) and finally executed pveceph createmon.
The Ceph status shows that the relevant node is out of quorum:
ceph health detail
HEALTH_WARN noout flag(s) set; 20 osds down; 3 hosts (22 osds) down; Reduced data availability: 1429 pgs inactive; Degraded data redundancy: 714
7773/14446416 objects degraded (49.478%), 1444 pgs degraded, 1845 pgs undersized; mon ld4257 is low on available space; 1/3 mons down, quorum ld
4257,ld4465
OSDMAP_FLAGS noout flag(s) set
OSD_DOWN 20 osds down
[...]
MON_DOWN 1/3 mons down, quorum ld4257,ld4465
mon.ld4464 (rank 1) addr 10.97.206.98:6789/0 is down (out of quorum)
Question:
Does it makes sense to continue like this?
Will it be possible to rebuild the cluster?
In my understanding I must fix the issue with failed monitoring service on node ld4464.
How can I do this?
THX