I have (had) a Promox 8.1 cluster with 2 nodes and one qDevice as 3rd witness
/etc/ceph/ceph.conf
[global]
auth_client_required = cephx
auth_cluster_required = cephx
auth_service_required = cephx
cluster_network = 10.227.101.33/24
fsid = something
mon_allow_pool_delete = true
mon_host = 10.227.101.33 10.227.101.32
ms_bind_ipv4 = true
ms_bind_ipv6 = false
osd_pool_default_min_size = 2
osd_pool_default_size = 3
public_network = 10.227.101.33/24
[client]
keyring = /etc/pve/priv/$cluster.$name.keyring
[mds]
keyring = /var/lib/ceph/mds/ceph-$id/keyring
[mds.proxmox03]
host = proxmox03
mds_standby_for_name = pve
[mds.proxmox32]
host = proxmox32
mds_standby_for_name = pve
[mon.proxmox03]
public_addr = 10.227.101.33
[mon.proxmox32]
public_addr = 10.227.101.32
It has been working for a while but suddenly a few days ago one of the nodes, 10.227.101.33, died and my CEPH-cluster aswell.
First after that I noticed there maybe is an error in my config, I guess cluster_network (and public_network) should be a subnet (10.227.101.0/24) and not a specific host, at least not a dead host.
But I am not able to edit ceph.conf due to permission issue (/etc/ceph/ceph.conf is meant to be read-only) and I am not allowed to change it with chmod neither.
Are there anything I can do to save my cluster or am I totally lost?
/etc/ceph/ceph.conf
[global]
auth_client_required = cephx
auth_cluster_required = cephx
auth_service_required = cephx
cluster_network = 10.227.101.33/24
fsid = something
mon_allow_pool_delete = true
mon_host = 10.227.101.33 10.227.101.32
ms_bind_ipv4 = true
ms_bind_ipv6 = false
osd_pool_default_min_size = 2
osd_pool_default_size = 3
public_network = 10.227.101.33/24
[client]
keyring = /etc/pve/priv/$cluster.$name.keyring
[mds]
keyring = /var/lib/ceph/mds/ceph-$id/keyring
[mds.proxmox03]
host = proxmox03
mds_standby_for_name = pve
[mds.proxmox32]
host = proxmox32
mds_standby_for_name = pve
[mon.proxmox03]
public_addr = 10.227.101.33
[mon.proxmox32]
public_addr = 10.227.101.32
It has been working for a while but suddenly a few days ago one of the nodes, 10.227.101.33, died and my CEPH-cluster aswell.
First after that I noticed there maybe is an error in my config, I guess cluster_network (and public_network) should be a subnet (10.227.101.0/24) and not a specific host, at least not a dead host.
But I am not able to edit ceph.conf due to permission issue (/etc/ceph/ceph.conf is meant to be read-only) and I am not allowed to change it with chmod neither.
Are there anything I can do to save my cluster or am I totally lost?