Hi.
I have a cluster with 10 servers. The root filesystem of 1 of the servers failed last week. So I've reinstalled the server with a new root disk. About ceph config, it's a little bit more difficult.
Creating an empty /var/lib/ceph/mon/ceph-servername allowed me to use the "Destroy" button in the interface to remove the associated monitor.
But creating a new monitor is still not possible : "monitor address x.x.x.x already in use (500)".
Checking with "ss -tupln" shows that no ceph process runs
Looking at the ceph.conf, I can see this :
I have a cluster with 10 servers. The root filesystem of 1 of the servers failed last week. So I've reinstalled the server with a new root disk. About ceph config, it's a little bit more difficult.
Creating an empty /var/lib/ceph/mon/ceph-servername allowed me to use the "Destroy" button in the interface to remove the associated monitor.
But creating a new monitor is still not possible : "monitor address x.x.x.x already in use (500)".
Checking with "ss -tupln" shows that no ceph process runs
Looking at the ceph.conf, I can see this :
Code:
[global]
auth_client_required = cephx
auth_cluster_required = cephx
auth_service_required = cephx
cluster_network = 192.168.12.0/24
fsid = 32376c87-d8b6-46e2-8aad-2587ed1f39f5
mon_allow_pool_delete = true
mon_host = 192.168.12.4 192.168.12.3 192.168.12.2 192.168.12.1 192.168.12.6 192.168.12.7 192.168.12.8 192.168.12.5 192.168.12.9 192.168.12.10
ms_bind_ipv4 = true
ms_bind_ipv6 = false
osd_pool_default_min_size = 2
osd_pool_default_size = 2
public_network = 192.168.12.0/24
[client]
keyring = /etc/pve/priv/$cluster.$name.keyring
[client.crash]
keyring = /etc/pve/ceph/$cluster.$name.keyring
[mon.bill]
public_addr = 192.168.12.7
[mon.bob]
public_addr = 192.168.12.9
[mon.grat]
public_addr = 192.168.12.10
[mon.henry]
public_addr = 192.168.12.6
[mon.jack]
public_addr = 192.168.12.2
[mon.jim]
public_addr = 192.168.12.8
[mon.joe]
public_addr = 192.168.12.1
[mon.marcel]
public_addr = 192.168.12.5
[mon.william]
public_addr = 192.168.12.3
The failed node was 192.168.12.4 (named Averell). In the conf file, I no more can see the [mon.averell] section, BUT I still can see the ip address in the mon_host parameter.
Can I fix the "mon_host" parameter safely ? Is that the problem ?
Regards