Could not connect to ceph cluster despite configured monitors (500)

Magneto

Well-Known Member
Jul 30, 2017
133
4
58
44
Please help.

I tried to change my CEPH IP's from 192.168.10.0/24 to 192.168.11.0/24, but it wen't horribly wrong. My initital setup is / was as follows:
3 servers:
SRV1 - 192.168.10.241
SRV2 - 192.168.10.242
SRV3 - 192.168.10.243

I wanted to move CEPH to a 2nd IP subnet, with different network cards,
Storage1 - 192.168.11.241
Storage2 - 192.168.11.242
Storage3 - 192.168.11.243

Both IP subnets work and all hosts are in /etc/hosts.



So I ran this code on each node, with the hostname and IP changed as necessary.

Code:
cd /home
mkdir tmp
ceph auth get mon. -o tmp/key-ceph-Storage1
ceph mon getmap -o tmp/map-ceph-Storage1
ceph-mon -i Storage1 --mkfs --monmap tmp/map-ceph-Storage1 --keyring tmp/key-ceph-Storage1
chown ceph:ceph -Rf /var/lib/ceph/mon/ceph-Storage1/
ceph-mon -i Storage1 --public-addr 192.168.11.241:6789



All 6 monitors showed up, so I started removing the SRV1-SRV3 monitors, and then they all dissipated. Now I am trying to re-add some monitors but keep getting this error:

Could not connect to ceph cluster despite configured monitors (500)
 
Here's the ceph.conf file:

Bash:
root@SRV1:/home# more /etc/ceph/ceph.conf
[global]
         auth_client_required = cephx
         auth_cluster_required = cephx
         auth_service_required = cephx
         cluster network = 192.168.10.0/24
         cluster_network = 192.168.10.0/24
         fsid = 1d0f6b10-587b-46b1-9978-4ddf4a61b8fe
         mon_allow_pool_delete = true
         mon_host = 192.168.10.241 192.168.10.242 192.168.10.243
         osd_pool_default_min_size = 2
         osd_pool_default_size = 3
         public network = 192.168.10.0/24
         public_network = 192.168.10.0/24

[client]
         keyring = /etc/pve/priv/$cluster.$name.keyring

[mds]
         keyring = /var/lib/ceph/mds/ceph-$id/keyring

[mds.SRV3]
         host = SRV3
         mds_standby_for_name = pve

[mon.SRV1]
         public_addr = 192.168.10.241

[mon.SRV2]
         public_addr = 192.168.10.242

[mon.SRV3]
         public_addr = 192.168.10.243
 
All 6 monitors showed up, so I started removing the SRV1-SRV3 monitors, and then they all dissipated. Now I am trying to re-add some monitors but keep getting this error:
Is there still an old monitor left? If the new MONs didn't connect to the old ones and you removed the old MONs, then the other maps (OSD, MDS, PG, Crush) are most likely lost.

cluster network = 192.168.10.0/24
cluster_network = 192.168.10.0/24
public network = 192.168.10.0/24
public_network = 192.168.10.0/24
The network stanzas are only needed once. And you can have multiple network on one config option.

In general to change the public_network IP range, the new network needs to be routed to the existing subnet. And then one MON at a time, it can be replaced with a new MON from the new subnet.
 
Hello,

I have the same problem with my proxmox Ceph cluster : Could not connect to ceph cluster despite configured monitors (500). For the moment I'm unable to retrieve data from my osd.

Have you find a solution ?

Kind regards.
 
@vlazarus, please open up a new thread and elaborate on your cluster in more detail.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!