How can I separate the Ceph Public Network on a 2-node Cluster?

cdemi

New Member
May 14, 2020
1
0
1
36
Let me start by saying, I know a 2-node cluster is not recommended but I only have limited amount of money for my homelab :(

I also know this isn't a forum for Ceph Support but any help/tips would be greatly appreciated as I'm trying to learn this stuff by myself.

I don't mind having downtime for this procedure.

ceph.conf:

Code:
   [global]
         auth_client_required = cephx
         auth_cluster_required = cephx
         auth_service_required = cephx
         cluster_network = 172.18.21.200/24
         fsid = 083f7640-cb8e-467e-8fa1-4d29291c17aa
         mon_allow_pool_delete = true
         mon_host = 172.18.21.200 172.18.21.100
         osd_pool_default_min_size = 2
         osd_pool_default_size = 2
         public_network = 172.18.21.200/24
   
    [client]
         keyring = /etc/pve/priv/$cluster.$name.keyring
   
    [mds]
         keyring = /var/lib/ceph/mds/ceph-$id/keyring
   
    [mds.legion]
         host = legion
         mds standby for name = pve
   
    [mds.magneto]
         host = magneto
         mds_standby_for_name = pve
   
    [mon.legion]
         public_addr = 172.18.21.100
   
    [mon.magneto]
         public_addr = 172.18.21.200

So I have tried to change the [global].public_network subnet to 172.18.10.200/24, I changed the [global].mon_host to the same subnet and I changed the [mon.x].public_addr to the new subnet. I made sure ceph.conf was replicated on the 2 hosts.

However, as soon as I do this change, the ceph cluster stops responding and times out. I can't query it with commands like ceph -s because they just hang.

If I tail /var/log/syslog I get: May 14 10:23:25 legion ceph-mon[9507]: 2020-05-14 10:23:25.109 7f0a877d0700 -1 mon.legion@1(electing) e2 failed to get devid for : fallback method has serial ''but no model

I assume this happens because the ceph cluster no longer has quorum. If I revert the changes, everything starts working again.

What am I doing wrong? Or what is the correct procedure of achieving this?
 
You can't change the IP of the existing MONs, since they are saved in their IP. You will need to transition slowly, by adding a MON in the new subnet first. And remove one from the old subnet. It is easier to change the cluster_network. Both subnets need to be routed.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!