Unable to change ceph networks

mart.v

Well-Known Member
Mar 21, 2018
32
0
46
43
Hi guys,

I am trying to change Ceph networks. Since now I had one subnet (172.16.254.0/24) both for public and cluster network. The goal is to have this configuration:

Code:
cluster_network = 10.10.112.0/24
public_network = 10.10.111.0/24

I have followed this tutorial https://forum.proxmox.com/threads/ceph-changing-public-network.119116/#post-614241

Now I have successfully changed the public network - so all clients are using the new IP range, as well as MON, MGR and MDS.

The problem is when I try to restart first OSD. It show as UP, but it's gone after few seconds. I can see that the service is running, bind to correct IP, but somehow other OSDs are reporting it as dead:

Code:
2024-04-21T22:19:02.241484+0200 mon.node13 (mon.0) 9815 : cluster [DBG] osd.14 reported failed by osd.19
2024-04-21T22:19:02.527605+0200 mon.node13 (mon.0) 9818 : cluster [DBG] osd.14 reported failed by osd.15
2024-04-21T22:19:02.660661+0200 mon.node13 (mon.0) 9819 : cluster [DBG] osd.14 reported failed by osd.24
2024-04-21T22:19:02.803907+0200 mon.node13 (mon.0) 9820 : cluster [DBG] osd.14 reported failed by osd.58
2024-04-21T22:19:02.901519+0200 mon.node13 (mon.0) 9823 : cluster [DBG] osd.14 reported failed by osd.67
2024-04-21T22:19:03.462197+0200 mon.node13 (mon.0) 9831 : cluster [DBG] osd.14 reported failed by osd.32
2024-04-21T22:19:03.589426+0200 mon.node13 (mon.0) 9832 : cluster [DBG] osd.14 reported failed by osd.89

I have checked through telnet that host server with those OSDs can communicate with osd.14 on their port.
Code:
"osd":{
         "hb_front_addr":"[v2:10.10.111.201:6802/464238,v1:10.10.111.201:6803/464238]",
         "front_addr":"[v2:10.10.111.201:6800/464238,v1:10.10.111.201:6801/464238]",
         "hb_back_addr":"[v2:172.16.254.201:6802/464238,v1:172.16.254.201:6803/464238]",
         "back_addr":"[v2:172.16.254.201:6800/464238,v1:172.16.254.201:6801/464238]",
      }

When I roll back the changes and restart OSD with old cluster network, everything is fine. Running on 16.2.15.

Any ideas what could be wrong?
 
Hi,


Now I have successfully changed the public network - so all clients are using the new IP range, as well as MON, MGR and MDS.
Did you restart all the OSDs and MONs after the IP changed? Could you please also share the output of the /etc/pve/ceph.conf and the output of `pveceph status` command?
 
Did you restart all the OSDs and MONs after the IP changed? Could you please also share the output of the /etc/pve/ceph.conf and the output of `pveceph status` command?
Thank you for your reply. Yes, I did restart everything.

Now there is a small progress. As I mentioned in the first post, I was able to change the public network to 10.10.111.0/24. I was unable to change the cluster network to 10.10.112.0/24.

BUT when I tried to change the cluster network from 172.16.254.0/24 to 10.10.111.0/24 (same as new public network), it worked. I have restarted every service and it is running smooth.

I am still unable to change the cluster network from 10.10.111.0/24 to 10.10.112.0/24 to have separated networks. I encounter the very same error
Code:
cluster [DBG] osd.14 reported failed by osd.19

Status ceph:

Code:
# pveceph status
  cluster:
    id:     ecc963a4-009f-4236-87fe-e672a7cb5d49
    health: HEALTH_OK


  services:
    mon: 5 daemons, quorum node13,node99,node98,node1,node97 (age 3h)
    mgr: node16(active, since 46h), standbys: node17
    mds: 1/1 daemons up, 1 standby
    osd: 84 osds: 84 up (since 3h), 84 in (since 4d)


  data:
    volumes: 1/1 healthy
    pools:   7 pools, 1985 pgs
    objects: 43.90M objects, 49 TiB
    usage:   130 TiB used, 70 TiB / 200 TiB avail
    pgs:     1984 active+clean
             1    active+clean+scrubbing+deep


  io:
    client:   34 MiB/s rd, 76 MiB/s wr, 1.34k op/s rd, 6.18k op/s wr
 

Attachments

  • ceph.txt
    921 bytes · Views: 1
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!