Monitored address already in use (500)

ivusi

Active Member
Jul 3, 2019
1
0
41
34
I have a 3 node Ceph cluster. I was updating the node via 'apt upgrade' when it crashed. It rebooted fine and all came back however the monitoring service had stopped for that node.
I tried restarting etc to no avail.
I checked that the other 2 were still in quorum, which was ok


Code:
wallis@prox01:~$ sudo ceph -s
  cluster:
    id:     68dd8e32-f7b1-479b-a406-dd8db83fd50d
    health: HEALTH_WARN
            1/3 mons down, quorum prox01,prox02
 
  services:
    mon: 3 daemons, quorum prox01,prox02 (age 4h), out of quorum: prox03
    mgr: prox02(active, since 5M), standbys: prox01, prox03
    mds: 1/1 daemons up, 2 standby
    osd: 18 osds: 18 up (since 4h), 18 in (since 11w)
 
  data:
    volumes: 1/1 healthy
    pools:   5 pools, 705 pgs
    objects: 941.62k objects, 3.5 TiB
    usage:   11 TiB used, 5.7 TiB / 16 TiB avail
    pgs:     705 active+clean
 
  io:
    client:   231 KiB/s rd, 1.2 MiB/s wr, 5 op/s rd, 113 op/s wr
 
wallis@prox01:~$


I then deleted mon on prox03 via the GUI and recreated it, when I got the message "monitor address 10.0.31.20 already in use (500)

I checked: ceph mon stat#

Code:
sudo ceph mon stat
e5: 2 mons at {prox01=[v2:10.0.31.18:3300/0,v1:10.0.31.18:6789/0],prox02=[v2:10.0.31.19:3300/0,v1:10.0.31.19:6789/0]} removed_ranks: {2}, election epoch 2204, leader 0 prox01, quorum 0,1 prox01,prox02

and I saw prox03 was not referenced here but i did see it referenced in the [global] section 10.0.31.20

Code:
wallis@prox03:~$ sudo cat /etc/pve/ceph.conf
[global]
    auth_client_required = cephx
    auth_cluster_required = cephx
    auth_service_required = cephx
    cluster_network = 10.0.31.16/28
    fsid = 68dd8e32-f7b1-479b-a406-dd8db83fd50d
    mon_allow_pool_delete = true
    mon_host = 10.0.31.18 10.0.31.19 10.0.31.20
    ms_bind_ipv4 = true
    ms_bind_ipv6 = false
    osd_pool_default_min_size = 2
    osd_pool_default_size = 3
    public_network = 10.0.31.16/28

[client]
    keyring = /etc/pve/priv/$cluster.$name.keyring

[client.crash]
    keyring = /etc/pve/ceph/$cluster.$name.keyring

[mds]
    keyring = /var/lib/ceph/mds/ceph-$id/keyring

[mds.prox01]
    host = prox01
    mds_standby_for_name = pve

[mds.prox02]
    host = prox02
    mds_standby_for_name = pve

[mds.prox03]
    host = prox03
    mds_standby_for_name = pve

[mon.prox01]
    public_addr = 10.0.31.18

[mon.prox02]
    public_addr = 10.0.31.19

wallis@prox03:~$


Can i manually remove 10.0.30.21 from each cluster ceph.conf file?
Many thnaks
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!