can't destroy or create ceph monitor

sidereus

Member
Jul 25, 2019
45
7
13
53
I have removed the monitor via web GUI, later created it again, but it doesn't start. Can't remove it or create again.
Bash:
root@asr3:~# pveceph mon destroy asr3
no such monitor id 'asr3'
root@asr3:~# pveceph mon create --monid asr3
monitor address '192.168.121.3' already in use
root@asr3:~# pveceph status
  cluster:
    id:     8fc87072-5946-466f-a10a-6fa9bd6fa925
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum asr1,asr2,asr4 (age 22m)
    mgr: asr2(active, since 3d), standbys: asr1, asr4, asr3
    mds: cephfs:1 {0=asr4=up:active} 3 up:standby
    osd: 27 osds: 27 up (since 18m), 27 in (since 4h)

  data:
    pools:   4 pools, 193 pgs
    objects: 359.84k objects, 1.4 TiB
    usage:   4.0 TiB used, 40 TiB / 44 TiB avail
    pgs:     193 active+clean

  io:
    client:   0 B/s rd, 100 KiB/s wr, 0 op/s rd, 7 op/s wr
root@asr3:/etc/pve# cat ceph.conf
[global]
         auth_client_required = cephx
         auth_cluster_required = cephx
         auth_service_required = cephx
         cluster_network = 192.168.121.0/24
         fsid = some-id
         mon_allow_pool_delete = true
         mon_host = 192.168.121.1 192.168.121.2 192.168.121.4 192.168.121.3
         osd_pool_default_min_size = 2
         osd_pool_default_size = 3
         public_network = 192.168.121.0/24

[client]
         keyring = /etc/pve/priv/$cluster.$name.keyring

[mds]
         keyring = /var/lib/ceph/mds/ceph-$id/keyring

[mds.asr1]
         host = asr1
         mds_standby_for_name = pve

[mds.asr2]
         host = asr2
         mds_standby_for_name = pve

[mds.asr3]
         host = asr3
         mds_standby_for_name = pve

[mds.asr4]
         host = asr4
         mds_standby_for_name = pve

[mon.asr1]
         public_addr = 192.168.121.1

[mon.asr2]
         public_addr = 192.168.121.2

[mon.asr3]
         public_addr = 192.168.121.3

[mon.asr4]
         public_addr = 192.168.121.4
What to do?
I also found the same problem here, but no solution.

EDIT. I did created a monitor manually, but it don't want to add to the map.
Code:
root@asr3:~# ceph mon getmap -o tmp/map
got monmap epoch 5
root@asr3:~# ceph auth get mon. -o tmp/key
exported keyring for mon.
root@asr3:~# rm -rf /var/lib/ceph/mon/ceph-asr3/
root@asr3:~# systemctl stop ceph-mon.target
root@asr3:~# systemctl stop ceph-mon@asr3.service
root@asr3:~# ceph-mon -i asr3 --mkfs --monmap tmp/map --keyring tmp/key
root@asr3:~# ceph-mon -i asr3 --public-network 192.168.121.0/24
root@asr3:~# ss -tulpn | grep '3300\|6789'
tcp     LISTEN   0        512        192.168.121.3:3300           0.0.0.0:*      users:(("ceph-mon",pid=2491129,fd=27))                                         
tcp     LISTEN   0        512        192.168.121.3:6789           0.0.0.0:*      users:(("ceph-mon",pid=2491129,fd=28))
root@asr3:~# ceph mon dump
dumped monmap epoch 5
epoch 5
fsid 8fc87072-5946-466f-a10a-6fa9bd6fa925
last_changed 2021-04-21T00:46:55.557939+0300
created 2021-04-06T16:06:42.329089+0300
min_mon_release 15 (octopus)
0: [v2:192.168.121.1:3300/0,v1:192.168.121.1:6789/0] mon.asr1
1: [v2:192.168.121.2:3300/0,v1:192.168.121.2:6789/0] mon.asr2
2: [v2:192.168.121.4:3300/0,v1:192.168.121.4:6789/0] mon.asr4
 
Last edited:
I suggest to clean up everything of Mon asr3 so that the pveceph tooling can recreate it cleanly!

To do so, try to run pveceph mon destroy asr3 on that node!

If it does not work since the current state is somewhat broken, try to clean up manually!

In the /etc/pve/ceph.conf file, remove the IP address from the mon_host line in the [global] section.
Also remove the [mon.asr3] section.

On the node asr3 stop the mon service and disable it.
Code:
systemctl stop ceph-mon@asr3.service
systemctl disable ceph-mon@asr3.service

Next remove the data directory
Code:
rm -r /var/lib/ceph/mon/ceph-asr3

With that you should be able to recreate the monitor with pveceph mon create on node asr3.

But one thing! How many nodes do you have in that cluster? If you recreate mon asr3, you will have 4 monitors in the cluster which usually is not a good idea. The Ceph Mons work on the majority principle and having an even number of votes can lead to a split brain situation. Try to keep an odd number of monitors in your cluster!

Same goes for the PVE cluster itself. If you run an even number of nodes, you might want to consider to add an external vote to the cluster to get to an odd number of votes. https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_corosync_external_vote_support
 
  • Like
Reactions: Alwin Antreich

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!