[Ceph] Can't delete ghost monitor [SOLVED]

fcarucci

New Member
May 13, 2023
17
2
3
Hello,

I have a 3 nodes ceph cluster with 3 OSDs and after some misadventures I ended up reinstalling a node from scratch and adding it back to the proxmox cluster. Ceph is rebalancing as expected.
But the old monitor from the reinstalled node is showing up as "unknown" and I can't seem to be able to delete it. I've read all the forum posts, but I can't get it to go away and I can not create a new monitor on that node.

This is the error I get when I try to delete the existing unknown monitor:
hostname lookup 'undefined' failed - failed to get address info for: undefined: Name or service not known (500)

This is what I get if I try to stop the monitor:
entry has no host

This is what I get if I try to create a new one from the GUI or the command line:
command 'monmaptool --clobber --addv pve '[v2:10.0.20.1:3300,v1:10.0.20.1:6789]' --print /tmp/monmap' failed: exit code 1
root@pve:~# pveceph destroymon pve
monitor filesystem '/var/lib/ceph/mon/ceph-pve' does not exist on this node

root@pve:~# pveceph createmon
monmaptool: monmap file /tmp/monmap
monmaptool: map already contains mon.pve


Mon dump
Code:
root@pve-ceph1:~# ceph mon dump
epoch 3
fsid 8ae0a6fc-9140-4301-95ba-08ec6c78b220
last_changed 2024-03-16T09:05:05.556663-0700
created 2024-03-15T22:06:10.274733-0700
min_mon_release 18 (reef)
election_strategy: 1
0: [v2:10.0.20.4:3300/0,v1:10.0.20.4:6789/0] mon.pve-ceph1
1: [v2:10.0.20.5:3300/0,v1:10.0.20.5:6789/0] mon.pve-ceph2
2: [v2:10.0.20.1:3300/0,v1:10.0.20.1:6789/0] mon.pve
dumped monmap epoch 3

This is my ceph.conf
Code:
[global]
         auth_client_required = cephx
         auth_cluster_required = cephx
         auth_service_required = cephx
         cluster_network = 10.30.0.4/24
         fsid = 8ae0a6fc-9140-4301-95ba-08ec6c78b220
         mon_allow_pool_delete = true
         mon_host = 10.0.20.4 10.0.20.5
         ms_bind_ipv4 = true
         ms_bind_ipv6 = false
         osd_pool_default_min_size = 2
         osd_pool_default_size = 3
         public_network = 10.0.20.4/24
         mon_cluster_log_to_file = false


[osd]
        osd_scrub_begin_hour = 0
        osd_scrub_end_hour = 7


[client]
         keyring = /etc/pve/priv/$cluster.$name.keyring


[mds]
         keyring = /var/lib/ceph/mds/ceph-$id/keyring


[mds.pve-ceph1]
         host = pve-ceph1
         mds_standby_for_name = pve


[mds.pve-ceph2]
         host = pve-ceph2
         mds_standby_for_name = pve


[mon.pve-ceph1]
         public_addr = 10.0.20.4


[mon.pve-ceph2]
         public_addr = 10.0.20.5

How do I get rid of this monitor? Thanks!
 
Last edited:
Hi,

This is what I get if I try to create a new one from the GUI or the command line:
command 'monmaptool --clobber --addv pve '[v2:10.0.20.1:3300,v1:10.0.20.1:6789]' --print /tmp/monmap' failed: exit code 1
root@pve:~# pveceph destroymon pve
monitor filesystem '/var/lib/ceph/mon/ceph-pve' does not exist on this node
May you try the following:

Code:
mkdir -p /var/lib/ceph/mon/<Ceph-MonID>
pveceph mon destroy <Ceph-MonID>

You have to replace the `<Ceph-MonID>` in the above commands with Ceph Monitor name. See [0] for more information.

[0] https://pve.proxmox.com/pve-docs/chapter-pveceph.html#pve_ceph_monitors
 
  • Like
Reactions: uncletall

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!