remove dead CEPH monitor after removing cluster node?

athompso

Renowned Member
Sep 13, 2013
129
8
83
I removed a PVE cluster node that was also a CEPH monitor (no OSD, just MON).
Of course, I forgot to remove the CEPH monitor before removing the node from the cluster.

When I attempt to remove the monitor from the PVE GUI, of course it fails because it's trying to cleanly remove it.

If I delete it from ceph.conf, it continues to show up in the GUI but now with hostname "unknown".

It was never referenced in storage.cfg (oops...) so I don't have to worry about that.

The only problem appears to be the permanent CEPH warning state, which makes monitoring (no pun intended) difficult, but I would still like to fix that.

(I'm not the only one with this problem; http://forum.proxmox.com/threads/18941-How-to-remove-a-dead-CEPH-nodes reports a similar situation, but with no answers. Maybe my post will get some answers...)
 
I did read that. Several times, today.

Ah... on about the 4th or 5th pass through that document, I decided to test "ceph mon remove 3", and that does the trick.

Thereafter, I also needed to edit /etc/pve/pve.conf (aka /etc/ceph/ceph.conf) to remove the reference to the missing monitor, and /etc/pve/storage.cfg to fill in the missing monitors.
 
  • Like
Reactions: karnz and AlexLup
I run into a same situation where, when deleting a ceph monitoring, it is away from CEPH - config, but still showing up on the list of monitors with "unknown" and also still is on the list of monitors when creating rbd storage. I did not find any config file in /etc/pve so I suppose another location, does anyone has a trick to do it?
 
Last edited:
I have done ceph mon remove NODEID and then remove it from /etc/ceph/ceph.conf, but as you mentioned it is still in the GUI as Uknown
It needs some time until the state on the GUI is updated.
 
If the nodes some got failed and is not online anymore, how can we do this?
I have done ceph mon remove NODEID and then remove it from /etc/ceph/ceph.conf, but as you mentioned it is still in the GUI as Uknown
workable in pve5.3? can you explain the steps
 
Anyone got a solution for this? The ceph monitor was removed, but it is still visible in GUI with a question mark.
Edit: Solved, as the monitor node was dead anyways, i just removed the node from cluster with 'pvecm delnode nodename' and removed the node directory from /etc/pve/nodes while connected to a working node
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!