Ceph: Unable to create a monitor. Says it already exists.

NorViking

New Member
Nov 28, 2023
7
0
1
I need some help with ceph: :)

In my home lab I have a few Intel nucs that I run a proxmox cluster ver. 8.2.4 on.

A while ago, on one of the nodes (mox04) the ceph monitor crashed and I was unable to restart it. I did a "destroy monitor" from the GUI. This caused the node to hang and I had to power cycle it.

The node works fine now, except that I'm not able to create a monitor on the node.

Code:
# ceph mon remove mox04
mon.mox04 does not exist or has already been removed

# pveceph mon create
monitor address '10.0.0.104' already in use

The node name or IP address does not appear in the Ceph configuration files (/etc/pve/ceph.conf) or in the output of Ceph commands (ceph -s, ceph mon dump).

I used strace to trace the execution of the 'pveceph mon create' command, uploaded the result to Goggle Gemini and asked it to analyze the result:

The answer was:

The strace output shows that the information claiming the monitor exists comes from the Proxmox cluster file system (pmxcfs). The relevant lines in the strace are:

write(5, "M)\0\0\0{\"prefix\":\"mon metadata\",\"f"..., 46) = 46
read(5, ">3\0\0\0", 5) = 5
read(5, "{\"return_code\":0,\"status_message"..., 51) = 51

This sequence indicates that the pveceph tool is sending a request to pmxcfs to retrieve monitor metadata. The response from pmxcfs includes a status message indicating that the monitor address is already in use.

This suggests that the monitor information is stored in the pmxcfs database, even though it doesn't appear in the Ceph configuration files or the output of Ceph commands.
 
Last edited:
I had a look at that case.

It's not quite the same problem. In that case that node showed an ghost monitor in the GUI. I my case it's not showing in any of the usual configs or GUI.

The ceph-mon@mox04 was already disabled and there is no files in '/var/lib/ceph/mon/' on the node.

One person said that stopping/staring the osd on the node fixed the problem. That did not fix the problem on my cluster.

I've tried shutting down all the nodes in the cluster at once and restarting them. No change after the nodes restarted.

Everything works and the ceph cluster is healthy, except that I can't create a monitor on one of the nodes.

The linked case talks about removing ceph on all nodes and reinstalling. I guess I then have to reconfigure ceph again and restore the data.
That seems like a lot of time consuming work to to fix a minor problem like this.

However I did find a clue: The following shows the node IP 10.0.0.104 that has the problem.

# for i in {2..6}; do ceph config show osd.$i|grep 10.0.0.104;done
mon_host 10.0.0.105 10.0.0.106 10.0.0.104 10.0.0.102 10.0.0.103 file
mon_host 10.0.0.105 10.0.0.106 10.0.0.104 10.0.0.102 10.0.0.103 file
mon_host 10.0.0.105 10.0.0.106 10.0.0.104 10.0.0.102 10.0.0.103 file
mon_host 10.0.0.105 10.0.0.106 10.0.0.104 10.0.0.102 10.0.0.103 file

Where is this information stored? Is it possible to remove the node here?
 
Last edited:
I tried this solution from the link:
Code:
# login to a working monitor host, stop the service and extract the map
service ceph-mon@myworkinghost stop
ceph-mon -i myworkinghost  --extract-monmap /tmp/monmap
2021-05-11 16:44:04.119 7f6087bd6400 -1 wrote monmap to /tmp/monmap
#start the service again and transfer it to the stalehost
service ceph-mon@myworkinghost start
scp /tmp/monmap mystalehost:/tmp/monmap

# on the stale monhost, stop the monitor and inject tthe mapstoppen und injecten
service ceph-mon@mystalehost stop
ceph-mon -i mystalehost --inject-monmap /tmp/monmap
# after that the access rights must be set manually fopr the user "ceph" (there where file permission denied errors, becuase after the injection some files belonged to root)
chown ceph.ceph /var/lib/ceph/mon/ceph-mystalehost/store.db/*
# and start again
service ceph-mon@mystalehost start

The result when I run the part on the "stale" host was:
Code:
# ceph-mon -i mox04 --inject-monmap /tmp/monmap
2024-07-02T20:39:47.115+0200 798e7b40ca00 -1 monitor data directory at '/var/lib/ceph/mon/ceph-mox04' does not exist: have you run 'mkfs'?

It seems that the solution assumes that you have a running, but stale monitor. In my case I don't seem to have monitor on the node. I'm not able to create one, thus the files does not exist.

I also tried this:
Code:
systemctl stop ceph-mon@<impacted-mon>
systemctl disable ceph-mon@<impacted-mon>
systemctl daemon-reload
systemctl reset-failed
pveceph mon destroy <impacted-mon>

The result was about the same:

Code:
# systemctl stop ceph-mon@mon04
# systemctl disable ceph-mon@mon04
# systemctl daemon-reload
# systemctl reset-failed
# pveceph mon destroy mox04
monitor filesystem '/var/lib/ceph/mon/ceph-mox04' does not exist on this node

# pveceph mon create
monitor address '10.0.0.104' already in use
 
Last edited:
Can you check that /etc/ceph/ceph.conf is a link to /etc/pve/ceph.conf on all nodes (and does not contain the outdated monitor info).

What does monmaptool --print /tmp/monmap say? If there is a wrong entry, you might want to remove it (monmaptool --rm mox04 /tmp/monmap) and re-inject the new monitor map on the host with the working monitor.
 
Can you check that /etc/ceph/ceph.conf is a link to /etc/pve/ceph.conf on all nodes (and does not contain the outdated monitor info).

What does monmaptool --print /tmp/monmap say? If there is a wrong entry, you might want to remove it (monmaptool --rm mox04 /tmp/monmap) and re-inject the new monitor map on the host with the working monitor.

Code:
# for i in {1..6}; do ssh mox0$i ls -l /etc/ceph/ceph.conf; done
lrwxrwxrwx 1 root root 18 Jul 21  2022 /etc/ceph/ceph.conf -> /etc/pve/ceph.conf
lrwxrwxrwx 1 root root 18 Jul 19  2022 /etc/ceph/ceph.conf -> /etc/pve/ceph.conf
lrwxrwxrwx 1 root root 18 Aug 18  2022 /etc/ceph/ceph.conf -> /etc/pve/ceph.conf
lrwxrwxrwx 1 root root 18 Apr 16  2023 /etc/ceph/ceph.conf -> /etc/pve/ceph.conf
lrwxrwxrwx 1 root root 18 Apr 16  2023 /etc/ceph/ceph.conf -> /etc/pve/ceph.conf
lrwxrwxrwx 1 root root 18 Apr 16  2023 /etc/ceph/ceph.conf -> /etc/pve/ceph.conf

Code:
# monmaptool --print /tmp/monmap
monmaptool: monmap file /tmp/monmap
epoch 52
fsid 89f450f5-c3aa-44cc-b485-e31008029e19
last_changed 2024-07-02T20:15:30.021587+0200
created 2022-07-19T18:18:18.982406+0200
min_mon_release 17 (quincy)
election_strategy: 1
0: [v2:10.0.0.105:3300/0,v1:10.0.0.105:6789/0] mon.mox05
1: [v2:10.0.0.106:3300/0,v1:10.0.0.106:6789/0] mon.mox06
2: [v2:10.0.0.102:3300/0,v1:10.0.0.102:6789/0] mon.mox02
3: [v2:10.0.0.103:3300/0,v1:10.0.0.103:6789/0] mon.mox03
4: [v2:10.0.0.101:3300/0,v1:10.0.0.101:6789/0] mon.mox01

Looks ok to me.
mox04 (10.0.0.104) is not listed as a monitor. That's the node I'm unable to create a monitor on.

Edit:
I found this line in /etc/pve/ceph.conf:

mon_host = 10.0.0.105 10.0.0.106 10.0.0.104 10.0.0.102 10.0.0.103 10.0.0.101

It shows the host 10.0.0.104
I edited the file and let things sync between nodes.
Now a was able to create a montor on mox04.
Then I removed the monitor on nodes mox01 and mox02. Removing mox01 went fine.
But when I removed mox02, it showed state unknown in the GUI.

Code:
# ceph mon remove mox02
mon.mox02 does not exist or has already been removed

# pveceph mon create
monitor 'mox02' already exists

Could removing two monitors right after each other before /etc/pve has synced be casing this problem?
 
Last edited:
What do you get when you execute the following script on the problematic node:
Code:
root@pve8a1 ~ # cat query-mon-host.pm
Code:
#!/bin/perl

use strict;
use warnings;

use JSON;

use PVE::Ceph::Services;
use PVE::CephConfig;
use PVE::Cluster qw(cfs_read_file);
use PVE::RPCEnvironment;

PVE::RPCEnvironment->init('cli');

my $cfg = cfs_read_file('ceph.conf');

my $rados = PVE::RADOS->new();
my $monhash = PVE::Ceph::Services::get_services_info("mon", $cfg, $rados);

print to_json($cfg->{global}->{mon_host}, { canonical => 1, pretty => 1 });
print to_json($monhash, { canonical => 1, pretty => 1 });
Code:
root@pve8a1 ~ # perl query-mon-host.pm
 
I edited the file and let things sync between nodes.
Now a was able to create a montor on mox04.
Then I removed the monitor on nodes mox01 and mox02. Removing mox01 went fine.
But when I removed mox02, it showed state unknown in the GUI.
It's better to let such operations finish before starting the next one.
Code:
# ceph mon remove mox02
mon.mox02 does not exist or has already been removed

# pveceph mon create
monitor 'mox02' already exists

Could removing two monitors right after each other before /etc/pve has synced be casing this problem?
How does the mon_host entry in /etc/pve/ceph.conf look like now? Does it contain the IP of a removed monitor?
 
It just kept getting worse. When I created new monitors they would not start and became ghosts.
I feared that even if I fixed it for now, problems could show up when upgrading to ceph 18.2 later.

I gave up, migrated the volumes on ceph to other storage, purged ceph and configured everything up again.
Upgraded to 18.2 and everything seems to work fine now.

Thank you for all your help on this issue! :)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!