Can't add monitor

Kaboom

Active Member
Mar 5, 2019
119
11
38
52
Dear All,

When i want to start a new monitor that I just added on node005, I get this error:

Mar 23 14:53:13 node005 systemd[1]: Started Ceph cluster monitor daemon.
Mar 23 14:53:14 node005 ceph-mon[3561]: 2020-03-23 14:53:14.762 7f58ca8e8700 -1 mon.node005@-1(probing) e0 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied

Does this ring a bell?

Thanks in advance!
 
Mar 23 14:53:13 node005 systemd[1]: Started Ceph cluster monitor daemon.
Mar 23 14:53:14 node005 ceph-mon[3561]: 2020-03-23 14:53:14.762 7f58ca8e8700 -1 mon.node005@-1(probing) e0 handle_auth_bad_method hmm, they didn't like 2 result (13) Permission denied
How did you add the MON?
 
Can you please post a ceph -s and the ceph.conf?
 
ceph -s cluster: id: 09935360-cfe7-48d4-ac76-c02e0fdd95de health: HEALTH_WARN 1 daemons have recently crashed services: mon: 2 daemons, quorum node003,node002 (age 2d) mgr: node002(active, since 2d), standbys: node003, node004 osd: 36 osds: 36 up (since 10h), 36 in (since 4M) data: pools: 1 pools, 1024 pgs objects: 842.82k objects, 3.0 TiB usage: 8.2 TiB used, 7.5 TiB / 16 TiB avail pgs: 1024 active+clean io: client: 91 MiB/s rd, 14 MiB/s wr, 2.13k op/s rd, 583 op/s wr
 
cat ceph.conf [global] auth_client_required = cephx auth_cluster_required = cephx auth_service_required = cephx cluster_network = 10.0.1.0/24 fsid = 09657777360-cfe7-48764-ac76-c02e4566 mon_allow_pool_delete = true mon_host = 10.0.1.2 10.0.1.3 10.0.1.5 osd_journal_size = 5120 osd_pool_default_min_size = 2 osd_pool_default_size = 3 public_network = 10.0.1.0/24 [client] keyring = /etc/pve/priv/$cluster.$name.keyring [mon.node002] host = node002 [mon.node003] host = node003 [mon.node005] host = node005
 
node005 exists in the ceph.conf, but wasn't registered by the other MONs. The easiest is to try a pveceph destroy node005 and afterwards a pveceph create. Then hopefully the new MON starts working. If not, the log file /var/log/ceph/ceph-mon.node005.log should give some Clous.
 
I found out this node had a different keyring, I don't understand why but I copied this from another node... double checked all the files (some with wrong owner and group rights) and now it starts.
 
I found out this node had a different keyring, I don't understand why but I copied this from another node...
I hope you didn't copy the keyring as well. Each new Ceph service will create a keyring by themselves.
 
Talking about this keyring: /var/lib/ceph/mon/ceph-node005/keyring

When I use a unique keyring the monitor doesn't start. When I use all the same keyrings the monitor works. But I get this error:
mon.node005@0(electing) e16 failed to get devid for : fallback method has serial ''but no model
 
Last edited:
When I use a unique keyring the monitor doesn't start. When I use all the same keyrings the monitor works. But I get this error:
mon.node005@0(electing) e16 failed to get devid for : fallback method has serial ''but no model
That's message is from a running MON and doesn't prohibit it joining the other MONs.

Talking about this keyring: /var/lib/ceph/mon/ceph-node005/keyring
Yes, even though they are the same for the MONs, the are different for the other services like MGR, OSD, MDS or clients. But anyway you will not need to copy those, since they are created by the MON on bostrapping.
 
I recreated all 3 monitors threw the GUI, but they all have the same keyring, is that correct?

And I still got this error on all 3 monitors, is this important?
2020-03-27 22:00:05.118 7ff15cea9700 -1 mon.node002@1(electing) e29 failed to get devid for : fallback method has serial ''but no model

=====

This cluster looks healthy:

ceph -s
cluster:
id: 09935360-cfe7-48d4-ac76-c02e0fdd95de
health: HEALTH_OK

services:
mon: 3 daemons, quorum node003,node002,node004 (age 7m)
mgr: node003(active, since 12m), standbys: node002, node004
osd: 36 osds: 36 up (since 36h), 36 in (since 4M)

data:
pools: 1 pools, 1024 pgs
objects: 848.70k objects, 3.0 TiB
usage: 8.3 TiB used, 7.5 TiB / 16 TiB avail
pgs: 1024 active+clean

io:
client: 519 KiB/s rd, 11 MiB/s wr, 37 op/s rd, 371 op/s wr
 
I recreated all 3 monitors threw the GUI, but they all have the same keyring, is that correct?
Yes they do.

And I still got this error on all 3 monitors, is this important?
2020-03-27 22:00:05.118 7ff15cea9700 -1 mon.node002@1(electing) e29 failed to get devid for : fallback method has serial ''but no model
This more an informational message.
 
  • Like
Reactions: Kaboom
And every night at 00:00 I get this error message, is this something to worry about?

Mar 29 00:00:00 node002 ceph-mon[3003622]: 2020-03-29 00:00:00.129 7ff1636b6700 -1 Fail to open '/proc/2987489/cmdline' error = (2) No such file or directory
Mar 29 00:00:00 node002 ceph-mon[3003622]: 2020-03-29 00:00:00.133 7ff1636b6700 -1 received signal: Hangup from <unknown> (PID: 2987489) UID: 0
Mar 29 00:00:00 node002 ceph-mon[3003622]: 2020-03-29 00:00:00.133 7ff1636b6700 -1 Fail to open '/proc/2987489/cmdline' error = (2) No such file or directory
Mar 29 00:00:00 node002 ceph-mon[3003622]: 2020-03-29 00:00:00.133 7ff1636b6700 -1 received signal: Hangup from <unknown> (PID: 2987489) UID: 0
Mar 30 00:00:00 node002 ceph-mon[3003622]: 2020-03-30 00:00:00.074 7ff1636b6700 -1 received signal: Hangup from killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw rbd-mirror (PID: 2197612) UID: 0
Mar 30 00:00:00 node002 ceph-mon[3003622]: 2020-03-30 00:00:00.110 7ff1636b6700 -1 received signal: Hangup from (PID: 2197614) UID: 0
 
And every night at 00:00 I get this error message, is this something to worry about?
What happens every day at midnight? :) Log rotation.
 
I have this problem, too;And every night at 00:00 I get this error message, is this something to worry about? ceph 14.2.9
Jun 17 16:40:38 node1 ceph-mon[2147]: 2020-06-17 16:40:38.259 7f3369bf5700 -1 --2- [v2:16.16.16.14:3300/0,v1:16.16.16.14:6789/0] >> conn(0x55eb8eb5ad00 0x55eb6fdab600 unknown :-1 s=BANNER_ACCEPTING pgs=0 cs=0 l=0 rx=0 tx=0)._handle_peer_banner peer is using msgr V1 protocol
Jun 18 00:00:00 node1 ceph-mon[2147]: 2020-06-18 00:00:00.500 7f3371404700 -1 received signal: Hangup from killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw rbd-mirror (PID: 616042) UID: 0
Jun 18 00:00:00 node1 ceph-mon[2147]: 2020-06-18 00:00:00.516 7f3371404700 -1 received signal: Hangup from pkill -1 -x ceph-mon|ceph-mgr|ceph-mds|ceph-osd|ceph-fuse|radosgw|rbd-mirror (PID: 616043) UID: 0
Jun 18 08:12:54 node1 ceph-mon[2147]: 2020-06-18 08:12:54.357 7f336abf7700 -1 mon.node1@0(electing) e11 failed to get devid for : fallback method has serial ''but no model
Jun 18 08:13:00 node1 ceph-mon[2147]: 2020-06-18 08:13:00.665 7f336abf7700 -1 mon.node1@0(electing) e11 failed to get devid for : fallback method has serial ''but no model
Jun 18 08:13:12 node1 ceph-mon[2147]: 2020-06-18 08:13:12.493 7f336abf7700 -1 mon.node1@0(electing) e11 failed to get devid for : fallback method has serial ''but no model
Jun 18 08:13:30 node1 ceph-mon[2147]: 2020-06-18 08:13:30.593 7f336abf7700 -1 mon.node1@0(electing) e11 failed to get devid for : fallback method has serial ''but no model
Jun 19 00:00:00 node1 ceph-mon[2147]: 2020-06-19 00:00:00.495 7f3371404700 -1 received signal: Hangup from killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw rbd-mirror (PID: 1403869) UID: 0
Jun 19 00:00:00 node1 ceph-mon[2147]: 2020-06-19 00:00:00.511 7f3371404700 -1 received signal: Hangup from pkill -1 -x ceph-mon|ceph-mgr|ceph-mds|ceph-osd|ceph-fuse|radosgw|rbd-mirror (PID: 1403870) UID: 0
Jun 19 08:05:56 node1 ceph-mon[2147]: 2020-06-19 08:05:56.660 7f336abf7700 -1 mon.node1@0(electing) e11 failed to get devid for : fallback method has serial ''but no model
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!