Error in syslog: probably no mds server is up

cmonty14

Well-Known Member
Mar 4, 2014
343
5
58
Hi,
after rebooting 1 node serving MDS I get this error message in this node's syslog:
root@ld3955:~# tail /var/log/syslog
Sep 17 12:21:18 ld3955 kernel: [ 3141.167834] ceph: probably no mds server is up
Sep 17 12:21:18 ld3955 pvestatd[2482]: mount error: exit code 2
Sep 17 12:21:28 ld3955 kernel: [ 3151.319780] libceph: mon2 10.97.206.95:6789 session established
Sep 17 12:21:28 ld3955 kernel: [ 3151.327118] libceph: client38594183 fsid 6b1b5117-6e08-4843-93d6-2da3cf8a6bae
Sep 17 12:21:28 ld3955 kernel: [ 3151.327163] ceph: probably no mds server is up
Sep 17 12:21:28 ld3955 pvestatd[2482]: mount error: exit code 2
Sep 17 12:21:38 ld3955 kernel: [ 3161.537316] libceph: mon0 10.97.206.93:6789 session established
Sep 17 12:21:38 ld3955 pvestatd[2482]: mount error: exit code 2
Sep 17 12:21:38 ld3955 kernel: [ 3161.543618] libceph: client38684721 fsid 6b1b5117-6e08-4843-93d6-2da3cf8a6bae
Sep 17 12:21:38 ld3955 kernel: [ 3161.544383] ceph: probably no mds server is up


There's no error in ceph-mds.ld3955.log:
root@ld3955:~# tail -f /var/log/ceph/ceph-mds.ld3955.log
2019-09-17 12:08:14.670 7f9610af4700 0 ms_deliver_dispatch: unhandled message 0x563ee6090500 osd_map(183147..183147 src has 172245..183147) v4
from mon.0 v2:10.97.206.93:3300/0
2019-09-17 12:08:14.670 7f9610af4700 0 ms_deliver_dispatch: unhandled message 0x563ee2267440 mdsmap(e 66927) v1 from mon.0 v2:10.97.206.93:3300/0
2019-09-17 12:08:14.670 7f9610af4700 0 ms_deliver_dispatch: unhandled message 0x563ee2267200 mdsmap(e 66928) v1 from mon.0 v2:10.97.206.93:3300/0
2019-09-17 12:08:14.670 7f96092e5700 0 mds.0.log _replay journaler got error -11, aborting
2019-09-17 12:11:48.279 7fb75f9ca340 0 set uid:gid to 64045:64045 (ceph:ceph)
2019-09-17 12:11:48.279 7fb75f9ca340 0 ceph version 14.2.2 (a887fe9a5d3d97fe349065d3c1c9dbd7b8870855) nautilus (stable), process ceph-mds, pid
45678
2019-09-17 12:11:48.279 7fb75f9ca340 0 pidfile_write: ignore empty --pid-file
2019-09-17 12:11:48.283 7fb75bee3700 1 mds.ld3955 Updating MDS map to version 66928 from mon.2
2019-09-17 12:11:49.231 7fb75bee3700 1 mds.ld3955 Updating MDS map to version 66929 from mon.2
2019-09-17 12:11:49.231 7fb75bee3700 1 mds.ld3955 Map has assigned me to become a standby


The other node is now replay, and there is this error in ceph-mds.ld3976.log:
root@ld3976:~# tail -f /var/log/ceph/ceph-mds.ld3976.log
2019-09-17 12:33:28.189 7f576b46c700 0 --1- [v2:10.97.206.92:6800/1176103745,v1:10.97.206.92:6801/1176103745] >> v1:10.97.206.93:7058/3301343 conn(0x5589f10c7a80 0x5589f10cf000 :-1 s=CONNECTING_SEND_CONNECT_MSG pgs=0 cs=0 l=1).handle_connect_reply_2 connect got BADAUTHORIZER
2019-09-17 12:33:28.193 7f576b46c700 0 --1- [v2:10.97.206.92:6800/1176103745,v1:10.97.206.92:6801/1176103745] >> v1:10.97.206.93:7058/3301343 conn(0x5589f10d2480 0x5589ed271800 :-1 s=CONNECTING_SEND_CONNECT_MSG pgs=0 cs=0 l=1).handle_connect_reply_2 connect got BADAUTHORIZER
2019-09-17 12:33:28.197 7f576b46c700 0 --1- [v2:10.97.206.92:6800/1176103745,v1:10.97.206.92:6801/1176103745] >> v1:10.97.206.93:7058/3301343 conn(0x5589f10c7a80 0x5589f10cf000 :-1 s=CONNECTING_SEND_CONNECT_MSG pgs=0 cs=0 l=1).handle_connect_reply_2 connect got BADAUTHORIZER
2019-09-17 12:33:28.197 7f576b46c700 0 --1- [v2:10.97.206.92:6800/1176103745,v1:10.97.206.92:6801/1176103745] >> v1:10.97.206.93:7058/3301343 conn(0x5589f10d2480 0x5589ed271800 :-1 s=CONNECTING_SEND_CONNECT_MSG pgs=0 cs=0 l=1).handle_connect_reply_2 connect got BADAUTHORIZER
2019-09-17 12:33:28.201 7f576b46c700 0 --1- [v2:10.97.206.92:6800/1176103745,v1:10.97.206.92:6801/1176103745] >> v1:10.97.206.93:7058/3301343 conn(0x5589f10c7a80 0x5589f10cf000 :-1 s=CONNECTING_SEND_CONNECT_MSG pgs=0 cs=0 l=1).handle_connect_reply_2 connect got BADAUTHORIZER
2019-09-17 12:33:28.205 7f576b46c700 0 --1- [v2:10.97.206.92:6800/1176103745,v1:10.97.206.92:6801/1176103745] >> v1:10.97.206.93:7058/3301343 conn(0x5589f10d2480 0x5589ed271800 :-1 s=CONNECTING_SEND_CONNECT_MSG pgs=0 cs=0 l=1).handle_connect_reply_2 connect got BADAUTHORIZER
2019-09-17 12:33:28.209 7f576b46c700 0 --1- [v2:10.97.206.92:6800/1176103745,v1:10.97.206.92:6801/1176103745] >> v1:10.97.206.93:7058/3301343 conn(0x5589f10c7a80 0x5589f10cf000 :-1 s=CONNECTING_SEND_CONNECT_MSG pgs=0 cs=0 l=1).handle_connect_reply_2 connect got BADAUTHORIZER
2019-09-17 12:33:28.213 7f576b46c700 0 --1- [v2:10.97.206.92:6800/1176103745,v1:10.97.206.92:6801/1176103745] >> v1:10.97.206.93:7058/3301343 conn(0x5589f10d2480 0x5589ed271800 :-1 s=CONNECTING_SEND_CONNECT_MSG pgs=0 cs=0 l=1).handle_connect_reply_2 connect got BADAUTHORIZER
2019-09-17 12:33:28.213 7f576b46c700 0 --1- [v2:10.97.206.92:6800/1176103745,v1:10.97.206.92:6801/1176103745] >> v1:10.97.206.93:7058/3301343 conn(0x5589f10c7a80 0x5589f10cf000 :-1 s=CONNECTING_SEND_CONNECT_MSG pgs=0 cs=0 l=1).handle_connect_reply_2 connect got BADAUTHORIZER
2019-09-17 12:33:28.221 7f576b46c700 0 --1- [v2:10.97.206.92:6800/1176103745,v1:10.97.206.92:6801/1176103745] >> v1:10.97.206.93:7058/3301343 conn(0x5589f10d2480 0x5589ed271800 :-1 s=CONNECTING_SEND_CONNECT_MSG pgs=0 cs=0 l=1).handle_connect_reply_2 connect got BADAUTHORIZER


What is causing this error?
How can I fix it?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!