One of the manager refuses to start after a seven node cluster migation to pve6 and nautilus, here the debug trace:
any hint?
Thanks,
rob
Code:
# /usr/bin/ceph-mgr -d --cluster ceph --id pvenode2 --setuser ceph --setgroup ceph --debug_ms 1 2>&1 | tee ceph-mgr.start.log
2019-11-28 12:17:39.136 7f40b23a1dc0 1 Processor -- start
2019-11-28 12:17:39.136 7f40b23a1dc0 1 -- start start
2019-11-28 12:17:39.136 7f40b23a1dc0 1 --2- >> v2:10.1.1.211:3300/0 conn(0x55f23d39e000 0x55f23d2ea580 unknown :-1 s=NONE pgs=0 cs=0 l=0 rx=0 tx=0).connect
2019-11-28 12:17:39.136 7f40b23a1dc0 1 --2- >> v2:10.1.1.216:3300/0 conn(0x55f23d39e480 0x55f23d2eab00 unknown :-1 s=NONE pgs=0 cs=0 l=0 rx=0 tx=0).connect
2019-11-28 12:17:39.136 7f40b23a1dc0 1 --2- >> v2:10.1.1.212:3300/0 conn(0x55f23d39e900 0x55f23d2eb080 unknown :-1 s=NONE pgs=0 cs=0 l=0 rx=0 tx=0).connect
2019-11-28 12:17:39.136 7f40b23a1dc0 1 -- --> v2:10.1.1.211:3300/0 -- mon_getmap magic: 0 v1 -- 0x55f23c729180 con 0x55f23d39e000
2019-11-28 12:17:39.136 7f40b23a1dc0 1 -- --> v2:10.1.1.212:3300/0 -- mon_getmap magic: 0 v1 -- 0x55f23c729340 con 0x55f23d39e900
2019-11-28 12:17:39.136 7f40b23a1dc0 1 -- --> v2:10.1.1.216:3300/0 -- mon_getmap magic: 0 v1 -- 0x55f23c729500 con 0x55f23d39e480
2019-11-28 12:17:39.136 7f40b2157700 1 --2- >> v2:10.1.1.212:3300/0 conn(0x55f23d39e900 0x55f23d2eb080 unknown :-1 s=BANNER_CONNECTING pgs=0 cs=0 l=0 rx=0 tx=0)._handle_peer_banner_payload supported=0 required=0
2019-11-28 12:17:39.136 7f40b1956700 1 --2- >> v2:10.1.1.211:3300/0 conn(0x55f23d39e000 0x55f23d2ea580 unknown :-1 s=BANNER_CONNECTING pgs=0 cs=0 l=0 rx=0 tx=0)._handle_peer_banner_payload supported=0 required=0
2019-11-28 12:17:39.136 7f40b2157700 1 --2- >> v2:10.1.1.212:3300/0 conn(0x55f23d39e900 0x55f23d2eb080 unknown :-1 s=HELLO_CONNECTING pgs=0 cs=0 l=0 rx=0 tx=0).handle_hello peer v2:10.1.1.212:3300/0 says I am v2:10.1.1.212:35226/0 (socket says 10.1.1.212:35226)
2019-11-28 12:17:39.136 7f40b2157700 1 -- 10.1.1.212:0/1734694411 learned_addr learned my addr 10.1.1.212:0/1734694411 (peer_addr_for_me v2:10.1.1.212:0/0)
2019-11-28 12:17:39.136 7f40b1956700 1 --2- >> v2:10.1.1.211:3300/0 conn(0x55f23d39e000 0x55f23d2ea580 unknown :-1 s=HELLO_CONNECTING pgs=0 cs=0 l=0 rx=0 tx=0).handle_hello peer v2:10.1.1.211:3300/0 says I am v2:10.1.1.212:42474/0 (socket says 10.1.1.212:42474)
2019-11-28 12:17:39.136 7f40b1155700 1 --2- 10.1.1.212:0/1734694411 >> v2:10.1.1.216:3300/0 conn(0x55f23d39e480 0x55f23d2eab00 unknown :-1 s=BANNER_CONNECTING pgs=0 cs=0 l=0 rx=0 tx=0)._handle_peer_banner_payload supported=0 required=0
2019-11-28 12:17:39.136 7f40b2157700 1 --2- 10.1.1.212:0/1734694411 >> v2:10.1.1.212:3300/0 conn(0x55f23d39e900 0x55f23d2eb080 unknown :-1 s=AUTH_CONNECTING pgs=0 cs=0 l=0 rx=0 tx=0).handle_auth_bad_method method=2 result (1) Operation not permitted, allowed methods=[2], allowed modes=[2,1]
2019-11-28 12:17:39.136 7f40b2157700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]
2019-11-28 12:17:39.136 7f40b2157700 1 -- 10.1.1.212:0/1734694411 >> v2:10.1.1.212:3300/0 conn(0x55f23d39e900 msgr2=0x55f23d2eb080 unknown :-1 s=STATE_CONNECTION_ESTABLISHED l=0).mark_down
2019-11-28 12:17:39.136 7f40b2157700 1 --2- 10.1.1.212:0/1734694411 >> v2:10.1.1.212:3300/0 conn(0x55f23d39e900 0x55f23d2eb080 unknown :-1 s=AUTH_CONNECTING pgs=0 cs=0 l=0 rx=0 tx=0).stop
2019-11-28 12:17:39.136 7f40b1956700 1 --2- 10.1.1.212:0/1734694411 >> v2:10.1.1.211:3300/0 conn(0x55f23d39e000 0x55f23d2ea580 unknown :-1 s=AUTH_CONNECTING pgs=0 cs=0 l=0 rx=0 tx=0).handle_auth_bad_method method=2 result (1) Operation not permitted, allowed methods=[2], allowed modes=[2,1]
2019-11-28 12:17:39.136 7f40b1956700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]
2019-11-28 12:17:39.136 7f40b1956700 1 -- 10.1.1.212:0/1734694411 >> v2:10.1.1.211:3300/0 conn(0x55f23d39e000 msgr2=0x55f23d2ea580 unknown :-1 s=STATE_CONNECTION_ESTABLISHED l=0).mark_down
2019-11-28 12:17:39.136 7f40b1956700 1 --2- 10.1.1.212:0/1734694411 >> v2:10.1.1.211:3300/0 conn(0x55f23d39e000 0x55f23d2ea580 unknown :-1 s=AUTH_CONNECTING pgs=0 cs=0 l=0 rx=0 tx=0).stop
2019-11-28 12:17:39.140 7f40b1155700 1 --2- 10.1.1.212:0/1734694411 >> v2:10.1.1.216:3300/0 conn(0x55f23d39e480 0x55f23d2eab00 unknown :-1 s=AUTH_CONNECTING pgs=0 cs=0 l=0 rx=0 tx=0).handle_auth_bad_method method=2 result (1) Operation not permitted, allowed methods=[2], allowed modes=[2,1]
2019-11-28 12:17:39.140 7f40b1155700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]
2019-11-28 12:17:39.140 7f40b1155700 1 -- 10.1.1.212:0/1734694411 >> v2:10.1.1.216:3300/0 conn(0x55f23d39e480 msgr2=0x55f23d2eab00 unknown :-1 s=STATE_CONNECTION_ESTABLISHED l=0).mark_down
2019-11-28 12:17:39.140 7f40b1155700 1 --2- 10.1.1.212:0/1734694411 >> v2:10.1.1.216:3300/0 conn(0x55f23d39e480 0x55f23d2eab00 unknown :-1 s=AUTH_CONNECTING pgs=0 cs=0 l=0 rx=0 tx=0).stop
2019-11-28 12:17:39.140 7f40b23a1dc0 1 -- 10.1.1.212:0/1734694411 shutdown_connections
2019-11-28 12:17:39.140 7f40b23a1dc0 1 --2- 10.1.1.212:0/1734694411 >> v2:10.1.1.212:3300/0 conn(0x55f23d39e900 0x55f23d2eb080 unknown :-1 s=CLOSED pgs=0 cs=0 l=0 rx=0 tx=0).stop
2019-11-28 12:17:39.140 7f40b23a1dc0 1 --2- 10.1.1.212:0/1734694411 >> v2:10.1.1.211:3300/0 conn(0x55f23d39e000 0x55f23d2ea580 unknown :-1 s=CLOSED pgs=0 cs=0 l=0 rx=0 tx=0).stop
2019-11-28 12:17:39.140 7f40b23a1dc0 1 --2- 10.1.1.212:0/1734694411 >> v2:10.1.1.216:3300/0 conn(0x55f23d39e480 0x55f23d2eab00 unknown :-1 s=CLOSED pgs=0 cs=0 l=0 rx=0 tx=0).stop
2019-11-28 12:17:39.140 7f40b23a1dc0 1 -- 10.1.1.212:0/1734694411 shutdown_connections
2019-11-28 12:17:39.140 7f40b23a1dc0 1 -- 10.1.1.212:0/1734694411 wait complete.
2019-11-28 12:17:39.140 7f40b23a1dc0 1 -- 10.1.1.212:0/1734694411 >> 10.1.1.212:0/1734694411 conn(0x55f23c655a80 msgr2=0x55f23d398000 unknown :-1 s=STATE_NONE l=0).mark_down
failed to fetch mon config (--no-mon-config to skip)
any hint?
Thanks,
rob