Hi,
today one of my monitor was in a bad state and does not come up after a node reboot.
So I tried to recover it with the solution from here ... that has worked in the past.
https://forum.proxmox.com/threads/i-managed-to-create-a-ghost-ceph-monitor.58435/#post-389798
I have three nodes, two nodes have a monitors running on the third node the monitor was deleted.
But both monitors are in a state "probing".
So at the moment I am in a state with no working monitor.
When I select ceph in the UI it is not responding.
pveceph status and ceph -s are not responding...
The systems in the rdb storage are all up and running.
What can I do?
Logfile Vmhost3
Logfile Vmhost2:
today one of my monitor was in a bad state and does not come up after a node reboot.
So I tried to recover it with the solution from here ... that has worked in the past.
https://forum.proxmox.com/threads/i-managed-to-create-a-ghost-ceph-monitor.58435/#post-389798
I have three nodes, two nodes have a monitors running on the third node the monitor was deleted.
But both monitors are in a state "probing".
So at the moment I am in a state with no working monitor.
When I select ceph in the UI it is not responding.
pveceph status and ceph -s are not responding...
Code:
pveceph status
command 'ceph -s' failed: got timeout
The systems in the rdb storage are all up and running.
What can I do?
Code:
ceph --admin-daemon /var/run/ceph/ceph-mon.vmhost3.asok mon_status
{
"name": "vmhost3",
"rank": -1,
"state": "probing",
"election_epoch": 0,
"quorum": [],
"features": {
"required_con": "2449958197560098820",
"required_mon": [
"kraken",
"luminous",
"mimic",
"osdmap-prune",
"nautilus"
],
"quorum_con": "0",
"quorum_mon": []
},
"outside_quorum": [
"vmhost2"
],
"extra_probe_peers": [],
"sync_provider": [],
"monmap": {
"epoch": 20,
"fsid": "42ca65c7-716e-4357-802e-44178a1a0c03",
"modified": "2021-06-22 14:09:26.898732",
"created": "2017-01-30 17:14:40.940356",
"min_mon_release": 14,
"min_mon_release_name": "nautilus",
"features": {
"persistent": [
"kraken",
"luminous",
"mimic",
"osdmap-prune",
"nautilus"
],
"optional": []
},
"mons": [
{
"rank": 0,
"name": "vmhost2",
"public_addrs": {
"addrvec": [
{
"type": "v2",
"addr": "10.0.99.82:3300",
"nonce": 0
},
{
"type": "v1",
"addr": "10.0.99.82:6789",
"nonce": 0
}
]
},
"addr": "10.0.99.82:6789/0",
"public_addr": "10.0.99.82:6789/0"
},
{
"rank": 1,
"name": "vmhost5",
"public_addrs": {
"addrvec": [
{
"type": "v2",
"addr": "10.0.99.83:3300",
"nonce": 0
},
{
"type": "v1",
"addr": "10.0.99.83:6789",
"nonce": 0
}
]
},
"addr": "10.0.99.83:6789/0",
"public_addr": "10.0.99.83:6789/0"
}
]
},
"feature_map": {
"mon": [
{
"features": "0x3ffddff8ffecffff",
"release": "luminous",
"num": 1
}
],
"client": [
{
"features": "0x3ffddff8ffecffff",
"release": "luminous",
"num": 8
}
]
}
}
Code:
2021-06-22 16:30:20.480 7f0c8cf1e700 -1 mon.vmhost3@-1(probing) e20 get_health_metrics reporting 4 slow ops, oldest is log(1 entries from seq 1 at 2021-06-22 16:10:39.766768)
2021-06-22 16:30:20.744 7f0c8ef22700 1 mon.vmhost3@-1(probing) e20 handle_auth_request failed to assign global_id
2021-06-22 16:30:20.924 7f0c8ef22700 1 mon.vmhost3@-1(probing) e20 handle_auth_request failed to assign global_id
2021-06-22 16:30:20.928 7f0c8ef22700 1 mon.vmhost3@-1(probing) e20 handle_auth_request failed to assign global_id
2021-06-22 16:30:20.948 7f0c8ef22700 1 mon.vmhost3@-1(probing) e20 handle_auth_request failed to assign global_id
2021-06-22 16:30:21.044 7f0c8ef22700 1 mon.vmhost3@-1(probing) e20 handle_auth_request failed to assign global_id
2021-06-22 16:30:21.108 7f0c8ef22700 1 mon.vmhost3@-1(probing) e20 handle_auth_request failed to assign global_id
2021-06-22 16:30:21.188 7f0c8ef22700 1 mon.vmhost3@-1(probing) e20 handle_auth_request failed to assign global_id
2021-06-22 16:30:21.188 7f0c8ef22700 1 mon.vmhost3@-1(probing) e20 handle_auth_request failed to assign global_id
2021-06-22 16:30:21.280 7f0c8ef22700 1 mon.vmhost3@-1(probing) e20 handle_auth_request failed to assign global_id
2021-06-22 16:30:21.280 7f0c8e721700 1 mon.vmhost3@-1(probing) e20 handle_auth_request failed to assign global_id
2021-06-22 16:30:21.348 7f0c8ef22700 1 mon.vmhost3@-1(probing) e20 handle_auth_request failed to assign global_id
2021-06-22 16:30:21.852 7f0c8ef22700 1 mon.vmhost3@-1(probing) e20 handle_auth_request failed to assign global_id
2021-06-22 16:30:22.056 7f0c8ef22700 1 mon.vmhost3@-1(probing) e20 handle_auth_request failed to assign global_id
2021-06-22 16:30:22.148 7f0c8ef22700 1 mon.vmhost3@-1(probing) e20 handle_auth_request failed to assign global_id
Code:
ceph --admin-daemon /var/run/ceph/ceph-mon.vmhost2.asok mon_status
{
"name": "vmhost2",
"rank": 0,
"state": "probing",
"election_epoch": 0,
"quorum": [],
"features": {
"required_con": "2449958747315912708",
"required_mon": [
"kraken",
"luminous",
"mimic",
"osdmap-prune",
"nautilus"
],
"quorum_con": "0",
"quorum_mon": []
},
"outside_quorum": [
"vmhost2"
],
"extra_probe_peers": [],
"sync_provider": [],
"monmap": {
"epoch": 20,
"fsid": "42ca65c7-716e-4357-802e-44178a1a0c03",
"modified": "2021-06-22 14:09:26.898732",
"created": "2017-01-30 17:14:40.940356",
"min_mon_release": 14,
"min_mon_release_name": "nautilus",
"features": {
"persistent": [
"kraken",
"luminous",
"mimic",
"osdmap-prune",
"nautilus"
],
"optional": []
},
"mons": [
{
"rank": 0,
"name": "vmhost2",
"public_addrs": {
"addrvec": [
{
"type": "v2",
"addr": "10.0.99.82:3300",
"nonce": 0
},
{
"type": "v1",
"addr": "10.0.99.82:6789",
"nonce": 0
}
]
},
"addr": "10.0.99.82:6789/0",
"public_addr": "10.0.99.82:6789/0"
},
{
"rank": 1,
"name": "vmhost5",
"public_addrs": {
"addrvec": [
{
"type": "v2",
"addr": "10.0.99.83:3300",
"nonce": 0
},
{
"type": "v1",
"addr": "10.0.99.83:6789",
"nonce": 0
}
]
},
"addr": "10.0.99.83:6789/0",
"public_addr": "10.0.99.83:6789/0"
}
]
},
"feature_map": {
"mon": [
{
"features": "0x3ffddff8ffecffff",
"release": "luminous",
"num": 1
}
],
"osd": [
{
"features": "0x3ffddff8ffecffff",
"release": "luminous",
"num": 3
}
],
"client": [
{
"features": "0x3ffddff8ffecffff",
"release": "luminous",
"num": 12
}
],
"mgr": [
{
"features": "0x3ffddff8ffecffff",
"release": "luminous",
"num": 3
}
]
}
Logfile Vmhost2:
Code:
2021-06-22 16:26:57.463 7f709f652700 -1 mon.vmhost2@0(probing) e20 get_health_metrics reporting 6149 slow ops, oldest is auth(proto 2 2 bytes epoch 0)
2021-06-22 16:26:57.471 7f709ce4d700 0 mon.vmhost2@0(probing) e20 handle_command mon_command({"prefix": "mon metadata", "id": "vmhost2"} v 0) v1
2021-06-22 16:26:57.471 7f709ce4d700 0 log_channel(audit) log [DBG] : from='mgr.122329882 10.0.99.82:0/1290' entity='mgr.vmhost2' cmd=[{"prefix": "mon metadata", "id": "vmhost2"}]: dispatch
2021-06-22 16:26:57.471 7f709ce4d700 0 mon.vmhost2@0(probing) e20 handle_command mon_command({"prefix": "mon metadata", "id": "vmhost5"} v 0) v1
2021-06-22 16:26:57.471 7f709ce4d700 0 log_channel(audit) log [DBG] : from='mgr.122329882 10.0.99.82:0/1290' entity='mgr.vmhost2' cmd=[{"prefix": "mon metadata", "id": "vmhost5"}]: dispatch
2021-06-22 16:26:57.483 7f70a1656700 1 mon.vmhost2@0(probing) e20 handle_auth_request failed to assign global_id
2021-06-22 16:26:57.483 7f70a1656700 1 mon.vmhost2@0(probing) e20 handle_auth_request failed to assign global_id
2021-06-22 16:26:57.515 7f70a1656700 1 mon.vmhost2@0(probing) e20 handle_auth_request failed to assign global_id
2021-06-22 16:26:57.531 7f70a1656700 1 mon.vmhost2@0(probing) e20 handle_auth_request failed to assign global_id
2021-06-22 16:26:57.531 7f70a0e55700 1 mon.vmhost2@0(probing) e20 handle_auth_request failed to assign global_id
2021-06-22 16:26:57.539 7f70a1656700 1 mon.vmhost2@0(probing) e20 handle_auth_request failed to assign global_id
2021-06-22 16:26:57.571 7f709ce4d700 1 mon.vmhost2@0(probing) e20 adding peer [v2:10.0.99.84:3300/0,v1:10.0.99.84:6789/0] to list of hints
2021-06-22 16:26:57.575 7f70a1656700 1 mon.vmhost2@0(probing) e20 handle_auth_request failed to assign global_id
2021-06-22 16:26:57.591 7f70a1656700 1 mon.vmhost2@0(probing) e20 handle_auth_request failed to assign global_id
2021-06-22 16:26:57.607 7f70a1656700 1 mon.vmhost2@0(probing) e20 handle_auth_request failed to assign global_id
2021-06-22 16:26:57.623 7f70a1656700 1 mon.vmhost2@0(probing) e20 handle_auth_request failed to assign global_id