Ceph Monitors not starting

faisa7847

New Member
Jul 25, 2024
22
0
1
Hello I have just configured ceph on proxmox and i see that the monitor on the 2nd node and 3rd node is stopped state I am running a 3-node cluster the systemctl is showing active and running but for some reason it's in a stopped state in GUI
I just found this on Syslog in the uI
ep 25 21:59:08 prx2 systemd[1]: Started ceph-mon@prx2.service - Ceph cluster monitor daemon.
Sep 25 23:53:56 prx2 systemd[1]: Stopping ceph-mon@prx2.service - Ceph cluster monitor daemon...
Sep 25 23:53:56 prx2 ceph-mon[126976]: 2024-09-25T23:53:56.287+0300 7e7462e006c0 -1 received signal: Terminated from /sbin/init (PID: 1) UID: 0
Sep 25 23:53:56 prx2 ceph-mon[126976]: 2024-09-25T23:53:56.287+0300 7e7462e006c0 -1 mon.prx2@0(probing) e0 *** Got Signal Terminated ***
Sep 25 23:53:56 prx2 systemd[1]: ceph-mon@prx2.service: Deactivated successfully.
Sep 25 23:53:56 prx2 systemd[1]: Stopped ceph-mon@prx2.service - Ceph cluster monitor daemon.
Sep 25 23:53:56 prx2 systemd[1]: ceph-mon@prx2.service: Consumed 2.413s CPU time.
Sep 25 23:53:56 prx2 systemd[1]: Started ceph-mon@prx2.service - Ceph cluster monitor daemon.
Sep 25 23:54:09 prx2 systemd[1]: Stopping ceph-mon@prx2.service - Ceph cluster monitor daemon...
Sep 25 23:54:09 prx2 ceph-mon[159435]: 2024-09-25T23:54:09.129+0300 7f9a0da006c0 -1 received signal: Terminated from /sbin/init (PID: 1) UID: 0
Sep 25 23:54:09 prx2 ceph-mon[159435]: 2024-09-25T23:54:09.129+0300 7f9a0da006c0 -1 mon.prx2@0(probing) e0 *** Got Signal Terminated ***
Sep 25 23:54:09 prx2 systemd[1]: ceph-mon@prx2.service: Deactivated successfully.
Sep 25 23:54:09 prx2 systemd[1]: Stopped ceph-mon@prx2.service - Ceph cluster monitor daemon.
Sep 26 00:01:01 prx2 systemd[1]: Started ceph-mon@prx2.service - Ceph cluster monitor daemon.


systemctl status node 2
ceph-mon@prx2.service - Ceph cluster monitor daemon
Loaded: loaded (/lib/systemd/system/ceph-mon@.service; enabled; preset: en>
Drop-In: /usr/lib/systemd/system/ceph-mon@.service.d
└─ceph-after-pve-cluster.conf
Active: active (running) since Thu 2024-09-26 00:01:01 +03; 14min ago
Main PID: 162283 (ceph-mon)
Tasks: 24
Memory: 15.9M
CPU: 635ms
CGroup: /system.slice/system-ceph\x2dmon.slice/ceph-mon@prx2.service
└─162283 /usr/bin/ceph-mon -f --cluster ceph --id prx2 --setuser c>

Sep 26 00:01:01 prx2 systemd[1]: Started ceph-mon@prx2.service - Ceph cluster m>

systemctl status node 3

ceph-mon@prx3.service - Ceph cluster monitor daemon
Loaded: loaded (/lib/systemd/system/ceph-mon@.service; enabled; preset: en>
Drop-In: /usr/lib/systemd/system/ceph-mon@.service.d
└─ceph-after-pve-cluster.conf
Active: active (running) since Wed 2024-09-25 22:04:21 +03; 2h 12min ago
Main PID: 136025 (ceph-mon)
Tasks: 24
Memory: 20.5M
CPU: 3.120s
CGroup: /system.slice/system-ceph\x2dmon.slice/ceph-mon@prx3.service
└─136025 /usr/bin/ceph-mon -f --cluster ceph --id prx3 --setuser c>

Sep 25 22:04:21 prx3 systemd[1]: Started ceph-mon@prx3.service - Ceph cluster m>
 
Hi!

Has this already been resolved? The UI sometimes takes a bit to update... What do 'ceph status' and 'ceph osd tree' say?

Can you do 'ceph ping mon.*'?
 
He
Hi!

Has this already been resolved? The UI sometimes takes a bit to update... What do 'ceph status' and 'ceph osd tree' say?

Can you do 'ceph ping mon.*'?
Hey no its hasnt the ceph uis shows monitor unknow and the configuration for monitor doesn't show the port open
root@prx1:~# ceph ping mon.*
2024-09-28T08:58:09.572+0300 7ae6e34006c0 0 ms_deliver_dispatch: unhandled message 0x7ae6e407e4d0 mon_map magic: 0 v1 from mon.0 v2:92.204.248.86:3300/0
mon.prx1
{
"health": {
"status": "HEALTH_OK",
"checks": {},
"mutes": []
},
"mon_status": {
"name": "prx1",
"rank": 0,
"state": "leader",
"election_epoch": 7,
"quorum": [
0
],
"quorum_age": 205067,
"features": {
"required_con": "2449958758054445060",
"required_mon": [
"kraken",
"luminous",
"mimic",
"osdmap-prune",
"nautilus",
"octopus",
"pacific",
"elector-pinging",
"quincy",
"reef"
],
"quorum_con": "4540138322906710015",
"quorum_mon": [
"kraken",
"luminous",
"mimic",
"osdmap-prune",
"nautilus",
"octopus",
"pacific",
"elector-pinging",
"quincy",
"reef"
]
},
"outside_quorum": [],
"extra_probe_peers": [],
"sync_provider": [],
"monmap": {
"epoch": 1,
"fsid": "33fd9bcb-12b5-4548-99ce-d262e24ad979",
"modified": "2024-09-25T07:12:07.581916Z",
"created": "2024-09-25T07:12:07.581916Z",
"min_mon_release": 18,
"min_mon_release_name": "reef",
"election_strategy": 1,
"disallowed_leaders: ": "",
"stretch_mode": false,
"tiebreaker_mon": "",
"removed_ranks: ": "",
"features": {
"persistent": [
"kraken",
"luminous",
"mimic",
"osdmap-prune",
"nautilus",
"octopus",
"pacific",
"elector-pinging",
"quincy",
"reef"
],
"optional": []
},
"mons": [
{
"rank": 0,
"name": "prx1",
"public_addrs": {
"addrvec": [
{
"type": "v2",
"addr": "92.204.248.86:3300",
"nonce": 0
},
{
"type": "v1",
"addr": "92.204.248.86:6789",
"nonce": 0
}
]
},
"addr": "92.204.248.86:6789/0",
"public_addr": "92.204.248.86:6789/0",
"priority": 0,
"weight": 0,
"crush_location": "{}"
}
]
},
"feature_map": {
"mon": [
{
"features": "0x3f01cfbffffdffff",
"release": "luminous",
"num": 1
}
],
"osd": [
{
"features": "0x3f01cfbffffdffff",
"release": "luminous",
"num": 3
}
],
"client": [
{
"features": "0x2f018fb87aa4aafe",
"release": "luminous",
"num": 5
},
{
"features": "0x3f01cfbffffdffff",
"release": "luminous",
"num": 10
}
],
"mgr": [
{
"features": "0x3f01cfbffffdffff",
"release": "luminous",
"num": 3
}
]
},
"stretch_mode": false
}
}




On node 2
2024-09-28T08:59:39.205+0300 71a7bcc006c0 0 ms_deliver_dispatch: unhandled message 0x71a7b80a30d0 mon_map magic: 0 v1 from mon.0 v2:92.204.248.86:3300/0
mon.prx1
{
"health": {
"status": "HEALTH_OK",
"checks": {},
"mutes": []
},
"mon_status": {
"name": "prx1",
"rank": 0,
"state": "leader",
"election_epoch": 7,
"quorum": [
0
],
"quorum_age": 205156,
"features": {
"required_con": "2449958758054445060",
"required_mon": [
"kraken",
"luminous",
"mimic",
"osdmap-prune",
"nautilus",
"octopus",
"pacific",
"elector-pinging",
"quincy",
"reef"
],
"quorum_con": "4540138322906710015",
"quorum_mon": [
"kraken",
"luminous",
"mimic",
"osdmap-prune",
"nautilus",
"octopus",
"pacific",
"elector-pinging",
"quincy",
"reef"
]
},
"outside_quorum": [],
"extra_probe_peers": [],
"sync_provider": [],
"monmap": {
"epoch": 1,
"fsid": "33fd9bcb-12b5-4548-99ce-d262e24ad979",
"modified": "2024-09-25T07:12:07.581916Z",
"created": "2024-09-25T07:12:07.581916Z",
"min_mon_release": 18,
"min_mon_release_name": "reef",
"election_strategy": 1,
"disallowed_leaders: ": "",
"stretch_mode": false,
"tiebreaker_mon": "",
"removed_ranks: ": "",
"features": {
"persistent": [
"kraken",
"luminous",
"mimic",
"osdmap-prune",
"nautilus",
"octopus",
"pacific",
"elector-pinging",
"quincy",
"reef"
],
"optional": []
},
"mons": [
{
"rank": 0,
"name": "prx1",
"public_addrs": {
"addrvec": [
{
"type": "v2",
"addr": "92.204.248.86:3300",
"nonce": 0
},
{
"type": "v1",
"addr": "92.204.248.86:6789",
"nonce": 0
}
]
},
"addr": "92.204.248.86:6789/0",
"public_addr": "92.204.248.86:6789/0",
"priority": 0,
"weight": 0,
"crush_location": "{}"
}
]
},
"feature_map": {
"mon": [
{
"features": "0x3f01cfbffffdffff",
"release": "luminous",
"num": 1
}
],
"osd": [
{
"features": "0x3f01cfbffffdffff",
"release": "luminous",
"num": 3
}
],
"client": [
{
"features": "0x2f018fb87aa4aafe",
"release": "luminous",
"num": 5
},
{
"features": "0x3f01cfbffffdffff",
"release": "luminous",
"num": 10
}
],
"mgr": [
{
"features": "0x3f01cfbffffdffff",
"release": "luminous",
"num": 3
}
]
},
"stretch_mode": false
}
}





on node 3




root@prx3:~# ceph ping mon.*
2024-09-28T09:00:42.135+0300 783d0be006c0 0 ms_deliver_dispatch: unhandled message 0x783d14065160 mon_map magic: 0 v1 from mon.0 v2:92.204.248.86:3300/0
mon.prx1
{
"health": {
"status": "HEALTH_OK",
"checks": {},
"mutes": []
},
"mon_status": {
"name": "prx1",
"rank": 0,
"state": "leader",
"election_epoch": 7,
"quorum": [
0
],
"quorum_age": 205219,
"features": {
"required_con": "2449958758054445060",
"required_mon": [
"kraken",
"luminous",
"mimic",
"osdmap-prune",
"nautilus",
"octopus",
"pacific",
"elector-pinging",
"quincy",
"reef"
],
"quorum_con": "4540138322906710015",
"quorum_mon": [
"kraken",
"luminous",
"mimic",
"osdmap-prune",
"nautilus",
"octopus",
"pacific",
"elector-pinging",
"quincy",
"reef"
]
},
"outside_quorum": [],
"extra_probe_peers": [],
"sync_provider": [],
"monmap": {
"epoch": 1,
"fsid": "33fd9bcb-12b5-4548-99ce-d262e24ad979",
"modified": "2024-09-25T07:12:07.581916Z",
"created": "2024-09-25T07:12:07.581916Z",
"min_mon_release": 18,
"min_mon_release_name": "reef",
"election_strategy": 1,
"disallowed_leaders: ": "",
"stretch_mode": false,
"tiebreaker_mon": "",
"removed_ranks: ": "",
"features": {
"persistent": [
"kraken",
"luminous",
"mimic",
"osdmap-prune",
"nautilus",
"octopus",
"pacific",
"elector-pinging",
"quincy",
"reef"
],
"optional": []
},
"mons": [
{
"rank": 0,
"name": "prx1",
"public_addrs": {
"addrvec": [
{
"type": "v2",
"addr": "92.204.248.86:3300",
"nonce": 0
},
{
"type": "v1",
"addr": "92.204.248.86:6789",
"nonce": 0
}
]
},
"addr": "92.204.248.86:6789/0",
"public_addr": "92.204.248.86:6789/0",
"priority": 0,
"weight": 0,
"crush_location": "{}"
}
]
},
"feature_map": {
"mon": [
{
"features": "0x3f01cfbffffdffff",
"release": "luminous",
"num": 1
}
],
"osd": [
{
"features": "0x3f01cfbffffdffff",
"release": "luminous",
"num": 3
}
],
"client": [
{
"features": "0x2f018fb87aa4aafe",
"release": "luminous",
"num": 5
},
{
"features": "0x3f01cfbffffdffff",
"release": "luminous",
"num": 10
}
],
"mgr": [
{
"features": "0x3f01cfbffffdffff",
"release": "luminous",
"num": 3
}
]
},
"stretch_mode": false
}
}
 
Sorry for the late reply, I was in between things.

There seem to be communication and quite possibly configuration issues in your cluster as none of the nodes prx2 and prx3 see a monitor on themselves.

Could you please post the output of the following commands from your node prx1:

cat /etc/ceph/ceph.conf

ceph status

How do you determine the following, could you please post the systemctl you are using?

it shows stoped but in systemctl its running