Hey all,
Hobbyist user here. I have a three node cluster with ceph and after being away on vacation returrned to find a monitor down and the following status message:
root@pve01:~# systemctl status ceph-mon@pve01
× ceph-mon@pve01.service - Ceph cluster monitor daemon
Loaded: loaded (/lib/systemd/system/ceph-mon@.service; enabled; preset: enabled)
Drop-In: /usr/lib/systemd/system/ceph-mon@.service.d
└─ceph-after-pve-cluster.conf
Active: failed (Result: signal) since Sun 2024-05-19 15:07:58 EDT; 22min ago
Duration: 86ms
Process: 5432 ExecStart=/usr/bin/ceph-mon -f --cluster ${CLUSTER} --id pve01 --setuser ceph --setgroup ceph (code=killed, signal=ABRT)
Main PID: 5432 (code=killed, signal=ABRT)
CPU: 60ms
May 19 15:07:58 pve01 systemd[1]: ceph-mon@pve01.service: Scheduled restart job, restart counter is at 6.
May 19 15:07:58 pve01 systemd[1]: Stopped ceph-mon@pve01.service - Ceph cluster monitor daemon.
May 19 15:07:58 pve01 systemd[1]: ceph-mon@pve01.service: Start request repeated too quickly.
May 19 15:07:58 pve01 systemd[1]: ceph-mon@pve01.service: Failed with result 'signal'.
May 19 15:07:58 pve01 systemd[1]: Failed to start ceph-mon@pve01.service - Ceph cluster monitor daemon.
May 19 15:28:17 pve01 systemd[1]: ceph-mon@pve01.service: Start request repeated too quickly.
May 19 15:28:17 pve01 systemd[1]: ceph-mon@pve01.service: Failed with result 'signal'.
May 19 15:28:17 pve01 systemd[1]: Failed to start ceph-mon@pve01.service - Ceph cluster monitor daemon.
root@pve01:~#
I have been looking at other posts but I may be in over my head. Not sure what logs would be helpful here? Any suggestions on where to start troubleshooting this?
Thanks,
James
Hobbyist user here. I have a three node cluster with ceph and after being away on vacation returrned to find a monitor down and the following status message:
root@pve01:~# systemctl status ceph-mon@pve01
× ceph-mon@pve01.service - Ceph cluster monitor daemon
Loaded: loaded (/lib/systemd/system/ceph-mon@.service; enabled; preset: enabled)
Drop-In: /usr/lib/systemd/system/ceph-mon@.service.d
└─ceph-after-pve-cluster.conf
Active: failed (Result: signal) since Sun 2024-05-19 15:07:58 EDT; 22min ago
Duration: 86ms
Process: 5432 ExecStart=/usr/bin/ceph-mon -f --cluster ${CLUSTER} --id pve01 --setuser ceph --setgroup ceph (code=killed, signal=ABRT)
Main PID: 5432 (code=killed, signal=ABRT)
CPU: 60ms
May 19 15:07:58 pve01 systemd[1]: ceph-mon@pve01.service: Scheduled restart job, restart counter is at 6.
May 19 15:07:58 pve01 systemd[1]: Stopped ceph-mon@pve01.service - Ceph cluster monitor daemon.
May 19 15:07:58 pve01 systemd[1]: ceph-mon@pve01.service: Start request repeated too quickly.
May 19 15:07:58 pve01 systemd[1]: ceph-mon@pve01.service: Failed with result 'signal'.
May 19 15:07:58 pve01 systemd[1]: Failed to start ceph-mon@pve01.service - Ceph cluster monitor daemon.
May 19 15:28:17 pve01 systemd[1]: ceph-mon@pve01.service: Start request repeated too quickly.
May 19 15:28:17 pve01 systemd[1]: ceph-mon@pve01.service: Failed with result 'signal'.
May 19 15:28:17 pve01 systemd[1]: Failed to start ceph-mon@pve01.service - Ceph cluster monitor daemon.
root@pve01:~#
I have been looking at other posts but I may be in over my head. Not sure what logs would be helpful here? Any suggestions on where to start troubleshooting this?
Thanks,
James