Ceph 2 OSD's down and out

scuppasteve

New Member
Mar 14, 2025
7
0
1
Upon an accidental power loss, i have two nodes that have their OSD's down / out. The drives are new, so i am pretty sure its not that. I assume it is some kind of auth issue based on data below, but i can't seem to find it.

ceph health detail
Code:
HEALTH_WARN 2 osds down; 2 hosts (2 osds) down
[WRN] OSD_DOWN: 2 osds down
    osd.0 (root=default,host=optiswarm01) is down
    osd.1 (root=default,host=optiswarm02) is down
[WRN] OSD_HOST_DOWN: 2 hosts (2 osds) down
    host optiswarm01 (root=default) (1 osds) is down
    host optiswarm02 (root=default) (1 osds) is down

systemctl status ceph-osd@0.service
Code:
× ceph-osd@0.service - Ceph object storage daemon osd.0
     Loaded: loaded (/lib/systemd/system/ceph-osd@.service; enabled-runtime; preset: enabled)
    Drop-In: /usr/lib/systemd/system/ceph-osd@.service.d
             └─ceph-after-pve-cluster.conf
     Active: failed (Result: exit-code) since Mon 2025-03-24 14:34:46 PDT; 27min ago
   Duration: 45ms
    Process: 5563 ExecStartPre=/usr/libexec/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id 0 (code=exited, status=0/SUCCESS)
    Process: 5567 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id 0 --setuser ceph --setgroup ceph (code=exited, status=1/FAILURE)
   Main PID: 5567 (code=exited, status=1/FAILURE)
        CPU: 66ms

Mar 24 14:34:46 optiswarm01 systemd[1]: ceph-osd@0.service: Scheduled restart job, restart counter is at 3.
Mar 24 14:34:46 optiswarm01 systemd[1]: Stopped ceph-osd@0.service - Ceph object storage daemon osd.0.
Mar 24 14:34:46 optiswarm01 systemd[1]: ceph-osd@0.service: Start request repeated too quickly.
Mar 24 14:34:46 optiswarm01 systemd[1]: ceph-osd@0.service: Failed with result 'exit-code'.
Mar 24 14:34:46 optiswarm01 systemd[1]: Failed to start ceph-osd@0.service - Ceph object storage daemon osd.0.
Mar 24 15:00:08 optiswarm01 systemd[1]: ceph-osd@0.service: Start request repeated too quickly.
Mar 24 15:00:08 optiswarm01 systemd[1]: ceph-osd@0.service: Failed with result 'exit-code'.
Mar 24 15:00:08 optiswarm01 systemd[1]: Failed to start ceph-osd@0.service - Ceph object storage daemon osd.0.

journalctl -b -u ceph-osd@0.service
Code:
Mar 24 14:22:07 optiswarm01 systemd[1]: Starting ceph-osd@0.service - Ceph object storage daemon osd.0...
Mar 24 14:22:07 optiswarm01 systemd[1]: Started ceph-osd@0.service - Ceph object storage daemon osd.0.
Mar 24 14:22:10 optiswarm01 ceph-osd[1280]: 2025-03-24T14:22:10.782-0700 7b0f006006c0 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]
Mar 24 14:22:10 optiswarm01 ceph-osd[1280]: failed to fetch mon config (--no-mon-config to skip)
Mar 24 14:22:10 optiswarm01 systemd[1]: ceph-osd@0.service: Main process exited, code=exited, status=1/FAILURE
Mar 24 14:22:10 optiswarm01 systemd[1]: ceph-osd@0.service: Failed with result 'exit-code'.
Mar 24 14:22:21 optiswarm01 systemd[1]: ceph-osd@0.service: Scheduled restart job, restart counter is at 1.
Mar 24 14:22:21 optiswarm01 systemd[1]: Stopped ceph-osd@0.service - Ceph object storage daemon osd.0.
Mar 24 14:22:21 optiswarm01 systemd[1]: Starting ceph-osd@0.service - Ceph object storage daemon osd.0...
Mar 24 14:22:21 optiswarm01 systemd[1]: Started ceph-osd@0.service - Ceph object storage daemon osd.0.
Mar 24 14:22:21 optiswarm01 ceph-osd[1578]: 2025-03-24T14:22:21.123-0700 7b880a0006c0 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]
Mar 24 14:22:21 optiswarm01 ceph-osd[1578]: failed to fetch mon config (--no-mon-config to skip)
Mar 24 14:22:21 optiswarm01 systemd[1]: ceph-osd@0.service: Main process exited, code=exited, status=1/FAILURE
Mar 24 14:22:21 optiswarm01 systemd[1]: ceph-osd@0.service: Failed with result 'exit-code'.
Mar 24 14:22:31 optiswarm01 systemd[1]: ceph-osd@0.service: Scheduled restart job, restart counter is at 2.
Mar 24 14:22:31 optiswarm01 systemd[1]: Stopped ceph-osd@0.service - Ceph object storage daemon osd.0.
Mar 24 14:22:31 optiswarm01 systemd[1]: ceph-osd@0.service: Start request repeated too quickly.
Mar 24 14:22:31 optiswarm01 systemd[1]: ceph-osd@0.service: Failed with result 'exit-code'.
Mar 24 14:22:31 optiswarm01 systemd[1]: Failed to start ceph-osd@0.service - Ceph object storage daemon osd.0.
Mar 24 14:34:15 optiswarm01 systemd[1]: Starting ceph-osd@0.service - Ceph object storage daemon osd.0...
Mar 24 14:34:15 optiswarm01 systemd[1]: Started ceph-osd@0.service - Ceph object storage daemon osd.0.
Mar 24 14:34:15 optiswarm01 ceph-osd[5419]: 2025-03-24T14:34:15.742-0700 72a8836006c0 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]
Mar 24 14:34:15 optiswarm01 ceph-osd[5419]: 2025-03-24T14:34:15.743-0700 72a8840006c0 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]
Mar 24 14:34:15 optiswarm01 ceph-osd[5419]: failed to fetch mon config (--no-mon-config to skip)
Mar 24 14:34:15 optiswarm01 systemd[1]: ceph-osd@0.service: Main process exited, code=exited, status=1/FAILURE
Mar 24 14:34:15 optiswarm01 systemd[1]: ceph-osd@0.service: Failed with result 'exit-code'.
Mar 24 14:34:25 optiswarm01 systemd[1]: ceph-osd@0.service: Scheduled restart job, restart counter is at 1.
Mar 24 14:34:25 optiswarm01 systemd[1]: Stopped ceph-osd@0.service - Ceph object storage daemon osd.0.
Mar 24 14:34:25 optiswarm01 systemd[1]: Starting ceph-osd@0.service - Ceph object storage daemon osd.0...
Mar 24 14:34:25 optiswarm01 systemd[1]: Started ceph-osd@0.service - Ceph object storage daemon osd.0.
Mar 24 14:34:25 optiswarm01 ceph-osd[5504]: 2025-03-24T14:34:25.824-0700 75388a4006c0 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]
Mar 24 14:34:25 optiswarm01 ceph-osd[5504]: failed to fetch mon config (--no-mon-config to skip)
Mar 24 14:34:25 optiswarm01 systemd[1]: ceph-osd@0.service: Main process exited, code=exited, status=1/FAILURE
Mar 24 14:34:25 optiswarm01 systemd[1]: ceph-osd@0.service: Failed with result 'exit-code'.
Mar 24 14:34:36 optiswarm01 systemd[1]: ceph-osd@0.service: Scheduled restart job, restart counter is at 2.
Mar 24 14:34:36 optiswarm01 systemd[1]: Stopped ceph-osd@0.service - Ceph object storage daemon osd.0.
Mar 24 14:34:36 optiswarm01 systemd[1]: Starting ceph-osd@0.service - Ceph object storage daemon osd.0...
Mar 24 14:34:36 optiswarm01 systemd[1]: Started ceph-osd@0.service - Ceph object storage daemon osd.0.
Mar 24 14:34:36 optiswarm01 ceph-osd[5567]: 2025-03-24T14:34:36.078-0700 79d5d34006c0 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]
Mar 24 14:34:36 optiswarm01 ceph-osd[5567]: failed to fetch mon config (--no-mon-config to skip)
Mar 24 14:34:36 optiswarm01 systemd[1]: ceph-osd@0.service: Main process exited, code=exited, status=1/FAILURE
Mar 24 14:34:36 optiswarm01 systemd[1]: ceph-osd@0.service: Failed with result 'exit-code'.
Mar 24 14:34:46 optiswarm01 systemd[1]: ceph-osd@0.service: Scheduled restart job, restart counter is at 3.
Mar 24 14:34:46 optiswarm01 systemd[1]: Stopped ceph-osd@0.service - Ceph object storage daemon osd.0.
Mar 24 14:34:46 optiswarm01 systemd[1]: ceph-osd@0.service: Start request repeated too quickly.
Mar 24 14:34:46 optiswarm01 systemd[1]: ceph-osd@0.service: Failed with result 'exit-code'.
Mar 24 14:34:46 optiswarm01 systemd[1]: Failed to start ceph-osd@0.service - Ceph object storage daemon osd.0.
Mar 24 15:00:08 optiswarm01 systemd[1]: ceph-osd@0.service: Start request repeated too quickly.
Mar 24 15:00:08 optiswarm01 systemd[1]: ceph-osd@0.service: Failed with result 'exit-code'.
Mar 24 15:00:08 optiswarm01 systemd[1]: Failed to start ceph-osd@0.service - Ceph object storage daemon osd.0.