Hi all! I really hope someome could guide me with this issue because I do not know cepth much and I am not sure about the next steps..
So I have a 3 nodes cluster, same hardware. ceph was working without any issue until a reboot I had to do because of a network issue. Now on a node, ceph is down, most of the osd are down/out. I have tried to make those osd "in" again but cannot start them anyway. If I do this, some become "out" again.
Here are some logs
I have moved all my vm to the other node, so I don't care (I guess) deleting/recreating if it's needed but I am not sure what to do at this level. Thanks for any help or advices you can give
R.
So I have a 3 nodes cluster, same hardware. ceph was working without any issue until a reboot I had to do because of a network issue. Now on a node, ceph is down, most of the osd are down/out. I have tried to make those osd "in" again but cannot start them anyway. If I do this, some become "out" again.
Here are some logs
Code:
root@SOCSMI-SRV3:/var/log/ceph# ceph health detail
HEALTH_WARN 24 osds down; 1 host (24 osds) down; Degraded data redundancy: 4595689/13787067 objects degraded (33.333%), 2560 pgs degraded, 2560 pgs undersized
OSD_DOWN 24 osds down
osd.48 (root=default,host=SOCSMI-SRV3) is down
osd.49 (root=default,host=SOCSMI-SRV3) is down
osd.50 (root=default,host=SOCSMI-SRV3) is down
osd.51 (root=default,host=SOCSMI-SRV3) is down
osd.52 (root=default,host=SOCSMI-SRV3) is down
osd.53 (root=default,host=SOCSMI-SRV3) is down
osd.54 (root=default,host=SOCSMI-SRV3) is down
osd.55 (root=default,host=SOCSMI-SRV3) is down
osd.56 (root=default,host=SOCSMI-SRV3) is down
osd.57 (root=default,host=SOCSMI-SRV3) is down
osd.58 (root=default,host=SOCSMI-SRV3) is down
osd.59 (root=default,host=SOCSMI-SRV3) is down
osd.60 (root=default,host=SOCSMI-SRV3) is down
osd.61 (root=default,host=SOCSMI-SRV3) is down
osd.62 (root=default,host=SOCSMI-SRV3) is down
osd.63 (root=default,host=SOCSMI-SRV3) is down
osd.64 (root=default,host=SOCSMI-SRV3) is down
osd.65 (root=default,host=SOCSMI-SRV3) is down
osd.66 (root=default,host=SOCSMI-SRV3) is down
osd.67 (root=default,host=SOCSMI-SRV3) is down
osd.68 (root=default,host=SOCSMI-SRV3) is down
osd.69 (root=default,host=SOCSMI-SRV3) is down
osd.70 (root=default,host=SOCSMI-SRV3) is down
osd.71 (root=default,host=SOCSMI-SRV3) is down
OSD_HOST_DOWN 1 host (24 osds) down
host SOCSMI-SRV3 (root=default) (24 osds) is down
PG_DEGRADED Degraded data redundancy: 4595689/13787067 objects degraded (33.333%), 2560 pgs degraded, 2560 pgs undersized
pg 1.7cd is active+undersized+degraded, acting [35,23]
pg 1.7ce is stuck undersized for 73002.137990, current state active+undersized+degraded, last acting [6,37]
pg 1.7cf is stuck undersized for 73003.125764, current state active+undersized+degraded, last acting [0,38]
pg 1.7d0 is stuck undersized for 73003.115012, current state active+undersized+degraded, last acting [0,32]
pg 1.7d1 is stuck undersized for 73002.137108, current state active+undersized+degraded, last acting [34,20]
pg 1.7d2 is stuck undersized for 73002.120444, current state active+undersized+degraded, last acting [4,36]
pg 1.7d3 is stuck undersized for 73018.159866, current state active+undersized+degraded, last acting [27,7]
pg 1.7d4 is stuck undersized for 73002.141011, current state active+undersized+degraded, last acting [28,0]
pg 1.7d5 is stuck undersized for 73002.143990, current state active+undersized+degraded, last acting [19,32]
pg 1.7d6 is stuck undersized for 73018.171505, current state active+undersized+degraded, last acting [10,25]
pg 1.7d7 is stuck undersized for 73003.132391, current state active+undersized+degraded, last acting [17,46]
pg 1.7d8 is stuck undersized for 73003.129911, current state active+undersized+degraded, last acting [17,42]
pg 1.7d9 is stuck undersized for 73018.167376, current state active+undersized+degraded, last acting [4,45]
pg 1.7da is stuck undersized for 73018.151950, current state active+undersized+degraded, last acting [15,44]
pg 1.7db is stuck undersized for 73018.183700, current state active+undersized+degraded, last acting [24,23]
pg 1.7dc is stuck undersized for 73003.113184, current state active+undersized+degraded, last acting [35,13]
pg 1.7dd is stuck undersized for 73003.136707, current state active+undersized+degraded, last acting [31,19]
pg 1.7de is stuck undersized for 73002.143750, current state active+undersized+degraded, last acting [2,38]
pg 1.7df is stuck undersized for 73002.134902, current state active+undersized+degraded, last acting [11,26]
pg 1.7e0 is stuck undersized for 73002.119510, current state active+undersized+degraded, last acting [22,47]
pg 1.7e1 is stuck undersized for 73003.130654, current state active+undersized+degraded, last acting [10,29]
pg 1.7e2 is stuck undersized for 73003.134833, current state active+undersized+degraded, last acting [46,22]
pg 1.7e3 is stuck undersized for 73018.153941, current state active+undersized+degraded, last acting [19,39]
pg 1.7e4 is stuck undersized for 73002.141248, current state active+undersized+degraded, last acting [33,17]
pg 1.7e5 is stuck undersized for 73003.130155, current state active+undersized+degraded, last acting [39,1]
pg 1.7e6 is stuck undersized for 73002.125134, current state active+undersized+degraded, last acting [18,34]
pg 1.7e7 is stuck undersized for 73018.163222, current state active+undersized+degraded, last acting [35,23]
pg 1.7e8 is stuck undersized for 73018.169650, current state active+undersized+degraded, last acting [40,19]
pg 1.7e9 is stuck undersized for 73003.133874, current state active+undersized+degraded, last acting [23,28]
pg 1.7ea is stuck undersized for 73002.138131, current state active+undersized+degraded, last acting [37,4]
pg 1.7eb is stuck undersized for 73003.122079, current state active+undersized+degraded, last acting [46,20]
pg 1.7ec is stuck undersized for 73003.127444, current state active+undersized+degraded, last acting [20,31]
pg 1.7ed is stuck undersized for 73002.121792, current state active+undersized+degraded, last acting [10,37]
pg 1.7ee is stuck undersized for 73003.128326, current state active+undersized+degraded, last acting [40,13]
pg 1.7ef is stuck undersized for 73018.175634, current state active+undersized+degraded, last acting [46,5]
pg 1.7f0 is stuck undersized for 73002.123425, current state active+undersized+degraded, last acting [17,28]
pg 1.7f1 is stuck undersized for 73002.130478, current state active+undersized+degraded, last acting [30,19]
pg 1.7f2 is stuck undersized for 73003.142126, current state active+undersized+degraded, last acting [36,14]
pg 1.7f3 is stuck undersized for 73002.136261, current state active+undersized+degraded, last acting [34,1]
pg 1.7f4 is stuck undersized for 73003.147187, current state active+undersized+degraded, last acting [16,30]
pg 1.7f5 is stuck undersized for 73018.165342, current state active+undersized+degraded, last acting [33,22]
pg 1.7f6 is stuck undersized for 73002.137469, current state active+undersized+degraded, last acting [7,27]
pg 1.7f7 is stuck undersized for 73002.133176, current state active+undersized+degraded, last acting [40,18]
pg 1.7f8 is stuck undersized for 73018.167834, current state active+undersized+degraded, last acting [17,46]
pg 1.7f9 is stuck undersized for 73002.122180, current state active+undersized+degraded, last acting [10,39]
pg 1.7fa is stuck undersized for 73003.142536, current state active+undersized+degraded, last acting [9,24]
pg 1.7fb is stuck undersized for 73002.137872, current state active+undersized+degraded, last acting [7,35]
pg 1.7fc is stuck undersized for 73003.121659, current state active+undersized+degraded, last acting [30,23]
pg 1.7fd is stuck undersized for 73003.116221, current state active+undersized+degraded, last acting [11,41]
pg 1.7fe is stuck undersized for 73018.125948, current state active+undersized+degraded, last acting [13,38]
pg 1.7ff is stuck undersized for 73002.133519, current state active+undersized+degraded, last acting [8,32]
I have moved all my vm to the other node, so I don't care (I guess) deleting/recreating if it's needed but I am not sure what to do at this level. Thanks for any help or advices you can give

R.