Ceph stops responding occasionaly

asou

New Member
Sep 11, 2022
1
0
1
Hi all,
We are facing a strange issue with our external ceph cluster (16.2.9) where for no apparent reason it stops responding and as if IO requests get in a queue. We have also notied that during this time, "rbd ls" is working OK but "rbd ls -l" does not respond. The problem is fixed by rebooting randomly nodes in the cluster. This is a 4-node cluster with replica-3 pools, 5 mons and 52 osds (13 each node). Any suggestions ?
 
, "rbd ls" is working OK but "rbd ls -l" does not respond.
on a hunch - from what I remember rbd ls does not need a ceph-mgr, while rbd ls -l does - check the ceph-mgr instances in your cluster (or if you have only one - consider adding another one)

else - carefully read through the ceph logs - these usually contain hints to what's going wrong

I hope this helps!