Hello.
Having the same issue here.
When you said:
Do you mean that you had to restart only a service (which one)? or reboot the whole server?
Also, how many MDS managers are configured on your cluster (ceph status)? I have only one MDS manager, so I'm not pretty sure about the next actions.