Cant login to WEB-UI after a node failure

Jun 28, 2019
119
12
23
44
Hi,

I'm testing a three-node cluster with CEPH and HA, everything is ok.. but if a poweroff one node (cold reboot) I can't login to web-ui anymore until this host come online again.. also.. ceph is not working, ha, etc in whis way...

If a reboot the host normally (clean OS reboot) everything works as expected.... the OSDs from this node appears as DOWN in the CEPH manager, HA works... I can login to web-ui... everything perfect... but if I force a node failure (powerof it directly using DELL iDrac) to simulate a real disaster recovery scenario... everything crash.. no login at web-ui (Login failed try again), teh OSDs from this node still appears as UP/IN into the CEPH manager, no ceph storage available.... no HA working... etc

I'm using proxmox 6, latest version.

any tips?
 
Last edited:
I make some tests here.. and marking the OSD as down manually, works.. I can access the CEPH Storage again..

#ceph osd down osd.1
#ceph osd down osd.2

etc...

This is by design or it should be marked as down automatically when a host suffers a complete failure?

My understand is that it should be done automatically doesn't it?