To me it looks like a ceph cluster (as hyper converged on pve or separate) generates much often and heavy trouble - as seen here in threads every week new - as any nas server while a ha-nas like a netapp/isilon just has a problem when power is off which can be serviced with a metro (2nd coupled) installation while switches run with emergency power. In a time range of 15y I've seen 5 ha systems crashed at 1 customer and checkup and service/data repairing such a system which was in chaos takes a bunch time. Sometimes it would be just better to stop a service, check and run otherwise than running into an increasingly troubles to data while ha thinks it can any further do it's job but it don't.