Fencing/failover for upstream and Ceph networks?

etfz

Member
Aug 29, 2025
42
4
8
Hi,

I have a cluster with separate networks for VM upstream traffic, Ceph and corosync. When a host loses corosync connectivity, fencing kicks in, as expected, but can I do the same for the VM and Ceph networks? A host without upstream or Ceph connectivity has no business trying to run VMs. Is there any reason I wouldn't want to do this?
 
  • Like
Reactions: Jeffthomson890
I believe it is standard practice to provide redundancy for networks such as Corosync, Ceph, and upstream networks using Bonding or MLAG.

When accounting for scenarios such as double failures, I have previously used an approach where a network monitoring script is created, and a kernel panic is triggered with `echo c > /proc/sysrq-trigger` to initiate fencing.
 
I do employ redundancy, so it is certainly an unlikely scenario, but since the technology exists for corosync, I feel like there should be no reason I couldn't have the same safeguard in place for at least Ceph, which is very much a clustered feature.

If anything, this behaviour seems to encourage using the same network for Ceph and corosync.

Thanks for the tip regarding manual kernel panic, though.
 
Last edited: