We had a node fence itself as the IPMI reset counter reached zero (Motherboard logs) but we can't locate logging information leading up to this event.
Surely corosync or pvecm logs events such as losing quorum, occasional heartbeat messages being lost or discarded?
We previously appear to have identified an issue with a node being fenced after systemd-timesync jumped time due to a problem with one or more members of pool.ntp.org. We now subsequently run ntp which drifts to correct time instead of jumping and discards answers from ntp servers which are not in quorum with the majority...
Surely corosync or pvecm logs events such as losing quorum, occasional heartbeat messages being lost or discarded?
We previously appear to have identified an issue with a node being fenced after systemd-timesync jumped time due to a problem with one or more members of pool.ntp.org. We now subsequently run ntp which drifts to correct time instead of jumping and discards answers from ntp servers which are not in quorum with the majority...