Hi,
A few days ago all nodes of our three-node cluster restarted almost at once. We are still looking for the casuse - pparently there was a DDos attack on a server in the same VLAN, but we are still puzzled why the nodes were rebooted (no log entries that would help).
The main problem is, however something else. Once all nodes were up, NONE of the VMs started.
In the pve logs, I see that startall failed because there was no quorum.
As I see from previous posts, quorum is mandatory for the VMs to start, but if it is so, why there is no retry in 1-2 minute intervals? Quorum was ready in a few minutes, but all VMs were down.
Same issue with the HA VMs (there are 4 of them). These have failed because sheepdog was not ready. I see no sheepdog errors BTW.
We are building a HA system, but this issue has ruined our expectations
Our questions are:
- How can we make the cluster start the VMs even if the first attempt fails for no-quorum reason?
- What caused the sheepdog error, why wasn't the sheepdog ready once the cluster was up, and why the HA VMs also failed to start?
- Are you aware of any issues, did you see any cases when all three nodes restarted nearly at the same time? (We have no HW watchdog, but in the logs I have seen some software watchdog entries).
Thanks a lot!
Attila
A few days ago all nodes of our three-node cluster restarted almost at once. We are still looking for the casuse - pparently there was a DDos attack on a server in the same VLAN, but we are still puzzled why the nodes were rebooted (no log entries that would help).
The main problem is, however something else. Once all nodes were up, NONE of the VMs started.
In the pve logs, I see that startall failed because there was no quorum.
As I see from previous posts, quorum is mandatory for the VMs to start, but if it is so, why there is no retry in 1-2 minute intervals? Quorum was ready in a few minutes, but all VMs were down.
Same issue with the HA VMs (there are 4 of them). These have failed because sheepdog was not ready. I see no sheepdog errors BTW.
We are building a HA system, but this issue has ruined our expectations

Our questions are:
- How can we make the cluster start the VMs even if the first attempt fails for no-quorum reason?
- What caused the sheepdog error, why wasn't the sheepdog ready once the cluster was up, and why the HA VMs also failed to start?
- Are you aware of any issues, did you see any cases when all three nodes restarted nearly at the same time? (We have no HW watchdog, but in the logs I have seen some software watchdog entries).
Thanks a lot!
Attila