Hi Dietmar,
The nodes that I am testing PVE HA does not have IPMI port. I am using the softdog of linux.
There is an easy way to test if the watchdog works correctly:
# echo 1 >/dev/watchdog
This should trigger a reboot within 60 seconds. Does that work?
Before unplugging the network cord of node 2, I test the echo 1 > /dev/watchdog on all the 3 nodes (node1, node2 and node3). It says on all the 3 nodes: "-bash: /dev/watchdog: Device or resource busy"
Then I unplugged the network cord of node2. The HA works as it migrates the VMs on node 2 evenly to node1 and node3. After some minutes, I plugged the network cord to node2 again. The membership quorum is finalised but it does not send the VMs back to node 2.
I execute "echo 1 > /dev/watchdog" command on node 2 and it got executed (while on node1 & node2, it says Device or resources busy).
After execution the command echo on node2 I checked the syslog and it fails with the below same error I had before
Aug 6 14:39:53 node2 kernel: [ 1320.827845] watchdog watchdog0: watchdog did not stop!
Aug 6 14:39:57 node2 pve-ha-lrm[1171]: watchdog update failed - Broken pipe
and it keeps repeating the last line "Broken pipe" on node2. It does NOT trigger the reboot within 60 seconds.
Each time I execute the command echo 1 > /dev/watchdog, it repeats this error on syslog.
Is there any parameters to be done on the BIOS
Thanks
Shafeek