Running PVE 8.0.4
ipmi_watchdog configured
After disabling maintenance mode via
lrm node3 (old timestamp - dead?, [date & time])
...
service vm:XXXX (node3, freeze)
systemctl status watchdog-mux pve-ha-lrm shows they are not running, failed to start.
After attempting to (re)start those services, logs show:
Thanks!
ipmi_watchdog configured
After disabling maintenance mode via
ha-manager crm-command node-maintenance disable node3
, ha-manager status
shows:lrm node3 (old timestamp - dead?, [date & time])
...
service vm:XXXX (node3, freeze)
systemctl status watchdog-mux pve-ha-lrm shows they are not running, failed to start.
After attempting to (re)start those services, logs show:
Not sure what is the cause of the "ERROR: unable to open watchdog socket - No such file or directory". Am I missing some additional configuration for the watchdog or could it be something else?Oct 12 14:59:25 node3 systemd[1]: Started watchdog-mux.service - Proxmox VE watchdog multiplexer.
Oct 12 14:59:25 node3 watchdog-mux[40674]: watchdog open: Device or resource busy
Oct 12 14:59:25 node3 systemd[1]: Starting pve-ha-lrm.service - PVE Local HA Resource Manager Daemon...
Oct 12 14:59:25 node3 systemd[1]: watchdog-mux.service: Main process exited, code=exited, status=1/FAILURE
Oct 12 14:59:25 node3 systemd[1]: watchdog-mux.service: Failed with result 'exit-code'.
Oct 12 14:59:26 node3 pve-ha-lrm[40694]: starting server
Oct 12 14:59:26 node3 pve-ha-lrm[40694]: status change startup => wait_for_agent_lock
Oct 12 14:59:26 node3 systemd[1]: Started pve-ha-lrm.service - PVE Local HA Resource Manager Daemon.
Oct 12 14:59:32 node3 pve-ha-lrm[40694]: successfully acquired lock 'ha_agent_node3_lock'
Oct 12 14:59:32 node3 pve-ha-lrm[40694]: ERROR: unable to open watchdog socket - No such file or directory
Oct 12 14:59:32 node3 pve-ha-lrm[40694]: restart LRM, freeze all services
Oct 12 14:59:32 node3 pve-ha-lrm[40694]: server stopped
Oct 12 14:59:32 node3 systemd[1]: pve-ha-lrm.service: Main process exited, code=exited, status=255/EXCEPTION
Oct 12 14:59:32 node3 systemd[1]: pve-ha-lrm.service: Failed with result 'exit-code'.
Thanks!