Hello,
I have a 3 nodes cluster based on pvetest. Nothing has been installed on pve nodes but proxmox distrib.
One of the node has its pve-ha-lrm failed all the time. It fails in few seconds when started from the GUI (or after reboot) with the following messages in syslog:
May 5 15:26:32 pve2 pvedaemon[1260]: <root@pam> starting task UPID: pve2:00001747:0000AE60:572B4A08:srvstart: pve-ha-lrm:root@pam:
May 5 15:26:32 pve2 pvedaemon[5959]: starting service pve-ha-lrm: UPID: pve2:00001747:0000AE60:572B4A08:srvstart: pve-ha-lrm:root@pam:
May 5 15:26:32 pve2 watchdog-mux[5961]: watchdog active - unable to restart watchdog-mux
May 5 15:26:32 pve2 systemd[1]: watchdog-mux.service: main process exited, code=exited, status=1/FAILURE
May 5 15:26:32 pve2 systemd[1]: Unit watchdog-mux.service entered failed state.
May 5 15:26:32 pve2 pve-ha-lrm[5984]: starting server
May 5 15:26:32 pve2 pve-ha-lrm[5984]: status change startup => wait_for_agent_lock
May 5 15:26:34 pve2 pve-ha-lrm[5984]: successfully acquired lock 'ha_agent_pve2_lock'
May 5 15:26:34 pve2 pve-ha-lrm[5984]: ERROR: unable to open watchdog socket - No such file or directory
May 5 15:26:34 pve2 pve-ha-lrm[5984]: restart LRM, freeze all services
May 5 15:26:34 pve2 pve-ha-lrm[5984]: server stopped
May 5 15:26:34 pve2 systemd[1]: pve-ha-lrm.service: main process exited, code=exited, status=255/n/a
May 5 15:26:35 pve2 systemd[1]: Unit pve-ha-lrm.service entered failed state.
May 5 15:27:40 pve2 systemd-timesyncd[638]: interval/delta/delay/jitter/drift 512s/-0.013s/0.036s/0.014s/+40ppm
All nodes are installed the same way, only their hardware differs. Any idea what could cause this issue ?
Thanks a lot in advance !
I have a 3 nodes cluster based on pvetest. Nothing has been installed on pve nodes but proxmox distrib.
One of the node has its pve-ha-lrm failed all the time. It fails in few seconds when started from the GUI (or after reboot) with the following messages in syslog:
May 5 15:26:32 pve2 pvedaemon[1260]: <root@pam> starting task UPID: pve2:00001747:0000AE60:572B4A08:srvstart: pve-ha-lrm:root@pam:
May 5 15:26:32 pve2 pvedaemon[5959]: starting service pve-ha-lrm: UPID: pve2:00001747:0000AE60:572B4A08:srvstart: pve-ha-lrm:root@pam:
May 5 15:26:32 pve2 watchdog-mux[5961]: watchdog active - unable to restart watchdog-mux
May 5 15:26:32 pve2 systemd[1]: watchdog-mux.service: main process exited, code=exited, status=1/FAILURE
May 5 15:26:32 pve2 systemd[1]: Unit watchdog-mux.service entered failed state.
May 5 15:26:32 pve2 pve-ha-lrm[5984]: starting server
May 5 15:26:32 pve2 pve-ha-lrm[5984]: status change startup => wait_for_agent_lock
May 5 15:26:34 pve2 pve-ha-lrm[5984]: successfully acquired lock 'ha_agent_pve2_lock'
May 5 15:26:34 pve2 pve-ha-lrm[5984]: ERROR: unable to open watchdog socket - No such file or directory
May 5 15:26:34 pve2 pve-ha-lrm[5984]: restart LRM, freeze all services
May 5 15:26:34 pve2 pve-ha-lrm[5984]: server stopped
May 5 15:26:34 pve2 systemd[1]: pve-ha-lrm.service: main process exited, code=exited, status=255/n/a
May 5 15:26:35 pve2 systemd[1]: Unit pve-ha-lrm.service entered failed state.
May 5 15:27:40 pve2 systemd-timesyncd[638]: interval/delta/delay/jitter/drift 512s/-0.013s/0.036s/0.014s/+40ppm
All nodes are installed the same way, only their hardware differs. Any idea what could cause this issue ?
Thanks a lot in advance !