pve cluster node rebootet

ontec

Active Member
Jan 12, 2018
3
0
41
38
Hi,

we have a pve cluster with 9 nodes ( Version: pve-manager/4.4-21/e0dadcf8 (running kernel: 4.4.98-3-pve))
Yesterday one of the nodes has rebooted and we do not know why this has happened.

This are the last lines we see in the /var/log/syslog befor the reboot:

Code:
Jan 11 21:59:51 bespin pve-firewall[3398]: firewall update time (27.120 seconds)
Jan 11 22:00:07 bespin pve-ha-crm[3503]: loop take too long (52 seconds)
Jan 11 22:00:07 bespin pve-firewall[3398]: firewall update time (5.811 seconds)
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@

Can anybody tell us what went wrong on this machine?

Thanks
Rene
 
What type of server do you use exactly. Have you an HW-Mangement like HP ILO or DELL Idrac? If yes normaly such things were be logged in to. Maybe a problem with an hw-watchdog... or an memory error...
 
Hi,

the Server is a Dell R630 with iDrac Enterprise. But we are monitorung the iDrac an haven't noticed any hardware error messages.
 
Ok... what I'am do in this case? I would do an upgrade of the Dell Firmware to latest version. Check in the that the watchdogtimer is disabled. Update PVE to latest version. Contact Dell Support.