I use Supermicro server with PVE 6.2. I did the watchdog setup this way:
here is what I see in output:
Server run for months perfectly, and today it rebooted at random moment, with no reasons for.
If there any way to debug what was the core reason for?
Code:
/etc/default/pve-ha-manager:
WATCHDOG_MODULE=ipmi_watchdog
Code:
/etc/modprobe.d/ipmi_watchdog.conf:
options ipmi_watchdog action=power_cycle panic_wdt_timeout=10
Code:
/etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet nmi_watchdog=0"
here is what I see in output:
Code:
# ipmitool mc watchdog get
Watchdog Timer Use: SMS/OS (0x44)
Watchdog Timer Is: Started/Running
Watchdog Timer Actions: Power Cycle (0x03)
Pre-timeout interval: 0 seconds
Timer Expiration Flags: 0x00
Initial Countdown: 10 sec
Present Countdown: 9 sec
Server run for months perfectly, and today it rebooted at random moment, with no reasons for.
If there any way to debug what was the core reason for?