[SOLVED] Intel S5520HC Xeon reboots

Jun 8, 2016
344
70
68
47
Johannesburg, South Africa
We re-purposed some old hardware to setup a proper sandbox environment. The cluster has 4 Dell R620 servers and a relatively old Intel S5520HC system with Intel E5620 (Westmere) CPUs. This last node isn't going to get used for virtuals, primarily serving as a dedicated Ceph storage node.

System had 1.5 year+ uptime running RHEL5 but we couldn't get it to stay up longer than 2 hours booting Proxmox, Debian 9, 10, 11 or CentOS 7. This involved simply booting the systems in to a rescue environment and leaving it sitting there.

System event logs (ipmi-sel) would report the following error:
Code:
3   | Sep-18-2019 | 07:17:14 | Pwr Unit Status  | Power Unit               | Power Unit Failure detected
4   | Sep-18-2019 | 07:17:19 | Pwr Unit Status  | Power Unit               | Power Off/Power Down


Physical symptoms would be the system resetting and beeping a couple of times until it restarted itself again.


Turns out allot (all?) Nehalem and early Westmere Intel Xeon CPUs have a physical design flaw when switching CPUs in to various low power C states. When we carefully recorded the current BIOS settings, reset them to defaults and compared those values to ones we recorded we observed that the Processor options had previously been altered to disable C6 power states, whilst C3 defaults to being disabled.

I assume Linux kernels newer than RHEL5 initialise CPUs more directly, possibly ignoring or bypassing the BIOS and that they subsequently ignore C states having been disabled. Solution was to pass the following options to the kernel at bootup and the system has been stable ever since!


/etc/default/grub
Code:
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_pstate=disable intel_idle.max_cstate=0 processor.max_cstate=1"


PS: We always run with 'intel_pstate' disabled and added the two cstate options
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!