HPET and watchdog

wosp

Renowned Member
Apr 18, 2015
203
23
83
37
The Netherlands
This weekend I upgraded a Proxmox VE 3.4 cluster to Proxmox VE 4.1. Everything works as expected, but on the 3.4 cluster we had "hpet=disable" in the /etc/default/grub because if we don't, we get many (strange) log entries about hpet. After the upgrade I removed the "hpet=disable", to see if the issues is still there in 4.1. But yes, the log entries came back. So I re-added "hpet=disable" to /etc/default/grub and rebooted all nodes. After +/- 1 hour one node decided to fence (watchdogtimer expired), 3 hours later, same issue. So I have removed the "hpet=disable" again and rebooted all nodes and now there is no problem anymore (besides the log entries).

Example of log entries:

Code:
Apr 17 05:02:32 node02 kernel: [27974.504137] CE: hpet4 increased min_delta_ns to 20115 nsec
Apr 17 08:04:25 node02 kernel: [38888.003432] CE: hpet increased min_delta_ns to 20115 nsec

Any experiences with this?

Edit: FWIW: I use hardware watchdog (idrac)
 
Nope, no special bios settings. We only use Dell-servers and I only see this issue on Dell PowerEdge R310 servers with X3450 CPU, not on any other Dell server (R610, R710, R320 and R420 tested). Don't know for sure if others CPU's are also affected, since I only have R310's with X3450's. But it's no problem, when I just disable hpet in grub config and switch to ntp instead of systemd-timesyncd, everything works fine. :)
 
you should really check in your bios, for "max performance" profile.
I think your hpet / clock problem, is because of energy-saving/ondemand profile, changing cpu clock speed or enabling/shutting down cores.
 
you should really check in your bios, for "max performance" profile.
I think your hpet / clock problem, is because of energy-saving/ondemand profile, changing cpu clock speed or enabling/shutting down cores.

I just checked 3 nodes, one R310 with the "problem", one R610 without the problem and one R710 without the problem.
On all 3 nodes I use the same settings for power management (Active Power Controller), which is the default, and I can choose between:

- Active Power Controller
- Maximum Performance
- OS Control
- Custom

But, since I don't have the problem on R610's and R710's, we can only conclude that this isn't the problem. Right?
 

Attachments

  • R310.png
    R310.png
    21 KB · Views: 25
  • R610.png
    R610.png
    20.3 KB · Views: 22
  • R710.png
    R710.png
    20.9 KB · Views: 25
you should really setup them to max performance.

It'll improve stability of vm. (for example,with the default, a vm do a cpu spike, to frequency or cores of cpu change, the clock drift, and sometime this can crash the vm)
 
  • Like
Reactions: wosp
Okay, I'm changing them to "Max Performance" now on all nodes on all clusters (time-consuming job ;)). Will also remove the hpet=disable, just to see if the error messages come back. Will let you know soon. Thanks!
 
After I changed to "Max Performance" power profile and removed the "hpet=disable" from GRUB config the hpet logs came back

Code:
May 19 22:22:05 node01 kernel: [ 2727.779396] CE: hpet3 increased min_delta_ns to 20115 nsec
May 20 00:20:25 node03 kernel: [ 7720.535187] CE: hpet3 increased min_delta_ns to 20115 nsec
May 20 00:51:01 node02 kernel: [10942.805762] CE: hpet increased min_delta_ns to 20115 nsec

So, the power profile wasn't the problem. However, I will keep using "Max Performance" power profile, since it seems to be a better choice. But again need to add "hpet=disable" to GRUB config. Thanks anyway.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!