Fan Control, LM-Sensors HP Workstation | CPU Temp too high??

gs800uk

New Member
Oct 2, 2023
13
0
1
Hi all,

I wanted to reach out to anyone who may have experience running Proxmox using HP workstations (Z8 G4 - 48 x Intel(R) Xeon(R) Gold 6136 CPU @ 3.00GHz (2 Sockets) in my case) and how to control the fans/thermal temperatures.

I currently run around 5 Windows VMs on this machine with a fairly low load of 5% CPU host usage on average so nothing major at this point but watching the output of watch sensors of CPU temp is avg 60c-65c which is pretty high.... considering this CPU has Tcase of 85c max. Obviously, this is a concern as i will be adding more VMs to this with heavier loads.

So investigating further it would seem all system fans 'sound' like they are on the lowest RPM. To test i made another Windows VM (assigned 24 cores) and ran prime 95 to add more stress to the host to see if the fans kick in..... They don't and i got near the 85c mark and stopped the test. They still remain in the lowest RPM state regardless it seems so i had a look in the BIOS settings to see if one could adjust the fan curve and the only option for this is setting a 'min fan speed' which i set to 20% to be safe.

What i noticed is sensors output have the wrong high and critical temps and as mentioned above Intel say these chips are 85c max:

Code:
coretemp-isa-0001
Adapter: ISA adapter
Package id 1:  +59.0°C  (high = +93.0°C, crit = +103.0°C)
Core 0:        +59.0°C  (high = +93.0°C, crit = +103.0°C)
Core 1:        +48.0°C  (high = +93.0°C, crit = +103.0°C)
Core 2:        +47.0°C  (high = +93.0°C, crit = +103.0°C)
Core 3:        +50.0°C  (high = +93.0°C, crit = +103.0°C)
Core 9:        +49.0°C  (high = +93.0°C, crit = +103.0°C)
Core 10:       +47.0°C  (high = +93.0°C, crit = +103.0°C)
Core 16:       +50.0°C  (high = +93.0°C, crit = +103.0°C)
Core 18:       +47.0°C  (high = +93.0°C, crit = +103.0°C)
Core 19:       +51.0°C  (high = +93.0°C, crit = +103.0°C)
Core 24:       +49.0°C  (high = +93.0°C, crit = +103.0°C)
Core 26:       +47.0°C  (high = +93.0°C, crit = +103.0°C)
Core 27:       +48.0°C  (high = +93.0°C, crit = +103.0°C)

coretemp-isa-0000
Adapter: ISA adapter
Package id 0:  +51.0°C  (high = +93.0°C, crit = +103.0°C)
Core 16:       +49.0°C  (high = +93.0°C, crit = +103.0°C)
Core 9:        +49.0°C  (high = +93.0°C, crit = +103.0°C)
Core 10:       +49.0°C  (high = +93.0°C, crit = +103.0°C)
Core 11:       +48.0°C  (high = +93.0°C, crit = +103.0°C)
Core 2:        +48.0°C  (high = +93.0°C, crit = +103.0°C)
Core 17:       +50.0°C  (high = +93.0°C, crit = +103.0°C)
Core 18:       +51.0°C  (high = +93.0°C, crit = +103.0°C)
Core 19:       +48.0°C  (high = +93.0°C, crit = +103.0°C)
Core 24:       +48.0°C  (high = +93.0°C, crit = +103.0°C)
Core 25:       +48.0°C  (high = +93.0°C, crit = +103.0°C)
Core 26:       +50.0°C  (high = +93.0°C, crit = +103.0°C)
Core 27:       +47.0°C  (high = +93.0°C, crit = +103.0°C)

Ive tried the Fancontrol setup but get the 'There are no pwm-capable sensor modules installed' in the hope i could perhaps change these incorrect high and crit values to what they should be or at least put my own custom behaviour in place....

What I would like to know is there any other way i can control this or edit those values? Perhaps I don't have the correct modules installed? I'm at a loss here with where to look next and would greatly appreciate any help :)
 
You ever get this figured out? Right now I have to use BIOS, but it's an unnecessary restart..
 
I did figure it out,

The recorded values from that command i believe are readings from TJ MAX as apposed to TCase which from what i read can be 15-20c difference.

According to https://ark.intel.com/content/www/u...old-6136-processor-24-75m-cache-3-00-ghz.html my cpu is 85 TCase.

So it does make sense that in fact i got panicked over the wrong type of reading. To take it a step further i ran Prime95 in one of my VMs for 30mins to raise these temps to 80-85c (TJ MAX) and both CPU bouncing in usage 60-80-100% usage, and it stayed around this temp with no thermal shutdown etc. I ran this test a good 5 times and still performed well.

I still under the HP bios have the lowest rpm set to 20% to be safe and been running this machine for a good 8 months now with no problems.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!