[TUTORIAL] Fix always high CPU frequency in proxmox host.

Thanks @masgo for doing this research. I would not have found it on my own.

I have two dual socket Intel Xeon E5 Broadwell systems and it seems Proxmox is defaulting to intel_cpufreq (i.e. acpi_cpufreq) for me.

I have all the intel_pstate options enabled in the BIOS correctly (as far as I know).

Any other thoughts as to why Proxmox would fall back to intel_cpufreq?
 
I could not find a reason why it would not use p-state. It might somehow detect the cpu as incompatible, but it should be compatible.

You could try to use grub-flags to enable it manually. For testing purposes you can change the flags at boot time from within the grub menu. They are gone after a reboot. So you can test and only make permanent changes when you are satisfied with the result.
Have a look here: https://askubuntu.com/a/19487/53410
 
Thanks @masgo for doing this research. I would not have found it on my own.

I have two dual socket Intel Xeon E5 Broadwell systems and it seems Proxmox is defaulting to intel_cpufreq (i.e. acpi_cpufreq) for me.

I have all the intel_pstate options enabled in the BIOS correctly (as far as I know).

Any other thoughts as to why Proxmox would fall back to intel_cpufreq?

I'm also curious what determines which cpu frequency driver is selected. I have a desktop (ubnuntu 20.04 skylake) and server (proxmox 7 ivy bridge). The desktop selects intel_pstate and the server uses intel_cpufreq. AIUI the ivy bridge system could safely use intel_pstate too.

Desktop (Ubuntu 20.04 Skylake/i5-6600)
Bash:
~$ uname -srvmpi
Linux 5.4.0-100-generic #113-Ubuntu SMP Thu Feb 3 18:43:29 UTC 2022 x86_64 x86_64 x86_64
~$ cpufreq-info
cpufrequtils 008: cpufreq-info (C) Dominik Brodowski 2004-2009
Report errors and bugs to cpufreq@vger.kernel.org, please.
analyzing CPU 0:
  driver: intel_pstate
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency: 4294.55 ms.
  hardware limits: 800 MHz - 3.90 GHz
  available cpufreq governors: performance, powersave
  current policy: frequency should be within 800 MHz and 3.90 GHz.
                  The governor "powersave" may decide which speed to use
                  within this range.
  current CPU frequency is 800 MHz.

Proxmox 7 (Ivybridge/i7-3770S)
Bash:
~$ uname -srvmpi
Linux 5.13.19-5-pve #1 SMP PVE 5.13.19-12 (Mon, 07 Mar 2022 15:54:28 +0100) x86_64 unknown unknown
~$ cpufreq-info
cpufrequtils 008: cpufreq-info (C) Dominik Brodowski 2004-2009
Report errors and bugs to cpufreq@vger.kernel.org, please.
analyzing CPU 0:
  driver: intel_cpufreq
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency: 20.0 us.
  hardware limits: 1.60 GHz - 3.90 GHz
  available cpufreq governors: conservative, ondemand, userspace, powersave, performance, schedutil
  current policy: frequency should be within 1.60 GHz and 3.90 GHz.
                  The governor "ondemand" may decide which speed to use
                  within this range.
  current CPU frequency is 2.04 GHz.

What struck me is the uname -srvmpi output on my proxmox system doesn't identify the CPU. I have a slightly flaky BIOS on that older board and I wonder if something (maybe incomplete ACPI implementation), results in default selection of the intel_cpufreq driver. I came across this thread on linuxquestions.org which mentions intel_cpufreq is the default. I believe you can force the choice of frequency driver post-boot as well as via Grub. I haven't tried that yet.
 
Last edited:
What struck me is the uname -srvmpi output on my proxmox system doesn't identify the CPU.
This is probably not the cause for the driver selection.
I have an Intel(R) Core(TM) i5-8400 CPU with Proxmox 7 and it also shows unknown in uname but uses pstate

Code:
# uname -srvmpi
Linux 5.13.19-4-pve #1 SMP PVE 5.13.19-9 (Mon, 07 Feb 2022 11:01:14 +0100) x86_64 unknown unknown

# cpufreq-info
cpufrequtils 008: cpufreq-info (C) Dominik Brodowski 2004-2009
Report errors and bugs to cpufreq@vger.kernel.org, please.
analyzing CPU 0:
  driver: intel_pstate
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency: 4294.55 ms.
  hardware limits: 800 MHz - 4.00 GHz
  available cpufreq governors: performance, powersave
  current policy: frequency should be within 800 MHz and 4.00 GHz.
                  The governor "powersave" may decide which speed to use
                  within this range.
  current CPU frequency is 1.05 GHz.
...
# cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 158
model name      : Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz
stepping        : 10
microcode       : 0xc6
cpu MHz         : 2800.000
cache size      : 9216 KB
physical id     : 0
siblings        : 6
core id         : 0
cpu cores       : 6
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 22
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d
vmx flags       : vnmi preemption_timer invvpid ept_x_only ept_ad ept_1gb flexpriority tsc_offset vtpr mtf vapic ept vpid unrestricted_guest ple shadow_vmcs pml ept_mode_based_exec
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit srbds
bogomips        : 5599.85
clflush size    : 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual
power management:
...
 
Interesting. Do you think there would be instability if I forced the p_state driver? Or would it just get stuck in a single p_state if the driver can't figure out what states its allowed to flip to?
 
My understanding of the pstate driver is that the frequency scaling is no longer done by an algorithm within the kernel but that the cpu determines how fast it should go. (but I might be wrong). Therefore you should only be able to force the driver if the cpu supports it. If it does, I would expect it to run fine.

You could also try to boot a current Ubuntu Live from USB and have a look which driver gets chosen. If it's pstate, then you are most likely fine.
 
I doubt there is anyway to “force” the p_state driver if the default kernel doesn’t enable it directly. I can’t seem to get it enabled on my old T110 ii with “supported” E31230 Sandybridge. I suspect it is due to the ancient Dell firmware not fully supporting them, even though the proc does.
 
This is the default behavior in Proxmox and this thread is about disabling this behavior so I'm not sure how your comment adds to the discussion at all. Am I missing something ?
Not exactly, the performance governor will most often cause max-frequency, but by forcing maximum cstate you limit it on another, deeper level.
 
On AMD on both proxmox 6.x and 7.1, it uses acpi_idle for me on proxmox, which gives no control at all and no c-states below C1. Been trying to get it to use acpi_cpufreq.

Code:
# cpupower frequency-info
analyzing CPU 0:
  no or unknown cpufreq driver is active on this CPU
  CPUs which run at the same hardware frequency: Not Available
  CPUs which need to have their frequency coordinated by software: Not Available
  maximum transition latency:  Cannot determine or is not supported.
Not Available
  available cpufreq governors: Not Available
  Unable to determine current policy
  current CPU frequency: Unable to call hardware
  current CPU frequency:  Unable to call to kernel
  boost state support:
    Supported: yes
    Active: no
    Boost States: 0
    Total States: 3
    Pstate-P0:  1100MHz
    Pstate-P1:  2700MHz
    Pstate-P2:  2500MHz

Code:
 # cpufreq-info
cpufrequtils 008: cpufreq-info (C) Dominik Brodowski 2004-2009
Report errors and bugs to cpufreq@vger.kernel.org, please.
analyzing CPU 0:
  no or unknown cpufreq driver is active on this CPU
  maximum transition latency: 4294.55 ms.
 
the reason why for some systems intel_cpufreq is used (apparently) and for others intel_pstate is active might be because intel_pstate acts in passive mode. See https://www.kernel.org/doc/html/lat...tate.html?highlight=intel_pstate#passive-mode for details.

You may check sys/devices/system/cpu/intel_pstate/status to see which mode is active for the intel_pstate driver. If passive mode is choosen then the scaling_driver policy attribute in sysfs for all CPUFreq policies contains the string “intel_cpufreq”.
 
Last edited:
I've noticed in my little homelab Ryzen 5600G Deskmini that all cores hover around 3.6-3.8Ghz. It would be nice if it worked as a normal PC where the load determined the CPU state as the default configuration.

Thanks and bye :)
 
I think its in general not that great with Ryzen 5000. Even when my bare metal Win11 Workstation with 5800X is idleing the clock is always around 3.7 GHz.
So here Windows also won't clock down the Ryzen even if the CPU clock is so to min 1% max 100% for the energy profile. Only way I can get the Ryzen to clock down below 3.7 GHz is to limit the maximum clock and that also isn't an option, as it then won't clock up when performance is needed.
 
It would be nice if it worked as a normal PC where the load determined the CPU state as the default configuration.

schedutil was the default for a short time, but got reverted back to performance quickly, because people reported performance hickups and inconsistency, afaik.

Did you test with schedutil for example and did you actually notice an reduction in power draw (and therefore temperature)?
I tested it with my two Ryzens and did not notice any difference in power consumption (according to my APC UPSs) (and temperature)...
Edit: And yes, clock speeds got of course down with schedutil.
 
Last edited:
  • Like
Reactions: lolcat
schedutil was the default for a short time, but got reverted back to performance quickly, because people reported performance hickups and inconsistency, afaik.

Did you test with schedutil for example and did you actually notice an reduction in power draw (and therefore temperature)?
I tested it with my two Ryzens and did not notice any difference in power consumption (according to my APC UPSs) (and temperature)...
Edit: And yes, clock speeds got of course down with schedutil.
I'll read the link in the first post, see if that helps. Maybe I should spend the time and try.

The issue I have is that with with AMD CPB (core performance boost) enabled (which isn't PBO or precision boost overdrive), the SoC voltage is 1.4V constantly which causes excess heat and fan noise in my granted (not server grade) Asrock Deskmini X300 + 5600G. I believe at idle the voltage should be 1.1V or lower, so 1.4V is basically the motherboard trying to pump high voltage to the CPU constantly (I assume due to this scheduling/driver or something issue).

Attached pic for reference. S-tui shows 3.8Ghz and the /proc/cpuinfo shows 4.4Ghz, bottomline is that the CPU is running full tilt it appears with no workload. This in turn, tells the CPU to utilise it's CPB feature then adds higher voltage and I end up with 1.4V in this tiny homeserver, which is aimed to run cool and power efficient. Even with a Noctua CPU cooler the fan spins so fast I can hear it at idle due to this.

Ryzen default.jpg
 
Last edited:
I also did alot of research online and most sources will recommend to not lower the clock and let the ryzen run at high clocks, because the ryzens are designed in way that they quickly switch a core between working at full performance and sleeping and lowering the clocks would prevent it from doing that well. So instead of an analogue like approach (where energy saving is done by finely lowering/increaseing the clocks and voltages based on load) its more like a PWM approach (so switching between high clock and zero clock with a duty cycle depending on the load).
But I really don't like it that way. Might be true that it feels snappier that way and might not consume much more power because work is faster done so the core can sleep longer compentating it, but I really would prefer to run the cores on lower voltages in case there is no load to be more efficient.
 
Last edited:
  • Like
Reactions: lolcat
I wanted an overview; at least for myself. :cool:


Check currently used driver:
Bash:
cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_driver

Check currently used governor:
Bash:
cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

Check current clock speeds (once):
Bash:
cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_cur_freq

Check current clock speeds (continuous):
Bash:
watch -n 1 cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_cur_freq

Check available governors:
Bash:
cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_available_governors

Set governor, for example schedutil (temporary):
Bash:
echo "schedutil" | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

Set governor, for example schedutil (permanent):
Add: cpufreq.default_governor=schedutil to the kernel command line [1].

[1] https://pve.proxmox.com/wiki/Host_Bootloader#sysboot_edit_kernel_cmdline
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!