WTR Pro CPU throttling

phar

New Member
Jan 3, 2025
6
1
1
Hi all - happy new year!!

I'm running PVE 8.3.2 on an AOOSTAR WTR Pro. It has an AMD Ryzen 7 5825U running at up to 4.3Ghz. The challenge I am having is ProxMox seems to limit the CPU core speeds to 2.3Ghz max. I have checked the CPU Governor and it's set to Performance.

Code:
cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
performance
performance
performance
performance
performance
performance
performance
performance
performance
performance
performance
performance
performance
performance
performance
performance


If I run on the host:

stress-ng --cpu 16 --timeout 60 and then watch the CPU with watch -n 1 "cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_cur_freq, this is what I get:

Code:
1916525
1916540
1916517
1916517
2395632
2395645
1916517
1916567
1916545
1916522
1916521
1916516
1916523
1916526
1916532
1916534

I've live booted ZorinOS 17 on the same system and run the same stress test and every core maxes at 4.3Ghz as expected. This says to me there is nothing in my BIOS limiting performance and the issue is with my ProxMox install.

Any suggestions on where to go here?
 
Last edited:
It's not throttling, just not boosting.Have you tried enabling the AMD pstate driver? https://forum.proxmox.com/threads/amd-pstate-driver-steps-and-discussion.118873/

Whilst I haven't specially enabled it I believe it's the default and so is enabled already. Here's the output from CPUpower:


Code:
root@pve:~# cpupower frequency-info
analyzing CPU 0:
  driver: amd-pstate-epp
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency:  Cannot determine or is not supported.
  hardware limits: 400 MHz - 4.55 GHz
  available cpufreq governors: performance powersave
  current policy: frequency should be within 400 MHz and 4.55 GHz.
                  The governor "performance" may decide which speed to use
                  within this range.
  current CPU frequency: Unable to call hardware
  current CPU frequency: 1.92 GHz (asserted by call to kernel)
  boost state support:
    Supported: yes
    Active: yes
    Boost States: 0
    Total States: 3
    Pstate-P0:  2000MHz
    Pstate-P1:  1800MHz
    Pstate-P2:  1600MHz
 
The P0 state doesn’t look right at all. Surely that should be the max boost frequency of 4.55Ghz?
 
I've live booted ZorinOS 17 on the same system and run the same stress test and every core maxes at 4.3Ghz as expected.

What driver was used there? Also the: amd-pstate-epp?
Did you compare the cpupower outputs of both, in general?
 
I think I found a solution, try upgrading to kernel 6.11 https://forum.proxmox.com/threads/o...e-8-available-on-test-no-subscription.156818/

on 6.8 I was stuck at 2.0ghz no matter what
once I upgraded to 6.11 I am able to get 3.1ghz all core 69C (w/ ptm7950) @ 20W PL1(?)
which settles to 2.8ghz all core 15W 61C PL2
and 43C idle

1736966456830.png

in BIOS you should be able to set higher TDP limits, I haven't had the chance to plug in my pikvm to mess with BIOS settings yet
 
Last edited:
OK, I've found the trigger for this problem. This happens when I pass through the SATA Controllers on my WTR Pro in ProxMox. FWIW, I am running kernel 6.11.11-1-pve but the same occurs on 6.8 also.

When I stop all VMs, removed the config that passes through the SATA Controllers and reboot PVE, now my CPU is boosting correctly and the temp sensors are working. I passthrough an NVME drive, a USB3.0 port at the SATA controllers to a VM. Only the latter causes the problem. The moment the VM that has control of the SATA controller starts, there is no CPU boosting and temp sensors are no longer readble).

Steps to reproduce:

  1. Stop all VMs
  2. Remove all passthrough settings
  3. Reboot ProxMox
  4. Run stress test and confirm CPU is boosting and CPU temps are working (they are both now working).
  5. Pass through SATA controllers to VM (see screenshots)
  6. Start VM
  7. CPU no longer boosts and temp sensors frozen at the last temp recorded before the VM was started

If I only passthrough one of the controllers (7:00.0) then CPU boosting and temp monitoring works fine......but I've lost 50% of my disks. There is something up when adding that second SATA controller (7:00.1). Its like the thermal management of the Unit is tied to one of the SATA controllers.


Any ideas what's happening here and how to resolve this so I can safely passthrough my SATA controllers without nerfing the performance of my box?

66284c60-84bf-45fc-8066-68b90c220023.jpg


7314bb40-608d-43d0-850a-78999ed30e9c.jpg
 
Last edited:
I have same machine and experiencing exact same problem.
IOMMU groups are seperated correctly, so I can't tell what's causing it.

I see this error message in dmesg command output. Don't know if it's related or not.
Code:
[   31.186531] pcie_mp2_amd 0000:04:00.7: Failed to discover, sensors not enabled is 0
[   31.186549] pcie_mp2_amd 0000:04:00.7: amd_sfh_hid_client_init failed err -95

I'm considering using TrueNAS SCALE (which is my purpose of SATA contoller passthrough) as hypervisor and run other VMs in it.
 
  • Like
Reactions: hantm
If you join the Aoostar Discord, this very topic is being actively discussed. No fix yet but Aoostar have been made aware. The only work around at the moment is to compromise on performance by passing through the individual SATA drives instead of the SATA Controllers themselves.
 
After speaking with Aoostar directly, it seems the SATA Controllers are integrated into the CPU rather than being on a separate controller chip. This means when passing through the Controllers, part of the SoC becomes locked out to the Host.

My perception is the thermal management system seems tied to the CPU in addition to the SATA Controllers This means when the SATA controllers are taken away from the host via passthrough, the host loses access to thermal management too and so protects itself by forcing a cap on the boosting frequency. This appears to be a hardware issue and not software. Whilst I hope I'm wrong, I'm not confident this be fixed anytime soon, if at all.
 
  • Like
Reactions: hantm
Interesting, I have an issue with the WTR Pro, my TruenasScale VM (with 2 SATA Passthrough) hangs almost every night and in the morning it's stopped. I can't reboot it, I need to reboot the host to make Truenas work again. I guess the issue is the one in this thread.
Is there any workaround ? Other than passing through disks directly ?
 
Interesting, I have an issue with the WTR Pro, my TruenasScale VM (with 2 SATA Passthrough) hangs almost every night and in the morning it's stopped. I can't reboot it, I need to reboot the host to make Truenas work again. I guess the issue is the one in this thread.
Is there any workaround ? Other than passing through disks directly ?
No, I actually refunded my unit and will build a PC based on Fractal Node 804. It will be 2x bigger but none of the problems.
I have tried to modify the Kernel, which did not help. Aoostar rep confirmed that this issue cannot be and will not be fixed due to how APU is built so its not exactly the manufacturer's problem but chip itself. Nevertheless, it is not customer's problem, its theirs, so make of it what you will. Solution does not exist
 
  • Like
Reactions: flipper203
is it risky to pass directly the sata drives to a truenas scale VM ?
You can do it but I wouldn't do it. It can and eventually will lead to data loss, so it all depends on your threat model.


SMART features will not work in the guest OS if you pass the drives directly. With such layout, if you can't keep track of your short and long scans, you won't catch the errors in time. Howerver, let's say you were to run SMART on host at the sans time the guest is doing a transfer as well, this could lead to problems and data loss. This while things is doable but not the proper way to do it. You have been warned.

PS: make sure to pass the serial numbers when passing the drives
 
  • Like
Reactions: flipper203
Yes I don’t want to go this way for sure too risky for my data and using the wtr jus as a truenas nas will lead to have a powerful nas that has nothing to do….
 
  • Like
Reactions: esvee
Yes I don’t want to go this way for sure too risky for my data and using the wtr jus as a truenas nas will lead to have a powerful nas that has nothing to do….
I have the previous versions, R7 dual drive NAS with 5825U with TrueNAS OS installed, running 12-13 containers... Its barely breaking 2% utilization. I need to run several OSes and LXCs but I can't... Because of this flaw. So I totally understand your pain. Hence I canceled the order and will go with a regular build
 
  • Like
Reactions: flipper203
Did anyone test with grub parameters to separate IOMMU
Code:
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt pcie_acs_override=downstream,multifunction"