Power savings + nvidia + rabbit hole.

adolfotregosa

Active Member
Jan 8, 2019
7
1
43
41
Hi all. I need help, I have gone into a rabbit hole...

Let me try to put into words the issue. When I reboot my proxmox host (13900 + z790 + 4060ti + doesn't matter) and manually spin down the spinning rust (9 of them) and let proxmox settle, my watt meter is around 32W give or take. Great! (There are more power saving things involved but the post is long enough already and not relevant...)
For this to be possible I have to have the nvidia driver installed on proxmox itself so the gpu can go to P8 state and S0ix low power enabled via bios nvram editing.
See this for the process.

...and now for the fun part (rabbit hole)...

When I startup the VM that I want to use the 4060ti, the associated hookscript pre-start unloads the nvidia driver so vfio-pci can take over the GPU, I can see the VM proxmox bios on my screen. If I then stop the VM, the post-stop part unbinds the gpu from vfio-pci and reloads the nvidia driver so I get proxmox back. For this to work correctly, nvidia_drm module must be loaded with "options nvidia_drm modeset=1" so I have that on modprobe.d/nvidia.conf. So, so far so good. I can start the VM, see the OVMF bios image, stop the VM and get proxmox back. The watt meter goes down to 32W again.
The problem arises when you let the VM load the nvidia driver and then stop/shutdown the VM. The watt meter goes up to and stays at 44W.

The rabbit hole took me to the LTR capability changing.

After a fresh proxmox reboot, via ssh I can see that:
lspci -vv -s 01:00.0 | grep LTR
DevCap2: Completion Timeout: Range AB, TimeoutDis+ NROPrPrP- LTR+
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR+ 10BitTagReq- OBFF Disabled,

and on the exact moment the nvidia driver loads inside the VM:

lspci -vv -s 01:00.0 | grep LTR
DevCap2: Completion Timeout: Range AB, TimeoutDis+ NROPrPrP- LTR-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- 10BitTagReq- OBFF Disabled,

LTR goes -

Inside the VM, DevCap2 does not even have LTR advertised so DevCtl2 LTR becomes LTR- and both DevCap2 and DevCtl2 LTR on the host becomes LTR-.
I presume "qemu" ?? (This is way out of my knowledge at this point ) does not advertise something that makes the nvidia driver remove LTR capabilities and hence the power usage goes up, at least, that's what my very limited knowledge about the matter thinks it is.

Some more testing around I got to this point:
I can change LTR+ to - and vice versa using :

setpci -s 01:00.0 0xa0.w=0400 -> DevCtl2 LTR+
and
setpci -s 01:00.0 0xa0.w=0000 -> DevCtl2 LTR-

but the watt meter does not react to the manual change, this part is annoying, but still DevCap2 does not change from LTR+, I'm only toggling LTR on DevCtl2 and that clue must be why something on the motherboard still goes to sleep.

After the nvidia driver loads inside the VM and you then stop the VM, "setpci" no longer works and it makes sense since DevCap2 LTR is now LTR- and something on the motherboard does not "sleep" anymore (that's my theory).

Now, I think I tried basically all there is to try to change DevCap2 LTR- back to LTR+, even echo 1 > ..... /remove and then echo 1 > /sys/bus/pci/rescan but it does not cut it. Only a system reboot does.

This is kinda of annoying knowing you are wasting 12W ""just because"" and rebooting proxmox everytime is unreasonable. It's also running OPNsense + unRaid.

So, I ask, am I alone ? Any ideas ?

Thank You so much for your time.
 
Last edited: