[SOLVED] High Power Usage for Passthrough GPU When VM Is Stopped

ovidiugm

New Member
Apr 29, 2023
3
2
3
I am running Proxmox installed on top of Debian 12 Bookworm, as per guide here: https://pve.proxmox.com/wiki/Install_Proxmox_VE_on_Debian_12_Bookworm
pveversion shows pve-manager/8.0.3/bbf3993334bfa916 (running kernel: 6.2.16-5-pve)
Hardware config is Ryzen 7 5700G, MSI B450 Gaming Plus Max.
Passthrough ASUS GeForce 1660S GPU for a Win10 VM.

Win10 VM uses the 1660S just fine. When idle with VM on, power usage at outlet is 28-29W.
With Win10 VM stopped, power usage goes up to 45W.

Right after boot, lspci --vv would show:
10:00.0 VGA compatible controller: NVIDIA Corporation TU116 [GeForce GTX 1660 SUPER] (rev a1) (prog-if 00 [VGA controller]) [...] LnkCtl: ASPM Disabled [...] Kernel driver in use: vfio-pci Kernel modules: nvidiafb, nouveau

1. With the Win10 VM stopped, unbind from vfio-pci and bind to nouveau - this gets me a very small drop in power usage (1-5W) but system is still idling at 40-45W.
echo 0000:10:00.0 > /sys/bus/pci/drivers/vfio-pci/unbind echo 0000:10:00.0 > /sys/bus/pci/drivers/nouveau/bind
Device correctly binds to nouveau as confirmed by lspci.


2. I also tried the omnipresent internet enable_aspm script to turn on ASPM, this changes the link control but makes no difference at all in power usage:
LnkCtl: ASPM L0s L1 Enabled; RCB 64 bytes, Disabled- CommClk+

3. Installed non-free Nvidia drivers (nvidia-smi). This brings the idle system to 27W with the Win10 VM off. However the passthrough does not work any more (bind/unbind don't work, VM does not start, device /dev/nvidia0 is locked by nvidia-persistenced service/process).

Anything else I could try?
Thanks so much!
 
You need drivers to power down parts of the GPU but nouveau often does not get the information from NVidia on how to do that. Keep the VM running in idle or create a smaller VM with the NVidia drivers to reduce the power draw of the GPU. Or switch to an AMD or Intel GPU that is properly supported by the Linux kernel drivers.
 
  • Like
Reactions: ovidiugm
You need drivers to power down parts of the GPU but nouveau often does not get the information from NVidia on how to do that. Keep the VM running in idle or create a smaller VM with the NVidia drivers to reduce the power draw of the GPU. Or switch to an AMD or Intel GPU that is properly supported by the Linux kernel drivers.
Isn't that impossible in his situation.

What i mean is, he is passing through the gpu to the VM.
If the VM turns off (shutdown or whatever)
There is only on the host vfio-pcie bound to the GPU.
Means no driver and no kernel driver or anything.

I actually never thought of such an scenario, what happens with the GPU power consumption.
So it's actually a pretty interesting thread.

In an optimal case, i would say that the vm, should send the GPU to the highest powersave state during shutdown.
But not sure if that's happening or the right usual scenario.

However, im curious to read the replyes here and see what people say, or how the GPU should behave actually.

Cheers

Edit: Forget it, i didn't readed the entire thread xD
Unbind vfio-pcie and bind to the driver is already a solution. Just seems not pretty to me.
 
Last edited:
You need drivers to power down parts of the GPU but nouveau often does not get the information from NVidia on how to do that. Keep the VM running in idle or create a smaller VM with the NVidia drivers to reduce the power draw of the GPU. Or switch to an AMD or Intel GPU that is properly supported by the Linux kernel drivers.
Indeed, I saw this mentioned as a solution before. I would prefer not to keep the current Win10 VM active - it's configured to hug a lot of total available RAM.
Question - is there any way to (reliably and programatically) turn on the small VM when the other is shut down, and the other way around?

AMD was my first option, as my CPU has an integrated AMD Vega 8 GPU.
My first attempt was to passthrough this GPU to Win10. Spent almost a week with that, following all the guides I could find. The farther I got was to have the GPU show up in the Win10 VM Device Manager, with the dreaded Error 43 - the drivers would not work properly.
I gave up on it and to my amazement the Nvidia GPU needed no config, it worked on first try.
If anybody have an up-to-date guide on Ryzen integrated GPU passthrough, that would be also great.
 
Indeed, I saw this mentioned as a solution before. I would prefer not to keep the current Win10 VM active - it's configured to hug a lot of total available RAM.
And all VMs with PCI(e) passthrough always use all of its memory because of possible device-initiated DMA.
Question - is there any way to (reliably and programatically) turn on the small VM when the other is shut down, and the other way around?
Use a hookscipt to start a smaller Windows VM (or try NVidia's Linux drivers inside a VM) when your big VM has stopped.
 
  • Like
Reactions: ovidiugm
Isn't that impossible in his situation.

What i mean is, he is passing through the gpu to the VM.
If the VM turns off (shutdown or whatever)
There is only on the host vfio-pcie bound to the GPU.
Means no driver and no kernel driver or anything.

I actually never thought of such an scenario, what happens with the GPU power consumption.
So it's actually a pretty interesting thread.
You might also be interested in all the other threads about this on this forum.
In an optimal case, i would say that the vm, should send the GPU to the highest powersave state during shutdown.
But not sure if that's happening or the right usual scenario.
For that you need a driver that has knowledge about the GPU. As measured (by multiple people), the GPU does not power down completely without drivers. It's not what you expect or what people want, but that's what how they work.
However, im curious to read the replyes here and see what people say, or how the GPU should behave actually.
You might also be interested in all the other threads about this on this forum.
Unbind vfio-pcie and bind to the driver is already a solution. Just seems not pretty to me.
But there is no (open source) Linux driver with enough knowledge about the GPU power settings, that's also why nouveau is slow on many NVidia GPUs: it cannot change clockspeeds of various components. That's why you need NVidia's drivers, which work best on a Windows VM.
 
Using a whole VM for power-saving doesn't look like an elegant solution, however I must say at first sight it's working great. No race conditions between the starting/stopping VMs, power usage stays low with the dummy Debian VM running, with Nvidia official drivers installed.

On guest VM, Debian w. 512MB RAM, 2 CPU cores, 4GB storage:
  • Add the PCI device for passthrough
  • Add non-free repository in /etc/apt/sources.list deb http://deb.debian.org/debian bookworm main contrib non-free non-free-firmware
  • Install nvidia drivers: apt install nvidia-smi
  • Reboot

On Proxmox host, create the hookscript, make it executable:
cd /var/lib/vz/snippets touch nvidiavm.sh chmod +x nvidiavm.sh nano nvidiavm.sh

Insert the following in the sh file, change VM id's as necessary:
Bash:
#!/bin/bash
VMID=$1
PHASE=$2

if [ "$VMID" = "105" ]; then
  if [ "$PHASE" = "pre-start" ]; then
    echo "Win10 VM pre-start: Stopping Nvidia Dummy VM"
    qm stop 200

  elif [ "$PHASE" = "post-stop" ]; then
    echo "Win10 VM post-stop: Starting Nvidia Dummy VM"
    qm start 200
  fi
fi

Attach it to your resource-consuming VM:
qm set 105 --hookscript local:snippets/nvidiavm.sh

Execution steps will show in journal:
journalctl -b
Jul 29 01:53:48 deb qmeventd[135980]: Win10 VM post-stop: Starting Nvidia Dummy VM


Thanks for everyone's contribution on this!
 
Using a whole VM for power-saving doesn't look like an elegant solution, however I must say at first sight it's working great. No race conditions between the starting/stopping VMs, power usage stays low with the dummy Debian VM running, with Nvidia official drivers installed.

On guest VM, Debian w. 512MB RAM, 2 CPU cores, 4GB storage:
  • Add the PCI device for passthrough
  • Add non-free repository in /etc/apt/sources.list deb http://deb.debian.org/debian bookworm main contrib non-free non-free-firmware
  • Install nvidia drivers: apt install nvidia-smi
  • Reboot

On Proxmox host, create the hookscript, make it executable:
cd /var/lib/vz/snippets touch nvidiavm.sh chmod +x nvidiavm.sh nano nvidiavm.sh

Insert the following in the sh file, change VM id's as necessary:
Bash:
#!/bin/bash
VMID=$1
PHASE=$2

if [ "$VMID" = "105" ]; then
  if [ "$PHASE" = "pre-start" ]; then
    echo "Win10 VM pre-start: Stopping Nvidia Dummy VM"
    qm stop 200

  elif [ "$PHASE" = "post-stop" ]; then
    echo "Win10 VM post-stop: Starting Nvidia Dummy VM"
    qm start 200
  fi
fi

Attach it to your resource-consuming VM:
qm set 105 --hookscript local:snippets/nvidiavm.sh

Execution steps will show in journal:
journalctl -b
Jul 29 01:53:48 deb qmeventd[135980]: Win10 VM post-stop: Starting Nvidia Dummy VM


Thanks for everyone's contribution on this!
You might also want to create a vzdump hook script. As otherwise the backup of your big VM will fail while the small VM is running (and reverse), because a VM needs to be started in order to back it up and you can't start two VMs that are using the same PCIe device.

Edit:
Here is what I use, which shuts down both VMs before doing the backup and then starts that VM again after the backup job has finished:
https://forum.proxmox.com/threads/backup-vm-same-pcie-device-iommu-on-two-vm.129600/#post-567803
 
Last edited:
  • Like
Reactions: ovidiugm
I have the same issue as ovidiugm

Proxmox: VE 8.0.4, Linux 6.2.16-12-pve
Hardware: AMD EPYC 7532 32-Core, NVIDIA RTX A4000 GPU (140 W)

So I have a Windows 11 VM for games with the above GPU passed through. And the power consumption is as follows:
1) When Windows 11 VM is active and I play video games in Windows 11 VM: 270-290W
2) When Windows 11 VM is active and I do not play games in Windows 11 VM: 200-210W
3) When Windows 11 VM is shutdown: 250-260W

I guess I will just keep the VM active all the time, but I wish I could keep it shutdown when I do not need it.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!