Back passthrough gpu to proxmox

pwn

New Member
May 24, 2024
16
0
1
I run Proxmox server 24/7 with some services and have VM with passthrough GPU.

I messure that when VM is down, gpu is hottest and take more Wats than run 24/7 with idle in VM.

I just wonder that is worth/possible to shutdown vm and back gpu to proxmox? How can I do it? I think that hookscript is solution but it's possible to back gpu to proxmox without restart server ?
 
I messure that when VM is down, gpu is hottest and take more Wats than run 24/7 with idle in VM.
Yes because there is no driver loaded that known how to put the GPU into lowest power settings. You could just (automatically) start a small VM with drivers.
I just wonder that is worth/possible to shutdown vm and back gpu to proxmox?
Yes and there have been threads about this in the past.
How can I do it?
Unbind the vfio-pci driver and bind the actual driver for the GPU (which is the reverse of what Proxmox does automatically).
What is the output of lspci -knns XY:00 where XY:00 is the PCI ID of the GPU (which you use to passthrough) and what is the XY?
I think that hookscript is solution but it's possible to back gpu to proxmox without restart server ?
Yes, you can automate it that way,
 
2d:00.0 VGA compatible controller [0300]: NVIDIA Corporation AD104 [GeForce RTX 4070 SUPER] [10de:2783] (rev a1)​
Subsystem: Gigabyte Technology Co., Ltd AD104 [GeForce RTX 4070 SUPER] [1458:413b]
Kernel driver in use: vfio-pci
Kernel modules: nvidiafb, nouveau

2d:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:22bc] (rev a1)
Subsystem: Gigabyte Technology Co., Ltd Device [1458:413b]
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel

I have blacklist driver too:
  • blacklist nouveau
  • blacklist nvidia

So As I understand i should delete entry from /etc/modprobe.d/vfio.conf when shutdown and add this entry when start?
 
So As I understand i should delete entry from /etc/modprobe.d/vfio.conf when shutdown and add this entry when start?
If you don't want vfio-pci to bind to the GPU before starting the VM, you do need to remove IDs 10de:2783 and 10de:22bc. And run the necessary update-initramfs -u etc. Don't go editing that file then the VM starts or stops!

Code:
2d:00.0 VGA compatible controller [0300]: NVIDIA Corporation AD104 [GeForce RTX 4070 SUPER] [10de:2783] (rev a1)
        Subsystem: Gigabyte Technology Co., Ltd AD104 [GeForce RTX 4070 SUPER] [1458:413b]
        Kernel driver in use: vfio-pci
        Kernel modules: nvidiafb, nouveau

2d:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:22bc] (rev a1)
        Subsystem: Gigabyte Technology Co., Ltd Device [1458:413b]
        Kernel driver in use: vfio-pci
        Kernel modules: snd_hda_intel
Unbind the current driver (like vfio-pci) using echo "0000:2d:00.0" > "/sys/bus/pci/devices/0000:2d:00.0/driver/unbind" and similar for 2d:00.1.
Bind the actual drivers using: echo "0000:2d:00.0" >"/sys/bus/pci/drivers/nouveau/bind" and echo "0000:2d:00.1" >"/sys/bus/pci/drivers/snd_hda_intel/bind".

But since that is a NVidia GPU and that company refuses to tell the nouveau driver developers how to change the clockspeeds you are probably better off by starting a minimal (Windows) VM with the official NVidia drivers to put the GPU in low power mode. The steps above only really work well with GPUs that have proper Linux support.
 
I would try to test unbind for some time and if any issue I create VM for that purpose as you wrote! Thanks!;]
 
echo "0000:2d:00.0" >"/sys/bus/pci/drivers/nouveau/bind - I havent nouveau folder in that path, it's possible if it's blacklisted, folder was not created?
 
echo "0000:2d:00.0" >"/sys/bus/pci/drivers/nouveau/bind - I havent nouveau folder in that path, it's possible if it's blacklisted, folder was not created?
Try doing a modprobe nouveau first. If the driver was not loaded before automatically, this will load it. Check with lsmod | grep nouveau afterwards.
 
OK it's show up

nouveau 2908160 0
mxm_wmi 12288 1 nouveau
drm_gpuvm 45056 1 nouveau
drm_exec 16384 2 drm_gpuvm,nouveau
gpu_sched 61440 1 nouveau
drm_ttm_helper 12288 1 nouveau
ttm 102400 2 drm_ttm_helper,nouveau
drm_display_helper 233472 1 nouveau
video 73728 1 nouveau
i2c_algo_bit 16384 3 igb,ast,nouveau
wmi 32768 4 video,wmi_bmof,mxm_wmi,nouveau


but now I just get write error when try to bind
bash: echo: write error: File exists
 
hmm there is not driver in use for vga

2d:00.0 VGA compatible controller [0300]: NVIDIA Corporation AD104 [GeForce RTX 4070 SUPER] [10de:2783] (rev a1)
Subsystem: Gigabyte Technology Co., Ltd AD104 [GeForce RTX 4070 SUPER] [1458:413b]
Kernel modules: nvidiafb, nouveau
2d:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:22bc] (rev a1)
Subsystem: Gigabyte Technology Co., Ltd Device [1458:413b]
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intel
 
hmm there is not driver in use for vga

2d:00.0 VGA compatible controller [0300]: NVIDIA Corporation AD104 [GeForce RTX 4070 SUPER] [10de:2783] (rev a1)
Subsystem: Gigabyte Technology Co., Ltd AD104 [GeForce RTX 4070 SUPER] [1458:413b]
Kernel modules: nvidiafb, nouveau
2d:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:22bc] (rev a1)
Subsystem: Gigabyte Technology Co., Ltd Device [1458:413b]
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intel
I don't know because I don't use NVidia because of poor Linux support. I would expect echo "0000:2d:00.0" >"/sys/bus/pci/drivers/nouveau/bind" to just work. Maybe nouveau does not support your GPU? Maybe install NVidia drivers on the Proxmox host (but I don't want to help with all the hurdles that that entails). Or maybe just run a minimal VM with NVidia drivers.
 
  • Like
Reactions: pwn