Hi,
I am running Proxmox on my desktop with the following specs:
- Intel 13600K
- MSI Z690 A WiFi D4
- AMD RX 6650 XT
- 2x32 GB of RAM
- 2x Samsung 870 Evo 500 GB as boot drives in ZFS RAID1 configuration
- XPG S70 Blade 1 TB in LVM-thin to run my VMs
Usage scenario: VM running Windows/Linux) with GPU passthrough
What works: No blacklisting of drivers, no VFIO ids. Only 'driverctl' to pass the GPU to vfio and back using a perl hookscript(attached for reference).
Here is a an extract from the log 'after' VM is shutdown to prove that host(proxmox) gets the control back and starts using amdgpu driver, so VFIO is not controlling it anymore:
So, what's the Issue: Irrespective of the approach I use for passing the GPU or the distro used for the VM, it's temperature starts going up every time it's returned to the host(proxmox) and it gets really hot without any usage.
I have no issues passing the GPU back to different VMs, as the hookscript works perfectly. The temperatures are fine, as long as its not the host but any VM controlling it via its own drivers(amdgpu for linux VMs or proprietory drivers in Windows). But this GPU-heat up is driving me mad. I am even contemplating bidding farewall to proxmox if I can't find any solution. I have been struggling with this since several months now. I know I can let a VM run just to manage it, but that solution is not very intuitive running a VM for no purpose with the same amdgpu driver, especially when the same amdgpu driver is actively managing it in the host, like how a linux VM would do it when GPU is passed through to it.
Wondering if there is a better solution to handle it well in the host itself, or am I missing something here..
Thanks,
I am running Proxmox on my desktop with the following specs:
- Intel 13600K
- MSI Z690 A WiFi D4
- AMD RX 6650 XT
- 2x32 GB of RAM
- 2x Samsung 870 Evo 500 GB as boot drives in ZFS RAID1 configuration
- XPG S70 Blade 1 TB in LVM-thin to run my VMs
Usage scenario: VM running Windows/Linux) with GPU passthrough
What works: No blacklisting of drivers, no VFIO ids. Only 'driverctl' to pass the GPU to vfio and back using a perl hookscript(attached for reference).
Here is a an extract from the log 'after' VM is shutdown to prove that host(proxmox) gets the control back and starts using amdgpu driver, so VFIO is not controlling it anymore:
03:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 23 [Radeon RX 6650 XT / 6700S / 6800S] [1002:73ef] (rev c1)
Subsystem: ASUSTeK Computer Inc. Navi 23 [Radeon RX 6650 XT / 6700S / 6800S] [1043:05e3]
Kernel driver in use: amdgpu
Kernel modules: amdgpu
03:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller [1002:ab28]
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller [1002:ab28]
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intel
So, what's the Issue: Irrespective of the approach I use for passing the GPU or the distro used for the VM, it's temperature starts going up every time it's returned to the host(proxmox) and it gets really hot without any usage.
I have no issues passing the GPU back to different VMs, as the hookscript works perfectly. The temperatures are fine, as long as its not the host but any VM controlling it via its own drivers(amdgpu for linux VMs or proprietory drivers in Windows). But this GPU-heat up is driving me mad. I am even contemplating bidding farewall to proxmox if I can't find any solution. I have been struggling with this since several months now. I know I can let a VM run just to manage it, but that solution is not very intuitive running a VM for no purpose with the same amdgpu driver, especially when the same amdgpu driver is actively managing it in the host, like how a linux VM would do it when GPU is passed through to it.
Wondering if there is a better solution to handle it well in the host itself, or am I missing something here..
Thanks,