[SOLVED] AMDGPU HWMON no longer exists.

eBell

Member
Jun 11, 2017
13
0
21
I was experimenting with the AMDGPU 20.45 drivers to see if it would allow me to pass through the driver to a container for HW acceleration under Jellyfin, but I was unable to install the drivers on the Debian host.
After I uninstalled the partially installed AMD driver packages I was unable to control the fan on my MI25 and the GPU wasn't detected in ROCM-SMI, and after investigating it seems that HWMON is no longer present in '/sys/class/drm/card0/device/'.

Could this be caused by the AMD driver installing older firmware, or firmware of a different version than the original ones available in the pve-firmware package?
 
Could this be caused by the AMD driver installing older firmware, or firmware of a different version than the original ones available in the pve-firmware package?
Likely. Did you reboot to load the old module?

And in general, better pass the card into a VM. Also the encapsulation helps not to trash the host. ;)
 
I've sorted it.
It was my own fault for not checking the module blacklists.
The AMD installer script blacklists the amdgpu module and doesn't remove the blacklist when uninstalled.

I don't run as many VMs these days, I've been trying to implement what I can in containers for the ease of use and management.
Trashing the host is part of the fun. :p
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!