[SOLVED] AMD GPU 7900XTX passthrough to Ubuntu 22.04 guest vm on proxmox 8.0.2 for PyTorch ROCM.

UltraRabbit

New Member
Nov 6, 2023
1
2
3
I got a new machine with AMD CPU 7950x and GPU 7900XTX setup with Proxmox 8.0.2 a few days ago. After successfully make the dedicated GPU passthrough to a Windows11 guest vm and a Ubuntu 22.4 guest vm, I tried to install the latest AMD ROCM support with Pytorch in the Ubuntu vm.
Although the installation procedure was quite smooth, I got stuck with an error from HIP kernel complaining it can't continue with present state. After searching on Google, I learned that PCIe DevCaps for AtomicOpsCap: Routing+ 32bit+ 64bit+ must be enabled althrough the PCIe tree to the GPU device. However, the default implementation of pci-root-port of kvm didn't enable that and cause the problem.
I found there's a patch on qemu repo already existed for enabling these features. I got it applied and recompile the pve-qemu-kvm: 8.0.2-7 to force enable the DevCaps as required. Now the Pytorch with ROCM could run successfully.
I'm wondering if this patch would be included in the futuer pve-qemu-kvm release.

Best wishes to Proxmox community.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!