GPU passthrough not allowing for PCIe atomics

MoxProxxer

Well-Known Member
Apr 25, 2018
86
37
58
54
So trying ROCm with GPI pass-through and 2 x AMD WX4100 (Baffin, Polaris11).
I should mention, that GPI pass though works perfectly with some Nvidia 1050Ti

ROCm ofc wants more HW features and thus OpenCL is no dice, dmesg contains:

[ 9.822248] kfd kfd: amdgpu: skipped device 1002:67e3, PCI rejects atomics
[ 11.580399] kfd kfd: amdgpu: skipped device 1002:67e3, PCI rejects atomics

I ofc plowed through this:
https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver/issues/26
and this:
https://github.com/RadeonOpenCompute/ROCR-Runtime/issues/24
and this:
https://pve.proxmox.com/wiki/Pci_passthrough

So how do I get PCIe atomics up and running in the VM as the HW evidently does support it?

edit: the VM config

Code:
#experimental VM
#
#It uses GPU pass-through to have access to the two AMD WX4100 GPUs.
#(see Tab "Hardware" - pci device)
#
#https://www.amd.com/de/products/professional-graphics/radeon-pro-wx-4100
boot: c
bootdisk: sata0
cores: 6
cpu: host
hostpci0: 81:00,pcie=1,x-vga=on
hostpci1: 82:00,pcie=1,x-vga=on
keyboard: de
machine: q35
memory: 16384
name: WX4100
net0: virtio=5A:61:35:52:C3:32,bridge=vmbr0
numa: 1
onboot: 1
ostype: l26
sata0: space:vm-100-disk-0,cache=unsafe,size=32G
scsihw: virtio-scsi-pci
smbios1: uuid=0b50a5ad-c422-4bed-9945-dcb6218916e4
sockets: 2
vga: qxl
vmgenid: 0d17868a-261f-4657-8b06-dabcee3da84e
 
Last edited:
It seems "my" problem is discussed in detail here:
https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver/issues/100 "KVM Support on proxmox"
and while I was successful with setting the PCIe bits for atomics, I'm stuck for now in hacking the amdgpu kernel module.

It seems the problem is touching Proxmox only marginally, however I wonder why PCIe atomics is not enabled by default when the HW provides it.


Code:
# lspci -s 01:00.0 -vvv | grep -i atom
                         AtomicOpsCap: 32bit+ 64bit+ 128bitCAS-
                         AtomicOpsCtl: ReqEn-

(the setpci operation puts the AtomicOpsCtl to ReqEn+)
Having this not set requires the user to jump through hoops in blacklisting the amdgpu module, then setting the PCIe atomics manually then modprobe the module.

By the way, it's quite unfortunate to have both

https://pve.proxmox.com/wiki/Pci_passthrough
and
https://pve.proxmox.com/wiki/PCI(e)_Passthrough

documentation, should be unified (and updated) IMHO.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!