Installing a GPU breaks HBA passthrough

doctorhopps

New Member
Jul 20, 2024
3
0
1
System specs:

pveversion:
pve-manager/8.2.4/faa83925c9641325 (running kernel: 6.8.8-3-pve)
CPU: Intel Xeon E5-2683 v4
Mobo: Asus X99-A/USB 3.1
PCIe Peripherals: LSI SAS2308 HBA, MSI GTX 980 Ti GPU

Issue:

I have a VM running TrueNAS on this Proxmox install with an HBA passed through. It was working fine until I opened it up to install an old GPU for some tinkering. I moved the HBA to a different PCIe slot and updated the VM settings, but when I try to start the VM, it hangs at BIOS and the following message floods dmesg:

vfio-pci 0000:02:00.0: BAR 0: can't reserve [io 0xd000-0xd0ff]

I tried blacklisting all the drivers for both the GPU and the HBA, configuring vfio-pci to prioritize these devices, and disabling the GPU for the host machine (this is unfortunately the primary GPU of the machine), none of those seemed to help.

When I remove the GPU, everything starts working again, even with the HBA in a different slot, so I'm inclined to rule out the motherboard and the CPU.

Anyone know what might be happening? Do I need to try harder to make the GPU not bind to the host? Kernel shenanigans?
 
Oh right, thanks for the reminder, I felt like I had forgotten something.

The IOMMU groups were different when the GPU was installed, and the HBA was on its own group.
 
So I updated the system, which included kernel 6.8.12, and I still haven't had success.

I logged the dmesg with and without the GPU, nothing looks off except for some memory map layout messages and ranges being different. Attaching in case folks see something I missed.
 

Attachments