RTX 2070 GPU Pass-through - USB Host Controller Failure

NotQuiteAKing

New Member
Sep 4, 2024
2
0
1
I've been using Proxmox with a Windows 11 for the past year or so now, I used a 2070 RTX this had a USB-C as I could plug in a dongle, this was going great! But a few months ago the USB Controller started failing randomly on start up. Rebooting the VM sometimes fixes it, though sometimes doesn't.

I tried a few things that didn't fix it, but it looks like a binding issue with xhic_hcd, this error came up when it failed to register, but didn't, when it did register...

Code:
journalctl -r -p3
Sep 17 22:05:27 nas kernel: nvidia 0000:21:00.0: AER:   Error of this Agent is reported first
Sep 17 22:05:26 nas kernel: nvidia 0000:21:00.0: AER:   Error of this Agent is reported first
Sep 17 22:05:26 nas kernel: NVRM: GPU 0000:26:00.0 is already bound to vfio-pci.
Sep 17 22:05:24 nas kernel: mpt2sas_cm0: overriding NVDATA EEDPTagMode setting

Code:
dmesg
[   30.675710] xhci_hcd 0000:26:00.2: remove, state 4
[   30.676406] xhci_hcd 0000:26:00.2: USB bus 4 deregistered
[   30.676416] xhci_hcd 0000:26:00.2: remove, state 1
[   30.989480] xhci_hcd 0000:26:00.2: USB bus 3 deregistered

Grub:
Code:
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on initcall_blacklist=sysfb_init vfio-pci.ids=10de:1f07,10de:10f9,10de:1ada,10de:1adb"

Code:
/etc/modprobe.d/blacklist.conf
blacklist nouveau
#blacklist nvidia*

Code:
/etc/modprobe.d/vfio.conf
options vfio-pci ids=10de:1f07,10de:10f9,10de:1ada,10de:1adb disable_vga=1

I was thinking of blacklisting xchi_hcd, but that might interfere with the usb.

To note: there's no issues with the Card itself just the USB Controller; I also have an P600 Passed through to another Plex VM, which is why I'm not black listing nvidia
 
Kernel version 6.8.12 broke USB passthrough for me inside a VM, which I would not expect because I made sure that Proxmox does not touch the USB controller before the VM starts. Maybe you can try with Proxmox booting with kernel version 6.8.4?