thunderbolt 3 issues on x570 with titan ridge aic

tristank

Active Member
Apr 24, 2020
36
7
28
24
I'm running a ASRock X570 Taichi with a Gigabyte Titan Ridge AIC on Proxmox 6.1-8 with Ryzen 9 3900X. I do want to use the system as a workstation, so I followed the instructions in the wiki and installed the gnome desktop environment.

$ inxi -F System: Host: pve-workstation Kernel: 5.3.18-3-pve x86_64 bits: 64 Console: tty 5 Distro: Debian GNU/Linux 10 (buster) Machine: Type: Desktop Mobo: ASRock model: X570 Taichi serial: <root required> UEFI: American Megatrends v: P2.10 date: 09/09/2019 CPU: Topology: 12-Core model: AMD Ryzen 9 3900X bits: 64 type: MT MCP L2 cache: 6144 KiB Speed: 3585 MHz min/max: 2200/3800 MHz Core speeds (MHz): 1: 3607 2: 3593 3: 3597 4: 3603 5: 4004 6: 3963 7: 3622 8: 3968 9: 3967 10: 3658 11: 3968 12: 3620 13: 3596 14: 3859 15: 4025 16: 3817 17: 3988 18: 3585 19: 3592 20: 3593 21: 3601 22: 3599 23: 3594 24: 3591 Graphics: Device-1: NVIDIA GP107GL [Quadro P400] driver: nvidia v: 440.82 Display: tty server: X.org 1.20.4 driver: nvidia unloaded: fbdev,modesetting,nouveau,vesa tty: 211x56 Message: Advanced graphics data unavailable in console. Try -G --display Audio: Device-1: NVIDIA GP107GL High Definition Audio driver: snd_hda_intel Device-2: Advanced Micro Devices [AMD] Starship/Matisse HD Audio driver: snd_hda_intel Sound Server: ALSA v: k5.3.18-3-pve Network: Device-1: Intel driver: iwlwifi IF: wlp38s0 state: down mac: fe:81:48:eb:2d:79 Device-2: Intel I211 Gigabit Network driver: igb IF: enp40s0 state: up speed: 1000 Mbps duplex: full mac: 70:85:c2:dd:60:1a IF-ID-1: vmbr0 state: up speed: N/A duplex: N/A mac: 70:85:c2:dd:60:1a Drives: Local Storage: total: 4.10 TiB used: 32.45 GiB (0.8%) ID-1: /dev/nvme0n1 vendor: Samsung model: SSD 970 PRO 512GB size: 476.94 GiB RAID: Device-1: rpool type: zfs status: ONLINE raid: no-raid size: 424.00 GiB free: 339.00 GiB Components: online: nvme0n1p3 Partition: ID-1: / size: 357.92 GiB used: 32.45 GiB (9.1%) fs: zfs raid: rpool/ROOT/pve-1 Sensors: System Temperatures: cpu: 53.8 C mobo: N/A Fan Speeds (RPM): N/A Info: Processes: 724 Uptime: 31m Memory: 62.74 GiB used: 5.10 GiB (8.1%) Init: systemd runlevel: 5 Shell: bash

[ 2055.227299] thunderbolt 0000:05:00.0: enabling device (0000 -> 0002) [ 2055.289185] pci 0000:15:00.0: enabling device (0000 -> 0002) [ 2055.289331] xhci_hcd 0000:15:00.0: xHCI Host Controller [ 2055.289334] xhci_hcd 0000:15:00.0: new USB bus registered, assigned bus number 1 [ 2055.290502] xhci_hcd 0000:15:00.0: hcc params 0x200077c1 hci version 0x110 quirks 0x0000000200009810 [ 2055.290752] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 5.03 [ 2055.290753] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [ 2055.290754] usb usb1: Product: xHCI Host Controller [ 2055.290755] usb usb1: Manufacturer: Linux 5.3.18-3-pve xhci-hcd [ 2055.290756] usb usb1: SerialNumber: 0000:15:00.0 [ 2055.290831] hub 1-0:1.0: USB hub found [ 2055.290838] hub 1-0:1.0: 2 ports detected

I'm having issues with the thunderbolt controller. For some reasons every few seconds a new USB bus is registered. Here is a github gist of boltctl monitor and dmesg -w. I have no clue what this is about. I would appreciate some help.
 

Attachments

  • boltctl-monitor.log
    47.9 KB · Views: 2
  • dmesg-w.log
    49.3 KB · Views: 4
I'm also trying to get Thunderbolt (TB) working on a VM. I'm using a Gibabyte (GB) Titan Ridge PCIe card on a GB TRX40 Designare mobo (presently in the large slot farthest from CPU). First off, I'm having no trouble passing through USB, Ethernet or SATA controllers; nor problems passing GPU or NVMe SSD; the basic techniques I seem to understand.

However, I am having trouble passing through the whole TB tree. I see it broken down into 7 addresses, not the one you referenced (but this may be due to what info we each requested through Linux commands). That is, when running "lspci -nnk", I get the following (Spoiler). From this you can see that only 4b:00 and 53:00 are being passed with vfio-pci substituted for the TB kernel driver. Do you see anything similar for your TB section?

But the problem is, I can only pass through 2 of the 7 addresses (NHI and USB); without the other parts, there is no TB functionality.

49:00.0 PCI bridge [0604]: Intel Corporation JHL7540 Thunderbolt 3 Bridge [Titan Ridge 4C 2018] [8086:15ea] (rev 06)
Kernel driver in use: pcieport
4a:00.0 PCI bridge [0604]: Intel Corporation JHL7540 Thunderbolt 3 Bridge [Titan Ridge 4C 2018] [8086:15ea] (rev 06)
Kernel driver in use: pcieport
4a:01.0 PCI bridge [0604]: Intel Corporation JHL7540 Thunderbolt 3 Bridge [Titan Ridge 4C 2018] [8086:15ea] (rev 06)
Kernel driver in use: pcieport
4a:02.0 PCI bridge [0604]: Intel Corporation JHL7540 Thunderbolt 3 Bridge [Titan Ridge 4C 2018] [8086:15ea] (rev 06)
Kernel driver in use: pcieport
4a:04.0 PCI bridge [0604]: Intel Corporation JHL7540 Thunderbolt 3 Bridge [Titan Ridge 4C 2018] [8086:15ea] (rev 06)
Kernel driver in use: pcieport
4b:00.0 System peripheral [0880]: Intel Corporation JHL7540 Thunderbolt 3 NHI [Titan Ridge 4C 2018] [8086:15eb] (rev 06)
Subsystem: Device [2222:1111]
Kernel driver in use: vfio-pci
Kernel modules: thunderbolt
53:00.0 USB controller [0c03]: Intel Corporation JHL7540 Thunderbolt 3 USB Controller [Titan Ridge 4C 2018] [8086:15ec] (rev 06)
Subsystem: Device [2222:1111]
Kernel driver in use: vfio-pci

If "hostpciX: 49:00.0" or "hostpciX: 4a:00" is used in the VM.config file, VM will crash, giving an error such as "vfio 0000:49:00.0: failed to open /dev/vfio/58: No such file or directory".

If the directory, /dev/vfio/ is examined, the following files are present: 13 16 30 31 33 52 53 54 56 57 63 64 66 77 vfio. These are all of the IOMMU groups for the addresses which have been passed, including the two TB sections: 63 for 4b:00, 64 for 53:00.

The IOMMU groups present on my mobo can be viewed with this command: "find /sys/kernel/iommu_groups/ -type l", the results of which are in the Spoiler below. The groups for the un-passable addresses are here: 58 for 49:00, 59 for 4a:00, 60 for 4a:01, 61 for 4a:02, and 62 for 4a:0.

These missing IOMMU groups are what seems to be leading to the error noted above, leading me to think that at least part of the problem for the inability to pass the TB device is due to Proxmox's failure to create those missing files in /dev/vfio/. (BTW, I've tried blacklisting not only Thunderbolt but pcieport, as well as running an un-binding routine on boot, both without success,)

I'd like to hear how to fix this, or what I'm doing incorrectly. (And I don't know if this is part of your problem with TB too.)

/sys/kernel/iommu_groups/55/devices/0000:43:00.0
/sys/kernel/iommu_groups/17/devices/0000:04:00.3
/sys/kernel/iommu_groups/45/devices/0000:40:07.1
/sys/kernel/iommu_groups/73/devices/0000:60:07.1
/sys/kernel/iommu_groups/35/devices/0000:25:00.3
/sys/kernel/iommu_groups/7/devices/0000:00:07.0
/sys/kernel/iommu_groups/63/devices/0000:4b:00.0
/sys/kernel/iommu_groups/25/devices/0000:20:07.1
/sys/kernel/iommu_groups/53/devices/0000:47:00.0
/sys/kernel/iommu_groups/53/devices/0000:42:09.0
/sys/kernel/iommu_groups/15/devices/0000:03:00.0
/sys/kernel/iommu_groups/43/devices/0000:40:05.0
/sys/kernel/iommu_groups/71/devices/0000:60:05.0
/sys/kernel/iommu_groups/33/devices/0000:25:00.0
/sys/kernel/iommu_groups/5/devices/0000:00:04.0
/sys/kernel/iommu_groups/61/devices/0000:4a:02.0
/sys/kernel/iommu_groups/23/devices/0000:20:05.0
/sys/kernel/iommu_groups/51/devices/0000:42:06.0
/sys/kernel/iommu_groups/13/devices/0000:01:00.0
/sys/kernel/iommu_groups/41/devices/0000:40:03.0
/sys/kernel/iommu_groups/31/devices/0000:23:00.1
/sys/kernel/iommu_groups/3/devices/0000:00:02.0
/sys/kernel/iommu_groups/21/devices/0000:20:03.1
/sys/kernel/iommu_groups/11/devices/0000:00:14.3
/sys/kernel/iommu_groups/11/devices/0000:00:14.0
/sys/kernel/iommu_groups/68/devices/0000:60:02.0
/sys/kernel/iommu_groups/1/devices/0000:00:01.1
/sys/kernel/iommu_groups/58/devices/0000:49:00.0
/sys/kernel/iommu_groups/48/devices/0000:41:00.0
/sys/kernel/iommu_groups/76/devices/0000:61:00.0
/sys/kernel/iommu_groups/38/devices/0000:40:01.1
/sys/kernel/iommu_groups/66/devices/0000:5c:00.0
/sys/kernel/iommu_groups/28/devices/0000:21:00.0
/sys/kernel/iommu_groups/56/devices/0000:44:00.0
/sys/kernel/iommu_groups/18/devices/0000:20:01.0
/sys/kernel/iommu_groups/46/devices/0000:40:08.0
/sys/kernel/iommu_groups/74/devices/0000:60:08.0
/sys/kernel/iommu_groups/36/devices/0000:25:00.4
/sys/kernel/iommu_groups/8/devices/0000:00:07.1
/sys/kernel/iommu_groups/64/devices/0000:53:00.0
/sys/kernel/iommu_groups/26/devices/0000:20:08.0
/sys/kernel/iommu_groups/54/devices/0000:48:00.0
/sys/kernel/iommu_groups/54/devices/0000:42:0a.0
/sys/kernel/iommu_groups/16/devices/0000:04:00.0
/sys/kernel/iommu_groups/44/devices/0000:40:07.0
/sys/kernel/iommu_groups/72/devices/0000:60:07.0
/sys/kernel/iommu_groups/34/devices/0000:25:00.1
/sys/kernel/iommu_groups/6/devices/0000:00:05.0
/sys/kernel/iommu_groups/62/devices/0000:4a:04.0
/sys/kernel/iommu_groups/24/devices/0000:20:07.0
/sys/kernel/iommu_groups/52/devices/0000:46:00.3
/sys/kernel/iommu_groups/52/devices/0000:46:00.1
/sys/kernel/iommu_groups/52/devices/0000:46:00.0
/sys/kernel/iommu_groups/52/devices/0000:42:08.0
/sys/kernel/iommu_groups/14/devices/0000:02:00.0
/sys/kernel/iommu_groups/42/devices/0000:40:04.0
/sys/kernel/iommu_groups/70/devices/0000:60:04.0
/sys/kernel/iommu_groups/32/devices/0000:24:00.0
/sys/kernel/iommu_groups/4/devices/0000:00:03.0
/sys/kernel/iommu_groups/60/devices/0000:4a:01.0
/sys/kernel/iommu_groups/22/devices/0000:20:04.0
/sys/kernel/iommu_groups/50/devices/0000:42:05.0
/sys/kernel/iommu_groups/12/devices/0000:00:18.3
/sys/kernel/iommu_groups/12/devices/0000:00:18.1
/sys/kernel/iommu_groups/12/devices/0000:00:18.6
/sys/kernel/iommu_groups/12/devices/0000:00:18.4
/sys/kernel/iommu_groups/12/devices/0000:00:18.2
/sys/kernel/iommu_groups/12/devices/0000:00:18.0
/sys/kernel/iommu_groups/12/devices/0000:00:18.7
/sys/kernel/iommu_groups/12/devices/0000:00:18.5
/sys/kernel/iommu_groups/40/devices/0000:40:02.0
/sys/kernel/iommu_groups/69/devices/0000:60:03.0
/sys/kernel/iommu_groups/30/devices/0000:23:00.0
/sys/kernel/iommu_groups/2/devices/0000:00:01.2
/sys/kernel/iommu_groups/59/devices/0000:4a:00.0
/sys/kernel/iommu_groups/20/devices/0000:20:03.0
/sys/kernel/iommu_groups/49/devices/0000:42:03.0
/sys/kernel/iommu_groups/77/devices/0000:62:00.0
/sys/kernel/iommu_groups/10/devices/0000:00:08.1
/sys/kernel/iommu_groups/39/devices/0000:40:01.3
/sys/kernel/iommu_groups/67/devices/0000:60:01.0
/sys/kernel/iommu_groups/29/devices/0000:22:00.0
/sys/kernel/iommu_groups/0/devices/0000:00:01.0
/sys/kernel/iommu_groups/57/devices/0000:45:00.0
/sys/kernel/iommu_groups/19/devices/0000:20:02.0
/sys/kernel/iommu_groups/47/devices/0000:40:08.1
/sys/kernel/iommu_groups/75/devices/0000:60:08.1
/sys/kernel/iommu_groups/37/devices/0000:40:01.0
/sys/kernel/iommu_groups/9/devices/0000:00:08.0
/sys/kernel/iommu_groups/65/devices/0000:5b:00.0
/sys/kernel/iommu_groups/27/devices/0000:20:08.1
 
Last edited:
  • Like
Reactions: augustopaulo

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!