Best approach for running TrueNAS and VM with (i)GPU passthrough

AndrewSpec

Member
Dec 24, 2021
11
1
8
34
I'm not sure if I can solve my problem.
So recently I bought the DeskMeet X300 case with Ryzen 7 4750G, 64GB RAM ECC and 2x12TB HDD drives.

My goal was to virtualize TrueNAS with HDD passthrough and another VM with Batocera with iGPU passthrough for emulation.

The probem is that when I run VM with GPU passthrough the TrueNAS VM crashes because disk IO error. I can't get the SATA controller passthrough working...

Can I use an external HDD bay like https://www.orico.cc/us/product/detail/6764.html ? I'm not sure how the disks will be recognized via USB. The best way is to passthrough whole USB device inside TrueNAS.
I have a free PCI-e slot and maybe I put there some nVidia GPU for passthrough to Batocera but the iGPU is so powerfull (for an iGPU) that I'd like to use it for something.
 
The probem is that when I run VM with GPU passthrough the TrueNAS VM crashes because disk IO error.
You cannot share devices from the same group between VMs and/or the Proxmox host. Most likely everything is in the same big "chipset group" unless you have a X570 motherboard. Check your IOMMU groups with: for d in /sys/kernel/iommu_groups/*/devices/*; do n=${d#*/iommu_groups/*}; n=${n%%/*}; printf 'IOMMU group %s ' "$n"; lspci -nns "${d##*/}"; done.
I can't get the SATA controller passthrough working...
Is this a separate issue from the two VMs using devices from the same group? Or just the symptom of starting both VMs?
Can I use an external HDD bay like https://www.orico.cc/us/product/detail/6764.html ? I'm not sure how the disks will be recognized via USB. The best way is to passthrough whole USB device inside TrueNAS.
USB passthrough does not work well for high-bandwidth or low-latency devices. Or are you passing through a USB controller (and might run into the same IOMMU grouping problem)?
I have a free PCI-e slot and maybe I put there some nVidia GPU for passthrough to Batocera but the iGPU is so powerfull (for an iGPU) that I'd like to use it for something.
Often such a slot is also part of the big "chipset group" and you run into the same issue, except for one x16 PCIe slot and one M.2 slot physically connected to the CPU.
 
You cannot share devices from the same group between VMs and/or the Proxmox host. Most likely everything is in the same big "chipset group" unless you have a X570 motherboard. Check your IOMMU groups with: for d in /sys/kernel/iommu_groups/*/devices/*; do n=${d#*/iommu_groups/*}; n=${n%%/*}; printf 'IOMMU group %s ' "$n"; lspci -nns "${d##*/}"; done.
Code:
IOMMU group 0 00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge [1022:1632]
IOMMU group 1 00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge [1022:1632]
IOMMU group 1 00:02.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Renoir PCIe GPP Bridge [1022:1634]
IOMMU group 1 00:02.3 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Renoir PCIe GPP Bridge [1022:1634]
IOMMU group 1 01:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd Device [144d:a809]
IOMMU group 1 02:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 15)
IOMMU group 2 00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge [1022:1632]
IOMMU group 2 00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Renoir Internal PCIe GPP Bridge to Bus [1022:1635]
IOMMU group 2 00:08.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Renoir Internal PCIe GPP Bridge to Bus [1022:1635]
IOMMU group 2 03:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Renoir [1002:1636] (rev d8)
IOMMU group 2 03:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:1637]
IOMMU group 2 03:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) Platform Security Processor [1022:15df]
IOMMU group 2 03:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Renoir USB 3.1 [1022:1639]
IOMMU group 2 03:00.4 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Renoir USB 3.1 [1022:1639]
IOMMU group 2 03:00.6 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) HD Audio Controller [1022:15e3]
IOMMU group 2 04:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 81)
IOMMU group 3 00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 51)
IOMMU group 3 00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
IOMMU group 4 00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 0 [1022:1448]
IOMMU group 4 00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 1 [1022:1449]
IOMMU group 4 00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 2 [1022:144a]
IOMMU group 4 00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 3 [1022:144b]
IOMMU group 4 00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 4 [1022:144c]
IOMMU group 4 00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 5 [1022:144d]
IOMMU group 4 00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 6 [1022:144e]
IOMMU group 4 00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 7 [1022:144f]

Is this a separate issue from the two VMs using devices from the same group? Or just the symptom of starting both VMs?
It's a separate issue - it was not working from the beginning.

USB passthrough does not work well for high-bandwidth or low-latency devices. Or are you passing through a USB controller (and might run into the same IOMMU grouping problem)?
Yup, that's the case. My first thought was that I can buy a USB SATA controller and passthrough it to TrueNAS but it may be in the same IOMMU group...

Often such a slot is also part of the big "chipset group" and you run into the same issue, except for one x16 PCIe slot and one M.2 slot physically connected to the CPU.
It's a x16 PCIe so it could work.
 
Code:
IOMMU group 0 00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge [1022:1632]
IOMMU group 1 00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge [1022:1632]
IOMMU group 1 00:02.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Renoir PCIe GPP Bridge [1022:1634]
IOMMU group 1 00:02.3 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Renoir PCIe GPP Bridge [1022:1634]
IOMMU group 1 01:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd Device [144d:a809]
IOMMU group 1 02:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 15)
IOMMU group 2 00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge [1022:1632]
IOMMU group 2 00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Renoir Internal PCIe GPP Bridge to Bus [1022:1635]
IOMMU group 2 00:08.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Renoir Internal PCIe GPP Bridge to Bus [1022:1635]
IOMMU group 2 03:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Renoir [1002:1636] (rev d8)
IOMMU group 2 03:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:1637]
IOMMU group 2 03:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) Platform Security Processor [1022:15df]
IOMMU group 2 03:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Renoir USB 3.1 [1022:1639]
IOMMU group 2 03:00.4 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Renoir USB 3.1 [1022:1639]
IOMMU group 2 03:00.6 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) HD Audio Controller [1022:15e3]
IOMMU group 2 04:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 81)
IOMMU group 3 00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 51)
IOMMU group 3 00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
IOMMU group 4 00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 0 [1022:1448]
IOMMU group 4 00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 1 [1022:1449]
IOMMU group 4 00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 2 [1022:144a]
IOMMU group 4 00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 3 [1022:144b]
IOMMU group 4 00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 4 [1022:144c]
IOMMU group 4 00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 5 [1022:144d]
IOMMU group 4 00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 6 [1022:144e]
IOMMU group 4 00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir Device 24: Function 7 [1022:144f]
The SATA controller and the GPU (VGA device) are in the same group.
It's a separate issue - it was not working from the beginning.
Those on-board SATA controllers often don't reset properly and don't work with passthrough. You're not the first on this forum (search this forum for that particular SATA contoller)...
Yup, that's the case. My first thought was that I can buy a USB SATA controller and passthrough it to TrueNAS but it may be in the same IOMMU group...
Are you talking about USB passthrough or PCIe passthrough of a USB controller? USB passthrough is not ideal and you don't have a USB controller in a separate group.
It's a x16 PCIe so it could work.
Do you have a link to the motherboard manual? Only the x16 slot closest to the CPU will be in a separate group (and work at x8 because of the integrated graphics).

If you don't care about the secure isolation between VMs and/or the Proxmox host, you could consider using the build-in pcie_acs_override to split (and lie about) the IOMMU groups.
 
If you don't care about the secure isolation between VMs and/or the Proxmox host, you could consider using the build-in pcie_acs_override to split (and lie about) the IOMMU groups.
Jup, probably your only option, as you can't add a dedicated GPU + HBA card for PCI passthrough.
 
Those on-board SATA controllers often don't reset properly and don't work with passthrough. You're not the first on this forum (search this forum for that particular SATA contoller)...
I've seen that topics.

Are you talking about USB passthrough or PCIe passthrough of a USB controller? USB passthrough is not ideal and you don't have a USB controller in a separate group.
I was thinking more about an SATA controller via USB.
Do you have a link to the motherboard manual? Only the x16 slot closest to the CPU will be in a separate group (and work at x8 because of the integrated graphics).
Here you go https://download.asrock.com/Manual/X300-ITX.pdf

If you don't care about the secure isolation between VMs and/or the Proxmox host, you could consider using the build-in pcie_acs_override to split (and lie about) the IOMMU groups.
Well.....I don't care so much. With pcie_acs_override I have separate groups for every device but stil can't passthrough the iGPU. I see that the module is still loading
Code:
IOMMU group 11 03:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Renoir [1002:1636] (rev d8)
        Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Renoir [1002:1636]
        Kernel driver in use: vfio-pci
        Kernel modules: amdgpu
 
Well.....I don't care so much. With pcie_acs_override I have separate groups for every device but stil can't passthrough the iGPU. I see that the module is still loading
Code:
IOMMU group 11 03:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Renoir [1002:1636] (rev d8)
        Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Renoir [1002:1636]
        Kernel driver in use: vfio-pci
        Kernel modules: amdgpu
If this is before starting the VM then it's fine because amdgpu is not loaded but vfio-pci is.
Since you are passing through the GPU that is used during boot, you need this work-around since kernel 5.15. You might also need to install and activate vendor-reset.
Look for the other threads here about passthrough issues with integrated AMD APUs as you might need more tricks to get it to work inside a VM.
 
Ok so after a few days I had some more time to play with it. With pcie_acs_override=downstream,multifunction I was able to see the iGPU inside the VM and booting it was not affecting other VMs.
On my Windows VM the iGPU was detected but when I tried to install drivers for it the system crashed with BSOD.
On my Ubuntu VM the iGPU was detected but when I changed the default graphics card to none the VM booted but the external display was black - no video output from VM.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!