SOLUTION:
VM config:
- System -> Machine : q35
- System -> BIOS: OVMF (UEFi)
- System -> Pre-Enroll keys: UN-ticked (!)
PCIe Pass-through:
- VM -> Hardware -> PCI Device -> Raw Device -> Select one of the ID, then tick "All Functions" and tick "PCI-Express"
//OR
- Datacenter -> Resource Mappings -> Add -> Select ID with "Pass through all functions as one device" in Device column, then
- VM -> Hardware -> PCI Device -> Mapped Device -> Select the mapped device from previous step, also tick "PCI-Express"
Start the VM, host should release the kernel driver and map it to vfio-pci, the device should be fully available in the VM
ORIGINAL POST:
Hello Proxmox gurus" I am at dead-end with the following so hoping for someone to chime-in with a magical ideas/solution.
I have Dell R630 server with additional PCIe NIC Chelsio Dual Port T520-CR 10GbE 0J6VY6 (link) which I'd like to fully pass-through to one of my (Linux) VMs.
I am running PVE 9.0.10 on ZFS
(Details about IOMMU and other stuff at the bottom)
I see these under PCIe Device -> Raw Device

- If i pass-through any of the four red ones, VM boots fine but no NICs are available in the VM
- If i pass-through the fifth (green) one, VM boots fine and I get two NICs available in the VM - technically OK but not the desired state


- The blue are not NICs, so i ignore these
//EDIT: If I pass-through one of these the VM won't boot either. So this is causing the issue, but why? //EDIT
BUT ! What I'd like to do is to pass-through the whole PCIe device to the VM. The problem is that when i do so, the VM does not boot and gets stuck on this:

I tried both ways, result is the same, VM won't boot and i have to hard-stop it.
- VM -> Hardware -> PCI Device -> All Functions
- Datacenter -> Resource Mappings -> Pass through all functions as one device


Also when in this "limbo" state the node summary shows 1CPU and all memory it fuly utilized (could be misleading though)

Things i tried from various srouces, threads, guides, etc ... Note that none of this helped (I wouldn't be writting this lovestory otherwise)
- Set ROM-Bar = 0 (Un-ticked the option)
- Use q35 machine type instead of the default i440fx
- Use OVMF (UEFI) bios instead of SeaBIOS
- Use x86-64-v2-AES processor type (It is the default value these days but i found some old thread with similar case)
Details about the host machine:
- Dell Poweredge R630 with 2x Intel Xeon 2x E5-2640 v4
- BIOS updated to most-recent available version
- Following BIOS options are enabled
-- Virtual Technology
-- X2Apic Mode
-- SR-IOV Global Enable
- I have NOT added the grup entry intel_iommu=on since this is on by default in the recent kernels (as per the Proxmox docs)
- I see a 90+ IOMMU groups available when running following and together with the fact that i can pass-through the ports via the one ID i assume IOMMU works as expected. But for the sake of completeness...
I will not paste all 100 lines but here is a
This is the Chelsio Network driver. I am adding this since i found some old posts that there were some bugs in the earlier FW versions. I am on the 1.27.5 (most-recent) where the issues are resolved.
//EDIT2:
Adding output of
Any idea, clue or magic checkmark how to fix this?
Thank you in advance!
VM config:
- System -> Machine : q35
- System -> BIOS: OVMF (UEFi)
- System -> Pre-Enroll keys: UN-ticked (!)
PCIe Pass-through:
- VM -> Hardware -> PCI Device -> Raw Device -> Select one of the ID, then tick "All Functions" and tick "PCI-Express"
//OR
- Datacenter -> Resource Mappings -> Add -> Select ID with "Pass through all functions as one device" in Device column, then
- VM -> Hardware -> PCI Device -> Mapped Device -> Select the mapped device from previous step, also tick "PCI-Express"
Start the VM, host should release the kernel driver and map it to vfio-pci, the device should be fully available in the VM
ORIGINAL POST:
Hello Proxmox gurus" I am at dead-end with the following so hoping for someone to chime-in with a magical ideas/solution.
I have Dell R630 server with additional PCIe NIC Chelsio Dual Port T520-CR 10GbE 0J6VY6 (link) which I'd like to fully pass-through to one of my (Linux) VMs.
I am running PVE 9.0.10 on ZFS
(Details about IOMMU and other stuff at the bottom)
I see these under PCIe Device -> Raw Device

- If i pass-through any of the four red ones, VM boots fine but no NICs are available in the VM
- If i pass-through the fifth (green) one, VM boots fine and I get two NICs available in the VM - technically OK but not the desired state


- The blue are not NICs, so i ignore these
//EDIT: If I pass-through one of these the VM won't boot either. So this is causing the issue, but why? //EDIT
BUT ! What I'd like to do is to pass-through the whole PCIe device to the VM. The problem is that when i do so, the VM does not boot and gets stuck on this:

I tried both ways, result is the same, VM won't boot and i have to hard-stop it.
- VM -> Hardware -> PCI Device -> All Functions
- Datacenter -> Resource Mappings -> Pass through all functions as one device


Also when in this "limbo" state the node summary shows 1CPU and all memory it fuly utilized (could be misleading though)

Things i tried from various srouces, threads, guides, etc ... Note that none of this helped (I wouldn't be writting this lovestory otherwise)
- Set ROM-Bar = 0 (Un-ticked the option)
- Use q35 machine type instead of the default i440fx
- Use OVMF (UEFI) bios instead of SeaBIOS
- Use x86-64-v2-AES processor type (It is the default value these days but i found some old thread with similar case)
Details about the host machine:
- Dell Poweredge R630 with 2x Intel Xeon 2x E5-2640 v4
- BIOS updated to most-recent available version
- Following BIOS options are enabled
-- Virtual Technology
-- X2Apic Mode
-- SR-IOV Global Enable
- I have NOT added the grup entry intel_iommu=on since this is on by default in the recent kernels (as per the Proxmox docs)
- I see a 90+ IOMMU groups available when running following and together with the fact that i can pass-through the ports via the one ID i assume IOMMU works as expected. But for the sake of completeness...
for d in /sys/kernel/iommu_groups/*/devices/*; do n=${d#*/iommu_groups/*}; n=${n%%/*}; printf 'IOMMU group %s ' "$n"; lspci -nns "${d##*/}"; done
I will not paste all 100 lines but here is a
| grep Ethernet
piece of itIOMMU group 24 01:00.0 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection [8086:10fb] (rev 01)
IOMMU group 25 01:00.1 Ethernet controller [0200]: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection [8086:10fb] (rev 01)
IOMMU group 27 08:00.0 Ethernet controller [0200]: Intel Corporation I350 Gigabit Network Connection [8086:1521] (rev 01)
IOMMU group 28 08:00.1 Ethernet controller [0200]: Intel Corporation I350 Gigabit Network Connection [8086:1521] (rev 01)
IOMMU group 4 81:00.0 Ethernet controller [0200]: Intel Corporation 82574L Gigabit Network Connection [8086:10d3]
IOMMU group 5 82:00.0 Ethernet controller [0200]: Chelsio Communications Inc T520-CR Unified Wire Ethernet Controller [1425:5001]
IOMMU group 5 82:00.1 Ethernet controller [0200]: Chelsio Communications Inc T520-CR Unified Wire Ethernet Controller [1425:5001]
IOMMU group 5 82:00.2 Ethernet controller [0200]: Chelsio Communications Inc T520-CR Unified Wire Ethernet Controller [1425:5001]
IOMMU group 5 82:00.3 Ethernet controller [0200]: Chelsio Communications Inc T520-CR Unified Wire Ethernet Controller [1425:5001]
IOMMU group 5 82:00.4 Ethernet controller [0200]: Chelsio Communications Inc T520-CR Unified Wire Ethernet Controller [1425:5401]
dmesg | grep -e DMAR -e IOMMU
[ 0.010393] ACPI: DMAR 0x000000007BAFE000 0000D0 (v01 DELL PE_SC3 00000001 DELL 00000001)
[ 0.010426] ACPI: Reserving DMAR table memory at [mem 0x7bafe000-0x7bafe0cf]
[ 0.860192] DMAR: Host address width 46
[ 0.860193] DMAR: DRHD base: 0x000000fbffc000 flags: 0x0
[ 0.860202] DMAR: dmar0: reg_base_addr fbffc000 ver 1:0 cap 8d2078c106f0466 ecap f020df
[ 0.860205] DMAR: DRHD base: 0x000000c7ffc000 flags: 0x1
[ 0.860209] DMAR: dmar1: reg_base_addr c7ffc000 ver 1:0 cap 8d2078c106f0466 ecap f020df
[ 0.860211] DMAR: ATSR flags: 0x0
[ 0.860215] DMAR: ATSR flags: 0x0
[ 0.860217] DMAR-IR: IOAPIC id 10 under DRHD base 0xfbffc000 IOMMU 0
[ 0.860219] DMAR-IR: IOAPIC id 8 under DRHD base 0xc7ffc000 IOMMU 1
[ 0.860221] DMAR-IR: IOAPIC id 9 under DRHD base 0xc7ffc000 IOMMU 1
[ 0.860222] DMAR-IR: HPET id 0 under DRHD base 0xc7ffc000
[ 0.860224] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[ 0.860600] DMAR-IR: IRQ remapping was enabled on dmar0 but we are not in kdump mode
[ 0.861090] DMAR-IR: Enabled IRQ remapping in x2apic mode
[ 1.374726] DMAR: No RMRR found
[ 1.374727] DMAR: No SATC found
[ 1.374729] DMAR: dmar0: Using Queued invalidation
[ 1.374734] DMAR: dmar1: Using Queued invalidation
[ 1.426819] DMAR: Intel(R) Virtualization Technology for Directed I/O
dmesg | grep remapping
[ 0.860224] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[ 0.860600] DMAR-IR: IRQ remapping was enabled on dmar0 but we are not in kdump mode
[ 0.861090] DMAR-IR: Enabled IRQ remapping in x2apic mode
dmesg |grep cxgb
This is the Chelsio Network driver. I am adding this since i found some old posts that there were some bugs in the earlier FW versions. I am on the 1.27.5 (most-recent) where the issues are resolved.
[ 5.113678] cxgb4 0000:82:00.4: Coming up as MASTER: Initializing adapter
[ 5.748434] cxgb4 0000:82:00.4: Direct firmware load for cxgb4/t5-config.txt failed with error -2
[ 6.412383] cxgb4 0000:82:00.4: Successfully configured using Firmware Configuration File "Firmware Default", version 0x0, computed checksum 0x0
[ 6.452391] cxgb4 0000:82:00.4: Hash filter supported only on T6
[ 6.462385] cxgb4 0000:82:00.4: max_ordird_qp 21 max_ird_adapter 387072
[ 6.470363] cxgb4 0000:82:00.4: Current filter mode/mask 0x632b:0x21
[ 6.492035] cxgb4 0000:82:00.4: 128 MSI-X vectors allocated, nic 32 eoqsets 34 per uld 8 mirrorqsets 2
[ 6.492043] cxgb4 0000:82:00.4: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link)
[ 6.498888] cxgb4 0000:82:00.4 eth1: Chelsio T520-CR 1G/10GBASE-SFP
[ 6.499208] cxgb4 0000:82:00.4 eth2: Chelsio T520-CR 1G/10GBASE-SFP
[ 6.507398] cxgb4 0000:82:00.4: Chelsio T520-CR rev 0
[ 6.507407] cxgb4 0000:82:00.4: S/N: PT05140769, P/N: 110116050D0
[ 6.507411] cxgb4 0000:82:00.4: Firmware version: 1.27.5.0
[ 6.507416] cxgb4 0000:82:00.4: Bootstrap version: 1.1.0.0
[ 6.507420] cxgb4 0000:82:00.4: TP Microcode version: 0.1.4.9
[ 6.507439] cxgb4 0000:82:00.4: Expansion ROM version: 1.0.0.68
[ 6.507440] cxgb4 0000:82:00.4: Serial Configuration version: 0x1003000
[ 6.507442] cxgb4 0000:82:00.4: VPD version: 0x2
[ 6.507444] cxgb4 0000:82:00.4: Configuration: RNIC MSI-X, Offload capable
[ 6.510091] cxgb4 0000:82:00.4 enp130s0f4d1: renamed from eth2
[ 6.510438] cxgb4 0000:82:00.4 enp130s0f4: renamed from eth1
//EDIT2:
Adding output of
lspci -v
for the Chelsio Etherned and Storage controller82:00.4 Ethernet controller: Chelsio Communications Inc T520-CR Unified Wire Ethernet Controller
Subsystem: Chelsio Communications Inc Device 0000
Flags: fast devsel, IRQ 102, NUMA node 1, IOMMU group 5
Memory at c9200000 (64-bit, non-prefetchable) [size=512K]
Memory at c8000000 (64-bit, non-prefetchable) [size=16M]
Memory at c98c4000 (64-bit, non-prefetchable) [size=8K]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable- Count=1/32 Maskable+ 64bit+
Capabilities: [70] Express Endpoint, IntMsgNum 0
Capabilities: [b0] MSI-X: Enable- Count=128 Masked-
Capabilities: [d0] Vital Product Data
Capabilities: [100] Advanced Error Reporting
Capabilities: [170] Device Serial Number 00-00-00-00-00-00-00-00
Capabilities: [190] Alternative Routing-ID Interpretation (ARI)
Capabilities: [1a0] Secondary PCI Express
Capabilities: [1c0] Single Root I/O Virtualization (SR-IOV)
Capabilities: [200] Transaction Processing Hints
Kernel driver in use: vfio-pci
Kernel modules: cxgb4
82:00.5 SCSI storage controller: Chelsio Communications Inc T520-CR Unified Wire Storage Controller
Subsystem: Chelsio Communications Inc Device 0000
Flags: fast devsel, IRQ 146, NUMA node 1, IOMMU group 5
Memory at c9380000 (64-bit, non-prefetchable) [size=512K]
Memory at c9400000 (64-bit, non-prefetchable) [size=512K]
Memory at c98c8000 (64-bit, non-prefetchable) [size=8K]
Expansion ROM at <ignored> [disabled]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable- Count=1/32 Maskable+ 64bit+
Capabilities: [70] Express Endpoint, IntMsgNum 0
Capabilities: [b0] MSI-X: Enable- Count=40 Masked-
Capabilities: [d0] Vital Product Data
Capabilities: [100] Advanced Error Reporting
Capabilities: [170] Device Serial Number 00-00-00-00-00-00-00-00
Capabilities: [190] Alternative Routing-ID Interpretation (ARI)
Capabilities: [1a0] Secondary PCI Express
Capabilities: [1c0] Single Root I/O Virtualization (SR-IOV)
Capabilities: [200] Transaction Processing Hints
Kernel driver in use: vfio-pci
82:00.6 Fibre Channel: Chelsio Communications Inc T520-CR Unified Wire Storage Controller
Subsystem: Chelsio Communications Inc Device 0000
Flags: fast devsel, IRQ 147, NUMA node 1, IOMMU group 5
Memory at c9280000 (64-bit, non-prefetchable) [size=512K]
Memory at c9300000 (64-bit, non-prefetchable) [size=512K]
Memory at c98c6000 (64-bit, non-prefetchable) [size=8K]
Expansion ROM at <ignored> [disabled]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable- Count=1/32 Maskable+ 64bit+
Capabilities: [70] Express Endpoint, IntMsgNum 0
Capabilities: [b0] MSI-X: Enable- Count=40 Masked-
Capabilities: [d0] Vital Product Data
Capabilities: [100] Advanced Error Reporting
Capabilities: [170] Device Serial Number 00-00-00-00-00-00-00-00
Capabilities: [190] Alternative Routing-ID Interpretation (ARI)
Capabilities: [1a0] Secondary PCI Express
Capabilities: [1c0] Single Root I/O Virtualization (SR-IOV)
Capabilities: [200] Transaction Processing Hints
Kernel driver in use: vfio-pci
Kernel modules: csiostor
Any idea, clue or magic checkmark how to fix this?
Thank you in advance!
Attachments
Last edited: