Is it impossible to setup pcie passthrough for Proliant gen8 servers ?

Lancer

New Member
Sep 10, 2023
10
0
1
Hi
My server is DL380p gen8 server, and I did read about and test various solutions but ALL of them failed.
I must admit that most of them are about GPU passthrough, and I just need to passthrough my p420i controller from proxmox to truenas (and a sas hba later).

So is there any proven way to do it? or we have to deal with it as is ?!
Thank you
 
My server is DL380p gen8 server, and I did read about and test various solutions but ALL of them failed.
Did you enable PCIe passthrough as per the Proxmox manual? Without a clearer description of the failures, it might just be a configuration mistake or maybe you need a motherboard BIOS update.
I must admit that most of them are about GPU passthrough, and I just need to passthrough my p420i controller from proxmox to truenas (and a sas hba later).
Did passthrough of the device give you trouble or could you not enabled VT-d? Do you known that Proxmox has two possible bootloaders?
So is there any proven way to do it? or we have to deal with it as is ?!
I don't know your particular hardware, so maybe someone else here can comment. There a some threads about DL380 and passthrough. Your system might have "the RMRR problem" and require an involved work-around or fix from HP.
 
  • Like
Reactions: Lancer
Did you enable PCIe passthrough as per the Proxmox manual? Without a clearer description of the failures, it might just be a configuration mistake or maybe you need a motherboard BIOS update.

Did passthrough of the device give you trouble or could you not enabled VT-d? Do you known that Proxmox has two possible bootloaders?

I don't know your particular hardware, so maybe someone else here can comment. There a some threads about DL380 and passthrough. Your system might have "the RMRR problem" and require an involved work-around or fix from HP.
Thank you so much

Yes I did All the instructions, I just didn't do the kernel edit since I don't see that it will work in the long run and I kinda understand that it got included in proxmox kernel.

As in the manual I did verfiy with (lspci -nnk) and I do get (Kernel driver in use: vfio-pci) !

But I still couldn't run the vm, and got this message:
kvm: -device vfio-pci,host=0000:02:00.0,id=hostpci0,bus=ich9-pcie-port-1,addr=0x0: vfio 0000:02:00.0: failed to setup container for group 49: Failed to set iommu for container: Operation not permitted
TASK ERROR: start failed: QEMU exited with code 1

So what to do now?
 
Just read this:
https://forum.proxmox.com/threads/hpe-ml-dl-server-series-pci-gpu-passthrough-pve8.131361/

kvm: -device vfio-pci,host=0000:02:00.0,id=hostpci0,bus=ich9-pcie-port-1,addr=0x0: vfio 0000:02:00.0: failed to setup container for group 49: Failed to set iommu for container: Operation not permitted
TASK ERROR: start failed: QEMU exited with code 1

HPE IOMMU/passthrough is not enabled, its need to enable per PCI-E slot using the "conrep" util from HPE, after you enabled on the specific PCI-E slot, there will be no more error.
 
Last edited:
  • Like
Reactions: Lancer
Thank you so much

Yes I did All the instructions, I just didn't do the kernel edit since I don't see that it will work in the long run and I kinda understand that it got included in proxmox kernel.

As in the manual I did verfiy with (lspci -nnk) and I do get (Kernel driver in use: vfio-pci) !

But I still couldn't run the vm, and got this message:
kvm: -device vfio-pci,host=0000:02:00.0,id=hostpci0,bus=ich9-pcie-port-1,addr=0x0: vfio 0000:02:00.0: failed to setup container for group 49: Failed to set iommu for container: Operation not permitted
TASK ERROR: start failed: QEMU exited with code 1

So what to do now?
Sounds like the RMRR problem. I have no experience with that but maybe this can provide a work-around? Looks like you selected hardware that does not work well with passthrough.
 
  • Like
Reactions: Lancer
Just read this:
https://forum.proxmox.com/threads/hpe-ml-dl-server-series-pci-gpu-passthrough-pve8.131361/



HPE IOMMU/passthrough is not enabled, its need to enable per PCI-E slot using the "conrep" util from HPE, after you enabled on the specific PCI-E slot, there will be no more error.
Thank you
I think it's enabled as I check with this command:
Code:
# dmesg | grep -e DMAR -e IOMMU -e AMD-Vi
[    0.007107] ACPI: DMAR 0x00000000BDDAD200 0006A8 (v01 HP     ProLiant 00000001 \xd2?   0000162E)
[    0.007164] ACPI: Reserving DMAR table memory at [mem 0xbddad200-0xbddad8a7]
[    1.162964] DMAR: IOMMU enabled
[    2.597420] DMAR: Host address width 46
[    2.597422] DMAR: DRHD base: 0x000000fbefe000 flags: 0x0
[    2.597431] DMAR: dmar0: reg_base_addr fbefe000 ver 1:0 cap d2078c106f0466 ecap f020de
[    2.597434] DMAR: DRHD base: 0x000000eaffe000 flags: 0x1
[    2.597445] DMAR: dmar1: reg_base_addr eaffe000 ver 1:0 cap d2078c106f0466 ecap f020de
[    2.597447] DMAR: RMRR base: 0x000000bdffd000 end: 0x000000bdffffff
[    2.597451] DMAR: RMRR base: 0x000000bdff6000 end: 0x000000bdffcfff
[    2.597452] DMAR: RMRR base: 0x000000bdf83000 end: 0x000000bdf84fff
[    2.597454] DMAR: RMRR base: 0x000000bdf7f000 end: 0x000000bdf82fff
[    2.597455] DMAR: RMRR base: 0x000000bdf6f000 end: 0x000000bdf7efff
[    2.597456] DMAR: RMRR base: 0x000000bdf6e000 end: 0x000000bdf6efff
[    2.597457] DMAR: RMRR base: 0x000000000f4000 end: 0x000000000f4fff
[    2.597458] DMAR: RMRR base: 0x000000000e8000 end: 0x000000000e8fff
[    2.597460] DMAR: [Firmware Bug]: No firmware reserved region can cover this RMRR [0x00000000000e8000-0x00000000000e8fff], contact BIOS vendor for fixes
[    2.597530] DMAR: [Firmware Bug]: Your BIOS is broken; bad RMRR [0x00000000000e8000-0x00000000000e8fff]
[    2.597532] DMAR: RMRR base: 0x000000bddde000 end: 0x000000bdddefff
[    2.597533] DMAR: ATSR flags: 0x0
[    2.597541] DMAR-IR: IOAPIC id 10 under DRHD base  0xfbefe000 IOMMU 0
[    2.597542] DMAR-IR: IOAPIC id 8 under DRHD base  0xeaffe000 IOMMU 1
[    2.597544] DMAR-IR: IOAPIC id 0 under DRHD base  0xeaffe000 IOMMU 1
[    2.597546] DMAR-IR: HPET id 0 under DRHD base 0xeaffe000
[    2.598376] DMAR-IR: Enabled IRQ remapping in xapic mode
[    4.051504] DMAR: No SATC found
[    4.051510] DMAR: dmar0: Using Queued invalidation
[    4.051540] DMAR: dmar1: Using Queued invalidation
[    4.063053] DMAR: Intel(R) Virtualization Technology for Directed I/O
[    5.808849] DMAR: DRHD: handling fault status reg 2
[    5.808944] DMAR: [INTR-REMAP] Request device [01:00.0] fault index 0x23 [fault reason 0x26] Blocked an interrupt request due to source-id verification failure

Anyway I did the instruction in your link, but the dead end was in this command:
lspci -s 0c:00.0 -vvv | grep 'Physical Slot'
Because I got nothing from it ! maybe because p420i is embedded ?

Hope you can help with that, thanks
 
He don't need any "hack/patch", just use the HPE "conrep" util to enable the IOMMU on specific PCI-E slot (disabled by default on all PCI-E slot).
Please see my previous reply. I couldn't find a 'Physical Slot' for my p420i.
Can I do it without this step ?
 
Please see my previous reply. I couldn't find a 'Physical Slot' for my p420i.
Can I do it without this step ?
I checked, it has no "busid" as the other hardware:
Code:
$> ssacli ctrl all show
    Smart Array P420i in Slot 0 (Embedded)
You can try the "Slot0" with conrep - I do not know if this works.

Also, need to blacklist the "hpsa" kernel-module to prevent load the driver on the HOST.

Example:

Code:
Example:
/etc/modprobe.d/hba.conf
   blacklist hpsa

$> update-initramfs -c -d -u

If the conrep fails, then its not possible to passthrough.
 
Last edited:
  • Like
Reactions: Lancer
I checked, it has no "busid" as the other hardware:
Code:
$> ssacli ctrl all show
    Smart Array P420i in Slot 0 (Embedded)
You can try the "Slot0" with conrep - I do not know if this works.

Also, need to blacklist the "hpsa" kernel-module to prevent load the driver on the HOST.

Example:

Code:
Example:
/etc/modprobe.d/hba.conf
   blacklist hpsa

$> update-initramfs -c -d -u

If the conrep fails, then its not possible to passthrough.
Thanks
I did block hpsa.
And I have to indecate two points:
1- Suddenly my server goes to 100% fan speed, then iLO told me about a storage controller failure (I think), not once but multiple times ! and I think it was triggered by my passthrough changes.
2- in the server console I got this message: "dmar device is ineligible for iommu domain attach due to platform rmrr requirement..." and others you can see in the attachment.
Also earlier I got this message "L1TF SPU bug present and SMT on, data leak possible. ...".
 

Attachments

  • dmar.png
    dmar.png
    73.2 KB · Views: 3
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!