PCIe passthrough doesnt work after upgrade

lyter

Renowned Member
Oct 15, 2013
16
1
68
Hi there

This weekend I wanted to upgrade my setup to Proxmox v4.3 from v3.3. But I'm having troubles with passing through a PCIe-Controller (SAS, M1015 in IT-Mode).
I've had this working since v3.2 for over 2 years.

Questions are at the bottom of this post.

The steps I've taken:
  1. Clean install of Proxmox v4.3, changed repo to non-subscription, apt-get update and dist-upgrade

  2. updated grub
    Code:
    GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on"
    and "update-grub", reboot
  3. added to /etc/modules
    Code:
    vfio
    
    vfio_iommu_type1
    vfio_pci
    vfio_virqfd
    • Check PCI adress, still the same -->
      Code:
      lspci
      
      ...
      05:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 02)
      ...
  4. Check IOMMU interrupt remapping
    Code:
    dmesg | grep ecap
    [    0.025698] DMAR: dmar0: reg_base_addr fed90000 ver 1:0 cap c9008020660262 ecap f010da
    should be fine since it ends on an "a"
  • Verify IOMMU isolation
    Code:
    find /sys/kernel/iommu_groups/ -type l
    /sys/kernel/iommu_groups/0/devices/0000:00:00.0
    /sys/kernel/iommu_groups/1/devices/0000:00:01.0
    /sys/kernel/iommu_groups/1/devices/0000:00:01.1
    /sys/kernel/iommu_groups/1/devices/0000:05:00.0
    /sys/kernel/iommu_groups/2/devices/0000:00:06.0
    /sys/kernel/iommu_groups/3/devices/0000:00:1a.0
    /sys/kernel/iommu_groups/4/devices/0000:00:1c.0
    /sys/kernel/iommu_groups/5/devices/0000:00:1c.4
    /sys/kernel/iommu_groups/6/devices/0000:00:1c.6
    /sys/kernel/iommu_groups/7/devices/0000:00:1c.7
    /sys/kernel/iommu_groups/8/devices/0000:00:1d.0
    /sys/kernel/iommu_groups/9/devices/0000:00:1e.0
    /sys/kernel/iommu_groups/10/devices/0000:00:1f.0
    /sys/kernel/iommu_groups/10/devices/0000:00:1f.2
    /sys/kernel/iommu_groups/11/devices/0000:0b:00.0
    /sys/kernel/iommu_groups/12/devices/0000:0c:02.0
    /sys/kernel/iommu_groups/12/devices/0000:0d:00.0
    /sys/kernel/iommu_groups/12/devices/0000:0d:00.1
    /sys/kernel/iommu_groups/13/devices/0000:0c:04.0
    /sys/kernel/iommu_groups/13/devices/0000:0e:00.0
    /sys/kernel/iommu_groups/13/devices/0000:0e:00.1
    /sys/kernel/iommu_groups/14/devices/0000:03:00.0
    /sys/kernel/iommu_groups/14/devices/0000:03:00.1
    /sys/kernel/iommu_groups/15/devices/0000:0f:00.0
    /sys/kernel/iommu_groups/16/devices/0000:01:00.0
    /sys/kernel/iommu_groups/16/devices/0000:01:00.1
    /sys/kernel/iommu_groups/16/devices/0000:01:00.2
    /sys/kernel/iommu_groups/16/devices/0000:01:00.4
  • verify IOMMU is working, weird errors at bottom
    Code:
    dmesg | grep -e DMAR -e IOMMU
    [    0.000000] ACPI: DMAR 0x00000000F1DE4A80 000504 (v01 HP     ProLiant 00000001 \xffffffd2?   0000162E)
    [    0.000000] DMAR: IOMMU enabled
    [    0.025694] DMAR: Host address width 39
    [    0.025695] DMAR: DRHD base: 0x000000fed90000 flags: 0x1
    [    0.025698] DMAR: dmar0: reg_base_addr fed90000 ver 1:0 cap c9008020660262 ecap f010da
    [    0.025699] DMAR: RMRR base: 0x000000f1ffd000 end: 0x000000f1ffffff
    [    0.025700] DMAR: RMRR base: 0x000000f1ff6000 end: 0x000000f1ffcfff
    [    0.025701] DMAR: RMRR base: 0x000000f1f93000 end: 0x000000f1f94fff
    [    0.025702] DMAR: RMRR base: 0x000000f1f8f000 end: 0x000000f1f92fff
    [    0.025703] DMAR: RMRR base: 0x000000f1f7f000 end: 0x000000f1f8efff
    [    0.025704] DMAR: RMRR base: 0x000000f1f7e000 end: 0x000000f1f7efff
    [    0.025705] DMAR: RMRR base: 0x000000000f4000 end: 0x000000000f4fff
    [    0.025706] DMAR: RMRR base: 0x000000000e8000 end: 0x000000000e8fff
    [    0.025707] DMAR: RMRR base: 0x000000f1dee000 end: 0x000000f1deefff
    [    0.025708] DMAR-IR: IOAPIC id 8 under DRHD base  0xfed90000 IOMMU 0
    [    0.025709] DMAR-IR: HPET id 0 under DRHD base 0xfed90000
    [    0.025710] DMAR-IR: x2apic is disabled because BIOS sets x2apic opt out bit.
    [    0.025711] DMAR-IR: Use 'intremap=no_x2apic_optout' to override the BIOS setting.
    [    0.025860] DMAR-IR: Enabled IRQ remapping in xapic mode
    [    0.515260] DMAR: No ATSR found
    [    0.515336] DMAR: dmar0: Using Queued invalidation
    [    0.515344] DMAR: Setting RMRR:
    [    0.515357] DMAR: Setting identity map for device 0000:01:00.0 [0xf1dee000 - 0xf1deefff]
    ...
    [    0.515411] DMAR: Setting identity map for device 0000:05:00.0 [0xe8000 - 0xe8fff]
    ...
    [    0.515526] DMAR: Setting identity map for device 0000:05:00.0 [0xf4000 - 0xf4fff]
    ...
    [    0.515545] DMAR: Setting identity map for device 0000:05:00.0 [0xf1f7e000 - 0xf1f7efff]
    ...
    [    0.515612] DMAR: Setting identity map for device 0000:05:00.0 [0xf1f7f000 - 0xf1f8efff]
    ...
    [    0.515628] DMAR: Setting identity map for device 0000:05:00.0 [0xf1f8f000 - 0xf1f92fff]
    ...
    [    0.515641] DMAR: Setting identity map for device 0000:05:00.0 [0xf1f93000 - 0xf1f94fff]
    ...
    [    0.515684] DMAR: Prepare 0-16MiB unity mapping for LPC
    [    0.515689] DMAR: Setting identity map for device 0000:00:1f.0 [0x0 - 0xffffff]
    [    0.515760] DMAR: Intel(R) Virtualization Technology for Directed I/O
    [  102.071750] vfio-pci 0000:05:00.0: Device is ineligible for IOMMU domain attach due to platform RMRR requirement.  Contact your platform vendor.
    [  222.457405] vfio-pci 0000:05:00.0: Device is ineligible for IOMMU domain attach due to platform RMRR requirement.  Contact your platform vendor.
    [  229.274505] vfio-pci 0000:05:00.0: Device is ineligible for IOMMU domain attach due to platform RMRR requirement.  Contact your platform vendor.
    [  393.803991] vfio-pci 0000:05:00.0: Device is ineligible for IOMMU domain attach due to platform RMRR requirement.  Contact your platform vendor.
    [ 1385.978901] vfio-pci 0000:05:00.0: Device is ineligible for IOMMU domain attach due to platform RMRR requirement.  Contact your platform vendor.
    [ 1403.995564] vfio-pci 0000:05:00.0: Device is ineligible for IOMMU domain attach due to platform RMRR requirement.  Contact your platform vendor.
    [ 1430.046864] vfio-pci 0000:05:00.0: Device is ineligible for IOMMU domain attach due to platform RMRR requirement.  Contact your platform vendor.
    [ 1565.871543] vfio-pci 0000:05:00.0: Device is ineligible for IOMMU domain attach due to platform RMRR requirement.  Contact your platform vendor.
    [ 2463.169383] vfio-pci 0000:05:00.0: Device is ineligible for IOMMU domain attach due to platform RMRR requirement.  Contact your platform vendor.
    [ 2499.985486] vfio-pci 0000:05:00.0: Device is ineligible for IOMMU domain attach due to platform RMRR requirement.  Contact your platform vendor.
  • double check VMID.conf (202.conf)
    Code:
    bootdisk: ide0
    cores: 2
    machine: q35
    hostpci0: 05:00.0,pcie=1
    net0: virtio=42:67:36:79:A9:89,bridge=vmbr0
    ide0: local-lvm:vm-202-disk-1,size=20G
    memory: 4096
    name: filer
    ostype: l26
    smbios1: uuid=a0f9ba02-ede2-44a2-9f9a-9dfb206b06d4
    sockets: 1
  • start VM via "qm start 202", get errors
    Code:
    qm start 202
    kvm: -device vfio-pci,host=05:00.0,id=hostpci0,bus=ich9-pcie-port-1,addr=0x0,rombar=0: vfio: failed to set iommu for container: Operation not permitted
    kvm: -device vfio-pci,host=05:00.0,id=hostpci0,bus=ich9-pcie-port-1,addr=0x0,rombar=0: vfio: failed to setup container for group 1
    kvm: -device vfio-pci,host=05:00.0,id=hostpci0,bus=ich9-pcie-port-1,addr=0x0,rombar=0: vfio: failed to get group 1
    kvm: -device vfio-pci,host=05:00.0,id=hostpci0,bus=ich9-pcie-port-1,addr=0x0,rombar=0: Device initialization failed
    start failed: command '/usr/bin/kvm -id 202 -chardev 'socket,id=qmp,path=/var/run/qemu-server/202.qmp,server,nowait' -mon ...
I hope I could make clear, where to problem is. At Proxmox v3.3 I had a different VM.conf which worked fine
Code:
df

Questions:
  1. What's changed since v3.3 so passthrough does not work anymore?
  2. How do I get passthrough working with v4.3?
  3. What do the pink errors in #7 mean?
Thanks in advance. I'd really love to use Proxmox v4.3! :D
 
This seems to be a problem with the hardware you use.
Some Mainboards reserve adress space for the pci devices (which is used for e.g. sideband communication),
and newer linux kernel do not allow to pass them through anymore.

here is a link to a redhat whitepaper about this issue for more detail:
https://access.redhat.com/articles/1434873
 
So to stay solution oriented, there are the following alternatives:
  • HP fixes this problem (which is very unlikely since support is quite poor)
  • Buy new hardware (is there any save composition so stuff like this won't repeat itself?)
  • stay at v3.3 and miss out on cool new features of v4.x+
I suppose downgrading the linux kernel to an earlier version will make Proxmox unusable?
Is there anything else I'm totally unaware on how to fix this?
Today, I've stumbled upon these posts:
https://lime-technology.com/forum/index.php?topic=48630.0
and
https://forums.servethehome.com/index.php?threads/lsi9211-8i-on-ubuntu-15-10-timeouts.8820/page-2

which looked promising, but didn't work for me. Could this be the solution?

Thanks!
 
Any further help? Or at least a confirmation of my assumptions would be great. Thanks
 
  • HP fixes this problem (which is very unlikely since support is quite poor)
  • Buy new hardware (is there any save composition so stuff like this won't repeat itself?)
  • stay at v3.3 and miss out on cool new features of v4.x+

yes it seems like this are your options, sadly i cannot comment on the links, as we do not have this hardware here to test...
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!